<<

Lecture Notes for State (3rd Year Course 6) Hilary Term 2012

c Professor Steven H. Simon Oxford University

January 9, 2012

i

Short Preface to My Second Year Lecturing This Course

Last year was my first year teaching this course. In fact, it was my first experience teaching any undergraduate course. I admit that I learned quite a bit from the experience. The good news is that the course was viewed mostly as a success, even by the tough measure of student reviews. I particularly would like to thank that student who wrote on his or her review that I deserve a raise — and I would like to encourage my department chair to post this review on his wall and refer to it frequently. With luck, the second iteration of the course will be even better than the first. Having learned so much from teaching the course last year, I hope to improve it even further for this year. One of the most important things I learned was how much students appreciate a clear, complete, and error-free set of notes. As such, I am spending quite a bit of time reworking these notes to make them as perfect as possible. Repeating my plea from last year, if you can think of ways that these notes (or this course) could be further improved (correction of errors or whatnot) please let me know. The next generation of students will certainly appreciate it and that will improve your Karma. ,

Oxford, United Kingdom January, 2012. ii

Preface

When I was an undergraduate, I thought solid state physics (a sub-genre of condensed physics) was perhaps the worst subject that any undergraduate could be forced to learn – boring and tedious, “squalid state” as it was commonly called1. How much would I really learn about the universe by studying the properties of ? I managed to avoid taking this course altogether. My opinion at the time was not a reflection of the subject matter, but rather was a reflection of how solid state physics was taught. Given my opinion as an undergraduate, it is a bit ironic that I have become a condensed matter . But once I was introduced to the subject properly, I found that condensed matter was my favorite subject in all of physics – full of variety, excitement, and deep ideas. Many many have come to this same conclusion. In fact, condensed matter physics is by far the largest single subfield of physics (the annual meeting of condensed matter physicists in the United States attracts over 6000 physicists each year!). Sadly a first introduction to the topic can barely scratch the surface of what constitutes the broad field of condensed matter. Last year when I was told that a new course was being prepared to teach condensed matter physics to third year Oxford undergraduates, I jumped at the opportunity to teach it. I felt that it must be possible to teach a condensed matter physics course that is just as interesting and exciting as any other course that an undergraduate will ever take. It must be possible to convey the excitement of real condensed matter physics to the undergraduate audience. I hope I will succeed in this task. You can judge for yourself. The topics I was asked to cover (being given little leeway in choosing the syllabus) are not atypical for a solid state physics course. In fact, the new condensed matter syllabus is extremely similar to the old Oxford B2 syllabus – the main changes being the removal of and device physics. A few other small topics, such as and point-group symmetries, are also nonexaminable now, or are removed altogether . A few other topics (thermal expansion, chemical bonding) are now added by mandate of the IOP2. At any rate, the changes to the old B2 syllabus are generally minor, so I recommend that Oxford students use the old B2 exams as a starting point for figuring out what it is they need to study as the exams approach. In fact, I have used precisely these old exams to figure out what I need to teach. Being that the same group of people will be setting the exams this year as set them last year, this seems like a good idea. As with most exams at Oxford, one starts to see patterns in terms of what type of questions are asked year after year. The lecture notes contained here are designed to cover exactly this crucial material. I realize that these notes are a lot of material, and for this I apologize. However, this is the minimum set of notes that covers all of the topics that have shown up on old B2 exams. The actual lectures for this course will try to cover everything in these notes, but a few of the less crucial pieces will necessarily be glossed over in the interest of time. Many of these topics are covered well in standard solid state physics references that one might find online, or in other books. The reason I am giving these lectures (and not just telling students to go read a standard book) is because condensed matter/solid-state is an enormous subject — worth many years of lectures — and one needs a guide to decide what subset of topics

1This jibe against solid state physics can be traced to the Nobel Laureate Murray Gell-Mann, discoverer of the quark, who famously believed that there was nothing interesting in any endeavor but . Interestingly he now studies complexity — a field that mostly arose from condensed matter. 2We can discuss elsewhere whether or not we should pay attention to such mandates in general – although these particular mandates do not seem so odious. iii are most important (at least in the eyes of the examination committee). I believe that the lectures contained here give depth in some topics, and gloss over other topics, so as to reflect the particular topics that are deemed important at Oxford. These topics may differ a great deal from what is deemed important elsewhere. In particular, Oxford is extremely heavy on theory (x-ray and diffraction) compared with most solid state courses or books that I have seen. But on the other hand, Oxford does not appear to believe in group representations (which resulted in my elimination of point group symmetries from the syllabus). I cannot emphasize enough that there are many many extremely good books on solid-state and condensed matter physics already in existence. There are also many good resources online (in- cluding the rather infamous “Britney Spears’ guide to physics” — which is tongue- in-cheek about Britney Spears3, but actually is a very good reference about ). I will list here some of the books that I think are excellent, and throughout these lecture notes, I will try to point you to references that I think are helpful.

States of Matter, by David L. Goodstein, Dover • Chapter 3 of this book is a very brief but well written and easy to read description of much of what we will need to cover (but not all, certainly). The book is also published by Dover which means it is super-cheap in paperback. Warning: It uses cgs units rather than SI units, which is a bit annoying. Solid State Physics, 2nd ed by J. R. Hook and H. E. Hall, Wiley • This is frequently the book that students like the most. It is a first introduction to the subject and is much more introductory than Ashcroft and Mermin. The Solid State, by H M Rosenberg, OUP • This slightly more advanced book was written a few decades ago to cover what was the solid state course at Oxford at that time. Some parts of the course have since changed, but other parts are well covered in this book. Solid-State Physics, 4ed, by H. Ibach and H. Luth, Springer-Verlag • Another very popular book on the subject, with quite a bit of information in it. More advanced than Hook and Hall Solid State Physics, by N. W. Ashcroft and D. N. Mermin, Holt-Sanders • This is the standard complete introduction to solid state physics. It has many many chapters on topics we won’t be studying, and goes into great depth on almost everything. It may be a bit overwhelming to try to use this as a reference because of information-overload, but it has good explanations of almost everything. On the whole, this is my favorite reference. Warning: Also uses cgs units. Introduction to Solid State Physics, 8ed, by Charles Kittel4, Wiley • This is a classic text. It gets mixed reviews by some as being unclear on many . It is somewhat more complete than Hooke and Hall, less so than Ashcroft and Mermin. Its selection of topics and organization may seem a bit strange in the modern era. The Basics of and Diffraction, 3ed, by C Hammond, OUP • This book has historically been part of the syllabus, particularly for the scattering theory part of the course. I don’t like it much.

3This guide was written when Ms. Spears was just a popular young performer and not the complete train wreck that she appears to be now. 4Kittel happens to be my dissertation-supervisor’s dissertation-supervisor’s dissertation-supervisor’s dissertation- supervisor, for whatever that is worth. iv

Structure and Dynamics, by M.T. Dove, Oxford University Press • This is a more advanced book that covers scattering in particular. It is used in the Condensed Matter option 4-th year course. in Condensed Matter, by Stephen Blundell, OUP • Well written advanced material on the magnetism part of the course. It is used in the Condensed Matter option 4-th year course. Band Theory and Electronic Properties of , by John Singleton, OUP • More advanced material on in solids. Also used in the Condensed Matter option 4-th year course. Solid State Physics, by G. Burns, Academic • Another more advanced book. Some of its descriptions are short but very good.

I will remind my reader that these notes are a first draft. I apologize that they do not cover the material uniformly. In some places I have given more detail than in others – depending mainly on my enthusiasm-level at the particular time of writing. I hope to go back and improve the as much as possible. Updated drafts will hopefully be appearing. Perhaps this pile of notes will end up as a book, perhaps they will not. This is not my point. My point is to write something that will be helpful for this course. If you can think of ways that these notes could be improved (correction of errors or whatnot) please let me know. The next generation of students will certainly appreciate it and that will improve your Karma. ,

Oxford, United Kingdom January, 2011. v

Acknowledgements

Needless to say, I pilfered a fair fraction of the content of this course from parts of other books (mostly mentioned above). The authors of these books put great thought and effort into their writing. I am deeply indebted to these giants who have come before me. Additionally, I have stolen many ideas about how this course should be taught from the people who have taught the course (and similar courses) at Oxford in years past. Most recently this includes Mike Glazer, Andrew Boothroyd, and Robin Nicholas. I am also very thankful for all the people who have helped me proofread, correct, and otherwise tweak these notes and the homework problems. These include in particular Mike Glazer, Alex Hearmon, Simon Davenport, Till Hackler, Paul Stubley, Stephanie Simmons, Katherine Dunn, and Joost Slingerland. Finally, I thank my father for helping proofread and improve these notes... and for a million other things. vi Contents

1 About Condensed Matter Physics 1 1.1 WhatisCondensedMatterPhysics...... 1 1.2 WhyDoWeStudyCondensedMatterPhysics? ...... 1

I Physics of Solids without Considering Microscopic Structure: The Early Days of Solid State 5

2 SpecificHeatofSolids:,Einstein,andDebye 7 2.1 Einstein’s Calculation ...... 8 2.2 Debye’s Calculation ...... 11 2.2.1 About Periodic (-Von-Karman) Boundary Conditions ...... 12 2.2.2 Debye’s Calculation Following Planck ...... 13 2.2.3 Debye’s “Interpolation” ...... 15 2.2.4 SomeShortcomingsoftheDebyeTheory ...... 15 2.3 SummaryofSpecificHeatofSolids...... 17 2.4 Appendix to this Chapter: ζ(4)...... 17

3 Electrons in : Drude Theory 19 3.1 Electrons in Fields ...... 20 3.1.1 Electrons in an Electric Field ...... 20 3.1.2 Electrons in Electric and Magnetic Fields ...... 21 3.2 ThermalTransport...... 23 3.3 SummaryofDrudeTheory ...... 25

4 More Electrons in Metals: Sommerfeld (Free ) Theory 27 4.1 Basic Fermi-Dirac Statistics ...... 28

vii viii CONTENTS

4.2 ElectronicHeatCapacity ...... 30 4.3 Magnetic Susceptibility (Pauli ) ...... 32 4.4 WhyDrudeTheoryWorkssoWell ...... 35 4.5 ShortcomingsoftheFreeElectronModel ...... 35 4.6 Summaryof(Sommerfeld)Free ElectronTheory ...... 37

II Putting Materials Together 39

5 WhatHoldsSolidsTogether:ChemicalBonding 41 5.1 GeneralConsiderationsaboutBonding ...... 41 5.2 IonicBonds...... 44 5.3 CovalentBond ...... 47 5.3.1 Particle in a Box Picture ...... 47 5.3.2 Molecular Orbital or Tight Binding Theory ...... 47 5.4 Van der Waals, Fluctuating Dipole Forces, or Molecular Bonding ...... 53 5.5 Metallic Bonding ...... 54 5.6 Hydrogenbonds ...... 55 5.7 SummaryofBonding(Pictoral)...... 55

6 Types of Matter 57

III Toy Models of Solids in One Dimension 61

7 One Dimensional Model of Compressibility, Sound, and Thermal Expansion 63

8 VibrationsofaOneDimensionalMonatomicChain 67 8.1 FirstExposuretotheReciprocalLattice ...... 68 8.2 Properties of the Dispersion of the One Dimensional Chain ...... 70 8.3 QuantumModes: ...... 72 8.4 CrystalMomentum...... 74 8.5 Summary of Vibrations of the One Dimensional Monatomic Chain ...... 75

9 VibrationsofaOneDimensionalDiatomicChain 77 9.1 Diatomic Structure: Some useful definitions ...... 77 9.2 Normal Modes of the Diatomic Solid ...... 79 CONTENTS ix

9.3 Summary of Vibrations of the One Dimensional Diatomic Chain ...... 85

10 Tight Binding Chain (Interlude and Preview) 87 10.1 Tight Binding Model in One Dimension ...... 87 10.2 Solution of the Tight Binding Chain ...... 89 10.3 Introduction to Electrons Filling Bands ...... 92 10.4MultipleBands ...... 93 10.5 Summary of Tight Binding Chain ...... 95

IV Geometry of Solids 97

11 Crystal Structure 99 11.1 Lattices and Unit Cells ...... 99 11.2 Lattices in Three Dimensions ...... 106 11.3 SummaryofCrystalStructure ...... 112

12 Reciprocal Lattice, Brillouin Zone, Waves in Crystals 115 12.1 The Reciprocal Lattice in Three Dimensions ...... 115 12.1.1 Review of One Dimension ...... 115 12.1.2 Reciprocal Lattice Definition ...... 116 12.1.3 The Reciprocal Lattice as a Fourier Transform ...... 117 12.1.4 Reciprocal Lattice Points as Families of Lattice Planes ...... 118 12.1.5 Lattice Planes and Miller Indices ...... 120 12.2 Brillouin Zones ...... 123 12.2.1 Review of One Dimensional Dispersions and Brillouin Zones ...... 123 12.2.2 General Brillouin Zone Construction ...... 124 12.3 Electronic and Vibrational Waves in Crystals in Three Dimensions ...... 125 12.4 Summary of Reciprocal Space and Brillouin Zones ...... 127

V Neutron and X-Ray Diffraction 129

13 Wave Scattering by Crystals 131 13.1 TheLaueandBraggConditions ...... 132 13.1.1 Fermi’s Golden Rule Approach ...... 132 13.1.2 DiffractionApproach...... 133 x CONTENTS

13.1.3 Equivalence of Laue and Bragg conditions ...... 134 13.2 ScatteringAmplitudes ...... 135 13.2.1 SystematicAbsencesandMoreExamples ...... 138 13.3 MethodsofScatteringExperiments...... 140 13.3.1 Advanced Methods (interesting and useful but you probably won’t be tested onthis) ...... 140 13.3.2 Powder Diffraction (you will almost certainly be tested on this!) ...... 141 13.4 Stillmoreaboutscattering ...... 147 13.4.1 Variant: Scattering in and Amorphous Solids ...... 147 13.4.2 Variant: Inelastic Scattering ...... 148 13.4.3 ExperimentalApparatus...... 148 13.5SummaryofDiffraction ...... 149

VI Electrons in Solids 151

14 Electrons in a Periodic Potential 153 14.1 NearlyFreeElectronModel ...... 153 14.1.1 DegeneratePerturbationTheory ...... 155 14.2Bloch’sTheorem ...... 160 14.3 Summary of Electrons in a Periodic Potential ...... 161

15 , Semiconductor, or 163 15.1 Energy Bands in One Dimension: Mostly Review ...... 163 15.2 EnergyBandsinTwo(orMore)Dimensions...... 166 15.3TightBinding...... 168 15.4 Failures of the Band-Structure Picture of Metals and Insulators...... 170 15.5 BandStructureandOpticalProperties...... 171 15.5.1 Optical Properties of Insulators and Semiconductors ...... 171 15.5.2 Direct and Indirect Transitions ...... 171 15.5.3 Optical Properties of Metals ...... 172 15.5.4 Optical Effects of Impurities ...... 173 15.6 Summary of Insulators, Semiconductors, and Metals ...... 174

16 Semiconductor Physics 175 16.1ElectronsandHoles ...... 175 CONTENTS xi

16.1.1 DrudeTransport:Redux ...... 178 16.2 Adding Electrons or Holes With Impurities: Doping ...... 179 16.2.1 ImpurityStates...... 180 16.3 Statistical of Semiconductors ...... 183 16.4 Summary of of Semiconductors ...... 187

17 Semiconductor Devices 189 17.1 BandStructureEngineering ...... 189 17.1.1 DesigningBandGaps ...... 189 17.1.2 Non-HomogeneousBandGaps ...... 190 17.1.3 Summary of the Examinable Material ...... 190 17.2 p-n Junction ...... 191

VII Magnetism and Mean Field Theories 193

18 Magnetic Properties of : Para- and Dia-Magnetism 195 18.1 Basic Definitions of types of Magnetism ...... 196 18.2 : Hund’s Rules ...... 197 18.2.1 WhyMomentsAlign...... 200 18.3 Coupling of Electrons in Atoms to an External Field ...... 202 18.4 Free Spin (Curie or Langevin) Paramagnetism ...... 204 18.5LarmorDiamagnetism ...... 206 18.6AtomsinSolids...... 207 18.6.1 Pauli Paramagnetism in Metals ...... 207 18.6.2 in Solids ...... 207 18.6.3 Curie Paramagnetism in Solids ...... 208 18.7 Summary of Atomic Magnetism; Paramagnetism and Diamagnetism ...... 209

19 Spontaneous Order: Antiferro-, Ferri-, and Ferro-Magnetism 211 19.1 (Spontaneous)MagneticOrder ...... 212 19.1.1 Ferromagnets...... 212 19.1.2 Antiferromagnets...... 212 19.1.3 Ferrimagnetism...... 214 19.2BreakingSymmetry ...... 214 19.2.1 IsingModel...... 215 xii CONTENTS

19.3 SummaryofMagneticOrders ...... 216

20 Domains and Hysteresis 217 20.1 MacroscopicEffects in Ferromagnets: Domains ...... 217 20.1.1 Disorder and Domain Walls ...... 218 20.1.2 Disorder Pinning ...... 219 20.1.3 TheBloch/N´eelWall...... 219 20.2 HysteresisinFerromagnets ...... 222 20.2.1 Single-Domain Crystallites ...... 222 20.2.2 Domain Pinning and Hysteresis ...... 223 20.3 Summary of Domains and Hysteresis in Ferromagnets ...... 224

21 Mean Field Theory 227 21.1 Mean Field Equations for the Ferromagnetic ...... 227 21.2 Solution of Self-Consistency Equation ...... 229 21.2.1 Paramagnetic Susceptibility ...... 231 21.2.2 FurtherThoughts...... 232 21.3 SummaryofMeanFieldTheory ...... 232

22MagnetismfromInteractions:TheHubbardModel 235 22.1 FerromagnetismintheHubbardModel...... 236 22.1.1 Hubbard Mean Field Theory ...... 236 22.1.2 StonerCriterion ...... 237 22.2 Mott in the ...... 239 22.3 SummaryoftheHubbardModel ...... 241 22.4 Appendix: The Hubbard model for the Molecule ...... 241

23 Magnetic Devices 245

Indices 247 IndexofPeople...... 248 IndexofTopics...... 250 Chapter 1

About Condensed Matter Physics

This chapter is just my personal take on why this topic is interesting. It seems unlikely to me that any exam would ask you why you study this topic, so you should probably consider this section to be not examinable. Nonetheless, you might want to read it to figure out why you should think this course is interesting if that isn’t otherwise obvious.

1.1 What is Condensed Matter Physics

Quoting Wikipedia: Condensed matter physics is the field of physics that deals with the macro- scopic and microscopic physical properties of matter. In particular, it is concerned with the “condensed” phases that appear whenever the num- ber of constituents in a system is extremely large and the interactions be- tween the constituents are strong. The most familiar examples of condensed phases are solids and liquids, which arise from the electromagnetic forces between atoms.

The use of the term “Condensed Matter” being more general than just solid state was coined and promoted by Nobel-Laureate Philip W. .

1.2 Why Do We Study Condensed Matter Physics?

There are several very good answers to this question

1. Because it is the world around us Almost all of the physical world that we see is in fact condensed matter. Questions such as

why are metals shiny and why do they feel cold? • why is glass transparent? • 1 2 CHAPTER 1. ABOUT CONDENSED MATTER PHYSICS

why is water a fluid, and why does fluid feel wet? • why is rubber soft and stretchy? • These questions are all in the domain of condensed matter physics. In fact almost every question you might ask about the world around you, short of asking about the sun or stars, is probably related to condensed matter physics in some way.

2. Because it is useful Over the last century our command of condensed matter physics has enabled us humans to do remarkable things. We have used our knowledge of physics to engineer new materials and exploit their properties to change our world and our society completely. Perhaps the most remarkable example is how our understanding of solid state physics enabled new inventions exploiting semiconductor technology, which enabled the electronics industry, which enabled computers, iPhones, and everything else we now take for granted.

3. Because it is deep The questions that arise in condensed matter physics are as deep as those you might find anywhere. In fact, many of the ideas that are now used in other fields of physics can trace their origins to condensed matter physics. A few examples for fun:

The famous Higgs , which the LHC is searching for, is no different from a phe- • nomenon that occurs in superconductors (the domain of condensed matter physicists). The Higgs mechanism, which gives to elementary particles is frequently called the “Anderson-Higgs” mechanism, after the condensed matter physicist Phil Anderson (the same guy who coined the term “condensed matter”) who described much of the same physics before Peter Higgs, the high energy theorist. The ideas of the group (Nobel prize to Kenneth Wilson in 1982) was • developed simultaneously in both high-energy and condensed matter physics. The ideas of topological quantum field theories, while invented by string theorists as • theories of quantum gravity, have been discovered in the laboratory by condensed matter physicists! In the last few years there has been a mass exodus of string theorists applying - • hole physics (in N-dimensions!) to transitions in real materials. The very same structures exist in the lab that are (maybe!) somewhere out in the cosmos!

That this type of physics is deep is not just my opinion. The Nobel committee agrees with me. During this course we will discuss the work of no fewer than 50 Nobel laureates! (See the index of at the end of this set of notes).

4. Because reductionism doesn’t work begin rant People frequently have the feeling that if you continually ask “what is it made of” you{ learn} more about something. This approach to knowledge is known as reductionism. For example, asking what water is made of, someone may tell you it is made from molecules, then molecules are made of atoms, atoms of electrons and protons, protons of quarks, and quarks are made of who-knows-what. But none of this information tells you anything about why water is wet, about why protons and bind to form nuclei, why the atoms bind to form water, and so forth. Understanding physics inevitably involves understanding how many objects all interact with each other. And this is where things get difficult very 1.2. WHY DO WE STUDY CONDENSED MATTER PHYSICS? 3

quickly. We understand the Schroedinger equation extremely well for one particle, but the Schroedinger equations for four or more particles, while in principle solvable, in practice are never solved because they are too difficult — even for the world’s biggest computers. Physics involves figuring out what to do then. How are we to understand how many quarks form a nucleus, or how many electrons and protons form an if we cannot solve the many particle Schroedinger equation? Even more interesting is the possibility that we understand very well the microscopic theory of a system, but then we discover that macroscopic properties emerge from the system that we did not expect. My personal favorite example is when one puts together many electrons (each with charge e) one can sometimes find new particles emerging, each having one third the charge of an electron!− 1 Reductionism would never uncover this — it misses the point completely. end rant { } 5. Because it is a Laboratory Condensed matter physics is perhaps the best laboratory we have for studying quantum physics and . Those of us who are fascinated by what and statistical mechanics can do often end up studying condensed matter physics which is deeply grounded in both of these topics. Condensed matter is an infinitely varied playground for physicists to test strange quantum and statistical effects. I view this entire course as an extension of what you have already learned in quantum and statistical physics. If you enjoyed those courses, you will likely enjoy this as well. If you did not do well in those courses, you might want to go back and study them again because many of the same ideas will arise here.

1Yes, this truly happens. The Nobel prize in 1998 was awarded to Dan Tsui, Horst Stormer and Bob Laughlin, for discovery of this phenomenon known as the fractional quantum Hall effect. 4 CHAPTER 1. ABOUT CONDENSED MATTER PHYSICS Part I

Physics of Solids without Considering Microscopic Structure: The Early Days of Solid State

5

Chapter 2

Specific Heat of Solids: Boltzmann, Einstein, and Debye

Our story of condensed matter physics starts around the turn of the last century. It was well known (and you should remember from last year) that the heat capacity1 of a monatomic (ideal) is Cv =3kB/2 per atom with kB being Boltzmann’s constant. The statistical theory of described why this is so. As far back as 1819, however, it had also been known that for many solids the heat capacity is given by2

C = 3kB per atom or C = 3R which is known as the Law of Dulong-Petit3. While this law is not always correct, it frequently is close to true. For example, at room we have With the exception of diamond, the law C/R = 3 seems to hold extremely well at room temper- ature, although at lower all materials start to deviate from this law, and typically

1We will almost always be concerned with the heat capacity C per atom of a material. Multiplying by ’s number gives the molar heat capacity or heat capacity per mole. The specific heat (denoted often as c rather than C) is the heat capacity per unit mass. However, the phrase “specific heat” is also used loosely to describe the molar heat capacity since they are both intensive quantities (as compared to the total heat capacity which is extensive — i.e., proportional to the amount of mass in the system). We will try to be precise with our language but one should be aware that frequently things are written in non-precise ways and you are left to figure out what is meant. For example, Really we should say Cv per atom = 3kB/2 rather than Cv = 3kB /2 per atom, and similarly we should say C per mole = 3R. To be more precise I really would have liked to title this chapter “Heat Capacity Per Atom of Solids” rather than “Specific Heat of Solids”. However, for over a century people have talked about the “Einstein Theory of Specific Heat” and “Debye Theory of Specific Heat” and it would have been almost scandalous to not use this wording. 2 Here I do not distinguish between Cp and Cv because they are very close to the same. Recall that Cp Cv = 2 − V T α /βT where βT is the isothermal compressibility and α is the coefficient of thermal expansion. For a solid α is relatively small. 3Both Pierre Dulong and Alexis Petit were French . Neither is remembered for much else besides this law.

7 8 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE

Material C/R Aluminum 2.91 Antimony 3.03 Copper 2.94 Gold 3.05 Silver 2.99 Diamond 0.735

Table 2.1: Heat Capacities of Some Solids

C drops rapidly below some temperature. (And for diamond when the temperature is raised, the heat capacity increases towards 3R as well, see Fig. 2.2 below). In 1896 Boltzmann constructed a model that accounted for this law fairly well. In his model, each atom in the solid is bound to neighboring atoms. Focusing on a single particular atom, we imagine that atom as being in a harmonic well formed by the interaction with its neighbors. In such a classical statistical mechanical model, the heat capacity of the vibration of the atom is 3kB per atom, in agreement with Dulong-Petit. (Proving this is a good homework assignment that you should be able to answer with your knowledge of statistical mechanics and/or the equipartition theorem). Several years later in 1907, Einstein started wondering about why this law does not hold at low temperatures (for diamond, “low” temperature appears to be room temperature!). What he realized is that quantum mechanics is important! Einstein’s assumption was similar to that of Boltzmann. He assumed that every atom is in a harmonic well created by the interaction with its neighbors. Further he assumed that every atom is in an identical harmonic well and has an oscillation frequency ω (known as the “Einstein” frequency). The quantum mechanical problem of a simple harmonic oscillator is one whose solution we know. We will now use that knowledge to determine the heat capacity of a single one dimensional harmonic oscillator. This entire calculation should look familiar from your statistical physics course.

2.1 Einstein’s Calculation

In one dimension, the eigenstates of a single harmonic oscillator are

En = ~ω(n +1/2) with ω the frequency of the harmonic oscillator (the “Einstein frequency”). The partition function is then4

−β~ω(n+1/2) Z1D = e > nX0 e−β~ω/2 1 = = 1 e−β~ω 2 sinh(β~ω/2) − 4 We will very frequently use the standard notation β = 1/(kB T ). 2.1. EINSTEIN’S CALCULATION 9

The expectation of energy is then

1 ∂Z ~ω β~ω 1 E = = coth = ~ω n (β~ω)+ (2.1) h i −Z ∂β 2 2 B 2     5 where nB is the occupation factor 1 n (x)= B ex 1 − This result is easy to interpret: the mode ω is an excitation that is excited on average nB times, or equivalently there is a “boson” orbital which is “occupied” by nB . Differentiating the expression for energy we obtain the heat capacity for a single oscillator,

∂ E eβ~ω C = h i = k (β~ω)2 ∂T B (eβ~ω 1)2 −

Note that the high temperature limit of this expression gives C = kB (check this if it is not obvious!). Generalizing to the three-dimensional case, ~ Enx,ny ,nz = ω[(nx +1/2)+(ny +1/2)+(nz +1/2)] and −βEn ,n ,n 3 Z3D = e x y z = [Z1D] > nx,nXy ,nz 0 resulting in E =3 E , so correspondingly we obtain h 3Di h 1Di eβ~ω C =3k (β~ω)2 B (eβ~ω 1)2 − Plotted this looks like Fig. 2.1.

5Satyendra Bose worked out the idea of Bose statistics in 1924, but could not get it published until Einstein lent his support to the idea. 10 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE

1

0.8

0.6 B k C 3 0.4

0.2

0 0 0.5 1 1.5 2

kBT/(~ω)

Figure 2.1: Einstein Heat Capacity Per Atom in Three Dimensions

Note that in the high temperature limit k T ~ω recover the law of Dulong-Petit — 3k B  B heat capacity per atom. However, at low temperature (T ~ω/kB) the degrees of freedom “freeze out”, the system gets stuck in only the eigenstate, and the heat capacity vanishes rapidly. Einstein’s theory reasonably accurately explained the behavior of the the heat capacity as a function of temperature with only a single fitting parameter, the Einstein frequency ω. (Sometimes this frequency is quoted in terms of the Einstein temperature ~ω = kBTEinstein). In Fig. 2.2 we show Einstein’s original comparison to the heat capacity of diamond. For most materials, the Einstein frequency ω is low compared to room temperature, so the Dulong-Petit law hold fairly well (being relatively high temperature compared to the Einstein frequency). However, for diamond, ω is high compared to room temperature, so the heat capacity is lower than 3R at room temperature. The reason diamond has such a high Einstein frequency is that the bonding between atoms in diamond is very strong and its mass is relatively low (hence a high ω = κ/m oscillation frequency with κ a spring constant and m the mass). These strong bonds also result in diamond being an exceptionally hard material. p Einstein’s result was remarkable, not only in that it explained the temperature dependence 2.2. DEBYE’S CALCULATION 11

Figure 2.2: Plot of Molar Heat Capacity of Diamond from Einstein’s Original 1907 paper. The fit is to the Einstein theory. The x-axis is kBT in units of ~ω and the y axis is C in units of cal/(K-mol). In these units, 3R 5.96. ≈ of the heat capacity, but more importantly it told us something fundamental about quantum mechanics. Keep in mind that Einstein obtained this result 19 years before the Schroedinger equation was discovered!6

2.2 Debye’s Calculation

Einstein’s theory of specific heat was extremely successful, but still there were clear deviations from the predicted equation. Even in the plot in his first paper (Fig. 2.2 above) one can see that at low temperature the experimental data lies above the theoretical curve7. This result turns out to be rather important! In fact, it was known that at low temperatures most materials have a heat capacity that is proportional to T 3 (Metals also have a very small additional term proportional to T which we will discuss later in section 4.2. Magnetic materials may have other additional terms as well8. Nonmagnetic insulators have only the T 3 behavior). At any rate, Einstein’s formula at low temperature is exponentially small in T , not agreeing at all with the actual experiments. In 1912 Peter Debye9 discovered how to better treat the quantum mechanics of oscillations of atoms, and managed to explain the T 3 specific heat. Debye realized that oscillation of atoms is the same thing as sound, and sound is a wave, so it should be quantized the same way as Planck quantized waves. Besides the fact that the speed of light is much faster than that of sound, there is only one minor difference between light and sound: for light, there are two polarizations for each k whereas for sound, there are three modes for each k (a longitudinal mode, where the atomic motion is in the same direction as k and two transverse modes where the motion is perpendicular to k. Light has only the transverse modes.). For simplicity of presentation here we will assume that the transverse and longitudinal modes have the same velocity, although in truth the longitudinal

6Einstein was a pretty smart guy. 7Although perhaps not obvious, this deviation turns out to be real, and not just experimental error. 8We will discuss magnetism in part VII. 9Peter Debye later won a Nobel prize in for something completely different. 12 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE velocity is usually somewhat greater than the transverse velocity10. We now repeat essentially what was Planck’s calculation for light. This calculation should also look familiar from your statistical physics course. First, however, we need some preliminary information about waves:

2.2.1 About Periodic (Born-Von-Karman) Boundary Conditions

Many times in this course we will consider waves with periodic or “Born-Von-Karman” boundary conditions. It is easiest to describe this first in one dimension. Here, instead of having a one dimensional sample of length L with actual ends, we imagine that the two ends are connected together making the sample into a circle. The periodic boundary condition means that, any wave in this sample eikr is required to have the same value for a position r as it has for r + L (we have gone all the way around the circle). This then restricts the possible values of k to be

2πn k = L for n an integer. If we are ever required to sum over all possible values of k, for large enough L we can replace the sum with an integral obtaining11

L ∞ dk → 2π −∞ Xk Z A way to understand this mapping is to note that the spacing between allowed points in k space is 2π/L so the integral dk can be replaced by a sum over k points times the spacing between the points. R In three dimensions, the story is extremely similar. For a sample of size L3, we identify opposite ends of the sample (wrapping the sample up into a hypertorus!) so that if you go a distance L in any direction, you get back to where you started12. As a result, our k values can only take values 2π k = (n ,n ,n ) L 1 2 3 3 for integer values of ni, so here each k point now occupies a volume of (2π/L) . Because of this discretization of values of k, whenever we have a sum over all possible k values we obtain

L3 dk → (2π)3 k X Z 10We have also assumed the sound velocity to be the same in every direction, which need not be true in real materials. It is not too hard to include anisotropy into Debye’s theory as well. 11In your previous courses you may have used particle in a box boundary conditions where instead of plane waves ei2πnr/L you used particle in a box wavefunctions of the form sin(knπr/L). This gives you instead L ∞ dk → π Z Xk 0 which will inevitably result in the same physical answers as for the periodic boundary condition case. All calculations can be done either way, but periodic Born-Von-Karmen boundary conditions are almost always simpler. 12Such boundary conditions are very popular in video games. It may also be possible that our universe has such boundary conditions — a notion known as the doughnut universe. Data collected by Cosmic Microwave Background Explorer (led by Nobel Laureates John Mather and George Smoot) and its successor the Wilkinson Microwave Anisotropy Probe appear consistent with this structure. 2.2. DEBYE’S CALCULATION 13 with the integral over all three dimensions of k-space (this is what we mean by the bold dk). One might think that wrapping the sample up into a hypertorus is very unnatural compared to considering a system with real boundary conditions. However, these boundary conditions tend to simplify calculations quite a bit and most physical quantities you might measure could be measured far from the boundaries of the sample anyway and would then be independent of what you do with the boundary conditions.

2.2.2 Debye’s Calculation Following Planck

Debye decided that the oscillation modes were waves with frequencies ω(k)= v k with v the sound velocity — and for each k there should be three possible oscillation modes, one| | for each direction of motion. Thus he wrote an expression entirely analogous to Einstein’s expression (compare to Eq. 2.1)

1 E = 3 ~ω(k) n (β~ω(k)) + h i B 2 k X   L3 1 = 3 dk ~ω(k) n (β~ω(k)) + (2π)3 B 2 Z  

Each excitation mode is a boson of frequency ω(k) and it is occupied on average nB(β~ω(k)) times. By spherical , we may convert the three dimensional integral to a one dimensional integral ∞ dk 4π k2dk → Z Z0 (recall that 4πk2 is the area of the surface of a sphere13 of radius k) and we also use k = ω/v to obtain 4πL3 ∞ 1 E =3 ω2dω(1/v3)(~ω) n (β~ω)+ h i (2π)3 B 2 Z0   It is convenient to replace nL3 = N where n is the density of atoms. We then obtain

∞ 1 E = dω g(ω)(~ω) n (β~ω)+ (2.2) h i B 2 Z0   where the is given by

12πω2 9ω2 g(ω)= N = N (2.3) (2π)3nv3 ω3   d where 3 2 3 ωd =6π nv (2.4) This frequency will be known as the Debye frequency, and below we will see why we chose to define it this way with the factor of 9 removed. The meaning of the density of states14 here is that the total number of oscillation modes with frequencies between ω and ω + dω is given by g(ω)dω. Thus the interpretation of Eq. 2.2 is

13Or to be pedantic, dk 2π dφ π dθ sin θ k2dk and performing the angular integrals gives 4π. → 0 0 14We will encounter theR conceptR of densityR of statesR many times, so it is a good idea to become comfortable with it! 14 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE simply that we should count how many modes there are per frequency (given by g) then multiply by the expected energy per mode (compare to Eq. 2.1) and finally we integrate over all frequencies. This result, Eq. 2.2, for the quantum energy of the sound waves is strikingly similar to Planck’s result for the quantum energy of light waves, only we have replaced 2/c3 by 3/v3 (replacing the 2 light modes by 3 sound modes). The other change from Planck’s classic result is the +1/2 that we obtain as the zero point energy of each oscillator15. At any rate, this zero point energy gives us a contribution which is temperature independent16. Since we are concerned with C = ∂ E /∂T this term will not contribute and we will separate it out. We thus obtain h i 9N~ ∞ ω3 E = dω + T independent constant h i ω3 eβ~ω 1 d Z0 − by defining a variable x = β~ω this becomes 9N~ ∞ x3 E = dx + T independent constant h i ω3(β~)4 ex 1 d Z0 − The nasty integral just gives some number17 – in fact the number is π4/15. Thus we obtain 4 4 (kBT ) π E =9N 3 + T independent constant h i (~ωd) 15 Notice the similarity to Planck’s derivation of the T 4 energy of . As a result, the heat capacity is 3 4 ∂ E (kB T ) 12π 3 C = h i = NkB 3 T ∂T (~ωd) 5 ∼ This correctly obtains the desired T 3 specific heat. Furthermore, the prefactor of T 3 can be calculated in terms of known quantities such as the sound velocity and the density of atoms. Note that the Debye frequency in this equation is sometimes replaced by a temperature

~ωd = kB TDebye known as the Debye temperature, so that this equation reads ∂ E (T )3 12π4 C = h i = NkB 3 ∂T (TDebye) 5

15Planck should have gotten this energy as well, but he didn’t know about zero-point energy — in fact, since it was long before quantum mechanics was fully understood, Debye didn’t actually have this term either. 16Temperature independent and also infinite. Handling infinities like this is something that gives mathematicians nightmares, but physicist do it happily when they know that the infinity is not really physical. We will see below in section 2.2.3 how this infinity gets properly cut off by the Debye Frequency. 17If you wanted to evaluate the nasty integral, the strategy is to reduce it to the famous Riemann zeta function. We start by writing ∞ x3 ∞ x3e−x ∞ ∞ ∞ ∞ ∞ 1 dx = dx = dx x3e−x e−nx = dx x3e−nx = 3! Z ex 1 Z 1 e−x Z Z n4 0 − 0 − 0 nX=0 nX=1 0 nX=1 ∞ −p The resulting sum is a special case of the famous Riemann zeta function defined as ζ(p) = n=1 n where here we are concerned with the value of ζ(4). Since the zeta function is one of the most importantP functions in all of mathematics18 , one can just look up its value on a table to find that ζ(4) = π4/90 thus giving us the above stated result that the nasty integral is π4/15. However, in the unlikely event that you were stranded on a desert island and did not have access to a table, you could even evaluate this sum explicitly, which we do in the appendix to this chapter. 18One of the most important unproven conjectures in all of mathematics is known as the Riemann hypothesis and is concerned with determining for which values of p does ζ(p) = 0. The hypothesis was written down in 1869 by Bernard Riemann (the same guy who invented Riemannian geometry, crucial to ) and has defied proof ever since. The Clay Mathematics Institute has offered one million dollars for a successful proof. 2.2. DEBYE’S CALCULATION 15

2.2.3 Debye’s “Interpolation”

Unfortunately, now Debye has a problem. In the expression derived above, the heat capacity is proportional to T 3 up to arbitrarily high temperature. We know however, that the heat capacity should level off to 3kBN at high T . Debye understood that the problem with his approximation is that it allows an infinite number of sound wave modes — up to arbitrarily large k. This would imply more sound wave modes than there are atoms in the entire system. Debye guessed (correctly) that really there should be only as many modes as there are degrees of freedom in the system. We will see in sections 8-12 below that this is an important general principle. To fix this problem, Debye decided to not consider sound waves above some maximum frequency ωcutoff , with this frequency chosen such that there are exactly 3N sound wave modes in the system (3 dimensions of motion times N particles). We thus define ωcutoff via

ωcutoff 3N = dω g(ω) (2.5) Z0 We correspondingly rewrite Eq. 2.2 for the energy (dropping the zero point contribution) as

ωcutoff E = dω g(ω) ~ωn (β~ω) (2.6) h i B Z0 Note that at very low temperature, this cutoff does not matter at all, since for large β the Bose factor nB will very rapidly go to zero at frequencies well below the cutoff frequency anyway. Let us now check that this cutoff gives us the correct high temperature limit. For high temperature 1 k T n (β~ω)= B B eβ~ω 1 → ~ω − Thus in the high temperature limit, invoking Eqs. 2.5 and 2.6 we obtain

ωcutoff E = k T dωg(ω) =3k TN h i B B Z0 yielding the Dulong-Petit high temperature heat capacity C = ∂ E /∂T =3kBN =3kB per atom. For completeness, let us now evaluate our cutoff frequency, h i

ωcutoff ωcutoff ω2 ω3 3N = dωg(ω)=9N dω =3N cutoff ω3 ω3 Z0 Z0 d d we thus see that the correct cutoff frequency is exactly the Debye frequency ωd. Note that k = 2 1/3 ωd/v = (6π n) (from Eq. 2.4) is on the order of the inverse interatomic spacing of the solid. More generally (in the neither high nor low temperature limit) one has to evaluate the integral 2.6, which cannot be done analytically. Nonetheless it can be done numerically and then can be compared to actual experimental data as shown in Fig. 2.3. It should be emphasized that the Debye theory makes predictions without any free parameters, as compared to the Einstein theory which had the unknown Einstein frequency ω as a free fitting parameter.

2.2.4 Some Shortcomings of the Debye Theory

While Debye’s theory is remarkably successful, it does have a few shortcomings. 16 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE

  

Figure 2.3: Plot of Heat Capacity of Silver. The y axis is C in units of cal/(K-mol). In these units, 3R 5.96). Over the entire experimental range, the fit to the Debye theory is excellent.≈ At low T it correctly recovers the T 3 dependence, and at high T it converges to the law of Dulong-Petit.

The introduction of the cutoff seems very ad-hoc. This seems like a successful cheat rather • than real physics

We have assumed sound waves follow the law ω = vk even for very very large values of • k (on the order of the inverse lattice spacing), whereas the entire idea of sound is a long wavelength idea, which doesn’t seem to make sense for high enough frequency and short enough wavelength. At any rate, it is known that at high enough frequency the law ω = vk no longer holds.

Experimentally, the Debye theory is very accurate, but it is not exact at intermediate tem- • peratures.

At very very low temperatures, metals have a term in the heat capacity that is proportional • to T , so the overall heat capacity is C = aT + bT 3 and at low enough T the linear term will dominate19 You can’t see this contribution on the plot Fig. 2.3 but at very low T it becomes evident.

Of these shortcomings, the first three can be handled more properly by treating the details of the crystal structure of materials accurately (which we will do much later in this course). The final issue requires us to carefully study the behavior of electrons in metals to discover the origin of this linear T term (see section 4.2 below). Nonetheless, despite these problems, Debye’s theory was a substantial improvement over Einstein’s20,

19In magnetic materials there may be still other contributions to the heat capacity reflecting the energy stored in magnetic degrees of freedom. See part VII below. 20Debye was pretty smart too... even though he was a . 2.3. SUMMARY OF SPECIFIC HEAT OF SOLIDS 17

2.3 Summary of Specific Heat of Solids

(Much of the) Heat capacity (specific heat) of materials is due to atomic vibrations. • Boltzmann and Einstein models consider these vibrations as N simple harmonic oscillators. • Boltzmann classical analysis obtains law of Dulong-Petit C =3Nk =3R. • B Einstein quantum analysis shows that at temperatures below the oscillator frequency, degrees • of freedom freeze out, and heat capacity drops exponentially. Einstein frequency is a fitting parameter.

Debye Model treats oscillations as sound waves. No fitting parameters. • – ω = v k , similar to light (but three polarizations not two) | | – similar to Planck quantization of light

– Maximum frequency cutoff (~ωDebye = kBTDebye) necessary to obtain a total of only 3N degrees of freedom – obtains Dulong-Petit at high T and C T 3 at low T . ∼ Metals have an additional (albeit small) linear T term in the heat capacity which we will • discuss later.

References

Almost every book covers the material introduced in this chapter, but frequently it is done late in the book only after the idea of phonons is introduced. We will get to phonons in chapter 8. Before we get there the following references cover this material without discussion of phonons: Goodstein sections 3.1 and 3.2 • Rosenberg sections 5.1 through 5.13 (good problems included) • Burns sections 11.3 through 11.5 (good problems included) • Once we get to phonons, we can look back at this material again. Discussions are then given also by Dove section 9.1 and 9.2 • Ashcroft and Mermin chapter 23 • Hook and Hall section 2.6 • Kittel beginning of chapter 5 •

2.4 Appendix to this Chapter: ζ(4)

The Riemann zeta function as mentioned above is defined as

∞ ζ(p)= n−p. n=1 X 18 CHAPTER 2. SPECIFIC HEAT OF SOLIDS: BOLTZMANN, EINSTEIN, AND DEBYE

This function occurs frequently in physics, not only in the Debye theory of solids, but also in the Sommerfeld theory of electrons in metals (see chapter 4 below), as well as in the study of Bose condensation. As mentioned above in footnote 18 of this chapter, it is also an extremely important quantity to mathematicians. In this appendix we are concerned with the value of ζ(4). To evaluate this we write a Fourier series for the function x2 on the interval [ π, π]. The series is given by − a x2 = 0 + a cos(nx) 2 n n>0 X with coefficients given by

1 π a = dx x2 cos(nx) n π Zπ These can be calculated straightforwardly to give

2π2/3 n =0 a = n 4( 1)n/n2 n> 0  − We now calculate an integral in two different ways. First we can directly evaluate

π 2π5 dx(x2)2 = 5 Z−π On the other hand using the above Fourier decomposition of x2 we can write the same integral as

π π a a dx(x2)2 = dx 0 + a cos(nx) 0 + a cos(mx) 2 n 2 m −π −π n>0 ! m>0 ! Z Z X X π a 2 π = dx 0 + dx (a cos(nx)) 2 n −π −π n>0 Z   Z X where we have used the orthogonality of Fourier modes to eliminate cross terms in the product. We can do these integrals to obtain

π a2 2π5 dx(x2)2 = π 0 + a2 = + 16πζ(4) 2 n 9 −π n>0 ! Z X Setting this expression to 2π5/5 gives us the result ζ(4) = π4/90. Chapter 3

Electrons in Metals: Drude Theory

The fundamental characteristic of a metal is that it conducts electricity. At some level the reason for this conduction boils down to the fact that electrons are mobile in these materials. In later chapters we will be concerned with the question of why electrons are mobile in some materials but not in others, being that all materials have electrons in them! For now, we take as given that there are mobile electrons and we would like to understand their properties. J.J. Thomson’s 1896 discovery of the electron (“corpuscles of charge” that could be pulled out of metal) raised the question of how these charge carriers might move within the metal. In 1900 Paul Drude1 realized that he could apply Boltzmann’s kinetic theory of gases to understanding electron motion within metals. This theory was remarkably successful, providing a first understanding of metallic conduction.2 Having studied the kinetic theory of gases, Drude theory should be very easy to understand. We will make three assumptions about the motion of electrons

1. Electrons have a scattering time τ. The probability of scattering within a time interval dt is dt/τ. 2. Once a scattering event occurs, we assume the electron returns to p = 0. 3. In between scattering events, the electrons, which are charge e particles, respond to the externally applied electric field E and magnetic field B. −

The first two of these assumptions are exactly those made in the kinetic theory of gases3. The third assumption is just a logical generalization to account for the fact that, unlike gases molecules,

1pronounced roughly “Drood-a” 2Sadly, neither Boltzmann nor Drude lived to see how much influence this theory really had — in unrelated tragic events, both of them committed suicide in 1906. Boltzmann’s famous student, Ehrenfest, also committed suicide some years later. Why so many highly successful statistical physicists took their own lives is a bit of a mystery. 3Ideally we would do a better job with our representation of the scattering of particles. Every collision should initial initial final consider two particles having initial momenta p1 and p2 and then scattering to final momenta p1 and final p2 so as to conserve both energy and momentum. Unfortunately, keeping track of things so carefully makes the problem extremely difficult to solve. Assumption 1 is not so crazy as an approximation being that there really is a typical time between scattering events in a gas. Assumption 2 is a bit more questionable, but on average the final

19 20 CHAPTER 3. DRUDE THEORY electrons are charged and must therefore respond to electromagnetic fields. We consider an electron with momentum p at time t and we ask what momentum it will have at time t + dt. There are two terms in the answer, there is a probability dt/τ that it will scatter to momentum zero. If it does not scatter to momentum zero (with probability 1 dt/τ) it simply accelerates as dictated by its usual equations of motion dp/dt = F. Putting the two− terms together we have dt p(t + dt) = 1 (p(t)+ Fdt)+ 0 dt/τ h i − τ   or4 dp p = F (3.1) dt − τ where here the force F on the electron is just the force

F = e(E + v B) − × One can think of the scattering term p/τ as just a drag force on the electron. Note that in the absence of any externally applied− field the solution to this differential equation is just an exponentially decaying momentum

−t/τ p(t)= pinitial e which is what we should expect for particles that lose momentum by scattering.

3.1 Electrons in Fields

3.1.1 Electrons in an Electric Field

Let us start by considering the case where the electric field is nonzero but the magnetic field is zero. Our equation of motion is then

dp p = eE dt − − τ In steady state, dp/dt = 0 so we have

mv = p = eτE − with m the mass of the electron and v its velocity. Now, if there is a density n of electrons in the metal each with charge e, and they are all moving at velocity v, then the electrical current is given by −

e2τn j = env = E − m momentum after a scattering event is indeed zero (if you average momentum as a vector). However, obviously it is not correct that every particle has zero kinetic energy after a scattering event. This is a defect of the approach. 4Here we really mean p when we write p. Since our scattering is probabilistic, we should view all quantities (such as the momentum)h asi being an expectation over these random events. A more detailed theory would keep track of the entire distribution of momenta rather than just the average momentum. Keeping track of distributions in this way leads one to the Boltzmann Transport Equation, which we will not discuss. 3.1. ELECTRONS IN FIELDS 21 or in other words, the conductivity of the metal, defined via j = σE is given by5

e2τn σ = (3.2) m By measuring the conductivity of the metal (assuming we know both the charge and mass of the electron) we can determine the product of the density and scattering time of the electron.

3.1.2 Electrons in Electric and Magnetic Fields

Let us continue on to see what other predictions come from Drude theory. Consider the transport equation 3.1 for a system in both an electric and a magnetic field. We now have dp = e(E + v B) p/τ dt − × − Again setting this to zero in steady state, and using p = mv and j = nev, we obtain an equation for the steady state current − j B m 0= eE + × + j − n neτ or 1 m E = j B + j ne × ne2τ   We now define the 3 by 3 resistivity matrix ρ which relates the current vector to the electric field vector E = ρ j e such that the components of this matrix are given by e m ρ = ρ = ρ = xx yy zz ne2τ and if we imagine B oriented in thez ˆ direction, then B ρ = ρ = xy − yx ne and all other components of ρ are zero. This off-diagonal term in the resistivity is known as the Hall resistivity, named after who discovered in 1879 that when a magnetic field is applied perpendicular to a current flow, a voltage can be measured perpendicular to both current and magnetic field (See Fig. 3.1.2).e As a homework problem you might consider a further generalization of Drude theory to finite frequency conductivity, where it gives some interesting (and frequently accurate) predictions.

The Hall coefficient RH is defined as ρ R = yx H B | | which in the Drude theory is given by 1 R = − H ne 5A related quantity is the mobility, defined by v = µF, which is given in Drude theory by µ = eτ/m. We will discuss mobility further in section 16.1.1 below. 22 CHAPTER 3. DRUDE THEORY



  



Figure 3.1: Edwin Hall’s 1879 experiment. The voltage measured perpendicular to both the magnetic field and the current is known as the Hall voltage which is proportional to B and inversely proportional to the electron density (at least in Drude theory).

This then allows us to measure the density of electrons in a metal. Aside: One can also consider turning this experiment on its head. If you know the density of electrons in your sample you can use a Hall measurement to determine the magnetic field. This is known as a Hall sensor. Since it is hard to measure small voltages, Hall sensors typically use materials, such as semiconductors, where the density of electrons is low so RH and hence the resulting voltage is large.

Let us then calculate n = 1/(eRH ) for various metals and divide it by the density of atoms. This should give us the number of− free electrons per atom. Later on we will see that it is frequently not so hard to estimate the number of electrons in a system. A short description is that electrons bound in the core shells of the atoms are never free to travel throughout the crystal, whereas the electrons in the outer shell may be free (we will discuss later when these electrons are free and when they are not). The number of electrons in the outermost shell is known as the valence of the atom.

( 1/[eR ])/ [density of atoms] − H Material In Drude theory this should give Valence the number of free electrons per atom which is the valence Li .8 1 Na 1.2 1 K 1.1 1 Cu 1.5 1 (usually) Be -0.2 (but anisotropic) 2 Mg -0.4 2

Table 3.1: Comparison of the valence of various atoms to the measured number of free electrons per atom (measured via the Hall resistivity and the atomic density). 3.2. THERMAL TRANSPORT 23

We see from table 3.1 that for many metals this Drude theory analysis seems to make sense — the “valence” of lithium, sodium, and potassium (Li, Na, and K) are all one which agrees roughly with the measured number of electrons per atom. The effective valence of copper (Cu) is also one, so it is not surprising either. However, something has clearly gone seriously wrong for Be and Mg. In this case, the sign of the Hall coefficient has come out incorrect. From this result, one might conclude that the charge carrier for beryllium and magnesium (Be and Mg) have the opposite charge from that of the electron! We will see below in section 16.1.1 that this is indeed true and is a result of the so-called band structure of these materials. However, for many metals, simple Drude theory gives quite reasonable results. We will see in chapter 16 below that Drude theory is particularly good for describing semiconductors. If we believe the Hall effect measurement of the density of electrons in metals, using Eq. 3.2 we can then extract a scattering time from the expression for the conductivity. The Drude scattering time comes out to be in the range of τ 10−14 seconds for most metals near room temperature. ≈

3.2 Thermal Transport

Drude was brave enough to attempt to further calculate the thermal conductivity κ due to mobile electrons6 using Boltzmann’s kinetic theory. Without rehashing the derivation, this result should look familiar to you from your previous encounters with the kinetic theory of gas

1 κ = nc v λ 3 vh i where cv is the heat capacity per particle, v is the average thermal velocity and λ = v τ is the scattering length. For a conventional gas theh i heat capacity per particle is h i

3 c = k v 2 B and 8k T v = B h i r πm Assuming this all holds true for electrons, we obtain

4 nτk2 T κ = B π m

While this quantity still has the unknown parameter τ in it, it is the same quantity that occurs in the electrical conductivity (Eq. 3.2). Thus we may look at the ratio of thermal conductivity to

6In any experiment there will also be some amount of thermal conductivity from structural vibrations of the material as well — so called thermal conductivity. (We will meet phonons in chapter 8 below). However, for most metals, the thermal conductivity is mainly due to electron motion and not from vibrations. 24 CHAPTER 3. DRUDE THEORY electrical conductivity, known as the Lorenz number7,8

κ 4 k 2 L = = B 0.94 10−8 WattOhm/K2 T σ π e ≈ ×   A slightly different prediction is obtained by realizing that we have used v 2 in our calculation, whereas perhaps we might have instead used v2 which would have then givenh i us instead h i κ 3 k 2 L = = B 1.11 10−8 WattOhm/K2 T σ 2 e ≈ ×   This result was viewed as a huge success, being that it was known for almost half a century that almost all metals have roughly the same value of this ratio, a fact known as the Wiedemann-Franz law. In fact the value predicted for this ratio is only a bit lower than that measured experimentally (See table 3.2).

Material L 108 (WattOhm/K2) × Li 2.22 Na 2.12 Cu 2.20 Fe 2.61 Bi 3.53 Mg 2.14 Drude Prediction 0.98-1.11

Table 3.2: Lorenz Numbers κ/(T σ) for Various Metals

So the result appears to be off by about a factor of 2, but still that is very good, considering that before Drude no one had any idea why this ratio should be a constant at all! In retrospect we now realize that this calculation is completely incorrect (despite its suc- cessful result). The reason we know there is a problem is because we do not actually measure a 3 specific heat of Cv = 2 kB per electron in metals (for certain systems where the density of electrons is very low, we do in fact measure this much specific heat, but not in metals). In fact, in most metals we measure only a vibrational (Debye) specific heat, plus a very small term linear in T at low temperatures. So why does this calculation give such a good result? It turns out (and we will see later below) that we have made two mistakes that roughly cancel each other. We have used a specific heat that is way too large, but we have also used a velocity that is way too small. We will see later that both of these mistakes are due to Fermi statistics of the electron (which we have so far ignored) and the Pauli exclusion principle. We can see the problem much more clearly in some other quantities. The so-calledPeltier effect is the fact that running electrical current through a material also transports heat. The

7This is named after Ludvig Lorenz, not Hendrik Lorentz who is famous for the Lorentz force and Lorentz contraction. However, just to confuse matters, the two of them worked on similar topics and there is even a Lorentz-Lorenz equation 8The dimensions here might look a bit funny, but κ, the thermal conductivity is measured in Watt/K and σ is 2 2 measured in 1/Ohm. To see that WattOhm/K is the same as (kB/e) note that kB is J/K and e is Coulomb (C). So we need to show that (J/C)2 is WattOhm (J/C)2 = (J/sec)(J/C)(1/(C/sec) = WattVolt/Amp = WattOhm 3.3. SUMMARY OF DRUDE THEORY 25 so-called Peltier coefficient Π is defined by

jq = Π j where jq is the heat current, and j is the electrical current. Aside: The Peltier effect is used for thermoelectric refrigeration devices. By running electricity through a thermoelectric material, you can force heat to be transported through that material. You can thus transport heat away from one object and towards another. A good thermoelectric device has a high Peltier coefficient, but must also have a low resistivity, because running a current through an material with resistivity R will result in power I2R being dissipated thus heating it up. In kinetic theory the thermal current is 1 jq = (c T )nv (3.3) 3 v here cvT is the heat carried by one particle (with cv =3kB/2 the heat capacity per particle) and n is the density of particles (and 1/3 is the geometric factor that is probably approximate anyway). Similarly the electrical current is j = env − Thus the Peltier coefficient is c T k T Π= − v = − B (3.4) 3e 2e so the ratio (known as thermopower, or Seebeck coefficient) S = Π/T is given by Π k S = = − B = 4.3 10−4V/K (3.5) T 2e − × in Drude theory. For most metals the actual value of this ratio is roughly 100 times smaller! This is a reflection of the fact that we have used cv =3kB/2 whereas the actual specific heat per particle is much much lower (which we will understand in the next section when we consider Fermi statistics more carefully).

3.3 Summary of Drude Theory

Based on kinetic theory of gases. • Assumes some scattering time τ, resulting in a conductivity σ = ne2τ/m. • Hall coefficient measures density of electrons. • Successes • – Wiedemann-Franz ratio κ/(σT ) comes out close to right for most materials – Many other transport properties predicted correctly (ex, conductivity at finite fre- quency) – Hall coefficient measurement of the density seems reasonable for many metals.

Failures • – Hall coefficient frequently is measured to have the wrong sign, indicating a charge carrier with charge opposite to that of the electron 26 CHAPTER 3. DRUDE THEORY

– There is no 3kB/2 heat capacity per particle measured for electrons in metals. This then makes the Peltier coefficient come out wrong by a factor of 100.

The latter of the two shortcomings will be addressed in the next section, whereas the former of the two will be addressed in chapter 16 below where we discuss band theory. Despite the shortcomings of Drude theory, it nonetheless was the only theory of metallic con- ductivity for a quarter of a century (until the Sommerfeld theory improved it), and it remains quite useful today (Particularly for seminconductors and other systems with low densities of electrons. See chapter 16).

References

Ashcroft and Mermin, chapter 1 • Burns, chapter 9 part A • Singleton, section 1.1–1.4 • Hook and Hall section 3.3 sort-of • Actually, Hook and Hall are aimed mainly at Free electron (Sommerfeld) theory (our next chapter), but they end up doing Drude theory anyway (they don’t use the word “Drude”). Chapter 4

More Electrons in Metals: Sommerfeld (Free Electron) Theory

In 1925 Pauli discovered the exclusion principle, that no two electrons may be in the exact same state. In 1926, Fermi and Dirac separately derived what we now call Fermi-Dirac statistics1 Upon learning about Fermi Statistics, Sommerfeld2 realized that Drude’s theory of metals could easily be generalized to incorporate Fermi statistics, which is what we shall presently do.

1All three, Pauli, Fermi, and Dirac, won Nobel prizes in the next few years — but you probably knew that already. 2Sommerfeld never won a Nobel prize, although he was nominated for it 81 times — more than any other physicist. He also was a research advisor for more Nobel Laureates than anyone else in history (6: Heisenberg, Pauli, Debye, Bethe, who were his PhD students and Pauling, Rabi who were postdoctoral researchers with him. He also was the first research advisor for for whom the theory building at Oxford is named, although Peierls eventually finished his PhD as a student of Pauli.)

27 28 CHAPTER 4. SOMMERFELD THEORY

4.1 Basic Fermi-Dirac Statistics

Given a system of free3 electrons with chemical potential4 µ the probability of an eigenstate of energy E being occupied is given by the Fermi factor5 (See Fig. 4.1)

1 n (β(E µ)) = (4.1) F − eβ(E−µ) +1

At low temperature the Fermi function becomes a step function (states below the chemical potential are filled, those above the chemical potential are empty), whereas at higher temperatures the step function becomes more smeared out.

kBT

1

nF (β(E µ)) − 0.8 µ EF ≈

0.6

0.4

0.2

0 0.0 0.5 1.0 1.5

E/EF

Figure 4.1: The Fermi Distribution for k T E . B  F

We will consider the electrons to be in a box of size V = L3 and, as with our above discussion of sound waves, it is easiest to imagine that the box has periodic boundary conditions (See section ik·r 2.2.1). The plane wavefunctions are of the form e where k must take value (2π/L)(n1,n2,n3) with ni integers due to the boundary conditions. These plane waves have corresponding energies

3Here “free” means that they do not interact with each other, with the background crystal lattice, with impurities, or with anything else for that matter. 4In case you did not properly learn about chemical potential in your statistical physics course, it can be defined via Eq. 4.1, by saying that µ is whatever constant needs to be inserted into this equation to make it true. It can also be defined as an appropriate thermodynamical derivative such as µ = ∂U/∂N with U the total energy and |V,S N the number of particles or µ = ∂G/∂N T,P with G the Gibbs potential. However, such a definition can be tricky if one worries about the discreteness of the| particle number — since N must be an integer, the derivative may not be well defined. As a result the definition in terms of Eq. 4.1 is frequently best (i.e, we are treating µ as a Lagrange multiplier). 5When we say that there are a particular set of N orbitals occupied by electrons, we really mean that the overall wavefunction of the system is an antisymmetric function Ψ(1,...,N) which can be expressed as a Slater determinant of N particle coordinates occupying the N orbitals. We will never need to actually write out such Slater determinant wavefunctions except in Appendix 22.4 which is too advanced for any reasonable exam. 4.1. BASIC FERMI-DIRAC STATISTICS 29

~2 k 2 (k)= | | (4.2) 2m with m the electron mass. Thus the total number of electrons in the system is given by V N =2 n (β((k) µ))=2 dk n (β((k) µ)) (4.3) F − (2π)3 F − k X Z where the prefactor of 2 accounts for the two possible spin states for each possible wavevector k. In fact, in a metal, N will usually be given to us, and this equation will define the chemical potential as a function of temperature. We now define a useful concept:

Definition 4.1.1. The Fermi Energy, EF is the chemical potential at temperature T = 0.

This is also sometimes called the Fermi Level. The states that are filled at T = 0 are sometimes called the Fermi Sea. Frequently one also defines a Fermi Temperature TF = EF /kB, and also the Fermi Wavevector kF defined via ~2k2 E = F (4.4) F 2m 6 and correspondingly a Fermi momentum pF = ~kF and a Fermi velocity

vF = ~kF /m (4.5)

Aside: Frequently people think of the Fermi Energy as the energy of the most energetic occupied electron state in system. While this is correct in the case where you are filling a continuum of states, it can also lead you to errors in cases where the energy eigenstates are discrete (see the related footnote 4 of this chapter), or more specifically when there is a gap between the most energetic occupied electron state in the system, and the least energetic unoccupied electron state. More correctly the Fermi energy, i.e., the chemical potential at T = 0, will be half-way between the most energetic occupied electron state, and the least energetic unoccupied electron state.

Let us now calculate the Fermi energy in a (three dimensional) metal with N electrons in it. At T = 0 the Fermi function (Eq. 4.1) becomes a step function (which we write as Θ. I.e.,Θ(x)=1 for x> 0 and =0 for x< 0) , so that Eq. 4.3 becomes

V V |k|

2 1/3 kF = (3π n)

6Yes, Fermi got his name attached to many things. To help spread the credit around I’ve called this section “Basic Fermi-Dirac Statistics” instead of just “Basic Fermi Statistics”. 30 CHAPTER 4. SOMMERFELD THEORY and correspondingly ~2(3π2n)2/3 E = (4.7) F 2m Since we know roughly how many free electrons there are in a metal (say, one per atom for mono- valent metals such as sodium or copper), we can estimate the Fermi energy, which, say for copper, turns out to be on the order of 7 eV, corresponding to a Fermi temperature of about 80,000 K. (!). This amazingly high energy scale is a result of Fermi statistics and the very high density of electrons in metals. It is crucial to remember that for all metals, TF T for any temperature anywhere near room temperature. In fact metals melt (and even vaporize!) at temperatures far far below their Fermi temperatures. Similarly, one can calculate the Fermi velocity, which, for a typical metal such as copper, may be as large as 1% the speed of light! Again, this enormous velocity stems from the Pauli exclusion principle — all the lower momentum states are simply filled, so if the density of electrons is very high, the velocities will be very high as well. With a Fermi energy that is so large, and therefore a Fermi sea that is very deep, any (not insanely large) temperature can only make excitations of electrons that are already very close to the Fermi surface (i.e., they can jump from just below the Fermi surface to just above with only a small energy increase). The electrons deep within the Fermi sea, near k = 0, cannot be moved by any reasonably low energy perturbation simply because there are no available unfilled states for them to move to unless they absorb a very large amount of energy.

4.2 Electronic Heat Capacity

We now turn to examine the heat capacity of electrons in a metal. Analogous to Eq. 4.3, the total energy of our system of electrons is given now by 2V 2V ∞ E = dk (k) n (β((k) µ)) = 4πk2dk (k) n (β((k) µ)) total (2π)3 F − (2π)3 F − Z Z0 where the chemical potential is defined as above by 2V 2V ∞ N = dk n (β((k) µ)) = 4πk2dk n (β((k) µ)) (2π)3 F − (2π)3 F − Z Z0 (Here we have changed to spherical coordinates to obtain a one dimensional integral and a factor of 4πk2 out front). It is convenient to replace k in this equation by the energy E by using Eq. 4.2 or equivalently 2m k = ~2 r we then have m dk = d 2~2 r We can then rewrite these expressions as ∞ Etotal = V d  g() nF (β( µ)) (4.8) 0 − Z ∞ N = V d g() n (β( µ)) (4.9) F − Z0 4.2. ELECTRONIC HEAT CAPACITY 31 where 2 2 2m m (2m)3/2 g()d = 4πk2dk = 4π d = 1/2d (4.10) (2π)3 (2π)3 ~2 2~2 2π2~3   r is the density of states per unit volume. The definition7 of this quantity is such that g()d is the total number of eigenstates (including both spin states) with energies between  and  + d. 3/2 ~3 2 3/2 From Eq. 4.7 we can simply derive (2m) / = 3π n/EF , thus we can simplify the density of states expression to 3n  1/2 g()= (4.11) 2E E F  F  which is a fair bit simpler. Note that the density of states has dimensions of a density (an inverse volume) divided by an energy. It is clear that this is the dimensions it must have given Eq. 4.9 for example. Note that the expression Eq. 4.9 should be thought of as defining the chemical potential given the number of electrons in the system and the temperature. Once the chemical potential is fixed, then Eq. 4.8 gives us the total kinetic energy of the system. Differentiating that quantity would give us the heat capacity. Unfortunately there is no way to do this analytically in all generality. However, we can use to our advantage that T T for any reasonable temperature, so that the  F Fermi factors nF are close to a step function. Such an expansion was first used by Sommerfeld, but it is algebraically rather complicated8 (See Ashcroft and Mermin chapter 2 to see how it is done in detail). However, it is not hard to make an estimate of what such a calculation must give — which we shall now do. When T = 0 the Fermi function is a step function and the chemical potential is (by definition) the Fermi energy. For small T , the step function is smeared out as we see in Fig. 4.1. Note, however, that in this smearing the number of states that are removed from below the chemical potential is almost exactly the same as the number of states that are added above the chemical potential9. Thus, for small T , one does not have to move the chemical potential much from the Fermi energy in order to keep the number of particles fixed in Eq. 4.9. We conclude that µ EF for any low 2 ≈ temperature. (In fact, in more detail we find that µ(T ) = EF + (T/TF ) , see Ashcroft and Mermin chapter 2). O

Thus we can focus on Eq. 4.8 with the assumption that µ = EF . At T = 0 let us call the kinetic energy10 of the system E(T = 0). At finite temperature, instead of a step function in Eq. 4.8 the step is smeared out as in Fig. 4.1. We see in the figure that only electrons within an energy range of roughly kB T of the Fermi surface can be excited — in general they are excited above the Fermi surface by an energy of about kBT . Thus we can approximately write

E(T )= E(T =0)+(γ/2)[Vg(EF )(kB T )](kBT )+ ...

Here Vg(EF ) is the density of states near the Fermi surface (Recall g is the density of states per unit volume), so the number of particles close enough to the Fermi surface to be excited is Vg(EF )(kB T ), and the final factor of (kB T ) is roughly the amount of energy that each one gets

7Compare the physical meaning of this definition to that of the density of states for sound waves given in Eq. 2.3 above. 8Such a calculation requires, among other things, the evaluation of some very nasty integrals which turn out to be related to the Riemann Zeta function (see section 2.4 above). 9 Since the Fermi function has a precise symmetry around µ given by nF (β(E µ)) = 1 nF (β(µ E)), this equivalence of states removed from below the chemical potential and states inserted− above− would be− an exact statement if the density of states in Eq. 4.9 were independent of energy. 10 In fact E(T = 0) = (3/5)NEF , which is not too hard to show. Try showing it! 32 CHAPTER 4. SOMMERFELD THEORY excited by. Here γ is some constant which we cannot get right by such an approximate argument (but it can be derived more carefully, and it turns out that γ = π2/3, see Ashcroft and Mermin). We can then derive the heat capacity

C = ∂E/∂T = γkBg(EF )kB T V which then using Eq. 4.11 we can rewrite as 3Nk T C = γ B 2 T    F  The first term in brackets is just the classical result for the heat capacity of a gas, but the final factor T/TF is tiny (0.01 or smaller!). This is the above promised linear T term in the specific heat of electrons, which is far smaller than one would get for a classical gas. This Sommerfeld prediction for the electronic (linear T ) contribution to the heat capacity of a metal is typically not far from being correct (The coefficient may be incorrect by factors of “order one”). A few metals, however, have specific heats that deviate from this prediction by as much as a factor of 10. Note that there are other measurements that indicate that these errors are associated with the electron mass being somehow changed in the metal. We will discover the reason for these deviations later when we study band theory (mainly in chapter 16). Realizing now that the specific heat of the electron gas is reduced from that of the classical gas by a factor of T/TF . 0.01, we can return to the re-examine some of the above Drude calculations of thermal transport. We had above found (See Eq. 3.3-3.5) that Drude theory predicts a thermopower S = Π/T = cv/(3e) that is too large by a factor of 100. Now it is clear that the reason for this error was− that we used in this calculation (See Eq. 3.4) the specific heat per electron for a classical gas, which is too large by roughly TF /T 100. If we repeat the calculation using the proper specific heat, we will now get a prediction for≈ thermopower which is reasonably close to what is actually measured in experiment for most metals. We also used the specific heat per particle in the Drude calculation of the thermal conduc- 1 2 tivity κ = 3 ncv v λ. In this case, the cv that Drude used was too large by a factor of TF /T , but on the otherh handi the value of v 2 that he used was too small by roughly the same factor 2 h i (Classically, one uses mv /2= kBT whereas for the Sommerfeld model, one should use the Fermi 2 velocity mvF /2 = kB TF ). Thus Drude’s prediction for thermal conductivity came out roughly correct (and thus the Wiedemann-Franz law correctly holds).

4.3 Magnetic Spin Susceptibility (Pauli Paramagnetism)11

Another property we can examine about the free electron gas is its response to an externally applied magnetic field. There are several ways that the electrons can respond to the magnetic field. First, the electrons’ motion can be curved due to the Lorentz force. We have discussed this previously, and we will return to discuss it again in section 18.5 below12. Secondly, the electron

11Part VII of this book is entirely devoted to the subject of magnetism, so it might seem to be a bit out of place to discuss magnetism now. However since the calculation is an important result that hinges only on free electrons and Fermi statistics, it seems appropriate to me that it is discussed here. Most students will already be familiar with the necessary definitions of quantities such as magnetization and susceptibility so should not be confused by this. However, for those who disagree with this strategy or are completely confused by this section it is OK to skip over it and return after reading a bit of part VII. 12For a free electron gas, the contribution to the from the orbital motion of the electron is known as Landau diamagnetism and takes the value χ = (1/3)χ . We will discuss diamagnetism Landau − Pauli 4.3. MAGNETIC SPIN SUSCEPTIBILITY (PAULI PARAMAGNETISM) 33 spins can flip over due to the applied magnetic field — this is the effect we will focus on. Roughly, the Hamiltonian (neglecting the Lorentz force of the magnetic field, see section 18.3 below for more detail) becomes13. p2 = + gµ B σ H 2m B · where g = 2 is the g-factor of the electron14, B is the magnetic field15 and σ is the spin of the electron which takes eigenvalues 1/2. Here I have defined (and will use elsewhere) the useful version of the magneton ±

µ = e~/2m .67(K/T )/k . B e ≈ B Thus in the magnetic field the energy of an electron with spin up or down (with up meaning it points the same way as the applied field, and B = B ) | | ~2 k 2 (k, ) = | | + µ B ↑ 2m B ~2 k 2 (k, ) = | | µ B ↓ 2m − B The spin magnetization of the system (moment per unit volume) in the direction of the applied magnetic field will then be 1 dE M = = ([# up spins] [# down spins]) µ /V (4.12) −V dB − − B So when the magnetic field is applied, it is lower energy for the spins to be pointing down, so more of them will point down. Thus a magnetization develops in the same direction as the applied magnetic field. This is known as Pauli Paramagnetism. Here Paramagnetism means that the magnetization is in the direction of the applied magnetic field. Pauli Paramagnetism refers in particular to the spin magnetization of the free electron gas. (We will discuss paramagnetism in more detail in chapter 18). Let us now calculate the Pauli paramagnetism of the free electron gas at T = 0. With zero magnetic field applied, both the spin up and spin down states are filled up to the Fermi energy (i.e, to the Fermi wavevector). Near the Fermi level the density of states per unit volume for spin up electrons is g(EF )/2 and similarly the density of states per unit volume for spin down electrons is g(EF )/2. When B is applied, the spin ups will be more costly by an energy µBB. Thus, (assuming that the chemical potential does not change) we will have (g(EF )/2)µBB fewer spin up electrons more in chapter 18 below. Unfortunately, calculating this diamagnetism is relatively tricky. (See Peierls’ book for example). This effect is named after the famous Russian Nobel-Laureate , who kept a now famous ranking of how smart various physicist were — ranked on a logarithmic scale. Einstein was on top with a ranking of 0.5. Bose, Wigner, and Newton all received a ranking of 1. Schroedinger, Heisenberg, Bohr, and Dirac were ranked 2, and Landau modestly ranked himself a 2.5 but after winning the Nobel prize raised himself to 2. He said that anyone ranked below 4 was not worth talking to. 13The sign of the last term, the so called Zeeman coupling, may be a bit confusing. Recall that because the electron charge is negative, the electron dipole moment is actually opposite the direction of the electron spin (the current is rotating opposite the direction that the electron is spinning). Thus spins are lower energy when they are anti-aligned with the magnetic field! This is yet another annoyance caused by Benjamin Franklin who declared that the charge left on a glass rod when rubbed with silk is positive. 14It is a yet another constant source of grief that the letter “g” is used both for density of states and for g-factor of the electron. To avoid confusion we immediately set the g-factor to 2 and henceforth in this chapter g is reserved for density of states. Similar grief is that we now have to write for Hamiltonian because H = B/µ0 is frequently H used for the magnetic field with µ0 the permeability of free space. 15One should be careful to use the magnetic field seen by the actual electrons — this may be different from the magnetic field applied to the sample if the sample itself develops a magnetization. 34 CHAPTER 4. SOMMERFELD THEORY per unit volume. Similarly, the spin downs will be less costly by the same amount, so we will have (g(EF )/2)µBB more spin downs per unit volume. Note that the total number of electrons in the system did not change, so our assumption that the chemical potential did not change is correct. (Recall that chemical potential is always adjusted so it gives the right total number of electrons in the system). This process is depicted in Figure 4.3.

B =0 B =0 6

g↑(E) g↑(E)

EF EF E µB→B E

g↓(E) g↓(E)

EF EF E µB←B E

Figure 4.2: Left: Before the magnetic field is applied the density of states for spin up and spin down are the same g↑(E)= g↓(E)= g(E)/2. Note that these functions are proportional to E1/2 (See Eq. 4.11) hence the shape of the curve, and the shaded region indicates the states that are filled. Right: When the magnetic field is applied, the states with up and down spin are shifted in energy by +µBB and µBB respectively as shown. Hence up spins pushed above the Fermi energy can lower their− energies by flipping over to become down spins. The number of spins that flip (the area of the approximately rectangular sliver) is roughly g↑(EF )µBB.

Using Eq. 4.12, given that we have moved g(EF )µB B/2 up spins to down spins, the mag- netization (magnetic moment per unit volume) is given by

2 M = g(EF )µB B and hence the magnetic susceptibility χ = ∂M/∂H is given (at T = 0 by)16

dM dM χ = = µ = µ µ2 g(E ) P auli dH 0 dB 0 B F with µ0 the permeability of free space. In fact this result is not far from correct for simple metals such as Li,Cu, or Na.

16See also the very closely related derivation given in section 22.1.2 below. 4.4. WHY DRUDE THEORY WORKS SO WELL 35

4.4 Why Drude Theory Works so Well

In retrospect we can understand a bit more about why Drude theory was so successful. As men- tioned above, we now realize that because of Fermi statistics, treating electrons as a classical gas is incorrect – resulting in a huge overestimation of the heat capacity per particle,and in a huge under- estimation of the typical velocity of particles. As described above, these two errors can sometimes cancel giving reasonable results nonetheless. However, we can also ask why it is that Drude was successful in calculation of transport properties such as the conductivity and the Hall coefficient. In these calculations neither the velocity of the particle nor the specific heat enter. But still, the idea that a single particle will accelerate freely for some amount of time, then will scatter back to zero momentum seems like it must be wrong, since the state at zero momentum is always fully occupied. The transport equation (Eq. 3.1) that we solve dp p = F (4.13) dt − τ in the Drude theory describes the motion of each particle. However, we can just as well use the same equation to describe the motion of the center of mass of the entire Fermi sea! On the left of Fig. 4.3 we have a picture of a Fermi sphere of radius kF . The typical electron has a very large velocity on the order of the Fermi velocity vF , but the average of all of the (vector) velocities is zero. When an electric field is applied (in they ˆ direction as shown on the right of Fig. 4.3, so that the force is in the yˆ direction since the charge of the electron is e) every electron in the system accelerates together− in the yˆ direction, and the center of the Fermi− sea shifts. The shifted Fermi − sea has some nonzero average velocity, known as the drift velocity vdrift. Since the kinetic energy of the shifted Fermi sea is higher than the energy of the Fermi sea with zero average velocity, the electrons will try to scatter back (with scattering rate 1/τ) to lower kinetic energy and shift the Fermi sea back to its original configuration with zero drift velocity. We can then view the Drude transport equation (Eq. 4.13) as describing the motion of the average velocity (momentum) of the entire Fermi sea. One can think about how this scattering actually occurs in the Sommerfeld model. Here, most electrons have nowhere to scatter to, since all of the available k states with lower energy (lower k are already filled. However, the few electrons near the Fermi surface in the thin crescent between| | the shifted and unshifted Fermi sea into the thin unfilled crescent on the other side of the unfilled Fermi sea to lower their energies (see Fig. 4.3). Although these scattering processes happen only to a very few of the electrons, the scattering events are extremely violent in that the change in momentum is exceedingly large (scattering all the way across the Fermi sea17).

4.5 Shortcomings of the Free Electron Model

Although the Sommerfeld (Free Electron) Model of a metal explains quite a bit about metals, it remains incomplete. Here are some items that are not well explained within Sommerfeld theory:

Having discovered now that the typical velocity of electrons v is extremely large, and being • F able to measure the scattering time τ, we obtain a scattering length λ = vF τ that may be 100 Angstroms or more. One might wonder, if there are atoms every few angstroms in a

17Actually, it may be that many small walking around the edge of these crescents make up this one effective scattering event 36 CHAPTER 4. SOMMERFELD THEORY

 

   

 

Figure 4.3: Drift Velocity and Fermi Velocity. The Drift momentum is the displacement of the entire Fermi sphere (which is generally very very small) whereas the Fermi momentum is the radius of the Fermi sphere, which can be very large. Drude theory makes sense if you think of it as a transport equation for the center of mass of the entire Fermi sphere – i.e., it describes the drift velocity. Scattering of electrons only occurs between the thin crescents that are the difference between the shifted and unshifted Fermi spheres

metal, why do the electrons not scatter from these atoms? (We will discuss this in chapter 14 below — the resolution is a result of Bloch’s theorem.)

Many of our results depend on the number of electrons in a metal. In order to calculate this • number we have always used the chemical valence of the atom. (For example, we assume one free electron per Li atom). However, in fact, except for Hydrogen, there are actually many electrons per atom. Why do core electrons not “count” for calculating the Fermi energy or velocity? What about insulators where there are no electrons free?

We have still not resolved the question of why the Hall effect sometimes comes out with the • incorrect sign, as if the charge carrier were positive rather than negative (the sign of charge of electrons.)

In optical spectra of metals there are frequently many features (higher absorbtion at some • frequencies, lower absorbtion at other frequencies). These features give metals their char- acteristic colors (for example, they make gold yellowish). The Sommerfeld model does not explain these features at all.

The measured specific heat of electrons is much more correct than in Drude theory, but for • 4.6. SUMMARY OF (SOMMERFELD) FREE ELECTRON THEORY 37

some metals is still off by factors as large as 10. Measurements of the mass of the electron in a metal also sometimes give answers that differ from the actual mass of the electron by similar factors. Magnetism: Some metals, such as Iron, are magnetic even without any applied external • magnetic field. We will discuss magnetism is part VII below. Electron interaction: We have treated the electrons as noninteracting fermions. In fact, the • 2 typical energy of interaction for electrons, e /(4π0r) with r the typical distance between electrons) is huge, roughly the same scale as the Fermi energy. Yet we have ignored the Coulomb interaction between electrons completely. Understanding why this works is an extremely hard problem that was only understood starting in the late 1950s — again due to the brilliance of Lev Landau (See above footnote 12 in this chapter about Landau). The theory that explains this is frequently known as “Landau Fermi Theory”, but we will not study it in this course.

With the exception of the final two points (Magnetism and Electron interaction) all of these issues will be resolved once we study electronic band structure in chapters 10, 14 and particularly 16 below. In short, we are not taking seriously the periodic structure of atoms in materials.

4.6 Summary of (Sommerfeld) Free Electron Theory

Treats properly the fact that electrons are Fermions. • High density of electrons results in extremely high Fermi energy and Fermi velocity. Thermal • and electric excitations are small redistributions of electrons around the Fermi surface. Compared to Drude theory, obtains electron velocity 100 times larger, but heat capacity • per electron 100 times smaller. Leaves Wiedemann-Franz∼ ratio roughly unchanged from Drude, but fixes∼ problems in predications of thermal properties. Drude transport equations make sense if one considers velocities to be drift velocities, not individual electron velocities. Specific Heat and (Pauli) paramagnetic susceptibility can be calculated explicitly (know these • derivations!) in good agreement with experiment.

References

For free electron (Sommerfeld) theory, good references are: Ashcroft and Mermin chapter 2–3. • Singleton, section 1.5–1.6 • Rosenberg section 7.1–7.9 • Ibach and Luth section 6–6.5 • Kittel chapter 6 • Burns chapter 9B (excluding 9.14 and 9.16) • 38 CHAPTER 4. SOMMERFELD THEORY Part II

Putting Materials Together

39

Chapter 5

What Holds Solids Together: Chemical Bonding

In chapter 2 we found that the Debye model gave a reasonably good description of the specific heat of solids. However, we also found a number of shortcomings of the theory. These shortcomings basically stemmed from not taking seriously the fact that solids are actually made up of individual atoms assembled in a periodic structure. Similarly in chapter 4 we found that the Sommerfeld model of metals described quite a bit about metals, but had a number of shortcomings as well — many of these were similarly due to not realizing that the solids are made up of individual atoms assembled in periodic structures. As such, a large amount of this book will actually be devoted to understanding the effects of these individual atoms and their periodic arrangement on the electrons and on the vibrations of the solid. However, first it is worth backing up and asking ourselves why atoms stick together to form solids in the first place!

5.1 General Considerations about Bonding

To determine why atoms stick together to form solids, we are in some sense trying to describe the solution to a many particle Schroedinger1 equation describing the many electrons and many nuclei in a solid. We can at least write down the equation

HΨ= EΨ where Ψ is the wavefunction describing the positions and spin states of all the electrons and nuclei in the system. The terms in the Hamiltonian include a kinetic term (with inputs of the electron

1Erwin Schroedinger was a fellow at Magdalen College Oxford from 1933 to 1938, but he was made to feel not very welcome there because he had a rather “unusual” personal life — he lived with both his wife, Anny, and with his mistress, Hilde, who, although married to another man, bore Schroedinger’s child, Ruth. After Oxford, Schroedinger was coaxed to live in Ireland with the understanding that this unusual arrangement would be fully tolerated. Surprisingly, all of the parties involved seemed fairly content until 1946 after Schroedinger fathered two more children with two different Irish women, whereupon Hilde decided to take Ruth back to Austria to live with her lawful husband. Anny, entirely unperturbed by this development and having her own lovers as well, remained Erwin’s close companion until his death.

41 42 CHAPTER 5. CHEMICAL BONDING and nucleon mass) as well as a Coulomb interaction term between all the electrons and nuclei.2 While this type of description of chemical bonding is certainly true, it is also mostly useless. No one ever even tries to solve the Schroedinger equation for more than a few particles at a time. Trying to solve it for 1023 electrons simultaneously is completely absurd. One must try to extract useful information about the behavior from simplified models in order to obtain a qualitative understanding. (This is a great example of what I was ranting about in chapter 1 — reductionism does not work: saying that the Schroedinger equation is the whole solution is misguided). More sophisticated techniques try to turn these qualitative understandings into quantitative predictions. In fact, what we are trying to do here is to try to understand a whole lot of chemistry from the point of view of a physicist. If you have had a good chemistry course, much of this chapter may sound familiar. However, here we will try to understand chemistry using our knowledge of quantum mechanics. Instead of learning empirical chemistry rules, we will look at simplified models that show roughly how these rules arise. However at the end of the day, we cannot trust our simplified models too much and we really should learn more chemistry to try to decide if yttrium really will form a carbonate salt or some similar question.

Figure 5.1: The periodic table of the elements.

From a chemist’s point of view one frequently thinks about different types of chemical bonds depending on the types of atoms involved, and in particular, depending on the atom’s position on the periodic table (and in particular, on the atom’s electronegativity — which is its tendency to attract electrons). Below we will discuss Ionic Bonds, Covalent Bonds, van der Waals (fluctuating dipole, or molecular) bonds, Metallic Bonds, and Hydrogen Bonds. Of course, they are all different aspects of the Schroedinger equation, and any given material may exhibit aspects of several of these types of bonding. Nonetheless, qualitatively it is quite useful to discuss these different types of bonds to give us intuition about how chemical bonding can occur. A brief description of the many types of bonding and their properties is shown in table 5.1. Note that this table should be considered just as rules-of-thumb, as many materials have properties intermediate between the categories listed.

2To have a fully functioning “Theory of Everything” as far as all of chemistry, biology, and most of everything that matters to us (besides the sun and atomic energy) is concerned, one needs only Coulomb interaction plus the Kinetic term in the Hamiltonian, plus spin-orbit (relativistic effects) for some of the heavy atoms. 5.1. GENERAL CONSIDERATIONS ABOUT BONDING 43

Type of Bonding Description Typical of which compounds Typical Properties

Binary compounds made Hard, Very Brittle Electron is transferred of constituents with very • High Melting Temper- from one atom to an- Ionic different electronegativ- ature• other, and the resulting ity: Ex, group I-VII com- Electrical Insulator ions attract each other pounds such as NaCl. • Water Soluble • Compounds made of Electron is shared Very Hard (Brittle) constituents with similar • equally between two electronegativities (ex, High Melting Temper- atoms forming a bond. • Covalent III-V compounds such as ature Energy lowered by Electrical Insulators or GaAs), or solids made of • delocalization of wave- one element only such as Semiconductors function diamond (C) Ductile, Maleable (due• to non-directional of bond. Can be Electrons delocalized hardened by preventing dislocation motion with throughout the solid Metals. Left and Middle Metallic Bonds impurities) forming a glue between of Periodic Table. positive ions. Lower Melting Tem- perature• Good electrical and thermal• conductors.

No transfer of electrons. Molecular Dipole moments on con- Noble Gas Solids, Solids Soft, Weak (van der Waals stituents align to cause made of Non-Polar (or • or attraction. Bonding slightly polar) Molecules Low Melting Tempera- ture• Fluctuating strength increases with Binding to Each Other Electrical Insulators Dipole) size of molecule or polar- (Wax) • ity of constituent. Involves Hydrogen ion Weak Bond (stronger • bound to one atom but than VdW though) Important in organic and Hydrogen still attracted to another. Important for main- biological materials • Special case because H is taining shape of DNA so small. and Table 5.1: Types of Bonds in Solids. This table should be thought of as providing rough rules. Many materials show characteristics intermediate between two (or more!) classes. 44 CHAPTER 5. CHEMICAL BONDING

In this section we will try to be a bit more quantitative about how some of these types of bonding come about. Remember, underneath it is all the Schroedinger equation and the Coulomb interaction between electrons and nuclei that is holding materials together!

5.2 Ionic Bonds

The general idea of an ionic bond is that for certain compounds (for example, binary compounds, such as NaCl, made of one element in group I and one element in group VII), it is energetically favorable for an electron to be physically trasferred from one atom to the other, leaving two oppositely charged ions which then attract each other. One writes a chemical “reaction” of the form Na + Cl Na+ + Cl− NaCl → → To find out if such a reaction happens, one must look at the energetics associated with the transfer of the electron. At least in principle it is not too hard to imagine solving the Schroedinger equation3 for a single atom and determining the energy of the neutral atom, of the positive ion, and of the negative ion or actually measuring these energies for individual atoms with some sort of . We define:

Ionizaton Energy = Energy required to remove one electron from a neutral atom to create a positive ion

Electron Affinity = Energy gain for creating a negative ion from a neutral atom by adding an electron

To be precise, in both cases we are comparing the energy of having an electron either at position infinity, or on the atom. Further, if we are removing or adding only a single electron, then these are called first Ionization energies and first electron affinities respectively (one can similarly define energies for removing or adding two electrons which would be called second). Finally we note that chemists typically work with systems at fixed (room) temperature and (atmospheric) pressure, in which case they are likely to be more concerned with Gibbs free energies, rather than pure energies. We will always assume that one is using the appropriate free energy for the experiment in question (and we will be sloppy and always call an energy E). Ionization energy is smallest on the left (group I and II) of the periodic table and largest on the right (group VII, and VIII). To a lesser extent the ionization energy also tends to decrease towards the bottom of the periodic table. Similarly electron affinity is also largest on the right and top of the periodic table (not including the group VIII nobel gases which roughly do not attract electrons measurably at all). The total energy change from transferring an electron from atom A to atom B is

∆E + − = (IonizationEnergy) (Electron Affinity) A+B→A +B A − B 3As emphasized in chapter 1 even the world’s largest computers cannot solve the Schroedinger equation for a system of more than a few electrons. Nobel prizes (in chemistry) were awarded to and John Pople for developing computational methods that can obtain highly accurate approximations. These approaches have formed much of the basis of modern . 5.2. IONIC BONDS 45

First Ionization Energies First Electron Affinities © 

Caesium-

Figure 5.2: Pictorial Tables of First Ionization Energies (left) and First Electron Affinities (right). The word ”First” here means that we are measuring the energy to lose or gain a first electron starting with a neutral atom. The linear size of each box represents the magnitude of the energies (scales on the two plots differ). For reference the largest ionization energy is helium, at roughly 24.58 eV per atom, the lowest is caesium at 3.89 eV. The largest electron affinity is chlorine which gains 3.62 eV when binding to an additional electron. The few light green colored boxes are atoms that have negative electron affinities.

Note carefully the sign. The ionization energy is a positive energy that must be put in, the electron affinity is an energy that comes out. However this ∆E is the energy to transfer an electron between two atoms very far apart. In addition, there is also4

Cohesive Energy = Energy gain from A+ + B− AB → This cohesive energy is mostly a classical effect of the Coulomb interaction between the ions as one lets the ions come close together.5 The total energy gain for forming a molecule from the two individual atoms is thus given by

∆E = (Ionization Energy) (Electron Affinity) Cohesive Energy of A-B A+B→AB A − B − One obtains an ionic bond if the total ∆E for this process is less than zero. In order to determine whether an electron is likely to be transferred between one atom and another, it is convenient to use the a so-called electronegativity, which roughly describes how much an atom “wants” electrons, or how much an atom attracts electrons to itself. While there are

4The term “Cohesive Energy” can be ambiguous since sometimes people use it to mean the energy to put two ions together into a compound, and other times they mean it to be the energy to put two neutral atoms together! Here we mean the former. 5One can write a simple classical equation for a total cohesive energy for a solid

QiQj Ecohesive = − 4π0 r r Xi

(Electron Affinity) + (Ionization Energy) (Mulliken) Electronegativity = 2

The electronegativity is extremely large for elements in the upper right of the periodic table (not including the noble gases). In bonding, the electron is always transferred from the atom of lower electronegativity to higher electronegativity. The greater the difference in electronegativities between two atoms the more completely the electron is transferred from one atom to the other. If the difference in electronegativities is small, then the electron is only partially transferred from one atom to the other. We will see below that one can have covalent bonding even between two identical atoms where there is no difference in electronegativities, and therefore no net transfer of electrons. Before leaving the topic of ionic bonds, it is worth discussing some of the typical physics of ionic solids. First of all, the materials are typically hard, as the Coulomb interaction between oppositely charged ions is strong. However, since water is extremely polar, it can dissolve an ionic solid. This happens (See Fig 5.3) by arranging the water molecules such that the negative side of the molecule is close to the positive ions and the positive side of the molecule is close to the negative ions.

+ + + −− + + + −− −− Cl− Na+ + + + + −− −− + −− +

Figure 5.3: Salt, NaCl, dissolved in water. Ionic compounds typically dissolve easily in water since the polar water molecules can screen the highly charged, but otherwise stable, ions.

6This electronegativity can be thought as approximately the negative of the chemical potential via

1 1 EN−1 EN+1 ∂E (Eaffinity + Eion)= ([EN EN+1]+[EN−1 EN ]) = − µ. 2 2 − − 2 ≈− ∂N ≈− See however the comments in section 4.1 on defining chemical potential for systems with discrete energy levels and discrete number of electrons. 7Both Robert Mulliken and Linus Pauling won Nobel Prizes in Chemistry for their work understanding chemical bonding including the concept of electronegativity. Pauling won a second Nobel prize, in Peace, for his work towards banning nuclear weapons testing. (Only four people have ever won two Nobels: Marie Curie, Linus Pauling, , and Fredrick Sanger. We should all know these names!). Pauling was criticized later in his life for promoting high doses of vitamin C to prevent cancer and other ailments, sometimes apparently despite scientific evidence to the contrary. 5.3. COVALENT BOND 47

5.3 Covalent Bond

Roughly, a covalent bond is a bond where electrons are shared equally between two atoms. There are several pictures that can be used to describe the covalent bond.

5.3.1 Particle in a Box Picture

Let us model a hydrogen atom as a box of size L for an electron (for simplicity, let us think about a one dimensional system). The energy of a single electron in a box is (I hope this looks familiar!)

~2π2 E = 2mL2

Now suppose two such atoms come close together. An electron that is shared between the two atoms can now be delocalized over the positions of both atoms, thus it is in a box of size 2L and has lower energy

~2π2 E = 2m(2L)2

This reduction in energy that occurs by delocalizing the electron is the driving force for forming the . The new ground state orbital is known as a bonding orbital. If each atom starts with a single electron (i.e., it is a hydrogen atom) then when the two atoms come together to form a lower energy (bonding) orbital, then both electrons can go into this same ground state orbital since they can take opposite spin states. Of course the reduction in energy of the two electrons must compete against the Coulomb repulsion between the two nuclei, and the Coulomb repulsion of the two electrons with each other, which is a much more complicated calculation. Now suppose we had started with two helium atoms, where each atom has two electrons, then when the two atoms come together there is not enough room in the single ground state wavefunction. In this case, two of the four electrons must occupy the first excited orbital — which in this case turns out to be exactly the same electronic energy as the original ground state orbital of the original atoms – since no energy is gained by these electrons when the two atoms come together these are known as antibonding orbitals. (In fact it requires energy to push the two atoms together if one includes Coulomb repulsions between the nuclei)

5.3.2 Molecular Orbital or Tight Binding Theory

In this section we make slightly more quantitative some of the idea of the previous section. Let us write a Hamiltonian for two Hydrogen atoms. Since the nuclei are heavy compared to the electrons, we will fix the nuclear positions and solve the Schroedinger equation for the electrons as a function of the distance between the nuclei. This fixing of the position of nuclei is known as a 48 CHAPTER 5. CHEMICAL BONDING



Figure 5.4: Particle in a box picture of covalent bonding. Two separated hydrogen atoms are like two different boxes each with one electron in the lowest eigenstate. When the two boxes are pushed together, one obtains a larger box – thereby lowering the energy of the lowest eigenstate – which is known as the bonding orbital. The two electrons can take opposite spin states and can thereby both fit in the bonding orbital. The first excited state is known as the antibonding orbital

“Born-Oppenheimer” approximation8,9. We hope to calculate the eigenenergies of the system as a function of the distance between the positively charged nuclei. For simplicity, let us consider a single electron and two identical positive nuclei. We write the Hamiltonian as H = K + V1 + V2 with p2 K = 2m being the kinetic energy of the electron and

e2 V = i 4π r R 0| − i| is the Coulomb interaction energy between the electron position r and the position of nuclei Ri. Generally this type of Schroedinger equation is hard to solve exactly. (In fact it can be solved exactly in this case, but it is not particularly enlightening to do so). Instead, we will attempt a variational solution. Let us write a trial wavefunction as

ψ = φ 1 + φ 2 (5.1) | i 1| i 2| i 8Max Born (also the same guy from Born-Von Karmen boundary conditions) was one of the founders of quantum physics, winning a Nobel Prize in 1954. His daughter, and biographer, Irene, married into the Newton-John family, and had a daughter named Olivia, who became a pop icon and film star in the 1970s. Her most famous role was in the movie of Grease playing Sandra-Dee opposite John Travolta. When I was a kid, she was every teenage guy’s dream-girl (her, or Farrah Fawcett). 9J. Robert Oppenheimer later became the head scientific manager of the American atomic bomb project during the second world war. After this giant scientific and military triumph, he pushed for control of nuclear weapons leading to his being accused of being a communist sympathizer during the “Red” scares of the 1950s and he ended up having his security clearance revoked. 5.3. COVALENT BOND 49

 

  

 





 

Figure 5.5: Molecular Orbital Picture of Bonding. In this type of picture, on the far left and far right are the orbital energies of the individual atoms well separated from each other. In the middle are the orbital energies when the atoms come together to form a molecule. Top: Two hydrogen atoms come together to form a H2 molecule. As mentioned above in the particle-in-a-box picture, the lowest energy eigenstate is reduced in energy when the atoms come together and both electrons go into this bonding orbital. Middle: In the case of helium, since there are two electrons per atom, the bonding orbitals are filled, and the antibonding orbitals must be filled as well. The total energy is not reduced by the two Helium atoms coming together (thus helium does not form He2). Bottom: In the case of LiF, the energies of the lithium and the fluorine orbitals are different. As a result, the bonding orbital is mostly composed of the orbital on the Li atom – meaning that the bonding electrons are mostly transferred from Li to F — forming a more ionic bond.

where φi are complex coefficients, and the kets 1 and 2 are known as “atomic orbitals” or “tight binding” orbitals10. The form of Eq. 5.1 is frequently| i | knowni as a “linear combination of atomic orbitals” or LCAO11. The orbitals which we use here can be taken as the ground state solution of the Schroedinger equation when there is only one nucleus present. I.e.

(K + V ) 1 =  1 1 | i 0| i (K + V ) 2 =  2 (5.2) 2 | i 0| i where  is the ground state energy of the single atom12. I.e., 1 is a ground state orbital on 0 | i 10The term “tight binding” is from the idea that an is tightly bound to its nucleus. 11The LCAO approach can be improved systematically by using more orbitals and more variational coefficients — which then can be optimized with the help of a computer. This general idea formed the basis of the quantum chemistry work of John Pople. See footnote 3 above in this section. 12 Here 0 is not a dielectric constant or the permittivity of free space, but rather the energy of an electron in an orbital (At some point we just run out of new symbols to use for new quantities!) 50 CHAPTER 5. CHEMICAL BONDING nucleus 1 and 2 is a ground state orbital on nucleus 2. | i For simplicity, we will now make a rough approximation that 1 and 2 are orthogonal so we can then choose a normalization such that | i | i

i j = δ (5.3) h | i ij When the two nuclei get very close together, this orthogonality is clearly no longer even close to correct. We then have to decide: either we keep our definition of the atomic orbitals being the solution to the Schroedinger equation for a single nucleus, but we give up on the two atomic orbitals being orthogonal; or we can give up on the orbitals being solutions to the Schroedinger equation for a single nucleus, but we keep orthonormality. It is a good exercise to consider what happens when we give up orthonormality, but fortunately most of what we learn does not depend too much on whether the orbitals are orthogonal or not, so for simplicity we will assume orthonormal orbitals. An effective Schroedinger equation can be written down for our variational wavefunction which (unsuprisingly) takes the form of an eigenvalue problem13

Hij φj = Eφi j X where H = i H j ij h | | i is a two by two matrix in this case. (The equation generalizes in the obvious way to the case where there are more than 2 orbitals). Recalling our definition of 1 as being the ground state energy of K + V , we can write14 | i 1 H = 1 H 1 = 1 K + V 1 + 1 V 1 =  + V (5.4) 11 h | | i h | 1| i h | 2| i 0 cross H = 2 H 2 = 2 K + V 2 + 2 V 2 =  + V (5.5) 22 h | | i h | 2| i h | 1| i 0 cross H = 1 H 2 = 1 K + V 2 + 1 V 2 =0 t (5.6) 12 h | | i h | 2| i h | 1| i − H = 2 H 1 = 2 K + V 1 + 2 V 1 =0 t∗ (5.7) 21 h | | i h | 2| i h | 1| i − In the first two lines V = 1 V 1 = 2 V 2 cross h | 2| i h | 1| i is the Coulomb potential felt by orbital 1 due to nucleus 2, or equivalently the Coulomb potential felt by orbital 2 due to nucleus 1. In the| i second two lines (Eqs. 5.6 and 5.7) we have also defined the so-called hopping| i term15,16 t = 1 V 2 = 1 V 2 −h | 2| i −h | 1| i 13To derive this eigenvalue equation we start with an expression for the energy ψ H ψ E = h | | i ψ ψ h | i ∗ then with ψ written in the variational form of Eq. 5.1, we minimize the energy by setting ∂E/∂φi = ∂E/∂φi = 0. 14 In atomic physics courses, the quantities Vcross and t are often called a direct and exchange terms and are sometimes denoted and . We avoid this terminology because the same words are almost always used to describe 2-electron interactionsJ in condensedK matter. 15The minus sign is a convention for the definition of t. For many cases of interest, this definition makes t positive, although it can actually have either sign depending on the structure of the orbitals in question and the details of the potential. 16 The second equality here can be obtained by rewriting H12 = 1 K + V1 2 + 1 V2 2 . h | | i h | | i 5.3. COVALENT BOND 51

The reason for the name “hopping” will become clear below. Note that in the second two lines (Eqs. 5.6 and 5.7) the first term vanishes because of orthogonality of 1 and 2 . Thus our Schroedinger equation is reduced to a two by two matrix equation of the form| i | i  + V t φ φ 0 cross 1 = E 1 (5.8) t∗  +−V φ φ  − 0 cross   2   2  The interpretation of this equation is roughly that orbitals 1 and 2 both have energies  which | i | i 0 is shifted by Vcross due to the presence of the other nucleus. In addition the electron can “hop” from one orbital to the other by the off-diagonal t term. To understand this interpretation more fully, we realize that in the time dependent Schroedinger equation, if the matrix were diagonal a wavefunction that started completely in orbital 1 would stay on that orbital for all time. However, with the off-diagonal term, the time dependent wavefunction| i can oscillate between the two orbitals.

Diagonalizing this two-by-two matrix we obtain eigenenergies E =  + V t ± 0 cross ± | | the lower energy orbital is the bonding orbital whereas the higher energy orbital is the anti-bonding. The corresponding wavefunctions are then 1 ψbonding = (φ1 φ2) (5.9) √2 ± 1 ψanti−bonding = (φ1 φ2) (5.10) √2 ∓ I.e., these are the symmetric and antisymmetric superposition of orbitals. The signs and depend on the sign of t, where the lower energy one is always called the bonding orbital± and∓ the higher energy one is called antibonding. To be precise t > 0 makes (φ1 + φ2)/√2 the lower energy bonding orbital. Roughly one can think of these two wavefunctions as being the lowest two “particle-in-a-box” orbitals — the lowest energy wavefunction does not change sign as a function of position, whereas the first excited state changes sign once, i.e., it has a single node (for the case of t> 0 the analogy is precise). It is worth briefly considering what happens if the two nuclei being bonded together are not identical. In this case the energy 0 for an electron to sit on orbital 1 would be different from that of orbital 2. (See bottom of Fig. 5.5) The matrix equation 5.8 would no longer have equal entries along the diagonal, and the magnitude of φ1 and φ2 would no longer be equal in the ground state as they are in Eq. 5.9. Instead, the lower energy orbital would be more greatly filled in the ground state. As the energies of the two orbitals become increasingly different, the electron is more completely transferred entirely onto the lower energy orbital, essentially reducing to an ionic bond. Aside: In section 22.4 below, we will consider a more general tight binding model with more than one electron in the system and with Coulomb interactions between electrons as well. That calculation is more complicated, but shows very similar results. That calculation is also much more advanced, but might be fun to read for the adventurous.

Note again that Vcross is the energy that the electron on orbital 1 feels from nucleus 2. How- ever, we have not included the fact that the two nuclei also interact, and to a first approximation, this Coulomb repulsion between the two nuclei will cancel17 the attractive energy between the 17If you think of a positively charged nucleus and a negatively charged electron surrounding the nucleus, from far outside of that electron’s orbital radius the atom looks neutral. Thus a second nucleus will neither be attracted nor repelled from the atom so long as it remains outside of the electron cloud of the atom. 52 CHAPTER 5. CHEMICAL BONDING nucleus and the electron on the opposite orbital. Thus, including this energy we will obtain E˜  t ± ≈ 0 ± | | As the nuclei get closer together, the hopping term t increases, giving an energy level diagram as shown in Fig. 5.3.2. This picture is obviously unrealistic,| | as it suggests that two atoms should bind together at zero distance between the nuclei. The problem here is that our assumptions and approximations begin to break down as the nuclei get closer together (for example, our orbitals are no longer orthogonal, Vcross does not exactly cancel the Coulomb energy between nuclei, etc.).

 



 

Figure 5.6: Model Tight Binding Energy Levels as a Function of Distance Between the Nuclei of the Atoms.

A more realistic energy level diagram for the bonding and antibonding states is given in Fig. 5.7. Note that the energy diverges as the nuclei get pushed together (this is from the Coulomb repulsion between nuclei). As such there is a minimum energy of the system when the nuclei are at some nonzero distance apart from each other, which then becomes the ground state distance of the nuclei in the resulting molecule. Aside: In Fig. 5.7 there is a minimum of the bonding energy when the nuclei are some particular distance apart. This optimal distance will be the distance of the bond between two atoms. However, at finite temperature, the distance will fluctuate around this minimum (think of a particle in a potential well at finite temperature). Since the potential well is steeper on one side than on the other, at finite temperature, the “particle” in this well will be able to fluctuate to larger distances a bit more than it is able to fluctuate to smaller distances. As a result, the average bond distance will increase at finite temperature. This thermal expansion will be explored again in the next chapter. Covalently bonded materials tend to be strong and tend to be electrical semiconductors or insulators (since electrons are tied up in the local bonds). The directionality of the orbitals makes these materials retain their shape well (non-ductile) so they are brittle. They do not dissolve in polar solvents such as water in the same way that ionic materials do. 5.4. VAN DER WAALS, FLUCTUATING DIPOLE FORCES, OR MOLECULAR BONDING53

 



 

Figure 5.7: More Realistic Energy Levels as a Function of Distance Between the Nuclei of the Atoms.

5.4 Van der Waals, Fluctuating Dipole Forces, or Molecular Bonding

When two atoms (or two molecules) are very far apart from each other, there remains an attraction between them due to what is known as van der Waals18 forces, sometimes known as fluctuating dipole forces, or molecular bonding. In short, both atoms have a dipole moment, which may be zero on average, but can fluctuate momentarily due to quantum mechanics. If the first atom obtains a momentary dipole moment, the second atom can polarize — also obtaining a dipole moment to lower its energy. As a result, the two atoms (momentarily dipoles) will attract each other. This type of bonding between atoms is very typical of inert atoms (such as noble gases: He, Ne, Kr, Ar, Xe) whose electrons do not participate in covalent bonds or ionic bonds. It is 19 also typical of bonding between inert molecules such as molecules N2 where there is no possibility for the electrons in this molecule to form covalent or ionic bonds between molecules. This bonding is weak compared to covalent or ionic bonds, but it is also long ranged in comparison since the electrons do not need to hop between atoms. To be more quantitative, let us consider an electron orbiting a nucleus (say, a proton). If the electron is at a fixed position, there is a dipole moment p = er where r is the vector from the electron to the proton. With the electron “orbiting” (i.e, in an eigenstate), the average dipole moment is zero. However, if an electric field is applied to the atom, the atom will develop a polarization (i.e., it will be more likely for the electron to be found on one side of the nucleus than on the other). We write p = χE

18J. D. van der Waals was awarded the in 1910 for his work on the structure of Liquids and Gases. You may remember the van der Waals from your course last year. There is a crater named after him on the far side of the . 19Whereas the noble gases are inert because they have filled atomic orbital shells, the nitrogen molecule is inert essentially because it has a filled shell of molecular orbitals — all of the bonding orbitals are filled, and there is a large to any anti-bonding orbitals. 54 CHAPTER 5. CHEMICAL BONDING where χ is known as the polarizability (also known as electric susceptibility). This polarizability can be calculated, for, say a hydrogen atom explicitly20. At any rate, it is some positive quantity. Now, let us suppose we have two such atoms, separated a distance r in thex ˆ direction. Suppose one atom momentarily has a dipole moment p1 (for definiteness, suppose this dipole moment is in thez ˆ direction). Then the second atom will feel an electric field

p1 E = 3 4π0r in the negativez ˆ direction. The second atom then, due to its polarizability, develops a dipole moment p2 = χE which in turn is attracted to the first atom. The potential energy between these two dipoles is 2 p1 p2 p1χE p1 χ U = −| || 3 | = − 3 = −| | 3 2 (5.11) 4π0r (4π0)r (4π0r ) corresponding to a force dU/dr which is attractive and proportional to 1/r7. − You can check that independent of the direction of the original dipole moment, the force is always attractive and proportional to 1/r7. Although there will be a (nonnegative) prefactor which depends on the angle between the dipole moment p1 and x the direction between the two atoms.

Note. This argument appears to depend on the fact that the dipole moment p1 of the first atom is nonzero. On average the atom’s dipole moment will be zero. However in Eq. 5.11 in fact 2 what enters is p1 which has a nonzero expectation value. (In fact this is precisely the calculation that x for an| electron| in a hydrogen atom is zero, but x2 is nonzero). h i h i While these fluctuating dipolar forces are generally weak, they are the only forces that occur when electrons cannot be shared or transferred between atoms — either in the case where the electrons are not chemically active or when the atoms are far apart. However, when considering the van der Waals forces of many atoms put together, the total forces can be quite strong. A well known example of a is the force that allows lizards, such as Geckos to climb up walls. They have hair on their feet that makes very close contact with the atoms of the wall, and they can climb up the walls mostly due to van der Waals forces!

5.5 Metallic Bonding

It is sometimes hard to distinguish metallic bonding from covalent bonding. Roughly, however, one defines a metallic bond to be the bonding that occurs in metal. These bonds are similar to covalent bonds in the sense that electrons are shared between atoms, but in this case the electrons become delocalized throughout the crystal (we will discuss how this occurs in section 10.2 below). We should think of the delocalized free electrons as providing the glue that holds together the positive ions that they have left behind. Since the electrons are completely delocalized, the bonds in metals tend not to be directional. Metals are thus often ductile and malleable. Since the electrons are free, metals are good conductors of electricity as well as of heat.

20This is a good exercise in quantum mechanics. See, for example, Eugen Merzbacher’s book on quantum me- chanics. 5.6. HYDROGEN BONDS 55

5.6 Hydrogen bonds

The hydrogen atom is extremely special due to its very small size. As a result, the bonds formed with hydrogen atoms are qualitatively different from other bonds. When the hydrogen atom forms a covalent or ionic bond with a larger atom, being small, the hydrogen nucleus (a proton) simply sits on the surface of its partner. This then makes the molecule (hydrogen and its partner) into a dipole. These dipoles can then attract charges, or other dipoles, as usual. What is special about hydrogen is that when it forms a bond, and its electron is attracted away from the proton onto (or partially onto) its partner, the unbonded side of the the proton left behind is a naked positive charge – unscreened by any electrons in core orbitals. As a result, this positive charge is particularly effective in being attracted to other clouds of electrons.

A very good example of the hydrogen bond is water, H2O. Each atom is bound to two (however because of the atomic orbital structure, these atoms are not collinear). The hydrogens, with their positive charge remain attracted to of other water molecules. In ice, these attractions are strong enough to form a weak, but stable bond between water molecules, thus forming a crystal. Sometimes one can think of the the hydrogen atom as forming “half” a bond with two oxygen atoms, thus holding the two oxygen atoms together. Hydrogen bonding is extremely important in biological molecules where, for example, hy- drogen bonds hold together strands of DNA.

5.7 Summary of Bonding (Pictoral)

See also the table 5.1 for a summary of bonding types.

  ' (& )"0    !"#    % %( & $ %%& )  & (1)2 3 4&& &  & &

5 6 # "7& (& 4 & &5  5 ' (7  (!  8    4&&&   & &

Figure 5.8: Cartoons of Bonding Types 56 CHAPTER 5. CHEMICAL BONDING

References on Chemical Bonding

Rosenberg, section 1.11–1.19 • Ibach and Luth, chapter 1 • Hook and Hall, section 1.6 • Kittel, chapter 3 up to elastic strain • Ashcroft and Mermin, chapters 19–20 • Burns, section 6.2–6.6 and also chapters 7 and 8 • Probably Ashcroft and Mermin as well as Burns chapters 7, and 8 are too much information. Chapter 6

Types of Matter

Once we understand how it is that atoms bond together, we can examine what types of matter can be formed. An obvious thing that can happen is that atoms can bond together the form regular crystals. A crystal is made of small units reproduced many times and built into a regular array. The macroscopic morphology of a crystal can reflect its underlying structure (See Fig. 6) We will spend much of the remainder of this book studying crystals.

Figure 6.1: Crystals: Top left: Small units (One green one blue) reproduced pe- riodically to form a crystal. Top right: A crystal of quartz (SiO2). Bottom: The macroscopic morphology of a crystal reflects its underlying structure.

57 58 CHAPTER 6. TYPES OF MATTER

It is also possible that atoms will bind together to form molecules, and the molecules will stick together via weak Van der Waals bonds to form so-called molecular crystals.

Figure 6.2: A Molecular Crystal. Here, 60 atoms of carbon bind together to form a large molecule known as a buckyball2, then the buckyballs can stick together to form a molecular crystal.

Figure 6.3: Cartoon of a Liquid. In liquids, molecules are not in an ordered config- uration and are free to move around (i.e, the liquid can flow). However, the liquid molecules do attract each other and at any moment in time you can typically define neighbors.

Another form of matter is liquid. Here, atoms are attracted to each other, but not so strongly that they form permanent bonds (or the temperature is high enough to make the bonds unstable). Liquids (and gases)3 are disordered configurations of molecules where the molecules are

2The name “buckyball” is an nickname for Buckminsterfullerene, named after Richard Buckminster Fuller, the famed developer of the geodesic dome, which buckyballs are supposed to resemble; although the shape is actually precisely that of a soccer ball. This name is credited to the discoverers of the buckyball, Harold Kroto, James Heath, and Richard Smalley, who were awarded a Nobel prize in chemistry for their discovery despite their choice of nomenclature. (Probably the name “Soccerballene” would have been better). 3As we should have learned in our stat-mech and thermo courses, there is no “fundamental” difference between a liquid and a gas. Generally liquids are high density and not very compressible, whereas gases are low density and very compressible. A single substance (say, water) may have a between its gas and liquid phase (boiling), but one can also go continuously from the gas to liquid phase without boiling by going to high pressure and going around the critical point (becoming “supercritical”). 59 free to move around into new configurations. Somewhere midway between the idea of a crystal and the idea of a liquid is the possibility of amorphous solids and glasses. In this case the atoms are bonded into position in a disordered configuration. Unlike a liquid, the atoms cannot flow freely.

Figure 6.4: Cartoon of : Silica (SiO2) can be an amorphous solid, or a glass (as well as being crystalline quartz). Left is a three dimensional picture, and right is a two dimensional cartoon. Here the atoms are disordered, but are bonded together and cannot flow.

Many more possibilities exist. For example, one may have so-called liquid-crystals, where the system orders in some ways but remains disordered in other ways. For example, in figure 6 the system is crystalline (ordered) in one direction, but remains disordered within each plane. One can also consider cases where the molecules are always oriented the same way but are at completely random positions (known as a “nematic”). There are a huge variety of possible phases of matter. In every case it is the interactions between the molecules (“bonding” of some type, whether it be weak or strong) that dictates the configurations.

Figure 6.5: Cartoon of a Liquid Crystal. Liquid crystals have some of the properties of a solid and some of the properties of a liquid. In this picture of a smectic-C liquid crystal the system is crystalline in the vertical direction (forming discrete layers) but remains liquid (random positions) within each plane. Like a crystal, in this case, the individual molecules all have the same orientation. 60 CHAPTER 6. TYPES OF MATTER

One should also be aware of ,4 which are long chains of atoms (such as DNA).

Figure 6.6: Cartoon of a : A polymer is a long chain of atoms.

And there are many more types of condensed matter systems that we simply do not have time to discuss5. One can even engineer artificial types of order which do not occur naturally. Each one of these types of matter has its own interesting properties and if we had more time we would discuss them all in depth! Given that there are so many types of matter, it may seem odd that we are going to spend essentially the entire remainder of our time focused on simple crystalline solids. There are very good reasons for this however. First of all, the study of solids is one of the most successful – both in terms of how completely we understand them and also in terms of what we have been able to do practically with this understanding (For example, the entire modern semiconductor industry is a testament to how successful our understanding of solids is). More importantly, however, the physics that we learn by studying solids forms an excellent starting point for trying to understand the many more complex forms of matter that exist.

References

Dove, chapter 2 gives discussion of many types of matter. • For an even more complete survey of the types of condensed matter see “Principles of Condensed Matter Physics”, by Chaikin and Lubensky (Cambridge).

4Here is a really cool experiment to do in your kitchen. Cornstarch is a polymer — a long chain of atoms. Take a box of cornstarch and make a mixture of roughly half cornstarch and half water (you may have to play with the proportions). The concoction should still be able to flow. And if you put your hand into it, it will feel like a liquid and be gooey. But if you take a tub of this and hit it with a hammer very quickly, it will feel as hard as a brick, and it will even crack (then it turns back to goo). In fact, you can make a deep tub of this stuff and although it feels completely like a fluid, you can run across the top of it (If you are too lazy to try doing this try Googling “Ellen cornstarch” to see a YouTube video of the experiment). This mixture is a “non-Newtonian” fluid — its effective depends on how fast the force is applied to the material. The reason that polymers have this property is that the long polymer strands get tangled with each other. If a force is applied slowly the strands can unentangle and flow past each other. But if the force is applied quickly they cannot unentangle fast enough and the material acts just like a solid. 5Particularly interesting are forms such as superfluids, where quantum mechanics dominates the physics. But alas, we must save discussion of this for another course! Part III

Toy Models of Solids in One Dimension

61

Chapter 7

One Dimensional Model of Compressibility, Sound, and Thermal Expansion

In the first few chapters we found that our simple models of solids, and electrons in solids, were insufficient in several ways. In order to improve our understanding, we decided that we needed to take the periodic microstructure of crystals more seriously. In this part of the book we finally begin this more careful microscopic consideration. To get a qualitative understanding of the effects of the periodic lattice, it is frequently sufficient to think in terms of simple one dimensional systems. This is our strategy for the next few chapters. Once we have introduced a number of important principles in one dimension, we will address the complications associated with higher dimensionality. In the last part of the book we discussed bonding between atoms. We found, particularly in the discussion of covalent bonding, that the lowest energy configuration would have the atoms at some optimal distance between (See figure 5.7, for example). Given this shape of the energy as a function of distance between atoms we will be able to come to some interesting conclusions. For simplicity, let us imagine a one dimensional system of atoms (atoms in a single line). The potential V (x) between two neighboring atoms is drawn in the Figure 7.1.

The classical equilibrium position is the position at the bottom of the well (marked xeq in the figure). The distance between atoms at low temperature should then be xeq. (A good homework assignment is to consider how quantum mechanics can change this value and increase it a little bit!). Let us now Taylor expand the potential around its minimum.

κ κ V (x) V (x )+ (x x )2 + 3 (x x )3 + ... ≈ eq 2 − eq 3! − eq

Note that there is no linear term (if there were a linear term, then the position xeq would not be the minimum). If there are only small deviations from the position xeq the higher terms are much much smaller than the leading quadratic term and we can throw these terms out. This is a rather crucial general principle that any potential, close enough to its minimum, is quadratic.

63 64 CHAPTER 7. COMPRESSIBILITY, SOUND, AND THERMAL EXPANSION

6

V (x)

xmin xmax ? ? k T 6 b ? - x xeq

Figure 7.1: Potential Between Neighboring Atoms (black). The thick red curve is a quadratic approximation to the minimum (it may look crooked but in fact the red curve is symmetric and the black curve is asymmetric). The equilibrium position is xeq . At finite temperature T , the system can oscillate between xmax and xmin which are not symmetric around the minimum. Thus as T increases the average position moves out to larger distance and the system expands.

Compressibility (or )

We thus have a simple Hooke’s law quadratic potential around the minimum. If we apply a force to compress the system (i.e., apply a pressure to our model one dimensional solid) we find

κ(δx )= F − eq where the sign is so that a positive (compressive) pressure reduces the distance between atoms. This is obviously just a description of the compressibility (or elasticity) of a solid. The usual description of compressibility is 1 ∂V β = −V ∂P (one should ideally specify if this is measured at fixed T or at fixed S. Here, we are working at T = S = 0 for simplicity). In the one dimensional case, we write the compressibility as

1 ∂L 1 1 β = = = (7.1) −L ∂F κ xeq κa with L the length of the system and xeq is the spacing between atoms. Here we make the conven- tional definition that the equilibrium distance between identical atoms in a system (the so-called lattice constant) is written as a.

Sound

You may recall from your fluids course that in an isotropic compressible fluid, one predicts sound waves with velocity B 1 v = = (7.2) ρ ρβ s r 65 where ρ is the mass density of the fluid, B is the bulk modulus, which is B = 1/β with β the (adiabatic) compressibility. While in a real solid the compressibility is anisotropic and the speed of sound depends in detail on the direction of propagation, in our model one dimensional solid this is not a problem and we can calculate that the density is m/a with m the mass of each particle and a the equilibrium spacing between particles. Thus using our result from above, we predict a sound wave with velocity κa2 v = (7.3) r m Shortly (in section 8.2) we will re-derive this expression from the microscopic equations of motion for the atoms in the one dimensional solid.

Thermal Expansion

So far we have been working at zero temperature, but it is worth thinking at least a little bit about thermal expansion. This will be fleshed out more completely in a homework assignment. (In fact even in the homework assignment the treatment of thermal expansion will be very crude, but that should still be enough to give us the general idea of the phenomenon1). Let us consider again figure 7.1 but now at finite temperature. We can imagine the potential as a function of distance between atoms as being like a ball rolling around in a potential. At zero energy, the ball sits at the the minimum of the distribution. But if we give the ball some finite temperature (i.e, some energy) it will oscillate around the minimum. At fixed energy kbT the ball rolls back and forth between the points xmin and xmax where V (xmin) = V (xmax) = kbT . But away from the minimum the potential is asymmetric, so xmax xeq > xmin xeq so on average the particle has a position x > x (T = 0). This is| in essence− the| reason| − for thermal| h i eq expansion! We will obtain positive thermal expansion for any system where κ3 < 0 (i.e., at small x the potential is steeper) which almost always is true for real solids.

Summary

Forces between atoms determine ground state structure. • These same forces, perturbing around the ground state, determine elasticity, sound velocity, • and thermal expansion. Thermal expansion comes from the non-quadratic part of the interatomic potential. • Sound and Compressibility: Goodstein, section 3.2b • Ibach an Luth, beginning of section 4.5 • Hook and Hall, section 2.2 • Thermal Expansion (Most references go into way too much depth on thermal expansion): Kittel chapter 5, section on thermal expansion. • 1Although this description is an annoyingly crude discussion of thermal expansion, we are mandated by the IOP to teach something on this subject. Explaining it more correctly is, unfortunately, rather messy! 66 CHAPTER 7. COMPRESSIBILITY, SOUND, AND THERMAL EXPANSION Chapter 8

Vibrations of a One Dimensional Monatomic Chain

In chapter 2 we considered the Boltzmann, Einstein, and Debye models of vibrations in solids. In this chapter we will consider a detailed model of vibration in a solid, first classically, and then quantum mechanically. We will be able to better understand what these early attempts to understand vibrations achieved and we will be able to better understand their shortcomings. Let us consider a chain of identical atoms of mass m where the equilibrium spacing between th atoms is a. Let us define the position of the n atom to be xn and the equilibrium position of the th eq n atom to be xn = na.

Once we allow motion of the atoms, we will have xn deviating from its equilibrium position, so we define the small variable δx = x xeq n n − n Note that in our simple model we are allowing motion of the only in one dimension (i.e., we are allowing longitudinal motion of the chain, not transverse motion). As discussed in the previous section, if the system is at low enough temperature we can consider the potential holding the atoms together to be quadratic. Thus, our model of a solid is a chain of masses held together with springs as show in this figure

a

Fig. 8.1 κ κ m m

Since the springs are quadratic potentials this model is frequently known as a harmonic chain.

67 68 CHAPTER 8. VIBRATIONS OF A ONE DIMENSIONAL MONATOMIC CHAIN

With this quadratic interatomic potential, we can write the total potential energy of the chain to be

V = V (x x ) tot i − i+1 i X κ = V + (δx δx )2 eq 2 i − i+1 i X The force on the nth mass on the chain is then given by

∂Vtot Fn = = κ(δxn+1 δxn)+ κ(δxn−1 δxn) − ∂xn − − Thus we have Newton’s equation of motion

m(δx¨ )= F = κ(δx + δx 2δx ) (8.1) n n n+1 n−1 − n To remind the reader, for any coupled system, a normal mode is defined to be a collective oscillation where all particles move at the same frequency. We now attempt a solution to Newton’s equations by using an ansatz that describes the normal modes as waves

iωt−ikxeq iωt−ikna δxn = Ae n = Ae where A is an amplitude of oscillation. Now the reader might be confused about how it is that we are considering complex values of δxn. Here we are using complex numbers for convenience but actually we implicitly mean to take the real part. (This is analogous to what one does in circuit theory with oscillating currents!). Since we are taking the real part, it is sufficient to consider only ω > 0, however, we must be careful that k can then have either sign, and these are inequivalent once we have specified that ω is positive. Plugging our ansatz into Eq. 8.1 we obtain

mω2Aeiωt−ikna = κAeiωt e−ika(n+1) + e−ika(n−1) 2e−ikan − − h i or mω2 =2κ[1 cos(ka)]=4κ sin2(ka/2) (8.2) − We thus obtain the result κ ka ω =2 sin (8.3) m 2 r  

In general a relationship between a frequency (or energy) and a wavevector (or momentum) is known as a dispersion relation. This particular dispersion relation is shown in Fig. 8.1

8.1 First Exposure to the Reciprocal Lattice

Note that in Fig. 8.1 we have only plotted the dispersion for π/a 6 k 6 π/a. The reason for this is obvious from Eq. 8.3 — the dispersion relation is actually periodic− in k k +2π/a. In fact this is a very important general principle: → 8.1. FIRST EXPOSURE TO THE RECIPROCAL LATTICE 69

    !" 















        

Figure 8.1: Dispersion Relation for Vibrations of the One Dimensional Monatomic Harmonic Chain. The dispersion is periodic in k k +2π/a →

Principle 8.1: A system which is periodic in real space with a peri- odicity a will be periodic in reciprocal space with periodicity 2π/a. In this principle we have used the word reciprocal space which means k-space. In other words this principle tells us that if a system looks the same when x x+a then in k-space the dispersion will look the same when k k +2π/a. We will return to this→ principle many times in later chapters. → The periodic unit (the “unit cell”) in k-space is conventionally known as the Brillouin Zone1,2. This is your first exposure to the concept of a Brillouin zone, but it will play a very central role in later chapters. The “First Brillouin Zone” is a unit cell in k-space centered around the point k = 0. Thus in Fig. 8.1 we have shown only the first Brillouin zone, with the understanding that the dispersion is periodic for higher k. The points k = π/a are known as the Brillouin-Zone boundary and are defined in this case as being points which± are symmetric around k = 0 and are separated by 2π/a. It is worth pausing for a second and asking why we expect that the dispersion curve should

1Leon Brillouin was one of Sommerfeld’s students. He is famous for many things including for being the “B” in the “WKB” approximation. I’m not sure if WKB is on your syllabus, but it really should be if it is not already! 2The pronunciation of “Brillouin” is something that gives English speakers a great deal of difficulty. If you speak French you will probably cringe at the way this name is butchered. (I did badly in French in school, so I’m probably one of the worst offenders.) According to online dictionaries it is properly pronounced somewhere between the following words: br¯ewan, breel-wahn, bree(y)lwa(n), and bree-l-(uh)-wahn. At any rate, the “l” and the “n” should both be very weak. 70 CHAPTER 8. VIBRATIONS OF A ONE DIMENSIONAL MONATOMIC CHAIN be periodic in k k +2π/a. Recall that we defined our vibration mode to be of the form → iωt−ikna δxn = Ae (8.4)

If we take k k +2π/a we obtain → iωt−i(k+2π/a)na iωt−ikna −i2πn iωt−ikna δxn = Ae = Ae e = Ae where here we have used e−i2πn =1 for any integer n. What we have found here is that shifting k k +2π/a gives us back exactly the same oscillation mode the we had before we shifted k. The two→ are physically exactly equivalent! In fact, it is similarly clear that shifting k by any k +2πp/a with p an integer will give us back exactly the same wave also since e−i2πnp =1 as well. We can thus define a set of points in k-space (reciprocal space) which are all physically equivalent to the point k = 0. This set of points is known as the reciprocal lattice. The original periodic set of points xn = na is known as the direct lattice or real-space lattice to distinguish it from the reciprocal lattice, when necessary. The concept of the reciprocal lattice will be extremely important later on. We can see the analogy between the direct lattice and the reciprocal lattice as follows: x = ... 2a, a, 0, a, 2a, ... n − − G = . . . 2 2π , 2π , 0, 2π , 2 2π , ... n − a − a a a Note that the defining property of the reciprocal lattice in terms  of the points in the real lattice can be given as eiGmxn =1 (8.5)

A point Gm is a member of the reciprocal lattice if and only if Eq. 8.5 is true for all xn in the real lattice.

8.2 Properties of the Dispersion of the One Dimensional Chain

We now return to more carefully examine the properties of the dispersion we calculated (Eq. 8.3).

Sound Waves:

Recall that sound wave3 is a vibration that has a long wavelength (compared to the inter-atomic spacing). In this long wavelength regime, we find the dispersion we just calculated to be linear with wavevector ω = vsoundk as expected for sound with

κ v = a . sound m r 3For reference it is good to remember that humans can hear sound wavelengths roughly between 1cm and 10m. Both of these are very long wavelength compared to interatomic spacings. 8.2. PROPERTIES OF THE DISPERSION OF THE ONE DIMENSIONAL CHAIN 71

(To see this, just expand the sin in Eq. 8.3). Note that this sound velocity matches the velocity predicted from Eq. 7.3! However, we note that at larger k, the dispersion is no longer linear. This is in disagreement with what Debye assumed in his calculation in section 2.2. So clearly this is a shortcoming of the Debye theory. In reality the dispersion of normal modes of vibration is linear only at long wavelength. At shorter wavelength (larger k) one typically defines two different velocities: The group velocity, the speed at which a wavepacket moves, is given by

vgroup = dω/dk. And the phase velocity, the speed at which the individual maxima and minima move, is given by

vphase = ω/k. These two match in the case of a linear dispersion, but otherwise are different. Note that the group velocity becomes zero at the Brillouin zone boundaries k = π/a (i.e., the dispersion is flat). As we will see many times later on, this is a general principle! ±

Counting Normal Modes:

Let us now ask how many normal modes there are in our system. Naively it would appear that we can put any k such that π/a 6 k < π/a into Eq. 8.3 and obtain a new normal mode with wavevector k and frequency ω−(k). However this is not precisely correct. Let us assume our system has exactly N masses in a row, and for simplicity let us assume that our system has periodic boundary conditions i.e., particle x0 has particle x1 to its right and particle xN−1 to its left. Another way to say this is to let, xn+N = xn, i.e., this one dimensional system forms a big circle. In this case we must be careful that the wave ansatz Eq. 8.4 makes sense as we go all the way around the circle. We must therefore have eiωt−ikna = eiωt−ik(N+n)a Or equivalently we must have eikNa =1 This requirement restricts the possible values of k to be of the form 2πp 2πp k = = Na L where p is an integer and L is the total length of the system. Thus k becomes quantized rather than a continuous variable. This means that the k-axis in Figure 8.1 is actually a discrete set of many many individual points; the spacing between two of these consecutive points being 2π/(Na) = 2π/L. Let us now count how many normal modes we have. As mentioned above in our discussion of the Brillouin zone, adding 2π/a to k brings one back to exactly the same physical wave. Thus we only ever need consider k values within the first Brillouin zone (i.e., π/a 6 k < π/a, and since π/a is the same as π/a we choose to count one but not the other).− Thus the total number of normal modes is − Range of k 2π/a Total Number of Modes = = = N (8.6) Spacing between neigboring k 2π/(Na) . 72 CHAPTER 8. VIBRATIONS OF A ONE DIMENSIONAL MONATOMIC CHAIN

There is precisely one normal mode per mass in the system — that is, one normal mode per degree of freedom in the whole system. This is what Debye insightfully predicted in order to cut off his divergent integrals in section 2.2.3 above!

8.3 Quantum Modes: Phonons

We now make a rather important leap from classical to quantum physics. Quantum Correspondence: If a classical harmonic system (i.e., any quadratic Hamiltonian) has a normal oscillation mode at frequency ω the corresponding quantum system will have eigenstates with energy 1 E = ~ω(n + ) (8.7) n 2 Presumably you know this well in the case of a single harmonic oscillator. The only thing different here is that our harmonic oscillator can be a collective normal mode not just motion of a single particle. This quantum correspondence principle will be the subject of a homework assignment. Thus at a given wavevector k, there are many possible eigenstates, the ground state being the n = 0 eigenstate which has only the zero-point energy ~ω(k)/2. The lowest energy excitation is of energy ~ω(k) greater than the ground state corresponding to the excited n = 1 eigenstate. Generally all excitations at this wavevector occur in energy units of ~ω(k), and the higher values of energy correspond classically to oscillations of increasing amplitude. Each excitation of this “normal mode” by a step up the harmonic oscillator excitation ladder (increasing the quantum number n) is known as a “phonon”.

Definition 8.3.1. A phonon is a discrete quantum of vibration4

This is entirely analogous to defining a single quanta of light as a . As is the case with the photon, we may think of the phonon as actually being a particle, or we can think of the phonon as being a quantized wave. If we think about the phonon as being a particle (as with the photon) then we see that we can put many phonons in the same state (ie., the quantum number n in Eq. 8.7 can be increased to any value), thus we conclude that phonons, like photons, are bosons. As with photons, at finite temperature there will be a nonzero number of phonons (i.e., n will be on average nonzero) as given by the Bose occupation factor. 1 n (β~ω)= B eβ~ω 1 − with β =1/(kbT ) and ω the oscillation frequency. Thus, the energy expectation of the phonons at wavevector k is given by

1 E = ~ω(k) n (β~ω(k)) + k B 2  .

4I do not like the definition of a phonon as “a quantum of vibrational energy” which many books use. The vibration does carry indeed energy, but it carries other quantum numbers (such as crystal momentum) as well, so why specify energy only? 8.3. QUANTUM MODES: PHONONS 73

We can use this type of expression to calculate the heat capacity of our one dimensional model5 1 U = ~ω(k) n (β~ω(k)) + total B 2 Xk   where the sum over k here is over all possible normal modes, i.e, k =2πp/(Na) such that π/a 6 k < π/a. Thus we really mean − p=N/2−1

→ k p = −N/2 X k=(2πpX)/(Na) Since for a large system, the k points are very close together, we can convert the discrete sum into an integral (something we should be very familiar with by now) to obtain

Na π/a dk → 2π −π/a Xk Z Note that we can use this continuum integral to count the total number of modes in the system

Na π/a dk = N 2π Z−π/a as predicted by Debye. Using this integral form of the sum, we have the total energy given by

N π/a 1 U = dk ~ω(k) n (β~ω(k)) + total 2π B 2 Z−π/a   from this we could calculate specific heat as dU/dT . These two previous expressions look exactly like what Debye would have obtained from his calculation (for a one dimensional version of his model)! The only difference lies in our expression for ω(k). Debye only knew about sound where ω = vk, is linear in the wavevector. We, on the other hand, have just calculated that for our microscopic ball and spring model ω is not linear in k (See Eq. 8.3). Other than this change in the dispersion relation, our calcualtion of heat capacity (exact for this model!) is identical to the approach of Debye. In fact, Einstein’s calculation of specific heat can also be phrased in exactly the same language. Only for Einstein’s model the frequency ω is constant for all k (it is fixed at the Einstein frequency). We thus see Einstein’s model, Debye’s model, and our microscopic harmonic model in a very unified light. The only difference between the three is what we use for a dispersion relation. One final comment is that it is frequently useful to further replace integrals over k with integrals over frequency (we did this when we studied the Debye model above). We obtain generally

Na π/a dk = dω g(ω) 2π Z−π/a Z where6 5 The observant reader will note that we are calculating CV = dU/dT the heat capacity at constant volume. Why constant volume? As we saw above when we studied thermal expansion, the crystal does not expand unless we include third(or higher) order terms in the interatomic potential, which are not in this model! 6The factor of 2 out front comes from the fact that each ω occurs for the two possible values of k. ± 74 CHAPTER 8. VIBRATIONS OF A ONE DIMENSIONAL MONATOMIC CHAIN

Na g(ω)=2 dk/dω 2π | | Recall again that the definition of density of states is that the number of modes with frequency between ω and ω + dω is given by g(ω)dω. Note that in the (one dimensional) Debye model this density of states is constant from ω =0 to ω = ωDebye = vπ/a. In our model, as we have calculated above, the density of states is not a constant, but becomes zero at frequency above the maximum frequency 2 κ/m. (In a homework problem we calculate this density of states explicitly). Finally in the Einstein model, this density p of states is a delta-function at the Einstein frequency.

8.4 Crystal Momentum

As mentioned above, the wavevector of a phonon is defined only modulo7 the reciprocal lattice. In other words, k is the same as k + Gm where Gm = 2πm/a is a point in the reciprocal lattice. Now we are supposed to think of these phonons as particles — and we like to think of our particles as having energy ~ω and a momentum ~k. But we cannot define a phonon’s momentum this way because physically it is the same phonon whether we describe it as ~k or ~(k + Gm). We thus instead define a concept known as the crystal momentum which is the momentum modulo the reciprocal lattice — or equivalently we agree that we must always describe k within the first Brillouin zone. In fact, this idea of crystal momentum is extremely powerful. Since we are thinking about phonons as being particles, it is actually possible for two (or more) phonons to bump into each other and scatter from each other — the same way particles do8. In such a collision, energy is conserved and crystal momentum is conserved! For example three phonons each with crystal momentum ~(2/3)π/a can scatter off of each other to produce three phonons each with crystal momentum ~(2/3)π/a. This is allowed since the initial and final states have the same energy and − 3 (2/3)π/a =3 ( 2/3)π/a mod (2π/a) × × − During these collisions although momentum ~k is not conserved, crystal momentum is9. In fact, the situation is similar when, for example, phonons scatter from electrons in a periodic lattice — crystal momentum becomes the conserved quantity rather than momentum. This is an extremely important principle which we will encounter again and again. In fact, it is a main cornerstone of solid-state physics. Aside: There is a very fundamental reason for the conservation of crystal momentum. Conserved

7The word “modulo” or “mod” means to divide and only keep the remainder. For example, 15 modulo 7 = 1 since when you divide 15 by 7, you have a remainder of 1. 8In the harmonic model we have considered phonons do not scatter from each other. We know this because the phonons are eigenstates of the system, so their occupation does not change with time. However, if we add anharmonic (cubic and higher) terms to the inter-atomic potential, this corresponds to perturbing the phonon Hamiltonian and can be interpreted as allowing phonons to scatter from each other. 9This thing we have defined, ~k, has dimensions of momentum, but is not conserved. However, as we will discuss below in chapter 13, if a particle, like a photon, enters a crystal with a given momentum and undergoes a process that conserves crystal momentum but not momentum, when the photon exits the crystal we will find that total momentum of the system is indeed conserved, with the momentum of the entire crystal accounting for any momentum that is missing from the photon. See footnote 6 in section 13.1.1 8.5. SUMMARY OF VIBRATIONS OF THE ONE DIMENSIONAL MONATOMIC CHAIN 75 quantities are results of symmetries (this is a deep and general statement known as Noether’s theorem10). For example, conservation of momentum is a result of the translational invariance of space. If space is not the same from point to point, for example if there is a potential V (x) which is different at different places, then momentum is not conserved. The conservation of crystal momentum correspondingly results from space being invariant under translations of a, giving us momentum that is conserved modulo 2π/a.

8.5 Summary of Vibrations of the One Dimensional Monatomic Chain

A number of very crucial new ideas have been introduced in this section. Many of these will return again and again in later chapters.

Normal modes are collective oscillations where all particles move at the same frequency. • If a system is periodic in space with periodicity ∆x = a, then in reciprocal space (k-space) • the system is periodic with periodicity ∆k =2π/a. Values of k which differ by multiples of 2π/a are physically equivalent. The set of points in • k-space which are equivalent to k = 0 are known as the reciprocal lattice. Any value of k is equivalent to some k in the first Brillouin-zone, π/a 6 k <π/a (in 1d). • − The sound velocity is the slope of the dispersion in the small k limit (group = phase velocity • in this limit). A classical normal mode of frequency ω gets translated into quantum mechanical eigenstates • ~ 1 th En = ω(n + 2 ). If the system is in the n eigenstate, we say that it is occupied by n phonons. Phonons can be thought of as particles, like photons, that obey Bose statistics. •

References

Normal Modes of Monatomic Chain and Introduction to Phonons: Kittel, beginning of chapter 4 • Goodstein, beginning of section 3.3 • Hook and Hall, section 2.3.1 • Burns, section 12.1–12.2 • Ashcroft and Mermin, beginning of chapter 22. •

10Emmy Noether has been described by Einstein, among others, as the most important woman in the history of mathematics. 76 CHAPTER 8. VIBRATIONS OF A ONE DIMENSIONAL MONATOMIC CHAIN Chapter 9

Vibrations of a One Dimensional Diatomic Chain

In the previous chapter we studied in detail a one dimensional model of a solid where every atom is identical to every other atom. However, in real materials not every atom is the same (for example, in sodium chloride, NaCl, we have two types of atoms!). We thus intend to generalize our previous discussion of the one dimension solid to a one dimensional solid with two types of atoms. Much of this will follow the outline set in the previous chapter, but we will see that several fundamentally new features will now emerge.

9.1 Diatomic Crystal Structure: Some useful definitions

Consider the following model system

Fig. 9.1.1

κ1 κ2 κ1 κ2

m1 m2 m1 m2

which represents a periodic arrangement of two different types of atoms. Here we have given them two masses m1 and m2 which alternate along the one dimensional chain. The springs connecting the atoms have spring constants κ1 and κ2 and also alternate. In this circumstance with more than one type of atom, we first would like to identify the so-called unit cell which is the repeated motif in the arrangement of atoms. In this picture, we have put a box around the unit cell. The length of the unit cell in one dimension is known as the

77 78 CHAPTER 9. VIBRATIONS OF A ONE DIMENSIONAL DIATOMIC CHAIN lattice constant and it is labeled a.

a

Fig. 9.1.2

Note however, that the definition of the unit cell is extremely non-unique. We could just as well have chosen (for example) the unit cell to be as follows.

a

Fig. 9.1.3

The important thing in defining a periodic system is to choose some unit cell and then construct the full system by reproducing the same unit cell over and over. (In other words, make a definition of the unit cell and stick with that definition!). It is sometimes useful to pick some references point inside each unit cell. This set of reference points makes a simple lattice (we will define the term “lattice” more closely in later chapters, but for now the point is that a lattice has only one type of point in it – not two different types of points). So in this figure, we have marked our reference point in each unit cell with an X (again, the choice of this reference point is arbitrary). 9.2. NORMAL MODES OF THE DIATOMIC SOLID 79

a

r1 r2 r3

X X X Fig. 9.1.4

3a 7a 40 20 Given the reference lattice point in the unit cell, the description of all of the atoms in the unit cell with respect to this reference point is known as a basis. In this case we might describe our basis as light gray atom centered at position 3a/40 to left of reference lattice point dark gray atom centered at position 7a/20 to right of reference lattice point

Thus if the reference lattice point in unit cell n is called rn (and the spacing between the lattice points is a) we can set rn = an with a the size of the unit cell. Then the (equilibrium) position of the light gray atom in the nth unit cell is xeq = an 3a/40 n − whereas the (equilibrium) position of the dark gray atom in the nth unit cell is

eq yn = an +7a/20

9.2 Normal Modes of the Diatomic Solid

For simplicity, let us focus on the case where all of the masses along our chain are the same m1 = m2 = m but the two spring constants κ1 and κ2 are different. (For homework we will consider the case where the masses are different, but the spring constants are the same!).

mm mm mm Fig. 9.2.1

κ1 κ2 κ1 κ2

x1 y1 x2 y2 x3 y3 80 CHAPTER 9. VIBRATIONS OF A ONE DIMENSIONAL DIATOMIC CHAIN

Given the spring constants in the picture, we can write down Newton’s equations of of motion for the deviations of the positions of the masses from their equilibrium positions. We obtain

m δx¨ = κ (δy δx )+ κ (δy δx ) (9.1) n 2 n − n 1 n−1 − n m δy¨ = κ (δx δy )+ κ (δx δy ) (9.2) n 1 n+1 − n 2 n − n Analogous to the one dimensional case we propose ans¨atze1 for these quantities that have the form of a wave

iωt−ikna δxn = Axe (9.3) iωt−ikna δyn = Aye (9.4) where, as in the previous chapter, we implicitly mean to take the real part of the complex number. As such, we can always choose to take ω > 0 as long as we consider k to be either positive and negative. As we saw in the previous chapter, values of k that differ by 2π/a are physically equivalent. We can thus focus our attention to the first Brillouin zone π/a 6 k < π/a. Note that the important length here is the unit cell length or lattice constant−a. Any k outside the first Brillouin zone is redundant with some other k inside the zone. As we found in the previous chapter, if our system has N unit cells (hence L = Na) then (putting periodic boundary conditions on the system) k will be is quantized in units of 2π/(Na)=2π/L. Note that here the important quantity is N, the number of unit cells, not the number of atoms (2N). Dividing the range of k in the first Brillouin zone by the spacing between neighboring k’s, we obtain exactly N different possible values of k exactly as we did in Eq. 8.6. In other words, we have exactly one value of k per unit cell. We might recall at this point the intuition that Debye used — that there should be exactly one possible excitation mode per degree of freedom of the system. Here we obviously have two degrees of freedom per unit cell, but we obtain only one possible value of k per unit cell. The resolution, as we will see in a moment, is that there will be two possible oscillation modes for each wavevector k. We now proceed by plugging in our ans¨atze (Eq. 9.3 and 9.4) into our equations of motion (Eq. 9.1 and 9.2). We obtain

ω2mA eiωt−ikna = κ A eiωt−ikna + κ A eiωt−ik(n−1)a (κ + κ )A eiωt−ikna − x 2 y 1 y − 1 2 x ω2mA eiωt−ikna = κ A eiωt−ik(n+1)a + κ A eiωt−ikna (κ + κ )A eiωt−ikna − y 1 x 2 x − 1 2 y which simplifies to

ω2mA = κ A + κ A eika (κ + κ )A − x 2 y 1 y − 1 2 x ω2mA = κ A e−ika + κ A (κ + κ )A − y 1 x 2 x − 1 2 y This can be rewritten conveniently as an eigenvalue equation

ika 2 Ax (κ1 + κ2) κ2 κ1e Ax mω = −ika − − (9.5) Ay κ κ e (κ + κ ) Ay    − 2 − 1 1 2    1I believe this is the proper pluralization of ansatz. 9.2. NORMAL MODES OF THE DIATOMIC SOLID 81

The solutions of this are obtained by finding the zeros of the secular determinant

2 ika (κ1 + κ2) mω κ2 κ1e 2 2 ika 2 0= −−ika − − 2 = (κ1 + κ2) mω κ2 + κ1e κ2 κ1e (κ1 + κ2) mω − − | | − − −

The roots of which are clearly given by

mω2 = (κ + κ ) κ + κ eika 1 2 ± | 1 2 | The second term needs to be simplified

κ + κ eika = (κ + κ eika)(κ + κ e−ika)= κ2 + κ2 +2κ κ cos(ka) | 1 2 | 1 2 1 2 1 2 1 2 q q So we finally obtain κ + κ 1 ω = 1 2 κ2 + κ2 +2κ κ cos(ka) (9.6) ± m ± m 1 2 1 2 r q Note in particular that for each k we find two normal modes — usually referred to as the two branches of the dispersion. Thus since there are N different k values, we obtain 2N modes total (if there are N unit cells in the entire system). This is in agreement with our above discussion that we should have exactly one normal mode per degree of freedom in our system. The dispersion of these two modes is shown in Figure 9.1.

Figure 9.1: Dispersion Relation for Vibrations of the One Dimensional Diatomic Chain. The dispersion is periodic in k k +2π/a. Here the dispersion is shown for the case of κ2 = 1.4κ1. This scheme of plotting dispersions,→ putting all normal modes within the first Brillouin zone, is the reduced zone scheme. Compare this to Fig. 9.2 below.

A few things to note about this dispersion. First of all we note that there is a long wavelength low energy branch of excitations with linear dispersion (corresponding to ω− in Eq. 9.6). This is the sound wave, or acoustic mode. Generally the definition of an acoustic mode is any mode that has linear dispersion as k 0. → 82 CHAPTER 9. VIBRATIONS OF A ONE DIMENSIONAL DIATOMIC CHAIN

By expanding Eq. 9.6 for small k it is easy to check that the sound velocity is

2 dω− a κ1κ2 vsound = = (9.7) dk s2m(κ1 + κ2)

In fact, we could have calculated this sound velocity on general principles analogous to what we did in Eq. 7.2 and Eq. 7.3. The density of the chain is 2m/a. The effective spring constant of two springs κ1 and κ2 in series isκ ˜ = (κ1κ2)/(κ1 +κ2) so the compressibility of the chain is β =1/(˜κa) (See Eq. 7.1). Then plugging into Eq. 7.2 gives exactly the same sound velocity as we calculate here in Eq. 9.7. The higher energy branch of excitations is known as the optical mode. It is easy to check that in this case the optical mode goes to frequency 2(κ1 + κ2)/m at k = 0, and also has zero group velocity at k = 0. The reason for the nomenclature “optical” will become clearer later in the p course when we study scattering of light from solids. For now we give a very simplified description of why it is named this way: Consider a solid being exposed to light. It is possible for the light to be absorbed by the solid, but energy and momentum must both be conserved. However, light travels at a very high velocity c, so ω = ck is a very large number. Since phonons have a maximum frequency, this means that photons can only be absorbed for very small k. However, for small k, acoustic phonons have energy vk ck so that energy and momentum cannot be conserved. On  the other hand, optical phonons have energy ωoptical which is finite for small k so that at some value of small k, we have ωoptical = ck and one can match the energy and momentum of the photon to that of the phonon.2 Thus, whenever phonons interact with light, it is inevitably the optical phonons that are involved. Let us examine a bit more closely the acoustic and the optical mode as k 0. Examining our eigenvalue problem Eq. 9.5, we see that in this limit the matrix to be diagonalized→ takes the simple form

A κ1 + κ2 1 1 A ω2 x = − x (9.8) Ay m 1 1 Ay    −   

The acoustic mode (which has frequency 0) is solved by the eigenvector

A 1 x = A 1  y   

This tells us that the two masses in the unit cell (at positions x and y) move together for the case of the acoustic mode in the long wavelength limit. This is not surprising considering our understanding of sound waves as being very long wavelength compressions and rarifactions. This is depicted in Figure 9.2.2. Note in the figure that the amplitude of the compression is slowly modulated, but always the two atoms in the unit cell move almost exactly the same way.

2 From this naive argument, one might think that the process where one photon with frequency ωoptical is absorbed while emitting a phonon is an allowed process. This is not true since the photons carry spin and spin must also be conserved. Much more typically the interaction between photons and phonons is one where a photon is absorbed and then re-emitted at a different frequency while emitting a phonon. I.e., the photon is inelastically scattered. We will discuss this later on. 9.2. NORMAL MODES OF THE DIATOMIC SOLID 83

Fig. 9.2.2

κ2 κ1 κ2 κ1 κ2

A long wavelength acoustic mode

2 2(κ1+κ2) On the other hand, the optical mode at k = 0, having frequency ω = m , has the eigenvector A 1 x = A 1  y   −  which described the two masses in the unit cell moving in opposite directions, for the optical mode. This is depicted in Figure 9.2.3. Note in the figure that the amplitude of the compression is slowly modulated, but always the two atoms in the unit cell move almost exactly the opposite way.

Fig. 9.2.3

κ2 κ1 κ2 κ1 κ2

A long wavelength optical mode

In order to get a better idea of how motion occurs for both the optical and acoustic modes, it is useful to see animations, which you can find on the web. Another good resource is to download the program “ChainPlot” from Professor Mike Glazer’s web site (http://www.amg122.com/programs)3 In this example we had two atoms per unit cell and we obtained two modes per distinct value of k. One of these modes is acoustic and one is optical. More generally if there are M atoms per unit cell (in one dimension) we will have M modes per distinct value of k (i.e., M branches of the dispersion) of which one mode will be acoustic (goes to zero energy at k = 0) and all of the remaining modes are optical (do not go to zero energy at k = 0). Caution: We have been careful to discuss a true one dimensional system, where the atoms are allowed to move only along the one dimensional line. Thus each atom has only one degree of freedom. However, if we allow atoms to move in other directions (transverse to the 1d line) we will have more degrees of freedom per atom. When we get to the 3d solid we should expect 3 degrees of freedom per atom. And there should be 3 different acoustic modes at each k at long wavelength. (In 3d, if there are n atoms per unit cell, there will be 3(n − 1) optical modes but always 3 acoustic modes totalling 3n degrees of freedom per unit cell.

3Note in particular the comment on this website about most books getting the form of the acoustic mode incorrect! 84 CHAPTER 9. VIBRATIONS OF A ONE DIMENSIONAL DIATOMIC CHAIN

One thing that we should study closely is the behavior at the Brillouin zone boundary. It is also easy to check that the frequencies ω at the zone boundary (k = π/a) are 2κ /m and ± ± 1 2κ /m, the larger of the two being ω . We can also check that the group velocity dω/dk of both 2 + p modes goes to zero at the zone boundary (Similarly the optical mode has zero group velocity at p k = 0). In Fig. 9.1 above, we have shown both modes at each value of k, such that we only need to show k within the first Brillouin zone. This is known as the reduced zone scheme. Another way to plot exactly the same dispersions is shown in Fig. 9.2 and is known as the extended zone scheme. Essentially you can think of this as “unfolding” the dispersions such that there is only one mode at each value of k. In this picture we have defined (for the first time) the second Brillouin zone.

   !" #"$%&











      

Figure 9.2: Dispersion Relation of Vibrations of the One Dimensional Diatomic Chain in the Extended Zone Scheme (Again choosing κ2 = 1.4κ1). Compare this to Fig. 9.1 above. The first Brillouin zone is labeled BZ1 and the second Brillouin zone is labeled BZ2.

Recall the first zone in 1d is defined as k 6 π/a. Analogously the second Brillouin zone is now π/a 6 k 6 2π/a. In later chapters we will| | define the Brillouin zones more generally. | | Here is an example where it is very useful to think using the extended zone scheme. We have been considering cases with κ2 >κ1, now let us consider what would happen if we take the limit of κ2 κ1. When the two spring constants become the same, then in fact the two atoms in the unit cell→ become identical, and we have a simple monotomic chain (which we discussed at length in the previous chapter). As such we should define a new smaller unit cell with lattice constant a/2, and the dispersion curve is now just a simple sin as it was in chapter 8. | | Thus it is frequently useful if the two atoms in a unit cell are not too different from each 9.3. SUMMARY OF VIBRATIONS OF THE ONE DIMENSIONAL DIATOMIC CHAIN 85

   !" #"$%&











       '()012345

Figure 9.3: How a Diatomic Dispersion Becomes a Monatomic Dispersion When the Two Different Atoms Become the Same. (black) Dispersion relation of vibrations of the one dimensional diatomic chain in the extended zone scheme with κ2 not too different from κ1. (blue) Dispersion relation when κ2 = κ1. In this case, the two atoms become exactly the same, and we have a monatomic chain with lattice spacing a/2. This single band dispersion precisely matches that calculated in chapter 8 above, only with the lattice constant redefined to a/2.

other, to think about the dispersion as being a small perturbation to a situation where all atoms are identical. When the atoms are made slightly different, a small gap opens up at the zone boundary, but the rest of the dispersion continues to look mostly as if it is the dispersion of the monatomic chain. This is illustrated in Fig. 9.3.

9.3 Summary of Vibrations of the One Dimensional Di- atomic Chain

A number of key concepts are introduced in this chapter as well

A unit cell is the repeated motif that comprises a crystal. • The basis is the description of the unit cell with respect to a reference lattice. • The lattice constant is the size of the unit cell (in 1d). • If there are M atoms per unit cell we will find M normal modes at each wavevector k. • 86 CHAPTER 9. VIBRATIONS OF A ONE DIMENSIONAL DIATOMIC CHAIN

One of these modes is an acoustic mode, meaning that it has linear dispersion at small k, • whereas the remaining M 1 are optical meaning they have finite frequency at k = 0. − For the acoustic mode, all atoms in the unit cell move in-phase with each other, whereas for • optical modes, they move out of phase with each other. Except for the acoustic mode, all other excitation branches have zero group velocity for • k = nπ/a for any n. If all of the dispersion curves are plotted within the first Brillouin zone k 6 π/a we call this • the reduced zone scheme. If we “unfold” the curves such that there is| only| one excitation plotted per k, but we use more than one Brillouin zone, we call this the extended zone scheme.

If the two atoms in the unit cell become identical, the new unit cell is half the size of the old • unit cell. It is convenient to describe this limit in the extended zone scheme.

References

Ashcroft and Mermin, chapter 22 (but not the 3d part) • Ibach and Luth, section 4.3 • Kittel, chapter 4 • Hook and Hall, sections 2.3.2, 2.4, 2.5 • Burns, section 12.3 • Chapter 10

Tight Binding Chain (Interlude and Preview)

In the previous two chapters we have considered the properties of vibrational waves (phonons) in a one dimensional system. At this point, we are going to make a bit of an excursion to consider electrons in solids again. The point of this excursion, besides being a preview of much of the physics that will re-occur later on, is to make the point that all waves in periodic environments (in crystals) are similar. In the previous two chapters we considered vibrational waves. In this chapter we will consider electron waves (Remember that in quantum mechanics particles are just as well considered to be waves!)

10.1 Tight Binding Model in One Dimension

We described the molecular orbital, or tight binding, picture for molecules previously in section 5.3.2. We also met the equivalent picture, or LCAO (linear combination of atomic orbitals) model of bonding for homework. What we will do here is consider a chain of such molecular orbitals to represent orbitals in a macroscopic (one dimensional) solid.

a

Fig. 10.1.1

1 2 3 4 5 6 | i | i | i | i | i | i

In this picture, there is a single orbital on atom n which we call n . For convenience we will assume that the system has periodic boundary conditions (i.e, there| arei N sites, and site N

87 88 CHAPTER 10. TIGHT BINDING CHAIN (INTERLUDE AND PREVIEW) is the same as site 0). Further we will assume that all of the orbitals are orthogonal to each other.

n m = δ (10.1) h | i n,m Let us now take a general trial wavefunction of the form

Ψ = φ n | i n| i n X As we showed for homework, the effective Schr¨odinger equation for this type of tight-binding model can be written as Hnmφm = Eφn (10.2) m X where Hnm is the matrix element of the Hamiltonian H = n H m nm h | | i As mentioned previously when we studied the molecular orbital model, this Schr¨odinger equation is actually a variational approximation. For example, instead of finding the exact ground state, it finds the best possible ground state made up of the orbitals that we have put in the model. One can make the variational approach increasingly better by expanding the Hilbert space and putting more orbitals into the model. For example, instead of having only one orbital n at a given site, one could consider many n, α where α runs from 1 to some number p. As p is increased| i the approach becomes increasingly more| i accurate and eventually is essentially exact. This method of using tight-binding like orbitals to increasingly well approximate the exact Schr¨odinger equation is known as LCAO (linear combination of atomic orbitals). However, one complication (which we treat only in one of the additional homework assignments) is that when we add many more orbitals we typically have to give up our nice orthogonality assumption, i.e., n, α m,β = δnmδαβ no longer holds. This makes the effective Schr¨odinger equation a bit more complicated,h | i but not fundamentally different. (See comments in section 5.3.2 above). At any rate, in the current chapter we will work with only one orbital per site and we assume the orthogonality Eq. 10.1. We write the Hamiltonian as H = K + Vj j X 2 where K = p /(2m) is the kinetic energy and Vj is the Coulomb interaction of the electron with the nucleus at site j, V = V (r r ) j − j th where rj is the position of the j nucleus. With these definitions we have

H m = (K + V ) m + V m | i m | i j | i jX6=m Now, we should recognize that K + Vm is the Hamiltonian which we would have if there were only a single nucleus (the mth nucleus) and no other nuclei in the system. Thus, if we take the tight-binding orbitals m to be the atomic orbitals, then we have | i (K + V ) m =  m m | i atomic| i 10.2. SOLUTION OF THE TIGHT BINDING CHAIN 89

where atomic is the energy of an electron on nucleus m in the absence of any other nuclei. Thus we can write

H = n H m =  δ + n V m n,m h | | i atomic n,m h | j | i jX6=m We now have to figure out what the final term of this equation is. The meaning of this term is that, via the interaction with some nucleus which is not the mth, an electron on the mth atom can be transferred to the nth. Generally this can only happen if n and m are very close to each other. Thus, we write V0 n = m n V m = t n = m 1 (10.3) h | j | i  − ± j6=m 0 otherwise X  which defines both V0 and t. (The V0 term here does not hop an electron from one site to another, but rather just shifts the energy on a given site.) Note by translational invariance of the system, we expect that the result should depend only on n m, which this form does. These two types − of terms V0 and t are entirely analogous to two types of terms Vcross and t that we met in section 5.3.2 above when we studied covalent bonding of two atoms1. The situation here is similar except that now there are many nuclei instead of just two. With the above matrix elements we obtain

H =  δ t (δ + δ ) (10.4) n,m 0 n,m − n+1,m n−1,m where we have now defined2 0 = atomic + V0 This Hamiltonian is a very heavily studied model, known as the tight binding chain. Here t is known as the hopping term, as it allows the Hamiltonian (which generates time evolution) to move the electron from one site to another, and it has dimensions of energy. It stands to reason that the magnitude of t depends on how close together the orbitals are — becoming large when the orbitals are close together and decaying exponentially when they are far apart.

10.2 Solution of the Tight Binding Chain

The solution of the tight binding model in one dimension (the tight binding chain) is very analogous to what we did to study vibrations (and hence the point of presenting the tight binding model at this point!). We propose an ansatz solution

e−ikna φn = (10.5) √N where the denominator is included for normalization where there are N sites in the system. We now plug this ansatz into the Schr¨odinger equation Eq. 10.2. Note that in this case, there is no frequency in the exponent of our ansatz. This is simply because we are trying to solve the time- independent Schr¨odinger equation. Had we used the time dependent equation, we would need a factor of eiωt as well! 1Just to be confusing, atomic physicists sometimes use J where I have used t here. 2 Once again 0 is not a dielectric constant or the permittivity of free space, but rather just the energy of having an electron sit on a site. 90 CHAPTER 10. TIGHT BINDING CHAIN (INTERLUDE AND PREVIEW)

As with vibrations, it is obvious that k k +2π/a gives the same solution. Further, if we consider the system to have periodic boundary→ conditions with N sites (length L = Na), the allowed values of k are quantized in units of 2π/L. As with Eq. 8.6 there are precisely N possible different solutions of the form of Eq. 10.5. Plugging the ansatz into the left side of the Schr¨odinger equation 10.2 and then using Eq. 10.4 gives us e−ikna e−ik(n+1)a e−ik(n−1)a H φ =  t + n,m m 0 √ − √ √ m N N N X   which we set equal to the right side of the Schr¨odinger equation

e−ikna Eφn = E √N To obtain the spectrum E =  2t cos(ka) (10.6) 0 − which looks rather similar to the phonon spectrum of the one dimensional monatomic chain which was (See Eq. 8.2) κ κ ω2 =2 2 cos(ka) m − m Note however, that in the electronic case one obtains the energy whereas in the phonon case one obtains the square of the frequency. This dispersion curve is shown in Fig. 10.1. Analogous to the phonon case, it is periodic in k k +2π/a. Further, analogous to the phonon case, the dispersion always has zero group velocity→ (is flat) for k = nπ/a for n any integer (i.e., at the Brillouin zone boundary). Note that unlike free electrons, the electron dispersion here has a maximum energy as well as a minimum energy. Electrons only have eigenstates within a certain energy band. The word “band” is used both to describe the energy range for which eigenstates exist, as well as to describe one connected branch of the dispersion curve (In this picture there is only a single mode at each k, hence one branch, hence a single band). The energy difference from the bottom of the band to the top is known as the bandwidth. Within this bandwidth (between the top and bottom of the band) for any energy there exists (at least one) k state having that energy. For energies outside of the bandwidth there are no k-states with that energy. The bandwidth (which in this model is 4t) is determined by the magnitude of the hopping, which, as mentioned above, depends on the distance between nuclei3. As a function of the inter- atomic spacing then the bandwidth increases as shown in Fig 10.2. On the right of this diagram there are N states, each one being an atomic orbital n . On the left of the diagram these N states form a band, yet as discussed above, there remain precisely| i N states. (This should not surprise us, being that we have not changed the dimension of the Hilbert state, we have just expressed it in terms of the complete set of eigenvalues of the Hamiltonian). Note that the average energy of a state in this band remains always zero. Aside: Note that if the band is not completely filled, the total energy of all of the electrons decreases as the atoms are moved together and the band width increases. (Since the average energy remains zero, but some of the higher energy states are not filled). This decrease in energy is precisely the binding force of a “metallic

3Since the hopping t depends on an overlap between orbitals on adjacent atoms (See Eq. 10.3), in the limit that the atoms are well separated, the bandwidth will increase exponentially as the atoms are pushed closer together. 10.2. SOLUTION OF THE TIGHT BINDING CHAIN 91

   







      









!"#$%%%&'

Figure 10.1: Dispersion of the Tight Binding Chain.

bond” which we discussed in section 5.5.4 We also mentioned previously that one property of metals is that they are typically soft and malleable. This is a result of the fact that the electrons that hold the atoms together are mobile — in essence, because they are mobile, they can readjust their positions somewhat as the crystal is deformed. Near the bottom of the band, the dispersion is parabolic. For our above dispersion (Eq. 10.6), expanding for small k, we obtain

E(k) = Constant+ ta2k2

[Note that for t< 0, the energy minimum is at the Brillouin zone boundary k = π/a. In this case we would expand for k close to π/a instead of for k close to 0]. The resulting parabolic behavior is similar to that of free electrons which have a dispersion

~2k2 E (k)= free 2m We can therefore view the bottom of the band as being almost like free electrons, except that we have to define a new effective mass which we call m∗ such that

~2k2 = ta2k2 2m∗ 4Of course we have not considered the repulsive force between neighboring nuclei, so the nuclei do not get too close together. As in the case of the covalent bond considered above in section 5.3.2, some of the Coulomb repulsion between nuclei will be canceled by Vcross (here V0) the attraction of the electron on a given site to other nuclei. 92 CHAPTER 10. TIGHT BINDING CHAIN (INTERLUDE AND PREVIEW)

Electron States allowed only within band

Interatomic spacing

Figure 10.2: Caricature of the Dependence of Bandwidth on Interatomic Spacing. which gives us ~2 m∗ = 2ta2 In other words, the effective mass m∗ is defined such that the dispersion of the bottom of the band is exactly like the dispersion of free particles of mass m∗. (We will discuss effective mass in much more depth in chapter 16 below. This is just a quick first look at it.) Note that this mass has nothing to do with the actual mass of the electron, but rather depends on the hopping matrix element t. Further we should keep in mind that the k that enters into the dispersion relationship is actually the crystal momentum, not the actual momentum of the electron (recall that crystal momentum is defined only modulo 2π/a). However, so long as we stay at very small k, then there is no need to worry about the periodicity of k which occurs. Nonetheless, we should keep in mind that if electrons scatter off of other electrons, or off of phonons, it is crystal momentum that is conserved. (See the discussion in section 8.4).

10.3 Introduction to Electrons Filling Bands

We now imagine that our tight binding model is actually made up of atoms and each atom “do- nates” one electron into the band (i.e., the atom has valence one). Since there are N possible k-states in the band, and electrons are fermions, you might guess that this would precisely fill the band. However, there are two possible spin states for an electron at each k, so in fact, this then only half-fills the band. This is depicted in the left of Fig. 10.3. The filled states (shaded) in this picture are filled with both up and down spins. It is crucial in this picture that there is a Fermi surface — the points where the shaded meets the unshaded region. If a small electric field is applied to the system, it only costs a very small amount of energy to shift the Fermi surface as shown in the right of Fig. 10.3, populating a few k-states moving right and de-populating some k-states moving left. In other words, the state of the system responds by changing a small bit and a current is induced. As such, this system is a metal in that it conducts electricity. Indeed, crystals of atoms that are mono-valent are very frequently metals! 10.4. MULTIPLE BANDS 93

       

 

             

 

 

Figure 10.3: Left: If each atom has valence 1, then the band is half-filled. The states that are shaded are filled with both up and down spin electrons. The Fermi surface is the boundary between the filled and unfilled states. Right: When a small electric field is applied, at only a small cost of energy, the Fermi sea can shift slightly thus allowing current to run.

On the other hand, if each atom in our model were di-valent (donates two electrons to the band) then the band would be entirely full of electrons. In fact, it does not matter if we think about this as being a full band where every k-state k is filled with two electrons (one up and one down), or a filled band where every site n is| filledi — these two statements describe the same multi-electron wavefunction. In fact, there| isi a single unique wavefunction that describes this completely filled band. In the case of the filled band, were one to apply a small electric field to this system, the system cannot respond at all. There is simply no freedom to repopulate the occupation of k-states because every state is already filled. We conclude an important principle, Principle: A filled band carries no current. Thus our example of a di-valent tightbinding model is an insulator. (This type of insulator is known as a band insulator). Indeed, many systems of di-valent atoms are insulators (although in a moment we will discuss how di-valent atoms can also form metals).

10.4 Multiple Bands

In the above model, we considered only the case where there is a single atom in the unit cell and a single orbital per atom. However, more generally we might consider a case where we have multiple orbitals per unit cell. One possibility is to consider one atom per unit cell, but several orbitals per atom5. Anal- ogous to what we found with the above tight binding model, when the atoms are very far apart, one has only the atomic orbitals on each atom. However, as the atoms are moved closer together, the orbitals merge together and the energies spread to form bands6. Analogous to Fig. 10.2 we

5Each atom actually has an infinite number of orbitals to be considered. But only a small number of them are filled, and within our level of approximation, we can only consider very few of them. 6This picture of atomic orbitals in the weak hopping limit merging together to form bands does not depend on the fact that the crystal of atoms is ordered. Glasses and amorphous solids can have this sort of band structure as well! 94 CHAPTER 10. TIGHT BINDING CHAIN (INTERLUDE AND PREVIEW) have shown how this occurs for the two band case in Fig. 10.4.

1

2

Metal-Insulator Transition

Inter-atomic distance

Figure 10.4: Caricature of Bands for a Two-Band Model as a Function of Interatomic Spacing . 1 2 In the atomic limit, the orbitals have energies atomic and atomic. If the system has valence one (per unit cell), then in the atomic limit, the lower orbital is filled and the upper orbital is empty. When the atoms are pushed together, the lower band will remain filled, and the upper will remain empty, until the bands start to overlap, whereupon we may have two bands both partially filled, which becomes a metal.

A very similar situation occurs when we have two atoms per unit cell but only one orbital per atom. We will do a problem like this for homework7. However, the general result will be quite analogous to what we found for vibrations of a diatomic chain in chapter 9. In Fig. 10.5 we show the spectrum of a tight-binding model with two different atoms per unit cell – each having a single orbital. We have shown results here in both the reduced and extended zone schemes. As for the case of vibrations, we see that there are now two possible energy eigenstates at each value of k. In the language of electrons, we say that there are two bands (we do not use the words “acoustic” and “optical” for electrons, but the idea is similar). Note that there is a gap between the two bands where there are simply no energy eigenstates. Let us think for a second about what should result in this situation. If each atom (of either type) were divalent, then the two electrons donated would completely fill the single orbital on each site. In this case, both bands would be completely filled with both spin-up and spin-down electrons. On the other hand, if each atom (of either type) is monovalent, then this means exactly half of the states of the system should be filled. However, here, when one fills half of the states of the system, then all of the states of the lower band are completely filled (with both spins) but all of

7The homework problem is sufficiently simplified that the bands do not overlap as they do in figure 10.4. One can obtain overlapping bands by including second-neighbor hopping as well as neighbor hopping. (If you are brave you might try it!). 10.5. SUMMARY OF TIGHT BINDING CHAIN 95

       

 

 

 

 

                 

Figure 10.5: Diatomic Tight Binding Dispersion in One Dimension. Left: Reduced Zone scheme. Right: Extended Zone scheme. the states in the upper band are completely empty. In the extended zone scheme it appears that a gap has opened up precisely where the Fermi surface is! (at the Brillouin zone boundary!) In the situation where a lower band is completely filled but an upper band is completely empty, if we apply a weak electric field to the system can current flow? In this case, one cannot rearrange electrons within the lower band, but one can remove an electron from the lower band and put it in the upper band in order to change the overall (crystal) momentum of the system. However, moving an electron from the lower band requires a finite amount of energy — one must overcome the gap between the bands. As a result, for small enough electric fields (and at low temperature), this cannot happen. We conclude that a filled band is an insulator as long as there is a finite gap to any higher empty bands. As with the single band case, one can imagine the magnitude of hopping changing as one changes the distance between atoms. When the atoms are far apart, then one is in the atomic limit, but these atomic states spread into bands as the atoms get closer together as shown in Fig. 10.4. For the case where each atom is mono-valent, in the atomic limit, half of the states are filled – that is the lower energy atomic orbital is filled with both spin-up and spin down electrons whereas the higher energy orbital is completely empty. (I.e., an electron is transferred from the higher energy atom to the lower energy atom and this completely fills the lower energy band). As the atoms are brought closer together, the atomic orbitals spread into bands (the hopping t increases). However, at some point the bands get so wide that their energies overlap8 — in which case there is no gap to transfer electrons between bands, and the system becomes a metal as marked in Fig. 10.4. (If it is not clear how bands may overlap, consider, for example the right side of Fig. 15.2. Band overlaps may occur — in fact, they often occur! — when we consider systems that are two and three dimensional.)

10.5 Summary of Tight Binding Chain

Solving tight-binding Shroedinger equation for electron waves is very similar to solving equa- • tions for vibrational (phonon) waves. The structure of the reciprocal lattice and the Brillouin

8As mentioned above, in our simplified model one needs to consider second neighbor hopping to get overlapping bands. 96 CHAPTER 10. TIGHT BINDING CHAIN (INTERLUDE AND PREVIEW)

zone remains the same. Obtain energy bands where energy eigenstates exist, and gaps between bands. • Zero hopping is the atomic limit, as hopping increases, atomic orbitals spread into bands. • Energies are parabolic in k near bottom of band — looks like free electrons, but with a • modified effective mass. A filled band with a gap to the next band is an insulator (a band insulator), a partially filled • band has a Fermi surface and is a metal. Whether a band is filled depends on the valence of the atoms. • As we found for phonons, gaps open at Brillouin zone boundaries. Group velocities are also • zero at zone boundaries.

References

No book has an approach to tight binding that is exactly like what we have here. The books that come closest do essentially the same thing, but in three dimensions (which complicates life a bit). These books are: Ibach and Luth, section 7.3 • Kittel, chapter 9, section on tight-binding • Burns, section 10.9, and 10.10. • Singleton, chapter 4. • Possibly the nicest (albeit short) description is given by Dove, section 5.5.5 • Also a nice short description of the physics (without any detail is given by) Rosenberg, section 8.19. • Finally, an alternative approach to tight binding is given by Hook and Hall, section 4.3. • The discussion of Hook and Hall is good (and they consider one dimension, which is nice), but they insist on using time dependent Schr¨odinger equation, which is annoying. Part IV

Geometry of Solids

97

Chapter 11

Crystal Structure

Having introduced a number of important ideas in one dimension, we must now deal with the fact that our world is actually spatially three dimensional. While this adds a bit of complication, really the important concepts are no harder in three dimensions than they were in one dimension. Some of the most important ideas we have already met in one dimension, but we will reintroduce more generally here. There are two parts that might be difficult here. First, we do need to wrestle with a bit of geometry. Hopefully most will not find this too hard. Secondly will also need to establish a language in order to describe structures in two and three dimensions intelligently. As such, much of this chapter is just a list of definitions to be learned, but unfortunately this is necessary in order to be able to carry further at this point.

11.1 Lattices and Unit Cells

Definition 11.1.1. A Lattice1 is an infinite set of points defined by integer sums of a set of linearly independent primitive lattice2 vectors

For example, in two dimensions, as shown in figure 11.1 the lattice points are described as

R = n a1 + n a2 n ,n Z (2d) [n1 n2] 1 2 1 2 ∈ with a1 and a2 being the primitive lattice vectors and n1 and n2 being integers. In three dimensions points of a lattice are analogously indexed by three integers

R = n a1 + n a2 + n a3 n ,n ,n Z (3d) (11.1) [n1 n2 n3] 1 2 3 1 2 3 ∈ 1Warning: Some books (Ashcroft and Mermin in particular) refer to this as a . This enables them to use the term Lattice to describe other things that we would not call a lattice (cf, the honeycomb). However, the definition we use here is more common, and more correct mathematically as well. [Thank you, Mike Glazer, for catching this]. 2Very frequently “primitive lattice vectors” are called “primitive basis vectors”, although the former is probably more precise. Furthermore, we have already used the word “basis” before in chapter 9.1, and unfortunately, here this is a different use of the same word. At any rate, we will try to use “primitive lattice vector” to avoid such confusion.

99 100 CHAPTER 11. CRYSTAL STRUCTURE

Figure 11.1: A lattice is defined as integer sums of a set of primitive lattice vectors.

Note that in one dimension this definition of a lattice fits with our previous description of a lattice as being the points R = na with n an integer. It is important to point out that in two and three dimensions, the choice of primitive lattice vectors is not unique3 as show in figure 11.2. (In 1d, the single primitive lattice vector is unique up to the sign (direction) of a).

Figure 11.2: The choice of primitive lattice vectors for a lattice is not unique.

It turns out that there are several definitions that are entirely equivalent to the one we have just given:

Equivalent Definition 11.1.1.1. A Lattice is an infinite set of vectors where addition of any two vectors in the set gives a third vector in the set.

It is easy to see that our above first definition 11.1.1 implies the second one 11.1.1.1. Here is a less crisply defined, but sometimes more useful definition.

Equivalent Definition 11.1.1.2. A Lattice is a set of points where the environment of any given point is equivalent to the environment of any other given point.

3 Given a set of primitive lattice vectors ai a new set of primitive lattice vectors may be constructed as bi = −1 j mij aj so long as mij is an invertible matrix with integer entries and the inverse matrix [m ]ij also has integer Pentries. 11.1. LATTICES AND UNIT CELLS 101

It turns out that any periodic structure can be expressed as a lattice of repeating motifs. A cartoon of this statement is shown in Fig. 11.3.   



 !"#$&'()! 0

Figure 11.3: Any periodic structure can be represented as a lattice of repeating motifs.

One should be cautious however, that not all periodic arrangements of points are lattices. The honeycomb4 shown in Fig. 11.4 is not a lattice. This is obvious from the third definition 11.1.1.2: The environment of point P and point R are actually different — point P has a neighbor directly above it (the point R), whereas point R has no neighbor directly above. In order to describe a honeycomb (or other more complicated arrangements of points) we have the idea of a unit cell, which we have met before in section 9.1 above. Generally we have

Definition 11.1.2. A unit cell is a region of space such that when many identical units are stacked together it tiles (completely fills) all of space and reconstructs the full structure

An equivalent (but less rigorous) definition is

Equivalent Definition 11.1.2.1. A unit cell is the repeated motif which is the elementary build- ing block of the periodic structure.

To be more specific we frequently want to work with the smallest possible unit cell

4One should be careful not to call this a hexagonal lattice. First of all, by our definition it, is not a lattice at all since all points do not have the same environment. Secondly, some people use the term “hexagonal” to mean what the rest of us call a triangular lattice: a lattice of triangles where each point has six nearest neighbor points. (See Fig 11.6 below) 102 CHAPTER 11. CRYSTAL STRUCTURE

R

P Q

Figure 11.4: The honeycomb is not a lattice. Points P and R are inequivalent. (Points P and Q are equivalent)

Definition 11.1.3. A primitive unit cell for a periodic crystal is a unit cell containing only a single lattice point.

As mentioned above in section 9.1 the definition of the unit cell is never unique. This is shown, for example, in Fig. 11.5

Figure 11.5: The choice of a unit cell is not unique. All of these unit cells reconstruct the same crystal.

Sometimes it is useful to define a unit cell which is not primitive in order to have it simpler to work with. This is known as a conventional unit cell. Almost always these conventional unit cells are chosen so as to have orthogonal axes. Some examples of possible unit cells are shown for the triangular lattice in Fig. 11.6. In this 11.1. LATTICES AND UNIT CELLS 103

figure the conventional unit cell (upper left) is chosen to have orthogonal axes — which is often easier to work with than axes which are non-orthogonal.

    !""

#$%&'()0'$12 3&$1"5'66 %

Figure 11.6: Some unit cells for the triangular lattice.

Figure 11.7: The Wigner-Seitz construction for a lattice in 2d.

A note about counting the number of lattice points in the unit cell: It is frequently the case that we will work with unit cells where the lattice points live at the corners (or edges) of the cells. When a lattice point is on the boundary of the unit cell, it should only be counted fractionally depending on what fraction of the point is actually in the cell. So for example in the conventional unit cell shown in Fig. 11.6, there are two lattice points within this cell. Obviously there is one point in the center, then four points at the corners — each of which is one quarter inside the cell, 1 so we obtain 2 = 1+4( 4 ) points in the cell. (Since there are two points in this cell, it is by definition, not primitive. Similarly for the primitive cell shown in this figure (upper right), the two lattice points at the left and the right have a 60o degree slice (which is 1/6 of a circle) inside the cell. The two points at the top and the bottom have 1/3 of the point inside the unit cell. Thus 1 1 this unit cell contains 2( 3 )+2( 6 ) = 1 point, and is thus primitive. Note however, that we can just imagine shifting the unit cell a tiny amount in almost any direction such that a single lattice point is completely inside the unit cell and the others are completely outside the unit cell. This sometimes makes counting much easier. Also shown in Fig. 11.6 is a so-called Wigner-Seitz unit cell5.

5Eugene Wigner was yet another Nobel laureate who was another one of the truly great minds of the last century 104 CHAPTER 11. CRYSTAL STRUCTURE

Definition 11.1.4. Given a lattice point, the set of all points in space which are closer to that given lattice point than to any other lattice point constitute the Wigner-Seitz cell of the given lattice point.

There is a rather simple scheme for constructing such a Wigner-Seitz cell: choose a lattice point and draw lines to all of its possible near neighbors (not just its nearest neighbors). Then draw perpendicular bisectors of all of these lines. The perpendicular bisectors bound the Wigner-Seitz cell6. It is always true that the Wigner-Seitz construction for a lattice gives a primitive unit cell. In figure 11.7 we show another example of the Wigner-Seitz construction for a two dimensional lattice. A similar construction can be performed in three dimensions in which case one must construct perpendicular-bisecting planes to bound the Wigner-Seitz cell. The description of objects in the unit cell in terms of the reference point in the unit cell is known as a “basis”. (This is the same definition of “basis” that we used in section 9.1 above).

a 3a 3a 3a [ 4 , 4 ] [ 4 , 4 ]

a a [ 2 , 2 ]

a a 3a a [ 4 , 4 ] [ 4 , 4 ]

[0, 0]

a a

Figure 11.8: Left: A periodic structure in two dimensions. A unit cell is marked with the dotted lines. Right: A blow-up of the unit cell with the coordinates of the particles in the unit cell with respect to the reference point in the lower left hand corner. The basis is the description of the atoms along with these positions.

In Fig. 11.8 we show a periodic structure in two dimension made of two types of atoms On the right we show a primitive unit cell (expanded) with the position of the atoms given with respect to the reference point of the unit cell which is taken to be the lower left-hand corner. We can describe the basis of this crystal as follows: of physics. Fredrick Seitz was far less famous, but gained notoriety in his later years by being a consultant for the tobacco industry, a strong proponent of the Regan-era Star-Wars missile defense system, and a prominent sceptic of global warming. He passed away in 2007. 6This Wigner-Seitz construction can be done on an irregular collection of points as well as on a periodic lattice. For such an irregular set of point the resulting construction is known as a Voronoi cell. 11.1. LATTICES AND UNIT CELLS 105

2 3 (a1 + a2) a1

1 3 (a1 + a2)

a2

a

Figure 11.9: The honeycomb from Fig. 11.4 with the two inequivalent points of the unit cell given different shades. The unit cell is outlined dotted on the left and the corners of the unit cell are marked with small black dots. On the right the unit cell is expanded and coordinates are given with respect to the reference point written.

Basis for crystal in Fig. 11.8 = Large Light Gray Atom Position= [a/2,a/2]

Small Dark Gray Atoms Position= [a/4,a/4] [a/4, 3a/4] [3a/4,a/4] [3a/4, 3a/4] The reference points forming the square lattice have positions

R[n1 n2] = [an1,an2]= an1xˆ + an2yˆ (11.2) with n1,n2 integers so that the large light gray atoms have positions Rlight−gray = [an ,an ] + [a/2,a/2] [n1 n2] 1 2 whereas the small dark gray atoms have positions

dark−gray1 R = [an1,an2] + [a/4,a/4] [n1 n2] Rdark−gray2 = [an ,an ] + [a/4, 3a/4] [n1 n2] 1 2 Rdark−gray3 = [an ,an ] + [3a/4,a/4] [n1 n2] 1 2 Rdark−gray4 = [an ,an ] + [3a/4, 3a/4] [n1 n2] 1 2 In this way you can say that the positions of the atoms in the crystal are “the lattice plus the basis”. 106 CHAPTER 11. CRYSTAL STRUCTURE

We can now return to the case of the honeycomb shown in Fig. 11.4 above. The same honeycomb is shown in Fig. 11.9 as well with the lattice and the basis explicitly shown. Here, the reference points (small black dots) form a (triangular) lattice, where we can write the primitive lattice vectors as

a1 = a xˆ

a2 = (a/2) xˆ + (a√3/2) yˆ

In terms of the reference points of the lattice, the basis for the primitive unit cell, i.e., the coor- 1 dinates of the two larger circles with respect to the reference point, are given by 3 (a1 + a2) and 2 3 (a1 + a2).

Figure 11.10: A simple cubic lattice

11.2 Lattices in Three Dimensions

The simplest lattice in three dimensions is the simple cubic lattice shown in Fig. 11.10 (sometimes known as cubic “P” or cubic-primitive lattice). The primitive unit cell in this case can most conveniently be taken to be single cube — which includes 1/8 of each of its eight corners. In fact, real crystals of atoms are rarely simple cubic7. To understand why this is so, think of an atom as a small sphere. When you assemble spheres into a simple cubic lattice you find that it is a very inefficient way to pack the spheres together — in that you are left with a lot of empty space in the center of the unit cells, and this turns out to be energetically unfavorable in most cases. Only slightly more complicated than the simple cubic lattice are the tetragonal and or- thorhombic lattices where the axes remain perpendicular, but the primitive lattice vectors may be of different lengths (shown in Fig 11.11). The orthorhombic unit cell has three different lengths of its perpendicular primitive lattice vectors, whereas the tetragonal unit cell has two lengths the same and one different. Conventionally to represent a given vector amongst the infinite number of possible lattice vectors in a lattice, one writes [uvw]= ua1 + va2 + wa3 (11.3)

7Of all of the chemical elements, Polonium is the only one which forms a simple cubic lattice. 11.2. LATTICES IN THREE DIMENSIONS 107

Figure 11.11: Unit cells for orthorhombic (left) and tetragonal (right) lattices. where u,v and w are integers. For cases where the lattice vectors are orthogonal, the basis vectors 8 a1, a2, and a3 are assumed to be in the xˆ, yˆ and zˆ directions. We have seen this notation before, for example, in the subscripts of the Eqns. after Definition 11.1.1). Lattices in three dimensions certainly exist where axes are non-orthogonal, but... you will not be held responsible for any three dimensional crystal system where coordinate axes are not orthogonal. Two further lattice systems that you will need to know are the Face Centered Cubic (fcc) and Body Centered Cubic (bcc) lattices. In terms of our above discussion of atoms as being like small spheres, packing spheres in either a bcc or fcc lattice leaves much less open space between the spheres than packing the spheres in a simple cubic lattice.9 Correspondingly, these two lattices are realized much more frequently in nature.

The Body Centered Cubic (bcc) Lattice

The body centered cubic (bcc) lattice is a simple cubic lattice where there is an additional point in the very center of the cube (this is sometimes known10 as cubic-I ). The unit cell is shown in the left of Fig. 11.12. Another way to show this unit cell, which does not rely on showing a three- dimensional picture, is to use a so-called plan view of the unit cell, shown in the right of Fig. 11.12. A plan view (a term used in and architecture) is a two dimensional projection from the top of an object where heights are labeled to show the third dimension. In the picture of the bcc unit cell, there are eight lattice points on the corners of the cell (each of which is 1/8 inside of the conventional unit cell) and one point in the center of the cell. Thus the conventional unit cell contains exactly two (= 8 1/8 + 1) lattice points. × Packing together these unit cells to fill space, we see that the lattice points of a full bcc lattice can be described as being points having coordinates [x,y,z] where either all three coordinates are integers [uvw] times the lattice constant a, or all three are odd-half-integers times the lattice

8 Note that this notation is also sometimes abused, as in Eq. 11.2, where the brackets [an1, an2] enclose not integers, but distances which are integer multiples of a lattice constant a. To try to make things more clear, in the latter usage we will put commas between the entries, whereas the typical [uvw] usage has no commas. However, most references will be extremely lax and switch between various types of notation freely. 9In fact it is impossible to pack spheres more densely than you would get by placing the spheres at the vertices of an fcc lattice. This result (known empirically to people who have tried to pack oranges in a crate) was first officially conjectured by Johannes Kepler in 1611, but was not mathematically proven until 1998! 10Cubic-I comes from ”Innenzentriert” (inner centered). This notation was introduced by Bravais in his 1848 treatise (interestingly, Europe was burning in 1848, but obviously that didn’t stop science from progressing). 108 CHAPTER 11. CRYSTAL STRUCTURE

a a/2

a a

Figure 11.12: Conventional unit cell for the body centered cubic (I) lattice. Left: 3D view. Right: A plan view of the conventional unit cell. Unlabeled points are both at heights 0 and a. constant a. It is often convenient to think of the bcc lattice as a simple cubic lattice with a basis of two atoms per conventional cell. The simple cubic lattice contains points [x,y,z] where all three coordinates are integers in units of the lattice constant. Within the conventional simple-cubic unit cell we put one point at position [0, 0, 0] and another point at the position [a/2,a/2,a/2] in units of the lattice constant. Thus the points of the bcc lattice are written as

Rcorner = [an1,an2,an3]

Rcenter = [an1,an2,an3] + [a/2,a/2,a/2] as if the two different types of points were two different types of atoms, although all points in this lattice should be considered equivalent (they only look inequivalent because we have chosen a conventional unit cell with two lattice points in it). Now, we may ask why it is that this set of points forms a lattice. In terms of our first definition of a lattice (Definition 11.1.1) we can write the primitive lattice vectors of the bcc lattice as

a1 = [a, 0, 0]

a2 = [0, a, 0] a a a a3 = [ , , ] 2 2 2 It is easy to check that any combination

R = n1a1 + n2a2 + n3a3 (11.4) with n1,n2 and n3 integers gives a point within our definition of the bcc lattice (that the three coordinates are either all integer or all half-odd integer times the lattice constant). Further one can check that any point satisfying the conditions for the bcc lattice can be written in the form of Eq. 11.4. We can also check that our description of a bcc lattice satisfies our second description of a lattice (definition 11.1.1.1) that addition of any two points of the lattice (given by Eq. 11.4) gives another point of the lattice. 11.2. LATTICES IN THREE DIMENSIONS 109

More qualitatively we can consider definition 11.1.1.2 of the lattice — that the local en- vironment of every point in the lattice should be the same. Examining the point in the center of the unit cell, we see that it has precisely 8 nearest neighbors in each of the possible diagonal directions. Similarly any of the points in the corners of the unit cells will have 8 nearest neighbors corresponding to the points in the center of the 8 adjacent unit cells. The coordination number of a lattice (frequently called Z or z) is the number of nearest neighbors any point of the lattice has. For the bcc lattice the coordination number is Z = 8. As in two dimensions, a Wigner-Seitz cell can be constructed around each lattice point which encloses all points in space that are closer to that lattice point than to any other point in the lattice. This Wigner-Seitz unit cell for the bcc lattice is shown in Figure 11.13. Note that this cell is bounded by the perpendicular bisecting planes between lattice points.

Figure 11.13: Wigner-Seitz unit cell for the bcc lattice (left) and the fcc lattice (right).

The Face Centered Cubic (fcc) Lattice

a/2

a a/2 a/2

a a a/2

Figure 11.14: Conventional unit cell for the face centered cubic (F) lattice. Left: 3D view. Right: A plan view of the conventional unit cell. Unlabeled points are both at heights 0 and a.

The face centered (fcc) lattice is a simple cubic lattice where there is an additional point in the center of every face of every cube (this is sometimes known as cubic-F, for “face centered”). The unit cell is shown in the left of Fig. 11.14. A plan view is of the unit cell is shown on the right of Fig. 11.14. 110 CHAPTER 11. CRYSTAL STRUCTURE

In the picture of the fcc unit cell, there are eight lattice points on the corners of the cell (each of which is 1/8 inside of the conventional unit cell) and one point in the center of each of the 6 faces, which is 1/2 inside the cell. Thus the conventional unit cell contains exactly four (= 8 1/8+6 1/2) lattice points. Packing together these unit cells to fill space, we see that the lattice× points of× a full fcc lattice can be described as being points having coordinates (x,y,z) where either all three coordinates are integers times the lattice constant a, or two of the three coordinates are odd-half-integers times the lattice constant a and the remaining one coordinate is an integer times the lattice constant a. Analogous to the bcc case, it is sometimes convenient to think of the fcc lattice as a simple cubic lattice with a basis of four atoms per conventional cell. The simple cubic lattice contains points [x,y,z] where all three coordinates are integers in units of the lattice constant a. Within the conventional simple-cubic-unit cell we put one point at position [0, 0, 0] and another point at the position [a/2,a/2, 0] another at [a/2, 0,a/2] and another at [0,a/2,a/2]. Thus the points of the fcc lattice are written as

Rcorner = [an1,an2,an3] (11.5)

Rface−xy = [an1,an2,an3] + [a/2,a/2, 0]

Rface−xz = [an1,an2,an3] + [a/2, 0,a/2]

Rface−yz = [an1,an2,an3] + [0,a/2,a/2] Again, this expresses the points of the lattice as if they were four different types of points but they only look inequivalent because we have chosen a conventional unit cell with four lattice points in it. Again we can check that this set of points forms a lattice. In terms of our first definition of a lattice (Definition 11.1.1) we write the primitive lattice vectors of the fcc lattice as a a a1 = [ , , 0] 2 2 a a a2 = [ , 0, ] 2 2 a a a3 = [0, , ] 2 2 Again it is easy to check that any combination

R = n1a1 + n2a2 + n3a3 with n1,n2 and n3 integers gives a point within our definition of the fcc lattice (that either the three coordinates are either all integer, or two of three are half-odd-integer and the remaining is integer in units of the lattice constant a). We can also similarly check that our description of a fcc lattice satisfies our other two definitions of (definition 11.1.1.1 and 11.1.1.2) of a lattice11. The Wigner-Seitz unit cell for the fcc lattice is shown in Figure 11.13.

Other Lattices in Three Dimensions

In addition to the simple cubic, orthorhombic, tetragonal, fcc, and bcc lattices, there are nine other types of lattices in three dimensions. These are known as the fourteen Bravais lattice types12 You 11Can you figure out the coordination number of the fcc lattice? Find the minimum distance between two lattice points then find out how many lattice points are this distance away from any one given lattice point. (It would be too easy if I told you the answer!) 12Named after Auguste Bravais who classified all the three dimensional lattices in 1848. Actually they should be named after Moritz Frankenheim who studied the same thing over ten years earlier — although he made a minor 11.2. LATTICES IN THREE DIMENSIONS 111 are not responsible for knowing these! But it is probably a good idea to know that they exist.

Figure 11.15: Unit cells for All of the Three Dimensional Bravais Lattice Types.

Figure 11.15 shows the full variety of Bravais lattice types in three dimensions. While it is an extremely deep fact that there are only 14 lattice types in three dimensions, the precise statement of this theorem, as well of the proof of it, are beyond the scope of this course. The key result is that any crystal, no matter how complicated, has a lattice which is one of these 14 types.13

Real Crystals

Once we have discussed lattices we can combine a lattice with a basis to describe any periodic structure — and in particular, we can describe any crystalline structure. error in his studies, and therefore missed getting his name associated with them. 13There is a real subtlety here in classifying a crystal as having a particular lattice type. There are only these 14 lattice types, but in principle a crystal could have one lattice, but have the symmetry of another lattice. An example of this would be if the a lattice were cubic, but the unit cell did not look the same from all six sides. Crystallographers would not classify this as being a cubic material even if the lattice happened to be cubic. The reason for this is that if the unit cell did not look the same from all six sides, there would be no particular reason that the three primitive lattice vectors should have the same length — it would be an insane coincidence were this to happen, and almost certainly in any real material the primitive lattice vector lengths would actually have slightly different values if measured more closely. 112 CHAPTER 11. CRYSTAL STRUCTURE

Several examples of real (and reasonably simple) crystal structures are shown in Fig. 11.16.

11.3 Summary of Crystal Structure

This chapter introduced a plethora of new definitions, aimed at describing crystal structure in three dimensions. Here is a list of some of the concepts that one should know

Definition of a lattice (in three different ways See definitions 11.1.1, 11.1.1.1, 11.1.1.2) • Definition of a unit cell for a periodic structure, and definition of a primitive unit cell and a • conventional unit cell Definition and construction of the Wigner-Seitz (primitive) unit cell. • One can write any periodic structure in terms of a lattice and a basis (See examples in • Fig. 11.16). In 3d, know the simple cubic lattice, the fcc lattice and the bcc lattice. • The fcc and bcc lattices can be thought of as simple cubic lattices with a basis. • Know how to read a plan view of a structure. •

References

All books cover this. Some books give way too much detail for us. I recommend the following as giving not too much and not too little: Kittel, chapter 1 • Ashcroft and Mermin chapter 4 (caution of the nomenclature issue, see footnote 1 of this chapter).• 11.3. SUMMARY OF CRYSTAL STRUCTURE 113

 !#$%&'())$0 1234#6789

     (

UVWWXY

ABCCDEFG0EIPDE0Q u PBRDR0G SI0TTT t v

w (

”•–—˜™defg•h—–idjklmgn opqrstuvw

xy€€‚ƒ„U‚†‡‚Uˆ ‡y‰‰U„ yU‘‘‘ ’x“““ xyz{z|}~€ ‚ƒ„ †‡ˆ‰ (

™š›œž™Ÿ ¡¢Ÿ£Ÿ›¤¥Ÿ¦š§Ÿ¨© ª«¬­®¯°±² ´ Š‹ŒŒŽ‘Ž’“Ž‘” · º “‹••‘ ³ µ –‘‘——— ¹ ¸ –‘˜˜˜

¶ (

Figure 11.16: Some examples of real crystals with simple structures. Note that in all cases the basis is described with respect to the primitive unit cell of a simple cubic lattice. 114 CHAPTER 11. CRYSTAL STRUCTURE Chapter 12

Reciprocal Lattice, Brillouin Zone, Waves in Crystals

In the last chapter we explored lattices and crystal structure. However as we saw in chapters 8–10, the important physics of waves in solids (whether they be vibrational waves, or electron waves) is best described in reciprocal space. This chapter thus introduces reciprocal space in 3 dimensions. As with the previous chapter, there is some tricky geometry in this chapter, and a few definitions to learn as well. This makes this material a bit tough to slog through, but stick with it because soon we will make substantial use of what we learn here. At the end of this chapter we will finally have enough definitions to describe the dispersions of phonons and electrons in three dimensional systems.

12.1 The Reciprocal Lattice in Three Dimensions

12.1.1 Review of One Dimension

Let us first recall some results from our study of one dimension. We consider a simple lattice in one dimension Rn = na with n an integer. Recall that two points in k-space (reciprocal space) were defined to be equivalent to each other if k1 = k2 +Gm where Gm =2πm/a with m an integer. The points Gm form the reciprocal lattice. Recall that the reason that we identified different k values was because we were considering waves of the form eikxn = eikna with n an integer. Because of this form of the wave, we find that shifting k k + Gm leaves this functional form unchanged since →

ei(k+Gm)xn = ei(k+Gm)na = eiknaei(2πm/a)na = eikxn where we have used ei2πmn =1 in the last step. Thus, so far as the wave is concerned, k is the same as k + Gm.

115 116 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS

12.1.2 Reciprocal Lattice Definition

Generalizing the above result from one dimension, we make the following definition:

Definition 12.1.1. Given a (direct) lattice of points R, a point G is a point in the reciprocal lattice if and only if eiG·R =1 (12.1) for all points R of the direct lattice.

To construct the reciprocal lattice, let us write the points of the direct lattice in the form (Here we specialize to the three dimensional case).

R = n1a1 + n2a2 + n3a3 (12.2) with n1,n2 and n3 integers, and with a1, a2, and a3 being primitive lattice vectors of the direct lattice. We now make two key claims:

1. We claim that the reciprocal lattice (defined by Eq. 12.1) is a lattice in reciprocal space (thus explaining its name).

2. We claim that the primitive lattice vectors of the reciprocal lattice (which we will call b1, b2, and b3) are defined to have the following property:

ai bj =2πδ (12.3) · ij 1 where δij is the Kronecker delta .

We can certainly construct vectors bi to have the desired property of Eq. 12.3, as follows:

2π a2 a3 b1 = × a1 (a2 a3) · × 2π a3 a1 b2 = × a1 (a2 a3) · × 2π a1 a2 b3 = × a1 (a2 a3) · × It is easy to check that Eq. 12.3 is satisfied. For example,

2π a1 (a2 a3) a1 b1 = · × =2π · a1 (a2 a3) · × 2π a2 (a2 a3) a2 b1 = · × =0 · a1 (a2 a3) · ×

Now, given vectors b1, b2, and b3 satisfying Eq. 12.3 we have claimed that these are in fact primitive lattice vectors for the reciprocal lattice.

1Leopold Kronecker was a mathematician who is famous (among other things) for the sentence “God made the integers, everything else is the work of man”. In case you don’t already know this, the Kronecker delta is defined as δij = 1 for i = j and is zero otherwise. (Kronecker did a lot of other interesting things as well) 12.1. THE RECIPROCAL LATTICE IN THREE DIMENSIONS 117

Let us write an arbitrary point in reciprocal space as

G = m1b1 + m2b2 + m3b3 (12.4) and for the moment, let us not require m1,m2 and m3 to be integers. (We are about to discover that for G to be a point of the reciprocal lattice, they must be integers, but this is what we want to prove!). To find points of the reciprocal lattice we must show that Eq. 12.1 is satisfied for all points R = n1a1 + n2a2 + n3a3 of the direct lattice with n1,n2 and n3 integers. We thus write

eiG·R = ei(m1b1+m2b2+m3b3)·(n1a1+n2a2+n3a3) = e2πi(n1m1+n2m2+n3m3)

In order for G to be a point of the reciprocal lattice, this must equal unity for all points R of this direct lattice, i.e., for all integer values of n1, n2 and n3. Clearly this can only be true if m1,m2 and m3 are also integers. Thus, we find that the points of the reciprocal lattice are precisely those of the form of Eq. 12.4 with m1,m2 and m3 integers. This further proves our claim that the reciprocal lattice is in fact a lattice!

12.1.3 The Reciprocal Lattice as a Fourier Transform

Quite generally one can think of the Reciprocal lattice as being a Fourier transform of the direct lattice. It is easiest to start by thinking in one dimension. Here the direct lattice is given again by Rn = an. If we think of the “density” of lattice points in one dimension, we might put a delta-function of density at these lattice points so we write the density as2

ρ(r)= δ(r an) − n X Fourier transforming this function gives3

[ρ(r)] = dreikr ρ(r)= dreikr δ(r an)= eikan =2π δ(k 2πm/a) F − − n n m Z X Z X X The last step here is a bit nontrivial.4 Here eikan is clearly unity if k =2πm/a, i.e., if k is a point on the reciprocal lattice. In this case, each term of the sum contributes unity to the sum and one obtains an infinite result. If k is not such a reciprocal lattice point, then the terms of the sum oscillate and the sum comes out to be zero. This principle generalizes to the higher (two and three) dimensional cases. Generally

[ρ(r)] = eik·R = (2π)D δD(k G) (12.5) F − R G X X 2Since the sums are over all lattice points they should go from to + . Alternately, one uses periodic boundary conditions and sums over all points. −∞ ∞ 3With Fourier transforms there are many different conventions about where one puts the factors of 2π. Probably in your mathematics class you learned to put 1/√2π with each k integral and with each r integral. However, in Solid-State physics conventionally 1/(2π) comes with each k integral, and no factor of 2π comes with each r integral. See section 2.2.1 to see why this is used. 4This is sometimes known as the Poisson resummation formula, after Sim´eon Denis Poisson, the same guy after 2 whom Poisson’s equation φ = ρ/0 is named, as well as other mathematical things such as the Poisson random distribution. His last name∇ means− “fish” in French. 118 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS where in the middle term, the sum is over lattice points R of the direct lattice, and in the last term it is a sum over points G of the reciprocal lattice. Here D is the number of dimensions (1, 2 or 3) and the δD is a D-dimensional delta function5. This equality is similar to that explained above. As above, if k is a point of the reciprocal lattice, then eik·R is always unity and the sum is infinite (a delta function). However, if k is not a point on the reciprocal lattice then the summands oscillate, and the sum comes out to be zero. Thus one obtains delta function peaks precisely at the positions of reciprocal lattice vectors. Aside: It is an easy exercise to show6 that the reciprocal lattice of an fcc direct lattice is a bcc lattice in reciprocal space. Conversely, the reciprocal lattice of a bcc direct lattice is an fcc lattice in reciprocal space.

Fourier Transform of Any Periodic Function

In the above section we considered the Fourier transform of a function ρ(r) which is just a set of delta functions at lattice points. However, it is not too different to consider the Fourier transform of any function with the periodicity of the lattice (and this will be quite important below in chapter 13). We say a function ρ(r) has the periodicity of a lattice if periodic ρ(r) = ρ(r + R) for any lattice vector R. We then want to calculate

[ρ(r)] = dr eik·rρ(r) F Z The integral over all of space can be broken up into a sum of integrals over each unit cell. Here we write any point in space r as the sum of a lattice point R and a vector x within the unit cell

[ρ(r)] = dx eik·(x+R)ρ(x + R)= eik·R dx eik·xρ(x) F R unit−cell R unit−cell X Z X Z where here we have used the invariance of ρ under lattice translations x x + R. The first term, as in Eq. 12.5 just gives a sum of delta functions yielding →

[ρ(r)] = (2π)D δD(k G)S(k) F − G X where S(k)= dx eik·xρ(x) (12.6) Zunit−cell is known as the and will become very important in the next chapter.

12.1.4 Reciprocal Lattice Points as Families of Lattice Planes

Another way to understand the reciprocal lattice is via families of lattice planes of the direct lattice. Definition 12.1.2. A lattice plane (or crystal plane) is a plane containing at least three noncolinear (and therefore an infinite number of) points of a lattice.

Definition 12.1.3. A family of lattice planes is an infinite set of equally separated lattice planes which taken together contain all points of the lattice.

5 2 For example, in two dimensions δ (r r0)= δ(x x0)δ(y y0) where r = (x,y) 6Try it! − − − 12.1. THE RECIPROCAL LATTICE IN THREE DIMENSIONS 119

In Figure 12.1, two examples of families of lattice planes are shown. Note that the planes are parallel and equally spaced, and every point of the lattice is included in exactly one lattice plane.

Figure 12.1: Two Examples of Families of Lattice planes on the Cubic Lattice. Each of these planes is a crystal plane because it intersects an infinite number of lattice points. The left example is (100) and the right example is (111) in the Miller index notation.

I now make the following claim: Claim: The families of lattice planes are in one-to-one correspondence7 with the possible directions of reciprocal lattice vectors, to which they are normal. Further the spacing between these lattice planes is d =2π/ Gmin where Gmin is the minimum length reciprocal lattice vector in this normal direction. | | This correspondence is made as follows. First we consider the set of planes defined by points r such that G r =2πm (12.7) · This defines an infinite set of parallel planes normal to G. Since eiG·r = 1 we know that every lattice point is a member of one of these planes (since this is the definition of G in Eq. 12.1). However, for the planes defined by Eq. 12.7, not every plane needs to contain a lattice point (so generically this is a family of parallel equally spaced planes, but not a family of lattice planes). For this larger family of planes, the spacing between planes is given by 2π d = (12.8) G | | 7For this one-to-one correspondence to be precisely true we must define G and G to be the same direction. If this sounds like a cheap excuse, we can say that “oriented” families of lattice planes are− in one-to-one correspondence with the directions of reciprocal lattice vectors, thus keeping track of the two possible normals of the family of lattice planes. 120 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS

To prove this we simply note that two adjacent planes must have

G (r1 r2)=2π · − Thus in the direction parallel to G, the spacing between planes is 2π/ G as claimed. | | Clearly different values of G that happen to point in the same direction, but have different magnitudes, will define parallel sets of planes. As we increase the magnitude of G, we add more and more lattice planes. For example, examining Eq. 12.7 we see that when we double the magnitude of G we correspondingly double the density of planes, which we can see from the spacing formula Eq. 12.8. However, whichever G we choose, all of the lattice points will be included in one of the defined planes. If we choose the maximally possible spaced planes, hence the smallest possible value of G allowed in any given direction which we call Gmin, then in fact every defined plane will include lattice points and therefore be lattice planes, and the spacing between these planes is 8 correspondingly 2π/ Gmin . This proves our above claim. | |

12.1.5 Lattice Planes and Miller Indices

It is convenient to define a notation for describing lattice planes. The conventional notations are known as Miller Indices.9 One writes (h,k,l) or (hkl) with integers h, k and l, to mean a family of lattice planes corresponding to reciprocal lattice vector

G(h,k,l) = hb1 + kb2 + lb3 (12.9) 10 where bi are the standard primitive lattice vectors of the reciprocal lattice . Note that (h,k,l) as a family of lattice planes, should be the shortest reciprocal lattice vector in that direction, meaning that the integers h, k and l should have no common divisor. One may also write (h,k,l) where h, k and l do have a common divisor, but then one is talking about a reciprocal lattice vector, or a family of planes that is not a family of lattice planes (i.e., there are some planes that do not intersect lattice point). Important Complication: For fcc and bcc lattices, Miller indices are usually stated using the primitive lattice vectors of the cubic lattice in Eq. 12.9 rather than the primitive lattice vector of the fcc or bcc. This comment is quite important. For example, the (100) family of planes for the cubic lattice (shown in the right of Fig. 12.1) intersects every corner of the cubic unit cell. However, if we were discussing a bcc lattice, there would also be another lattice point in the center of every conventional unit cell, and the (100) lattice planes would not intersect. However, the (200) planes would intersect these central points as well, so in this case (200) represents a true family of lattice planes for the bcc lattice whereas (100) does not!

8More rigorously, if there is a family of lattice planes in direction G with spacing between planes d, then G = 2πG/d is necessarily a reciprocal lattice vector. To see this note that eiG·R = 1 will be unity for all lattice b points. Further, in a family of lattice planes, all lattice points are included within the planes, so eiG·R = 1 for all b R a lattice point, which implies G is a reciprocal lattice vector. Furthermore, G is the shortest reciprocal lattice vector in the direction of G since increasing G will result in a smaller spacing of lattice planes and some planes will not intersect lattice points R. b 9These are named after the 19th century mineralogist William Hallowes Miller. It is interesting that the structure of lattice planes was understood long before the world was even certain there was such a thing as an atom. 10We have already used the corresponding notation [uvw] to represent lattice points of the direct lattice. See for example, Eq. 11.1 and Eq. 11.3. 12.1. THE RECIPROCAL LATTICE IN THREE DIMENSIONS 121

    !!"  ##$  &&& %  "  !! %  '  %#!  (   )(#$  0! 1"

Figure 12.2: Determining Miller Indices From the Intersection of a Plane with the Coordinate 1 22 32 32 Axes. The spacing between lattice planes in this family would be 2 = 2 + 2 + 2 . |d(233)| a b c

From Eq. 12.8 one can write the spacing between a family of planes specified by Miller indices (h,k,l) 2π 2π d = = (12.10) (hkl) 2 2 2 2 2 2 G h b1 + k b2 + l b3 | | | | | | | | where we have assumed that the coordinatep axes of the primitive lattice vectors bi are orthogonal. Recall that in the case of orthogonal axes bi = 2π/ ai where ai are the lattice constants in the three orthogonal directions. Thus we can equivalently| | | write| 1 h2 k2 l2 = + + (12.11) d 2 a2 a2 a2 | (hkl)| 1 2 3 Note that for a cubic lattice this simplifies to a dcubic = (12.12) (hkl) √h2 + k2 + l2

A useful shortcut for figuring out the geometry of lattice planes is to look at the intersection of a plane with the three coordinate axes. The intersections x1, x2, x3 with the three coordinate 122 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS axes (in units of the three principle lattice constants) are related to the Miller indices via

a a a 1 : 2 : 3 = h : k : l x1 x2 x3

This construction is illustrated in Fig. 12.2. In figure Fig. 12.3 we show three more examples of Miller indices.

Figure 12.3: More Examples of Miller Indices.

Note that Miller indices can be negative if the planes intersect the negative axes. We could have, for example, a lattice plane (1,-1,1). Conventionally, the minus sign is denoted with an over-bar rather than a minus sign, so we write (111)¯ instead11. Finally, we note that different lattice planes may be the same under a symmetry of the crystal. For example, in a cubic lattice, (111) looks the same as (111)¯ after rotation (and possibly reflection) of the axes of the crystal (but would never look like (122) under any rotation or reflection since the spacing between planes is different!). If we want to describe all lattice planes that are equivalent in this way, we write 111 instead. { } It is interesting that lattice planes in crystals were well understood long before people even knew for sure there was such a thing as atoms. By studying how crystals cleave along certain

11How (111)¯ is pronounced is a bit random. Some people say “one-(bar-one)-one” and others say “one-(one-bar)- one”. I have no idea how the community got so confused as to have these two different conventions. I think in Europe the former is more prevalent whereas in America the latter is more prevalent. At any rate, it is always clear when it is written. 12.2. BRILLOUIN ZONES 123 planes, scientists like Miller and Bravais could reconstruct a great deal about how these materials must be assembled12.

12.2 Brillouin Zones

The whole point of going into such gross detail about the structure of reciprocal space is in order to describe waves in solids. In particular, it will be important to understand the structure of the Brillouin zone.

12.2.1 Review of One Dimensional Dispersions and Brillouin Zones

As we learned in chapters 8–10, the Brillouin zone is extremely important in describing the exci- tation spectrum of waves in periodic media. As a reminder, in Fig. 12.4 we show the excitation spectrum of vibrations of a diatomic chain (chapter 9) in both the reduced, and extended zone schemes. Since waves are physically equivalent under shifts of the wavevector k by a reciprocal lattice vector 2π/a, we can always express every excitation within the first Brillouin zone, as shown in the reduced zone scheme (left of Fig. 12.4). In this example, since there are two atoms per unit cell, there are precisely two excitation modes per wavevector. On the other hand, we can always unfold the spectrum and put the lowest (acoustic) excitation mode in the first Brillouin zone and the higher energy excitation mode (optical) in the second Brillouin zone, as shown in the extended zone scheme (right of Fig.12.4). Note that there is a jump in the excitation spectrum at the Brillouin zone boundary.

   !" #"$%&











      

Figure 12.4: Phonon Spectrum of a Diatomic Chain in One Dimension. Left: Reduced Zone scheme. Right: Extended Zone scheme. (See Figs. 9.1 and 9.2)

12There is a law known as “Bravais’ Law” which states that crystals cleave most readily along faces having the highest density of lattice points. In modern language this is essentially equivalent to stating that the fewest atomic bonds should be broken in the cleave. Can you see why this is? 124 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS

12.2.2 General Brillouin Zone Construction

Definition 12.2.1. A Brillouin zone is a unit cell of the reciprocal lattice.

Entirely equivalent to the one dimensional situation, physical waves in crystals are unchanged if their wavevector is shifted by a reciprocal lattice vector k k + G. Alternately, we realize that the physically relevant quantity is the crystal momentum.→ Thus, the Brillouin zone has been defined to include each physically different crystal momentum exactly once (Each k point within the Brillouin zone is physically different and all physically different points occur once within the zone). While the most general definition of Brillouin zone allows us to choose any shape unit cell for the reciprocal lattice, there are some definitions of unit cells which are more convenient than others. We define the first Brillouin zone in reciprocal space quite analogously to the construction of the Wigner-Seitz cell for the direct lattice.

Definition 12.2.2. Start with the reciprocal lattice point G = 0. All k points which are closer to 0 than to any other reciprocal lattice point define the first Brillouin zone. Similarly all points where the point 0 is the second closest reciprocal lattice point to that point constitute the second Brillouin zone, and so forth. Zone boundaries are defined in terms of this definition of Brillouin zones.

As with the Wigner-Seitz cell, there is a simple algorithm to construct the Brillouin zones. Draw the perpendicular bisector between the point 0 and each of the reciprocal lattice vectors. These bisectors form the Brillouin zone boundaries. Any point that you can get to from 0 without crossing a reciprocal lattice vector is in the first Brillouin zone. If you cross only one perpendicular bisector, you are in the second Brillouin zone, and so forth. In figure 12.5, we show the Brillouin zones of the square lattice. A few general principles to note:

1. The first Brillouin zone is necessarily connected, but the higher Brillouin zones typically are made of disconnected pieces.

2. A point on a Brillouin zone boundary lies on the perpendicular bisector between the point 0 and some reciprocal lattice point G. Adding the vector G to this point necessarily results in a point (the same distance from 0) which is on another− Brillouin zone boundary (on the bisector of the segment from 0 to G). This means that Brillouin zone boundaries occur in parallel pairs symmetric around− the point 0 which are separated by a reciprocal lattice vector (See Fig. 12.5).

3. Each Brillouin zone has exactly the same total area (or volume in three dimensions). This must be the case since there is a one-to-one mapping of points in each Brillouin zone to the first Brillouin zone. Finally, as in 1d, we claim that there are exactly as many k-states within the first Brillouin zone as there are unit cells in the entire system13.

Note, that as in the case of the Wigner Seitz cell construction, the shape of the first Brillouin zone can look a bit strange, even for a relatively simple lattice (See Fig. 11.7).

13 Here’s the proof of this statement for a square lattice. Let the system be Nx by Ny unit cells. Then, with 12.3. ELECTRONIC AND VIBRATIONAL WAVES IN CRYSTALS IN THREE DIMENSIONS125

Figure 12.5: First, second, third, fourth, ... Brillioun zones of the square lattice. Note that zone boundaries occur in parallel pairs symmetric around 0 and separated by a reciprocal lattice vector.

The construction of the Brillouin zone is similar in three dimensions as it is in two, and is again entirely analogous to the construction of the Wigner-Seitz cell in three dimensions. For a simple cubic lattice, the first Brillouin zone is simply a cube. For fcc and bcc lattices, however, the situation is more complicated. As we mentioned above in the Aside at the end of section 12.1.3 above, the reciprocal lattice of the fcc lattice is bcc, and vice-versa. Thus, the Brillouin zone of the fcc lattice is the same shape as the the Wigner-Seitz cell of the bcc lattice! The Brillouin zone for the fcc lattice is shown in Fig. 12.6 (compare to Fig. 11.13). Note that in Fig. 12.6, various k-points are labeled with letters. There is a complicated labeling convention that we will not discuss in this course, but it is worth knowing that it exists. For example, we can see in the figure that the point k = 0 is labeled Γ, and the point k = (π/a)ˆy is labeled X. Given this diagram of this Brillouin zone we can finally arrive at some real physics!

12.3 Electronic and Vibrational Waves in Crystals in Three Dimensions

In the left of Fig. 12.7 we show the electronic band-structure (i.e., dispersion relation) of diamond, which is an fcc lattice with a diatomic basis (See Fig. 11.16). As in the one-dimensional case, we can choose to work in the reduced zone scheme where we only need to consider the first Brillouin zone. Since we are trying to display a three dimensional spectrum (Energy as a function of k) on a one dimensional diagram, what is done is to show several single-line cuts through reciprocal

periodic boundary conditions, the value of kx is quantized in units of 2π/Lx = 2π/(Nxa) and the value of ky is quantized in units of 2π/Ly = 2π/(Nya). But the size of the Brillouin zone is 2π/a in each direction, thus there are precisely NxNy different values of k in the Brillouin zone. 126 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS

Figure 12.6: First Brillouin Zone of the FCC Lattice. Note that it is the same shape as the Wigner- Seitz cell of the bcc lattice, see Fig. 11.13. Various special points of the Brillioun zone are labeled with code letters such as X, K, and Γ.

space14. Starting on the left of the diagram, we start at L-point of the Brillouin zone and show E(k) as k traces a straight line to the Γ point (the center of the Brillouin zone). Then we continue to the right and k traces a straight line from the Γ point to the X point. Note that the lowest band is quadratic at the center of the Brillouin zone (a dispersion ~2k2/(2m∗) for some effective mass m∗). Similarly, in the right of Fig. 12.7, we show the phonon spectrum of diamond. Several things to note about this figure. First of all, since diamond has a unit cell with two atoms in it (it is fcc with a basis of two atoms) there should be six modes of oscillation per k-points (three directions of motion times two atoms per unit cell). Indeed, this is what we see in the picture, at least in the central third of the picture. In the other two parts of the picture, one sees fewer modes per k-point, but this is because, due to the symmetry of the crystal along this particular direction, several excitation modes have exactly the same energy (you can see for example, at the X-point, two modes come in from the right, but only one goes out to the left. This means the two modes have the same energy on the left of the X point). Secondly, we note that at the Γ-point, k = 0 there are exactly three modes which come down linearly to zero energy. These are the three acoustic modes. The other three modes, which are finite energy at k = 0 are the optical modes. Finally, you may note something a bit confusing about this diagram. On the far left of the diagram, we start at the Γ point, move in the (100) direction and end up at the X point. Then from the X point, we move in the (110) direction, and we end up back at the Γ point! This is because we have landed at the Γ point in a different Brillouin zone.

14This type of plot, because it can look like a jumble of lines, is sometimes called a “spaghetti-diagram” 12.4. SUMMARY OF RECIPROCAL SPACE AND BRILLOUIN ZONES 127

Figure 12.7: Dispersions in Diamond. Left: Electronic excitation spectrum of diamond (E = 0 is the Fermi energy). Right: Phonon spectrum of diamond (points are from experiment). In both plots the horizontal axis gives cuts through k-space as labeled in Fig. 12.6 above. (Left figure is from W. Saslow, T. K. Bergstresser, and Marvin L. Cohen Phys. Rev. Lett. 16, 354 (1966). Right figure is from R. D. Turner and J. C. Inkson, J. Phys. C: Solid State Phys., Vol. 11, 1978))

12.4 Summary of Reciprocal Space and Brillouin Zones

Reciprocal lattice is a lattice in k-space defined by the set of points such that eiG·R = 1 for • all R in direct lattice. Given this definition, the reciprocal lattice can be thought of as the Fourier transform of the direct lattice. A reciprocal lattice vector G defines a set of parallel equally spaced planes via G r =2πm • such that every point of the direct lattice is included in one of the planes. The· spacing between the planes is d =2π/ G . If G is the smallest reciprocal lattice vector parallel to G then this set of planes is a family| | of lattice planes, meaning that all planes intersect points of the direct lattice. Miller Indices (h,k,l) are used to describe families of lattice planes, or reciprocal lattice • vectors. For fcc and bcc lattices, one specifies the Miller indices of the associated simple cubic lattice conventional unit cell. General definition of Brillouin zone is any unit cell in reciprocal space. The First Brillouin • zone is the Wigner-Seitz cell around the point 0 of the reciprocal lattice. Each Brillouin zone has the same volume – and contains one k-state per unit cell of the entire system. Parallel Brillouin zone boundaries are separated by reciprocal lattice vectors.

References

For reciprocal lattice, Miller indices and Brillouin zones. I recommend 128 CHAPTER 12. RECIPROCAL LATTICE, BRILLOUIN ZONE, WAVES IN CRYSTALS

Ashcroft and Mermin, chapter 5 (again be warned of the nomenclature issue mentioned above in• chapter 11, footnote 1). Many books introduce X-ray diffraction and the reciprocal lattice at the same time. Once we have read the next chapter and we study scattering, we might go back and look at the nice introductions to reciprocal space given in the following books Goodstein, section 3.4–3.5 (very brief) • Kittel, chapter 2 • Ibach and Luth, chapter 3 • Part V

Neutron and X-Ray Diffraction

129

Chapter 13

Wave Scattering by Crystals

In the last chapter we discussed reciprocal space, and explained that the energy dispersion of phonons and electrons is plotted within the Brillouin zone. We understand how these are similar to each other due to the wave-like nature of both the electron and the phonon. However, much of the same physics occurs when a crystal scatters waves (or particles1) that impinge upon it externally. Indeed, exposing a solid to a wave in order to probe its properties is an extremely useful thing to do. The most commonly used probe is X-rays. Another common, more modern, probe is neutrons. It can hardly be overstated how important this type of experiment is to science.

The general setup that we will examine is shown in Fig.13.1.

Figure 13.1: A generic scattering experiment.

1Remember, in quantum mechanics there is no real difference between particles and waves!

131 132 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

13.1 The Laue and Bragg Conditions

13.1.1 Fermi’s Golden Rule Approach

If we think of the incoming wave as being a particle, then we should think of the sample as being some potential V (r) that the particle experiences as it goes through the sample. According to Fermi’s golden rule2, the transition rate Γ(k0, k) per unit time for the particle scattering from k to k0 is given by 0 2π 0 2 Γ(k , k)= k V k δ(Ek0 Ek) ~ |h | | i| − The matrix element here 0 −ik ·r ik·r e e 1 0 k0 V k = dr V (r) = dr e−i(k −k)·r V (r) h | | i √L3 √L3 L3 Z Z is nothing more than the Fourier transform of the potential (where L is the linear size of the sample, so the √L3 terms just normalize the wavefunctions). Note that these above expressions are true whether or not the sample is a periodic crystal. However, if the sample is periodic the matrix element is zero unless k k0 is a reciprocal lattice vector! To see this is true, let us write positions r = R + x where R is− a lattice vector position and x is a position within the unit cell 1 0 1 0 k0 V k = dr e−i(k −k)·r V (r)= dx e−i(k −k)·(x+R) V (x + R) h | | i L3 L3 R unit−cell Z X Z Now since the potential is assumed periodic, we have V (x + R)= V (x), so this can be rewritten as 1 0 0 k0 V k = e−i(k −k)·R dx e−i(k −k)·x V (x) (13.1) h | | i L3 " R # unit−cell X  Z  As we discussed in section 12.1.3 above, the first term in brackets must vanish unless k0 k is a reciprocal lattice vector3. This condition, − k0 k = G (13.2) − is known as the Laue equation (or Laue condition)4,5. This condition is precisely the statement of the conservation of crystal momentum.6 Note also that when the waves leave the crystal, they 2Fermi’s golden rule should be familiar to you from quantum mechanics. Interestingly, Fermi’s golden rule was actually discovered by Dirac, giving us yet another example where something is named after Fermi when Dirac really should have credit as well, or even instead. See also footnote 6 in section 4.1. 3Also we discussed that this first term in brackets diverges if k0 k is a reciprocal lattice vector. This divergence is not a problem here because it gives just the number of unit cells− and is canceled by the 1/L3 normalization factor leaving a factor of the inverse volume of the unit cell. 4Max von Laue won the Nobel prize for his work on X-ray scattering from crystals in 1914. Although von Laue never left Germany during the second world war, he remained openly opposed to the Nazi government. During the war he hid his gold Nobel medal at the Niels Bohr Institute in Denmark to prevent the Nazis from taking it. Had he been caught doing this, he may have been jailed or worse, since shipping gold out of Nazi Germany was considered a serious offense. After the occupation of Denmark in April 1940, George de Hevesy (a Nobel laureate in chemistry) decided to dissolve the medal in the solvent aqua regia to remove the evidence. He left the solution on a shelf in his lab. Although the Nazis occupied Bohr’s institute and searched it very carefully, they did not find anything. After the war, the gold was recovered from solution and the Nobel Foundation presented Laue with a new medal made from the same gold. 5The reason this is called “Laue condition” rather than “von Laue” condition is because he was born Max Laue. In 1913 his father was elevated to the nobility and his family added the “von”. 6Real momentum is conserved since the crystal itself absorbs any missing momentum. In this case, the center of mass of the crystal has absorbed momentum ~(k0 k). See the comment in footnote 9 in section 8.4. − 13.1. THE LAUE AND BRAGG CONDITIONS 133 should have

k = k0 | | | | which is just the , which is enforced by the delta function in Fermi’s golden rule. (In section 13.4.2 below we will consider more complicated scattering where energy is not conserved.)

13.1.2 Diffraction Approach

It turns out that this Laue condition is nothing more than the scattering condition associated with a diffraction grating. This description of the scattering from crystals is known as the Bragg Formulation of (x-ray) diffraction7.

Figure 13.2: Bragg Scattering off of a plane of atoms in a crystal.

Consider the configuration shown in Fig. 13.2. An incoming wave is reflected off of two adjacent layers of atoms separated by a distance d. A few things to note about this diagram. First note that the wave has been deflected by 2θ in this diagram8. Secondly, from simple geometry note that the additional distance traveled by the component of the wave that reflects off of the further layer of atoms is

extra distance = 2d sin θ.

In order to have constructive interference, this extra distance must be equal to an integer number of wavelengths. Thus we derive the Bragg condition for constructive interference, or what is known as Bragg’s law

nλ =2d sin θ (13.3)

Note that we can have diffraction from any two parallel planes of atoms such as the one shown here

7William Henry Bragg and William Lawrence Bragg were a father and son team who won the Nobel prize together in 1915 for the their work on X-ray scattering. William Lawrence Bragg was 25 years old when he won the prize, and remains the youngest Nobel laureate ever. 8This is a very common source of errors on exams. The total deflection angle is 2θ. 134 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

d

What we will see next is that this Bragg condition for constructive interference is precisely equivalent to the Laue condition described above.

13.1.3 Equivalence of Laue and Bragg conditions

Consider the following picture (essentially the same as Fig.13.2). Here we have shown the reciprocal lattice vector G which corresponds to the family of lattice planes. As we discussed in chapter 12 the spacing between lattice planes is d =2π/ G (See Eqn. 12.8). | |

k k0 G θ θ d

Just from geometry we have

kˆ Gˆ = sin θ = kˆ0 Gˆ · − · where the hatsˆover vectors indicate unit vectors. Suppose the Laue condition is satisfied. That is, k k0 = G with k = k0 =2π/λ with λ the wavelength. We can rewrite the Laue equation as − | | | |

2π (kˆ kˆ0)= G λ − 13.2. SCATTERING AMPLITUDES 135

Now let us dot this equation with Gˆ to give 2π Gˆ (kˆ kˆ0) = Gˆ G · λ − · 2π (sin θ sin θ0) = G λ − | | 2π (2 sin θ) = λ G | | 2d sin θ = λ which is the Bragg condition (in the last step we have used the relation, Eq. 12.8, between G and d). You may wonder why in this equation we got λ on the right hand side rather than nλ as we had in Eq. 13.3. The point here is that there if there is a reciprocal lattice vector G, then there is also a reciprocal lattice vector nG, and if we did the same calculation with that lattice vector we would get nλ. In other words, in the nλ case we are reflecting off of the spacing nd which necessarily also exists when there is a set of lattice planes with spacing d. Thus we conclude that the Laue condition and the Bragg condition are equivalent. It is equivalent to say that interference is constructive (as Bragg indicates) or to say that crystal momentum is conserved (as Laue indicates).

13.2 Scattering Amplitudes

If the Laue condition is satisfied, we would now like to ask how much scattering we actually get. Recall in section 13.1.1 we started with Fermi’s golden rule

0 2π 0 2 Γ(k , k)= k V k δ(Ek0 Ek) ~ |h | | i| − and we found out that if V is a periodic function, then the matrix element is given by (See Eq. 13.1)

1 0 0 k0 V k = e−i(k −k)·R dx e−i(k −k)·x V (x) (13.4) h | | i L3 " R # unit−cell X  Z  The first factor in brackets gives zero unless the Laue condition is satisfied, in which case it gives a constant (due to the 1/L3 out front, this is now a nondivergent constant). The second term in brackets is known as the structure factor (compare to Eq. 12.6)

S(G)= dx eiG·x V (x) (13.5) Zunit−cell where we have used G for (k k0) since this must be a reciprocal lattice vector or the first term in brackets vanishes. − Frequently, one writes I S 2 (13.6) (hkl) ∝ | (hkl)| which is shorthand for saying that I(hkl), the intensity of scattering off of the lattice planes defined by the reciprocal lattice vector (hkl), is proportional to the square of the structure factor at this reciprocal lattice vector. Sometimes a delta-function is also written explicitly to indicate that the wavevector difference (k0 k) must be a reciprocal lattice vector. − 136 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

We now turn to examine this structure factor more closely for our main two types of scat- tering probes – neutrons9 and x-rays.

Neutrons

Since neutrons are uncharged, they scatter almost exclusively from nuclei (rather than electrons) via the nuclear forces. As a result, the scattering potential is extremely short ranged, and can be approximated as a delta-function. We thus have

V (x)= f δ(x xj) j − j atom Xin unit cell th where xj is the position of the j atom in the unit cell. Here, fj is known as the form factor or , and represents the strength of scattering from that particular nucleus. In fact, for the case of neutrons this quantity is proportional to the so-called “nuclear scattering-length” bj . Thus for neutrons we frequently write

V (x) b δ(x xj) ∼ j − j atom Xin unit cell Plugging this expression into Eq. 13.5 above, we obtain

S(G) b eiG·xj (13.7) ∼ j j atom Xin unit cell

X-rays

X-rays scatter from the electrons in a system10. As a result, one can take V (x) to be proportional to the electron density. We can thus approximate

V (x) Z g (x xj) ∼ j j − atom jXin unit cell where Zj is the atomic number of atom j (i.e., its number of electrons) and gj is a somewhat short-ranged function (i.e., it has a few angstroms range — roughly the size of an atom). Taking the Fourier transform, we obtain

S(G) f (G) eiG·xj (13.8) ∼ j j atom Xin unit cell

where fj , the form factor, is roughly proportional to Zj , but has some dependence on the mag- nitude of the reciprocal lattice vector G as well. Frequently, however, we approximate fj to be independent of G (which would be true if g were extremely short ranged), although this is not strictly correct.

9Brockhouse and Schull were awarded the Nobel prize for pioneering the use of experiments for understanding properties of materials. Schull’s initial development of this technique began around 1946, just after the second world war, when the US atomic energy program made neutrons suddenly available. The Nobel prize was awarded in 1994, making this the longest time-lag ever between a discovery and the awarding of the prize. 10The coupling of photons to matter is via the usual minimal coupling (p + eA)2/(2m). The denominator m, which is much larger for nuclei than for electrons, is why the nuclei are not important. 13.2. SCATTERING AMPLITUDES 137

Aside: As noted above, fj (G) is just the Fourier transform of the scattering potential for atom j. This scattering potential is proportional to the electron density. Taking the density to be a delta function results in fj being a constant. Taking the slightly less crude approximation that the density is constant inside a sphere of radius r0 and zero outside of this radius will result in a Fourier transform

sin(x) − x cos(x) fj (G) ∼ 3Zj (13.9)  x3  with x = |Gr0| (try showing this!). If the scattering angle is sufficiently small (i.e., G is small compared to

1/r0), the right hand side is roughly Zj with no strong dependence on G.

Comparison of Neutrons and X-rays

For X-rays since fj Zj the x-rays scatter very strongly from heavy atoms, and hardly at • all from light atoms.∼ This makes it very difficult to “see” light atoms like hydrogen in a solid. Further it is hard to distinguish atoms that are very close to each other in their atomic number (since they scatter almost the same amount). Also fj is slightly dependent on the scattering angle.

In comparison the nuclear scattering length bj varies rather erratically with atomic number • (it can even be negative). In particular, hydrogen scatters fairly well, so it is easy to see. Further, one can usually distinguish atoms with similar atomic numbers rather easily.

For neutrons, the scattering really is very short ranged, so the form factor really is propor- • tional to the scattering length bj independent of G. For X-rays there is a dependence of G that complicates matters.

Neutrons also have spin. Because of this they can detect whether various electrons in the unit • cell have their spins pointing up or down. The scattering of the neutrons from the electrons is much weaker than the scattering from the nuclei, but is still observable. We will return to this situations where the spin of the electron is spatially ordered in section 19.1.2 below.

Simple Example

Generally, as mentioned above, we write the intensity of scattering as

I S 2 (hkl) ∝ | (hkl)| Assuming we have orthogonal primitive lattice vectors, we can then generally write

2πi(hxj +kyj +lzj ) S(hkl) = fj e (13.10) j atom Xin unit cell where [xj ,yj,zj ] are the coordinates of atom j within the unit cell, in units of the three primitive lattice vectors. Example 1: Caesium Chloride: Let us now consider the simple example of CsCl, whose unit cell is shown in Fig. 13.3. This system can be described as simple cubic with a basis given by11

11Do not make the mistake of calling this a lattice! Bcc is a lattice where all points must be the same. 138 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

Figure 13.3: Cesium Chloride Unit Cell. Cs is white corner atoms, Cl is red central atom. This is simple cubic with a basis. Note that bcc Cs can be thought of as just replacing the Cl with another Cs atom.

Basis for CsCl Cs Position= [0, 0, 0]

Cl Position= [a/2,a/2,a/2] Thus the structure factor is given by

2πi(h,k,l)·[1/2,1/2,1/2] S(hkl) = fCs + fCl e = f + f ( 1)h+k+l Cs Cl − with the f’s being the appropriate form factors for the corresponding atoms. Recall that the scattered wave intensity is I S 2. (hkl) ∼ | (hkl)|

13.2.1 Systematic Absences and More Examples

Example 2: Caesium bcc: Let us now consider instead a pure Cs crystal. In this case the crystal is bcc. We can think of this as simply replacing the Cl in CsCl with another Cs atom. Analogously we think of the bcc lattice as a simple cubic lattice with exactly the same basis, which we now write as Basis for Cs bcc Cs Position= [0, 0, 0]

Cs Position= [a/2,a/2,a/2] Now the structure factor is given by

2πi(h,k,l)·[1/2,1/2,1/2] S(hkl) = fCs + fCs e = f 1 + ( 1)h+k+l Cs − Crucially, note that the structure factor, and therefore the scattering intensity vanishes for h+k +l being any odd integer! This phenomenon is known as a systematic absence. 13.2. SCATTERING AMPLITUDES 139

To understand why this absence occurs, consider the simple case of the (100) family of planes (See Fig. 12.1). This is simply a family of planes along the crystal axes with spacing a. You might expect a wave of wavelength 2π/a oriented perpendicular to these planes to scatter constructively. However, if we are considering a bcc lattice, then there are additional planes of atoms half-way between the (100) planes which then cause perfect destructive interference. We refer back to the Important Complication mentioned in section 12.1.5. As mentioned there, the plane spacing for the bcc lattice in this case is not 2π/ G but is rather 2π/ G . In fact in general, for a bcc | (100)| | (200)| lattice the plane spacing for any family of lattice planes is 2π/ G(hkl) where h + k + l is always even. This is what causes the selection rule. | | Example 3: Copper fcc Quite similarly there are systematic absences for scattering from fcc crystals as well. Recall from Eq. 11.5 that the fcc crystal can be thought of as a simple cubic lattice with a basis given by the points [0, 0, 0], [1/2, 1/2, 0], [1/2, 0, 1/2], and [0, 1/2, 1/2] in units of the cubic lattice constant. As a result the structure factor of fcc coppper is given by (plugging into Eq. 13.10) iπ(h+k) iπ(h+l) iπ(k+l) S(hkl) = fCu 1+ e + e + e (13.11) It is easily shown that this expressionh vanishes unless h, k and l are eitheri all odd or all even. Summary of Systematic Absences Systematic Absences of Scattering Simple Cubic all h,k,l allowed bcc h + k + l must be even fcc h,k,l must be all odd or all even Systematic absences are sometimes known as selection rules. It is very important to note that these absences, or selection rules, occur for any structure with the given Bravais lattice type. Even if a material is bcc with a basis of five different atoms per primitive unit cell, it will still show the same systematic absences as the bcc lattice we considered above with a single atom per primitive unit cell. To see why this is true we consider yet another example

Figure 13.4: Zinc Sulfide Conventional Unit Cell. This is fcc with a basis given by a Zn atom at [0, 0, 0] and a S atom at [1/4, 1/4, 1/4].

Example 4: Zinc Sulfide = fcc with a basis: As shown in Fig 13.4, the Zinc Sulfide 140 CHAPTER 13. WAVE SCATTERING BY CRYSTALS crystal is a an fcc lattice with a basis given by a Zn atom at [0, 0, 0]and an S atom at [1/4, 1/4, 1/4] (this is known as a zincblende structure). If we consider the fcc lattice to itself be a cubic lattice with basis given by the points [0, 0, 0], [1/2, 1/2, 0], [1/2, 0, 1/2], and [0, 1/2, 1/2], we then have the 8 atoms in the conventional unit cell having positions given by the combination of the two bases, i.e., Basis for ZnS Zn Positions= [0, 0, 0], [1/2, 1/2, 0], [1/2, 0, 1/2], and [0, 1/2, 1/2] S Positions= [1/4, 1/4, 1/4], [3/4, 3/4, 1/4], [3/4, 1/4, 3/4], and [1/4, 3/4, 3/4] The structure factor for ZnS is thus given by

2πi(hkl)·[1/2,1/2,0] 2πi(hkl)·[1/4,1/4,1/4] 2πi(hkl)·[3/4,3/4,1/4] S(hkl) = fZn 1+ e + ... + fS e + e + ... h i h i This combination of 8 terms can be factored to give

iπ(h+k) iπ(h+l) iπ(k+l) i(π/2)(h+k+l) S(hkl) = 1+ e + e + e fZn + fS e (13.12) h i h i The first term in brackets is precisely the same as the term we found for the fcc crystal in Eq. 13.11. In particular it has the same systematic absences that it vanishes unless h, k and l are either all even or all odd. The second term gives additional absences associated specifically with the ZnS structure. Since the positions of the atoms are the positions of the underlying lattice plus the vectors in the basis, it is easy to see that the structure factor of a crystal system with a basis will always factorize into a piece which comes from the underlying lattice structure times a piece corresponding to the basis. Generalizing Eq. 13.12 we can write

S = SLattice Sbasis (13.13) (hkl) (hkl) × (hkl) (where, to be precise, the form factors only occur in the latter term).

13.3 Methods of Scattering Experiments

There are many methods of performing scattering experiments. In principle they are all similar — one sends in a probe wave of known wavelength (an X-ray, for example) and measures the angles at which it diffracts when it comes out. Then using Bragg’s laws (or the Laue equation) one can deduce the spacings of the lattice planes in the system.

13.3.1 Advanced Methods (interesting and useful but you probably won’t be tested on this)

Laue Method

Conceptually, perhaps the simplest method is to take a large of the material in question — fire waves at it (X-rays, say) from one direction, and measure the direction of the outgoing waves. However, given a single direction of incoming wave, it is unlikely that you precisely achieve the diffraction condition (the Bragg condition) for any given set of lattice planes. In order to get more data, one can then vary the wavelength of the incoming wave. This allows one to achieve the Bragg condition, at least at some wavelength. 13.3. METHODS OF SCATTERING EXPERIMENTS 141

Rotating Crystal Method

A similar technique is to rotate the crystal continuously so that at some angle of the crystal with respect to the incoming waves, one achieves the Bragg condition and measures an outcoming diffracted wave. Both of these methods are used. However, there is an important reason that they are sometimes impossible. Frequently it is not possible to obtain a single crystal of a material. Growing large crystals (such as the beautiful ones shown in Fig. 6) can be an enormous challenge12 In the case of neutron scattering, the problem is even more acute since one typically needs fairly large single crystals compared to x-rays.

13.3.2 Powder Diffraction (you will almost certainly be tested on this!)

Powder diffraction, or the Debye-Scherrer method13 is the use of wave scattering on a sample which is not single crystalline, but is powdered. In this case, the incoming wave can scatter off of any one of many small crystallites which may be oriented in any possible direction. In spirit this technique is similar to the rotating crystal method in that there is always some angle at which a crystal can be oriented to diffract the incoming wave. A figure of the Debye-Scherrer setup is shown in Fig. 13.5. Using Bragg’s law, given the wavelength of the incoming wave, we can deduce the possible spacings between lattice planes.

A Fully Worked Example. Study this!

Because this type of problem has historically ended up on exams essentially every year, and because it is hard to find references that explain how to solve these problems, I am going to work a powder- diffraction problem in detail here. As far as I can tell, they will only ever ask you about cubic lattices (simple cubic, fcc, and bcc). Before presenting the problem and solving it, however, it is useful to write down a table of possible lattice planes and the selection rules that can occur for the smallest reciprocal lattice vectors

12For example, high-temperature superconducting materials were discovered in 1986 (and resulted in a Nobel prize the next year!). Despite a concerted world-wide effort, good single crystals of these materials were not available for 5 to 10 years. 13Debye is the same guy from the specific heat of solids. Paul Scherrer was Swiss but worked in Germany during the second world war, where he passed information to the famous American spy (and baseball player), Moe Berg, who had been given orders to find and shoot Heisenberg if he felt that the Germans were close to developing a bomb. 142 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

ABCDEFGHHCDIPGQR CGRSTCUIQCFUFQRD VWQPQSCDVWGIHGXY

01231456718818 69@1  !" #$%&'( #(!&)(' 

 

Figure 13.5: Debye-Scherrer Powder Diffraction.

Lattice Plane Selection Rules

hkl N = h2 + k2 + l2 Multiplicity cubic bcc fcc { } 100 1 6 X 110 2 12 X X 111 3 8 X X 200 4 6 X X X 210 5 24 X 211 6 24 X X 220 8 12 X X X 221 9 24 X 300 9 6 X 310 10 24 X X 311 11 24 X X 222 12 8 X X X . . . .

The selection rules are exactly those listed above: simple cubic allows scattering from any plane, bcc must have h + k + l be even, and fcc must have h,k,l either all odd or all even. We have added a column N which is the square magnitude of the reciprocal lattice vector. We have also added an additional column labeled “multiplicity”. This quantity is important 13.3. METHODS OF SCATTERING EXPERIMENTS 143 for figuring out the amplitude of scattering. The point here is that the (100) planes have some par- ticular spacing but there are 5 other families of planes with the same spacing: (010), (001), (100),¯ (010)¯ , (001).¯ (Because we mean all of these possible families of lattice planes, we use the nota- tion hkl introduced at the end of section 12.1.5). In the powder diffraction method, the crystal orientations{ } are random, and here there would be 6 possible equivalent orientations of a crystal which will present the right angle for scattering from one of these planes, so there will be scattering intensity which is 6 times as large as we would otherwise calculate — this is known as the multi- plicity factor. For the case of the 111 family, we would instead find 8 possible equivalent planes: (111), (111)¯ , (111)¯ , (11¯1)¯ , (111)¯ , (11¯ 1)¯ , (1¯11)¯ , (1¯1¯1).¯ Thus, we should replace Eq. 13.6 with I M S 2 (13.14) {hkl} ∝ {hkl}| {hkl}| where M is the multiplicity factor. Calculating this intensity is straightforward for neutron scattering, but is much harder for x-ray scattering because the form factor for X-rays depends on G. I.e, since in Eq. 13.7 the form factor (or scattering length bj ) is a constant independent of G, it is easy to calculate the expected amplitudes of scattering based only on these constants. For the case of X-rays you need to know the functional forms of fj(G). At some very crude level of approximation it is a constant. More precisely we see in Eq. 13.9 that it is constant for small scattering angle but can vary quite a bit for large scattering angle.

Even if one knows the detailed functional form of fj (G), experimentally observed scattering intensities are never quite of the form predicted by Eq. 13.14. There can be several sources of cor- rections14 that modify this result (these corrections are usually swept under the rug in elementary introductions to scattering, but you should at least be aware that they exist). Perhaps the most significant corrections15 are known as Lorentz corrections or Lorentz-Polarization corrections. These terms, which depend on the detailed geometry of the experiment, give various prefactors (involving terms like cos θ for example) which are smooth as a function of θ.

The Example

Consider the powder diffraction data from PrO2 shown in Fig. 13.6. (Exactly this data was presented in the 2009 Exam, and we were told that the lattice is some type of cubic lattice. As we will see below there were several small, but important, errors in the question!) Given the wavelength .123 nm, we first would like to figure out the type of lattice and the lattice constant. Note that the full deflection angle is 2θ. We will want to use Bragg’s law and the expression for the spacing between planes λ a d(hkl) = = 2 sin θ √h2 + k2 + l2 where we have also used the expression Eq. 12.12 for the spacing between planes in a cubic lattice given the lattice constant a. Note that this then gives us a2/d2 = h2 + k2 + l2 = N 14Many of these corrections were first worked out by Charles Galton Darwin, the grandson of Charles Robert Darwin, the brilliant naturalist and proponent of evolution. The younger Charles was a terrific in his own right. Later in life his focus turned to ideas of eugenics, predicting that the human race would eventually fail as we continue to breed unfavorable traits. (His interest in eugenics is not surprising considering that the acknowledged father of eugenics, Francis Galton, was also part of the same family. ) 15Another important correction is due to the thermal vibrations of the crystal. Using Debye’s theory of vibration, Ivar Waller derived what is now known as the Debye-Waller factor that accounts for the thermal smearing of Bragg peaks. 144 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

 0     3    !"#$%& &!'!&'(  1 ) 2            

Figure 13.6: Powder Diffraction of Neutrons from PrO2. The wavelength of the neutron beam is λ = .123 nm. (One should assume that Lorentz corrections have been removed from the displayed intensities.) which is what we have labeled N in the above table of selection rules. We now make a table. In the first two columns we just read the angles off of the given graph. You should try to make the measurements of the angle from the data as carefully as possible. It makes the analysis much easier if you measure the angles right!

peak 2θ d = λ/(2 sin θ) d2/d2 3d2/d2 N = h2 + k2 + l2 hkl a = d√h2 + k2 + l2 a a { } a 22.7◦ 0.313 nm 1 3 3 111 .542 nm b 26.3◦ 0.270 nm 1.33 3.99 4 200 .540 nm c 37.7◦ 0.190 nm 2.69 8.07 8 220 .537 nm d 44.3◦ 0.163 nm 3.67 11.01 11 311 .541 nm e 46.2◦ 0.157 nm 3.97 11.91 12 222 .544 nm f 54.2◦ 0.135 nm 5.35 16.05 16 400 .540 nm In the third column of the table we calculate the distance between lattice planes for the given diffraction peak using Bragg’s law. In the fourth column we have calculated the squared ratio of the lattice spacing d for the given peak to the lattice spacing for the first peak (labeled a) as a reference. We then realize that these ratios are pretty close to whole numbers divided by three, so we try multiplying each of these quantities by 3 in the next column. If we round these numbers to integers (given in the next column), we produce precisely the values of N = h2 + k2 + l2 expected for the fcc lattice (According to the above selection rules we must have h,k,l all even or all odd). The final column calculates the lattice constant from the given diffraction angle. Averaging these numbers gives us a measurement of the lattice constant a = .541 .002 nm. ± The analysis thus far is equivalent to what one would do for X-ray scattering. However, with neutrons, assuming the scattering length is independent of scattering angle (which is typically a good assumption) we can go a bit further by analyzing the the intensity of the scattering peaks. 13.3. METHODS OF SCATTERING EXPERIMENTS 145

In real data often intensities are weighted by the above mentioned Lorentz factors. In Fig. 13.6 these factors have been removed so that we can expect that Eq. 13.14 holds precisely. (One error in the Exam question was that it was not mentioned that these factors have been removed!) In the problem given on the 2009 Exam, it is given that the basis for this crystal is a Pr atom at position [0,0,0] and O at [1/4,1/4,1/4] and [1/4,1/4,3/4]. Thus, the Pr atoms form a fcc lattice and the O’s fill in the holes as shown in Fig. 13.7.

Figure 13.7: The flourite structure. This is fcc with a basis given by a white atom (Pr) at [0, 0, 0] and yellow atoms (O) at [1/4, 1/4, 1/4] and [1/4, 1/4, 3/4].

Let us calculate the structure factor for this crystal. Using Eq. 13.13 we have

iπ(h+k) iπ(h+l) iπ(k+l) i(π/2)(h+k+l) i(π/2)(h+k+3l) S(hkl) = 1+ e + e + e bPr + bO e + e h i h  i The first term in brackets is the structure factor for the fcc lattice, and it gives 4 for every allowed scattering point (when h,k,l are either all even or all odd). The second term in brackets is the structure factor for the basis. The scattering intensity of the peaks are then given in terms of this structure factor and the peak multiplicities as shown in Eq. 13.14. We thus can write for all of our measured peaks16

2 i(π/2)(h+k+l) i(π/2)(h+k+3l) I{hkl} = CM{hkl} bPr + bO e + e  

where the constant C contains other constant factors (including the factor of 42 from the fcc structure factor). Note: We have to be a bit careful here to make sure that the bracketed factor gives the same result for all possible (hkl) included in hkl , but in fact it does. Thus we can compile another table showing the predicted relative intensities{ } of the peaks.

16Again assuming that smooth Lorentz correction terms have been removed from our data so that Eq. 13.14 is accurate. 146 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

Scattering Intensity

2 peak hkl I{hkl}/C M S Measured Intensity { } 2∝ | | a 111 8 bPr 0.05 2 b 200 6 [bPr 2bO] 0.1 − 2 c 220 12 [bPr +2bO] 1.0 2 d 311 24 bPr 0.15 2 e 222 8 [bPr 2bO] 0.1 − 2 f 400 6 [bPr +2bO] 0.5 where the final column are the intensities measured from the data in Fig. 13.6. From the analytic expressions in the third column we can immediately predict that we should have 4 I =3I I =2I I = I d a c f e 3 b Examining the fourth column of this table, it is clear that the first two of these equations are properly satisfied. However the final equation does not appear to be correct. This points to some error in constructing the plot. Thus we suspect some problem in either Ie or Ib. Either Ie is too 17 small or Ib is too large .

To further home in on this problem with the data, we can look at the ratio Ic/Ia which in the measured data has a value of about 20. Thus we have 2 Ic 12[bPr +2bO] = 2 = 20 Ia 8 bPr with some algebra this can be reduced to a quadratic equation with two roots, resulting in b = .43b or .75b (13.15) Pr − O O Let us suppose now that our measurement of Ib is correct. In this case we have 2 Ib 6[bPr 2bO] = −2 =2 Ia 8 bPr which we can solve to give b = .76b or 3.1b Pr O − O The former solution being reasonably consistent with the above. However, were we to assume Ie were correct, we would have instead 2 Ie 8[bPr 2bO] = −2 =2 Ia 8 bPr we would obtain b = .83b or 4.8b Pr O − O which appears inconsistent with Eq. 13.15. We thus conclude that the measured intensity of Ie given in Fig. 13.6 is actually incorrect, and should be larger by about a factor of 4/3. (This is the second error in the exam question.) We can then conclude that Having now corrected this error, we note that we have now used this neutron data to experimentally determine the ratio of the nuclear scattering lengths b /b .75 Pr O ≈ 17Another possibility is that the form factor is not precisely independent of scattering angle, as is the case for X-ray scattering. However, the fact that all the peaks are consistent but for this one peak suggests a transcription error. 13.4. STILL MORE ABOUT SCATTERING 147

13.4 Still more about scattering

Scattering experiments such as those discussed here are the method for determining the microscopic structures of materials. One can use these methods (and extensions thereof) to sort out even very complicated atomic structures such as those of biological molecules. Aside: In addition to the obvious work of von Laue and Bragg that initiated the field of X-ray diffraction (and Brockhouse and Schull for neutrons) there have been about half a dozen Nobel prizes that have relied on, or further developed these techniques. In 1962 a chemistry Nobel prize was awarded to Perutz and Kendrew for using X-rays to determine the structure of the biological proteins hemoglobin and myoglobin. The same year, Watson and Crick were awarded the prize in Biology for determining the structure of DNA — which they did with the help of X-ray diffraction data taken by Rosalind Franklin18. Two years later in 1964, Dorothy Hodgkin19 won the prize for determination of the structure of penicillin and other biological molecules. Further Nobels were given in chemistry for determining the structure of Boranes (Lipscomb, 1976) and for the structure of photosynthetic proteins (Deisenhofer, Huber, Michel 1988).

13.4.1 Variant: Scattering in Liquids and Amorphous Solids

A material need not be crystalline to scatter waves. However, for amorphous solids or liquids, instead of having delta-function peaks in the structure factor at reciprocal lattice vectors (as in Fig. 13.6), the structure factor (which is again defined as the Fourier transform of the density) will have smooth behavior — with incipient peaks corresponding to 2π/d where d is roughly the typical distance between atoms. An example of a measured structure factor in liquid Al is shown in Fig. 13.8. As the material gets close to its freezing point, the peaks in the structure factor will get more pronounced, becoming more like the structure of a solid where the peaks are delta functions.

Figure 13.8: The structure factor of liquid Aluminum

18There remains quite a controversy over the fact that Watson and Crick, at a critical juncture, were shown Franklin’s data without her knowledge! Franklin may have won the prize in addition to Watson and Crick and thereby received a bit more of the appropriate credit, but she tragically died of cancer at age 37 in 1958, two years before the prize was awarded. 19Dorothy Hodgkin was a student and later a fellow at Somerville College, Oxford. Yay! 148 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

13.4.2 Variant: Inelastic Scattering

Figure 13.9: Inelastic scattering. Energy and crystal momentum must be conserved.

It is also possible to perform scattering experiments which are inelastic. Here, “inelastic” means that energy of the incoming wave is left behind in the sample, and the energy of the outgoing wave is lower. The general process is shown in Fig. 13.9. A wave is incident on the crystal with momentum k and energy (k) (For neutrons the energy would be ~2k2/(2m) whereas for photons the energy would be ~c k ). This wave transfers some of its energy and momentum to some internal excitation mode of the| material| — such as a phonon, or a spin or electronic excitation quanta. One then measures the outgoing energy and momentum of the wave. Since energy and crystal momentum must be conserved, one has Q = k0 k + G − E(Q) = (k0) (k) − thus allowing one to determine the dispersion relation of the internal excitation (i.e., the relation- ship between Q and E(Q)). This technique is extremely useful for determining phonon dispersions experimentally. In practice, the technique is much more useful with neutrons than with X-rays. The reason for this is, because the speed of light is so large, (and E = ~c k ) the energy differences that one obtains are enormous except for a tiny small range of k0 for| each| k. Since there is a maximum energy for a phonon, the X-rays therefore have a tiny total cross section for exciting phonons. A second reason that this technique is difficult for X-rays is because it is much harder to build an X-ray detector that determines energy than it is for neutrons.

13.4.3 Experimental Apparatus

Perhaps the most interesting piece of this kind of experiments is the question of how one actually produces and measures the waves in questions. Since at the end of the day, one ends up counting photons or neutrons, brighter sources (higher flux of probe particles) are always better —- as it allows one to do experiments quicker, and allows one to reduce noise (since counting error on N counts is proportional to √N, meaning a fractional error that drops as 1/√N). Further, with a brighter source, one can examine smaller samples more easily. 13.5. SUMMARY OF DIFFRACTION 149

X-rays: Even small laboratories can have X-ray sources that can do very useful crystal- lography. A typical source accelerates electrons electrically (with 10s of keV) and smashes them into a metal target. X-rays with a discrete spectrum of energies are produced when an electron is knocked out of a low atomic orbital and an electron in a higher orbital drops down to re-fill the hole (this is known as X-ray flourescence). Also a continuous Bremsstrahlung spectrum is produced by electrons coming near the charged nuclei, but for monochromatic diffraction experiments, this is less useful. (One wavelength from a spectrum can be selected — using diffraction from a known crystal!). Much higher brightness X-ray sources are provided by huge (and hugely expensive) facilities known as synchrotron light sources — where particles (usually electrons) are accelerated around enormous loops (at energies in the GeV range). Then using these electrons are rapidly accelerated around corners which makes them emit X-rays extremely brightly and in a highly columnated fashion. Detection of X-rays can be done with photographic films (the old style) but is now more frequently done with more sensitive semiconductors detectors. Neutrons: Although it is possible to generate neutrons in a small lab, the flux of these devices is extremely small and neutron scattering experiments are always done in large neutron source facilities. Although the first neutron sources simply used the byproduct neutrons from nuclear reactors, more modern facilities now use a technique called spallation where protons are accelerated into a target and neutrons are emitted. As with X-rays, neutrons can be mono- chromated (made into a single wavelength), by diffracting them from a known crystal. Another technique is to use time-of-flight. Since more energetic neutrons move faster, one can send a pulse of poly-chromatic neutrons and select only those that arrive at a certain time in order to obtain mono-chromatic neutrons. On the detection side, one can again select for energy very easily. I won’t say too much about neutron detection as there are many methods. Needless to say, they all involve interaction with nuclei.

13.5 Summary of Diffraction

Diffraction of waves from crystals in Laue and Bragg formulations (equivalent to each other). • The structure factor (the Fourier transform of the scattering potential) in a periodic crystal • has sharp peaks at allowed reciprocal lattice vectors for scattering. The scattering intensity is the square of the structure factor. There are systematic absences of diffraction peaks depending on the crystal structure (fcc, • bcc). Know how to figure these out. Know how to analyze a powder diffraction pattern (very common exam question!) •

References

It is hard to find references that give enough information about diffraction to suit the Oxford course. These are not bad. Kittel, chapter 2 • Ashcroft and Mermin, chapter 6 • 150 CHAPTER 13. WAVE SCATTERING BY CRYSTALS

Figure 13.10: The Rutherford- Lab in Oxfordshire, UK. On the right, the large circular building is the DIAMOND . The building on the left is the ISIS spallation neutron facility. This was the world’s brightest neutron source on earth until August 2007 when it was surpassed by one in Oak Ridge, US. The next generation source is being built in Sweden and is expected to start operating in 2019. The price tag for construction of this device is over 109 euros.

Dove, chapter 6 (most detailed, with perhaps a bit too much information in places) • In addition, the following have nice, but incomplete discussions. Rosenberg, chapter 2. • Ibach and Luth, chapter 3. • Burns, chapter 4. • Part VI

Electrons in Solids

151

Chapter 14

Electrons in a Periodic Potential

In chapters 8 and 9 we discussed the wave nature of phonons in solids, and how crystal momentum is conserved (i.e., momentum is conserved up to reciprocal lattice vector). Further we found that we could describe the entire excitation spectrum within a single Brillouin zone in a reduced zone scheme. We also found in chapter 13 that X-rays and neutrons similarly scatter from solids by conserving crystal momentum. In this chapter we will consider the nature of electron waves in solids and we will find that similarly crystal momentum is conserved and the entire excitation spectrum can be described within a single Brillouin zone using a reduced zone scheme. We have seen a detailed preview of properties of electrons in periodic systems when we considered the one dimensional tight binding model in chapter 10, so the results of this section will be hardly surprising. However, in the current chapter we will approach the problem from a very different (and complimentary) starting point. Here, we will consider electrons as free-electron waves that are very only very weakly perturbed by the periodic arrangement of atoms in the solid. The tight binding model is exactly the opposite limit where we consider electrons bound strongly to the atoms, and they only weakly hop from one atom to the next.

14.1 Nearly Free Electron Model

We start with completely free electrons whose Hamiltonian is

p2 H = 0 2m The corresponding energy eigenstates, the plane waves k , have eigenenergies | i ~2 k 2  (k)= | | 0 2m We now consider a weak periodic potential perturbation to this Hamiltonian

H = H0 + V (r) with V (r)= V (r + R)

153 154 CHAPTER 14. ELECTRONS IN A PERIODIC POTENTIAL where R is any lattice vector. The matrix elements of this potential are then just the Fourier components 0 0 1 i(k−k )·r k V k = dr e V (r) Vk0 k (14.1) h | | i L3 ≡ − Z which is zero unless k0 k is a reciprocal lattice vector (See Eq. 13.1). Thus, any plane wave state k can scatter into another− plane wave state k0 only if these two plane waves are separated by a reciprocal lattice vector. We now apply the rules of perturbation theory. At first order in the perturbation V , we have (k)=  (k)+ k V k =  (k)+ V 0 h | | i 0 0 which is just an uninteresting constant energy shift to all of the eigenstates. In fact, it is an exact statement (at any order of perturbation theory) that the only effect of V0 is to shift the energies 1 of all of the eigenstates by this constant . Henceforth we will assume that V0 = 0 for simplicity. At second order in perturbation theory we have

0 k0 V k 2 (k)=  (k)+ V + |h | | i| (14.2) 0 0  (k)  (k0) k0 k G 0 0 =X+ − where the 0 means that the sum is restricted to have G = 0. In this sum, however, we have to be 0 6 0 careful. It is possible that, for some k it happens that 0(k) is very close to 0(k ) or perhaps they are even equal. In which case the corresponding term of the sum diverges and the perturbation expansion makes no sense. This is what we call a degenerate situation and it needs to be handled with degenerate perturbation theory, which we shall consider below. To see when this degenerate situation happens, we look for solutions of

0 0(k) = 0(k ) (14.3) k0 = k + G (14.4)

First, let us consider the one-dimensional case. Since (k) k2, the only possible solutions of Eq. 14.3 is k0 = k. This means the two equations are only satisfied∼ for − k0 = k = πn/a − or precisely on the Brillouin zone boundaries (See Fig. 14.1). In fact, this is quite general even in higher dimensions: given a point k on a Brillouin zone boundary, there is another point k0 (also on a Brillouin zone boundary) such that Eqs. 14.3 and 14.4 are satisfied (See in particular Fig. 12.5 for example)2. Since Eq. 14.2 is divergent, we need to handle this situation with degenerate perturbation theory3. In this approach, one diagonalizes the Hamiltonian within the degenerate space first (and other perturbations can be treated after this). In other words, we take states of the same energy that are connected by the matrix element and treat their mixing exactly.

1You should be able to show this! 2To see this generally, recall that a Brillouin zone boundary is a perpendicular bisector of the segment between 0 and some G. We can write the given point k = G/2+ k⊥ where k⊥ G = 0. Then if we construct the point 0 0 · k = G/2 + k⊥, then clearly 14.4 is satisfied, k is a perpendicular bisector of the segment between 0 and G and therefore− is on a zone boundary, and k = k0 which implies that Eq. 14.3 is satisfied. − 3Hopefully you have learned this in your| | quantum| | mechanics courses already! 14.1. NEARLY FREE ELECTRON MODEL 155

Figure 14.1: Scattering from Brillouin Zone Boundary to Brillouin Zone Boundary. The states at the two zone boundaries are separated by a reciprocal lattice vector G and have the same energy. This situation leads to a divergence in perturbation theory, Eq. 14.2 because when the two energies match, the denominator is zero.

14.1.1 Degenerate Perturbation Theory

If two plane wave states k and k0 = k + G are of approximately the same energy (meaning that k and k0 are close to| zonei boundaries),| i | theni we must diagonalize the matrix elements of these states first. We have k H k = 0(k) hk0| H |ki0 =  (k0)=  (k + G) 0 0 (14.5) h | | 0i 0 ∗ k H k = Vk−k = VG h 0| | i k H k = Vk0 k = VG h | | i − ∗ where we have used the definition of VG from Eq. 14.1, and the fact that V−G = VG is guaranteed by the fact that V (r) is real. Now, within this two dimensional space we can write any wavefunction as

Ψ = α k + β k0 = α k + β k + G (14.6) | i | i | i | i | i Using the variational principle to minimize the energy is equivalent to solving the effective Schroedinger equation4  (k) V ∗ α α 0 G = E (14.7) VG  (k + G) β β  0      The secular equation determining E is then

2 0(k) E 0(k + G) E VG = 0 (14.8) − ! − ! − | |

(Note that once this degenerate space is diagonalized, one could go back and treat further, nondegenerate, scattering processes, in perturbation theory.)

4This should look similar to our 2 by 2 Schroedinger Equation 5.8 above. 156 CHAPTER 14. ELECTRONS IN A PERIODIC POTENTIAL

Simple Case: k exactly at the zone boundary

The simplest case we can consider is when k is precisely on a zone boundary (and therefore 0 k = k + G is also precisely on a zone boundary). In this case 0(k)= 0(k + G) and our secular equation simplifies to 2 2 0(k) E = VG − ! | | or equivalently

E =  (k) VG ± 0 ± | | 0 Thus we see that a gap opens up at the zone boundary. Whereas both k and k had energy 0(k) in the absence of the added potential VG, when the potential is added, the two eigenstates form two linear combinations with energies split by VG . ±| |

In one dimension

In order to understand this better, let us focus on the one dimensional case. Let us assume we have a potential V (x)= V˜ cos(2πx/a) with V˜ > 0. The Brillouin zone boundaries are at k = π/a and k0 = k = π/a so that k0 k = G = 2π/a and  (k)=  (k0). − − − − 0 0 0 Examining Eq. 14.7, we discover that the solutions (when 0(k) = 0(k )) are given by α = β thus giving the eigenstates ±

1 0 ψ± = ( k k ) (14.9) | i √2 | i ± | i corresponding to E± respectively. Since we can write the real space version of these k wavefunc- tions as5 | i

k eikx = eixπ/a | i → 0 k e−ik x = e−ixπ/a | i → we discover that the two eigenstates are given by

ψ eixπ/a + e−ixπ/a cos(xπ/a) + ∼ ∝ ψ eixπ/a e−ixπ/a sin(xπ/a) − ∼ − ∝ 2 If we then look at the densities ψ± associated with these two wavefunctions (See Fig. 14.2) we see that the higher energy eigenstate| | ψ+ has its density concentrated mainly at the maxima of the potential V whereas the lower energy eigenstate ψ− has it density concentrated mainly at the minima of the potential. So the general principle is that the periodic potential scatters between the two plane waves k and k + G. If the energy of these two plane waves are the same, the mixing between them is strong, and the two plane waves can combine to form one state with higher energy (concentrated on the potential maxima) and one state with lower energy (concentrated on the potential minima).

5Formally what we mean here is x k = eikx/√L. h | i 14.1. NEARLY FREE ELECTRON MODEL 157







 







Figure 14.2: Structure of Wavefunctions at the Brillouin Zone Boundary. The higher energy eigenstate ψ+ has its density concentrated near the maxima of the potential V whereas the lower energy eigenstate has its density concentrated near the minima. k not quite on a zone boundary (and still in one dimension)

It is not too hard to extend this calculation to the case where k is not quite on a zone boundary. For simplicity though we will stick to the one dimensional situation6. We need only solve the secular equation 14.8 for more general k. To do this, we expand around the zone boundaries. Let us consider the states at the zone boundary k = nπ/a which are separated by the reciprocal lattice vectors G = 2πn/a. As noted above, the gap± that opens up precisely at the zone ± boundary will be VG . Now let us consider a plane wave near this zone boundary k = nπ/a + δ with δ being very±| small| (and n an integer). This wavevector can scatter into k0 = nπ/a + δ due to the periodic potential. We then have − ~2  (nπ/a + δ) = (nπ/a)2 +2nπδ/a + δ2 0 2m ~2    ( nπ/a + δ) = (nπ/a)2 2nπδ/a + δ2 0 − 2m −   The secular equation (Eq. 14.8)) is then

~2 ~2 ~2 ~2 2 2 2 2 2 (nπ/a) + δ E + 2nπδ/a (nπ/a) + δ E 2nπδ/a VG =0 2m − 2m ! 2m − − 2m ! − | |     which simplifies to 2 2 ~2 ~2 2 2 2 (nπ/a) + δ E = 2nπδ/a + VG 2m − ! 2m ! | |   or ~2 ~2 2 E = (nπ/a)2 + δ2 2nπδ/a + V 2 (14.10) ± 2m ± 2m | G| s    6If you are very brave and good with geometry, you can try working out the three dimensional case. 158 CHAPTER 14. ELECTRONS IN A PERIODIC POTENTIAL

Expanding the square root for small δ we obtain7 ~2(nπ/a)2 ~2δ2 ~2(nπ/a)2 1 E = V + 1 (14.11) ± 2m ± | G| 2m ± m V  | G| Note that for small perturbation (which is what we are concerned with), the second term is the square brackets is larger than unity so that for one of the two solutions, the square bracket is negative. Thus we see that near the at the Brillouin zone boundary, the dispersion is quadratic (in δ) as shown in Fig. 14.3. In Fig. 14.4, we see (using the repeated zone scheme) that small gaps open at the Brillouin zone boundaries in what is otherwise a parabolic spectrum. (This plotting scheme is equivalent to the reduced zone scheme if restricted to a single zone).

Figure 14.3: Dispersion of a Nearly Free Electron Model. In the nearly free electron model, gaps open up at the Brillouin zone boundaries in an otherwise parabolic spectrum. Compare this to what we found for the tight binding model in Fig 10.5.

The general structure we find is thus very much like what we expected from the tight binding model we considered previously in chapter 10 above. As in the tight binding picture there are energy bands where there are energy eigenstates, and there are gaps between bands, where there are no energy eigenstates. As in the tight binding model, the spectrum is periodic in the Brillouin zone (See Fig 14.4). In section 10.2 above we introduced the idea of the effective mass — if a dispersion is parabolic, we can describe the curvature at the bottom of the band in terms of an effective mass. In this model at every Brillouin zone boundary the dispersion is parabolic (indeed, if there is a

7The conditions of validity for this expansion is that the first term under the square root is much smaller than the second, meaning that we must have small enough δ, or we must be very close to the Brillouin zone boundary. But note that as VG gets smaller and smaller, the expansion is valid only for k closer and closer to the zone boundary. 14.1. NEARLY FREE ELECTRON MODEL 159

Figure 14.4: Dispersion of a Nearly Free Electron Model. Same as Fig. 14.3 above, but plotted in repeated zone scheme. This is equivalent to the reduced zone scheme but the equivalent zones are repeated. Forbidden bands are marked where there are no eigenstates. The similarity to the free electron parabolic spectrum is emphasized. gap, hence a local maximum and a local minimum, the dispersion must be parabolic around these extrema). Thus we can write the dispersion Eq. 14.11 as

~2δ2 E+(G + δ) = C+ + ∗ 2m+ ~2δ2 E−(G + δ) = C− ∗ − 2m−

8 where C+ and C− are constants, and the effective masses are given here by m m∗ = ± ~2 2 1 (nπ/a) 1 ± m |VG|

We will define effective mass more precisely, and explain its physics in detail in chapter 16 below. For now we just think of this as a convenient way to describe the parabolic dispersion near the Brillouin zone boundary.

Nearly free electrons in two (and higher) dimensions

The principles of the nearly free electron model are quite similar in two and three dimensions. In short, near the Brillouin zone boundary, a gap opens up due to scattering by a reciprocal lattice vector. States of energy slightly higher than the zone boundary intersection point are pushed up

2 2 8 ~ (nπ/a) 1 Note that since VG is assumed small, 1 is negative. − m |VG| 160 CHAPTER 14. ELECTRONS IN A PERIODIC POTENTIAL in energy, whereas states of energy slightly lower than the zone boundary intersection point are pushed down in energy. We will return to the detailed geometry of this situation in section 15.2. There is one more key difference between one dimension and higher dimensions. In one dimension, we found that if k is on a zone boundary, then there will be exactly one other k0 such that k k0 = G is a reciprocal lattice vector and such that (k0) = (k). (I.e., Eqs. 14.3 and 14.4 are− satisfied). As described above, these two plane wave states mix with each other (See Eq. 14.6) and open up a gap. However, in higher dimensions it may occur that given k there may be several different k0 which will satisfy these equations — i.e., many k0 which differ from k by a reciprocal lattice vector and which all have the same unperturbed energy. In this case, we need to mix together all of the possible plane waves in order to discover the true eigenstates. One example of when this occurs is the two dimensional square lattice, where the four points ( π/a, π/a) all have the same unperturbed energy and are all separated from each other by reciprocal± ± lattice vectors.

14.2 Bloch’s Theorem

In the above, “nearly free electron” approach, we started from the perspective of plane waves that are weakly perturbed by a periodic potential. But in real materials, the scattering from atoms can be very strong so that perturbation theory may not be valid (or may not converge until very high order). How do we know that we can still describe electrons with anything remotely similar to plane waves? In fact, by this time, after our previous experience with waves, we should know the answer in advance: the plane wave momentum is not a conserved quantity, but the crystal momentum is. No matter how strong the periodic potential, so long as it is periodic, crystal momentum is conserved. This important fact was first discovered by Felix Bloch9 in 1928, very shortly after the discovery of the Schroedinger equation, in what has become known as Bloch’s theorem10

Bloch’s Theorem: An electron in a periodic potential has eigenstates of the form

α ik·r α Ψk(r)= e uk(r)

α where uk is periodic in the unit cell and k (the crystal momentum) can be chosen within the first Brillouin zone.

In reduced zone scheme there may be many states at each k and these are indexed by α. The periodic function u is usually known as a Bloch function, and Ψ is sometimes known as a modified plane-wave. Because u is periodic, it can be rewritten as a sum over reciprocal lattice vectors

α α iG·r uk(r)= u˜G,k e G X 11 α α This form guarantees that uk(r) = uk(r + R) for any lattice vector R. Therefore the full 9Felix Bloch later won a Nobel prize for inventing Nuclear Magnetic . NMR was then renamed MRI (Magnetic Resonance Imaging) when people decided the word “Nuclear” sounds too much like it must be related to some sort of bomb. 10Bloch’s theorem was actually discovered by a Mathematician Gaston Floquet in 1883, and rediscovered later by Bloch in the context of solids. This is an example of what is known as Stigler’s Law of Eponomy: “Most things are not named after the person who first discovers them”. In fact, Stigler’s law was discovered by Merton. 11In fact, the function u is periodic in the unit cell if and only if it can be written as a sum over reciprocal lattice 14.3. SUMMARY OF ELECTRONS IN A PERIODIC POTENTIAL 161 wavefunction is expressed as α α i(G+k)·r Ψk(r)= u˜G,k e (14.12) G X Thus an equivalent statement of Bloch’s theorem is that we can write each eigenstate as being made up of a sum of plane wave states k which differ by reciprocal lattice vectors G. Given this equivalent statement of Bloch’s theorem, we now understand that the reason for Bloch’s theorem is that the scattering matrix elements k0 V k are zero unless k0 and k differ by a reciprocal lattice vector. As a result, the Schroedingerh | equa| ition is “block diagonal”12 in the space of k and in any given wavefunction only plane waves k that differ by some G can be mixed together. One way to see this more clearly is to is to take the Schroedinger equation p2 + V (r) Ψ(r)= EΨ(r) 2m   and Fourier transform it to obtain ~2 k 2 VGΨk G = E | | Ψk − − 2m G X   0 where we have used the fact that Vk k0 is only nonzero if k k = G. It is then clear that for − − each k we have a Schroedinger equation for the set of Ψk−G’s and we must obtain solutions of the form of Eq. 14.12. Although by this time it may not be surprising that electrons in a periodic potential have eigenstates labeled by crystal momenta, we should not overlook how important Bloch’s theorem is. This theorem tells us that even though the potential that the electron feels from each atom is extremely strong, the electrons still behave almost as if they do not see the atoms at all! They still almost form plane wave eigenstates, with the only modification being the periodic Bloch function u and the fact that momentum is now crystal momentum. A quote from : When I started to think about it, I felt that the main problem was to explain how the electrons could sneak by all the ions in a metal. By straight Fourier analysis I found to my delight that the wave differed from the plane wave of free electrons only by a periodic modulation.

14.3 Summary of Electrons in a Periodic Potential

When electrons are exposed to a periodic potential, gaps arise in their dispersion relation at • the Brillouin zone boundary. (The dispersion is quadratic approaching a zone boundary). Thus the electronic spectrum breaks into bands, with forbidden energy gaps between the • bands. In the nearly free electron model, the gaps are proportional to the periodic potential VG . | | Bloch’s theorem guarantees us that all eigenstates are some periodic function times a plane • wave. In repeated zone scheme the wavevector (the crystal momentum) can always be taken in the first Brillouin zone. vectors in this way. 12No pun intended. 162 CHAPTER 14. ELECTRONS IN A PERIODIC POTENTIAL

References

Goodstein, Section 3.6a. • Burns, section 10.1–10.6 • Kittel, chapter 7 (Skip Kronig-Penny model) • Hook and Hall, section 4.1 • Ashcroft and Mermin, chapters 8–9 (not my favorite) • Ibach and Luth, sections 7.1–7.2 • Singleton, chapter 2-3 • Chapter 15

Insulator, Semiconductor, or Metal

In chapter 10, when we discussed the tight-binding model in one dimension, we introduced some of the basic ideas of band structure. In chapter 14 we found that an electron in a periodic potential shows exactly the same type of band-structure as we found for the tight-binding model: In both cases, we found that the spectrum is periodic in momentum (so all momenta can be taken to be in the first Brillouin zone, in reduced zone scheme) and we find that gaps open at Brillouin zone boundaries. These principles, the idea of bands and band structure form the fundamental underpinning of our understanding of electrons in solids. In this chapter (and the next) we explore these ideas in further depth.

15.1 Energy Bands in One Dimension: Mostly Review

As we pointed out in chapter 12 the number of k-states in a single Brillouin zone is equal to the number of unit cells in the entire system. Thus, if each atom has exactly one electron (i.e., is valence 1) there would be exactly enough electrons to fill the band if there were only one spin state of the electron. Being that there are two spin states of the electron, when each atom has only one valence electron, then the band is precisely half full. This is shown in the left of Fig. 15.1. Here, there is a Fermi surface where the unfilled states meet the filled states. (In the figure, the Fermi energy is shown as a green dashed line). When a band is partially filled, the electrons can repopulate when a small electric field is applied, allowing current to flow as shown in the fight of Fig. 15.1. Thus, the partially filled band is a metal. On the other hand, if there are two electrons per atom, then we have precisely enough electrons to fill one band. One possibility is shown on the left of Fig. 15.2 — the entire lower band is filled and the upper band is empty, and there is a band gap between the two bands (note that the chemical potential is between the bands). When this is the situation, the lower (filled) band is known as the valence band and the upper (empty) band is known as the conduction band. In this situation the minimum energy excitation is created by moving an electron from the valence to the conduction band, which is nonzero energy. Because of this, at zero temperature, a sufficiently small electric perturbation will not create any excitations— the system does not respond at all to electric field. Thus, systems of this type are known as (electrical) insulators (or more specifically

163 164 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

 

 

 

 

 

 

               

Figure 15.1: Band Diagrams of a One Dimensional Monovalent Chain with Two Orbitals per Unit Cell. Left: A band diagram with two bands shown where each atom has one electron so that the lowest band is exactly half filled, and is therefore a metal. The filled states are colored red, the chemical potential is the green line. Right: When electric field is applied, electrons accelerate, filling some of the k states to the right and emptying k-states to the left (in one dimension this can be thought of as having a different chemical potential on the left versus the right). Since there are an unequal number of left-moving versus right-moving electrons, the situation on the right represents net current flow.

   

 

 

 

 

               

Figure 15.2: Band Diagrams of a One Dimensional Divalent Chain with Two Orbitals per Unit Cell. When there are two electrons per atom, then there are exactly enough electrons to fill the lowest band. In both pictures the chemical potential is drawn in green. Left: one possibility is that the lowest band (the valence band) is completely filled and there is a gap to the next band (the conduction band) in which case we get an insulator. This is a direct band gap as the valence band maximum and the conduction band minimum are both at the same crystal momentum (the zone boundary). Right: Another possibility is that the band energies overlap, in which case there are two bands, each of which is partially filled, giving a metal. If the bands were separated by more (imagine just increasing the vertical spacing between bands) we would have an insulator again, this time with an indirect band gap, since the valence band maximum is at the zone boundary while the conduction band minimum is at the zone center. 15.1. ENERGY BANDS IN ONE DIMENSION: MOSTLY REVIEW 165 band insulators). If the band gap is below about 4 eV, then these type of insulators are called semiconductors since at finite temperature electrons can be thermally excited into the conduction band, and these electrons then can move around freely, carrying some amount of current. One might want to be aware that in the language of chemists, a band insulator is a situation where all of the electrons are tied up in bonds. For example, in diamond, carbon has valence four — meaning there are four electrons per atom in the outer-most shell. In the diamond lattice, each carbon atom is covalently bonded to each of its four nearest neighbors – and each covalent bond requires two electrons. One electron is donated to each bond from each of the two atoms on either end of the bond — this completely accounts for all of the four electrons in each atom. Thus all of the electrons are tied up in bonds. This turns out to be equivalent to the statement that certain bonding bands are completely filled, and there is no mobility of electrons in any partially filled bands (See the left of Fig. 16.3). When there are two electrons per atom, one frequently obtains a band insulator as shown in the left of Fig. 15.2. However another possibility is that the band energies overlap, as shown in the right of Fig. 15.2. In this case, although one has precisely the right number of electrons to fill a single band, instead one has two partially filled bands. As in Fig. 15.1 there are low energy excitations available, and the system is metallic.

Figure 15.3: Fermi Sea of a Square Lattice of Monovalent Atoms in Two Dimensions. Left: In the absence of a periodic potential, the Fermi sea forms a circle whose area is precisely half that of the Brillouin zone (the black square). Right: when a periodic potential is added, states closer to the zone boundary are pushed down in energy deforming the Fermi sea. Note that the area of the Fermi sea remains fixed. 166 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

15.2 Energy Bands in Two (or More) Dimensions

It is useful to try to understand how the nearly-free electron model results in band structure in two dimensions. Let us consider a square lattice of monovalent atoms. The Brillouin zone is correspondingly square, and since there is one electron per atom, there should be enough electrons to half fill a single Brillouin zone. In absence of a periodic potential, the Fermi sea forms a circle as shown in the left of Fig. 15.3. The area of this circle is precisely half the area of the zone. Now when a periodic potential is added, gaps open up at the zone boundaries. This means that states close to the zone boundary get moved down in energy — and the closer they are to the boundary, the more they get moved down. As a result, states close to the boundary get filled up preferentially at the expense of states further from the boundary. This deforms the Fermi surface1 roughly as shown in the right of Fig. 15.3. In either case, there are low energy excitations possible and therefore the system is a metal.

Figure 15.4: Fermi Surfaces that Touch Brillouin Zone Boundaries. Left: Fermi Sea of a square lattice of monovalent atoms in two dimensions with strong periodic potential. The Fermi surface touches the Brillouin zone boundary. Right: The Fermi surface of copper, which is monovalent (the lattice structure is fcc, which determines the shape of the Brillouin zone, see Fig. 12.6).

If the periodic potential is strong enough the Fermi surface may even touch2 the Brillouin zone boundary as shown in the left of Fig. 15.4. This is not uncommon in real materials. On the right of Fig. 15.4 the Fermi surface of copper is shown, which similarly touches the zone boundary.

1Recall that the Fermi surface is the locus of points at the Fermi energy (so all states at the Fermi surface have the same energy), separating the filled from unfilled states. Keep in mind that the area inside the Fermi surface is fixed by the total number of electrons in the system. 2Note that whenever a Fermi surface touches the Brillouin zone boundary, it must do so perpendicularly. This is due to the fact that the group velocity is zero at the zone boundary — i.e., the energy is quadratic as one approaches normal to the zone boundary. Since the energy is essentially not changing in the direction perpendicular to the zone boundary, the Fermi surface must intersect the zone boundary normally. 15.2. ENERGY BANDS IN TWO (OR MORE) DIMENSIONS 167

Figure 15.5: Fermi Sea of a Square Lattice of Divalent Atoms in Two Dimensions. Left: In the absence of a periodic potential, the Fermi sea forms a circle whose area is precisely that of the Brillouin zone (the black square). Right: when a sufficiently strong periodic potential is added, states inside the zone boundary are pushed down in energy so that all of these states are filled and no states outside of the first Brillouin zone are filled. Since there is a gap at the zone boundary, this situation is an insulator. (Note that the area of the Fermi sea remains fixed).

Let us now consider the case of a two-dimensional square lattice of divalent atoms. In this case the number of electrons is precisely enough to fill a single zone. In the absence of a periodic potential, the Fermi surface is still circular, although it now crosses into the second Brillouin zone, as shown in the left of Fig. 15.5. Again, when a periodic potential is added a gap opens at the zone boundary — this gap opening pushes down the energy of all states within the first zone and pushes up energy of all states in the second zone. If the periodic potential is sufficiently strong3, then the states in the first zone are all lower in energy than states in the second zone. As a result, the Fermi sea will look like the right of Fig. 15.5. I..e, the entire lower band is filled, and the upper band is empty. Since there is a gap at the zone boundary, there are no low energy excitations possible, and this system is an insulator. It is worth considering what happens for intermediate strength of the periodic potential. Again, states outside of the first Brillouin zone are raised in energy and states inside the first Brillouin zone are lowered in energy. Therefore fewer states will be occupied in the second zone and more states occupied in the first zone. However, for intermediate strength of potential, there will remain some states occupied in the second zone and some states empty within the first zone. This is precisely analogous to what happens in the right half of Fig. 15.2. Analogously, there will

3We can estimate how strong the potential needs to be. We need to have the highest energy state in the first Brillouin zone be lower energy than the lowest energy state in the second zone. The highest energy state in the first 2 zone, in the absence of periodic potential, is in the zone corner and therefore has energy corner = 2(π/2) /(2m). The lowest energy state in the second zone is it the middle of the zone boundary edge and in the absence of periodic 2 potential has energy edge = (π/2) /(2m). Thus we need to open up a gap at the zone boundary which is sufficiently large that the edge becomes higher in energy than the corner. This requires roughly that 2 V = corner  . | G| − edge 168 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

Figure 15.6: Fermi Sea of a Square Lattice of Divalent Atoms in Two Dimensions. Left: For intermediately strong periodic potential, there are still some states filled in the second zone, and some states empty in the first zone, thus the system is still a metal. Right: The states in the second zone can be moved into the first zone by translation by a reciprocal lattice vector. This is the reduced zone scheme representation of the occupancy of the second Brillouin zone. still be some low energy excitations available, and the system remains a metal. We emphasize that in the case where there are many atoms per unit cell, we should count the total valence of all of the atoms in the unit cell put together to determine if it is possible to obtain a filled-band insulator. If the total valence in of all the atoms in the unit cell is even, then for strong enough periodic potential, it is possible that some set of low energy bands will be completely filled, there will be a gap, and the remaining bands will be empty – i.e., it will be a band insulator.

15.3 Tight Binding

So far in this chapter we have described band structure in terms of the nearly free electron model. Similar results can be obtained starting from the opposite limit — the tight binding model intro- duced in chapter 10. In this model we imagine some number of orbitals on each atom (or in each unit cell) and allow them to only weakly hop between orbitals. This spreads the eigen-energies of the atomic orbitals out into bands. Writing down a two (or three) dimensional generalization of the tight binding Hamiltonian Eq. 10.4 is quite straightforward and is a good exercise to try. One only needs to allow each orbital to hop to neighbors in all available directions. The eigenvalue problem can then always be solved with a plane wave ansatz analogous to Eq. 10.5. The solution (again a good exercise to try!) of a tight binding model of atoms, each having a single atomic orbital, on a square lattice is given by 15.3. TIGHT BINDING 169













       

Figure 15.7: Equi-Energy Contours for the Dispersion of a Tight Binding Model on a Square Lattice. This is a contour plot of Eq. 15.1. The first Brillouin Zone is shown. Note that the contours intersect the Brillouin zone boundary normally.

(Compare Eq. 10.6)

E(k)=  2t cos(k a) 2t cos(k a) (15.1) 0 − x − y

Equi-energy contours for this expression are shown in Fig. 15.7. Note the similarity in the dispersion to our qualitative expectations shown in Fig. 15.3 (right) and Fig. 15.4 left and Fig. 15.6, which were based on a nearly free electron picture. In the above described tight binding picture, there is only a single band. However, one can make the situation more realistic by starting start with several atomic orbitals per unit cell, to obtain several bands (another good exercise to try!). As mentioned above in section 5.3.2 and chapter 10, as more and more orbitals are added to a tight binding (or LCAO) calculation, the results become increasingly accurate. In the case where a unit cell is divalent, as mention above, it is crucial to determine whether bands overlap. (I.e., is it insulating like the left of Fig. 15.2 or metallic type like the right of Fig. 15.2.) This, of course, requires detailed knowledge of the band structure. In the tight binding picture, if the atomic orbitals start sufficiently far apart in energy, then small hopping between atoms cannot spread the bands enough to make them overlap (See Fig. 10.4). in the nearly free electron picture, the gap between bands formed at the Brillouin zone boundary is proportional to VG , and it is the limit of strong periodic potential that will guarantee that the bands do not overlap| | (See Fig. 15.5). Qualitatively these two are the same limit — very far from the idea of a freely propagating wave! 170 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

15.4 Failures of the Band-Structure Picture of Metals and Insulators

The picture we have developed is that the band structure, and the filling of bands, determines whether a material is a metal or insulator, (or semiconductor, meaning an insulator with a small band gap). One thing we might conclude at this point is that any system where the unit cell has a single valence electron (so the first Brillouin zone is half-full) must be a metal. However, it turns out that this is not always true! The problem is that we have left out a very important effect — Coulomb interaction between electrons. We have so far completely ignored the Coulomb repulsion between electrons. Is this neglect justified at all? If we try to estimate how strong the Coulomb 2 interaction is between electrons, (roughly e /(4π0r) where r is the typical distance between two electrons — i.e., the lattice constant a) we find numbers on the order of several eV. This number can be larger, or even far larger, than the Fermi energy (which is already a very large number, on the order of 10,000 K). Given this, it is hard to explain why it is at all justified to have thrown out such an important contribution. In fact, one might expect that neglecting this term would give complete nonsense! Fortunately, it turns out that in many cases it is OK to assume noninteracting electrons. The reason this works is actually quite subtle and was not understood until the 1950s due to the work of Lev Landau (See footnote 12 in chapter 4 about Landau). This (rather deep) explanation, however, is beyond the scope of this course so we will not discuss it. Nonetheless, with this in mind it is perhaps not too surprising that there are cases where the noninteracting electron picture, and hence our view of band structure, fails.

Magnets

A case where the band picture of electrons fails is when the system is ferromagnetic4. We will discuss ferromagnetism in detail in chapters 19–22 below, but in short this is where, due to in- teraction effects, the electron spins spontaneously align. From a kinetic energy point of view this seems unfavorable, since filling the lower energy states with two spins can lower the Fermi energy. However, it turns out that aligning all of the spins can lower the Coulomb energy between the electrons, and thus our rules of non-interacting electron band theory no longer hold.

Mott Insulators

Another case where interaction physics is important is the so-called Mott insulator5. Consider a monovalent material. From band theory one might expect a half-filled lowest band, therefore a metal. But if one considers the limit where the electron-electron interaction is extremely strong, this is not what you get. Instead, since the electron-electron interaction is very strong, there is a huge penalty for two electrons to be on the same atom (even with opposite spins). As a result, the ground state is just one electron sitting on each atom. Since each atom has exactly one electron, no electron can move from its atom — since that would result in a double occupancy of the atom it lands on. As a result, this type of ground state is insulating. In some sense this type of insulator — which can be thought of as more-or-less a traffic jam of electrons — is actually simpler to visualize than a band insulator! We will also discuss Mott insulators further in sections 18.4 and particularly 22.2 below.

4Or antiferromagnetic or ferrimagnetic, for that matter. See chapter 19 below for definitions of these terms. 5Named after the English Nobel Laureate, Nevill Mott. Classic examples of Mott insulators include NiO and CoO. 15.5. BAND STRUCTURE AND OPTICAL PROPERTIES 171

15.5 Band Structure and Optical Properties

To the extent that electronic band structure is a good description of the properties of materials (and usually it is), one can attribute many of the optical properties of materials to this band structure. First one needs to know a few simple facts about light shown here in this table: Color ~ω Infrared < 1.65 eV R O Y G BV Red 1.8 eV 1,65 2,01 2,11 2,17 2,50 2,75 3,27 Orange ∼2.05 eV Yellow ∼ 2.15 eV Green ∼ 2.3 eV Blue ∼ 2.7 eV Violet ∼ 3.1 eV Ultraviolet >∼ 3.2 eV

15.5.1 Optical Properties of Insulators and Semiconductors

With this table in mind we see that if an insulator (or wide-bandgap semiconductor) has a bandgap of greater than 3.2 eV, then it appears transparent. The reason for this is that a single photon of visible light cannot excite an electron in the valence band into the conduction band. Since the valence band is completely filled, the minimum energy excitation is of the band gap energy — so the photon creates no excitations at all. As a result, the visible optical photons do not scatter from this material at all and they simply pass right through the material6. Materials such as quartz, diamond, aluminum-oxide, and so forth are insulators of this type. Semiconductors with somewhat smaller band gaps will absorb photons with energies above the band gap (exciting electrons from the valence to the conduction band), but will be transparent to photons below this band gap. For example, cadmium-sulfide (CdS) is a semiconductor with band gap of roughly 2.6 eV, so that violet and blue light are absorbed but red and green light are transmitted. As a result this material looks reddish. (See Fig. 15.8).

15.5.2 Direct and Indirect Transitions

While the band gap determines the minimum energy excitation that can be made in an insulator (or semiconductor), this is not the complete story in determining whether or not a photon can be absorbed by a material. It turns out to matter quite a bit at which values of k the maximum of the valence band and the minimum of the conduction band lies. If the value of k for the valence band maximum is the same as the value of k for the conduction band minimum, then we say that it is a direct band gap. If the values of k differ, then we say that it is an indirect band gap. For example, the system shown on the left of Fig. 15.2 is a direct band gap, where both the valence band maximum and the conduction band minimum are at the zone boundary. In comparison, if the band shapes were as in the right of Fig. 15.2, but the band gap were large enough such that it would be an insulator (just imagine the bands separated by more), this would be an indirect band gap since the valence band maximum is at the zone boundary, but the conduction band minimum is at k = 0.

6Very weak scattering processes can occur where, say, two photons together can excite an electron, or a photon excites a phonon 172 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

Figure 15.8: Orange crystals of CdS. This particular crystal is the naturally occurring mineral called “Greenockite” which is CdS with trace amounts of impurity which can change its color somewhat.

One can also have both indirect and direct band gaps in the same material, as shown in Fig. 15.9. In this figure, the minimum energy excitation is the indirect transition — meaning an excitation of an electron across an indirect band gap, or equivalently a transition of nonzero crystal momentum7 where the electron is excited from the top of the valence band to the bottom of the lower conduction band at a very different k. While this may be the lowest energy excitation that can occur, it is very hard for this type of excitation to result from exposure of the system to light — the reason for this is energy-momentum conservation. If a photon is absorbed, the system absorbs both the energy and the momentum of the photon. But given an energy E in the eV range, the momentum of the photon k = E/c is extremely small, because c is so large. Thus the system cannot conserve momentum| | while exciting an electron across an indirect band gap. Nonetheless, typically if a system like this is exposed to photons with energy greater than the indirect band gap a small number of electrons will manage to get excited — usually by some complicated process including absorbtion of a photon exciting an electron with simultaneous emission of a phonon8 to arrange the conservation of energy and momentum. In comparison, if a system has a direct band gap, and is exposed to photons of energy matching this direct band gap, then it strongly absorbs these photons while exciting electrons from the valence band to the conduction band.

15.5.3 Optical Properties of Metals

The optical properties of metals, however, are a bit more complicated. Since these materials are very conductive, photons (which are electromagnetic) excite the electrons9, which then re-emit light. This re-emission (or reflection) of light is why metals look shiny. Noble metals (gold, silver,

7By “nonzero” we mean, substantially nonzero – like a fraction of the Brillouin zone. 8Another way to satisfy the conservation of momentum is via a “disorder assisted” process. You recall that the reason we conserve crystal momentum is because the system is perfectly periodic. If the system has some disorder, and is therefore not perfectly periodic, then crystal momentum is not perfectly conserved. Thus the greater the disorder level, the less crystal momentum needs to be conserved and the easier it is to make a transition across an indirect band gap. 9Note the contrast with insulators — when an electron is excited above the band gap, since the conductivity is somewhat low, the electron does not re-emit quickly, and the material mostly just absorbs the given wavelength. 15.5. BAND STRUCTURE AND OPTICAL PROPERTIES 173

   

   

Figure 15.9: Direct and Indirect transitions. While the indirect transition is lower energy, it is hard for a photon to excite an electron across an indirect band gap because photons carry very little momentum (since the speed of light, c, is large). platinum) look particularly shiny because their surfaces do not form insulating oxides when exposed to air, which many metals (such as sodium) do within seconds. Even amongst metals (ignoring possible oxide surfaces), colors vary. For example, Silver looks brighter than gold and copper, which look yellow or orange-ish. This again is a result of the band structure of these materials. Both of these materials have valence one meaning that a band should be half-filled. However, the total energy width of the conduction band is greater for silver than it is for gold or copper (In tight-binding language t is larger for silver, see chapter 10). This means that higher energy electronic transitions within the band are much more possible for silver than they are for gold and copper. For copper and gold, photons with blue and violet colors are not well absorbed and re-emitted, leaving these material looking a bit more yellow and orange. For silver on the other hand, all visible colors are re-emitted well, resulting in a more perfect (or “white”) mirror. While this discussion of the optical properties of metals is highly over- simplified10, it captures the correct essence — that the details of the band structure determine which color photons are easily absorbed and/or reflected, and this in turn determines the apparent color of the material.

15.5.4 Optical Effects of Impurities

It turns out that small levels of impurities put into periodic crystals (particularly into semicon- ductors and insulators) can have dramatic effects on many of their optical (as well as electrical!) properties. For example, one nitrogen impurity per million carbon atoms in a diamond crystal gives the crystal a yellow-ish color. One boron atom per million carbon atoms give the diamond a blue-ish color11. We will discuss the physics that causes this in section 16.2.1 below.

10Really there are many bands overlapping in these materials and the full story addresses inter and intra-band transitions. 11Natural blue diamonds are extremely highly prized and are very expensive. Possibly the world’s most famous diamond, the Hope Diamond, is of this type (it is also supposed to be cursed, but that is another story). With modern crystal growth techniques, in fact it is possible to produce man-made diamonds of “quality” better than those that are mined. Impurities can be placed in as desired to give the diamond any color you like. Due to the 174 CHAPTER 15. INSULATOR, SEMICONDUCTOR, OR METAL

15.6 Summary of Insulators, Semiconductors, and Metals

A material is a metal if it has low energy excitations. This happens when at least one band is • partially full. (Band) Insulators and semiconductors have only filled bands and empty bands and have a gap for excitations. A semiconductor is a (band) insulator with a small band gap. • The valence of a material determines the number of carriers being put into the band — • and hence can determine if one has a metal or insulator/semiconductor. However, if bands overlap (and frequently they do) one might not be able to fill the bands to a point where there is a gap. The gap between bands is determined by the strength of the periodic potential. If the • periodic potential is strong enough (the atomic limit in tight binding language), bands will not overlap.

The band picture of materials fails to account for electron-electron interaction. It cannot • describe (at least without modification) interaction driven physics such as magnetism and Mott insulators. Optical properties of solids depend crucially on the possible energies of electronic transitions. • Photons easily create transitions with low momentum, but cannot create transitions with larger momentum easily. Optical excitations over an indirect (finite momentum) gap are therefore weak.

References

Goodstein section 3.6c • Kittel, chapter 7; first section of chapter 8; first section of chapter 9 • Burns, section 10.7, 10.10 • Hook and Hall, section 4.2,4.3, section 5.4 • Rosenberg section 8.9–8.19 •

powerful lobby of the diamond industry, all synthetic diamonds are labeled as such — so although you might feel cheap wearing a synthetic, in fact, you probably own a better product than those that have come out of the earth! (Also you can rest with a clean conscience that the production of this diamond did not finance any wars in Africa). Chapter 16

Semiconductor Physics

16.1 Electrons and Holes

Suppose we start with an insulator or semiconductor and we excite one electron from the valence band to the conduction band, as shown in the left of Fig. 16.1. This excitation may be due to absorbing a photon, or it might be a thermal excitation. (For simplicity in the figure we have shown a direct band gap. For generality we have not assumed that the curvature of the two bands are the same). When the electron has been moved up to the conduction band, there is an absence of an electron in the valence band known as a hole. Since a completely filled band is inert, it is very convenient to only keep track of the few holes in the valence band (assuming there are only a few) and to treat these holes as individual elementary particles. The electron can fall back into the empty state that is the hole, emitting energy (a photon, say) and “annihilating” both the electron from the conduction band and the hole from the valence band1. Note that while the electrical charge of an electron is negative the electrical charge of a hole (the absence of an electron) is positive — equal and opposite to that of the electron.2

Effective Mass of Electrons

As mentioned in sections 10.2 and 14.1.1, it is useful to describe the curvature at the bottom of a band in terms of an effective mass. Let us assume that near the bottom of the conduction band

1This is equivalent to pair annihilation of an electron with a . In fact, the analogy between electron-hole and electron-positron is fairly precise. As soon as Dirac constructed his equation (in 1928) describing the relativistic motion of electrons, and predicting , it was understood that the positron could be thought of as an absence of an electron in an filled sea of states. The filled sea of electron states with a gap to exciting electron-positron pairs is the inert vacuum, which is analogous to an inert filled valence band. 2If this does not make intuitive sense consider the process of creating an electron-hole pair as described in Fig. 16.1. Initially (without the excited electron-hole pair) the system is charge neutral. We excite the system with a photon to create the pair, and we have not moved any additional net charge into the system. Thus if the electron is negative, the hole must be positive to preserve overall charge neutrality.

175 176 CHAPTER 16. SEMICONDUCTOR PHYSICS

 



 

Figure 16.1: Electrons and Holes in a Semiconductors. Left: A single hole in the valence band and a single electron in the conduction band. Right: Moving the hole to a momentum away from the top of the valence band costs positive energy — like pushing a balloon under water. As such, the effective mass of the hole is defined to be positive. The energy of the configuration on the right is 2 2 ∗ greater than that on the left by E = ~ k kmax /(2m ) | − |

3,4,5 (assumed to be at k = kmin) the energy is given by

2 E = E + α k kmin + ... min | − | where the dots mean higher order term in the deviation from kmin. We then define the effective mass to be given by ~2 ∂2E = =2α (16.1) m∗ ∂k2 at the bottom of the band (with the derivative being taken in any direction for an isotropic system). Correspondingly, the (group) velocity is given by

∗ v = kE/~ = ~(k kmin)/m (16.2) ∇ − 3It is an important principle that near a minimum or a maximum one can always expand and get something quadratic plus higher order corrections. 4For simplicity we have assumed the system to be isotropic. In the more general case we would have

min 2 min 2 min 2 E = E + αx(kx k ) + αy(ky k ) + αz (kz k ) + ... min − x − y − z for some orthogonal set of axes (the “principle axes”) x,y,z. In this case we would have an effective mass which can be different in the three different principle directions. 5For simplicity we also neglect the spin of the electron here. In general, spin-orbit coupling can make the dispersion depend on the spin state of the electron. Among other things, this can modify the effective electron g-factor. 16.1. ELECTRONS AND HOLES 177

~2|k|2 This definition is chosen to be in analogy with the free electron behavior E = 2m with corre- sponding velocity v = kE/~ = ~k/m. ∇

Effective Mass of Holes

Analogously we can define an effective mass for holes. Here things get a bit more complicated6. For the top of the valence band, the energy dispersion for electrons would be

2 E = E α k kmax + ... max − | − | The modern convention is to define the effective mass for holes at the top of a valence band to be always positive7 ~2 ∂2E = =2α (16.3) m∗ ∂k2 hole

The convention of the effective mass being positive makes sense because the energy to boost the hole from zero velocity (k = kmax at the top of the valence band) to finite velocity is positive. This energy is naturally given by

2 2 ~ k kmax Ehole = | − ∗ | 2mhole The fact that boosting the hole away from the top of the valence band is positive energy may seem a bit counter-intuitive being that the dispersion of the hole band is an upside-down parabola. How- ever, one should think of this like pushing a balloon under water. The lowest energy configuration is with the electrons at the lowest energy possible and the hole at the highest energy possible. So pushing the hole under the electrons costs positive energy. (This is depicted in the right hand side of Fig. 16.1.) Analogous to the electron, we can write the hole group velocity as the derivative of the hole energy ∗ v = kE /~ = ~(k kmax)/m (16.4) hole ∇ hole − hole

Effective Mass and Equations of Motion

We have defined the effective masses above in analogy with that of free electrons, by looking at the curvature of the dispersion. An equivalent definition (equivalent at least at the top or bottom of the band) is to define the effective mass m∗ as being the quantity that satisfies Newton’s second law, F = m∗a for the particle in question. To demonstrate this, our strategy is to imagine applying a force to an electron in the system and then equate the work done on the electron to its change in energy. Let us start with an electron in momentum state k. Its group velocity is v = E(k)/~. If we apply a force8, the work done per unit time is ∇k dW/dt = F v = F E(k)/~ · · ∇k 6Some people find the concept of effective mass for holes to be a bit difficult to digest. I recommend chapter 12 of Ashcroft and Mermin to explain this in more detail (in particular see page 225 and thereafter). 7Be warned: a few books define the mass of holes to be negative. This is a bit annoying but not inconsistent as long as the negative sign shows up somewhere else! 8For example, if we apply an electric field E and it acts on an electron of charge e, the force is F = eE. − − 178 CHAPTER 16. SEMICONDUCTOR PHYSICS

On the other hand, the change in energy per unit time must also be (by the chain rule)

dE/dt = dk/dt E(k) · ∇k Setting these two expressions equal to each other we (unsurprisingly) obtain Newton’s equation 1 dk dp F = = (16.5) ~ dt dt where we have used p = ~k. If we now consider electrons near the bottom of a band, we can plug in the expression Eq. 16.2 for the velocity and this becomes dv F = m∗ dt exactly as Newton would have expected. In deriving this result recall that we have assumed that we are considering an electron near the bottom of a band so that we can expand the dispersion quadratically (or similarly we assumed that holes are near the top of a band). One might wonder how we should understand electrons when they are neither near the top nor the bottom of a band. More generally Eq. 16.5 always holds, as does the fact that the group velocity is v = kE/~. It is then sometimes convenient to define an effective mass for an electron as a function of∇ momentum to be given by9 ~2 ∂2E = m∗(k) ∂k2 which agrees with our above definition (Eq. 16.1) near the bottom of band. However, near the top of a band it is the negative of the corresponding hole mass (note the absolute value in Eq. 16.3). Note also that somewhere in the middle of the band the dispersion must reach an inflection point (∂2E/∂k2 = 0), whereupon the effective mass actually becomes infinite as it changes sign. Aside: It is useful to compare the time evolution of electrons and holes near the top of bands. If we think in terms of holes (the natural thing to do near the top of a band) we have F =+eE and the holes have a positive mass. However if we think in terms of electrons, we have F = −eE but the mass is negative. Either way, the acceleration of the k-state is the same, whether we are describing the state in terms of an electron in the state or in terms of a hole in the state. This is a rather important fundamental principle — that the time evolution of an eigenstate is independent of whether that eigenstate is filled with an electron or not.

16.1.1 Drude Transport: Redux

Back in section 3 we studied Drude theory — a simple kinetic theory of electron motion. The main failure of Drude theory was that it did not treat the Pauli exclusion principle properly: it neglected the fact that in metals the high density of electrons makes the Fermi energy extremely high. However, in semiconductors or band insulators, when only a few electrons are in the conduction band and/or only a few holes are in the valence band, then we can consider this to be a low density situation, and to a very good approximation, we can ignore Fermi statistics. (For example, if only a single electron is excited into the conduction band, then we can completely ignore the Pauli principle, since it is the only electron around — there is no chance that any state it wants to sit in will already be filled!). As a result, when there is a low density of conduction electrons or valence holes, it turns out that Drude theory works extremely well! We will come back to this issue later in section 16.3 and make this statement much more precise.

9For simplicity we write this in its one dimensional form. 16.2. ADDING ELECTRONS OR HOLES WITH IMPURITIES: DOPING 179

At any rate, in the semiclassical picture, we can write a simple Drude transport equation (really Newton’s equations!) for electrons in the conduction band

m∗dv/dt = e(E + v B) m∗v/τ e − × − e ∗ with me the electron effective mass. Here the first term on the right hand side is the force on the electron, and the second term is a drag force with an appropriate scattering time τ. The scattering time determines the so-called mobility µ which measures the ease with which the particle moves10

µ = E / v = eτ/m∗ | | | | | | Similarly we can write equations of motion for holes in the valence band

m∗ dv/dt = e(E + v B) m∗ v/τ h × − h ∗ where mh is the hole effective mass. Note again that here the charge on the hole is positive. This should make sense — the electric field pulls on the electrons in a direction opposite than it pulls on the absence of an electron! If we think back all the way to chapter 3 and 4, one of the physical puzzles that we could not understand is why the Hall coefficient sometimes changes sign (See the table in section 3.1.2). In some cases it looked as if the charge carrier had positive charge. Now we understand why this is true. In some materials the main charge carrier is the hole!

16.2 Adding Electrons or Holes With Impurities: Doping

In a pure band insulator or semiconductor, if we excite electrons from the valence to the conduction band (either with photons or thermally) we can be assured that the number of electrons in the conduction band (typically called n) is precisely equal to the number of holes left behind in the valence band (typically called p). However, in an impure semiconductor or band insulator this is not the case. Consider for example, silicon (Si), which is a semiconductor with a band gap of about 1.1 eV. Without impurities, a semiconductor is known as intrinsic11. Now imagine that a phosphorus (P) atom replaces one of the Si atoms in the lattice as shown on the left of Fig. 16.2. This P atom, being directly to the right of Si on the periodic table, can be thought of as nothing more than a Si atom plus an extra proton and an extra electron12 as shown on the right of Fig. 16.2. Since the valence band is already filled this additional electron must go into the conduction band. The P atom is known as a donor (or electron donor) in silicon since it donates an electron to the conduction band. It is also sometimes known as an n-dopant, since n is the symbol for the density of electrons in the conduction band. Analogously, we can consider aluminum, the element directly to the left of Si on the periodic table. In this case, the aluminum dopant provides one fewer electron than Si, so there will be one missing electron from the valence band. In this case Al is known as an electron acceptor, or equivalently as a p-dopant, since p is the symbol for the density of holes13.

10Mobility is defined to be positive for both electrons and holes. 11The opposite of intrinsic, the case where impurities donate carries is sometimes known as extrinsic. 12There is an extra neutron as well, but it doesn’t do much in this context. 13Yes, it is annoying that the common dopant phosphorus, has the chemical symbol P, but it is not a p-dopant, it is an n-dopant. 180 CHAPTER 16. SEMICONDUCTOR PHYSICS

      

      

     

Figure 16.2: Cartoon of Doping a Semiconductor. Doping Si with P adds one free electron

In a more chemistry oriented language, we can depict the donors and acceptors as shown in Fig. 16.3. In the intrinsic case, all of the electrons are tied up in covalent bonds of two electrons. With the n-dopant, there is an extra unbound electron, whereas with the p-dopant there is an extra unbound hole (one electron too few).

16.2.1 Impurity States

Let us consider even more carefully what happens when we add dopants. For definiteness let us consider adding an n-dopant such as P to a semiconductor such as Si. Once we add a single n-dopant to an otherwise intrinsic sample of Si, we get a single electron above the gap in the conduction ∗ band. This electron behaves like a free particle with mass me. However, in addition, we have an single extra positive charge +e at some point in the crystal due to the P nucleus. The free electron is attracted back to this positive charge and forms a bound state that is just like a hydrogen atom. There are two main differences between a real hydrogen atom and this bound state of an electron ∗ in the conduction band and the impurity nucleus. First of all, the electron has effective mass me which can be very different from the real (bare) mass of the electron (and is typically smaller than the bare mass of the electron). Secondly, instead of the two charges attracting each other with a 2 2 potential V = e they attract each other with a potential V = e where  is the relative 4π0r 4πr0r r permittivity (or relative dielectric constant) of the material. With these two small differences, we can calculate of the hydrogenic bound states proceeds exactly as we do for genuine hydrogen in our quantum mechanics courses. We recall the energy eigenstates of the hydrogen atom are given by EH−atom = Ry/n2 n − 16.2. ADDING ELECTRONS OR HOLES WITH IMPURITIES: DOPING 181

    

Figure 16.3: Cartoon of Doping a Semiconductor. n and p doping: In the intrinsic case, all of the electrons are tied up in covalent bonds of two electrons. In the n-dopant case, there is an extra unbound electron, whereas with the p-dopant there is an extra hole. where Ry is the Rydberg constant given by

me2 Ry = 2 2 13.6eV 80h ≈

2 with m the electron mass. The corresponding radius of this wavefunction is rn n a0 with the Bohr radius given by ≈ 4π ~2 a = 0 .51 10−10m 0 me2 ≈ ×

The analogous calculation for a hydrogenic impurity state in a semiconductor gives precisely ∗ the same expression, only 0 is replaced by 0r and m is replaced by me. One obtains

m∗ 1 Ryeff = Ry e m 2  r  and m aeff = a  0 0 r m∗  e  Because the dielectric constant of semiconductors is typically high (roughly 10 for most common semiconductors) and because the effective mass is frequently low (a third of m or even smaller), the effective Rydberg Ryeff can be tiny compared to the real Rydberg, and the effective 182 CHAPTER 16. SEMICONDUCTOR PHYSICS

eff 14 15 Bohr radius a0 can be huge compared to the real Bohr radius . For example, in Silicon the eff eff effective Rydberg, Ry , is much less than .1 eV and a0 is above 30 angstroms! Thus this donor impurity forms an energy eigenstate just below the conduction band. At zero temperature this eigenstate will be filled, but it takes only a small temperature to excite some of the bound electrons out of the hydrogenic orbital and into the conduction band. A depiction of this physics is given in Fig. 16.4 where we have plotted an energy diagram for a semiconductor with donor or acceptor impurities. Here the energies eigenstates are plotted as a function of position. Between the valence and conduction band (which are uniform in position), there are many localized hydrogen-atom-like eigenstates. The energies of these states are in a range of energies but are not all exactly the same since each impurity atom is perturbed by other impurity atoms in its environment. If the density of impurities is high enough, electrons (or holes) can hop from one impurity to the next, forming an impurity band. Note that because the effective Rydberg is very small, the impurity eigenstates are only slightly below the conduction band or above the valence band respectively. With a small tempera- ture, these donors or acceptors can be thermally excited into the band. Thus, except at low enough temperature that the impurities bind the carrier, we can think of the impurities as simply adding carriers to the band. So the donor impurities donate free electrons to the conduction band, whereas the acceptor impurities give free holes to the valence band. However, at very low temperature, these carriers get bound back to their respective nuclei so that they can no longer carry electricity, a phenomenon known as carrier freeze out. Note that in the absence of impurities, the Fermi energy (the chemical potential at zero temperature) is in the middle of the band gap. When donor impurities are added, at zero tem- perature, these states are near the top of the band gap, and are filled. Thus the Fermi energy is moved up to the top of the band gap. On the other hand, when acceptors are added, the acceptor states near the bottom of the band gap are empty. (Remember it is a bound state of a hole to a nucleus!). Thus, the Fermi energy is moved down to the bottom of the band gap.

Optical Effects of Impurities (Redux)

As mentioned previously in section 15.5.4, the presence of impurities in a material can have dra- matic effects on its optical properties. There are two main optical effects of impurities. The first effect is that the impurities add charge carriers to an otherwise insulating material – turning an insulator into something that conducts at least somewhat. This obviously can have some impor- tant effects on the interaction with light. The second important effect is the introduction of new energy levels within the gap. Whereas before the introduction of impurities, the lowest energy transition that can be made is the full energy of the gap, now one can have optical transitions between impurity states, or from the bands to the impurity states.

14Note that the large Bohr Radius justifies post-facto our use of a continuum approximation for the dielectric constant r. On small length scales, the electric field is extremely inhomogeneous due to the microscopic structure of the atoms, but on large enough length scales we can use classical and simply model the material as a medium with a dielectric constant. 15Because Silicon has an anisotropic band, and therefore an anisotropic mass, the actual formula is more compli- cated. 16.3. STATISTICAL MECHANICS OF SEMICONDUCTORS 183

# #

 !  !

   !" (  

$%%&% ! $%%&% !

' '

Figure 16.4: Energy Diagram of a Doped Semiconductor (left) with donor impurities (right) with acceptor impurities. The energy eigenstates of the hydrogenic orbitals tied to the impurities are not all the same because each impurity is perturbed by neighbor impurities. At low temperature, the donor impurity eigenstates are filled and the acceptor eigenstates are empty. But with increasing temperature, the electrons in the donor eigenstates are excited into the conduction band and similarly the holes in the acceptor eigenstates are excited into the valence band.

16.3 Statistical Mechanics of Semiconductors

We now use our knowledge of statistical physics to analyze the occupation of the bands at finite temperature. Imagine a band structure as shown in Fig. 16.5. The minimum energy of the conduction band is defined to be c and the maximum energy of the valence band is defined to be v. The band gap is correspondingly E =   . gap c − v Recall from way back in Eq. 4.10 that the density of states per unit volume for free electrons (in three dimensions with two spin states) is given by (2m)3/2 g( > 0) = √ 2π2~3

The electrons in our conduction band are exactly like these free electrons, except that (a) ∗ the bottom of the band is at energy c and (b) they have an effective mass me. Thus the density 184 CHAPTER 16. SEMICONDUCTOR PHYSICS

 



 

Figure 16.5: A Band Diagram of a Semiconductor. of states for these electrons near the bottom of the conduction band is given by

(2m∗)3/2 g ( >  )= e √  c c 2π2~3 − c Similarly the density of states for holes near the top of the valence band is given by

(2m∗ )3/2 g ( 6  )= h √  v v 2π2~3 v −

At fixed chemical potential µ the total number of electrons n in the conduction band, as a function of temperature T is thus given by ∞ ∞ g () n(T )= d g () n (β( µ)) = d c c F − eβ(−µ) +1 Zc Zc −1 where nF is the Fermi occupation factor, and β = kB T as usual. If the chemical potential is “well below” the conduction band (i.e., if β( µ) 1), then we can approximate −  1 e−β(−µ) eβ(−µ) +1 ≈ In other words, Fermi statistics can be replaced by Boltzmann statistics when the temperature is low enough that the density of electrons in the band is very low. (We have already run into this 16.3. STATISTICAL MECHANICS OF SEMICONDUCTORS 185 principle in section 16.1.1 when we discussed that Drude theory, a classical approach that neglects Fermi statistics, actually works very well for electrons above the band gap in semiconductors!). We thus obtain ∞ (2m∗)3/2 ∞ n(T ) dg ()e−β(−µ) = e d (  )1/2e−β(−µ) ≈ v 2π2~3 − c Zc Zc (2m∗)3/2 ∞ = e eβ(µ−c) d (  )1/2e−β(−c) 2π2~3 − c Zc The last integral is (using y2 = x =   ). − c ∞ ∞ ∞ 2 d 2 d π 1 dx x1/2e−βx =2 dy y2e−βy =2 e−βy = = β−3/2 √π dβ dβ β 2 Z0 Z0 Z0 r Thus we obtain the standard expression for the density of electrons in the conduction band

1 2m∗k T 3/2 n(T )= e B e−β(c−µ) (16.6) 4 π~2   Note that this is mainly just exponential activation from the chemical potential to the bottom of the conduction band, with a prefactor which doesn’t change too quickly as a function of temperature (obviously the exponential changes very quickly with temperature!). Quite similarly, we can write the number of holes in the valence band p as a function of temperature16

v 1 v g ()eβ(−µ) p(T )= d g () 1 = d v v − eβ(−µ) +1 e−β(−µ) +1 Z−∞   Z−∞ Again, if µ is substantially above the top of the valence band, we have eβ(−µ) 1 so we can replace this by  v β(−µ) p(T )= d gv()e Z−∞ and the same type of calculation then gives

1 2m∗ k T 3/2 p(T )= h B e−β(µ−v) (16.7) 4 π~2   again showing that the holes are activated from the chemical potential down into the valence band. (Recall that pushing a hole down into the valence band costs energy!).

16 If the Fermi factor nF gives the probability that a state is occupied by an electron, then 1 nF gives the probability that the state is occupied by a hole. − 186 CHAPTER 16. SEMICONDUCTOR PHYSICS

Law of Mass Action

A rather crucial relation is formed by combining Eq. 16.6 with 16.7.

1 k T 3 n(T )p(T ) = B (m∗ m∗ )3/2 e−β(c−v) 2 π~2 e h   1 k T 3 = B (m∗ m∗ )3/2 e−βEgap (16.8) 2 π~2 e h   where we have used the fact that the gap energy Egap = c v. Eq. 16.8 is sometimes known as the law of mass action17, and it is true independent of doping− of the material.

Intrinsic Semiconductors

For an intrinsic (i.e., undoped) semiconductor the number of electrons excited into the conduction band must be equal to the number of holes left behind in the valence band so p = n. We can then divide Eq. 16.6 by 16.7 to get m∗ 3/2 1= e e−β(v+c−2µ) m∗  h  Taking log of both sides gives the useful relation 1 3 µ = ( +  )+ (k T ) log(m∗ /m∗) (16.9) 2 c v 4 B h e Note that at zero temperature, the chemical potential is precisely mid-gap. Using either this expression, or by using the law of mass action along with the constraint n = p, we can obtain an expression for the intrinsic density of carriers in the semiconductor

3/2 1 kB T ∗ ∗ 3/4 −βEgap/2 nintrinsic = pintrinsic = √np = (m m ) e √2 π~2 e h  

Doped Semiconductors

For doped semiconductors, the law of mass action still holds. If we further assume that the temperature is high enough so that there is no carrier freeze out (i.e, carriers are not bound to

17The nomenclature here “law of mass action” is a reference to an analogue in chemistry. In chemical reactions we may have an equilibrium between two objects A and B and their compound AB. This is frequently expressed as A + B AB There is some chemical equilibrium constant K which gives the ratio of concentrations [A][B] K = [AB] where [X] is the concentration of species X. The law of mass action states that this constant K remains fixed independent of the individual concentrations. In semiconductor physics it is quite similar, only the “reaction” is e + h 0 the annihilation of an electron and a hole. So that the product of [e]= n and [h]= p is fixed. 16.4. SUMMARY OF STATISTICAL MECHANICS OF SEMICONDUCTORS 187 impurities) then we have

n p = (density of donors) (density of acceptors) − − This, along with the law of mass action gives us two equations in two unknowns which can be solved18 . In short, the result is that if we are at a temperature where the undoped intrinsic carrier density is much greater than the dopant density, then the dopants do not matter much, and the chemical potential is roughly midgap as in Eq. 16.9 (This is the intrinsic regime). On the other hand, if we are at a temperature where the intrinsic undoped density is much smaller than the dopant density, then the temperature does not matter much and we can think of this as a low temperature situation where the carrier concentration is mainly set by the dopant density (This is the extrinsic regime). In the n-doped case, the bottom of the conduction band gets filled with the density of electrons from the donors, and the chemical potential gets shifted up towards the conduction band. Correspondingly, in the p-doped case, holes fill the top of the valence band, and the chemical potential gets shifted down towards the valence band. Note that in this case of strong doping, the majority carrier concentration is obtained just from the doping, whereas the minority carrier concentration — which might be very small — is obtained via law of mass action).

16.4 Summary of Statistical Mechanics of Semiconductors

Holes are the absence of an electron in the valence band. These have positive charge (electrons • have negative charge), and positive effective mass. Energy of a hole gets larger at larger momentum (away from the maximum of the band) as they get pushed down into the valence band. The positive charge of the hole as a charge carrier explains the puzzle of the sign of the Hall coefficient.

Effective mass of electrons determined by the curvature at the bottom of the conduction • band. Effective mass of holes determined by curvature at top of conduction band.

Mobility of a carrier is µ = eτ/m∗ • | | 18Here is how to solve these two equations. Let

D = doping = n p = (density of donors) (density of acceptors) − − Let us further assume that n>p so D > 0 (we can do the calculation again making the opposite assumption, at the end). Also let I = nintrinsic = pintrinsic so that 1 k T 3 I2 = B (m∗ m∗ )3/2 e−βEgap 2  π~2  e h from the law of mass action. Using np = I2, we can then construct

D2 + 4I2 = (n p)2 + 4np = (n + p)2 − So we obtain 1 n = D2 + 4I2 + D 2 p  1 p = D2 + 4I2 D 2 p − 

As stated in the main text, if I D then the doping D is not important. On the other hand, if I D then the majority carrier density is determined by the doping only and the thermal factor I is unimportant.  188 CHAPTER 16. SEMICONDUCTOR PHYSICS

When very few electrons are excited into the conduction band, or very few holes into the • valence band, Boltzmann statistics is a good approximation for Fermi statistics and Drude theory is accurate. Electrons or Holes can be excited thermally, or can be added to a system by doping. The • law of mass action assures that the product np is fixed independent of the amount of doping (only depends on the temperature, the effective masses, and the band gap). At very low temperature carriers may freeze-out, binding to the impurity atoms that they • came from. However, because the effective Rydberg is very small, carriers are easily ionized into the bands. Know how to derive the law of mass action! •

References

Ashcroft and Mermin, chapter 28. A very good discussion of holes and their effective mass is given• in chapter 12. Rosenberg, chapter 9 • Hook and Hall, 5.1–5.5 • Kittel, chapter 8 • Burns, chapter 10 not including 10.17 and after • Singleton chapter 5–6 • Chapter 17

Semiconductor Devices

The development of semiconductor devices, such as the no doubt changed the world. Every iPad, iPod, iPhone, and iBook literally contains billions of semiconductor . Simple devices, like alarm clocks, TVs, radios, or cars, contain many thousands or even millions of them. It is hard to overstate how much we take these things for granted these days. This chapter discusses the physics behind some of devices you can make with semiconductors.

17.1 Band Structure Engineering

To make a semiconductor device one must have control over the detailed properties of materials (band gap, doping, etc) and one must be able to assemble together semiconductors with differing such properties.

17.1.1 Designing Band Gaps

A simple example of engineering a device is given by aluminum-galium-arsenide. GaAs is a semi- conductor (zincblende structure as in Fig. 13.4) with a direct band gap about Egap,k=0(GaAs) = 1.4 eV. AlAs is the same structure except that the Ga has been replaced by Al and the gap1 at k = 0 is about 2.7 eV. One can also produce alloys (mixtures) where some fraction (x) of the Ga has been replaced by Al which we notate as AlxGa1−xAs. To a fairly good approximation the direct band gap just interpolates between the direct band gaps of the pure GaAs and the pure AlAs. Thus we get roughly (for x < .4) E (x)=(1 x)1.4eV+ x 2.7 eV gap − By producing this type of alloyed structure allows one to obtain any desired band gap in this type of material2. 1 AlAs is actually an indirect band gap semiconductor, but for x<.4 or so AlxGa1−xAs is direct band gap as well. 2By alloying the material with arbitrary x, one must accept that the system can no longer be precisely periodic but instead will be some random mixture. It turns out that as long as we are concerned with long wavelength electron waves (i.e, states near the bottom of the conduction band or the top of the valence band) this randomness is very effectively averaged out and we can roughly view the system as being a periodic crystal of As and some average of a AlxGa1−x atom.. This is known as a “virtual crystal” approximation.

189 190 CHAPTER 17. SEMICONDUCTOR DEVICES

In the context of device physics one might want to build, for example, a out of a semiconductor. The lowest energy transition which recombines a hole with an electron is the gap energy (this is the “lasing” energy typically). By tuning the composition of the semiconductor, one can tune the energy of the gap and therefore the optical frequency of the laser.

17.1.2 Non-Homogeneous Band Gaps

By constructing structures where the materials (or the alloying of a material) is a function of position, one can design more complex environments for electrons or holes in a system. Consider for example, the structure shown in the following figure:

AlxGa1−xAs z L GaAs

AlxGa1−xAs

Here a layer of GaAs with smaller band gap is inserted between two layers of AlGaAs which has a larger band gap. This structure is known as a “quantum well”. In general a semiconductor made of several varieties of semiconductors is known as a semiconductor heterostructure. A band diagram of the quantum well structure as a function of the vertical position z is given in Fig. 17.1. The band gap is lower in the GaAs region than in the AlGaAs region. The changes in band energy can be thought of as a potential that an electron (or hole) would feel. For example, an electron in the valence band can have a lower energy if it is in the quantum well region (the GaAs region) than it can have in the AlGaAs region. An electron in the valence band with low energy will be trapped in this region. Just like a particle in a box, there will be discrete eigenstates of the electron’s motion in the z direction, as shown in the Figure. The situation is similar for holes in the valence band (recall that it requires energy to push a hole down into the valence band), so there will similarly be confined particle in a box states for holes in the quantum well.

17.1.3 Summary of the Examinable Material

One can tune band gaps by forming an alloy • Band gaps act as a potential for electrons and holes. •

References on Inhomogeneous Semiconductors

There are many good references on semiconductors (See also the material listed below). Almost all of them discuss the p-n junction first (which is nonexaminable for us). I recommend Hook and Hall section 6.6 on the quantum well to cover the above material. The rest of the material in this chapter is NOT EXAMINABLE. But since semiconductors really did change the world, you might be interested in learning it anyway! 17.2. P -N JUNCTION 191

 

"#$%&#"' "#$%&#"'

$"'

!

  

(

Figure 17.1: Band diagram of a quantum well. A single electron in the conduction band can be trapped in the particle-in-a-box states in the quantum well. Similarly, a hole in the valence band can be trapped in the quantum well.

17.2 p-n Junction

The p-n junction is a junction in a semiconductor between a region of p-doping and a region of n doping. This type of junction has the remarkable property of rectification: it will allow current to flow through the junction easily in one direction, but not easily (with very high resistance) in the other direction3. Consider OK, I haven’t finished this chapter. Cut me some slack, typing these notes was a load of work! Anyway, it is nonexaminable material, so don’t worry about it too much.

3The phenomenon of rectification in semiconductors was discovered by Karl Ferdinand Braun way back in 1874, but was not understood in detail until the middle of the next century. This discovery was fundamental to the development of radio technology. Braun was awarded the Nobel Prize in 1909 with Guglielmo Marconi for contributions to wireless telegraphy. Perhaps as important to modern communication, Braun also invented the cathode ray tube (CRT) which formed the display for televisions for many years until the LCD display arrived very recently. (The CRT is known as a “Braun Tube” in many countries). 192 CHAPTER 17. SEMICONDUCTOR DEVICES Part VII

Magnetism and Mean Field Theories

193

Chapter 18

Magnetic Properties of Atoms: Para- and Dia-Magnetism

The first question one might ask is why we are interested in magnets. While the phenomenon of magnetism was known to the ancients1, it has only been since the discovery of quantum mechanics that we have come to any understanding of what causes this effect2. It may seem like this is a relatively small corner of physics for us to focus so much attention (indeed, several chapters), but we will see that magnetism is a particularly good place to observe theeffects of both statistical physics and quantum physics3. As we mentioned in section 15.4, one place where the band theory of electrons fails is in trying to describe magnets. Indeed, this is precisely what makes magnets interesting! In fact, magnetism remains an extremely active area of research in physics (with many many hard and unanswered questions remaining). Much of condensed matter physics continues to use magnetism as a testing ground for understanding complex quantum and statistical physics both theoretically and in the laboratory. We should emphasize that most magnetic phenomena are caused by the quantum mechanical behavior of electrons. While nuclei do have magnetic moments, and therefore can contribute to magnetism, the magnitude of the nuclear moments is (typically) much less than that of electrons.4

1 Both the Chinese and the Greeks probably knew about magnetic properties of Fe3O4, or magnetite (also known as loadstone when magnetized), possibly as far back as several thousands of years BCE (with written records existing as far back as 600 years BCE). One legend has it that a shepherd named Magnes, in the provence of Magnesia, had the nails in his shoes stuck to a large metallic rock and the scientific phenomenon became named after him. 2Animal magnetism not withstanding... (that was a joke). 3In fact there is a theorem by Niels Bohr and Hendrika van Leeuwen which shows that any treatment of statistics mechanics without quantum mechanics (i.e., classical statistical mechanics) can never produce a nonzero magneti- zation. 4To understand this, recall that the Bohr magneton, which gives the size of magnetic moments of an electrons is e~ given by µB = 2m with m the electron mass. If one were to consider magnetism caused by nuclear moments, the typical moments would be smaller by a ratio of the mass of the electron to the mass of a nucleus (a factor of over 1000). Nonetheless, the magnetism of the nuclei, although small, does exist.

195 196 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM

18.1 Basic Definitions of types of Magnetism

Let us first make some definitions. Recall that for a small magnetic field, the magnetization of a system M (moment per unit volume) is typically related linearly to the applied5 magnetic field H by a (magnetic) susceptibility χ. We write for small fields H,

M = χH (18.1)

Note that χ is dimensionless. For small susceptibilities (and susceptibilities are almost always small, except in ferromagnets) there is little difference between µ0H and B (with µ0 the permeability of free space), so we can also write M = χB/µ0 (18.2)

Definition 18.1.1. A paramagnet is a material where χ > 0 (i.e., the resulting magnetization is in the same direction as the applied field).

We have run into (Pauli) paramagnetism previously in section 4.3 above. You may also be familiar with the paramagnetism of a free spin (which we will cover again in section 18.4 below). Qualitatively paramagnetism occurs whenever there are moments that can be re-oriented by an applied magnetic field — thus developing magnetization in the direction of the applied field. Definition 18.1.2. A diamagnet is a material where χ< 0 (i.e., the resulting magnetization is in the opposite direction from the applied field).

We will discuss diamagnetism more in section 18.5 below. As we will see, diamagnetism is quite ubiquitous and occurs generically unless it is overwhelmed by other magnetic effects. For example, water, and pretty almost all other biological materials are diamagnetic6. Qualitatively we can think of diamagnetism as being similar in spirit to Lenz’s law (part of Faraday’s law) that an induced current always opposes the change causing it. However, the analogy is not precise. If a magnetic field is applied to a loop of wire, current will flow to create a magnetization in the opposite direction. However, in any (nonsuperconducting) loop of wire, the current will eventually decay back to zero and there will be no magnetization remaining. In a diamagnet, in contrast, the magnetization remains so long as the applied magnetic field remains. For completeness we should also define a ferromagnet — this is what we usually think of as a “” (the thing that holds notes to the fridge). Definition 18.1.3. A ferromagnet is a material where M can be nonzero, even in the absence of any applied magnetic field7

5The susceptibility is defined in terms of H. With a long rod-shaped sample oriented parallel to the applied field, H is the same outside and inside the sample, and is thus directly controlled by the experimentalist. The susceptibility is defined in terms of this standard configuration. However, more generally, one needs to take care that the internal field B that any electrons in the sample respond to is related to H via B = µ0(H + M). 6It is interesting to note that a diamagnet repels the field that creates it, so it is attracted to a magnetic field minimum. Earnshaw’s theorem forbids a local maximum of the B field in free space, but local minima can exist — and this then allows diamagnets to levitate in free space. In 1997 used this effect to levitate a rather confused frog. This feat earned him a so-called Ig-Nobel prize in 2000 (Ig-Nobel prizes are awarded for research that “cannot or should not be reproduced”.) Ten years later he was awarded a real Nobel prize for the discovery of — single layer carbon sheets. This makes him the only person so far to receive both the Ig-Nobel and the real Nobel. 7The definition of ferromagnetism given here is a broad definition which would also include ferrimagnets. We will discuss ferrimagnets in section 19.1.3 below and we mention that occasionally people use a more restrictive definition (also commonly used) of ferromagnetism that excludes ferrimagnets. At any rate, the broad definition given here is common. 18.2. ATOMIC PHYSICS: HUND’S RULES 197

It is worth already drawing the distinction between spontaneous and non-spontaneous mag- netism. Magnetism is said to be spontaneous if it occurs even in the absence of externally applied magnetic field, as is the case for a ferromagnet. The remainder of this chapter will mainly be con- cerned with non-spontaneous magnetism, and we will return to spontaneous magnetism in chapter 19 below. It turns out that a lot of the physics of magnetism can be understood by just considering a single atom at a time. This will be the strategy of the current chapter — we will discuss the magnetic behavior of a single atom and only in section 18.6 will consider how the physics changes when we put many atoms together to form a solid. We thus start this discussion by reviewing some atomic physics that you might have learned in prior courses8.

18.2 Atomic Physics: Hund’s Rules

We start with some of the fundamentals of electrons in an isolated atom. (I.e., we ignore the fact that in materials atoms are not isolated, but are bound to other atoms). For isolated atoms there are a set of rules, known as “Hund’s Rules”9 which determine how the electrons fill orbitals. Recall from basic quantum mechanics that an electron in an atomic orbital can be labeled by four quantum numbers, n,l,l , σ , where | z zi

n = 1, 2,... l = 0, 1,...,n 1 − l = l,...,l z − σ = 1/2 or+1/2 z −

Here n is the principle quantum number, l is the angular momentum, lz is its z-component and 10 σz is the z-component of spin . Recall that the angular momentum shells with l = 0, 1, 2, 3 are sometimes known as s,p,d,f, ... respectively in atomic language. These shells can accommodate 2, 6, 10, 14,... electrons respectively including both spin states. When we consider multiple electrons in one atom, we need to decide which orbitals are filled and which ones are empty. The first rule is known as the Aufbau principle11, which many people

8You should have learned this in prior courses. But if you not, it is probably not your fault! This material is rarely taught in physics courses these days, even though it really should be. Much of this material is actually taught in chemistry courses instead! 9Friedrich Hermann Hund was an important physicist and chemist whose work on atomic structure began in the very early days of quantum mechanics — he wrote down Hund’s rules in 1925. He is also credited with being one of the inventors of molecular orbital theory which we met in chapter 5.3.2 above. In fact, molecular orbital theory is sometimes known as Hund-Mulliken Molecular Orbital theory. Mulliken thanked Hund heavily in his Nobel Prize acceptance speech (but Hund did not share the prize). Hund died in 1997 at the age of 101. The word “Hund” means “Dog” in German. 10You probably discussed these quantum numbers in reference to the eigenstates of a Hydrogen atom. The orbitals of any atom can be labeled similarly. 11Aufbau means “construction” or “building up” in German. 198 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM think of as Hund’s 0th rule

Aufbau Principle (paraphrased): Shells should be filled starting with the lowest available energy state. An entire shell is filled before another shell is started12.

(Madelung Rule): The energy ordering is from lowest value of n+l to the largest; and when two shells have the same value of n + l, fill the one with the smaller n first.13 This ordering rule means that shells should be filled in the order14

1s, 2s, 2p, 3s, 3p, 4s, 3d, 4p, 5s, 4d, 5p, 6s, 4f,...

A simple mneumonic for this order can be constructed by drawing the following simple diagram:

1 2 1s 3 4 2s 2p 5

3s 3p 3d

4s 4p 4d 4f

5s 5p 5d 5f 5g

6s 6p 6d 6f 6g 6h

......

So for example, let us consider an isolated nitrogen atom which has atomic number 7 (i.e., 7 electrons). Nitrogen has a filled 1s shell (containing 2 electrons, one spin up, one spin down), has a filled 2s shell (containing 2 electrons, one spin up, one spin down), and has three remaining electrons in the 2p shell. In atomic notation we would write this as 1s22s22p3. To take a more complicated example, consider the atom praseodymium (Pr) which is a

12It is important to realize that a given orbital is different in different atoms. For example, the 2s orbital in a nitrogen atom is different from the 2s orbital in an iron atom. The reason for this is that the charge of the nucleus is different and also that one must account for the interaction of an electron in an orbital with all of the other electrons in that atom. 13Depending on your country of origin, the Madelung rule might instead be known as Klechkovsky’s rule. 14You may find it surprising that shells are filled in this order, being that for a simple hydrogen atom orbital energies increase with n and are independent of l. However, in any atom other than hydrogen, we must also consider interaction of each electron with all of the other electrons. Treating this effect in detail is quite complex, so it is probably best to consider the this ordering (Madelung) rule to be simply empirical. Nonetheless, various approximation methods have been able to give some insight. Typical approximation schemes replace the Coulomb potential of the nucleus with some screened potential which represents the charge both of the nucleus and of all the other electrons (essentially the effective charge of the nucleus is reduced if the electron is at a radius where other electrons can get between it and the nucleus). Note in particular that once we changes the potential from the Coulomb 1/r form, we immediately break the energy degeneracy between different l states. 18.2. ATOMIC PHYSICS: HUND’S RULES 199 rare earth element with atomic number 59. Following the Madelung rule, we obtain an atomic15 configuration 1s22s22p63s23p64s23104p65s24d105p66s24f3. Note that the “exponents” properly add up to 59. There are a few atoms that violate this ordering (Madelung) rule. One example is copper which typically fills the 3d shell by “borrowing” an electron from the (putatively lower energy) 4s shell. Also, when an atom is part of a molecule or is in a solid, the ordering may change a little as well. However, the general trend given by this rule is rather robust. This shell filling sequence is, in fact, the rule which defines the overall structure of the periodic table with each “block” of the periodic table representing the filling of some particular shell. For example, the first line of the periodic table has the elements H and He, which have atomic fillings 1sx with x =1, 2 respectively (and the 1s shell holds at most 2 electrons). The left of the second line of the table contains Li and Be which have atomic fillings 1s22sx with x = 1, 2 respectively. The right of the 2nd line of the table shows B, N, C, O, F, Ne which have atomic x fillings 1s22s22p with x = 1 ... 6 and recall that the 2p shell can hold at most 6 electrons. One can continue and reconstruct the entire periodic table this way! In cases when shells are partially filled (which in fact includes most elements of the periodic table) we next want to describe which of the available orbitals are filled in these shells and which spin states are filled. In particular we want to know what whether these electrons will have a net magnetic moment. Hund’s rules are constructed precisely to answer this questions. Perhaps the simplest way to illustrate these rules is to consider an explicit example. Here we will again consider the atom praseodymium. As mentioned above, this element in atomic form has three electrons in its outer-most shell, which is an f-shell, meaning it has angular momentum l = 3, and therefore 7 possible values of lz, and of course 2 possible values of the spin for each electron. So where in these possible orbital/spin states do we put the three electrons?

Hund’s First Rule (paraphrased): Electrons try to align their spins. Given this rule, we know that the three valence electrons in Pr will have their spins point in the same direction, thus giving us a total spin-angular momentum S =3/2 from the three S =1/2 spins. So locally (meaning on the same atom), the three electron spins behave ferromagnetically — they all align16. The reason for this alignment will be discussed below in section 18.2.1, but in short, it is a result of the Coulomb interaction between electrons (and between the electrons and the nucleus) — the Coulomb energy is lower when the electron spins align. We now have to decide which orbital states to put the electrons in.

Hund’s Second Rule(paraphrased): Electrons try to maximize their total orbital angular momentum, consistent with Hund’s first rule.

For the case of Pr, we fill the lz = 3 and lz = 2 and lz = 1 states to make the maximum possible total Lz = 6 (this gives L = 6, and by rotational invariance we can point L in any direction

15This tediously long atomic configuration can be abbreviated as [Xe]6s24f3 where [Xe] represents the atomic configuration of Xenon, which, being a noble gas is made of entirely filled shells. 16We would not call this a true ferromagnet since we are talking about a single atom here, not a macroscopic material! 200 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM equally well). Thus, we have a picture as follows

l = 3 2 10 1 2 3 z − − −

We have put the spins as far as possible to the right to maximize Lz (Hund’s 2nd rule) and we have aligned all the spins (Hund’s 1st rule). Note that we could not have put both electrons in the same orbital, since they have to be spin-aligned and we must obey the Pauli principle. Again the rule of maximizing orbital angular momentum is driven by the physics of Coulomb interaction (as we will discuss briefly below in section 18.2.1) . At this point we have S = 3/2 and L = 6, but we still need to think about how the spin and orbital angular momenta align with respect to each other.

Hund’s Third Rule(paraphrased): Given Hund’s first and second rules, the or- bital and spin angular momentum either align or antialign, so that the total angular momentum is J = L S with the sign being determined by whether the shell of orbitals is more than| half± | filled (+) or less than half filled ( ). − The reason for this rule is not interaction physics, but is spin-orbit coupling. The Hamilto- nian will typically have a spin-orbit term α l σ, and the sign of α determines how the spin and orbit align to minimize the energy.17 Thus for· the case of Pr, where L = 6 and S = 3/2 and the shell is less than half filled, we have total angular momentum J = L S =9/2. − One should be warned that people frequently refer to J as being the “spin” of the atom. This is a colloquial use which is very persistent but imprecise. More correctly, J is the total angular momentum of the electrons in the atom, whereas S is the spin component of J.

18.2.1 Why Moments Align

We now return, as promised above, to discuss roughly why Hund’s rules work — in particular we want to know why magnetic moments (real spin moments or orbital moments) like to align with each other. This section will be only qualitative, but should give at least a rough idea of the right physics. Let us first focus on Hund’s first rule and ask why spins like to align. First of all, we emphasize that it has nothing to do with magnetic dipole interactions. While the magnetic dipoles of the spins do interact with each other, when dipole moments are on the order of the Bohr magneton, this energy scale becomes tiny — way too small to matter for anything interesting. Instead, the alignment comes from the Coulomb interaction energy. To see how this works, let us consider a wavefunction for two electrons on an atom.

17The fact that the sign switches at half filling does not signal a change in the sign of the underlying α (which is always positive) but rather is a feature of careful bookkeeping. So long as the shell remains less than half full, all of the spins are aligned in which case we have l σ = S L thus always favoring L counter-aligned with S. i i · i · When the shell is half filled L = 0. When we addP one more spin to a half filled shell, this spin must counter-align with the many spins that comprise the half-filled shell due to the Pauli exclusion principle. The spin orbit coupling li σi then makes this additional spin want to counter-align with its own orbital angular momentum li, which is equal· to the total orbital angular momentum L since the half full shell has L = 0. This means that the orbital angular momentum is now aligned with the net spin, since most of the net spin is made up of the spins comprising the half-filled shell and are counter-aligned with the spin of the electron which has been added. 18.2. ATOMIC PHYSICS: HUND’S RULES 201

Naive Argument

The overall wavefunction must be antisymmetric by Pauli’s exclusion principle. We can generally write

Ψ(r1, σ1; r2, σ2)= ψorbital(r1, r2) χspin(σ1, σ2) where ri are the particles’ positions and σi are their spin. Now, if the two spins are aligned, say both are spin-up (i.e., χ ( , ) = 1 and χ = 0 for other spin configurations) then the spin spin ↑ ↑ spin wavefunction is symmetric and the spatial wavefunction ψorbital must be antisymetric. As a result we have

lim ψorbital(r1, r2) 0 r1→r2 → So electrons with aligned spins cannot get close to each other, thus reducing the Coulomb energy of the system. The argument we have just given is frequently stated in textbooks. Unfortunately, it is not the whole story.

More Correct

In fact it turns out that the crucial Coulomb interaction is that between the electron and the nucleus. Consider the case where there are two electrons and a nucleus as shown in Fig. 18.1. What we see from this figure is that the positive charge of the nucleus seen by one electron is screened by the negative charge of the other electron. This screening reduces the binding energy of the electrons to the nucleus. However, when the two spins are aligned, the electrons repel each other and therefore screen the nucleus less effectively. In this case, the electrons see the full charge of the nucleus and bind more strongly, thus lowering their energies. Another way of understanding this is to realize that when the spins are not aligned, some- times one electron gets between the other electron and the nucleus — thereby reducing the effective charge seen by the outer electron, reducing the binding energy, and increasing the total energy of the atom. However, when the electrons are spin aligned, the Pauli principle largely prevents this configuration from occurring, thereby lowering the total energy of the system. Hund’s second rule is driven by very similar considerations. When two electrons take states which maximize their total orbital angular momentum, they are more likely to be found on opposite sides of the nucleus. Thus the electrons see the nucleus fully unscreened so that the binding energy is increased and the energy is lowered. One must be somewhat careful with these types of arguments however — particularly when they are applied to molecules instead of atoms. In the case of a diatomic molecule, say H2, we have two electrons and two nuclei. While the screening effect discussed above still occurs, and tries to align the electrons, it is somewhat less effective than for two electrons on a single atom — since most of the time the two electrons are near opposite nuclei anyway. Furthermore, there is a competing effect that tends to make the electrons want to anti-align. As we discussed in section 5.3.1 when we discussed covalent bonding, we can think of the two nuclei as being a square well (See Fig. 5.4), and the bonding is really a particle-in-a-box problem. There is some lowest energy (symmetric) wavefunction in this large two-atom box, and the lowest energy state of two electrons would be to have the two spins anti-aligned so that both electrons can go in the same low energy spatial wavefunction. It can thus be quite difficult to determine whether electrons on neighboring atoms want to be aligned or anti-aligned. Generally either behavior is possible. We will discuss this 202 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM

Figure 18.1: Why Aligned Spins Have Lower Energy (Hund’s First Rule). In this figure, the wavefunction is depicted for one of the electrons whereas the other electron (the one further left) is depicted as having fixed position. When the two electrons have opposite spin, the effective charge of the nucleus seen by the fixed electron is reduced by the screening provided by the other electron (left figure) . However, when the spins are aligned, the two electrons cannot come close to each other (right figure) and the fixed electron sees the full charge of the nucleus. As such, the binding of the fixed electron to the nucleus is stronger in the case where the two electrons are spin aligned, therefore it is a lower energy configuration. much further below in chapter 22. The energy difference between having the spins on two atoms aligned versus anti-aligned is usually known as the exchange interaction or exchange energy.18

18.3 Coupling of Electrons in Atoms to an External Field

Having discussed how electron moments (orbital or spin) can align with each other, we now turn to discuss how the electrons in atoms couple to an external magnetic field. In the absence of a magnetic field, the Hamiltonian for an electron in an atom is of the usual

18The astute reader will recall that atomic physicists use the word “exchange” to refer to what we called the hopping matrix element (see footnote 14 in section 5.3.2) which “exchanged” an electron from one orbital to another. In fact the current name is very closely related. Let us attempt a very simple calculation of the difference in energy between two electrons having their spins aligned and two electrons having their spins antialigned. Suppose we have two electrons on two different orbitals which we will call A and B. We write a general wavefunction as ψ = ψspatialχspin and overall the wavefunction must be antisymmetric. If we choose the spins to be aligned (a triplet, therefore antisymmetric), then the spatial wavefunction must be symmetric, which we can write as AB + BA . On the other hand if we choose the spins to be anti-aligned (a singlet, therefore symmetric) then the| i spatial| wavefunctioni must be antisymmetric AB BA . When we add Coulomb interaction, the energy difference between the singlet and triplet is proportional| i−| to thei cross term AB V BA . In this matrix element the two electrons have “exchanged” place. Hence the name. h | | i 18.3. COUPLING OF ELECTRONS IN ATOMS TO AN EXTERNAL FIELD 203 form19 p2 = + V (r) H0 2m where V is the electrostatic potential from the nucleus (and perhaps from the other electrons as well). Now consider adding an external magnetic field. Recall that the Hamiltonian for a charged particle in a magnetic field B takes the minimal coupling form20

(p + eA)2 = + gµ B σ + V (r) H 2m B · where e is the charge of the particle (the electron), σ is the electron spin, g is the electron g- − factor (approximately 2), µB = e~/(2m) is the Bohr magneton, and A is the vector potential. For a uniform magnetic field, we may take A = 1 B r such that A = B. We then have21 2 × ∇× p2 e e2 1 = + V (r)+ p (B r)+ B r 2 + gµ B σ (18.3) H 2m 2m · × 2m 4| × | B ·

The first two terms in this equation comprise the Hamiltonian 0 in the absence of the applied magnetic field. The next term can be rewritten as H e e p (B r)= B (r p)= µ B l (18.4) 2m · × 2m · × B · where ~l = r p is the orbital angular momentum of the electron. This can then be combined with the so-called× Zeeman term gµ B σ to give B · e2 1 = + µ B (l + gσ)+ B r 2 (18.5) H H0 B · 2m 4| × | The second term on the right of this equation, known sometimes as the paramagentic term, is clearly just the coupling of the external field to the total magnetic moment of the electron (both orbital and spin). Note that when a B-field is applied, these moments aligns with the B-field (meaning that l and σ anti-align with B) such that the energy is lowered by the application of the field22. As a result a moment is created in the same direction as the applied field and this term results in paramagnetism. The final term of Eq. 18.5 is known as the diamagnetic term of the Hamiltonian, and will be responsible for the effect of diamagnetism. Since this term is quadratic in B it will always cause an increase in the total energy of the atom when the magnetic field is applied, and hence has the opposite effect from that of the above considered paramagnetic term. These two terms of the Hamiltonian are the ones responsible for both the paramagnetic and diamagnetic response of atoms to external magnetic fields. We will treat them each in turn in the next two sections. Keep in mind that at this point we are still considering the magnetic response of a single atom!

19Again, whenever we discuss magnetism it is typical to use for the Hamiltonian so as not to confuse it with H the magnetic field strength H = B/µ0. 20Recall that minimal coupling requires p p qA where q is the charge of the particle. Here our particle has charge q = e. The negative charge also is→ responsible− for the fact that the electron spin magnetic moment is anti-aligned with− its spin. Hence it is lower energy to have the spin point opposite the applied magnetic field (hence the positive sign of the so-called Zeeman term gµB B σ). Blame Ben Franklin. (See footnote 13 of section 4.3). 21 · Note that while pi does not commute with ri, it does commute with rj for j = i, so there is no ordering problem between p and B r 6 22If the sign of the× magnetic moment confuses you, it is good to remember that moment is always ∂F/∂B, and at zero temperature the free energy is just the energy. − 204 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM

18.4 Free Spin (Curie or Langevin) Paramagnetism

We will start by considering the effect of the paramagnetic term of Eq. 18.5. We assume that the unperturbed Hamiltonian 0 has been solved and we need not pay attention to this part of the Hamiltonian — we areH only concerned with the reorientation of a spin σ or an orbital angular momentum l of an electron. At this point we also disregard the diamagnetic term of the Hamiltonian as its effect is generally weaker than that of the paramagnetic term.

Free Spin 1/2

As a review let us consider a simpler case that you are probably familiar with from your statistical physics course: a free spin-1/2. The Hamiltonian, you recall, of a single spin-1/2 is given by

= gµ B σ (18.6) H B · with g the g-factor of the spin which we set to be 2, and µB = e~/(2m) is the Bohr magneton. We can think of this as being a simplified version of the above paramagnetic term of Eq. 18.5, for a single free electron where we ignore the orbital moment. The eigenstates of B σ are B/2 so we have a partition function · ± Z = e−βµBB + eβµB B (18.7) and a corresponding free energy F = k T log Z giving us a magnetic moment (per spin) of − B ∂F moment = = µ tanh(βµ B) (18.8) −∂B B B

If we have many such atoms together in a volume, we can define the magnetization M to be the magnetic moment per unit volume. Then, at small field (expanding the tanh for small argument) we obtain a susceptibility of

∂M nµ µ2 χ = lim = 0 B (18.9) H→0 ∂H kbT where n is the number of spins per unit volume (and we have used B µ0H with µ0 the per- meability of free space). Expression 18.9 is known as the “Curie Law”23≈susceptibility (Actually any susceptibility of the form χ C/(kbT ) for any constant C is known as Curie law), and paramagnetism involving free spins∼ like this is often called Curie paramagnetism or Langevin24 paramagnetism.

Free Spin J

The actual paramagnetic term in the Hamiltonian will typically be more complicated than our simple spin-1/2 model, Eq. 18.6. Instead, examining Eq. 18.5 and generalizing to multiple electrons

23Named after . Pierre’s work on magnetism was well before he married his mega-brilliant wife Marie Sklodowska Curie. She won one physics Nobel with Pierre, and then another one in chemistry after he died. Half- way between the two prizes, Pierre was killed when he was run over by a horse-drawn vehicle while crossing the street (Be careful!). 24Paul Langevin was Pierre Curie’s student. He is well known for many important scientific discoveries. He is also well known for creating quite the scandal by having an affair with Marie Curie a few years after her husband’s death (Langevin was married at the time). Although the affair quickly ended, ironically, the grandson of Langevin married the granddaughter of Curie and they had a son — all three of them are physicists. 18.4. FREE SPIN (CURIE OR LANGEVIN) PARAMAGNETISM 205 in an atom, we expect to need to consider a Hamiltonian of the form

= µ B (L + gS) (18.10) H B · where L and S are the orbital and spin components of all of the electrons in the atom put together. Recall that Hund’s rules tell us the value of L, S, and J. The form of Eq. 18.10 looks a bit inconvenient, since Hund’s third rule tells us not about L + gS but rather tells us about J = L + S. Fortunately, for the type of matrix elements we are concerned with (reorientations of J without changing the value of J, S or L which are dictated by Hund’s rules) the above Hamiltonian Eq. 18.10 turns out to be precisely equivalent to

=˜gµ B J (18.11) H B · whereg ˜ is an effective g-factor given by25

1 1 S(S + 1) L(L + 1) g˜ = (g +1)+ (g 1) − 2 2 − J(J + 1)  

From our new Hamiltonian, it is easy enough to construct the partition function

J Z = eβgµ˜ B BJz (18.12) JzX=−J Analogous to the spin-1/2 case above one can differentiate to obtain the moment as a function of temperature. If one considers a density n of these atoms, one can then determine the the magnetization and the susceptibility (this is assigned as an “Additional Problem” for those who are interested). The result, of the Curie form, is that the susceptibility per unit volume is given by nµ (˜gµ )2 J(J + 1) χ = 0 B 3 kB T (Compare Eq. 18.9) Note that Curie law susceptibility always diverges at low temperature26. If this term is nonzero (i.e., if J is nonzero) then the Curie paramagnetism is dominant compared to any other type of paramagnetism or diamagnetism27.

25You probably do not need to memorize this formula for this course, although you might have to know it for atomic physics! The derivation of this formula is not difficult though. We are concerned in determining matrix elements of B (L + gS) between different Jz states. To do this we write · L J S J B (L + gS)= B J · + g · · ·  J 2 J 2  | | | | The final bracket turns out to be just a number, which we evaluate by rewriting it as J 2 + L 2 J L 2 J 2 + S 2 J S 2 | | | | −| − | + g | | | | −| − |  2 J 2   2 J 2  | | | | Finally replacing J L = S and J S = L then substituting in J 2 = J(J + 1) and S 2 = S(S + 1) and L 2 = L(L + 1), with− a small bit of algebra− gives the desired result. | | | | | 26| The current calculation is a finite temperature thermodynamic calculation resulting in divergent susceptibility at zero temperature. In the next few sections we will study Larmor and Landau Diamagnetism as well as Pauli and Van-Vleck Paramagnetism. All of these calculations will be zero temperature quantum calculations and will always give much samller finite susceptibilities. 27Not including superconductivity. 206 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM

Aside: From Eqs. 18.7 or 18.12 we notice that the partition function of a free spin is only a function of the dimensionless ratio µB B/(kbT ). From this we can derive that the S is also a function only of the same dimensionless ratio. Let us imagine now we have a system of free spins at magnetic field B and temperature T , and we thermally isolate it from the environment. If we adiabatically reduce B, then since S must stay fixed, the temperature must drop proportionally to the reduction in B. This is the principle of the adiabatic demagnetization refrigerator.28

18.5 Larmor Diamagnetism

Since Curie paramagnetism is dominant whenever J = 0, the only time we can possibly observe diamagnetism is if an atom has J = 0. A classic situation6 in which this occurs is for atoms with filled shell configurations, like the noble gases where L = S = J = 0. Another possibility is that J = 0 even though L = S is nonzero (one can use Hund’s rules to show that this occurs if a shell has one electron fewer than being half filled). In either case, the paramagnetic term of Eq. 18.5 has zero expectation and the term can be mostly ignored29. We thus need to consider the effect of the final term in Eq. 18.5, the diamagnetic term. If we imagine that B is applied in thez ˆ direction, the expectation of the diamagnetic term of the Hamiltonian (Eq. 18.5) can be written as

e2 e2B2 δE = B r 2 = x2 + y2 8mh| × | i 8m h i Using the fact that the atom is rotationally symmetric, we can write 2 2 x2 + y2 = x2 + y2 + z2 = r2 h i 3h i 3h i Thus we have e2B2 δE = r2 12m h i Thus the magnetic moment per electron is

dE e2 moment = = r2 B −dB − 6mh i   28Very low temperature adiabatic demagnetization refrigerators usually rely on using nuclear moments rather than electronic moments. The reason for this is that the (required) approximation of spins being independent holds down to much lower temperature for nuclei, which are typically quite decoupled from their neighbors. Achieving nuclear temperatures below 1µK is possible with technique. 29Actually, to be more precise, even though J may be zero, the paramagnetic term in Eq. 18.5 may be important in second order perturbation theory. At secondh i order, the energy of the system will be corrected by a term proportional to p B (L + gS) 0 2 δE0 + |h | · | i| ∼ E0 Ep p>X0 − This contribution need not vanish. It is largest when there is a low energy excitation Ep so the denominator can be small. Since this energy decreases with increasing B, this term is paramagnetic. At any rate, this contribution is often important in the cases where J = 0 but L and S are individually nonzero — as this usually implies there is a low energy excitation that can occur by misorienting L and S with respect to each other thus violating Hund’s 3rd rule only. However, for atoms like noble gases, where L and S are individually zero, then there are no low energy excitations and this contribution is negligible. This type of paramagnetism is known as after the Nobel Laureate J. H. Van Vleck who was a professor at Baliol college Oxford in 1961–1962 but spent most of his later professional life at Harvard. 18.6. ATOMS IN SOLIDS 207

Assuming that there is a density ρ of such electrons in a system, we can then write the susceptibility as ρe2µ r2 χ = 0h i (18.13) − 6m This result, Eq. 18.13, is known as Larmor Diamagnetism.30 For most atoms, r2 is on the order of a few Bohr radii squared. In fact, the same expression can sometimes beh appliedi for large conductive molecules if the electrons can freely travel the length of the molecule — by taking r2 to be the radius squared of the molecule instead of that of the atom. h i

18.6 Atoms in Solids

Up to this point, we have always been considering the magnetism (paramagnetism or diamag- netism) of a single isolated atom. Although the atomic picture gives a very good idea of how magnetism occurs, the situation in solids can be somewhat different. As we have discussed in chapters 14 and 15 when atoms are put together the electronic band structure defines the physics of the material — we cannot usually think of atoms as being isolated from each other. We thus must think a bit more carefully about how our above atomic calculations may or may not apply to real materials.

18.6.1 Pauli Paramagnetism in Metals

Recall that in section 4.3 we calculated the susceptibility of the free . We found 2 χP auli = µ0µBg(EF ) (18.14) with g(EF ) the density of states at the Fermi surface. We might expect that such an expression would hold for metals with nontrivial band structure — only the density of states would need to be modified. Indeed, such an expression holds fairly well for simple metals such as Li or Na. Note that the susceptibility, per spin, of a Fermi gas (Eq. 18.14) is smaller than the suscep- tibility of a free spin (Eq. 18.9) by roughly a factor of T/EF (This can be proven using Eq. 4.11 for a free electron gas). We should be familiar this idea, that due to the Pauli exclusion principle, only the small fraction of spins near the Fermi surface can be flipped over, therefore giving a small susceptibility.

18.6.2 Diamagnetism in Solids

Our above calculation of Larmor diamagnetism was applied to isolated atoms each having J = L = S = 0, such as noble gas atoms. At low temperature noble gas atoms form very weakly bonded crystals and the same calculation continues to apply (with the exception of the case of helium which does not crystalize but rather forms a superfluid at low temperature). To apply the above result Eq. 18.13 to a noble gas crystal, one simply sets the density of electrons ρ to be equal to the density of atoms n times the number of electrons per atom (the atomic number) Z. Thus for noble gas atoms we obtain Zne2µ r2 χ = 0h i (18.15) Larmor − 6m 30Joseph Larmor was a rather important physicist in the late 1800s. Among other things, he published the Lorentz transformations for time dilation and length contraction two years before Lorentz, and seven years before Einstein. However, he insisted on the aether, and rejected relativity at least until 1927 (maybe longer). 208 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM where r2 is set by the atomic radius. h i In fact, for any material the diamagnetic term of the Hamiltonian (the coupling of the orbital motion to the magnetic field) will result in some amount of diamagnetism. To account for the diamagnetism of electrons in core orbitals, Eq. 18.15 is usually fairly accurate. For the conduction electrons in a metal, however, a much more complicated calculation gives the so-called Landau-diamagnetism (See footnote 12 of chapter 3)

1 χ = χ Landau −3 P auli which combines with the Pauli paramagnetism to reduce the total paramagnetism of the conduction electrons by 1/3. If one considers, for example, a metal like copper, one might be tempted to conclude that it should be a paramagnet, due to the above described Pauli paramagnetism (corrected by the Landau effect). However, copper is actually a diamagnet! The reason for this is that the core electrons in copper have enough Larmor diamagnetism to overwhelm the Pauli paramagnetism of the conduction electrons! In fact, Larmor diamagnetism is often strong enough to overwhelm Pauli paramagnetism in metals (this is particularly true in heavy elements where there are many core electrons that can contribute to the diamagnetism). Note however, if there are free spins in the material, then Curie paramagnetism occurs which is always stronger than any diamagnetism27.

18.6.3 Curie Paramagnetism in Solids

Where to find free spins?

As discussed above, Curie paramagnetism describes the reorientation of free spins in an atom. We might ask how a “free spin” can occur in a solid? Our understanding of electrons in solids so far describes electrons as being either in full bands, in which case they cannot be flipped over at all; or in partially full bands, in which case the calculation of the Pauli susceptibility in section 4.3 is valid — albeit possibly with a modified density of states at the Fermi surface to reflect the details of the band structure (and with the Landau correction). So how is it that we can have a free spin? Let us think back to the description of Mott insulators in section 15.4. In these materials, the Coulomb interaction between electrons is strong enough that no two electrons can double occupy the same site of the lattice. As a result, having one electron per site results in a “traffic jam” of electrons where no electron can hop to any other site. When this sort of forms, there is exactly one electron per site, which can be either spin-up or spin-down. Thus we have a free spin on each site exactly as we considered in the previous section!31 More generally we might expect that we could have some number N valence electrons per atom, which fill orbitals to form free spins as dictated by Hund’s rules. Again, if the Coulomb interaction is sufficiently strong that electrons cannot hop to neighboring sites, then the system will be Mott insulating and we can think of the spins as being free.

31This picture of a Mott insulator resulting in independent free spins will be examined more closely in chapter 22. Very weakly, some amount of (virtual) hopping can always occur and this will change the behavior at low enough temperatures. 18.7. SUMMARY OF ATOMIC MAGNETISM; PARAMAGNETISM AND DIAMAGNETISM209

Modifications of Free Spin Picture

Given that we have found free spins in a material, we can ask whether there substantial differences between a free spin in an isolated atom and a free spin in a material. One possible modification is that the number of electrons on an atom become modified in a material. For example, we found above in section 18.2 that praseodymium (Pr) has three free electrons in its valence (4f) shell which form a total angular momentum of J =9/2. However, in many compounds Pr exists as a +3 ion. In this case it turns out that both of the 6s electrons are donated as well as a single f electron. This leaves the Pr atom with two electrons in its f shell, thus resulting in a J = 4 angular momentum instead (you should be able to check this with Hund’s rules). Another possibility is that the atoms are no longer in a rotationally symmetric environment, they see the potential due to neighboring atoms, the so-called “crystal field”. In this case orbital angular momentum is not conserved and the degeneracy of states all having the same L2 is broken, a phenomenon known as crystal field splitting. As a (very) cartoon picture of this physics, we can imagine a crystal which is highly tetrag- onal (see Fig. 11.11) where the lattice constant in one direction is quite different from the constant in the other two. We might imagine that an atom that is living inside such an elongated box would have a lower energy if its orbital angular momentum pointed along the long axes (say, the z-axis), rather than in some other direction. In this case, we might imagine that Lz = +L and Lz = L might be lower energy than any of the other possible values of L. − Another thing that may happen due to crystal field splitting is that the orbital angular momentum may be pinned to have zero expectation (for example, if the ground state is a super- position of Lz = +L and Lz = L). In this case, the orbital angular momentum decouples from the problem completely (a phenomenon− known as quenching of the orbital angular momentum), and the only magnetically active degrees of freedom are the spins. This is precisely what happens for most transition metals.32 The most important moral to take home from this section is that paramagnets can have many different effective values of J, and one needs to know the microscopic details of the system before deciding which spin and orbital degrees of freedom are active.

18.7 Summary of Atomic Magnetism; Paramagnetism and Diamagnetism

Susceptibility χ = dM/dH is positive for paramagnets and negative for diamagnets. • Sources of paramagnetism: (a) Pauli paramagnetism of free electron gas (See section 4.3) (b) • Free spin paramagnetism – know how to do the simple statmech exercise of calculating the paramagnetism of a free spin.

The magnitude of the free spin determined by Hund’s rules. The bonding of the atom, or • environment of this atom (crystal field) can modify this result.

32The 3d shell of transition metals is shielded from the environment only by the 4s electrons, whereas for rare earths the 4f shell is shielded by 6s and 5p. Thus the transition metals are much more sensitive to crystal field perturbations than the rare earths. 210 CHAPTER 18. MAGNETIC PROPERTIES OF ATOMS, PARA- AND DIA-MAGNETISM

Larmor diamagnetism can occur when atoms have J = 0, therefore not having strong para- • magnetism. This comes from the diamagnetic term of the Hamiltonian in first order pertur- bation theory. The diamagnetism per electron is proportional to the radius of the orbital.

References

Ibach and Luth, section 8.1 • Hook and Hall, chapter 7 • Ashcroft and Mermin, chapter 31 • Kittel, chapter 11 • Blundell, chapter 2 • Burns, chapter 15A • Goodstein, section 5.4a–c (doesn’t cover diamagnetism) • Rosenberg, chapter 11 (doesn’t cover diamagnetism) • Chapter 19

Spontaneous Order: Antiferro-, Ferri-, and Ferro-Magnetism

In section 18.2.1 we commented that applying Hund’s rules to molecules can be quite dangerous since spins on neighboring atoms could favor either having their spins aligned or could favor having their spins anti-aligned depending on which of several effects is stronger. In chapter 22 below we will show models of how either behavior might occur. In this chapter we will assume there is an interaction between neighboring spins (a so-called exchange interaction, see footnote 18 from section 18.2.1) and we will explore how the interaction between neighboring spin aligns spins on a . We first assume that we have an insulator, i.e., electrons do not hop from site to site1. We then write a model Hamiltonian as 1 = J S S + gµ B S (19.1) H −2 ij i · j B · i i,j i X X 2 3 where Si is the spin on site i and B is the magnetic field experienced by the spins . Here Jij Si Sj is the interaction energy4 between spin i and spin j. Note that we have included a factor of 1· /2 out front to avoid overcounting, since the sum actually counts both Jij and Jji (which are equal).

If Jij > 0 then it is lower energy when spins i and j are aligned, whereas if Jij < 0 then it is lower energy when the spins are anti-aligned. The coupling between spins typically drops rapidly as the distance between the spins in- creases. A good model to use is one where only nearest neighbor spins interact with each other.

1This might be the situation if we have a Mott insulator, as described in sections 15.4 and 18.6.3 above where strong interaction prevents electron hopping. 2When one discusses simplified models of magnetism, very frequently one writes angular momentum as S without regards as to whether it is really S, or L or J. It is also conventional to call this variable the “spin” even if it is actually from orbital angular momentum in a real material. 3Once again the plus sign in the final term assumes that we are talking about electronic moments. (See footnote 13 of section 4.3) 4WARNING: Many references use Heisenberg’s original convention that the interaction energy should be defined as 2Jij Si Sj rather than Jij Si Sj . However, more modern researchers use the latter, as we have here. This matches up· with the convention used· for the Ising model below Eq. 19.5, where the convention 2J is never used. At any rate, if someone on the street tells you J, you should ask whether they intend a factor of 2 or not.

211 212 CHAPTER 19. ANTIFERRO-, FERRI-, AND FERRO-MAGNETISM

Frequently one writes (neglecting the magnetic field B) 1 H = J S S −2 ij i · j i,j neighborsX or using brackets i, j as a shorthand to indicate that i and j are neighbors, h i 1 H = J S S −2 ij i · j hXi,ji In a uniform system where each spin is coupled to its neighbors with the same strength, we can drop the indices from Jij (since they all have the same value) and we obtain the so-called Heisenberg Hamiltonian 1 H = J S S (19.2) −2 i · j hXi,ji

19.1 (Spontaneous) Magnetic Order

As in the case of a ferromagnet, it is possible that even in the absence of any applied magnetic field, magnetism — or ordering of magnetic moments — may occur. This type of phenomenon is known as spontaneous magnetic order (since it occurs without application of any field). It is a subtle statistical mechanical question as to when magnetic interaction in a Hamiltonian actually results in spontaneous magnetic order. At our level of analysis we will assume that systems can always find ground states which “satisfy” the magnetic Hamiltonian. In chapter 21 we will consider how temperature might destroy this magnetic ordering.

19.1.1 Ferromagnets

As mentioned above, if J > 0 then neighboring spins want to be aligned. In this case the ground state is when all spins align together developing a macroscopic magnetic moment — this is what we call a ferromagnet, and is depicted on the left of Fig. 19.1. We will return to study these further in sections 20.1 and 21 below.

19.1.2 Antiferromagnets

On the other hand, if J < 0, neighboring spins want to point in opposite directions, and the most natural ordered arrangement is a periodic situation where alternating spins point in opposite directions, as shown on the right of Fig. 19.1 — this is known as an antiferromagnet. Such an antiferromagnet has zero net magnetization but yet is magnetically ordered. This type of antiperiodic ground state is sometimes known as a N´eel state after Louis N´eel who first proposed that these states exist5. We should be cautioned that our picture of spins pointing in directions is a classical picture, and is not quite right quantum mechanically. Particularly when the spin is small (like spin-1/2) the effects of quantum mechanics are strong and classical intuition can fail us. We will have a homework problem that shows that this classical picture of the antiferromagnet is not quite right, although it is fairly good when the spin on each site is larger than 1/2.

5N´eel won a Nobel prize for this work in 1970. 19.1. (SPONTANEOUS) MAGNETIC ORDER 213

Figure 19.1: Magnetic Spin Orderings. Left: Ferromagnet — all spins aligned (at least over some macroscopic regions) giving finite magnetization. Middle: Antiferromagnet — Neighboring spins antialigned, but periodic. this so-called N´eel state has zero net magentization.

Detecting Antiferromagnetism with Diffraction

Being that antiferromagnets have zero net magnetization, how do we know they exist? What is their signature in the macroscopic world? For homework (see also section 21.2.2) we will explore a very nice method of determining that something is an antiferromagnet by examining its susceptibility as a function of temperature (in fact it was this type of experiment that N´eel was analyzing when he realized that antiferromagnets exist). However, this method is somewhat indirect. A more direct approach is to examine the spin configuration using diffraction of neutrons. As mentioned in section 13.2, neutrons are sensitive to the spin direction of the object they scatter from. If we fix the spin polarization of an incoming neutron, it will scatter differently from the two different possible spin states of atoms in an antiferromagnet. The neutrons then see that the unit cell in this antiferromagnet is actually of size 2a where a is the distance between atoms (i..e, the distance between two atoms with the same spin is 2a). Thus when the spins align antiferromagnetically, the neutrons will develop scattering peaks at reciprocal wavevectors G =2π/(2a) which would not exist if all the atoms were aligned the same way. This type of neutron diffraction experiment are definitive in showing that antiferromagnetic order exists6.

Frustrated Antiferromagnets

On certain lattices, for certain interactions, there is no ground state that fully “satisfies” the interaction for all spins. For example, on a triangular lattice if there is an antiferromagnetic interaction between bonds, there is no way that all the spins can point in the opposite direction from their neighbors. For example, as shown in the left of Fig. 19.2 on a triangle, once two of the of the spins are aligned opposite each other, independent of which direction the spin on the last site points, it cannot be antiparallel to both of its neighbors. It turns out that (assuming the spins are classical variables) the ground state of the antiferromagnetic Heisenberg Hamiltonian

6These are the experiments that won the Nobel prize for Clifford Schull. See footnote 9 from chapter 13. 214 CHAPTER 19. ANTIFERRO-, FERRI-, AND FERRO-MAGNETISM on a triangle is the configuration shown on the right of Fig. 19.2. While each bond is not quite optimally anti-aligned, the overall energy is optimal for this Hamiltonian7

Figure 19.2: Cartoon of a Triangular Antiferromagnet. Left: An antiferromagnetic interaction on a triangular lattice is frustrated – not all spins can be antialigned with all of their neighbors. Right: The ground state of antiferromagnetic interaction on a triangle for classical spins (large S) is the state on the right, where spins are at 120◦ to their neighbor.

19.1.3 Ferrimagnetism

Once one starts to look for magnetic structure in materials, one can find many other interesting possibilities. One very common possibility is where you have a unit cell with more than one variety of atom, where the atoms have differing moments, and although the ordering is antiferromagnetic (neighboring spins point in opposite direction) there is still a net magnetic moment. An example of this is shown in Fig. 19.3. Here, the red atoms have a smaller moment than the green atoms and point opposite the green atoms. This type of configuration, where one has antiferromagnetic order, yet a net magnetization due to differing spin species, is known as ferrimagnetism. In fact, many of the most common magnets, such as magnetite (Fe3O4) are ferrimagnetic. Sometimes people speak of ferrimagnets as being a subset of ferromagnets (since they have nonzero net magnetic moment in zero field) whereas other people think the word “ferromagnet” implies that it exclude the category of ferrimagnets.8

19.2 Breaking Symmetry

In any of these ordered states, we have not yet addressed the question of which direction the spins will actually point. Strictly speaking, the Hamiltonian Eq. 19.2 is rotationally symmetric — the magnetization can point in any direction and the energy will be the same! In a real system, however, this is rarely the case: due to the asymmetric environment the atom feels within the lattice, there will be some directions that the spins would rather point than others (This physics

7Try showing this! 8The fact that the scientific cannot come to agreement on so many definitions does make life difficult sometimes. However, such disagreements inevitably come from the fact that many different communities, from high energy physicists to chemists, are interested in this type of physics. Coming from such diverse backgrounds, it is perhaps more surprising that there aren’t even more disagreements! 19.2. BREAKING SYMMETRY 215

Figure 19.3: Cartoon of a Ferrimagnet: Ordering is antiferromagnetic, but because the differnt spin species have different moments, there is a net magnetization was also discussed above in section 18.6). Thus to be more accurate we might need to add an additional term to the Heisenberg Hamiltonian. One possibility is to write9 1 H = JS S κ (Sz)2 (19.3) −2 i · j − i i hXi,ji X (again dropping any external magnetic field). The κ term here favors the spin to be pointing in the +ˆz direction or the zˆ direction, but not in any other direction. (You could imagine this being appropriate for a tetragonal− crystal elongated in thez ˆ direction). This energy from the κ term is sometimes known as the anisotropy energy since it favors certain directions over others. Another possible Hamiltonian is

1 H = JS S κ˜ [(Sx)4 + (Sy)4 + (Sz)4] (19.4) −2 i · j − i i i i hXi,ji X which favors the spin pointing along any of the orthogonal axis directions — but not towards any in-between angle. In some cases (as we discussed in Section 18.6) the coefficient κ may be substantial. In other cases it may be very small. However, since the pure Heisenberg Hamiltonian Eq. 19.2 does not prefer any particular direction, even if the anisotropy (κ) term is extremely small, it will determine the direction of the magnetization in the ground state. We say that this term “breaks the symmetry.” Of course, there may be some symmetry remaining. For example, in Eq. 19.3, if the interaction is ferromagnetic, the ground state magnetization may be all spins pointing in the +ˆz direction, or equally favorably, all spins pointing in the zˆ direction. −

19.2.1 Ising Model

If the anisotropy (κ) term is extremely large, then this term can fundamentally change the Hamil- tonian. For example, let us take a spin-S Heisenberg model. Adding the κ term in 19.3 with a

9For small values of the spin quantum number, these added interactions may be trivial. For example, for spin 2 2 2 1/2, we have (Sz) = (Sy) = (Sz) = 1. However, as S becomes larger, the spin becomes more like a classical vector and these such κ terms will favor the spin pointing in the corresponding directions. 216 CHAPTER 19. ANTIFERRO-, FERRI-, AND FERRO-MAGNETISM

large coefficient, forces the spin to be either Sz = +S or Sz = S with all other values of Sz having a much larger energy. In this case, a new effective model may− be written 1 = Jσ σ + gµ B σ (19.5) H −2 i j B i i hXi,ji X where σi = S only (and we have re-introduced the magnetic field B). This model is known as the Ising model± 10 and is an extremely important model in statistical physics11.

19.3 Summary of Magnetic Orders

Ferromagnets: spins align. Antiferromagnets: spins antialign with neighbors so no net mag- • netization. Ferrimagnets: spins antialign with neighbors, but alternating spins are different magnitude so there is a net magnitization anyway. Microscopic spins stuctures of this sort can be observed with neutrons.

Useful model Hamiltonians include Heisenberg ( JSi Sj) for isotropic spins, and Ising • JSzSz for spins that prefer to align along only one− axis.· − i j Spins generally do not equally favor all directions equally (as the Heisenberg model suggest). • Terms that favor spins along particular axes may be weak or strong. Even if they are weak, they will pick a direction among otherwise equally likely directions.

References

Blundell, section 5.1–5.3 (Very nice discussion, but covers mean field theory at the same time which• we will cover in chapter 21 below.) Burns, section 15.4–15.8 (same comment). •

10“Ising” is properly pronounced “Ee-sing” or “Ee-zing”. In the United States it is habitually mispronounced “Eye-sing”. The Ising model was actually invented by (another example of Stigler’s law, see footnote 10 in section 14.2). Ising was the graduate student who worked on this model his graduate dissertation. 11The Ising model is frequently referred to as the “hydrogen atom” of statistical mechanics since it is extremely simple, yet it shows many of the most important features of complex statistical mechanical systems. The one dimensional version of the model was solved in by Ising in 1925, and the two dimensional version of the model was solved by Onsager in 1944 (a chemistry Nobel Laureate, who was amusingly fired by my alma-mater, University, in 1933). Onsager’s achievement was viewed as so important that wrote after world war two that “nothing much of interest has happened [in physics during the war] except for Onsagers exact solution of the two-dimensional Ising model.” (Perhaps Pauli was spoiled by the years of amazing progress in physics between the wars). If you are very brave, you might try calculating the free energy of the one-dimensional Ising model at finite temperature. Chapter 20

Domains and Hysteresis

20.1 Macroscopic Effects in Ferromagnets: Domains

We might think that in a ferromagnet, all the spins in the system will align as described above in the Heisenberg (or Ising) models. However, in real magnets, this is frequently not the case. To understand why this is we imagine splitting our sample into two halfs as shown in Fig. 20.1. Once we have two magnetic dipoles it is clear that they would be lower energy if one of them flipped over as shown it at the far right of Fig. 20.1. (The two north faces of these magnets repel each other1). This energy, the long range dipolar force of a magnet, is not described in the Heisenberg or Ising model at all. In those models we have only included nearest neighbor interaction between spins. As we mentioned above, the actual magnetic dipolar force between electronic spins (or orbital moments) is tiny compared to the interaction “exchange” force between neighboring spins. But when you put together a whole lot of atoms (like 1023 of them!) to make a macroscopic magnet, the summed effect of their dipole moment can be substantial. Of course, in an actual ferromagnet (or ferrimagnet), the material does not really break apart, but nonetheless different regions will have magnetization in different directions in order to minimize the dipolar energy. A region where the moments all point in one given direction is known as a domain or a Weiss domain.2 The boundary of a domain, where the magnetization switches direction is known as a domain wall3. Some possible examples of domain structures are sketched in Fig. 20.2. In the left two frames we imagine an Ising-like ferromagnet where the moment can only point up or down. The left most frame shows a magnet with net zero magnetization. Along the domain walls, the ferromagnetic Hamiltonian is “unsatisfied”. In other words, spin-up atoms on one side of the domain wall have spin-down neighbors — where the Hamiltonian says that they should want to have spin up neighbors only. What is happening is that the system is paying an

1Another way to understand the dipolar force is to realize that the magnetic field far away from the magnets will be much lower if the two magnets are antialigned with each other. Since the electromagnetic field carries energy 2 dV B /µ0, minimizing this magnetic field lowers the energy of the two dipoles. | | R 2After Pierre-Ernest Weiss, one of the fathers of the study of magnets from the early 1900s. 3Domain walls can also occur in antiferromagnets. Instead of the magnetization switching directions we imagine a situation where to the left of the wall, the up-spins are on the even sites, and the down-spins are on the odd-sites, whereas on the right of the domain wall the up-spins are on the odd sites and the down spins are on the even sites. At the domain wall, two neighboring sites will be aligned rather than anti-aligned. Since antiferromagnets have no net magnetization, the argument that domain walls should exist in ferromagnets is not valid for antiferromagnets. In fact, it is always energetically unfavorable for domain walls to exist in antiferromagnets, although they can occur at finite temperature.

217 218 CHAPTER 20. DOMAINS AND HYSTERESIS

Figure 20.1: Dipolar Forces Create Magnetic Domains. Left: The original ferromagnet. Middle: The original ferromagnet broken into two halves. Right: Because two dipoles next to each other are lower energy if their moments are anti-aligned, the two broken halves would rather line up in opposing directions to lower their energies (the piece on the right hand side has been flipped over here). This suggests that in large ferromagnets, domains may form. energy cost along the domain wall in order that the global energy associated with the long range dipolar forces is minimized. If we apply a small up-pointing external field to this system, we will obtain the middle picture where the up pointing domains grow at the expense of the down pointing domains to give an overall magnetization of the sample. In the rightmost frame of Fig. 20.2 we imagine a sample where the moment can point along any of the crystal axis directions4. Again in this picture the total magnetization is zero but it has rather complicated domain structure.

20.1.1 Disorder and Domain Walls

The detailed geometry of domains in a ferromagnet depends on a number of factors. First of all, it depends on the overall geometry of the sample. (For example, if the sample is a very long thin rod and the system is magnetized along the long axis, it may gain very little energy by forming domains). It also depends on the relative energies of the neighbor interaction versus the long range dipolar interaction: increasing the strength of the long range dipolar forces with respect to the neighbor interaction will obviously decrease the size of domains (having no long range dipolar forces, will result in domains of infinite size). Finally, the detailed disorder in a sample can effect the shape and size of magnetic domains. For example, if the sample is polycrystaline, each domain could be a single crystallite (a single microscopic crystal).

4See for example the Hamiltonian, Eq. 19.4, which would have moments pointing only along the coordinate axes — although that particular Hamiltonian does not have the long range magnetic dipolar interaction written, so it would not form domains. 20.1. MACROSCOPIC EFFECTS IN FERROMAGNETS: DOMAINS 219

Figure 20.2: Some Possible Domain Structures for a Ferromagnet. Left: An Ising-like ferromagnet where in each domain the moment can only point either up or down. Middle: When an external magnetic field pointing upwards is applied to this ferromagnet, it will develop a net moment by having the down-domains shrink and the up-domains expand (The local moment per atom remains constant — only the size of the domains change). Right: In this ferromagnet, the moment can point in any of the crystal axis directions.

20.1.2 Disorder Pinning

Even for single-crystal samples, disorder can play an extremely important role in the physics of domains. For example, a domain-wall can have lower energy if it passes over a defect in the crystal. To see how this occurs let us look at a domain wall in an Ising ferromagnet as shown in Fig. 20.3. All bonds are marked red where spins are antialigned rather than aligned. In both figures the domain wall starts and ends at the same points, but on the right it follows a path through a defect in the crystal — in this case a site that is missing an atom. When it intersects the location of the missing atom, the number of antialigned bonds (marked) is lower, and therefore the energy is lower. Since this lower energy makes the domain wall stick to the missing site, we say the domain wall is pinned to the disorder.

20.1.3 The Bloch/N´eel Wall

Our discussion of domain walls so far has assumed that the spins can only point up or down — that is, the κ term in Eq. 19.3 is extremely strong. However, it often happens that this is not true — the spins would prefer to point either up or down, but there is not a huge energy penalty for pointing in other directions instead. In this case the domain wall might instead be more of a smooth rotation from spins pointing up to spins pointing down as shown on the right of Fig. 20.4. This type of smooth domain wall is known as a Bloch wall or N´eel wall5 depending on which direction the spin rotates in with respect to the direction of the domain wall itself (a somewhat subtle difference, which we will not discuss further here). The length of the domain wall (L in the

5We have already met our heros of magnetism — Felix Bloch and Louis N´eel. 220 CHAPTER 20. DOMAINS AND HYSTERESIS

Figure 20.3: Domain Wall Pinning. The energy of a domain wall is lower if the domain wall goes through the position of a defect in the crystal. Here, the green dot is supposed to represent a missing spin. The red bonds, where spins are anti-aligned each cost energy. When the domain wall intersects the location of the missing spin, there are fewer red bonds, therefore it is a lower energy configuration. (There are 12 red bonds on the left, but only 10 on the right).

figure, i.e., how many spins are pointing neither up nor down) is clearly dependent on a balance between the J term of Eq. 19.3 (known sometimes as the spin stiffness) and the κ term, the anisotropy. As mentioned above, if κ/J is very large, then the spins must point either up or down only. In this case, the domain wall is very sharp, as depicted on the left of Fig. 20.4. On the other hand, if κ/J is small, then it costs little to point the spins in other directions, and it is more important that each spin points mostly in the direction of its neighbor. In this case, the domain wall will be very fat, as depicted on the right of Fig. 20.4. A very simple scaling argument can give us an idea of how fat the Bloch/N´eel wall is. Let us say that the the length of the wall is N lattice constants, so L = Na is the actual length of the twist in the domain wall (See Fig. 20.4). Roughly let us imagine that the spin twists uniformly over the course of these N spins, so between each spin and its neighbor, the spin twists an angle δθ = π/N. The first term JSi Sj in the Hamiltonian 19.3 then can be rewritten in terms of the angle between the neighbors− ·

2 2 2 (δθ) E = JSi Sj = JS cos(θ θ )= JS 1 + ... one−bond − · − i − j − − 2   where we have used the fact that δθ is small to expand the cosine. Naturally, the energy of this term is minimized if the two neighboring spins are aligned, that is δθ = 0. However, if they are not quite aligned there is an energy penalty of

2 2 2 2 δEone−bond = JS (δθ) /2= JS (π/N) /2

This is the energy per bond. So the energy of the domain wall due to this spin “stiffness” is δE stiffness = NJS2(π/N)2/2 A/a2 20.1. MACROSCOPIC EFFECTS IN FERROMAGNETS: DOMAINS 221

Figure 20.4: Domain Wall Structure. Left: An infinitely sharp domain wall. This would be realized if the anisotropy energy (κ) is extremely large so the spins must point either up or down (i.e., this is a true Ising system). Right: A Bloch/N´eel wall (actually this depicts a N´eel wall) where the spin flips continuously from up to down over a length scale L. The anisotropy energy here is smaller so that the spin can point at intermediate angle for only a small energy penalty. By twisting slowly the domain wall will pay less spin-stiffness energy.

Here we have written the energy, per unit area A of the domain wall in units of the lattice constant a. On the other hand, in Eq. 19.3 there is a penalty proportional to κS2 per spin when the spins are not either precisely up or down. We estimate the energy due to this term to be κS2 per spin, or a total of δE anisotropy κS2N A/a2 ≈ along the length of the twist.6 Thus the total energy of the domain wall is

E tot = JS2(π2/2)/N + κS2N A/a2

This can be trivially minimized resulting in a domain wall twist having length L = Na with

N = C1 (J/κ) (20.1)

6This approximation of the energy of the κ term is annoyinglyp crude. To be more precise, we should instead 2 2 write κS cos (θi) and then sum over i. Although this makes things more complicated, it is still possible to solve the problem so long as the spin twists slowly so that we can replace the finite difference δθ with a derivative, and replace the sum over sites with an integral. In this case, one minimizes the function

E = dx JS2(a2/2)(dθ(x)/dx)2 κS2 cos2 θ(x) /a Z −   with a the lattice constant. Using calculus of variations the minimum of this energy is given by the solution of the differential equation (Ja2/κ)d2θ/dx2 sin(2θ) = 0 − which has a truly remarkable solution of the form κ θ(x) = 2 tan−1 exp √2(x/a)   r J  where we once again see the same L J/κ scaling. Plugging in this solution, the total energy of the domain wall 2 2 ∼ becomes Etot/(A/a ) = 2√2S √Jκ. p 222 CHAPTER 20. DOMAINS AND HYSTERESIS and a minimum domain wall energy per unit area

Emin tot = C S2√Jκ A/a2 2 where C1 and C2 are constants of order unity (which we will not get right here considering the crudeness our approximation, but see footnote 6). As predicted, the length increases with J/κ. In many real materials the length of a domain wall can be hundreds of lattice constants. Since the domain wall costs an energy per unit area, it is energetically unfavorable. However, as mentioned above, this energy cost is weighed against the long-range dipolar energy which tries to introduce domain walls. The more energy the domain wall costs,the larger individual domains will be (to minimize the number of domain walls). Note that if a crystal is extremely small (or, say, one considers a single crystallite within a polycrystaline sample) it can happen that the size of the crystal is much smaller than the optimum size of the domain wall twist. In this case the spins within this crystallite always stay aligned with each other. Finally we comment that even though the actual domain wall may be hundreds of lattice constants thick, it is easy to see that these objects still have a tendency to stick to disorder as described in section 20.1.1 above.

20.2 Hysteresis in Ferromagnets

We know from our experience with electromagnetism that ferromagnets show a hysteresis loop with respect to the applied magnetic field, as shown in Fig. 20.5. After a large external magnetic field is applied, when the field is returned to zero there remains a residual magnetization. We can now ask why this should be true. In short, it is because there is a large activation energy for changing the magnetization.



Figure 20.5: The Hysteresis Loop of a Ferromagnet

20.2.1 Single-Domain Crystallites

For example, let us consider the case of a ferromagnet made of many small crystallites. If the crystallites are small enough then all of the moments in each crystallite point in a single direction. 20.2. HYSTERESIS IN FERROMAGNETS 223

(We determined in section 20.1.3 that domain walls are unfavorable in small enough crystallites). So let us imagine that all of the microscopic moments (spins or orbital moments) in this crystallite are locked with each other and point in the same direction. The energy per volume of the crystallite in an external field can be written as

E/V = E M B κ0(M z)2 0 − · − where here M is magnetization vector, and Mz is its component in thez ˆ crystal axis. Here the anisotropy term κ0 stems from the anisotropy term κ in the Hamiltonian 19.37 Note that we have no J term since this would just give a constant if all the moments in the crystallite are always aligned with each other. Assuming that the external field B is pointing along thez ˆ axis (although we will allow it to point either up or down) we then have

E/V = E M B cos θ κ0 M 2 cos2 θ (20.2) 0 − | || | − | | where M is the magnitude of magnetization and cos θ is the angle of the magnetization with respect| to| thez ˆ axis. We see that this energy is a parabola in the variable cos θ which ranges from +1 to 1. The minimum of this energy is always when the magnetization points in the same direction− as the external field (which we have taken to always point in the either the +ˆz or zˆ direction, − corresponding to θ = 0 or π). However, for small Bz the energy is not monotonic in θ. Indeed, having the magnetization point in the opposite direction as B is also a local minimum (because the κ0 term favors pointing along the z-axis). This is shown schematically in Fig. 20.6. It is an easy exercise8 to show that there will be a local minimum of the energy with the magnetization pointing the opposite direction as the applied field for B

B =2κ0 M crit | |

So if the magnetization is aligned along the zˆ direction and a field BBcrit) is applied to lower the activation barrier, at which point the moments flip over. Clearly this type of activation barrier can result in hysteretic behavior as shown in Fig. 20.5.

20.2.2 Domain Pinning and Hysteresis

Domains turn out to be extremely important for determining the detailed magnetic properties of materials – and in particular for understanding hysteresis in crystals that are sufficiently large that they are not single-domain (Recall that we calculated the size L of a domain wall in Eq. 20.1. Crystals larger than this size can in principle contain a domain wall). As mentioned above, when a magnetic field is externally applied to a ferromagnet, the domain walls move to re-establish a new domain configuration (See the left two panels of Fig. 20.2) and therefore a new magnetization.

7In particular since M = gµ Sρ with ρ the number of spins per unit volume we have κ0 = κ/[(gµ )2ρ]. Further − B B we note that the M B term is precisely the Zeeman energy +gµB B S per unit volume. 8Try showing it!− · · 9In principle the spins can get over the activation barrier either by being thermally activated or by quantum tunneling. However, if the activation barrier is sufficiently large (i.e., for a large crystallite) both of these are greatly suppressed. 224 CHAPTER 20. DOMAINS AND HYSTERESIS

   



      

 ! ""  #$ %  ! #%&'$ %  (# )%

Figure 20.6: Energy of an Anisotropic Ferromagnet in a as a Function of Angle. Left: Due to the anisotropy, in zero field the energy is lowest if the spins point either in the +ˆz or zˆ direction. When a field is applied in the +ˆz direction the energy is lowest when the moments −are aligned with the field, but there is a metastable solution with the moments pointing in the opposite direction. The moments must cross an activation barrier to flip over. Right: For large enough field, there is no longer a metastable solution.

However, as we discussed in section 20.1.2 above, when there is disorder in a sample, the domain walls can get pinned to the disorder: There is a low energy configuration where the domain wall intersects the disorder, and there is then an activation energy to move the domain wall. This activation energy, analogous to what we found above in section 20.2.1, results in hysteresis of the magnet. It is frequently the case that one wants to construct a ferromagnet which retains its magne- tization extremely well — i.e., where there is strong hysteresis, and even in the absence of applied magnetic field there will be a large magnetization. This is known as a “hard” magnet (also known as a “permanent” magnet). It turns out that much of the trick of constructing hard magnets is arranging to insert appropriate disorder and microstructure to strongly pin the domain walls.

20.3 Summary of Domains and Hysteresis in Ferromagnets

Although short range interaction in ferromagnet favors all magentic moments to align, long • range dipolar forces favors spins to anti-align. A compromise is reached with domains of aligned spins where different domains point in different directions. A very small crystal may be a single domain. The actual domain wall boundary may be a continuous rotation of the spin rather than a • 20.3. SUMMARY OF DOMAINS AND HYSTERESIS IN FERROMAGNETS 225

sudden flip over a single bond-length. The size of this spin structure depends on the ratio of the ferromagnetic energy to the anisotropy energy. (I.e., if it is very costly to have spins point in directions between up and down then the wall will be over a single bond length). Domain walls are lower energy if they intersect certain types of disorder in the solid. This • results in the pinning of domain walls – they stick to the disorder. In a large crystal, changes in magnetization occur by changing the size of domains. In poly- • crystalline samples with very small crystallites, changes in magnetization occur by flipping over individual single-domain crystallites. Both of these processes can require an activation energy (domain motion requires activation energy if domain walls are pinned) and thus result in hysteretic behavior a magnetization in ferromagnets.

References

Hook and Hall, section 8.7 • Blundell, section 6.7 • Burns, section 15.10 • Ashcroft and Mermin, end of chapter 33 • Also good (but covers material in random order compared to what we want): Rosenberg, chapter 12 • Kittel, chapter 12 • 226 CHAPTER 20. DOMAINS AND HYSTERESIS Chapter 21

Mean Field Theory

Given a Hamiltonian for a magnetic system, we are left with the theoretical task of how to predict its magnetization as a function of temperature (and possibly external magnetic field). Certainly at low temperature, the spins will be maximally ordered, and at high temperature, the spins will thermally fluctuate and will be disordered. But calculating the magnetization as a function of temperature and applied magnetic field, is typically a very hard task. Except for a few very simple exactly solvable models (like the Ising model in one dimension) we must always resort to approximations. The most important and probably the simplest such approximation is known as “Mean Field Theory” or “Molecular Field theory” or “Weiss Mean Field theory”1 which we will discuss in depth in this chapter. The general concept of mean field theory proceeds in two steps:

First, one examines one site (or one unit cell, or some small region) and treats it exactly. • Any object outside the unit cell is approximated as an expectation (an average or a mean).

The second step is to impose self-consistency: Every site (or unit cell, or small region) in the • entire system should look the same. So the one site we treated exactly should have the same average as all of the others.

This procedure is extremely general and can be applied to problems ranging from magnetism to liq- uid crystal to fluid mechanics. We will demonstrate the procedure as it applies to ferromagnetism. For a homework problem we will consider how mean field theory can be applied to antiferromagnets as well (further generalizations should then be obvious).

21.1 Mean Field Equations for the Ferromagnetic Ising Model

As an example, let us consider the spin-1/2 Ising model

1 = Jσ σ + gµ B σ H −2 i j B j i hXi,ji X 1The same Pierre-Ernest Weiss for whom Weiss domains are named.

227 228 CHAPTER 21. MEAN FIELD THEORY where J > 0, and here σ = 1/2 is the z-component of the spin and the magnetic field B is applied ± in thez ˆ direction (and as usual µB is the Bohr magneton). For a macroscopic system, this is a statistical mechanical system with 1023 degrees of freedom, where all the degrees of freedom are now coupled to each other. In other words, it looks like a hard problem! To implement mean field theory, we focus in on one site of the problem, say, site i. The Hamiltonian for this site can be written as

= gµ B J σ σ Hi  B − j  i j X   where the sum is over sites j that neighbor i. We think of the term in brackets as being caused by some effective magnetic field seen by the spin on site i, thus we define Beff,i such that

gµ B = gµ B J σ B eff,i B − j j X with again j neighboring i. Now Beff,i is not a constant, but is rather an operator since it contains the variables σj which can take several values. However, the first principle of mean field theory is that we should simply take an average of all quantities that are not site i. Thus we write the Hamiltonian of site i as = gµ B σ Hi Bh eff i i This is precisely the same Hamiltonian we considered when we studied paramagnetism in Eq. 18.6 above, and it is easily solvable. In short, one writes the partition function

−βgµB hBeff i/2 βgµB hBeff i/2 Zi = e + e

From this we can derive the expectation of the spin on site i (compare Eq. 18.8) 1 σ = tanh (βgµ B /2) (21.1) h ii −2 Bh eff i However, we can also write that

gµ B = gµ B J σ Bh eff i B − h j i j X The second step of the mean field approach is to set σ to be equal on all sites of the lattice, so we obtain h i gµ B = gµ B Jz σ (21.2) Bh eff i B − h i where z is the number of neighbors j of site i (this is known as the coordination number of the lattice, and this factor has replaced the sum on j). Further, again assuming that σ is the same on all lattice sites, from Eq. 21.1 and 21.2, we obtain the self-consistency equationh i for σ given by h i 1 σ = tanh (β [gµ B Jz σ ] /2) (21.3) h i −2 B − h i The expected moment per site will correspondingly be given by2.

m = gµ σ (21.4) − Bh i 2Recall the the spin points opposite the moment! Ben Franklin, why do you torture us so? (See footnote 13 of section 4.3) 21.2. SOLUTION OF SELF-CONSISTENCY EQUATION 229

21.2 Solution of Self-Consistency Equation

The self-consistency equation, Eq. 21.3 is still complicated to solve. One approach is to find the solution graphically. For simplicity, let us set the external magnetic field B to zero. We then have the self-consistency equation 1 βJz σ = tanh σ (21.5) h i 2 2 h i   We then choose a value of the parameter βJz/2. Let us start by choosing a value βJz/2 = 1 that is somewhat small, i.e., a high temperature. Then in Fig. 21.1 we plot both the right hand side of Eq. 21.5 as a function of σ (in blue) and the left hand side of Eq. 21.5 (in green). Note that the left hand side is σ so theh i straight line is simply y = x. We see that there is only a single point where the two curvesh i meet, i.e., where the left side equals the right side. This point, in this case is σ = 0. From this we conclude that, for this value temperature, within mean field approximation, htherei is no magnetization in zero field.

 



      





Figure 21.1: Graphical Solution of the Mean Field Self Consistency Equations at Relatively High Temperature βJz/2 = 1. The blue line is the tanh of Eq. 21.5. The green line is just the line y = x. Eq. 21.5 is satisfied only where the two curves cross – i.e., at σ = 0 meaning that at this temperature, within the mean field approximation, there is no magnehtization.i

Let us now reduce the temperature substantially to βJz/2 = 6. Analogously, in Fig. 21.2 we plot both the right hand side of Eq. 21.5 as a function of σ (in blue) and the left hand side of Eq. 21.5 (in green). Here, however, we see there are threeh posi sible self-consistent solutions to the equations. There is the solution at σ = 0 as well as two solutions marked with arrows in the figure at σ .497. The two nonzeroh solutionsi tell us that at low temperature this system can have nonzeroh i≈± magnetization even in the absence of applied field — i.e., it is ferromagnetic. The fact that we have possible solutions with the magnetization pointing in both directions is quite natural: The Ising ferromagnet can be polarized either spin up or spin down. However, the fact that there is also a self-consistent solution with zero magnetization at the same temperature seems a bit puzzling. We will see as a homework assignment that when there are three solutions, 230 CHAPTER 21. MEAN FIELD THEORY

 



      







Figure 21.2: Graphical Solution of the Mean Field Self Consistency Equations at Relatively Low Temperature βJz/2 = 6. Here, the curves cross at three possible values ( σ = 0 and σ .497). The fact that there is a solution of the self-consistency equations with nonzeroh i magnetizationh i≈± tells us that the system is ferromagnetic (the zero magnetization solution is non-physical). the zero magnetization solution is actually a solution of maximal free energy not minimal free energy, and therefore should be discarded3. Thus the picture that arises is that at high temperature the system has zero magnetization (and we will see below that it is paramagnetic) whereas at low temperature a nonzero magnetization develops and the system becomes ferromagnetic4. The transition between these two behaviors 5 6 occurs at a temperature known as Tc, which stands for critical temperature or . It is clear from Figs. 21.1 and 21.2 that the behavior changes from one solution to three solutions precisely when the straight green line is tangent to the tanh curve, i.e., when the slope of the tanh is unity. This tangency condition thus determines the critical temperature. Expanding the tanh for small argument, we obtain the tangency condition

1 β Jz 1= c 2 2   or when the temperature is Jz k T = b c 4

Using the above technique, one can solve the self-consistency equations (Eq. 21.5) at any temperature (although there is no nice analytic expression, it can be solved numerically or

3We will see (as a homework problem) that our self-consistency equations are analogous to when we find the minimum of a function by differentiation — and we may also find maxima as well. 4It is quite typical that at high temperatures, a ferromagnet will turn into a paramagnet, unless something else happens first — like the crystal melts. 5Strictly speaking it should only be called a critical temperature if the transition is second order. I.e., if the magnetization turns on continuously at this transition. For the Ising model, this is in fact true, but for some magnetic systems it is not true. 6Named for Pierre again. 21.2. SOLUTION OF SELF-CONSISTENCY EQUATION 231 graphically). The results are shown in Fig. 21.3. Note that at low enough temperature, all of the spins are fully aligned ( σ =1/2 which is the maximum possible for a spin-1/2). One can also, in h i













     

Figure 21.3: Magnetization as a Function of Temperature. The plot shows the magnitude of the moment per site in units of gµB as a function of temperature in the mean field approximation of the spin-1/2 Ising model, with zero external magnetic field applied. principle, solve the self-consistency equation (Eq. 21.3) with finite magnetic field B.

21.2.1 Paramagnetic Susceptibility

At high temperature there will be zero magentization in zero externally applied field. However, at finite field, we will have a finite magnetization. Let us imagine applying a small magnetic field and solve the self-consistency equations Eq. 21.3. Since the applied field is small, we can assume that the induced σ is also small. Thus we can expand the tanh in Eq.21.3 to obtain h i 1 σ = (β [Jz σ gµ B] /2) h i 2 h i− B Rearranging this then gives 1 1 4 (βgµB)B 4 (gµB)B σ = 1 = h i − 1 βJz −kb(T Tc) − 4 − which is valid only so long as σ remains small. The moment per site is then given by (See Eq. h i 21.4) m = gµB σ which divided by the volume of a unit cell gives the magnetization M. Thus we find that− theh susceptibilityi is 1 2 ∂M ρ(gµB) µ0 χ χ = µ = 4 = Curie (21.6) 0 ∂B k (T T ) 1 T /T b − c − c where ρ is the number of spins per unit volume and χCurie is the pure Curie susceptibility of a system of (noninteracting) spin-1/2 particles (Compare Eq. 18.9). Eq. 21.6 is known as the Curie-Weiss Law. Thus, we see that a ferromagnet above its critical temperature is roughly a paramagnet with an enhanced susceptibility. Note that the susceptibility diverges at the transition temperature when the system becomes ferromagnetic.7 7 This divergence is in fact physical. As the temperature is reduced towards Tc, the divergence tells us that it takes a smaller and smaller applied B field to create some fixed magnetization M. This actually makes sence since 232 CHAPTER 21. MEAN FIELD THEORY

21.2.2 Further Thoughts

As mentioned above, the mean-field procedure is actually very general. As a homework problem we will also study the antiferromagnet. In this case, we divide the system into two sublattices — representing the two sites in a unit cell. In that example we will want to treat one spin of each sublattice exactly, but as above each spin sees only the average field from its neighbors. One can generalize even further to consider very complicated unit cells. Aside: It is worth noting that the result of solving the Antiferromagnetic Ising model gives χ χ = Curie 1+ Tc/T compared Eq. 21.6. It is this difference in susceptibility that pointed the way to the discovery of anitferromag- nets. We see that in both the ferromagnetic and antiferromagnetic case, at temperatures much larger than the critical temperature (much larger than the exchange energy scale J), the system behaves like a pure free spin Curie paramagnet. In section 18.6.3 above we asked where we might find free spins so that a Curie paramagnet might be realized. In fact, now we discover that any ferromagnet or antiferromagnet (or ferrimagnet for that matter) will appear to be free spins at temperatures high enough compared to the exchange energy. Indeed, it is almost always the case that when one thinks that one is observing free spins, at low enough energy scales one discovers that in fact the spins are coupled to each other! The principle of mean field theory is quite general and can be applied to a vast variety of difficult problems in physics8. No matter what the problem, the principle remains the same — isolate some small part of the system to treat exactly, average everything outside of that small system, then demand self-consistency: that the average of the small system should look like the chosen average of the rest of the system. While the mean field approach is merely an approximation, it is frequently a very good ap- proximation for capturing a variety of physical phenomena. Furthermore, many of its shortcomings can be systematically improved by considering successively more corrections to the initial mean field approach9.

21.3 Summary of Mean Field Theory

Understand the mean field theory calculation for ferromagnets. Understand how you would • generalize this to any model of antiferromagnets (homework), ferrimagnets (try it!) different spins, anisotropic models, etc etc.

For the ferromagnet the important results of mean field theory includes: • once the temperature is below Tc, the magnetization will be nonzero even in the absence of any applied B. 8In chapter 2 we already saw another example of mean field theory, when we considered the Boltzmann and Einstein models of specific heat of solids. There we considered each atom to be in a harmonic well formed by all of its neighbors. The single atom was treated exactly, whereas the neighboring atoms were treated only approximately in that their positions were essentially averaged in order to simply form the potential well — and nothing further was said of the neighbors. Another example in similar spirit was given in footnote 2 of chapter 17 where an alloy of Al and Ga with As is replaced by some averaged atom AlxGa1−x and is still considered a periodic crystal. (No unit cell is treated exactly in this case, all are replaced by the average unit cell). 9The motivated student might want to think about various ways one might improve mean field theory system- atically. One approach is discussed in the Additional Problems. 21.3. SUMMARY OF MEAN FIELD THEORY 233

– a finite temperature phase transition from a low temperature ferromagnetic phase to a high temperature paramagnetic phase at a transition temperature known as the Curie temperature. – Above the curie temperature the paramagnetic susceptibility is χ = χ /(1 T /T ) Curie − c where χ0 is the susceptibility of the corresponding model where the ferromagnetic cou- pling between sites is turned off.

– Below Tc the magnetic moment turns on, and increases to saturation at the lowest temperature.

References on Mean Field Theory

Ibach and Luth, chapter 8 (particularly 8.6, 8.7) • Hook and Hall, chapter 8 (particularly 8.3, 8.4) • Kittel, beginning of chapter 12 • Burns, section 15.5 • Ashcroft and Mermin, chapter 33 • 234 CHAPTER 21. MEAN FIELD THEORY Chapter 22

Magnetism from Interactions: The Hubbard Model

So far we have only discussed ferromagnetism in the context of isolated spins on a lattice that align due to their interactions with each other. However, in fact many materials have magnetism where the magnetic moments – the aligned spins – are not pinned down, but rather can wander through the system. This phenomenon is known as itinerant ferromagnetism1. For example, it is easy to imagine a free electron gas where the number of up spins is different from the number of down spins. However, for completely free electrons it is always lower energy to have the same number of up and down spins than to have the numbers differ2. So how does it happen that electrons can decide, even in absence of external magnetic field, to polarize their spins? The culprit is the strong Coulomb interaction between electrons. On the other hand, we will see that antiferromagentism can also be caused by strong interaction between electrons as well! The Hubbard model3 is an attempt to understand the magnetism that arises from inter- actions between electrons. It is certainly the most important model of interacting electrons in modern condensed matter physics. We will see through this model how interactions can produce both ferro- and anti-ferromagnetism (this was alluded to in section 18.2.1). The model is relatively simple to describe4. First we write a tight binding model for a band

1Itinerant means traveling from place to place without a home (from Latin iter, or itiner meaning journey or road. In case anyone cares.) 2 2/d The total energy of having N electrons spin up in a system is proportional to NEF = N(N/V ) where d is the dimensionality of the system (you should be able to prove this easily). We can write E = CN 1+a with 1+a 1+a a > 0 and C some constant. For N↑ up spins and N↓ downspins, we have a total energy E = CN↑ + CN↓ = C(N 1+a +(N N )1+a) where N is the total number of electrons. Setting dE/dN = 0 immediately gives N = N/2 ↑ − ↑ ↑ as the minimum energy configuration. 3John Hubbard, a British physicist, wrote down this model in 1963 and it quickly became an extremely important example in the attempt to understand interacting electrons. Despite the success of the model, Hubbard, who died relatively young in 1980, did not live to see how important his model became: In 1986, when the phenomenon of “high temperature superconductivity” was discovered by Bednorz and M¨uller (resulting in a Nobel prize the following year), the community quickly came to believe that an understanding of this phenomenon would only come from studying the Hubbard model. Over the next two decades the Hubbard model took on the status of being the most important question in condensed matter physics. Its complete solution remains elusive despite the tens of thousands of papers written on the subject. It is a shame that we do not have time to discuss superconductivity in this course. 4The reason most introductory books do not cover the Hubbard model is that the model is conventionally introduced using so-called “second quantized” notation — that is, using field-theoretic methods which are rather

235 236 CHAPTER 22. MAGNETISM FROM INTERACTIONS: THE HUBBARD MODEL of electrons as we did in chapter 10 with hopping parameter t. (We can choose to do this in one, 5 two, or three dimensions as we see fit ). We will call this Hamiltonian H0. As we derived above (and should be easy to derive in two and three dimensions now) the full bandwidth of the band is 4dt in d dimensions. We can add as many electrons as we like to this band. Let us define the number of electrons in the band per site to be called the doping, x (so that x/2 is the fraction of k states in the band which are filled being that there are two spin states). As long as we do not fill all of the states in the band (x < 2), in the absence of interactions, this partially filled tight binding band is a metal. Finally we include the Hubbard interaction

Hinteraction = Uni↑ ni↓ (22.1) i X where here ni↑ is the number of electrons with spin up on site i and ni↓ is the number of electrons on site i with spin down, and U > 0 is an energy known as the repulsive Hubbard interaction energy. This term gives an energy penalty of U whenever two electrons sit on the same site of the lattice. This short ranged interaction term is an approximate representation of the Coulomb interaction between electrons. The full Hubbard model Hamiltonian is given by the sum of the kinetic and interaction pieces

H = H0 + Hinteraction

22.1 Ferromagnetism in the Hubbard Model

Why should this on-site interaction create magnetism? Imagine for a moment that all of the electrons in the system had the same spin state (a so-called “spin-polarized” configuration). If this were true, by the Pauli exclusion principle, no two electrons could ever sit on the same site. In this case, the expectation of the Hubbard interaction term would be zero

Polarized Spins H Polarized Spins =0 h | interaction| i which is the lowest possible energy that this interaction term could have. On the other hand, if we filled the band with only one spin-species, then the Fermi energy (and hence the kinetic energy of the system) would be much higher than if the electrons could be distributed between the two possible spin states. Thus, it appears that there will be some competition between the potential and kinetic energy that decides whether the spins align or not.

22.1.1 Hubbard Ferromagnetism Mean Field Theory

To try to decide quantitatively whether spins will align or not we start by writing

U U Un n = (n + n )2 (n n )2 i↑ i↓ 4 i↑ i↓ − 4 i↑ − i↓

Now we make the approximation of treating all operators ni,↑ and ni,↓ as their expectations.

U U Un n n + n 2 n n 2 i↑ i↓ ≈ 4 h i↑ i↓i − 4 h i↑ − i↓i advanced. We will avoid this approach, but as a result, we cannot delve too deep into the physics of the model. 5In one dimension, the Hubbard model is exactly solvable. 22.1. FERROMAGNETISM IN THE HUBBARD MODEL 237

This type of approximation is a type of mean-field theory, similar to that we encountered in the 6 previous chapter 21: We replace operators by their expectations. The expectation ni↑ + ni,↓ in the first term is just the average number of electrons on site i which is just the averageh numberi of particles per site,7 which is equal to the doping x, which we keep fixed.

Correspondingly, the second expectation, ni↑ ni↓ , is related to the magnetization of the h8 − i 9 system. In particular, since each electron carries a magnetic moment of µB, the magnetization is M = (µ /v) n n B h i↓ − i↑i with v the volume of the unit cell. We thus see that the expectation of the energy of the Hubbard interaction is given by

H (V/v)(U/4) x2 (Mv/µ )2 (22.2) h interactioni≈ − B where V/v is the number of sites in the system. Thus, as expected, increasing the magnetization M decreases the expectation of the interaction energy. To determine if the spins actually polarize we need to weigh this potential energy gain against the kinetic energy cost.

22.1.2 Stoner Criterion10

Here we calculate the kinetic energy cost of polarizing the spins in our model and we balance this against the potential energy gain. We will recognize this calculation as being almost identical to the calculation we did way back in section 4.3 when we studied Pauli paramagnetism (but we repeat it here for clarity). Consider a system (at zero temperature for simplicity) with the same number of spin up and spin down electrons. Let g(EF ) be the total density of states at the Fermi surface per unit volume (for both spins put together). Now, let us flip over a small number of spins so that the spin up and spin down Fermi surfaces have slightly different energies11.

EF,↑ = EF + δ/2 E = E δ/2 F,↓ F − The difference in the number density of up and down electrons is then

EF +δ/2 g(E) EF −δ/2 g(E) ρ ρ = dE dE ↑ − ↓ 2 − 2 Z0 Z0 where we have used the fact that the density of states per unit volume for either the spin-up or g(E) spin-down species is 2 . 6This is a slightly different type of mean field theory from that encountered in chapter 21. Previously we considered some local degree of freedom (some local spin) which we treated exactly, and replaced all other spins by their average. Here, we are going to treat the kinetic energy term exactly, but replace the operators in the potential energy term by their averages. 7This assumes that the system remains homogeneous — that is, that all sites have the same average number of electrons. 8We have assumed an electron g-factor of g = 2 and an electron spin of 1/2. Everywhere else in this chapter the symbol g will only be used for density of states. 9Recall magnetization is moment per unit volume. 10This has nothing to do with the length of your dreadlocks or the number of Grateful Dead shows you have been to (I’ve been to 6 shows ... I think). 11 If we were being very careful we would adjust EF to keep the overall electron density ρ↑ + ρ↓ fixed as we change δ. For small δ we would find that EF remains unchanged as we change δ but this is not true for larger δ. 238 CHAPTER 22. MAGNETISM FROM INTERACTIONS: THE HUBBARD MODEL

Although we could carry forward at this point and try to perform the integrals generally for arbitrary δ (indeed we will have a homework problem on this) it is enough for our present discussion to consider the simpler case of very small δ. In this case, we have

g(E ) ρ ρ = δ F ↑ − ↓ 2 The difference in the number of up and down electrons is related to the magnetization of the system by8 M = µ (ρ ρ ) B ↓ − ↑ so g(E ) M = µ δ F − b 2 The kinetic energy per unit volume is a bit more tricky. We write

EF +δ/2 g(E) EF −δ/2 g(E) K = dE E + dE E 2 2 Z0 Z0 EF g(E) EF +δ/2 g(E) EF g(E) = 2 dE E + dE E dE E (22.3) 2 2 − 2 Z0 ZEF ZEF −δ/2 g(E ) (E + δ/2)2 E2 E2 (E δ/2)2 K + F F F F F − ≈ M=0 2 2 − 2 − 2 − 2     g(E ) = K + F (δ/2)2 M=0 2 g(E ) M 2 = K + F (22.4) M=0 2 µ g(E )  B F 

Where KM=0 is the kinetic energy per unit volume for a system with no net magentization (equal numbers of spin-up and spin-down electrons). We can now add this result to Eq. 22.2 to give the total energy of the system per unit volume M 2 1 vU E = E + tot M=0 µ 2g(E ) − 4  B   F  with v the volume of the unit cell. We thus see that for 2 U > g(EF )v the energy of the system is lowered by increasing the magnetization from zero. This condition for itinerant ferromagnetism is known as the Stoner criterion12. Aside: We did a lot of work to arrive at Eq. 22.4. In fact, we could have almost have written it down with no work at all based on the calculation of the Pauli susceptibility we did back in section 4.3. Recall first that when an external magnetic field is applied in the up direction to a system, there is an energy induced from the coupling of the spins to the field which is given by µB (ρ↑ − ρ↓)B = −MB (with positive M being defined

12Edmund Stoner was a British physicist who, among other things, figured out the Pauli Exclusion principle in 1924 a year before Pauli. However, Stoner’s work focused on the spectra, and behavior, of atoms, and he was not bold enough to declare the exclusion was a fundamental property of electrons. Stoner was diagnosed with diabetes in 1919 at 20 years of age and grew progressively weaker for the next eight years. In 1927 Insulin treatment became available, saving his life. He died in 1969. 22.2. MOTT ANTIFERROMAGNETISM IN THE HUBBARD MODEL 239 in the same direction as positive B so that having the two aligned is low energy). Also recall in section 4.3 that we derived the (Pauli) susceptibility of an electron system is

2 χPauli = µ0µB g(EF ) which means that when a magnetic field B is applied, a magnetization χPauliB/µ0 is induced. Thus we can immediately conclude that the energy of such a system in an external field must be of the form

M 2µ E(M)= 0 − MB 2χPauli To see that this is correct, we minimize the energy with respect to M at a given B and we discover that this properly gives us M = χPauliB/µ0. Thus, at zero applied B, the energy should be

2 2 M µ0 M E(M)= = 2 2χPauli 2µB g(EF ) exactly as we found in Eq. 22.4!

22.2 Mott Antiferromagnetism in the Hubbard Model

In fact, the Hubbard model is far more complex than the above mean field calculation would lead one to believe. Let us now consider the case where the doping is such that there is exactly one electron per site of the lattice. For noninteracting electrons, this would be a half-filled band, and hence a conductor. However, if we turn on the Hubbard interaction with a large U, the system becomes an insulator. To see this, imagine one electron sitting on every site. In order for an electron to move, it must hop to a neighboring site which is already occupied. This process therefore costs energy U, and if U is large enough, the hopping cannot happen. This is precisely the physics of the Mott insulator which we discussed above in section 15.4. With one immobile electron on each site we can now ask which way the spins align (in the absence of external field). For a square or cubic lattice, there are two obvious options: either the spins want to be aligned with their neighbors or they want to be anti-aligned with their neighbors (ferromagnetism or antiferromagnetism). It turns out that antiferromagnetism is favored! To see this, consider the antiferromagnetic state GS0 shown on the left of Fig. 22.1. In the absence of hopping this state is an eigenstate with zero| energyi (as is any other state where there is precisely one electron on each site). We then consider adding the hopping perturbatively. Because the hopping Hamiltonian allows an electron to hop from site to site (with hopping amplitude t), the electron can make a “virtual” hop to a neighboring site, as shown in the right of Fig. 22.1.− The state on the right X is of higher energy, (in the absence of hopping it has energy U because of the double occupancy).| i Using second order perturbation theory we obtain

2 X Hhop GS0 E( GS0 + hopping) = E( GS0 )+ |h | | i| | i | i EGS0 EX XX − Nz t 2 = E( GS ) | | | 0i − U In the first line the sum is over all X states that can be reached in a single hop from the state | i GS0 . In the second line, we have counted the number of such terms to be Nz where z is the |coordinationi number (number of nearest neighbors) and N is the number of sites. Further we have inserted t for the amplitude of hopping from one site to the next. Note that if the spins were all aligned,− no virtual intermediate state X could exist since it would violate the Pauli exclusion | i 240 CHAPTER 22. MAGNETISM FROM INTERACTIONS: THE HUBBARD MODEL principle (hopping of electrons conserves spin state, so spins cannot flip over during a hop, so there is strictly no double occupancy). Thus we conclude that the antiferromagnetic state has its energy lowered compared to the ferromagnetic state in the limit of large U in a Mott insulating phase.

 

Figure 22.1: Spin Configurations of the Half Filled Hubbard Model. Left: The proposed antiferro- magnetic ground state in the limit that t is very small. Right: A higher energy state in the limit of small t which can occur by an electron from one site hopping onto a neighboring site. The energy penalty for double occupancy is U.

Admittedly the above argument appears a bit handwaving (It is correct though!). To make the argument more precise, one should be much more careful about how one represents states with multiple electrons. This typically requires field theoretic techniques. A very simple example of how this is done (without more advanced techniques) is presented in the appendix to this chapter. Nonetheless, the general physics of why the antiferromagnetic Mott insulator state should be lower energy than its ferromagnetic counterpart can be understood qualitatively without resorting to the more precise arguments. On each site one can think of an electron as being confined by the interaction with its neighbors to that site. In the ferromagnetic case, the electron cannot make any excursions to neighboring sites because of the Pauli exclusion principle (these states are occupied). However, in the antiferromagnetic case, the electron can make excursions, and even though when the electron wanders onto neighboring sites, the energy is higher, there is nonetheless some amplitude for this to happen.13 Allowing the electron wavefunction to spread out always lowers its energy14. Indeed, in general a Mott insulator (on a square or cubic lattice) is typically an antiferro- magnet (unless some other interesting physics overwhelms this tendency). It is generally believed that there is a substantial range of t, U and doping x where the ground state is antiferromagnetic. Indeed, many real materials are thought to be examples of antiferromagnetic Mott insulators. In-

13Similar to when a particle is in a potential well V (x), there is some amplitude to find the electron at a position such that V (x) is very large. 14By increasing ∆x we can decrease ∆p and thus lower the kinetic energy of the particle, as per the Heisenberg . 22.3. SUMMARY OF THE HUBBARD MODEL 241 terestingly, it turns out that in the limit of very very strong on-site interaction U adding even a single additional hole to the half-filled Mott insulator will turn the Mott antiferromagnet→ ∞ into a ferromagnet! This rather surprising result due to Nagaoka and Thouless15 (one of the few key results about the Hubbard model which is known as a rigorous theorem) shows the general complexity of this model.

22.3 Summary of the Hubbard Model

Hubbard model includes tight-binding hopping t and on-site “Hubbard” interaction U • For partially filled band, the repulsive interaction (if strong enough) makes the system an • (itinerant) ferromagnet: aligned spins can have lower energy because they do not double occupy sites, and therefore are lower energy with respect to U although it costs higher kinetic energy to align all the spins.

For a half-filled band, the repulsive interaction makes the Mott insulator antiferromagnetic: • virtual hopping lowers the energy of anti-aligned neighboring spins.

References on Hubbard Model

Unfortunately there are essentially no references that I know of that are readable without back- ground in field theory and second quantization.

22.4 Appendix: The Hubbard model for the Hydrogen Molecule

Since my above perturbative calculation showing antiferromagnetism is very hand-waving, I thought it useful to do a real (but very simple) calculation showing how, in principle, these calculations are done more properly. This appendix is certainly nonexaminable, but if you are confused about the above discussion of antiferromagnetism in the Hubbard model, this appendix might be enlightening to read. The calculation given here will address the Hubbard model for the Hydrogen molecule. Here we consider two nuclei A and B near each other, with a total of two electrons – and we consider only the lowest spatial orbital (the s-orbital) for each atom16. There are then four possible states which an electron can be in: A A B B ↑ ↓ ↑ ↓ To indicate that we have put electron 1 in, say the A state, we write the wavefunction ↑ A ϕ (1) | ↑i ←→ A↑

(Here ϕ is the wavefunction (1) is shorthand for the position r1 as well as the spin σ1 coordinate).

15David Thouless, born in Scotland, is one of the most prominent names in modern condensed matter physics. He has not yet won a Nobel prize, but he is frequently mentioned as a high contender. Yosuki Nagaoka is a prominent Japanese theorist. 16This technique can in principle be used for any number of electrons in any number of orbitals although exact solution becomes difficult as the Schroedinger matrix becomes very high dimensional and hard to diagonalize exactly necessitating sophisticated approximation methods. 242 CHAPTER 22. MAGNETISM FROM INTERACTIONS: THE HUBBARD MODEL

For a two electron state, we are only allowed to write wavefunctions that are overall anti- symmetric. So given two single electron orbitals α and β (α and β take values in the four possible orbitals A , A ,B ,B ) we write so-called Slater determinants to describe the antisymmetric two particle↑ wavefunctions↓ ↑ ↓ 1 α(1) β(1) α; β = det = (α(1)β(2) β(1)α(2))/√2= β; α | i √ α(2) β(2) − −| i 2

Note that this slater determinant can be generalized to write a fully antisymmetric wavefunction for any number of electrons. If the two orbitals are the same, then the wavefunction vanishes (as it must by Pauli exclusion). For our proposed model of the Hydrogen molecule, we thus have six possible states for the two electrons A ; A = A ; A | ↑ ↓i −| ↓ ↑i A ; B = B ; A | ↑ ↑i −| ↑ ↑i A ; B = B ; A | ↑ ↓i −| ↓ ↑i A B = B ; A | ↓ ↑i −| ↑ ↓i A ; B = B ; A | ↓ ↓i −| ↓ ↓i B ; B = B ; B | ↑ ↓i −| ↓ ↑i The Hubbard interaction energy (Eq. 22.1) is diagonal in this basis — it simply gives an energy penalty U when there are two electrons on the same site. We thus have A ; A H A ; A = B ; B H B ; B = U h ↑ ↓ | interaction| ↑ ↓i h ↑ ↓ | interaction| ↑ ↓i and all other matrix elements are zero. To evaluate the hopping term we refer back to where we introduced tight binding in section 5.3.2 and chapter 10. Analogous to that discussion, we see that the hopping term with amplitude t turns an A orbital into a B orbital or vice versa, and similarly turns a A into a B and vice− versa (The↑ hopping does not↑ change the spin). Thus, for example, we have ↓ ↓ A ; B H A ; A = t h ↓ ↑ | hop| ↓ ↑i − where here the hopping term turned the B into an A. Note that this implies similarly that A ; B H A ; A = t h ↓ ↑ | hop| ↑ ↓i since A ; A = A ; A . | ↓ ↑ii −| ↑ ↓i Since there are six possible basis states, our most general Hamiltonian can be expressed as a six by six matrix. We thus write our Schroedinger equation as U 0 t t 0 0 ψ ψ − A↑A↓ A↑A↓ 000000 ψA↑B↑ ψA↑B↑  t 0 0 00 t   ψ   ψ  − − A↑B↓ = E A↑B↓  t 0 0 00 t   ψ   ψ     A↓B↑   A↓B↑   000000   ψA↓B↓   ψA↓B↓         0 0 t t 0 U   ψB↑B↓   ψB↑B↓   −            where here we mean that the full wavefunction is the sum Ψ = ψ A ; A + ψ A ; B + ψ A ; B | i A↑A↓| ↑ ↓i A↑B↑| ↑ ↑i A↑B↓| ↑ ↓i + ψ A ; B + ψ A ; B + ψ B ; B A↓B↑| ↓ ↑i A↓B↓| ↓ ↓i B↑B↓| ↑ ↓i 22.4. APPENDIX: THE HUBBARD MODEL FOR THE HYDROGEN MOLECULE 243

We note immediately that the Hamiltonian is block diagonal. We have eigenstates A ; B A ; B | ↑ ↑i | ↓ ↓i both with energy E = 0 (hopping is not allowed and there is no double occupancy, so no Hubbard interaction either). The remaining four by four Shroedinger equation is then

U t t 0 ψA↑A↓ ψA↑A↓ t 0− 0 t ψ ψ A↑B↓ = E A↑B↓  t 0 0 t   ψ   ψ  − − A↓B↑ A↓B↑  0 t t U   ψB↑B↓   ψB↑;B↓   −      We find one more eigenvector (0, 1, 1, 0) with energy E = 0 corresponding  to the state17 ∝ 1 ( A ; B + A ; B ) √2 | ↑ ↓i | ↓ ↑i A second eigenstate has energy U and has a wavefunction 1 ( A ; A B ; B ) √2 | ↑ ↓i − | ↑ ↓i 1 √ 2 2 The remaining two eigenstates are more complicated, and have energies 2 U U + 16t . The ground state, always has energy ±  1 E = U U 2 + 16t2 ground 2 − In the limit of t/U becoming zero, the ground statep wavefunction becomes very close to 1 ( A ; B A ; B )+ (t/U) (22.5) √2 | ↑ ↓i − | ↓ ↑i O with amplitudes of order t/U for the two electrons to be on the same site. In this limit the energy goes to E = 4t2/U ground − which is almost in agreement with our above perturbative calculation — the prefactor differs from that mentioned in the above calculation by a factor of 2. The reason for this discrepancy is that the ground state is not just on one site and on the other, but rather a superposition between the two. This superposition can↑ be thought of as↓ a (covalent) chemical bond (containing two electrons) between the two atoms. In the opposite limit, U/t 0 the ground state wavefunction for a single electron is the symmetric superposition ( A + B→)/√2 (see section 5.3.2) assuming t > 0. This is the so-called “bonding” orbital. So the| groundi | statei for two electrons is just the filling of this bonding orbital with both spins — resulting in A + B A + B 1 | ↑i | ↑i | ↓i | ↓i = ( A ; A + A ; B + B ; A + B ; B ) √2 ⊗ √2 2 | ↑ ↓i | ↑ ↓i | ↑ ↓i | ↑ ↓i 1 = ( A ; A + A ; B A ; B + B ; B ) 2 | ↑ ↓i | ↑ ↓i − | ↓ ↑i | ↑ ↓i Note that eliminating the double occupancy states (simply crossing them out)18 yields precisely the same result as Eq. 22.5. Thus, as the interaction is turned on it simply suppresses the double occupancy in this case. 17 The three states with E = 0 are in fact the Sz = 1, 0, 1 states of S = 1. Since the Hamiltonian is rotationally invariant, these all have the same energy. − 18Eliminating doubly occupied orbitals by hand is known as Gutzwiller projection (after Martin Gutzwiller) and is an extremely powerful approximation tool for strongly interacting systems. 244 CHAPTER 22. MAGNETISM FROM INTERACTIONS: THE HUBBARD MODEL Chapter 23

Magnetic Devices

This is the chapter on magnetic devices. It is NONEXAMINABLE It is also NONFINISHED. I hope to finish this soon!

245 246 CHAPTER 23. MAGNETIC DEVICES Indices

These notes have two indices1. In the index of people, Nobel laureates are marked with *. There are over 50 of them. A few stray celeb pop stars got into the index as well. A few people whose names are mentioned did not end up in the index because the use of their name is so common that it is not worth indexing them as people as well. A few examples are Coulomb’s law; Fourier transform; Boltzmann’s constant; Taylor expansion; Hamiltonian; Jacobian, and so forth. But then again, I did index Schroedinger Equation and Fermi Statistics under Schroedinger and Fermi respectively. So I’m not completely consistent. So sue me. The index of topics was much more difficult to put together. It was hard to figure out what the most useful division of topics is to put in the index. I tried to do this so that the index would be maximally useful - but I’m not sure how good a job I did. Most book indices are not very useful, and now I know why — it is hard to predict why a reader is going to want to look something up.

1Making it a tensor. har har.

247 248 INDICES

Index of People

Anderson, Philip*, 1, 2 Frankenheim, Moritz, 111 Appleton, Edward*, 150 Franklin, Benjamin, 33, 203, 228 Franklin, Rosalind, 147 Bardeen, John**, 46 Franz, Rudolph, 24, 25, 32, 37 Bednorz, Johannes*, 235 Fuller, Richard Buckminster, 58 Berg, Moe, 141 Bethe, Hans*, 27 Galton, Francis, 143 Bloch, Felix*, 36, 160–161, 219–222 Geim, Andre*, 196 Bohr, Niels*, 33, 132, 181, 195, 200, 228 Gell-Mann, Murray*, ii Boltzmann, Ludwig, 8, 17, 19, 20, 67, 184, Gutzwiller, Martin, 243 232 Born, Max*, 12–13, 48 Hall, Edwin, 21–23, 179 Bose, Satyendra, 9, 33, 72, 75 Heath, James*, 58 Bragg, William Henry*, 132–135, 140–141, Heisenberg, Werner*, 27, 33, 211, 212, 215, 143–144, 147 217, 240 Bragg, William Lawrence*, 132–135, 140–141, Hevesy, George*, 132 143–144, 147 Higgs, Peter, 2 Braun, Karl Ferdinand*, 191 Hodgkin, Dorothy*, 147 Bravais, Auguste, 99, 107, 110–111, 123 Hubbard, John, 235–243 Brillouin, Leon, 69, 71, 74, 75, 80–86, 95, Huber, Robert*, 147 123–127, 153–161, 163, 165–170, 172 Hund, Friedrich, 197–205, 211 Brockhouse, Bertram*, 136, 147 Ising, Ernst, 211, 215–217, 219, 227–232

Crick, Francis*, 147 Kendrew, John*, 147 Curie, Marie**, 46, 204 Kepler, Johannes, 107 Curie, Pierre*, 204, 205, 230–232 Klechkovsky,Vsevolod, 198 Kohn, Walter*, 44 Darwin, Charles Galton, 143 Kronecker, Leopold, 116 Darwin, Charles Robert, 143 Kroto, Harold*, 58 de Hevesy, George*, see Hevesy, George* Debye, Peter*, 11–17, 24, 27, 41, 67, 71, 73– Landau, Lev*, 33, 37, 170, 205, 208 74, 80, 141, 143 Langevin, Paul, 204 Deisenhofer, Johann*, 147 Larmor, Joseph, 205–208 Dirac, Paul*, 27–30, 33, 132, 175 Laue, Max von*, 132–135, 140–141 Drude, Paul, 19–26, 32, 35, 37, 178–179, 184 Laughlin, Robert*, 3 Dulong, Pierre, 7–8, 10, 15 Leeuwen, Hendrika van, 195 Lenz, Heinrich, 196 Earnshaw, Samuel, 196 Lenz, Wilhelm*, 216 Ehrenfest, Paul, 19 Lipscomb, William*, 147 Einstein, *, 8–11, 13, 17, 33, 67, 73, Lorentz, Hendrik*, 20, 24, 32, 143, 207 207, 232 Lorenz, Ludvig, 24 Very Smart, 11 Madelung, Erwin, 45, 198 Faraday, Michael, 196 Magnes, Shephard, 195 Fawcett, Farrah, 48 Marconi, Guglielmo*, 191 Fermi, Enrico*, 24, 27–30, 132–133, 184 Mather, John*, 12 Floquet, Gaston, 160 Merton, Robert, 160 INDEX OF PEOPLE 249

Michel, Harmut*, 147 Thomson, Joseph John*, 19 Miller, William Hallowes, 119–123 Thouless, David, 241 Mott, Nevill*, 170, 208, 211, 239, 240 Travolta, John, 48 M¨uller, Karl Alex*, 235 Tsui, Dan*, 3 Mulliken, Robert*, 46, 197 Van der Waals, J. D.*, 53–54 N´eel, Louis*, 212–213, 219–222 van Leeuwen, Hendrika , see Leeuwen, Hen- Nagaoka, Yosuki, 241 drika van Newton, Isaac, 33, 60, 68, 177, 178 Van Vleck, John*, 205, 206 Newton-John, Irene Born, 48 Von Karman, Theodore, 12–13 Newton-John, Olivia, 48 von Laue, Max*, see Laue, Max von* Noether, Emmy, 75 Waller, Ivar, 143 Onsager, Lars*, 216 Watson, James*, 147 Oppenheimer, J. Robert, 48 Weiss, Pierre, 217, 227, 231 Wiedemann, Gustav, 24, 25, 32, 37 Pauli, Wolfgang*, 24, 27, 32–34, 37, 45, 178, Wigner, Eugene*, 33, 103–104, 109, 110, 112 201, 205, 207–208, 216, 236, 238, Wilson, Kenneth*, 2 240, 242 Pauling, Linus**, 27, 46 Zeeman, Pieter*, 33, 203, 223 Peierls, Rudolf, 27 Peltier, Jean Charles, 24–26 Perutz, Max*, 147 Petit, Alexis, 7–8, 10, 15 Planck, Max*, 12–14, 17 Poission, Sim´eon, 117 Pople, John*, 44, 49

Rabi, Isadore Isaac*, 27 Riemann, Bernhard, 14 Rutherford, Ernest Lord*, 150 Rydberg, Johannes, 181

Sanger, Fredrick**, 46 Scherrer, Paul, 141 Schroedinger, Erwin*, 3, 11, 33, 41–42, 44, 48–51, 88–90, 242 Schull, Clifford*, 136, 147, 213 Seebeck, Thomas, 25 Seitz, Fredrick, 103–104, 109, 110, 112 Simon, Steven H., 1 Slater, John, 28, 242 Smalley, Richard*, 58 Smoot, George*, 12 Sommerfeld, , 26–37, 41 Spears, Britney, iii Stigler, Stephen, 160, 216 Stoner, Edmund, 237–239 Stormer, Horst*, 3 Superfluid, 207 250 INDICES

Index of Topics

Acceptor, 179, 182, 183 Good, iii–iv Acoustic Mode, 81, 86 Born-Oppenheimer Approximation, 48 Adiabatic Demagnetization, 206 Born-Von-Karman Boundary Condidition, see Alloy, 189 Periodic Boundary Conditions Amorphous Solid, 59, 147 Bose Occupation Factor, 9, 72, 75 Anderson-Higgs Mechanism, 2 Bragg Condition, 132–135, 140–141, 143–144, Anisotropy Energy, 215, 219–221 149 Antibonding Orbital, 47, 51–53 Bravais Lattice Antiferromagnetism, 212–213, 216, 232 Nomenclatural Disagreements, 99 Frustrated, 213–214 Bravais Lattice Types, 110–111 Mott, see Mott Antiferromagnetism, 241 Brillouin Zone, 69, 71, 74, 75, 80–86, 96, 123– Atomic Form Factor, see Form Factor 127, 153–161, 163, 167, 170, 172 Aufbau Principle, 197 Boundary, 71, 84, 95, 96, 155–161, 163, 166, 167 Band, see Band Structure Definition of, 69, 124 Band Gap, 94, 96, 158, 161, 163, 174 First, 69, 71, 74, 75, 84, 123–125, 161, Designing of, 189–190 167–169 Direct, see Direct Gap Definition of, 124 Indirect, see Indirect Gap Number of k States in, 124 Non-Homogeneous, 190 Second, 84, 124, 167, 168 Band Insulator, 95, 96, 163, 167, 168, 174, Definition of, 124 179 Buckyball, 58 Band Structure, 90–96, 158, 161, 163–170 Bulk Modulus, 65 Engineering, 189 Failures of, 170 Carrier Freeze Out, 182, 188 of Diamond, 125 Chemical Bond, 41–56 Bandwidth, 90 Covalent, see Covalent Bond Basis Fluctuating Dipolar, see Van der Waals in Crystal Sense, 79, 85, 104–106, 112 Bond Vectors Hydrogen, see Hydrogen Bond Primitive, see Primitive Lattice Vec- Ionic, see Ionic Bond tors Metallic, see Metallic Bond BCC Lattice, see Body Centered Cubic Lat- Molecular, see Van der Waals Bond tice Van der Waals, see Van der Waals Bond Bloch Function, 160 Compressibility, 64, 82 Bloch Wall, 219–222 Condensed Matter Bloch’s Theorem, 36, 160–161 Definition of, 1 Body Centered Cubic Lattice, 107–109, 112, Conduction Band, 163, 164, 173, 175, 179 141 Conductivity Miller Indices, 120 of Metals, 21 Selection Rules, 138–139 Thermal, see Thermal Conductivity Bohr Magneton, 33, 200, 203, 204, 228 Conventional Unit Cell, 102, 108, 110, 112 Boltzmann Model of Solids, 8, 17, 67, 232 For BCC Lattice, 108 Boltzmann Statistics, 184, 188 of FCC Lattice, 110 Boltzmann Transport Equation, 20 Coordination Number, 109 Bonding Orbital, 47, 51–53, 243 Cornstarch, 60 Books Covalent Bond, 42–44, 46–52, 243 INDEX OF TOPICS 251

Critical Temperature, 230 Shortcomings of, 25 Crystal Field, 209 Dulong-Petit Law, 7, 8, 10, 15, 17 Crystal Momentum, 74–75, 92, 95, 132, 153, 161, 172 Earnshaw’s Theorem, 196 Crystal Plane, see Lattice Plane Effective Mass, 91, 96, 158, 175–177 Cubic Lattice, see Simple Cubic or FCC or Effective mass, 187 BCC Einstein Frequency, 8, 10 Curie Law, 204, 205, 231 Einstein Model of Solids, 8–11, 15, 17, 67, Curie Temperature, 230 73–74, 232 Curie-Weiss Law, 231 Einstein Temperature, 10 Curse, 173 Elasticity, 64 Electric Dipole Moment, 53 Debye Frequency, 13, 14 Electric Susceptibility Debye Model of Solids, 11–17, 41, 67, 71–74, See Polarizability, 53 80–143 Electron Debye Temperature, 14 g-factor, see g-factor of Electron Debye-Scherrer Method, see Powder Diffrac- Electron Affinity, 44–46 tion Table of, 45 Debye-Waller Factor, 143 Electron Donor, see Donor Density of States Electron Mobility, 179 Electronic, 31, 33, 183, 184, 237 Electron Transport of Debye Model, 13 , see Drude Model of Elec- of One Dimensional Vibration Model, 73 tron Transport Diamagnetism, 209 Electronegativity, 42, 46 Definition of, 196 Mulliken, 46 Landau, 33, 205, 208 Energy Band, see Band Structure Larmor, 205–208, 210 Exchange Interaction, 202, 211, 217 Diffraction, 133–134, 213 Extended Zone Scheme, 84–86, 94, 123, 158 Dipole Moment, see Electric Dipole Moment Extrinsic Semiconductor or Magnetic Dipole Moment Definition of, 179, 187 , 175 Direct Band Gap, 173 Face Centered Cubic Lattice, 109–110, 112, Direct Gap, 164, 171–172, 175 141 Direct Lattice, 70 First Brillouin Zone of, 126 Direct Transition, 171–173 Miller Indices, 120 Dispersion Relation Selection Rules, 139–140 of Electrons, 90 Family of Lattice Planes, 119, 120, 127, 135, of Vibrational Normal Modes, 68 138 of Vibrations, 81 Spacing Between, 121 DNA, 55, 59, 147 Faraday’s Law, 196 Dollars FCC Lattice, see Face Centered Cubic Lat- One Million, 14 tice Domain Wall, 217–222, 225 Fermi Domains, 217–225 Energy, 29–31, 33, 37, 163, 166, 182 Donor, 179, 182, 183 Level, see Fermi Energy, 33 Doped Semiconductor, 179–182, 186–187 Momentum, 29 Doping, see Impurities Occupation Factor, 28, 31, 184, 185 Doughnut Universe, 12 Sea, 29, 30, 35, 166 Drude Model of Electron Transport, 19–27, Sphere, 29, 35 35, 37, 178–179, 185, 188 Statistics, 24, 25, 27–30, 35, 37, 184, 188 252 INDICES

Surface, 29–31, 92, 163, 166, 237 , 2 Temperature, 29, 30, 32 High Temperature Superconductors, 141 Velocity, 29, 30, 32, 35, 37 Hole, 175, 176, 187 Wavevector, 29, 33 Effective Mass of, 176–177 , 37 Mobility of, 179 Fermi’s Golden Rule, 132–133, 135 Hope Diamond, 174 Fermi-Dirac Statistics, see Fermi Statistics Hopping, 50, 89 Ferrimagnetism, 214, 216, 217, 232 Hubbard Interaction, 236, 241 Ferromagnetism, 199, 212, 216–225, 229–232, Hubbard Model, 235–243 240–241 Hund’s Rules, 197–205, 209, 211 Definition of, 196–197 Hydrogen Bond, 42–44, 54–55 Hard, 224 Hydrogenic Impurity, 181 Itinerant, 235–239, 241 Hysteresis, 222–224 Nagaoka-Thouless, 241 Permanent, 224 Impurities, 179–187 First Brillouin Zone, see Brillouin Zone, First Impurity Band, 182 Form Factor, 143 Impurity States, 180–183 of Neutrons, 136, 137 Indirect Band Gap, 173 of X-rays, 136–137 Indirect Gap, 164, 171–172 Fractional Quantum Hall Effect, 3 Indirect Transition, 171–173 Free Electron Theory of Metals, see Sommer- Insulator, see Band Insulator or Mott Insula- feld Theory of Metals tor Integral g-factor Nasty, 14 Effective, 176 Intrinsic Semiconductor, 186 of Electron, 33 Definition of, 179, 187 of Free spin, 204 Ionic Bond, 42–46, 49 Gecko, 54 Ionization Energy, 44–46 General Relativity, 14 Table of, 45 Glass, 59 iPhone, 2, 189 Group Velocity, 71, 75, 176 Ising Model, 211, 215–217, 219, 227–231 Gutzwiller Projection, 243 Itinerant Ferromagnetism, see Ferromagnetism, Itinerant Hall Effect, 35, 36, 179, 187 Hall Resistivity, 21–23, 25 Karma, i, iv Hall Sensor, 22 Kinetic Theory, 19, 23, 25 Harmonic Oscillator, 8, 72 Klechkovsky’s Rule, see Madelung’s Rule Heat Capacity, see Specific Heat, see Specific Heat Landau Fermi Liquid Theory, 37 of Diamond, 8, 10–11 Laser, 189 of Gases, 7, 23 Lattice, 78–79, 85, 99–112 of Metals, 17, 24, 26, 30–32 Definition of, 99–101 of Solids, 7–17 Lattice Constant, 64, 85, 108, 110, 121, 122, Debye Model, see Debye Model of Solids 143 Einstein Model, see Einstein Model of Definition of, 78 Solids Lattice Plane, 118 Table of, 8 Family of, see Family of Lattice Planes Heisenberg Hamiltonian, see Heisenberg Model Laue Condition, 132–135, 140, 149 Heisenberg Model, 211, 212, 214–217 Laue Equation, see Laue Condition Heisenberg Uncertainty, 240 Laue Method, 140 INDEX OF TOPICS 253

Law of Dulong-Petit, see Dulong-Petit Law Comparison with X-rays, 137, 148 Law of Mass Action, see Mass Action, Law Sources, 149 of, 188 Spin of, 137 LCAO, see Linear Combination of Atomic Newton’s Equations, 68, 79, 177–179 Orbitals Noether’s Theorem, 75 Lenz’s Law, 196 Non-, 60 Linear Combination of Atomic Orbitals, 49 Normal Modes, 68, 71–72, 75 Liquid, 58, 147 Enumeration of, 71–72, 163 Liquid-Crystal, 59 Nuclear Scattering Length, 136, 137 Lorentz Correction, 143, 144 Lorentz Force, 20, 32, 33 One Dimension Lorentz-Polarization Correction, 143 Diatomic Chain, 77–86 Lorenz Number, 24 Monatomic Chain, 65–75, 90 Tight Binding Model, see Tight Binding Madelung Energy, 45 Model of One Dimensional Solid Madelung Rule, 198 Optical Mode, 82, 86 Magnetic Devices, 245 Optical Properties, 82, 171–174 Magnetic Levitation, 196 Effect of Impurities, 173 Magnetic Susceptibility, 196, 204, 205, 207, of Impurities, 182 209, 231, 239 of Insulators and Semiconductors, 171– Magnetism, 32–34, 37, 170, 174, 195–210 172 Animal, 195 of Metals, 36, 172–173 Magnetization, 33, 196, 222, 227, 237 Orthorhombic Lattice, 106 Mass Action, Law of, 186–188 Mean Field Theory, 227–233, 236–239 p-Dopant, see Acceptor Metal, 92, 96, 163, 165, 174 p-n Junction, 191 Metal-Insulator Transition, 95 Paramagnetism, 209, 231 Metallic Bond, 42–44, 54, 91 Curie, see Paramagnetism of Free Spins Miller Indices, 119–123, 127, 138 Definition of, 196 for FCC and BCC Lattices, 120 Langevin, see Paramagnetism of Free Spins Minimal Coupling, 203 of Free Electrons, see Paramagnetism, Mobility, 21, 179, 187 Pauli Modified Plane Wave, 160 of Free Spins, 203–206, 208–209, 232 Molar Heat Capacity, 7, see Heat Capacity of Metals, see Paramagnetism, Pauli Molecular Crystal, 58 Pauli, 32–34, 205, 207–209, 239 Molecular Field Theory, see Mean Field The- Van Vleck, 205, 206 ory Particle in a Box, 47, 190, 201 Molecular Orbital Theory, see Tight Binding Particle-Wave Duality, 131 Model Pauli Exclusion Principle, 24, 27, 30, 45, 178, Mott Antiferromagnetism, 239–241 200, 201, 236, 240, 242 Mott Insulator, 170, 174, 208, 211, 239–241 Pauli Paramagnetism, see Paramagnetism, Pauli, Multiplicity, see Scattering Multiplicity 37 Peltier Effect, 24–26, 32 n-Dopant, see Donor Periodic Boundary Conditions, 12–13, 28 N´eel state, see Antiferromagnetism Periodic Table, 42, 44–46, 179, 199 N´eel Wall, 219–222 Perturbation Theory, 154–155, 206, 239 Nearly Free Electron Model, 153–160, 166– Degenerate, 155 169 Phase Velocity, 71, 75 Nematic, 59 Phonon, 72–75, 90, 96, 172 Neutrons, 131, 136, 141, 153, 213 Definition of, 72 254 INDICES

Spectrum Multiplicity, 142 of Diamond, 126 Scattering Time, 19, 23, 35, 179 Pinning, 218–219, 223–225 Schroedinger Equation, 3, 11, 41–42, 44, 48– Plan View, 107, 109, 112 51, 88–90, 96, 155, 161, 242, 243 Polarizability, 53 Seebeck Effect, 25, 32 Polymer, 59 Selection Rules, see Systematic Absences Positron, 175 Table of, 141 Powder Diffraction, 141–146, 149 Semiconductor, 165, 174, 179 Primitive Basis Vectors, see Primitive Lattice Devices, 189–191 Vectors Heterostructure, 190 Primitive Lattice Vectors, 99, 116 Laser, 189 Primitive Unit Cell, 104, 112 Physics, 175–188 Definition of, 102 Statistical Physics of, 182–187 Proteins, 147 Simple Cubic Lattice, 106, 108, 109, 112, 119, 120, 122, 125, 127, 137–141 Quantum Gravity, 2 Spacing Between Lattice Planes, 121 Quantum Well, 190 Slater Determinant, 28, 242 Quarks, ii, 2 Soccer, 58 Somerville College, 147 Raise Sommerfeld Theory of Metals, 27–37, 41 Steve Simon Deserves, i Shortcomings of, 35 Rant, 2–3, 42 Sound, 11, 14, 64–65, 70–72, 75, 81–82 Reciprocal Lattice, 68–70, 74, 75, 96, 115– Spaghetti Diagram, 126 123, 127, 132–135, 138, 153, 154 Spallation, 149 as Fourier Transform, 117–118 Specific Heat, 7, see Heat Capacity Definition of, 70, 115–116 Reciprocal Space, 74–75 of Diamond, 8, 10–11 Definition of, 69 of Gases, 7, 23 Reduced Zone Scheme, 81, 84, 86, 94, 123, of Metals, 17, 24, 26, 30–32, 37 153 of One Dimensional Quantum Model, 72– Reductionism, 2–3, 42 73 Refrigeration, 196, 206 of Solids, 7–17 Thermoelectric, 25 Boltzmann Model, see Boltzmann Model , 2 of Solids Repeated Zone Scheme, 159 Debye Model, see Debye Model of Solids Resistivity Einstein Model, see Einstein Model of Hall, see Hall Resistivity Solids of Metals, 21 Table of, 8 Riemann Hypothesis, 14 Spin Stiffness, 220, 221 Riemann Zeta Function, 14, 17–18 Spin-orbit, 42, 176, 200 Rotating Crystal Method, 140 Spontaneous Order, 197, 212 Rydberg, 181, 188 Squalid State, ii Stern-Gerlach Experiment, 137 Scattering, see Wave Scattering Stoner Criterion, 237–239 Amplitudes, 135–137 Stoner Ferromagnetism, see Ferromagnetism, Form Factor, see Form Factor Itinerant in Amorphous Solids, 147 Structure Factor, 118, 135, 137–140, 143, 145, in Liquids, 147 149 Inelastic, 147 Superconductor, 205, 235 Intensity, 135, 137–138, 143, 145 Susceptibility INDEX OF TOPICS 255

Electric, see Polarizability Magnetic, see Magnetic Susceptibility Synchrotron, 149 Systematic Absences, 138–141, 149

Tetragonal Lattice, 106 Thermal Conductivity, 23–25 Thermal Expansion, 52, 65 Thermoelectric, 25 Thermopower, 25, 32 Tight Binding Model, 153, 163, 168–169, 235– 236, 241, 242 of Covalent Bond, 47–52 of One Dimensional Solid, 87–96 Time-of-Flight, 149 Topological , 2

Unit Cell, 77–79, 85, 94, 101–112 Conventional, see Conventional Unit Cell Definition of, 78, 101 Primitive, see Primitive Unit Cell Wigner-Seitz, see Wigner-Seitz Unit Cell

Valence, 22, 36, 92, 93, 96, 163, 174 Valence Band, 163, 164, 173, 175, 179 Van der Waals Bond, 42–44, 53–54, 58 Van Vleck Paramagnetism, see Paramagnetism, Van Vleck Variational Method, 48, 88 Virtual Crystal Approximation, 189, 232

Wave Scattering, 131–149 Weiss Domain, see Domain Weiss Mean Field Theory, see Mean Field Theory Wiedemann-Franz Law, 24, 25, 32, 37 Wigner-Seitz Unit Cell, 103–104, 109, 110, 112, 124–125, 127 of BCC Lattice, 109 of FCC Lattice, 110 Wikipedia, 1

X-rays, 131, 136–137, 140–141, 147, 153 Comparison with Neutrons, 137, 148 Sources, 148

Zeeman Coupling, 33, 203, 223 Zeeman Term, 203 Zeta Function, see Riemann Zeta Function Zone Boundary, see Brillouin Zone Boundary