Preprint typeset in JHEP style - HYPER VERSION 2018
Gauge Theory
David Tong Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 OBA, UK
http://www.damtp.cam.ac.uk/user/tong/gaugetheory.html [email protected] Contents
0. Introduction 1
1. Topics in Electromagnetism 3 1.1 Magnetic Monopoles 3 1.1.1 Dirac Quantisation 4 1.1.2 A Patchwork of Gauge Fields 6 1.1.3 Monopoles and Angular Momentum 8 1.2 The Theta Term 10 1.2.1 The Topological Insulator 11 1.2.2 A Mirage Monopole 14 1.2.3 The Witten Effect 16 1.2.4 Why θ is Periodic 18 1.2.5 Parity, Time-Reversal and θ = π 21 1.3 Further Reading 22
2. Yang-Mills Theory 26 2.1 Introducing Yang-Mills 26 2.1.1 The Action 29 2.1.2 Gauge Symmetry 31 2.1.3 Wilson Lines and Wilson Loops 33 2.2 The Theta Term 38 2.2.1 Canonical Quantisation of Yang-Mills 40 2.2.2 The Wavefunction and the Chern-Simons Functional 42 2.2.3 Analogies From Quantum Mechanics 47 2.3 Instantons 51 2.3.1 The Self-Dual Yang-Mills Equations 52 2.3.2 Tunnelling: Another Quantum Mechanics Analogy 56 2.3.3 Instanton Contributions to the Path Integral 58 2.4 The Flow to Strong Coupling 61 2.4.1 Anti-Screening and Paramagnetism 65 2.4.2 Computing the Beta Function 67 2.5 Electric Probes 74 2.5.1 Coulomb vs Confining 74 2.5.2 An Analogy: Flux Lines in a Superconductor 78
– 1 – 2.5.3 Wilson Loops Revisited 85 2.6 Magnetic Probes 88 2.6.1 ’t Hooft Lines 89
2.6.2 SU(N) vs SU(N)/ZN 92 2.6.3 What is the Gauge Group of the Standard Model? 97 2.7 Dynamical Matter 99 2.7.1 The Beta Function Revisited 100 2.7.2 The Infra-Red Phases of QCD-like Theories 102 2.7.3 The Higgs vs Confining Phase 105 2.8 ’t Hooft-Polyakov Monopoles 109 2.8.1 Monopole Solutions 112 2.8.2 The Witten Effect Again 114 2.9 Further Reading 115
3. Anomalies 121 3.1 The Chiral Anomaly: Building Some Intuition 121 3.1.1 Massless Fermions in Two Dimensions 122 3.1.2 Massless Fermions in Four Dimensions 126 3.2 Deriving the Chiral Anomaly 129 3.2.1 Noether’s Theorem and Ward Identities 129 3.2.2 The Anomaly lies in the Measure 133 3.2.3 Triangle Diagrams 140 3.2.4 Chiral Anomalies and Gravity 148 3.3 Fermi Zero Modes 149 3.3.1 The Atiyah-Singer Index Theorem 149 3.3.2 Instantons Revisited 151 3.3.3 The Theta Term Revisited 153 3.3.4 Topological Insulators Revisited 155 3.4 Gauge Anomalies 157 3.4.1 Abelian Chiral Gauge Theories 158 3.4.2 Non-Abelian Gauge Anomalies 161 3.4.3 The SU(2) Anomaly 165 3.4.4 Anomaly Cancellation in the Standard Model 170 3.5 ’t Hooft Anomalies 174 3.6 Anomalies in Discrete Symmetries 177 3.6.1 An Anomaly in Quantum Mechanics 178 3.6.2 Generalised Symmetries 184 3.6.3 Discrete Gauge Symmetries 188
– 2 – 3.6.4 Gauging a ZN One-Form Symmetry 192 3.6.5 A ’t Hooft Anomaly in Time Reversal 196 3.7 Further Reading 197
4. Lattice Gauge Theory 200 4.1 Scalar Fields on the Lattice 202 4.2 Gauge Fields on the Lattice 204 4.2.1 The Wilson Action 204 4.2.2 The Haar Measure 208 4.2.3 The Strong Coupling Expansion 213 4.3 Fermions on the Lattice 218 4.3.1 Fermions in Two Dimensions 218 4.3.2 Fermions in Four Dimensions 222 4.3.3 The Nielsen-Ninomiya Theorem 224 4.3.4 Approaches to Lattice QCD 227 4.4 Towards Chiral Fermions on the Lattice 232 4.4.1 Domain Wall Fermions 233 4.4.2 Anomaly Inflow 235 4.4.3 The Ginsparg-Wilson Relation 238 4.4.4 Other Approaches 240 4.5 Further Reading 240
5. Chiral Symmetry Breaking 243 5.1 The Quark Condensate 244 5.1.1 Symmetry Breaking 246 5.2 The Chiral Lagrangian 248 5.2.1 Pion Scattering 249 5.2.2 Currents 251 5.2.3 Adding Masses 252 5.3 Miraculously, Baryons 253 5.3.1 The Skyrme Model 255 5.3.2 Skyrmions 257 5.4 QCD 258 5.4.1 Mesons 260 5.4.2 Baryons 263 5.4.3 Electromagnetism, the Weak Force, and Pion Decay 264 5.5 The Wess-Zumino-Witten Term 268 5.5.1 An Analogy: A Magnetic Monopole 269
– 3 – 5.5.2 A Five-Dimensional Action 271 5.5.3 Baryons as Bosons or Fermions 274 5.6 ’t Hooft Anomaly Matching 276 5.6.1 Confinement Implies Chiral Symmetry Breaking 277
5.6.2 Massless Baryons when Nf = 2? 282 5.6.3 The Vafa-Witten-Weingarten Theorems 284 5.6.4 Chiral Gauge Theories Revisited 293 5.7 Further Reading 297
6. Large N 300 6.1 A Quantum Mechanics Warm-Up: The Hydrogen Atom 301 6.2 Large N Yang-Mills 303 6.2.1 The Topology of Feynman Diagrams 303 6.2.2 A Stringy Expansion of Yang-Mills 309 6.2.3 The Large N Limit is Classical 311 6.2.4 Glueball Scattering and Decay 313 6.2.5 Theta Dependence Revisited 316 6.3 Large N QCD 318 6.3.1 Mesons 320 6.3.2 Baryons 322 6.4 The Chiral Lagrangian Revisited 326 6.4.1 Including the η0 326 6.4.2 Rediscovering the Anomaly 328 6.4.3 The Witten-Veneziano Formula 330 6.5 Further Reading 331
7. Quantum Field Theory on the Line 332 7.1 Electromagnetism in Two Dimensions 332 7.1.1 The Theta Angle 334 7.1.2 The Theta Angle is a Background Electric Field 336 7.2 The Abelian-Higgs Model 337 7.2.1 Vortices 339 7.2.2 The Wilson Loop 342 7.3 The CPN−1 Model 344 7.3.1 A Mass Gap 346 7.3.2 Confinement 348 7.3.3 Instantons 349 7.4 Fermions in Two Dimensions 351
– 4 – 7.4.1 The Gross-Neveu Model 352 7.4.2 Kinks in the Gross-Neveu Model 355 7.4.3 The Chiral Gross-Neveu Model 357 7.4.4 Back to Basics: Quantising Fermions in 2d 361 7.5 Bosonization in Two Dimensions 362 7.5.1 T-Duality 364 7.5.2 Canonical Quantisation of the Boson 366 7.5.3 The Bosonization Dictionary 370 7.5.4 The Allowed Operators: Is the Boson Really a Fermion? 372 7.5.5 Massive Thirring = Sine-Gordon 373
7.5.6 QED2: The Schwinger Model 376 7.6 Non-Abelian Bosonization 378 7.6.1 The Wess-Zumino-Witten Term 381 7.7 Further Reading 384
8. Quantum Field Theory on the Plane 387 8.1 Electromagnetism in Three Dimensions 387 8.1.1 Monopole Operators 388 8.2 The Abelian-Higgs Model 390 8.2.1 Particle-Vortex Duality 392 8.3 Confinement in d = 2 + 1 Electromagnetism 397 8.3.1 Monopoles as Instantons 398 8.3.2 Confinement 401 8.4 Chern-Simons Theory 403 8.4.1 Quantisation of the Chern-Simons level 403 8.4.2 A Topological Phase of Matter 406 8.4.3 Non-Abelian Chern-Simons Theories 410 8.5 Fermions and Chern-Simons Terms 412 8.5.1 Integrating out Massive Fermions 413 8.5.2 Massless Fermions and the Parity Anomaly 416 8.6 3d Bosonization 418 8.6.1 Flux Attachment 418 8.6.2 A Bosonization Duality 420 8.6.3 The Beginning of a Duality Web 424 8.6.4 Particle-Vortex Duality Revisited 426 8.6.5 Fermionic Particle-Vortex Duality 427 8.7 Further Reading 429
– 5 – – 6 – For Mum.
– 7 – Acknowledgements, Caveats and Apologies
The subject of quantum gauge dynamics is a rather mathematical one. These lectures makes no pretence at mathematical rigour. I have tried to put the physics front and centre, and introduced the relevant mathematics only when necessary. This might not be everyone’s cup of tea. But it’s mine. Also, I am not an ornithologist. I have no idea whether birds prefer to perch on wires at the end of the day, or during their mid-morning brunch. The opening paragraph should be read with poetic licence. The subsequent 400 pages should be more reliable.
My thanks to Pietro Benetti Genolini for many comments. I am supported by the STFC, by a Royal Society Wolfson Merit award, and by a Simons Investigator award.
– 8 – Pre-Requisites
These are advanced lectures on quantum field theory. They assume that you are com- fortable with the basics of canonical quantisation and, most importantly, path integral techniques. You can find an introduction to the former in my introductory lectures on Quantum Field Theory. Many of the ideas covered in these lectures have their genesis in statistical physics and, in particular, Wilson’s development of the renormalisation group; these were covered in the lectures on Statistical Field Theory.
Recommended Books and Resources
Much of the material covered in these lectures was discovered in a golden period of quantum field theory, dating from the mid 1970s and early 1980s, and underlies large swathes of current research. Some of this material can be found in the usual quantum field theory textbooks, but often they tend to peter out just as the fun gets going. Here are some books and resources which cover some relevant topics:
• John Preskill, Lectures on Quantum Field Theory
Preskill’s beautiful and comprehensive lectures on quantum field theory are the closest to this course and, in places, offer substantially more detail. Unfortunately they are available only in hand-written form, which means it can take some time to search for the topic you’re interested in.
They can be downloaded here: http://www.theory.caltech.edu/∼preskill/notes.html • Sidney Coleman, Aspects of Symmetry Despite their age, Coleman’s Erice lectures still sparkle. They cover only a small subset of the material we’ll need – solitons, instantons and large N are highlights – but do so with such charm that they shouldn’t be missed. • Alexander Polyakov, Gauge, Fields and Strings Polyakov is one of the masters of the path integral, whose pioneering work over the decades did much to cement our current understanding of quantum field theory. His book is not easy going, but rewards anyone who persists.
– 9 – • Gerard ’t Hooft, Under the Spell of the Gauge Principle
During the 1970s, ’t Hooft wrote a series of papers, each of which changed the way we think about quantum field theory. His name is attached to so many things in these lectures that it can, at times, get confusing. (How do ’t Hooft anomalies affect ’t Hooft lines in the ’t Hooft limit?) This book is a collection of preprints, prefaced by some brief remarks. Still, the originals are well worth the read.
• Yitzhak Frishman and Cobi Sonnenschein, Non-Perturbative Field Theory: From Two Dimensional Conformal Field Theory to QCD in Four Dimensions
The goal of this book is similar to these lectures but the itinerary is run in reverse, starting in two dimensions and building up to four.
• Eduardo Fradkin, Field Theories in Condensed Matter Physics
• Shankar, Quantum Field Theory and Condensed Matter
Both of these books discuss quantum field theory in condensed matter physics. Much of the material is restricted to field theories in d = 1 + 1 and d = 2 + 1 dimensions, and so useful for Sections 7 and 8. But the general approach to understanding the phase structure and behaviour of field theories should resonate.
Lecture notes on various topics discussed in these lectures can be downloaded from the course webpage.
– 10 – 0. Introduction
Towards the end of the day, as feathers droop and hearts flutter from too much flapping, it is not unusual to find flocks of birds resting on high voltage wires. For someone unacquainted with the gauge principle, this may seem like a dangerous act. But birds know better. There is no absolute sense in which the voltage of the wire is high. It is only high in comparison to the Earth.
Of the many fillets and random facts that we are fed in high school science classes, the story of the birds is perhaps the deepest. Most other ideas from our early physics lessons look increasingly antiquated as we gain a deeper understanding of the Universe. The concept of “force”, for example, is very 17th century. Yet the curious fact that the electrostatic potential does not matter, only the potential difference, blossoms into the gauge symmetry which underlies the Maxwell equations, the Standard Model and, in the guise of diffeomorphism invariance, general relativity.
Gauge symmetry is, in many ways, an odd foundation on which to build our best theories of physics. It is not a property of Nature, but rather a property of how we choose to describe Nature. Gauge symmetry is, at heart, a redundancy in our description of the world. Yet it is a redundancy that has enormous utility, and brings a subtlety and richness to those theories that enjoy it.
This course is about the quantum dynamics of gauge theories. It is here that the utility of gauge invariance is clearest. At the perturbative level, the redundancy allows us to make manifest the properties of quantum field theories, such as unitarity, locality, and Lorentz invariance, that we feel are vital for any fundamental theory of physics but which teeter on the verge of incompatibility. If we try to remove the redundancy by fixing some specific gauge, some of these properties will be brought into focus, while others will retreat into murk. By retaining the redundancy, we can flit between descriptions as is our want, keeping whichever property we most cherish in clear sight.
The purpose of this course is not so much to convince you that gauge theories are useful, but rather to explore their riches. Even at the classical level they have much to offer. Gauge theories are, like general relativity, founded in geometry. They are not associated only to the geometry of spacetime, but to a less intuitive and more general mathematical construct known as a fibre bundle. This brings something new to the table. While most interesting applications of general relativity are restricted to ripples of the curved, but topologically flat, spacetime in which we live, gauge fields are more supple: they can twist and wind in novel ways, bringing the subject of topology firmly into the realm of physics. This will be a dominant theme throughout these lectures. It
– 1 – is a theme that becomes particularly subtle when we include fermions in the mix, and see how they intertwine with the gauge fields.
However, the gauge theoretic fun really starts when we fully immerse ourselves in the quantum world. The vast majority of gauge theories are strongly coupled quantum field theories, where the usual perturbative techniques are insufficient to answer many questions of interest. Despite many decades of work, our understanding of this area remains rather primitive. Yet this is where the most interesting phenomena occur. In particle physics, the strong coupling dynamics of quantum field theory causes quarks and gluons to bind into protons, neutrons and other particles. In condensed matter physics, it causes electrons, which are indivisible particles, to fractionalise in high magnetic fields. There are even tantalising hints that such dynamics may be responsible for the emergence of space and time itself from more fundamental underlying degrees of freedom. The focus of these lectures is not on any particular phenomenon (although confinement in QCD will be something of a pre-occupation). Rather we will try to explain some of the ways in which we can make progress, primitive as it may be, in understanding gauge fields when interactions become strong, and quantum fluctuations wild.
– 2 – 1. Topics in Electromagnetism
We start these lectures by reviewing some topics in Maxwell theory. As we will see, there are some beautiful topological surprises hiding in electromagnetism that are not usually covered in our first undergraduate lectures. These topics will then follow us through these lectures as we explore other examples of gauge theories.
1.1 Magnetic Monopoles A magnetic monopole is an object which emits a radial magnetic field of the form gˆr Z B = ⇒ dS · B = g (1.1) 4πr2 Here g is called the magnetic charge.
We learn as undergraduates that magnetic monopoles don’t exist. First, and most importantly, they have never been observed. Second there’s a law of physics which insists that they can’t exist. This is the Maxwell equation
∇ · B = 0
Third, this particular Maxwell equation would appear to be non-negotiable. This is because it follows from the definition of the magnetic field in terms of the gauge potential
B = ∇ × A ⇒ ∇ · B = 0
Yet the gauge potential A is indispensable in theoretical physics. It is needed whenever we describe the quantum physics of particles moving in magnetic fields. Underlying this statement is the fact that the gauge potential is needed in the classical Hamiltonian treatment. Moreover, there are more subtle phenomena such as the Aharonov-Bohm effect which tell us that there is further, non-local information stored in the gauge potentials. (The Aharonov-Bohm effect was covered in the lectures on Applications of Quantum Mechanics.) All of this points to the fact that we would be wasting our time discussing magnetic monopoles.
Happily, there is a glorious loophole in all of these arguments, first discovered by Dirac, and magnetic monopoles play a crucial role in our understanding of the more subtle effects in gauge theories. The essence of this loophole is that there is an ambiguity in how we define the gauge potentials. In this section, we will see how we can exploit this.
– 3 – 1.1.1 Dirac Quantisation It turns out that not any magnetic charge g is compatible with quantum mechanics. Since this will be important, we will present several different arguments for the allowed values of g.
We start the simplest, and most physical of these arguments. For this we need to know a fact from quantum mechanics. Suppose that we take a particle which carries electric charge e. We adiabatically transport it along some closed path C in the back- ground of some gauge potential A(x, t). Then, upon returning to its initial starting position, the wavefunction of the particle picks up a phase I ψ → eieα/~ψ with α = A · dx (1.2) C There are different ways to see this, but the simplest is from the path integral approach to quantum mechanics, where the action for a point particle includes the term R dt ex˙ ·A; this directly gives the phase above.
The phase of the wavefunction is not an observable quantity in quantum mechanics. However, the phase in (1.2) is really a phase difference. We could, for example, place a particle in a superposition of two states, one of which stays still while the other travels around the loop C. The subsequent interference will depend on the phase eieα. Indeed, this is the essence of the Aharonov-Bohm effect.
Let’s now see what this has to do with magnetic monopoles. We place our electric particle, with charge e, in the background of a magnetic monopole with magnetic charge g. We keep the magnetic monopole fixed, and let the electric particle undergo some journey along a path C. We will ask only that the path C avoid the origin where the magnetic monopole is sitting. This is shown in the left-hand panel of the figure. Upon returning, the particle picks up a phase eieα/~ with I Z α = A · dx = dS · B C S R where, as shown in the figure, S is the area enclosed by C. Using the fact that S2 dS · B = g, if the surface S makes a solid angle Ω, this phase can be written as
Ωg α = 4π
– 4 – B C
C S
S’
Figure 1: Integrating over S... Figure 2: ...or over S0.
However, there’s an ambiguity in this computation. Instead of integrating over S, it is equally valid to calculate the phase by integrating over S0, shown in the right-hand panel of the figure. The solid angle formed by S0 is Ω0 = 4π − Ω. The phase is then given by (4π − Ω)g α0 = − 4π where the overall minus sign comes because the surface S0 has the opposite orientation to S. As we mentioned above, the phase shift that we get in these calculations is observable: we can’t tolerate different answers from different calculations. This means 0 that we must have eieα/~ = eieα /~. This gives the condition
eg = 2π~n with n ∈ Z (1.3)
This is the famous Dirac quantisation condition. The smallest such magnetic charge is also referred to as the quantum of flux,Φ0 = 2π~/e. Above we worked with a single particle of charge e. Obviously, the same argument holds for any other particle of charge e0. There are two possibilities. The first is that all particles carry charge that is an integer multiple of some smallest unit. In this case, it’s sufficient to impose the Dirac quantisation condition (1.3) where e is the smallest unit of charge. For example, in our world we should take e to be the electron charge. (You might want to insist that monopoles carry a larger magnetic charge so that they are consistent with quarks which have one third the electron charge. However, it turns out this isn’t necessary if the monopoles also carry colour magnetic charge.)
The second possibility is that the particles carry electric charges which are irrational multiples of each other. For example, there may be a particle with charge e and another √ particle with charge 2e. In this case, no magnetic monopoles are allowed.
– 5 – It’s sometimes said that the existence of a magnetic monopole would imply the quantisation of electric charges. This, however, has it slightly backwards. (It also misses the point that we have a beautiful explanation of the quantisation of charges from anomaly cancellation in the Standard Model; we will tell this story in Section 3.4.4.) Instead, the key distinction is the choice of Abelian gauge group. A U(1) gauge group has only integer electric charges and admits magnetic monopoles. In contrast, a gauge group R can have any irrational charges, but the price you pay is that there are no longer monopoles.
Above we looked at an electrically charged particle moving in the background of a magnetically charged particle. It is simple to generalise the discussion to particles that carry both electric and magnetic charges. These are called dyons. For two dyons, with charges (e1, g1) and (e2, g2), the generalisation of the Dirac quantisation condition requires
e1g2 − e2g1 ∈ 2π~Z (1.4)
This is sometimes called the Dirac-Zwanziger condition.
1.1.2 A Patchwork of Gauge Fields The discussion above shows how quantum mechanics constrains the allowed values of magnetic charge. It did not, however, address the main obstacle to constructing a magnetic monopole out of gauge fields A when the condition B = ∇ × A would seem to explicitly forbid such objects.
Let’s see how to do this. Our goal is to write down a configuration of gauge fields which give rise to the magnetic field (1.1) of a monopole which we will place at the origin. We will need to be careful about what we want such a gauge field to look like.
The first point is that we won’t insist that the gauge field is well defined at the origin. After all, the gauge fields arising from an electron are not well defined at the position of an electron and it would be churlish to require more from a monopole. This fact gives us our first bit of leeway, because now we need to write down gauge fields on R3\{0}, as opposed to R3 and the space with a point cut out enjoys some non-trivial topology that we will make use of.
Now consider the following gauge connection, written in spherical polar coordinates
g 1 − cos θ AN = (1.5) φ 4πr sin θ
– 6 – The resulting magnetic field is 1 ∂ 1 ∂ B = ∇ × A = (AN sin θ) ˆr − (rAN )θˆ r sin θ ∂θ φ r ∂r φ Substituting in (1.5) gives gˆr B = (1.6) 4πr2 In other words, this gauge field results in the magnetic monopole. But how is this possible? Didn’t we learn as undergraduates that if we can write B = ∇ × A then R dS · B = 0? How does the gauge potential (1.5) manage to avoid this conclusion?
The answer is that AN in (1.5) is actually a singular gauge connection. It’s not just singular at the origin, where we’ve agreed this is allowed, but it is singular along an entire half-line that extends from the origin to infinity. This is due to the 1/ sin θ term which diverges at θ = 0 and θ = π. However, the numerator 1 − cos θ has a zero when θ = 0 and the gauge connection is fine there. But the singularity along the half-line θ = π remains. The upshot is that this gauge connection is not acceptable along the line of the south pole, but is fine elsewhere. This is what the superscript N is there to remind us: this gauge connection is fine as long as we keep north.
Now consider a different gauge connection g 1 + cos θ AS = − (1.7) φ 4πr sin θ This again gives rise to the magnetic field (1.6). This time it is well behaved at θ = π, but singular at the north pole θ = 0. The superscript S is there to remind us that this connection is fine as long as we keep south.
At this point, we make use of the ambiguity in the gauge connection. We are going to take AN in the northern hemisphere and AS in the southern hemisphere. This is allowed because the two gauge potentials are the same up to a gauge transformation, A → A + ∇ω. Recalling the expression for ∇ω in spherical polars, we find that for N S θ 6= 0, π, we can indeed relate Aφ and Aφ by a gauge transformation, 1 gφ AN = AS + ∂ ω where ω = (1.8) φ φ r sin θ φ 2π However, there’s still a question remaining: is this gauge transformation allowed? The problem is that the function ω is not single valued: ω(φ = 2π) = ω(φ = 0) + g. Should this concern us?
– 7 – To answer this, we need to think more carefully about what we require from a gauge transformation. This is where the charged matter comes in. In quantum mechanics, the gauge transformation acts on the wavefunction of the particle as
ψ → eieω/~ψ
In quantum field theory, we have the same transformation but now with ψ interpreted as the field. We will not require that the gauge transformation ω is single-valued, but only that the wavefunction ψ is single-valued. This holds for the gauge transformation (1.8) provided that we have
eg = 2π~n with n ∈ Z This, of course, is the Dirac quantisation condition (1.3).
Mathematically, this is a construction of a topologically non-trivial U(1) bundle over the S2 surrounding the origin. In this context, the integer n is called the first Chern number.
1.1.3 Monopoles and Angular Momentum Here we provide yet another derivation of the Dirac quantisation condition, this time due to Saha. The key idea is that the quantisation of magnetic charge actually follows from the more familiar quantisation of angular momentum. The twist is that, in the presence of a magnetic monopole, angular momentum isn’t quite what you thought.
Let’s start with some simple classical mechanics. The equation of motion for a particle of mass m and charge e and position r, moving in a magnetic field B, is the familiar Lorentz force law dp = e r˙ × B dt with p = mr˙ the mechanical momentum. If you remember the Hamiltonian formalism for a particle in a magnetic field, you might recall that p is not the canonical momentum, a fact which is hiding in the background in what follows. Now let’s consider this equation in the background of a magnetic monopole, with g r B = 4π r3 The monopole has rotational symmetry so we would expect that the angular momen- tum, r × p, is conserved. Let’s check: d(r × p) = r˙ × p + r × p˙ = r × p˙ = er × (r˙ × B) dt
– 8 – eg eg r˙ r˙r = r × (r˙ × r) = − 4πr3 4π r r2 d eg = ˆr dt 4π We see that in the presence of a magnetic monopole, the naive angular momentum r×p is not conserved! However, we can easily write down a modified angular momentum that is conserved, namely1 eg L = r × p − ˆr 4π The extra term can be thought of as the angular momentum stored in E × B. The surprise is that the particle has angular momentum even if it doesn’t move! Before we move on, there’s a nice and quick corollary that we L can draw from this. The angular momentum vector L does not change with time. But the angle that the particle makes with this vector is eg L · ˆr = − = constant 4π θ This means that the particle moves on a cone, with axis L and angle cos θ = −eg/4πL. So far, our discussion has been classical. Now we invoke some Figure 3: simple quantum mechanics: the angular momentum should be quantised. In particular, 1 the angular momentum in the z-direction should be Lz ∈ 2 ~Z. Using the result above, we have eg 1 = n ⇒ eg = 2π n with n ∈ Z 4π 2~ ~ Once again, we find the Dirac quantisation condition.
On Bosons and Fermions There is an interesting factor of 2 buried in the discussion above. Consider a minimal Dirac monopole, with g = 2π~/e. In the background of this monopole, we will throw in a particle of spin S. The total angular momentum J is then 1 J = L + S = r × p + S − ˆr (1.9) 2 The key observation is that the final term, due to the monopole, shifts the total angular momentum by 1/2. That means, in the presence of a monopole, bosons have half-integer angular momentum while fermions have integer angular momentum! We’ll not need this curious fact for most of these lectures, but it will return in Section 8.6 when we discuss some surprising dualities in d = 2 + 1 quantum field theories. 1We also noticed this in the lecture notes on Classical Dynamics; see Section 4.3.2.
– 9 – 1.2 The Theta Term In relativistic notation, the Maxwell action for electromagnetism takes a wonderfully compact form, Z Z 1 4 1 µν 4 0 2 1 2 SMaxwell = d x − F Fµν = d x E − B (1.10) µ0 4 2 2µ0
Here Fµν = ∂µAν − ∂νAµ and Ei = cF0i and Fij = −ijkBk. One reason that the Maxwell action is so simple is that there is very little else we can write down that is both gauge invariant and Lorentz invariant. There are terms of order ∼ F 4 and higher, which give rise to non-linear electrodynamics, but these will always be suppressed by some high mass scale and are unimportant at low-energies.
There is, however, one other term that we can add to the Maxwell action that, at first glance, would seem to be of equal importance. A second glance then shows that it is completely unimportant and it’s on the third glance that we see the role it plays. This is the theta term.
We start by defining the dual tensor 1 ?F µν = µνρσF 2 ρσ ? µν F takes the same form as the original electromagnetic tensor Fµν, but with E/c → B and B → −E/c. The theta term is then given by
2 Z 2 Z θe 4 1 ? µν θe 4 Sθ = d x F Fµν = − d x E · B (1.11) 4π2~ 4 4π2~c where θ is a parameter. The morass of constants which accompany it ensure, among other things, that θ is dimensionless; we will have more to say about this in Section 1.2.4. Like the original Maxwell term, the theta term is quadratic in electric and magnetic fields. However, it is simple to check that the theta term can be written as a total derivative, 2 Z θe 4 µνρσ Sθ = d x ∂µ( Aν ∂ρAσ) (1.12) 8π2~ We say that the theta term is topological. It depends only on boundary information. Another way of saying this is that we don’t need to use the spacetime metric to define the theta term; we instead use the volume form µνρσ. The upshot is that the theta term does not change the equations of motion and, it would seem, can have little effect on the physics.
– 10 – As we will now see, this latter conclusion is a little rushed. There are a number of situations in which the theta term does lead to interesting physics. These situations often involve subtle interplay between quantum mechanics and topology.
Axion Electrodynamics We start by looking at situations where θ affects the dynamics classically. This occurs when θ is not constant, but instead varies in space and, possibly, time: θ = θ(x, t). In general, the action governing the electric and magnetic field is given by
Z 2 4 1 µν e ? µν S = d x − F Fµν + θ(x, t) F Fµν 4 16π2~ The equations of motion from this action read αc 1 ∂E α ∇ · E = − ∇θ · B and − + ∇ × B = θ˙B + ∇θ × E (1.13) π c2 ∂t πc where 1 e2 α = 4π0 ~c is the dimensionless fine structure constant. It takes the approximate value α ≈ 1/137. The deformed Maxwell equations are sometimes referred to as the equations of axion electrodynamics. The name is slightly misleading; an axion is what you get if you promote θ to a new dynamical field. Here we’re considering it to be some fixed back- ? µν ground. They are accompanied by the usual Bianchi identities, ∂µ F = 0, which remain unchanged ∂B ∇ · B = 0 and + ∇ × E = 0 ∂t The equations (1.13) carry much – although not all – of the new physics. The first tells us that in regions of space where θ varies, a magnetic field B acts like an electric charge density. The second tells us that the combination (θ˙B + ∇θ × E) acts like a current density.
1.2.1 The Topological Insulator There are a fascinating class of materials, known as topological insulators, whose dy- namics is characterised by the fact that θ = π. (We’ll see what’s special about the value θ = π in Section 1.2.4.) Examples include the Bismuth compounds Bi2Se3 and Bi2T e3.
– 11 – B E
θ=0 θ=0 B E
E
θ=π θ=π B
Figure 4: Applying a magnetic field Figure 5: Applying an electric field
Consider a topological insulator, with θ = π, filling (most of) the lower-half plane, z < −. We fill (most of) the upper-half plane, z > , with the vacuum which has
θ = 0. In the intermediate region z ∈ [−, ] we have ∂zθ 6= 0.
Let’s first shine a magnetic field Bz = B on this interface from below, as shown in the left-hand panel of the figure. The first equation in (1.13) tells us that a there is an effective accumulation of charge density, ρ = αc(∂zθ)B/π. The surface charge per unit area is given by Z σ = d2x ρ = αcB − This surface charge will give rise to an electric field outside the topological insulator. We learn that the boundary of a topological insulator has rather striking properties: it takes a magnetic field inside and generates an electric field outside!
Alternatively, we can turn on an electric field which lies tangential to the interface, say Ey = E. This is shown in the right-hand panel of the figure. The second equation in (1.13) tells us that, in the regime where ∂zθ 6= 0, the electric field acts as a surface current K, lying within the interface, perpendicular to E,
Kx = α0cEy (1.14)
This, in turn, then generates a magnetic field outside the topological insulator, per- pendicular to both E and K. This, again, is shown in the right-hand panel of the figure.
The creation of a two-dimensional current which lies perpendicular to an applied electric field is called the Hall effect. The coefficient of proportionality is known as the Hall conductivity and there is a long and beautiful story about how it takes certain
– 12 – very special values which are rational multiples of e2/2π~. (More details can be found in the lecture notes on the Quantum Hall Effect.) In the present example (1.14), the Hall conductivity is
1 e2 σxy = 2 2π~ This is usually abbreviated to say that the interface of the topological insulator has Hall conductivity 1/2.
The general phenomenon in which electric fields induce magnetic fields and vice versa goes by the name of the topological magneto-electric effect.
Continuity Conditions There’s a slightly different, but equivalent way of describing the physics above. It doesn’t tell us anything new, but it does make contact with the language we previously used to describe electrodynamics in materials2. We introduce the electric displacement
αcθ D = E + B 0 π
Comparing to the usual expression for D, we see that, in a topological insulator, a magnetic field B acts like polarisation. When θ varies, we have a varying polarisation, resulting in bound charge. This is what we saw in the topological insulator interface above. Similarly, we define the magnetising field
1 α H = B − θE µ0 πc
We see in a topological insulator, E acts like magnetisation. When θ varies, we get a varying magnetisation which results in bound currents.
With these definitions, the equations of axion electrodynamics (1.13) take the usual form of the Maxwell equations in matter
∂D ∇ · D = 0 and ∇ × H − = 0 ∂t Now we can use the standard arguments (involving Gaussian pillboxes and line in- tegrals) that tell us B perpendicular to a surface and E tangential to a surface are
2See Section 7 of the lectures on Electromagnetism.
– 13 – necessarily continuous. This means that if we introduce the normal vector to the sur- face nˆ, then
nˆ · ∆B = 0 and nˆ × ∆E = 0 (1.15)
For a usual dielectric, D perpendicular to a surface and H parallel to a surface are both discontinuous, with the discontinuity given by the surface charge and current respectively. Here, we’ve absorbed the θ-induced surface charges and currents into the definition of D and H. If there are no further, external charges we have
nˆ · ∆D = 0 and nˆ × ∆H = 0 (1.16)
It is simple the check that this condition reproduces the topological magneto-electric results that we described above.
1.2.2 A Mirage Monopole Let’s continue to explore the physics of interface between the vacuum (filling z > 0) and a topological insulator (filling z < 0). Here’s a fun game to play: take an electric charge q and place it in the vacuum at point x = (0, 0, d), a distance d above a topological insulator. What do the resulting electric and magnetic fields look like?
We can answer this using the continuity conditions described above, together with the idea of an image charge. (We met the image charge in the Electromagnetism lecture notes when discussing metals. One can also use the same tricks to describe the electric field in the presence of a dielectric, which is closer in spirit to the calculation here.) As always with the method of images, we need a flash of insight to write down an ansatz. (Or, equivalently, someone to tell us the answer). However, if we find a solution that works then general results about the uniqueness of boundary-value problems ensure that this is the unique condition.
In the present case, the answer is quite cute: we will see that if we sit in the vacuum z > 0, the electric and magnetic field lines are those due to the original particle at x = (0, 0, d), together with a mirror dyon sitting at x = (0, 0, −d) with electric and magnetic charges (q0, g). Meanwhile, if we sit in the topological insulator, z < 0, the electric and magnetic field lines are those due to the original particle, now superposed with those arising from a mirror dyon with charges (q0, −g), also sitting at x = (0, 0, d). Note that in both cases, the dyon is a mirage: it sits outside of the region we have access to. If we try to reach it by crossing the boundary, it switches the other side!
– 14 – i) ii) iii)
q θ=0 θ=0 q θ=0 (q+q’,−g) d
θ=π θ=π (q’,g) θ=π
Figure 6: i) An electric charge q placed near a topological insulator. ii) The resulting electric and magnetic field lines as seen outside. iii) The field lines as seen inside.
To see that this is the correct answer (and to compute q0 and g), we work with scalar potentials. It’s familiar to use the electrostatic equation ∇ × E = 0 to write E = −∇φ. The electric potential in the two regions is ! 1 q q0 φ = p + p z > 0 4π0 x2 + y2 + (z − d)2 x2 + y2 + (z + d)2 and 1 q + q0 φ = p z < 0 4π0 x2 + y2 + (z − d)2
Note that Ey = −∂yφ and Ex = −∂xφ are both continuous at the interface z = 0, as required by (1.15). In contrast, Ez will be discontinuous; we’ll look at this shortly. For the magnetic field, this is one of the few occasions where it’s useful to work with the magnetic scalar potential. This means that we use the fact that ∇ × B = 0 to write B = −∇Ω. (Recall the warning from earlier lectures: unlike the electric scalar φ, there is nothing fundamental about Ω; it is merely a useful computational trick). We then have 1 g Ω = z > 0 4π px2 + y2 + (z + d)2 and 1 g Ω = − z < 0 4π px2 + y2 + (z − d)2
Note that Bz = −∂zΩ is continuous across the plane z = 0, as required by the condition (1.15).
– 15 – Let’s now look at the (dis)continuity conditions (1.16). From the expressions above, we have q − q0 d Dz = 0Ez = − z=0+ z=0+ 4π (x2 + y2)3/2 and 0 q + q + αc0g d Dz = 0 Ez + αcBz = − z=0− z=0− 4π (x2 + y2)3/2
Equating these tells us that the magnetic charge on the image dyon is 2q0 g = (1.17) αc0 Similarly, the magnetic field tangent to the interface is 1 g d
Hx = Bx = 2 2 3/2 z=0+ µ0 z=0+ 4πµ0 (x + y ) and 1 α g − (q + q0)α/c d 0 Hx = Bx − Ex = − 2 2 3/2 z=0− µ0 c z=0− 4πµ0 (x + y ) which gives us (q + q0)α g = (1.18) 2c0 Happily we have found a solution both (1.15) and (1.16) can be satisfied across the boundary. Uniqueness means that this must be the correct solution. As we have seen, it involve mirage dyons sitting beyond our reach. From (1.17) and (1.18), we learn that the electric and magnetic charges carried by these dyons are given by 2 0 α 2α q = − 2 q and g = 2 q 4 + α (4 + α )c0 The monopoles and dyons that arise in this way are a mirages. Experimentally, we’re in the slightly unusual situation where we can see mirage monopoles, but not real monopoles!
1.2.3 The Witten Effect There is also an interesting story to tell about genuine magnetic monopoles. As we now show, the effect of the θ term is to endow the magnetic monopole with an electric charge. This is known as the Witten effect.
– 16 – It’s simplest to frame the set-up by first taking a θ non−zero magnetic monopole with magnetic charge g and placing B E it inside a vacuum, with θ = 0. We then surround this with a medium that has θ 6= 0 as shown in the figure. We θ=0 know what happen from our discussion above. When the magnetic field crosses the interface where θ changes, it will induce an electric charge. This charge follows from the first equation in (1.13). From inside the medium when θ 6= 0, it looks as if the monopole has electric charge Figure 7: θg e2 θg q = −αc0 = − (1.19) π 4π~ π Note, however, that this result is independent of the size of the interior region is where θ = 0. We could shrink this region down until it is infinitesimally small, and we still find that the monopole has charge q. The correct interpretation of this is that when θ 6= 0, a monopole is, in fact, a dyon: it carries electric charge (1.19).
When the monopole carries the minimum allowed magnetic charge, its electric charge is given by
2π eθ g = ~ ⇒ q = e 2π In particular, if we place a magnetic monopole inside a topological insulator, it turns into a dyon which carries half the charge of the electron.
Note that if we take θ = 2π then the electric charge of the monopole coincides with that of the electron; in this case, we can construct a neutral monopole by considering a bound state of the dyon + positron. However, when θ is not a mutliple of 2π, all monopoles necessarily carry electric charge.
One might wonder why we had to introduce the region with θ = 0 at all. What happens if we simply insist that we place the monopole directly in a system with θ 6= 0? You would again discover the Witten effect, but now you have to be careful about the boundary conditions you can place on the gauge field at the origin. We won’t describe this here. We will, however, give a slightly different derivation. Consider, once again first, placing a monopole in a medium with θ = 0. This time we will very slowly we increase θ. (Don’t ask me how...I don’t know! We just imagine it’s possible.) The second equation in (1.13) contains a θ˙ term which tells us that this will be accompanied
– 17 – by a time-varying electric field which lies parallel to B. At the end of this process, the final electric field will be −αc Z αcθ E = θdt˙ B = − B π π Once again, we learn that the monopole carries an electric charge given by (1.19).
We’ll see various other manifestations of the Witten effect as these lectures progress including, in Section 2.8.2, for monopoles in non-Abelian gauge theories.
1.2.4 Why θ is Periodic In classical axion electrodynamics, θ can take any value. Indeed, as we have seen, it is only spatial and temporal variations of θ that play a role. However, in the quantum theory θ is a periodic variable: it lies in the range θ ∈ [0, 2π) This is the real reason why θ was accompanied by that mess of other constants multi- plying the action; it is to ensure that the periodicity is something natural.
The periodicity of θ in electrodynamics is actually fairly subtle. It hinges on the topology of the U(1) gauge fields. We’ll see that, after imposing appropriate boundary conditions, Sθ can only take values of the form
Sθ = ~θN with N ∈ Z (1.20) This means that the theta angle contributes to the partition function as iS exp θ = eiNθ ~ The factor of i here is all important. In Minkowski signature, the action always sits with a factor of i. However, one of the special things about the theta term is that it has only a single time derivative in the integrand, a fact which can be traced to the appearance of the µνρσ anti-symmetric tensor. This means that the factor of i persists even in Euclidean signature. Since N is an integer, we see that the value of θ in the partition function is only important modulo 2π.
So our task is to show that, when evaluated on any field configuration, Sθ must take the form (1.20). The essence of the argument follows from the fact that the theta term is a total derivative (1.12), which shows us that the value of Sθ depends only on the boundary condition. To exploit the topology of lurking in the U(1) gauge field, we will work on a compact Euclidean spacetime which we take to be T4. We’ll take each of the circles in the torus to have radii R.
– 18 – We’ll make life even easier for ourselves by restricting to the special case E = (0, 0,E) and B = (0, 0,B) with E and B constant. The integral that we’re interested in is Z Z Z d4x EB = dx0dx3E dx1dx2 B (1.21) T4 T2 T2 This still looks like it can take any value we like. But we need to recall that E and
B are not the fundamental fields; these are the gauge fields Aµ. And these must be well defined on the underlying torus. As we’ll now show, this puts restrictions on the allowed values of E and B.
First, we need the following result: when a direction of space, say x1, is periodic with radius R, then the constant part of the corresponding gauge field (also known as the zero mode) also becomes periodic with radius
A ≡ A + ~ (1.22) 1 1 eR This arises because the presence of a circle allows us to do something interesting with gauge transformations A1 → A1 + ∂1ω. As in Section 1.1.2, we do not insist that ω(x) is single valued. Instead, we require only that eieω/~ is single-valued, since this is what acts on the wavefunction. This allows us to perform gauge transformations that wind around the circle, such as x1 ω = ~ eR These are sometimes called large gauge transformations, a name which reflects the fact that they cannot be continuously deformed to the identity. Under such a gauge transformation, we see that
A → A + ∂ ω = A + ~ 1 1 1 1 eR But field configurations that are related by a gauge transformation are to be viewed as physically equivalent. We learn that the constant part of the gauge field is periodically identified as (1.22) as claimed.
Now let’s see how this fact restricts the allowed values of the integral (1.21). The magnetic field is written as
B = ∂1A2 − ∂2A1
We can work in a gauge where A1 = 0, so that B = ∂1A2. If we want a uniform, 1 constant B then we need to write A2 = Bx . This isn’t single valued. However, that
– 19 – needn’t be a problem because, as we’ve seen above, A2 is actually a periodic variable with periodicity ~/eR. This means that we’re perfectly at liberty to write A2 = Bx1, but only if this has the correct period (1.22). This holds provided Z ~n 1 2 2π~n B = 2 with n ∈ Z ⇒ dx dx B = (1.23) 2πeR T2 e Note that this is the same as the condition on R dS · B = g that we derived from the Dirac quantisation condition (1.3). Indeed, the derivation above relies on the same kind of arguments that we used when discussing magnetic monopoles.
We can now apply exactly the same argument to the electric field, E = ∂ A − ∂ A c 0 3 3 0
0 Let’s work in a gauge with A0 = 0, so that E/c = ∂0A3. We can write A3 = (E/c)x , 0 2 which is compatible with the periodicity of A3 only when E/c = ~n /2πeR for some n0 ∈ Z. We find Z 2πc n0 dx0dx3 E = ~ (1.24) T2 e Before we go on, let me point out something that may be confusing. You may have thought that the relevant equation for E is Gauss’ law which, given the quantisation of charge, states that R dS · E = en0 for some n0 ∈ Z. But that’s not what we computed in (1.24) because E = (0, 0,E) lies parallel to the side of the torus, not perpendicular.
Instead, both (1.23) and (1.24) are best thought of as integrating the 2-form Fµν over the appropriate T2. For the magnetic field, this coincides with R dS·B which measures the magnetic charge enclosed in the manifold. It does not, however, coincide with R dS · E which measures the electric charge.
Armed with (1.23) and (1.24), we see that, at least for this specific example,
Z 2 2 4 4π ~ cN 0 d x E · B = 2 ⇒ Sθ = ~θN with N = nn ∈ Z T4 e which is our promised result (1.20)
The above explanation was rather laboured. It’s pretty straightforward to generalise it to non-constant E and B fields. If you’re mathematically inclined, it is the statement that the second Chern number of a U(1) bundle is integer valued and, as we have seen above, is actually equal to the product of two first Chern numbers. Finally note that,
– 20 – although we took Euclidean spacetime to be a torus T 4, the end result does not depend on the volume of the torus which is set by R. Nonetheless, the introduction of the torus was crucial in our argument: we needed the circles of T 4 to exploit the fact that ∼ Π1(U(1)) = Z. We will see another derivation of this when we come to discuss the anomaly in Section 3.3.1.
1.2.5 Parity, Time-Reversal and θ = π The theta term does not preserve the same symmetries as the Maxwell term. It is, of course, gauge invariant and Lorentz invariant. But it is not invariant under certain discrete symmetries.
The discrete symmetries of interest are parity P and time reversal invariance T . Parity acts by flipping all directions of space
P : x 7→ −x (1.25)
(At least this is true in any odd number of spatial dimensions; in an even number of spatial dimensions, this is simply a rotation.) Meanwhile, as the name suggests, time reversal flips the direction of time
T : t → −t
We would like to understand how these act on the electric and magnetic fields. This follows from looking at the Lorentz force law,
mx¨ = e (E + x˙ · B)
This equation is invariant under neither parity, nor time reversal. However it can be made invariant if we simultaneously act on both E and B as
P : E(x, t) 7→ −E(−x, t) and P : B(x, t) 7→ B(−x, t) and
T : E(x, t) 7→ E(x, −t) and T : B(x, t) 7→ −B(x, −t)
We say that E is odd under parity and even under time reversal; B is even under parity and odd under time reversal.
As an aside, note that a high energy theorist usually refers to CP rather than T . Here C is charge conjugation which acts as C : E 7→ −E and C : B 7→ −B, with the consequence that CP : E 7→ E and CP : B 7→ −B, rather like T . However, there is a difference between the two symmetries: CP is unitary, while T is anti-unitary.
– 21 – This means that, in general, the theta term breaks both parity and time-reversal invariance. We say that θ 7→ −θ under P and T . There are two exceptions. One of these is obvious: when θ = 0, the theory is invariant under these discrete symmetries. However, when θ = π the theory is also invariant. This is because, as we have seen θ is periodic so θ = π is the same as θ = −π.
This observation also gives some hint as to why the topological insulator has θ = π. These are materials which are defined to be time-reversal invariant. As we have seen, there are two possibilities for the dynamics of such materials. (In fancy language, they are said to have a Z2 classification.) Most materials are boring and have θ = 0. But some materials have a band structure which is twisted in a particular way. This results in θ = π.
1.3 Further Reading Anyone who has spent even the briefest time looking into the history of physics will have learned one thing: it’s complicated. It’s vastly more complicated than the air-brushed version we’re fed as students. A fairly decent summary is: everyone was confused. Breakthroughs are made by accident, or for the wrong reason, or lie dormant until long after they are rediscovered by someone else. Mis-steps later turn out to be brilliant moves. Ideas held sacred by one generation are viewed as distractions by the next.
Things become even harder when attempting to assign attribution. The scientific literature alone does not tell the full story. It misses the conference coffee conversations, the petty rivalries, the manoeuvering for glory. It misses the fact that, for most of the time, everyone was confused. Gell-Mann, who perhaps did more than anyone to lay the groundwork for particle physics and quantum field theory, captures this in an uncharacteristically rambling manner [47]
My whole life was like that and I think many people’s lives are like that. If you generalised my errors and the things that I got wrong and the things that I didn’t follow up properly and the things that I saw and I did not believe in and the things I saw and did not write up. Almost always there was some error of level. That is I knew that a certain thing was right and felt that it was right and it was contradictory to what I was doing and I could not get used to the idea that some things you have to answer late: you just put them off, but you answer some of the things now and...
These lectures are concerned with the theoretical structure of gauge theories. It is a subject whose history is inextricably bound with experimental discoveries in particle
– 22 – physics and the development of the Standard Model. Our understanding of gauge theory took place slowly, over many decades, and involved many hundreds, if not thousands, of physicists.
Each chapter of these lectures ends with a section in which I offer a broad brush account of this history. It is flawed. In places, given the choice between accuracy or a good story, I have erred towards a good story. I have, however, included references to the original literature. More usefully for students, I have also included references to reviews where a number of topics are treated in much greater detail.
Gauge Symmetry These lectures are about gauge symmetry. Although the use of a gauge choice was commonplace among classical physicists, it was viewed as a trick for finding solutions to the equations of electromagnetism. It took a surprisingly long time for physicists to appreciate the idea of gauge invariance as as an important principle in its own right. Fock was the first to realise, in 1926, that the action of gauge symmetry is intricately tied to the phase of the wavefunction in quantum mechanics [62]. The credit for viewing gauge symmetry (or “eichinvarianz”) as a desirable property of our theories of Nature is usually attributed to Weyl [203] although, as with many stories in the history of physics, his motivation now seems somewhat misplaced as he tried to prematurely develop a unified theory of gravity and electromagnetism [204]. (His approach survives in the Weyl invariance enjoyed by the worldsheet in string theory.) More historical background on the long road to the gauge principle can be found in [116, 150].
Monopoles Debrett’s style guide for physics papers includes the golden rule: one idea per paper. Many authors flaunt this, but few flaunt it in as spectacular a fashion as Dirac. His 1931 paper “Quantised Singularities in the Electromagnetic Field” [44] is primarily about the possibility of magnetic monopoles obeying the quantisation condition that now bears his name. But the paper starts by reflecting on the negative energy states predicted by the Dirac equation which, he is convinced, cannot be protons as he originally suggested. Instead, he argues, the negative energy states must correspond to novel particles, equal in mass to the electron but with positive charge.
It seems that Dirac held anti-matter and magnetic monopoles, both predictions made within a few pages of each other, on similar footing [55]. He returned to the subject of monopoles only once, in 1948, elaborating on the concept of the “Dirac string” [45]. But the spectacular experimental discovery of anti-matter in 1932, followed by a long,
– 23 – fruitless wait in the search for monopoles, left Dirac disillusioned. Fifty years after his first paper, and seemingly unconvinced by theoretical arguments (like “monopoles are heavy”), he wrote in a letter to Salam [46],
“I am inclined now to believe that monopoles do not exist. So many years have gone by without any encouragement from the experimental side.”
In the intervening years, the experimental situation has not improved. But monopoles now sit at the heart of our understanding of quantum field theory. This story took some decades to unfold and only really came to fruition with the discovery, by ’t Hooft [99] and Polyakov [158], of solitons carrying magnetic charge in non-Abelian gauge theories; this will be described in Section 2.8.
As we saw in these lectures, the angular momentum of a particle-monopole pair has an extra anomalous term. This fact was noted long ago by Poincar´e[155] in the charmingly titled short story “Remarques sur une exp´eriencede M. Birkeland”. In 1936, the Indian physicist Meghnad Saha showed that the quantum version of this observation provides a re-derivation of the Dirac quantisation [171]. The paper is an ambitious, but flawed, attempt to explain the mass of the neutron in terms of a monopole-anti-monopole bound state, and the argument for which it is now remembered is dealt with in a couple of brief sentences. The angular momentum derivation was later rediscovered by H. Wilson [213], prompting an “I did it first” response from Saha [172]. The implication for the spin-statistics of monopoles was pointed out in [92] and [112]; a more modern take can be found in [137].
The extension of the Dirac quantisation condition to dyons was made by Zwanziger in 1968 [234], while the idea of patching gauge fields is due to Wu and Yang [231]
There are many good reviews on magnetic monopoles. More details on the material discussed in this section can be found in the review by Preskill [164] or the book by Shnir [183]. More references are given in the next section when we discuss ’t Hooft-Polyakov monopoles.
Topological Insulators The story of topological insulators started in the study of band structures, and the ways in which they can twist. The first examples are the TKNN invariant for the integer quantum Hall effect [194], and the work of Haldane on Chern insulators [88]. Both of these were described in the lectures on the quantum Hall effect [193]. For this work, Thouless and Haldane were awarded the 2016 Nobel prize.
– 24 – The possibility that a topologically twisted band structure could exist in 3d materials was realised only in 2006. In July of that year, three groups posted papers on the arXiv [139, 170, 66], and in November of that year, Fu and Kane predicted the existence of this phase in a number of real materials [67]. This was quickly confirmed in experiments [109].
The effective field theory of a topological insulator, in terms of electrodynamics with θ = π, was introduced by Qi, Hughes and Zhang in [165]. This took the subject away from its lattice underpinnings, and into the realm of quantum field theory. Indeed, Wilczek had already discussed a number of properties of electrodynamics in the presence of a theta angle [212], including the Witten effect [216]. The existence of the mirror monopole was shown in [166], and a number of further related effects were discussed in [54].
More details of topological insulators can be found in the reviews [91, 167, 17].
– 25 – 2. Yang-Mills Theory
Pure electromagnetism is a free theory of a massless spin 1 field. We can ask: is it possible to construct an interacting theory of spin 1 fields? The answer is yes, and the resulting theory is known as Yang-Mills. The purpose of this section is to introduce this theory and some of its properties.
As we will see, Yang-Mills is an astonishingly rich and subtle theory. It is built upon the mathematical structure of Lie groups. These Lie groups have interesting topology which ensures that, even at the classical (or, perhaps more honestly, semi-classical) level, Yang-Mills exhibits an unusual intricacy. We will describe these features in Sections 2.2 and 2.3 where we introduce the theta angle and instantons.
However, the fun really gets going when we fully embrace ~ and appreciate that Yang-Mills is a strongly coupled quantum field theory, whose low-energy dynamics looks nothing at all like the classical theory. Our understanding of quantum Yang- Mills is far from complete, but we will describe some of the key ideas from Section 2.4 onwards.
A common theme in physics is that Nature enjoys the rich and subtle: the most beautiful theories tend to be the most relevant. Yang-Mills is no exception. It is the theory that underlies the Standard Model of particle physics, describing both the weak and the strong forces. Much of our focus, and much of the terminology, in this section has its roots in QCD, the theory of the strong force.
For most of this section we will be content to study pure Yang-Mills, without any additional matter. Only in Sections 2.7 and 2.8 will we start to explore how coupling matter fields to the theory changes its dynamics. We’ll then continue our study of the Yang-Mills coupled to matter in Section 3 where we discuss anomalies, and in Section 5 where we discuss chiral symmetry breaking.
2.1 Introducing Yang-Mills
Yang-Mills theory rests on the idea of a Lie group. The basics of Lie groups and Lie algebras were covered in the Part 3 lectures on Symmetries and Particle Physics. We start by introducing our conventions. A compact Lie group G has an underlying Lie algebra g, whose generators T a satisfy
[T a,T b] = if abcT c (2.1)
– 26 – Here a, b, c = 1,..., dim G and f abc are the fully anti-symmetric structure constants. The factor of i on the right-hand side is taken to ensure that the generators are Her- mitian: (T a)† = T a.
Much of our discussion will hold for general compact, simple Lie group G. Recall that there is a finite classification of these objects. The possible options for the group G, together with the dimension of G and the dimension of the fundamental (or minimal) representation F , are given by
G dim G dim F SU(N) N 2 − 1 N 1 SO(N) 2 N(N − 1) N Sp(N) N(2N + 1) 2N
E6 78 27
E7 133 56
E8 248 248
F4 52 6
G2 14 7 where we’re using the convention Sp(1) = SU(2). (Other authors sometimes write Sp(2n), or even USp(2n) to refer to what we’ve called Sp(N), preferring the argument to refer to the dimension of F rather than the rank of the Lie algebra g.)
Although we will present results for general G, when we want to specialise, or give examples, we will frequently turn to G = SU(N). We will also consider G = U(1), in which case Yang-Mills theory reduces to Maxwell theory.
We will need to normalise our Lie algebra generators. We require that the generators in the fundamental (i.e. minimal) representation F satisfy
1 tr T aT b = δab (2.2) 2 In what follows, we use T a to refer to the fundamental representation, and will re- fer to generators in other representations R as T a(R). Note that, having fixed the normalisation (2.2) in the fundamental representation, other T a(R) will have different normalisations. We will discuss this in more detail in Section 2.5 where we’ll extract some physics from the relevant group theory.
– 27 – a For each element of the algebra, we introduce a gauge field Aµ. These are then packaged into the Lie-algebra valued gauge potential
a a Aµ = AµT (2.3) This is a rather abstract object, taking values in a Lie algebra. For G = SU(N), a more down to earth perspective is to view Aµ simply as a traceless N × N Hermitian matrix.
a We will refer to the fields Aµ collectively as gluons, in deference to the fact that the strong nuclear force is described by G = SU(3) Yang-Mills theory. From the gauge potential, we construct the Lie-algebra valued field strength
Fµν = ∂µAν − ∂νAµ − i[Aµ,Aν] (2.4)
a a Since this is valued in the Lie algebra, we could also expand it as Fµν = FµνT . In more mathematical terminology, Aµ is called a connection and the field strength Fµν is referred to as the curvature. We’ll see what exactly the connection connects in Section 2.1.3.
Although we won’t look at dynamical matter fields until later in this section, it will prove useful to briefly introduce relevant conventions here. Matter fields live in some representation R of the gauge group G. This means that they sit in some vector ψ of dimension dim R. Much of our focus will be on matter fields in the fundamental representation of G = SU(N), in which case ψ is an N-dimensional complex vector. The matter fields couple to the gauge fields through a covariant derivative, defined by
Dµψ = ∂µψ − iAµψ (2.5) However, the algebra g has many different representations R. For each such repre- sentation, we have generators T a(R) which we can can think of as square matrices of dimension dim R. Dressed with all their indices, they take the form
a i T (R) j i, j = 1,..., dim R ; a = 1 ..., dim G For each of these representations, we can package the gauge fields into a Lie alge- a a i bra valued object Aµ T (R) j We can then couple matter in the representation R by generalising the covariant derivative from the fundamental representation to
i i a a i j Dµψ = ∂µψ − iAµ T (R) jψ i, j = 1,..., dim R (2.6) a Each of these representations offers a different ways of packaging the fields Aµ into Lie-algebra valued objects Aµ. As we mentioned above, we will mostly focus on G = SU(N): in this case, we usually take T a in the fundamental representation, in which case Aµ is simply an N × N Hermitian matrix.
– 28 – Aside from the fundamental, there is one other representation that will frequently arise: this is the adjoint, for which dim R = dim G. We could think of these fields as forming a vector φa, with a = 1,..., dim G, and then use the form of the covariant derivative (2.6). In fact, it turns out to be more useful to package adjoint valued matter fields into a Lie-algebra valued object, φ = φaT a. In this language the covariant derivative can be written as
Dµφ = ∂µφ − i[Aµ, φ] (2.7) The field strength can be constructed from the commutator of covariant derivatives. It’s not hard to check that
[Dµ, Dν]ψ = −iFµνψ The same kind of calculation shows that if φ is in the adjoint representation,
[Dµ, Dν]φ = −i[Fµν, φ] where the right-hand-side is to be thought of as the action of Fµν on fields in the adjoint representation. More generally, we write [Dµ, Dν] = −iFµν, with the understanding that the right-hand-side acts on fields according to their representation.
2.1.1 The Action The dynamics of Yang-Mills is determined by an action principle. We work in natural units, with ~ = c = 1 and take the action 1 Z S = − d4x tr F µνF (2.8) YM 2g2 µν where g2 is the Yang-Mills coupling. (It’s often called the “coupling constant” but, as we will see in Section 2.4, there is nothing constant about it so I will try to refrain from this language).
If we compare to the Maxwell action (1.10), we see that there is a factor of 1/2 outside the action, rather than a factor of 1/4; this is accounted for by the further factor of 1/2 that appears in the normalisation of the trace (2.2). There is also the extra factor of 1/g2 that we will explain below.
The classical equations of motion are derived by minimizing the action with respect a to each gauge field Aµ. It is a simple exercise to check that they are given by µν DµF = 0 (2.9) where, because Fµν is Lie-algebra valued, the definition (2.7) of the covariant derivative is the appropriate one.
– 29 – There is also a Bianchi identity that follows from the definition of Fµν in terms of the gauge field. This is best expressed by first introducing the dual field strength
1 ?F µν = µνρσF 2 ρσ and noting that this obeys the identity
? µν Dµ F = 0 (2.10)
The equations (2.9) and (2.10) are the non-Abelian generalisations of the Maxwell equations. They differ only in commutator terms, both those inside Dµ and those inside Fµν. Even in the classical theory, this is a big difference as the resulting equations are non-linear. This means that the Yang-Mills fields interact with themselves.
Note that we need to introduce the gauge potentials Aµ in order to write down the Yang-Mills equations of motion. This is in contrast to Maxwell theory where the Maxwell equations can be expressed purely in terms of E and B and we introduce gauge fields, at least classically, merely as a device to solve them.
A Rescaling Usually in quantum field theory, the coupling constants multiply the interaction terms in the Lagrangian; these are terms which are higher order than quadratic, leading to non-linear terms in the equations of motion.
However, in the Yang-Mills action, all terms appear with fixed coefficients determined by the definition of the field strength (2.4). Instead, we’ve chosen to write the (inverse) coupling as multiplying the entire action. This difference can be accounted for by a trivial rescaling. We define
1 A˜ = A and F˜ = ∂ A˜ − ∂ A˜ − ig[A˜ , A˜ ] µ g µ µν µ ν ν µ µ ν
Then, in terms of this rescaled field, the Yang-Mills action is
1 Z 1 Z S = − d4x tr F µνF = − d4x tr F˜µνF˜ YM 2g2 µν 2 µν
In the second version of the action, the coupling constant is buried inside the definition of the field strength, where it multiplies the non-linear terms in the equation of motion as expected.
– 30 – In what follows, we will use the normalisation (2.8). This is the more useful choice in the quantum theory, where SYM sits exponentiated in the partition function. One way to see this is to note that g2 sits in the same place as ~ in the partition function. This already suggests that g2 → 0 will be a classical limit. Heuristically you should think that, for g2 small, one pays a large price for field configurations that do not minimize the action; in this way, the path integral is dominated by the classical configurations. In contrast, when g2 → ∞, the Yang-Mills action disappears completely. This is the strong coupling regime, where all field configurations are unsuppressed and contribute equally to the path integral.
Based on this, you might think that we can just set g2 to be small and a classical analysis of the equations of motion (2.9) and (2.10) will be a good starting point to understand the quantum theory. As we will see in Section 2.4, it turns out that this is not an option; instead, the theory is much more subtle and interesting.
2.1.2 Gauge Symmetry The action (2.8) has a very large symmetry group. These come from spacetime- dependent functions of the Lie group G,
Ω(x) ∈ G
The set of all such transformations is known as the gauge group. Sometimes we will be sloppy, and refer to the Lie group G as the gauge group, but strictly speaking it is the much bigger group of maps from spacetime into G. The action on the gauge field is
−1 −1 Aµ → Ω(x)AµΩ (x) + iΩ(x)∂µΩ (x) (2.11)
A short calculation shows that this induces the action on the field strength
−1 Fµν → Ω(x) Fµν Ω (x) (2.12)
The Yang-Mills action is then invariant by virtues of the trace in (2.8).
In the case that G = U(1), the transformations above reduce to the familiar gauge transformations of electromagnetism. In this case we can write Ω = eiω and the trans- formation of the gauge field becomes Aµ → Aµ + ∂µω. Gauge symmetry is poorly named. It is not a symmetry of the system in the sense that it takes one physical state to a different physical state. Instead, it is a redundancy in our description of the system. This is familiar from electromagnetism and remains true in Yang-Mills theory.
– 31 – There are a number of ways to see why we should interpret the gauge symmetry as a redundancy of the system. Roughly speaking, all of them boil down to the statement that the theory fails to make sense unless we identify states related by gauge transfor- mations. This can be see classically where the equations of motion (2.9) and (2.10) do not uniquely specify the evolution of Aµ, but only its equivalence class subject to the identification (2.11). In the quantum theory, the gauge symmetry is needed to remove various pathologies which arise, such as the presence of negative norm states in the Hilbert space. A more precise explanation for the redundancy comes from appreciating that Yang-Mills theory is a constrained system which should be analysed as such using the technology of Dirac brackets; we will not do this here.
Our best theories of Nature are electromagnetism, Yang-Mills and general relativity. Each is based on an underlying gauge symmetry. Indeed, the idea of gauge symmetry is clearly something deep. Yet it is, at heart, nothing more than an ambiguity in the language we chose to present the physics? Why should Nature revel in such ambiguity?
There are two reasons why it’s advantageous to describe Nature in terms of a redun- dant set of variables. First, although gauge symmetry means that our presentation of the physics is redundant, it appears to be by far the most concise presentation. For example, we will shortly describe the gauge invariant observables of Yang-Mills theory; they are called “Wilson lines” and can be derived from the gauge potentials Aµ. Yet presenting a configuration of the Yang-Mills field in terms of a complete set of Wilson lines would require vastly more information specifying the four matrix-valued fields Aµ.
The second reason is that the redundant gauge field allow us to describe the dynamics of the theory in a way that makes manifest various properties of the theory that we hold dear, such as Lorentz invariance and locality and, in the quantum theory, unitarity. This is true even in Maxwell theory: the photon has two polarisation states. Yet try writing down a field which describes the photon that has only two indices and which transforms nicely under the SO(3, 1) Lorentz group; its not possible. Instead we introduce a field with four indices – Aµ – and then use the gauge symmetry to kill two of the resulting states. The same kind of arguments also apply to the Yang-Mills field, where there are now two physical degrees of freedom associated to each generator T a.
The redundancy inherent in the gauge symmetry means that only gauge independent quantities should be considered physical. These are the things that do not depend on our underlying choice of description. In general relativity, we would call such objects “coordinate independent”, and it’s not a bad metaphor to have in mind for Yang-Mills.
It’s worth pointing out that in Yang-Mills theory, the “electric field” Ei = F0i and the
– 32 – 1 “magnetic field” Bi = − 2 ijkFjk are not gauge invariant as they transform as (2.12). This, of course, is in contrast to electromagnetism where electric and magnetic fields are physical objects. Instead, if we want to construct gauge invariant quantities we should work with traces such as tr FµνFρσ or the Wilson lines that we will describe below. (Note that, for simple gauge groups such as SU(N), the trace of a single field strength vanishes: tr Fµν = 0.) Before we proceed, it’s useful to think about infinitesimal gauge transformations. To leading order, gauge transformations which are everywhere close to the identity can be written as
Ω(x) ≈ 1 + iωa(x)T a + ...
The infinitesimal change of the gauge field from (2.11) becomes
δAµ = ∂µω − i[Aµ, ω] ≡ Dµω where ω = ωaT a. Similarly, the infinitesimal change of the field strength is
δFµν = i[ω, Fµν] Importantly, however, there are classes of gauge transformations which cannot be de- formed so that they are everywhere close to the identity. We will study these in Section 2.2.
2.1.3 Wilson Lines and Wilson Loops It is a maxim in physics, one that leads to much rapture, that “gravity is geometry”. But the same is equally true of all the forces of Nature since gauge theory is rooted in geometry. In the language of mathematics, gauge theory is an example of a fibre bundle, and the gauge field Aµ is referred to as a connection. We met the idea of connections in general relativity. There, the Levi-Civita connec- ρ tion Γµν tells us how to parallel transport vectors around a manifold. The Yang-Mills connection Aµ plays the same role, but now for the appropriate “electric charge”. First we need to explain what this appropriate charge is.
Throughout this section, we will consider a fixed background Yang-Mills fields Aµ(x). In this background, we place a test particle. The test particle is going to be under our control: we’re holding it and we get to choose how it moves and where it goes. But the test particle will carry an internal degree of freedom – this is the “electric charge” – and the evolution of this internal degree of freedom is determined by the background Yang-Mills field.
– 33 – This internal degree of freedom sits in some representation R of the Lie group G. To start with, we will think of the particle as carrying a complex vector, w, of fixed length,
† wi i = 1,... dim R such that w w = constant In analogy with QCD, we will refer to the electrically charged particles as quarks, and to wi as the colour degree of freedom. The wi is sometimes called chromoelectric charge.
As the particle moves around the manifold, the connection Aµ (or, to dress it with all i a a i its indices, (Aµ)j = Aµ(T )j) tells this vector w how to rotate. In Maxwell theory, this “parallel transport” is nothing more than the Aharonov-Bohm effect that we discussed in Section 1.1. Upon being transported around a closed loop C, a particle returns H with a phase given by exp i C A . We’d like to write down the generalisation of this formula for non-Abelian gauge theory. For a particle moving with worldline xµ(τ), the rotation of the internal vector w is governed by the parallel transport equation dw dxµ i = A (x)w (2.13) dτ dτ µ † The factor of i ensues that, with Aµ Hermitian, the length of the vector w w remains µ µ constant. Suppose that the particle moves along a curve C, starting at xi = x (τi) and µ µ finishing at xf = x (τf ). Then the rotation of the vector depends on both the starting and end points, as well as the path between them,
w(τf ) = U[xi, xf ; C]w(τi) where Z τf dxµ Z xf U[xi, xf ; C] = P exp i dτ Aµ(x(τ)) = P exp i A (2.14) τi dτ xi where P stands for path ordering. It means that when expanding the exponential, we order the matrices Aµ(x(τ)) so that those at earlier times are placed to the left. (We met this notation previously in the lectures on quantum field theory when discussing
Dyson’s formula and you can find more explanation there.) The object U[xi, xf ; C] is referred to as the Wilson line. Under a gauge transformation Ω(x), it changes as
† U[xi, xf ; C] → Ω(xi)U[xi, xf ; C]Ω (xf ) If we take the particle on a closed path C, this object tells us how the vector w differs from its starting value. In mathematics, this notion is called holonomy. In this case, we can form a gauge invariant object known as the Wilson loop, I W [C] = tr P exp i A (2.15)
– 34 – The Wilson loop W [C] depends on the representation R of the gauge field, and its value along the path C. This will play an important role in Section 2.5 when we describe ways to test for confinement.
Quantising the Colour Degree of Freedom Above we viewed the colour degree of freedom as a vector w. This is a very classical perspective. It is better to think of each quark as carrying a finite dimensional Hilbert space Hquark, of dimension dim Hquark = dim R. Here we will explain how to accomplish this. This will provide yet another perspec- tive on the Wilson loop. What follows also offers an opportunity to explain a basic aspect of quantum mechanics which is often overlooked when we first meet the subject. The question is the following: what classical system gives rise to a finite dimensional quantum Hilbert space? Even the simplest classical systems that we meet as under- graduates, such as the harmonic oscillator, give rise to an infinite dimensional Hilbert space. Instead, the much simpler finite dimensional systems, such as the spin of the electron, are typically introduced as having no classical analog. Here we’ll see that there is an underlying classical system and that it’s rather simple.
We’ll stick with a G = SU(N) gauge theory. We consider a single test particle and attach to it a complex vector w, but this time we will insist that w has dimension N. We will restrict its length to be
w†w = κ (2.16)
The action which reproduces the equation of motion (2.13) is Z dw S = dτ iw† + λ(w†w − κ) + w†A(x(τ))w (2.17) w dt where λ is a Lagrange multiplier to impose the constraint (2.16), and where A = µ Aµ dx /dτ is to be thought of as a fixed background gauge field Aµ(x) which varies in time in some fixed way as the particle moves along the path xµ(τ).
Perhaps surprisingly, the action (2.17) has a U(1) worldline gauge symmetry. This acts as
w → eiαw and λ → λ +α ˙ for any α(τ). Physically, this gauge symmetry means that we should identify vectors iα which differ only by a phase: wγ ∼ e wγ. Since we already have the constraint (2.16), this means that the vectors parameterise the projective space S2N−1/U(1) ∼= CPN−1.
– 35 – Importantly, our action is first order in time derivatives rather than second order. This means that the momentum conjugate to w is iw† and, correspondingly, CPN−1 is the phase space of the system rather than the configuration space. This, it turns out, is the key to getting a finite dimensional Hilbert space: you should quantise a system with a finite volume phase space. Indeed, this fits nicely with the old-fashioned Bohr-Sommerfeld view of quantisation in which one takes the phase space and assigns a quantum state to each region of extent ∼ ~. A finite volume then gives a finite number of states.
We can see this in a more straightforward way doing canonical quantisation. The unconstrained variables wi obey the commutation relations
† [wi, wj ] = δij (2.18)
But we recognise these as the commutation relations of creation and annihilation op- erators. We define a “ground state” |0i such that wi|0i = 0 for all i = 1,...,N.A general state in the Hilbert space then takes the form
|i , . . . , i i = w† . . . w† |0i (2.19) 1 n i1 in
However, we also need to take into account the constraint (2.16). Note that this now arises as the equation of motion for the worldine gauge field λ. As such, it is analogous to Gauss’ law when quantising Maxwell theory and we should impose it as a constraint that defines the physical Hilbert space. There is an ordering ambiguity in defining this constraint in the quantum theory: we chose to work with the normal ordered constraint
† (wi wi − κ)|physi = 0
This tells us that the physical spectrum of the theory has precisely κ excitations. In this way, we restrict from the infinite dimensional Hilbert space (2.19) to a finite dimensional subspace. However, clearly this restriction only makes sense if we take
κ ∈ Z+ (2.20)
This is interesting. We have an example where a parameter in an action can only take integer values. We will see many further examples as these lectures progress. In the present context, the quantisation of κ means that the CPN−1 phase space of the system has a quantised volume. Again, this sits nicely with the Bohr-Sommerfeld interpretation of dividing the phase space up into parcels.
– 36 – For each choice of κ, the Hilbert space inherits an action under the SU(N) symmetry. For example:
• κ = 0: The Hilbert space consists of a single state, |0i. This is equivalent to putting a particle in the trivial representation of the gauge group.
† • κ = 1: The Hilbert space consists of N states, wi |0i. This describes a particle transforming in the fundamental representation of the SU(N) gauge group.
1 † † • κ = 2: The Hilbert space consists of 2 N(N + 1) states, wi wj |0i, transforming in the symmetric representation of the gauge group.
In this way, we can build any symmetric representation of SU(N). If we were to treat the degrees of freedom wi as Grassmann variables, and so replace the commutators in † (2.18) with anti-commutators, {wi, wj } = δij, then it’s easy to convince yourself that we would end up with particles in the anti-symmetric representations of SU(N).
The Path Integral over the Colour Degrees of Freedom We can also study the quantum mechanical action (2.17) using the path integral. Here we fix the background gauge field Aµ and integrate only over the colour degrees of freedom w(τ) and the Lagrange multiplier λ(τ).
First, we ask: how can we see the quantisation condition of κ (2.20) in the path integral? There is a rather lovely topological argument for this, one which will be repeated a number of times in subsequent chapters. The first thing to note is that the term κλ in the Lagrangian transforms as a total derivative under the gauge symmetry. Naively we might think that we can just ignore this. However, we shouldn’t be quite so quick as there are situations where this term is non-vanishing.
Suppose that we think of the worldline of the system, parameterised by τ ∈ S1 rather than R. Then we can consider gauge transformations α(τ) in which α winds around the circle, so that R dτ α˙ = 2πn for some n ∈ Z. The action (2.17) would then change as
Sw → Sw + 2πκn under a gauge transformation which seems bad. However, in the quantum theory it’s iSw not the action Sw that we have to worry about but e because this is what appears in the path integral. And eiSw is gauge invariant provided that κ ∈ Z.
– 37 – It is not difficult to explicitly compute the path integral. For convenience, we’ll set κ = 1, so we’re looking at objects in the N representation of SU(N). It’s not hard to see that the path integral over λ causes the partition function to vanish unless we put in two insertions of w. We should therefore compute Z † iSw(w,λ;A) † Zw[A] := DλDwDw e wi(τ = ∞)wi (τ = −∞)
The insertion at τ = −∞ can be thought of as placing the particle in some particular internal state. The partition function measures the amplitude that it remains in that state at τ = +∞
We next perform the path integral over w and w†. This is tantamount to summing a series of diagrams like this:
= + + + ....
where the straight lines are propagators for wi which are simply θ(τ1 − τ2)δij, while the dotted lines represent insertions of the gauge fields A. It’s straightforward to sum these. The final result is something familiar: Z Zw[A] = tr P exp i dτ A(τ) (2.21)
This, of course, is the Wilson loop W [C]. We see that we get a slightly different perspective on the Wilson loop: it arises by integrating out the colour degrees of freedom of the quark test particle.
2.2 The Theta Term The Yang-Mills action is the obvious generalisation of the Maxwell action, 1 Z S = − d4x tr F µνF YM 2g2 µν There is, however, one further term that we can add which is Lorentz invariant, gauge invariant and quadratic in field strengths. This is the theta term, θ Z S = d4x tr ?F µνF (2.22) θ 16π2 µν
? µν 1 µνρσ where F = 2 Fρσ. Clearly, this is analogous to the theta term that we met in Maxwell theory in Section 1.2. Note, however, that the canonical normalisation of the
– 38 – 1 Yang-Mills theta term differs by a factor of 2 from the Maxwell term (a fact which is a little hidden in this notation because it’s buried in the definition of the trace (2.2)). We’ll understand why this is the case below. (A spoiller: it’s because the periodicity of 2 the Maxwell theta term arises from the first Chern number, c1(A) while the periodicity of the non-Abelian theta-term arises from the second Chern number c2(A).) The non-Abelian theta term shares a number of properties with its Abelian counter- part. In particular,
• The theta term is a total derivative. It can be written as θ Z S = d4x ∂ Kµ (2.23) θ 8π2 µ where 2i Kµ = µνρσtr A ∂ A − A A A (2.24) ν ρ σ 3 ν ρ σ This means that, as in the Maxwell case, the theta term does not change the classical equations of motion.
• θ is an angular variable. For simple gauge groups, it sits in the range
θ ∈ [0, 2π)
This follows because the total derivative (2.23) counts the winding number of a gauge configuration known as the Pontryagin number such that, evaluated on
any configuration, Sθ = θn with n ∈ Z. This is similar in spirit to the kind of argument we saw in Section 1.2.4 for the U(1) theta angle, although the details differ because non-Abelian gauge groups have a different topology from their Abelian cousins. We will explain this in the rest of this section and, from a slightly different perspective, in Section 2.3.
There can, however, be subtleties associated to discrete identifications in the gauge group in which case the range of θ should be extended. We’ll discuss this in more detail in Section 2.6.
In Section 1.2, we mostly focussed on situations where θ varies in space. This kind of “topological insulator” physics also applies in the non-Abelian case. However, as we mentioned above, the topology of non-Abelian gauge groups is somewhat more complicated. This, it turns out, affects the spectrum of states in the Yang-Mills theory even when θ is constant. The purpose of this section is to explore this physics.
– 39 – 2.2.1 Canonical Quantisation of Yang-Mills Ultimately, we want to see how the θ term affects the quantisation of Yang-Mills. But we can see the essence of the issue already in the classical theory where, as we will now show, the θ term results in a shift to the canonical momentum. The full Lagrangian is
1 θ L = − tr F µνF + tr ?F µνF (2.25) 2g2 µν 16π2 µν
To start, we make use of the gauge redundancy to set
A0 = 0
With this ansatz, the Lagrangian becomes
1 θ L = tr A˙ 2 − B2 + tr A˙ · B (2.26) g2 4π2
1 Here Bi = − 2 ijkFjk is the non-Abelian magnetic field (sometimes called the chromo- ˙ magnetic field). Meanwhile, the non-Abelian electric field is Ei = Ai. I’ve chosen not to use the electric field notation in (2.26) as the A˙ terms highlight the canonical structure. Note that the θ term is linear in time derivatives; this is reminiscent of the effect of a magnetic field in Newtonian particle mechanics and we will see some similarities below.
The Lagrangian (2.26) is not quite equivalent to (2.25); it should be supplemented by the equation of motion for A0. In analogy with electromagnetism, we refer to this as Gauss’ law. It is
DiEi = 0 (2.27)
This is a constraint which should be imposed on all physical field configurations.
The momentum conjugate to A is
∂L 1 θ π = = E + B ∂A˙ g2 8π2 From this we can build the Hamiltonian 1 H = tr E2 + B2 (2.28) g2
We see that, when written in terms of the electric field E, neither the constraint (2.27) nor the Hamiltonian (2.28) depend on θ; all of the dependence is buried in the Poisson
– 40 – bracket structure. Indeed, when written in terms of the canonical momentum π, the constraint becomes
Diπi = 0 where the would-be extra term DiBi = 0 by virtue of the Bianchi identity (2.10). Meanwhile the Hamiltonian becomes
θ 2 1 H = g2tr π − B + tr B2 8π2 g2
It is this θ-dependent shift in the canonical momentum which affects the quantum theory.
Building the Hilbert Space Let’s first recall how we construct the physical Hilbert space of Maxwell theory where, for now, we set θ = 0. For this Abelian theory, Gauss’ law (2.27) is linear in A and it is equivalent to ∇ · A = 0. This makes it simple to solve: the constraint kills the longitudinal photon mode, leaving us with two, physical transverse modes. We can then proceed to build the Hilbert space describing just these physical degrees of freedom. This was the story we learned in our first course on Quantum Field Theory.
In contrast, things aren’t so simple in Yang-Mills theory. Now the Gauss’ law (2.27) is non-linear and it’s not so straightforward to solve the constraint to isolate only the physical degrees of freedom. Instead, we proceed as follows. We start by constructing an auxiliary Hilbert space built from all spatial gauge fields: we call these states |A(x, t)i. The physical Hilbert space is then defined as those states |physi which obey
DiEi |physi = 0 (2.29)
Note that we do not set DiEi = 0 as an operator equation; this would not be compatible with the commutation relations of the theory. Instead, we use it to define the physical states.
There is an alternative way to think about the constraint (2.29). After we’ve picked
A0 = 0 gauge, we still have further time-independent gauge transformations of the form
A → ΩAΩ−1 + iΩ∇Ω−1
Among these are global gauge transformations which, in the limit x → ∞, asymptote to Ω → constant 6= 1. These are sometimes referred to as large gauge transformations.
– 41 – They should be thought of as global, physical symmetries rather than redundancies. A similar interpretation holds in Maxwell theory where the corresponding conserved quantity is electric charge. In the present case, we have a conserved charge for each generator of the gauge group. The form of the charge follows from Noether’s theorem and, for the gauge transformation Ω = eiω, is given by
Z Q(ω) = d3x tr (π · δA) 1 Z θg2 = d3x tr E + B D ω (2.30) g2 i 8π2 i i 1 Z = − d3x tr (D E ω) g2 i i where we’ve used the fact that DiBi = 0. This is telling us that the Gauss’ law a a G = (DiEi) plays the role of the generator of the gauge symmetry. The constraint (2.29) is the statement that we are sitting in the gauge singlet sector of the Hilbert space where, for all ω, Q(ω) = 0.
2.2.2 The Wavefunction and the Chern-Simons Functional
It’s rare in quantum field theory that we need to resort to the old-fashioned Schr¨odinger representation of the wavefunction. But we will find it useful here. We will think of the states in the auxiliary Hilbert space as wavefunctions of the form Ψ(A). (Strictly speaking, these are wavefunctionals because the argument A(x) is itself a function.)
In this language, the canonical momentum πi is, as usual in quantum mechanics, i π = −iδ/δAi. The Gauss’ law constraint then becomes
δΨ Di −i = 0 (2.31) δAi
Meanwhile, the Schr¨odingerequation is
δ θ 2 1 HΨ = g2tr −i − B Ψ + tr B2Ψ = EΨ (2.32) δA 8π2 g2
This is now in a form that should be vaguely familiar from our first course in quantum mechanics, albeit with an infinite number of degrees of freedom. All we have to do is solve these equations. That, you may not be surprised to hear, is easier said than done.
– 42 – We can, however, try to see the effect of the θ term. Suppose that we find a physical, energy eigenstate — call it Ψ0(A) — that solves both (2.31), as well as the Schr¨odinger equation (2.32) with θ = 0. That is, δ2Ψ 1 −g2tr 0 + tr B2Ψ = EΨ (2.33) δA g2 0 0 Now consider the following state
iθ W [A] Ψ(A) = e Ψ0(A) (2.34) where W (A) is given by 1 Z 2i W (A) = d3x ijk tr F A + A A A (2.35) 8π2 ij k 3 i j k This is known as the Chern-Simons functional. It has a number of beautiful and subtle properties, some of which we will see below, some of which we will explore in Section 8. It also plays an important role in the theory of the Quantum Hall Effect. Note that we’ve already seen the expression (2.35) before: when we wrote the θ term as a total derivative (2.24), the temporal component was K0 = 4π2W .
For now, the key property of W (A) that we will need is
δW (A) 1 ijk 1 = 2 Fjk = 2 Bi δAi 8π 4π which gives us the following relation,
δΨ(A) iθW [A] δΨ0(A) θ −i = −ie + 2 Bi Ψ(A) δAi δAi 4π This ensures that Ψ satisfies the Gauss law constraint (2.31). (To see this, you need to convince yourself that the Di in (2.31) acts only on δΨ0/δAi in the first term above and on Bi in the second and then remember that DiBi = 0 by the Bianchi identity.) Moreover, if Ψ0 obeys the Schr¨odingerequation (2.33), then Ψ will obey the Schr¨odinger equation (2.32) with general θ.
The above would seem to show that if we can construct a physical state Ψ0 with energy E when θ = 0 then we can dress this with the Chern-Simons functional eiθW (A) to construct a state Ψ which has the same energy E when θ 6= 0. In other words, the physical spectrum of the theory appears to be independent of θ. In fact, this conclusion is wrong! The spectrum does depend on θ. To understand the reason behind this, we have to look more closely at the Chern-Simons functional (2.35).
– 43 – Is the Chern-Simons Functional Gauge Invariant? The Chern-Simons functional W [A] is not obviously gauge invariant. In fact, not only is it not obviously gauge invariant, it turns out that it’s not actually gauge invariant! But, as we now explain, it fails to be gauge invariant in an interesting way.
Let’s see what happens. In A0 = 0 gauge, we can still act with time-independent gauge transformations Ω(x) ∈ G, under which
A → ΩAΩ−1 + iΩ∇Ω−1
−1 The spatial components of the field strength then changes as Fij → ΩFijΩ . It is not difficult to check that the Chern-Simons functional (2.35) transforms as
1 Z 1 W [A] → W [A] + d3x iijk∂ tr (∂ ΩΩ−1A ) − ijktr (Ω−1∂ ΩΩ−1∂ ΩΩ−1∂ Ω) 4π2 j i k 3 i j k
The first term is a total derivative. It has an interesting role to play on manifolds with boundaries but will not concern us here. Instead, our interest lies in the second term. This is novel to non-Abelian gauge theories and has a beautiful interpretation.
To understand this interpretation, we need to understand something about the topol- ogy of non-Abelian gauge transformations. As we now explain, these gauge transfor- mations fall into different classes.
We’ve already met the first classification of gauge transformations. Those with Ω 6= 1 2 ∼ 3 at spatial infinity, S∞ = ∂R , are to be thought of as global symmetries. The remaining 2 gauge symmetries have Ω = 1 on S∞. These are the ones that we are interested in here.
2 3 3 Insisting that Ω → 1 at S∞ is equivalent to working on spatial S rather than R . Each gauge transformation with this property then defines a map,
Ω(x): S3 7→ G
Such maps fall into disjoint classes. This arises because the gauge transformations can “wind” around the spatial S3, in such a way that one gauge transformation cannot be continuously transformed into another. We’ll meet this kind of idea a lot throughout these lectures. Such maps are characterised by homotopy theory. In general, we will be interested in the different classes of maps from spheres Sn into some space X. Two maps are said to be homotopic if they can be continuously deformed into each other.
– 44 – The homotopically distinct maps are classified by the group Πn(X). For us, the relevant formula is
Π3(G) = Z for all simple, compact Lie groups G. In words, this means that the winding of gauge transformations is classified by an integer n. This statement is intuitive for G = SU(2) since SU(2) ∼= S3, so the homotopy group counts the winding of maps from S3 7→ S3. For higher dimensional G, it turns out that it’s sufficient to pick an SU(2) subgroup of G and consider maps which wind within that. It turns out that these maps cannot be unwound within the larger G. Moreover, all topologically non-trivial maps within G can be deformed to lie within an SU(2) subgroup. It can be shown that this winding is computed by, Z 1 3 ijk −1 −1 −1 n(Ω) = 2 d S tr (Ω ∂iΩΩ ∂jΩΩ ∂kΩ) (2.36) 24π S3 We claim that this expression always spits out an integer n(Ω) ∈ Z. This integer characterises the gauge transformation. It’s simple to check that n(Ω1Ω2) = n(Ω1) + n(Ω2).
An Example: SU(2) We won’t prove that the expression (2.36) is an integer which counts the winding. We will, however, give a simple example which illustrates the basic idea. We pick gauge group G = SU(2). This is particularly straightforward because, as a manifold, ∼ 3 3 ∼ SU(2) = S and it seems eminently plausible that Π3(S ) = Z. In this case, it is not difficult to give an explicit mapping which has winding number n. Consider the radially symmetric gauge transformation
σ xˆi ω ω Ω (x) = exp iω(r) i = cos + i sin σi · xˆi (2.37) n 2 2 2 where ω(r) is some monotonic function such that ( 0 r = 0 ω(r) = 4πn r = ∞
i Note that whenever ω is a multiple of 4π then Ω = e2πiσixˆ = 1. This means that as we move out radially from the origin, the gauge transformation (2.37) is equal to the identity n times, starting at the origin and then on successive spheres S2 before it
– 45 – 2 reaches the identity the final time at infinity S∞. If we calculate the winding (2.36) of this map, we find
n(Ωn) = n
For more general non-Abelian gauge groups G, one can always embed the winding
Ωn(x) into an SU(2) subgroup. It turns out that it is not possible to unwind this by moving in the larger G. Moreover, the converse also holds: given any non-trivial winding Ω(x) in G, one can always deform Ω(x) until it sits entirely within an SU(2) subgroup.
The Chern-Simons Functional is not Gauge Invariant! We now see the relevance of these topologically non-trivial gauge transformations. Dropping the boundary term, the transformation of the Chern-Simons functional is
W [A] → W [A] + n
We learn that the Chern-Simons functional is not quite gauge invariant. But it only changes under topologically non-trivial gauge transformations, where it shifts by an integer.
What does this mean for our wavefunctions? We will require that our wavefunctions are gauge invariant, so that Ψ(A0) = Ψ(A) with A0 = ΩAΩ−1 + iΩ∇Ω−1. Now, however, we see the problem with our dressing argument. Suppose that we find a wavefunction Ψ0(A) which is a state when θ = 0 and is gauge invariant. Then the dressed wavefunction
iθ W [A] Ψ(A) = e Ψ0(A) (2.38) will indeed solve the Schr¨odingerequation for general θ. But it is not gauge invariant: instead it transforms as Ψ(A0) = eiθnΨ(A).
This then, is the way that the θ angle shows up in the states. We do require that Ψ(A) is gauge invariant which means that it’s not enough to simply dress the θ = 0 iθW [A] wavefunctions Ψ0(A) with the Chern-Simons functional e . Instead, if we want to go down this path, we must solve the θ = 0 Schr¨odingerequation with the requirement 0 −iθn that Ψ0(A ) = e Ψ0(A), so that this cancels the additional phase coming from the dressing factor so that Ψ(A) is gauge invariant.
There is one last point: the value of θ only arises in the phase eiθn with n ∈ Z. This, is the origin of the statement of that θ is periodic mod 2π. We take θ ∈ [0, 2π).
– 46 – We have understood that the spectrum does depend on θ. But we have not un- derstood how the spectrum depends on θ. That is much harder. We will not have anything to say here, but will return to this a number of times in these lectures, both in Section 2.3 where we discuss instantons and in Section 6 when we discuss the large N expansion.
2.2.3 Analogies From Quantum Mechanics There’s an analogy that exhibits some (but not all) of the ideas above in a much simpler setting. Consider a particle of unit charge, restricted to move on a circle of radius R. Through the middle of the circle we thread a magnetic flux Φ. Because the particle sits away from the magnetic field, its classical motion is unaffected by the flux. Nonetheless, the quantum spectrum does depend on the flux and this arises for reasons very similar to those described above.
Let’s recall how this works. The Hamiltonian for the particle is 1 ∂ Φ 2 H = −i + 2m ∂x 2πR
We can now follow our previous train of logic. Suppose that we found a state Ψ0 which is an eigenstate of the Hamiltonian when Φ = 0. We might think that we could −iΦx/2πR then just write down the new state Ψ = e Ψ0 which is an eigenstate of the Hamiltonian for non-zero Φ. However, as in the Yang-Mills case above, this is too quick. For our particle on a circle, it’s not large gauge transformations that we have to worry about; instead, it’s simply the requirement that the wavefunction is single valued. The dressing factor eiΦx/2πR is only single valued if Φ is a multiple of 2π.
Of course, the particle moving on a circle is much simpler than Yang-Mills. Indeed, there is no difficulty in just solving it explicitly. The single-valued wavefunctions have the property that they are actually independent of Φ. (There is no reason to believe that this property also holds for Yang-Mills.) They are 1 Ψ = √ einx/R n ∈ Z 2πR These solve the Schr¨odingerequation HΨ = EΨ with energy 1 Φ 2 E = n + n ∈ Z (2.39) 2mR2 2π We see that the spectrum of the theory does depend on the flux Φ, even though the particle never goes near the region with magnetic field. Moreover, as far as the particle is concerned, the flux Φ is a periodic variable, with periodicity 2π. In particular, if Φ is an integer multiple of 2π, then the spectrum of the theory is unaffected by the flux.
– 47 – The Theta Angle as a “Hidden” Parameter There is an alternative way to view the problem of the particle moving on a circle. We explain this here before returning to Yang-Mills where we offer the same viewpoint. This new way of looking at things starts with a question: why should we insist that the wavefunction is single-valued? After all, we only measure probability |Ψ|2, which cares nothing for the phase. Does this mean that it’s consistent to work with wavefunctions that are not single-valued around the circle?
The answer to this question is “yes”. Let’s see how it works. Consider the Hamilto- nian for a free particle on a circle of radius R,
1 ∂2 H = − (2.40) 2m ∂x2 In this way of looking at things, the Hamiltonian contains no trace of the flux. Instead, it will arise from the boundary conditions that we place on the wavefunction. We will not require that the wavefunction is single valued, but instead that it comes back to itself up to some specified phase Φ, so that
Ψ(x + 2πR) = eiΦ Ψ(x)
The eigenstates of (2.40) with this requirement are
1 Ψ = √ ei(n+Φ/2π)x/R n ∈ Z 2πR
The energy of these states is again given by (2.39). We learn that allowing for more general wavefunctions doesn’t give any new physics. Instead, it allows for a different perspective on the same physics, in which the presence of the flux does not appear in the Hamiltonian, but instead is shifted to the boundary conditions imposed on the wavefunction. In this framework, the phase Φ is sometimes said to be a “hidden” parameter because you don’t see it directly in the Hamiltonian.
We can now ask this same question for Yang-Mills. We’ll start with Yang-Mills theory in the absence of a θ term and will see how we can recover the states with θ 6= 0. Here, the analog question is whether the wavefunction Ψ0(A) should really be gauge invariant, or whether we can suffer an additional phase under a gauge transformation. The phase that the wavefunction picks up should be consistent with the group structure of gauge transformations: this means that we are looking for a one-dimensional representation (the phase) of the group of gauge transformations.
– 48 – Topologically trivial gauge transformations (which have n(Ω) = 0) can be continu- ously connected to the identity. For these, there’s no way to build a non-trivial phase 0 factor consistent with the group structure: it must be the case that Ψ0(A ) = Ψ0(A) whenever A0 = ΩAΩ−1 + iΩ∇Ω−1 with n(Ω) = 0.
However, things are different for the topologically non-trivial gauge transformations. As we’ve seen above, these are labelled by their winding n(Ω) ∈ Z. One could require that, under these topologically non-trivial gauge transformations, the wavefunction changes as
0 −iθn Ψ0(A ) = e Ψ0(A) (2.41) for some choice of θ ∈ [0, 2π). This is consistent with consecutive gauge transformations because n(Ω1Ω2) = n(Ω1) + n(Ω2). In this way, we introduce an angle θ into the definition of the theory through the boundary conditions on wavefunctions.
It should be clear that the discussion above is just another way of stating our earlier results. Given a wavefunction which transforms as (2.41), we can always dress it with a Chern-Simons functional as in (2.38) to construct a single-valued wavefunction. These are just two different paths that lead to the same conclusion. We’ve highlighted the “hidden” interpretation here in part because it is often the way the θ angle is introduced in the literature. Moreover, as we will see in more detail in Section 2.3, it is closer in spirit to the way the θ angle appears in semi-classical tunnelling calculations.
Another Analogy: Bloch Waves There’s another analogy which is often wheeled out to explain how θ affects the states. This analogy has some utility, but it also has some flaws. I’ll try to highlight both below.
So far our discussion of the θ angle has been for all states in the Hilbert space. For this analogy, we will focus on the ground state. Moreover, we will work “semi-classically”, which really means “classically” but where we use the language of wavefunctions. I should stress that this approximation is not valid: as we will see in Section 2.4, Yang- Mills theory is strongly coupled quantum theory, and the true ground state will bear no resemblance to the classical ground state. The purpose of what follows is merely to highlight the basic structure of the Hilbert space.
With these caveats out the way, let’s proceed. The classical ground states of Yang- Mills are pure gauge configurations. This means that they take the form
A = iV ∇V −1 (2.42)
– 49 – for some V (x) ∈ G. But, as we’ve seen above, such configurations are labelled by the integer n(V ). This is a slightly different role for the winding: now it is labelling the zero energy states in the theory, as opposed to gauge transformations. At the semi- classical level, the configurations (2.42) map into quantum states. Since the classical configurations are labelled by an integer n(V ), this should carry over to the quantum Hilbert space. We call the corresponding ground states |ni with n ∈ Z.
If we were to stop here, we might be tempted to conclude that Yang-Mills has multiple ground states, |ni. But this would be too hasty. All of these ground states are connected by gauge transformations. But the gauge transformations itself must have non-trivial topology. Specifically, if Ω is a gauge transformation with n(Ω) = n0 then Ω|ni = |n + n0i.
The true ground state, like all states in the Hilbert space, should obey (2.41). For our states, this reads
Ω|Ψi = eiθn0 |Ψi
This means that the physical ground state of the system is a coherent sum over all the states |ni. It takes the form X |θi = eiθn|ni (2.43) n
This is the semi-classical approximation to the ground state of Yang-Mills theory. These states are sometimes referred to as theta vacua. Once again, I stress that the semi- classical approximation is a rubbish approximation in this case! This is not close to the true ground state of Yang-Mills.
Now to the analogy, which comes from condensed matter physics. Consider a particle moving in a one-dimensional periodic potential
V (x) = V (x + a)
Classically there are an infinite number of ground states corresponding the minima of the potential. We describe these states as |ni with n ∈ Z. However, we know that these aren’t the true ground states of the Hamiltonian. These are given by Bloch’s theorem which states that all eigenstates have the form X |ki = eikan|ni (2.44) n
– 50 – for some k ∈ [−π/a, π/a) called the lattice momentum. Clearly there is a parallel between (2.43) and (2.44). In some sense, the θ angle plays a role in Yang-Mills similar to the combination ka for a particle in a periodic potential. This similarity can be traced to the underlying group theory structure. In both cases there is a Z group action on the states. For the particle in a lattice, this group is generated by the translation operator; for Yang-Mills it is generated by the topologically non-trivial gauge transformation with n(Ω) = 1.
There is, however, an important difference between these two situations. For the particle in a potential, all the states |ki lie in the Hilbert space. Indeed, the spec- trum famously forms a band labelled by k. In contrast, in Yang-Mills theory there is only a single state: each theory has a specific θ which picks out one state from the band. This can be traced to the different interpretation of the group generators. The translation operator for a particle is a genuine symmetry, moving one physical state to another. In contrast, the topologically non-trivial gauge transformation Ω is, like all gauge transformations, a redundancy: it relates physically identical states, albeit it up to a phase.
2.3 Instantons We have argued that the theta angle is an important parameter in Yang-Mills, changing the spectrum and correlation functions of the theory. This is in contrast to electro- magnetism where θ only plays a role in the presence of boundaries (such as topological insulators) or magnetic monopoles. It is natural to ask: how do we see this from the path integral?
To answer this question, recall that the theta term is a total derivative θ Z θ Z S = d4x tr ?F µνF = d4x ∂ Kµ θ 16π2 µν 8π2 µ where 2i Kµ = µνρσtr A ∂ A − A A A ν ρ σ 3 ν ρ σ
This means that if a field configuration is to have a non-vanishing value of Sθ, then it must have something interesting going on at infinity.
At this point, we do something important: we Wick rotate so that we work in Euclidean spacetime R4. We will explain the physical significance of this in Section
2.3.2. Configurations that have finite action SYM must asymptote to pure gauge,
−1 Aµ → iΩ∂µΩ as x → ∞ (2.45)
– 51 – with Ω ∈ G. This means that finite action, Euclidean field configurations involve a map
3 Ω(x): S∞ 7→ G
But we have met such maps before: they are characterised by the homotopy group
Π3(G) = Z. Plugging this asymptotic ansatz (2.45) into the action Sθ, we have
Sθ = θν (2.46) where ν ∈ Z is an integer that tells us the number of times that Ω(x) winds around 3 the asymptotic S∞, Z 1 3 ijk −1 −1 −1 ν(Ω) = 2 d S tr (Ω∂iΩ )(Ω∂jΩ )(Ω∂kΩ ) (2.47) 24π 3 S∞
This is the same winding number that we met previously in (2.36).
This discussion is mathematically identical to the classification of non-trivial gauge transformations in Section 2.2.2. However, the physical setting is somewhat different. 3 Here we are talking about maps from the boundary of (Euclidean) spacetime S∞, while in Section 2.2.2 we were talking about maps from a spatial slice, R3, suitably compactified to become S3. We will see the relationship between these in Section 2.3.2.
2.3.1 The Self-Dual Yang-Mills Equations Among the class of field configurations with non-vanishing winding ν there are some that are special: these solve the classical equations of motion,
µν DµF = 0 (2.48)
There is a cute way of finding solutions to this equation. The Yang-Mills action is
1 Z S = d4x tr F F µν YM 2g2 µν
Note that in Euclidean space, the action comes with a + sign. This is to be contrasted with the Minkowski space action (2.8) which comes with a minus sign. We can write this as 1 Z 1 Z 8π2 S = d4x tr (F ∓ ?F )2 ± d4x tr F ?F µν ≥ |ν| YM 4g2 µν µν 2g2 µν g2
– 52 – where, in the last line, we’ve used the result (2.46). We learn that in the sector with winding ν, the Yang-Mills action is bounded by 8π2|ν|/g2. The action is minimised when the bound is saturated. This occurs when
? Fµν = ± Fµν (2.49)
These are the (anti) self-dual Yang-Mills equations. The argument above shows that solutions to these first order equations necessarily minimise the action is a given topo- logical sector and so must solve the equations of motion (2.48). In fact, it’s straightfor- ward to see that this is the case since it follows immediately from the Bianchi identity ? µν Dµ F = 0. The kind of “completing the square” trick that we used above, where we bound the action by a topological invariant, is known as the Bogomolnyi bound. We’ll see it a number of times in these lectures.
Solutions to the (anti) self-dual Yang-Mills equations (2.49) are known as instantons. This is because, as we will see below, the action density is localised at both a point in space and at an instant in (admittedly, Euclidean) time. They contribute to the path integral with a characteristic factor
2 2 e−Sinstanton = e−8π |ν|/g eiθν (2.50)
Note that the Yang-Mills contribution is real because we’ve Wick rotated to Euclidean space. However, the contribution from the theta term remains complex even after Wick rotation. This is typical behaviour for such topological terms that sit in the action with epsilon symbols.
A Single Instanton in SU(2) ? We will focus on gauge group G = SU(2) and solve the self-dual equations Fµν = Fµν with winding number ν = 1. As we’ve seen, asymptotically the gauge field must be −1 pure gauge, and so takes the form Aµ → iΩ∂µΩ . An example of a map Ω(x) ∈ SU(2) with winding ν = 1 is given by x σµ Ω(x) = √µ where σµ = (1, −i~σ) x2 with this choice, the asymptotic form of the gauge field is given by3 1 A → iΩ∂ Ω−1 = ηi xνσi as x → ∞ µ µ x2 µν
3In the lecture notes on Solitons, the instanton solution was presented in singular gauge, where it takes a similar, but noticeably different form.
– 53 – i Here the ηµν are usually referred to as ’t Hooft matrices. They are three 4 × 4 matrices which provide an irreducible representation of the su(2) Lie algebra. They are given by
0 1 0 0 0 0 1 0 0 0 0 1 η1 = −1 0 0 0 , η2 = 0 0 0 −1 , η3 = 0 0 1 0 µν 0 0 0 1 µν −1 0 0 0 µν 0 −1 0 0 0 0 −1 0 0 1 0 0 −1 0 0 0
1 i i These matrices are self-dual: they obey 2 µνρσηρσ = ηµν. This will prove important. (Note that we’re not being careful about indices up vs down as we are in Euclidean space with no troublesome minus signs.) The full gauge potential should now be of −1 the form Aµ = if(x)Ω∂µΩ for some function f(x) → 1 as x → ∞. The right choice turns out to be f(x) = x2/(x2 + ρ2) where ρ is a parameter whose role will be clarified shortly. We then have the gauge field 1 A = ηi xνσi (2.51) µ x2 + ρ2 µν You can check that the associated field strength is 2ρ2 F = − ηi σi µν (x2 + ρ2)2 µν This inherits its self-duality from the ’t Hooft matrices and therefore solves the Yang- Mills equations of motion.
The instanton solution (2.51) is not unique. By acting on this solution with various symmetries, we can easily generate more solutions. The most general solution with winding ν = 1 depends on 8 parameters which, in this context, are referred to as collective coordinates. Each of them is has a simple explanation:
• The instanton solution above is localised at the origin. But we can always generate a new solution localised at any point X ∈ R4 simply by replacing xµ → xµ − Xµ in (2.51). This gives 4 collective coordinates.
• We’ve kept one parameter ρ explicit in the solution (2.51). This is the scale size of the instanton, an interpretation which is clear from looking at the field strength which is localised in a ball of radius ρ. The existence of this collective coordinate reflects the fact that the classical Yang-Mills theory is scale invariant: if a solution exists with one size, it should exist with any size. This property is broken in the quantum theory by the running of the coupling constant, and this has implications for instantons that we will describe below.
– 54 – • The final three collective coordinates arise from the global part of the gauge group. These are gauge transformations which do not die off asymptotically, and correspond to three physical symmetries of the theory, rather than redundancies.
For our purposes, we can consider a constant V ∈ SU(2) , and act as Aµ → −1 VAµV . Before we proceed, we pause to mention that it is straightforward to write down a corresponding anti-self-dual instanton with winding ν = −1. We simply replace the ’t Hooft matrices with their anti-self dual counterparts,
0 −1 0 0 0 0 −1 0 0 0 0 −1 η¯1 = 1 0 0 0 , η¯2 = 0 0 0 −1 , η¯3 = 0 0 1 0 µν 0 0 0 1 µν 1 0 0 0 µν 0 −1 0 0 0 0 −1 0 0 1 0 0 1 0 0 0
1 i i They obey 2 µνρσηρσ = −ηµν, and one can use these to build a gauge potential (2.51) with ν = −1. These too form an irreducible representation of su(2), and obey [ηi, η¯j] = 0. The fact that we can find two commuting su(2) algebras hiding in a 4 × 4 matrix reflects the fact that Spin(4) ∼= SU(2) × SU(2) and, correspondingly, the Lie algebras are so(4) = su(2) ⊕ su(2).
General Instanton Solutions To get an instanton solution in SU(N), we could take the SU(2) solution (2.51) and simply embed it in the upper left-hand corner of an N × N matrix. We can then rotate this into other embeddings by acting with SU(N), modulo the stabilizer which leaves the configuration untouched. This leaves us with the action SU(N) S[U(N − 2) × U(2)] where the U(N −2) hits the lower-right-hand corner and doesn’t see our solution, while the U(2) is included in the denominator because it acts like V in the original solution (2.51) and we don’t want to over count. The notation S[U(p) × U(q)] means that we lose the overall central U(1) ⊂ U(p) × U(q). The coset space above has dimension 4N − 8. This means that the solution in which (2.51) is embedded into SU(N) comes with 4N collective coordinate. This is the most general ν = 1 instanton solution in SU(N).
What about solutions with higher ν? There is a beautiful story here. It turns out that such solutions exist and have 4Nν collective coordinates. Among these solutions are configurations which look like ν well separated instantons, each with 4N collec- tive coordinates describing its position, scale size and orientation. However, as the instantons overlap this interpretation breaks down.
– 55 – V ( x ) V ( x )
x
x Figure 8: The double well Figure 9: The upside down double well
Remarkably, there is a procedure to generate all solutions for general ν. It turns out that one can reduce the non-linear partial differential equations (2.49) to a straightfor- ward algebraic equation. This is known as the ADHM construction and is possible due to some deep integrable properties of the self-dual Yang-Mills equations. You can read more about this construction (from the perspective of D-branes and string theory) in the lectures on Solitons.
2.3.2 Tunnelling: Another Quantum Mechanics Analogy We’ve found solutions in Euclidean spacetime that contribute to the theta dependence in the path integral. But why Euclidean rather than Lorentzian spacetime? The answer is that solutions to the Euclidean equations of motion describe quantum tunnelling.
This is best illustrated by a simple quantum mechanical example. Consider the double well potential shown in the left-hand figure. Clearly there are two classical ground states, corresponding to the two minima. But we know that a quantum particle sitting in one minimum can happily tunnel through to the other. The end result is that the quantum theory has just a single ground state.
How can we see this behaviour in the path integral? There are no classical solutions to the equations of motion which take us from one minimum to the other. However, things are rather different in Euclidean time. We define
τ = it
After this Wick rotation, the action
Z m dx2 S[x(t)] = dt − V (x) 2 dt turns into the Euclidean action Z m dx2 S [x(τ)] = −iS = dτ + V (x) E 2 dτ
– 56 – We see that the Wick rotation has the effect of inverting the potential: V (x) → −V (x). In Euclidean time, the classical ground states correspond to the maxima of the inverted potential. But now there is a perfectly good solution to the equations of motion, in which we roll from one maximum to the other. We come to a rather surprising conclusion: quantum tunnelling can be viewed as classical motion in imaginary time!
As an example, consider the quartic potential V (x) = λ(x2 − a2)2 (2.52) which has minima at x = ±a. Then a solution to the equations of motion which interpolates between the two ground states in Euclidean time is given by ω x¯(τ) = a tanh (τ − τ ) (2.53) 2 0 with ω2 = 8λa2/m. This solution is the instanton for quantum mechanics in the double well potential. There is also an anti-instanton solution that interpolates from x = +a to x = −a. The (anti)-instanton solution is localised in a region 1/ω in imaginary time.
In this case, there is just a single collective coordinate, τ0, whose existence follows from time translational invariance of the quantum mechanics.
Returning to Yang-Mills, we now seek a similar tunnelling interpretation for the instanton solutions. In the semi-classical approximation, the instantons tunnel between the |ni vacua that we described in Section 2.2.3. Recall that the semi-classical vacuum −1 3 is defined by Ai = iV ∂iV on a spatial slice R , which we subsequently compactify to S3. The vacuum |ni is associated to maps V (x): S3 7→ G with winding n, defined in (2.36).
We noted previously that the construction of the vacua |ni in terms of winding relies on topological arguments which are simi- n + > lar to those which underlie the existence of instantons. To see the connection, we can take the definition of the instanton winding instanton with t (2.47) and deform the integration region from the asymptotic winding ν 3 4 3 S∞ = ∂R to the two asympotic three spheres S± which we 3 think of as the compactified R± spatial slices ar t = ±∞. We n − > can then compare the instanton winding (2.47) to the definition of the vacuum states (2.36), to write Figure 10:
ν(U) = n+(U) − n−(U) We learn that the Yang-Mills instanton describes tunnelling between the two semi- classical vacua, |n−i → |n+i = |n− + νi, as shown in the figure.
– 57 – 2.3.3 Instanton Contributions to the Path Integral Given an instanton solution, our next task is to calculate something. The idea is to use the instanton as the starting point for a semi-classical evaluation of the path integral.
We can first illustrate this in our quantum mechanics analogy, where we would like to compute the amplitude to tunnel from one classical ground state |x = −ai to the other |x = +ai over some time T .
Z x(T )=+a ha|e−HT |−ai = N Dx(τ) e−SE [x(τ)] x(0)=−a with N a normalisation constant that we shall do our best to avoid calculating. There is a general strategy for computing instanton contributions to path integrals which we sketch here. This strategy will be useful in later sections (such as Section 7.2 and 8.3 where we discuss instantons in 2d and 3d gauge theories respectively.) However, we’ll see that we run into some difficulties when applying these ideas to Yang-Mills theories in d = 3 + 1 dimensions.
Given an instanton solutionx ¯(τ), like (2.53), we write the general x(τ) as
x(τ) =x ¯(τ) + δx(τ) and expand the Euclidean action as Z 3 SE[x(τ)] = Sinstanton + dτ δx ∆δx + O(δx ) (2.54)
Here Sinstanton = SE[¯x(τ)]. There are no terms that are linear in δx becausex ¯(τ) solves the equations of motion. The expansion of the action to quadratic order gives the differential operator ∆. The semi-classical approach is valid if the higher order terms give sub-leading corrections to the path integral. For our quantum mechanics double well potential, one can check that this holds provided λ 1 in (2.52). For Yang-Mills, this requirement will ultimately make us think twice about the semi-classical expansion.
Substituting the expansion (2.54) into the path integral, we’re left with the usual Gaussian integral. It’s tempting to write
x(T )=+a δx(T )=0 Z Z 3 Dx(τ) e−SE [x(τ)] = e−Sinstanton Dδx(τ) e−δx∆δx+O(δx ) x(0)=−a δx(0)=0 e−Sinstanton ≈ det1/2∆
– 58 – This, however, is a little too quick. The problem comes because the operator ∆ has a zero eigenvalue which makes the answer diverge. A zero eigenvalue of ∆ occurs if there are any deformations of the solutionx ¯(τ) which do not change the action. But we know that such deformations do indeed exist since the instanton solutions are never unique: they depend on collective coordinates. In our quantum mechanics example, there is a just a single collective coordinate, called τ0 in (2.53), which means that the deformation δx = ∂x/∂τ¯ 0 is a zero mode: it is annihilated by ∆. To deal with this, we need to postpone the integration over any zero mode. These can then be replaced by an integration over the associated collective coordinate. For our quantum mechanics example, we have Z x(T )=+a Z T e−Sinstanton −SE [x(τ)] Dx(τ) e ≈ dτ0 J 0 1/2 x(0)=−a 0 det ∆ Here J is the Jacobian factor that comes from changing the integration variable from the zero mode to the collective coordinate. We will not calculate it here. Meanwhile the notation det0 means that we omit the zero eigenvalue of ∆ when computing the determinant. The upshot is that a single instanton gives a saddle point contribution to the tunnelling amplitude, N J ha|e−HT |−ai ≈ KT e−Sinstanton with K = det0 1/2∆ Note that we’ve packaged all the things that we couldn’t be bothered to calculate into a single constant, K. The result above gives the contribution from a single instanton to the tunnelling amplitude. But, it turns out, this is not the dominant contribution. That, instead, comes from summing over many such tunnelling events. Consider a configurations consisting of a string of instantons and anti-instantons. Each instanton must be followed by an anti-instanton and vice versa. This configu- ration does not satisfy the equation of motion. However, if the (anti) instantons are well separated, with a spacing 1/ω, then the configuration very nearly satisfies the equations of motion; it fails only by exponentially suppressed terms. We refer to this as a dilute gas of instantons. As above, we should integrate over the positions of the instantons and anti-instantons. Because each of these is sandwiched between two others, this leads to the integration Z T Z T Z T T n dt1 dt2 ... dtn = 0 t1 tn−1 n! where we’re neglecting the thickness 1/ω of each instanton.
– 59 – A configuration consisting of n instantons and anti-instantons is more highly sup- pressed since its action is approximately nSinstanton. But, as we now see, these contri- butions dominate because of entropic factors: there are many more of them. Summing over all such possibilities, we have
X 1 ha|e−HT |−ai ∼ (KT e−Sinstanton )n = sinh KT e−Sinstanton n! n odd where we restrict the sum to n odd to ensure that we end up in a different classical ground state from where we started. We haven’t made any effort to normalise this amplitude, but we can compare it to the amplitude to propagate from the state |−ai back to |−ai,
X 1 h−a|e−HT |−ai ∼ (KT e−Sinstanton )n = cosh KT e−Sinstanton n! n even In the long time limit T → ∞, we see that we lose information about where we started, and we’re equally likely to find ourselves in either of the ground states |ai or |−ai. If we were more careful about the overall normalisation, we can also use this argument to compute the energy splitting between the ground state and the first excited state.
As an aside, you may notice that the calculation above is identical to the argument for why there are no phase transitions in one dimensional thermal systems given in the lectures on Statistical Field Theory.
Back to Yang-Mills Instantons Now we can try to apply these same ideas to Yang-Mills instantons. Unfortunately, things do not work out as nicely as we might have hoped. We would like to approximate the Yang-Mills path integral Z Z = DA e−SYM +iSθ by the contribution from the instanton saddle point. There are the usual issues related to gauge fixing, but these do not add anything new to our story so we neglect them here and focus only on the aspects directly related to instantons. (We’ll be more careful about gauge fixing in Section 2.4.2 when we discuss the beta function.)
Let’s start by again considering the contribution from a single instanton. The story proceeds as for the quantum mechanics example until we come to discuss the collective coordinates. For the instanton in quantum mechanics, there was just a single collective
– 60 – coordinate τ0. For our Yang-Mills instanton in SU(2), there are eight. Four of these are associated to translations in Euclidean spacetime; these play the same role as τ0 and integrating over them gives a factor of the Euclidean spacetime volume VT , with V the 3d spatial volume. Three of the collective coordinates arise from the global part of the gauge symmetry and can be happily integrated over. But this leaves us with the scale size ρ. This too should be singled out from the path integral and integrated over. We find ourselves with an integral of the form, Z ∞ Z ≈ dρ K(ρ) V T e−8π2/g2 eiθ 0 where, as before, K(ρ) includes contributions from the Jacobians and the one-loop determinant. Now, however, it is a function of the instanton scale size ρ and so we should do the hard work of calculating it.
We won’t do this hard work, in part because the calculation is rather involved and in part because, as we advertised above, the end result doesn’t offer quantitative insights into the behaviour of Yang-Mills. It turns out that K(ρ) causes the integral diverge at large ρ. This raises two concerns. First, it is difficult to justify the dilute instanton gas approximation if it is dominated by instantons of arbitrarily large size which are surely overlapping. Second, and more pressing, it is difficult to justify the saddle point expansion at all. This is because, as we describe in some detail in the next section, the gauge coupling in Yang-Mills runs; it is small at high energy but becomes large at low energies. This means that any semi-classical approximation, such as instantons, is valid for describing short distance processes but breaks down at large distances. The fact that our attempt to compute the partition function is dominated by instantons of large size is really telling us that the whole semi-classical strategy has broken down. Instead, we’re going to have to face up to the fact that Yang-Mills is a strongly coupled quantum field theory.
It’s a little disappointing that we can’t push the instanton programme further in Yang-Mills. However, it’s not all doom and gloom and we won’t quite leave instan- tons behind in these lectures. There are situations where instantons are the leading contribution to certain processes. We will see one such example in Section 3.3.2 in the context of the anomaly, although for more impressive examples one has to look to supersymmetric field theories which are under greater control and beyond the scope of these lectures.
2.4 The Flow to Strong Coupling Our discussion in the previous sections has focussed on the classical (or, at the very
– 61 – least, semi-classical) approach to Yang-Mills. Such a description gives good intuition for the physics when a theory is weakly coupled, but often fails miserably at strong coupling. The next question we should ask is whether Yang-Mills theory is weakly or strongly coupled.
We have chosen a scaling in which the coupling g2 sits in front of the action 1 Z S = d4x tr F µνF (2.55) YM 2g2 µν The quantum theory is defined, in the framework of path integrals, by summing over all field configurations weighted, with eiSYM in Minkowski space or e−SYM in Euclidean space. When g2 is small, the Euclidean action has a deep minimum on the solutions to the classical equations of motion, and these dominate the path integral. In this case, the classical field configurations provide a good starting point for a saddle point analysis. (In Minkowski space, the action is a stationary point rather than a minimum on classical solutions but, once again, these dominate the path integral.) In contrast, when g2 is large, many field configurations contribute to the path integral. In this case, we sometimes talk about quantum fluctuations being large. Now the quantum state will look nothing like the solutions to the classical equations of motion.
All of this would seem to suggest that life is easy when g2 is small, and harder when g2 is large. However, things are not quite so simple. This is because the effective value of g2 differs depending on the length scale on which you look: we write g2 = g2(µ), where µ is an appropriate energy scale, or inverse length scale. Note that this is quite a radical departure from the the classical picture where any constants you put in the action remain constant. In quantum field theory, these constants are more wilful: they take the values they want to, rather than the values we give them.
We computed the running of the gauge coupling g2 at one-loop in our previous course on Advanced Quantum Field Theory. (We will review this computation in Section 2.4.2 below.) The upshot is that the coupling constant depends on the scale µ as
2 1 1 11 C(adj) ΛUV 2 = 2 − 2 log 2 (2.56) g (µ) g0 3 (4π) µ
2 where g0 is the coupling constant evaluated at the cut-off scale ΛUV . Here C(adj) is a group theoretic factor. Recall that we have fixed a normalisation of the Lie algebra generators in the fundamental representation to be (2.2), 1 tr T aT b = δab (2.57) 2
– 62 – Having pinned down the normalisation in one representation, the other representations R will have different normalisations,
tr T a(R)T b(R) = I(R) δab
The coefficient I(R) is called the Dynkin index of the representation R. The convention 1 (2.57) means that I(F ) = 2 . The group theoretic factor appearing in the beta function is simply the Dynkin index in the adjoint representation,
C(adj) = I(adj)
It is also known as the quadratic Casimir, which is why it is denoted by a different letter. For the various simple, compact Lie groups it is given by
G SU(N) SO(N) Sp(N) E6 E7 E8 F4 G2 1 C(adj) N 2 N − 1 N + 1 2 3/2 1/2 3/2 2
Note that the adjoint representation of E8 is the minimal representation; hence the 1 appearance of C(adj) = I(F ) = 2 .
The running of the gauge coupling (2.56) is often expressed in terms of the beta function
dg 11 C(adj) β(g) ≡ µ = β g3 with β = − (2.58) dµ 0 0 3 (4π)2
The minus sign in (2.56) or, equivalently, in (2.58), is all important. It tells us that the gauge coupling gets stronger as we flow to longer length scales. In contrast, it is weaker at short distance scales. This phenomena is called asymptotic freedom.
Asymptotic freedom means that Yang-Mills theory is simple to understand at high energies, or short distance scales. Here it is a theory of massless, interacting gluon fields whose dynamics are well described by the classical equations of motion, together with quantum corrections which can be computed using perturbation methods. In particular, our discussion of instantons in Section 2.3 is valid at short distance scales. However, it becomes much harder to understand what is going on at large distances where the coupling gets strong. Indeed, the beta function (2.58) is valid only when g2(µ) 1. This equation therefore predicts its own demise at large distance scales.
– 63 – We can estimate the distance scale at which we think we will run into trouble. Taking the one-loop beta function at face value, we can ask: at what scale does g2(µ) diverge? This happens at a finite energy
2 1/2β0g ΛQCD = ΛUV e 0 (2.59)
For historical reasons, we refer to this as the “QCD scale”, reflecting its importance in the strong force. Alternatively, we can write ΛQCD in terms of any scale µ,
2 1/2β0g (µ) ΛQCD = µ e and dΛQCD/dµ = 0. For this reason, it is sometimes referred to as the RG-invariant scale.
Asymptotic freedom means that β0 < 0. This ensures that if g0 1, so that the theory is weakly coupled at the cut-off, then ΛQCD ΛUV . This is interesting. Yang-Mills theory naturally generates a scale ΛQCD which is exponentially lower than the cut-off ΛUV of the theory. Theoretical physicists spend a lot of time worrying about “naturalness” which, at heart, is the question of how Nature generates different length scales. The logarithmic running of the coupling exhibiting by Yang-Mills theory provides a beautiful mechanism to do this. As we will see moving forwards, all the interesting physics in Yang-Mills occurs at energies of order ΛQCD.
Viewed naively, there’s something very surprising about the emergence of the scale
ΛQCD. This is because classical Yang-Mills has no dimensionful parameter. Yet the quantum theory has a physical scale, ΛQCD. It seems that the quantum theory has generated a scale out of thin air, a phenomenon which goes by the name of dimensional transmutation. In fact, as the definition (2.59) makes clear, there is no mystery about this. Quantum field theories are not defined only by their classical action alone, but also by the cut-off ΛUV . Although we might like to think of this cut-off as merely a crutch, and not something physical, this is misleading. It is not something we can do without. And it this cut-off which evolves to the physical scale ΛQCD.
The question we would like to ask is: what does Yang-Mills theory look like at low energies, comparable to ΛQCD? This is a difficult question to answer, and our current understanding comes primarily from experiment and numerical work, with intuition built from different analytic approaches. The answer is rather startling: Yang-Mills theory does not describe massless particles. Instead, the gluons bind together to form massive particles known as glueballs. These particles have a mass that is of the order of ΛQCD, but figuring out the exact spectrum remains challenging. We sometimes say
– 64 – that the theory is gapped, meaning that there is a gap in the energy spectrum between the ground state, which we can take to have E = 0, and the first excited state with energy E = Mc2, where M is the mass of the lightest glueball.
Proving the mass gap for Yang-Mills is one of the most important and difficult open problems in mathematical physics. In these lectures we will restrict ourselves to building some intuition for Yang-Mills theory, and understanding some of the consequences of the mass gap. In later sections, will also see how the situation changes when we couple Yang-Mills to dynamical matter fields.
Before we proceed, I should mention a rather subtle and poorly understood caveat. We have argued in Sections 2.2 and 2.3 that the dynamics of Yang-Mills theory also depends on the theta parameter and we can ask: how does θ affect the spectrum? We have only a cursory understanding of this. It is thought that, for nearly all gauge groups, Yang-Mills remains gapped for all values of θ. However, something interesting happens at θ = π. Recall from Section 1.2.5 that θ = π is special because it preserves time- reversal invariance, more commonly known in particle physics as CP. For most gauge groups, it is thought that the dynamics spontaneously breaks time reversal invariance at θ = π, so that Yang-Mills has two degenerate ground states. We will give an argument for this in Section 3.6 using discrete anomalies, and another in Section 6.2.5 when we discuss the large N expansion. However, there is speculation that the behaviour of Yang-Mills is rather different for gauge group G = SU(2) and that, while gapped for all θ 6= π, this theory actually becomes gapless at θ = π, where it is conjectured to be described by a free U(1) photon. We will have nothing to say about this in these lectures.
2.4.1 Anti-Screening and Paramagnetism The computations of the 1-loop beta functions are rather involved. It’s useful to have a more down-to-earth picture in mind to build some understanding for what’s going on. There is nice intuitive analogy that comes from condensed matter.
In condensed matter physics, materials are not boring passive objects. They contain mobile electrons, and atoms with a flexible structure, both of which can respond to any external perturbation, such as applied electric or magnetic fields. One consequence of this is an effect known as screening. In an insulator, screening occurs because an applied electric field will polarise the atoms which, in turn, generate a counteracting electric field. One usually describes this by introducing the electric displacement D, related to the electric field through D = E
– 65 – where the permittivity = 0(1 + χe) with χe the electrical susceptibility. For all materials, χe > 0. This ensures that the effect of the polarisation is always to reduce the electric field, never to enhance it. You can read more about this in Section 7 of the lecture notes on Electromagnetism.
(As an aside: In a metal, with mobile electrons, there is a much stronger screening effect which turns the Coulomb force into an exponentially suppressed Debye-H¨uckel, or Yukawa, force. This was described in the final section of the notes on Electromagnetism, but is not the relevant effect here.)
What does this have to do with quantum field theory? In quantum field theory, the vacuum is not a passive boring object. It contains quantum fields which can respond to any external perturbation. In this way, quantum field theories are very much like condensed matter systems. A good example comes from QED. There the one-loop beta function is positive and, at distances smaller than the Compton wavelength of the
electron, the gauge coupling runs as
+
+
2 + 1 1 1 ΛUV + + 2 = 2 + 2 log 2
e (µ) e0 12π µ +
+
+ This tells us that the charge of the electron gets effectively + smaller as we look at larger distance scales. This can be + + + understood in very much the same spirit as condensed matter systems. In the presence of an external charge, electron-positron pairs will polarize the vacuum, as shown Figure 11: in the figure, with the positive charges clustering closer to the external charge. This cloud of electron-positron pairs shields the original charge, so that it appears reduced to someone sitting far away.
The screening story above makes sense for QED. But what about QCD? The negative beta function tells us that the effective charge is now getting larger at long distances, rather than smaller. In other words, the Yang-Mills vacuum does not screen charge: it anti-screens. From a condensed matter perspective, this is unusual. As we mentioned above, materials always have χe > 0 ensuring that the electric field is screened, rather than anti-screened.
However, there’s another way to view the underlying physics. We can instead think about magnetic screening. Recall that in a material, an applied magnetic field in- duces dipole moments and these, in turn, give rise to a magnetisation. The resulting
– 66 – magnetising field H is defined in terms of the applied magnetic field as
B = µH with the permeability µ = µ0(1 + χm). Here χm is the magnetic susceptibility and, in contrast to the electric susceptibility, can take either sign. The sign of χm determines the magnetisation of the material, which is given by M = χmH. For −1 < χm < 0, the magnetisation points in the opposite direction to the applied magnetic field. Such materials are called diamagnets. (A perfect diamagnet has χm = −1. This is what happens in a superconductor.) In contrast, when χm > 1, the magnetisation points in the same direction as the applied magnetic field. Such materials are called paramagnets.
In quantum field theory, polarisation effects can also make the vacuum either dia- magnetic or paramagnetic. Except now there is a new ingredient which does not show up in real world materials discussed above: relativity! This means that the product must be
µ = 1 because “1” is the speed of light. In other words, a relativistic diamagnetic material will have µ < 1 and > 1 and so exhibit screening. But a relativistic paramagnetic material will have µ > 1 and < 1 and so exhibit anti-screening. Phrased in this way, the existence of an anti-screening vacuum is much less surprising: it follows simply from paramagnetism combined with relativity.
For free, non-relativistic fermions, we calculated the magnetic susceptibility in the lectures on Statistical Physics when we discussed Fermi surfaces. In that context, we found two distinct contributions to the magnetisation. Landau diamagnetism arose because electrons form Landau levels. Meanwhile, Pauli paramagnetism is due to the spin of the electron. These two effects have the same scaling but different numerical coefficients and one finds that the paramagnetism wins.
In the next section we will compute the usual one-loop beta-function. We present the computation in such a way that it makes clear the distinction between the diamagnetic and paramagnetic contributions. Viewed in this light, asymptotic freedom can be traced to the paramagnetic contribution from the gluon spins.
2.4.2 Computing the Beta Function In this section, we will sketch the derivation of the beta function (2.58). We’re going to use an approach known as the background field method. We work in Euclidean space
– 67 – and decompose the gauge field as
¯ Aµ = Aµ + δAµ
¯ We will think of Aµ as the low-energy, slowly moving part of the field. It is known as the background field. Meanwhile, δAµ describes the high-energy, short-wavelength modes whose effect we would like to understand. The field strength becomes
¯ ¯ ¯ Fµν = Fµν + DµδAν − DνδAµ − i[δAµ, δAν]
¯ ¯ where Dµ = ∂µ −i[Aµ, ·] is the covariant derivative with respect to the background field ¯ Aµ. From this, we can write the action (2.55) as
1 Z 1 S = d4x tr F¯ F¯µν + 2F¯µνD¯ δA YM g2 2 µν µ ν ¯µ ν ¯ ¯µ ν ¯ ¯µν + D δA DµδAν − D δA DνδAµ − iF [δAµ, δAν] 1 − 2iD¯µδAν[δA , δA ] − [δAµ, δAν][δA , δA ] (2.60) µ ν 2 µ ν where we’ve ordered the terms in the action depending on the number of δA’s. Note that the middle line is quadratic in δA.
Gauge Fixing and Ghosts
Our plan is to integrate over the fluctuations δAµ in the path integral, leaving ourselves ¯ with an effective action for the background field Aµ. To do this, we must first deal with the gauge symmetry. While the action of the gauge symmetry on Aµ is clear, there is ¯ no unique decomposition into the action on Aµ and δAµ. However, the calculation is simplest if we load the full gauge transformation into δAµ, so
¯ ¯ δgaugeAµ = 0 and δgauge(δAµ) = Dµω − i[δAµ, ω] where, for this section alone, we’ve changed our notation for infinitesimal gauge trans- formations so as not to confuse them with the fluctuating field δAµ. With this choice, δAµ transforms as any other adjoint field.
As usual, field configurations related by a gauge symmetry should be viewed as physically equivalent. This is necessary in the present context because the kinetic terms for δAµ are not invertible. For this reason, we first need a way to fix the gauge.
– 68 – We do this using the Faddeev-Popov procedure that we saw in the lectures on Advanced Quantum Field Theory. We choose to work in the gauge
¯ ¯µ G(A; δA) = D δAµ = 0 (2.61)
Note that this gauge fixing condition depends on our choice of background field. This ¯ is the advantage of this method; we will find that the gauge invariance of Aµ is retained throughout the calculation.
We add to our action the gauge-fixing term 1 Z S = d4x tr (D¯µδA )2 (2.62) gf g2 µ The choice of overall coefficient of the gauge fixing term is arbitrary. But nice things ¯µ ν ¯ happen if we make the choice above. To see why, let’s focus on the D δA DνδAµ term in (2.60) . Integrating by parts, we have Z Z 4 ¯µ ν ¯ 4 ¯µ ¯ν d x tr D δA DνδAµ = − d x tr δAνD D δAµ Z 4 ¯µ ¯ν ¯ν ¯µ = − d x tr δAν [D , D ] + D D δAµ Z h i 4 ¯µ 2 ¯µν = d x tr (D δAµ) + iδAν[F , δAµ]
The first of these terms is then cancelled by the gauge fixing term (2.62), leaving us with 1 Z 1 S + S = d4x tr F¯ F¯µν + 2F¯µνD¯ δA YM gf g2 2 µν µ ν ¯µ ν ¯ ¯µν +D δA DµδAν − 2iF [δAµ, δAν] 1 −2iD¯µδAν[δA , δA ] − [δAµ, δAν][δA , δA ] µ ν 2 µ ν and we’re left with just two terms that are quadratic in δA. We’ll return to these shortly.
The next step of the Faddeev-Popov procedure is to implement the gauge fixing condition (2.61) as a delta-function constraint in the path integral. We denote the ¯ω ¯ ω ¯ gauge transformed fields as Aµ = Aµ and δAµ = δAµ + Dω − i[δAµ, ω]. We then use the identity Z ∂G(A¯ω, δAω) Dω δ(G(A¯ω, δAω)) det = 1 ∂ω
– 69 – The determinant can be rewritten through the introduction of adjoint-valued ghost fields c. For the gauge fixing condition (2.61), we have ∂G(A,¯ δAω) Z 1 Z h i det = Dc Dc† exp − d4x tr − c†D¯2c + ic†[D¯µδA , c] ∂ω g2 µ where we’ve chosen to include an overall factor of 1/g2 in the ghost action purely as a convenience; it doesn’t effect subsequent calculations. The usual Faddeev-Popov story tells us that the integration R Dω now decouples, resulting in a unimportant overall constant. We’re left with an action that includes both the fluctuating gauge field δAµ and the ghost field c, S = SYM + Sgf + Sghost, 1 Z 1 S = d4x tr F¯ F¯µν + 2F¯µνD¯ δA g2 2 µν µ ν ¯µ ν ¯ ¯µν ¯ † ¯µ +D δA DµδAν − 2iF [δAµ, δAν] + Dµc D c 1 −2iD¯µδAν[δA , δA ] − [δAµ, δAν][δA , δA ] + ic†[D¯µδA , c] µ ν 2 µ ν µ As previously, we have arranged the terms so that the middle line is quartic in fluctu- ating fields, while the final line is cubic and higher.
One-Loop Determinants
Our strategy now is to integrate out the fluctuating fields, δAµ and c, to determine ¯ their effect on the dynamics of the background field Aµ.
¯ Z ¯ e−Seff [A] = DδA Dc Dc† e−S[A,δA,c]
Things are simplest if we take our background field to obey the classical equations of ¯ ¯µν motion, DµF , which ensures that the term linear in δAµ in the action disappears. Furthermore, at one loop it will suffice to ignore the terms cubic and quadratic in fluctuating fields that sit on the final line of the action above. We’re then left just with Gaussian integrations, and these are easy to do,
¯ 1 R 4 ¯ ¯µν −Seff [A] −1/2 +1 − 2 d x tr Fµν F e = det ∆gauge det ∆ghost e 2g where the quadratic fluctuation operators can be read off from the action and are given by
µν ¯2 µν ¯µν ¯2 ∆gauge = −D δ + 2i[F ·] and ∆ghost = −D where the F¯µν should be thought of as an operator acting on objects in the adjoint ¯ representation. This extra term, Fµν, arising from the gauge fields can be traced to
– 70 – the fact that they are spin 1 excitations. As we will see below, this contributes the paramagnetic part to the beta function and, ultimately, is responsible for the famous minus sign that leads to anti-screening.
Taking logs of both sides, the effective action is given by 1 Z 1 S [A¯] = d4x tr F¯ F¯µν + Tr log ∆ − Tr log ∆ (2.63) eff 2g2 µν 2 gauge ghost where the Tr means the trace over group, Lorentz and momentum indices (as opposed to tr which is over only gauge group indices). We need to figure out how to compute the contributions from these quadratic fluctuation operators.
The Ghost Contribution
The contribution from the ghost fields are simplest because it has the least structure. We write
2 ∆ghost = −∂ + ∆1 + ∆2 ¯ where the subscripts keep track of how many Aµ terms each operator has, µ ¯ ¯ µ ¯µ ¯ ∆1 = i∂ Aµ + iAµ∂ and ∆2 = [A , [Aµ, ·]] where, again these operators act on objects in the adjoint representation. This will prove important to get the right normalisation factor. We then have