<<

An introduction to and quantum field theory

Jean-No¨elFuchs [email protected]

Updated September 10, 2017 Contents

1 Introduction 5 1.1 Requirements...... 5 1.2 Why quantum field theory?...... 5 1.2.1 Quantum + relativity ⇒ QFT...... 5 1.2.2 Many-body problem in condensed ...... 6 1.2.3 What is a field?...... 7 1.3 Symmetries as a leitmotiv and guiding principle...... 7 1.4 Natural units, dimensional analysis and orders of magnitude in high- physics...... 8 1.5 of the course: the menu...... 9

2 - symmetries and geometrical objects 11 2.1 ...... 11 2.1.1 O(3) and SO(3) groups...... 11 2.1.2 Representations of SO(3)...... 14 2.2 ...... 18 2.2.1 Metric of flat ...... 18 2.2.2 Interval between two events in spacetime...... 19 2.2.3 Isometries of spacetime: the Lorentz group...... 19 2.2.4 Lorentz algebra...... 21 2.2.5 Representations of the Lorentz group...... 24 2.3 Poincar´egroup and field representations...... 26 2.3.1 Spacetime translations and field representations...... 27 2.3.2 of a field...... 27 2.3.3 Poincar´ealgebra...... 28 2.3.4 Lorentz transformation of a 4-vector field...... 28 2.3.5 Lorentz transformation of a field...... 28 2.3.6 Summary...... 29 2.3.7 Representations of the Poincar´egroup on single states...... 29 2.4 Conclusion...... 29

3 Classical fields, symmetries and conservation laws 30 3.1 Lagrangian formalism...... 30 3.1.1 and Lagrangian...... 30 3.1.2 Equations of ...... 31 3.1.3 Conjugate field and Hamiltonian...... 32 3.2 Symmetries and conservation laws...... 32 3.2.1 Symmetries of the action...... 32 3.2.2 Proof of Noether’s theorem...... 33 3.2.3 Explicit form of the Noether current...... 34 3.2.4 Examples...... 34

1 3.3 Scalar fields and the Klein-Gordon equation...... 37 3.3.1 Real scalar field...... 37 3.3.2 Complex scalar field...... 39 3.4 Electromagnetic field and the Maxwell equation...... 40 3.4.1 Covariant form of the Maxwell equations...... 40 3.4.2 Free electromagnetic field and gauge invariance...... 41 3.4.3 Gauge choices...... 43 3.4.4 Energy- ...... 44 3.4.5 Coupling to matter and electric conservation as a consequence of gauge invariance 44 3.4.6 Gauging a ...... 45 3.4.7 Advanced geometrical description of the electromagnetic field...... 47 3.5 Massless spinor fields and the Weyl equations...... 48 3.5.1 Left Weyl field, helicity and the ...... 48 3.5.2 Right Weyl field...... 49 3.6 Spinor field and the ...... 50 3.6.1 and the Dirac Lagrangian...... 50 3.6.2 The Dirac equation and γ matrices...... 50 3.6.3 ...... 51 3.6.4 The Clifford algebra and different representations of the γ matrices...... 52 3.6.5 General solution of the Dirac equation: mode expansion...... 54 3.6.6 Conserved currents and charges...... 56 3.6.7 Coupling to the electromagnetic field...... 57 3.6.8 The αj and β matrices and the Dirac equation as the square root of the KG equation 58 3.6.9 Conclusion on the Dirac equation...... 59 3.6.10 Bonuses: extra material...... 61 3.7 Conclusion on classical fields...... 62

4 Canonical of fields 64 4.1 Real scalar fields...... 64 4.1.1 Quantum field...... 64 4.1.2 Particle interpretation and Fock space...... 65 4.1.3 and ordering...... 67 4.1.4 Fock space and Bose-Einstein statistics...... 68 4.1.5 Summary on ...... 68 4.2 Complex scalar field and anti-...... 69 4.3 Dirac field...... 70 4.3.1 Anticommutators: spinor fields are weird...... 70 4.3.2 Fock space, Fermi-Dirac statistics and the -statistics relation...... 73 4.3.3 U(1) charge and anti-particles...... 73 4.3.4 Charge conjugation...... 73 4.3.5 Motivation for anticommutation...... 74 4.3.6 Spin...... 75 4.3.7 Causality and locality...... 75 4.4 and as exchange of particles...... 75 4.4.1 What is a ?...... 75 4.4.2 Force as an exchange of virtual particles...... 76 4.4.3 A first glance at Feynman diagrams...... 77

2 5 Spontaneous symmetry breaking 78 5.1 Introduction: SSB and phase transitions...... 78 5.2 Goldstone mechanism: SSB of a global symmetry...... 79 5.2.1 Classical groundstate...... 79 5.2.2 Excitations above the groundstate...... 80 5.2.3 What is the number of Goldstone modes?...... 81 5.2.4 SSB of a discrete symmetry and quantum tunneling...... 82 5.3 : SSB of a gauge symmetry...... 83 5.3.1 Superconductivity and the Meissner effect...... 85

3 Foreword

These are lecture notes for the course “Symmetries and quantum field theory” given at the master 2 “concepts fondamentaux de la physique, parcours de physique quantique” in Paris. It is work in progress and comments are most welcome. There are 12 or 13 sessions consisting of a 1.5h lecture and a 1.5h exercise session (under the supervision of Giuliano Orso, [email protected]) from September 7 to December 21, 2017. Location: between towers 55 and 56, 1st floor, room 1-02, in Universit´ePierre et Marie Curie (Jussieu, Paris) Thursday afternoon from 14:00 to 15:30 (lecture), then a 30’ break and from 16:00 to 17:30 (exercise session) Recommended books: • Textbooks: The three documents that I most used to prepare these lecture notes are the following. B. Delamotte, Un soup¸conde th´eoriedes groupes: groupe des et groupe de Poincar´e[1] M. Maggiore, A modern introduction to quantum field theory [2] L.H. Ryder, Quantum field theory [3] • More advanced: A. Zee, Quantum field theory in a nutshell [4] • Less advanced (popular science reading): A. Zee, Fearful symmetry: the search for beauty in [5]

4 Chapter 1

Introduction

V. Weisskopf: “There are no real one-particle systems in nature, not even few-particle systems. The existence of virtual pairs and of pair fluctuations show that the days of fixed particle number are over.”

These notes serve as an introduction to relativistic quantum fiel theory (but they often discuss the connection to as well). They start by discussing the symmetries of flat spacetime (Lorentz and Poincar´egroups) and constructing the natural geometrical objects of spacetime, such as scalars, vectors and . Then classical field theory is reviewed before quantizing the non-interacting fields using canonical quantization. The notes end with a short presentation of spontaneous symmetry breaking.

1.1 Requirements

Quantum including representations of the rotation group and the notion of spin in covariant notation (Lagrangian, action, Euler-Lagrange equations, etc.) Notions on and representations (wave propagation, gauge invariance)

1.2 Why quantum field theory? 1.2.1 Quantum + relativity ⇒ QFT Wedding of quantum physics and relativity imposes the language of quantum field theory (QFT). Indeed, is formulated to describe a fixed number of particles. But this is only possible non- relativistically. The uncertainty relation ∆E∆t & ~ permits the violation of the energy (by an amount ∆E) for a short time (∆t) and relativity theory permits the conversion of energy into matter (E = mc2 states that the rest energy is given by the ). Therefore the number of particles (even massive) can not be fixed: (virtual) particles may appear or disappear out of nothing (vacuum). We therefore need a theory that permits to create or destroy particles. This is QFT. The change of perspective is that we now describe a field (such as the electromagnetic field or the field) rather than particles and that particles emerge as excitations of this field upon quantization. Let us try to describe quantum mechanically a single relativistic massive particle (i.e. an electron, for example) and show that we run into a difficulty. Because ∆∆p & ~ and ∆p = ∆E/v with v = ∂E/∂p and E = pp2c2 + m2c4, then ∆E & ~v/∆x. In addition, remaining a single particle on the positive energy branch of the relativistic dispersion relation E = pp2c2 + m2c4 means that the energy uncertainty ∆E has to be smaller than the 2mc2. Therefore ∆x & ~v/(2mc2) ∼ ~/(mc), which means that there is a

5 minimum localization length for a single particle, which is of the order of the Compton wavelength ~/(mc) 1. If we try to localize a particle better than its Compton wavelength, then its energy becomes uncertain by an amount larger than the mass gap so that other particles start being involved (those occupying the negative energy branch in the picture of the vacuum having a negative energy branch E = −pp2c2 + m2c4 filled with ). And hence, it is impossible to describe a single relativistic particle. Note that quantum + relativity also imposes the notion of (pictured as a missing electron in the Dirac sea). The most well-known example of a QFT is that of the electromagnetic field, the quanta of which are the . Photons can be created and destroyed. There is no consistent description of a single . The blackbody radiation is described as an ideal Bose-Einstein of massless particles at thermal equilibrium with a zero chemical potential µ = 0: when T → 0, there are less and less photons on average. Photons are not conserved, their number is not fixed. In the present context, QFT is known as relativistic quantum field theory.

1.2.2 Many-body problem in condensed matter physics Another familiar example of a QFT can be found in non-relativistic condensed matter physics. Phonons appear as quanta of the displacement field of a crystal. They can also be destroyed and created. In the context of condensed matter, QFT is not required but it is frequently used as it is very convenient when dealing with quantum mechanics of a very large number of degrees of freedom. This is known as the many-body problem. QFT in the many-body problem is also known as “” description (meaning occupation number representation, Fock space, annihilation and creation operators, etc.) in contrast to using standard quantum mechanics known as “first quantization” description (meaning numbered particles, many-body wavefunction Ψ(1, ...., N; t) in configuration space, symmetrization or anti-symmetrization, etc). It means describing the population of modes instead of describing the behavior of numbered particles. Actually, it is not restricted to describing non-conserved particles (such as phonons, magnons, etc.) but can be adapted to describe conserved particles as well (such as electrons in a metal or atoms in a trapped ultracold gas, which are conserved in a non-relativistic theory). “Second quantization” is a historical name and a quite confusing one (a better name would be non- relativistic QFT or condensed matter field theory). It comes from a misinterpretation: of a single particle (Newton’s equation or Lagrangian dynamics or Hamiltonian dynamics) would first be quan- tized into the quantum mechanical description of a single particle (Schr¨odinger’s equation for a wavefunction ϕ(~x,t)), and then the wavefunction would itself be quantized again (i.e. second quantized) ϕ(~x,t) → ϕˆ(~x,t) to become a quantum field. Do you see what is wrong in the preceding argument? Think about this , it is important to grasp. Actually, there is only a single quantization procedure but accompanied by a change of perspective from a single particle to a field describing all identical particles. What is wrong is that the field that becomes quantized in a QFT is not a wavefunction (it is not a describing a single particle) but rather a field whose excitations are the particles. Once this field is quantized, it becomes an operator creating or destroying particles at position ~x and time t. A better notation for the quantized field ϕˆ(~x,t) would bea ˆ(~x,t) to clearly identify it as a ladder operator (or annihilation operator as for the harmonic oscillator). The conjugate fielda ˆ†(~x,t) is also a ladder operator (known as a creation operator). In QFT, as we will see, the Schr¨odingerequation for a single electron is reinterpreted as being a classical field equation describing the field of all electrons, i.e. the electronic field2. Just as Maxwell’s equation is a classical wave equation describing the electromagnetic field, i.e. the photonic field. In these lectures, we will mostly focus on relativistic QFT. For more on condensed matter field theory (i.e. the use of QFT techniques in the context of non-relativistic condensed matter physics), see other M2 courses such as “Quantum mechanics: second quantization and ” (first semester) and “Condensed matter theory” (first semester). We also strongly recommend the book by A. Altland and B. Simons, Condensed matter field theory [6].

1For an electron, the Compton wavelength is of the order of 500 fm, i.e. 100 smaller than the Bohr radius, the typical size of atoms and 500 times larger than the size of a nucleus. 2A tricky question at the heart of the confusion about “second quantization”: how come ~ already appears in Schr¨odinger’s equation if it is a classical wave equation?

6 number of degrees of freedom classical quantum finite and discrete quantum mechanics infinite and continuous classical field theory quantum field theory

Table 1.1: theory versus mechanics.

In short, QFT is a convenient/efficient framework to describe a system with many particles that are not necessarily conserved (they can be created or annihilated).

1.2.3 What is a field? A field is an object φ defined at each point of space-time

φ(~x,t) (1.1) where ~x denotes space, t time and φ can be a scalar, a vector, a tensor, a spinor, a or even something else. More generally, a field is a map from a base manifold3 M (usually space-time) to a target T (which depends on the nature of the field: scalar, vector, etc.):

φ : M → T x → φ(x) (1.2)

A field describes an infinite number of degrees of freedom (DoF). For example, if the field is a real scalar (φ ∈ R) and if space-time is continuous 1 + 1 dimensional, at each point of this 2d manifold, there is a single degree of freedom. QFT should be clearly contrasted with quantum mechanics, which is restricted to a finite countable number of degrees of freedom, see Table 1.1. Fields are also needed to avoid action at a , i.e. to mediate at finite and respect locality. For example, the electric interaction between charges is not instantaneous but propagates at the of . Although, it is often approximated as an instantaneous (non-retarded) Coulomb potential.

1.3 Symmetries as a leitmotiv and guiding principle

Symmetries are all important in physics (P.W. Anderson even wrote that “it is only slightly overstating the case to say that physics is the study of symmetry” in “More is different”, Science 1972). They constrain the form of theories (symmetry dictates design). We will use space-time symmetries to construct relativistic field theories. Symmetries also imply conservation laws (for example, the is a consequence of invariance under time ). Even more so in quantum then in classical physics. In addition to space-time symmetries, there are also less obvious internal symmetries. For example: gauge symmetry of electromagnetism. Symmetries can be continuous (as rotations) or discrete (as space inversion or time reversal). The mathematical tools needed to describe symmetries are groups and their representations. But what is a symmetry? To answer that question precisely, we first need to define a few important notions. First, there are transformations, which should be carefully distinguished from symmetries. This is a change in our description of a system. We adopt the passive viewpoint: the system is left unchanged, only the description (a frame, for example) is transformed (the active viewpoint would consist in having a single frame and in applying a transformation to the system). An example is the rotation of a reference frame used to describe a bicycle. A transformation need not be a symmetry. The bicycle is not under arbitrary rotation. Still, it is not forbidden to apply a rotation (transformation) to a bicycle.

3A manifold is a topological space that resembles near each point.

7 Second, there is the notion of invariance under a transformation. Something (an object, the state of a system, etc.) is said to be symmetrical under a transformation (or to admit a transformation as a symmetry) if it is left unchanged by the transformation. For example, a cube is invariant under certain rotations (but not all). Third, one needs to distinguish the symmetry of an object (or of the state of a system) from the symmetry of a law of physics. In the following, we will be more interested in the symmetry of physical laws than in the symmetry of an object (except in the last chapter on spontaneous symmetry breaking). If a law of physics (or a description of a system) is left unchanged by a transformation, then we say that the law exhibits a symmetry or that the system possesses a symmetry. We will often describe a system by an action S: then, the system exhibits a symmetry if the action is left invariant by a transformation. For example, 3d space is thought to be rotationally invariant. In order to respect that invariance, the fundamental laws of physics (e.g. Newton’s second law) have to be written in a covariant manner (i.e. as an equality between objects that transform the same under space symmetries, e.g. vectors F~ = m~a). But this does not preclude the existence of objects (e.g. a cube, a bicycle, etc.) that do not have the full rotational invariance of space. Not all objects are spheres. Fourth, when a symmetry is present for a physical law, it does not mean that every state of the system will feature that symmetry4. Indeed a symmetry can be spontaneously broken: the system may possess the symmetry but this is not necessarily reflected in its state. For example, of space may be apparent in the state of an atomic ensemble (when it is in its gaseous or its liquid phase) or may be spontaneously broken (when it is in its cristalline solid phase). The state of the system may be less symmetric than the laws of physics. Eventually, a symmetry can also be explicitly broken. For example, full of space is explicitly broken at the surface of earth by the presence of a gravitation field indicating a preferred direction (vertical direction).

1.4 Natural units, dimensional analysis and orders of magnitude in high-energy physics

We will use natural units such that ~ = 1 and c = 1. It is a good exercise to put units back in final expressions. With these units, energy = mass = 1/length = 1/time, which is usually expressed in GeV (1 GeV = 109 eV = 1.6 × 10−10 J = 1.8 × 10−27 kg, which is the order of magnitude of the or mass). The corresponding length scale is ∼ 0.1 fm = 10−16 m (femtometer or fermi). The typical size of a nucleus being ∼ 1 fm. When performing dimensional analysis of physical quantities, we will always be interested in knowing their “mass ”. For example, energy has mass dimension 1, time has mass dimension −1, power has mass dimension 2, action has mass dimension 0, the Lagrangian density has mass dimension D + 1 (in D−1 D + 1 spacetime), a scalar field has mass dimension 2 , etc. At this point it would be good to make connection to the natural playground for relativistic QFT, which is nuclear and subnuclear physics (also called high-energy physics). Unfortunately, we won’t have time to do that. We therefore only provide what we think is the minimum piece of information. For more details see the introduction chapter in Maggiore [2] or Ryder [3]. For a pleasant evening reading see the popular science account by A. Zee [5]. The elementary particles consist mainly of two categories: matter particles or constituents of matter () and interaction mediating particles or messengers of interaction (mediating bosons). The first category splits in (electron, , etc.: spin 1/2) that do not feel the strong in- teraction5 and hadrons (proton, neutron, , etc.) that do feel the . Hadrons further

4Again, it is important to distinguish the symmetry of a law and that of an object (or of the state of a system). 5In addition, the neutrino does not carry an and is almost massless. Therefore, it essentially only couples to the and is very hard to detect.

8 separate into (proton, neutron, etc.) that are fermions and mesons (pion, etc.) that are bosons6. The neutrino is almost massless (less than ∼ 1 eV). The electron mass is roughly 0.5 MeV and that of nucleons (proton or neutron) 1 GeV. Entering the world of happens at the mass scale of the pion 0.1 GeV which corresponds to a distance of 1 fm (the typical size of a nucleus). The four fundamental interactions (electromagnetism, , strong and weak) and their main properties are summarized in Table 1.2. The particles carrying interactions are all bosons. The photon (massless, spin 1, gauge boson) carries the electromagnetic interaction, the intermediate vector bosons W and Z (80 − 90 GeV, spin 1, gauge bosons) carry the weak interaction and the mesons (e.g. the , 0.1 GeV) carry the strong interaction (see also a preceding footnote and the caption of Table 1.2). Gravity is supposed to be mediated by a massless spin 2 boson called the graviton, although it has never been observed. Classical gravitational waves have recently been detected by LIGO-Virgo (september 2015). The huge energy scale at which gravity becomes quantum is expected to be the so called Planck mass p~c/G ∼ 1019 GeV (where G is Newton’s constant of gravity) corresponding to a distance of 10−20 fm. 7 Eventually, there is a third type of particles – on top of fermions and mediating bosons – with a single known member: the Higgs (scalar, spin 0) boson. Its peculiarity comes from its being a scalar (rather than a vector) boson, not mediating an interaction and therefore being closer to being a matter particle. Its mass (125 GeV) is of the same order as that of the intermediate vector bosons (W and Z). It was discovered experimentally in 2012. We will encounter this particle at the end of the course, when discussing spontaneous symmetry breaking. We should also mention that the highest energy currently achieved in particle accelerators (e.g. the large hadron collider LHC at CERN) is of the order of 1 TeV = 1000 GeV (per nucleon) corresponding to a distance of ∼ 10−4 fm, i.e. 0.01% of the nucleus size. The current status (2016) is the following: the seems to explain almost every observed phenomena in high-energy physics (with an extension to include finite ) and, at the moment, there are no signs of things such as super-symmetry or extra particles.

1.5 Logic of the course: the menu

Chapter 1: quantum + relativity ⇒ QFT as a necessity (also quantum + many bodies ⇒ QFT as a convenience) Chapters 2 and 3: space-time symmetries constrain physical theories ⇒ relativistic (classical) field theory Chapter 4: canonical quantization of relativistic field theory ⇒ QFT Chapter 5: spontaneous symmetry breaking (Goldstone and Higgs mechanisms)

[end of lecture #1]

6A potentially confusing fact in the above presentation is that mesons (which are composite bosons) appear both as matter particles (and as such should be fermions) but also as the carriers of the strong interaction at low energy. At a more fundamental level, the correct theory of strong interaction is . The elementary fermions in this theory are called (rather than baryons) and interact via the exchange of massless spin 1 gauge bosons called gluons (rather than mesons). It is only at a phenomenological level that hadrons are formed due to and gluon confinement – the strong interaction being so strong at low energy that free quarks or free gluons have never been seen – leaving baryons (bound states of three quarks) as “matter fermions” and mesons (quark anti-quark bound states) as ”mediating bosons”. At low energy, the strong interaction proceeds via the exchange of mesons (e.g. pions) between baryons (e.g. nucleons) as proposed by Yukawa. Baryons are composite fermions made of three fermions, whereas mesons are composite bosons made of two fermions. This resolve the apparent contradiction of mesons being constituents of matter and bosons at the same time. 7Later, we will see that fundamental interactions are described by gauge theories and that carriers of interactions are gauge bosons. Electromagnetism or (QED) will be seen as a U(1)Q (Q is the electric charge). It is an abelian gauge theory. or quantum flavor dynamics (QFD) will be seen as an SU(2)L × U(1)Y gauge theory ( and Y is weak ). It is a non-abelian or Yang-Mills gauge theory. Upon spontaneous symmetry breaking (Higgs mechanism) SU(2)L ×U(1)Y → U(1)Q. Strong interaction or quantum chromodynamics (QCD) will be seen as a SU(3) gauge theory (the corresponding charge is called ). The standard model is built from the gauge group SU(3) × SU(2)L × U(1)Y . And gravitation can also be seen as some kind of gauge theory where spacetime translations are gauged into diffeomorphisms? There is also a , and a Levi-Civita connection, etc.

9 interaction realm strength (at low energy) range mass of carriers e2 1 −2 em atoms, molecules, chemistry weak ( ~c = 137 ∼ 10 ) long (∞) 0 (photon) gravity planets, galaxies, cosmos weakest (10−40 @ 0.1 GeV) long (∞) 0 (graviton?) strong nuclei, nuclear binding strong (∼ 1 @ 0.1 GeV) short (1 fm) 0.1 GeV (pion) weak radioactive β decay weaker (10−7 @ 0.1 GeV) shortest (10−3 fm) 100 GeV (W , Z)

Table 1.2: The four fundamental interactions. Interaction range r and mass of the carrier boson m are related by r ∼ ~/mc ∼ 1/m in natural units (1 fm ↔ 0.2 GeV). At a more fundamental level, strong interactions are carried by spin 1 massless gauge bosons called gluons. The latter interact very strongly and are confined such that the resulting effective interaction is carried by massive composite particles (called mesons: pions being one example, e.g.) and the effective range of nuclear forces is finite despite the mass of gluons being zero.

10 Chapter 2

Space-time symmetries and geometrical objects

A. Zee: “The issue of symmetry is whether different observers perceive the same structure of physical reality.”

The aim of the present chapter is to study symmetries of 3+1 dimensional space-time using group theory and representations. This will allow us to identify natural objects, that have well defined transformation properties in space-time (just as scalars, vectors, etc. are natural objects of 3d space). There is little physics in this chapter (except for the structure of spacetime) and the focus is mainly on group theory as a way to recognize well behaved geometrical objects. We start with a familiar example – that of the rotations of 3d space – in order to recall important concepts of group theory as applied to the study of symmetries (such as , , linear representations, etc).

2.1 Rotation group 2.1.1 O(3) and SO(3) groups This is intended to be a warm up section with the aim of reviewing basic notions of groups and representations. We consider the 3d Euclidian space. The group of isometries (i.e. transformations preserving dl2 = dx2 + dy2 + dz2) of this space is the O(3) group (here we do not consider translations), which is also that of orthogonal real 3 × 3 matrices. Let us see that. Let r~0 = R~r such that r~0 · r~0 = ~r · ~r defines a rotation R for a position vector ~r. The matrix R is 3 × 3 and encodes a linear transformation. For simplicity, we will also write it as r0 = Rr. Then r0T r0 = (Rr)T Rr = rT r so that rT RT Rr = rT r for all vector r. Therefore RT R = 1, which shows that R is an orthogonal matrix. In addition by its definition as a r~0 = R~r it is obvious that it is a real matrix. Therefore R ∈ O(3). Let us see that O(3) has a group structure1. (i) If R ∈ O(3) and R0 ∈ O(3), (RR0)T (RR0) = R0T RT RR0 = R0T R0 = 1 so that RR0 ∈ O(3): closure. (ii) If R ∈ O(3) then RT ∈ O(3) and RT R = RRT = 1 so that R−1 = RT is the inverse of R. (iii) The unit 3 × 3 matrix I is a neutral element as IR = RI = R (for simplicity we will often write 1 instead of I). (iv) R(R0R”) = (RR0)R” (associativity). Therefore O(3) is a group. It is non-abelian (or non-commutative) as

1In a pragmatic view, a group is a G of elements g equiped with a composition law, the properties of which are specified by a multiplication table. For example the group Z2 = {1, −1; ×} contains two elements (1, −1) and has a composition law × 1 −1 denoted by × that obeys the following multiplication table 1 1 −1 . Alternatively, the same group can also be written −1 −1 1 + 0 1 {0, 1; +} with a multiplication table that reads 0 0 1 . Its elements are the integers modulo 2 justifying its name Z2. 1 1 0

11 in general RR0 6= R0R, which is a well known of matrices. As det(RT R) = 1 = (det R)2 and det R is real so that det R = 1. SO(3) (special ) is a subgroup made of det R = 1 matrices (show it). The subset of O(3) made of det R = −1 matrices is not a subgroup (why?). SO(3) is the group of proper  1 0 0   −1 0 0  rotations. It contains the identity I =  0 1 0  but not the space inversion2 P =  0 −1 0 . 0 0 1 0 0 −1 2 T R ∈ SO(3) a priori depends on 3 = 9 real elements but R R = 1 means that RkiRkj = δij (summation over repeated indices implied), which gives 6 independent conditions (because the two equations RkiRkj = 0 when i 6= j and RkjRki = 0 are the same) so that in the end there are only 3 independent (real) parameters 3. The group is therefore continuous and the three parameters can be taken as 3 angles: 2 angles to specify a direction in 3D space (an axis) and a last angle to specify the amount of rotation around that axis. When the elements of the group depend in a continuous and differentiable way on a set of real parameters, the group is called a Lie group. The parameter space of the SO(3) Lie group is 3-dimensional and made of 3 angles. Any element of O(3) can be obtained as the product of an element of SO(3) with either the identity I or the space inversion P 4. SO(3) is the part of O(3) that is continuously connected to the identity. In the following we concentrate on SO(3).

Generators of rotations and Lie algebra

To be concrete, we now build the rotation matrix Rz(ψ) for a rotation of angle ψ around the z axis in the passive viewpoint (there is a single vector and two different frames). We consider a fixed vector V~ and 0 0 0 describe it in two orthonormal frames {~ex,~ey,~ez} and {e~ x, e~ y, e~ z}. In the first frame V~ = Vi~ei and in the ~ 0 ~0 second V = Vi e i (summation over repeated indices). The second frame is rotated with respect to the first 0 0 0 one around the z axis. As e~ x = cos ψ~ex + sin ψ~ey, e~ y = cos ψ~ey − sin ψ~ex and e~ z = ~ez, we obtain that the coordinates of the vector V~ transform as  0        Vx cos ψ sin ψ 0 Vx Vx 0  Vy  =  − sin ψ cos ψ 0   Vy  = Rz(ψ)  Vy  (2.1) 0 Vz 0 0 1 Vz Vz It is important to realize that vectors do not transform as vectors. In the present passive viewpoint, vectors do not transform at all V~0 = V~ , unlike their coordinates and basis vectors, which both transform in the same way. Indeed  ~0      e x cos ψ sin ψ 0 ~ex ~0  e y  =  − sin ψ cos ψ 0   ~ey  (2.2) 0 e~ z 0 0 1 ~ez  1 ψ 0   0 −i 0  In an infinitesimal rotation ψ → 0, Rz(ψ) ≈  −ψ 1 0  = 1 + iψJz with Jz =  i 0 0  = 0 0 1 0 0 0 d −i dψ Rz(ψ)|ψ→0, which is called the generator of rotations around the z axis. For a finite rotation Rz(ψ) = 2Space inversion and parity are not exactly the same thing. Space inversion or point reflection means that ~r = (x, y, z) → −~r = (−x, −y, −z). Parity means that the direction of space is inverted, which can be realized by reverting either one – as in (x, y, z) → (−x, y, z) – or three – as in (x, y, z) → (−x, −y, −z) – components of a vector. Therefore parity needs to be clearly defined. In 3 space , one usually takes parity to be the same as space inversion ~r = (x, y, z) → −~r = (−x, −y, −z). But the general definition of parity is that of the change of sign of a single component. Physically, it corresponds to a mirror reflection. For example, in 2 space dimensions, one should carefully distinguish space inversion – (x, y) → (−x, −y) – from parity – either (x, y) → (−x, y) or (x, y) → (x, −y). Indeed, in 2 space dimensions, space inversion is equivalent to a rotation by π around an axis perpendicular to the xy plane and does not revert the direction of space. Whereas parity does revert the direction of space. Parity should have det P = −1. 3The fact that the be +1 rather than −1 allows one to choose a sign but does not lead to an extra independent linear equation that would reduce the number of independent parameters. 4 The zeroth homotopy group Π0(V ) of a manyfold V is the set of connected components. We have Π0(O(3)) = Z2 and Π0(SO(3)) = 0.

12 N N iψJz limN→∞ [Rz(ψ/N)] = limN→∞(1 + iJzψ/N) = e .  1 0 0  As an exercise, show that for a rotation of angle ψ around x, one obtains Rx(ψ) =  0 cos ψ sin ψ  = 0 − sin ψ cos ψ  0 0 0   cos ψ 0 − sin ψ  iψJx e with the generator Jx =  0 0 −i . And for a rotation around y, Ry(ψ) =  0 1 0  = 0 i 0 sin ψ 0 cos ψ  0 0 i  iψJy e with the generator Jy =  0 0 0 . −i 0 0 In the end, there are as many generator (Jx,Jy,Jz) as independent continuous parameters of the Lie † group. The generators are hermitian matrices Ji = Ji with i = 1, 2, 3 for x, y, z. You can check that, (Ji)jk = −iijk where i indicates the generator (either J1 = Jx or J2 = Jy or J3 = Jz), j is the row index and k the column index of the matrix. And ijk is the fully such that 123 = +1 (as an exercise show that only 6 out of its 27 elements are non-zero and try to represent this tensor as a “3D matrix”). The generators do not commute but satisfy the following (Lie) algebra [Jx,Jy] = iJz or more generally [Ji,Jj] = iijkJk (2.3)

where ijk are called the structure factors of the Lie algebra. In order to distinguish the Lie group SO(3) from its algebra, the latter is usually written so(3). A general proper rotation5 can be written as

~ iψ~n·J~ R~n(ψ) = R(ψ) = e , (2.5)

where the unit vector ~n – with coordinates (sin θ cos φ, sin θ sin φ, cos θ) where 0 ≤ θ ≤ π and 0 ≤ φ ≤ 2π – defines the rotation axis and ψ the rotation angle (sometimes with the notation ψ~ = ψ~n). Note that this is not the same as eiψnxJx eiψny Jy eiψnz Jz (to convince yourself, think of the Baker-Campbell-Haussdorff formula for the product of exponentials of two operators that do not commute: X Y X+Y + 1 [X,Y ]+ 1 ([X,[X,Y ]]+[Y,[Y,X]])+... X+Y e e = e 2 12 6= e ).

Parameter space and the topology of SO(3)

Because R~n(ψ + π) = R−~n(π − ψ) (show it), the rotation angle is such that 0 ≤ ψ ≤ π and not 0 ≤ ψ ≤ 2π. The three real parameters needed to specify a proper rotation are therefore (ψ, θ, φ) to be taken in the parameter space [0, π] × [0, π] × [0, 2π]. A Lie group can also be seen as a differentiable manifold, that can be characterized topologically. The parameters of the Lie group live in a parameter space which is roughly a ball (i.e. a filled sphere S2 in R3) of radius π when ψ is interpreted as a radial coordinate and (θ, φ) as spherical coordinates. It can also be seen as a kind of S3 sphere in R4, but see below. The Lie group SO(3) is said to be compact as [0, π] × [0, π] × [0, 2π] is compact (as it is a closed and bounded interval of R3). It is connected as the parameter space is made of a single piece. Let’s take a closer look at what happens on the surface of this ball of radius π. As R~n(π) = R−~n(π) (corresponding to ψ = 0 in the above equation R~n(ψ + π) = R−~n(π − ψ)) for any unit vector ~n, antipodal points on the surface of the ball have to be identified. Then this parameter space is not simply connected. Actually, it is doubly connected: there are two homotopy classes (two classes of closed paths that can not be continuously deformed into one another). One class uses the identification of opposite points on the surface of the ball (paths in this class can not be

5A general formula due to B. Olinde Rodrigues is (here we are momentarily using the active viewpoint) 0 ~x = R~n(ψ)~x = cos ψ ~x + (1 − cos ψ)(~x · ~n)~n + sin ψ(~n × ~x) (2.4) 0 where ~x = ~xk + ~x⊥ = (~x · ~n)~n + [~x − (~x · ~n)~n] and ~x = ~xk + cos ψ ~x⊥ + sin ψ (~n × ~x⊥). Note that ~n = ~xk/xk, ~x⊥/x⊥ and ~n × ~x⊥/x⊥ form a direct orthonormal basis. The Rodrigues formula clearly shows that the rotation matrix only depends on ψ ψ cos ψ and sin ψ (and not on cos 2 and sin 2 for example: this will play an important role later). In other words, ψ only modulo 2π. As an exercise, write explicitly the rotation matrix for a general rotation parametrized by (ψ, θ, φ).

13 smoothly contracted to a null path), the other does not (paths in this second class can be smoothly deformed to the null path, i.e. the identity). The fundamental group of SO(3) is therefore Z2, which is usually written 3 as Π1(SO(3)) = Z2. SO(3) as a topological manifold is similar to the projective space RP , which is the 3 3 3 3-sphere S with antipodal points identified. This is usually written as SO(3) ≈ RP ≈ S /Z2.

Figure 2.1: This picture illustrates the topology of the SO(3) manifold. It is roughly a ball (a filled sphere S2) of radius π, but antipodal points on the surface of the sphere (such as P and P’) have to be identified. There are therefore two homotopy classes (i.e. the fundamental homotopy group Π1(SO(3)) = Z2). Image taken from http://physics.stackexchange.com/questions/76096/lie-groups-and-group-extensions.

For more information on homotopy groups and basic notions of topology, you can consult Ryder [3] (section on “Topology of the vacuum: the Bohm-Aharonov effect”) or Altland and Simons [6] (section 9.2 on homotopy groups in chapter 9 “Topology”) or J. Sethna [7] (section “IV. Classify the topological defects”) or Nakahara [8]. We will essentially need the homotopy groups of the sphere [9]. You should understand 1 2 2 statements such as Π1(S ) = Z (winding number), Π1(S ) = 0, Π2(S ) = Z (wrapping number), etc.

2.1.2 Representations of SO(3) First, we review a few results from . A group can be thought of as an abstract object6 (G, ·) made of a set G of elements g and a composition law (symbolized by the dot ·) with closure, inverse, neutral element and associativity. Now, we would like to know how a group of transformations acts on physical quantities. We therefore introduce linear7 representations of this group. Let’s take a with n components (for example the electric field with 3 components). A representation built in this representation space (or base space) is such that for each g ∈ G there is an n × n matrix T (g) acting on physical quantities such that T (g1g2) = T (g1)T (). In other words:

T : G → V (2.6) g → T (g) (2.7)

6Here, by calling the group an abstract object, we want to emphasize that having constructed SO(3) from 3 × 3 matrices, we should now forget this construction and think of this group as an abstract group. Its elements are no longer matrices but abstract elements that obey a certain “multiplication table”. 7Why do we restrict ourselves to linear representations? See [1] for some ideas about this question.

14 where V is a linear (vector) space called the representation space and T (g) is a matrix (if V is of finite dimen- sion) or a linear operator (if V is infinite dimensional). A representation such that T (g1g2) = T (g1)T (g2) is iφ(g1,g2) called a faithful representation, whereas one such that T (g1g2) = e T (g1)T (g2) is called a projective representation (or a representation up to a phase). An irreducible representation is such that it contains no non-trivial subspace of V which is left invariant by all T (g). Otherwise, a representation is said to be reducible. For example, for SO(3) a quadruplet

(a, Vx,Vy,Vz) made of a scalar a and a vector V~ defines a reducible representation. For a such as SO(3), we can restrict ourselves to irreducible representations which are unitary and of finite dimension (see [1] encadr´esI and III). Then T (g) are unitary matrices (i.e. T (g)−1 = T (g)†) and the corresponding generators are hermitians. For a non-compact group, there are no unitary representations of finite dimension (apart from the trivial one). A representation of the Lie group can be obtained from a representation of its Lie algebra8. On the one iψ~n·J~ hand, the Lie group SO(3) made of elements R~n(ψ) = e has a global structure (for example, we also discussed it as a topological manifold being doubly connected). On the other hand, the Lie algebra so(3) defined by [Ji,Jj] = iijkJk only reflects the local structure of the group SO(3) close to the identity I but not its global structure9. We now concentrate on irreducible representations of the Lie algebra so(3). We already said that these representations are unitary and of finite dimension.

Dimension 1

Let us start by trying to construct a representation of dimension 1. [Ji,Jj] = iijkJk can be satisfied by 1×1 ~ iψ~·J~ matrices Ji = 0 so that R(ψ) = e = 1. This is the scalar (or trivial) representation. A scalar quantity s transforms indeed as s0 = 1s = s.

Dimension 3 We already found a representation of dimension 3 when first constructing the rotation group. The generators are 3 × 3 matrices such that (Ji)jk = −iijk. This is the vector (or fundamental or defining) representation of SO(3). Hence, for us a vector is now a triplet of numbers (Vx,Vy,Vz) that transforms under a rotation  0    Vx Vx 0 iψ~·J~ iψ~·J~ according to  Vy  = e  Vy  where e is a 3 × 3 matrix here. Remember that a triplet of 0 Vz Vz numbers that has no specified transformation property under spatial rotations is just a triplet of numbers, not a vector.

Dimension 2 M. Atiyah: “No one fully understands spinors. Their algebra is formally understood but their general signif- icance is mysterious. In some sense they describe the “square root” of geometry and, just as understanding the square root of −1 took centuries, the same might be true of spinors.”

Let us now try to find a two dimensional representation of the Lie algebra. are 2 × 2 hermitian matrices that do satisfy the so(3) algebra (up to a a factor 2, see below). We recall that σ1 =  0 1   0 −i   1 0  σ = , σ = σ = and σ = σ = are the three Pauli matrices. Check x 1 0 2 y i 0 3 z 0 −1 that indeed [σi/2, σj/2] = iijkσk/2. Another property of Pauli matrices is that they anticommute. Show

8Modulo the fact that two groups (e.g. SO(3) and SU(2)) can have the same algebra but different faithful representations. In addition, this statement turns out to be true only for compact groups. See below. 9 We will soon see that SU(2) is a different group from SO(3) – the former is simply connected Π1(SU(2)) = 0, while the latter is doubly connected Π1(SO(3)) = Z2 – but it has the same Lie algebra su(2) = so(3).

15 that σiσj = δij + iijkσk and {σi, σj} = 2δij (this last property defines a Clifford algebra, which we will encounter again when discussing the Dirac equation). The corresponding rotation matrix is

~ ψ ψ U(ψ~) = eiψ·~σ/2 = cos σ + i sin ~n · ~σ (2.8) 2 0 2 ~ where ψ = ψ~n and σ0 = I is here the 2 × 2 unit matrix and σi/2 are the generators for the representation ~ ~ ψ of dimension 2. Note that, unlike R(ψ) in the representation of dimension 3, U(ψ) depends on cos 2 and ψ ~ sin 2 , rather than on cos ψ and sin ψ. The matrix U(ψ) is a 2 × 2 complex matrix that depends on three real parameters (ψ, ~n(θ, φ)). In addition U(ψ~)†U(ψ~) = 1 and det U(ψ~) = +1, which means that U(ψ~) ∈ SU(2) the group of special (det= +1) unitary matrices. The SU(2) Lie group is different from SO(3). It has the same local structure (i.e. the same Lie algebra) as SO(3) but a different global structure. Indeed, for each element of SO(3), there are actually two elements ~ ~ ~ iψ~ ·~σ/2 ψ of SU(2). Let’s see that. Consider ψ = ψ~n and ψ = ψ~n = (ψ + 2π)~n. Then U(ψ) = e = cos 2 + ψ iψ~·~σ/2 ~ ~ iψ~ ·J~ iψ~·J~ ~ iψ iψ i sin 2 ~n · ~σ = −e = −U(ψ), whereas R(ψ) = e = e = R(ψ). This is a consequence of e = e iψ/2 iψ/2 and e = −e . The parameter space of SU(2) is (ψ, θ, φ) ∈ [0, 2π] × [0, π] × [0, 2π] as U~n(ψ + 2π) = −U~n(ψ) = U−~n(2π − ψ) (compare with the discussion of R~n(ψ)). This parameter space is twice as large as that of SO(3): for each element R~n(ψ) of SO(3) defined by the triplet (ψ, θ, φ) ∈ [0, π] × [0, π] × [0, 2π], there are two distinct elements U~n(ψ) and U~n(ψ + 2π) = −U~n(ψ) of SU(2) corresponding to the triplets of parameters (ψ, θ, φ) and (ψ + 2π, θ, φ) ≡ (2π − ψ, π − θ, φ + π). In summary U ∈ SU(2) ↔ R ∈ SO(3). As a manifold SU(2) ≈ S3 which is compact and simply connected (i.e. the fundamental homotopy 3 10 group Π1(S ) = 0) . SU(2) is known as the universal cover of SO(3) which is written as SO(3) = SU(2). The representation we have just found is a faithful representation of SU(2) but not of SO(3). Let’s see that it is actually a projective representation of SO(3). Take the following two elements of G = SO(3): g1 = Rz(π) iπσz /2 iπσz /2 iπσz and g2 = Rz(π), then g1 · g2 = Rz(2π) = I = e but U(g1)U(g2) = e e = e = −I = −U(e) iφ(g1,g2) where e is the neutral element of the group. Here the phase in T (g1)T (g2) = e T (g1g2) is π. This representation is called the spinor (or fundamental or defining) representation of SU(2). Note how by allowing projective representations, we have moved from the study of the group SO(3) to that of another group, SU(2). Objects that transform according to this representation are called spinors. They are doublets of complex numbers (z1, z2) that transform according to

 0      z1 iψ~·~σ/2 z1 ~ z1 0 = e = U(ψ) (2.9) z2 z2 z2 Just like any triplet of numbers is not a vector, any doublet is not a spinor. A characteristic feature of a spinor is that it gets a minus sign in a rotation of 2π and only comes back to itself after a 4π rotation. Classical objects that show such a behavior can be constructed with a belt or scissors or a glass of water (show some of the tricks in the classroom: plate, Dirac’s string or belt, Balinese cup, etc.). Note how in the present discussion of representations of the rotation group, spinors appear as geometrical objects that a priori have nothing to do with quantum physics. Spinors were actually discovered by the mathematician Elie´ Cartan in 1913 when studying linear representations of groups, before they appeared in quantum mechanics with in 1927. The connection between spinors and quantum mechanics comes from allowing for the presence of representations up to a phase.

 a b  10To see that directly, note that a generic SU(2) matrix can be written U = where a and b are complex −b∗ a∗ numbers such that |a|2 + |b|2 = 1 (indeed UU † = 1 and det U = +1). Writing a = x + iy and b = z + it, we see that x2 + y2 + z2 + t2 = 1 which is the definition of a sphere S3 of radius 1 in R4. This shows that SU(2) ≈ S3. As a side remark, iχ iψ~n·~σ/2 iχ ψ ψ iχ note that a general U(2) matrix can be written U = e e = e [cos 2 + i sin 2 ~n · ~σ] = e [γ0 + i~γ · ~σ] where γ0 and R 2 2 (γx, γy, γz) all ∈ such that γ0 + ~γ = 1. First, one recognizes the above four real numbers such that the sum of their square equals one. Second, one sees that U(2) ≈ U(1) × SU(2) where eiχ is the U(1) phase.

16 Tensor products allow one to construct representations of higher dimension from representations of smaller dimension. Let’s study a concrete example: that of the tensor product of two vectors (i.e. two representations ~ 0 iψ~·J~ of spin 1). Let V be a vector, i.e. its coordinates transform as Vi = RijVj with R = e . Let Tij be an 0 array of 9 numbers. If it transforms as Tij = RikRjlTjl, then it is said to be a rank 2 tensor. For example, Tij = ViVj. This defines a reducible representation of SO(3) of dimension 9. From your knowledge of the composition of in quantum mechanics11, you know that composing two spin 1 (i.e. vectors) gives rise to a spin 0 (i.e. scalar), a spin 1 and a spin 2. In terms of dimensions of the representations, this is usually written as 3⊗3 = 1⊕3⊕5, which means that the tensor product of two vector representations is reducible and decomposes into the direct sum of 3 irreducibles representations (irreps): one of dimension 1 (spin 0, scalar, trace of the tensor), one of dimension 3 (spin 1, vector, antisymmetric part of the tensor) and one of dimension 5 (spin 2, traceless symmetric part of the tensor). Another example is the composition of two spin 1/2 representations: 2 ⊗ 2 = 1 ⊕ 3. It is a reducible 4-dimensional representation that splits into a one-dimensional (spin 0) irrep and a three-dimensional (spin 1) irrep. We can continue this game and compose a spin 1/2 with a spin 1: 2 ⊗ 3 = 2 ⊕ 4. It is a reducible six-dimensional representation that splits into a two-dimensional irrep (spin 1/2) and a four-dimensional irrep (spin 3/2). Repeating this procedure, we can obtain all the irreps of SU(2).

Casimir operator A Casimir is an operator that commutes with all generators of the Lie group. In each irrep, it is proportional to the identity (Schur’s lemma). Its eigenvalues are used to label the irrep. For the so(3) = su(2) algebra, 2 2 2 th J~ = JiJi is the only Casimir operator. Check that [J~ ,Ji] = 0 and that J~ = j(j + 1)I in the j irrep of SU(2). An irrep of SU(2) (or of SO(3)) is therefore labeled by its spin j.

Summary All irreducible representations (faithful or projective) of SO(3) can be found as faithful irreps of SU(2). Irreps of SU(2) are labelled by a half integer j = 0, 1/2, 1, 3/2, ..., which is known as the spin of the representation. The dimension of the representation is 2j + 1. j = 0: spin 0, scalar, dimension = 1 j = 1/2: spin 1/2, spinor, dimension = 2 j = 1: spin 1, vector, dimension = 3 j = 3/2: spin 3/2, dimension = 4 j = 2: spin 2, traceless symmetric rank 2 tensor, dimension = 5 .... Faithful representations of SO(3) are that of integer spin j ∈ N. Projective representations of SO(3) are 1 that of 2 + integer spin j = 1/2, 3/2, .... Remark: from the above construction, keep in mind the difference between a rank 2 tensor and a spin 2. The former corresponds to 9 numbers and induces a reducible representation that splits into the direct sum of three irreps: its trace is a scalar (spin 0), its antisymmetric part is a vector (spin 1) and its traceless trT I trT symmetric part is a spin 2. T = 3 + A + S where Aij = (Tij − Tji)/2 and Sij = (Tij + Tji)/2 − 3 .

Space inversion (parity) Space inversion P does not belong to SO(3) but to O(3). It further distinguishes between a true and a pseudo- scalar or between a true (or polar) and a pseudo- (or axial) vector. Let V~ be a true vector. Under P it

transforms as Vi → −Vi. With two vectors V~ and W~ , we can make the scalar product V~ ·W~ which transforms as ViWi → (−Vi)(−Wi) = ViWi, which defines a true scalar. We can also consider the V~ × W~ which transforms as ijkVjWk → ijk(−Vj)(−Wk) = ijkVjWk, which is therefore not a true vector but a

11 When composing a spin j1 with a spin j2, one obtains all the spins in between |j1 − j2| and j1 + j2.

17 pseudo-vector. The mixed product (V~ × W~ ) · U~ of three vectors transforms as ijkVjWkUi → −ijkVjWkUi, which is a pseudo-scalar. If one is interested in going further in this direction of refining the classification of “vectors”, I suggest reading J. Hlinka, Phys. Rev. Lett. 113, 165502 (2014) (see arXiv:1312.0548), who considers the direct product of SO(3) and {I,P,T,PT } (where P is the space inversion and T the time reversal operator) to build 8 types of directional quantities (polar vector, axial vector, nematic director, etc.) and 4 scalar quantities (time-even scalar, time-even pseudo-scalar, time-odd scalar, time-odd pseudo-scalar). Question: studying the projective representations of O(3) (i.e. SO(3) and parity P ), is it possible to discover the existence of two types of spinors (left-chiral spinors and right-chiral spinors)? These two types of spinors are discussed in the next section and appear as projective irreps of SO(3, 1).

[end of lecture #2]

2.2 Lorentz group

We now move from 3 dimensional Euclidian space to 3+1 dimensional Minkowskian space-time and. As before, we do not consider translations (either in space or time) for the moment and restrict to a sub-class of isometries12: “rotations” (i.e. space rotations and boosts). At this point, it would be good for the reader to review the basics of special relativity (relativity principle + absolute value of the light velocity). If one wants to understand something to relativity, we recommend reading Epstein [10]. For a brief recap, we recommend Feynman [11]. And for a more formal presentation with the covariant notation, we suggest reading Boratav and Kerner [12]. The absolute nature of the velocity of light c justifies taking units such that c = 1 in the following.

2.2.1 Metric of flat spacetime Spacetime is described by the Minkowski metric. It is a flat metric (no curvature) but there is a minus sign that makes it different from the familiar Euclidian metric. A vector in spacetime (called a 4-vector) will be written A. Because the metric is not Euclidian13, we have to distinguish contravariant (upper indices) and covariant (lower indices) coordinates, which are different. We use a basis of four 4-vectors {eµ} with µ = 0, 1, 2, 3 = t, x, y, z (greek indices take 4 values, while latin indices only span the 3 space directions i = 1, 2, 3 = x, y, z) to define contravariant coordinates Aµ by

µ A = A eµ (2.10)

(summation convention for repeated lower and upper indices). We have Aµ = (A0,Ai) = (A0, A~) for µ ν contravariant coordinates. The scalar product between two 4-vectors is defined by A · B = A B eµ · eν = µ ν A B ηµν , where ηµν ≡ eµ · eν defines the , which is obviously symmetric ηµν = ηνµ. It is represented by the following matrix  1 0 0 0   0 −1 0 0  ηµν =   = diag(1, −1, −1, −1) (2.11)  0 0 −1 0  0 0 0 −1

It describes a flat spacetime and relies on a conventional choice (we could have chosen ηµν = diag(−1, 1, 1, 1) instead). If spacetime was Euclidian the metric would be diag(1, 1, 1, 1), i.e. the unit 4×4 matrix. Covariant coordinates are defined by Aµ ≡ A · eµ . (2.12)

12For the moment, we did not specify what is the proper definition of a “distance” in spacetime. 13Distinguishing contra- and co-variant indices is required whenever the basis of vectors is not orthonormal, i.e. when the metric is not simply the identity. It is not specific of relativity and of studying spacetime. As an exercise, consider a 2d flat space and work with a basis of two unit vectors that are not orthogonal. See Delamotte [1], page 70.

18 0 i ~ ν ν ν We have Aµ = (A0,Ai) = (A , −A ) = (A0, −A) because Aµ = A · eµ = A eν · eµ = A ηνµ = ηµν A . The µν µν µ object η is defined as being the inverse of ηµν i.e. η ηνσ = δσ = diag(1, 1, 1, 1). It is also represented by µν the same matrix as ηµν . The ηµν and η are used to lower and raise indices. Note, for example, µν µ µ µ that η ηνσ = δσ = η σ = ησ . We will see that the metric tensor is one of the few tensors that has the same expression in every frame (it is called an invariant tensor). We also define differential operators ∂ ∂ ≡ = (∂ , ∂ , ∂ , ∂ ) = (∂ , ∇~ ) = (∂ , ∂ ) (2.13) µ ∂xµ t x y z t 0 i and µ µν 0 i ∂ ≡ η ∂ν = (∂ , ∂ ) = (∂t, −∇~ ) . (2.14) A word on three words: invariant, covariant, contravariant. Invariant means unaffected by a transfor- mation (e.g. a scalar is invariant under an isometry). Covariant means that something transforms in the same way as something else. For example a covariant equation is such that both sides of the equality are transformed but that the equality remains true after transformation14. Covariant coordinates are coordi- nates that transforms as basis vectors do. Whereas contravariant coordinates do not transform as basis µ vectors. Contravariant coordinates A transform in a way opposite to that of basis vectors eµ such that the µ vector A = A eµ itself is not transformed (passive viewpoint). Instead of talking about contravariant and covariant coordinates, it would probably be easier just to say upstairs indices and downstairs indices. As a rule of thumb, remember that Aµ = (A0, A~) (upstairs) are the “natural objects” (meaning no minus signs ∂ ~ floating around). The only exception is the ∂µ = ∂xµ = (∂t, ∇) which is “naturally defined” with a downstairs index. Note that we are now manipulating four types of objects: 4-vectors A, contravariant coordinates Aµ, covariant coordinates Aµ and basis 4-vectors eµ. In the passive viewpoint, 4-vectors are not transformed, contravariant coordinates are transformed in a certain way (see below) and basis vectors are transformed in the opposite way (such as to maintain 4-vectors invariant). And covariant coordinates transform as basis vectors do. Remark: there is a potentially confusing point of detail that we want to spell out. Aµ = (A0,Ai) = 0 (A , A~) = (A0, −Ai). But because we are so used of writing A~ = (Ax,Ay,Az) in 3d space, we usually keep 1 2 3 i 1 this notation and therefore have to have that A~ = (Ax,Ay,Az) = (A ,A ,A ) = A , i.e. A = Ax = −A1, 2 3 A = Ay = −A2 and A = Az = −A3. Not so easy...

2.2.2 Interval between two events in spacetime The “distance” (called interval) between two “points” (called events) in spacetime ds is such that its “square” 2 2 2 2 2 µ ν 2 2 is ds = dt − (dx + dy + dz ) = dx · dx = dx√ dx ηµν . Note that ds is not positive in general. If ds > 0, the interval is said to be time-like and ds = ds2 is called the interval. If ds2 < 0, the interval is said to be space-like and p|ds2| is called the proper distance. If ds2 = 0 the interval is said to be light-like (or null). The latter defines the light-cone of a given . Inside the of an event are other events that can be causally related to it (either before in the “past” or after in the “future”). Outside the light cone are events that may be thought as happening “now” and that are not causally related to the given event.

2.2.3 Isometries of spacetime: the Lorentz group A Poincar´etransformation is an isometry of spacetime, i.e. a transformation that preserves the interval 0 02 2 0µ 0ν α β between two events. Therefore it transforms ds into ds such that ds = ds i.e. ηµν dx dx = ηαβdx dx . 0µ 0µ σ σ 0µ σ 0µ 0ν Using the fact that dx = ∂x /∂x dx = ∂σx dx , we obtain ηµν ∂σx ∂ρx = ησρ. First taking the

14For example, an equality between two vectors ~a = ~b is covariant, whereas an equality between a triplet of scalars (a, b, c) and a vector d~ = (dx, dy, dz) is not and is generally wrong, even if it may be true in a particular frame. It is seen to be wrong 0 0 0 in a rotation as (a, b, c) is invariant but not (dx, dy, dz) → (dx, dy, dz) = R(dx, dy, dz), where R is a rotation matrix.

19 Figure 2.2: Light cone of a given event (or observer). Taken from https://en.wikipedia.org/wiki/Light cone.

∂x0 ∂x0 determinant we see that the Jacobian | det ∂x | = 1 so that the matrix ∂x is invertible. Second taking 0µ 0ν a further derivative, we have ∂α(ηµν ∂σx ∂ρx ) = 0 so that Aασρ + Aαρσ = 0 where we defined Aασρ ≡ ∂2x0µ ∂x0ν ηµν ∂xα∂xσ ∂xρ . Playing with the indices we find the three relations: Aσαρ + Aαρσ = 0, Aσαρ + Aσρα = 0 ∂x0µ 0 and Aσρα + Aαρσ = 0...... one eventually obtains ∂xσ ∂xα = 0 so that x is a linear of x. Therefore, µ µ we introduce the 4 × 4 matrix Λ ν and the vector a such that

0µ µ ν µ x = Λ ν x + a . (2.15)

µ By convention Λ ν is represented by a matrix for which µ is the row index and ν the column index (note that this convention for Λ difers from the one we have taken for the metric η for which the associated matrix µ is defined by ηµν = diag(1, −1, −1, −1) and not by η ν ). The constant shift a corresponds to a spacetime translation, whereas Λ corresponds to a “spacetime rotation” (i.e. either to a space rotation or to a change of inertial frame called a Lorentz boost or simply a boost). (Λ, a) is an element of the Poincar´egroup (translations + rotations + boosts), whereas Λ is an element of the Lorentz group (rotations + boosts)15. 0µ µ ν For the moment, we concentrate on the Lorentz group, an element Λ of which acts as x = Λ ν x . From x0 · x0 = x · x, we obtain that µ ν T µ ν ηαβ = ηµν Λ αΛ β = (Λ )α ηµν Λ β , (2.16) T ν ν where we used the definition of the matrix (Λ )µ ≡ Λ µ simply exchanging row and column indices (pay attention to the position of the indices in this definition and compare ΛT to the definition of Λ−1 below). Therefore η = ΛT ηΛ (2.17)

µ as a matrix identity where η represents the matrix ηµν , while Λ represents the matrix Λ ν . This relation defines (homogeneous) Lorentz transformations. Equation (2.17) should be seen as a Minkowskian equivalent of the Euclidian relation I = RT IR showing that rotations are described by orthogonal matrices. Lorentz transformations form a group (show it) called

15The Poincar´egroup is also called the inhomogeneous Lorentz group and the Lorentz group is also known as the homogeneous Lorentz group.

20 O(3, 1). This group is actually made of four disconnected pieces due to the presence of discrete transfor- mations. Space inversion is P = diag(1, −1, −1, −1). Time reversal is T = diag(−1, 1, 1, 1). Their product is PT = −I. These three discrete transformations P , T and PT are not continuously connected to the identity. We would like to exhibit the part of O(3, 1) which is connected to the identity. P and T change the orientation of spacetime (just like in 3d Euclidian space, space inversion changes the orientation of space). This is seen in det T = −1 and det P = −1. They are called improper transformations. We call SO(3, 1) the proper Lorentz group: it is a subgroup (show it) of the Lorentz group containing only elements that do not change the orientation of spacetime (therefore excluding P and T but not PT as det PT = +1). The problem with PT is that, although it preserves the orientation of spacetime, it does reverse the time 0 2 0 direction. It can be shown that (Λ 0) is always ≥ 1 so that Λ 0 is either ≥ 1 or ≤ −1. Transformations Λ of the first kind do not reverse time and are called orthochronous. Those of the second kind reverse time and are called non-orthochronous. P is improper and orthochronous. T is improper and non-orthochronous. PT is proper and non-orthochronous. We call SO+(3, 1) the proper orthochronous Lorentz group. It is the subgroup of O(3, 1) that is continuously connected to the identity. Any element of O(3, 1) can be seen as the product of an element of SO+(3, 1) and I or P or T or PT 16. In the following, we concentrate on SO+(3, 1), which we will loosely call the Lorentz group. From ΛT ηΛ = η, one obtains that (det Λ)2 = 1 so that Λ−1 exists. We now obtain its expression. From α β −1 ν σµ −1 σ α µσ 17 ηµν = ηαβΛ µΛ ν contracting with (Λ ) ρη we find that (Λ ) ρ = ηραΛ µη , which by definition we σ α µσ −1 σ σ call Λρ ≡ ηραΛ µη . We then have the simple equality (Λ ) ρ = Λρ . Show that the transformation law 0 µ 0ν ν µ for basis vectors is eν = Λν eµ, which is contrary to contravariant coordinates transforming as x = Λ µx , 0 µ but similar to that of covariant coordinates transforming as xν = Λν xµ. This is the origin of the name “covariant coordinates”: coordinates that transform as the basis vectors do.

2.2.4 Lorentz algebra The Lorentz group is a Lie group. It is somewhat similar to SO(4) – the group of rotations in a 4-dimensional Euclidian space. The latter has 6 independent rotations (in the planes x0x1, x0x2, x0x3, x1x2, x1x3 and x2x3) and therefore 6 generators. Another way to see it is to realize that a 4 × 4 real matrix has 16 real parameters but that the defining condition ΛT ηΛ = η gives 10 independent constraints leaving 6 independent parameters. An important difference with SO(4), that will reveal crucial when contrsucting representations, is that the Lorentz group, which also has 6 generators, is non-compact.

Lorentz boost To be more concrete, we now construct specific Lorentz transformations in order to obtain the generators and their algebra. Among the 6 independent transformations, there are 3 space rotations in the planes xy, xz and yz and 3 boosts (changes of inertial frame) in the planes tx, ty and tz. As the rotations are similar to the ones studied in SO(3), we focus on a boost in the tx plane. This is a transformation x0 = Λx such that y0 = y, z0 = z and only t and x mixes. In addition, we know that Λ is linear so that t0 and x0 are linear combinations of t and x. We also know that the transformation preserves intervals (isometry) so

16The group O(3, 1) is the semi-direct product of SO+(3, 1) and the discrete group {I,P,T,PT }, which is known as the Klein four-group K4. As an exercise, write the multiplication table of the group K4 and compare it to that of the group Z2 × Z2 where Z2 = {1, −1; ×} or equivalently {0, 1; +}.  1 0   1 0  17A simple 2×2 example should make things clearer. Take η = with µ = 0, 1 = t, x. Then ηµν = . µν 0 −1 0 −1  cosh φ − sinh φ  And let’s take as a specific Lorentz transformation a boost of φ:Λµ = . The inverse matrix ν − sinh φ cosh φ  1 0   cosh φ − sinh φ   1 0   cosh φ sinh φ  Λ−1 is defined by (Λ−1)σ = η Λα ηµσ = = . This ρ ρα µ 0 −1 − sinh φ cosh φ 0 −1 sinh φ cosh φ  cosh φ − sinh φ  is simply Λ(−φ) as expected. The transpose matrix ΛT is defined by (ΛT ) ν ≡ Λν = so that here µ µ − sinh φ cosh φ ΛT = Λ. Note in particular that ΛT 6= Λ−1. We can also check the matrix equality ΛT ηΛ = η defining Lorentz transformations, which shows that Λ−1 = ηΛT η.

21 02 02 2 2 0 18 that t − x = t − x . In addition det Λ = +1 (proper) and Λ 0 ≥ 1 (orthochronous) . One finds that 0µ µ ν x = Λ ν x reads  t0   cosh φ − sinh φ 0 0   t   x0   − sinh φ cosh φ 0 0   x    =     (2.18)  y0   0 0 1 0   y  z0 0 0 0 1 z which is parametrized by a single quantity called the rapidity φ. An important point is that φ ∈ R is unbounded and not closed (unlike rotations which are parametrized by an angle ψ ∈ [0, π]). The rapidity φ is not an angle. Therefore the Lorentz group is non-compact. Physically, this transformation describes ~ a boost along the x direction with a velocity v = tanh φ. We characterize it by φ = φ~ex. We have x0 = cosh φx − sinh φt so that the origin of the primed frame at x0 = 0 moves such that x = tanh φ × t 19 showing that the velocity is indeed tanh√ φ . To make the connection√ with standard notations of special relativity, note that cosh φ = γ = 1/ 1 − v2 20 and sinh φ = γv = v/ 1 − v2. In the non-relativistic limit  ct0   1 −v/c   ct  v/c = tanh φ ≈ φ  1, we find that ≈ i.e. t0 ≈ t − vx/c2 ≈ t and x0 −v/c 1 x x0 ≈ x−vt as expected for a Galilean boost (time is absolute t0 = t and simply add x0/t = x/t−v).

Generators

We can now obtain the boost generator Kx from  0 i 0 0  dΛ  i 0 0 0  Kx ≡ −i |φ=0 =   (2.19) dφ  0 0 0 0  0 0 0 0

† 21 which is anti-hermitian Kx = −Kx . Similarly, one can work out the boosts in the y and z directions and obtain the corresponding generators:  0 0 i 0   0 0 0 i   0 0 0 0   0 0 0 0  Ky =   and Kz =   (2.20)  i 0 0 0   0 0 0 0  0 0 0 0 i 0 0 0 The three rotation generators are already known from our study of SO(3) (excluding the first row and column of the following 4 × 4 matrices, you should recognize the 3 × 3 submatrices (Ji)jk = −iijk):  0 0 0 0   0 0 0 0   0 0 0 0   0 0 0 0   0 0 0 i   0 0 −i 0  Jx =   , Jy =   and Jz =   (2.21)  0 0 0 −i   0 0 0 0   0 i 0 0  0 0 i 0 0 −i 0 0 0 0 0 0

† † One has Ki = −Ki and Ji = Ji. The parameter space of the Lorentz group is made of 3 and 3 22 3 3 angles such that (φx, φy, φz, θx, θy, θz) ∈ R ×[0, π] . The parameter space is non-compact (it is unbounded and not closed).

 a b 0 0  18 µ  c d 0 0  T An alternative way is so search for a matrix Λ ν =   such that Λ ηΛ = η.  0 0 1 0  0 0 0 1 19 If one takes ~v = v~ex as the parameters defining the boost, then v ∈] − 1, 1[ which is bounded but not closed and hence not compact. Physically, it means that it is not possible to go in an inertial frame moving at the velocity of light |v| = 1 compared to another inertial frame. √ 20The parameter γ = 1/ 1 − v2 is the usual factor describing time dilatation – a moving clock runs slower – and in the direction of motion. 21In the literature, you will also find definitions such that the generator is hermitian, having absorbed an i factor. 22From now on, we call θ the angle giving the magnitude of rotation. It used to be called ψ.

22 Lorentz algebra The Lie algebra of the Lorentz group so(3, 1) can be obtained from computing the commutators of the generators. Check that [Jx,Jy] = iJz,[Jx,Ky] = iKz and [Kx,Ky] = −iJz. The Lorentz algebra is therefore23

J i,J j = iijkJ k (2.22) J i,Kj = iijkKk (2.23) Ki,Kj = −iijkJ k (2.24)

The first equation means that rotation generators form a closed sub-algebra (which we recognize as so(3)). The second line means that the triplet (Kx,Ky,Kz) transforms as a vector (here meaning a 3-vector, a vector under space rotations). Therefore the first equation also means that the triplet (Jx,Jy,Jz) itself transforms as a vector. Hence the notations K~ and J~. The third equation is more subtle: it means that boosts do not form a closed sub-algebra. The fact that the commutator of two boosts is related to a rotation in the third space direction is at the origin of the effect. This is a relativistic effect. An infinitesimal Lorentz transformation parametrized by (φ~ → 0, θ~ → 0) is therefore Λ = 1+iφ~·K~ +iθ~·J~. N N 24 Using the usual trick limN→∞(1 + x/N) = e , we find for a finite transformation that   Λ = exp iφ~ · K~ + iθ~ · J~ . (2.25)

Covariant notation It may be disturbing that the equation Λ = 1 + iφ~ · K~ + iθ~ · J~ is not written itself in a covariant fashion (indeed φ~ · K~ is the scalar product in 3d Euclidian space, therefore φ~ · K~ is a 3-scalar but not a 4-scalar). µ µ µ For an infinitesimal Lorentz transformation (i.e. close to the identity), we can write Λ ν ≈ δν + ω ν .  0 −φ1 −φ2 −φ3   −φ1 0 θ3 −θ2  Comparing ωµ with Λ − 1 = iφ~ · K~ + iθ~ · J~, we find that ωµ =   so that ν ν  −φ2 −θ3 0 θ1  −φ3 θ2 −θ1 0  0 −φ1 −φ2 −φ3  1 3 2 σ  φ 0 −θ θ  ωµν ≡ ηµσω =   = −ωνµ. This antisymmetric rank 2 tensor contains the 6 ν  φ2 θ3 0 −θ1  φ3 −θ2 θ1 0 σρ σρ parameters specifying a generic Lorentz transformation. We define J such that Λ = 1 − iωσρJ /2 when ωσρ → 0. It can be chosen to be antisymmetric as well because any symmetric part would vanish in the σρ contraction with the antisymmetric tensor ωσρ. This antisymmetric tensor J contains the 6 generators. By σρ µ µ equating two different expressions for Λ, we find that −iωσρ(J ) ν /2 = ω ν where σρ labels the generators and µ is the row index and ν the column index in the matrix representation. In other words, each J σρ with σρ µ σµ ρ ρµ σ fixed σρ is itself a 4 × 4 matrix. One can show that (J ) ν = i(η δν − η δν ), which is the 3 + 1 version σρ σρ of (Ji)jk = −iijk. Let’s have a closer look at the 6 generators hidden in J . Among the 16 entries of J

23 1 2 With J = Jx, J = Jy, etc. Note that when using latin indices i, j, ... = 1, 2, 3, it means that we restrict to space rather than spacetime. In 3d space the metric is Euclidian. There is therefore, in that case, no reason in making a difference between upstairs and downstairs indices. When writing ijkJk, the summation over repeated indices is implied. 24Actually, this is not entirely true in the case of a non-compact group. The issue is that exponentiating the generators, one does not recover all of the elements of the group. All elements might be recovered as products of such exponentials (?). Matthieu Tissier has a nice counter-example involving the non compact Lie group SL(2, C), which is the group of 2 × 2 complex  −1 0  matrices with det = +1. The matrix belongs to SL(2, C) but can not be written as an exponential. However 1 −1  −1 0   1 0  it can be written as the product of two matrices that each can be written as an exponential 0 −1 −1 1 eiπσz ei(σy +iσx)/2. Check that by using the Baker-Campbell-Hausdorff formula.

23 i 0i i ijk 0i i only 6 are independent because of antisymmetry. Check that K = J and J =  Jjk/2, i.e. J = K k and Jij = ijkJ . Therefore  0 K1 K2 K3  1 3 2 σρ  −K 0 J −J  J =   (2.26)  −K2 −J 3 0 J 1  −K3 J 2 −J 1 0 In covariant notation, the Lorentz algebra reads:

[J µν ,J ρσ] = i (−ηµρJ νσ − ηνσJ µρ + ηνρJ µσ + ηµσJ νρ) (2.27)

ω −i µν J µν and an element of the Lorentz group is Λ = e 2 (see however, the preceding footnote).

[end of lecture #3]

2.2.5 Representations of the Lorentz group Following a strategy similar to that used for the rotation group, we construct representations of the Lie algebra to obtain that of the group. We first rewrite the Lorentz algebra by defining N~ ≡ (J~  iK~ )/2 such ~ † ~ that N = N. Now, the algebra reads

h i j i ijk k N+,N+ = i N+ (2.28)

h i j i ijk k N−,N− = i N− (2.29)

h i j i N+,N− = 0 (2.30) which means that it splits in two decoupled su(2) algebras, i.e. so(3, 1) = su(2) ⊕ su(2). Therefore irreps of so(3, 1) can be obtained as irreps of su(2) ⊕ su(2). The latter are labelled by (j+, j−) with j = 0, 1/2, 1, ... ~ 2 and are of dimension (2j+ + 1)(2j− + 1). N are the two Casimir operators: they are proportional to the ~ 2 I ~ 2 I identity in each irrep. In (j+, j−), N+ = j+(j+ + 1) and N− = j−(j− + 1) . From their definition, we also know that J~ = N~+ + N~− and K~ = −i(N~+ − N~−). The rotation generator J~ is therefore the sum of two angular momenta N~+ and N~−. From the familiar composition law of angular momentum applied to 2 J~ = N~+ + N~−, we know that the spin j corresponding to J~ goes in unit steps from |j+ − j−| to |j+ + j−|.

Representation (j+, j−) = (0, 0) This is the 4-scalar representation. It has dimension 1, J~ = 0, K~ = 0. It is also a 3-scalar under rotations as j = 0.

Representation (1/2, 0)

This representation has dimension 2 and behaves as a spinor under rotations as j = 1/2. Here N~+ = ~σ/2 iφ~·K~ +iθ~·J~ ~σ/2·(iθ~+φ~) and N~− = 0 so that J~ = ~σ/2 and K~ = −i~σ/2. Therefore ΛL = e = e . Doublets of complex numbers ψL that transform in such a representation are called left-handed Weyl spinors. The transformation 0 law is ψL = ΛLψL.

Representation (0, 1/2) This representation has dimension 2 and also behaves as a spinor under rotations as j = 1/2. However it is

inequivalent to the previous representation as we will see. Here N~+ = 0 and N~− = ~σ/2 so that J~ = ~σ/2 and iφ~·K~ +iθ~·J~ ~σ/2·(iθ~−φ~) K~ = i~σ/2. Therefore ΛR = e = e . Doublets of complex numbers ψR that transform in

24 such a representation are called right-handed Weyl spinors (in the old literature, left and right spinors are 0 also called dotted and undotted spinors). The transformation law is ψR = ΛRψR. Left and right spinors both behave as spinors (j = 1/2) under rotations but behave differently under boosts. The two representations are inequivalent because one can not find a 2 × 2 matrix S (a similarity −1 matrix) such that ΛR = SΛLS (this being the definition of equivalent representations). ∗ ~σ/2·(iθ~+φ~) ∗ ~σ∗·(−iθ~+φ~)/2 Actually σ2ΛLσ2 = ΛR. The proof goes as follows. First σ2(e ) σ2 = σ2e σ2. Then one uses the general matrix formula

∞ n −1 −1 X A BA(B B)AB −1 BeAB−1 = B B−1 = 1 + BAB−1 + + ... = eBAB (2.31) n! 2! n=0

∗ ∗ ~σ ·(−iθ~+φ~)/2 σ2~σ σ2·(−iθ~+φ~)/2 ∗ to show that σ2e σ2 = e . Finally, using σ2σ1,3σ2 = σ2σ1,3σ2 = −σ1,3 and ∗ ∗ ~σ/2·(iθ~−φ~) σ2σ2 σ2 = σ2(−σ2)σ2 = −σ2, we see that σ2~σ σ2 = −~σ, which completes the proof as ΛR = e . In other words σ2K where K is the operation of complex conjugation (it takes the complex conjugate of everything to its right) would be a candidate for S except that σ2K is an anti-unitary operator and the latter can not be represented by a matrix25. Also ΛL and ΛR do not belong to SO(3, 1): they are complex 2 × 2 matrices with det = +1, but they † ~σ·φ~ are not unitary. Indeed ΛLΛL = e 6= 1. This non-unitarity can be traced back to the fact that SO(3, 1) is non-compact, i.e. to the existence of boosts which are described by anti-hermitian generators so that iφ~·K~ † iφ~·K~ −iφ~·K~ (e ) = e 6= e . Rather, ΛL and ΛR belong to the group SL(2, C) (the of complex matrices of size 2 × 2). The latter is the covering group of SO(3, 1) – noted as SL(2, C) = SO(3, 1) – and the Weyl spinors realize projective (and not faithful) representations of SO(3, 1) (similarly to the relation between SO(3) and SU(2): SU(2) = SO(3)).

Representation (1/2, 0) ⊕ (0, 1/2) What is the effect of space inversion P on the Weyl spinors? J~ → J~ is a pseudo-vector, whereas K~ → −K~ i i i i 26 is a true vector. This can be checked from PJ P = J and PK P = −K . Therefore N~ → N~∓. This shows that the irreps (1/2, 0) and (0, 1/2) are exchanged under parity so that a left spinor becomes a right spinor and vice-versa. Let us therefore glue a left and a right Weyl spinor into a quadruplet ψ =  ψ  L known as a or (here written in the so-called chiral basis). It has four complex ψR components. It realizes a 4-dimensional reducible (and projective) representation of SO+(3, 1) – that’s the meaning of (1/2, 0) ⊕ (0, 1/2) – and an irreducible (and projective) representation when parity is added.  0      ψL ΛL 0 ψL 0 A Dirac spinor transforms as 0 = , which we write ψ = S(Λ)ψ, under a ψR 0 ΛR ψR  0   I    ψL 0 ψL 0 0 Lorentz transformation. And as 0 = I , which we write ψ = γ ψ, under a parity ψR 0 ψR transformation. We introduce the notation ψ ≡ ψ†γ0 called the conjugate of a Dirac spinor. The reason for this is that ψψ is a Lorentz scalar (prove it), whereas ψ†ψ is not. This peculiarity can again be traced back to having ηµν as a metric instead of δµν .

25Indeed K is not a linear operator and can therefore not be represented by a matrix. It is actually an anti-linear operator. The most famous example of such an operator is the quantum mechanical time reversal operator T which is anti-unitary (as shown by Wigner in 1932). It acts (in ) on a superposition of quantum states as T (a|ψi + b|χi) = a∗|ψi + b∗|χi and satisfies T †T = TT † = 1 and T 2 = 1 or −1. It is usually written as the product of a unitary (and linear) operator U and the complex conjugation operator K: T = UK. For a brief introduction to the time-reversal operator see pages 99-100 in [4]. 26Is it clear in the present context why the generators should transform as Ji → PJiP † under parity? This is the transfor- mation law for a matrix A → A0 = SAS−1 and here the role of S is played by P = P † = P −1. Intuitively, the generator J~ is an angular momentum and we know that the orbital angular momentum L~ = ~x × P~ (see the section on field representations) is a pseudo-vector because position ~x and momentum P~ are true vectors. Also K~ = tP~ − E~x as we will see in the section on field representations: therefore K~ → −K~ under parity, as a true vector.

25 Representation (1/2, 1/2) This representation has dimension 4. It is the defining or fundamental or 4-vector representation of the Lorentz group. It is actually the representation we have obtained in constructing the Lorentz group, when working with 4 × 4 matrices (section called Lorentz algebra). The contravariant coordinates Aµ of a 4-vector 0µ µ ν A transforms as A = Λ ν A . It can be shown that a 4-vector under the Lorentz group splits into a scalar and a 3-vector under the rotation group (for details, see Maggiore [2] page 28). It is important to realize that although a Dirac spinor is a quadruplet (it has 4 components), it is not a 0 0µ µ ν Lorentz 4-vector. Indeed the transformation law ψ = S(Λ)ψ is different from x = Λ ν x .

Tensor representations The 4-vector representation is the fundamental representation of the Lorentz group (but not of its cover- ing group SL(2, C)). From it, we can play the game of obtaining larger representations by using tensor products. Let’s do a useful case: the tensor product of two 4-vector representations. A rank 2 tensor µν µ ν 0µν µ ν ρσ T = A B transforms as T = Λ ρΛ σT (by definition) under a Lorentz transformation. This realizes a 16-dimensional representation. However, it is reducible. Indeed if T ρσ is antisymmetric then T 0µν as well. Similarly if T ρσ is symmetric then T 0µν is also symmetric. Therefore the 16-dimensional representa- tion splits into a 6-dimensional antisymmetric representation Aµν = (T µν − T νµ)/2 and a 10-dimensional µν µν νµ µν µν symmetric representation S = (T + T )/2. But the trace of the tensor trT = ηµν T = ηµν S is a µν 0µν µ ν ρσ ρσ scalar: indeed trT = ηµν S → ηµν S = ηµν Λ ρΛ σS = ηρσS is Lorentz-invariant. Therfore the 10- µν dimensional representation is also reducible into a 1-dimensional (scalar trT = ηµν S ) and a 9-dimensional (traceless symmetric rank 2 tensor Sµν − ηµν trT/4) representations. This can be summarized by saying that 4⊗4 = 1⊕6⊕9, i.e. the rank 2 tensor representation decomposes into 3 irreducible representations: a scalar (the trace), an antisymmetric part and a traceless symmetric part. For more details, see Maggiore [2] page 20. A particular symmetric rank 2 tensor is the metric tensor ηµν . It has a specific property: under a Lorentz 0 ρ σ α σ transformation ηµν → ηµν = Λµ Λν ηρσ = ηµν by virtue of equation (2.16), ηρσ = ηαβΛ ρΛ β, which defines µ ρ µ Lorentz transformations, and the fact that Λ ρΛν = δν . The metric tensor has the same expression in every frame: it is called an invariant tensor. Another tensor with a similar property is the totally antisymmetric rank 4 tensor αβγδ such that 0123 = +1 (also known as the Levi-Civita symbol). As an exercise, show that it is invariant under Lorentz transformations. There is an important difference between the Levi-Civita symbol with 3 and 4 indices. Indeed, with three indices, ijk = +1 if and only if (ijk) is an even permutation of (123) (i.e. if it is ob- tained from an even number of transposition of (123)). This is the same as saying that (ijk) is a circular permutation of (123). With four indices, µνρσ = +1 iff (µνρσ) is an even permutation of (0123). But this is not the same as saying that (µνρσ) is a circular permutation of (0123). As a counter-example, show that 3012 = −1: even if (3012) is a circular permutation of (0123), it is an odd permutation of (0123). Pay also αβγδ attention to the fact that µνρσ ≡ ηµαηνβηργ ησδ and the product of four η’s gives −1. In other words µνρσ ijk µνρσ = − . Whereas ijk =  .

2.3 Poincar´egroup and field representations

After having studied the homogeneous Lorentz group consisting of rotations and boosts, we want to study the inhomogeneous Lorentz group also known as the Poincar´egroup and denoted ISO(3, 1). In addition to rotations and boosts, this group contains translations in spacetime. There are 4 extra generators corre- sponding to the three translations in space and to time translation. Therefore the Poincar´egroup has 10 generators. 0µ µ ν µ An element of the group is noted g = (Λ, a) such that x = Λ ν x + a . Show that the composition law is (Λ˜, a˜)(Λ, a) = (ΛΛ˜ , Λ˜a +a ˜). Show that Lorentz transformations (Λ, 0) form a subgroup, i.e. SO+(3, 1) and that spacetime translations (I, a) constitute a subgroup usually called R1,3. The Poincar´egroup is then seen as the semi-direct product of these two subgroups: ISO(3, 1) = SO+(3, 1) × R1,3.

26 2.3.1 Spacetime translations and field representations The aim is now to find the 10 generators of the Poincar´egroup. As we already know the 6 generators of the Lorentz group, we will concentrate on the 4 generators of the spacetime translation group. The parameter space for the latter is R4, which is not compact. It turns out that a representation of the spacetime translations can therefore only be found using fields (i.e. functions of spacetime) rather than multiplets with a discrete and finite number of components27. In other words, spacetime translations will be represented by operators (such as ) and not by matrices. Consider the translation (I, a) acting as x0µ = xµ + aµ on coordinates. What is its action on a spacetime function φ(x) 28? It is simply φ0(x0) = φ(x). Indeed, in the passive viewpoint, the point of spacetime at which the field φ is evaluated is unchanged only its coordinates are changed from x to x0 by the transformation. In other words, every field behaves as a “scalar” under a µ 0 0µ 0 µ µ 0 µ ν 0 µ spacetime translation. In an infinitesimal translation, φ(x ) = φ (x ) = φ (x +a ) ≈ φ (x )+a ∂ν φ (x ) ≈ 0 µ ν µ µ 0 ν ν φ (x ) + a ∂ν φ(x ) when a → 0. Therefore φ (x) ≈ φ(x) − a ∂ν φ(x) = [1 − a ∂ν ]φ(x), from which the ν d[1−a ∂ν ] generators are obtained as usual from −i daν |aν =0 = i∂ν . The generators of translations are usually called Pν ≡ i∂ν (2.32) ν and there are indeed four of them. A finite translation is represented by φ0(x) = eia Pν φ(x). Because ∂µ∂ν = ∂ν ∂µ, the algebra of translation generators is trivial [Pµ,Pν ] = 0 and the translation group is 29 0 abelian . Later, we will recognize P = i∂t as the Hamiltonian operator, P~ = −i∇~ as the momentum operator and therefore P µ = (P 0, P~ ) as the 4-momentum operator. An important point to understand is that in a field representation we are comparing φ0(x) with φ(x) rather than with φ0(x0). To make the discussion more general, consider a field of a more complex nature having I internal degrees of freedom: φ (x) with I = 1, ..., NI labeling internal degrees of freedom. For example, the field could be a 4-vector (then NI = 4) or a Weyl spinor (then NI = 2) or a Dirac spinor (then NI = 4) etc. If we were comparing φI (x) with φ0I (x0) we would be studying a fixed point (or event) of spacetime (in the µ 0µ passive viewpoint, the event is fixed but its coordinates are changing from x to x ) and how the NI degrees I of freedom φ are mixed by the transformation. In the case of a field which is itself a scalar (NI = 1), the field is invariant, there is a single degree of freedom and therefore φ0(x0) = φ(x). In a more general case of a field 0I 0 I J with NI > 1, the internal components are mixed in a transformation φ (x ) = [S(g)] J φ (x) where g is an element of the Poincar´egroup and S(g) is a NI × NI matrix. Doing this, there would be no interest of using fields φI (x) rather than similar objects φI without the dependence on spacetime coordinates. Therefore, what we are now doing is to compare φ0I (x) with φI (x) i.e. comparing the transformed field at another point of spacetime (having the same coordinates in two different frames) with the original field at the original point of spacetime. Because we are going from one event to another, there is now an infinite number of degrees of freedom involved (even if NI = 1). And hence, the field representation is of infinite dimension. See the corresponding discussion on page 30 of Maggiore [2].

2.3.2 Lorentz transformation of a scalar field Here we want to study how a Lorentz transformation (Λ, 0) acts on fields. Let’s first consider a scalar field (i.e. a field that is itself a scalar rather than a vector or a spinor). One has φ0(x0) = φ(x), as before and 0µ µ ν 0µ µ µ by definition of a scalar field, except that now x = Λ ν x instead of x = x + a . In an infinitesimal 0µ µ µ ν µ µ µ transformation x ≈ x + ω ν x (when discussing the Lorentz group, we saw that Λ ν ≈ δν + ω ν ), 0 0µ 0 µ ρ ν 0 µ 0 µ ρ ν µ 0 0 therefore φ (x ) ≈ φ (x ) + ω ν x ∂ρφ (x ) ≈ φ (x ) + ω ν x ∂ρφ(x ). Using φ (x ) = φ(x), we obtain 0 ρ ν ρν ρν νρ φ (x) ≈ φ(x) − ω ν x ∂ρφ(x) = [1 − ω xν ∂ρ]φ(x). As ω = −ω is antisymmetric, we can keep only 0 ρν 0µ the antisymmetric part of xν ∂ρ. Therefore φ (x) ≈ [1 − ω (xν ∂ρ − xρ∂ν )/2]φ(x). This is similar to x ≈

27Under translations, objects such as scalars, 4-vectors, spinors, etc. are all left unchanged. That’s the reason why we have to search for different type of objects to build representations of the spacetime translation group. 28For shortness, we write φ(x) instead of φ(t, x, y, z) or φ(xµ). 29All irreducible representations of an abelian group are one-dimensional.

27 µ i ρσ µ ν x − 2 ωρσ(J ) ν x , and we can read the generators from the previous expression

Lνρ ≡ xν i∂ρ − xρi∂ν = xν Pρ − xρPν . (2.33) Later, we will identify Lij as the (extrinsic or orbital) angular momentum operator. For a finite transforma- 0 ωµν µν  µν µν νµ tion, we have φ (x) = exp −i 2 L φ(x). From its definition, L is antisymmetric L = −L and there- fore contains only 6 independent entries corresponding to the 6 generators (3 boosts: Ki = L0i = x0P i −xiP 0 i ijk i.e. K~ = tP~ − H~x and 3 rotations: L =  Ljk/2 i.e. L~ = ~x × P~ ) of the Lorentz group.

2.3.3 Poincar´ealgebra We are now in position to obtain the complete Poincar´ealgebra. We already know the spacetime translation algebra [P µ,P ν ] = 0 (2.34) and also the Lorentz algebra (see the section on the Lorentz group) [Lµν ,Lρσ] = i (−ηµρLνσ − ηνσLµρ + ηνρLµσ + ηµσLνρ) (2.35) What remains to be computed is the commutator of P µ and Lρσ:[P µ,Lρσ]φ(x) = i∂µ(xρi∂σ −xσi∂ρ)φ(x)− (xρi∂σ − xσi∂ρ)i∂µφ(x) = i(ηµρP σ − ηµσP ρ)φ(x). Therefore [P µ,Lρσ] = i(ηµρP σ − ηµσP ρ) (2.36) which tells us that P µ behaves as a 4-vector under Lorentz transformations. These three equations constitute the Poincar´ealgebra written in covariant notation. It can also be more explicitly written as Li,Lj = iijkLk, Li,Kj = iijkKk, Li,P j = iijkP k (2.37) Ki,Kj = −iijkLk, P i,P j = 0, Ki,P j = iP 0δij (2.38) Li,P 0 = 0, P i,P 0 = 0, P 0,P 0 = 0, Ki,P 0 = iP i (2.39)

i ijk i 0i with L =  Ljk/2 and K = L .

2.3.4 Lorentz transformation of a 4-vector field I Let us consider a more involved situation with a field with internal indices φ (x), where I = 1, ..., NI . Under a µ I 0µ µ ν 0I 0 I J Lorentz transformation g = (Λ, 0), both x and φ are transformed: x = Λ ν x and φ (x ) = [S(g)] J φ (x) I I where S(g) is an NI × NI matrix (for example if φ is a left Weyl spinor, NI = 2 and S(g) = ΛL; if φ is a 0I I scalar, NI = 1 and S(g) = 1). The question we ask is: how is φ (x) related to φ (x)? To be more concrete, we consider the example of a 4-vector field φµ(x) (here the internal index I is 0µ 0 µ ν µ µ ν called µ and NI = 4). In an infinitesimal transformation, φ (x ) = Λ ν φ (x) ≈ φ (x) + ω ν φ (x) (i.e. µ µ 0µ µ µ ν µ ρσ µ µ µ [S(g)] ν = Λ ν here) where x ≈ x + ω ν x . Here Λ ν = [exp (−iωρσS /2)] ν ≈ δν + ω ν where ρσ µ ρµ σ σµ ρ ρσ ρσ [S ] ν = i(η δν − η δν ) (note that S was previously called J ). Doing a Taylor expansion, we find 0ρ µ µ ν 0ρ µ µ ν ρ µ 0ρ 0µ ρ µ ρ ν µ 0ρ µ ρ µ φ (x + ω ν x ) ≈ φ (x ) + ω ν x ∂µφ (x ). But φ (x ) ≈ φ (x ) + ω ν φ (x ) so that φ (x ) ≈ φ (x ) + ρ ν µ µ ν ρ µ I i µν µν ρ σ µ µν µν µν ω ν φ (x )−ω ν x ∂µφ (x ) = [ − 2 ωµν (S +L )] σφ (x ). By definition we now call J = S +L the total generator of Lorentz transformations. It has two parts: Sµν generates the transformation on the nature of the field (it is the intrinsic generator of Lorentz transformations) whereas Lµν = xµP ν − xν P µ generates the transformation on the field as being a function of spacetime (it is the extrinsic generator of Lorentz transformations). This should be familiar from the decomposition J~ = S~ +L~ of the total angular momentum operator into its intrinsic (spin) and extrinsic (orbital) parts in non-relativistic quantum mechanics.

2.3.5 Lorentz transformation of a spinor field Exercise: write the two equations (2.40) and (2.41) in the case of a left Weyl spinor and also in the case of a Dirac spinor in order to identify the corresponding Sµν .

28 2.3.6 Summary In the case of an arbitrary field φI (x), at the same point in spacetime, we have

ωµν µν 0I 0 I J −i 2 S φ (x ) = [S(Λ)] J φ (x) where S(Λ) = e (2.40)

µν ρµ σ σµ ρ µν with S depending on the representation (0 for a scalar field, i(η δν − η δν ) for a 4-vector field, σ /2 = µ ν [γ , γ ]/4 for a Dirac spinor field, S(Λ) = ΛL for a left Weyl spinor field, etc). And at the same coordinates (but at different points in spacetime), we have

ωµν µν 0I −i 2 J I J µν µν µν φ (x) = [e ] J φ (x) where J = S + L (2.41) where Lµν is always given by xµP ν − xν P µ with P µ = i∂µ.

2.3.7 Representations of the Poincar´egroup on single particle states A subject that would be worth studying here, especially in preparation of the appearance of particles as excitation quanta of the fields, – but which we omit because we feel it does not belong to a classical (i.e. non- quantum) discussion of the Poincar´egroup – is the representation of the Poincar´egroup on single particle states in Hilbert space (see e.g. Maggiore [2] pages 36-40 or Weinberg [13], pages 62-74). This is a famous work of Wigner (1939). He showed that irreps are classified by the mass and the spin of particles. More precisely, for a massive particle, irreps are classified by spin j = 0, 1/2, 1, 3/2, ... and have a dimension 2j +1. And for a massless particle, irreps are classified by the helicity h = 1/2, 1, ... and have dimension 1. In case parity is conserved, one may build two-dimensional representations by grouping h = +1 and h = −1 for the photon for example.

2.4 Conclusion

Scalars, vectors, spinors, tensors, etc. are simply natural objects (i.e. irreducible representations of the of space-time) that emerge from the geometry of space-time. They have well-defined transformation properties under the symmetries of space-time. Almost no physics at this point. At least no dynamics, no particles, etc.

[end of lecture #4]

29 Chapter 3

Classical fields, symmetries and conservation laws

A. Zee: “Einstein’s legacy: Symmetry dictates design.”

The idea now is to use our knowledge of spacetime symmetries to build relativistic classical field theories. We will see that a field theory is conveniently specified by giving its action S = R d4xL in terms of its Lagrangian density L. To require that a field theory has a certain symmetry (for example Poincar´esymmetry) amounts to designing an action that is invariant under Poincar´etransformations. In others words, the action should be a Poincar´e-scalar.The Lagrangian density should therefore be a Lorentz-scalar and should not explictly depend on the spacetime coordinate x (in order to have translational invariance). Then the corresponding action (and therefore the field theory) will be invariant under Poincar´etransformations, i.e. automatically incorporate the physical facts that space is isotropic, homogeneous, that there is no prefered time origin, that relative motion is undetectable (relativity principle), etc.

3.1 Lagrangian formalism

For this section, it would be good to have analytical mechanics fresh in your mind. The essential forward step here is to go from analytical mechanics of a discrete and finite number of degrees of freedom to that of fields.

3.1.1 Action and Lagrangian To learn relativistic field theory, we will most often use the simplest example of a real scalar field φ(xµ). Therefore φ ∈ R and φ(x) → φ0(x0) = φ(x) under a Lorentz transformation. When we want to be slightly µ more general, we will take a field with internal indices φI (x ), where I = 1, ..., NI . The action Z Z S = dtL = d4L (3.1) R is a functional S[φ] of the field φ, which is written in terms of the Lagrangian L or of the Lagrangian density L. The is over a very large portion R of Minkowski spacetime M (eventually, all of it). A functional is a machine that eats a function (not just a number) and gives a number as an output. It should clearly be distinguished from a function1. The Lagrangian density (which will be called Lagrangian in the following) should be a Lorentz scalar, should not depend explictly on xµ (so as to have translation invariance), should be built out of local fields

1Indeed S[φ] depends on the whole function φ(x) and not just on the value of the function φ at a single given point x.A functional is not a function of a function in the sense of f(g(x)) = f ◦ g(x).

30 1 µ and is usually taken to depend on the field and its first derivatives L(φ, ∂µφ). For example, L = 2 ∂µφ∂ φ 1 describes a free (i.e. non-interacting, i.e. quadratic) scalar field. The coefficient 2 is conventional. Note that 1 ˙2 1 ~ 2 the Lagrangian looks like kinetic energy minus elastic energy: L = 2 φ − 2 (∇φ) . A generalization is to add an extra V (φ): 1 L = ∂ φ∂µφ − V (φ). (3.2) 2 µ 2 1 µ 1 2 2 2 If one insists on staying quadratic (free field) then V (φ) ∝ φ : L = 2 ∂µφ∂ φ − 2 m φ . At this point m is just the name of a constant (later m will be interpreted as a mass). Note that φ2 is a scalar, as well as µ ∂µφ∂ φ.

3.1.2 The equations of motion of the field are obtained from the variational principle (least action, Hamilton): the 0 2 action should be stationary δS = 0 when we make a small variation of the field δφI (x) ≡ φI (x) − φI (x) . Note the position of the prime and x is a short notation for xµ. Later we will need to introduce another type 0 0 0 of variation ∆φI (x) ≡ φI (x ) − φI (x) that should not be confused with δφI (x). When φI (x) → φI (x) the 0 R 4 0 0 action varies from S[φI (x)] → S[φI (x)] = S[φI (x)]+δS. Therefore δS = R d x[L(φI , ∂µφI )−L(φI , ∂µφI )] = R 4 h ∂L ∂L i 3 d x δφI + ∂µδφI . Integration by part allows us to express the integrand in terms of δφI only R ∂φI ∂(∂µφI ) R 4 h ∂L ∂L i 4 and we obtain δS = d x − ∂µ( ) δφI . By definition , the functional derivative of the action R ∂φI ∂(∂µφI ) δS ∂L ∂L is = − ∂µ( ). When it vanishes, we find the Euler-Lagrange (EL) equations of motion: δφI ∂φI ∂(∂µφI )

∂L  ∂L  − ∂µ = 0 (3.4) ∂φI ∂(∂µφI )

cl A solution of this equation is called a classical field φI . 1 α m2 2 As an example, take L = 2 ∂αφ∂ φ − 2 φ − U(φ), where U(φ) is a polynomial in φ with powers greater 3 4 µ 2 dU 2 3 than 2: U(φ) = aφ + bφ + .... Show that the EL equation is (∂µ∂ + m )φ = − dφ = −3aφ + 4bφ + .... µ A first thing to note is that this equation is linear in φ if U = 0. The operator ∂µ∂ is often called the d’Alembertian operator and denoted by a square  or by ∂2. If U = 0, one obtains the so-called Klein- Gordon (KG) equation ( + m2)φ(xµ) = 0. If in addition m = 0, then one obtains the d’Alembert wave equation φ = 0 familiar from the study of waves on string, sound waves and light waves. It describes the propagation of a wave at the velocity of light c = 1. A solution of the KG equation as a propagating wave µ ~ µ −ikµx −iωt ik·~x µ 0 ~ ~ φ(x ) = φ0e = φ0e e , where k = (k , k) = (ω, k), is easily found by and yields the following dispersion relation ω2 = ~k2 + m2 of relativistic flavor.

2 µ One should think of x here as an index labelling degrees of freedom. Just like j in qj labels the number of degrees of µ µ freedom in classical mechanics. In other words φ(x ) is something like q(j). At each x , there is a finite number NI of degrees µ of freedom. What we want to vary is φI not x . 3 R 4  ∂L  We have assumed that boundary terms vanish. Indeed, using Gauss’ theorem d x∂µ δφI = R ∂(∂µφI ) R 3 ∂L d Sµ δφI = 0 if δφI = 0 at the boundary ∂R of R. ∂R ∂(∂µφI ) 4The Taylor espansion of a functional reads: Z δS 1 Z Z δ2S S[φ(x) + δφ(x)] = S[φ(x)] + d4x δφ(x) + d4x d4x0 δφ(x)δφ(x0) + ... (3.3) δφ(x) 2 δφ(x)δφ(x0) This gives a practical definition of the successive derivatives of a functional.

31 3.1.3 Conjugate field and Hamiltonian

Given a field φI (x) (i.e. the equivalent of a position q in classical mechanics), the canonically conjugate field ΠI (x) (i.e. the equivalent of a momentum p = ∂L/∂q˙) is defined by ∂L Π (x) = ΠI (x) ≡ (3.5) φI ˙ ∂φI (x) ˙ where φI ≡ ∂tφI . The Hamiltonian density (often simply called Hamiltonian) is I ˙ H = Π (x)φI (x) − L (3.6) I ˙ and should be expressed in terms of φI and Π rather than φI and φI (this is easy to remember as H(q, p) = R 3 ˙ pq˙ − L(q, q˙)). The Hamiltonian (not density here) is H = d xH. Indeed in a Legendre transform L[φI , φI ] I R 3 I ˙ ˙ 1 µ becomes H[φI , Π ] = d xΠ φI − L[φI , φI ]. With our favorite example, L 2 ∂µφ∂ φ − V (φ), the conjugate ˙ 1 2 1 ~ 2 field Π = φ and H = 2 Π + 2 (∇φ) + V (φ). The Hamiltonian Z 1 1  H = d3x Π2 + (∇~ φ)2 + V (φ) (3.7) 2 2 is recognized as the sum of kinetic energy, elastic energy and potential energy.

3.2 Symmetries and conservation laws

We will see that continuous and global symmetries of the action imply conservation laws. This is a funda- mental result that was obtained by around 1918.

3.2.1 Symmetries of the action µ We start again from an action functional S[φI (x )] where I = 1, ..., NI labels the components of the field whatever their origin (whether due to spacetime symmetries as for a 4-vector field Aν or a Weyl spinor field ψ, or due to a truly internal symmetry, as in the case of a complex scalar field that can be decomposed in two real fields – real and imaginary parts –, NI = 2, with an SO(2) symmetry). A symmetry is a transformation that leaves the action invariant, even if φI (x) does not satisfy the EL equations of motion, i.e. for any φI (x). We consider a transformation that acts as xµ → x0µ = xµ + δxµ 0 0 φI (x) → φI (x ) = φI (x) + ∆φI (x) (3.8) 0 0 0 Note the important point that ∆φI (x) ≡ φI (x )−φI (x), which is different from δφI (x) ≡ φI (x)−φI (x). This notations are not standard and vary from book to book. By definition, there is not point in distinguishing ∆xµ and δxµ which are equal. We now make the transformation continuous and infinitesimal. We assume a that it depends on Na independent real parameters  (labelled by a = 1, ..., Na, where Na is the number of generators of the group of transformations):

µ 0µ µ a µ x → x = x +  Aa (x) (3.9) 0 0 a φI (x) → φI (x ) = φI (x) +  FI,a(φ, ∂φ) (3.10) with a → 0. 0 0 a If the transformation (3.9)+(3.10) is a symmetry, then ∆S ≡ S[φI (x )] − S[φI (x)] = 0. If  does not depend on xµ, the symmetry is said to be global. If a(x) 6= const. does depend on the spacetime event, 0µ µ µ the symmetry is said to be local. An internal symmetry is such that x = x so that Aa = 0 in (3.9) and only φ is affected. Then d4x0 = d4x so that ∆S = R d4∆L and ∆S = 0 becomes equivalent to having an µ µ invariant Lagrangian density ∆L = 0. A spacetime (or external) symmetry affects x (i.e. Aa 6= 0) and also φI (unless φI = φ is a scalar field).

32 3.2.2 Proof of Noether’s theorem We first present an elegant but somewhat tricky proof (apparently due to ). We proceed in five steps: Step 1: Suppose the action is invariant under a global (a = const.) transformation such as (3.9)+(3.10) (i.e. continuous and infinitesimal). Step 2: Now, consider the same transformation (3.9)+(3.10) but with a(x) 6= cst. In addition, we assume a a that not only  → 0 but ∂µ → 0 (the transformation is only slightly non-global in space and time). This is no longer a symmetry, so that ∆S 6= 0. However, because the transformation is infinitesimal and slowly spacetime varying, we can make a gradient expansion: Z 4 a a µ 2 ∆S = d x [ (x)Ka(φI ) − ∂µ (x)Ja (φI )] + O(∂∂,  , ∂) (3.11)

µ where we only kept terms linear in  or ∂; Ka(φI ) and −Ja (φI ) are just the name given to the unknown expansion coefficients. a a R 4 Step 3: Now, if  (x) = const., we have ∆S = 0 =  d xKa(φI ) for all φI (x) (remember that for a symmetry, the action is invariant for all φI (x)). Therefore Ka(φI ) = 0 for all φI . At this point, Z 4 a µ 2 ∆S = − d x∂µ (x)Ja (φI ) + O(∂∂,  , ∂), (3.12)

which we transform by an integration by part into Z 4 a µ 2 ∆S = d x (x)∂µJa (φI ) + O(∂∂,  , ∂) (3.13)

which still corresponds to a case where a(x) 6= cst 5. cl Step 4: We also know that δS = 0 if φI (x) = φI (x) (i.e. the action is invariant at first order when the field satisfies the EL equations of motion). In addition, it turns out that ∆S = δS because x0 is just a dummy variable in S[φ0(x0)] = R d4x0L(φ0(x0), ∂φ0(x0)) = R d4xL(φ0(x), ∂φ0(x)) = S[φ0(x)]. Therefore 0 0 0 cl cl ∆S = S[φ (x )] − S[φ(x)] = S[φ (x)] − S[φ(x)] = δS. We conclude that ∆S[φI ] = δS[φI ] = 0 even if the transformation is not a symmetry of the action. R 4 a µ cl Step 5: Applying the previous result to Eq. (3.13), we conclude that d x (x)∂µJa (φI ) = 0 for all a(x) such that a → 0 and ∂a → 0. This can only be true if

µ cl ∂µJa (φI ) = 0 (3.14) This is the proof that a divergenceless current exists as a consequence of a continuous and global symmetry. cl Note that it is only for a field that satisfies the EL equations of motion (i.e. for a classical field φI ) that the current is conserved. To summarize, we have found that for each generator (labelled by a) of a global and continuous symmetry µ cl group, there is a conserved (meaning divergenceless in the present context) current Ja (φI ) usually noted µ Ja (x). Equation (3.14) should be familiar from the , e.g. local in µ µ electromagnetism ∂µJ = ∂tρ + ∇~ · J~ = 0 where J = (ρ, J~), ρ is the electric and J~ the electric . The quantity Z 3 0 Qa ≡ d xJa (t, ~x) (3.15) is called a (Noether) charge. It is also said to be conserved (meaning time-independent here) because when integrated over most of spacetime R Z Z Z dQa 3 0 3 i 2 i = d x∂0Ja (x) = − d x∂iJa(x) = − d SiJa(x) = 0, (3.16) dt R R ∂R 5 a a R 4 µ 2 R 4 µ If here, we assume again that  (x) = cst, we find that ∆S =  d x∂µJa (φI ) + O( ) and as R d x∂µJa (φI ) = R 3 µ ∂R d SµJa (φI ) vanishes as a boundary term, we just recover that ∆S = 0 for any φI which is the definition of a symmetry.

33 where we used Gauss’ theorem and the fact that the field (and therefore the current) vanishes sufficiently fast at the boundary ∂R. This is now a global statement. For each generator of the continuous and global symmetry group, there is a quantity that when computed for the whole spacetime (?) is a constant. Remember electric charge conservation: Q = R d3xρ is supposed to be a constant in the entire universe. [end of lecture # 5]

3.2.3 Explicit form of the Noether current In this section, we present an alternative, less elegant, proof of Noether’s theorem. The main advantage is that we will obtain an explicit expression for the (in the previous proof, we only showed the existence of such a current, but we had no concrete way of finding it). We start from an action R 4 a S = d xL(φI , ∂φI ) and apply a transformation (3.9)+(3.10) with  (x) 6= const. The trick is to compute a ∆S to first order in ∂ in order to identify Jµ by comparing with equation (3.12). We will therefore discard terms like , 2 or ∂∂. The first step is Z Z   4 4  4 4 ∂L 4 ∂L ∆S = ∆(d x)L + d x∆L = ∆(d x)L + d x ∆φI + d x ∆(∂µφI ) (3.17) ∂φI ∂(∂µφI )

a a From (3.10), we see that ∆φI =  FI,a is proportional to  and is therefore an uninteresting term. For 0 0 later purpose, we nevertheless compute ∂µ(∆φI ), which is not that easy. Starting from ∆φ = φ (x ) − 0 a 0 0 a a φ(x) = φ (x +  (x)Aa(x)) − φ(x), we obtain ∂(∆φ) = [∂φ ](x )[1 + ∂ Aa +  ∂Aa] − ∂φ. Next we use 0 0 a a ∂[φ (x )] = ∂[φ(x) +  (x)Fa] = ∂φ(x) + Fa∂ + ... where the dots denote uninteresting terms proportional a ν to . In the end, ∂µ(∆φI ) = (∂µ )[Aa∂ν φI + FI,a] + O(, ∂∂). 4 We still have to compute ∆(d x) and ∆(∂µφI ), which will be seen to be 6= ∂µ(∆φI ). To make it simple, 0 0 a a let us start with a 1d version ∆(dx) ≡ dx − dx with x = x +  (x)Aa(x). Then ∆(dx) = (∂x )Aadx + a a  (∂xAa)dx = dx(∂x )Aa + ... as we are only interested in the first term (the one proportional to ∂). 4 4 0 4 4 a µ Therefore in 4d spacetime, we have ∆(d x) = d x − d x = d x(∂µ )Aa + .... 0 ν 0µ ∂φI ∂φI ∂x ∂ a ∂φI ∂x ∂ µ a µ Next, we turn to ∆(∂µφI ) ≡ ∂x0µ − ∂xµ = ∂x0µ ∂xν (φI +  FI,a) − ∂xµ . But ∂xν = ∂xν (x +  Aa ) = µ µ a a µ ∂xν ν ν a a ν δν +Aa ∂ν  + ∂ν Aa that we need to invert. Check that ∂x0µ = δµ −Aa∂µ − ∂µAa is indeed the inverse. ν ν a a ν a a ν a From there, ∆(∂µφI ) = (δµ−Aa∂µ − ∂µAa)(∂ν φI +FI,a∂ν  + ∂ν FI,a)−∂µφI = [FI,a−Aa∂ν φI ]∂µ +..., a ν a which is indeed not equal to ∂µ(∆φI ) = ∂µ [FI,a + Aa∂ν φI ]∂µ + .... R 4 ∂L We can now gather everything and obtain the order ∂ term in the action change ∆S = − d x[Aaν ∂ν φI − ∂(∂µφI ) µ ∂L a LA − FI,a ]∂µ + .... From there we identify the Noether current as a ∂(∂µφI )

  µ ∂L µ ν ∂L µ ν ∂L Ja (φI ) = ∂ν φI − Lδν Aa − FI,a = θ ν Aa − FI,a (3.18) ∂(∂µφI ) ∂(∂µφI ) ∂(∂µφI ) where we defined the quantity µ ∂L µ θ ν = ∂ν φI − Lδν (3.19) ∂(∂µφI ) µ later to be identified as the energy-momentum tensor. Remember that this current is conserved (i.e. ∂µJa = cl 0) only when computed on the solution of the EL equations of motion φI = φI .

3.2.4 Examples To get more familiar with Noether currents, we consider a few concrete examples.

Internal global symmetry

We have in mind the case of an internal global symmetry such as SO(N) where I = 1, ..., NI = N having µ a Na = N(N − 1)/2 generators (a = 1, ..., Na). Then Aa = 0 and ∆φI = δφI =  FI,a here so that

34 µ ∂L 6 0 a J J = FI,a . Most often the symmetry is linear and therefore φ = φI +  (Ta) φJ where for each a ∂(∂µφI ) I I J a, Ta is an NI × NI matrix and FI,a = (Ta)I φJ . The Ta’s are actually the generators of the symmetry c c group and they satisfy a Lie algebra [Ta,Tb] = ifab Tc, where fab are the so-called sructure constants of the algebra7. Therefore, the Noether current can be taken as

µ ∂L J Ja = (Ta)I φJ (3.20) ∂(∂µφI )

R 3 0 R 3 ∂L J R 3 I J and the corresponding charges are Qa = d xJa = d x (Ta)I φJ = d xΠ (Ta)I φJ . ∂φ˙I Exercise (the solution is given below in the section on the complex scalar field): The simplest example 1 µ of that type is probably the case of two real scalar fields φ1 and φ2 with the Lagrangian L = 2 ∂µφ1∂ φ1 − 2 2 m1 2 1 µ m2 2 2 (φ1) + 2 ∂µφ2∂ φ2 − 2 (φ2) . Show that this Lagrangian has an SO(2) symmetry when m1 = m2 = m (which would be absent had we introduced different coefficients m1 6= m2). How many generators? How many conserved currents? Expression of the conserved currents?

Spacetime translations

µ 0µ µ µ µ σ µ Under a spacetime translation x → x = x + a = x + a δσ and every field behaves as a scalar 0 0 σ a φI (x ) = φI (x). Therefore FI,a = 0, whereas a plays the role of  (the a index is here σ = 0, 1, 2, 3) and µ µ µ ν µ µ ∂L µ δ that of A . Therefore, the Noether current is J = δ θ = θ = ∂σφI − Lδ which is called the σ a σ σ ν σ ∂(∂µφI ) σ energy-momentum tensor (also known as the stress-energy tensor). It is usually written as

µν µ σν ∂L ν µν θ = θ ση = ∂ φI − Lη (3.21) ∂(∂µφI )

µ µ µ µ We write θ ν rather than θν because it is not obvious whether θ ν = θν ? But there are good reasons – that will be spelled out later when discussing angular momentum – to ask that θµν = θνµ. It is actually µν µν µν µν ρµν possible to define a symmetric energy-momentum tensor T from θ . Indeed, let T ≡ θ + ∂ρA ρµν µρν µν νµ µν µν where A = −A such that T = T . Check that ∂µT = 0 if ∂µθ = 0. Check also that the conserved charges P ν = R d3xT 0ν associated to the symmetric energy-momentum tensor are the same as that R d3xθ0ν associated to θµν . The conserved charge is called the 4-momentum Z Z ν 3 0ν 3 I ν 0ν P = d xθ = −i d xΠ i∂ φI − η L (3.22)

0 R 3 I ˙ R 3 R 3 I One recognizes the energy H = P = d xΠ φI − L = d H = −i d xΠ (i∂t)φI − L and the momentum R 3 I P~ = −i d xΠ (−i∇~ )φI , where −i∇~ is the generator of space translations (later to be identified as the momentum operator) and i∂t is the generator of time translations (later to be identified as the Hamiltonian operator).

Lorentz transformations

0µ µ ν µ µ ν Here, for simplicity, we restrict to a scalar field. A Lorentz transformation act as x = Λ ν x ≈ x + ω ν x 0 0 µ and φ (x ) = φ(x) (scalar field). Therefore, here Fa = 0 and we have a little bit of work to identify Aa , where a labels the generators of the Lorentz group. For an infinitesimal transformation, we have shown µ µ µ µ µσ µ ρσ a ρσ µ that Λ ν = δν + ω ν . Therefore δx = xσω = δρ xσω from which we identify  as ω and Aa µ ρσ σρ as δρ xσ. Remember that ω = −ω is antisymmetric and therefore a = (ρσ) only takes 6 different µ 1 µ µ ρσ values (01,02,03,12,13,23). Because of antisymmetry, we can also write δx = 2 (δρ xσ − δσ xρ)ω and µ 1 µ µ A (ρσ) = 2 (δρ xσ − δσ xρ) (the notation meaning that this tensor is antisymmetric with respect to its last

6 µ µ The minus sign is unimportant here and we dropped it: if Ja is a conserved current so is −Ja . 7 For example, in the case of SO(3), the Lie algebra is [Ji,Jj ] = iijkJk and the structure constants are just given by the 3d Levi-Civita symbol.

35 µ µ µ µ ν 1 µ µ two indices A ρσ = −A σρ). The Noether current is then J (ρσ) = θ ν A (ρσ) = − 2 (θ ν xρ − θ ρxσ). The divergenceless current can therefore be taken as

Mµ(ρσ) = xρθµσ − xσθµρ (3.23)

µ(ρσ) µρσ µσρ with ∂µM = 0. As M = −M , there are 6 independent divergenceless currents and hence 6 conserved charges Z Z M ρσ = d3xM0ρσ = d3x(xρθ0σ − xσθ0ρ) (3.24)

such that dM ρσ/dt = 0. These charges are the angular momentum Z ij 3 i j M = −i d xΠ[x (−i∂j) − x (−i∂i)]φ, (3.25)

ij i j 8 where L = x (−i∂j)−x (−i∂i) is the angular momentum operator (generator of rotations) , and a vectorial quantity G~ with no definite name and related to boosts generators (how?): Z Z Gi = M 0i = d3x(xiθ00 − x0θ0i) = d3x[Hxi − tΠ∂iφ] . (3.26)

In other words G~ = −tP~ + R d3x~xH. The conservation of this last quantity is perhaps less familiar then that of angular momentum or energy or linear momentum. Let us give an example in the simplest possible case of one-dimensional of a massive particle: the energy is conserved and H = E = pp2 + m2 (as c = 1). The Hamilton equations of motion arep ˙ = −∂xH = 0 andx ˙ = ∂pH = p/E = v is a constant. ˙ d p Then G = Ex−pt is indeed a as G = dt (Ex−tp) = Ex˙ −p = E E −p = 0. This conserved quantity is related to the uniform motion of the center of mass. Indeed, if at t = 0 the particle is in x = x0 p then the conservation law is G = Ex − pt = Ex0 = const. which means that x − E t = x − vt = x0 = const. The conserved quantity is unusual as it explicitly depends on time (it is also unusual in having no clear name...). In the non-relativistic limit v = p/E  1, a Lorentz boost becomes a Galilean boosts, the energy E ≈ m and the conserved quantity is G = mx−pt. The conservation law is usually written as x−vt = const. See the discussion on the conservation law of the center of mass in §8 of Ref. [14]. Let us now show that invariance under the Poincar´egroup implies that the energy-momentum tensor µνρ has to be a . Invariance under Lorentz transformations implies that ∂µM = 0 = µσ ρ µρ σ µσ ρ µρ σ µν ∂µ(θ x −θ x ) = θ ∂µx −θ ∂µx , where we used that ∂µθ = 0 as a consequence of invariance under ν ν ρσ σρ spacetime translations. Given that ∂µx = δµ, we obtain that 0 = θ − θ , qed. See the symmetrization procedure from θµν to T µν discussed above. Remark (to be skipped in a first reading): another reason for wanting an energy-momentum tensor that is symmetric T µν = T νµ comes from . General relativity is basically the double statement that (1) energy-momentum curves space-time (Einstein field equations) and (2) that in the absence of forces (gravity being considered not as a force here), particles follow in this curved space-time (equation of geodesics). The first equation reads:

1 8πG Rµν − Rgµν = − T µν (3.27) 2 c4 where gµν is the metric for curved space-time (it replaces ηµν defined for flat space-time) and all the quantities called R are curvatures of space-time defined from the metric tensor. The above equation states that space- time curvature (left hand side) is proportional to the energy-momentum tensor (right hand side). The proportionality constant depends on Newton’s gravitation constant G and on the velocity of light c. It turns out that the left hand side is (by construction) symmetric with respect to its two indices µ, ν. Therefore

8M ij is an anti-symmetric tensor and therefore contains 3 independent entries, which is equivalent to an axial vector usually called the angular momentum M~ .

36 the energy-momentum tensor also has to be symmetric9. Actually, when studying field theory in curved spacetime, the definition of the energy-momentum tensor becomes the variation of the action10 with respect to the metric: µν 2 δS δS T = −p ∝ (3.28) − det gµν δgµν δgµν See for example page 78 in Zee [4]. Again, this last definition makes it obvious that the energy-momentum tensor has to be symmetric.

As a summary of this section: the Poincar´egroup has 10 generators, which implies 10 conserved charges and the corresponding divergenceless currents. Invariance under time translation gives the conservation of µν energy H, invariance under space translations gives the conservation of momentum P~ . Overall ∂µT = 0. Invariance under Lorentz transformations give 6 other conserved charges: space rotations implies the conservation of angular momentum M~ (related to the rotation generator L~ ); Lorentz boosts implies the conservation of G~ (related to boosts generators K~ ), known as the conservation of the center of mass. Overall µρσ ∂µM = 0. As a word of caution, we note that the conserved quantities are strongly related to the corresponding generators (and are often called by the same name!) but should still be carefully distinguished from them. R 3 R 3 For example, the energy of the field H = d xH = −i d xΠ(i∂t)φ−L is different from the time translation operator i∂t. Or the momentum of the field P~ is different from the momentum operator (generator of space translations) −i∇~ . This difference is similar to that which exists between the total momentum of a gas of particles [a.k.a. the many-body momentum and equivalent to P~ = −i R d3xΠ(−i∇~ )φ] and the individual momenta [a.k.a. the single-particle momenta and equivalent to eigenvalues of −i∇~ ].

The next sections are about building field theories that respect the spacetime symmetries described above. The strategy is to write actions that are Lorentz invariants so that the EL equations of motion obtained by minimizing the action will automatically be covariant. We will study in turn the scalar (Klein-Gordon) fields, the vector (Maxwell and Proca) fields and the spinor (Weyl and Dirac) fields.

3.3 Scalar fields and the Klein-Gordon equation 3.3.1 Real scalar field One of the simplest Lagrangian for a real scalar field φ(x) ∈ R is 1 m2 1 1 m2 L = (∂ φ)(∂µφ) − φ2 = (∂ φ)2 − (∂ φ)2 − φ2 (3.29) 2 µ 2 2 0 2 i 2 The corresponding EL equation of motion is

µ 2 2 (∂µ∂ + m )φ = ( + m )φ = 0 (3.30) which is known as the Klein-Gordon (KG) equation 11. As we have invariance under space and time trans- µ ~ lation, we can look for a solution in the form of a plane wave φ(xµ) = φ(0)e−ik xµ = φ(0)eik·~x−iωt. Injecting

9 λ λ λ σ λ σ λ More precisely Rµνκ = ∂ν Γµκ − ∂κΓµν + ΓµκΓνσ − Γµν Γκσ is the defined in terms of Christoffel λ 1 λρ λ µν symbols Γµν = 2 g (∂ν gρµ +∂µgρν −∂ρgµν ), Rµν ≡ Rµλν is the tensor and R ≡ Rµν g is the scalar curvature. With hindsight and after studying the electromagnetic field, you will recognize that the Christoffel symbol is essentially a connection just as the 4-vector gauge potential Aµ and that the Riemann curvature tensor is essentially a curvature just as the electromagnetic field strength F µν = ∂µAν − ∂ν Aµ. For more details, see A. Zee [4] chapter VIII.1, page 419. 10 This action S being that for a field in a curved spacetime excluding the gravitational field itself that has its own action Sg. In plain words the total action should be Stot = S + Sg and here we only vary S with respect to the metric gµν and not Stot. 11It was actually first discovered in 1926 by Schr¨odingeras a relativistic version of what is now known as the Schr¨odinger equation. It was then rediscovered by Klein, Gordon and Fock in 1926. It was first considered as a relativistic and quantum equation describing the motion of a single electron rather than as a field equation. It suffers from two major problems as an

37 this ansatz in the KG equation, we find that q 2 ~ 2 2 ~ 2 2 ω = k + m i.e. ω = ωk with ωk ≡ k + m (3.31)

This dispersion relation should be familiar from relativistic mechanics except that it pertains to wave quan- tities (ω and ~k) instead of particle quantities (E and ~p). Therefore m is here a gap or a zero momentum frequency ω0 = m and not yet a mass. Maybe ~ is playing a role in this correspondance? The answer will be found in the next chapter. There is an internal and global symmetry of this Lagrangian. It is given by the discrete transformation φ → −φ. This is known as a Z2 or Ising symmetry. Because this symmetry is discrete and not continuous, it can not be made infinitesimal and does not result in a conserved Noether current. In the next section, we will study a complex (rather than real) scalar field that has a more interesting internal symmetry than Z2. A general solution of the KG equation can be found as a mode expansion thanks to the Fourier transform: R d4k −ik·x 2 2 φ(x) = (2π)4 e φ(k). Injecting in the KG we find that k = m and therefore k0 = ωk (k0 = ω). R d4k −ik·x 2 2 2 2 Therefore a general solution can be written as φ(x) = (2π)4 e ϕ(k)δ(k − m )2π. But δ(k − m ) = 1 P 1 δ((k0 − ωk)(k0 + ωk)) = [δ(k0 − ωk) + δ(k0 + ωk)] using the fact that δ(f(x)) = 0 δ(x − xj) where 2ωk j |f (xj )| 3 ~ ~ R d k ~ −iωkt+ik·~x ~ iωkt+ik·~x xj are all the roots of f, i.e. f(xj) = 0. Then φ(x) = 3 [ϕ(ωk, k)e + ϕ(−ωk, k)e ] = (2π) 2ωk R d3k  −ik·x ik·x 3 ϕ(k)e + ϕ(−k)e where in the last expression it is understood that k0 is not inde- (2π) 2ωk k0=ωk ~ ∗ pendent of k but is actually equal to ωk. We now use the fact that the field is real φ(x) = φ(x) which means R d3k  ∗ ik·x ∗ −ik·x R d3k  −ik·x ik·x ∗ that 3 ϕ(k) e + ϕ(−k) e = 3 ϕ(k)e + ϕ(−k)e so that ϕ(−k) = ϕ(k) . (2π) 2ωk (2π) 2ωk R d3k  −ik·x ∗ ik·x Therefore φ(x) = 3 ϕ(k)e + ϕ(k) e . The usual notation is to call a(k) ≡ ϕ(k) so that the (2π) 2ωk mode expansion reads

Z 3 Z d k  −ik·x ∗ ik·x −ik·x φ(x) = 3 a(k)e + a(k) e = [a(k)e + c.c.] (3.32) (2π) 2ωk k

R R d3k where we introduced the short hand notation ≡ 3 . k (2π) 2ωk Here are a few remarks on the above expression: - The a(k)’s are just the expansion coefficients of φ(x) on the plane wave basis. ~ - The function a(k), despite its name, depends only on k and not separatly on k0 which is fixed to k0 = ωk. d3k R d4k 2 2 - 3 is a Lorentz invariant integration measure as it comes from 3 δ(k − m ) which is obviously (2π) 2ωk (2π) µ 0µ µ ν invariant (show it by computing the Jacobian of the change of variable k → k = Λ ν k where Λ is any Lorentz transform) but d3k is not. - Note that we have made no hypothesis on the sign of the energy, although in the end everything only −iω t iω t depends on k0 = ωk ≥ 0. Indeed e k (positive ) and e k (negative energies) are both present in (3.32). A this point, it is good to recall the energy-momentum tensor of the real scalar field

2 µν ∂L ν µν µ ν 1 α µν m 2 µν θ = ∂ φ − Lη = (∂ φ)(∂ φ) − (∂αφ)(∂ φ)η + φ η , (3.33) ∂(∂µφ) 2 2 which we already computed when discussing Noether’s theorem. From the energy H of the field, identify the different contributions: kinetic energy, elastic energy, local potential energy (harmonic terms for the Lagrangian we have chosen).

equation of quantum mechanics: probability density that is not always positive and negative energy states. Also as an equation describing a single electron it does not include spin and gives energy levels for the hydrogen atom that are not in agreement with experiments. This is why it was discarded by Schr¨odingerwho later realized that the non-relativistic limit was much better behaved.

38 3.3.2 Complex scalar field The field φ is now assumed to be complex. The (free) Lagrangian is taken as

∗ µ 2 ∗ L = (∂µφ) (∂ φ) − m φ φ (3.34)

∗ ∗ Note that (∂µφ) = ∂µφ but it will be useful to pay attention to the position of the complex conjugation ∗ ∗ later when replacing the ∂µ with the covariant derivative Dµ because (Dµφ) 6= Dµ(φ ). This theory is also√ equivalent to that of√ two real fields with a special symmetry relating them. Indeed let ∗ φ = (φ1 + iφ2)/ 2 and φ = (φ1 − iφ2)/ 2, which is just a decomposition of the complex field into its real and imaginary parts (φ1 and φ2 ∈ R). Check that: 1 m2 1 m2 L = (∂ φ )(∂µφ ) − φ2 + (∂ φ )(∂µφ ) − φ2 . (3.35) 2 µ 1 1 2 1 2 µ 2 2 2 2 2 2 2 2 2 The special symmetry is related to having a single coefficient m = m1 = m2 instead of two m1 6= m2. ∗ The independent degrees of freedom are φ1 and φ2, but one often does as if φ and φ could be taken as independent. The EL equations of motion are (check it in two different ways):

( + m2)φ = 0 and ( + m2)φ∗ = 0 . (3.36)

We now study the above-mentioned special symmetry. It is a transformation φ(x) → φ0(x) = e−iχφ(x) that leaves the action (actually the Lagrangian density) invariant. Here χ is a phase, which is independent of the spacetime point x. It is therefore a global (not local) symmetry. In addition it mixes the two internal components of the complex field. It is therefore an internal symmetry. As e−iχ ∈ U(1), it is called a global internal U(1) symmetry. U(1) is an abelian Lie group (it has a single generator) and U(1) ≈ SO(2). The transformation can be thought of as either a global phase change for the complex scalar field φ (notation U(1)) or as a global rotation in the (φ1, φ2) plane around a perpendicular axis (notation SO(2)). Show that the Lagrangian is invariant under the above U(1) transformation. µ From Noether’s theorem for an internal symmetry, there should be a single divergenceless current Ja = ∂L FI,a with here a = 1 as Na = 1 (single generator) and I = 1, 2 as NI = 2 (a complex field is equivalent ∂(∂µφI ) 0 0∗ ∗ to a doublet of real fields). We need to identify FI . From φ ≈ (1 − iχ)φ and φ ≈ (1 + iχ)φ , we find that a ∆φ1 = δφ1 ≈ χφ2 and ∆φ2 = δφ2 ≈ −χφ1. So that  =  = χ → 0, F1 = φ2 and F2 = −φ1. Therefore

µ ∂L ∂L µ µ ∗ µ µ ∗ J = φ2 + (−φ1) = φ2∂ φ1 − φ1∂ φ2 = iφ ∂ φ − iφ∂ φ . (3.37) ∂(∂µφ1) ∂(∂µφ2)

µ ∗ µ ∗ µ ∗ ∗ Let’s check that we have found a conserved current: ∂µJ = i∂µφ ∂ φ + iφ φ − i∂µφ∂ φ − iφφ = iφ∗φ − iφφ∗. This is not generally equal to zero. It only equals zero upon using the EL equations of 2 ∗ 2 ∗ µ ∗ ∗ 2 ∗ 2 ∗ motion φ = −m φ and φ = −m φ as ∂µJ = iφ φ−iφφ = −im φ φ+im φ φ = 0. The conserved R 3 0 R 3 ∗ ∗ R 3 ˙ ˙ charge is Q = d xJ = d x(iφ ∂0φ − iφ∂0φ ) = d x(φ1φ2 − φ2φ1) such that dQ/dt = 0. It is called a ˙ ˙ R 3 U(1) charge. Note that the conjugate fields Π1,2 = ∂L/∂(φ1,2) = φ1,2 so that Q = d x(Π1φ2 − Π2φ1) =           R 3 0 1 φ1 R 3 ∆φ1 0 1 φ1 d x(Π1, Π2) = d xΠI TIJ φJ with ∆φI ≈ χTIJ φJ as ≈ χ = −1 0 φ2 ∆φ2 −1 0 φ2  χφ  2 . −χφ1 Later, we shall do two main things with the complex scalar field: 1) make the internal symmetry local (i.e. “to gauge the symmetry”) instead of global in order to see that it is related to the gauge invariance of the Maxwell equations (to be discussed in the next section) and that the above conserved U(1) charge is actually the electric charge. 2) quantize the free complex scalar field to see that Q is actually an integer. So that the electric charge is quantized (i.e. there is a minimal unit of electric charge).

[end of lecture # 6]

39 3.4 Electromagnetic field and the Maxwell equation

The electromagnetic (or Maxwell) field is a real massless 4-vector field. We will see that it is actually more than that: it is also a gauge field. We start by discussing the covariant formulation of the Maxwell equations.

3.4.1 Covariant form of the Maxwell equations (see also exercise sheet #1) Here we change perspective and do not start by constructing a 4-vector field theory just from symmetries but suppose that we already know the Maxwell equations describing the electromagnetic field. In Heaviside- Lorentz rationalized units (and with c = 1)12, these equations are

(a) ∇~ · B~ = 0 (magnetic monopoles (or magnetic charges) do not exist)

(b) ∇~ × E~ + ∂tB~ = 0 (Faraday: time dependent B field produces an E field) (c) ∇~ · E~ = ρ (Gauss: electric charges exist and are sources of E field)

(d) ∇~ × B~ − ∂tE~ = ~j (Amp`ere+ Maxwell: electric currents and time-dependent E field produce B field) when written in terms of 3-vectors and 3-scalars. The two first (a and b) are the homogeneous Maxwell equations (no sources in the right hand side). The two last (c and d) are the inhomogeneous Maxwell equations (sources in the right hand side, i.e. ρ and ~j). These equations are Lorentz covariant (the Lorentz transformation was actually discovered from them). But they are not manifestly covariant as they are written in terms of irreps of the rotation group, whereas they should be written in terms of irreps of the Lorentz group. We start by introducing the electromagnetic field strength F µν , which is just a smart way of writing the electric and magnetic fields together in a single object. It is an anti-symmetric rank 2 tensor, whose 6 independent components are F 0i = −Ei and F ij = −ijkBk. Or the other way around: Ei = −F 0i and Bi = −ijkF jk/2. Written in matrix form, the field strength is:

 0 −E1 −E2 −E3  1 3 2 µν  E 0 −B B  F =   (3.38)  E2 B3 0 −B1  E3 −B2 B1 0

We now rewrite the Maxwell equations in covariant form using the field strength tensor and start with the homogeneous ones (a) and (b). Usually the potential vector A~ is introduced so that B~ = ∇~ × A~ is

defined as a 3- in order that ∇~ · B~ = 0 is automatically satisfied. Likewise, the A0 is introduced such that E~ = −∇~ A0 − ∂tA~ in order that ∇~ × E~ + ∂tB~ = 0 is automatically satisfied. Note that µ ν ν µ ∂tA~ + ∇~ A0 = −E~ together with ∇~ × A~ = B~ actually defines a 4-curl i.e. ∂ A − ∂ A . Here, similarly, if we µ 0 i say that there exist a real 4-vector A = (A ,A ) = (A0, A~) such that the field strength is defined as a 4-curl F µν = ∂µAν − ∂ν Aµ, then (a) and (b) are automatically satisfied. Indeed F µν = ∂µAν − ∂ν Aµ implies that ∂ρFµν +∂µFνρ +∂ν Fρµ = 0 known as a Bianchi or Jacobi identity (check that it is equivalent to (a) and (b)). The field Aµ is known as the 4-vector potential (or the gauge potential or the gauge field). Still another way of writing the homogeneous Maxwell equations, that does not require introducing a gauge potential Aµ, is ˜µν 1 µνρσ ˜µν by introducing the dual field tensor F ≡ 2  Fρσ and noticing that ∂µF = 0 is equivalent to (a) and 12In this footnote, we explicitly restore the units of c to show the Maxwell equations in Heaviside-Lorentz rationalized units: 1 1 1 ∇~ · B~ = 0 , ∇~ × E~ + ∂tB~ = 0 , ∇~ · E~ = ρ and ∇~ × B~ − ∂tE~ = ~j c c c

For more details on the manipulations needed to get rid of µ0 and 0 and have only c to appear in the Maxwell equations, see [15].

40 (b). Indeed  0 −B1 −B2 −B3  1 3 2 µν  B 0 E −E  F˜ =   (3.39)  B2 −E3 0 E1  B3 E2 −E1 0

It is called the dual field strength because it is obtained from F µν by the replacement (E,~ B~ ) → (B,~ −E~ ). µν i0 µi Check that ∂µF˜ = 0 gives (a) ∂iF˜ = 0 = ∇~ · B~ and (b) ∂µF˜ = 0 = −∂tB~ − ∇~ × E~ . Consider what happens to the four Maxwell equations (a-d) in the absence of sources (ρ = 0 and ~j = 0) under this replacement. Would (E,~ B~ ) → (−B,~ E~ ) also work? And (E,~ B~ ) → (B,~ E~ )? This property is called duality. It would be worth studying in more depth. Let’s turn our attention to the other two Maxwell equations (c) and (d). We define a 4-vector jµ ≡ (ρ,~j) µν ν called the 4-current in order to rewrite the inhomogeneous Maxwell equations as ∂µF = j . Indeed i0 0 i µi i 0i ji i ijk k i ∂iF = j is (c) ∂iE = ρ and ∂µF = j is (d) ∂0F +∂jF = −∂tE + ∂jB = −∂tE~ +∇×~ B~ = j = ~j. In summary, we defined the field strength tensor, its dual and the 4-current

 0 −E1 −E2 −E3  1 3 2 µν  E 0 −B B  ˜µν 1 µνρσ ν ~ F ≡   , F ≡  Fρσ and j ≡ (ρ, j) (3.40)  E2 B3 0 −B1  2 E3 −B2 B1 0 such that the Maxwell equations are compactly and covariantly written

µν µν ν ∂µF˜ = 0 and ∂µF = j (3.41) or µν µ ν ν µ µν ν F = ∂ A − ∂ A and ∂µF = j (3.42) or

µν ν ∂ρFµν + ∂µFνρ + ∂ν Fρµ = 0 and ∂µF = j . (3.43)

3.4.2 Free electromagnetic field and gauge invariance In this section, we restrict to the free electromagnetic field, i.e. the field in absence of the sources jµ = 0. µν µ ν ν µ µν ν The field equations are then F = ∂ A − ∂ A and ∂µF = j , which, upon inserting the first equation into the second, gives µ ν ν µ ν ν ∂µ∂ A − ∂ ∂µA = A − ∂ (∂ · A) = 0 (3.44) The second term −∂ν (∂ · A) is weird for the moment and is discussed below: we will show that it actually disappears upon making a certain choice. For the moment, we just neglect it. The first term looks like d’Alembert’s wave equation for a 4-vector field Aν = 0. Note the absence of a “mass term” Aν +m2Aν = 0 (we don’t obtain the Klein-Gordon equation for a 4-vector field). This is in agreement with the dispersion relation expected for light ω = |~k|. The electromagnetic field can be described either in terms of the field strength F µν (i.e. E~ and B~ ) or in µ 13 µ terms of the vector potential A (i.e. A0 and A~) . However the description in terms of A is not unique.

13Actually, this is a subtle point. In classical physics, the electric and magnetic fields seem fundamental (they are also gauge-independent) and the scalar and vector potentials appear as a mathematical construction lacking physical reality (they are also gauge-dependent). In quantum physics, the situation is partially reversed. There are actually good reasons – see the Aharonov-Bohm effect, the Dirac monopole, etc. – to believe that the 4-vector potential is fundamental and physical (in the sense that it couples locally to other fields), despite its being gauge-dependent and therefore not directly measurable. We will come back later on that issue.

41 In other words, there is a redundancy or freedom: several choices of Aµ lead to the same equations (a-d). The transformation Aµ(x) → A0µ(x) = Aµ(x) + ∂µθ(x) , (3.45) where θ(x) is any differentiable function (field), is called a gauge transformation. It is an internal and continuous transformation. It leaves the field strength invariant as F µν → F 0µν = ∂µA0ν − ∂ν A0µ = µ ν ν µ µ ν ν µ µν µν µν ∂ A − ∂ A + ∂ ∂ θ − ∂ ∂ θ = F . And therefore the free field equations ∂µF˜ = 0 and ∂µF = 0 are also invariant. We postpone the discussion of what happens to the 4-current jµ under a gauge transformation. The Lagrangian for the free electromagnetic field is

1 E~2 − B~2 1 1 L = − F F µν = = − ∂ A ∂µAν + ∂ A ∂ν Aµ (3.46) 4 µν 2 2 µ ν 2 µ ν

It is a Lorentz scalar. Note that (apart for the sign) the first term looks similar to the Lagrangian of a 1 µ massless real scalar field L = 2 ∂µφ∂ φ. The above Lagrangian is also gauge-invariant as it only depends on the field strength. Therefore the gauge transformation is a symmetry (called gauge invariance). Note that it is a weird symmetry: it does not act on F µν directly but only on Aµ. It is an internal symmetry (not a spacetime one) but it is a local internal symmetry as θ(x) depends on the spacetime point. The factor −1/4 is here to make the kinetic energy positive and have the familiar 1/2 factor for a real field. Indeed 1 ~ 2 1 ~ 2 2 1 2 ~ ~ 1 ~ ~ i L = 2 (∂tA) − 2 (∂iA) + 0 × (∂tA0) + 2 (∂iA0) + ∂tA · ∇A0 + 2 ∂iA · ∇A (one also notices the absence of a kinetic energy term for A0 – it has therefore no dynamics – and the surprising sign of the elastic energy µ term for A0). The Lagrangian is quadratic in A (also in E~ and B~ ): it is a free field theory (light does not µ scatter light). A term like A Aµ is also quadratic in the field and could be present in a free field theory. But it is actually absent. Its presence would spoil the gauge invariance. Such a term would correspond to 1 µν m2 µ 14 a “mass term” L = − 4 Fµν F + 2 AµA . Note also that from our knowledge of the real scalar field 1 µ m2 2 1 µ ν m2 µ L = 2 ∂µφ∂ φ − 2 φ , we would have probably guessed that L = 2 ∂µAν ∂ A − 2 AµA instead of (3.46). Think of the different problems related to such a Lagrangian. We check that the Lagrangian is correct by re-obtaining the equations of motion from the EL equation15 ∂L ∂L = ∂µ giving: ∂Aν ∂(∂µAν ) µ ν ν µ ν ν 0 = −∂µ∂ A + ∂µ∂ A = −A + ∂ (∂ · A) (3.47) qed. By integration by part, the Lagrangian can also be written as 1 L = Aµ (η − ∂ ∂ ) Aν (3.48) 2 µν µ ν The main message of this subsection is that the electromagnetic field is not only a 4-vector field Aµ(x), but it is a gauge field, i.e. a vector field that has a gauge invariance. Gauge invariance is not truly a symmetry (although it is often called “gauge symmetry”). The gauge freedom is only a recognition of the fact that we are not able to construct a unique covariant description of the electromagnetic field. Remember that when we discussed rotation symmetry, we insisted on the difference between a transformation and a symmetry – that one could rotate a system, even if it did not possess rotational symmetry. However, electromagnetism always has gauge invariance. Gauge invariance is just a redundancy in our description of the electromagnetic field. It is therefore important to have in mind the difference between a vector field and a gauge field. A gauge field is a vector field that has a gauge invariance (i.e. the action should be invariant under the transformation Aµ → Aµ + ∂µθ for any θ(x)). The Maxwell (or electromagnetic) field is a gauge field (it is actually the simplest example of a gauge field). But it is also possible to construct a theory for a vector field that has no gauge invariance (see, for example, the Proca field theory for a massive vector field with 1 µν m2 µ µ Lagrangian L = − 4 Fµν F + 2 AµA ). In this case, the vector field A should not be called a gauge field. 14This is known as the Proca Lagrangian. See for example Ryder [3] pages 69-70. 15First show that ∂L = F αβ . ∂(∂β Aα)

42 3.4.3 Gauge choices There is something puzzling about the above construction. We know (experimentally) that the electromag- netic field has two internal degrees of freedom (two linear polarizations or two circular polarizations16). But the 4-vector field Aµ has 4 internal components. There are too many degrees of freedom. All of them can not be physical. There is a redundancy. We have already noticed that the component A0 has no dynamics (no kinetic energy in the Lagrangian) and can therefore not represent a physical degree of freedom. Getting rid of the redundancy can be done by fixing a gauge. There are several usual gauge choices. Below, we discuss the and the radiation gauge. Gauge freedom allows one to impose

µ ∂µA = 0 (3.49) (known as the Lorenz – and not Lorentz, who is a different physicist – gauge condition) by choosing θ(x). µ 0µ µ µ 0µ µ Imagine that A is given. Let A = A + ∂ θ. We want to find θ(x) such that ∂µA = 0 = ∂µA + θ we µ therefore only have to find a solution to the equation θ = −∂µA . However, this does not fully fix θ, so that there remains some partial gauge freedom17. The Lorenz gauge condition is Lorentz covariant. But it does not fully fix the gauge freedom. Several gauge choices satisfy the Lorenz gauge condition (see below). The latter fixes one component of Aµ among four but three remain, whereas only two are physical. To see this, we note that, within the Lorenz gauge condition, the equation of motion for the gauge field becomes

Aν = 0 (3.50)

This is a d’Alembert wave equation for each component of the 4-vector field and corresponds to a dispersion relation ω = |~k| as expected for light in the vacuum. Consider a plane wave Aµ(x) = Aµe−ik·x with µ ~ ~ µ ~ ~ ˆ k = (|k|, k). The Lorenz condition reads k Aµ = 0 = |k|A0 − k · A~. Therefore A0 = k · A~, which shows i that the time-component A0 is not independent of the spatial components A . This proves that the Lorenz condition reduces the number of independent components of the gauge field from 4 to 3. The advantage of the Lorenz gauge condition is that it removes part of the redundancy without sacrifying manifest . The drawback is that we still have to handle some redundancy and manipulate non-physical components. A choice that fully fixes the gauge freedom is the so-called radiation gauge. It is such that A0 = 0 and µ ∇~ · A~ = 0 (so that ∂µA = ∂tA0 + ∇~ · A~ = 0). These conditions are obviously not Lorentz covariant because under a Lorentz transformation A0 mixes with the other components of the 4-vector field whereas 0 is a 4- scalar. The advantage of the radiation gauge is that only physical degrees of freedom appear (there is no more redundancy). The drawback is that the theory is no longer manifestly Lorentz covariant. Let’s construct the radiation gauge choice step by step following Maggiore [2] pages 66-67. Starting from a given Aµ, we perform µ 0µ µ µ R t 0 a first gauge transformation A → A = A +∂ θ with θ(x) = − dt A0(t, ~x), such that ∂0θ(x) = −A0(x). 0 Therefore A0 = 0, which shows that we can get rid of a first component. Next, we perform a second gauge 0µ 00µ 0µ µ 0 0 R 3 1 0i 0 transformation A → A = A +∂ θ with θ (x) = d y 4π|~x−~y| ∂iA (t, ~y). Actually θ (x) does not depend i 0i 00 on time. The electric field E = −∂0A (because A = 0) and the Gauss equation (in the absence of a source) i 0i 0 R 3 1 0i ∂iE = 0 shows that ∂0∂iA = 0. Therefore ∂0θ (x) = d y 4π|~x−~y| ∂0∂iA (t, ~y) = 0, which shows that 0 0 00 0 0 0 00j 0j j 0 θ (x) = θ (~x). As a consequence, A = A + ∂0θ = A = 0. In addition ∂jA (x) = ∂jA (x) + ∂j∂ θ (x) =   0 0 0   ~ ~0 R 3 ~ 2 1 ~ ~0 ~ 2 1 (3) ∇ · A (x) + d y∇ 4π|~x−~y| ∇ · A (t, ~y). With the help of the identity ∇ 4π|~x−~y| = −δ (~x − ~y) (this is the Green’s function of the Laplacian in 3 spatial dimensions), we see that ∇~ · A~00 = ∇~ · A~0 − ∇~ · A~0 = 0. 00 In summary, starting from a given gauge field, we have found a gauge choice that allows one to have A0 = 0 ~ ~00 00 ~ ~00 and ∇ · A = 0. Going to Fourier space, we have A0 (k) = 0 and k · A (k) = 0. The last equation means 00 00 that for a plane wave propagating in the z direction, we have Az (k) = 0 in addition to A0 (k) = 0 and only 00 00 two non-zero components Ax and Ay .

16Anticipating, we also know that the photon exists in two helicities 1. It carries a spin 1 but it is massless and therefore the longitudinal component (Sz = 0) is not possible and only the two transverse ones (Sz = 1) are. 17 µ 0 0 µ If θ(x) satisfies θ = −∂µA then θ (x) = θ(x) + f(t − ~n · ~x) also satisfies θ = −∂µA for any smooth function f and any unit vector ~n. Indeed, show that φ(x) = f(t − ~n · ~x) is a general solution of the d’Alembert equation φ(x) = 0.

43 In conclusion, quoting Ryder [3] pages 143-144, “The origin of the difficulty is that the elecromagnetic field, like any massless field, possesses only two independent components, but is covariantly described by a 4-vector Aµ. In choosing two of these components as the physical ones, [...], we lose . Alternatively, if we wish to keep covariance, we have two redundant components.”

Gauge fixing to be skipped in a first reading A practical way to enforce the Lorenz gauge condition is to add a “gauge fixing term” to the Maxwell Lagrangian such that 1 1 L = − F µν F − (∂ Aµ)2 (3.51) 4 µν 2ξ µ where ξ is an arbitrary (for the moment) real parameter. The EL equation of motion becomes 1 Aν + (1 − )∂ν (∂ Aµ) = 0 (3.52) ξ µ Upon choosing ξ = 1, this becomes Aν = 0, which is the Maxwell wave equation with the Lorenz gauge condition.

3.4.4 Energy-momentum tensor (see exercise 2 in sheet #5)

3.4.5 Coupling to matter and electric charge conservation as a consequence of gauge invariance µν µν ν We now include the sources in the inhomogeneous Maxwell equations. From ∂µF˜ = 0 and ∂µF = j we get the equation of motion: Aν − ∂ν (∂ · A) = jν (3.53)

ν ν ν Taking the 4- of the preceding equation, it follows that ∂ν j = (∂ν A ) − ∂ν ∂ (∂ · A) = 0, which means that there is a divergenceless current. This is nothing but the familiar local form of the conservation of electric charge. Let us show that it is possible to derive this conservation law using Noether’s theorem for a specific symmetry (namely gauge symmetry). The Lagrangian for the electromagnetic field in presence of sources is18 1 L = − F F µν − jµA , (3.54) 4 µν µ which obviously gives back the correct equations of motion. Is this Lagrangian invariant under a gauge transformation Aµ → Aµ − ∂µθ? We know that the field strength is invariant and we further assume that µ 19 µ R 4 the 4-current j is also invariant . Then L → L + j ∂µθ is not invariant. But the action S = d xL → R 4 µ R 4 µ µ S + d xj ∂µθ = S − d xθ∂µj (using an integration by part in the last step). Therefore if ∂µj = 0 then the action is gauge invariant and there is a gauge symmetry. Reciprocally, if there is a gauge symmetry, i.e. R 4 µ µ if the action is gauge invariant, then d xθ∂µj = 0 for any θ(x) and therefore ∂µj = 0.

18 ˙ m~x˙ 2 m~x˙ 2 ˙ ~ For a single particle of mass m and charge q, the Lagrangian L(~x, ~x) = 2 − V (~x) becomes L = 2 − V (~x) − qA0 + q~x · A µ ˙ in the presence of an electro-magnetic field described by a potential A = (A0, A~). Note that −qA0 + q~x · A~ is similar to µ ˙ ∂L ˙ −jµA = −ρA0 + ~j · A~, with ρ → q and ~j → q~x. The canonical momentum is ~p = = m~x + qA~ = m~v + qA~ so that the ∂~x˙ (~p−qA~)2 (~p−qA~)2 m~v2 Hamiltonian is H(~x,~p) = 2m + V (~x) + qA0. This has a familiar form of the sum of kinetic energy 2m = 2 and potential energy V (~x) + qA0, where the last term is the energy. Note also that the canonical momentum ~p is gauge-dependent, while the mechanical momentum m~v is gauge-invariant. 19Actually, we know that the Maxwell equations (a) to (d) only involve gauge-independent quantities, when written in terms of E and B fields. Therefore, we know that the current and density are gauge invariant. So that jµ is also gauge invariant.

44 µ In summary, gauge invariance of the action is equivalent to ∂µj = 0. So that the conservation of electric charge can be seen as a consequence of a gauge symmetry. This is not yet entirely satisfying because of at least two things: (i) the conservation law is for charged matter but we derived it from a symmetry pertaining mainly to the neutral gauge field (the charged matter field being hidden in the 4-current jµ); (ii) the gauge symmetry is local whereas Noether’s theorem was proven for the case of a global symmetry (of course a global symmetry can be seen as a particular case of a local symmetry20).

3.4.6 Gauging a symmetry We now come to a very interesting point that goes back to the work of (in 1918 and 1929). We present his general method for writing a gauge invariant action for a matter field. It is called “gauging of an internal symmetry” (see pages 69-72 in [2] and pages 93-100 in [3]). It goes in four steps: 1) Start from a complex scalar (matter) field

∗ µ 2 ∗ L1 = (∂µφ) (∂ φ) − m φ φ . (3.55)

This Lagrangian has a global U(1) symmetry. It is also an internal and continuous symmetry. It is known as a phase symmetry. The Lagrangian is invariant under the transformation φ(x) → eiθφ(x) with θ = constant iθ µ ∗ µ µ ∗ µ (e ∈ U(1)). From Noether’s theorem, there is a conserved current J = iφ ∂ φ − iφ∂ φ , ∂µJ = 0 if ( + m2)φ = 0. 2) The “gauge principle” is the idea that an exact symmetry of Nature can not be global but has to be local (otherwise it would have the flavor of action at a distance). But it is an unproven assumption, it is a principle. It is inspired by Einstein’s step of going from special to general relativity by asking that the invariance under change of frame be local rather than global (an idea out of which popped the theory of gravitation)21. Let us therefore make the U(1) phase symmetry local (this is called “gauging the symmetry”):

φ(x) → φ0(x) = eiθ(x)φ(x) iθ(x) ∂µφ(x) → e [∂µφ + φi∂µθ] (3.56)

∗ µ The last equation implies that (∂µφ) ∂ φ is not invariant under the local U(1) transformation, so that µ neither L1 nor the corresponding action S1 are. The idea is to introduce a real field A (x) such that Aµ(x) → Aµ(x) − ∂µθ(x) under the local U(1) transformation. The purpose of this new field is solely to make the Lagrangian invariant under the local U(1) transformation by changing the differential ∂µ. Indeed

φ(x) → φ0(x) = eiθ(x)φ(x) iθ iθ iθ(x) (∂µ + iAµ(x))φ(x) → e [∂µφ + φi∂µθ] + i(Aµ − ∂µθ)e φ = e (∂µ + iAµ)φ (3.57)

By changing ∂µ into ∂µ + iAµ, we now have the property that the field φ and its “gradient” (∂µ + iAµ)φ transform similarly under a local U(1) phase transformation. The new field is called a gauge field and the local U(1) transformation law for the matter field corresponds to a gauge transformation for the gauge field Aµ. We define

Dµ ≡ ∂µ + iAµ (3.58)

20From the point of view of the 4-vector potential Aµ(x), a gauge transformation is an internal and continuous transformation. It can be made infinitesimal and rendered global by choosing a specific transformation. Indeed xµ → x0µ = xµ and Aµ(x) → µ 0 µ µ µ µ ν ν A (x ) = A (x) − ∂ θ ≈ A (x) + a for a specific field θ(x) = −a xν with  → 0 and a is a 4-vector. 21Weyl was inspired by Einstein’s theory of general relativity. He first hoped to derive electromagnetism from a deeper symmetry – just as Einstein had done with gravitation and the symmetry under general coordinate change – related to changing the scale length or changing locally the ruler or the gauge (in the sense of a metal bar) used to measure distances. This is now called dilatation transformation and conformal symmetry. This was not the correct symmetry for electromagnetism. But the name “gauge symmetry” sticked. Weyl later – in 1929 after the advent of quantum mechanics and complex wavefunctions – understood what the correct symmetry was: namely a U(1) phase symmetry of the matter field.

45 22 ∗ µ called the covariant derivative so that (Dµφ) (D φ) is invariant under the local U(1) transformation. We can therefore postulate a new Lagrangian

∗ µ 2 ∗ ∗ ∗ µ µ 2 ∗ L2 = (Dµφ) (D φ) − m φ φ = (∂µφ − iAµφ )(∂ φ + iA φ) − m φ φ (3.59)

The latter now includes coupling between the matter field and the gauge field. To see it clearly, we rewrite the Lagrangian as:

∗ µ 2 ∗ ∗ µ µ ∗ ∗ µ L2 = (∂µφ) (∂ φ) − m φ φ − Aµ(iφ ∂ φ − iφ∂ φ ) + φ φAµA µ ∗ 2 = L1 − AµJ + φ φA (3.60)

This coupling is called minimal coupling and is usually written as ∂µ → Dµ = ∂µ + iAµ or i∂µ → iDµ = i∂µ −Aµ, where i∂µ is the momentum operator (impulsion) and i∂µ −Aµ is the gauge-invariant or mechanical momentum (quantit´ede mouvement). In the above equation, we recognize the free matter field Lagrangian µ ∗ µ µ ∗ L1, the coupling between the gauge field and the current J = iφ ∂ φ − iφ∂ φ of the matter field (similar to the one in eq. (3.54)) and there is a third term that could look like a mass term for the gauge field µ (it is proportional to AµA ). This last term will play an important role later when discussing the Higgs mechanism. We also understand why a local U(1) symmetry is called a gauge symmetry. At first it was a global phase symmetry for the complex matter field. We turned it into a local phase transformation and were forced to introduce a gauge field if we wanted to maintain invariance under the local phase transformation. Then we realized that for the gauge field, the local phase transformation is actually a gauge transformation (in the sense of the familiar gauge transformation in the Maxwell equations). So that from now on, a gauge transformation actually means the following combined transformation

φ(x) → eiθ(x)φ(x) Aµ(x) → Aµ(x) − ∂µθ(x) (3.61)

Namely, a local U(1) phase transformation on the matter field, together with a gauge transformation (in the old acceptation of classical electromagnetism) on the 4-vector potential (now called a gauge field). 3) The gauge field Aµ that we introduced does not have a dynamic yet. There is no kinetic energy or elastic energy for this field in L2. But we already know how to write a free field Lagrangian for a gauge- 1 µν 2 µ invariant real 4-vector field. It is L = − 4 Fµν F . A term such as mAAµA is forbidden by gauge invariance and therefore the gauge field has to be massless. We therefore arrive at the following Lagrangian: 1 L = (D φ)∗(Dµφ) − m2φ∗φ − F F µν (3.62) 3 µ 4 µν This is actually a baby version of quantum electrodynamics (QED) called scalar QED. It is a theory for electrons and photons except that the electron is here spinless (scalar field instead of spinor field). 4) Gauge invariance of the Lagrangian L3 is now different from gauge invariance of the Maxwell action R 4 1 µν µ S = d x(− 4 Fµν F − jµA ). We therefore revisit Noether’s theorem in this new context. The gauge invariance of L3 implies global U(1) invariance and therefore a conserved current via Noether. However this µ ∗ µ µ ∗ current is not necessarily J = iφ ∂ φ−iφ∂ φ as we show below. From L3 and the EL equations,we obtain the equations of motion for the matter field

µ 2 (D Dµ + m )φ = 0 (3.63)

and the gauge field µν ν ∗ ν ∗ ν ν ∗ ∗ ν ∂µF = J − 2φ φA = iφ ∂ φ − iφ∂ φ − 2φ φA (3.64) where F µν ≡ ∂µAν − ∂ν Aµ is the field strength associated to the gauge field Aµ. These equations are µν coupled: the evolution of the matter field depends on the gauge field and vice versa. From ∂ν ∂µF = 0

22Here covariant means covariant with respect to the gauge transformation.

46 ν ∗ ν (contraction of a symmetric tensor with an anti-symmetric one), we find that ∂ν (J − 2φ φA ) = 0, which allows us to identify the conserved current as

ν ν ∗ ν ∗ µ µ ∗ ν j ≡ J − 2φ φA = iφ D φ − iφ(D φ) and ∂ν j = 0 (3.65)

The equation of motion for the gauge field is:

Aν − ∂ν (∂ · A) = jν i.e. ( + 2φ∗φ)Aν − ∂ν (∂ · A) = J ν = iφ∗∂ν φ − iφ∂ν φ∗ (3.66)

We can also check the current we found directly from the explicit form of the general Noether current. µ ∂L3 ∂L3 For a global continuous internal symmetry, the Noether current is j = − Fφ − ∗ Fφ∗ . The ∂(∂µφ) ∂(∂µφ ) ∗ ∗ ∗ ∗ infinitesimal transformations are φ → φ + θiφ and φ → φ + θ(−iφ ) so that Fφ = iφ and Fφ∗ = −iφ . Therefore the Noether current is jµ = −(Dµφ)∗iφ + (Dµφ)iφ∗, which is indeed correct. We rewrite the Lagrangian to identify three terms: 1 L = (∂ φ)∗(∂µφ) − m2φ∗φ − F F µν − A J µ + φ∗φA Aµ 3 µ 4 µν µ µ = Lmatter field + Lgauge field + Linteraction (3.67)

µ ∗ µ µ ∗ µ The last term Linteraction = −AµJ (φ)+φ φAµA can also be written Linteraction = −Aµj (φ, A)−φ φAµA , where the notation jµ(φ, A) emphasizes that, in the present case, the gauge-invariant current jµ depends both on the matter and on the gauge field. Whereas J µ is the gauge-dependent current. The construction of Weyl (inspired by the general relativity construction of Einstein) is marvellous! (cf. G. t’Hooft “Under the spell of the gauge principle”). It means that each time a new conservation law is found to exist in Nature and which appears to be exact, it should correspond to a local internal symmetry of the matter fields and to a new gauge field corresponding to an interaction. But remember it is a principle (why do exact symmetries have to be local?).

3.4.7 Advanced geometrical description of the electromagnetic field to be skipped in a first reading

Parallel transport A geometrical interpretation of the electromagnetic field follows from a parallel with the gravitational field (as described by general relativity), see page 124 in [3]. The 4-vector potential (or gauge field) Aµ is also known as a connection and the field strength F µν as a curvature tensor. This language is taken from the notions of parallel transport and of fiber bundles [8]. The homogeneous Maxwell equations ∂µFνρ + ∂ν Fρµ + ∂ρFµν = 0 are also known as a Bianchi (or Jacobi) identity. When minimally coupling the matter fields to the gauge field, the derivative ∂µ gets replaced by the covariant derivative Dµ ≡ ∂µ + iAµ. Such a minimal coupling is familiar from quantum physics and the prescription for how to modify the momentum operator pµ = i∂µ → pµ − Aµ = i∂µ − Aµ = iDµ. Remember from single particle classical mechanics that the canonical momentum (impulsion) ~p = m~v becomes ~p − qA~ = m~v, where m~v is the mechanical momentum (quantit´ede mouvement), in the presence of a vector potential. Similarly, the Hamiltonian H becomes H − qA0 in the ~p2 (~p−qA~)2 presence of a scalar potential. Overall H = 2m + V (~x) becomes H − qA0 = 2m + V (~x).

Differential forms Also, using the language of differential geometry (see §2.9 in [3]), the gauge potential defines a 1-form µ 1 µ ν A = Aµdx and the field strength a 2-form F = − 2 Fµν dx ∧ dx such that F = dA, where ∧ denotes the exterior (or wedge) product (its main property being that dxµ ∧ dxν = −dxν ∧ dxµ so that dxµ ∧ dxµ = 0). ν µ ν µ The differential operator d acts as dA = ∂ν Aµdx ∧ dx . Because of the antisymmetry of dx ∧ dx , 1 ν µ 1 ν µ dA = 2 (∂ν Aµ − ∂µAν )dx ∧ dx = 2 Fνµdx ∧ dx . But this is precisely the definition of a a two form F =

47 1 µ ν 2! Fµν dx ∧ dx . Therefore F = dA. The is such that dd = 0. Therefore dF = ddA = 0. µν This is actually the homogeneous Maxwell equation ∂µF˜ = 0 or ∂µFνρ + ∂ν Fρµ + ∂ρFµν = 0 (Bianchi ∗ ∗ 1 ˜ µ ν identity). The inhomogeneous Maxwell equation read d F = J where F = − 2 Fµν dx ∧dx is the dual form of F and J is the current density 3-form defined by J = (jxdy ∧dz +jydz ∧dx+jzdx∧dy)∧dt−ρdx∧dy ∧dz. Most compactly the Maxwell equations are dF = 0 and d∗F = J. The purpose of differential forms is to allow multivariable without specifying coordinates. It is µ µ taylor made for line, surface, volume, etc. . A 1-form is defined by A = Aµdx where dx is a µ 1 µ ν differential and Aµ are 4 functions of x . A 2-form is defined by F = 2! Fµν dx ∧ dx where ∧ denotes the exterior or wedge product. The idea is that dxµ ∧dxν is an oriented surface element (an integration measure). It is antisymmetric dxµ∧dxν = −dxν ∧dxµ, so that in a coordinate change x → x0, the Jacobian automatically 1 µ1 µ2 µp appears for the integration measure. A p-form is defined as H = p! Hµ1µ2...µp dx ∧dx ∧...∧dx . It already includes the volume element or integration measure dxµ1 ∧ dxµ2 ∧ ... ∧ dxµp . Because of the antisymmetry of the wedge product, the Jacobian rule to change the integration measure in a change of variable is automatic. 1 ν µ1 µ2 µp The exterior derivative d acts as dH = p! ∂ν Hµ1µ2...µp dx ∧ dx ∧ dx ∧ ... ∧ dx . A form H is said to be closed if dH = 0. The homogeneous Maxwell equation dF = 0 means that the field strength 2-form is closed. A p-form H is said to be exact if there exist a (p-1)-form G such that H = dG. Of course if H = dG (H is exact) then dH = ddG = 0 (H is closed). But the converse is not true. Actually, it is almost true. It is true locally but not globally (Poincar´e’slemma). Therefore F = dA ⇒ dF = 0 but dF = 0 implies F = dA locally but not globally. The important result about integration of forms is Stokes’ theorem: R R M dH = ∂M H, where M is an orientable differentiable manifold and ∂M is its boundary.

[end of lecture # 7]

3.5 Massless spinor fields and the Weyl equations

see Maggiore [2] pages 54-56 In this section, we build classical theories for spinor fields using symmetries. We will therefore be quite far from the historical appearance of the Dirac equation (in the context of single particle relativistic quantum mechanics, see exercise sheet # 3) or the Weyl equation. We have in mind the description of spin 1/2 particles such as the neutrino or the electron (even if for the moment we describe fields not yet particles, which will only emerge upon quantization).

3.5.1 Left Weyl field, helicity and the Weyl equation µ i Consider a single left Weyl field ψL(x). Letσ ¯ ≡ (I, −σ ) = (I, −~σ) in any frame be a quadruplet of 2 × 2 µ I i † µ matrices (and σ ≡ ( , σ )) built from the Pauli matrices σx, σy, σz. Then ψLσ¯ ψL is a 4-vector. Check it † † µ µ † ν † µ (i.e. show that, under a Lorentz transformation Λ, it becomes ψLΛLσ¯ ΛLψL = Λ ν ψLσ¯ ψL). Is ψLσ ψL also a 4-vector? From this knowledge, we can build a Lorentz invariant kinetic (and elastic) term that is first order in derivative: † µ LL = iψLσ¯ ∂µψL (3.68) and such that the corresponding action is also real (check it). It is not possible to write a mass term just † for a left Weyl field (for example, ψLψL is not Lorentz invariant – check it). † µ µ † µ The EL equations give 0 = ∂µ(iψLσ¯ ) and upon taking the hermitian conjugate and using (¯σ ) =σ ¯ , we obtain µ σ¯ ∂µψL = 0 (3.69) which is known as the Weyl equation (1929). The latter can also be written as i∂tψL = −~σ · (−i∇~ )ψL, which may remind you of the low-energy description of graphene (albeit in 3 space dimension rather than 2), see exercise sheet # 4. We can “take the square” of the Weyl equation to obtain

ψL = 0 . (3.70)

48 ν µ j i 2 j i 2 µ  This follows from σ ∂ν σ¯ ∂µ = (∂0 + σ ∂j)(∂0 − σ ∂i) = ∂0 − σ σ ∂j∂i = ∂0 − ∂j∂j = ∂ ∂µ = as σiσj = δij + iijkσk. The left Weyl spinor therefore satisfies the d’Alembert (or massless KG) equation. Actually, the spinor has two complex components. Each component of the doublet satisfies the massless KG equation (3.70). But in addition the doublet has to satisfy the Weyl equation (3.69). We will see that the latter is a projection equation that gets rid of one of the two (complex) degrees of freedom. −ik·x Take a plane wave solution ψL(x) = uLe where uL(x) = uL is a constant spinor. If we inject it in 2 ~ 2 ~ (3.70), we find that k0 = k i.e. the dispersion relation ω = |k| (identical to that of light). If we inject ~ 23 in (3.69), we find that k · ~σuL = −ωuL. Using the two equations and choosing the positive branch of the ~ ˆ ˆ ~ ~ dispersion relation ω = |k|, we find that k · ~σuL = −uL (where k ≡ k/|k|). But the spin operator S~ = ~σ/2 for a spin 1/2 and therefore 1 kˆ · Su~ = − u . (3.71) L 2 L This means that the left Weyl field has an helicity of −1/2. The helicity is the projection of the spin (or intrinsic angular momentum) on the direction of motion. This last equation should be thought of as an equation that projects out half of the apparent degrees of freedom. A Weyl field seems to have two (complex) degrees of freedom. But, because it is an helicity eigenstate, it only has a single (complex) degree of freedom. ~ ~ To make this point clear, imagine the motion is along the z direction, so that k = |k|~ez. Then σzuL = −uL  0  meaning that u is the spinor having a single non-zero component. L 1 Another Lagrangian – which one obtains by integration by part in the action, which is more symmetrical, gives the same equation of motion and is therefore equivalent – is: i i ←→ L0 = ψ† σ¯µ∂ ψ − (∂ ψ† )¯σµψ = iψ† σ¯µ ∂ ψ (3.72) L 2 L µ L 2 µ L L L µ L

←→ 1 1 ←→ 1 −→ 1 ←− where we introduced the following notation A∂µ B ≡ 2 A(∂µB) − 2 (∂µA)B or ∂µ ≡ 2 ∂µ − 2 ∂µ. As a conclusion to this part, it is essential to realize that the Weyl equation is more than just the dispersion relation ω = |~k| (which is essentially what the KG equation for a massless field is). It is an equation which is first order in gradient and which enforces a relation between the components of the spinor doublet. Physically it projects out unwanted or extra or redundant degrees of freedom.

3.5.2 Right Weyl field

A similar construction can be made for a right Weyl spinor field ψR(x). It gives a Lagrangian

† µ LR = iψRσ ∂µψR (3.73)

0 † µ←→ (or equivalently LR = iψRσ ∂µ ψR), a Weyl equation

µ σ ∂µψR = 0 (3.74)

and, as a consequence, a massless KG equation ψR = 0. When one injects a plane wave solution, one ~ ˆ ~ 1 obtains the dispersion relation ω = |k| and the helicity equation k · SuR = + 2 uR. Remark: In Nature, the neutrino (if massless) could be described by a left Weyl spinor field (it has an helicity of −1/2). While the anti-neutrino (if it exists) would correspond to a right Weyl spinor field (of helicity +1/2). It is now known that the neutrino has a finite tiny mass (see the phenomenon of neutrino oscillations). However it is unclear whether it is identical or different from its , i.e. whether it is a Majorana particle or not.

23The important point is not the choice of the positive branch. The argument goes as well with the negative branch. The point is that there is no freedom in the projection of the spin once the orbital motion is given. Hence, we do not have two complex degrees of freedom but a single complex one. The spin projection is locked to the momentum. This is the notion of helicity eigenstate.

49 3.6 Spinor field and the Dirac equation

see Maggiore [2] pages 56-65 Remember that ψL and ψR are exchanged under parity. Therefore, if we insist on having a parity- invariant action, we need to take both a left ψL(x) and a right ψR(x) Weyl spinor fields and glue them  ψ (x)  into a bispinor (i.e. a quadruplet of complex fields) to make a Dirac field ψ(x) = L . This is the ψR(x) representation (1/2, 0) ⊕ (0, 1/2).

3.6.1 Parity and the Dirac Lagrangian The game is now to build a Lagrangian, which is both Lorentz and parity invariant. It will now be possible to construct a simple mass term. Indeed, under a Lorentz transformation Λ, ψL → ΛLψL and ψR → ΛRψR (iθ~+φ~)·~σ/2 (iθ~−φ~)·~σ/2 † † † with ΛL = e and ΛR = e . The bilinears ψLψL and ψRψR are not invariant but ψLψR and † ψRψL are (check it). µ 0µ 0 0 0 † † Under parity x = (t, ~x) → x = (t, −~x) and ψL/R(x) → ψL/R(x ) = ψR/L(x ). So that ψLψR → ψRψL † † † † † † and ψRψL → ψLψR. Therefore ψLψR + ψRψL is parity-invariant: it is a true scalar (whereas ψLψR − ψRψL is a pseudo-scalar). Therefore, a free, Lorentz and parity invariant Lagrangian is † µ † µ † † LD = iψLσ¯ ∂µψL + iψRσ ∂µψR − m(ψLψR + ψRψL) , (3.75) which is known as the Dirac Lagrangian. It is obviously Lorentz invariant. Let’s check its behavior under µ µ parity: ∂i → −∂i so thatσ ¯ ∂µ → σ ∂µ and ψL ↔ ψR so that indeed LD → LD.

3.6.2 The Dirac equation and γ matrices Dirac to Feynman: “I have an equation. Do you have one too?”

∂LD LD µ ∂LD LD The EL equation † = ∂µ † gives −mψR = −iσ¯ ∂µψL. Similarly † = ∂µ † gives −mψL = ∂ψL ∂(∂µψL) ∂ψR ∂(∂µψR) µ −i(∂µσ ψR). The Dirac equation is therefore µ iσ¯ ∂µψL = mψR µ iσ ∂µψR = mψL , (3.76) which shows that the left and right spinors inside the Dirac bispinor are coupled. µ Here it is also possible to take the square of this first order equation by applying iσ ∂µ to the left ν µ ν µ ν 1 µ ν ν µ of iσ¯ ∂ν ψL = mψR to find (iσ ∂µ)(iσ¯ ∂ν ψL) = −σ σ¯ ∂µ∂ν ψL = − 2 (σ σ¯ + σ σ¯ )∂µ∂ν ψL. Now, we note that σµσ¯ν + σν σ¯µ = 2ηµν I (pay attention to the fact that σµσ¯ν + σν σ¯µ is not the anticommutator µ ν µ ν ν µ µ µ µ 2 {σ , σ¯ } = σ σ¯ +σ ¯ σ ). Therefore −∂µ∂ ψL = (iσ ∂µ)(mψR) = miσ ∂µψR = m ψL (where, in the last 2 step, we used the second equation (3.76)). We eventually arrive at (+m )ψL = 0, which is the massive KG 2 equation for ψL. We could as well obtain ( + m )ψR = 0 so that in the end, each of the four componenents of the Dirac spinor ψ satisfies the massive KG equation: ( + m2)ψ = 0 . (3.77)  ψ  We can rewrite the above results using the Dirac spinor notation ψ = L , which is here written in ψR the chiral representation (meaning that its first two components behave as a left Weyl spinor and its last  ψ + ψ  two as a right Weyl spinor). But we could have chosen to define e.g. ψ = √1 L R (see below for 2 ψL − ψR another representation). In the chiral representation, the Dirac equation reads:  µ     µ    0 iσ ∂µ ψL 0 σ ψL µ = µ i∂µψ = m = mψ (3.78) iσ¯ ∂µ 0 ψR σ¯ 0 ψR

50  0 σµ  We define the γ matrices in the chiral representation by γµ ≡ so that the Dirac equation σ¯µ 0 becomes µ (iγ ∂µ − m)ψ = 0 or (i∂/ − m)ψ = 0 or (p/ − m)ψ = 0 (3.79)

µ with the Feynman slash notation /r ≡ γ rµ and pµ = i∂µ is here the 4-momentum operator. Remarks: - theσ ¯µ and σµ are quadruplets of 2 × 2 (Pauli) matrices but they do not transform as 4-vectors (despite their notation). They are invariant under a Lorentz transformation. - Similarly, the γµ are a quadruplet of 4 × 4 matrices (known as or Dirac matrices) but they do not transform as a 4-vector. They are unchanged in a Lorentz transformation. Here, we have given their expression in the so-called chiral basis.

The γ matrices satisfy the following algebra

{γµ, γν } = 2ηµν I (3.80) known as the Clifford algebra. It implies that (γ0)2 = 1 and (γi)2 = −1. This algebra will actually be taken ν µ as the definition of γ matrices. Applying iγ ∂ν to the left of iγ ∂µψ = mψ and using the Clifford algebra, we find that ( + m2)ψ = 0 as expected. Each of the 4 components of the Dirac bispinor satisfies the same KG equation. The Dirac Lagrangian can be rewritten ←→ ¯ 0 ¯ LD = ψ(i∂/ − m)ψ or LD = ψ(i ∂/ − m)ψ (3.81) with ψ¯ ≡ ψ†γ0 the conjugate Dirac spinor (which we already defined; remember that its main property is that ψψ¯ is a Lorentz scalar whereas ψ†ψ is not).  ~σ 0  The spin operator is 1 Σ~ = 1 so that the helicity operator is 1 Σ~ · ~p where ~p = −i∇~ is here 2 2 0 ~σ 2 |~p| the 3-momentum operator.

3.6.3 Chirality operator µ 5 0 1 2 3 In addition to γ with µ = 0, 1, 2, 3, we also define γ ≡ iγ γ γ γ ≡ γ5. In the chiral basis, it is γ5 =  −I 0  , which we recognize as the chirality operator, i.e. the operator which distinguishes left from right 0 I Weyl spinors. A Dirac bispinor purely made of a left (resp. right) Weyl spinor is an eigenvector of γ5 with eigenvalue −1 (resp. +1). Its status as a γ matrix comes from the fact that it shares a few properties with the other 4 gamma matrices: it anticommutes with the other γ matrices {γ5, γµ} = 0 (check it) and it squares to one (γ5)2 = I (check it). The reason for avoiding γ4 as a name is because it already exists in some conventions 1γ5 in which µ = 1, 2, 3, 4 = x, y, z, t instead of µ = 0, 1, 2, 3 = t, x, y, z. The chirality projectors are 2 . Indeed 2  ψ  they are projectors 1γ5  = 1γ5 and they project onto chirality eigenstates 1−γ5 ψ = L (left) and 2 2 2 0           1+γ5 0 5 ψL ψL 5 0 0 2 ψ = (right) because γ = − and γ = + . Note that parity ψR 0 0 ψR ψR  0 I   ψ   0   0   ψ  γ0 = in the chiral basis such that γ0 L = and γ0 = R . Parity I 0 0 ψL ψR 0 exchanges left and right chirality eigenstates. Chirality refers to left and right spinors, which are spinors distinguished by their behavior under a Lorentz boost but not under a rotation. In addition, a left spinor becomes a right spinor under parity transformation and vice-versa. The word “chiral” actually means an object that is not identical – even after a rotation –

51 to its mirror image (like a right hand and a left hand). The parity transformation is essentially a mirror reflection.

[end of lecture #8]

3.6.4 The Clifford algebra and different representations of the γ matrices The γ matrices are defined by the Clifford algebra and therefore exist in several representations. In this course, we have first presented them in the chiral basis. Let U ∈ U(4) be a unitary transformation such that ψ0 = Uψ and γ0µ = UγµU −1 = UγµU †. Note that U is independent of the spacetime point whereas ψ(x) is a field even if we write ψ for conciseness. Let’s check that this is a do-nothing transformation (similar to a canonical transformation in classical mechanics: we just change our way of representing a given problem). µ µ ν µν 0µ 0 0µ 0ν µν From (iγ ∂µ − m)ψ = 0 and {γ , γ } = 2η , show that (iγ ∂µ − m)ψ = 0 and {γ , γ } = 2η . The different representations of the γ matrices correspond to different representations of the Clifford algebra. Two important representations are the chiral (or Weyl) one, which we already discussed, and the standard (or Dirac) representation, that we discuss below24. The chiral representation has a diagonal chirality operator γ5 and an off-diagonal parity operator γ0. It is convenient to study massless or almost massless fields (such as the neutrino field) and also for the transformation properties of ψ under a Lorentz transformation. The γ matrices are  0 I   0 σi    0 σµ   −I 0  γ0 = , γi = i.e. γµ = and γ5 = (3.82) I 0 −σi 0 σ¯µ 0 0 I and the Dirac spinor transforms as follows  ψ   Λ 0   ψ  ψ = L → L L (3.83) ψR 0 ΛR ψR

~σ/2·(iθ~φ~) under a Lorentz transformation Λ, where ΛL/R = e . The standard (or parity) representation has a diagonal parity operator γ0 and an off-diagonal chirality operator γ5. This representation is obtained from the chiral one through the unitary transformation U =  II  √1 so that 2 −II

 I 0   0 σi   0 I  γ0 = , γi = and γ5 = (3.84) 0 −I −σi 0 I 0 and  φ  1  ψ + ψ  ψ = = √ R L (3.85) χ 2 ψR − ψL This representation of γ matrices is convenient to discuss the non-relativistic limit of a massive field (such as the electronic field). The two spinors φ and χ are then known as the large and small components of the Dirac bispinor. We summarize some general properties of the γ matrices:

µ ν µν µ † 0 µ 0 5 0 1 2 3 5 µ 5 † 5 5 2 {γ , γ } = 2η , (γ ) = γ γ γ , γ ≡ iγ γ γ γ ≡ γ5 , {γ , γ } = 0 , (γ ) = γ and (γ ) = 1 (3.86) Note the similarity between γ5 and γ0. The following 16 matrices are linearly independent and form a basis of all 4 × 4 matrices: i I , γµ , γ5 , γµγ5 and σµν ≡ [γµ, γν ] (3.87) 2 In other words, a generic 4 × 4 matrix with complex entries (group called GL(4, C)) is a linear combinaison with complex coefficients of these 16 matrices.

24We won’t discuss a third important representation called the Majorana representation.

52 Dirac bilinears We here briefly mention the properties of Dirac field bilinears under Lorentz and parity transformations. The aim is to prepare building blocks in order to construct Lagrangians. Show that ψψ¯ is a true 4-scalar and that ψγ¯ 5ψ is a pseudo 4-scalar. That ψγ¯ µψ is a true 4-vector and that ¯ µ 5 ¯ µν µν i µ ν ψγ γ ψ is a pseudo 4-vector. And that ψσ ψ is an anti-symmetric (rank 2) tensor, where σ = 2 [γ , γ ]. 0 − i ω Sµν In exercise sheet #3, you will see that a Dirac spinor transforms as ψ → ψ = S(Λ)ψ = e 2 µν ψ µν σµν i under a Lorentz transformation Λ, with S = 2 . This last object contains the spin operator S and the i i 1 σjk 1 i ~ 5 0 boost generator K . The spin operator is S = 2 ijk 2 = 2 Σ . From there, we can show that Σ = γ γ ~γ,  ~σ 0  which in both the standard and the chiral representations gives Σ~ = . 0 ~σ

Parity, chirality and helicity These three operators with 1 eigenvalues should not be confused: - The parity operator γ0 realizes a space inversion P on a Dirac spinor – note that γ0 is only the action of space inversion onto the internal degrees of freedom of the Dirac spinor field, but in addition parity P does xµ = (t, ~x) → (t, −~x). More generally, a parity transformation realizes a mirror reflection xµ = (x0, x1, x2, ...) → (x0, −x1, x2, ...) and is such that det P = −1. Only when space has an odd number of dimensions can it be defined as inverting all spatial coordinates xµ = (x0, xi) → (x0, −xi). Therefore in 3+1 or in 1+1, parity can be considered to be equivalent to space inversion. Whereas in 2+1, parity is not the same transformation as space inversion. - The chirality operator is γ5, which by definition is the matrix that anticommutes with all other γ matrices. It allows one to distinguish between left and right Weyl spinors (to avoid confusion we should say left-chiral and right-chiral). The latter behave identically under a rotation, differently under a boost and are exchanged by a parity transformation ([γ5, γ0] 6= 0). Chirality is therefore related to the parity transformation (for example, it enters into the definition of a pseudo-scalar ψγ¯ 5ψ or of a pseudo-vector ψγ¯ 5γµψ made of two Dirac fields). Note that γ5 and γ0 are essentially “rotations” of one another. Chirality and parity operators exchange their matrix representation when going from the chiral to the standard basis. Chirality only exists for even space-time dimensions (see below). Chirality is also related to the (only left spinors carry a weak charge, not right spinors; and the weak interaction therefore maximally violates parity). Chirality is a Lorentz-invariant quantity: it therefore has an absolute meaning independent of the observer. - The helicity operator is essentially25 the projection of the spin operator onto the direction of motion ~ 1 ~ h = Σ · pˆ, where ~p is the momentum operator,p ˆ ≡ ~p/|~p| and 2 Σ is the spin operator. Note that helicity is not a Lorentz-invariant quantity: it has no absolute meaning and depends on the observer. However, it is a conserved quantity (in the sense that it does not depend on time). For a massless particle, helicity and chirality turn out to be the same. Indeed, if m = 0, then chirality (γ5 with eigenvalues 1) and helicity (Σ~ · pˆ with eigenvalues 1) are identical. Using Σ~ = γ5γ0~γ, show it. Therefore, physically, chirality is similar to helicity, which is less abstract. Note however, the subtle differences between the two (which are important for massive particles). The helicity operator is not Lorentz invariant, whereas γ0 and γ5 are. Therefore the chirality is an intrinsic (absolute) property, whereas helicity is not (it is relative, it depends on the frame). But the helicity is conserved (the helicity operator commutes 0 0 5 with the Dirac Hamiltonian HD = γ ~γ · ~p + mγ , see below), whereas the chirality is not (γ does not 5 5 0 commute with HD). Show that [Σ~ · p,ˆ HD] = 0 and [γ ,HD] = 2mγ γ . In order to clearly distinguish chirality and helicity, one should say left-chiral/right-chiral and left- helical/right-helical for the eigenstates of eigenvalue −1/ + 1 of γ5 and h = Σ~ · pˆ. In summary, for a massless Dirac particle, helicity = chirality, which has an absolute meaning and is conserved. Therefore it makes sense to say that a massless Dirac particle is left-handed. For a massive particle, helicity 6= chirality: helicity is conserved but has no absolute meaning, whereas chirality has an

25 1 1 ~ Here, for the clarity of the present discussion, we did not include S = 2 into the definition of the helicity 2 Σ · pˆ.

53 absolute meaning but evolves in time (i.e. is not conserved). Therefore it does not make sense to say that a massive Dirac particle is left-handed. See P.B. Pal, Am. J. Phys. 79, 485 (2011).

3.6.5 General solution of the Dirac equation: mode expansion see Ryder [3] pages 51-54

This is a technical section included in preparation for the quantization of the Dirac field to be done in the µ next chapter. Consider the Dirac equation (iγ ∂µ − m)ψ(x) = 0 with a finite mass term m 6= 0. We want to find a general plane wave solution. The strategy is to work in the standard (Dirac) basis for γ matrices (as it is well adapted to the non-relativistic limit) and first to obtain a plane wave solution in the rest frame (which only exists for a finite mass). And then to boost this solution to another frame to obtain an arbitrary plane wave. −ik·x A plane wave is written ψ(x) = ψke where ψk is a Dirac spinor (but no longer a field) and k · x = ~ ~ k0t − k · ~x = ωt − k · ~x. We first obtain the dispersion relation from the “squared Dirac equation” which is the massive Klein- 2 2 2 Gordon equation ( + m )ψ(x) = 0. On the plane wave ansatz, it gives (k + m )ψk = 0 and as ψk 6= 0 we 2 ~ 2 2 p~ 2 2 find that ω = k +m . We define ωk ≡ k + m such that ω = ωk. This means that for each wavevector ~ 26 k, there are two solutions ω = ωk corresponding to positive and negative energies . In the following we choose to write positive energy solutions as −ik·x ~ ψ(x) = u~ke with k · x = ωkt − k · ~x

~ p~ 2 2 ~ which (despite the notation) only depends on k (as ω = ωk = k + m is not independent of k). It is ~ a plane wave with energy ωk and momentum k, and u~k is a Dirac spinor. Negative energy solutions are written as ik·x ~ ψ(x) = v~ke with k · x = ωkt − k · ~x ~ which actually corresponds to a plane wave with energy −ωk and momentum −k, and v~k is a Dirac spinor. The Dirac equation (i∂/ − m)ψ(x) = 0 actually contains more information than the mere dispersion relation k2 = m2 obtained from the Klein-Gordon equation. When applied on the two above positive and negative energy solutions it gives

(k/ − m)u~k = 0 and (k/ + m)v~k = 0 In the following, we will see that these are actually projection equations getting rid of two out of the four components of a Dirac spinor (which makes sense if we remember that the electron has only two internal degrees of freedom and not four). µ µ 0 Let us specialize first to the rest frame where k = (m,~0) so that k/ = kµγ = mγ and the projection equations become 0 I 0 I (γ − )u~0 = 0 and (γ + )v~0 = 0       I 0 I 0 I 0 I 0 0 0 as m 6= 0. In the standard basis γ0 = , so that +γ = and −γ = act 0 −I 2 0 0 2 0 I

as projectors onto positive or negative energy states in the rest frame. Then u~0 = (6= 0, 6= 0, 0, 0) and v~0 = (0, 0, 6= 0, 6= 0). We therefore choose the following basis of solutions:  1   0   0   0  (1)  0  (2)  1  (1)  0  (2)  0  u =   , u =   , v =   , v =   . ~0  0  ~0  0  ~0  1  ~0  0  0 0 0 1

26Indeed, as seen in exercise session, the energy/Hamiltonian associated to the Dirac Lagrangian – and obtained from Noether’s R 3 † theorem via the energy-momentum tensor – is H = d xψ (x)i∂tψ(x) which equals ω when computed on a normalized plane wave ψ(x) = ψ(~x)e−iωt such that R d3xψ†(x)ψ(x) = 1.

54 To summarize, for ~k = 0, we have found four different solutions of the Dirac equation: two with positive energy u(α)e−imt and two with negative energy v(α)eimt (with α = 1, 2). ~0 ~0 To find an arbitrary state of motion, we perform a boost Λ to a of velocity ~v: the 4- momentum is transformed as (m,~0) → kµ = (ω,~k) with ω2 − ~k2 = m2. Such a boost on a Dirac spinor is   0 ΛL 0 easily performed in the chiral representation (CR) in which ψCR = S(Λ)ψCR = ψCR with ΛL = 0 ΛR ~σ·φ/~ 2 −~σ·φ/~ 2 ~ e and ΛR = e . The rapidity φ = φ~n is related to the velocity by ~v = tanh φ ~n. And the velocity ~ ~ is also related to the momentum by ~v = ∇~kωk = k/ωk with ω = +ωk. Using standard hyperbolic trigonom- 27 p p etry relations we therefore find that cosh(φ/2) = (ωk + m)/(2m), sinh(φ/2) = (ωk − m)/(2m) and ~ tanh(φ/2) = |k|/(ωk + m). We also know how to go from the chiral to the standard representation by a  II  unitary transformation ψ = Uψ with U = √1 . Therefore, under a Lorentz transformation SR CR 2 −II 0 0 Λ, a Dirac spinor in the standard representation (SR) transforms as ψSR → ψSR = UψCR = US(Λ)ψCR = † † US(Λ)U ψSR. Calculating the product of three matrices US(Λ)U , we obtain the Lorentz transformation in the SR as:

r ~k·~σ ! 1  Λ + Λ Λ − Λ   cosh(φ/2) −~n · ~σ sinh(φ/2)  ω + m I − R L R L k ωk+m = = ~ 2 ΛR − ΛL ΛR + ΛL −~n · ~σ sinh(φ/2) cosh(φ/2) 2m − k·~σ I ωk+m

We now apply US(Λ)U † to the Dirac spinors u(α) and v(α) to obtain u(α) and v(α) (we also know that e∓imt ~0 ~0 ~k ~k ∓ik·x (α) −ik·x (α) ik·x become e with ω = ωk). We find that the plane wave solutions are u e and v e with ~k ~k  1   0  r r (1) ωk + m  0  (2) ωk + m  1  u = k , u = k ~k  z  ~k  −  2m  ωk+m  2m  ωk+m  k+ − kz ωk+m ωk+m and  kz   k−  ωk+m ωk+m r k+ r k (1) ωk + m   (2) ωk + m  − z  v =  ωk+m  , v =  ωk+m  ~k 2m  1  ~k 2m  0  0 1

where k ≡ kx  iky. One may check the following basis independent relations:

k/ + m −k/ + m u(α) = u(α), v(α) = v(α) ~k p ~0 ~k p ~0 2m(m + ωk) 2m(m + ωk)

The Dirac spinors u(α) and v(α) are normalized as follows: ~k ~k ω u(α)†u(β) = k δαβ = v(α)†v(β) and u(α)†v(β) = 0 = v(α)†u(β) ~k ~k m ~k ~k ~k −~k ~k −~k

u¯(α)u(β) = δαβ = −v¯(α)v(β) andu ¯(α)v(β) = 0 =v ¯(α)u(β) ~k ~k ~k ~k ~k ~k ~k ~k We also define the following projectors on positive and negative energy states

2 2 ~ X (α) (α) ~ X (α) (α) P+(k) ≡ u u¯ and P−(k) ≡ − v v¯ ~k ~k ~k ~k α=1 α=1 p 27Such as cosh φ = 1/ 1 − tanh2 φ and cosh(φ/2) = p(1 + cosh φ)/2.

55 2 ~ I Exercise: show that P+ = P+ and similarly for P−. Assume that P+(k) = a + bk/ and find a and b to obtain that: k/ + m −k/ + m P (~k) = and P (~k) = + 2m − 2m

Remark: (k/ − m)u~k = 0 and (k/ + m)v~k = 0 A general solution of the Dirac equation can be written as a linear combinaison of the planes wave solutions that we found

Z 3 d k m X  (α) −ik·x ∗ (α) ik·x ψ(x) = bα(k)u e + d (k)v e (3.88) (2π)3 ω ~k α ~k k α where b and d’s are the name of the expansion coefficients. This expression will be useful when quantizing the Dirac field.

[end of lecture #9]

3.6.6 Conserved currents and charges see also exercise sheet # 5 In this section, we apply the results of the Noether theorem to obtain three conserved currents: the energy-momentum tensor, the vector current and the axial current. Starting from the Dirac Lagrangian i ¯ µ ¯ µ ¯ L = 2 [ψγ ∂µψ − (∂µψ)γ ψ] − mψψ and using the EL equations, we obtain the equations of motion for ψ ¯ µ ¯ µ ¯ and for ψ:(iγ ∂µ − m)ψ = 0 and i(∂µψ)γ + mψ = 0. The energy-momentum tensor is the Noether current associated to the symmetry under space-time trans- lation. It reads

µ ∂L ¯ ∂L µ i ¯ µ i ¯ µ µ θ ν = ∂ν ψ + ∂ν ψ ¯ − δν L = ψγ ∂ν ψ − (∂ν ψ)γ ψ − δν L (3.89) ∂(∂µψ) ∂(∂µψ) 2 2

00 ∂L Therefore θ = Π ∂ ψ + Π ¯∂ ψ¯ − L, which we recognize as the Hamiltonian density H, with Π = = ψ 0 ψ 0 ψ ∂(ψ˙) i 0 i † ∂L i 0 i † ψγ¯ = ψ and Π ¯ = = − γ ψ = ψ . After integration by part and using the equations of motion, 2 2 ψ ∂(ψ¯˙) 2 2 we find that the Hamiltonian is: Z Z 3 00 3 † H = d xθ = d xψ (i∂t)ψ (3.90)

0i i ¯ 0 i i i ¯ 0 Similarly θ = 2 ψγ ∂ ψ − 2 (∂ ψ)γ ψ and the corresponding Noether charge (the momentum) is: Z Z P~ = d3xθ0i = d3xψ†(−i∇~ )ψ (3.91)

Both equations can be summarized by saying that the 4-momentum P µ = R d3xψ†(i∂µ)ψ is a conserved charge, in which we recognize the 4-momentum operator i∂µ. There is also a so-called vector current that is conserved as a result of global U(1) phase invariance. Indeed when ψ → eiαψ with α(x) = α = constant, L is invariant. The Noether theorem implies that J µ = − ∂L iψ + iψ¯ ∂L should be conserved (the V index is for Vector and to distinguish it from V ∂(∂µψ) ∂(∂µψ¯) another conserved current called Axial, see below. This current is also often simply called J µ.). We find that µ µ ¯ µ J = JV = ψγ ψ. (3.92) We know that this is a true 4-vector (see the section on Dirac bilinears). Let’s check that it is divergenceless: ¯ µ ¯ µ ¯ µ m ¯ ¯ m ∂µ(ψγ ψ) = (∂µψ)γ ψ + ψγ (∂µψ) = − i ψψ + ψ i ψ = 0 upon using the equations of motion. In non- † † 0 † covariant notation ∂t(ψ ψ) + ∇~ · (ψ ~αψ) = 0 where ~α ≡ γ ~γ is the velocity operator, ψ ψ is the particle density and ψ†~αψ is the particle current density (here we anticipate in invoking “particles”). The conserved

56 R 3 0 R 3 † R 3 † † dQV charge is Q = QV = d xJV = d xψ ψ = d x(ψRψR +ψLψL) such that dt = 0. To clearly understand the nature of this conservation law, one would need to couple the electronic field to the electromagnetic field (in order to see the electric charge as a coupling strength) and to quantize the fields (in order to really have particles and give a meaning to the “number of particles”). † µ † µ Consider now the massless Dirac Lagrangian. In the chiral basis, it reads L = iψLσ¯ ∂µψL + iψRσ ∂µψR. On top of the U(1) invariance discussed above, there is an extra phase symmetry. Indeed, one can transform the phase of the left and right Weyl spinors separately. The usual way to define this transformation is iαγ5 −iα iα ¯ ψ(x) → e ψ(x), which means that ψL → e ψL and ψR → e ψR. Under such a transformation ψψ is not left invariant, which explains why we consider only the massless Dirac Lagrangian. According to Noether’s theorem the conserved current is

µ µ ¯ 5 µ J5 = JA = ψγ γ ψ. (3.93) µ It is called the axial current (it is also often called J5 ) and is a pseudo 4-vector. Check that it is indeed R 3 ¯ 5 0 R 3 † † conserved if and only if m = 0. The Noether charge is Q5 = QA = d xψγ γ ψ = d x(ψRψR − ψLψL), which is a pseudo 4-scalar. The conservation law is that of the difference in number of right and left particles. When m = 0 both the vector and the axial current are conserved, which means that the number of left particles is separately conserved and the number of right particles as well. Remark: - The phase transformation of a Weyl spinor is usually known as a chiral transformation. When applied on a Dirac bispinor, it decomposes into a vector transformation (acting similarly on left and right fields) and iαL an axial transformation (acting in opposite ways on the right and left fields). Indeed if ψL → e ψL and iαR iαV −iαA ψR → e ψR, one can define αV = (αR + αL)/2 and αA = (αR − αL)/2 such that ψL → e e ψL and 5 iαV iαA iαV iαAγ ψR → e e ψR or, in other words, ψ → e e ψ. Here αV is the overall global phase and αA is the relative phase between the two Weyl spinors. - The axial symmetry of the massless spinor field is quite interesting. It is a classical symmetry that does not survive upon quantizing the theory. This fact is known as an in quantum field theory. In the present case, it is called the axial or chiral or triangle or Adler-Jackiw-Bell anomaly. For a first introduction to anomalies see Zee [4] pages 243-254. The anomaly manifests itself as a non-conservation of the quantum current in the presence of an electromagnetic field.

3.6.7 Coupling to the electromagnetic field see Maggiore [2] pages 69-72 This section is essentially a repetition of the section on the gauging of a global internal U(1) symmetry, expect that it will now be performed on a Dirac field rather than on a complex scalar field. We start by making two remarks: ¯ µ ¯ - LD = iψγ ∂µψ − mψψ is a parity invariant free theory for a spin 1/2 complex and massive field (the Dirac field, which is a matter field). It is invariant under a global U(1) phase transformation ψ(x) → eiθψ(x). 1 µν µ - LM = − 4 Fµν F is a free theory for a massless spin 1 and real field (the gauge field A ). It is invariant under the (local) gauge transformation Aµ(x) → Aµ(x) − ∂µθ(x) with θ(x). Starting from the Dirac Lagrangian, we apply the gauge principle and decide to make the U(1) phase iθ(x) iθ(x) symmetry local: ψ(x) → e ψ(x). Then the derivative ∂µψ → e [∂µψ + ψi∂µθ] is not covariant (i.e. it does not transform as the field ψ) under the local phase transformation. We therefore introduce the covariant iθ(x) derivative Dµ ≡ ∂µ + iAµ with Aµ → Aµ − ∂µθ such that Dµψ → e Dµψ. Therefore we consider the ¯ µ ¯ µ modified Dirac Lagrangian L = iψγ Dµψ − mψψ. We also need to add a dynamics for the field A and therefore arrive at 1 1 L = iψγ¯ µD ψ − mψψ¯ − F F µν = ψ¯(iγµ∂ − m)ψ − F F µν − A ψγ¯ µψ = L + L − A J µ (3.94) µ 4 µν µ 4 µν µ D M µ V µ where Lint = −AµJV is the Lagrangian describing the minimal coupling between the Dirac field and the µ ¯ µ ¯ µ gauge field and JV = ψγ ψ is the vector current. Note that the vector current ψγ ψ is invariant under a

57 local phase transformation (unlike the current iφ∗∂µφ+c.c. obtained in the complex scalar case). The reason why the dynamics for the gauge field has this form is that it has to be quadratic, contain the correct kinetic energy and can not have a mass term that would spoil the invariance under the local phase transformation. Note that in the above construction, we did not introduce a knob to vary the strength for the coupling ¯ µ between the matter (Dirac) field and the gauge field. In other words, we would like to have Lint = −qAµψγ ψ µ instead of Lint = −AµJV with q a expressing the strength of the coupling between the two fields. 1 µν The way to do it is simply to realize that when introducing the Lagrangian − 4 Fµν F , there is the freedom to introduce an arbitrary constant λ > 0 28 such that

 1  L = iψγ¯ µD ψ − mψψ¯ + λ − F F µν (3.95) µ 4 µν

This Lagrangian is as valid as the one with λ = 1 in order to describe a Dirac field with a local phase symmetry. But it has one additional parameter. This parameter λ measures the relative weight between the µ two Lagrangians for the fields ψ and A . If one wants to recover the traditional√ form of the Lagrangian, this parameter can be absorbed in a redefinition of the gauge field Aµ → Aµ/ λ. Then it means that the gauge µ iqθ(x) transformation√ of the field A is unchanged but that the phase transformation of ψ becomes ψ → e ψ with q ≡ 1/ gλ and the covariant derivative is now Dµ = ∂µ + iqAµ. Therefore q can be considered as a characteristic property of the matter field (it will later be interpreted as the electric charge of an electron29). The Lagrangian becomes 1 1 L = iψγ¯ µD ψ − mψψ¯ − F F µν = ψ¯(iγµ∂ − m)ψ − F F µν − A qψγ¯ µψ (3.96) µ 4 µν µ 4 µν µ which is the QED Lagrangian describing electrons, photons and their interaction. The minimal coupling is µ µ µ µ µ now ∂µ → Dµ = ∂µ + iqAµ i.e. p = i∂ → iD = i∂ − qA . The EL equation for the Dirac field gives

µ µ µ (iγ Dµ − m)ψ = (iγ ∂µ − qγ Aµ − m)ψ = 0 (3.97) and for the gauge field: µν ¯ ν ν ∂µF = qψγ ψ = j (3.98) µν µν ν From the antisymmetry of F we obtain ∂ν ∂µF = ∂ν j = 0, in other words, we have a conserved current ν ¯ ν R 3 0 R 3 ¯ 0 R 3 † ν ν j = qψγ ψ and a conserved charge Q = d xj = q d xψγ ψ = q d xψ ψ. Here j = qJV : this should be contrasted to the complex scalar field case for which jν = iφ∗Dν φ + c.c. 6= J ν = iφ∗∂ν φ + c.c.. Upon quantizing the Dirac field, we will see that q is actually the electric charge carried by a single particle (an electron). Here we see the role of the electric charge as the coupling strength of the matter field to the gauge field. Remark: Note that µ ∂L δS JV = − = − , (3.99) ∂Aµ δAµ which is often taken as a definition of the current of a matter field that is coupled to a gauge field.

3.6.8 The αj and β matrices and the Dirac equation as the square root of the KG equation Bohr: “What are you working on, Mr Dirac?”. Dirac: “I am trying to take the square root of something.”

28It is conventional to call this parameter 1/g instead of λ. The reason is that g plays the role of a dimensionless . It is actually ∼ the fine structure constant. 2 29 1 2 ~ q And then indeed g = λ = q which is units such that = 1 and c = 1 is essentially the fine structure constant α = ~c with q the charge of a single electron.

58 µ 0 The Dirac equation iγ ∂µψ = mψ can also be written iγ ∂0ψ = [~γ · (−i∇~ ) + m]ψ. Multiplying this equation on the left with γ0, we obtain

0 0 i∂tψ = [γ ~γ · (−i∇~ ) + mγ ]ψ = [~α · (−i∇~ ) + mβ]ψ (3.100)

0 where ~α ≡ γ0~γ and β ≡ γ . These d+1 matrices (d α’s and one β) satisfy the following algebra (consequence of the Clifford algebra): 2 {αi, αj} = 2δi,j, {αi, β} = 0 and β = 1. (3.101)

The quantity ~α · (−i∇~ ) + mβ will later be identified as a (single-particle) Hamiltonian operator HD = ~α · ~p + R 3 † mβ = i∂t. Do not mistake this Hamiltonian operator i∂t with the energy of the Dirac field H = d xψ i∂tψ. µ µ ν µν It is important to realize that the Dirac equation is (iγ ∂µ − m)ψ = 0 AND {γ , γ } = 2η (or 2 equivalently i∂tψ = [~α · (−i∇~ ) + mβ]ψ AND {αi, αj} = 2δi,j, {αi, β} = 0 and β = 1). Indeed, imagine that only i∂tψ = [~α·~p+mβ]ψ, where ~p = −i∇~ is the momentum operator, is given without the algebra of α’s and β. For a moment, imagine that the α’s and β are just unknown coefficients (forget that you already know 2 2 2 2 that they are matrices). Then i∂t(i∂tψ) = −∂t ψ = [~α · ~p + mβ] ψ = [pipjαiαj + m β + mpj(αjβ + βαj)]ψ. 2 1 2 2 This means that −∂t ψ = [pipj 2 {αi, αj} + m β + mpj{αj, β}]ψ. But we would like this equation to be the  2 2 2 2 massive KG equation ( + m )ψ = 0 i.e. −∂t ψ = [~p + m ]ψ. In order for this to be true, we have to take 1 2 2 {αi, αj} = δij, β = 1 and {αj, β} = 0. Such an algebra can not be satisfied by ordinary numbers and one has to conclude that the α’s and β are matrices. This is actually more or less how Dirac discovered his equation and the γ matrices in 1928.

3.6.9 Conclusion on the Dirac equation To conclude, we contrast (i) the historical Dirac equation of relativistic quantum mechanics, (ii) the modern version of the Dirac equation as a field theory and (iii) the Dirac equation as an emergent low-energy single-particle theory in condensed matter physics (graphene, bismuth, topological insulators, etc.). Here we explicitly write c and ~ and do not take c = 1 and ~ = 1.

As a single-particle relativistic quantum mechanical equation The original Dirac equation was invented in 1928 by as a relativsitic version of single particle ~ (−i~∇~ )2 quantum mechanics (a generalization of the Schr¨odingerequation i ∂tψ(t, ~x) = 2m ψ(t, ~x) where ψ(t, ~x) is a wavefunction for a single electron). The Dirac equation reads

2 i~∂tψ(t, ~x) = [c~α · (−i~∇~ ) + mc β]ψ(t, ~x) (3.102)

ˆ 2 The Dirac Hamiltonian operator HˆD = c~α · ~p + mc β is an operator in the Hilbert space of a single electron (it is an operator both through ~pˆ – which acts on the orbital degrees of freedom, the external motion – and through the 4 × 4 matrices ~α and β – which act on the internal degrees of freedom such as the spin). It acts on |ψ(t)i, which is a quantum state in single particle Hilbert space. It can be represented as a single particle wavefunction ψ(t, ~x) = h~x|ψ(t)i. The Dirac equation also reads d i~ |ψ(t)i = [c~α · ~pˆ + mc2β]|ψ(t)i = Hˆ |ψ(t)i (3.103) dt D in bra-ket notation. As a single-particle relativistic quantum equation, the Dirac equation is better behaved than the Klein- Gordon equation (e.g., it does not suffer from negative probability density). But it still has to face difficulties related to the negative energy states, i.e. with the fact that its Hamiltonian is not bounded from below. First, we have already seen that quantum + relativistic imply the possibility to create and destroy particles (see the discussion in the introduction chapter) and hence it is impossible to write an equation for a single electron. Just like nobody writes an equation for a single photon but writes an equation for the photonic field (known as the electromagnetic field).

59 Second, the main difficulty is the existence of a negative energy branch for the dispersion relation, which also exists in a classical relativistic theory, but which becomes problematic in the quantum world because of the possibility of quantum jumps (across the mass gap 2mc2) from the positive to the negative energy branch. Such jumps mean that there is no stable groundstate for a single electron described by the Dirac equation of relativistic quantum mechanics. In order to save his equation and render a single electron stable, Dirac devised the following idea: he imagined that all negative energy states are occupied by other electrons. This is called the Dirac sea. Because of the Pauli exclusion principle (electrons are fermions), an electron in a positive energy state can not jump to the negative energy branch because there are no empty states available. Dirac postulated that the groundstate of his theory consists in such an unobservable Dirac sea (it is indeed the many-body state with the lowest energy: all single-particle negative states are filled). Now an extra electron becomes a stable particle that does not decay to negative energy states. You notice that even in the context of a single relativistic and quantum electron, Dirac had to introduce an infinite amount of other electrons to fill the negative energy branch states. A consequence of this bold step is to imagine an electron absorbing a very energetic photon and being promoted from the Dirac sea to the positive energy branch. In such a process, in addition to an electron in the positive energy branch, one obtains a hole in the negative energy branch. This hole is defined as the absence of an electron in an otherwise filled band. This hole actually behaves very similarly to an electron (same mass, same spin 1/2) except that it has an opposite energy 30 and charge 31. It is an anti-electron (also known as a ). That’s how Dirac invented anti-matter in 1931! We will later see that the modern view on anti-matter is different.

As a field equation The modern vision of the Dirac equation is that it is actually a field equation for the electronic field (just like Maxwell equation describes the electromagnetic or photonic field):

0 µ i∂tψ(x) = [c~α · (−i∇~ ) + ω0β]ψ(x) i.e. iγ ∂tψ(x) + ic~γ · ∇~ ψ(x) − ω0ψ(x) = (iγ ∂µ − ω0)ψ = 0 (3.104)

mc2 where ω0 = ~ is a characteristic frequency and in the last equality we set c = 1. It describes the field ψ(x) of all electrons (not just a single electron). This equation is for a classical field32. It is the equivalent for the electronic field of what Aµ − ∂µ(∂ · A) = 0 is for the electromagnetic field. The field itself is therefore not a quantum state (not a ket in a Hilbert space). Actually, at this point, there is no Hilbert (or Fock) space and no quantum states; just a classical field with internal dofs (a bispinor field). After quantization (which we will perform in the next chapter), we will understand that ψ is actually an annihilation operator ψˆ(x) acting in Fock space (which is a special kind of Hilbert space). And the Dirac equation is: 2 ˆ ~ mc ˆ i∂tψ(x) = [c~α · (−i∇) + ~ β]ψ(x) (3.105) The field operator ψˆ(x) annihilates an electron at event x in spacetime. The particles (the electrons) are just the excitation quanta of the electronic field (known as the Dirac field). Likewise ψˆ†(x) creates an electron at event x in spacetime. A particular state in Fock space is the vacuum |vaci, which by definition is empty of particles ψˆ(x)|vaci = 0. A single electron state looks like ψˆ†(x)|vaci. The Dirac equation (3.105) can then be seen as the Heisenberg equation of motion for the time-dependent field operator.

30The energy of the hole is defined as minus the energy of the missing electron (it is therefore positive), its momentum is equal to minus that of the missing electron and its spin projection is the opposite of that of the missing electron. The whole idea is actually very similar to that of a hole in the filled valence band of a semi-conductor as originally understood by Peierls almost simultaneously in the 30’s. 31The electric charge of the hole is q = e > 0, which is that of a missing electron. The charge of the electron is q = −e < 0. 32If you are annoyed by the appearance of ~ in the Dirac equation seen as an equation of motion for a classical field, remember that for us m is just the name of a coefficient in front of the ψψ¯ term in the Lagrangian. It is not yet a mass. We could have 2 µ called this coefficient ω0 instead of mc /~ as a characteristic zero wavevector frequency and written (iγ ∂µ − ω0)ψ = 0 with c = 1. A more serious difficulty about the interpretation of the Dirac equation as a classical field equation is that the field is actually not a usual commuting number but an anticommuting number (known as a ) as a result of fermionic statistics. You will encounter this notion when studying the quantization of fields via path integrals.

60 In conclusion, in front of a notation like ψ(x), one should always wonder whether it means: 1) h~x|ψ(t)i, i.e. a wavefunction, where |ψ(t)i is a ket, a vector in the space F of quantum states (either a Hilbert or a Fock space), i.e. a quantum state, such that the dimension of this vector (the number of components of the vector) is that of the space states dim F. OR 2) ψˆ(x), i.e. an operator acting in the space of quantum states, i.e. a matrix whose size is the square of that of the space of states, i.e. (dim F)2. Notations with no meaning are therefore |ψˆi or |ψˆ(x)i or |ψˆ(t)i, etc.

As an emergent single-particle quantum mechanical equation in condensed matter physics There are several condensed matter systems (such as graphene, boron nitride, bismuth, topological insulators, etc.) in which the Dirac Hamiltonian for a single-particle emerges as a low-energy effective description. Graphene is a two dimensional crystal of carbon atoms with a honeycomb . The latter is a triangular Bravais lattice with two atoms per unit cell. In undoped graphene, at low energy, and in the vicinity of the two Fermi points (valley K or K’), the motion of an electron is described by a massless 2+1 Dirac equation:

i~∂tψ(t, ~x) = vF ~σ · (−i~∇~ )ψ(t, ~x) (3.106)

where vF ≈ c/300 is the velocity of electrons, known as the Fermi velocity, the  sign refers to valley K/K’, and ~σ = (σx, σy) acts in the sublattice space of the two atoms in the unit cell. See also exercise sheet # 4. In this context, the Dirac equation does not suffer from its Hamiltonian being unbounded from below. Indeed the microscopic Hamiltonian (tight-binding model for example) is bounded from below and the Dirac Hamiltonian only emerges as an effective low-energy and long-wavelength description. In addition, it is not truly relativistic here neither as the velocity of electrons is vF  c and the Lorentz symmetry is only emergent. The Dirac equation as an effective single-particle quantum mechanical description in condensed matter physics is therefore well-behaved.

3.6.10 Bonuses: extra material Clifford algebra in different space dimensions d At d = 3, the Clifford algebra can be represented by four 4 × 4 matrices γµ with µ = 0, 1, 2, 3. And the chirality is defined by γ5 = iγ0γ1γ2γ3, which is another 4 × 4 matrix with the property that it anticommutes with the four γµ. In space dimension d (space-time dimension d + 1), there are d + 1 γµ matrices as µ = 0, 1, ..., d. In order to satisfy the Clifford algebra, the matrix size of the γµ’s has to depend on d. In d = 2 and d = 1, 2 × 2 matrices are enough to satisfy the Clifford algebra. 0 1 2 In d = 2, the three Pauli matrices can be used. For example, γ = σz, γ = iσx and γ = iσy. These matrices anticommute. The first squares to one and the last two to −1. The first is hermitian and the last two are anti-hermitian. There is no chirality operator γ5 as iγ0γ1γ2 = I does not anticommute with the γµ’s. 0 1 In d = 1, we can take γ = σz and γ = iσx (check that they satisfy the Clifford algebra). And 0 1 5 µ γ γ = −σy. This matrix can be taken as a chirality operator γ as it anticommutes with the γ ’s, it squares to 1 and is hermitian. [ d+1 ] 5 The general rule is that the d + 1 gamma matrices have size 2 2 . And the chirality operator γ only exists in even space-time dimensions.

61 From the Dirac to the in the non-relativistic limit The Pauli equation (i.e. Schr¨odinger+ Zeeman effect) for an electron (with mass m and charge −e < 0) in a magnetic field B~ = ∇~ × A~ reads " # (−i~∇~ + eA~)2 i~∂ ϕ(~x,t) = − ~µ · B~ ϕ(~x,t) (3.107) t 2m where ϕ(~x,t) is a two-component spinor field and −e ~ −e ~ ~µ = g ~σ = ~σ = −µ ~σ (3.108) 2m 2 m 2 B

e~ is the magnetic moment of the electron (this corresponds to a gyro-magnetic factor g = 2) and µB ≡ 2m is the Bohr magneton. The Pauli equation can be obtained as the non-relativistic limit of the Dirac equation (see exercise 3 in sheet # 3). An important point to notice is the reduction from 4 to 2 bands due to the 2 2 p 2 2 2 4 2 ~p ~p projection onto the positive energy bands only E =  ~p c + m c → E ≈ mc + 2m → E ≈ 2m (the last step just amounts to redefining the zero of energy). Another important point is the appearance of a magnetic moment (a spin 1/2 with gyro-magnetic factor g = 2) and of the corresponding Zeeman coupling to the magnetic field.

3.7 Conclusion on classical fields

see Ryder [3] pages 70-71. As a general conclusion on classical fields in 3+1, we propose the following table. It contains the field names, their geometrical (tensorial) nature and also their equations of motion. We separate the information on the orbital motion (external degrees of freedom) contained in a second order differential equation (essen- tially the Klein-Gordon equation ( + m2)... = 0) from that on the internal degrees of freedom (i.e. on the spin degrees of freedom) contained in a first order differential equation. The latter projects out unwanted or unphysical degrees of freedom (dof). In the words of S. Weinberg, “these equations are a confession that we have too many spin components”. Note the difference between the Lorenz gauge condition – which is a gauge choice and does not have to be imposed – and the Proca, Weyl or Dirac equations, which do not result from a choice. This difference is related to the peculiar nature of a gauge field. The Proca equation eliminates 1 dof out of 4 (4 → 3: massive spin 1 with Sz = 1, 0). Gauge fixing (such as the radiation gauge) allows to reduce the number of dof from 4 → 2 (massless spin 1 with Sz = 1 but not Sz = 0) for the Maxwell field. The Weyl equation reduces the number of dof from 2 → 1 (massless spin 1/2, helicity is conserved S~ · ~k = −1/2). The Dirac equation reduces the number of dof from 4 → 2 (massive spin 1/2, Sz = 1/2).

field name geometric nature of field external dofs, dispersion relation spin (# of internal dofs), projection p Klein-Gordon scalar (massive) ( + m2)φ = 0, ω =  ~k2 + m2 S = 0 (1), no projection 2 µ p~ 2 2 µ Proca vector (massive) ( + m )A = 0, ω =  k + m S = 1 (4 → 3), ∂µA = 0 µ µ ~ µ 0 Maxwell vector & gauge A − ∂ (∂ · A) = 0, ω = |k| S = 1 (4 → 2), ∂µA = 0 & A = 0 ~ µ (left) Weyl spinor (massless) ψL = 0, ω = |k| S = 1/2 (2 → 1),σ ¯ ∂µψL = 0 2 p~ 2 2 µ Dirac bi-spinor (massive) ( + m )ψ = 0, ω =  k + m S = 1/2 (4 → 2), (iγ ∂µ − m)ψ = 0

Second order differential equations: d’Alembert equation ... = 0 Klein-Gordon equation ( + m2)... = 0 µν 2 ν ν ν 2 ν 2 ν ν Proca equation ∂µF = −m A (i.e. A − ∂ (∂ · A) + m A = 0) implies ( + m )A = 0 (as ∂ν A = 0 see below).

62 First order differential equations: µ 0 Lorenz gauge condition ∂µA = 0 (A = 0 is a further condition to fully fix the gauge: it is known as the radiation gauge) µ Weyl equationσ ¯ ∂µ... = 0 µ Dirac equation (γ i∂µ − m)... = 0 ν µν 2 ν Proca equation implies ∂ν A = 0 (indeed ∂ν ∂µF = −m ∂ν A = 0 and m 6= 0).

[end of lecture #10]

63 Chapter 4

Canonical quantization of fields

In this chapter, we will turn the classical field theories discussed previously into quantum field theories. For that, we will use the machinery of “canonical quantization” (and not the alternative machinery of “path integral quantization”, for example). The idea is to identify pairs of canonically conjugate variables (such as q and p) in the classical field theory and then to impose that their (equal-time) commutator is non-vanishing and proportional to i~ (i.e. [ˆq, pˆ] = i~). It is the usual idea of replacing a Poisson bracket {q, p}PB = 1 with a commutator (i~)−1[ˆq, pˆ] = 1, which turns c-numbers (i.e. commuting numbers q, p) into q-numbers (i.e. operatorsq ˆ,p ˆ)1. Canonical quantization relies on the Hamiltonian formulation of classical field theory, whereas path integral quantization relies on the Lagrangian formulation. Canonical quantization is not explicitly covariant as time plays a special role in the Hamilton formalism. We start with scalar fields (first real, then complex) and continue with the Dirac field. We will not study the case of the electromagnetic field as this is done in other master 2 courses such as “atoms and photons”.

4.1 Real scalar fields

[see Ryder [3], pages 129-139 and Maggiore [2], pages 83-88]

4.1.1 Quantum field

1 2 m2 2 We consider a real scalar field φ(x) with Lagrangian L = 2 (∂φ) − 2 φ . The Hamiltonian is H = R 3 1  2 2 2 2 ∂L d x (∂0φ) + (∇~ φ) + m φ . We identify the field φ(x) and the conjugate field Π(x) ≡ = 2 ∂(∂0φ(x)) ∂0φ(x). For each point ~x of space, we have a generalized coordinate φ(x) = φ(t, ~x) (you should think of it as q~x(t) with ~x labeling the degrees of freedom) and a generalized conjugate momentum Π(x) = Π(t, ~x) (think of it as p~x(t)). We now impose the following equal-time commutation relations (ETCR):

[φ(t, ~x), Π(t, ~x0)] = i~δ(~x − ~x0) and [φ(t, ~x), φ(t, ~x0)] = 0 = [Π(t, ~x), Π(t, ~x0)] (4.1) Note that the above ETCR are not manifestly Lorentz covariant as t plays a special role. Note also that ~ → 0 just affects the first commutator. Once every commutator vanishes, we are back to a classical field theory. This trivial remark will sound very strange when applied to a Dirac field. From now on, we will use units such that ~ = 1 on top of c = 1.

1We are used to distinguish commuting numbers (ordinary numbers, c-numbers, classical numbers) from operators (q- numbers, quantum numbers, queer numbers), which usually do not commute. A third even stranger category is anti-commuting numbers (called Grassmann variables): these are not operators but numbers, but they anticommute rather than commute as ordinary numbers do. They are useful in the path integral quantization of fermionic fields.

64 The consequence of imposing ETCR is that the classical field φ(x) has been promoted to a quantum field, i.e. to an operator (and called a quantum field operator). We don’t write hats on φ (for simplicity), but we mean it! The quantum field φ(x) is an hermitian operator φ†(x) = φ(x) because the classical field was real. It is an operator in the Heisenberg representation φ(t, ~x) with an explicit time-dependence. It is therefore very different from a wavefunction or a quantum state (a ket)2. The EL equation of motion ( + m2)φ(x) = 0 is now interpreted as a Heisenberg equation of motion3 for the time-dependent operator iHt −iHt φ(t, ~x). In the φ(t, ~x) = e φ(0, ~x)e and therefore ∂tφ(x) = −i[φ(x),H]. Exercise: From φ˙ = −i[φ, H], φ¨ = −i[φ,˙ H] and [φ(t, ~x), φ˙(t, ~x0)] = iδ(~x − ~x0), show that ( + m2)φ(x) = 0. We recall the mode expansion of a classical free and real scalar field Z φ(x) = a(k)e−ik·x + a∗(k)eik·x (4.2) k µ 0 ~ 0 p~ 2 2 where it is understood that in the above expression k = (k , k) with k = ωk ≡ k + m and the R R d3k shorthand notation ≡ 3 . At this point a(k) is just the name of the coefficient of the Fourier k (2π) 2ωk expansion of a field φ(x) satisfying the massive KG equation ( + m2)φ = 0. Because of quantization, the mode expansion of the quantum field now reads Z φ(x) = a(k)e−ik·x + a†(k)eik·x (4.3) k and a(k) and a†(k) become operators as well (but non-hermitian as the classical a(k) is a complex and not a real number). For the conjugate field operator Z −ik·x † ik·x Π(x) = ∂0φ(x) = (−iωk) a(k)e − a (k)e (4.4) k Check that the ETCR of φ and Π imply that † 0 3 ~ ~ 0 [a(k), a (k )] = 2ωk(2π) δ(k − k ) (4.5) and [a(k), a(k0)] = 0 = [a†(k), a†(k0)] (4.6) This starts to smell like the algebra of annihilation and creation operators (i.e. [a, a†] = 1, [a, a] = 0 = [a†, a†]) familiar from the quantum mechanical harmonic oscillator. Hence the name a(k) for the coefficients in the mode expansion of the field φ(x).

4.1.2 Particle interpretation and Fock space In order to construct the Fock space and to strengthen the analogy with the harmonic oscillator, it is easier to work with a finite spatial volume V = L3 and periodic boundary conditions. Therefore Z d3k 1 X → and (2π)3δ(~k − ~k0) → V δ (4.7) (2π)3 V ~k,~k0 ~k ~ 2π 1 P 1 −ik·x  with k = (nx, ny, nz) where nj ∈ Z. Let a~ ≡ √ a(k) so that φ(x) = ~ √ a~ e + h.c. L k 2ωkV k V 2ωk k instead of (4.3). Then the commutation relations for the a~k operators read [a , a† ] = δ and [a , a ] = 0 = [a† , a† ] (4.8) ~k ~k0 ~k,~k0 ~k ~k0 ~k ~k0 Now, we really have a clear analogy with annihilation and creation operators of the harmonic oscillator. Except that we have one such harmonic oscillator (one such mode) for each ~k. 2In particular, it is not a wavefunction that has been quantized a second time. It is not a quantum wavefunction with a hat on top. Hence, the dislike with the terminology of “second quantization”. Much better would be “occupation number representation” or “annihilation/creation formalism”. 3Remember that, in quantum mechanics, an operator A in the usual (Schr¨odinger) picture – for simplicity, we assume that it does not explicitly depend on time – becomes A(t) ≡ eiHtAe−iHt in the Heisenberg picture and satisfies the Heisenberg equation of motion A˙(t) = −i[A(t),H].

65 Detour by the 1D quantum harmonic oscillator

2 mω2q2 p 0 ~ The 1D quantum mechanical harmonic oscillator has a Hamiltonian H = 2m + 2 with [q, p] = i where q is the position and p the canonically conjugated momentum. One usually defines a ≡ q√+ip and a† = q√−ip 2 2 q (in units such that the characteristic length ~ = 1 and ~ = 1), which satisfy the following algebra mω0 [a, a†] = 1 and [a, a] = 0 = [a†, a†] as a consequence of [q, p] = i and [q, q] = 0 = [p, p]. The Hamiltonian ω0 † † † 1 † † is then rewritten as H = 2 (a a + aa ) = ω0(a a + 2 ). We know that [a, a ] = 1 implies that N = a a is the number operator. Indeed, call |N = 0i = |vaci the vacuum state (i.e. the groundstate of the harmonic ω0 † † oscillator, H|0i = 2 |0i), which is defined by a|0i = 0. Then [N, a] = −a and [N, a ] = a such that [N, a†]|0i = a†|0i, which means that N(a†|0i) = 1(a†|0i). In other words a†|0i ∝ |N = 1i. We can choose a†|0i = |N = 1i, which shows that a† is a creation operator for an excitation quantum of the harmonic oscillator. Continuing this construction, we arrive at √ √ a|Ni = N|N − 1i , a†|Ni = N + 1|N + 1i and a†a|Ni = N|Ni (4.9) which confirms that N = a†a is the number operator (its eigenvalues are in N), a† is a creation operator (it creates a single excitation of the harmonic oscillator) and a is an annihilation operator (it destroys a single excitation of the harmonic oscillator). The Hilbert space of states for the harmonic oscillator4 can be thought of as a Fock space5 for excitation quanta (i.e. particles that you may call phonons): the number of excitations quanta is not fixed and an orthonormal basis is {|Ni,N ∈ N}. Because the number of excitation quanta that can be accommodated in this single mode is not bounded, we also know that we are dealing with bosons. In the case of a single harmonic oscillator, the particles (i.e. the excitation quanta) have no dispersion relation (here, the phonons are located at one point in space and can not move) and are characterized by a single frequency ω0. Here, we would like to make a connection with quantum statistics as seen in lectures on . The statistics for such phonons is the Bose-Einstein statistics at zero chemical potential (also known as Planck’s distribution). In the groundstate, there are no phonons and when the increases, the number of phonons increases. The number of phonons is not conserved. We are obviously not describing matter particles. One should have in mind the two equivalent descriptions: either a single massive particle in a harmonic oscillator (Hilbert space of a single 1D particle) or an ideal gas of non-conserved bosons which are the excitation quanta of the harmonic oscillator (Fock space for 0D excitation quanta). Remark: In non-relativistic quantum mechanics of identical particles (many-body physics), the idea of working with number representation (i.e. with occupation numbers of different modes) is known as second quantization formalism. It is an alternative to first quantization formalism in which particles carry labels (i.e. particles are numbered), are assigned to specific single-particle states and the many-body wavefunction is then symmetrized (for bosons) or antisymmetrized (for fermions). Here we have a very simple example, with a unique 1D massive particle in a harmonic oscillator, which can be either described by a wavefunction such as hq|1 : χni (1: meaning the particle number 1) or alternatively described by the occupation number † Nω0 = a a of its single mode of frequency ω0 given a way function such as hq|Nω0 = ni. The two kets are

indeed equal |1 : χni = |Nω0 = ni but the first refers to a single massive particle in 1D and the second to a 0D ensemble of n excitation quanta (phonons).

Back to quantum field theory Going back to the quantized scalar field, show that the Hamiltonian can be rewritten as Z   3 1  2 2 2 2 X ωk  † †  X † 1 H = d x (∂0φ) + (∇~ φ) + m φ = a a + a a = ωk a a + (4.10) 2 2 ~k ~k ~k ~k ~k ~k 2 ~k ~k

4This is actually not H = {|qi, q ∈ R} as localized states far from the minimum q = 0 of the harmonic potential are not −q2/2 allowed. It is a smaller Hilbert space H = {|χni, n ∈ N} if we call χn(q) = hq|ni ∝ Hn(q)e the eigen-wavefunctions, where Hn are the Hermite polynomials and H|χni = (n + 1/2)ω0|χni. 5 ∞ The Fock space is F = ⊕j=1Ej where Ej is a one-dimensional Hilbert space generated by |N = ji. One has that H = F.

66 This looks like a sum of independent harmonic oscillators (one for each ~k) and each with its own frequency ωk. We also see that, despite the fact that the massive KG equations gives a dispersion relation ω = ωk with positive and negative branches, the energy of the field is always positive6. Indeed, the expectation value of the Hamiltonian operator in any state is   X † 1 X ωk hHi = ωk ha a i + ≥ > 0 (4.11) ~k ~k 2 2 ~k ~k

as ha† a i ≥ 0. ~k ~k The Fock space is constructed by analogy with the harmonic oscillator. For each mode ~k of frequency † † ωk, there is a pair of creation/annihilation operators a~ and a~k such that N~k = a~ a~k is the occupation P k k number operator in that mode. The operator N = ~k N~k gives the total number of excitation quanta (i.e. the total number of particles) contained in the field. The vacuum state is defined as being annihilated by ~ all annihilation operators a~k|vaci = 0, ∀k: in other words, the vacuum contains no particles (no excitation quanta). A single particle state is |~ki = |N = 1i = a† |vaci. Show that it is an energy eigenstate with ~k ~k energy ωk above that of the vacuum.

4.1.3 Vacuum energy and normal ordering

P ωk P ωk 7 The vacuum energy is obtained from H|vaci = ( ~k 2 )|vaci. The energy hvac|H|vaci = ~k 2 is infinite . It comes from the zero-point motion of each mode. This absolute energy is unobservable (but energy differences are: see the exercise sheet # 6 on the Casimir effect). It is usually subtracted by redefining the zero of energy as being the energy of the vacuum state8. Then

X † H ≡ H − hvac|H|vaci = ωka a (4.12) ~k ~k ~k Another way to look at the same issue is to realize that when going from the classical to the quantum theory, there is an ambiguity. Indeed, take a complex field φ: when in the classical theory one has φ∗φ = ∗ 2 † † 1 † † † φφ = |φ| in the Hamiltonian density, should it be quantized as φ φ or φφ or 2 (φ φ + φφ ) or fφ φ + (1 − f)φφ† with f an arbitrary number between 0 and 1? We therefore fix the following prescription known as normal ordering (and symbolized by a pair of colons A →: A :): upon quantization, the field operators have to be normal ordered, i.e. creation operators are to be moved to the left of annihilation operators (while preserving the order among either creation operators and also among annihilation operators). The quantum Hamiltonian is then Z 3 1  2 2 2 2 X ωk  † †  X † H = d x : (∂0φ) + (∇~ φ) + m φ : = : a a + a a : = ωka a (4.13) 2 2 ~k ~k ~k ~k ~k ~k ~k ~k which is equivalent to removing the vacuum energy. Now H|vaci = 0. A precise definition of normal ordering is that † † † † : a(k1)a (k2) : = a (k2)a(k1) and : a(k1)a(k2)a(k3) a(k4) : = a(k3) a(k1)a(k2)a(k4) (4.14)

6Discuss this point in more details with the remark of J.Y. Ollitrault. There is a choice behind the way the mode expansion is interpreted which leads to positive energy. Note also that, contrary to the Dirac interpretation of the Dirac equation, we did not need to invoke the Pauli principle to fill the negative energy states and have a lower bound for energy. This is good news because Dirac’s argument would not work for bosons. Here the formalism of quantum field theory allows one to obtain a total energy that is positive despite the fact that it is built out of modes that have positive and negative energy branches. 3 7Even the energy density is infinite as 1 P ωk = R d k ωk ∼ R Λ dkk2ω ∼ R Λ dkk3 ∼ Λ4 diverges when the UV cutoff V ~k 2 (2π)3 2 0 k Λ → ∞. The UV cutoff Λ ∼ 1/a is related to the short distance structure of space (where a is an artificial lattice spacing) and was here introduced in order to show the nature of the divergence. Don’t mix it with the large distance L box that controls the IR behavior, i.e. the large distance behavior. 8Is this without physical consequence? Even in the presence of gravitation?

67 The 3-momentum is classically P~ = −i R d3xΠ(−i∇~ )φ (remember that this is just the Noether charge associated to the space translation symmetry). After quantization, it becomes

Z X ~k   X P~ = −i d3x : Π(−i∇~ )φ := : a† a + a a† := ~ka† a (4.15) 2 ~k ~k ~k ~k ~k ~k ~k ~k so that the vacuum has zero momentum P~ |vaci = 0. Remark: - Note that there is the 3-momentum operator −i∇~ , which is the space translation generator. And there is the 3-momentum of the whole field P~ , which upon canonical quantization also becomes an operator. The first is an operator in the sense that it acts on the field seen as a function (it is a gradient). Whereas the second is a many-body operator (it concerns the whole gas of particles): it acts in Fock space. In the language of second quantization, the first would be called a single-particle operator (acting in the Hilbert space for a single-particle) and the second a many-body operator (acting in Fock space). - For the classical scalar field, we found an orbital angular momentum L~ but no intrinsic angular momentum (no spin) S~ = 0. Therefore the corresponding excitation quanta have zero spin.

4.1.4 Fock space and Bose-Einstein statistics A single particle state is |~ki = |N = 1i = a† |vaci. It is indeed an eigenvector of the number operator N with ~k ~k eigenvalue 1. It is also an eigenvector of H with eigenvalue ωk (after normal ordering) and an eigenvector of P~ with eigenvalue ~k (after normal ordering). Notice that particles (i.e. field excitation quanta) only have positive energy despite the fact that the dispersion relation has positive and negative energy branches. ~ ~ † † A multi-particle state is |k1, ..., kni = a ....a |vaci. It is an eigenvector of N with eigenvalue n, an energy ~k1 ~kn ~ ~ eigenvector with eigenvalue ωk1 + ... + ωkn and a 3-momentum eigenvector with eigenvalue k1 + ... + kn. † † ~ ~ † † Because of [a , a 0 ] = 0, the multiparticle state |k1, ..., kni = a ....a |vaci is symmetric under any ~k ~k ~k1 ~kn exchange of particle, which means that the particles (excitation quanta of the scalar field) are bosons. There is actually a general connection between the fact that an integer spin field (here scalar field means spin 0) is quantized with commutators and the fact that the excitation quanta obey Bose-Einstein statistics. Making a connection to statistical mechanics, the average occupation of a mode ~k in an equilibrium state at temperature T (often simply called a thermal state) is given by

† 1 n~ ≡ ha~ a~ iT = (4.16) k k k eωk/T − 1 which is the Planck occupation factor (i.e. the Bose-Einstein distribution at zero chemical potential)9. The number of particles is not conserved: it changes with temperature. Indeed, the total number of particles is hNi = P 1 , which goes as T 3 → ∞ when T  m and as m3e−m/T → 0 when T  m. T ~k eωk/T −1 4.1.5 Summary on canonical quantization Canonical quantization is the usual way of going from classical to quantum mechanics by turning Poisson brackets into commutators, here extended from mechanics to fields. One has to identify a field and its conjugate field and then to impose equal-time commutation relations. The field becomes a quantum operator. Other operators defined from the field (often bilinears in the fields such as the energy, the momentum, the angular momentum, etc.) have to be normal ordered in order to be unambiguously defined upon quantization. This procedure of normal ordering also removes unobservable infinites (such as the vacuum energy).

9 We are taking units such that Boltzmann’s constant kB = 1 in addition to ~ = 1 and c = 1. All these fundamental constants are merely conversion factors that can safely be taken to be equal to 1 (kB from energy to temperature, c from length to time, ~ from energy to frequency or from momentum to wavevector) unlike coupling strengths (as the electric charge unit e or the gravitation constant G), which are also fundamental constants but of a different nature. They are actually not constants (cf. group and the flow of “coupling constants”) and their value has a meaning (the fine structure constant, which 2 e2 2 1 is essentially e as α = ~c = e ≈ 137  1, can obviously not be taken to equal one).

68 4.2 Complex scalar field and anti-particles

It is worth considering the complex scalar field φ(x) as it brings an essential novelty compared to the real ∗ µ 2 ∗ field. The novelty is related to its U(1) internal symmetry. The Lagrangian is L = (∂µφ) (∂ φ) − m φ φ, ∗ ∗ with φ (x) 6= φ(x). The conjugate field is now Π = ∂0φ . We impose ETCR, which have the same expression † as before (exercise by writing all of them). Upon quantization the conjugate field Π = ∂0φ . The mode expansion is slightly different from the case of the real scalar field as one does not impose φ†(x) = φ(x). Therefore Z φ(x) = a(k)e−ik·x + b†(k)eik·x (4.17) k µ ~ † where as usual k = (ωk, k) here. The coefficient a(k) and b (k) are both related to the field φ(x). Actually, they are related to its Fourier transform ϕ(k) by a(k) ≡ ϕ(k) and b†(k) ≡ ϕ(−k). We introduced the notation b because here ϕ†(−k) (i.e. b(k)) is different from ϕ(k) (i.e. a(k)). In other words, a corresponds to plane waves at positive energy and b to plane waves at negative energy. And in the case of a complex field, the two are not identical, meaning that b†(k) = a(−k) 6= a†(k). On the a(k) and b(k) operators, the ETCR become

† 0 2 ~ ~ 0 † 0 [a(k), a (k )] = 2ωk(2π) δ(k − k ) = [b(k), b (k )] (4.18) and all other commutators vanish. This last point is important, it concerns commutators involving a and a, those involving a† and a† but also those involving a and b. Don’t consider these vanishing commutators as a triviality, otherwise you will face big difficulties when discussing fermions later. Remember also that in these commutators k0 is always equal to ωk. Upon normal ordering, the Hamiltonian becomes Z Z 3  ∗ ∗ 2 ∗  X † † † † H = d x : ∂0φ ∂0φ + ∇~ φ · ∇~ φ + m φ φ := ωk(a a +b b ) = ωk(a(k) a(k)+b(k) b(k)) (4.19) ~k ~k ~k ~k k ~k P (without normal ordering, the vacuum energy would be ~k ωk) and the 3-momentum becomes X Z P~ = ~k(a† a + b† b ) = ~k(a†(k)a(k) + b†(k)b(k)) (4.20) ~k ~k ~k ~k k ~k The total number operator is Z X † † † † N = (a a + b b ) = (a (k)a(k) + b (k)b(k)) = Na + Nb (4.21) ~k ~k ~k ~k k ~k With these operators, we can again construct a Fock space. The novelty is that we have two types of ~ particles, because for each k we have two modes with the same frequency ωk (i.e. with the same mass m). We have two pairs of creation/annihilation operators i.e. a†/a and b†/b. The vacuum is defined as being ~ annihilated by all a~k and all b~k. It has H|vaci = 0, P |vaci = 0 and N|vaci = 0. Because the complex field has twice as many degrees of freedom as the real scalar field, it was expected that there would be twice as many modes. The two species of particles that appear are both quanta of the same field φ and have the same mass as a consequence of the U(1) symmetry. What distinguishes them? Remember that associated to the U(1) global internal symmetry, there is a classically conserved current µ ∗ µ µ ∗ R 3 ∗ J = i(φ ∂ φ−φ∂ φ ). The corresponding classical charge is Q = d x(iφ ∂0φ+c.c.). Upon quantization, it becomes Z 3 † X  † †  X  † †  Q = d x :(iφ ∂0φ + h.c.) := : a a − b b := a a − b b ~k ~k ~k ~k ~k ~k ~k ~k ~k ~k Z † † = (a (k)a(k) − b (k)b(k)) = Na − Nb (4.22) k

69 At this moment charge simply means difference in number of a-type and b-type particles, which is an integer. The vacuum is uncharged Q|vaci = 0 thanks to normal ordering. A single a-type particle has a charge Q|N = 1i = Qa† |vaci = +1|N = 1i, whereas a single b-type particle state has a charge a,~k ~k a,~k Q|N = 1i = Qb† |vaci = −1|N = 1i. Therefore the two types of particles are distinguished by their b,~k ~k b,~k charge being +1 (a-type) or −1 (b-type). Type a is actually called a particle and type b is called an antiparticle. They are distinguished by their charge. And the total charge counts the total number of + charge minus the total number of − charges. Remember that upon gauging the U(1) internal symmetry (following Weyl), we understood that the corresponding charge is actually the electric charge that couples the matter field to the Maxwell gauge field. Therefore, we see the important fact that the electric charge is now quantized (in the sense that it is discrete): Q ∈ Z (and not Q ∈ R). † In summary, a |vaci is a single particle state (known as a particle) of energy H = ωk ≥ m > 0, ~k momentum P~ = ~k, electric charge Q = +1 and spin S~ = 0. And b† |vaci is a single particle state (known as ~k ~ an anti-particle) of energy H = ωk ≥ m > 0, momentum P~ = k, electric charge Q = −1 and spin S~ = 0. Remarks: - Note that contrary to the historical Dirac construction of a filled Fermi sea of negative energy states that can no longer accommodate a single extra particle because of Fermi-Dirac statistics (Pauli exclusion), here we managed to construct antiparticles without invoking Fermi statistics. The particles and antiparticles that we considered are bosons. An antiparticle does not have to be seen as the removal of a negative energy particle (a hole in a filled Fermi sea). Note however that b†(k) = a(−k) could be interpreted as: the creation ~ of an anti-particle of energy ωk and momentum k is equal to the annihilation of a particle of energy −ωk and momentum −~k. - Therefore QFT allowed to make sense of the KG field theory despite its having a dispersion relation with positive and negative energy branches and despite its corresponding to bosons that do not obey an exclusion principle. The total energy in QFT is bounded from below and antiparticles are possible. - In hindsight, a real field has b(k) = a(k) (i.e. anti-particle = particle), the excitation quanta are charge neutral (they carry no electric charge) and there is no U(1) (but a Z2) symmetry. - When doing the Weyl construction (gauging of the U(1) internal symmetry), you remember that we have the freedom to introducing a tunable coupling strength q = −e between the matter and the gauge field. This coupling strength has the meaning of an electric charge unit (e = 1.6 × 10−19 C) and the charge operator becomes

Q = q(Na − Nb) = −e(Na − Nb) (4.23)

[end of lecture #11]

4.3 Dirac field

[see Ryder [3], pages 139-143]

4.3.1 Anticommutators: spinor fields are weird µ We start from the mode expansion of a classical Dirac field obeying the equation (iγ ∂µ − m)ψ(x) = 0. Remember, the four plane wave solutions: u(α)e−ik·x (positive energy) and v(α)eik·x (negative energy) with ~k ~k α = 1, 2 (α is something like a spin index, labelling the two spin projections). The u and v’s are Dirac spinors (not fields). The information they carry is about the internal polarization (the spin). A general mode expansion is Z X  (α) −ik·x ∗ (α) ik·x ψ(x) = bα(k)u e + d (k)v e (4.24) ~k α ~k k α

70 µ ~ R R d3k m 10 where k = (ωk, k) as usual and here ≡ 3 is different then in the case of the scalar fields (but k (2π) ωk it is also Lorentz invariant). This is conventional and is related to the way the u and v’s are normalized. Let ψ become an operator. Its mode expansion is Z X  (α) −ik·x † (α) ik·x ψ(x) = bα(k)u e + d (k)v e (4.25) ~k α ~k k α

¯ † 0 as bα(k) and dα(k) are now also operators. And the Dirac conjugate field operator ψ(x) ≡ ψ (x)γ is Z ¯ X  † (α) ik·x (α) −ik·x ψ(x) = b (k)¯u e + dα(k)¯v e (4.26) α ~k ~k k α whereu ¯ ≡ u†γ0 and similarly forv ¯. ¯ µ ¯ ∂L ¯ 0 † From the Lagrangian L = iψγ ∂µψ − mψψ, we obtain the conjugate field Π = = iψγ = iψ and ∂(∂0ψ) R 3 ˙ R 3 ¯ j R 3 † the Hamiltonian H = d x(Πψ − L) = d xψ(−iγ ∂j + m)ψ = d xψ i∂0ψ (in the last step, we used the Dirac equation)11. Inserting the mode expansion and using the normalization conditions on the u and v’s, we find that (check it, it is not that obvious) Z X † †  H = ωk bα(k)bα(k) − dα(k)dα(k) (4.27) k α with k0 = ωk. Note that we have not performed normal-ordering for the time being and remark the important minus sign in the above expression. We now impose commutation relations (as for the scalar fields):

† 0 3 ωk ~ ~ 0 † 0 [b (k), b 0 (k )] = (2π) δ 0 δ(k − k ) = [d (k), d 0 (k )] (4.28) α α m α,α α α

† † 3 ωk and all other commutators (with b and b, b and b , b and d, etc.) vanish. The strange factors (2π) m come from our normalization convention for the u and v’s (nothing profound). From these relations, we find that the Hamiltonian is Z Z X † †  X 3 (3) ~ H = ωk bα(k)bα(k) − dα(k)dα(k) − d kδ (0)ωk (4.29) k α α

P R 3 (3) ~ P P ωk A first issue is the diverging vacuum energy − α d kδ (0)ωk = − α,~k ωk = −4 ~k 2 (we used that R 3 (3) ~ P P d kδ (k) → ~k δ~0,~0 = ~k). It is negative, it has a fourfold degeneracy (later to be interpreted as spin up/spin down and particle/anti-particle) and comes from the zero-point motion. This could be taken care of by redefining the zero of energy (e.g. via normal ordering) H ≡ H − hvac|H|vaci. The second problem is R P † †  much more serious, it is related to k α ωk bα(k)bα(k) − dα(k)dα(k) being unbounded from below because of the minus sign in front of d†d. There is no lower bound to the energy! This means that we can not define a stable vacuum or groundstate, because we could always lower the energy (and hence it would not be a groundstate) by adding more d-type particles. Therefore, we step back and give up imposing commutation relations. To overcome this problem, we follow Jordan and Wigner (1928) and take a radical step by postulating that anticommutation relations should be used instead:

† 0 3 ωk ~ ~ 0 † 0 {b (k), b 0 (k )} = (2π) δ 0 δ(k − k ) = {d (k), d 0 (k )} (4.30) α α m α,α α α

10 R R d3k 1 For scalar fields the convention was that ≡ 3 . k (2π) 2ωk 11 0 0 Note that we have a “single-particle” Hamiltonian hD = −iγ ~γ · ∇~ + mγ = i∂t (this is the historical Dirac Hamiltonian). We also have a Hamiltonian for the field (or a “many-body” Hamiltonian) H = R d3xH given in terms of the Hamiltonian † density H = ψ (x)hDψ(x).

71 We also explicitly write the anticommutators that vanish because they are weird:

{b, b} = 0 = {b†, b†} = {d, d} = {d†, d†} i.e. (b†)2 = 0 e.g. (4.31) and even weirder {b, d} = 0 = {b†, d} = {b, d†} = {b†, d†} i.e. b†d† = −d†b† e.g. (4.32) Contrary to the case of the scalar field, which was quantized using commutation relations, here all these equations are “quantum” (not just the one explicitly containing δ(~k−~k0)) in the sense that an anticommutator can never be equal to zero in a classical theory (unless one introduces anticommuting numbers, known as Grassmann numbers). Note that b and d-type particles are all excitation quanta of the same field ψ(x) and are not like two species of particles (as e.g. electrons and ). With these anticommutation relations, the Hamiltonian (with the vacuum energy removed) becomes Z X † †  H ≡ H − hvac|H|vaci = ωk bα(k)bα(k) + dα(k)dα(k) , (4.33) k α which is now positive definite (it has a lower bound). Anticommutation relations are required by the positivity of energy H. We will see another motivation for anticommutation relations below. It is related to the excitation quanta (the particles) being fermions, which we admit for the moment. It is also possible to define a normal ordering procedure for fermions. It moves all creation operators to the left of annihilation operators (while preserving the order among either creation operators and also among annihilation operators) and gives a minus sign for each exchange of two (creation or annihilation) operators in the process. For example

† † † 2 † † : b(k1)b (k2) := −b (k2)b(k1) and : b(k1)b(k2)b (k3) := (−1) b (k3)b(k1)b(k2) = b (k3)b(k1)b(k2) (4.34)

Then the procedure of canonical quantization for spin 1/2 fields is: use anticommutators (instead of com- mutators) to quantize the fields and normal-order (with the specific prescription) observables (such as the Hamiltonian, 3-momentum, angular momentum, etc) which are constructed from bilinears in the field oper- ators. For example, the 4-momentum is Z Z Z µ 3 † µ X µ † †  X µ † †  P = d x : ψ i∂ ψ := k : bα(k)bα(k) − dα(k)dα(k) := k bα(k)bα(k) + dα(k)dα(k) k α k α (4.35) µ ~ with k = (ωk, k). As an exercise (see Zee [4], pages 106-107), show the following equal-time anticommutation relations (ETAR): † 0 ~ 0 {ψi(t, ~x), ψj (t, ~x )} = δijδ(~x − ~x ) (4.36) where i, j = 1, 2, 3, 4 here label the quadruplet of components of a Dirac bispinor (not to be confused with i = 1, 2, 3 = x, y, z). Note that Π(x) = iψ(x)† = iψ¯(x)γ0 is the conjugate field and that indeed 0 0 12 {ψi(t, ~x), Πj(t, ~x )} = i~δijδ(~x − ~x ). Other anticommutators are

0 † † 0 {ψi(t, ~x), ψj(t, ~x )} = 0 = {ψi (t, ~x), ψj (t, ~x )} (4.37)

12Think how weird fermions are: at equal-time, the Dirac field should anticommute at all distances! Doesn’t that violate relativity? No. Actually fermionic fields are not observables. Observables are constructed from field bilinears and this makes a huge difference. The field bilinears can be shown to respect the requirements of special relativity even if the fields themselves ~ † 0 seem not to. In addition, upon taking → 0, only the first anticommutator is modified to {ψi(t, ~x), ψj (t, ~x )} = 0. Is this the classical limit of fermionic fields? All equal-time anticommutators (not commutators!) vanish. These are no longer operators but are non commuting numbers. They are anticommuting numbers. The latter are usually called Grassmann numbers or variables, i.e. anticommuting c-numbers.

72 Hints: use the mode expansion, the anticommutation relation between b and b† and also between b and d P (α) (α)† 0 k/+m 0 and then the relation α u (k)u (k) = P+(k)γ = 2m γ and similarly for the v’s. Question: obtain the equation of motion for the Dirac field ψ(x) in several different ways. First from the ˙ δH EL equation. Then from the Hamilton equation ψ = δΠ . And as a third way, from the Heisenberg equation of motion ψ˙ = −i[ψ, H]. And now a tricky question: what about −i{ψ, H}?

4.3.2 Fock space, Fermi-Dirac statistics and the spin-statistics relation

The construction of Fock space starts by defining the vacuum state |vaci as being annihilated by all bα(k) and dα(k) operators. So that H|vaci = 0 after normal ordering. There are four types of single particle † † states bα(k)|vaci and dα(k)|vaci (these will later be interpreted as spin up/down and particle/anti-particle). † † † 2 However, because of the anticommutation relation {bα(k), bα(k)} = 0, one hase [bα(k)] = 0, which means that it is impossible to have two identical (bα-type here) particles in the same mode. This is reminiscent of the Pauli exclusion principle. Note that Fermi-Dirac statistics is more than the mere exclusion principle. It requires having many-body wavefunctions that are antisymmetric under exchange. But the anticommutation † † 0 † † 0 † 0 † relation {bα(k), bα0 (k )} = 0 implies that the two-particle state bα(k)bα0 (k )|vaci = −bα0 (k )bα(k)|vaci is indeed antisymmetric in the exchange of the two particles. This is indeed Fermi-Dirac statistics. We here glimpse at a particular case of a general relation: half-integer spin fields are quantized using anticommutation relations, which leads to Fermi-Dirac statistics. Whereas integer spin fields are quantized with commutation relations, leading to Bose-Einstein statistics. This is the famous spin-statistics theorem, which we only state here. It was proven in the frame of relativistics quantum field theory by Pauli, Fierz, L¨uders,Zumino and others.

4.3.3 U(1) charge and anti-particles From the invariance of the Dirac action under a global internal U(1) transformation, we found a conserved µ ¯ µ µ vector current JV = ψγ ψ such that ∂µJV = 0. Upon gauging the symmetry, we understood that this is R 3 0 R 3 † actually the conservation of electric charge. The conserved charge being Q = d xJV = d xψ (x)ψ(x). Now, when quantizing the Dirac field, the conserved charge becomes Z Z Z 3 † X † †  X † †  Q = d x : ψ (x)ψ(x) := : bα(k)bα(k) + dα(k)dα(k) := bα(k)bα(k) − dα(k)dα(k) (4.38) k α k α

The vacuum is uncharged (thanks to normal ordering) Q|vaci = 0. The charge is quantized Q ∈ Z and counts the number of particles (i.e. b-type particles, carrying a +1 charge) minus the number of anti-particles (i.e. d-type particle, carrying a -1 charge). Actually the electric charge is −eQ where e = 1.6 × 10−19 C is the electric charge unit that plays the role of a coupling strength.

4.3.4 Charge conjugation [see Zee [4], pages 97-98] Charge conjugation is a discrete transformation that exchanges particles and anti-particles. The Dirac µ equation in a U(1) gauge field is (iγ Dµ − m)ψ = 0 with the covariant derivative Dµ = ∂µ + iqAµ, where q = −e < 0 is a property of the Dirac field that measures the coupling strength between the Dirac and the gauge field (it is called the electric charge of the matter field). We now check that if ψ satisfies the Dirac equation µ [iγ (∂µ + iqAµ) − m]ψ = 0, (4.39) we can find a transformed field ψc that satisfies the Dirac equation with the opposite charge. In either the chiral or the standard representation (but not in a general representation), the transformation is

ψ → ψc = −iγ2ψ∗ (4.40)

73 (the phase −i is conventional). It is a transformation that involves complex conjugation (as time reversal). As (γ2)2 = −1, we have that (ψc)c = −iγ2(−iγ2ψ∗)∗ = −iγ2i(−γ2)ψ = (−iγ2)2ψ = ψ, as expected.  0 σ2  Note that (γ2)∗ = −γ2 because γ2 = contains the purely imaginary Pauli matrix σ2 = σ = −σ2 0 y  0 −i  . i 0 Second, we take the complex conjugation of the above Dirac equation to find

µ ∗ ∗ [−i(γ ) (∂µ − iqAµ) − m]ψ = 0 (4.41)

Then we multiply to the left by γ2 and insert the identity in the form (γ2)2 between [...] and ψ∗ to get

2 µ ∗ 2 2 ∗ [−iγ (γ ) γ (∂µ − iqAµ) − m]γ ψ = 0 (4.42)

Using that γ2(γµ)∗γ2 = −γµ, we finally obtain:

µ 2 ∗ [γ (∂µ − iqAµ) − m]γ ψ = 0 (4.43)

µ c µ ∗ c This shows the desired property that [iγ (∂µ − iqAµ) − m]ψ = 0 i.e. (iγ Dµ − m)ψ = 0. The charge conjugate field ψc satisfies the same Dirac equation (same mass) but with an opposite electric charge.

4.3.5 Motivation for anticommutation [see Zee [4], pages 103-108] Here, we would like to show that imposing Fermi-Dirac statistics leads to anticommutators (instead of the Jordan-Wigner way: assuming anticommutators leading to FD statistics). † Let bα|vaci be a single state (a mode) with α (here α is not the same thing as the index α = 1, 2 of the section on the Dirac equation in which it serves to number the spinor components. Here it serves as a short hand notation for something like ~k.). In first quantization this state would be called † † |1 : αi. Now add another fermion in mode β 6= α: bβbα|vaci. This is a two particle state. 1) For two electrons, which are fermions, we would like the wavefunction to be antisymmetric under † † † † exchange. Therefore bβbα|vaci = −bαbβ|vaci. Actually, we would like this to hold for an arbitrary state |χi † † † † † † and not only for the vacuum state. We want bβbα|χi = −bαbβ|χi i.e. {bα, bβ} = 0 and by taking the adjoint we find {bα, bβ} = 0 when α 6= β. As a subcase, we get the Pauli exclusion principle. Indeed, when β = α, there is no such two-particle † 2 † 2 state, i.e. (bα) |vaci = 0 (and also (bα) |χi = 0 for an arbitrary state |χi in Fock space). Therefore † † {bα, bα} = 0. By taking the adjoint, we also have {bα, bα} = 0. † † 2) At this point, we already have {bα, bβ} = 0 = {bα, bβ} for all α and β modes. But we still need † P † to obtain {bα, bβ} 6= 0. We now ask that N ≡ α bαbα be the number operator, which means that † † † [N, bα] = bα. Indeed, the whole construction of Fock space such as |Nα = 1i = bα|0i proceeds from this † † P † † relation. For bosons ([bα, bβ] = δα,β and [bα, bβ] = 0), the relation follows from [N, bβ] = α[bαbα, bβ] = P  † † † †  † α bα[bα, bβ] + [bα, bβ]bα = bβ upon using [AB, C] = A[B,C] + [A, C]B. Let’s try to repeat that for † P † † P  † † † †  fermions: [N, bβ] = α[bαbα, bβ] = α bα{bα, bβ} − {bα, bβ}bα upon using [AB, C] = A{B,C}−{A, C}B. † † † P † † † We already have {bα, bβ} = 0 and therefore [N, bβ] = α bα{bα, bβ}. To obtain the desired relation [N, bα] = † † bα, we therefore require that {bα, bβ} = δα,β, which is what we wanted to show. In the end, we obtain † † † {bα, bβ} = 0 = {bα, bβ} and {bα, bβ} = δα,β (4.44) as a consequence of asking for Fermi-Dirac statistics and the corresponding Fock space.

74 4.3.6 Spin TO BE FINISHED

4.3.7 Causality and locality TO BE FINISHED See J.Y. Ollitrault.

4.4 Propagators and force as exchange of particles

[see Zee [4], chapter I.4] In this section, we won’t have time to go into the details as deeply as we have been up to now (you can consider it as a quick and dirty preview of Feynman diagrams). We would like however to convey the following qualitative ideas. First, the motion of a quantum particle is described by an object called alternatively a propagator or a two-point correlation function or a Green’s function. Second, the interactions between fields are described by vertices (i.e. connections between more than two fields, typically three or four). Third the interaction between two matter particles corresponds to the exchange of a (virtual) gauge boson (carrier particle). The range r of the corresponding force being set by the inverse mass 1/m of the carrier particle. Fourth, the interest into propagator is twofold: (i) they are related to physical observables and (ii) they have simple perturbation theory (unlike most observables) and are therefore things that can be computed in a theory with interactions.

4.4.1 What is a propagator?

1 2 m2 2 We take the simple case of a massive scalar real field φ with Lagrangian L = 2 (∂φ) − 2 φ . There are several equivalent definitions (and corresponding names) for a propagator. We present three equivalent definitions. A first definition is as a two point correlation function (with time ordering):

~ ~ Z d3k e−i(ωkt−k·~x) Z d3k ei(ωkt−k·~x) − iD(x − 0) ≡ hvac|T [φ(t, ~x)φ(0, 0)]|vaci = Θ(t) 3 + Θ(−t) 3 (4.45) (2π) 2ωk (2π) 2ωk where the time-ordering T [...] of a product of operators is defined by T [A(t1)B(t2)] = Θ(t1 −t2)A(t1)B(t2)+ † † Θ(t2−t1)B(t2)A(t1). Here the field is real so that φ = φ but you should think that it is hvac|T [φ(t, ~x)φ (0, 0)]|vaci (first create then destroy, and the is to be read from right to left). It is the probability amplitude for a massive boson to be created at t = 0 in ~x = 0, to propagate and to be destroyed at time t > 0 in ~x. Or, when t < 0, it is the probability amplitude for an anti-particle to be created at time t < 0 in ~x, to propagate and to be destroyed at time t = 0 in ~x = 0. A second definition is as a Green’s function of the KG equation ( + m2)φ = 0. The Green function is the impulse response defined by: ( + m2)D(x, y) = δ(4)(x − y) (4.46) Here, invariance under space-time translations implies that D(x, y) = D(x − y). By Fourier transform, we obtain (−k2 + m2)D(k) = 1. However D(k) = −1/(k2 − m2) is ill defined when k2 = m2, i.e. on the mass ~ 2 2 shell. When the propagator is seen as a function of frequency ω, i.e. D(ω, k) = −1/(ω − ωk), we see that there is a problem when ω = ωk. The prescription to avoid the poles on the real frequency axis is to slightly shift them in the complex plane by introducing a finite imaginary part  → 0+ to the frequency such that −1 −1 −1 D(k) = 2 2 = 2 2 = 2 2 (4.47) k − m + i ω − ωk + i (ω + i sign(ω)) − ωk Note also that the regulator  is here to give an infinite and positive lifetime to the particle. This type of gives the so-called Feynman propagator, which is the one most relevant in field theory

75 (there are other ways of making well-defined propagators such as the retarded propagator or the advanced propagator or etc). The level broadening is proportional to  → 0+ and the lifetime to 1/ → +∞ (indeed e−iωt = e−it(ωk−i) = e−iωkte−t). The salient feature here is that the propagator (in reciprocal space) is defined essentially as the inverse of the dispersion relation. In other words, the poles D(k)−1 = 0 are on the mass-shell. By Fourier transform, we get that the propagator in real space is

Z 4 ik·x Z 3 Z iωt d k e d k −i~k·~x dω e D(x − 0) = − 4 2 2 = − 3 e 2 2 (4.48) (2π) k − m + i (2π) 2π ω − ωk + i where  → 0+ is here to give a sense to the integral (otherwise it is meaningless as there are poles on the real frequency axis). The frequency integral can be done using the residue theorem by closing the real axis by a semi-circle (of infinite radius) either in the upper or in the lower plane depending on the sign of t. Each closed contour captures a single pole: either ω = ωk − i or ω = −ωk + i. In the end, we find that iωt R dω e i −iωk|t| 2 2 = − e and we recover the first expression of the propagator as a two point correlation 2π ω −ωk+i 2ωk function (4.45). When studying path integral quantization, you will learn that the propagator is also the inverse of the quadratic kernel in the Lagrangian. This being a third definition. It is actually closely related to the second 1  2 as the Lagrangian for the massive scalar field can be written (after integration by part) as L = − 2 φ( +m )φ. Which shows that the kernel of the quadratic term is  + m2. Physically, the propagator describes the probability amplitude for a disturbance in the field to propagate from the origin to x. Inside the light-cone, the propagator oscillates. Whereas outside the lightcone it leaks exponentially (a kind of quantum mechanical tunnel effect). Why do we work with propagators? What is their interest? Why not work directly with the physical observables, the measurable quantities? It is because the observables are easily obtained from the Green’s function13. But in addition the latter is much easier to compute in perturbation theory in a systematic way. Perturbation theory in the coupling constant is made systematic by drawing Feynman diagrams. Each diagram corresponds to a precise mathematical expression. The more vertices, the higher the order in perturbation theory and the smaller the contribution of the diagram (if perturbation theory converges). Feynman diagrams are nothing more than a systematic and economical way of writing perturbation theory. Perturbation theory is not exact and is not all there is to field theory.

4.4.2 Force as an exchange of virtual particles A hint at interactions: vertices, particle exchange as mediating forces. Main message: source particles (fermions spin 1/2, also called matter particles) and force particles (gauge bosons of spin 1 mainly, also called force carrying particles). Exchange of a gauge boson is responsible for forces. The range r of the interaction is set by the mass m of the mediating particle: ~ 1 r ∼ = (4.49) mc m

Example of the Yukawa theory of pion exchange between nucleons. Range r ∼ 1 fm and mπ ∼ 0.1 GeV. Potential energy (obtained in the static limit) is ∼ e−m|~x|/|~x| Example of the Coulomb force as an exchange of photon between electrons. Range r → ∞ as m = 0. Potential energy (obtained in the static limit) is ∼ 1/|~x| and the corresponding force is 1/~x2 (the celebrated Coulomb law!).

13An example from standard quantum mechanics: the retarded Green function is G = 1/(ω − H + i), where H is the hamiltonian operator. We call Eα the eigenvalues and |αi the eigenvectors of H. The imaginary part of the trace of the Green 1 1 function gives the density of states. Using the mathematical identity x+i = P x − iπδ(x), where P denotes the principal part, P 1 show that the density of states ρ(ω) = α δ(ω − Eα) – an observable – is equal to − π ImTrG.

76 Figure 4.1: Scattering of an electron onto another electron due to the exchange of a single virtual photon as represented by a . The straight lines with arrows are electron propagators, the wavy line is a photon propagator and there are two vertices. Each vertex connects two electron lines (one incoming, one outgoing) and a photon line. A vertex is proportional to the electric charge e (i.e. the coupling constant of the electron field and of the electromagnetic field).

4.4.3 A first glance at Feynman diagrams Here we do a crash course on Feynman diagrams. A Feynman diagram corresponds to a precise mathematical expression involving propagators (represented as lines) and vertices (connections between the lines). But it may be interpreted as a movie-like description of elementary processes between particles. As an example, consider an electron propagating freely and then emitting a photon while changing direction. Later another electron absorbs the photon and changes direction as a result of the momentum kick. Seen from far away, such a process looks like the scattering of an electron on another electron. Or the collision of an electron on an electron due to their mutual repulsion (the Coulomb force). The diagram shown is the lowest order one for electron-electron scattering in perturbation theory. It involves two vertices and is therefore of order e2 = α = 1/137 (the fine structure constantα = e2/(~c)). The next order involves diagrams with four vertices (try to draw one) and is proportional to e4 = α2 = (1/137)2.

[end of lecture #12]

77 Chapter 5

Spontaneous symmetry breaking

[see Ryder [3], chapter 8, pages 290-307] In this chapter, we discuss spontaneous symmetry breaking (SSB). We are essentially considering classical (not quantum) field theory in the presence of self-interactions (i.e. of a non-linearity).

5.1 Introduction: SSB and phase transitions

Remember that by a symmetry, we mean a symmetry of the laws of physics and that we carefully distinguish it from the symmetry of a state of a system. This distinction will prove crucial in the following. The state of a system need not show the full symmetry of the laws: it may be less symmetric. We consider a concrete example from condensed matter physics. Take a 3D classical Heisenberg ferromagnet with Hamiltonian P ~ ~ ~ H = −J hi,ji Si · Sj. There is a classical micro-magnet (not to call it a spin) Sj with unit norm on each site j of an infinite 3D cubic lattice, for example. The magnets interact ferromagnetically with their nearest neighbors via an exchange interaction J > 0. The Hamiltonian is invariant under the SO(3) rotation group as it involves the scalar product of two 3-vectors (the micro-magnets). However, the groundstate (the state that minimizes the total energy H, which is reached at zero temper- ature) of the system is not invariant under all rotations. In the groundstate, all the magnets are aligned in the same direction. This is the state of minimum energy E = −JNz/2, where N is the number of sites in the lattice and z = 6 is the number of nearest neighbor sites (coordination number of the lattice). The ground- state is characterized by a non-zero magnetization ~m = hS~ji (it does not depend on the lattice site, thanks to translational invariance). The magnetization is known as the order parameter. But the groundstate is not unique as this common direction is arbitrary. There are as many groundstates as points on a sphere S2. The groundstate is therefore highly degenerate (there is actually a continuum of groundstates). A single groundstate is still invariant under a subgroup of SO(3): it is invariant with respect to rotations around the direction of the magnetization ~m. It therefore still admits SO(2) as a symmetry group. Transformations that belong to SO(3) but not SO(2) form a coset (i.e. a subset which is not a group) called SO(3)/SO(2). A rotation belonging to SO(3)/SO(2) transforms one groundstate into another one within the degeneracy manifold. At high temperature, the magnetization is zero and the state (in the sense of equilibrium statistical mechanics: we take an ensemble average over a thermal equilibrium ensemble) of the system has the full rotation symmetry. The system is said to be in the paramagnetic phase. Upon decreasing the temperature, there is a critical temperature Tc below which the system spontaneously acquires a finite magnetization in a specific direction. This direction is chosen at random. The rotation symmetry is spontaneously broken from SO(3) to SO(2). The high temperature phase is called paramagnetic and it is disordered and more symmetric. The low temperature phase is called ferromagnetic and it is ordered and less symmetric. Decreasing the temperature from Tc to 0, the direction of the magnetization does not change but its norm varies continuously from 0 to 1. The idea that a second order (or critical) phase transition is related to the spontaneous breaking

78 of a symmetry is due to Landau (1937). Note that not all phase transitions are of this sort (two counter- examples: the liquid-gas transition is first order in general and does not correspond to a change of symmetry; the Kosterlitz-Thouless transition of the 2D classical XY model is not due to SSB but to the unbinding of topological defects called vortices). In addition, the spontaneous symmetry breaking of SO(3) is accompanied by the appearance of gapless (or soft or massless) modes above the broken symmetry groundstate. These modes do not exist in the high-temperature symmetric phase. They only exist in the low temperature symmetry broken phase. They are gapless in the sense that their frequency ω goes to zero when the wavevector k → 0. In the Heisenberg ferromagnet, they are called spin waves. In general, they are known as Goldstone modes. Remarks: - the groundstate is often called the vacuum state or simply the vacuum - at a SSB phase transition, the symmetric vacuum becomes unstable - SSB refines the notion of vacuum. The main message is that the vacuum need not posses the full symmetry of the model (i.e. the symmetry of the action or of the Hamiltonian depending on the context). - in the above example, the temperature plays the role of the control parameter for the phase transition. The idea to transpose the concepts of SSB and phase transitions from condensed matter to high-energy physics is due to Nambu around 1960. He was inspired by the BCS theory of superconductivity (1957).

5.2 Goldstone mechanism: SSB of a global symmetry

We now turn to relativistic field theory and consider a complex scalar field with global U(1) phase symmetry and self-interaction. The Lagrangian of the Goldstone model is:

∗ µ ∗ ∗ µ 2 ∗ ∗ 2 L = (∂µφ) (∂ φ) − V (φ , φ) = (∂µφ) (∂ φ) − m φ φ − λ(φ φ) (5.1)

This corresponds to an interacting theory due to the quartic term |φ|4 which produces a non-linearity in the equation of motion (see below). The U(1) symmetry is global, internal and continuous. We take λ > 0 (corresponding to repulsive interactions). However, we assume that the parameter m2 (sometimes called µ) can be positive or negative (think that m2 is the name of a real parameter, not the square of a real mass). Making contact with the previous example, m2 is actually the control parameter for the phase transition 4 and usually taken to be proportional to T − Tc. This model is also known as the relativistic φ theory (even in the case where φ is complex, in which case it is actually |φ|4 and not φ4 that appears in the Lagrangian). In the non-relativistic limit, it is essentially a Landau-Ginzburg theory frequently used to describe a phase transition.

5.2.1 Classical groundstate We first find the classical groundstate. The Hamiltonian is Z   Z   H = d3x φ˙∗φ˙ + ∇~ φ∗ · ∇~ φ + V (φ∗, φ) = d3x |φ˙|2 + |∇~ φ|2 + V (φ∗, φ) (5.2)

As the kinetic and elastic energies are squares, the energy is minimized by taking a uniform and time- independent field φ(x) = φ = constant ∈ C. Therefore H = Vol. × V (φ∗, φ). The classical groundstate field should in addition minimize the potential energy (density) V (|φ|) = m2|φ|2 + λ|φ|4. The potential V is a function of φ∗ and φ = |φ|eiθ but it only depends on the modulus |φ| and not on the phase θ of the field. This function is plotted in Fig. 5.1 for m2 > 0 on the left and for m2 < 0 on the right. Its first and second derivatives with respect to the modulus are dV/d|φ| = 2|φ|(m2 + λ|φ2|) and d2V/d|φ|2 = 2m2 + 12λ|φ|2. If 2 2 2 2 m > 0, the first derivative vanishes when |φ| = 0 and d V/d|φ| |φ|=0 = 2m > 0: there is a unique stable 2 2 2 vacuum at φ0(x) = 0. However, if m < 0, the first derivative vanishes at |φ| = 0 and at |φ| = −m /(2λ) > 0. The second derivative is equal to 2m2 < 0 at |φ| = 0 and −4m2 > 0 at |φ|2 = −m2/(2λ). In the end, the extremum at |φ| = 0 is a local maximum and represents an unstable symmetric vacuum; whereas that at

79 m2 2 λ 4 Figure 5.1: Potential V = 2 |φ| + 4 |φ| as a function of the complex field φ = Reφ+iImφ. Left corresponds 2 2 to m > 0 (symmetric phase, unique stable vacuum at φ0 = 0) and right to m < 0 (symmetry broken phase, q −m2 iθ mexican hat potential, highly degenerate vacuum at φ0 = 2λ e with arbitrary θ).

|φ| = p−m2/(2λ) is a global minimum corresponding to a stable vacuum. It actually corresponds to a p 2 iθ gutter and is a highly degenerate vacuum. Indeed φ0(x) = −m /(2λ)e is a groundstate for any phase θ. Each φ0 for a given θ corresponds to a stable symmetry-broken groundstate. There are as many stable symmetry broken vacua as angles between 0 and 2π. If the parameter m2 is varied (by an external control parameter) from positive to negative, the point at which it changes sign (m2 = 0) is called a phase transition and corresponds to SSB.

5.2.2 Excitations above the groundstate In order to analyse the linear excitations above the groundstate, we will use the following idea: linearizing the equation of motion is equivalent to quadratizing the Lagrangian and leads to an effective low-energy (meaning close to the groundstate) description of the system in terms of non-interacting excitations. We now consider the linearized excitations close to the vacuum. The EL equation of motion is

( + m2)φ = −2λ|φ|2φ and ( + m2)φ∗ = −2λ|φ|2φ∗ (5.3)

2 which is non-linear. When m > 0, we expand close to the stable vacuum φ(x) = φ0 + δφ(x) = δφ(x) and retain only lowest order terms in the fluctuation δφ(x). We find that ( + m2)δφ = −2λ|δφ|2δφ ≈ 0 and similarly for δφ∗. This is the familiar massive KG equation. We therefore find two massive excitations ∗ p~ 2 2 (δφ and√ δφ or Reδφ and Imδφ) with dispersion relation ω = ωk =  k + m . We can now interpret m = m2 as a mass (or a gap). 2 p 2 iθ0 When m < 0, we expand around one of the stable vacua φ0 = −m /(2λ)e (we choose θ0 = 0) iθ iθ(x) p 2 and use polar coordinates φ = ρe . The field is φ(x) = [ρ0 + δρ(x)]e with ρ0 = −m /(2λ) and θ(x) = θ0 + δθ(x) = 0 + θ(x). We insert this field in the Lagrangian and obtain (check it) at quadratic order:

m4 L = + (∂ δρ)(∂µδρ) − 4λρ2(δρ)2 + ρ2(∂ θ)(∂µθ) + O(δρ3, θ2δρ) (5.4) 4λ µ 0 0 µ

m4 The first term is a positive constant equal to minus the vacuum energy E0 = − 4λ ×Vol. The second and third term describe a real massive amplitude mode δρ and the fourth term describes a real massless phase mode θ (it is massless due to the absence of θ2 term). The omitted terms describe either self interaction of the amplitude 3 4 mode (terms like ρ0(δρ) and (δρ) ) or interaction between the amplitude and the phase modes (terms like q 2 µ ~ 2 2 p~ 2 2 (δρ) ∂µθ∂ θ). The dispersion relation for the amplitude mode δρ is ω =  k + 4ρ λ =  k − 2m , the √ √ 0 2 ~ mass being mH = 2ρ0 λ = −2m > 0. For the phase mode θ, the dispersion relation is ω = |k|, which

80 Figure 5.2: Mexican hat potential corresponding to the symmetry broken phase of the Goldstone model. A groundstate is any point in the gutter. Above such a groundstate, there is a massive amplitude mode (radial, Higgs) that climbs the potential (restoring force) and a massless phase mode (azimuthal, Goldstone) that feels no restoring force. Taken from M. Endres et al., Nature 487, 454 (2012).

indeed corresponds to a vanishing mass mG = 0. The amplitude mode – a.k.a. the Higgs mode – corresponds to climbing the potential at fixed phase θ from the minimum. That the potential is not flat means that there is a restoring force. The phase mode – a.k.a. the Goldstone mode – corresponds to moving along the gutter, i.e. the manifold of degenerate vacua. The flatness of the potential along the gutter means that there is no restoring force in this direction. See Fig. 5.2. In short, the Higgs amplitude mode corresponds to climbing up the wall and the Goldstone phase mode to rolling along the gutter. Actually, you have to keep in mind that the above figure is drawn at a given position in space and at a given time. To really understand the dispersion relation you should think of such a picture at each point of space and at each time (very many Mexican hats!). The statement that the mode is massless in the limit of infinite wavelength (or zero wavevector) means that one can imagine a spatial excitation above a broken symmetry vacuum that costs almost zero energy: the position of the groundstate in the gutter of the Mexican hat changes very slowly in space and time. Such an excitation has no potential energy cost (it lies in the gutter, which is the potential minimum) and only cost a small amount of elastic energy (because of the slow spatial variation). The latter vanish in the limit of a infinite wavelength. There is a general theorem by Goldstone (1960-1961) stating that the spontaneous symmetry breaking of a continuous symmetry implies the existence of massless excitations. The corresponding particles (upon quantization) are usually called Nambu-Goldstone particles. In the above example, the U(1) symmetry is 2 spontaneously broken. When m > 0, there are two massive modes with√ mass m > 0. After the SSB, when 2 2 m < 0, there is one massive amplitude mode (with mass mH = −2m ) – the Higgs mode – and one massless phase mode (mG = 0) – the Goldstone mode.

5.2.3 What is the number of Goldstone modes? Imagine a symmetry group G that is spontaneously broken to a subgroup H (not to be confused with a Hamiltonian). G is the symmetry group of the action and H is the symmetry group of one of the vacuum states. Elements of H leave the vacuum unchanged and therefore do not produce soft modes. Elements of the coset G/H (not a group) change one vacuum into another one within the degeneracy manifold. By making such a transformation slightly inhomogeneous, one produces soft modes. The order of G is the number of its generators. The total number of modes is order(G). The number of massive modes is order(H). The number of massless modes (Goldstone modes) due to the SSB of a continuous symmetry is dim (G/H) = order(G) − order(H).

81 Example of the ferromagnet: G = SO(3) and H = SO(2). The total number of modes is 3, the number of massive modes is 1 and the number of massless modes (spin waves) is 2. This is due to the two possible transverse polarizations of the spin wave. Remark (thanks to Nicolas Dupuis): the number of Goldstone modes actually depends on the dynamical part of the Lagrangian and not just on the potential V (φ, φ∗) and on the above group theoretic argument. Contrast, for example, the case of the Goldstone model in the relativistic and in the non-relativistic limits of the Goldstone model.

5.2.4 SSB of a discrete symmetry and quantum tunneling Unlike the case of a continuous symmetry, SSB of a discrete symmetry does not imply the existence of a massless excitation. Consider a real scalar field φ with Z2 symmetry (i.e. symmetry under the transformation 1 µ 2 2 4 2 2 4 φ → −φ). The Lagrangian is L = 2 (∂µφ)(∂ φ) − m φ − λφ and the potential V (φ) = m φ + λφ . When m2 > 0, the potential has essentially the shape of a parabola and there is a unique vacuum at φ = 0 and a corresponding single massive mode (see Fig. 5.3). When m2 < 0, the potential V (φ) = m2φ2 + λφ4 is a double-well potential (not a mexican hat, see Fig. 5.3). It has two stable groundstates (so there is degeneracy but not a continuous degeneracy). Around each groundstate, there is a single massive mode. Note the absence of gutter and the absence of the corresponding massless phase mode. The Goldstone theorem does not apply to the case of a discrete symmetry1.

V 0.4

0.3

0.2

0.1

Φ -1.0 -0.5 0.5 1.0

-0.1

-0.2

-0.3

Figure 5.3: Potential V = m2φ2 +λφ4 as a function of the real field φ. Red corresponds to m2 > 0 (symmetric phase, unique stable vacuum at φ = 0) and blue to m2 < 0 (symmetry broken phase, double-well potential, q −m2 twofold degenerate stable vacuum at φ =  2λ ).

Upon quantizing the theory, there is a twist to SSB. Let us simply mention that quantum mechanically, it is possible to tunnel from one broken symmetry vacuum to the another (this is quantum mechanical tunneling). Therefore the true quantum mechanical groundstate is neither of the two classical groundstates but is a linear superposition of the two. Think of the standard problem of the ammonia molecule NH3 that is described as a double well problem but treated as a two level system, initially degenerate, and the degeneracy of which is lifted by a coupling due to tunneling (see for example Feynman III on quantum mechanics). However, if the system has an infinite number of degrees of freedom (which is the case of a field theory), it can be shown that the tunneling is suppressed by an extensitivity effect2 and SSB is also possible in the quantum world. More precisely SSB, which is classically possible both in mechanics and in field theory, is no

1Another case in which the Goldstone theorem does not apply is when the symmetry is local – i.e. a gauge symmetry – rather than global (see the section on the Higgs mechanism). 2Basically, tunneling should occur simultaneously for an infinite number of degrees of freedom, which makes its probability vanishingly small. It is roughly a tunneling probability for s single dof (i.e. a positive number < 1) to a power given by the number of degrees of freedom.

82 longer possible in quantum mechanics (i.e. with a finite number of degrees of freedom) but is again possible in quantum field theory. The interested reader should consult Maggiore [2] pages 253-256.

5.3 Higgs mechanism: SSB of a gauge symmetry

If we think that in the above model the U(1) symmetry is fundamental and exact, then following Weyl’s idea we should gauge it (i.e. make it local). What about SSB of a gauge symmetry? We start from a complex scalar field (self-interacting) coupled to a U(1) gauge field: 1 L = (D φ)∗(Dµφ) − V (φ∗, φ) − F F µν (5.5) µ 4 µν where Dµ = ∂µ + ieAµ. This Lagrangian is Lorentz invariant and is also invariant under a local U(1) symmetry φ(x) → eiΛ(x)φ(x), Aµ(x) → Aµ(x) − ∂µΛ/e such that Dµφ(x) → eiΛ(x)Dµφ(x). 2 µ Remember that gauge invariance forbids a mass term in the Lagrangian for the gauge field (mAAµA ). Gauge theories produce massless gauge bosons as mediators of the forces. These propagate over infinite distances. In Nature, there are long ranged forces (electromagnetism and gravitation) but there also short −18 −15 3 ranged ones (weak interactions 1/mZ,W ∼ 10 m and strong interactions 1/mπ ∼ 10 m ). Are gauge theories useless to describe the latter? Also SSB of a continuous global symmetry produces massless Gold- stone bosons, which are not observed as fundamental particles (there are no known massless scalar bosons). The Goldstone mechanism therefore also seems to be useless. Actually, the two problems (gauge theories producing massless vector bosons and Goldstone mechanism producing massless scalar bosons) will be shown to cancel each other. The SSB of a continuous gauge symmetry removes massless particles: this is the Higgs mechanism that we now discuss. Let us rewrite the Lagrangian as 1 L = (∂ φ)∗(∂µφ) − m2φ∗φ − λ(φ∗φ)2 − F F µν + ieA φ∂µφ∗ − ieA φ∗∂µφ + e2A Aµφ∗φ (5.6) µ 4 µν µ µ µ The first three terms are the Goldstone model of a self-interacting (φ4) relativistic complex massive and scalar field; the fourth term is the free theory of a massless U(1) gauge field (Maxwell action). The last three terms describe the coupling between the complex scalar field and the gauge field and e is the coupling strength. Note in particular the last term which is proportional to A2 and therefore looks like a mass term for the vector field. 2 When e = 0 (no coupling) and m > 0, there are 2 massive reals scalar modes (φ1 = Reφ and φ2 = Imφ) and 2 (and not 4) massless gauge modes corresponding to the 2 polarizations of the massless photon. When e 6= 0, we can use the gauge freedom to get rid of the phase mode of the matter field. Indeed, let φ(x) = ρ(x)eiθ(x) be written in amplitude ρ and phase θ variables. The Lagrangian reads 1 L = (∂ ρ)(∂µρ) − m2ρ2 − λρ4 − F F µν + ρ2(∂ θ + eA )(∂µθ + eAµ) (5.7) µ 4 µν µ µ Now, choosing a gauge, A0µ ≡ Aµ +∂µθ/e is also a valid gauge field (it gives the same field strength F 0 = F ). The Lagrangian therefore reads 1 L = (∂ ρ)(∂µρ) − m2ρ2 − λρ4 − F 0 F 0µν + e2ρ2A0 A0µ (5.8) µ 4 µν µ From now on, we work with this gauge choice and drop the primes. The phase field has disappeared! It is no longer in the scalar sector but has been put in the gauge sector as a consequence of our gauge choice. Note

3Gluons – the gauge vector bosons of the strong interaction in QCD are massless but due to the very strong interaction and the mechanism of color confinement (inspired by the Schwinger model), the effective low energy carriers of the strong interaction are not simple gluons but composite object known as mesons, which are colorless and therefore not confined. The mesons have a finite mass and carry the effective nuclear interaction over a finite distance. The effective interaction potential is given by the Yukawa potential.

83 also that, within our gauge choice, we have transformed the complex scalar field into a real positive ρ field. This has neither a phase nor even a discrete symmetry. Also note that writing the field in amplitude and phase variables is only meaningful if the average of the field is non-vanishing, i.e. if there is a spontaneous symmetry breaking, which we now discuss. Spontaneous symmetry breaking4 means m2 < 0 in V (ρ) = m2ρ2 + λρ4, where ρ is positive. There is a p 2 local maximum in ρ = 0 (unstable) and a minimum in ρ = −m /(2λ) = ρ0 (stable). Expanding around the minimum ρ(x) = ρ0 + δρ(x), we find (to quadratic order and up to a constant related to the vacuum energy) that 1 L = (∂δρ)2 − 4λρ2(δρ)2 − F 2 + e2ρ2A2 + O(δρ3, δρA2) (5.9) 0 4 0 √ √ 2 The first two terms show that there is a massive amplitude mode δρ of mass mH = 2ρ0 λ = −2m , which already existed in the Goldstone model with the global U(1) symmetry (this is the Higgs mode). In the gauge sector, we have to understand how many modes are present. Look at the third and fourth terms in the above Lagrangian. There is now a mass term for the vector field, so that the Lagrangian is that of Proca (rather than Maxwell) 1 m2 L = − F 2 + A A2 (5.10) P 4 2 √ with mA = 2eρ0. It describes a massive vector field. Such a Lagrangian is not gauge invariant. But here, we did not put the mass by hand: the mass was generated dynamically by the SSB. In addition, we actually work in a specific gauge and there is therefore no reason to expect that there still is a gauge freedom. The complete theory is still gauge invariant, only the vacuum state appears to break the symmetry. The µν 2 ν EL equations of motion for the Proca field is ∂µF + mAA = 0 (show it). Taking its gradient, we get µν 2 ν 2 ν µν µ ∂ν ∂µF + mA∂ν A = mA∂ν A = 0 because ∂ν ∂µF = 0. Therefore ∂µA = 0 is obtained automatically (it is not a choice, not the Lorenz gauge condition that we chose to impose in the case of the Maxwell field). This condition implies that we have reduced the number of independent components of the vector field Aµ 5 µν µ ν ν µ µ µν 2 ν from 4 to 3 . In addition inserting F = ∂ A − ∂ A and ∂µA = 0 in ∂µF + mAA = 0, we find that ( + m2 )Aµ = 0. This is the massive KG equation for each of the components of the vector field. It A q ~ 2 2 corresponds to a dispersion relation ω =  k + mA. Therefore mA is indeed the mass of the vector field. It is sometimes said that in the Higgs mechanism, the gauge becomes massive upon eating the would-be massless . In addition, there remains a massive scalar boson known as the . Summary: a) When m2 > 0 and whether e = 0 or e 6= 0 (indeed upon quadratizing the Lagrangian around the symmetry unbroken groundstate, the terms proportional to e are omitted), there are 2 massive scalar real fields with mass m and 2 massless gauge fields with mass mA = 0. There are 4 modes√ in total.√ 2 2 b) When m < 0 and e = 0, there is 1 massive amplitude field with mass mH = −2m = 2 λρ0 (Higgs mode), 1 massless phase field (Goldstone mode, mG = 0) and 2 massless gauge fields (i.e. one massless spin 1 field, mA = 0). In both a) and b) cases, the matter and gauge fields are essentially decoupled. √ 2 2 c) When m < 0 and e 6= 0, there is 1 massive amplitude mode (Higgs√ mode, mH = −2m ) and 3 massive gauge fields (i.e. one massive spin 1 field) with mass mA = 2eρ0 (described by the Proca theory). The latter contains 2 transverse components and 1 longitudinal one which is of mixed (matter- gauge) character. In all cases, there are 4 modes in total. The number of degrees of freedom is conserved.

4At this point, it is good to realize that SSB actually just means that the scalar field acquires a non-zero expectation value, i.e. that its groundstate is at finite ρ > 0 and not at ρ = 0. 5In the case of the massive vector field described by the Proca Lagrangian, there is no gauge invariance. Therefore, we can not play with the gauge transformation. The only way to get rid of some of the components of the vector field is to realize that µ ∂µA = 0 is automatically satisfied. Considering a plane wave excitation in its rest frame shows that A0 = 0 and that only three components out of four remain.

84 Remarks: - The Higgs mechanism has a complicated story and many inventors. P.W. Anderson was a pioneer (1963). Then Brout and Englert, Higgs and finally Hagen, Guralnik and Kibble (all in 1964). - Elitzur’s theorem and the pedantic point about the fact that a gauge symmetry can never be spontaneously broken as it is not a true symmetry but a mere redundancy in our description of a system. - The Higgs mechanism that we presented above concerns an abelian gauge group (namely U(1)) and is known as the abelian Higgs mechanism. The Higgs mechanism is actually even more important in the case of a non-abelian gauge group. It is a central ingredient of the electro-weak theory (Glashow, Weinberg and Salam), which is a gauge theory based on the non-abelian group SU(2)L × U(1)Y . This group has 4 generators and therefore 4 massless gauge bosons. Upon SSB of this gauge symmetry down to U(1)Q (where Q denotes the electric charge), 3 of the gauge bosons acquire a mass (these are the Z and W , with mass ∼ 90 GeV) and 1 remains massless (the photon). There is also a massive Higgs boson, which was recently discovered (2012, with a mass ∼ 125 GeV). - In the chart of fundamental particles there are three families of particles: the matter particles (spin 1/2, fermions), the force carriers (spin 1, gauge bosons) and the Higgs boson (massive spin 0). The Higgs boson really stands apart. Note also that it is supposed to be self-interacting and to spontaneously acquire a non-zero . The Higgs field is actually thought to couple to all other particles (not only gauge bosons as we have studied but also fermions and to give them a mass upon acquiring a non-zero vacuum expectation value). The mechanism of fermionic via SSB is actually quite easy. The hard task was to understand how to generate mass for the gauge bosons, which is precisely what the Higgs mechanism does.

5.3.1 Superconductivity and the Meissner effect For an example of the abelian Higgs mechanism, we turn to non-relativistic condensed matter physics. This was actually understood by Anderson in 1963 and is known as the Anderson-Higgs mechanism in this context. It concerns the superconductivity of a metal. In the Ginzburg-Landau description, there is a (non- relativistic) complex matter field φ(x), which is the order parameter field for the superconducting transition (it can be thought of as the wavefunction for the condensate of Cooper pairs, each carrying a charge of −2e, and it is often written ∆(x)). This charged matter field is coupled to a gauge field Aµ(x), which is the electromagnetic field. The control parameter for the phase transition is the temperature. In other words 2 the parameter m changes sign as a function of T − Tc where Tc is the critical temperature. In the ordered phase (superconducting phase, symmetry-broken phase), there are three massive photon modes (two massive transverse photon modes, with the mass given by the plasma frequency and a longitudinal mode which is the plasma oscillation or plasmon). The electromagnetic field can only enter the superconductor over a finite distance given by the inverse of the photon mass and known as the London penetration length. This is at the source of the Meissner effect. There is also a massive amplitude (Higgs) mode. The latter corresponds to vibrations of |∆(x)|. This is currently being measured in some superconductors. A neutral superfluid (corresponding to e = 0) can in some special cases6 be mapped onto an effective relativistic Goldstone model. Then after SSB, there is a gapless phase mode (Goldstone mode, known as a Bogolubov phonon in the present context) and a gapped amplitude mode (Higgs-like). However, because of the absence of coupling to a gauge field, this corresponds to the Goldstone mechanism and not to the Higgs mechanism. For more details see the reference given in the caption of Figure 5.2.

[end of lecture #13]

6Namely, for a repulsive Bose gas in an optical lattice. There is a special point in the phase diagram (at the tip of a Mott lobe of the superfluid to Mott insulator transition), where relativistic invariance is emergent.

85 Bibliography

[1] B. Delamotte, Un soup¸ccon de th´eorie des groupes: groupe des rotations et groupe de Poincar´e, http://cel.archives-ouvertes.fr/cel-00092924 (1996).

[2] M. Maggiore, A modern introduction to quantum field theory (Oxford university press, 2005). [3] L.H. Ryder, Quantum field theory (Cambridge university press, first edition, 1985). [4] A. Zee, Quantum field theory in a nutshell ( press, first edition, 2003).

[5] A. Zee, Fearful symmetry: the search for beauty in modern physics (Princeton university press, 1986). [6] A. Altland and B. Simons, Condensed matter field theory (Cambridge university press, second edition, 2010). [7] J. Sethna, Order Parameters, Broken Symmetry, and Topology, 1991 Lectures in Complex Systems, Eds. L. Nagel and D. Stein, Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XV, Addison- Wesley, 1992, available online http://www.lassp.cornell.edu/sethna/pubPDF/OrderParameters.pdf [8] M. Nakahara, Geometry, topology and physics (IOP Publishing, 2003). [9] https://en.wikipedia.org/wiki/Homotopy groups of spheres [10] L.C. Epstein, Relativity visualized (Insight press, 1981).

[11] Feynman, Leighton, Sands, The Feynman lectures on physics, volume 1, chapters 15 to 17. Available on line at http://www.feynmanlectures.caltech.edu/. [12] M. Boratav and R. Kerner, Relativit´e (Ellipses, 1991). [13] S. Weinberg, The quantum theory of fields (Cambridge university press, 1995), volume 1.

[14] L. Landau, E.M. Lifshitz, Mechanics, volume 1 of the course in (Pergamon Press, Second edition, 1969). [15] http://en.wikipedia.org/wiki/Lorentz-Heaviside units

86