<<

An Introduction to QED & QCD

N.J. Evans

Department of and Astronomy University of Southampton Southampton SO17 1BJ

Lectures presented at the School for High Physicists, September 2008. 1 Introduction

The aim of this course is to teach you how to calculate transition amplitudes, cross sections and decay rates, for elementary particles in the highly successful theories of (QED) and (QCD). Most of our work will be in understanding how to compute in QED. By the end of the course you should be able to go from a , such as the one for e−e− µ−µ− in → figure 5, to a number for the . To do this we will have to learn how to cope with relativistic, quantum, particles and anti-particles that carry . In fact all these properties of particles will emerge rather neatly from thinking about relativistic . The rules for calculating in QCD are slightly more complicated that in QED, as we will briefly review, however, the basic techniques for the calculation are very similar. We have a lot to cover so will necessarily have to take some short cuts. Our main fudge will be to work in relativistic quantum mechanics rather than the full Quantum Theory (QFT) (sometimes referred to as ‘second ’). We will be in good company though since we will largely follow methods from Feynman’s papers and text books such as Halzen and Martin. In quantum mechanics a classical wave is used to describe a particle whose motion is subject to the Uncertainty Principle. In a full QFT the wave’s motion itself is subject to the Uncertainty Principle too - the quanta of that field are what we then refer to as particles. Luckily at lowest order in a calculation one neglects the quantum nature of the field and the two theories give the same answer. At higher orders the quantum nature of the field gives rise to virtual pair creation of particles - in the quantum mechanics version of the story these are included in a more ad hoc fashion as we will see. Luckily the simultaneous QFT course will give you a good grounding in more precise methodologies. Thus our starting point will be ordinary Quantum Mechanics and our first goal (section 2) will be to write down a ‘relativistic version’ of Quantum Mechanics. This will lead us to look at relativistic wave equations, in particular the , which describes particles with spin 1/2. We will also develop a wave equation for and look at how they couple to our (section 3) - this is the core of QED. A perturbation theory analysis will result in quantum mechanical probability amplitudes for particular processes. After this, we will work out how to go from the probability amplitudes to cross sections and decay rates (section 4). We will look at some examples of tree level QED processes. Here you will get hands-on experience of calculating transition amplitudes and getting from them to cross sections (section 5). We will restrict ourselves to calculations at tree level but, at the end of the course (section 6), we will also take a first look at higher order loop effects, which, amongst other things, are responsible for the running of the couplings. For QCD, this running means that the coupling appears weaker when measured at higher energy scales and is the reason why we can sometimes do perturbative QCD calculations. However, in higher order calculations divergences appear and we have to understand — at least in principle — how these divergences can be removed. In reference [1] you will find a list of textbooks that may be useful. 1.1 Relativity Review An in a reference frame S is described by the four coordinates of a four-vector (in units where c = 1) xµ = (t, ~x), (1.1) where the Greek index µ 0, 1, 2, 3 . These coordinates are reference frame dependent. The coordinates in another∈ {frame S0}are given by x0µ, related to those in S by a Lorentz Transformation (LT) xµ x0 µ = Λµ xν , (1.2) → ν where summation over repeated indices is understood. This transformation identifies xµ as a contravariant 4-vector (often referred to simply as a vector). A familiar example of a LT is a boost along the z-axis, for which

γ 0 0 βγ 0 1 0 −0 Λµ =   , (1.3) ν 0 0 1 0      βγ 0 0 γ   −  with, as usual, β = v and γ = (1 β2)−1/2. LT’s can be thought of as generalized rotations. − The “length” of the 4-vector (t2 ~x 2) is invariant to LTs. In general we define the − | | Minkowski scalar product of two 4-vectors x and y as

x y xµyνg xµy , (1.4) · ≡ µν ≡ µ where the metric 1 if µ = ν gµν = g = diag(1, 1, 1, 1), gµλg = gµ = δµ = , (1.5) µν − − − λν ν ν 0 if µ = ν  6 has been introduced. The last step in eq. (1.4) is the definition of a covariant 4-vector (sometimes referred to as a co-vector),

x g xν. (1.6) µ ≡ µν This transforms under a LT according to

x x0 = Λ νx . (1.7) µ → µ µ ν Note that the invariance of the scalar product implies

ΛT gΛ = g gΛT g = Λ−1, (1.8) ⇒ i.e. a generalization of the orthogonality property of the rotation matrix RT = R−1. . Exercise 1.1 Show eq. (1.8), starting from the invariance of the scalar product. To formulate a coherent relativistic theory of dynamics we define kinematic variables that are also 4-vectors (i.e. transform according to eq. (1.2)). For example, we define a 4-velocity dxµ uµ = , (1.9) dτ where τ is the proper time measured by a clock moving with the particle. Everyone will agree what the clock says at a particular event so this measure of time is Lorentz invariant and uµ transforms as xµ. Note dt dxµ uµ = = γ(1, ~v) (1.10) dτ dt and has invariant length uµu = γ2(12 ~v 2) = 1. (1.11) µ − | | Similarly 4- provides a relativistic definition of energy and momentum pµ = muµ (E, p~). (1.12) ≡ The invariant length provides the crucial relation pµp = E2 p~ 2 = m2. (1.13) µ − | | . Exercise 1.2 Check that dt/dτ = γ and that our relativistic definitions of E and p~ make sense in the non-relativistic limit. The differentiation operator,

∂ ∂ ~ ν ν ∂µ = , , ∂µx = δµ, (1.14) ≡ ∂xµ ∂t ∇! is a covariant 4-vector (i.e. according to eq. (1.7)). This means that the contravariant equivalent 4-vector will have an extra minus sign in its space-like components, ∂ ∂µ = ( , ~ ). (1.15) ∂t −∇ The convention for the totally antisymmetric Levi-Civita tensor is +1 if µ, ν, λ, σ an even permutation of 0, 1, 2, 3 µνλσ { } { }  = 1 if an odd permutation . (1.16)  −  0 otherwise µνλσ µνλσ Note that  = µνλσ, and  pµqνrλsσ changes sign under a transformation since it contains an−odd number of spatial components. . Exercise 1.3 Verify the above two properties of µνλσ. I will use natural units, c = 1, h¯ = 1, so mass, energy, inverse length and inverse time all have the same dimensions. Generally think of energy as the basic unit, e.g. mass has units of GeV and distance has units of GeV−1. . Exercise 1.4 Noting that E has SI unit kg.m2.s−2, c has SI unit m.s−1 and h¯ has SI unit kg.m2.s−1, what is a mass of 1 GeV in kg and what is a cross-section of 1 GeV−2 in microbarns? 2 Relativistic Wave Equations

Let’s review how wave equations describe non-relativistic quantum particles. Experimen- tally we know that a particle with definite momentum p~ and energy E can be associated with a plane wave

~ p~ E ψ = ei(k.~x−wt), with ~k = , w = . (2.1) h¯ h¯ To extract E and p~ from the wave we use operators d Eψ = ih¯ ψ, pψ~ = ih¯ ~ ψ. (2.2) dt − ∇ In quantum mechanics, it is more usual to refer to the energy operator as the Hamiltonian H, and write (with h¯ = 1) ∂ψ Hψ = i . (2.3) ∂t I shall usually reserve the Greek symbol ψ for spin 1/2 fermions and φ for spin 0 bosons. So for and the like I shall write ∂φ Hφ = i . (2.4) ∂t In non-relativistic systems, conservation of energy can be written H = T + V, (2.5) where T is the kinetic energy and V is the potential energy. A particle of mass m and momentum p~ has non-relativistic kinetic energy, p~ 2 T = . (2.6) 2m Replacing the energy and momentum operators with the forms seen in eq. (2.2), we arrive at the Schr¨odinger equation d h¯2 ih¯ ψ = 2ψ + V ψ. (2.7) dt −2m∇ In this equation ψ is the wave function describing the single particle probability ampli- tude. For a slow moving particle v c (e.g. an in a ) this is  adequate, but for relativistic systems (v c) the Hamiltonian above is incorrect. For a free relativistic particle the total∼ energy E is given by the Einstein equation E2 = p~ 2 + m2. (2.8) Thus the square of the relativistic Hamiltonian H 2 is simply given by promoting the momentum to operator status: H2 = p~ 2 + m2. (2.9) So far, so good, but how should this be implemented into the wave equation of eq. (2.3), which is expressed in terms of H rather than H 2? Naively the relativistic wave equation looks like ∂ψ(t) p~ 2 + m2ψ(t) = i (2.10) ∂t q but this is difficult to interpret because of the square root. There are two ways forward: 1. Work with H2. By iterating the wave equation we have

2 2 2 ∂ φ(t) i∂ H φ(t) = 2 or V φ(t) (2.11) − ∂t  ∂t −   This is known as the Klein-Gordon (KG) equation. In this case the wave function describes spinless bosons.

2. Invent a new Hamiltonian HD that is linear in momentum, and whose square is 2 2 2 2 equal to H given above, HD = p~ + m . In this case we have ∂ψ(t) H ψ(t) = i (2.12) D ∂t

which is known as the Dirac equation, with HD being the Dirac Hamiltonian. In this case the wave function describes spin 1/2 fermions, as we shall see.

2.1 The Klein-Gordon Equation Let us now take a more detailed look at the KG equation (2.11). In position space we write the energy-momentum operator as

pµ i∂µ, (2.13) → so that the KG equation (for zero potential V ) becomes

(∂2 + m2) φ(x) = 0 (2.14) where we recall the notation,

∂2 = ∂ ∂µ = ∂2/∂t2 2 (2.15) µ − ∇ and x is the 4-vector (t, ~x). The operator ∂2 is Lorentz invariant, so the Klein-Gordon equation is relativistically covariant (that is, transforms into an equation of the same form) if φ is a scalar function. That is to say, under a Lorentz transformation (t, ~x) (t0, ~x0), → φ(t, ~x) φ0(t0, ~x0) = φ(t, ~x) (2.16) → so φ is invariant. In particular φ is then invariant under spatial rotations so it represents a spin-zero particle (more on spin when we come to the Dirac equation); there being no preferred direction which could carry information on a spin orientation. The Klein-Gordon equation has plane wave solutions:

φ(x) = Ne−i(Et−p~·~x) (2.17) where N is a normalization constant and E = √p~ 2 + m2. Thus, there are both positive and negative energy solutions. The negative energy solutions pose a severe problem if we try to interpret φ as a wave function (as indeed we are trying to do). The spectrum is no longer bounded from below, and we can extract arbitrarily large amounts of energy from the system by driving it to ever more negative energy states. Any external perturbation capable of pushing a particle across the energy gap of 2m between the positive and negative energy continuum of states can uncover this difficulty. Furthermore, we cannot just throw away these solutions as unphysical since they appear as Fourier modes in any realistic solution of (2.14). Note that if one interprets φ as a quantum field there is no problem, as you will see in the field theory course. The positive and negative energy modes are just associated with operators which create or destroy particles. A second problem with the wave function interpretation arises when trying to find a probability density. Since φ is Lorentz invariant, φ 2 does not transform like a density (i.e. as the time component of a 4-vector) so we will| | not have a Lorentz covariant con- tinuity equation ∂ρ + ~ J~ = 0. To search for a candidate we derive such a continuity equation. Defining ρ and∇ ·J~ by

∂φ ∂φ∗ ∂ ∂ ρ i φ∗ φ , or φ∗ i V + φ i V φ∗ , (2.18) ≡ ∂t − ∂t !   ∂t −  − ∂t −   J~ i (φ∗ ~ φ φ~ φ∗), (2.19) ≡ − ∇ − ∇ we obtain (see problem) a covariant conservation equation

µ ∂µJ = 0, (2.20)

where J is the 4-vector (ρ, J~). It is thus natural to interpret ρ as a probability density and J~ as a probability current. However, for a plane wave solution (2.17), ρ = 2 N 2E, so the negative energy solutions also have a negative probability! | | . Exercise 2.5 Derive the continuity equation (2.20). Start with the Klein-Gordon equation multiplied by φ∗ and subtract the complex conjugate of the KG equation multiplied by φ. Thus, ρ may well be considered as the density of a conserved quantity (such as elec- tric charge), but we cannot use it for a probability density. To Dirac, this and the existence of negative energy solutions seemed so overwhelming that he was led to intro- duce another equation, first order in time derivatives but still Lorentz covariant, hoping that the similarity to Schr¨odinger’s equation would allow a probability interpretation. Dirac’s original hopes were unfounded because his new equation turned out to admit negative energy solutions too! Even so, he did find the equation for spin-1/2 particles and predicted the existence of . Before turning to discuss what Dirac did, let us put things in context. We have found that the Klein-Gordon equation, a candidate for describing the quantum mechanics of spinless particles, admits unacceptable negative energy states when φ is interpreted as the single particle wave function. We could solve all our problems here and now, and restore our faith in the Klein-Gordon equation, by simply re-interpreting φ as a quantum field. However we will not do that. There is another way forward (this is the way followed in the textbook of Halzen & Martin) due to Feynman and Stuc¨ kelberg. Causality us to ensure that positive energy states propagate forwards in time, but if we the negative energy states to propagate only backwards in time then we find a theory that is consistent with the requirements of causality and that has none of the aforementioned problems. In fact, the negative energy states cause us problems only so long as we think of them as real physical states propagating forwards in time. Therefore, we should interpret the emission (absorption) of a negative energy particle with momentum pµ as the absorption (emission) of a positive energy with momentum pµ. In order to become more familiar with this picture, consider a process with− a π+ and a in the initial state and final state. In figure 1(a) the π+ starts from the point + A and at a later time t1 emits a photon at the point ~x1. If the energy of the π is still positive, it travels on forwards in time and eventually will absorb the initial state photon + at t2 at the point ~x2. The final state is then again a photon and a (positive energy) π . There is another process however, with the same initial and final state, shown in + figure 1(b). Again, the π starts from the point A and at a later time t2 emits a photon at the point ~x1. But this time, the energy of the photon emitted is bigger than the energy of the initial π+. Thus, the energy of the π+ becomes negative and it is forced to travel backwards in time. Then at an earlier time t1 it absorbs the initial state photon at the point ~x2, thereby rendering its energy positive again. From there, it travels forward in time and the final state is the same as in figure 1(a), namely a photon and a (positive energy) π+. B B

(t2, ~x2) (t1, ~x2)

(t1, ~x1) (t2, ~x1) A A space (a) (b) time

Figure 1: Interpretation of negative energy states

In todays language, the process in figure 1(b) would be described as follows: in the + initial state we have an π and a photon. At time t1 and at the point ~x2 the photon creates an π+π− pair. Both propagate forwards in time. The π+ ends up in the final − state, whereas the π is annihilated at (a later) time t2 at the point ~x1 by the initial state π+, thereby producing the final state photon. To someone observing in real time, the negative energy state moving backwards in time looks to all intents and purposes like a negatively charged with positive energy moving forwards in time. . Exercise 2.6 Consider a wave incident on the potential step shown in figure 2. Show that if the 2 2 step size V > m + Ep, where Ep = √p~ + m then one cannot avoid using the negative square root ~k = (E V )2 + m2, resulting in negative currents and densities. Hint: − p − use the continuityqof φ(x) and ∂φ(x)/∂x at x = 0, and ensure that the group velocity vg = ∂E/∂k is positive for x > 0. Interpret the solution. b exp( ip~ ~x) − · V a exp(ip~ ~x) d exp(i~k ~x) · ·

x = 0

Figure 2: A potential step

2.2 The Dirac Equation Dirac wanted an equation first order in time derivatives and Lorentz covariant, so it had to be first order in spatial derivatives too. His starting point was to assume a Hamiltonian of the form,

HD = α1p1 + α2p2 + α3p3 + βm, (2.21) where pi are the three components of the momentum operator p~, and αi and β are some unknown quantities, which we will show must be interpreted as 4 4 matrices. Substituting the expressions for the operators eq. (2.13) into the Dirac Hamiltonian× of eq. (2.21) results in the equation ∂ψ i = ( i α~ ~ + βm)ψ (2.22) ∂t − · ∇ which is the position space Dirac equation. If ψ is to describe a free particle it must satisfy the Klein-Gordon equation so that it has the correct energy-momentum relation. This requirement imposes relationships among α1, α2, α3 and β. To see this, apply the Hamiltonian operator to ψ twice, to give ∂2ψ = [ αiαj i j i (βαi + αiβ)m i + β2m2]ψ, (2.23) − ∂t2 − ∇ ∇ − ∇ with an implicit sum of i and j over 1 to 3. The Klein-Gordon equation by comparison is ∂2ψ = [ i i + m2]ψ. (2.24) − ∂t2 −∇ ∇ It is clear that we cannot recover the KG equation from the Dirac equation if the αi and β are normal numbers. Insisting that the terms linear in i vanish independently would ∇ require either β to vanish or all the αi to vanish. This would remove either i j term or the m2 term, both of which are unacceptable. Instead we must insist that∇the∇ terms linear in i vanish in their sum without any of αi or β vanishing, i.e. we must assume ∇ that αi and β anti-commute. We recover the KG equation only if

αiαj + αjαi = 2δij βαi + αiβ = 0 (2.25) β2 = 1 for i, j = 1, 2, 3. In principle, these equations define αi and β, and any objects which obey these relations are good representations of them. However, in practice, we will represent them by matrices. In this case, ψ is a multi-component spinor on which these matrices act. . Exercise 2.7 Prove that any matrices α~ and β satisfying eq. (2.25) are traceless with eigenvalues 1. Hence argue that they must be even dimensional.  In two dimensions a natural set of matrices for the α~ would be the Pauli matrices 0 1 0 i 1 0 σ1 = , σ2 = − , σ3 = . (2.26) 1 0 i 0 0 1      −  However, there is no other independent 2 2 matrix with the right properties for β, so we must use a higher dimensional form. The× smallest number of dimensions for which the Dirac matrices can be realized is four. One choice is the Dirac representation: 0 ~σ 1 0 α~ = , β = . (2.27) ~σ 0 0 1    −  Note that each entry above denotes a two-by-two block and that the 1 denotes the 2 2 identity matrix. The spinor ψ therefore has four components. × There is a theorem due to Pauli that states that all sets of matrices obeying the relations in eq. (2.25) are equivalent. Since the hermitian conjugates α~ † and β† clearly obey the relations, you can, by a change of basis if necessary, assume that α~ and β are hermitian. All the common choices of basis have this property. Furthermore, we would like αi and β to be hermitian so that the Dirac Hamiltonian (2.42) is hermitian. If we define ρ = J 0 = ψ†ψ, J~ = ψ†αψ~ , (2.28) then it is a simple exercise using the Dirac equation to show that this satisfies the µ continuity equation ∂µJ = 0. We will see in section 2.8 that (ρ, J~) transforms, as it must, as a 4-vector. Note that ρ is now also positive definite.

2.3 Solutions to the Dirac Equation We look for plane wave solutions of the form χ(p~) ψ = e−i(Et−p~·~x) (2.29)  φ(p~)  where φ(p~) and χ(p~) are two-component spinors that depend on momentum p~ but are independent of ~x. Using the Dirac representation of the matrices, and inserting the trial solution into the Dirac equation gives the pair of simultaneous equations χ m ~σ p~ χ E = . (2.30) φ ~σ p~ ·m φ    · −    There are two simple cases for which eq. (2.30) can readily be solved, namely 1. p~ = 0, m = 0, which might represent an electron in its rest frame. 6 2. m = 0, p~ = 0, which describes a massless particle or a particle in the ultra- relativistic limit6 (E m).  For case (1), an electron in its rest frame, the equations (2.30) decouple and become simply, Eχ = mχ, Eφ = mφ. (2.31) − So, in this case, we see that χ corresponds to solutions with E = m, while φ corresponds to solutions with E = m. In light of our earlier discussions, we no longer need to recoil in horror at the appearance− of these negative energy states. The negative energy solutions persist for an electron with p~ = 0 for which the solutions 6 to equation (2.30) are ~σ p~ ~σ p~ φ = · χ, χ = · φ. (2.32) E+m E m − . Exercise 2.8 Show that (~σ p~)2 = p~2. · Using (~σ p~)2 = p~2 we see that E = √p~ 2 + m2 . We write the positive energy solutions with· E = + √p~ 2 + m2 as | | | | χ ψ(x) = e−i(Et−p~·~x), (2.33) ~σ·p~ χ  E+m  while the general negative energy solutions with E = √p~ 2 + m2 are −| | ~σ·p~ φ ψ(x) = E−m e−i(Et−p~·~x), (2.34)  φ  for arbitrary constant φ and χ. Clearly when p~ = 0 these solutions reduce to the positive and negative energy solutions discussed previously. It is interesting to see how Dirac coped with the negative energy states. Dirac inter- preted the negative energy solutions by postulating the existence of a “sea” of negative energy states. The vacuum or has all the negative energy states full. An additional electron must now occupy a positive energy state since the Pauli exclusion principle forbids it from falling into one of the filled negative energy states. On promot- ing one of these negative energy states to a positive energy one, by supplying energy, an electron-hole pair is created, i.e. a positive energy electron and a hole in the negative energy sea. The hole is seen in nature as a positive energy . This was a radical new idea, and brought pair creation and antiparticles into physics. The problem with Dirac’s hole theory is that it does not work for bosons. Such particles have no exclusion principle to stop them falling into the negative energy states, releasing their energy. It is convenient to rewrite the solutions, eqs. (2.33) and (2.34), introducing the spinors (s) (s) uα (p~) and vα (p~). The label α 1, 2, 3, 4 is a spinor index that often will be sup- pressed, while s 1, 2 denotes the∈ {spin state} of the , as we shall see later. We ∈ { } take the positive energy solution eq. (2.33) and define

χs −ip·x (s) −ip·x √E+m ~σ·p~ e u (p)e . (2.35)  E+m χs  ≡ For the negative energy solution of eq. (2.34), change the sign of the energy, E E, and the three-momentum, p~ p~, to obtain, → − → − ~σ·p~ χ √E+m E+m s eip·x v(s)(p)eip·x. (2.36)  χs  ≡ In these two solutions E is now (and for the rest of the course) always positive and given 2 2 1/2 by E = (p~ + m ) . The χs for s = 1, 2 are 1 0 χ1 = , χ2 = . (2.37)  0   1 

For the simple case p~ = 0 we may interpret χ1 as the spin-up state and χ2 as the spin-down state. Thus for p~ = 0 the 4-component wave function has a very simple interpretation: the first two components describe with spin-up and spin-down, while the second two components describe with spin-up and spin-down. Thus we understand on physical grounds why the wave function had to have four components. The general case p~ = 0 is slightly more involved and is considered in the next section. 6 The u-spinor solutions will correspond to particles and the v-spinor solutions to antiparticles. The role of the two χ’s will become clear in the following section, where it will be shown that the two choices of s are spin labels. Note that each spinor solution depends on the three-momentum p~, so it is implicit that p0 = E.

2.4 Orthogonality and Completeness Our solutions to the Dirac equation take the form

ψ = Nu(s)e−ip.x, ψ = Nv(r)eip.x, with r, s = 1, 2, (2.38)

where N is a normalization factor. We have already included a factor √E+m in our spinors (see eqs. (2.35) and (2.36)), which results in

u(r) †(p)u(s)(p) = v(r) †(p)v(s)(p) = 2Eδrs. (2.39)

This convention allows u†u to transform as the time component of a 4-vector under Lorentz transformations, which is essential to its interpretation as a probability density (see eq. (2.28) and section 2.8). Also note that the spinors are orthogonal. . Exercise 2.9 Check the normalization condition for the spinors in eq. (2.39). We must further normalize the spatial part of the wave functions. In fact a plane wave is not normalizable in an infinite space so in the computatuions that follow where we use them we will work in a large box of volume V - such a construction is not Lorentz invariant. The number of particles in the box will be

ψ†ψ d3x = 2E N 2 V, (2.40) Z so setting N = 1/√V allows us to adopt the standard relativistic normalization con- vention of 2E particles per box of volume V . Most people and the books use this convention. I frequently find it more intuitive, given we’ve broken Lorentz invariance, to set N = 1/√2EV so there’s one particle in the box. I’ll try to be clear below when I do this. Remember that the solutions to the wave equation form a complete set of states meaning that we can expand (like a Fourier expansion) an arbitrary function χ(x) in terms of them

χ(x) = anψn(x) (2.41) n X The an are the equivalent of Fourier coefficients and if χ is a wave function in some 2 quantum mixed state then an is the probability of being in the state ψn (or 2E times that!). | |

2.5 Spin Now it is time to justify the statements we have been making that the Dirac equation describes spin-1/2 particles. The Dirac Hamiltonian in momentum space is given in eq. (2.21) as H = α~ p~ + βm, (2.42) D · and the orbital angular momentum operator is

L~ = R~ p.~ (2.43) ×

Evaluating the commutator of L~ with HD,

[L,~ H ] = [R~ p,~ α~ p~] D × · = [R~, α~ p~] p~ · × = iα~ p~, (2.44) × we see that the orbital angular momentum is not conserved (otherwise the commutator would be zero). We would like to find a total angular momentum J~ that is conserved, by adding an additional operator S~ to L~ ,

J~ = L~ + S~, [J~, HD] = 0. (2.45)

To this end, consider the three matrices,

~σ 0 Σ~ = iα1α2α3α~, (2.46) ≡  0 ~σ  − where the first equivalence is merely a definition of Σ~ and the last equality can be verified by an explicit calculation. The Σ~ /2 have the correct commutation relations to represent angular momentum, since the Pauli matrices do, and their commutators with α~ and β are, [Σ~ , β] = 0, [Σi, αj] = 2iεijkαk. (2.47) From the relations in (2.47) we find that

[Σ~ , H ] = 2iα~ p.~ (2.48) D − × . Exercise 2.10 Using α α α 1  α α α verify the commutation relations in eqs. (2.47) and (2.48). 1 2 3 ≡ 3 ijk i j k Comparing eq. (2.48) with the commutator of L~ with HD in eq. (2.44), you see that 1 [L~ + Σ~ , H ] = 0, (2.49) 2 D and we can identify 1 S~ = Σ~ (2.50) 2 as the additional quantity that, when added to L~ in equation (2.45), yields a conserved total angular momentum J~. We interpret S~ as an angular momentum intrinsic to the particle. Now 1 ~σ ~σ 0 3 1 0 S~2 = · = , (2.51) 4 0 ~σ ~σ 4 0 1  ·    and, recalling that the eigenvalue of J~ 2 for spin j is j(j+1), we conclude that S~ represents spin-1/2 and the solutions of the Dirac equation have spin-1/2 as promised. We worked in the Dirac representation of the matrices for convenience, but the result is necessarily independent of the representation. (s) Now consider the u-spinor solutions u (p) of eq. (2.35). Choose p~ = (0, 0, pz) and write √E+m 0 0 √E+m u u(1)(p) =   , u u(2)(p) =   . (2.52) ↑ ≡ √E m ↓ ≡ 0      −   √   0   E m     − −  With these definitions, we get 1 1 S u = u , S u = u . (2.53) z ↑ 2 ↑ z ↓ −2 ↓ So, these two spinors represent spin up and spin down along the z-axis respectively. For the v-spinors, with the same choice for p~, write, √E m 0 0− √E m v = v(1)(p) =   , v = v(2)(p) =   , (2.54) ↓ √E+m ↑ − 0 −        √   0   E+m      where now, 1 1 S v = v , S v = v . (2.55) z ↓ 2 ↓ z ↑ −2 ↑ This apparently perverse choice of up and down for the v’s is actually quite sensible when one realizes that a negative energy electron carrying spin +1/2 backwards in time looks just like a positive energy positron carrying spin 1/2 forwards in time. − 2.6 There is a much more compact way of writing the Dirac equation, which requires that we get to grips with some more notation. Define the γ-matrices,

γ0 = β, ~γ = βα~. (2.56)

In the Dirac representation,

1 0 0 ~σ γ0 = , ~γ = . (2.57) 0 1 ~σ 0  −   −  In terms of these, the relations between the α~ and β in eq. (2.25) can be written compactly as, γµ, γν = 2gµν. (2.58) { } . Exercise 2.11 Prove that γµ, γν = 2gµν. { } µ Combinations like aµγ occur frequently and are conventionally written as,

µ µ a/ = aµγ = a γµ,

pronounced “a slash.” Note that γµ is not, despite appearances, a 4-vector. It just denotes a set of four matrices. However, the notation is deliberately suggestive, for when combined with Dirac fields you can construct quantities that transform like vectors and other Lorentz tensors (see the next section). Observe that using the γ-matrices the Dirac equation (2.22) becomes

(i∂/ m)ψ = 0, (2.59) − or, in momentum space, (p/ m)ψ = 0. (2.60) − The spinors u and v satisfy

(p/ m)u(s)(p) = 0, (2.61) − (p/ + m)v(s)(p) = 0, (2.62)

since for v(s)(p), E E and p~ p~. We want the Dirac→ −equation (2.59)→ −to preserve its form under Lorentz transformations eq. (1.2). We’ve just naively written the matrices in the Dirac equation as γµ however this does not make them a 4-vector! They are just a set of numbers in four matrices and there’s no reason they should change when we do a boost. Since ∂µ does transform, for the equation to be Lorentz covariant we are led to propose that ψ transforms too. We know that 4-vectors get their components mixed up by LT’s, so we expect that the components of ψ might get mixed up too:

ψ(x) ψ0(x0) = S(Λ)ψ(x) = S(Λ)ψ(Λ−1x0) (2.63) → where S(Λ) is a 4 4 matrix acting on the spinor index of ψ. Note that the argument Λ−1x0 is just a fancy× way of writing x, i.e. each component of ψ(x) is transformed into a linear combination of components of ψ(x). In order to appreciate the above it is useful to consider a vector field, where the corresponding transformation is

Aµ(x) A0µ(x0) → where x0 = Λx. This makes sense physically if one thinks of space rotations of a vector field. For example the wind arrows on a weather map are an example of a vector field: with each point on the map there is associated an arrow. Consider the wind direction at a particular point on the map, say Abingdon. If the map is rotated, then one would expect on physical grounds that the wind vector at Abingdon always point in the same physical direction and have the same length. In order to achieve this, both the vector itself must rotate, and the point to which it is attached (Abingdon) must be correctly identified after the rotation. Thus the vector at the point x0 (corresponding to Abingdon in the rotated frame) is equal to the vector at the point x (corresponding to Abingdon in the unrotated frame), but rotated so as to keep the physical sense of the vector the same in the rotated frame (so that the wind always blows towards Oxford, say, in the two frames). Thus having correctly identified the same point in the two frames all we need to do is rotate the vector:

0µ 0 µ ν A (x ) = Λ ν A (x). (2.64) A similar thing also happens in the case of the 4-component spinor field above, except that we do not (yet) know how the components of the wave function themselves must transform, i.e. we do not know S. We now need to figure out what S is. The requirement is that the Dirac equation has the same form in any inertial frame. Thus, if we make a LT from our original frame into another (‘primed’) frame and write down the Dirac equation in this frame, it has to have the same form.

(iγµ∂ m)ψ(x) = 0 (iγµ∂0 m)ψ0(x0) = 0, (2.65) µ − −→ µ − where we used the fact that m is a scalar, i.e. m0 = m. The derivative transforms as a covector, eq. (1.7), so using the orthogonality condition σ 0 of eq. (1.8), we can write ∂µ = Λ µ∂σ and multiplying the Dirac equation in the original frame by S it becomes S(iγµΛσ ∂0 m)ψ(x) = 0. (2.66) µ σ − On the other hand, we can use the definition of S in eq. (2.63) to rewrite the equation in the primed frame as (iγµ∂0 m)Sψ(x) = 0. (2.67) µ − We can see that the second term (containing m) of eqs. (2.66) and (2.67) are now σ µ σ identical. To make the first term identical we need SΛ µγ = γ S. Thus, in order for the Dirac equation to be Lorentz invariant, S(Λ) has to satisfy

σ µ −1 σ Λ µγ = S γ S (2.68) We still haven’t solved for S explicitly. We need to find an S that satisfies eq. (2.68). Since S depends on the LT, we first have to find a convenient parameterization of a LT and then express S(Λ) in terms of these parameters. For an infinitesimal LT, it can be shown that, µ µ µ Λ ν = g ν + ω ν (2.69)

where ωµν is an antisymmetric set of infinitesimal parameters. For example, a boost along the z-axis corresponds to ω = ω = β (remember that ω = ω0 = ω i etc) 03 − 30 − 0i i − 0 with all other entries of ωµν zero,

1 0 0 β − 0 1 0 0 Λµ = gµ + ωµ =   . (2.70) ν ν ν 0 0 1 0      β 0 0 1   −  This corresponds to eq. (1.3) when one makes an expansion in small β, i.e. γ = 1 + (β2). Non-zero ω or ω correspond to boosts along the x and y axes respectively. O 01 02 The remaining combinations, non-zero ω23, ω31 or ω12, correspond to infinitesimal anti- clockwise rotations through an angle ωij about the x, y and z axes respectively. It’s a nice exercise to check this out. For an infinitesimal LT we are at liberty to write i S(Λ) = 1 + ω σµν , (2.71) 4 µν which is nothing but a definition of the set of matrices σµν . Our task is to determine these matrices. To do this, substitute the expression for S, eq. (2.71), into eq. (2.68) (and −1 i µν remember that S (Λ) = 1 4 ωµνσ ). After some algebra, we can convince ourselves that the solution is − i σµν = [γµ, γν] (2.72) 2 Thus S can be written explicitly in terms of γ-matrices for a general LT by building the finite transformation out of lots of infinitesimal ones. . Exercise 2.12 Verify that eq. (2.72) is true. Now that we now how ψ transforms we can find quantities that are Lorentz invariant, or transform as vectors or tensors under LT’s. To this end, we will find it useful to introduce the Dirac adjoint. The Dirac adjoint ψ¯ of a spinor ψ is defined by

ψ¯ ψ†γ0 (2.73) ≡ With the help of S†(Λ)γ0 = γ0S−1(Λ) (2.74) we see that ψ¯ transforms under LT’s as

ψ¯ ψ¯ 0 = ψ¯S−1(Λ). (2.75) → . Exercise 2.13 1. Verify that 㵆 = γ0γµγ0. 2. Prove eq. (2.74) 3. Show that ψ¯ satisfies the equation ← ψ¯ ( i∂/ m) = 0 − − where the arrow over ∂/ implies the derivative acts to the left. 4. Hence prove that ψ¯ transforms as in eq. (2.75). Combining the transformation properties of ψ and ψ¯ in eqs. (2.63) and (2.75) we see that the bilinear ψ¯ψ is Lorentz invariant. In section 2.8 we will consider the transforma- tion properties of general bilinears. Let’s close this section by recasting the spinor normalization eq. (2.39) in terms of Dirac inner products. The conditions become u¯(r)(p)u(s)(p) = 2mδrs u¯(r)(p)v(s)(p) = v¯(r)(p)u(s)(p) = 0 (2.76) v¯(r)(p)v(s)(p) = 2mδrs − where, in analogy to eq. (2.73), we defined u¯ u†γ0 and v¯ v†γ0. ≡ ≡ . Exercise 2.14 Verify the normalization properties in the above equations (2.76).

2.7 Parity, charge conjugation and time reversal 2.7.1 Parity We usually use LT’s which are in the connected Lorentz Group, SO(3, 1), meaning they can be obtained by a continuous deformation of the identity transformation (i.e. by lots of little transformations)1. This class of LT is often referred to as proper LT. However, the full Lorentz group consists not only of the proper transformations but also includes the discrete operations of parity (space inversion), P , and time reversal, T : 1 0 0 0 1 0 0 0 0 1 0 0 −0 1 0 0 Λ =   , Λ =   . (2.77) P 0 −0 1 0 T 0 0 1 0      −     0 0 0 1   0 0 0 1   −    LT’s satisfy ΛT gΛ = g, so taking determinants shows that det Λ = 1. Proper LT’s are  continuously connected to the identity so must have determinant 1, but both P and T operations have determinant 1. Let us now find the action−of parity on the Dirac wave function and determine the wave function ψP in the parity-reversed system. According to the discussion of the previous section, we need to find a matrix P satisfying P −1γ0P = γ0, P −1γiP = γi. (2.78) − 1Indeed in the last section we considered LT’s very close to the identity in equation (2.69) Using some properties of the γ-matrices we see that P = P −1 = γ0 is an acceptable solution (Clearly one could multiply γ0 by a phase and still have an acceptable definition for the parity transformation.), from which it follows that the transformation is

ψ(t, ~x) ψ (t, ~x) = P ψ(t, ~x) = γ0ψ(t, ~x). (2.79) → P − Since 1 0 γ0 = , (2.80) 0 1  −  the u-spinors and v-spinors at rest have opposite eigenvalues, corresponding to particle and antiparticle having opposite intrinsic parities.

2.7.2 Charge Conjugation Another discrete invariance of the Dirac equation is charge conjugation, which takes you from particle to antiparticle and vice versa. For scalar fields the is just complex conjugation, but in order for the charge conjugate Dirac field to remain a solution of the Dirac equation, you have to mix its components as well. The transformation on the fermion wavefunction is ψ ψ = Cψ¯ T , (2.81) → C T where ψ¯ T = ψ†γ0 = γ0 T ψ† T = γ0ψ∗. To find the form of C, let’s take the complex conjugate of the Dirac Equation,

T T (iγµ∂ m)∗ ψ∗ = i γµ † ∂ m ψ† µ − µ −       = γ0 T iγµ T ∂ m ψ¯T , (2.82) − µ −   where we have additionally used γµ † = γ0γµγ0. Premultiply by C and the Dirac equation becomes iCγµ T C−1∂ m ψ = 0. (2.83) − µ − c   In order for ψC to satisfy the Dirac equation we require C to be a matrix satisfying the condition CγT C−1 = γ (C−1 = C†). (2.84) µ − µ In the Dirac representation,a suitable choice for this operator is

0 iσ2 C = iγ2γ0 = . (2.85) iσ2 −0  −  The charge-conjugation transformation is then

ψ(t, ~x) ψ (t, ~x) = Cψ¯T (t, ~x) = iγ2γ0ψ¯T (t, ~x). (2.86) → C When Dirac wrote down his equation everybody thought parity and charge conju- gation were exact symmetries of nature, so invariance under these transformations was essential. Now we know that neither of them, nor the combination CP , is respected by the standard electroweak model. 2.7.3 Time reversal

As already noted, time reversal is an improper LT, given by ΛT in eq. (2.77). Naively one would expect to derive a time reversal operation in the same way as for parity. However, there is a subtlety that the momentum of a particle is a rate of change, so if we reverse the direction of time, the momentum must change direction. When we reverse the momentum p~ in a plane wave we find ∗ e−i(Et−p~·~x) e−i(Et−(−p~)·~x) = ei(E(−t)−p~·~x) = e−i(E(−t)−p~·~x) . (2.87) −→ In this example, taking the complex conjugate is the equiv alent of reversing the time coordinate and reversing the momentum. So once again, we must take the complex conjugate of the field, transforming it according to ψ(t, ~x) ψ ( t, ~x) = T ψ∗(t, ~x). (2.88) → T − To find the form of T , let’s take the complex conjugate of the Dirac equation, premultiply by T and interchange t t, → − 0 ∂ 0 ∗ ∂ ∗ −1 ∗ iγ + i~γ ~ m ψ(t, ~x) ST iγ i~γ ~ m T T ψ ( t, ~x) ∂t · ∇ − ! −→ − ∂( t) − · ∇ − ! − −

0 ∗ −1 ∂ ∗ −1 = i T γ T + i T~γ T ~ m ψT (t, ~x). ∂t − · ∇ − ! h i h i (2.89)

For ψT to satisfy the Dirac equation we need i T γ0 ∗T −1 = γ0, T~γ∗T −1 = ~γ. (2.90) − − A suitable choice is h i h i 0 1 0 0 0 iσ σ 1 0 0 0 T = iγ1γ3 = − 1 3 = i  −  , (2.91) iσ1σ3 0 ! 0 0 0 1   −    0 0 1 0   −  and the time reversal transformation on a fermion field is ψ(t, ~x) ψ ( t, ~x) = T ψ∗(t, ~x) = iγ1γ3ψ∗(t, ~x) (2.92) → T − 2.7.4 CPT We are now in the position to ask what is the effect of performing charge conjugation, parity and time-reversal all together on a Dirac field. The combined transformation is known as CPT. Using eqs. (2.79), (2.86) and (2.92), the CPT transformation is, ∗ ψ(t, ~x) ψ ( t, ~x) = iγ2γ0γ0 T γ0iγ1γ3ψ∗(t, ~x) → CP T − − h i = iγ2γ0γ0γ0( i)γ1γ3ψ(t, ~x) − = γ0γ1γ2γ3ψ(t, ~x) = iγ5ψ(t, ~x) (2.93) − Thus, apart from the factor of γ5, a particle moving forward in time is equivalent to an anti-particle moving backwards in time and in the opposite direction. In fact, the extra γ5 makes no difference to observable quantities (see the next section) so this justifies the Feynman-Stuc¨ kelberg interpretation of negative energy states we used earlier.

2.8 Bilinear Covariants Now, as promised, we will construct and classify the bilinears. These are useful for defin- ing quantities with particular properties under Lorentz transformations, and appearing in Lagrangians for fermion field theories. To begin, note that by forming products of the γ-matrices it is possible to construct 16 linearly independent 4 4 matrices. Any constant 4 4 matrix can then be decomposed into a sum over these basis× matrices. In equation (2.72)× we have defined i σµν [γµ, γν], ≡ 2 and now it is convenient to define 0 1 γ5 γ5 iγ0γ1γ2γ3 = , (2.94) ≡ ≡  1 0  where the last equality is valid in the Dirac representation. This new matrix satisfies γ5† = γ5, γ5, γµ = 0, (γ5)2 = 1. (2.95) . Exercise 2.15 n o Prove the three results in eq. (2.95) independently of the γ-matrix representation. Now, the set of 16 matrices 1, γ5, γµ, γµγ5, σµν form a basis for γ-matrix products.n There are 16 matriceso since there is 1 unit matrix, 1 γ5 matrix, 4 γµ matrices and 4 γµγ5 matrices, and 6 σµν matrices (see equation (2.72) for the definition of σµν ). Using the transformations of ψ and ψ¯ from eqs. (2.63) and (2.75), together with the transformation of γµ in eq. (2.74), the 16 fermion bilinears and their transformation properties can be written as follows: ψ¯ψ ψ¯ψ S scalar → ψ¯γ5ψ det(Λ) ψ¯γ5ψ P pseudoscalar → ψ¯γµψ Λµ ψ¯γν ψ V vector → ν ψ¯γµγ5ψ det(Λ) Λµ ψ¯γνγ5ψ A axial vector → ν ψ¯σµνψ Λµ Λν ψ¯σλσψ T tensor (2.96) → λ σ In particular we note that ψ¯γµψ = ψ†γ0γµψ = (ψ†ψ, ψ†α~ψ) (2.97) which is our previous definition eq. (2.28) of the current 4–vector J µ, i.e. we now see that it is really a 4–vector. . Exercise 2.16 Derive the transformation properties of the bilinears in equation (2.96) under C, P, T and CPT transformations.

2.9 Massless (Ultra-relativistic) Fermions At very high we may neglect the masses of particles (E2 p~ 2). Therefore, let us look at solutions of the Dirac equation with m = 0, on the basis' |that| this will be an extremely good approximation for many situations. From equation (2.30) we have in this case

Eφ = ~σ p~ χ, Eχ = ~σ p~ φ. (2.98) · · These equations can easily be decoupled by taking linear combinations and defining the two component spinors ΨL and ΨR, χ φ Ψ  , (2.99) R/L ≡ 2 which leads to EΨ = ~σ p~ Ψ , EΨ = ~σ p~ Ψ . (2.100) R · R L − · L The system is in fact described by two entirely separated two component spinors. If we take them to be moving in the z direction, and noting that σ3 = diag(1, 1), we see that there is one positive and one negative energy solution in each. − Further since E = p~ for massless particles, these equations may be written | | ~σ p~ ~σ p~ · Ψ = Ψ , · Ψ = Ψ (2.101) p~ L − L p~ R R | | | | 1 ~σ·p~ Now, 2 |p~| is known as the helicity operator (i.e. it is the spin operator projected in the direction of motion of the momentum of the particle). We see that the ΨL corresponds to solutions with negative helicity, while ΨR corresponds to solutions with positive helicity. In other words ΨL describes a left-handed particle while ΨR describes a right-handed particle, and each type is described by a two-component spinor. The two-component spinors transform very simply under LT’s,

i ~ ~ Ψ e 2 ~σ.(θ−iφ)Ψ (2.102) L → L i ~ ~ Ψ e 2 ~σ.(θ+iφ)Ψ (2.103) R → R where θ~ = ~nθ corresponds to space rotations through an angle θ about the unit ~n axis, and φ~ = ~vφ corresponds to Lorentz boosts along the unit vector ~v with a speed v = tanh φ. Note that these transformations are consistent with the fact that it is not possible to boost past a massless particle (i.e. its helicity cannot be reversed). However, under parity transformations ~σ ~σ (like R~ p~), p~ p~, therefore ~σ p~ ~σ p~, i.e. the spinors transform into eac→h other: × → − · → − · Ψ Ψ . (2.104) L ↔ R So a theory in which ΨL has different interactions to ΨR (such as the in which the weak force only acts on left handed particles) manifestly violates parity. Although massless particles can be described very simply using two component spinors as above, they may also be incorporated into the four-component formalism by using the γ5 we defined earlier. Let’s define projection operators 1 P 1 γ5 . (2.105) R/L ≡ 2    In the Dirac representation, these are,

1 1 1 PR/L =  , (2.106) 2 1 1  ! where 1 denotes the 2 2 identity matrix. Acting these projection operators on a general × Dirac field of the form eq. (2.29) projects onto right- or left-handed eigenstates. To see this, first note that

χ 1 1 1 χ ΨR/L PR/L =  = . (2.107) φ 2 1 1 φ ΨR/L !  ! ! ! The helicity operator in four-component Dirac space is given by S~ p/~ p~ , with S~ = 1 Σ,~ · | | 2 where Σ~ is defined in equation (2.46). Acting this operator on the projected state gives

1 ~σ·p~ 0 Ψ 1 Ψ |p~| R/L = R/L , (2.108)  ~σ·p~  2 0 |p~| ΨR/L ! 2 ΨR/L !   indicating that the projected states are indeed right- or left-handed eigenstates with helicity 1 .  2 This can be made more explicit by using a different representation for the γ-matrices. In the chiral representation (sometimes called the Weyl representation) we define the γ- matrices to be 0 1 0 ~σ γ0 , ~γ , (2.109) ≡  1 0  ≡  ~σ 0  so that, with γ5 = iγ0γ1γ2γ3 as before, the projection operators eq. (2.105) become

0 0 1 0 PR = , PL = . (2.110) 0 1 ! 0 0 !

Now, the left-handed Weyl spinor sits in the upper two components of the Dirac spinor, while the right-handed Weyl spinor sits in the lower two components of the Dirac spinor. The projection operators pick out only the upper or lower component, e.g.

ΨL 0 0 ΨL 0 PR = = , (2.111) ΨR ! 0 1 ! ΨR ! ΨR ! so the projected states are once again helicity eigenstates. 3 Quantum Electrodynamics

3.1 Classical So far, we have only considered relativistic wave equations for free particles. Now we want to include electromagnetic interactions, so let’s start by reviewing Maxwell’s Equations in differential form:

~ .E~ = ρ, ~ .B~ = 0, ∇ ∇ (3.1) ∂B~ ∂E~ ~ E~ = , ~ B~ = J~ + . ∇ × − ∂t ∇ × ∂t Note here that I’m using Heaviside Lorentz units - I’ve used my freedom to choose the unit of charge to set 0 = 1. Then, since in natural units c = 1, µ0 = 1 too. When one plays these games the value of the electron charge changes but the dimensionless 2 quantity α = e /4π0hc¯ remains unchanged - α = 1/137. We can rewrite the Maxwell equations in terms of a scalar potential φ, and a vector potential A~. Writing ∂A~ E~ = ~ φ, − ∂t − ∇ (3.2) B~ = ~ A,~ ∇ × we automatically have solutions of two of the Maxwell equations,

~ .B~ = ~ .(~ A~) 0 (3.3) ∇ ∇ ∇ × ≡ and ∂A~ ~ E~ = ~ ~ φ ∇ × ∇ × − ∂t − ∇    ∂(~ A~) (3.4) = ∇ × ~ (~ φ) − ∂t − ∇ × ∇

∂B~ = . − ∂t This simplifies things greatly since now there are only two Maxwell equations to solve. Let’s write them out in terms of the potentials,

d(~ .A~) ~ .E~ = 2φ ∇ = ρ, (3.5) ∇ −∇ − dt and (since ~ ~ A~ 2A~ + ~ .(~ .A~)), ∇ × ∇ × ≡ −∇ ∇ ∇

∂ ∂A~ ~ (~ .A~) 2A~ = J~ + ~ φ . (3.6) ∇ ∇ − ∇ ∂t − ∂t − ∇    or rearranging, ∂2A~ ∂φ 2A~ + = J~ ~ (~ .A~ + ). (3.7) −∇ ∂t2 − ∇ ∇ ∂t Unfortunately these two equations we are left with are quite complicated. To simplify them up we note that we can redefine our potentials,

A~ A~ + ~ ψ, → ∇ ∂ψ φ φ , (3.8) → − ∂t without changing E~ and B~ . This redefinition of the potentials is known as a gauge transformation. . Exercise 3.17 Check that E~ and B~ are invariant under the gauge transformation in eq. (3.8). We can choose a gauge transformation such that ∂φ ~ .A~ = . (3.9) ∇ − ∂t In this gauge (the Lorentz gauge) Maxwell’s equations simplify to

∂2φ 2φ + = ρ, (3.10) −∇ ∂t2 ∂2A~ 2A~ + = J~. (3.11) −∇ ∂t2 As well as being prettier, these equations also have a very suggestive form. They suggest we should define the 4-vectors,

J µ = (ρ, J~), Aµ = (φ, A~), (3.12)

so the Maxwell equations may be written in a manifestly covariant form,

∂2Aµ = J µ. (3.13)

The µ = 0 equation is the φ eq. (3.10) and the µ = 1, 2, 3 equations give the components of the eq. (3.11) for A~. The gauge condition, eq. (3.9), becomes

µ ∂ Aµ = 0. (3.14)

Eq. (3.13) is the classical wave equation for the electromagnetic field. In free space we have eq. (3.13) with no source, i.e.

∂2Aµ = 0, (3.15)

which has plane wave solutions, Aµ = µeiq.x, (3.16) where µ is the polarization tensor and q2 = 0. The Lorentz condition, eq. (3.14), enforces µ q µ = 0, (3.17) which removes one degree of freedom. Even after enforcing this condition, there is still room to make more gauge transformations, Aµ Aµ + ∂µχ where ∂2χ = 0. (3.18) → This can be used to remove one extra degree of freedom from µ. There are therefore two physical degrees of freedom, the normal polarizations of a photon.

3.2 The Dirac Equation in an We will now treat Aµ as a quantum mechanical wave function for photons. In the limit of a large number of photons the wave function is interpreted as a number density and produces the classical wave theory. But so far we have no interactions; to allow electrons to interact with electromagnetism we have to include the photon field into our Dirac equation. The ’obvious’ thing to do is to just be led by Lorentz invariance. The field Aµ is a vector field so we need to ’soak up’ its free index with a γ-matrix. We therefore include it into the Dirac equation as (iγµ∂ eγ Aµ m) ψ = 0, (3.19) µ − µ − where the factor of e is a free constant which quantifies how strongly the electron couples to the photon (the charge of the electron is e). It is convenient to incorporate this extra−term into a new definition of a covariant derivative2, Dµ ∂µ + ieAµ. (3.20) ≡ Our interacting Dirac equation was therefore obtained from the free Dirac equation by the minimal substitution ∂µ Dµ, and the Dirac equation becomes → (iD/ m)ψ = 0. (3.21) − There is a much nicer and theoretically much more appealing way to get the interac- tion term. That is if we require the QED Lagrangian to be invariant under a local gauge symmetry consisting of the transformations ψ e−ieΛ(x)ψ, Aµ Aµ ∂µΛ(x). (3.22) → → − then we are forced to the wave equation in eq. (3.21). For more details, I refer you to the Standard Model course. We must also allow the electrons to enter into the photon wave equation but here the classical theory already tells us how a current density enters. We expect ∂2Aµ = J µ (3.23) where J µ is just given by the charge times the Dirac equation number density ( eψ¯γµψ). − 2Conventions for the covariant derivative vary. Halzen and Martin, and Mandl and Shaw both use Dµ ∂µ ieAµ whereas Peskin and Schroeder both use eq. (3.20). Both conventions define the electron charge≡ to−be e but differ by a sign in the definition of the photon field, Aµ. − 3.3 g 2 of the Electron − We now have a wave equation which describes how an electron behaves in an electro- magnetic field, i.e. eq. (3.19). We will immediately put this to use by investigating the interaction between the spin of a non-relativistic electron and a magnetic field. Writing the electron field in the form of eq. (2.29), we see that eq. (3.19) gives

χ m ~σ ( i~ eA~) χ = · − ∇ − (3.24) φ ! ~σ ( i~ eA~) m ! φ ! · − ∇ − − Substituting the equation from the second row into the that from the first leads to,

2 ~σ ( i~ eA~) E m + · − ∇ − χ = 0. (3.25)  − h E + m i      We can simplify this somewhat by using to relation

σiσj = δij + iijkσk, (3.26) to show 2 ~σ i~ eA~ = i~ eA~ 2 e ~ A~ + A~ ~ ~σ, (3.27) · − ∇ − | − ∇ − | − ∇ × × ∇ · and note h  i   ~ A~ ψ + A~ ~ ψ = ~ A~ ψ = B~ ψ. (3.28) ∇ × × ∇ ∇ × Putting all this together we find,  

p~ eA~ 2 eB~ ~σ E m + | − | − · χ = 0. (3.29)  − E + m    In the non-relativistic limit we can write E m and observe that the lower 2-component ≈ spinor is ~σ (p~ eA~) φ · − χ χ. (3.30) ≈ 2m  This allows us to write, for the 4-component spinor ψ,

1 eB~ Σ~ p~ eA~ 2ψ · ψ = 0. (3.31) 2m| − | − 2m

Notice that we have a coupling between the magnetic field B~ and the spin of the ~ 1 ~ electron S = 2 Σ. This is known as a magnetic moment interaction and takes the form

µ~ B~ . (3.32) − · Our Dirac equation in an electromagnetic field has predicted e µ~ = Σ~ . (3.33) −2m In classical physics the magnetic moment of an orbiting charge is written e µ~ = L.~ (3.34) orb −2mc This is the magnetic moment associated with orbital angular momentum. By analogy we define the magnetic moment due to intrinsic angular momentum (i.e. spin) as e g e µ~ = g S~ = Σ~ (3.35) spin − 2m −2 2m where g is the gyromagnetic ratio of the particle. The Dirac equation predicts

g = 2. (3.36)

Experimentally one finds for the electron that

g = 2.0023193043738 0.0000000000082, (3.37)  so the Dirac equations prediction is pretty close. It is not exactly correct, as we can see from the incredible precision with which this quantity has been measured. The discrep- ancy is due to us not yet including quantum corrections to our prediction. The interaction of an electron with a photon (and thus the gyromagnetic ratio) will be changed by pro- cesses of the form seen in fig. 3, and processes involving yet more particle loops. When

Figure 3: Quantum corrections to the electron-photon interaction. one performs a more careful analysis, including these quantum effects, one predicts the deviation from 2 to be g 2 α α 2 α 3 α 4 − = 1 + 0.328 + 1.181 1.510 + . . . + 4.393 10−12, (3.38) 2 2π −  π   π  −  π  × and comparing this prediction with experiment: g 2 Theory : − = 1159652140(28) 10−12, 2 × (3.39) g 2 Experiment : − = 1159652186.9(4.1) 10−12. 2 × The figure in brackets denotes the error on the last significant figure. We can see that the experimental measurement matches the theoretical prediction to 8 significant figures, making this prediction of QED the most precisely tested prediction in physics. 3.4 Interactions in Perturbation Theory The principle technique for computations of particle is perturbation theory - in other words we assume that the coupling e 1 and expand about e = 0. We will be interested in processes such as 

a c

V

b d

Outside the shaded interaction region we assume the particles are free. We will use the plane wave particle solutions derived in section 2.3 which, as noted in section 2.4, can only be normalized in a box of volume V. The shaded region is a sketch of this box - if we take a very large box then we expect it’s presence to vanish from the answer for the which is dominated when the particles are close and at the centre of the box. This will indeed be the case for our final cross-section results but we will need to keep track of factors of V for a while to see that result. I find it intuitive to have just one of each of the incoming and outgoing states in the box and to calculate the probability of that scatter occuring - I therefore pick the normalization N = 1/√2EV from section 2.4. None of this analysis is Lorentz invariant but as the volume will factor out of our final results we will finally recover the Lorentz invariant forms for cross-sections. To begin let’s write the Dirac equation in a way that displays the smallness of the interaction ∂ψ iγ0 + iγi∂ ψ mψ + γ0 δV ψ = 0 (3.40) ∂t i − so for the electromagnetic interaction δV = eγ0γµA (3.41) − µ Note that (γ0)2 = 1 so the γ0 have been included simply for notational convenience. We will assume that the scattering particles begin in a pure p~ state but the interaction then scatters them to another p~ state with some (small) probability. In general we can write

iEnt ψ = κnφn(x)e (3.42) n X The φn(x) are the free Dirac equation solutions with n labelling the spinor state and the p~ state. The κn are the probability amplitudes for the given state n. Before the interaction all the κ will be zero except one but during the interaction ( T/2 < t < T/2) we allow n − κn to change - κn(t). If we now substitute the solution into the perturbed Dirac equation above then, at leading order, we obtain zero since we have expanded in solutions of the unperturbed equation. At next order we find

dκn −iEnt iEnt iγ0 φne = γ0 δV κnφn(x)e (3.43) n dt ! n X X Now we will make use of the orthogonality of the φn to extract the final state κn. We 3 † multiply through by d x φf γ0 R dκf 3 † −i(En−Ef )t = i κn d x φf δV φn e (3.44) dt − n X Z For a discussion of normalization of the spinors see section 2.4. Remembering that at t = T/2 κ = 1 and κ = 0 at leading order we have − i i6=n

dκf † 3 = i ψf δV ψi d x (3.45) dt − Z and integrating with respect to t we find the important result

† 4 κf (T/2) = i ψf δV ψi d x (3.46) − Z

Now lets use our explicit form for δV in QED and concentrate on the scattering of a particle a c by a photon Aµ → a c

κ = i ψ¯c( eγ Aµ)ψa d4x ca − − µ R (3.47) = i J caAµ d4x − µ where R J ca = e ψ¯cγ ψa = e N N u¯cγ ua ei(pc−pa).x (3.48) µ − µ − a c µ The Ns here are the normalizations of the spatial wave functions ψ again from section 2.5. We’re really interested in two particles scattering off each other so we’d better com- pute the Aµ field produced when another particle scatters from state b d →

b d

2Aµ = J µ = e N N u¯ γ u ei(pd−pb).x (3.49) db − b d d µ b the solution is 1 Aµ = J µ , q = p p (3.50) −q2 db d − b So finally substituting this back into our expression for κca we find

c a 1 d µ b i(pb+pd−pa−pc).x 4 κfi = i NaNbNcNd u¯ ( eγµ)u 2 u¯ ( eγ )u e d x (3.51) − − −q ! − Z Note that the integral is just a delta function that ensures 4-momentum conservation in the interaction. In order to make this result more memorable Feynman developed his famous rules that associate different parts of the expression with elements of a diagram of the scattering. − u a u c

µ i e γ

_ i gµν q 2

ν i e γ

_ u b u d

where momentum is conserved at the vertices. Multipling these rules out gives us i fi where − M

κ = i N N N N (2π)4δ4(p p ) (3.52) fi − a b c d f − i Mfi . Exercise 3.18 Derive the Feynman rules for the scattering of two particles described by the Klein Gordon equation to leading order in e. You may Assume the form of the result in (3.46).

3.5 Internal Fermions and External Photons We concentrated above on a scattering with external fermions interacting by the exchange of a photon. We can also imagine processes where there are external photon fields or internal virtual fermions. What are the Feynman rules for these cases? Given time constraints, rather than derive them, I’ll present some simple arguments to motivate the rules. If we have an external photon interacting with a fermion in some way, then the vertex rule is still ieγµ. Since the amplitude we wish to calculate is Lorentz invariant we can not allow a stra−y µ index to survive but must soak it up with a 4-vector. The obvious 4-vector associated with external photon is its polarization vector µ and indeed this is the appropriate factor for an external photon. Compare this to the way an external fermion closes the gamma matrix space indicies, to give a number, with the external spinor. We have seen that an internal photon (satisfying 2Aµ = 0) generates a Feynman rule (or ) ig 2Aµ = 0 − µν (3.53) → p2 Since a photon is just a collection of four scalar fields we can deduce that a massless, scalar field (which satisfies the KG equation 2φ = 0) will have a Feynman rule i 2φ = 0 (3.54) → p2 It turns out that the sign is that of a space-like photon degree of freedom. To find the propagator of a massive scalar field we can treat the mass as a perturbing interaction of the free particle. Writing the KG equation as

2φ = δV φ = m2φ (3.55) − − will generate a Feynman rule for the scalar self interaction −im

Now we can consider the set of perturbation theory diagrams that contribute to the full scalar propagator

+ + + i i −im i i −im i −im i p2 + p2( )p2 + p2( )p2( )p2 +

i i i i i i i + ( im) + ( im) ( im) + ... (3.56) p2 → p2 p2 − p2 p2 − p2 − p2 Pleasingly we can resum this series

i m2 m4 i 1 i 2 1 + 2 + 4 + ... = 2 m2 = 2 2 (3.57) p p p ! p 1 2  p m − p −   and this is indeed the full propagator in the massive case. By this point we can see that the propagator is basically just i times the inverse of the free field equation operator in momentum space. A sensible −guess for the fermionic field is

i i p/ + m i(p/ + m) (i∂/ m)ψ = 0 = = (3.58) − → p/ m p/ m p/ + m p2 m2 − − − This is in fact the correct answer. You will receive more insight into these results from the Field Theory course. For every . . . draw . . . write . . .

µ ν igµν Internal photon line − p2 + i0+

α β i(p/ + m) Internal fermion line αβ p p2 m2 + i0+ − α β µ Vertex ieγαβ µ −

Outgoing electron u¯α(s, p)

Incoming electron uα(s, p)

Outgoing positron vα(s, p)

Incoming positron v¯α(s, p) Outgoing photon ε∗µ(λ, p) Incoming photon εµ(λ, p) Attach a directed momentum to every internal line • (4) Conserve momentum at every vertex, i.e. include δ ( pi) • Integrate over all internal momenta • P

Table 1: Feynman rules for QED. µ, ν are Lorentz indices, α, β are spinor indices and s and λ fix the polarization of the electron and photon respectively.

3.6 Summary of Feynman Rules of QED

The Feynman rules for computing the amplitude fi for an arbitrary process in QED are summarized in Table 1. M The spinor indices in the Feynman rules are such that matrix multiplication is per- formed in the opposite order to that defining the flow of fermion number. The arrow on the fermion line itself denotes the fermion number flow, not the direction of the momen- tum associated with the line: I will try always to indicate the momentum flow separately as in Table 1. This will become clear in the examples which follow. We have already met the Dirac spinors u and v. I will say more about the photon polarization vector ε when we need to use it. To summarize, the procedure for calculating the amplitude for any process in QED is the following: 1. Draw all possible distinct diagrams 2. Associate a directed 4-momentum with all lines 3. Apply the Feynman rules for the , vertices and external legs 4. Ensure 4-momentum conservation at each vertex by adding (2π)4δ4(k k ),where i − f ki and kf are the total incoming and outgoing 4-momenta of the vertex respectively 5. Perform the integration over all internal momenta with the measure d4k/(2π)4

It is also part of the Feynman rules for QED that when diagrams differ by anR interchange of two fermion lines, a relative minus sign must be included. This is a reflection of Pauli’s exclusion principle or equivalently of the anticommutation of the fermion operators dis- cussed in the appendix. Note, however, that you don’t need to get the absolute sign of an amplitude right, just its sign relative to the other amplitudes, since it is the modulus of the amplitude squared that we need ultimately. This sounds rather complicated. In particular there seem to be an awful lot of integrations to be done. However, at tree-level, i.e. if there are no loop diagrams, the delta functions attached to the vertices together with the integration over the internal momenta simply result in an overall 4-momentum 4 4 conservation, i.e. in a factor (2π) δ (Pi Pf ), where Pi and Pf are the total incoming and outgoing 4-momenta of the process.−Thus at tree-level, no ‘real’ integration has to be done. At one loop, however, there is one non trivial integration to be done. Gen- erally, the calculation of an n-loop diagram involves n non trivial integrations. Even worse, these integrals very often are divergent. Still, we can get perfectly reasonable theoretical predictions at any order in QED. The procedure to get these results is called and will be the topic of section 6. At this point, some remarks con- cerning step 1, i.e. drawing all possible distinct Feynman diagrams, might be useful. In order to establish whether two diagrams are distinct, we have to try to convert one into the other. If this is possible without cutting lines and without gluing lines – that is solely by twisting and stretching the lines and rotating the whole figure – then the two diagrams are identical. It should be noted that the external lines are labeled in this process. Therefore, the two diagrams shown in figure 6 are different. Finally, let me mention that the diagrams shown in figure 1 are not Feynman diagrams. When drawing Feynman diagrams we are only interested in what particles are incoming and which ones are outgoing and there is no time direction involved. 4 Cross Sections and Decay Rates

Before explicitly calculating some transition amlitudes lets see how to connect those amplitudes to physical observables such as cross sections and particle widths.

4.1 Transition Rate

Consider an arbitrary scattering process with an initial state i with total 4-momentum Pi and a final state f with total 4-momentum Pf . Let’s assume we computed the scattering amplitude for this process in QED, i.e. we know the matrix element

N i N N (2π)4δ4(P P ) (4.1) − f i Mfi f − i fY=1 Yin Our task in this section is to convert this into a scattering cross section (relevant if there is more than 1 particle in the initial state) or a decay rate (relevant if there is just 1 particle in the initial state), see figure 4.

(a) (b)

Figure 4: Scattering (a) and decay (b) processes.

The probability for the transition to occur is the square of the matrix element, i.e.

N Probability = i N N (2π)4δ4(P P ) 2. (4.2) | − f i Mfi f − i | fY=1 Yin Attempting to take the squared modulus of the amplitude produces a meaningless square of a delta function. This is a technical problem because our amplitude is expressed between plane wave states. These states are states of definite momentum and so extend throughout all of space-time. In a real experiment the incoming and outgoing states are localized (e.g. they might leave tracks in a detector). To deal with this properly we would have to construct normalized wave packet states which do become well separated in the far past and the far future. A sloppier derivation is to maintain that our interaction is occuring in a box of volume V = L3 and over a time of order T . The final answers will come out independent of V and T , reproducing the ones we would get if we worked with localized wave packets. Using

4 4 i(Pf −Pi)x 4 (2π) δ (Pf Pi) = e d x (4.3) − Z we get in our space-time box the result

4 4 2 4 4 i(Pf −Pi)x 4 4 4 (2π) δ (Pf Pi) (2π) δ (Pf Pi) e d x V T (2π) δ (Pf Pi). (4.4) | − | ' − Z ' − We must also use the explicit expressions for the wave function normalizations from section 2.4. Above we used the normalization N = 1/√2EV . So putting everything together, we find for the transition rate W , i.e. the probability per unit time

N 1 2 4 4 1 1 W = fi V T (2π) δ (Pf Pi) . (4.5) T |M | − 2Ef V 2EiV fY=1   Yin   As expected, the dependence on T cancelled. Usually we are interested in much more detailed information than just the total transition rate. We want to know the differential transition rate dW , i.e. the transition rate into a particular element of the final state phase space. To get dW we have to multiply by the number of available states in the (small) part of phase space under consideration. For a single particle final state, the number of available states dn in some momentum range ~k to ~k + d~k is, in the box normalization, dn = V d3~k (4.6) This result is proved by recalling that the allowed momenta in the box have components that can only take on discrete values such as kx = 2πnx/L where nx is an integer. Thus dn = dnxdnydnz and the result follows. For a two particle final state we have

dn = dn1dn2 where 3 3 dn1 = V d ~k1, dn2 = V d ~k2,

where dn is the number of final states in some momentum range ~k1 to ~k1 +d~k1 for particle 1 and ~k2 to ~k2 + d~k2 for particle 2. There is an obvious generalization to an N particle final state, N V d3~k dn = f . (4.7) (2π)3 fY=1 The transition rate for transitions into a particular element of final state phase space is thus given by, using equations (4.7) and (4.5), N N 3~ 2 4 4 1 1 V d kf dW = fi (2π) δ (Pf Pi)V 3 |M | − 2Ef V 2EiV (2π) fY=1   Yin   fY=1

2 1 = fi V LIPS(N) , (4.8) |M | 2EiV × Yin   where in the second step we defined the Lorentz invariant phase space with N particles in the final state N 3~ 4 4 d kf LIPS(N) (2π) δ (Pf Pi) 3 . (4.9) ≡ − (2π) 2Ef fY=1 Observe that everything in the transition rate is Lorentz invariant save for the initial energy factor and the factors of V . . Exercise 4.19 Show that d3k/2E is a Lorentz-invariant element of phase space. (Hint: Think how you would write the phase space in a 4-dimensional, integral but with the particle on-shell, i.e. E = (~k2 + m2)1/2). 4.2 Decay Rates We turn now to the special case where we have only one particle with mass m in the initial state i, i.e. we consider the decay of this particle into some final state f. In this case, the transition rate is called the partial decay rate and denoted by Γif . First of all, we observe that the dependence on V cancels, as advertised above. In the rest frame of the particle the partial decay rate is given by

1 2 Γif = fi LIPS (4.10) 2m Z |M | × The important special case of two particles in the final state deserves further considera- tion. Consider the partial decay rate for a particle i of mass m into two particles f1 and f2. The Lorentz-invariant phase space is

3 3 4 4 d p~1 d p~2 LIPS(N) = (2π) δ (pi p1 p2) 3 3 . (4.11) − − (2π) 2E1 (2π) 2E2 In the rest frame the four-vectors of each particle are

p = (m, 0), p = (E , p~), p = (E , p~). (4.12) i 1 1 2 2 − Therefore we can eliminate one three-momentum in the phase space

3 1 d p~2 LIPS(N) = 2 δ(m E1 E2) . (4.13) (2π) − − 4E1E2 Hence the partial decay rate becomes

2 ∗ 1 2 d p~f p~f dΩ Γif = 2 fi δ(m E1 E2) | | | | (4.14) 8m(2π) Z |M | − − E1E2 where dΩ∗ is the solid angle element for the angle of one of the outgoing particles with respect to some fixed direction, and p~f is the momentum of one of the final state particles. But from the on-shell condition E = (p~2 + m2)1/2, we have dE = p~ /E d p~ and 1 1 1 1 | f | 1 | f | similarly for particle 2 and so

E1 + E2 d(E1 + E2) = p~f d p~f , | | | | E1E2 therefore 2 1 p~f p~f d p~f = | | d(E1 + E2). (4.15) | | | |E1E2 E1 + E2

Using this in eq. (4.14) and integrating over (E1 + E2) we obtain the final result

1 2 ∗ Γi→f1f2 = 2 2 fi p~f dΩ . (4.16) 32π m Z |M | | | The total decay rate of particle i is obtained by summation of the partial decay rates into all possible final states Γtot = Γif (4.17) Xf −1 The total decay rate is related to the mean life time τ via (Γtot) = τ. For completeness I also give the definition of the branching ratio for the decay into a specific final state f

Γif Bf (4.18) ≡ Γtot

In an arbitrary frame we find, W = (m/E)Γtot, which has the expected Lorentz dilation factor. In the master formula (equation (4.8)) this is what the product of 1/2Ei factors for the initial particles does.

4.3 Cross Sections The total cross section for a static target and a beam of incoming particles is defined as the total transition rate for a single target particle and a unit beam flux. The differential cross section is similarly related to the differential transition rate. We have calculated the differential transition rate with a choice of normalization corresponding to a single ‘target’ particle in the box, and a ‘beam’ corresponding also to one particle in the box. A beam consisting of one particle per volume V with a velocity v has a flux N0 given by v N = 0 V particles per unit area per unit time. Thus the differential cross section is related to the differential transition rate in equation (4.8) by dW V dσ = = dW . (4.19) N0 × v Now let us generalize to the case where in the frame in which you make the measurements, the ‘beam’ has a velocity v1 but the ‘target’ particles are also moving with a velocity v2. In a colliding beam experiment, for example, v1 and v2 will point in opposite directions in the laboratory. In this case the definition of the cross section is retained as above, but now the beam flux of particles N0 is effectively increased by the fact that the target particles are moving towards it. The effective flux in the laboratory in this case is given by ~v ~v N = | 1 − 2| 0 V which is just the total number of particles per unit area which run past each other per unit time. I denote the velocities with arrows to remind you that they are vector velocities, which must be added using the vector law of velocity addition, not the relativistic law. In the general case, then, the differential cross section is given by dW 1 1 dσ = = 2 LIPS (4.20) N ~v ~v 4E E |Mfi| × 0 | 1 − 2| 1 2 where we have used equation (4.8) for the transition rate, and the box volume V has again canceled. The amplitude-squared and phase space factors are manifestly Lorentz invariant. What about the initial velocity and energy factors? Observe that

E E (~v ~v ) = E p~ E p~ . 1 2 1 − 2 2 1 − 1 2 In a frame where p~1 and p~2 are collinear, E p~ E p~ 2 = (p p )2 m2m2, | 2 1 − 1 2| 1 · 2 − 1 2 and the last expression is manifestly Lorentz invariant. . Exercise 4.20 2 2 2 2 Prove that E2p~1 E1p~2 = (p1 p2) m1m2 in a frame where the momenta are collinear. Hence we can| define− a Loren| tz in· varian− t differential cross section. The total cross section is obtained by integrating over the final state phase space:

1 1 2 σ = fi LIPS. (4.21) ~v1 ~v2 4E1E2 |M | × | − | finalXstates Z A slight word of caution is needed in deciding on the limits of integration to get the total cross section. If there are identical particles in the final state then the phase space should be integrated so as not to double count. An important special case is 2 2 → scattering a(p ) + b(p ) c(p ) + d(p ). a b → c d . Exercise 4.21 Show that in the centre-of-mass frame the differential cross section for the scattering a(p ) + b(p ) c(p ) + d(p ) is a b → c d dσ 1 p~ = | c| 2. (4.22) dΩ 64π2s p~ |Mfi| | a|

4.4 Mandelstam Variables Invariant 2 2 scattering amplitudes are frequently expressed in terms of the Mandel- stam variables→. These are defined by s (p + p )2 = (p + p )2, ≡ a b c d t (p p )2 = (p p )2, ≡ a − c b − d u (p p )2 = (p p )2. (4.23) ≡ a − d b − c In fact there are only two independent Lorentz invariant combinations of the available momenta in this case, so there must be some relation between s, t and u. . Exercise 4.22 Show that 2 2 2 2 s + t + u = ma + mb + mc + md. (4.24) . Exercise 4.23 Show that, for two body scattering of particles of equal mass m, s 4m2, t 0, u 0. ≥ ≤ ≤ (Hint: since all variables are invariant work in the centre of mass frame.) 5 Processes in QED and QCD

5.1 Electron–Muon Scattering This is as simple a process as one can find since at lowest order in the electromagnetic coupling, just one diagram contributes. It is shown in figure 5. The amplitude obtained by applying the Feynman rules to this diagram is

µ igµν ν i fi = ie u¯(pc)γ u(pa) − 2 ie u¯(pd)γ u(pb), (5.1) M  q  2 2 where q = (pa pc) . Note that, for clarity, I have dropped the spin label on the spinors. I will restore− it when I need to. In constructing this amplitude we have followed the fermion lines backwards with respect to fermion flow when working out the order of matrix multiplication (which makes sense if you think of an unbarred spinor as a column vector and a barred spinor as a row vector and remember that the amplitude carries no spinor indices).

pa pc e− e−

µ− µ− pb pd

Figure 5: Lowest order Feynman diagram for e−µ− e−µ− scattering. → The cross section involves the squared modulus of the amplitude, 2. Let us see |Mfi| how we obtain a neat form for this. The hermitian conjugate of a ‘spinor sandwich’ is the same as its hermitian conjugate,

µ ∗ µ † (u¯(pc)γ u(pa)) = (u¯(pc)γ u(pa))

since it is just a number. Using rules of matrix algebra we see that this is

† 0 µ † † µ† 0† (u(pc) γ γ u(pa)) = (u(pa) γ γ u(pc)) † µ† 0 = (u(pa) γ γ u(pc)). (5.2)

But in section 2.6 we saw that γ0㵆γ0 = γµ, and so this becomes

µ ∗ µ (u¯(pc)γ u(pa)) = u¯(pa)γ u(pc). (5.3)

. Exercise 5.24 5 If Γ represents a string of γ-matrices (not including γ ) and ΓR is its reverse (i.e. the same γ-matrices in reverse order), show that,

0 ∗ 0 [u¯(k )Γu(k)] = u¯(k)ΓRu(k ). Using this result in the expression for 2 we obtain |Mfi| e4 2 = u¯(p )γµu(p )u¯(p )γ u(p )u¯(p )γνu(p )u¯(p )γ u(p ) |Mfi| q4 c a d µ b a c b ν d e4 = Lµν L , (5.4) q4 (e) (µ) µν where the subscripts e and µ refer to the electron and muon respectively and µν µ ν L(e) = u¯(pc)γ u(pa)u¯(pa)γ u(pc), µν with a similar expression for L(µ). Usually we have an unpolarized beam and target and do not measure the polarization of the outgoing particles. Thus we calculate the squared amplitudes for each possible spin combination, then average over initial spin states and sum over final spin states. Note that we square and then sum since the different spin configurations are in principle distinguishable. In contrast, if several Feynman diagrams contribute to the same process, you have to sum the amplitudes first. We will see examples of this below. The spin sums are made easy by the results u(s)(p) u¯(s)(p) = p/ + m, s X v(s)(p) v¯(s)(p) = p/ m. (5.5) s − X Do not forget that by m, we really mean m times the unit 4 4 matrix. × . Exercise 5.25 Prove eq. (5.5). Using the spin sums we find that

µν (sc) µ (sa) (sa) ν (sc) L(e) = u¯α (pc)γαβuβ (pa)u¯ρ (pa)γρσuσ (pc) s ,s spinsX Xa c µ ν = γαβ[p/a + me]βργρσ[p/c + me]σα µ ν = Tr (γ (p/a + me)γ (p/c + me)) . (5.6) where in the first line, we have make explicit the spinor indices in order to show how the trace emerges. Since all calculations of cross sections or decay rates in QED require the evaluation of traces of products of γ-matrices, you will generally find a table of “trace theorems” in any quantum field theory textbook [1]. All these theorems can be derived from the fundamental anti-commutation relations of the γ-matrices in eq. (2.58) together with the invariance of the trace under a cyclic change of its arguments. For now it suffices to use Tr(γµ1 . . . γµn ) = 0 for n odd Tr(γµ1 . . . γµn ) = gµ1µ2 Tr(γµ3 . . . γµn ) gµ1µ3 Tr(γµ2 γµ4 . . . γµn ) + − · · · + gµ1µn Tr(γµ2 . . . γµn−1 ) Tr(a/b/) = 4 a b, · Tr(a/b/c/d/) = 4(a b c d a c b d + a d b c). (5.7) · · − · · · · . Exercise 5.26 Derive the trace results in equation (5.7). (Hint: for the first one use (γ5)2 = 1.) Using these trace theorems,

Lµν = 4(pµpν gµνp p + pνpµ) + 4gµνm2, (5.8) (e) a c − a · c a c e spinsX µν with a similar result for L(µ). Putting this altogether, the spin averaged/summed ampli- tude squared is

1 2 4 |Mfi| spinsX e4 = 4 pµpν + pνpµ p p m2 gµν p p + p p p p m2 g q4 a c a c − a · c − e b µ d ν b ν d µ − b · d − µ µν         e4 = 8 (p p )(p p ) + (p p )(p p ) m2(p p ) m2 (p p ) + 2m2m2 . q4 c · d a · b c · b a · d − e b · d − µ a · c e µ   (5.9)

(Notice that we have divided by 4 since we are averaging over initial states, and there are 4 possible initial spin configurations.) This takes on a more compact form if expressed in terms of the Mandelstam variables of eq. (4.23),

1 2e4 2 = (s2 + u2 4(m2 + m2 )(s + u) + 6(m2 + m2 )2). (5.10) 4 |Mfi| t2 − e µ e µ spinsX

Finally, we can derive the differential cross section for this process in the centre-of- mass frame using eq. (4.22). In the high energy limit where s, u m2, m2 , i.e. setting | |  e µ the masses to zero, dσ e4 s2 + u2 = . (5.11) dΩ 32π2s t2 Other calculations of cross sections or decay rates will follow the same steps we have used above. Draw the diagrams, write down the amplitude, square it and evaluate the traces (if you are using spin sum/averages). There are one or two more complications to be aware of, which we will illustrate below.

5.2 Electron–Electron Scattering For the scattering e−e− e−e− we now have identical particles in the final state which → may only be distinguished by their momenta. Therefore we cannot just replace mµ by me in the calculation we performed above. Labeling the momenta in the process according − − − − − − to e (pa)+e (pb) e (pc)+e (pd) in analogy to e µ scattering, we realize that when particle a emits a →photon we do not know whether it ‘becomes’ particle c (as it did in the e−µ− scattering) or ‘becomes’ particle d. Since either is possible, we need to include both cases, resulting in the two diagrams of fig. 6. Applying the Feynman rules, the two pa pc pa pc e− e− e− e−

e− e− e− e− pb pd pb pd

Figure 6: Lowest order Feynman diagrams for electron–electron scattering.

diagrams give the amplitudes, ie2 i = u¯(p )γµu(p )u¯(p )γ u(p ), (5.12) M1 t c a d µ b ie2 i = u¯(p )γµu(p )u¯(p )γ u(p ). (5.13) M2 − u d a c µ b Notice the additional minus sign in the second amplitude, which comes from the anti- commuting nature of fermion fields. Remember that when diagrams differ by an inter- change of two fermion lines, a relative minus sign must be included. This is important because 2 = + 2 |Mfi| |M1 M2| = 2 + 2 + 2Re ∗ , (5.14) |M1| |M2| M1M2 so the interference term will have the wrong sign if you don’t include the extra sign 2 2 difference between the two diagrams. 1 and 2 are very similar to the previous calculation. The interference term is a|Mlittle| more|Mcomplicated| due to a different trace structure. Performing the calculation explicitly yields (in the limit of negligible fermion masses),

2 2 2 2 2 1 2 4 s + u s + t 2s fi = 2e + + . (5.15) 4 |M | t2 u2 tu ! spinsX . Exercise 5.27 Prove the result in eq. (5.15). It will be helpful first to prove γαγµγ = 2γµ α − α µ ν µν γ γ γ γα = 4g (5.16) γαγµγν γργ = 2γργνγµ. α − 5.3 Electron–Positron The two diagrams e+e− scattering are shown in fig. 7, with the one on the right known as the annihilation diagram. They are just what you get from the diagrams for electron– electron scattering in fig. 6 if you twist round the fermion lines. The fact that the diagrams are related in this way implies a relation between the amplitudes. The inter- change of incoming particles/antiparticles with outgoing antiparticles/particles is called . For our particular example, the squared amplitude for e+e− e+e− is related to that for e−e− e−e− by performing the interchange s u. Hence,→ squaring the amplitude and doing→ the traces yields (again neglecting fermion↔ mass terms)

2 2 2 2 2 1 2 4 s + u u + t 2u fi = 2e + + . (5.17) 4 |M | t2 s2 ts ! spinsX

e+ e+ e+ e+

e− e− e− e−

Figure 7: Lowest order Feynman diagrams for electron-positron scattering in QED.

If electrons and positrons collide and produce muon–antimuon or –antiquark pairs, then the annihilation diagram is the only one that contributes. At sufficiently high energies that the quark masses can be neglected, this immediately gives the lowest order QED prediction for the ratio of the annihilation cross section into hadrons to that into µ+µ−: σ(e+e− hadrons) R → = 3 Q2 , (5.18) ≡ σ(e+e− µ+µ−) f → Xf where the sum is over quark flavours f and Qf is the quark’s charge in units of e. The 3 comes from the existence of three colours for each flavour of quark. Historically this was important: you could look for a step in the value of R as your e+e− collider’s CM energy rose through a threshold for producing a new quark flavour. If you did not know about colour, the height of the step would seem too large. At the energies used at LEP you have to remember to include the diagram with a Z replacing the photon. Finally, we compute the total cross section for e+e− µ+µ−, neglecting the masses. Here we only have the annihilation diagram, and→for the amplitude, we get ig = ( ie)2u¯(p )γµv(p )− µν v¯(p )γνu(p ) Mfi − d c s a b ie2 = u¯ γµv v¯ γ u . (5.19) s d c a µ b Summing over final state spins and averaging over initial spins gives,

1 e4 2 = Tr(γµp/ γνp/ )Tr(γ p/ γ p/ ), 4 |Mfi| 4s2 c d µ b ν a spinsX where we have neglected me and mµ. Using the results in equation (5.7) to evaluate the traces gives, 1 8e4 2 = (p p p p + p p p p ). 4 |Mfi| s2 a · d b · c a · c b · d spinsX Neglecting masses we have,

p p = p p = t/2, (5.20) a · c b · d − p p = p p = u/2. (5.21) a · d b · c − Hence 1 t2 + u2 2 = 2e4 , (5.22) 4 |Mfi| s2 spinsX which incidentally is what you get by applying crossing to the electron–muon amplitude of section 5.1. We can use this in eq. (4.22) to find the differential cross section in the CM frame, dσ e4 t2 + u2 = . (5.23) dΩ 32π2s s2 You could get straight to this point by noting that the appearance of v spinors instead of u spinors in fi does not change the answer since only quadratic terms in mµ survive the Dirac algebraM and we go on to neglect masses anyway. Hence you can use the result of eq. (5.11) with appropriate changes. Neglecting masses, the CM momenta are 1 p = √s(1, ~e) p = 1 √s(1, ~e 0) (5.24) a 2 c 2 1 p = √s(1, ~e) p = 1 √s(1, ~e 0) (5.25) b 2 d 2 which gives t = s(1 cos θ)/2 and u = s(1 + cos θ)/2, where cos θ = ~e ~e 0. Hence, − − − · finally, the total cross section is,

1 dσ 4πα2 σ = 2πd(cos θ) = . (5.26) Z−1 dΩ 3s 5.4 The diagrams which need to be evaluated to compute the Compton cross section for γe γe are shown in fig. 8. For unpolarized initial and/or final states, the cross section calculation→ involves terms of the form

ε∗ µ(λ, p) εν(λ, p), (5.27) Xλ where λ represents the polarization of the photon of momentum p. Since the photon is massless, the sum is over the two transverse polarization states, and must vanish when contracted with pµ or pν. In principle eq. (5.27) is a complicated object. However, there is a simplification as far as the amplitude calculation is concerned. The photon is coupled to the electromagnetic current J µ = ψ¯γµψ of eq. (2.28). This is a conserved current, i.e. µ µ ∂µJ = 0, and in momentum space pµJ = 0. Hence, any term in the polarization sum, eq. (5.27), proportional to pµ or pν does not contribute to the cross section. This means that in calculations one can make the replacement ε∗ µ(λ, p) εν(λ, p) gµν , (5.28) → − Xλ and we have a simple, Lorentz-covariant prescription.

γ γ γ γ

e− e− e− e−

Figure 8: Lowest order Feynman diagrams for Compton scattering.

. Exercise 5.28 Show that the spin summed/averaged squared matrix element for Compton scattering in the massless limit is given by

2 4 u s fi = 2e (5.29) |M | − s − u Evaluate the total cross section using the expressions in the centre-of-mass frame at the end of the last sub-section. Why does this create a problem?

5.5 QCD Processes The theory of and , QCD, is in many ways very similar to QED. We have done most of the hard work to calculate tree level amplitudes already. The main difference between the theories is that QCD has three types of charges (called ‘colours’, e.g. red, green and blue). We can write a quark as a vector with the three colour states shown uR G u =  u  (5.30) uB   There are more possible interactions than in QED which are mediated by eight photon- like gauge fields called “gluons”. We encode the couplings of the gluons to the quarks by matrices which act on the above colour vector. For example there are two gluons with matrix “generators” 1 0 0 1 0 0 1 1 T 1 = 0 1 0 , T 2 = 0 1 0 . (5.31) √  −  2   12 0 0 0 0 0 2    −  These are just photon-likeinteractionswith each of the two photons having different couplings to the different colours. . Exercise 5.29 Check that the strength of a colour anti-colour quark pair scattering to itself at tree level is the same no matter which colour you pick. Show that the strength of a scattering of a colour anti-colour quark pair to a different colour pair is also the same no matter what colours you pick. The remaining six gluons change the colour of the quark and are associated with generators of the form 0 1 0 0 i 0 1 1 − T 3 = 1 0 0 , T 4 = i 0 0 (5.32) 2   2   0 0 0 0 0 0         The remaining four generators are of the same form but interchange the other two colour a b 1 ab combinations. Note these matrices are traceless and normalized so that T rT T = 2 δ . You will learn more about the origin of these fields and their couplings in the Standard Model course. From the point of view of calculating cross sections though the Feynman Rules are all we need to proceed, and these are very similar to those of QED. The generator T a is included in the Feynman rule for the –quark–anti-quark vertex as shown in fig. 9 (upper), where g is the QCD . Also, since a gluon

a, µ

−igT aγµ

j i

a, α

p − abc − γ αβ − α βγ − β γα gf (p q) g + (q r) g + (r p) g 

r q

c, γ b, β

Figure 9: An example of some QCD Feynamn Rules.

associated with, for example, T 3 can pair produce a red quark and an anti-green quark we see that the gluons themselves are charged. Therefore gluons can interact with other gluons, and there are multi-gluon vertices that do not occur in QED where the photon is chargeless. The Feynman rule for these vertices are given in fig. 9 (lower), where f abc, a, b, c = 1, ..., 8 are the QCD structure constants defined by [T a, T b] = f abcT c, (5.33) The QCD Feynman rules will be discussed at greater length in the Standard Model course. 6 Introduction to Renormalization

6.1 Ultraviolet (UV) Singularities So far, everything was computed at tree-level, that is, at the lowest nontrivial order in perturbation theory. Very often, a more precise determination of a cross section is desirable and we are thus led to consider loop diagrams. In order to illustrate this, consider the example e+e− µ+µ−. The perturbative expansion of the corresponding amplitude is written as →

= α + α2 + α3 + (α4), (6.1) M M0 M1 M2 O 2 where α = e 1/137. When we computed the corresponding amplitude in section 5.3 4π ≈ we only computed the leading order term

α = e2 α (6.2) M0 ∝ ∝ Using this expression for the amplitude, we will get the leading-order cross section σ α2 2. If we want to compute corrections of order α3 to this result, we will 0 ∝ |M0| have to compute the amplitude to an accuracy of order α2.

= + + + + . . . (6.3) M In fact this set of diagrams is one place where the distinction between relativistic quantum mechanics and true field theory raises its head. The diagram with an internal quark loop is naturally generated in quantum field theory but not in a perturbative expansion in quantum mechanics. In principle, a quark must also be included in this loop, but in QM you have to treat the quark as an external particle that is put there by hand. While the Feynman rules we derived are correct, you will see a much more rigorous derivation of the (scalar theory) Feynman rules in your QFT course. The one-loop correction to the cross section is related to the interference term of 0 and , M M1 σ α + α2 + (α3) 2 = α2 2 + 2α3Re( ∗) + (α4). (6.4) 1 ∝ | M0 M1 O | |M0| M0 M1 O The whole procedure looks pretty straightforward. However, if we try to compute a loop diagram, we run into trouble. Consider as an example the vertex correction , depicted in fig. 10. Using the Feyn- man rules listed in section 3.6 we end up with anVexpression of the form d4k k k (6.5) V ∝ (2π)4 k2((p + k)2 m2)((p k)2 m2) Z b − a − − where we did not bother to write down the full algebraic expression resulting from the spinor and Lorentz algebra but only the terms involving k. The two factors of k in the numerator stem from the two fermion propagators. The important point is that this integral diverges. Indeed, considering the limit k we can neglect pa, pb and m and find → ∞ d4k 1 dk 1 4 4 = (6.6) V ∼ Z (2π) k ∼ Z (2π) k ∞ pb

pb + k

pa + pb k

p k a − pa

Figure 10: Vertex correction for e+e− µ+µ− scattering. → where we used d4k k3dk. These singularities are called ultraviolet (UV) singularities because they come from∼ the region k . → ∞ Similar problems are encountered if we try to compute the other one-loop diagrams and our final answer for the cross section at next-to-leading order seems to be infinity.

6.2 Infrared (IR) Singularities There is another class of singularities that shows up in QED and QCD. As we saw in section 6.1 that UV singularities are related to the region of large k. However, there is also a potential danger of singularities from the region k 0 or more generally, from ∼ zeros in the denominators of the integrand. These singularities are called infrared (IR) singularities. These occur if some (massless) particle becomes very soft or two become very collinear. These singularities have nothing to do with the UV singularities. The solution to the problem is completely different in the two cases. In fact, you already should have encountered an IR singularity. When you tried to compute the total cross section for Compton scattering in section 5.4 you should have found that the total cross section diverges. This is due to an IR singularity. Indeed, the final state photon can become arbitrarily soft, in which case the electron-photon pair becomes indistinguishable from a single electron. One possibility to get a well defined finite answer is to require that the final state photon has some minimal energy but the general solution will be discussed in the phenomenology course. I will not discuss the IR singularities any further and will simply ignore them, safe in the knowledge that they can be dealt with in a manner totally different to that for the UV singularities. Thus in what follows I will call a cross section finite if it has no UV singularities, but it might well have IR singularities. Strictly speaking, we should replace every ‘finite’ below by ‘UV-finite’.

6.3 Renormalization It is important to realize that renormalization is not really about the removal of diver- gences, but simply an expression of the fact that in quantum field theories the value of certain parameters, e.g. the coupling constants, change with the energy scale used in a process. The infinities we encounter are then just a consequence of our ignorance of what is happening as E although we integrate up to this limit in any loop diagrams. → ∞ We will demonstrate this below, and show how results do turn out to be finite after all. To to obtain a prediction for any measurable quantity , say a cross section, we started with wave equations from which we deduced the FeynmanS rules, which in turn were used to compute . The wave equations of QED, eqs. (3.21) and (3.23), have some parameters. So far, weSdenoted them by e, m and referred to them as mass and charge of the electron. Therefore, our result will depend on these parameters. However, the S parameter m in the Lagrangian is not the real mass of the electron, nor is e its charge. The identification of the parameter in the Lagrangian and the measurable quantity is only justified at tree level, because beyond this level the parameters themselves receive corrections, i.e. the propagator and vertex diagram which define the mass and coupling strength are themselves corrected. Therefore, from now on we will be more precise and denote the parameters in by m0 and e0 and call them the bare mass and bare charge respectively. Note that theL bare parameters are not measurable. The (measurable) physical mass and charge of the electron will be denoted (as always) by m and e. also L depends on the fields, which we denoted so far by ψ and A. From now on, we denote them by ψ0 and A0 and call them the bare fields. We are now ready to reformulate the problem we encountered in section 6.1. If we try to compute a measurable quantity in terms of the unmeasurable bare quantities as a perturbative expansion in the coupling constant we generally encounter divergences. That is, if we compute

(e , m , ψ , A ) = (e , m , ψ , A ) + e2 (e , m , ψ , A ) + (e4) (6.7) S 0 0 0 0 S0 0 0 0 0 0 S1 0 0 0 0 O 0 then we may find that 1(e0, m0, ψ0, A0) = . In particular, this is true for two special physical quantities, namelyS the mass and the∞charge of the electron,

m = m + e2 m (e , m , ψ , A ) + (e4) 0 0 1 0 0 0 0 O 0 e = e + e2 e (e , m , ψ , A ) + (e4). (6.8) 0 0 1 0 0 0 0 O 0 But this is an expression for two measurable quantities in terms of unknown parame- ters. If the unknowns m0 and e0 are finite then we would get divergences in m1 and e1 and hence in m and e. Since m and e are finite quantities we conclude that the bare quantities are infinite. This is the root of the problem. UV divergences in our perturba- tive calculations show up if we try to express our results in terms of the unmeasurable, unphysical bare parameters, i.e. the parameters of the original Lagrangian. In order to save the situation, we have to find new parameters such that the result of any physical quantity expressed in these new parameters — at any order in perturbation theory — is finite. Is this possible? Generally, the answer is no. However, for some special theories (and luckily QED is one of them) it is possible. Such theories are called renormalizable theories. The new parameters are called the renormalized quantities and are denoted by eR, mR and ψR, AR. They are related to the bare quantities as follows:

1/2 ψ0 = Z2 ψR 1/2 A0 = Z3 AR 1/2 m0 = Zm mR

−1 −1/2 e0 = Z1Z2 Z3 eR (6.9) This is simply a definition of the renormalization factors Z1, Z2, Z3 and Zm. Since the renormalization factors relate finite and divergent quantities, they have to be divergent themselves. More precisely, they can be written as a perturbative series with divergent coefficients. To summarize, if we express the perturbative series for our physical quantity in terms of the renormalized quantities

(e , m , ψ , A ) = (e , m , ψ , A ) + e2 (e , m , ψ , A ) + (e4 ) (6.10) S R R R R S0 R R R R R S1 R R R R O R there will be no UV-divergences at any order in perturbation theory. Some people refer to this as ‘hiding the infinities’. What is meant by this statement is that if we have a small number of input values (mR, eR . . .) and express all results in terms of these input values we get finite answers for all measurable quantities. Thus, renormalizing QED enables us to relate any measurable quantity to a small number of measurable input values. It is a highly non-trivial exercise to show that QED is indeed a renormalizable theory. But once we know that we can find a set of renormalized parameters eR, mR, ψR, AR such that eq. (6.10) has finite coefficients at each order, it is clear that we can find as many 0 0 0 0 0 other sets as we like. Indeed, if we chose eR, mR, ψR, AR such that mR and mR (and all other parameters) are related by a finite series, then

0(e0 , m0 , ψ0 , A0 ) = 0 (e0 , m0 , ψ0 , A0 ) + (e0 )2 0 (e0 , m0 , ψ0 , A0 ) + ((e0 )4) (6.11) S R R R R S0 R R R R R S1 R R R R O R is also finite at each order in perturbation theory. In other words, the divergent pieces of the renormalization factors in eq. (6.9) are uniquely determined by requiring that the divergences cancel. However, we are completely free to fix the finite pieces to whatever we want. Choosing a particular set of renormalized quantities, that is, giving some pre- scription on how to fix the finite pieces of the renormalization factors, is called choosing the renormalization scheme. It is possible in QED that mR = m and eR = e, i.e. the renormalized coupling is determined by real electron photon scattering. The renormal- ization scheme that satisfies these constraints is called the on-shell scheme. Alternatively, the renormalized coupling may be determined by scattering with, for example, a virtual photon. In this case the value of eR will depend on the scale of the scattering, i.e. the coupling will “run” with the renormalization scale. To be precise let me also mention that one more constraint is needed to fix the scheme completely. Naively you would expect that four constraints are needed, since we have four renormalization factors to fix. However, two of them are related, Z1 = Z2. This identity is due to gauge invariance and is called the Ward identity. As a result, we only need three constraints to fix the renormalization scheme completely. . Exercise 6.30 Why is it not possible in QCD to use the on-shell scheme? Of course, the result of our calculation has to be independent of the renormalization scheme. This remark is not quite as innocuous as it looks. In fact, it is only true up to 2 the order to which we decided to compute. If we decide to include the (eR) but not the higher order terms in our calculation, we have O

(e , m , ψ , A ) 0(e0 , m0 , ψ0 , A0 ) = (e4 ) (6.12) S R R R R − S R R R R O R The numerical result for our prediction will depend on the renormalization scheme! Even though the difference is formally of higher order it still can be numerically significant, in particular in QCD. It’s worth stressing again that this ability to hide UV divergences in the couplings is not as conspiratorial as it at first seems. In the IR a theory involves long wavelength modes that are insensitive to UV physics - indeed they (like us!) don’t even know what the full UV theory of nature is. The incomplete IR theory will break down (generate infinities) if extended into the UV but since we know (presumably!) that the IR theory is part of a consistent UV theory there must be a way to hide the infinities. This is fundamentally why renormalization works.

6.4 What we have learned so far is that we have to express the result of our calculation in terms of renormalized quantities rather than the bare ones. But since the starting point of any calculation is the Lagrangian, the first step in any calculation is to get the results in terms of bare quantities. Only then, we replace the bare quantities by the renormalized quantities, using eq. (6.9) and get a finite result. In intermediate steps we will have to deal with divergent expressions. In order to give a mathematical meaning to these intermediate expressions, we will have to regularize the integrals. That is, we have to change them in a systematic way, such that they become finite. By doing so, we change the value of the integrals. However, at the end of our calculation, we are able to undo this change. Since the final result is finite, this step will not introduce a singularity. There are — at least in principle — many different possibilities for regularizing the integrals. To illustrate the idea of regularization I will discuss first the method of intro- ducing a cutoff, even though in practice this method is not really used. Consider again the vertex correction in eq. (6.5). As we saw, we got the UV singularity from the region k . To regularize this expression, we introduce a cutoff Λ → ∞ Λ d4k k k (6.13) V → Vreg ∼ (2π)4 k2((p + k)2 m2)((p k)2 m2) Z b − a − − Of course, by doing so we changed the value of the integral. At the end of our calculation we will have to let Λ . Introducing this cutoff, however, gives us the possibility to deal with such intermediate→ ∞ expressions. Let me illustrate the interplay between renormalization and regularization with an oversimplified example. Assume that with the cutoff regularization we get as a result of our calculation of some physical quantity, say a cross section

4 6 Λ 8 = e0A + e0 B ln + FS + (e0) (6.14) S  m  O where A, B and FS are some finite terms. The originally divergent expression for has been rendered finite by regularization. At this point we cannot let Λ sinceS → ∞ we would get . However, we learned that we have to express our results in S → ∞ terms of eR and not e0 (For simplicity, I ignore the mass renormalization). This step is renormalization (not regularization). Computing the relation between e0 and eR, using the same regularization, we would find

3 Λ 5 eR = e0 e0 C ln + Fe + (e0) (6.15) −  m  O and reversing this 3 Λ 5 e0 = eR + eR C ln + Fe + (eR) (6.16)  m  O where C and Fe are also finite. Plugging in eq. (6.16) into eq. (6.14) we get

4 6 Λ 8 = eRA + eR (B + 4AC) ln + FS + 4AFe + (eR) (6.17) S  m  O and we would find (B + 4AC) = 0. Since QED is a renormalizable theory this ‘miracle’ would happen for any measurable quantity. Finally, in the expression

= e4 A + e6 (F + 4AF ) + (e8 ) (6.18) S R R S e O R we can let Λ and ‘undo’ the regularization. To summarize,→ ∞ regularization enables us to work with divergent intermediate expres- sions. In the example above, instead of writing we write log Λ and have in mind Λ . Renormalization, on the other hand remo∞ves the (would be) singularities, i.e. it remo→ ∞ves the log Λ terms. Therefore, after renormalization we can (and have to) undo the regularization. Note that we could have defined a different renormalized coupling

3 Λ 5 e˜R = e0 e0 C ln + Ge + (e0) (6.19) −  m  O and this would have lead to

= e˜4 A + e˜6 (F + 4AG ) + (e˜8 ) (6.20) S R R S e O R and we would have a different expression in terms of a different coupling - both equally 8 valid, and identical up to the (e˜R) corrections. As mentioned above, the methoO d of introducing a cutoff for regularization is hardly ever used in actual calculations. The by far most popular method is to use dimensional regularization. The basic idea is to do the calculation not in 4 space-time dimensions but rather in D dimensions. Why does this help? Consider once more our initial example of the vertex correction in eq. (6.5), which has an UV singularity in D = 4 space-time dimensions (see eq. (6.6)). For arbitrary D, using dDk kD−1dk we get ∼ D d k 1 dk D−5 4 4 k (6.21) V ∼ Z (2π) k ∼ Z (2π) and the integral is UV-finite for say D 3. Thus changing the dimension can regulate ≤ integrals. It is important to note that this is only a technicality. There is no Physics associated with D = 4 and at the end of the calculation we have to let D 4. If we did renormalize our theory6 properly this last step will not lead to UV divergences.→ The reason why dimensional regularization is so popular is that it preserves gauge invariance and is technically relatively simple. Another very important issue is that this regularization not only regulates UV singularities, but also IR singularities. As men- tioned in section 6.2, theories like QED or QCD are very often plagued by IR singulari- ties. It is therefore very convenient if we do not have to introduce another regularization for IR singularities. Only after all UV and IR singularities have been removed, we can let D 4 and finally obtain a finite result. → 7 QED as a Field Theory

7.1 Quantizing the Dirac Field In this section we return to the Dirac equation and use it as the basis for a a field theory, which allows the creation and annihilation of particles naturally. Quantizing a field (or ) basically means that the wave function becomes an operator. The space in which this operator acts is called the Fock space. The Fock space contains states with an arbitrary number of particles and therefore we will be able to describe processes where the number of states changes. Dirac field theory is defined to be the theory whose field equations correspond to the Dirac equation. We regard the two Dirac fields ψ(x) and ψ¯(x) as being dynamically independent fields and postulate the Dirac Lagrangian density:

= ψ¯(x)(iγµ∂ m)ψ(x). (7.1) L µ − Then the Euler-Lagrange equation ∂ ∂ ∂ µ L L = 0 (7.2) ∂x ∂(∂µψ) − ∂ψ leads to the Dirac equation. The canonical momentum is ∂ π(x) = L = iψ†(x) (7.3) ∂ψ˙(x) and the Hamiltonian density is ∂ψ = πψ˙ = ψ†i . (7.4) H − L ∂t Now we want to regard ψ as a quantum field rather than as a wave function. In order to quantize this field, naively we would try to impose the usual equal time commutation relations, i.e.

[ψ (~x, t), π (~y, t)] = iδ δ3(~x ~y), α β αβ − [ψα(~x, t), ψβ(~y, t)] = 0,

[πα(~x, t), πβ(~y, t)] = 0, (7.5) where α and β label the spinor components of ψ and π. Without proving it for the moment we note that this would lead to a disaster. In particular, the Hamiltonian is unbounded from below - there is no ground state. The only way to cure the problem is to impose anti-commutation relations (we will soon see that this leads to the desired properties for spin-1/2):

ψ (~x, t), π (~y, t) = iδ δ3(~x ~y). { α β } αβ − ψ (~x, t), ψ (~y, t) = 0, { α β } π (~x, t), π (~y, t) = 0. (7.6) { α β } There is a very nice discussion in Peskin & Schroeder on this (Chapter 3). In particular, they show how anti-commutation relations really are the only solution. The Heisenberg equations of motion for the field operators have the solution

d3~k 1 ψ (~x, t) = [b(s,~k)u (s,~k)e−ik·x + d†(s,~k)v (s,~k)eik·x] (7.7) α (2π)3 2E α α Z sX=1,2 d3~k 1 ψ¯ (~x, t) = [b†(s,~k)u¯ (s,~k)eik·x + d(s,~k)v¯ (s,~k)e−ik·x] (7.8) α (2π)3 2E α α Z sX=1,2 Since ψ is now an operator, so are the expansion coefficients b†, d†, b and d. They are in- terpreted as creation and annihilation operators for electrons and positrons respectively. The anti-commutation relations for the fields, eq. (7.6), imply that

b(r,~k), b†(s,~k0) = (2π)3 2E δ3(~k ~k0)δ − sr n o d(r,~k), d†(s,~k0) = (2π)3 2E δ3(~k ~k0)δ − sr n o b(r,~k), b(s,~k0) = b†(r,~k), b†(s,~k0) = 0 n o n o d(r,~k), d(s,~k0) = d†(r,~k), d†(s,~k0) = 0 (7.9) n o n o . Exercise 7.31 Show that the anticommutation relations above lead to the correct anticommutation relations for the fields ψα(~x, t) and πβ(~x, t). You will need the spinor sum relations in eq. (5.5). The total Hamiltonian is H = d3~x : : (7.10) Z H The symbols : : denote normal ordering of the operator inside, i.e. we put all creation operators to the left of all annihilation operators so that H 0 = 0 by definition, and is | i the way we remove the ambiguity associated with the order of operators. Note that if we move an anti-commuting (fermion) operator through another such operator then we pick up a minus sign. Using eq. (7.4) after some algebra we get

d3~k 1 H = E [b†(s,~k)b(s,~k) + d†(s,~k)d(s,~k)]. (7.11) (2π)3 2E Z sX=1,2 . Exercise 7.32 Verify the above form of the Hamiltonian. Can you see from the derivation why com- mutation relations for ψ and π and therefore for b and d would have led to a disaster? The formula in eq. (7.11) has a very nice interpretation. The operator b†b is nothing but the number operator for electrons and d†d that for positrons. Thus, to get the total Hamiltonian, we have to count all electrons and positrons for all spin states s and momenta ~k and multiply this number by the corresponding energy E. If we had tried to impose commutation relations, the d†d term would have entered with a minus sign in front, which would signal that something has gone wrong. In par- ticular, it would mean that d† creates particles of negative energy. This is not supposed to happen in the quantized field theory. (We could try to fix the problem by simply re-labeling d d† but it may be shown that this leads to acausal propagation.) So, in order↔ to quantize the Dirac field we are necessarily led to the introduction of anti-commutation relations. Remarkably we find that we have automatically taken into account the Pauli exclusion principle! For example, b†(r,~k), b†(s,~k0) = 0 n o implies that it is not possible to create two quanta in the same state, i.e. b†(s,~k)b†(s,~k) 0 = 0. | i This intimate connection between spin and statistics is a direct consequence of desiring our theory to be consistent with the laws of relativity and quantum mechanics. Finally consider the charge operator

3 3 † Q = d ~x : j0(x) : = d ~x : ψ ψ : Z Z which, in terms of the creation and annihilation operators, is

d3~k 1 Q = [b†(s,~k)b(s,~k) d†(s,~k)d(s,~k)] (7.12) (2π)3 2E − Z sX=1,2 This shows again that b† creates fermions while d† creates the associated antifermions of opposite charge.

7.2 Quantizing the Electromagnetic Field The Maxwell equations can be derived from the Lagrangian density 1 = F µν F j Aµ (7.13) L −4 µν − µ where the field strength tensor is F ∂ A ∂ A , (7.14) µν ≡ µ ν − ν µ

and jµ is a source for the field. Maxwell’s equations do not change under the gauge transformation A (x) A (x) + ∂ Λ(x) (7.15) µ → µ µ where Λ(x) is some scalar field. This shows that there is some redundancy, and the 4 components of Aµ(x) are more than is required to describe the electromagnetic field (there are two transverse polarizations of e.m. radiation). This leads to a problem in quantization. To see this note that the canonically conjugate field to Aµ is ∂ Πµ = L = F µ0 (7.16) ∂(∂0Aµ) and from this it follows that Π0 = 0. This means there is no possibility of imposing a non-zero commutation relation between Π0 and A0, which we would need if we are to quantize the field. To get around this problem we recognize that gauge invariance allows us to impose an extra condition, which we use to fix the gauge invariance, and effectively lower the degrees of freedom. For example, we can impose the Lorentz gauge condition, i.e.

µ ∂µA = 0. (7.17) Note that, even after fixing the Lorentz gauge, we can perform another gauge transfor- mation on Aµ, i.e. Aµ(x) Aµ(x) + ∂µχ(x) where χ(x) must satisfy the wave equation, µ → ∂µ∂ χ = 0, i.e. we have two unphysical degrees of freedom and the two physical fields. µ We impose the constraint by noting that since ∂µA = 0, there is no harm in adding it to the Lagrangian density as 1 1 = F µνF j Aµ (∂ Aµ)2. (7.18) L −4 µν − µ − 2ξ µ Indeed what we are doing here is following the Lagrange multiplier method of imposing constraints (1/2ξ being the Lagrange multiplier), and recognizing that we should find 4 4 µ 2 the stationary points of S = d x subject to the constraint d x(∂µA ) = 0, i.e. this comes from the “equation of motion”L ∂ /∂(1/2ξ) = 0. R L R Using the gauge-fixed Lagrangian, the equations of motion are now 1 ∂µF j + ∂ (∂µA ) = 0. µν − ν ξ ν µ µ If we require that these equations are satisfied and then also ∂µA = 0, we have the original equations of motion but in a fixed gauge. In the Feynman gauge ξ = 1, the Lagrangian is particularly simple (after some integration by parts under d4x): R 1 = ∂ A ∂µAν j Aµ, L 2 µ ν − µ µ µ and quantization can now proceed: Π = ∂0A and thus [Aµ(~x, t), ∂ Aν(~y, t)] = igµν δ3(~x ~y) (7.19) 0 − − with all other commutators vanishing. The Heisenberg operator corresponding to the photon field is

d3~k 1 3 A (x) = ε (λ,~k)a(λ,~k)e−ik·x + ε∗ (λ,~k)a†(λ,~k)eik·x (7.20) µ (2π)3 2E µ µ Z λX=0 h i where εµ(λ,~k) are a set of four linearly independent basis 4-vectors for polarization µ µ (λ = 0, 1, 2, 3). For example, if k = (k0,~k), we might choose ε (0) = (1, 0, 0, 0), ε (3) = ~ µ µ 2 ~ 2 ~ ~ (0, k)/k0, ε (1) = (0, ~n1) and ε (2) = (0, ~n2), where k0 = k , ~n1 k = 0, ~n2 k = 0 and ~n ~n = 0. εµ(1) and εµ(2) are therefore polarization vectors for transv· erse polarizations· 1 · 2 whilst εµ(0) is referred to as the timelike polarization vector and εµ(3) is referred to as µ the longitudinal polarization vector. For example, if k = (k0, 0, 0, k0), ε (0) = (1, 0, 0, 0), εµ(3) = (0, 0, 0, 1), εµ(1) = (0, 1, 0, 0) and εµ(2) = (0, 0, 1, 0). The commutation relation (7.19) implies that

† 0 0 3 3 0 a(λ,~k), a (λ ,~k ) = g 0 2E (2π) δ (~k ~k ). (7.21) − λλ − h i At a glance this looks fine, i.e. we interpret a†(λ,~k) as an operator that creates quanta of the electromagnetic field (photons) with polarization λ and momentum ~k. However, for λ = 0 we have a problem since the sign on the RHS of (7.21) is opposite to that of the other 3 polarizations. This shows up in the fact that these timelike photons make a negative contribution to the energy:

d3~k 1 H = E a†(0,~k)a(0,~k) + a†(i,~k)a(i,~k) . (7.22) (2π)3 2E −  Z i=1X,3   Fortunately, although we might not realize it yet, we have already solved the problem. µ Recall that we still have to impose ∂µA = 0. It turns out that it is impossible to do this at the operator level, but we can do it for all physical expectation values, i.e. we can impose the correct physics. It then turns out that contributions from the timelike and longitudinal photons always cancel. More explicitly, by demanding for any state χ that | i χ ∂ Aµ χ = 0 (7.23) h | µ | i it follows that χ a†(3,~k)a(3,~k) a†(0,~k)a(0,~k) χ = 0. (7.24) h | − | i and therefore χ H χ 0. This is nice because it is in accord with our knowledge that free photons areh |transv| i ≥ersely polarized. . Exercise 7.33 Show that eq. (7.24) follows from eq. (7.23). Acknowledgements

This lecture course has a long history; David Miller and I have shared the teaching over the last 6 years. We both built upon the foundation laid by Robert Thorne, Tim Morris, Jeff Forshaw, Adrian Signer and all the previous lecturers of this course who have been responsible for its development over the years. These course notes have evolved dependent on the tastes of those lecturers and in that spirit I have only modified the LATEX files which David was kind enough to provide me. Any errors which I have managed to add while doing this (as well as any errors I have missed for sections left unchanged) are, of course, entirely my responsibility. Many thanks also to Margaret Evans for the impeccable organization and to Bill Murray for keeping us all under control. Thanks also to all the other lecturers and tutors and the students for making the school such a pleasant experience.

References

[1] J D Bjorken and S D Drell Relativistic Quantum Mechanics McGraw-Hill 1964 F Halzen and A D Martin Quarks and Wiley 1984 F Mandl and G Shaw Wiley 1984 I J R Aitchison and A J G Hey Gauge theories in 2nd edition Adam Hilger 1989 M E Peskin and D V Schroeder An Introduction to Quantum Field Theory Addison Wesley 1995 B Hatfield Quantum Field Theory of Point Particles and Strings Addison Wesley 1992 L H Ryder Quantum Field Theory Cambridge University Press 1985 C Itzykson and J-B Zuber Quantum Field Theory McGraw-Hill 1987 Pre School Problems

Rotations, Angular Momentum and the Pauli Matrices Show that a 3-dimensional rotation can be represented by a 3 3 orthogonal matrix R with determinant +1 (Start with ~x0 = R ~x, and impose ~x0 ~x0 ×= ~x ~x). Such rotations form the special orthogonal group, SO(3). · · For an infinitesimal rotation, write R = 1l + iA where 1l is the identity matrix and A is a matrix with infinitesimal entries. Show that A is antisymmetric (the i is there to make A hermitian). Parameterise A as

0 ia3 ia2 3 A = ia −0 ia a L  3 − 1  ≡ i i ia ia 0 i=1  2 1  X  −  where the ai are infinitesimal and verify that the Li satisfy the angular momentum commutation relations [Li, Lj] = iεijkLk 2 2 2 2 Note that the Einstein summation convention is used here. Compute L L1 + L2 + L3. What is the interpretation of L2 ? ≡ The Pauli matrices σi are, 0 1 0 i 1 0 σ1 = , σ2 = − , σ3 = . 1 0 i 0 0 1      −  1 Verify that 2 σi satisfy the same commutation relations as Li.

Four Vectors A Lorentz transformation on the coordinates xµ = (ct, ~x) can be represented by a 4 4 matrix Λ as follows: × 0µ µ ν x = Λ ν x For a boost along the x-axis to velocity v, show that γ βγ 0 0 − βγ γ 0 0 Λ =   (.25) −0 0 1 0      0 0 0 1    where β = v/c and γ = (1 β2)−1/2 as usual. By imposing the condition−

0µ 0ν µ ν gµν x x = gµνx x (.26) where 1 0 0 0  0 1 0 0  gµν = − 0 0 1 0    −   0 0 0 1   −  show that µ ν T gµν Λ ρΛ σ = gρσ or Λ gΛ = g This is the analogue of the orthogonality relation for rotations. Check that it works for the Λ given by equation (.25) above. Now introduce ν xµ = gµνx µ and show, by reconsidering equation (.26) using x xµ, or otherwise, that

0 −1 ν xµ = xν (Λ ) µ

µ µ Vectors A and Bµ that transform like x and xµ are sometimes called contravariant and covariant respectively. A simpler pair of names is vector and covector. A particularly important covector is obtained by letting ∂/∂xµ act on a scalar φ:

∂φ ∂ φ ∂xµ ≡ µ µ Show that ∂µ does transform like xµ and not x .

Probability Density and Current Density Starting from the Schr¨odinger equation for the wave function ψ(~x, t), show that the probability density ρ = ψ∗ψ satisfies the continuity equation ∂ρ + J~ = 0 ∂t ∇ where h¯ J~ = [ψ∗( ψ) ( ψ∗)ψ] 2im ∇ − ∇ What is the interpretation of J~? Verify that the continuity equation can be written in manifestly covariant form. µ ∂µJ = 0 where J µ = (cρ, J~).