<<

Relativistic 1

The aim of this chapter is to introduce a relativistic formalism which can be used to describe and their interactions. The choice of the 1.1 SpecialRelativity 1 formalism was dictated by the fact that it needs to be covered in a few 1.2 One- states 7 lectures only. The emphasis is given to those elements of the formalism 1.3 The Klein–Gordon equation 8 which can be carried on to Relativistic Quantum Fields (RQF); the formalism which will be introduced and used later, at the graduate level. 1.4 The Diracequation 14 We will begin with a brief summary of , concentrating 1.5 Gaugesymmetry 33 on 4-vectors and . One-particle states and their Lorentz transfor- mations will be introduced, leading to the Klein–Gordon and the Dirac equations for amplitudes; i.e. Relativistic Quantum Mechan- ics (RQM). Readers who would like to get to the RQM quickly, without studying its foundation in the special relativity might skip the first sec- tions and start reading from the section 1.3. Intrinsic problems of RQM will be discussed and a region of applicability of RQM will be defined. functions will be constructed and particle interactions will be described using their probability currents. A gauge will be introduced to derive a particle interaction with a classical gauge field.

1.1 Special Relativity

Einstein’s special relativity is a necessary and fundamental part of any formalism of particle . We will begin with its brief summary. For a full account, please refer to specialized books, like for example (1) or (2). A chapter about spinors in (3) is recommended1. 1Please note that in that chapter, The basic elements of special relativity are 4-vectors (or contravariant transformations are in ”active”, (the 4-vectors) like a 4-displacement2 xµ = (t, x) = (x0, x1, x2, x3) = (x0, xi) is not changing, vec- µ 0 1 2 3 0 i tors are changing) not ”passive” (the or a 4- p = (E, p) = (p ,p ,p ,p ) = (p ,p ). 4-vectors coordinate system is changing but vec- have real components and form a . There is a metric tensor tors don’t) sense as in this book µν gµν = g which is used to form a to the space of 4-vectors. 2µ, ν = 0, 1, 2, 3 and i,j = 1, 2, 3. This dual space is a vector space of linear functions, known as 1-forms (or covariant 4-vectors), which act on 4-vectors. For every 4-vector xµ, ν there is an associated with it 1-form xµ = gµν x . Such a 1-form is a linear which acting on a 4-vector yµ gives a ν µ ν = gµν x y . This number is called a scalar product x y of x and µ3 · µ 3 y . The Lorentz transformation between two coordinate systems, Λ ν , A similar situation is taking place in an infinitely-dimensional vector space of states in (com- plex numbers there). For every state, a vector known as a ket, for example |xi, there is a 1-form known as a bra, hx| which acting on a ket |yi gives a number hx|yi which is called a scalar product of two kets, |xi and |yi. 2 Relativistic Quantum Mechanics

0µ µ ν x =Λ ν x , leaves the scalar product unchanged which is equivalent µ ν to gρσ = gµν Λ ρΛ σ. In the standard configuration, the Lorentz trans- formation becomes the Lorentz boost along the first space coordinate direction and is given by 10 0 0 γ γβ 0 0 0 −1 0 0 − g = . γβ γ 0 0 v 1  0 0 −1 0  Λ=  −  and β = , γ = 0 0 0 −1 0 0 10 c v2 1 2  0 0 01  − c       q where v is the velocity of the boost. Two Lorentz boosts along different directions are equivalent to a single boost and a space rotation. This means that Lorentz transformations which can be seen as space- rotations, include Lorentz boosts (ro- tations by a pure imaginary angle) as well as space rotations (by a pure real angle). Representing Lorentz transformations by 4-dimensional real matrices acting on 4-vectors is not well suited to combine Lorentz boosts and space rotations in a transparent way. Even a simple question like this: what is a single space rotation which is equivalent to a combination of two arbitrary space rotations? is hard to answer. A better way is to represent Lorentz transformations by 2-dimensional complex matrices. First we consider a 3-dimensional real space and rotations. With every rotation in that 3-dimensional real space we can associate a 2x2 4also known as Hamilton’s quaternion complex matrix4 or transformation or rotation op- erator. R = cos(θ/2) + isin(θ/2)(σxcos(α)+ σycos(β)+ σzcos(γ)) or R = cos(θ/2)+ isin(θ/2)(n σ) · R is unitary; R† = R−1. or after some algebra R = exp[i(θ/2)(n σ)] (1.1) · 5Only two of the angles α, β, γ are in- where θ is the angle of rotation, α,β,γ5 are the angles between the dependent. axis of rotation n and the coordinate axes and σ = (σx, σy, σz) are . The vector space of spin matrices (a subspace of all 2 x 2 complex matrices) can be defined using four vectors, such as the unit and three basis vectors formed using Pauli matrices: iσx, iσy, iσz. In this basis, the spin matrix R has the following coor- dinates: cos(θ/2), sin(θ/2)cos(α), sin(θ/2)cos(β) and sin(θ/2)cos(γ). Combining two rotations, one multiplies corresponding spin matrices and describes the outcome using the above basis; thus getting all the parameters of the equivalent single rotation. The next step is to asso- ciate each 3-dimensional space (real numbers) vector x = (x1, x2, x3) 6No unit matrix, only three Pauli ma- with a corresponding spin matrix6 trices as the basis. 1 2 3 X = x σx + x σy + x σz. (1.2) Then, under the space rotation, x is transformed to x0 and X is trans- 0 † 01 02 03 formed to X = RXR = x σx + x σy + x σz from which we can read coordinates of x0. 1.1 Special Relativity 3

The beauty of the above approach is that it extends seamlessly to space-time rotations; i.e. to the Lorentz transformations. The spin matrix R of 1.1 becomes the Lorentz transformation tanh(ρ) = β L = exp[( ρ + iθn) σ/2], (1.3) cosh(ρ)= γ, sinh(ρ)= βγ. − · where ρ = ρnρ is the rapidity. Now, combination of two Lorentz trans- formations is very transparent; just of real and imaginary parts in the exponent. Association of a 4-vector xµ with a spin matrix7 7Now, four matrices as the basis.

0 1 2 3 X = x + x σx + x σy + x σz (1.4)

allows to get its Lorentz transformed coordinates from X0 = LXL† = X is Hermitian; X = X†. 00 01 02 03 x + x σx + x σy + x σz. And finally, the Lorentz boost alone (θ = 0) along nρ is

L = exp[ ρ σ/2] = cosh(ρ/2) n σ sinh(ρ/2). (1.5) − · − ρ · 1.1.1 Spinors

Spin matrices act on two component complex vectors called spinors8. 8Spinors are vectors in the sense of Spinors play an important role in RQM9 and in this section we will mathematical vector space but they are describe them in some detail. not vectors like a displacement x be- cause they transform (for example un- α Under the space rotation, a spinor ξ (ξ to be more precise) transforms der rotation) differently. the following way 9 0 Spinors like vectors or tensors are used ξ = Rξ. in different parts of physics, including In a comparison, coordinates of a vector x transform under space rota- . tion as 0 † 01 02 0z X = RXR = x σx + x σy + x σz. Thus, a rotation of the coordinate system by θ =2π, R = 1 because of θ/2 in R, gives ξ0 = ξ and x0 = x. Continuing the rotation− by a further 2π, so all together by− 4π, results in ξ0 = ξ. Is that counter intuitive minus sign because of that 2π rotation good for any physics? Yes it is, as was demonstrated in a beautiful experiment (4) using . One of the two coherent beams was going through a magnetic field of variable strength. In the magnetic field the neutrons’ magnetic moments were precessing with the Larmor frequency and the angle of the precession could be easily calculated as a function of the strength of the magnetic field. After passing through the magnetic field, the Fig. 1.1 The phase change of 4π is beam interfered with the second beam which followed a path outside needed to get the same intensity from the magnetic field. As demonstrated in fig 1.1, an angle of 4π was the interference of two neutron beams; needed for the neutron to reproduce itself. A 2π rotation from 4. produced -1 in front of the original neutron wave function as predicted for spin 1/2 spinor. So the counter intuitive minus sign was needed for the neutron to relate to the outside , to be entangled with it the way quantum states can be entangled. So far, one could think about spinors as being identical with Pauli spinors10 of non-relativistic quantum mechanics11. This is not quite 10Eigenstates of spin operators, like the spin projection on the z axis 1/2¯hσz for a spin 1/2 particle, in non-relativistic quantum mechanics (5). 11giving components of spin matrices X of eqn 1.2 4 Relativistic Quantum Mechanics

right. In order to see that, we will look at spin matrices X of eqn 1.4 from a different angle, seeing them as tensors created by a tensor 12A or or product12 of 2-dimensional spinors, Weyl spinors are rooted in space- dyadic product; something like this: time, not in space only like Pauli spinors. Let’s consider the Lorentz a a c d . transformation of a spin matrix X built from spinors ξ = and b b   c    η = : d   a0 a X0 = c0 d0 = L c d L†. b0 b       We can see that ξ0 = Lξ but η0 = L∗η (after taking , L†T = L∗). There are two different types of spinors, transforming differently. Those which transform with the complex conjugation are called dotted Weyl spinors, distinguished from the undotted ξα by a dot written above the index: ηα˙ ; for example, (ξα)∗ is a dotted spinor. The spin matrix X ˙ α = 1, 2 andα ˙ = 1, 2. is then written as Xαβ. There is a metric tensor 0 1  = αβ = αβ 1 0  −  (and an identical one for the dotted spinors) to create a dual space of 2 1 β β˙ α ξ1 = ξ and ξ2 = −ξ . 1-forms; ξα = αβξ (and ηα˙ = α˙ β˙ η ). The scalar product ξ ζα = α β α αβξ ζ (and similarly for the dotted spinors) is invariant with respect ξ ξα = 0 to the Lorentz transformation. Because undotted Weyl spinors and dot- α α ξ ζα = −ξαζ ted Weyl spinors are different objects, the scalar product, or in general any contraction, can only be performed on the same type of spinors; an undotted index is contracted with another undotted index and a dotted index is contracted with another dotted one; one can’t contract a dotted index with an undotted one. . In order to get more insight, we will now go beyond Lorentz trans- formations and consider a space inversion P ; P (x0, x) = (x0, x). The space inversion P commutes with space rotations but it doesn’t− commute with Lorentz transformations because Lorentz transformations affect the time component and P doesn’t. One can see that, considering P Λ, the Lorentz boost followed by the space inversion in the 4-dimensional space- time: P Λ=Λ0P and that Λ = Λ0. If Λ is the boost with a velocity v, Λ0 is the boost with the velocity6 v. Thus [P, Λ] = 0 and therefore P is not proportional to the identity− which commutes6 with every operator. Back to Weyl spinors. Because the space inversion is not proportional to the identity operator, the space inversion is not transforming ξα into ξα a number. It transforms ξα into a spinor of a different type, which transforms under the Lorentz transformation differently than ξα. As Pauli spinors represent spin in non-relativistic quantum mechanics, Weyl spinors are going to represent spin in RQM. If so, we know that the space inversion leaves spin unaffected and therefore under P , ξα needs to be transformed to a spinor which transforms under space rotations the same way as ξα and is representing the same spin state. Out of all 1.1 Special Relativity 5

3 possibilities, only 1-form ηα˙ transforms the same way. So under the α α space inversion ξ ηα˙ and ηα˙ ξ . For a better grasp of the material pre- α → → sented in this part of the section on As spinors ξ and ηα˙ play a very important role in RQM, here is a summary of how they transform under spinors, it is recommended to read the chapter on in (6). In the dis- cussion on the space inversion, P 2 = 1 α α rotation R ξ Rξ ηα˙ Rηα˙ , (1.6) is assumed. This is fine for all particles → → except Majorana particles. No funda- mental Majorana particle has been dis- Lorentz transfor- α α † −1 covered so far (a composite Majorana ξ Lξ ηα˙ (L ) ηα˙ , (1.7) mation L → → particle was discovered in condensed matter) but a could be one. In case of a Majorana particle, P 2 = −1 Lorentz boost α α −1 and the transformation of spinors un- † ξ Lξ ηα˙ L ηα˙ , (1.8) L = L → → der P is different than given here. We will define a Majorana particle later. α α (L†)−1 = L∗−1. space inversion P ξ η ηα˙ ξ . (1.9) → α˙ → Suppose that, there is, described in a particular reference frame, a spin 1/2 particle with a 4-momentup pµ represented via the eqn 1.4 by pαβ˙ . Following our non-relativistic intuition gained from using Pauli α 11˙ 0 3 spinors, we want to represent the spin of that particle by ξ . In an p = p22˙ = p + p attempt to write a covariant13 eqn, we can try to contract the undotted 22˙ 0 3 p = p11˙ = p − p index α but that would lead to something like this 12˙ 1 2 p = −p21˙ = p − ip ˙ α 21 1 2 p = −p12˙ = p + ip pαβ˙ ξ = mηβ˙ αβ˙ βα˙ α p is identical with p ; it is not where ηβ˙ is a dotted spinor different from ξ coming from not contracted transpose operation. 14 dotted index, and m is a dimensional scalar parameter appearing there 13 µ Here and in the rest of the RQM, a because of the dimensionality of p . So that equation is covariant covariant means covariant with respect only when m = 0 because we don’t have any dotted spinor in hands to to Lorentz transformations. put on the right hand side of the eqn. A similar outcome is obtained 14From now on, a scalar means a scalar α15 having only ηβ˙ instead of ξ . Those attempts lead to two independent with respect to Lorentz transforma- Lorentz-invariant Weyl eqns: tions. 15A small clarification might be needed α 0 here. Having a column with two com- p ˙ ξ =0 (p p σ)ξ = 0 (1.10) αβ − · plex numbers is not enough. We need a name in addition; or how and those two numbers transform (here) under Lorentz transformations. Hav- αβ˙ 0 p ηβ˙ =0 (p + p σ)η =0. (1.11) ing three real numbers in 3-dimensional · Euclidean space is not enough; we need In the context of RQM, eqn 1.10 represents an eqn of for a free to know whether they represent a polar massless spin 1/2 particle with a positive helicity and eqn 1.11, an eqn or axial vector; they transform differ- ently under space inversion. So insist- of motion for a different, free massless spin 1/2 particle with a negative ing on only one type of spinor, excludes helicity. Each equation is not space inversion-covariant and violates the other one because it transforms dif- because the space inversion, eqn 1.9, sends each spinor beyond ferently. the formalism; only one type of spinor is present in each formalism. At In quantum mechanics, a helicity oper- present there are no known particles which could be described by any ator representing a projection of parti- of the Weyl eqns. If the neutrino would have no , it would cle’s spin on the direction of its momen- tum is defined as be described by eqn 1.11 and the hypothetically massless and different p · σ . electron anti-neutrino would be described by eqn 1.10. p αβ˙ α Suppose now that we have p and two different spinors: ξ and ηβ˙ For a , this is equiva- lent to p · σ . p0 6 Relativistic Quantum Mechanics

to describe a spin 1/2 particle. We can then contract undotted index αβ˙ µ α α αβ˙ α p pγβ˙ = p pµδγ getting pαβ˙ ξ = mηβ˙ . Acting with p on ηβ˙ from that eqn, gives mξ 2 µ under the condition that m = p pµ. The result is a covariant of equations

αβ˙ α 0 p ηβ˙ = mξ (p + p σ)η = mξ α 0 · p ˙ ξ = mη ˙ (p p σ)ξ = mη (1.12) αβ β − ·

α Requiring that under the space inversion ξ and ηβ˙ are transformed into each other as in eqn 1.9, makes the set of eqns 1.12 space inversion αβ˙ invariant as, simultaneously, p and pαβ˙ are also transformed into each α other. Spinors ξ and ηβ˙ are combined into a four component , called the Dirac spinor and the two eqns become one eqn 1.12 called the . In the context of RQM, the Dirac equation describes a Paul A.M. Dirac spin 1/2 particle like the electron. In order to have more insight into the origin of the Dirac equation, we will consider the Lorentz boost, eqns 1.5 and 1.8, from the rest frame, momentum p = 0, to the frame in which the particle has the energy E 16 and the momentum p. The relevant spinors transform as16 E + m cosh(ρ/2) = 1/2 ξ(p) = (cosh(ρ/2) n σ sinh(ρ/2))ξ(0) (1.13) (2m(E + m)) − ρ · p sinh(ρ/2) = 1/2 (2m(E + m)) η(p) = (cosh(ρ/2)+ n σ sinh(ρ/2))η(0) (1.14) ρ · which can be written as E + m + p σ ξ(p)= · ξ(0) (1.15) (2m(E + m))1/2

E + m p σ η(p)= − · η(0). (1.16) (2m(E + m))1/2

In the particle’s rest frame and in all frames moving with respect to it slowly enough, such that the Lorentz boost can be approximated by the Galilean transformation (not affecting time) when transforming between those frames, any differences in how spinors with dotted or undotted indexes transform disappear. In that case, spinors effectively live in 3 real dimensions. Inspecting eqn 1.5, one can see that at the limit β 0 L the unit matrix and therefore under the Galilean transformation,→ → α spinors don’t change. Thus, at rest, both Weyl spinors, ξ and ηβ˙ , become effectively identical with the same Pauli spinor and we can write α ξ (0) = ηα˙ (0). This allows, after some algebra, to remove p = 0 spinors from the eqns 1.15 and 1.16 and to obtain the Dirac equation 1.12. So the Dirac equation 1.12 is equivalent to the Lorentz boost. This should be expected. Once an object, like a bispinor, is found to represent a particle in its rest frame, the only thing left to do is to boost it to another frame as needed. 1.2 One-particle states 7

1.2 One-particle states

The fact that quantum states of free relativistic particles are fully de- fined by the Lorentz transformation supplemented by the space-time translation was discovered by Wigner. Here we will follow his idea in a Eugen Wigner. qualitative way just to get the main concept across. First, we note that Lorentz transformations are not able to transform a given arbitrary 4-momentum pµ to every possible p0µ. Instead, the vec- tor space of 4-momenta is divided into sub-spaces of 4-momenta which can be Lorentz transformed into each other. Three of those sub-spaces represent experimentally known states. The simplest, at this stage, is µ µ the state given by the conditions p = 0 and p pµ = 0. There is no Lorentz transformation which would transform a 4-momentum not satisfying these conditions to the one which does and vice-versa. We will not study vacuum states in this course and therefore, without any fur- ther considerations about them, we will move on to consider two other possibilities. A 4-momentum sub-space related to massive particles, like the elec- µ tron, is given by the condition p pµ > 0. In addition to a 4-momentum pµ, what are other degrees of freedom present and which geometrical ob- ject represent them? To answer this question, we can consider Lorentz transformations which leave pµ invariant17. To see what they are, we 17The group of such transformations is can transform pµ to the particle rest frame where p0µ = (mass, 0, 0, 0), known as the little group. find the largest subset of the Lorentz transformations leaving p0µ invari- ant, and then transform back to the same pµ. It turns out, as intuitively expected, that the looked for transformations are space rotations acting on 2s + 1 spinors representing 2s + 1 spin projections of spin s parti- cle. Thus, the electron, s = 1/2, is represented by two Dirac spinors. In fact, by two Dirac spinors multiplied be a dimensionless scalar. To get the scalar, we add space-time translations. Looking for a theory which is space-time translation invariant, meaning the energy and mo- mentum conservation, we are looking for the energy and momentum eigenstates (free particles) which in the representation, would µ lead to exp( ip xµ) scalar. The third− 4-momentum sub-space is defined by the conditions pµ =0 µ 6 and p pµ = 0. belong to this class. The question is again to find the largest sub-set of Lorentz transformations leaving pµ invariant. There is no rest frame in this case and therefore and instead, we trans- form an arbitrary pµ to the frame where p0µ = (ω, 0, 0,ω). We can see that the largest18 sub-set of the Lorentz transformations leaving p0µ in- 18For a discussion of some subtle issues variant are the rotations in the x1x2 plane. As a result, a spin s massless see (7). particle is represented by only one state, a helicity eigenstate, and not by 2s + 1 states as in the massive case. This is an important difference. In order to get parity conserving with photons having either helicity + or helicity – states, one puts those two, in principle different, helicity states into one theory. 8 Relativistic Quantum Mechanics

1.2.1 Fields and probability amplitudes We have now all what is needed to develop RQM and to describe fun- damental particles and their interactions. But before we move on, we will pause for a short moment to look at a larger picture of which RQM is only a part. The Dirac equation, for example, can be studied in the context of classical fields (CF) or RQF or RQM. The algebra would often be identical but the basic objects and the interpretation different. The most natural way to progress from where we are right now, would be to study CF. The model for that would be the electro-, or classical tensor field F µν or 4-vector potential field Aµ; F µν = ∂µAν ∂ν Aµ − Then such a field, for example a classical electron field, i.e. a classical Dirac spinor field Ψ, would be quantized, promoting Ψ of CF to a rela- tivistic quantum field operator Ψ of RQF. Probability amplitudes would be obtained by Ψ of RQF acting on particle states living in a suitably constructed space. In RQM, Ψ of RQM represents a particle state and it isn’t an operator. Describing it in a basis of eigenstates of a Her- mitian operator representing a physical one gets probability amplitudes. A Ψ of RQM looks like a Ψ of CF but they are differ- 19To be defined shortly; one can think ent. Algebraically they are the same but a large modulus of Ψ19 of CF about something like |Ψ|2 in non- would correspond to many in a considered small volume (simi- relativistic quantum mechanics larly, a large electric field created by a focused light corresponds to many photons) and if we would multiply that field by a big number, the number of electrons would increase. A corresponding quantity in RQM would represent a large probability to find an electron in that volume. Multiplying Ψ by a big number would not change anything because in comparison, there is an extra requirement of the normalization. Also a question about how to measure a physical quantity leads to different considerations and a different answer in each case. For example one can measure a phase of a classical electric field but one can’t measurea phase of Ψ in RQM. As mentioned earlier, in few lectures, we would not be able to study CF and RQF; a one year course at a basic level would be needed for that alone. Fortunately, considering the most important, aspects of physics being studied in this course, RQF would be giving the same results as those which we will obtain in RQM. Differences will be in details beyond leading effects. The one important exception is that we will be missing the idea of a vacuum state. In RQF, a vacuum is not a nothingness, although particles are absent. For example, the QCD vacuum is a very complicated state.

1.3 The Klein–Gordon equation

RQM of spin 0 particles was considered by Schroedinger first, before he published his famous equation for the non-relativistic case. He aban- doned RQM because of formal difficulties which were understood many 1.3 The Klein–Gordon equation 9 years later. Here, we will see what they are and we will define an area of RQM applicability. As argued earlier, a spin 0 particle with

µ 2 p pµ = m > 0, (1.17) in the position representation, is expected to be described by a scalar µ ∂ wave function exp( ip xµ). Replacing the energy by i ∂t and the momentum by∼ i in− eqn 1.17, one gets the Klein–Gordon (K–G) − ∇ equation20 of RQM in position representation: 20Or the Klein–Gordon–Fock equation.

(2 + m2)Ψ(t, x)=0, (1.18)

2 µ ∂2 2 where = ∂ ∂µ = ∂t2 . For a particle at rest, i Ψ(t, x) = 0, only time (the proper time− ∇τ) derivative would be present− in∇ eqn 1.18 and pµ = i∂µ there would be two independent solutions: Ψ±(τ, x)= exp( imτ)Ψ±(0, 0). 0 ∂ ∂ p = i 0 = i Therefore in a frame in which the particle has a momentum∓p and the en- ∂x ∂t i i ∂ 2 2 p = i∂ = −i∂i = −i ergy Ep =+ p + m > 0 (the subscript p for the plus sign; indicating ∂xi 21 that Ep is positive), we get, as expected, . p 21 µ mτ = p xµ Ψ+(t, x)= Nexp( ip x)= Nexp( iE t + ip x) (1.19) − · − p ·

Instead of boosting the other solution, i.e. taking Ep energy eigenstate and p momentum eigenstate we take the complex− conjugate of Ψ+(t, x), corresponding to E energy eigenstate and p momentum eigenstate − p − to get22 22Why one is doing that should become clear after reading section 1.3.1 Ψ−(t, x)= Nexp(+ip x)= Nexp(+iE t ip x) (1.20) · p − · where N is a normalization constant which will be defined shortly. By a direct substitution, one can check that a general solution of eqn 1.18 is indeed a of Ψ+(t, x) and Ψ−(t, x). We obtained, as expected, Ψ+(t, x) but in addition, we also got Ψ−(t, x). This is the first puzzle of RQM, the nature of which will be becoming clearer when we progress a little further. Both solutions of the K–G eqn are of the energy oper- ∂ − + ator i ∂t ;Ψ (t, x) with an eigenvalue of Ep and Ψ (t, x) with Ep, a for a free particle! − In exactly the same way as in the case of the non-relativistic Schroedinger eqn, one can derive the for a probability density ρ and a j: ∂ρ + j = 0 (1.21) ∂t ∇ · where ∂Ψ ∂Ψ∗ ρ = i(Ψ∗ Ψ ) (1.22) ∂t − ∂t and j = i(Ψ∗ Ψ Ψ Ψ∗). (1.23) − ∇ − ∇ 10 Relativistic Quantum Mechanics

The probability current comes out to be given by the same expression as the non-relativistic one but the probability density is different, although nicely symmetric to the current and we can define a 4-vector current jµ (ρ, j)= i(Ψ∗∂µΨ Ψ∂µΨ∗). The continuity eqn 1.21 can be then ≡ µ − written as ∂µj = 0. The corresponding conserved quantity is the total probability which one gets integrating j0 = ρ over the 3D space. The underling symmetry is the invariance with respect to the multiplication by a global phase factor: physics described by Ψ is identical to physics 23One can prove that considering described by exp(iθ)Ψ for any fixed real parameter θ23. the Klein–Gordon Lagrangian and the Substituting Ψ+(t, x) from eqn 1.19 into 1.22 and 1.23 we obtain Noether theorem. + 2 + 2 ρ =2 N Ep and j =2 N p. (1.24) | | | | Now we can fix the normalization N. In the non-relativistic quantum mechanics, a volume of the probability density was a constant; 1 for one particle in the whole space. This doesn’t work in RQM because there is the Lorentz contraction which modifies the volume, contracting one side of a cube, parallel to the Lorentz boost, by the Lorentz factor γ. To keep the integral independent of the Lorentz transformation, the 24No problem, as ρ is the time-like probability density, should grow by the same γ24. So putting N = 1 component of a 4-vector would do the job, as any other constant would do it as well. The choice of N = 1 is called the covariant normalization and corresponds to 2Ep particles in a unit volume. Another popular choice is N =1/√2m which ∗ in the non-relativistic limit, Ep m makes ρ Ψ Ψ and j velocity approaching expressions of the non-relativistic→ → quantum mechanic→ s. For Ψ−(t, x), eqn 1.24 becomes

− 2 − 2 ρ = 2 N Ep and j = 2 N p. (1.25) − | | − | | Summarizing, Ψ+(t, x) and related , the energy, probabil- ity density and the probability current come out as expected and behave nicely in the non-relativistic limit. In contrast to Ψ+(t, x) an unex- pected additional wave function Ψ−(t, x), describes a free particle with negative energy, negative probability density and with the probability current flowing in the opposite direction to the particle’s momentum; all properties unexpected and difficult to accept.

1.3.1 The Feynman–Stueckelberg interpretation of negative energy states

In this section we outline the Feynman–Stueckelberg interpretation of negative energy states following Feynman’s lecture (20) which is recom- mended as further reading. Suppose there is a particle in a state φ0 as indicated in fig 1.2a. At a time t1 a potential U1 is turned on for a moment, acting on the particle and changing its state to an intermediate state. At a time t2, a second perturbation U2 changes that intermediate state to the final one which could be the same as the original state φ0. An amplitude for the particle 1.3 The Klein–Gordon equation 11

Fig. 1.2 A contribution to the transition amplitude viewed in two different reference frames; adapted from FEYNMAN.

to go from the initial state φ0 to the same state φ0 after time t2 has a contribution from an amplitude with an intermediate state, existing for a period of time from t1 to t2, of energy Ep > 0. All possible intermediate states of different Ep > 0 are contributing. Among them, there are amplitudes for particles travelling faster than the . This is the result of insisting that all energies Ep are positive. If one starts a of from a point keeping all energies positive, those waves can not be confined to be inside the light cone. A sketch in fig 1.2a corresponds to an amplitude where the particle in the intermediate state travels faster than the speed of light. As indicated in fig 1.2b, there is a reference frame in which the sequence of events is different; 0 0 t2 happens first, before t1. An observer in the reference frame (t, x) observed one particle which moved from x1 to x2 ending in the same as the original one. An observer in a different reference 0 0 0 frame, (t , x ), has a different story to tell. A particle was in x1, in a 0 quantum state φ0. Nothing was happening until a time t2 when suddenly 0 two particles emerged from a point x2. One of those particles travelled to 0 0 x1 and at a time t1 collided with the original particle and both particles annihilated each other disappearing from the scene, leaving the third 0 0 particle at x2 in state φ0; three particles were present between t2 and 0 t1. The second observer can argue that the particle which travelled 0 0 from x2 to x1 is the of the original particle and therefore they were able to annihilate each other. So exist and their properties are defined by particles such that the annihilation works. The first observer might argue that the antiparticle of the second observer is his particle travelling backwards in time and that is the interpretation of the negative energy states. The negative energy states correspond to particles travelling backwards in time and therefore the phase of the wave function in eqn 1.20 has a +iEpt contribution instead of iEpt like in eqn 1.19. To make the picture complete we also would take− into account that a particle travelling backwards in time has its momentum reversed. Mathematically, all that is equivalent of taking the complex 12 Relativistic Quantum Mechanics

conjugation of the positive energy solution Ψ+ (eqn 1.19) to obtain the negative energy solution Ψ− (eqn 1.20). In summary, negative energy solutions of the Klein–Gordon eqn repre- sent antiparticles. The probability density represent the density and can be either negative or positive. Similarly for the probability cur- rent, representing the charged current; the number of charges passing through the unit area per unit time.

Inclusion of interactions via a potential Now, we will introduce interactions using a potential and see what happens. Following the way a potential V was introduced into the 25For a proper discussion of interac- non-relativistic Schroedinger eqn, we will modify the energy operator25: tions one needs to wait until section 1.5 i ∂ i ∂ V transforming eqn 1.18 into ∂t → ∂t − ∂ (i V )2Ψ = ( 2 + m2)Ψ ∂t − −∇ which for time independent, time-like potential V and for energy eigen- 26 26 states with energy Ep, and in 1-, becomes ∂ i Ψ= EpΨ 2 ∂t 2 ∂ 2 Ψ(t, s)= ψ(s)exp(−iE t) (Ep V (s)) ψ(s) = ( + m )ψ(s). (1.26) p − −∂s2 Let’s consider a time-like potential barrier of fixed hight V > 0 for s 0, as shown in fig. 1.3 The wave function ψ(s) consists of incident, Iexp≥ (ips), reflected, Rexp( ips), and transmitted, Texp(iks) waves: − ψ (s)= Iexp(ips)+ Rexp( ips) L −

ψR(s)= Texp(iks),

where ψ(s)= ψL(s) for s< 0 and ψ(s)= ψR(s) for s 0. Substituting 2 2 2 ≥ 2 2 2 Fig. 1.3 A time-like potential barrier of ψL and ψR into eqn 1.26, one gets E = p +m and (E V ) = k +m hight V. Incoming, reflected and trans- leading to p = √E2 m2 and k = (E V )2 m−2. In both cases mitted waves are also indicated. ± − ± − − one chooses a + sign in front of the () to match expected propagation p directions as in fig. 1.3. p dψ(s) From the continuity condition at s =0 for ψ(s) and ds one gets

I + R = T pI pR = kT −

2p p k and T = I R = − I. (1.27) p + k p + k Probability currents along s for s< 0 and s 0 are ≥ p 2 2 k 2 j = ( I R ) jR = T (1.28) L m | | − | | m | | We will keep the energy E fixed and consider three different regions of the potential strength. 1.3 The Klein–Gordon equation 13

The first region is of a weak potential, E > V + m where k is real and k 0 ρ = − ψ 2 > 0 L m | L| R m | R| and this case looks like the non-relativistic one; nothing special, a small fraction of the incoming wave is reflected and the rest is transmit- ted. In the second region, the potential is of a moderate strength, V m 0 in the first case; weak potential) to negative:

E > V ⇒ ρR > 0 but if

E < V ⇒ ρR < 0. (1.29) We will come back to this after discussing now the case of the strong potential, E < V m, when k becomes pure real and k2 > p2. The probability, which− became negative when the potential became strong enough in the previous case, stays negative and the probability current becomes real inside the barrier: E V k ρ = − T 2 < 0 and j = T 2 . R m | | R m | | We will consider now, the previously unphysical case of k< 0, because, in a counter-intuitive way, this case now corresponds to a particle moving to the right. To see that we will calculate the ∂E k = = > 0. Vg ∂k E V − A consequence of that is that T and R given by eqn 1.27 can be arbitrary large, possibly making the reflected current bigger than the incoming one; the Klein paradox, another puzzle of RQM. The reason for this is that the very strong potential provides the energy needed to produce particle-antiparticle pairs. The probability current and the probability density within the barrier are negative because created antiparticles (see 14 Relativistic Quantum Mechanics

eqn 1.25) are attracted to the barrier; moving to the right ( g > 0). Pro- duced particles are repelled by the barrier, moving to theV left, increas- ing the reflected probability current. Such a situation can be created by focusing light from a high power laser, making a very strong electric field which in turn produces electron- pairs from the vacuum. One can also think about the Hawking radiation in a neighbourhood of a black hole. The fact that the probability density became nega- tive already in the case of a moderate strength potential, corresponds to the vacuum by creation of -antiparticle pairs which didn’t influence probability currents because there was not enough energy in the system to promote them to become real particles. Looking for an analogy, one can probably think about the in where the vacuum polarizations affects energy levels. The problem of RQM is now clear: the formalism describes one parti- cle (or a fixed number of particles) but physics needs many particles; the number of which can’t be fixed; particles can be created and particles can be annihilated. One needs RQF to describe such physics. RQM can be used, as long as the number of particles is fixed. Considering the limit of the uncertainty relation p s ¯h, we see that the pair creation which starts at p mc sets the4 limit4 ∼ on the s ¯h/(mc). So as long as we are studying4 physics∼ at a scale bigger than4 ¯h/∼(mc), known as the Compton , RQM can be applied. Atomic physics is an exam- ple when this condition is usually fulfilled. But RQM can also be applied for many processes in high energy . A representative ex- ample could be the electron-positron annihilation producing hadrons; many hadrons. A fundamental process in that case is e+ +e− q + ¯q. The number of particles is 2 and is fixed and the change from→ 2 lep- tons to 2 can be handled by RQM. A fragmentation of quarks to hadrons is taking place on a different, much slower time scale and therefore can be separated from the fundamental process of the e+ +e− annihilation. Is there any limit on p? How well can one measure momentum? In the non-relativistic quantum4 mechanics, momentum can be measured with any precision but due to the limit speed

This is the main section of the book on RQM. First, we will consider different representations of the Dirac eqn, a probability current and bi- linear covariants. Then, we will find wave functions describing free spin 1/2 particles like electrons, discuss chirality and helicity operators before applying them to describe the Standard Model (SM) interactions at the limit when can be neglected. Electromagnetic interactions will 1.4 The Dirac equation 15 be introduced via, so called, and the non-relativistic limit will be obtained. ξ Combining ξα and η into one Dirac spinor Ψ, Ψ = , the Dirac β˙ η eqn 1.12 can be written as  

0 p + pσ 0 Ψ= mΨ. (1.30) p pσ 0  0 −  Instead of Ψ, we could use Ψ0 = UΨ, where U is a . In the new basis, eqn 1.30 would look different. So, in general, the Dirac eqn can be written as (γp m)Ψ = 0 (1.31) − where ∂ γp γµp = p γ0 pγ = iγ0 + iγ . (1.32) ≡ µ 0 − ∂t ∇ Comparing eqns 1.31 and 1.30 one can see that in the basis or repre- sentation which was used to get eqn 1.30,

0 1 0 σ γ0 = , γ = . (1.33) 1 0 σ −0     This representation in known as the Weyl or symmetric or chiral 27 27 α representation . Multiplying eqn 1.31 by γp from the left, one gets In some books, ξ and ηβ˙ swap representation independent constraints on the γ matrices: places leading to different space-like γ matrices; multiplied by -1. γµγν + γν γµ =2gµν . (1.34) The matrix γ0 is Hermitian and the matrices γ are anti-Hermitian28 : (γ0)2 = 1 0† 0 γ† = γ (1.35) 1 2 2 2 3 2 γ = γ , − (γ ) = (γ ) = (γ ) = −1 γ0 = UγU † = UγU −1 Applying the Hermitian conjugation to eqn 1.31, using properties of the γ matrices given by eqn 1.35 and after some algebra29 one gets the 28in any representation. adjoint Dirac equation 29Eqn 1.31 is Ψ(¯ γp + m) = 0 (1.36) ∂Ψ ∂Ψ iγ0 + iγk − mΨ = 0. where the adjoint spinor ∂t ∂xk † 0 Ψ¯ Ψ γ (1.37) Applying † one gets ≡ and p acts on the left. ∂Ψ† ∂Ψ† i γ0 + i (−γk)+ mΨ† = 0 Multiplying eqn 1.31 by Ψ¯ from the left and eqn 1.36 by Ψ from the ∂t ∂xk right and adding the resulting equations, one gets and multiplying by γ0 from the right, using eqn 1.34, gives ¯ µ ¯ µ ¯ µ Ψγ ∂µΨ + (∂µΨ)γ Ψ= ∂µ(Ψγ Ψ)=0, ∂Ψ¯ ∂Ψ¯ i γ0 + i γk + mΨ¯ = 0. µ ∂t ∂xk which is a continuity equation, ∂µj = 0, for the probability current 4-vector jµ = Ψ¯ γµΨ. (1.38) The probability density

4 ρ j0 = Ψ¯ γ0Ψ= Ψ 2 (1.39) ≡ | i| i=1 X 16 Relativistic Quantum Mechanics

is a time-like component of the probability current, it is positive definite (ξα)∗ transforms like a dotted spinor and it has similar form to the non-relativistic expression. ∗ and (ηβ˙ ) transforms like an undotted One should note that Ψ¯ is a natural partner to Ψ to form objects of spinor well defined transformation properties like a scalar ΨΨ¯ or the probability current 4-vector or so called bilinear covariants. The matrix γ0 which is sitting inside Ψ¯ is swapping spinors inside the bi-spinor such a way that the dotted spinor is meeting the dotted one and the undotted is meeting undotted one; as it should be, the dotted index is contracted with the dotted one and the undotted index is contracted with the undotted one. The Hamiltonian H of the Dirac eqn is obtained by multiplying eqn 1.31 by γ0 from the left and separating the time derivative: ∂Ψ HΨ= i , (1.40) ∂t where H = α p + βm (1.41) ·

0 and α = γ0γ, β = γ . (1.42) αiαj + αj αi = 2δij βα + αβ = 0 Matrices α and β are Hermitian and in the Weyl representation are β2 = 1 σ 0 0 1 given by α = , β = . (1.43) 0 σ 1 0  −    The Weyl representation is very well suited to the ultra-relativistic limit, when the mass can be neglected because the Dirac bispinor is effectively reduced to the Weyl spinor; one spinorial component van- ishes. At the non-relativistic limit, however, both spinor components of the Dirac bispinor contribute equally in the Weyl representation and therefore in the non-relativistic limit another representation, called the standard or the Dirac representation, is more suitable (this representa- tion is the most common in textbooks). A transformation from the Weyl representation to the Dirac represen- tation is a Unitary transformation 1 1 1 U = , √2 1 1  −  which gives ϕ ξ 1 ξ + η Ψ(Dirac)= = UΨ(Weyl)= U = . χ η √2 ξ η      − (1.44) The transformation of γ matrices; γ(Dirac) = Uγ(Weyl)U −1 gives

1 0 0 σ 0 σ γ0 = β = , γ = , α = . 0 1 σ 0 σ 0  −   −    The Dirac eqn in the Dirac (standard) representation is then given by

Eϕ p σχ = mϕ − · 1.4 The Dirac equation 17

Eχ + p σϕ = mχ. (1.45) − · In the non-relativistic limit, χ 0 and the Dirac spinor becomes effec- tively a two component Pauli spinor.→ The fact that in the relativistic theory one needs two Weyl spinors to describe the electron and in the non-relativistic world one Pauli spinor is enough to do the same, is always difficult to accept. To bring more insight why it is like that, we will consider yet another representation; Foldy–Wouthuysen (FW) representation. We start with the Dirac rep- resentation and apply a momentum dependent Unitary transformation U given by 1 βα p p U = exp( · arctan | |). 2 p m | | The wave function in the FW representation is then

u ϕ Ψ(F W )= = UΨ(Dirac)= U w χ     and after the transformation of the Hamiltonian, eqn 1.41 splits into components becoming ∂u p2 + m2 u = i , (1.46) ∂t p ∂w p2 + m2 w = i . (1.47) − ∂t So, now we have two decoupledp eqns for positive and negative energy solutions respectively. But if we want to drop one of them and consider say only the one for the positive energies, then there is a problem because we would not know how to transform the positive energy spinor without any knowledge of the other one. We only know how to transform ξα and ηβ˙ and they are buried inside u and w spinors. We would need to α get ξ and ηβ˙ out of u and w, do the Lorentz transformation and put them back to get Lorentz transformed u and w. In the non-relativistic 2 2 2 p limit, however, p + m m + 2m which leads to the Schroedinger Hamiltonian and the Lorentz' transformation becomes the Galilean one p α which doesn’t affect spin. Therefore ξ and ηβ˙ transform the same way and we don’t need to treat them separately. So, in the non-relativistic limit, we can just take one eqn, for example, the one for the positive energy, use it to describe a non-relativistic electron with its spinor u being effectively the Pauli spinor; and we can ”forget” about the negative energy solution.

Majorana particles This is a small detour from the main track to introduce the concept of the Majorana particle. A neutrino might be the first fundamental particle of that type. From the discussion in section 1.1.1 of spinors, one could get an impression that a massless spin 1/2 particle is described by the Weyl spinor and a massive one by the Dirac spinor which has 18 Relativistic Quantum Mechanics

two independent Weyl spinors as components. Although all particles we know either fit or could fit this scenario, this is not the only scenario. One can start with the Weyl spinor, say, an undotted spinor ξα, get a dotted one by the complex conjugation and then lower the index with

the metric tensor  getting the spinor which transforms like ηβ˙ . Then one can insert these ξ and η-like into the Dirac spinor getting the Dirac eqn in the Weyl representation for a massive particle, the Majorana particle, effectively defined by one Weyl spinor; not two independent Weyl spinors like for the electron.

1.4.1 Free particle solutions The Dirac eqn 1.45 in the Dirac (standard) representation can be written as m p σ ϕ ϕ = E p σ ·m χ χ  · −      which for the electron at rest simplifies to m 0 ϕ ∂ ϕ = i . 0 m χ ∂t χ  −      There are 4 independent, not normalized solutions for energy eigenstates: 1 0 0 1   exp( imτ),   exp( imτ) for E = m 0 − 0 −  0   0          and 0 0 0 0   exp(+imτ),   exp(+imτ) for E = m 1 0 −  0   1      By boosting those solutions to the frame where p = 0 one gets One can go back to the Weyl represen- 6 tation, boost ξ, eqn 1.15, and η, eqn (s) (s) 1.16, and return to the Dirac represen- u exp( i(Ept p x)= u exp( ip x) for E > 0 tation. − − · − · and, for E energy eigenstate and p momentum eigenstate, − p u(s+2) exp(+iE t) exp(ip x)= u(s+2) exp(+i(+E t + p x) for E < 0 p · p · where

(s) −σ·p (s) (s) ϑ (s+2) E +m ϑ u = N p ,u = N p σ· ϑ(s) (s) Ep+m ! ϑ ! and N is the normalization constant, 1 0 E =+ p2 + m2, ϑ(1) = , ϑ(2) = , s =1, 2. p 0 1 p     1.4 The Dirac equation 19

As in the case of K–G eqn, instead of the above two solutions for E < 0 energy eigenstates (in principle we could carry on with them, as done in a number of textbooks), we will follow the Feynman–Stueckelberg inter- pretation of negative energy states as antiparticles which are equivalent to particles travelling backwards in time. This will require to replace the momentum p by p and to decide which spin state to choose going from a particle to an− anti-particle (this last step will be addressed in a moment). These will lead to the final choice of our 4 independent solutions of the Dirac eqn:

Ψ+(x)= u(s) exp( ip x) and Ψ−(x)= v(s) exp(+ip x) (1.48) − · · where u(s) is as above and

v(1)(p)= u(4)( p) and v(2)(p)= u(3)( p). (1.49) − − Ψ+ describing a free electron, as well as Ψ− describing a free positron, is two fold degenerate. We need a to label, distinguish those states with the same energy. A suitable operator, commuting with the free particle Hamiltonian, is the helicity operator

σ·p |p| 0 h(p)= σ·p . (1.50) 0 |p| !

In a convenient reference frame where p = (0, 0,p),

1 0 0 0 1 0 10 0 0 h(p)=  −  and, for example, u(1) = N     0 0 1 0 p 1  0 0 0 1   Ep+m 0   −          thus u(s) (as well as v(s)) are helicity eigenstates,

h(p)u(1) =+u(1) and h(p)u(2) = u(2). − Helicity eigenvalues, correspond to the spin component in the direction of motion. The helicity + means that, in its rest frame, the electron has +1/2 spin projection on the axis parallel to p (likewise for the helicity – and 1/2 spin projection); ϑ(1) and ϑ(2) are also defined with respect to this− axis, which has to be the z axis from the construction; chosen representation of the Pauli matrices. We can see now, that changing p to p changes the direction of the axis in the rest frame of the− electron and the spin projection changes sign as in eqn 1.49 (see the s label)30. 30See (9) for further reading. As in the case of K–G eqn, we will use the covariant normalization requiring that the integral of the probability density over the unit vol- ume gives 2Ep particles. This will lead to the normalization constant N = Ep + m. p 20 Relativistic Quantum Mechanics

1.4.2 Chiriality = helicity 6 The chiriality, also called the handedness, is defined by a pair of projec- tion operators

1 γ5 1+ γ5 P = − and P = where γ5 = iγ0γ1γ2γ3. (1.51) L 2 R 2 PL + PR = 1, PLPR = PRPL = 0, PLPL = PL, PRPR = PR. They divide the space of wave functions into right-handed and left- handed half-spaces. PR projects a wave function onto the right-handed half-space, giving the right-handed component of the wave function. PL does the same with respect to the left-handed half-space and left-handed 31 31 Ψ= PRΨ+ PLΨ=ΨR +ΨL. component of the wave function . Describing the Dirac spinor in the Dirac representation (ignoring the normalization) in terms of the Weyl spinors ξ and η of the Weyl representation, see eqn 1.44, one gets

1 γ5 ξ + η 1 1 1 ξ + η η − = − = 2 ξ η 2 1 1 ξ η η  −   −   −   −  and 1+ γ5 ξ + η 1 1 1 ξ + η ξ = = . 2 ξ η 2 1 1 ξ η ξ  −     −    32The names left-handed and right- Thus, we can define left-handed and right handed spinors32 as handed are very unfortunate historical remnants, hiding different transforma- ηL η and ξR ξ. (1.52) tion properties of dotted and undotted ≡ ≡ Weyl spinors. We kept those handed- ness names away from the readers em- So, the projection operators PL and PR pull out of the Dirac spinor, phasizing the transformation properties its dotted and undotted components, respectively. This is important in for as long as we could. particle physics because weak interactions are sensitive to those com- ponents. W couples only to the left-handed part of a particle wave function and Z boson couples to both parts but with a different strength. In order to get a better insight, we will consider chiral components of the Dirac spinor u:

5 5 (1 σ·p )ϑ 1 γ 1 γ ϑ 1 Ep+m − u = − σ·p = − p 2 2 ϑ 2 (1 σ· )ϑ  Ep+m  − − Ep+m ! where ϑ = ϑ+ + ϑ− is a superposition of helicity + (ϑ+) and – (ϑ−) eigenstates (not normalized). Because σ p σ p · ϑ+ = ϑ+ and · ϑ− = ϑ− p p − | | | | therefore σ p σ p σ p σ p · (1 · )ϑ+ =0, · (1 · )ϑ− = 2ϑ−, p − p p − p − | | | | | | | | σ p σ p σ p σ p · (1 + · )ϑ+ =2ϑ+, · (1 + · )ϑ− =0, p p p p | | | | | | | | 1.4 The Dirac equation 21 and finally σ p a σ p b σ p (1 · )ϑ = (1 + · )ϑ + (1 · )ϑ − E + m 2 p 2 − p p | | | | where p p a =1 | | , b =1+ | | , − Ep + m Ep + m gives a decomposition of the final state into helicity + and helicity – components with a/2 and b/2 weights accordingly. In summary, 1 γ5 a b − u = (helicity+) + (helicity )= u ; left chiral state 2 2 2 − L and 1+ γ5 b a u = (helicity+) + (helicity )= u ; right chiral state. 2 2 2 − R It should be clear now that chirality and helicity are two different things. However, in the limit

speed c = a 0 and b 2, → ⇒ → → chiriality and helicity become identical, which is the subject of the next section. Before that, we will explore consequences, relevant at energies when masses can’t be neglected, of chiriality and helicity being different; for example affecting weak decay rates of particles. As an example, we − − will consider π µ +¯νµ decay. → − − In the rest frame of the π , the momenta of µ andν ¯µ are back to back and the helicities are as indicated in fig 1.4. Theν ¯µ is in the helicity + state33 with its spin along its momentum. The µ− has to − be in the helicity + state to conserve the total (π − − Fig. 1.4 π → µ +ν ¯µ decay. − has no spin). But the W couples to the left-handed component of the 33 µ− wave function and therefore the µ− needs, simultaneously, to have We are not going to consider the helicity – state as such a state has never the helicity + (to conserve the angular momentum) and chiriality = been observed experimentally. A neu- left-handed (for the W − to couple). The probability to get that is trino with non-zero mass is either de- scribed by the Dirac spinor with two a 2 1 speed Wyle spinors contributing but with one | | = (1 ). a 2 + b 2 2 − c helicity state being sterile, i.e. not seen | | | | in weak interactions or are − massive Majorana particles having only We can see that for speed c, the decay rate tends to 0; π would one helicity state. not decay to massless µ−. This→ explains why, although favoured by the available energy phase-space factor (not considered here), the decay rate π− e− +¯ν is smaller than that to µ− +¯ν . → e µ 1.4.3 Helicity conservation and interactions via currents It is worth repeating the conclusion of the previous section that the chiriality = handedness is different from the helicity and that has con- sequences at low energies where masses can not be neglected. At high 22 Relativistic Quantum Mechanics

energies, where masses can be neglected, the helicity and chiriality = handedness can be treated as identical and in many textbooks this is the working assumption right from the start. It can be shown (see for example (8)) that in the probability current

µ µ µ µ µ j =uγ ¯ u = (¯uL +¯uR)γ (uL + uR)=¯uLγ uL +¯uRγ uR

µ no cross terms likeu ¯Lγ uR are present. In the high energy limit where masses can be neglected 1 (1 γ5)u = u u− is the helicity eigenstate, 2 − L ' L − 1 (1 + γ5)u = u u+ is the helicity + eigenstate 2 R ' R and therefore, in the probability current, there are no helicity mixing − µ + terms likeu ¯L γ uR. The significance of that will be discussed in this section. In classical electro-magnetism, two parallel wires with electric cur- rents, interact with each other with a force proportional to the product of the currents. One can try a similar idea to describe scattering of particles. Multiplying the probability current of a free electron by the electron’s , will give us a 4-vector with the time-like com- ponent representing the and the space-like component representing the number of electric charges crossing unit area per unit time, i.e. the corresponding to moving electrons. If we now take that current and another one, for say a free , then we can expect that the dot product of these currents will have something to do with the electromagnetic interaction of these particles. Multiply- 34 34 For an introduction to , ing that by a (which actually will contain the gµν tensor see for example (8). for the above dot product) allows for the momentum transfer between interacting particles and indeed gives the matrix element of the first ap- proximation to the . That matrix element can be visualized by a , see fig 1.5. It is the property of the Standard Model (SM) that interactions between any two SM fermions can be described, in a leading approximation, by the current–current interaction as outlined above, although in case of the weak interactions, treating the left-handed and the right-handed parts of it separately be- cause weak interaction couple to them differently. Since in the − µ + probability current, there are no helicity mixing terms likeu ¯L γ uR, there are only two fundamental SM vertices as illustrated in fig 1.6a, where f stands for a and the wiggly line of the exchanged parti- cle represents either of the SM vector bosons, the , W +, W −, Z0 or any of the eight . The helicity is the same before and after the scattering (coupling to the exchanged particle). One says that the he- licity is conserved, not changed by the interaction. The annihilation or pair creation vertices are obtained from the scattering ones by so called crossing symmetry, see for example (8) or (9). Keeping in mind that in fig 1.6 time is going from left to right, we ”cross” the incoming particle Fig. 1.5 e− + µ− → e− + µ− Feyn- to the other side of the reaction equation by inverting its 4-momentum man diagram (the top diagram). The top diagram is in fact a sum of two di- agrams. The vertical wiggly line rep- resenting the exchanged particle prop- agator is the sum of two scenarios de- pending on which particle was the emit- ter and which one was the absorber of the exchanged particle (for example a photon or Z0) as sketched below the main, top, diagram; time goes from left to right 1.4 The Dirac equation 23

Fig. 1.6 Fundamental vertices of the SM at the high energy limit where masses can be neglected. and swapping the helicity state, so it travels backwards in time, repre- senting the outgoing anti-particle with the opposite helicity travelling forward in time35 , fig 1.6b: 35The arrow on the fermion line indi- cates in which direction with respect to Ψ+(x)= u(1)(p) exp( ip x) u(4)( p) exp(+ip x)= v(1)(p) exp(+ip x). the time arrow the fermion specified by − · → − · · the label at the end of the line is trav- The crossing symmetry allows us to get the annihilation amplitude, for elling. example e+ + e− µ+ + µ−, from the scattering one, here e− + µ− e− + µ− by ”crossing”→ the relevant particles to the other side of the→ reaction eqn by changing in the scattering amplitude the corresponding 4-momenta p p, the helicities and spinors u v plus, what is be- yond the formalism→ − we are using, putting in ”by→ hand” the minus sign in front of the whole amplitude (QFT is needed for that). This whole transformation affects also the description of the propagator. In the scattering amplitude, we have, in the example considered , p1 and p2 4- momenta for the incoming and outgoing electrons respectively and there- fore the 4-momentum of the exchanged boson is the difference p1 p2 which dotted with itself gives t (p1 p2) (p1 p2); for that reason− the scattering is called the t-channel≡ process.− · By− changing p2 p2, t s (p1+p2) (p1+p2) which is the of the centre-of-mass→ − en- ergy;→ the≡ resulting· annihilation reaction is therefore called the s-channel process36 36There is also a u-channel process, see Let’s now consider e+ + e− γ µ+ + µ− aiming at getting the for example (8). angular distribution of µ− in→ the centre→ of mass system. There will be four annihilation amplitudes, corresponding to four different helicity configurations, contributing incoherently to the cross-section. Fig 1.7 shows a configuration of particles for one of them37. Beam particles 37At large centre of mass energies, a collide head on along the z axis and produced propagate along contribution from, in principle possible, the z0 axis, chosen for simplicity of muons’ wave functions along that configuration in which the virtual pho- ton has the projection of its spin on the z axis equal 0 can be neglected. Only Jz = +1 or −1 contribute and are con- sidered here 24 Relativistic Quantum Mechanics

+ − + − Fig. 1.7 Jz = +1 and Jz0 = −1 configuration for e + e → µ + µ .

axis. In the shown configuration, the electron is in the helicity state +, the positron in the helicity state – and the projection of the total angular momentum of the initial state configuration on the z axis is Jz = 1. The 0 final state, defined using the z axis, is in the Jz0 = 1 configuration, µ− is in the helicity – and µ+ in the helicity + state. The− corresponding helicity amplitude

(j (J 0 = 1), j (J = +1) M−1+1 final z − initial z is obtained taking relevant probability currents and ”crossing” them from the t-channel to the s-channel. In order to get the amplitude in one reference frame, one needs to transform one of the currents to get both with coordinates with respect to the same axes. One can get jfinal(Jz0 = 1) by rotating jfinal(Jz = 1) by the angle θ about the y − − 1 axis. Since the only θ dependent part of the amplitude is the d−1+1(θ) function emerging there because of the rotation, the contribution to the σ(θ) from this helicity amplitude is 1 2 (d1 (θ))2 = (1 cos θ)2. |M−1 +1 | ∼ −1+1 4 − Since soon we will be considering different coupling constants (for weak interactions), before continuing with other amplitudes, we will define Jpf them following the Feynman diagram in fig 1.8; gff¯ (V ) is the coupling ¯ between fermions ff and the exchanged boson V, Jpf is the projection of the total angular momentum of the ff¯ system on the axis along the momentum of f. The final cross section σ(θ) for the angular distribution of µ− produced in e+ + e− γ µ+ + µ− annihilation is → → Fig. 1.8 e+ + e− → µ+ + µ− Feynman the sum of contributions from four helicity amplitudes, diagram σ(θ) 2 + 2 + 2 + 2, ∼| M−1 +1 | |M+1 −1 | |M+1 +1 | |M−1 −1 | 1.4 The Dirac equation 25 which gives

−1 +1 1 2 +1 −1 1 2 σ(θ) g + − (γ)g + − (γ)d (θ) + g + − (γ)g + − (γ)d (θ) + ∼| e e µ µ −1+1 | | e e µ µ +1−1 | +1 +1 1 2 −1 −1 1 2 + g + − (γ)g + − (γ)d (θ) + g + − (γ)g + − (γ)d (θ) (1.53). | e e µ µ +1+1 | | e e µ µ −1−1 |

Jpf The photon coupling gff¯ (γ) doesn’t depend on the helicity and is proportional38 to the electron’s electric charge and therefore 38There are often some numerical fac- tors used, for example to have the cou- σ(θ) (1 + cos θ)2 + (1 cos θ)2 1+cos2 θ. (1.54) pling in terms of the fine structure con- ∼ − ∼ stant. In fig 1.9 one can see that indeed, experimental data points follow that distribution, symmetric with respect to θ = π/2, very well. + + Now we will consider ud¯ annihilation ud¯ W e + νe at the ud¯ centre of mass energy equal to the W + →mass, as→ in CERN experiments where weak bosons were first observed. As W + couples only to left-handed fermions, instead of taking the whole probability current, we will take only the relevant part of it by inserting the chirality projection operator P = (1 γ5)/2 into it: L − 1 γ5 jµ =uγ ¯ µu uγ¯ µ − u. → 2 The net effect of that is that instead of the four helicity amplitudes of the electro-magnetic interaction, we will get only one, −1 −1 for the 39 M weak charged current interaction . The angular distribution of νe is then 1 σ(θ) 2 (d1 (θ))2 = (1 + cos θ)2. ∼| M−1 −1 | ∼ −1−1 4 + Instead of the νe distribution one measures the e distribution but be- Fig. 1.9 The angular distribution of − + − + − cause one changes also the z direction to follow the initial state anti- µ produced in e + e → µ + µ annihilation, measured at LEP. fermion momentum, the distribution is the same. As one can see in fig 1.10, there is an excellent agreement between the theory and the 39Z0 gives the weak neutral current. measurement. One should note that in comparison to the symmetric distribution of the electro-magnetic interaction, this one is not symmet- ric at all, no e+ going in the direction of the original u quark. The Z0 sits somewhere between the photon, giving the symmetric distribution, and the W , giving a very asymmetric one. It couples to both chiral states but with a different strength; it couples stronger to the left-handed particles than to the right-handed particles;

J pf −1 0 +1 0 −1 0 +1 0 gff¯ (V )= ge+e− (Z ) >ge+e− (Z ) and gµ+µ− (Z ) >gµ+µ− (Z ).

In the reaction e+ + e− Z0 µ+ + µ−, all four helicity amplitudes contribute as in the case→ of the→ photon but with different strength. As a result, instead of 1.54 where two terms with cos θ enter with equal weight, we get

σ(θ) a(1 + cos θ)2 + b(1 cos θ)2, ∼ − 26 Relativistic Quantum Mechanics

+ + + Fig. 1.10 The angular distribution of e produced in ud¯ → W → e +ν ¯e.

with different coefficients a and b, which leads to an asymmetric angular distribution of produced µ− (with coupling dependent coefficients A and B): σ(θ) A + B cos2 θ + cos θ. ∼ Fitting this distribution to the data, see fig 1.11, one gets a constraint on the Z0 couplings. To get the couplings (assuming that those for electrons and muons are the same) one needs another constraint and for that one usually takes the Z0 () line shape (with the same particles in the initial (e+ + e−) and the final (µ+ + µ−) state). Gluons, like the photon, couple to different helicity amplitudes with the same strength, and the corresponding angular distributions for quarks are symmetric. Instead of couplings as defined in this section, one often uses their linear combinations, called vector and axial couplings.

1.4.4 P, T and a comment on C By its construction, the Dirac eqn is covariant with respect to Lorentz transformations and the space inversion. It is also covariant with respect to the time inversion. The space inversion P(r r) and the time inversion T(t t) will be discussed in this section.→− A comment about the charge conjugation→− C(particle antiparticle) will be added at the end of this section. → Let’s consider two observers and their reference frames and 0. They are describing the same system or physics process, usingO theirO co- ordinates and wave functions, Ψ(x) and Ψ0(x0), respectively. The coor- ν 0 ν µ dinates are related by a linear coordinate transformation (x ) = aµx , ν where aµ could be the Lorentz transformation or the space or time inver- sion for example. Correspondingly, gamma matrices in the Dirac eqns 1.4 The Dirac equation 27

Fig. 1.11 The angular distribution of µ− produced in e+ + e− → Z0 → µ+ + µ−. used by both observers are related because of that transformation but (after a lot of algebra) one can show40 that both observers can neglect 40for proofs of all theorems in this sec- differences between their gamma matrices. The covariance of the Dirac tion, please refer to (10) for example. 41 eqn require that the wave functions transform as 41One can also show that if the wave functions transform that way, the Dirac Ψ0(x0)=Ψ0(ax)= S(a)Ψ(x)= S(a)Ψ(a−1x0) eqn is covariant with the considered co- ordinate transformation. where S(a) is a matrix, S−1(a) exists and S−1(a) = S(a−1). One says that the coordinate transformation a induced a transformation S(a) in the space of the wave functions; Ψ0(x0)= S(a)Ψ(x).

P(r r) →− The space inversion transformation 10 0 0 0 1 0 0 a = .  0− 0 1 0  −  0 0 0 1   −    We want to find S(a) satisfying Ψ0(x0) = S(a)Ψ(x). Considering, for example, Ψ+(x) = u(1) exp( ip x) and following ex- pectations that r r makes−r0 ·= r, p0 = p and σ0 = σ, the angular momentum axial→− vector), we obtain− −

ϑ(1) (Ψ+)0(x0)= 0 0 exp( i(E t p0 x0)= σ ·p ϑ(1) p Ep+m ! − − ·

(1) ϑ 0 + = p exp( i(E t p x)= γ Ψ (x) −σ· ϑ(1) p Ep+m ! − − · 28 Relativistic Quantum Mechanics

giving S = γ0, up to a fixed phase, which is set to 1 by a convention. We could get that result immediately by looking at the transformation properties of Weyl spinors, see eqn 1.9, and the transformation from the Weyl to the Dirac representation, eqn 1.44. Eigenstates of the parity operator S = γ0 are states of defined intrinsic parity. Positive energy states in the particle rest frame,

u(1) exp( imτ) and u(2) exp( imτ) − − are eigenstates of γ0 with the eigenvalue +1, i.e. they have intrinsic parity +1 but negative energy states

v(1) exp(+imτ) and v(2) exp(+imτ)

are eigenstates with the eigenvalue -1; thus having the intrinsic parity -1. We can state then that the intrinsic parity of spin 1/2 particle (defined in its rest frame) is negative of the intrinsic parity of its antiparticle. This applies to higher spin fermions as well. Free particle states with p = 0 are not eigenstates of the parity operator γ0. 6 T(t t) →− For the time inversion 1 0 0 0 −0 100 a = .  0 010   0 001      A derivation of S(a) is more complicated than that for the space inver- sion. We will only consider one aspect of it and then give the result. From classical physics we expect that t t makes r0 = r, p0 = p, t0 = t and therefore → − − − exp( i(E t0 p0 x0) = exp( i( E t + p x) = (exp( i(E t p x))∗, − p − · − − p · − p − · so there is the complex conjugation involved. The exact derivation gives Ψ0(t0) = S(a)Ψ(t) = iγ1γ3Ψ(t)∗, again up to an arbitrary fixed phase which, by a convention, is taken to be 1. It is useful to consider how S(T) S(a) acts on a free particle state, for example ≡

(1) (2) ϑ ϑ 0 0 0 S(T) p exp( i(E t p x)= i 0 exp( i(E t p x ). σ· ϑ(1) p σ·p ϑ(2) p Ep+m ! − − · − Ep+m ! − − ·

One should note that ϑ(1) ϑ(2), but that doesn’t mean that the helicity flips because the direction→ of the momentum changes as well, so a positive helicity state remains positive helicity state in the time reversed system. S(T) is antiunitary, S2(T) = 1 and this has interesting conse- quences. If we have interactions which− are invariant with respect to 1.4 The Dirac equation 29 the time inversion then S(T) commutes with the Hamiltonian. If Ψ is an eigenstate of the Hamiltonian then S(T)Ψ is also an eigenstate of the Hamiltonian with the same energy. But if

S(T)Ψ = ζΨ where ζ is a phase then

S(T)S(T)Ψ = S(T)ζΨ= ζ∗ζΨ=Ψ in contradiction to S2(T) = 1. For that reason S(T)Ψ is a differ- ent state than Ψ; there is at least− a two-fold degeneracy, known as the Kramers degeneracy. A spin 1/2 particle has a natural two-fold de- generacy due to the two spin projections (2j + 1 states). A static mag- netic field can lift that degeneracy by coupling to the particle’s magnetic dipole moment but a static magnetic field is not invariant with respect to the time inversion, so nothing is wrong with having non degenerate states in the magnetic field. The situation is different if that particle is put into a static electric field which is invariant with respect to the time inversion. If the particle has an electric dipole moment, the electric field would couple to it and would lift the 2j + 1 degeneracy shifting the energy levels up or down depending on the electric dipole orientation. The Kramers degeneracy forbids that to happen because shifted energy states require to be at least two-fold degenerate. So unless there is an extra degree of freedom to guarantee that, electric dipole moments are forbidden42 by the Kramers degeneracy. 42We know that a molecule of water, In a series of beautiful experiments, pioneered by N.F. Ramsey (12), for example, has large electric dipole using a beam of ultra-cold neutrons coming from a nuclear reactor at moment but in molecular or atomic sys- tems, there are extra energy degenerate the ILL in Grenoble, the of the neutron electron dipole (or nearly degenerate) states which al- moment d was measured to be smaller than 2.9 10−26 e cm at 90% confi- low for that, see for example (11). dence level. One should note that the SM predicts very tiny violation of the time inversion symmetry leading to a prediction of very tiny d, much smaller than the above limit. A number of extensions of the SM were rejected because they predicted d to be larger than the current limit.

A comment on C For every particle, there is corresponding antiparticle (and vice versa). This symmetry of nature is called the charge conjugation. It has noth- ing to do with the Lorentz invariance and is the same in all inertial frames. The charge conjugation leads to a unitary operator C in RQF (all states have positive energies in RQF). In RQM one can construct an antiunitary operator transforming a positive energy state into a nega- tive energy state (u to v spinor). Because it is antiunitary, i.e. different from a common unitary charge conjugation operator of RQF, we will not spend time deriving it to avoid possible confusion. We will however briefly outline basic properties of the charge conjuga- tion, see for example (13). C changes the sign of the electric charge, mag- netic moment, number and lepton number. Dynamical quantities like the energy, momentum and helicity are left unchanged. Except for weak interactions, all the other interactions obey the charge conjugation 30 Relativistic Quantum Mechanics

symmetry. Eigenstates of the charge conjugation are neutral particles like the photon, π0, η, ρ0. Eigenvalues are +1 or -1. For the photon it is -1 and therefore it is +1 for π0 γγ. The charge conjugation symme- try forbids π0 to decay to three→ photons. For particles built out of two fermions, ff¯, the eigenvalue is ( 1)l+S, where S is the total spin of the ff¯ and l is the relative orbital angular− momentum of the fermions. The , e+e−, decays to two photons if it is in the , S = 0, l = 0 and decays to three photons if it is in the , S = 1, l = 0. In conclusion, we note that the weak interaction violates the symme- try of each of the three discrete transformations P, T, C, as well as any superposition of any two of them. For all other interactions, each of the three discrete transformations is a symmetry of the interaction. But all known interactions, including the weak interaction, obey CPT symme- try. In RQF it is impossible to construct Lorentz-invariant interactions which would violate the CPT symmetry.

1.4.5 Electromagnetic interactions and the non-relativistic limit So far we were considering the free particle Dirac eqn. The electron (or indeed any electrically charged spin 1/2 structureless fermion) interac- tion with a classical electromagnetic field is introduced through so called ”minimal coupling”, following a prescription of how the electromagnetic interactions were introduced in classical mechanics. The physics of this prescription will be discussed in section 1.5. The classical electromagnetic field is given by a 4-vector potential Aµ = (φ, A). The interaction is introduced into the Dirac eqn by mod- ∂ ∂ ifications of the derivatives: i ∂t i ∂t qφ and i i qA, where q< 0 is the electron’s charge.→ The− Dirac eqn− then∇→− becomes∇− ∂ (i qφ)Ψ(r,t) = (α ( i qA)+ βm)Ψ(r,t). ∂t − · − ∇− For solutions of this eqn, for some cases of the electromagnetic field, see for example (14). We will now get the non-relativistic limit of this eqn by applying an iterative method of approximations. At low energies, the mass m of a particle is the main part of the energy, so we will factor that part out of the wave function: ψ (r,t) Ψ(r,t) = exp( imt) U ; − ψL(r,t)   ψU (r,t) and ψU (r,t) are known as the upper and lower components of the Dirac spinor. They contain all ”non-rest” energy information; all of what is relevant in the non-relativistic limit. After inserting Ψ into the Dirac eqn, ∂ψ i U = σ (p qA)ψ + qφψ (1.55) ∂t · − L U and ∂ψ i L = σ (p qA)ψ + qφψ 2mψ . (1.56) ∂t · − U L − L 1.4 The Dirac equation 31

From eqn 1.56,

σ (p qA) i ∂ qφ ψ = · − ψ ∂t − ψ . (1.57) L 2m U − 2m L st Because m is relatively large, ψL is then small relative to ψU . Inthe 1 order approximation, we neglect the last term in eqn 1.57 and then take σ (p qA) ψ ψ = · − ψ L ' L1 2m U st and insert that into eqn 1.55 to get the 1 order approximation ψU1 for ψU : ∂ψ (σ (p qA))(σ (p qA)) i U1 = qφψ + · − · − ψ . ∂t U1 2m U1 Using vector identities43 this eqn can be written as 43 ∂ψ (p qA)2 q (σ · A)(σ · A)= A · A + iσ·(A × A) i U1 = qφψ + − ψ σ ( A + A )ψ ∂t U1 2m U1 − 2m · ∇× × ∇ U1 and and finally as (∇× A + A ×∇)f = f(∇× A)= fB ∂ψ i U1 = H ψ ∂t P U1 where (p qA)2 q H qφ + − σ B P ≡ 2m − 2m · is the Pauli Hamiltonian of the Schroedinger QM and B is the magnetic field. Since the Hamiltonian of the interaction of a magnetic dipole moment µ with an external magnetic field B is µ B, we can iden- q − · tify 2m σ with the electron magnetic dipole moment µ. On the other hand, the electron is related to the electron spin as 1 |q| µ = g 2 σµB , where g is the proportionality factor and µB = 2m is the Bohr− magneton (q is the electron charge). From that we get that g =2 which was a triumph for Dirac and his eqn. The value of g was known from atomic physics but until the Dirac eqn, there was no explanation why the experimental value was as it was. In fact the value for g, for the electron, as well as for the muon, is not exactly 2. It differs from 2 a little. That small difference is explained by RQF. The value of (g 2)/2 is measured with fantastic precision; 2.8 10−13 for the electron, (15),− and 6 10−10 for the muon, (16). As in the case of the precision measurements of the neutron electric dipole moment, in this case, the precision of the measurements and theoretical calculations, allows for tests of extensions of the SM. One should note here that the magnetic dipole moment of the is quite different from what one gets from the Dirac eqn. The g value is about 5.6 instead of about 2 (using the proper magneton for the proton). For the neutron it is even more surprising, instead of 0, as the electric charge is 0, g is about 3.8. Those significant deviations from the expectations coming from the− Dirac eqn, applicable for point- like particles, were first indications that and neutrons are not point-like particles, they have substructures, guarks and gluons. One 32 Relativistic Quantum Mechanics

can modify the Dirac eqn, adding an extra term, consistent with all invariance principles, to account for the magnetic moment being different from that what one gets for g = 2. We can carry on the iterative procedure and get relativistic corrections beyond the Pauli Hamiltonian. The next one would be obtained by nd substituting ψL1 for ψL in the last term of eqn 1.57 thus getting the 2 order approximation:

σ (p qA) (i ∂ qφ)(σ (p qA)) ψ ψ = · − ψ ∂t − · − ψ . L ' L2 2m U − 2m U Unfortunately, after inserting that to the eqn 1.55 one gets the Hamil- tonian which is not Hermitian and the electron acquires an imaginary electric dipole moment. Formal problems of this type took many years, after Dirac published his eqn, to solve. Eventually, one gets the second order Hermitian Hamiltonian

(p qA)2 p4 H = − 2m − 8m3 q +qφ σ B − 2m · iq p E −8m2 · iq q ( σ ( E)+ σ (E p)). − 8m2 · ∇× 4m2 · × The first line in the Hamiltonian expression represents the kinetic term with its first relativistic correction, the third, the Darwin term and the fourth, the spin-orbit interaction; E is the electric field. For E = φ and the spherically symmetric potential φ, σ ( E) = 0, and−∇ the spin-orbit term takes the familiar form · ∇×

q q ∂φ r H = σ (E p)= σ p SO −4m2 · × 4m2 · ∂r r × q 1 ∂φ = σ l. 4m2 r ∂r · One should note that the Thomas precession is automatically included, as it should be in the relativistic formalism. Historically, the first formally successful derivation of the non-relativistic limit of the Dirac eqn, which had the interactions with the classical elec- tromagnetic field included, was obtained via the F–W transformation. It is interesting that one has to introduce the interaction first into the Dirac eqn in the Weyl or Dirac representation (or any other representa- tion related to them by a transformation which does not depend on the momentum) and only then do the F–W transformation. One could think that doing the F–W transformation for the free particle Dirac eqn first, thus decoupling the lower and upper components of the Dirac spinor, and then introducing the interactions via the ”minimal coupling” would work; but it doesn’t. 1.5 Gauge symmetry 33

1.5 Gauge symmetry

By considering transformations of space-time one gets a relativistic de- scription of free particles. In order to describe their interactions, one needs to consider transformations, gauge transformations, in another space, an internal space. Symmetries of the gauge transformations in that internal space are at the heart of the SM. Before discussing gauge symmetries, we need to go first through three revision sections on the main three ingredients to the formalism. Here they are:

1.5.1 Covariant derivative Let’s consider two coordinates systems as sketched in fig 1.12. The spherical basis vectors are related to the Cartesian ones in the following way: Fig. 1.12 ∂x ∂y Cartesian and spherical co- er = ex + ey = cos ϕ ex + sin ϕ ey, er =1, ordinates systems. ∂r ∂r | | ∂x ∂y e = ex + ey = r sin ϕ ex + r cos ϕ ey, e = r. ϕ ∂ϕ ∂ϕ − | ϕ | Let’s now describe a constant vector field a of unit length, as sketched in fig 1.13, using both coordinate systems. In the Cartesian basis

x y a =1 ex +0 ey = a ex + a ey and in the spherical one

r ϕ a = cos ϕ er 1/r sin ϕ e = a er + a e . − ϕ ϕ When Fig. 1.13 A constant vector field a. ∂ax ∂ax ∂ay ∂ay = = = =0, ∂x ∂y ∂x ∂y for example ∂ar ∂aϕ 1 = sin ϕ = 0 and = cos ϕ =0 ∂ϕ − 6 ∂ϕ − r 6 which looks wrong because the field is constant, nothing is changing from a point to point. It is wrong because the differentiation has not taken into account that the spherical basis vectors are changing from one space point to another. If we take this into account, differentiating not only coordinates but also basis vectors, everything is fine. For example ∂a ∂ 1 = (cos ϕ er sin ϕ e ) ∂ϕ ∂ϕ − r ϕ ∂ ∂ ∂ 1 1 ∂ = (cos ϕ) er + cos ϕ (er)+ ( sin ϕ) e sin ϕ (e ) ∂ϕ ∂ϕ ∂ϕ − r ϕ − r ∂ϕ ϕ 1 1 1 = sin ϕ er + cos ϕ( e ) cos ϕ e sin ϕ( r er)=0. − r ϕ − r ϕ − r − So in general, for a vector V, ∂V ∂V α ∂e = e + V α α . ∂xβ ∂xβ α ∂xβ 34 Relativistic Quantum Mechanics

∂eα The last derivative, ∂xβ , is a vector and can be described as a linear combination of the basis vectors eα: ∂e α =Γµ e . ∂xβ αβ µ µ The geometrical object Γαβ is called a connection. Changing names of the indices in the above equation, µ α, α µ, we can write → → ∂V ∂V α = ( + V µΓα ) e . ∂xβ ∂xβ µβ α This is the covariant derivative which takes care of the changing coordi- nates as well as the basis vectors.

1.5.2 Gauge invariance in electromagnetism In classical electromagnetism, because of B = 0, we can introduce a vector potential A, such that the magnetic∇ field· B = A. Then ∇× 1 ∂B 1 ∂A E + =0 can be written as (E + )=0, ∇× c ∂t ∇× c ∂t and therefore one can introduce a scalar potential φ, such that 1 ∂A E + = φ c ∂t −∇ and the elctric field can be written as 1 ∂A E = φ . −∇ − c ∂t If we now take an arbitrary but differentiable scalar function λ(r,t) and make a transformation

A A0 = A + λ (1.58) → ∇ then B remains the same as it was before the transformation. If simul- taneously with the transformation 1.58 we change 1 ∂λ φ φ0 = φ , (1.59) → − c ∂t than E will also stay unchanged. Transformations 1.58 and 1.59 are 44In manifestly Lorentz covariant form called gauge transformations44 and the fact that electric and magnetic (and c = 1): Aµ → A0µ = Aµ − ∂µλ. fields stay the same is called gauge invariance.

1.5.3 The Aharonov–Bohm effect We will consider now two-slit electron diffraction, following (17). A diagram of the Aharonov–Bohm effect set-up is sketched in fig 1.14. With no current in the solenoid, electrons are diffracted by the slits and form an interference pattern on the screen. Once a power supply is turned on, supplying a current to the solenoid, the diffraction pattern 1.5 Gauge symmetry 35

Fig. 1.14 A diagram of a two-slit electron diffraction experiment to demonstrate the Aharonov–Bohm effect; adapted from FEYNMAN.

changes to another one. The ψ1 for an electron to follow path 1 and the amplitude ψ2 for the path 2 are modified the following way:

ψ = ψ exp( iqS /¯h) and ψ = ψ exp( iqS /¯h), 1 01 − 1 2 02 − 2 where ψ01 and ψ02 are the amplitudes without the current, q the elec- tron’s charge and S1 and S2 are extra phases due to the presence of the current in the solenoid, which are given by q q S = A dr and S = A dr 1 ¯h · 2 ¯h · path1Z path2Z

The modification of the phase on the screen is then q q q q (S S )= ( A dr A dr)= A dr, ¯h 1 − 2 ¯h · − ¯h · ¯h · path1Z path2Z closedI path which, by the Stokes’ theorem, is proportional to the magnetic flux in the solenoid. For infinitely long, thin (therefore not obstructing the slits) solenoid, there is no magnetic field B outside the solenoid. The experimentally confirmed modification of the interference pattern (18) is therefore due to the vector potential A, present outside the solenoid, which has the magnitude inversely proportional to the distance from the solenoid axis. Classically, the magnetic field B and the vector potential A are equiva- lent, in the sense that one can use one or the other. This is not true at 36 Relativistic Quantum Mechanics

the quantum level where A is apparently more fundamental. One should note, that the same applies to the scalar potential φ and the electric field E, where instead of space dimensions one considers time. The final conclusion, we take from the Aharonov–Bohm effect, is the that the vector potential A is related to the space dependent phase of the probability amplitude and the scalar potential φ is related to the time dependent phase of the probability amplitude, so from the phase of the probability amplitude we can get the 4-vector potential (and the other way around).

1.5.4 Interactions from the gauge symmetry The electromagnetic interactions were introduced to the Dirac eqn fol- lowing the classical ”minimal coupling” procedure which required to modify the derivatives: ∂ ∂ D0 + iqφ and D iqA. ∂t → ≡ ∂t ∇ → ≡∇− which can be written in manifestly Lorentz covariant form as

Dµ ∂µ + iqAµ (1.60) ≡ The derivative Dµ is the covariant derivative and one can start thinking that the potential φ and the vector potential A are connections in a space to be defined. The outcome is the Dirac eqn for the electron inter- acting with a classical electromagnetic field represented by the 4-vector potential (φ, A): ∂ (i qφ)Ψ(r,t) = (α ( i qA)+ βm)Ψ(r,t). (1.61) ∂t − · − ∇− We know that the potentials are not unique and we can perform gauge transformations eqn 1.58 and 1.59 without affecting the Maxwell eqns and their solutions. Would the same apply to the Dirac eqn 1.61? The answer is negative. A solution of 1.61 will be different from the one we get solving 1.61 after transformations 1.58 and 1.59. But if simultane- ously with 1.58 and 1.59 we also transform

Ψ(r,t) Ψ0(r,t) = exp(iqλ(r,t))Ψ(r,t), (1.62) → then the Dirac eqn 1.61 will be covariant with respect to those com- bined three gauge transformations, 1.58, 1.59, 1.62, and the solution will describe the same physics as the one of the eqn 1.61 before the transformations. We see that changes to the 4-vector potential affect the space-time dependent phase of the wave function; it needs to be changed accordingly as well. As in the Aharonov–Bohm effect, they are connected. Now, the crucial step. Let’s reverse the flow of arguments. Let’s de- mand first that we want interactions which are invariant with respect to the transformation 1.62. For that to happen, we will need to modify 1.5 Gauge symmetry 37

Fig. 1.15 A particle travelling in space-time and its internal space; adapted from MORIYASU. the derivatives of the free particle Dirac eqn in such a way that the sym- metry is obeyed. That will require to move from space-time derivatives to covariant derivatives and the introduction of the 4-vector potential (φ, A). So demanding the gauge symmetry with respect to the gauge transformation 1.62 we get the classical electromagnetic field with which our particle is interacting. If in addition, in all formulae, for example the one for the probability current, in any eqn, we replace derivatives by the covariant derivatives, all will stay the same; it will be invariant. The last step is to introduce the internal space with its basis vectors on which the connection (the 4-vector potential) operates and extend the formalism to other interactions beyond electromagnetism45. 45The formalism will be only outlined One can imagine that a particle moving in space-time is carrying with here. For more details see (19). itself its internal space as sketched in fig 1.15. In mathematical lan- guage, the internal space is called a fibre and the whole structure which locally is a product of the fibre and space-time, is called a fibre bun- dle. In the case of electromagnetism, the fibre is a circle, as sketched in fig 1.16 . Each point on the circle, parametrised by a real function λ(r,t), corresponds to a with modulus 1, i.e. a one- dimensional exp(iqλ(r,t)). All such matrices form a group called U(1), 1 for one-dimensionality. When the particle moves from one space-time point to another, the phase of its wave function is modified, eqn 1.62, and the 4-vector potential (φ, A) is modified as well, eqns 1.59 and 1.58. A basis in the space U(1), consists of one particular matrix, for example the unit matrix (in this case the real number 1) which corresponds to λ = 0. When the particle is moving in space-time, λ(r,t) is changing corresponding to the changing basis (with respect to 4 different directions, so 4 different derivatives, with respect to time and 3 space directions). But the change of the basis is represented by the connection which in turn gives us the change of the 4-vector potential (φ, A), eqns 1.59 and 1.58. So our ”differential” eqn is: the connection Fig. 1.16 The internal space of the electromagnetism. 38 Relativistic Quantum Mechanics

equals the 4-vector potential. We are ready now to extend the formalism to include other interac- tions. What will happen if in eqn 1.62, instead of the real function λ(r,t) 1 2 exp M≡ 1+ M + 2 M + ... we insert a matrix (r,t)? Suppose belongs to SU(2), a group of complex Unitary matricesM 2 2 with theM unit determinant (the letter S = special, stands for that). That× group has 3 basis vectors. Pauli matri- ces, σx, σy, σz, can serve as those basis vectors and each element of that group can be defined by 3 real numbers, λ1, λ2, λ3, and can be repre- sented as a point within a sphere of radius 2π in 3 real dimensions (that includes the interior of the sphere). So our particle would be caring with itself such a sphere, its internal space, and the interaction we would get this way, would be the weak interaction. Generalizing eqn 1.62, we will demand gauge symmetry with respect to the following transformation:

3 Ψ(r,t) Ψ0(r,t) = exp(iq λk(r,t)σ )Ψ(r,t). (1.63) → k Xk=1 The operator acting on the wave function Ψ is now a series of 2 2 ma- trices and therefore our wave function Ψ gets one extra dimension×and is represented by two components (for historical reasons called projections of the isotopic spin) related to the SU(2) internal space. In order to get the 4-vector potential of the weak interactions one considers infinitesi- mal change of the wave function when the space-time point is changed from (r,t) to (r + dr,t + dt). All components of the wave function are affected but to get the 4-vector potential one selects only the connection part of the covariant derivative, i.e. the change of the basis vectors in the internal space. That change is represented by the change of λ1, λ2, λ3, giving finally: 3 µ µ k A = (∂ λ (r,t))σk (1.64) kX=1 So, if we want to introduce the weak interactions to the free particle Dirac eqn, we change space-time derivatives to covariant derivatives, including the connection, i.e. the 4-vector potential given by eqn 1.64. One should note that the 4-vector potential is a 2 2 matrix in the isotopic spin projection space brought in by the Pauli× matrices. The strong interactions are introduced in a similar way. The internal space is now SU(3), a group of Unitary complex matrices with unit determinant. There are 8 basis vectors in that space and therefore 8 real numbers identify each matrix belonging to the group. The Pauli matrices generating the weak interactions are replaced by those 8 basis matrices of the SU(3) group and the wave function gains extra degrees of freedom, becoming a three dimensional column in the colour space of the strong interactions. The corresponding 4-vector potential becomes a3 3 matrix. Changing derivatives to covariant derivatives causes the particle× to interact with the classical colour field. After quantization of the classic weak and strong fields we would get three spin 1 bosons of the weak interactions and 8 spin 1 gluons for the strong interactions. All of them would be massless. To end up 1.5 Gauge symmetry 39

Fig. 1.17 SM vertices of gauge bosons interactions with each other with physical massive Z0, W + and W − bosons, one needs an extra mechanism, like the Higgs mechanism which will be discussed later in this book. Elements of U(1) group commute with each other but those which belong to SU(2) or SU(3), in general, don’t commute with each other. That leads to the fact that photons don’t interact with each other but weak interactions bosons and strong interactions gluons do interact with each other. The SM vertices representing their interactions are sketched in fig 1.17. For further reading, (9) is particularly recommended.

References

Steane A.M., Relativity Made Relatively Easy, Oxford University Press, 2012. Schutz B.F., A First Course in General Relativity, Cambridge Univer- sity Press (1985). Misner C.W., Thorne K.S. and Wheeler J.A., Gravitation, W.H. Free- man and Company (1995). Rauch H. et al. Phys. Lett. A 54, 425 (1975). The chapter Spinors in Landau L.D. and Lifshitz E.M., Quantum Me- chanics (Non-relativistic Theory, 3th edition, Pergamon Press (1977). Landau course, Bierestecki W.B., Lifshitz E.M. and Pitajewski L.P., Relativistic part I. Maggiore M., A Modern Introduction to Quantum Theory, Ox- ford University Press (2005). Halzen F. and Martin A.D., Quarks & Leptons: An Introductory Course in Modern Particle Physics, Wiley (1984). Aitchison I.J.R. and Hey A.J.G., Gauge Theories in Particle Physics, 3rd edition, Institute of Physics Publishing (2003). Bjorken and Drell Relativistic Quantum Mechanics. Schiff L.I. Quantum Mechanics, 3rd edition, McGraw–Hill, 1955. Ramsey, Physics Today vol. 33 (7) (1980) 25 and Annu. Rev. Nucl.Part. Sci. vol. 40 (1990), pp 1-14. Perkins D.H., Introduction to High Energy Physics, 3rd edition, Addison–Wesley (1987). Landau R.H. Quantum Mechanics II, a Second Course in Quantum Theory, 2nd edition, John Wiley & Sons, Inc. 1996. g-2 electron g-2 muon Feynman R.P., Leighton R.B., Sands M. Feynman Lecctures on Physics Vol. II, Addison–Wesley, reading, Mass., 1965. Chambers R.G. Phys. Rev. Lett. 5, 3-5 (1960). Moriyasu K. An Elementary Primer for , World Scientific (1983). Feynman R.P. in Elementary Particles and the Laws of Physics, The 1986 Dirac Memorial Lectures, Cambridge University Press, 1987. PDG