<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by CERN Document Server

LC-REV-1999-005 IKDA 99/11 June 1999

Electroweak Gauge Bosons at Future Electron-Positron Colliders

Thorsten Ohl

Darmstadt University of Technology Schloßgartenstr. 9 D-64289 Darmstadt, Germany

Abstract The interactions of electroweak gauge bosons are severely con- strained by the symmetries inferred from low energy observables. The exploration of the dynamics of electroweak symmetry breaking re- quires therefore both an experimental sensitivity and a theoretical precision that is better than the natural scale of deviations from mini- mal extrapolations of the low energy effective theory. Some techniques for obtaining improved theoretical predictions for precision tests at a + future e e− linear collider are discussed.

i Preface

Several excellent, up-to-date reviews of the wide range of that can + be probed at a future e e− Linear Collider (LC) are available [?, ?, ?]. The present review discusses some physical and technical aspects of electroweak gauge bosons, that will be important at a LC. The importance of gauge invariance even in a conservative Effective Field Theory (EFT) approach, which does not pretend to be valid to arbitrarily short distances, is em- phasized. The implementation of gauge invariance in perturbation theory is discussed. Techniques for improving the convergence and speed of high dimensional phase space integrations are described.

Acknowledgments

I thank Panagiotis Manakos, Hans Dahmen, and Thomas Mannel for their support over the years. Howard Georgi deserves the credit for the most inspirational year of physics so far (Howard provided an open door, physics insight, and inspiration, while DFG provided the financial support). I thank Harald Anlauf, Edward Boos, Wolfgang Kilian, and Alex Nippe for enjoyable and productive collaborations that have contributed to some of the results presented in the following pages. I thank the organizers of the Maria Laach school for the invitation to climb on the Effective Field Theory soapbox, as reproduced in part in appendix A. The students make this a fun place to lecture and the hospitality of the congregation of Maria Laach makes this school unique. I thank Deutsche Forschungsgemeinschaft (DFG) and Bundesministerium for Bildung, Wissenschaft, Forschung und Technologie (BMBF) for financial support.

ii Contents

1 Introduction 1 1.1 The Art and Craft of Elementary ...... 1 1.2 Success and Failure of the Standard Model ...... 2 1.3TheMissionoftheLinearCollider...... 4 1.4Roadmap...... 7

2 The Standard Model as Effective Quantum Field Theory 9 2.1Currents...... 10 2.2WeakInteractions...... 13 2.3FermiTheory...... 13 2.4 Flavor Symmetries ...... 15 2.5 Higher Energy Scattering ...... 19 2.6IntermediateVectorBosons...... 20 2.6.1 Power Counting ...... 21 2.6.2 Hidden Symmetry ...... 24 2.6.3 Tree Level Unitarity ...... 26 2.7 The Minimal Standard Model ...... 26 2.8 Precision Tests ...... 27

3 Matrix Elements 29 3.1 Three Roads from Actions to Matrix Elements ...... 29 3.1.1 ManualCalculation...... 29 3.1.2 ComputerAidedCalculation...... 29 3.1.3 AutomaticCalculation...... 30 3.2 Forests and Groves ...... 30 3.2.1 GaugeCancellations...... 32 3.2.2 Flavor Selection Rules ...... 34 3.2.3 Forests...... 36 3.2.4 Flavored Forests ...... 40 3.2.5 Groves ...... 40 3.2.6 Application...... 43

iii 3.2.7 Automorphisms...... 46 3.2.8 Advanced Combinatorics ...... 47 3.2.9 Loops...... 48

4 Phase Space 49 4.1 Four Roads from Matrix Elements to Answers ...... 50 4.1.1 AnalyticIntegration...... 50 4.1.2 Semianalytic Calculation ...... 50 4.1.3 MonteCarloIntegration...... 50 4.1.4 Monte Carlo Event Generation ...... 51 4.1.5 Why We Can Not Use the Metropolis Algorithm Naively 52 4.2MonteCarlo...... 53 4.3 Adaptive Monte Carlo ...... 56 4.3.1 Multi Channel ...... 57 4.3.2 Delaunay Triangulation ...... 59 4.3.3 VEGAS...... 59 4.3.4 VAMP...... 63 4.4 Parallelization ...... 71 4.4.1 Vegas...... 72 4.4.2 Formalization of Adaptive Sampling ...... 73 4.4.3 Multilinear Structure of the Sampling Algorithm . . . 75 4.4.4 Random Numbers ...... 78 4.5DedicatedPhaseSpaceAlgorithms...... 79

5 Linear Colliders 81 5.1TripleGaugeCouplings...... 81 5.2 Single W ± Production...... 85 5.3Beamstrahlung...... 86 5.4 Quadruple Gauge Couplings and W ± Scattering...... 87 6 Conclusions 89

A The Renormalization Group 92 A.1Introduction...... 93 A.1.1History...... 94 A.1.2 Renormalizability ...... 94 A.1.3 The Renormalization Group ...... 98 A.2 The Renormalization Group ...... 99 A.2.1 Cracking Integrals ...... 99 A.2.2Wilson’sInsight...... 100 A.2.3TheSlidingCut-Off...... 101

iv A.2.4 Renormalization Group Flow ...... 104 A.2.5 Relevant, Marginal Or Irrelevant? ...... 107 A.2.6LeadingLogarithms...... 110 A.2.7CallanSymanzikEquations...... 110 A.2.8 Running Couplings and Anomalous Dimensions . . . 112 A.2.9AsymptoticFreedom...... 114 A.2.10 Dimensional Regularization ...... 115 A.2.11GaugeTheories...... 116 A.3EffectiveFieldTheory...... 118 A.3.1 Flavor Thresholds and Decoupling ...... 118 A.3.2 Integrating Out or Matching and Running ...... 120 A.3.3 Important Irrelevant Interactions ...... 121 A.3.4 Mass Independent Renormalization ...... 122 A.3.5 Naive Dimensional Analysis ...... 123

v —1— Introduction

1.1 The Art and Craft of Elementary Particle Physics

To teach how to live without certainty, and yet without being paralyzed by hesitation, is perhaps the chief thing that philosophy, in our age, can still do for those who study it. [Bertrand Russell: AHistory of Western Philosophy]

In a highly idealized scenario, Theoretical Elementary Particle Physics (TEPP) proceeds by condensing the results of experimental observations up to some energy scale into a model, which is designed to remain consistent at higher energies. In a second step, predictions for experiments at these higher en- ergy scales are derived. Finally, comparisons with subsequent experimental results are used to refine the model, if necessary, and the process starts anew. Therefore, we can identify two complementary tasks: model building on one hand and calculation of observables on the other. Below, we will focus our attention in the area of model building to the systematic procedure of EFT and its rˆole in the modern interpretation of the Standard Model (SM) with simple extensions. The methods for the calculation of observables will be discussed in greater detail. Such calculations are naturally divided into two separate parts: calculation of amplitudes and differential cross sections and phase space integration. Some methods for both parts are described with a noticeable bias towards the author’s research interests.

1 Figure 1.1: Physics reaches of past, present and future colliders. Hadron colliders can not utilize the full beam energy in partonic collisions since the momentum is shared by quarks and gluons. As a rule of thumb, the lower border of the grey rectangles (√s0 = √s/10) corresponds to the physics reach of hadron colliders. (Data taken from [?, ?].)

1.2 Success and Failure of the Standard Model

Tout comprendre, c’est tout mepriser. [Friedrich Nietzsche]

In the decade after the commissioning of LEP1, many aspects of the Stan- dard Model [?] of elementary particle physics have been tested successfully to a precision previously unimaginable [?, ?]. Indeed it is not without irony that one of the main sources of theoretical uncertainty derives from the in- + sufficient precision of the old data on hadron production in e e−-collisions at a few GeV, despite recent theoretical progress in the interpretation of this data [?]. The important qualitative features [?] of the SM were established from low energy experiments at a time before any of the colliders shown in figure 1.1 were commissioned. The symmetry structure of the SM as a SU(2)L U(1)Y that is spontaneously broken to the U(1) of Quantum× Elec- trodynamics (QED) has been firmly established by the stunning success of the SM at LEP1. Nevertheless, direct information regarding the underlying dynamics of the Electro Weak Symmetry Breaking (EWSB) has yet to be found. The existing data is compatible with the minimal SM with a light ele- mentary Higgs particle [?]. Global fits of the SM predictions to the LEP1 precision observables (including mtop from Tevatron and MW from LEP2) suggest a Higgs mass close to the present exclusion limits (see figure 1.2). However, the Higgs mass enters the precision observables only logarithmically and the limits are therefore weak. Despite the phenomenological hints in its favor, it must be mentioned that the minimal SM with an elementary Higgs particle is unattractive from a theoretical point of view. There is a quadratically divergent contribution

2 6 theory uncertainty ∆α(5) had = 0.02804±0.00065 0.02784±0.00026 4 2 ∆χ

2

Excluded Preliminary 0 2 3 10 10 10 [ ] mH GeV

Figure 1.2: The famous “blue band” derived from fits of electroweak precision data to SM predictions [?], that suggests a very light SM Higgs particle, with a mass close to the present exclusion limit. The area excluded by direct searches is shaded in light grey on the left. The parabolas show the distance in likelihood from the best fit as a function of the Higgs mass for two different determinations of the running electromagnetic coupling αQED(MZ ).

to the Higgs mass renormalization g δm2 = Λ2 (1.1) Higgs ∝ 16π2 that requires an unrealistic fine-tuning of the mass to one part in 1038 in order to keep the Higgs light and the EWSB scale close to 1 TeV (see section A.2.5 for more details). There are two ways around this problem and both feature fermions promi- nently. A supersymmetry can relate the scalar mass to a fermion mass. Since a fermion mass is protected from renormalization by chiral symmetry, it can be naturally light and the scalar mass can naturally remain on the order of the supersymmetry breaking scale. On the other hand, Goldstone bosons from the spontaneous breaking of a continuous symmetry are protected by

3 Goldstone’s theorem and can be naturally light on the order of the symmetry breaking scale as well. Here the chiral symmetry of fermions is required to protect the particles that take part in the interaction, which is responsible for the symmetry breaking, from a large mass renormalization. The latter models are disfavored by current precision data. A necessary, though not sufficient, condition for the unification of gauge interactions in a single simple group at a Grand Unified Theory (GUT) scale is that the running couplings (see sections A.2.8 and A.3.2 for details) have a common value at one scale. According to the data from the pre-LEP1 era, the SM has this property. However, LEP1 precision data revealed that the coupling do not meet, unless supersymmetric partners are added to the particle spectrum [?]. Even if the naturalness problem is solved by the Minimal Supersymmetric Standard Model (MSSM) and gauge coupling unification is achieved, models with an elementary Higgs particle remain unsatisfactory, because they not only provide no insight into a theory of flavor [?], they also screen any theory of flavor so effectively that experiments at the GUT scale would be required to shed any light on the open questions of the physics of flavor, in particular the ratios of fermion masses and the fermion mixing angles. A strongly interacting weak sector that generates EWSB dynamically has a greater potential of explaining the physics of flavor. In such a theory it is conceivable that everything can be calculated from (renormalized) Clebsh- Gordan coefficients in a tower of spontaneous symmetry breakings. Such a reduction from the continuous manifold of free parameters to a discrete set of symmetry breaking patterns allowed by group theory would increase the predictive power of TEPP immensely. Today we have no working theory of flavor that could explain the ratios of fermion masses and the values of the mixing angles. We do not know if Nature was kind enough to hide the answer at the TeV-scale. Of course, she is under no obligation to do so and could as well have used supersymmetry to put the answers out of our experimental reach somewhere between the GUT and Planck scales. The remarkable success of the MSSM with no intermediate mass scales and gauge coupling unification [?] might indeed have been the first direct evidence of such a grand desert.

1.3 The Mission of the Linear Collider

From the previous section, we can identify the main missions for a future LC. Assuming that neither LEP2, TEV33, or LHC have done so, the LC has to try to discover the Higgs particle and supersymmetric partners of the

4 189 GeV Preliminary LEP ] 20 pb [ )) γ ( − W +

W 10 →

− Data e

+ Standard Model (e

σ no ZWW vertex ν e exchange 0 160 170 180 190 200  √s [GeV]

+ Figure 1.3: W W −-production cross section [?] close to the threshold is in spectacular agreement with the SM prediction. Already at 183 GeV, the need + 0 for a W W −Z vertex was clearly visible.

particles in the SM. If a Higgs particle has already been found at the time of commissioning, the LC will be in a unique position to determine its nature by measuring its properties, such as branching ratios, that are difficult to access at a hadron collider, in detail. Complementary to this discovery mission, the LC has to perform precision measurements of the properties of the top quark and of electroweak gauge bosons. Whatever the mechanism of mass generation for the fermions will turn out to be, the top quark couples most strongly of all known fermions to the symmetry breaking sector. In fact, in the minimal SM, the Yukawa coupling of the top quark to the Higgs particle is of the order one. Thus the couplings of the top quark have to be measured as precisely as possible, because they offer the best chance of learning something about the interaction of the symmetry breaking sector with the fermionic sector. Finally, the LC is in a unique position for precision measurements of the properties of the electroweak gauge bosons. LEP2 has firmly estab- + 0 lished the trilinear W W −Z interaction of gauge bosons. The shape of the

5 68% CL 68% CL 68% CL LEP LEP 95% CL LEP 95% CL preliminary 95% CL preliminary preliminary Standard Model Standard Model Standard Model 1 1 1 γ γ γ

0 λ 0 0 ∆κ ∆κ

-1 -1 -1

-1 0 1 -1 0 1 -1 0 1 Z Z ∆g1 ∆g1 λγ

Figure 1.4: Combined LEP2 and Tevatron limits on Triple Gauge Couplings at the time of the ICHEP98 conference in Vancouver, B.C. (July 1998) [?]. The SM values are favored, but the 90% CL boundaries are still O(1).See figure 1.5 for a dramatic improvement (note the different scale!).

+ W W −-production cross section at the threshold as shown in figure 1.3 is as convincing as direct fits of the coupling parameters in figure 1.5. Due to limited statistics and energy, LEP2 will not be able to measure the + 0 W W −Z couplings at a level where deviations from the minimal SM can reasonably be expected [?, ?] (see chapter 2). In a strongly interacting model, with no observed light resonances, deviations from the couplings predicted by low energy symmetries are on the order of (see sections 2.7 and A.3.5)

1 3 6.3 10− (1.2) 16π2 ≈ · and in a weakly interacting model, this factor is further reduced by cou- pling constants. Unless Quantum Field Theory (QFT) breaks down1 just above the LEP2 energy, the LEP2 experiments are not able to establish any deviation from the SM [?]. The LEP2 measurements are nevertheless important, but we should be disappointed by the “failure” to see any de- viation from the SM. The improvements in the determination of the mass of the W ±-bosons will be more significant, because MW is not constrained by a low energy symmetry. An improved measurement probes the custodial

1Recently, the idea of solving the hierarchy problem by introducing large extra dimen- sions at the TeV-scale [?] has attracted a lot of attention. Such a scenario allows many spectacular signatures that avoid the bounds from Renormalization Group (RG) argu- ments in four-dimensional QFT [?] and will not be discussed here.

6 0.5 0.5 LEP LEP LEP preliminary 0.2 preliminary preliminary

0.25 0.25 γ γ γ

0 λ 0 0 ∆κ ∆κ

-0.25 -0.25 68% CL 68% CL 68% CL 95% CL 95% CL 95% CL Standard Model -0.2 Standard Model Standard Model -0.5 -0.5 -0.2 -0.1 0 0.1 0.2 -0.2 -0.1 0 0.1 0.2 -0.2 -0.1 0 0.1 0.2 Z Z ∆g1 ∆g1 λγ

Figure 1.5: Combined LEP2 and Tevatron limits on Triple Gauge Cou- plings at the time of the MORIOND 1999 conference (March 1999) [?]. The 189 GeV data gives a dramatic improvement over figure 1.4 (note the differ- ent scale!) and vanishing TGC are now definitely excluded.

symmetry, if combined with precision calculations and measurements of MZ 2 and sin θw. The situation will change qualitatively with the commissioning of the LC. The LC will be able to detect deviations at the level of (1.2) and will be the only machine to do so [?, ?, ?]. Consequently, precision measurements of the properties of electroweak gauge bosons are the final piece of a no-lose theorem 34 2 1 for a high luminosity ( 10 cm− sec− ), high energy (√s 1TeV)LC: L & ≈ Either an elementary Higgs particle and possible supersymmetry is found at the LC as a signal of a weakly interacting symmetry breaking sector or the couplings of electroweak gauge bosons will deviate from the minimal SM predictions at a measurable level in order to accommodate a strongly interacting symmetry breaking sector.

In order to cover the second part of this no-lose theorem, the craft of precision electroweak gauge boson physics has to be mastered.

1.4 Roadmap

Antigonus: Thou art perfect then, our ship hath touch’d upon The deserts of Bohemia? Mariner: Ay, my lord, ... [William Shakespeare: The Winter’s Tale, III.3 ]

7 Chapter 2 sketches the phenomenological bottom-up approach to the field theoretical description of particle physics experiments and applies it to the SM of electroweak interactions. Most of this material can be found as part of sys- tematic expositions in modern textbooks [?, ?, ?], but is included anyway in a condensed and slightly provocative form in order to put the other chapters in perspective. Chapter 2 attempts to give a concise description that sepa- rates the features of the SM of particle physics that are firmly established by phenomenology and the general principles of QFT, on one side, from model dependent assumptions that have not been tested so far, on the other side. Chapter 2 in particular assumes familiarity with the Renormalization Group (RG) for QFT in the Wilsonian approach [?], roughly on the level of [?]. Therefore, appendix A reproduces a short and deliberately elementary introduction to the Wilson RG for QFT, based on the first half of lectures presented at the German Autumn School for High Energy Physics, Maria Laach, September 9–19, 1997. The physical intuition provided by the Wil- son RG is crucial for understanding why the SM is so successful at LEP1 and LEP2 and why we can nevertheless expect a lot of exciting physics at a LC. Chapters 3 and 4 dive deeper into the more technical parts of the phe- nomenologists’ toolbox. Chapter 3 discusses specific issues arising in the cal- culation of matrix elements and differential cross sections for gauge bosons in Perturbation Theory (PT). Chapter 4 describes methods for phase space integration and event generation. These chapters combine published original contributions with results that appear to be known only to the practitioners of the field as “tricks of the trade.” Finally, chapter 5 discusses some aspects of the forthcoming physics of electroweak gauge bosons at a LC.

8 —2— The Standard Model as Effective Quantum Field Theory

And as the smart ship grew In stature, grace, and hue, In shadowy silent distance grew the Iceberg too. [Thomas Hardy: The Convergence of the Twain]

This chapter explores some consequences of the general principles of QFT for the phenomenology of elementary particles. This material is hardly original (see the texts [?, ?, ?, ?]), but is included in a condensed and intentionally provocative form in order to set the stage for the following chapters. The purpose of this exercise is to allow a separation of kinematics and symmetry on one side, from dynamics on the other side. The result of the application of this formalism to the SM is Janus-faced: many predictions are very solid, because they depend on kinematics and symmetry only, but, by the same token, the independence from dynamical details reveals how little we still know about the underlying dynamics of the SM. This view on the virtues of the SM is purposefully pessimistic. It demonstrates that, despite the tremendous achievements of the past, there is still a lot of exciting physics waiting to be discovered at the high energy frontier. Perturbative renormalizability has often been presented as the main guid- ing principle behind the construction of the SM. But when we acknowledge our ignorance about the “true” short distance physics that is out of the reach of the currently operative, as well as the currently planned, colliders, we can not adopt it as a guiding principle anymore. After all, QFT might just be cut off at some string scale. However, as stressed in section A.2.5, among all equivalent theories that describe Nature, there is always one perturbatively renormalizable QFT, that contains only relevant and marginal interactions at the highest energy scale. Since perturbatively renormalizable QFTs are a technically more convenient starting point, it is often more economical to

9 Figure 2.1: Scattering amplitude.

use them as the initial condition. Nevertheless, irrelevant interactions will be observed at low energies and one has to explain them as the result of matching conditions at intermediate thresholds as described in section A.3.3 in order to show the consistency of the description. A comprehensive and systematical modern treatment of all the theoretical issues is provided by Weinberg’s text [?, ?]. A more pedagogical treatment from a similarly modern perspective is given in [?], while more on the phe- nomenology can be found in [?].

2.1 Currents

current: in general circulation or use; ... [Concise Oxford Dictionary]

The asymptotic single particle states in a scattering experiment are character- ized by quantum numbers belonging to the Poincar´e group (energy E = P0, ~ 2 momentum P1,2,3, and angular momentum L3, L ) as well as by internal quan- tum numbers corresponding to conserved charges Qn:

Pµ p, l, l3,q1,... ,qn =pµ p, l, l3,q1,... ,qn (2.1a) | i | i ~ 2 L p, l, l3,q1,... ,qn =l(l+1) p, l, l3,q1,... ,qn (2.1b) | i | i L3 p, l, l3,q1,... ,qn =l3 p, l, l3,q1,... ,qn (2.1c) | i | i Qn p, l, l3,q1,... ,qn =qn p, l, l3,q1,... ,qn . (2.1d) | i | i

For this to make sense, the Qn have to commute with all generators of the Poincar´egroup1

[Pµ,Qn] = 0 (2.2a)

1 We do not include supersymmetry charges among the Qn.

10 [Lµν ,Qn] = 0 (2.2b) andinparticular

[H, Qn]=0. (2.2c)

In general, the Qn will represent a compact Lie algebra

[Q ,Q ]= if Q Q (2.3) n1 n2 n1n2n3 n3 n X3 that factorizes into simple non-abelian and abelian subalgebras. The quan- tum numbers (2.1) define the asymptotic states in scattering experiments completely, if we employ the cluster decomposition principle for asymptotic multi particle states. However, the operators appearing in (2.1) are not suf- ficient for a Lorentz invariant description of the interaction of the states corresponding to tensor products of asymptotic single particle states. The creation and annihilation operators in the Fock space built from the states in (2.1) are non-local, i. e. their Fourier transforms neither commute nor anti-commute at space like distances. However, the only known way (see [?] for a discussion) of constructing a relativistic (i. e. Lorentz invariant) scattering matrix uses a local interaction, i. e. a Lagrangian density that is a Lorentz invariant polynomial of local operators that commute or anti- commute at space like distances

2 [φ(x),φ0(x0)] =0 for(x x0) <0 (2.4) ± − and their derivatives. Therefore, the particle creation and the anti-particle annihilation operators have to be combined into local field operators that satisfy (2.4). As a result, it is impossible to have interactions that conserve particle number and the correct description of relativistic scattering will in- volve an infinite number of degrees of freedom. The most profound physical consequence of the mathematical result that there are an infinite number of degrees of freedom is the phenomenon of Spon- taneous Symmetry Breaking (SSB). For a finite number of degrees of freedom, Wigner has shown long ago that all symmetries of the interaction have to be symmetries of the ground state and the symmetry has to be represented linearly. However, Wigner’s theorem breaks down for an infinite number of degrees of freedom and symmetries can be broken spontaneously, guarantee- ing the existence of massless scalar particles through Goldstone’s theorem. This allows non-linear realizations of these symmetries [?]. The Goldstone bosons are protected from obtaining large masses through radiative correc- tions, unlike regular scalars. On the other hand, non-linearly realized gauge

11 symmetries provide a consistent description of massive spin-1 particles that does not suffer from bad high-energy behavior (see section 2.6 for a detailed discussion). The charges Qn can be constructed from local quantities as the integral 0 of a charge density jn(x) on a space-like hypersurface x0 = t

3 0 Qn(t)= dxjn(x), (2.5)

x0Z=t

0 µ where jn(x) is the time component of a conserved current jn (x)

µ ∂µjn (x)=0. (2.6)

µ If the matrix elements of the jn (x) fall off sufficiently rapidly at large dis- tances, the charge Qn is conserved

dQn 3 0 3 ~ (t)= dx∂0j (x)= dx ~n(x)=0. (2.7) dt n ∇ x0Z=t x0Z=t These conserved currents are particularly useful in the phenomenology of interacting field theories, because they are not renormalized. For the non abelian factors of the Lie algebra generated by the charges Qn,thisistrivial, because any renormalization jµ Zjµ would induce a renormalization Q ZQ and violate the commutation→ relations (2.3). For the abelian factors, we→ have to make the further assumption that there are operators ψ that form a non trivial representation

[Qn,ψ]=qn,ψψ. (2.8)

In this relation, any renormalization of the ψ will cancel and we find again a non renormalization theorem for the current. Matrix elements of the conserved currents in asymptotic states are con- strained by symmetry, independent of dynamical assumptions. For example, the matrix element of a current between two scalar single particle states with 2 2 2 identical mass (p = p0 )musthavetheform

2 p0,q0 jn,µ(0) p, qn =(p0 +pµ)qnF((p0 p) )δq q ,F(0) = 1 (2.9) h n| | i µ − n0 n from Lorentz symmetry and current conservation alone, where the normal- ization of the form factor at zero momentum transfer F (0) = 1 is fixed by

2 The eigenstates are orthogonal for hermitian charges Q: q0 Q q δq,q . h | | i∝ 0 12 3 3 qn2p0(2π) δ (p~ p~)=qn p0,qn p, qn = p0,qn Qn p, qn 0 − h | i h | | i 3 3 i~x(p~ p~ ) = d x p0,qn jn,0(x) p, qn = d xe − 0 p0,qn jn,0(0) p, qn h | | i h | | i x0Z=t x0Z=t 3 3 =(2π)δ (p~ p~) p0,qn jn,0(0) p, qn , 0− h | | i i. e.

qn2p0 = p, qn jn,0(0) p, qn . (2.10) h | | i Similar relations can be derived for matrix elements of particles with spin, but in general there is a larger number of independent form factors for which symmetry provides less constraints.

2.2 Weak Interactions

The low energy manifestations of the interactions described by the Stan- dard Model (SM) are the long range electromagnetic interaction and the short range weak interaction. In Nature, we observe weak interactions in the burning of the solar fuel

4 4p +2e− He + 2νe +26.73 MeV Eν . (2.11) → − While the released energy is the binding energy of the 4He nucleus due to the strong interaction of protons and neutrons, the process (2.11) would be impossible without the weak interaction, because the strong interaction conserves isospin, which forbids p+e− n+νe transitions. The other natural manifestation of weak interactions is in→ the nuclear β-decay, for example

137 137 Cs Ba + e− +¯νe. (2.12) → Decades of particle physics experiments with accelerators have found many more examples of weak interactions that can be observed because they violate the isospin and parity invariance of strong interactions (see [?] for the most exhaustive and up-to-date list).

2.3 Fermi Theory

αυτoς εϕα [Pythagoras’ pupils]

13 As mentioned in the previous section, there are two defining characteris- tics of weak interactions at low energies: parity violation and isospin non- conservation. The former is the result of the experimental observation that all electrons (and muons and taus) produced in β-decay are left handed, while all positrons (and anti-muons and anti-taus) produced in β+-decay are right handed3. Therefore, the weak interaction couples in the leptonic sector to (V A) charged currents of the form − νγ¯ µ(1 γ5)e + h.c. . − Using the τ ± =(τ1 iτ2)/2 isospin shift operators, such charged currents can be combined in the± -components of a conserved isospin vector current ±

1 γ5 ν(x) j±(x)=(¯ν(x)¯e(x)) τ ± γµ − µ ⊗ 2 e(x)    1 γ5 νµ(x) +(¯νµ(x)¯µ(x)) τ ± γµ − ⊗ 2 µ(x)    1 γ5 ντ(x) +(¯ντ(x)¯τ(x)) τ ± γµ − , (2.13) ⊗ 2 τ(x)    which is written more compactly as

¯ ~τ 1 γ5 ~µ(x)=ψ γµ − ψ, (2.14) 2⊗ 2   where the summation over all isospin doublets is implied. The neutral compo- 0 nent jµ(x) of the current (2.14) is masked at low energies by much stronger electromagnetic interactions of charged leptons and strong interactions of hadrons and was observed only much later, after neutrino beams became available. Making the educated guess that the hadronic part is described by a similar isospin current, the simplest short range interaction imaginable couples these currents

g + ,µ L(x)= j (x)j− (x) (2.15) Λ2 µ with g a suitable coupling constant and Λ a suitable scale. According to section A.2.5, the interaction (2.15) is irrelevant and expected to be very weak at low energies (compared to Λ).

3This observation summarizes many complicated experiments and ingenious theoretical interpretations.

14 The interaction (2.15) has been tested quantitatively (including radiative corrections, see e. g. [?]) in the leptonic sector, where the matrix elements of the products of isospin currents can be calculated without further assump- tions. From the comparisons with experimental results, lepton universality, i. e. that each doublet of leptons contributes identically to the current as in (2.13), is firmly established. For the hadronic part of the current, no ab-initio calculations are possible yet, due to our limited understanding of the non-perturbative dynamics of + ¯0 + the strong interactions. In semi-leptonic processes like B D µ νµ,the matrix element of the interaction can be factorized in terms of→ a leptonic and a hadronic current

+ ¯0 + ,µ + + + ¯0 ,µ + µ νµD jµ (x)j− (x) B = µ νµ jµ (x) 0 D j− (x) B . (2.16)

Model independent symmetry arguments for the conserved vector current and partially conserved axial current similar to (2.9) can be used to express (2.16) in terms of a single unknown form factor. For heavy hadrons, the form factor has a known normalization a zero momentum transfer, up to small corrections [?]. This provides non-trivial cross checks for such processes and the interaction (2.15) has passed these tests. Similar symmetry arguments have less predictive power for light baryons and vector mesons, because there are more independent form factors, but some non-trivial checks can still be performed. Compared to this, purely hadronic weak processes remain poorly un- derstood, because the matrix elements can not be factorized in terms of hadronic currents and ab-initio calculations in terms of quarks are not yet possible. However, this reflects more on our still poor understanding of non- perturbative QFT than on the accuracy of the description of weak interac- tions with the Fermi interaction (2.15).

2.4 Flavor Symmetries

What are the hadronic currents that the weak interaction couples to? A comprehensive study of the transitions among hadrons induced by the weak interaction currents

a a jµ,meson = iΦj†Tjk←∂→µ Φk (2.17a) j,k X a ¯ a jµ,baryon = ΨjTjkΓµΨk (2.17b) j,k X

15 Figure 2.2: Chiral symmetry breaking patterns of hadrons by weak interac- tions and quark masses. On the left hand side, the symmetry is broken by an as yet unknown mechanism that creates heavy quark masses. In the right hand side, the embedding of the weak interaction gauge group in the global chiral symmetry groups results in quite different symmetry breaking patterns.

together with the hadron masses Mj

1 µ 1 2 L = (∂ Φ )† (∂ Φ ) M Φ†Φ (2.18a) 0,meson 2 µ j j 2 j j j j − X   ¯ ¯ L0,baryon = iΨj∂/Ψj Mj Ψj Ψj (2.18b) j − X   reveals intriguing symmetry patterns for the meson fields Φj and baryon fields Ψj, which can be summarized by the succession of global (i. e. un- gauged) SU(N) flavor symmetry groups in figure 2.24. The strong interac- tions appear to be completely flavor blind and hadrons can be classified in 5 representations of a global SU(6)L SU(6)R U(1) symmetry group . The classification of the hadronic⊗ currents⊗ and their interactions accord- ing to the symmetries described in figure 2.2 is complicated by the fact that the observed hadrons do not live in the fundamental representation N of SU(N). Instead they live in the irreducible components of product rep- resentations that carry no overall charge of the color SU(3), responsible for (QCD), the strong interaction in the quark pic- ture. These representations of SU(N)areN N¯=1 (N2 1) for the mesons and N N N for the baryons. However,⊗ the⊕ leptons− do form a fundamental representation⊗ ⊗ of the symmetries in figure 2.2, as shown in the upper left part of table 2.1. Furthermore, there is independent experimen- tal evidence, in particular from deep-inelastic lepton hadron scattering, for

4 Note that the SU(6)L,R symmetry groups in figure 2.2 contain some revisionistic his- toriography, because no hadrons with top flavor are observed in nature. Furthermore, the Fermi interaction is not applicable at energies as high as the top threshold. Therefore, a strictly phenomenological analysis would use SU(5)L,R instead. Nevertheless, for the purpose of studying the symmetry properties, it is useful to pretend temporarily that the top is lighter than it actually is. 5The six different flavors will later correspond to quarks, coming with left handed and right handed chiralities. Note that the SU(6)’s must not be confused with the nonrela- tivistic SU(6) symmetry containing the SU(3) of the light hadrons and the SU(2) of spin angular momentum.

16 quarks as constituents of hadrons, which live in a fundamental representation (see table 2.1). In particular, the properties of hadrons with heavy flavors are well described by heavy quarks surrounded by a SU(3)L SU(3)R flavor symmetric “cloud” of light quarks and gluons [?]. The product⊗ representa- tions for the observed hadron spectra are therefore understood as the bound states of quarks. The SU(6)L SU(6)R symmetry is badly broken by the heavy flavor ⊗ masses mt, mb,andmc,whichleavestheSU(3)L SU(3)R of the eightfold way of the three light flavors [?]. This is already a useful⊗ symmetry that can be used to derive quantitative results, provided that the breaking of the flavor- SU(3) to the isospin-SU(2) due to the finite strange quark mass is taken into account properly, together with the spontaneous chiral symmetry breaking SU(3)L SU(3)R SU(3)L+R caused by QCD. The product representations are the⊗ familiar singlet→ plus octet for the light mesons 3 3=1¯ 8andthe ⊗ ⊕

Table 2.1: The Standard Model (SM) of elementary particle physics. We have color (C) singlets (1) and triplets (3), weak isospin (T) singlets and doublets (2) and various hypercharge (Y ) assignments.

Y Leptons (C, T) Q = T + Y 3 2

νe,R (?) νµ,R (?) ντ,R (?) (1, 1)0 0

eR µR τR (1, 1) 2 1 − − νe,L νµ,L [ντ,L] 0 (1,2) 1 − eL ! µL ! τL ! 1! − Quarks 2 u c t (3, 1) R R R 4/3 3 1 dR sR bR (3, 1) 2/3 − −3 2 uL cL tL 3 (3,2)1/3 1 dL! sL! bL! ! −3 Gauge Bosons 0 γZ W± g Higgs

Φ (?) (1, ?)? ?

17 singlet, two octets and a decuplet for the light baryons 3 3 3=1 8 8 10. ⊗ ⊗ ⊕ ⊕ ⊕ On the other hand, the SU(6)L SU(6)R is also broken by differing charges for weakly interacting neutral⊗ current interactions (i. e. Z0-exchange and QED) for up-type quarks and down-type quarks. The remaining SU(3)UL ⊗ SU(3)UR corresponds to the flavors u, c,andtof the up-type quarks and the SU(3)DL SU(3)DR corresponds to the flavors d, s,andbof the down-type quarks. This⊗ symmetry in broken further by the weakly interacting charged current interactions to a SU(3)L for the left handed part that interacts and a SU(3)UR SU(3)DR for the non-interacting right handed parts. Again, these ⊗ symmetries are badly broken by heavy flavor masses, but the SU(3)L SU(3)R ⊗ and the SU(3)UL SU(3)DL SU(3)UR SU(3)DR have a common sub- ⊗ ⊗ ⊗ group SU(2)DL SU(2)DR that is a good symmetry and has interesting phe- nomenological consequences⊗ [?]. An additional complication is provided by the fact that the SU(3)UL ⊗ SU(3)UR, the SU(3)DL SU(3)DR, and the isospin SU(2) of the weak interac- tions twisted by the CKM⊗ mixing matrix. Without the intuition provided by quarks, this low energy symmetry structure of the weak interactions would be very hard to understand. The SM provides a complete phenomenological description of the cur- rents described in this section. However, of all the symmetry breakings in figure 2.2, it only explains the two steps on the right, which are due to neu- tral currents and charged currents. The other symmetry breakings, as well as the CKM mixing have to be put in by hand, introducing most of the free parameters in the SM. The strong interaction physics that is responsible for the relation between the fundamental representation and the product representations is interesting in itself, but not of fundamental importance for the weak interaction physics under consideration. The significant observation of the present section is that the leptonic and semi-leptonic low energy weak interactions can indeed be described by the Fermi interaction (2.15) for observable hadronic and leptonic currents. The axial part of the short range neutral current weak interaction has also been observed more recently in low energy experiments through their parity violating effects. Advances in experimental atomic physics have made atomic parity violation a viable tool for precision physics [?]. The emerging picture of the weak interactions at low energy is that of interacting isospin SU(2) currents, where the charged components act on the left handed fields and the neutral component couples differently to the two members of the isospin doublet particles. The symmetry is incomplete, as long as the charged current and neutral current pieces of the interaction ap- pear to be unrelated. Below in section 2.7, we will see that the corresponding gauge bosons are indeed related by a custodial SU(2)c symmetry [?].

18 Figure 2.3: Differential cross section and S-wave tree-level unitarity bound for e−νe e−νe. →

2.5 Higher Energy Scattering

Having established the Fermi interaction (2.15) for low energy weak interac- tions, we must study if and when this description will break down at higher energies, either phenomenologically or from theoretical inconsistencies. From power counting, the interaction is irrelevant and the RG flow in the ultravio- let will eventually leave the perturbative domain (see section A.2.4). But in- stead of studying radiative corrections, we can also study a tree level process like eν scattering. The differential cross section is found to be quadratically rising with energy

dσ G2 = = F E2 . (2.19) ⇒ dΩ 4π2 · CM

On the other hand, we can project the scattering amplitude on the S-wave component and verify the unitarity of the partial wave

J=0 1. (2.20) |M |≤ This gives an upper bound on differential cross section dσ 1 2 . (2.21) dΩ ≤ ECM Both the differential cross section and the unitarity bound are shown in figure 2.3, where the unitarity bound is turned on smoothly, because we have to allow for higher order corrections close to the bound. In any case, it is obvious from figure 2.3, that the Fermi interaction (2.15) has to be replaced before 1 TeV. The solution to this problem is provided by softening the interaction at high energies, e. g. 1 1 − (2.22) Λ2 → p2 Λ2 − 2 2 for some combination p of kinematical invariants that scales as ECM.The particular example in (2.22) is just one of many possibilities that reduce to

19 the Fermi interaction at low energy. However, it corresponds to replacing the local interaction (2.15) by the exchange of vector bosons

dσ g4 1 1 = = 2 (2.23) 2 2 2M 2 ⇒ dΩ 32π ECM W 1 cos θ + E2 − CM   and can therefore be described again by a local interaction. Unlike an arbi- trary form factor, a local interaction produces a Lorentz invariant scattering matrix (see section 2.1). The formal Lorentz invariance of an ad-hoc form factor in a non-local interaction does not suffice for Lorentz invariance of the scattering matrix for all matrix elements if the form factor is taken into account consistently [?]. Space-like correlations will ruin Lorentz invariance. Incidentally, the interaction that produces the vector boson exchange to replace the irrelevant current-current interaction,

a a,µ L(x)= Aµ(x)j (x), (2.24) a X is marginal (see A.2.5) and the scale in the irrelevant interaction is replaced by the mass of the new particle. This fits of course nicely within the EFT paradigm, where a low energy EFT breaks down at the scale where new particles and interactions have to be introduced.

2.6 Intermediate Vector Bosons

The introduction of intermediate vector bosons (2.23) is not without prob- lems, because unitarity requires that all particles that are exchanged must also appear in external states. Massive vector bosons have three degrees of freedom and massless vector bosons () have only two degrees of free- dom. However, a naive counting of the components of a Lorentz vector finds four degrees of freedom. One approach to a consistent theory would construct the vector bosons as narrow spin one resonances of scalars or spin-1/2 fermions. In such a theory, unitarity could be proven in a Hilbert space of asymptotic states built from scalars and spin-1/2 fermions, avoiding all problems with unphysical degrees of freedom. However, constituents that could form resonances with the correct quantum numbers are not observed and Occam’s razor suggests to attempt a description in terms of the vectorial degrees of freedom before inventing new constituents.

20 The current operators have all the right quantum numbers, but they have a mass dimension of three and can therefore not be identified with intermediate vector bosons with a mass dimension of one. Unfortunately, the very fact that conserved currents are not renormalized prevents them from acquiring an anomalous dimension and turning into a vector field with canonical dimension for sufficiently large coupling6. For non-interacting massive vector bosons, a simple solution is that the equation of motion

2 2 µ µ ν (∂ + M )A ∂ ∂νA = 0 (2.25) − derived from the Proca Lagrangian with an explicit mass term

1 µν 1 2 µ L = Fµν F + M AµA ,Fµν = ∂µAν ∂ν Aµ (2.26) −4 2 − eliminates the unphysical degree of freedom automatically, because of

2 µ M ∂µA =0. (2.27)

The operator equation (2.27) removes precisely the degree of freedom that would correspond to a negative norm state in a naive canonical quantization of all four components.

2.6.1 Power Counting If the vector boson A is coupled to a conserved current in (2.24), the operator equation (2.27) remains intact. However, even if the vector boson couples to conserved currents only, it is not true in general that contributions from the longitudinal part of the propagator

i pµpν gµν (2.28) −p2 M 2 + i − M 2 −   cancel in physical amplitudes. Instead, except for exceptional cases, (2.27) has to be enforced explicitely in intermediate states. As a consequence, the power counting rules (A.89) used to classify effective Lagrangians are violated by the longitudinal polarization state. If the currents in (2.17) are conserved and correspond to an abelian sym- metry, i. e.

T a,Tb =0, (2.29)

6See however [?] for an interesting speculation along this line.

21 the longitudinal polarization state decouples nevertheless. This is trivial from the equations of motion for on-shell currents, but in order to show that the contributions from off-shell propagators cancel, one has to sum over all + Feynman diagrams (see also section 3.2). For example, in the process e e− Z0Z0, the diagrams →

Mµν (p+,p ;k, q) = + (2.30) −

contribute equally to the Ward identity

µ ν k  (q)Mµν (p+,p ;k, q)=0. (2.31) − In general, (2.29) is satisfied for the neutral current piece of the weak inter- action, because these particular T a are generated from the Pauli matrices 1 and τ3 by linear combination, direct product, and direct sum. Due to this decoupling, it is possible to replace the Proca propaga- tor (2.28) for abelian currents by a St¨uckelberg propagator

i pµpν gµν (1 ξ) , (2.32) −p2 M 2 + i − − p2 ξM2 + i −  −  which is compatible with the power counting rules (A.89). Then the choice ξ = 1 is particularly convenient for quick calculations, while the cancellation of ξ dependent pieces provides useful cross checks in more complicated calcula- tions. With (2.32), ∂A can be eliminated consistently from matrix elements

µ phys. ∂µA (x) phys.0 = 0 (2.33) h i as a free field, but it does not necessarily vanish as an operator in the full indefinite metric Hilbert space. The situation is more involved for charged current interactions in which (2.29) + + is violated. For example, in the process e e− W W − only one of the di- agrams corresponding to (2.30) does not vanish→

(1) Mµν (p+,p ;k, q) = (2.34) −

22 and it does not satisfy the Ward identity by itself

µ ν (1) k  (q)Mµν (p+,p ;k, q) v¯(p+)/(q)u(p ) . (2.35) − ∝ −

Since the W ± are charged, there is a second diagram with exchange

(2) Mµν (p+,p ;k, q)= , (2.36) −

but this does not cancel (2.35) for minimal coupling ∂µ ∂µ ieAµ of → − the W ±. Therefore, we must not use a St¨uckelberg propagator (2.32) for massive vector bosons coupled to charged currents naively. In an attempt to salvage the power counting, we can take a hint from unbroken non-abelian gauge theories (a. k. a. Yang-Mills theories) and investigate another diagram

, (2.37)

0 arising from a coupling of the W ±’s to a third gauge boson, the Z ,which couples to the τ3 component of the current, in order to close the gauge algebra generated by the τ ±. Indeed using the non-abelian gauge vertex and putting the outgoing W ±’s on their mass-shell

2 µ ν (1) s MW k  (q)Mµν (p+,p ;k, q) − v¯(p+)/(q)u(p ) . (2.38) − ∝ s M2 − − Z Thus (2.35) could be canceled with charge assignments that are compatible with the gauge algebra, as long as MW = MZ . However, the sum of (2.34), (2.36) and (2.37) still couples to unphysical polarization states if the external bosons are massive, because it is impossible to cancel the massless photon pole this way. Moreover, adding the Yang-Mills interaction to the Proca Lagrangian (2.26) results in equations of motion that violate (2.27). There- fore, a successive plugging of the holes is not likely to succeed and we should take a step back and reconsider, how unitarity is implemented in covariantly quantized massless Yang-Mills theories. In addition to the two physical degrees of freedom there are four unphys- ical degrees of freedom in the covariant quantization of gauge theories: the µ scalar and longitudinal polarization states ∂µA (x)andAL(x) of the gauge

23 bosons are accompanied by two Faddeev-Popov ghosts η(x)and¯η(x)[?]. The quartet-mechanism [?] for Becchi-Rouet-Stora (BRS) [?] invariant interac- tions guarantees that these unphysical degrees of freedom decouple from the two physical degrees of freedom. Only in an abelian gauge theory, as QED, do the unphysical polarization states and ghost decouple separately. Thus the physical Hilbert space in a Yang-Mills theory is not characterized by the simple Gupta-Bleuler condition (2.33), but by the cohomology of the conserved BRS-charge

1 3 QBRS(t)= d~xtr π0(x)←→∂ 0η(x) igπ0(x)[A0(x),η(x)] 2 − Zx0=t ig + [η(x),η(x)] ∂0η¯(x) . (2.39) 2 !

The cohomology arises because the BRS-charge QBRS projects on zero norm states

2 2 QBRS Ψ = Ψ Q Ψ =0, (2.40) k | ik BRS due to

QBRS† = QBRS (2.41a) 2 QBRS =0, (2.41b) and such states have to factored out of the physical states characterized by

QBRS phys. =0, (2.42) | i in order to construct a proper Hilbert space. For massive vector bosons, some other degree of freedom has to take the place of the now physical lon- gitudinal polarization state AL(x) in the quartet-mechanism. This degree of freedom can be provided by the Goldstone bosons of a spontaneous symme- try breaking and the quantum numbers will match, if, and only if, the broken symmetry is the gauge symmetry of the massive vector bosons.

2.6.2 Hidden Symmetry Spontaneously broken gauge theories provide a consistent description of mas- sive vector bosons, in which the dimensions of the fields are indeed given by (A.10) and (A.11). There is no argument that they are the only con- sistent description. However, such an argument is not necessary, because it

24 is possible to use a non-linear field redefinition to rewrite any low energy effective interaction of vector bosons as the unitarity gauge version of a non linearly realized gauge theory [?, ?, ?]. In an abelian toy example7 there are the correspondences

iφ(x)/f ψ(x) χ 0 ( x ) e− χ(x) 1 = 1 . (2.43) Vµ(x) ⇐⇒ − Dµφ(x) Aµ(x) ∂µφ(x)    gf   − gf 

χ is a matter field and Vµ is massive vector field, while φ is a field and Aµ a gauge field of a non-linearly realized U(1)

χ(x) eiω(x)χ(x) φ(x) φ(x)+fω(x) . (2.44)   →  1  Aµ(x) Aµ(x)+ g∂µω(x)     Without a specification of the subsidiary conditions for the quantization of the vector fields, it appears that (Aµ,φ) has one more degree of freedom µ than Vµ. However, Vµ is understood to be quantized with ∂µV = 0 enforced explicitely, while (Aµ,φ) is quantized in an indefinite norm Hilbert space µ where φ and ∂µA conspire with the Faddeev-Popov ghosts to decouple from physical amplitudes [?] and the same number of degrees of freedom survive. Since the non-linear field redefinitions do not change the on-shell scattering matrix elements [?], the two formulations will yield the same predictions for observable quantities. The preceding reasoning does not prove that all vector bosons are gauge bosons in disguise, but it comes very close, because it shows that for every theory with massive vector bosons, there is another theory in which the vec- tor bosons are parametrized gauge bosons of a non-linearly realized gauge symmetry. These theories are indistinguishable at energy scales below the symmetry breaking scale Λ = 4πf characterizing the non-linear realization. Above the symmetry breaking scale, the two theories are of course very dif- ferent, because the gauge symmetry will then be realized linearly. In a covariant Rξ-gauge, a special case of (2.42) is the new subsidiary condition

µ phys. ∂µA (x) ξmAφ(x) phys.0 =0. (2.45) h − i that takes the place of (2.33). At energies large compared to the boson mass, µ µ the longitudinal polarization vector L(k) can be approximated by k /mA. 7In the non-abelian generalization, the physics is obscured slightly by the need to parametrize the factor group manifold G/H. Nevertheless, it is a straightforward appli- cation of the formalism set forth in [?] and explicit formulae are given in [?, ?].

25 + Figure 2.4: Tree-level unitarity bound for νν¯ W W −. → L L

Then (2.45) turns into the celebrated equivalence theorem [?, ?]andthe scattering amplitudes for longitudinal vector bosons can be calculated more efficiently by calculating the corresponding amplitudes for Goldstone bosons.

2.6.3 Tree Level Unitarity A complementary analysis arrives at the same conclusion. A systematic study of the tree level unitarity of four-point functions and five-point functions calculated with arbitrary interactions reveals that only spontaneously broken gauge theories result in amplitudes that satisfy tree level unitarity at high energy [?](seealso[?] for a less systematical but more pedagogical account). As could be expected from the study of the Ward identities, the production of longitudinal vector bosons

dσ G2 = = F E2 sin2 θ (2.46) ⇒ dΩ 32π2 · CM ·

is responsible for this result. P -wave unitarity ( J=1 1) demands |M |≤ dσ 18 sin2 θ 2 (2.47) dΩ ≤ ECM which is displayed in figure 2.4. Unitarity is maintained up to consider- ably higher scales than in figure 2.3, but will eventually be violated at a few TeV. Continuing this heuristic “patching up” will unambiguously [?] lead to a spontaneously broken gauge theory, identical to the one proposed in sections 2.6.1 and 2.6.2.

2.7 The Minimal Standard Model

Combining the low energy information on the flavor symmetries with the observation of the top quark and the description of weak interactions by gauge bosons of a non-linearly realized SU(2)L U(1)Y /U(1)Q and the un- ⊗ broken SU(3)C of QCD, results in the particles of the minimal SM lined

26 up with their quantum number assignments in table 2.1. Instead of realiz- ing SU(2)L U(1)Y /U(1)Q non-linearly, it is also possible to introduce an elementary Higgs⊗ particle to break the electroweak symmetry. Both options provide a consistent theoretical framework for calculations and are not ruled out by experiment. However, a linear realization with a light Higgs near 100 GeV appears to be favored by current experimental results (see figure 1.2). As discussed at length in section 2.6, the low energy non-abelian charged currents require that among the equivalent effective theories describing the vector bosons, there is one in which the bosons are the non-abelian gauge bosons of a broken symmetry. Therefore the non-abelian couplings shown in figure 1.5 are unavoidable and deviations from the gauge theory predic- tions have to be consistent with the power counting rules of EFT, i. e. Naive Dimensional Analysis (NDA), as described in section A.3.5. NDA is based on the hierarchy of two scales, here the Fermi scale v ≈ 246 GeV and ΛEWSB =4πv 3.1 TeV, with no intermediate scale. Therefore its predictions can be invalidated≈ by the observation of a new physical thresh- old between v and ΛEWSB. However, such a threshold would be observed and NDA is a valid tool for studying electroweak gauge bosons on the TeV scale, if no new particles will have been discovered until then. Unless a new threshold is discovered or something very spectacular hap- pens that would undermine most our present understanding of QFT, the deviations from the minimal SM must be on the order of

1 3 6.3 10− . (2.48) 16π2 ≈ · From the predictions for the sensitivity at LEP2 on gauge boson couplings [?], we see that (2.48) will not be reached. Thus LEP2 will not be able to distinguish among models for EWSB, based on measurements of the gauge boson couplings alone. The direct search for a very light Higgs is more promising. This situation changes will change dramatically at the LC, due to the increased energy and luminosity (see section 5.1).

2.8 Precision Tests

The observation that low energy physics can be described by an effective field theory, does not mean that precision experiments can not access at least some aspects of the physics at higher scales through loop corrections [?]. If the low energy EFT has less symmetry than the EFT relevant at a higher energy scale, the parameters of the low energy EFT can be tested for the

27 remnants of the high energy symmetry. The most celebrated example of this phenomenon at LEP1 are the radiative corrections to the ρ-parameter and the evidence for a custodial SU(2)c symmetry [?] of the EWSB sector (see also section A.3.1). At low energies, in the current-current couplings (2.15), the relative strengths of the weak and neutral current couplings is a free parameter. After the intro- duction of intermediate vector bosons, the two dimensionful parameters are interpreted as the ratio of dimensionless couplings and vector boson masses. Above the production threshold of the vector bosons, their masses can be measured independently from the strength of weak interactions. From the observed values, they appear to be related because the ρ-parameter

M 2 W (2.49) ρ = 2 2 cos θwMZ is very close to unity, with the renormalization scheme dependent8 deviations in the SM on the order of 1%. In (2.49), the weak mixing angle θw is defined

e = g sin θw (2.50) by the ratio of the left handed charged current couplings and the electromag- netic coupling. Of course, ρ = 1 could be just a numerical accident, unless it can be explained by a symmetry. In the minimal SM, ρ = 1 is explained by the fact that all three components of the isospin-triplet charged gauge boson receive the same mass from the coupling to the EWSB sector, before the neutral component mixes with the hypercharge gauge boson. In the linear realization of the EWSB, the global SU(2)c SU(2)L symmetry is equivalent to the SO(4) of the four components of the⊗ complex Higgs doublet. In a Higgs-less realization, the SU(2)c SU(2)L U(1)Y /SU(2)C+L U(1)Q has to be realized non-linearly [?]. Apparently,⊗ ⊗ whatever the detailed⊗ dynamics + 3 of EWSB is, it respects the custodial symmetry relating W , W −,andW = 0 cos θwZ +sinθwA. The most important deviations from ρ = 1 in the minimal SM depend on mt (see section A.3.1). Therefore, the joint efforts of LEP1, LEP2, SLC and Tevatron in the measurement of MZ , MW ,sinθw,andmt have provided strict constraints on the symmetry of the EWSB sector. The TeV-scale physics probed in this way exceeds the direct physics reach of the contributing colliders (see figure 1.1) by an order of magnitude.

8There are mass dependent renormalization schemes that actually enforce ρ = 1 at one scale. The non-trivial observation is then that ρ does not “run away” from this value.

28 —3— Matrix Elements

The quest for a theory of flavor demands precise calculations of high energy scattering processes in the framework of the SM and its extensions. At the Tevatron, the Large Hadron Collider (LHC), and at the LC, final states with many detected particles and with tagged flavor will be the primary handle for testing theories of flavor.

3.1 Three Roads from Actions to Matrix Elements

3.1.1 Manual Calculation The traditional textbook approach to deriving amplitudes from actions are manual calculations, which can be aided by computer algebra tools. The time-honored method of calculating the squared amplitudes directly using trace techniques is no longer practical for today’s multi particle final states and has generally been replaced by helicity amplitude methods (see e. g. [?, ?, ?]). Manual calculations have the disadvantage of consuming a lot of valuable physicist’s time, but can provide insights that are hidden by the other ap- proaches discussed below. Regarding basic processes, where extremely fast and accurate predictions are required for numerical fits, manual calculations can provide efficient formulae that are still unrivaled.

3.1.2 Computer Aided Calculation An increasingly popular technique is to use a well tested library of basic helic- ity amplitudes for the building blocks of Feynman diagrams and to construct the complete amplitude directly in the program in the form of function calls. A possible disadvantage is that the differential cross section is nowhere avail-

29 able as a formula, but the value of such a formula is limited anyway, since they can hardly be printed on a single page anymore.

3.1.3 Automatic Calculation Fully automatic calculations are a further step in the same direction. The Feynman rules (or equivalent prescriptions) are no longer applied manually, but encoded algorithmically. This method will become more and more impor- tant in the future, but more work is needed for the automated construction of efficient unweighted event generators (see section 4.1.4) and for the auto- mated factorization of collinear singularities in radiative corrections (see [?]).

3.2 Forests and Groves

Calculations of cross section with many-particle final states remain challeng- ing despite all technical advances and it is of crucial importance to be able to concentrate on the important parts of the scattering amplitude for the phenomena under consideration. In gauge theories, however, it is impossible to simply select a few signal di- agrams and to ignore irreducible backgrounds. The same subtle cancellations among the diagrams in a gauge invariant subset that lead to the celebrated good high energy behavior of gauge theories such as the SM (see section 2.6), come back to haunt us if we accidentally select a subset of diagrams that is not gauge invariant. The results from such a calculation have no predictive power, because they depend on unphysical parameters introduced during the gauge fixing of the Lagrangian. It must be stressed that not all diagrams in a gauge invariant subset have the same pole structure and that a selection based on “signal” or “background” will not suffice. The subsets of Feynman diagrams selected for any calculation must there- fore form a gauge invariant subset, i. e. together they must already satisfy the Ward- and Slavnov-Taylor-identities to ensure the cancellation of contribu- tions from unphysical degrees of freedom. In abelian gauge theories, such as QED, the classification of gauge invari- ant subsets is straightforward and can be summarized by the requirement of inserting any additional photon into all connected charged propagators, as shown in figure 3.1. This situation is similar for gauge theories with simple gauge groups, the difference being that the gauge bosons are carrying charge themselves. For non-simple gauge groups like the SM, which even includes mixing, the classification of gauge invariant subsets is much more involved.

30 + + Figure 3.1: The gauge invariant subsets in e e− µ µ−γ are obtained by adding the photon in all possible ways to connected→ lines of charged particles.

In early calculations, the classification of gauge invariant subsets in the SM has been performed in an ad-hoc fashion (see [?, ?]), which was later super- seded by the explicit construction of groves [?], the smallest gauge invariant classes of tree Feynman diagrams in gauge theories. This construction is not restricted to gauge theories with simple gauge groups. Instead, it is applica- ble to gauge groups with any number of factors, which can even be mixed, as in the SM. Furthermore, it does not require a summation over complete multiplets and can therefore be used in flavor physics when members of weak isospin doublets (such as charm or bottom) are detected separately. The method constructs the smallest gauge invariant subsets and examples below illustrate that they are indeed smaller than those derived from looking at the final state alone [?, ?]. The method will probably also have applications in loop calculations. However, the published proofs use properties of tree diagrams and further research is required in this area. In unbroken gauge theories, the permutation symmetry of external gauge quantum numbers can be used to subdivide the scattering amplitude corre- sponding to a grove further into gauge invariant sub-amplitudes (see [?]and references cited therein). In this decomposition, each Feynman diagram con- tributes to more than one sub-amplitude. It is not yet known how to perform a similar decomposition systematically in the SM, because the entanglement of gauge couplings and gauge boson masses complicates the structure of the amplitudes. The construction of the groves provides a necessary first step towards the solution of this important problem.

31 3.2.1 Gauge Cancellations Even if the general arguments are well known, the explicit cancellations of unphysical contributions remain amazing in explicit perturbative calcula- tions in gauge theories. The intricate nature of these cancellations compli- cates the development of systematic calculational procedures. Nevertheless, phenomenology needs numerically well behaved matrix elements and cross sections, which should be as compact as possible. The development of an automated procedure for obtaining compact expressions with explicit gauge cancellations remains a formidable challenge. The selection of gauge invari- ant subsets of Feynman diagrams described in the remainder of this section is one necessary step towards this goal. As discussed in chapter 2, there is overwhelming phenomenological evi- dence that the strong and electroweak interactions are mediated by vector particles. A naive description of vector particles would assign to them four degrees of freedom or polarization states, of which only two (for massless particles) or three (for massive particles) can be physical. The only known theories that can consistently decouple the unphysical degrees of freedom for interacting fields are gauge theories (for massless particles) or equivalent to spontaneously broken gauge theories (for massive particles). However, perturbative calculations require an explicit breaking of gauge invariance for technical reasons and the cancellation of unphysical contribu- tions is not manifest in intermediate stages of calculations. The contribution of a particular Feynman diagram to a scattering amplitude depends in the gauge fixing procedure and has no physical meaning, in general. Among the possible gauge fixing schemes, two complementary approaches can be identified: unitarity gauges and renormalizable gauges. A massive gauge boson in unitarity gauge has a propagator

i pµpν gµν (3.1) −p2 M 2 + i − M 2 −   and unitarity of the time evolution can be proved straightforwardly, because µ only physical degrees of freedom propagate and the condition ∂µV =0is maintained. Unfortunately, the longitudinal part of the propagator (3.1) does not fall off with high energy and momentum. This spoils the power counting for renormalization and, even worse, violates tree-level unitarity at high energies, because the scattering amplitudes for longitudinal vector bosons do not fall off rapidly enough to satisfy the optical theorem at high energies.

32 On the other hand, the family of Rξ-gauge fixings results in propagators that propagate unphysical degrees of freedom

i pµpν gµν (1 ξ) (3.2) −p2 M 2 + i − − p2 ξM2 + i −  −  but with good high energy behavior and standard power counting even for the longitudinal components. The cancellation of the unphysical contributions in scattering matrix elements turns out to be a formidable combinatorial problem [?]. The second approach is superior for formal proofs to all orders (see [?]and references therein), because it allows to treat the two complicated problems independently. In a first step, the theory is defined using a Bolgoliubov- Parasiuk-Hepp-Zimmermann (BPHZ) subtraction procedure [?], which vio- lates gauge invariance and unitarity, but allows to prove the Quantum Action Principle (QAP) [?]. The QAP states that the classical symmetries of the Lagrangian density L are respected by the effective action Γ, upto a lo- cal operator of dimension five (see also [?]). Subsequently, one can prove by an inspection of all operators of dimension five, that the BRS invari- ance of the effective action can be restored by finite and local counterterms of dimension four, provided that the coefficient of the Adler-Bell-Bardeen- Jackiw (ABBJ) anomaly vanishes. Finally, the unitarity of the BRS invariant theory is proven with the quartet-mechanism [?]. This two-step procedure is transparent because it decouples the combina- torics of renormalized perturbative power counting from the combinatorics of gauge invariance and BRS invariance. Once the QAP is established, BRS invariance is reduced to a problem in finite-dimensional cohomology. On the other hand, controlling the high energy behavior of the unitarity gauge propagator (3.1) and the combinatorics of Ward identities simultaneously is prohibitively complicated, even in low orders of the loop expansion. However, this decoupling of kinematical structure and gauge structure, which makes the abstract argument transparent, is not useful in concrete calculations, because the abstract arguments apply only to the sum of all diagrams and shed little light on how the gauge cancellations “work”. One particular problem is that the Ward identities relate diagrams with different pole structure. For example, in qq¯ gg → + + (3.3) the numerator factors cancel parts of denominators and the Ward identity is satisfied only by the sum and not by individual diagrams. Therefore, the physical amplitudes are determined by an intricate web of kinematical structure and gauge structure, that is very hard to disentangle.

33 Nevertheless, the identification of partial sums of Feynman diagrams that are gauge invariant by themselves is of great practical importance for at least two reasons. Firstly, it is more economical to spent the time improving Monte Carlo statistics for the important pieces of the amplitude than to calculate the complete amplitude all the time. This requires the identification of the smallest gauge invariant part of the amplitude that contains the important pieces. Secondly, different parts of the amplitude are characterized by differ- ent scales (e. g. s or t) for radiative corrections. The different scales can only be used if the parts correspond to separately gauge invariant pieces. The practical consequence of these arguments for LEP2 physics is that + + each individual diagram in the e e− W W − amplitude → + + (3.4) has a bad high energy behavior [?] and only the inclusion of subtle interfer- ences in the sum results in good high energy behavior1. The on-shell pair production (3.4) is unfortunately not the observed process. This causes two problems: the finite width necessitates Dyson- resummation and higher orders violate gauge invariance [?]. Furthermore, the observed final state consists of many observed particles with tagged flavor. Therefore there are many tree-level background diagrams that contribute to the same process. For this reason, the calculations are complicated and it is important to identify the important parts.

3.2.2 Flavor Selection Rules The interesting flavor physics at the LHC and at the LC is different from QCD with light quarks because the gauge symmetry is broken spontaneously and the gauge group is not simple. Furthermore, the factors are mixed according to

0 3 Z cos θw sin θw W = − (3.5) A sin θw cos θw B      and the combinatorics are consequently more intricate. Finally, the flavor of leptons and heavy quarks is tagged and the flavor transformations do not commute with the gauge transformations and no summation over complete gauge multiplets is possible. For these reasons, the QCD recursion relations or string inspired methods [?] are not directly applicable.

1 The good high energy behavior is realized separately for the SU(2)L and U(1)Y parts of the amplitude [?], but this requires breaking up diagrams.

34 Figure 3.2: Gauge invariant subsets of Feynman diagrams in Bhabha scat- tering.

One method of identifying gauge invariant subsets is to identify subsets that contribute to a particular final state [?, ?]. For example, there are ten + ¯ tree diagrams contributing to e e− µ−ν¯µud and therefore the correspond- +→ ¯ ing subset of the 20 diagrams in e e− e−ν¯eud and its complement must be both gauge invariant by themselves.→ The subsets that can be derived in this way are in general not minimal, but this example shows that selection rules of flavor symmetries that commute with the gauge group appear to be ausefultool. The simplest example of these flavor selection rules is provided by Bhabha scattering: the s-channel and the t-channel diagram in figure 3.2 have to + + + be gauge-invariant separately, because both e e− µ µ− and e µ− + → → e µ− are physical processes that must give gauge invariant amplitudes by themselves. The first process has only the s-channel diagram and the second only the t-channel diagram. As a result, we see that conserved currents of (real or fictitious) horizontal (i. e. generation) symmetries can serve as a tool to separate gauge invariant classes of Feynman diagrams. A less trivial example is provided by the three separately gauge invariant sets in ud ud scattering shown in figure 3.3. The separate gauge invari- ance of the→ gluon exchange diagram is obvious because the QCD generators commute with the electroweak generators and the strong coupling can be switched on and off without violating gauge invariance. The charged current diagram is separately gauge-invariant, because we may assume that the CKM mixing matrix is diagonal and then the charged current diagonal is absent in us us scattering, which is related to ud ud by a horizontal symmetry that→ commutes with the gauge group. → These example in 2 2 scattering are not useful by themselves, unless we can handle the combinatorics→ in the more complicated realistic applications.

35 Figure 3.3: Gauge invariant subsets of Feynman diagrams in ud ud scat- tering. →

3.2.3 Forests In order to develop tools for understanding the combinatorics, we have to take a step back from the intricacies of gauge theories and study unflavored scalar φ3-andφ4-theory. We assume that no φ5 or higher vertices are re- quired by phenomenology. In principle, the combinatorial methods describe below can be extended to non-renormalizable theories. However, the required arguments would have be more involved, without providing additional insight into the gauge sector. Since there are no selection rules, the diagrams S1, S2, and S3 in

S,1 S,2 S,3 S,4 T4 = t ,t ,t ,t = ,,, .(3.6) { 4 4 4 4 } { } must have the same coupling strength to ensure crossing invariance. If there S,4 are additional symmetries, as in the case of gauge theories, the coupling of t4 S,1 S,2 S,3 will also be fixed relative to t4 , t4 ,andt4 . Henceforth, we will call each exchange t t0 of two members of the ↔ set T4 of all tree graphs with four external particles an elementary flip.The elementary flips define a trivial relation t t0 on T4, which is true, if and only ◦ if t and t0 are related by a flip. The relation is trivial on T4 because all pairs are related by an elementary flip. ◦ However, the elementary flips in T4 induce flips in T5 (the set of all tree diagrams with five external particles) if they are applied to an arbitrary four particle subdiagram

= ,, (3.7a) ⇒{ } 36 3 Figure 3.4: The forest F5 of the 15 five-point tree diagrams in unflavored φ - theory. The diagrams are specified by fixing vertex 1 and using parentheses to denote the order in which lines are joined at vertices (see the footnote on page 39 for examples).

Obviously, there is more than one element of T4 embedded in a particular t T5 and the same diagram is member of other quartets, e. g. ∈ = ,, (3.7b) ⇒{ } There are 15 five-point tree diagrams in φ3 and inspection shows that only four other inequivalent diagrams can be reached from any diagram by flips of a four-point subdiagram. Thus there must be a non-trivial mathematical structure on the set T5 with the relation . This structure can be visualized by the graph in figure 3.4. Similarly, there◦ are 25 five-point tree diagrams in combined φ3 and φ4 and at most six can be reached from any diagram by a flip. In the same way, the trivial relation on T4 has a non-trivial natural ex- tension to the set Tn of all n-point tree diagrams: t t0 is true if and only if t ◦ and t0 are identical up to a single flip of a four-point subdiagram

t t0 t 4 T 4 ,t0 T4 :t4 t0 t t4 =t0 t0 (3.8) ◦ ⇐⇒ ∃ ∈ 4 ∈ ◦ 4 ∧ \ \ 4 Note that this relation is not transitive and therefore not an equivalence relation. Instead, this relation allows us to view the elements of Tn as the vertices of a graph Fn, where the edges of the graph are formed by the pairs of diagrams related by a single flip

Fn = (t, t0) Tn Tn t t0 . (3.9) ∈ × ◦  To avoid confusion, we will refer to graph Fn as forest and to its vertices as Feynman diagrams. The most important property of the forests for our applications is

Theorem 1 The unflavored forest Fn is connected for all n. which is easily proved by mathematical induction on n. Indeed, all members of Tn+1 can be derived from Tn by connecting a new external line to a prop- agator (forming a new φ3-vertex) or an existing φ3-vertex (transforming it

37 into φ4-vertex). Obviously, two such insertions are related by a finite num- ber of flips, which proves the induction step. This theorem shows that it is possible to construct all Feynman diagrams by visiting the nodes of Fn along successive applications of the flips in (3.6). Already the simplest non-trivial example of a unflavored (“vanilla”) for- est, the 15 tree diagrams with five external particles in unflavored φ3-theory, as shown in figure 3.4, displays an intriguing symmetry structure: there are 120 permutations of the vertices of F5 that leave this forest invariant. The automorphism group Aut(F5) is therefore with 120 elements much larger than one would expect. The classification of vanilla forests is probably an in- teresting mathematical problem, but without obvious physical applications. However, as we will see below, not all flips are created equal and only some flips are required by gauge invariance, leading to a partition of the forest into gauge invariant subsets, called groves [?].

Vanilla Trees The enumeration of tree diagrams is simplified considerably if we can find a unique representative for each topological equivalence class [?]. One such representation is defined by picking one external line as the root and drawing the reminder of the diagram as a mathematical tree. The edges at each node are sorted according to the index of the external lines beneath each node. The sorting lifts the remaining degeneracy and there remains only one representative for each topological equivalence class. To be specific, here is how the members of T4 are represented

(3.10a) ⇐⇒

(3.10b) ⇐⇒

(3.10c) ⇐⇒

38 (3.10d) ⇐⇒

In this representation2, the flips are equivalent to a recursive tree pattern matching

,,, (3.11a) ∨∨⇒{ } where each of a, b, c can be either a complete subtree or a single leaf. In these pattern matchings, a subtle special case in which two eligible vertices meet, has to be taken care of

,,, (3.11b) ⇒{ } in order not to miss a branch. However, cases like

(3.12) need no special rules because the root vertex will only be expanded and never be contracted with another vertex. This unique representation of tree diagrams provides an efficient equality predicate and the recursive pattern matching provides a function calculating the neighbors

φ(t)= t0 T t0 t (3.13) { ∈ | ◦ } of any tree t in the forest. Using these two ingredients, the construction of the forest is a well defined mathematical problem with efficient textbook solutions, known as depth first search and breadth first search. Essentially, these solutions consist of recursively following the edges to the neighbors until all nodes have been visited, keeping track of the visited nodes to avoid loops. If the forest is connected (from theorem 1, the vanilla forests are known to be connected) and a single tree is known, the construction of the forest is also an efficient algorithm for generating all Feynman tree diagrams.

2Instead of the graphical representation used in (3.10—3.12), a notation based on parenthesis has been used in figure 3.4. In this less intuitive but more concise notation, the four trees in (3.10) are denoted by a(bc), c(ab), b(ac), and abc, respectively. The matching in (3.11b) is written (ab)(cd) a(b(cd)),b((cd)a),c(d(ab)),d((ab)c) . ⇒{ }

39 3.2.4 Flavored Forests In physics applications, we have to deal with multiple flavors of particles. Therefore we introduce flavored forests, where the admissibility of elementary flips t t0 depends on the four particles involved through the Feynman rules ◦ for the vertices in t4 and t0. In order to simplify the combinatorics when counting diagrams for theo- ries with more than one flavor, we will below treat all external particles as outgoing. The physical amplitudes are obtained later by selecting the incom- ing particles in the end. Ward identities, etc. will be proved for the latter physical amplitudes, of course.

Flavored Trees Flavored trees are derived from vanilla trees by adding a label to each edge. Feynman rules are then used at each node to identify valid flavor combina- tions. All flavored trees can be generated from vanilla trees (which correspond to “topologies”) by a recursive tensor product of elementary alternatives. E. g.

,, . (3.14) 7→ { } The representation described in section 3.2.3 allows to use efficient tree ho- momorphisms for the necessary manipulations. In particular, the calculation of the algebraic expressions corresponding to flavored trees is a tree homo- morphism:

u¯(p1)γµu(p2) . (3.15) 7→ If there were no Feynman rules, the flavored forests would be connected just like the vanilla forests. However, the Feynman rules typically prohibit most of the possible flavor assignments and it turns out that the remaining vertices form more than one connected component.

3.2.5 Groves The construction of the groves is based on the observation that the flips in gauge theories fall into two different classes: the flavor flips among

tF,1,tF,2,tF,3 = ,, ,(3.16) { 4 4 4 } { }

40 which involve four matter fields that carry gauge charge and possibly addi- tional conserved quantum numbers and the gauge flips among tG,1,tG,2,tG,3,tG,4 = ,,, (3.17a) { 4 4 4 4 } { } and tG,5,tG,6,tG,7,tG,8 = ,,, (3.17b) { 4 4 4 4 } { } G,8 which also involve external gauge bosons (the diagram t4 ,isonlypresent for scalar matter fields that appear in extensions of the SM: SUSY partners, leptoquarks, etc.). In gauge theories with more than one factor, like the SM, the gauge flips are extended in the obvious way to include all four-point functions with at least one gauge-boson. Commuting gauge group factors lead to separate sets, of course. In spontaneously broken gauge theories, the Higgs and Goldstone boson fields contribute additional flips, in which they are treated like gauge bosons. Ghosts can be ignored at tree level. As we have seen for the examples of Bhabha-scattering and ud ud scattering in section 3.2.2, the flavor flips (3.16) are special because they→ can be switched off without spoiling gauge invariance by introducing a horizontal symmetry that commutes with the gauge group. Such a horizontal symmetry is similar to the replicated generations in the SM, but if three generations do not suffice, it can also be introduced artificially. This observation suggests to introduce two relations:

t t0 t and t0 related by a gauge flip (3.18a) • ⇐⇒ and

t t0 t and t0 related by a flavor or gauge flip. (3.18b) ◦ ⇐⇒ These two relations define two different graphs with the same set T (E)ofall Feynman diagrams as vertices:

F (E)= (t, t0) T (E) T (E) t t0 (3.19a) ∈ × ◦ and 

G(E)= (t, t0) T (E) T (E) t t0 . (3.19b) ∈ × • As we have seen above in Bhabha-scattering, it is in general not possible to connect all pairs of diagrams in G(E) by a series of gauge flips. Thus there will be more than one connected component. For brevity, we will continue to denote the flavor forest F (E)astheforest of the external state E and we will denote the connected components Gi(E)ofthegauge forest G(E)asthe groves of E.Sincet t0 t t0,wehave iGi(E)=G(E) F(E), i. e. the groves are a partition•of⇒ the◦ gauge forest. ⊆ S 41 Figure 3.5: The three diagrams in qq¯ gZ0γ are connected by successive gauge flips of subdiagrams like (3.17b).→ If the decays of the Z0 were included, bremsstrahlung from the decay products would form a separate grove, because no gauge flip will pass the photon across the uncharged Z0.

Figure 3.6: All five diagrams in ud¯ csγ¯ are in the same grove, because they are connected by gauge flips passing→ through the diagram in the center. + + In contrast, in e e− µ µ−γ (see figure 3.1), the center diagram is missing and there are two separate→ groves.

Theorem 2 The forest F (E) for an external state E consisting of gauge and matter fields is connected if the fields in E carry no conserved quantum numbers other than the gauge charges. The groves Gi(E) are the minimal gauge invariant classes of Feynman diagrams.

By construction, the theorem is true for the four-point diagrams and we can use induction on the number of external matter fields and gauge bosons. Since the matter fields are carrying conserved charges, they can only be added in pairs. If we add an additional external gauge boson to a gauge invariant am- plitude, the diagrammatical proof of the Ward and Slavnov-Taylor identities in gauge theories requires us to sum over all ways to attach a gauge boson to connected gauge charge carrying components of the Feynman diagrams. However, the gauge flips are connecting pairs of neighboring insertions and can be iterated along gauge charge carrying propagators. Therefore no par- tition of the forest F (E) that is finer than the groves Gi(E) can preserve gauge invariance. Two examples are shown in figures 3.5 and 3.6. If we add an additional pair of matter fields to a gauge invariant ampli- tude, we have to consider two separate cases

n n +2 f → f , . (3.20) −−−−−→ { } If the new flavor does not already appear among the other matter fields, the only way to attach the pair is through a gauge boson, as in the left diagram

42 in the braces in (3.20). If the new flavor is already present, we can also break up a matter field propagator and connect the two parts of the diagram with a new gauge propagator. Since it is always possible to introduce a new flavor, either physical or fictitious, without breaking gauge invariance, these cases fork off separately gauge invariant classes every time we add a new pair of matter fields. On the other hand, the cases in (3.20) are related by a flavor flip. Therefore F (E) remains connected, the Gi(E) are separately gauge invariant and the proof is complete. Earlier attempts [?, ?] have used physical final states as a criterion for identifying gauge invariant subsets. We have already shown that the groves are minimal and therefore never form a more coarse partition than the one derived from a consideration of the final states alone. Below we shall see examples where the groves do indeed provide a strictly finer partition. In a practical application, one calculates the groves for the interesting combinations of gauge quantum numbers, such as weak isospin and hyper- charge in the SM, using an external state where all other quantum numbers are equal, since the flavor forest would otherwise be disconnected. The com- binatorics is simplified by treating all particles as outgoing. The physical amplitude is then obtained by selecting the groves that are compatible with the other quantum numbers of the process under consideration. Concrete examples are considered in the next section. If only a few flavor combinations are interesting, it can be more conve- nient in practical applications to generate all diagrams for the external states conventionally and to partition the disconnected flavor forest. The resulting groves will be the same, of course.

3.2.6 Application In table 3.1, we list the groves for all processes with six external massless fermions in the SM, with and without single photon bremsstrahlung, with- out QCD and CKM mixing. We could also include fermion masses, QCD, and CKM mixing within the same formalism, but the table would have to be much larger, because additional gluon, Higgs and Goldstone diagrams appear, depending on whether the fermions are massive or massless, colored or uncolored. In the table, cases with identical SU(2)L U(1)Y quantum ⊗ numbers are listed only once and cases with different T3 and Y are listed separately only if the vanishing of the electric charge removes diagrams from + + + agrove.E.g.e e−e e−e e− is identical to uuu¯ uu¯ u¯ and not listed separately. + ¯ ¯ The familiar non-minimal gauge invariant classes for e e− f1f2f3f4 [?] are included in table 3.1 as special cases. The LEP2 WW-classes→ CC09, CC10, and CC11 are immediately obvious. As a not quite so obvious exam-

43 + + + ple, the process e e− µ µ−τ τ − has the same SU(2)L quantum numbers as uu¯ uuu¯ u¯. We can→ read off table 3.1 that, in the case of identical pairs, there are→ 18 groves, of 8 diagrams each. If all three pairs are different, the number of groves has to be divided by 3!, because we are no longer free to connect the three particle-antiparticle pairs arbitrarily. Thus there are + + + 24 diagrams contributing to the process e e− µ µ−τ τ − and they are organized in three groves of 8 diagrams each. Any→ diagram in a grove can be reached from the other 7 by exchanging the vertices of the gauge bosons on one fermion line and be exchanging Z0 and γ. Since there are no non- abelian couplings in this process, the separate gauge invariance of each grove could also be proven as in QED, by varying the hypercharge of each particle: 2 2 2 A A1 qe qµqτ + A2 qeqµqτ + A3 qeqµqτ . ∝ · · · + ¯ As is well known, the diagrams for the CC20 process e e− e−ν¯eud fall into two separately gauge invariant classes, depicted in figure→ 3.7 (including processes related by crossing symmetry). Figure 3.8 shows two views of the + groves for e e− νeν¯eνeν¯e (including processes related by crossing symme- →

Table 3.1: The groves for all processes with six external massless fermions and bremsstrahlung in the SM (without QCD). All particles are considered as outgoing.

external fields (E) #diagrams groves uuu¯ uu¯ u¯ 144 18 8 uuu¯ uu¯ uγ¯ 1008 18 · 24 +36 16 uuu¯ ud¯ d¯ 92 4 ·11 +6 8· uuu¯ ud¯ dγ¯ 716 4 · 95 +6·24 +12 16 + · · · ` `−uud¯ d¯ 35 1 11 +3 8 + · · ` `−uud¯ dγ¯ 262 1 94 +3 24 +6 16 `+νdud¯ d¯ 20 2 · 10 · · `+νdud¯ dγ¯ 152 2 · 76 + + · ` `−` νdu¯ 20 2 10 + + · ` `−` νduγ¯ 150 2 75 + · ` ν`−νd¯ d¯ 19 1 9 +2 4+1 2 + · · · ` ν`−νd¯ dγ¯ 107 1 59 +2 12 +2 8+2 4 + + · · · · ` ν`−ν`¯ `− 56 4 9 +4 4+2 2 + + · · · ` ν`−ν`¯ `−γ 328 4 58 +4 12 +4 8+4 4 + · · · · ` ν`−νν¯ ν¯ 36 4 6 +6 2 + · · ` ν`−νν¯ νγ¯ 132 4 26 +2 6+4 4 ννν¯ νν¯ ν¯ 36 18· 2 · · ·

44 + ¯ Figure 3.7: The two groves in the CC20 process e e− e−ν¯eud. Gauge flips are drawn a solid lines and flavor flips are drawn as→ dotted lines. These + + groves can be found in table 3.1 as ` `−` νdu¯.

try). The left hand side emphasizes the four-fold permutation symmetry of flavor flips, while the right hand side emphasizes the groups of six diagrams + forming a single grove. The process e e− ννν¯ νγ¯ is not yet the most press- ing physics problem, but small enough to→ produce the nice picture shown in figure 3.9. The decomposition 4 26+2 6+4 4 is clearly visible. Finally, fig- ure 3.10 displays the forest for the· process· γγ· udd¯ u¯ in the SM. The grove in the center consists of the 31 diagrams with→ charged current interactions (the set CC21 of [?]). The four small groves of neutral current interactions are only connected with the rest of the forest through the charged current grove. The automorphism group of this forest has 128 elements. The groves can now be used to select the Feynman diagrams to be calcu- lated by other means. However, we note that it is also possible to calculate the amplitude with little additional effort already during the construction of the groves by keeping track of momenta and couplings in the diagram flips.

45 + Figure 3.8: Two views of e e− νeν¯eνeν¯e. Gauge flips are drawn a solid lines and flavor flips are drawn as→ dotted lines.

3.2.7 Automorphisms The forests and groves that we have studied appear to be very symmetri- cal in the neighborhood of any vertex. However, the global connection of these neighborhoods is twisted, which makes it all but impossible to draw the graphs in a way that makes these apparent symmetries manifest (see figure 3.8). Nevertheless, one can turn to mathematics (in particular to computa- tional and constructive group theory [?]) and construct the automorphism groups Aut(F (E)) and Aut(Gi(E)) of the forest F (E) and the groves Gi(E), i. e. the group of permutations of vertices that leave the edges invariant. These groups turn out to be larger than one might expect. For example, the group of permutations of the 71 vertices of the forest F (γγ udd¯ u¯)in figure 3.10, that leave the edges invariant, has 128 elements. Similarly,→ the automorphism group of the forest in figure 3.4 has 120 = 5! elements. The study of these groups and their relations might enable us to construct gauge invariant subsets directly.

46 + Figure 3.9: The decomposition 4 26 +2 6+4 4 of e e− ννν¯ νγ¯ . Gauge flips are drawn a solid lines and· flavor flips· are· drawn as dotted→ lines.

3.2.8 Advanced Combinatorics Even pure gauge forests can provide new insights into the combinatorics of gauge cancellations among Feynman diagrams. Consider the process gg ggg: there are 25 diagrams (all in the same grove, of course). 15 of them→ have three triple gluon vertices and the remaining 10 diagrams have one triple and one quadruple vertex each. The sub-forest of the 15 diagrams with triple vertices is equivalent to the one shown in figure 3.4. The 30 endpoints of the gauge flips starting from the 10 diagrams with a quadruple vertex cover this sub-forest with 15 nodes exactly twice. Therefore, the gauge cancellations do not follow from a simple recursion of the gauge cancellations in gg gg, where the sub-forest of diagrams with triple couplings is covered exactly→ once. Unfortunately, the symmetries in gg gggg are much more complicated. Nevertheless, the automorphism groups→ of groves should be studied in order to understand the gauge cancel- lations better.

47 Figure 3.10: The forest of size 71 for the process γγ udd¯ u¯ in the SM (without QCD, CKM mixing and masses in unitarity gauge)→ with one grove of size 31, two of size 12 and two of size 8. Solid lines represent gauge flips and dotted lines represent flavor flips.

3.2.9 Loops The number of loops is invariant under flips, as can be seen from Euler’s for- mula. However, some of the proofs above have made explicit use of properties of tree diagrams. Therefore further research is required in this area.

48 —4— Phase Space

Not from the stars do I my judgment pluck, And yet methinks I have astronomy [William Shakespeare: Sonnet XIV ]

The calculation of matrix elements using the methods outlined in chapter 3 is but the first step towards a comparison of theoretical predictions with experiment. In the second step, the differential cross section obtained from squaring the amplitude has to be integrated in the region of phase space corresponding to the detector. While this step is common to all observables, it remains highly non-trivial and will be discussed in detail in this chapter. In our application, the integration of a function f with a given measure µ on a manifold M

I(f)= dµ(p)f(p) (4.1) ZM is complicated by three facts 1. the manifold of multi particle phase space and the corresponding Lorentz invariant measure

4 4 2 2 dµ(p)=δ ( kn P) dknδ(k m ) (4.2) n − n n− n is non-trivial for three particlesP in theQ final state and analytically in- tractable for more than three massive particles, 2. the function is given by a squared matrix element multiplied by kine- matical cuts or acceptances 2 f(p)= T(k1,k2,...) C(k1,k2,...) (4.3) | | · and usually ill-behaved. Integrable singularities are common in theories with massless particles and fluctuations over many orders of magnitude are the norm rather than the exception,

49 3. in realistic applications, the acceptances and efficiencies in C(k1,k2,...) are almost never given as analytical functions. Instead, they are them- selves the result of complicated simulations.

4.1 Four Roads from Matrix Elements to Answers

4.1.1 Analytic Integration Analytic integration is mentioned only for completeness. Already for three or more particles in the final state with halfway realistic cuts, a fully ana- lytical calculation of integrated cross sections is impossible. At least some integrations will have to be performed numerically.

4.1.2 Semianalytic Calculation The radiative corrections for LEP1 precision physics on the Z0-resonance are dominated by soft photons. The phase space in the soft limit remains sufficiently simple so that all but a few integrations can be performed analyt- ically for simple cuts. Since programs based on this approach are extremely fast compared to the more universally applicable, they play an important rˆole in LEP1 phenomenology. Due to their speed and precision, they are the preferred tool for the unfolding of electroweak precision observables at LEP1. An additional application is to act as benchmarks for other programs. The situation at LEP2 and the LC is different, because more phase space for hard radiation is available and substantial changes are necessary in these programs.

4.1.3 Monte Carlo Integration If only I(f) in (4.1) has to be calculated, it is possible to calculate the es- timator E(f), given in (4.7) below, directly, using an appropriate source of pseudo random numbers. This is known colloquially as “weighted event gen- eration,” because the phase space points used for sampling are not equally likely, but have a weight attached to them. Since f is not required to be bounded for this approach, the application can focus all attention on mini- mizing the sampling error (4.8). The techniques for achieving this goal are well developed and the currently most sophisticated are described in detail in section 4.3.

50 TD TE TS TK TV ST PA DELPHI Interactive Analysis 0 100 0 0 0 0 0 Act Beam: 45.6 GeV Run: -2 DAS : 10-May-1994 ( 0) (215) ( 0) ( 0) ( 35) ( 0) ( 0) 14:05:47 0 0 0 0 0 0 0 Evt: 2 Deact Proc: 10-May-1994 Scan: 11-May-1994 ( 0) ( 17) ( 0) ( 0) ( 18) ( 0) ( 0) 1.Z-R

 µ+

u¯-jet - /p (= ν)

- d-jet

R

F Z

+ Figure 4.1: Simulated event e e− duµ¯ ν¯µ at √s = 200 GeV (“particle identification” from Monte Carlo record).→ Event generated by WOPPER [?] and passed through DELPHI detector simulation [?].

4.1.4 Monte Carlo Event Generation In principle, it is possible to use weighted event generators also in detector simulations similar to the one shown in figure 4.1, as long as the detector simulation is equipped with the trivial infrastructure for the bookkeeping of event weights. However, this approach has the serious drawback that the distribution of events that are simulated in the detector, can be different from the realistic distributions. In the result of the integration, this is corrected by the weights, but the problem remains that the detector is exercised mostly in the region of the phase space where few events are expected in the experiment. Therefore it is desirable to generated “unweighted events”, i. e. events with uniform weight 1, which are distributed exactly as predicted by the- ory. For bounded probability distributions, such samples can be generated

51 Made on 25-May-1998 17:27:39 by lancon with DALI_E2. Filename: DC045360_005106_980525_1727.PS ) θ in the µ ¯ ν they were 17. Gev EC 2.7 Gev HC o −43 )*SIN( φ ¯ qµ Z x o q o − o − o 0 x x o − o o x as if → Run=45360 Evt=5106 − and a uniform de- − x e x x − + e

|−600cm 600cm|

o o

− o |−600cm 600cm| 0 |−600cm x − ρ o 53. GeV o o o : =0 ( =180 o θ θ RZ . The advantage of the pro- 9.6Gev EC 4.5Gev HC again ] appears to be an attractive alter- ? = 189 GeV s 52 √ , else repeat) and if the distribution does max /p ) x ( p ≤ y ,if DALI_E2 ECM=189 Pch=95.1 Efl=153. Ewi=59.2 Eha=9.28 P0045360 Efl=153. Ewi=59.2 Eha=9.28 P0045360 DALI_E2 ECM=189 Pch=95.1 Detb= E1FFFF EV2=.653 EV3=.201 ThT=1.45 98−05−25 7:55 Nch=13 EV1=.786 x Actual LEP2 event at ALEPH ; accept point must be added to the sample y cedure is that probabilities exceeding unity can be treated not vary too much,most this distributions method in can be Highsophisticated mapping sufficiently Energy techniques efficient. Physics to (HEP)section Unfortunately, handle 4.3). are the singular integrable and singularities require (see by brute force rejection (generate a phase space point ALEPH detector. viate At first sight, the Metropolis algorithm [ 4.1.5 Why We Can Not Usenative the to the Metropolis brute-force Algorithm rejection Naively The algorithm for Metropolis unweighted event algorithm generation. space, constructs which a can Markov beability process shown density covering to without the convergeaccepted phase asymptotically having with to to a the reject probability desiredsities that a at prob- is single the equal new to event.same the and ratio the A of old Markov the phase probability step space den- point. is If the move is rejected, the Figure 4.2: unity and probability densities need not be bounded for this agorithm to work (see [?, ?] for pedagogical discussions). If the probability distribution is rather steep, simple numerical experi- ments show that the number of rejected moves is high. Therefore, the num- ber of events that will appear more than once in a finite sample will be high as well and this causes problems further down the simulation chain. Error estimates in detector simulation and unfolding tools might not be prepared for finite samples with many duplicate events [?]. In general, the problem is that, unless the step size and the initial con- ditions are chosen (and varied!) carefully, errors can be underestimated significantly for steep distributions. Nevertheless, if the problems with ob- taining robust error estimates can be gotten under control, the Metropolis algorithm could also develop into a powerful part of the calculational tool chest for event generators. But until then, a strong caveat emptor regarding results obtained in this way is in order.

4.2 Monte Carlo

simulate: feign, pretend to have or feel, put on; pretend to be, act like, resemble, mimic, take or have an altered form suggested by; ... [Concise Oxford Dictionary]

The overview in the preceding section has shown that weighted and un- weighted Monte Carlo methods are the most robust and flexible part of the phenomenological tool chest for phase space integration. Therefore they de- serve a more detailed discussion. Given a (pseudo-)random sequence of points in n-particle phase space

pg = p1,p2,... ,pN , (4.4) { } where the points are distributed according to

dµg(p)=g(p)dµ(p), (4.5a) with

dµg(p)=1andg(p)>0 (almost everywhere) , (4.5b) ZM we get an estimator of f(p) I(f)= dµ(p)f(p)= dµ (p) (4.6) g g(p) ZM ZM 53 as

f 1 N f(p ) E(f)= = i . (4.7) g N g(p ) g i=1 i   X In (4.7), the volume of M is encoded in g, because g is normalized according to (4.5b). As written, E(f) depends on g and the particular pg, but the dependence on g drops out completely in the ensemble average over all pg. Explicitely spelling out the dependence on g is important, because the mea- sure dµ(p) (i. e. dµg(p)forg(x) = 1) is in general by no means the one that can be generated most efficiently. As will be shown shortly, it is also not the most useful one. Therefore a discussion should consider a general g from the outset. For large pg, E(f) will converge to I(f), but when comparing theoretical predictions with experimental observations, we have to deal with the more complicated problem of estimating the error for a finite sample size and minimizing this error. Besides optimizing g for this purpose, we could also use a determinis- tic pg derived from quasi random numbers. Under certain conditions, this approach can speed up convergence dramatically [?](from1/√Nto 1/N ) at the expense of the robustness of the conditions. Therefore we will not proceed along this line here and just notice that further research in the area is highly desirable. Also it is not clear how to interface deterministic Monte Carlos using quasi random numbers with detector simulations. An unbiased estimator for the variance of E(f)isgivenby

1 f 2 f 2 V(f,g)= . (4.8) N 1 g − g  − *  +g  g   A direct consequence of (4.8) is V (f,f) = 0. This vanishing error is not paradoxical, because I(g) has to be known exactly in order to be able to generate pg. For the following, it is useful to note that V (f,g)isabiased estimator for Wg(f/g)/(N 1), where the variation Wg(h) of a function h − with respect to the measure µg is defined as

2 2 2 Wg(h)= dµg(p) (h(p)) (I(h)) = dµg(p)(h(p) I(h)) 0 . − − ≥ ZM ZM  (4.9)

The advantage of Wg(f/g)overV(f,g)isthatWg(f/g)doesnot depend on the particular pg, which will simplify the following reasoning considerably.

54 Figure 4.3: In the left graph, ∆1(f,gα) is minimized for a simple step func- tion, while in the left graph ∆2(f,gα) is minimized in the same function space. ∆2(f,gα) distributes the mismatch more evenly to achieve faster convergence for integration, but has lower efficiency for event generation.

For Monte Carlo integration, the optimal g will minimize

∆2(f,g)=Wg(f/g) (4.10) while for event generation, the optimal g will minimize

f(p) g(p) I(f) ∆1(f,g)= dµg(p) 1 =1 , (4.11) M  − f  − f Z supM g supM g      in order to maximize the rejection efficiency. If f is a wildly fluctuating func- tion, this optimization of g is indispensable for obtaining a useful accuracy. Typical causes for large fluctuations are integrable singularities of f or µ inside of M or non-integrable singularities very close to M. Therefore, we will use the term “singularity” for those parts of M in which there are large fluctuations in f or µ. If the g could be chosen from a function space that includes f,both∆1 and ∆2 would find the same optimal g = f, since ∆1(f,f)=0and∆2(f,f)= 0. However, in practical applications, we have g G, with a function space G that does not include f (otherwise we would already∈ know both I(f)andan efficient algorithm to generate phase space points according to µf ). There- fore, which g G is closest to f will in general depend on whether the metric ∈ is ∆1 or ∆2. For concreteness, consider the trivial example of f(x)=x2 on M =[0,1] with a one parameter family of step functions

gα(x)=α Θ(1/2 x)+(1 α) Θ(x 1/2) (4.12) · − − · − Then 1 155 1 ∆2(f,gα)= + (4.13) 160 α 160 (1 α) − 9 · · − which is minimized by 1 α2 = =0.15226 ... . (4.14) 1+√31

55 On the other hand, ∆1(f,gα) is minimized by 1 α = (4.15) 1 5 and there is a noticeable difference, even for smooth functions, as shown in figure 4.3. From a technical point of view, ∆2 is more convenient. For a given g G, ∈ ∆2(f,g) will exist for many interesting functions f for which ∆1(f,g)isuse- less because supM (f/g) diverges. Furthermore, the estimation of ∆2(f,g)is statistically robust, while the estimation of supM (f/g)requiredby∆1(f,g)is much more delicate. Finally, the optimality equations for g derived from ∆2(f,g) by functional derivation are linear. The set G is not defined precisely, it includes all probability densities that can be generated “cheaply.” This includes all densities for which the inverse of the anti-derivative can be computed efficiently, but there are other, less well known, cases [?]. Of course, there is no absolute measure for the “cheapness” of generating g. Instead, the effort has to be put in context. As long as the generation of a phase space point p does not take longer than the evaluation of the differential cross section f(p), the generation can be considered “cheap.” If the variance reduction is substantial, it can be useful touseevenmoreexpensiveg’s. The manual minimization of either ∆n(f,g)foragivenfcan be a te- dious exercise, that has to be repeated every time the coupling parameters or kinematical cuts in f are changed. For this reason it is economical to use adaptive algorithms that minimize ∆n(f,g) automatically, as long as they perform satisfactorily.

4.3 Adaptive Monte Carlo

Manual optimization of g is often too time consuming, in particular if the dependence of the integral on external parameters (in the integrand and in the boundaries) is to be studied. Adaptive numerical approaches are more attractive in these cases. In this section, we will use for illustration only integrals over the unit hypercube

n x M =]0,1[⊗ (4.16) ∈ with Cartesian measure

dµ(x)=dx1 dx2 ... dxn (4.17) ∧ ∧ ∧ 56 Figure 4.4: A fixed grid with variable weights can not adapt to singular inte- grands.

and Vol(M) = 1. The general case follows by covering the manifold M with one or more maps and taking into account the appropriate Jacobians.

4.3.1 Multi Channel

The basic question is: how can we implement a variable weight g in dµg(x)= g(x)dµ(x) efficiently? An obvious option is to employ a weighted sum of densities n

g(x)= αigi(x) (4.18) i=1 X with n

dµ(x) gi(x)=1and αi =1andαi 0 (4.19) i=1 ≥ Z X and to optimize the weights αi [?]. This approach has been very successful in Monte Carlo integration (e. g. [?, ?]). However, in applications with several hundred different channels gi [?], the optimization of the weights αi according to [?] yields only modest improvements [?]. It appears to be more important that a channel is present, than that the weights are accurately tuned. In a typical application, the gi are constructed manually from squared signal diagrams with simplified spin correlations. The corresponding distri- butions can be generated easily by successive integration of the s-channel poles and t-channel singularities. The drawback of this approach is that the location and the shape of the singularities has to be known analytically. While the location of the poles is fixed by the particle masses, radiative correction and interference terms will modify the shape and shift the ac- tual maxima. A particularly serious problem is that the effects of radiative corrections depend strongly on kinematical cuts. A special case of (4.18) is given by using the characteristic functions of fixed bins as gi, which corresponds to variable weights for bins in a fixed grid, as depicted in figure 4.4. This case is not useful for applications in particle physics, because a grid with a fixed resolution can not adapt to the power law singularities from propagators as are ubiquitous in particle physics.

57 Figure 4.5: In one dimension, a variable grid with fixed weights can adapt well to singular integrands.

Figure 4.6: Quadrangles do not remain convex in general after unconstrained shifts of lattice points.

Differential cross sections typically vary over many orders of magnitude and the grid would have to be too fine to be useful. In one dimension, there is an interesting alternative to varying the weights of fixed width. It is possible to vary instead the size of bins with fixed weight1

N 1 Θ(x xi 1)Θ(xi x) g(x)= − − − . (4.20) N x x i=1 i i 1 X − − This way it is possible to have bins corresponding to densities that vary over many orders of magnitude. It is possible to iteratively improve the grid, driven by samplings of f(x)dµg(x). The improvement can either de- crease ∆1(f,g)(importance sampling for event generation) or decrease ∆2(f,g) (stratified sampling for high precision integration). An optimal g would solve the functional equation δ ∆ (f,g)=∆ (f,g)=0. (4.21) δg 1,2 10,2

While it is straightforward to sample ∆10 ,2(f,g) simultaneously with f/g, an exact solution of (4.21) is not possible, because contributions have to be moved from one bin to another. Nevertheless, an approximate solution is possible under the assumption that the integrand in (4.21) is piecewise constant [?]. Unfortunately, there is no immediate generalization to higher dimensions, because freely shifting vertices starting from a hypercubic lattice will in gen- eral lead to quadrangles that are no longer convex, as illustrated in figure 4.6. The concave quadrangles are more than a technical inconvenience, because they lead to numeric instabilities of the grid optimization.

1Varying the weights as well would be possible but adds no new flexibility. It is therefore better to avoid the added complexity and to keep all weights identical.

58 4.3.2 Delaunay Triangulation Mathematically, the most natural decomposition of the integration region in higher dimension is into simplices. In fact, there have been attempts to use simplical decompositions in adaptive Monte Carlo [?]. The results of this exercise have not been impressive so far, in particular for strongly fluctuating integrands. The reason for the failure of the naive approach is that the shape of the simplices is not constrained and that instabilities will produce “splinters” that are numerically ill-behaved. The construction of “well behaved” simplical decompositions is a rather non-trivial mathematical problem (see [?] for reviews), that remains unsolved in the general case. However, there has been progress in two and three di- mensions, driven in part by the needs of the engineering community for numerically robust finite element codes. The theory for two dimensions is rather complete and mathematically very beautiful [?]. Therefore it is ap- propriate to make a factorized ansatz [?], in which the “least factorizable” pairs and triplets are handled by simplical decomposition, using triangles and tetrahedra. In two dimensions, the most suitable tool are constrained Delaunay tri- angulations with Steiner points. For a given set of points, the Delaunay triangulation is the unique triangulation in which the center of the circle passing through the corners of each triangle is inside of that triangle. This uniqueness helps to stabilize the optimization of the triangulation, which is no longer unique, if we allow to add additional points, so-called Steiner points, to the point set in order to satisfy additional constraints. Nevertheless, there are well defined and terminating algorithms [?] that construct constrained Delaunay triangulations with global and local maximum area constraints, as well as minimum angle constraints. The former are what is required for mod- eling a function for importance sampling and the latter stabilize the iterative refinement of the triangulation. However, there is still much more work needed to put these ideas to work in particle physics applications beyond a proof of concept [?], because the required computational geometry is at or beyond the state-of-the-art in finite element analysis for engineering.

4.3.3 VEGAS Factorization The previous section has shown that one-dimensional distributions are sin- gled out because a mathematically well defined optimization strategy can be

59 Figure 4.7: Delaunay triangulation with Steiner points adapted to a singu- larity on a quarter circle. formulated with very little effort. Therefore it was tempting to propose a factorized ansatz [?]

g(x)=g1(x1)g2(x2) gn(xn) (4.22) ··· and to investigate its properties. It turns out that (4.22) produces satisfac- tory results with very little effort and the corresponding code VEGAS [?]has been used as a “black box” by countless physics programs. For two decades no significant improvements of VEGAS have been proposed. The factorized ansatz (4.22) of VEGAS guarantees that the resulting grid is hypercubic. Furthermore, the simple one-dimensional update prescriptions can be used in each dimension, if the estimators entering the prescriptions are averaged over the remaining dimensions. A final benefit of the factorized ansatz is that the computational costs grow only linearly with the number of dimensions and not exponentially. Of course, not all functions of interest in particle physics can be approxi- mated by a factorized ansatz. While the functions depicted in figure 4.8 have

60 Figure 4.8: Integrands that VEGAS can handle after a coordinate transforma- tion.

Figure 4.9: An integrand that VEGAS can not handle, even after a coordinate transformation.

factorized singularities after a simple coordinate transformation, their sum in figure 4.9 can not be sampled efficiently by VEGAS. Similar example are ubiq- uitous in particle physics, because the cross sections are derived from sums of Feynman diagrams and have singularities in several kinematical variables which can not be used as coordinates simultaneously. + For example, the three jet production e e− qqg¯ in figure 4.10 has im- → 2 portant singularities in both variables s1/2 =(q1/2+k). In this case, there are parametrizations of the phase space that use both s1 and s2 simultaneously. However the corresponding Jacobian ∂φ/∂x contains non-factorizable singu- | | larities in s1 and s2 induced by a Gram determinant as 1/ ∆4(p1,p2,q1,q2). Thus there is no parametrizations in which both singularities factorize simul- p taneously and VEGAS can not adjust to both singularities (see [?]). A general solution to this problem will be presented below in section 4.3.4.

Stratified vs. Importance Sampling The implementation of VEGAS [?] that is widely used, has three different modes: importance sampling, stratified sampling and a hybrid of the two. Importance sampling can be selected by the application program, while the choice between stratified sampling and the hybrid is done by VEGAS based on the dimension of the integration domain. The implementation [?]very terse and hard to understand. Even popular textbooks [?] reproduce the code without further explanation. The reimplementation that was required for [?], aims to be more readable. In pure importance sampling, VEGAS adjusts a grid like the two-dimensional illustration in figure 4.11. The sampling proceeds by first selecting a hyper- cube at random and subsequently selecting a point in this hypercube accord- ing to a uniform distribution. This algorithm generates points distributed according to (4.20) and (4.22).

61 + Figure 4.10: Three jet production e e− qqg¯ . →

Figure 4.11: VEGAS grid structure for importance sampling

In stratified sampling, faster convergence is attempted by distributing sampling points more uniformly than random samples. In VEGAS, stratified sampling is implemented by dividing the bins of the adaptive grid further as shown in figure 4.12. The number of divisions is chosen such that the sum of two samples per bin approximates the desired total number of samples. Now the sampling proceeds by selecting two random points from each bin of the finer grid. This procedure shows better convergence for integration than importance sampling. As can be seen from table 4.1, the algorithm for stratified sampling as described above can only be used in low dimensions. Already the eight- dimensional four-particle phase space would require many billions of sampling points, which is usually not possible. Nevertheless, the stratified sampling algorithm performs empirically bet- ter than the importance sampling algorithm and it is worthwhile to pursue

Table 4.1: To stratify or not to stratify: number of calls required for a grid of 25 bins in each direction, as a function of the dimension of the integration domain.

max ndim Ncalls (ng = 25) 2 1 103 3 3 · 104 4 8 · 105 5 2 · 107 6 5 · 108 ·

62 Figure 4.12: VEGAS grid structure for genuinely stratified sampling.

Figure 4.13: VEGAS grid structure for pseudo stratified sampling.

the hybrid approach illustrated in figure 4.13. Here the stratification grid is no longer a subdivision of the adaptive grid. Instead, the points selected in the stratification grid are mapped linearly onto the adaptive grid and further nonlinearly onto the integration domain. The adaptive grid is updated to minimize ∆1(f,g). This hybrid is, strictly speaking, theoretically not justi- fied because importance sampling assumes a random distribution. However, the “more uniform” distribution appears to perform better in practical ap- plications and the hybrid approach is very successful.

4.3.4 VAMP Introduction As mentioned before for the examples in figure 4.8, the property of factoriza- tion depends on the coordinate system. Consider, for example, the functions

1 f (x ,x )= (4.23a) 1 1 2 2 2 (x1 a1) +b1 − 1 f2(x1,x2)= 2 (4.23b) 2 2 2 x +x a2 +b 1 2− 2 p  on M =( 1,1) ( 1, 1) with the measure dµ =dx1 dx2. Obviously, − ⊗ − ∧ f1 is factorizable in Cartesian coordinates, while f2 is factorizable in polar coordinates. VEGAS will sample either function efficiently for arbitrary b1,2 in suitable coordinate systems, but there is no coordinate system in which VEGAS can sample the sum f1 + f2 efficiently for small b1,2. There is however a generalization [?]oftheVEGAS algorithm from factoriz- able distributions to sums of factorizable distributions, where each term may be factorizable in a different coordinate system. This larger class includes most of the integrands appearing in particle physics and empirical studies have shown a dramatic increase of accuracy for typical integrals.

63 Maps The problem of estimating I(f) can be divided naturally into two parts: parametrization of M and sampling of the function f. While the estimate will not depend on the parametrization, the error will. In general, we need an atlas with more that one chart φ to cover a mani- fold M. We will ignore this technical complication in the following, because, for the purpose of integration, we can always decompose M such that each piece is covered by a single chart. Moreover, a single chart suffices in most cases of practical interest, since we are at liberty to remove sets of measure zero from M. For example, after removing a single point, the unit sphere can be covered by a single chart. Nevertheless, even if we are not concerned with the global properties of M that require the use of more than one chart, the language of differential geometry will allow us to use our geometrical intuition. Instead of pasting together locally flat pieces, we will paste together factorizable pieces, which can be overlapping, because integration is an additive operation. For actual computations, it is convenient to use the same domain for the charts of all manifolds. The obvious choice for n-dimensional manifolds is the open n-dimensional unit hypercube

n U =(0,1)⊗ . (4.24)

Sometimes, it will be instructive to view the chart as a composition φ = ψ χ with an irregularly shaped P Rn as an intermediate step ◦ ∈

(4.25)

(in all commutative diagrams, solid arrows are reserved for bijections and dotted arrows are used for other morphisms). The integral (4.1) can now be written 1 ∂φ I(f)= dnx f(φ(x)) (4.26) ∂x Z0

64 and it remains to sample ∂φ/∂x (f φ)onU. Below, it will be crucial that there is always more than| one way|· to◦ map U onto M

(4.27)

and that we are free to select the map most suitable for our purposes. The ideal choice for φ would be a solution of the partial differential equa- tion ∂φ/∂x =1/(f φ), but this is equivalent to an analytical evaluation of I(f| ) and| is impossible◦ for the cases under consideration. A more realistic goal is to find a φ such that ∂φ/∂x (f φ) has factorizable singularities and is therefore sampled well by VEGAS| . This|· ◦ is still a non-trivial problem, however. For example, consider again the phase space integration for gluon radiation + e e− qqg¯ . From the Feynman diagrams in figure 4.10 we had seen that the → 2 squared matrix element has singularities in both variables s1/2 =(q1/2 +k) that can not be factorized. On the other hand, it is straightforward to find parametrizations that factorize the dependency on s1 or s2 separately. Returning to the general case, consider Nc different maps φi : U M → and probability densities gi : U [0, ). Then the function → ∞

Nc 1 1 ∂φi− g = αi(gi φ− ) (4.28) ◦ i ∂p i=1 X is a probability density g : M [0, ) → ∞ dµ(p) g(p)=1, (4.29) ZM as long as the gi are probability densities themselves and the αi are properly normalized

1 Nc n gi(x)d x =1, αi =1, 0 αi 1. (4.30) ≤ ≤ 0 i=1 Z X From the definition (4.28), we have obviously

Nc ∂φ 1 f(p) I(f)= α g (φ 1(p)) i− dµ(p) (4.31) i i i− ∂p g(p) i=1 ZM X

65 and, after pulling back from M to Nc copies of U

Nc 1 f(φ (x)) I(f)= α g (x)dnx i , (4.32) i i g(φ (x)) i=1 0 i X Z we find a new estimator of the integral I(f)

Nc f φi E(f)= αi ◦ . (4.33) g φi i=1 gi X  ◦ 

If the gi in (4.31) and (4.33) are factorized, they can be optimized using the classic VEGAS algorithm [?] unchanged. However, since we have to sample with a separate adaptive grid for each channel, a new implementation [?]is required for technical reasons. 2 1 Using the N maps πij = φ− φi : U U introduced in (4.27), we can c j ◦ → write the g φi : U [0, ) in (4.33) as ◦ → ∞

1 Nc ∂φi − ∂πij g φi = αigi + αj(gj πij)  . (4.34) ◦ ∂x ◦ ∂x j=1  Xj=i   6    From a geometrical perspective, the maps πij are just the coordinate transfor- mations from the coordinate systems in which the other singularities factorize into the coordinate system in which the current singularity factorizes. Even if all gi factorize, the g φi will not necessarily factorize, provided ◦ the φi are chosen appropriately. The maps πij are purely geometric objects, independent of the physical model under consideration. They are the coordi- nate transformation from a coordinate system Ξi in which the ith singularity factorizes to a coordinate system Ξj in which the jth singularity factorizes. The construction of the πij requires some mathematical input, but it is required only once and the procedure allows economical studies of the depen- dence of the cross section on kinematical cuts and on external parameters through f, because VEGAS can optimize the gi for each parameter set and set of cuts without further human intervention. Note that the integral in (4.31) does not change, when we use φi : U → Mi M,ifweextentffrom M to Mi by the definition f(Mi M)=0. This⊇ is useful, for instance, when we want to cover ( 1, 1) \( 1, 1) by − ⊗ −

66 both Cartesian and polar coordinates. This causes, however, a problem with the π12 in (4.34). In the diagram

(4.35)

the injections ι1,2 are not onto and since π12 is not necessarily a bijection anymore, the Jacobian ∂πij /∂x may be ill-defined. But since f(Mi M)=0, | | \ we only need the unique bijections φ10 ,2 and π120 that make the diagram

(4.36) commute.

Multi Channel Sampling

Up to now, we have not specified the αi, they are only subject to the condi- tions (4.30). Intuitively, we expect the best results when the αi are propor- tional to the contribution of their corresponding singularities to the integral. The option of tuning the αi manually is not attractive if the optimal val- ues depend on varying external parameters. Instead, we use a numerical procedure [?] for tuning the αi. We want to minimize the variance (4.8) with respect to the αi.Thisis equivalent to minimizing

f(p) 2 W (f,α)= g(p)dµ(p) (4.37) g(p) ZM   with respect to α with the subsidiary condition i αi = 1. After adding a Lagrange multiplier, the stationary points of the variation are given by the solutions to the equations P

i : Wi(f,α)=W(f,α) (4.38) ∀ where

1 2 ∂ n f(φi(x)) Wi(f,α)= W(f,α)= gi(x)d x (4.39) −∂α g(φ (x)) i Z0  i 

67 and

Nc

W (f,α)= αiWi(f,α). (4.40) i=1 X It can easily be shown [?] that the stationary points (4.38) correspond to local minima. If we use

Ni = αiN (4.41) to distribute N sampling points among the channels, the Wi(f,α)arejust the contributions from channel i to the total variance. Thus we recover the familiar result from stratified sampling, that the overall variance is minimized by spreading the variance evenly among channels. The Wi(f,α) can be estimated with very little extra effort while sam- pling I(f) (see (4.33))

2 f φi Vi(f,α)= ◦ . (4.42) * g φi +  ◦  gi

Note that the factor of gi/g in the corresponding formula in [?]isabsent from (4.42), because we are already sampling with the weight gi in each channel separately. The equations (4.38) are non linear and can not be solved directly. How- ever, the solutions of (4.38) are a fixed point of the prescription

β αi (Vi(f,α)) αi αi0 = β , (β>0) (4.43) 7→ i αi (Vi(f,α)) for updating the weights αi. ThereP is no guarantee that this fixed point will be reached from a particular starting value, such as αi =1/Nc, through suc- cessive applications of (4.43). Nevertheless, it is clear that (4.43) will concen- trate on the channels with large contributions to the variance, as suggested by stratified sampling. Furthermore, empirical studies show that (4.43) is successful in practical applications. The value β =1/2 has been proposed in [?], but it can be beneficial in some cases to use smaller values like β =1/4 to dampen statistical fluctuations.

Performance Both the implementation and the practical use of the multi channel algo- rithm are more involved than the application of the original VEGAS algorithm.

68 Therefore it is necessary to investigate whether the additional effort pays off in terms of better performance. A Fortran implementation of this algorithm, VAMP [?], has been used for empirical studies. This implementation features other improvements over “VEGAS Classic”, most notably system independent and portable support for parallel processing (see section 4.4) and support for unweighted event generation. There are two main sources of additional computational costs: at each sampling point the function g φi has be evaluated, which requires the com- ◦ putation of the Nc 1mapsπij together with their Jacobians and of the Nc 1 − − probability distributions gi of the other VEGAS grids (see (4.34)). The retrieval of the current gi’s requires a bisection search in each dimen- sion, i.e. a total of O((Nc 1) ndim log2(ndiv)) executions of the inner loop of the search. For simple− integrands,· · this can indeed be a few times more costly than the evaluation of the integrand itself. The computation of the πij can be costly as well. However, unlike the gi, this computation can usually be tuned manually. This can be worth the effort if many estimations of similar integrals are to be performed. Empirically, straightforward implementations of the πij add costs of the same order as the evaluation of the gi. Finally, additional iterations are needed for adapting the weights αi of the multi channel algorithm described in (4.3.4). Their cost is negligible, however, because they are usually performed with far fewer sampling points than the final iterations. Even in cases in which the evaluation of gi increases computation costs by a whole order of magnitude, any reduction of the error by more than a factor of 4 will make the multi channel algorithm economical. In fact, it is easy to construct examples in which the error will be reduced by more than two orders of magnitude. The function

b 3πΘ(r < 1) 2πΘ(r < 1, x < 1) 3 + 2 3 f(x)= 2 2 2 | 2 | 2 144 atan(1/2b) r ((r3 1/2) + b ) r2((r2 1/2) + b )  3 − − Θ( 1

69 Figure 4.14: Comparison of the sampling error for the integral of f in (4.44) as a function of the width parameter b for the two algorithms at comparable computational costs.

and allows a check of the result. The three terms factorize in spherical, cylin- drical and Cartesian coordinates, respectively, suggesting a three channel approach. After five steps of weight optimization consisting four iterations of 105 samples, we have performed three iterations of 106 samples with the VAMP multi channel algorithm. Empirically, we found that we can perform four iterations of 5 105 samples and three iterations of 5 106 samples with the classic VEGAS algorithm· during the same time period.· Since the functional form of f is almost as simple as the coordinate transformation, the fivefold increase of computational cost is hardly surprising. In figure 4.14, we compare the error estimates derived by the classic VEGAS algorithm and by the three channel VAMP algorithm. As one would expect, the multi channel algorithm does not offer any substantial advantages for smooth functions (i. e. b>0.01). Instead, it is penalized by the higher computational costs. On the other hand, the accuracy of the classic VEGAS algorithm deteriorates like a power with smaller values of b.Atthesame time, the multi channel algorithm can adapt itself to the steeper functions, leading to a much slower loss of precision. The function f in (4.44) has been constructed as a showcase for the multi channel algorithm, of course. Nevertheless, more complicated realistic ex- amples from particle physics appear to gain about an order of magnitude in accuracy. Furthermore, the new algorithm allows unweighted event genera- tion. This is hardly ever possible with the original VEGAS implementation, because the remaining fluctuations typically reduce the average weight to very small numbers. A particularly attractive application is provided by automated tools for the calculation of scattering cross sections. While these tools can currently calculate differential cross sections without manual intervention, the phase space integrations still require hand tuning of mappings for importance sam- pling for each parameter set. The present algorithm can overcome this prob- lem, since it requires to solve the geometrical problem of calculating the maps πij in (4.34) for all possible invariants only once. The selection and optimization of the channels can then be performed algorithmically.

70 A Cheaper Alternative

There is an alternative approach that avoids the evaluation of the gi’s, sac- rificing flexibility. Fixing the gi atunity,wehavefor˜g:M [0, ) → ∞ Nc ∂φ 1 g˜ = α i− (4.46) i ∂p i=1 X and the integral becomes

Nc ∂φ 1 f(p) Nc 1 f(φ (x)) I(f)= α i− dµ(p) = α dnx i . (4.47) i ∂p g˜(p) i g˜(φ (x)) i=1 ZM i=1 Z0 i X X

VEGAS can now be used to perform adaptive integrations of 1 f(φ(x)) I (f)= dnx i (4.48) i g˜(φ (x)) Z0 i individually. In some cases it is possible to construct a set of φi such that Ii(f) can estimated efficiently. The optimization of the weights αi can again be effected by the multi channel algorithm described in (4.3.4). The disadvantage of this approach is that the optimal φi will depend sen- sitively on external parameters and the integration limits. In the approach based on the g in (4.28), VEGAS can take care of the integration limits auto- matically.

4.4 Parallelization

The computing needs of the LC can only be satisfied economically if parallel computing on commodity hardware can be realized. Traditionally, paral- lelization is utilized in HEP by running independent Monte Carlo or analysis jobs in parallel and combining their results statistically. Nevertheless, more fine grained parallelism is desirable in Monte Carlo jobs that use adaptive algorithms2. The problem of the parallelization of adaptive Monte Carlo integration algorithms, in particular VEGAS, has gained some attention recently [?, ?]. These implementations start from the classic implementation of VEGAS and add synchronization barriers, either mutexes for threads accessing shared memory or explicit message passing. These approaches result in compact

2Already at LEP2, there are problems with the parallelization of unweighted event generation if the maximum weights have to be found in each parallel job separately [?].

71 code and achieve high performance, but their low level nature obscures the mathematical structure. Since parallel computing is notoriously difficult, it is desirable to have a well defined mathematical model that frees physicists from having to waste too much effort dealing with technical computing de- tails. Therefore, we suggest a mathematical model of parallelism for adaptive Monte Carlo integration that is independent both of a concrete paradigm for parallelism and of the programming language used for an implementation. We decompose the algorithm and prove that certain parts can be executed in any order without changing the result. As a corollary, we know that they can be executed in parallel. The algorithms presented below have been implemented successfully in the library VAMP [?], along with the multi channel algorithm described in section 4.3.4.

4.4.1 Vegas As described in section 4.3.3, VEGAS uses two different grids: an adaptive grid GA, which is used to adapt the distribution of the sampling points and a stratification grid GS for stratified sampling. The latter is static and depends only on the number of dimensions and on the number of sampling points. i Both grids factorize into divisions dA,S GA = dA dA dA (4.49a) 1 ⊗ 2 ⊗···⊗ n GS = dS dS dS . (4.49b) 1 ⊗ 2 ⊗···⊗ n The divisions come in three kinds

S di = (importance sampling) (4.50a) A ∅S di = di /m (stratified sampling) (4.50b) dA = dS/m (pseudo-stratified sampling) . (4.50c) i 6 i In the classic implementation of VEGAS [?], all divisions are of the same type. In a more general implementation [?], this is not required and it can be useful to use stratification only in a few dimensions. Two-dimensional grids for the cases (4.50a) and (4.50b) have been illus- trated in figures 4.11 and 4.12. In case (4.50a), there is no stratification grid and the points are picked at random in the whole region according to GA.In case (4.50b), the adaptive grid GA is a regular subgrid of the stratification grid GS and an equal number of points are picked at random in each cell A S of GS.Sincedi =di/m, the points will be distributed according to GA as well. A one-dimensional illustration of the most complicated case (4.50c) has been shown in figure 4.13.

72 4.4.2 Formalization of Adaptive Sampling In order to discuss the problems with parallelizing adaptive integration al- gorithms and to present a solution, it helps to introduce some mathematical notation. A sampling S is a map from the space π of point sets and the space F of functions to the real (or complex) numbers

S : π F R × → (4.51) (p, f) I = S(p, f) . 7→ For our purposes, we have to be more specific about the nature of the point set p π. In general, this point set will be characterized by a sequence of pseudo∈ random numbers ρ R and by one or more grids G Γusedfor importance or stratified sampling.∈ A simple sampling ∈

S0 : R Γ A F R R R Γ A F R R × × × × × → × × × × × (ρ, G, a, f, µ1,µ2) (ρ0,G,a0,f,µ0,µ0)=S0(ρ, G, a, f, µ1,µ2) 7→ 1 2 (4.52) estimates the n-th moments µn0 R of the function f F . The integral and its standard deviation can be derived∈ easily from the moments∈

I = µ1 (4.53a)

2 1 2 σ = µ2 µ , (4.53b) N 1 − 1 −  while the latter are more convenient for the following discussion. In addition, S0 collects auxiliary information to be used in the grid refinement, denoted by a A. The unchanged arguments G and f have been added to the ∈ result of S0 in (4.52), so that S0 has identical domain and codomain and can therefore be iterated. Previous estimates µn maybeusedintheestimation of µn0 , but a particular S0 is free to ignore them as well. Using a little notational freedom, we augment R and A with a special value , which will ⊥ always be discarded by S0. In an adaptive integration algorithm, there is also a refinement opera- tion r :Γ A Γ that improves the grid based on the auxiliary informa- × → tion a A. r can be extended naturally to the codomain of S0 ∈ r : R Γ A F R R R Γ A F R R × × × × × → × × × × × (ρ, G, a, f, µ1,µ2) (ρ, G0,a,f,µ1,µ2)=r(ρ, G, a, f, µ1,µ2), 7→ (4.54)

73 so that S = rS0 is well defined and we can specify n-step adaptive sampling as

n Sn = S0(rS0) (4.55)

Since, in a typical application, only the estimate of the integral and the standard deviation are used, a projection can be applied to the result of Sn: P : R Γ A F R R R R × × × × × → × (4.56) (ρ, G, a, f, µ1,µ2) (I,σ). 7→ Then

n (I,σ)=PS0(rS0) (ρ, G0, ,f, , ) (4.57) ⊥ ⊥ ⊥ and a good refinement prescription r,suchasVEGAS, will minimize the σ. For parallelization, it is crucial to find a division of Sn or any part of it into independent pieces that can be evaluated in parallel. In order to be effective, r has to be applied to all of a and therefore a synchronization of G before and after r is appropriate. Furthermore, r usually uses only a tiny fraction of the overall CPU time and it makes little sense to invest a lot of effort into parallelizing it. On the other hand, S0 can be parallelized naturally, because all operations are linear, including he computation of a.Weonlyhaveto make sure that the cost of communicating the results of S0 and r back and forth during the computation of Sn do not offset any performance gain from parallel processing. When we construct a decomposition of S0 and proof that it does not change the results, i.e.

S0 = ιS0φ (4.58) or

N N i=1 S0 N i=1 Gi i =1 Gi −−−−→L Lφ L ι , (4.59)

x S0  G G  −−−→ y where φ is a forking operation and ι is a joining operation, we are faced with the technical problem of a parallel random number source ρ. As made explicit in (4.52), S0 changes the state of the random number generator ρ. Demanding identical results, imposes therefore a strict ordering on the operations and defeats parallelization. In principle, it is possible to

74 devise implementations of S0 and ρ that circumvent this problem by dis- tributing subsequences of ρ in such a way among processes that the results do not depend on the number of parallel processes. However, a reordering of the random number sequence will only change the result by an amount on the order of the statistical sampling error, as long as the scale of the allowed reorderings is bounded and much smaller than the period of the random number generator3. Below, we will therefore use the notation x y for “equal for an appropriate finite reordering of the ρ used in calculating≈x and y”. For our purposes, the relation x y is strong enough and allows simple and efficient implementations. ≈

4.4.3 Multilinear Structure of the Sampling Algorithm

Since S0 is essentially a summation, it is natural to expect a linear structure

S0(ρi,Gi,ai,f,µ1,i,µ2,i) S0(ρ, G, a, f, µ1,µ2) (4.60a) ≈ i M where

ρ = ρi (4.60b) i M G = Gi (4.60c) i M a = ai (4.60d) i M µn = µn,i (4.60e) i M for appropriate definitions of “ ”. For the moments, we have standard ad- dition ⊕

µn,1 µn,2 = µn,1 + µn,2 (4.61) ⊕ and since we only demand equality up to reordering, we only need that the ρi are statistically independent. This leaves us with G and a and we have to discuss importance sampling and stratified sampling separately.

3Arbitrary reorderings on the scale of the period of the pseudo random number gener- ator have to be forbidden, of course.

75 Importance Sampling In the cases of naive Monte Carlo without grid optimization and of impor- tance sampling, the natural decomposition of G is to take j copies of the same grid G/j that are identical to G,eachwithonej-th of the total sam- pling points. As long as the a are linear themselves, we can add them up just like the moments

a1 a2 = a1 + a2 (4.62) ⊕ and we have found a decomposition (4.60). In the case of VEGAS,theai are sums of function values at the sampling points. Thus they are obviously linear and this approach is applicable to VEGAS in the importance sampling mode.

Stratified Sampling The situation is more complicated in the case of stratified sampling. The first complication is that in pure stratified sampling there are only two sampling points per cell. Splitting the grid in two pieces as above can provide only a very limited amount of parallelization. The second complication is that the a are no longer linear, since they correspond to a sampling of the variance per cell and no longer of the function values themselves. However, as long as the samplings contribute to disjoint bins only, we can still “add” the variances by combining bins. The solution is therefore to divide the grid into disjoint bins along the divisions of the stratification grid and to assign a set of bins to each processor. Finer decompositions will incur higher communications costs and other resource utilization. An implementation based on PVM is described in [?], which minimizes the overhead by running identical copies of the grid G on each processor. Since most of the time is usually spent in function evalua- tions, it makes sense to run a full S0 on each processor, skipping function evaluations everywhere but in the region assigned to the processor. This is a neat trick, which is unfortunately tied to the computational model of message passing systems such as PVM and MPI. More general paradigms can not be supported since the separation of the state for the processors is not explicit (it is implicit in the separated address space of the PVM or MPI processes).

76 However, it is possible to implement (4.60) directly in an efficient manner. This is based on the observation that the grid G used by VEGAS is factorized into divisions Dj for each dimension

ndim G = Dj (4.63) j=1 O and that decompositions of the Dj induce decompositions of G

i 1 n i 1 n − dim − dim j i j j i j G1 G2 = D D1 D D D2 D ⊕ j=1 ⊗ ⊗ i=j+1 ! ⊕ j=1 ⊗ ⊗ i=j+1 ! O O O O i 1 ndim − j i i j = D D1 D2 D . (4.64) j=1 ⊗ ⊕ ⊗ j=i+1 O  O We can translate (4.64) directly to code that performs the decomposition Di = i i j=i D1 D2 discussed below and simply duplicates the other divisions D 6 .A decomposition⊕ along multiple dimensions is implemented by a recursive ap- plication of (4.64). In VEGAS, the auxiliary information a inherits a factorization similar to the factorization (4.63) of the grid

a =(d1,... ,dndim ) (4.65) but not a multilinear structure. Instead, as long as the decomposition respects the stratification grid, we find the in place of (4.64)

1 1 i i ndim ndim a1 a2 =(d +d ,... ,d d ,... ,d + d ) (4.66) ⊕ 1 2 1 ⊕ 2 1 2 with “+” denoting the standard addition of the bin contents and “ ”de- noting the aggregation of disjoint bins. If the decomposition of the division⊕ would break up cells of the stratification grid, the equation (4.66) would be incorrect, because, as discussed above, the variance is not linear. Now it remains to find a decomposition

Di = Di Di (4.67) 1 ⊕ 2 for both the pure stratification mode and the pseudo stratification mode of VEGAS (see figures 4.11 and 4.12). In the pure stratification mode, the strat- ification grid is strictly finer than the adaptive grid and we can decompose along either of them immediately. Technically, a decomposition along the coarser of the two is straightforward. Since a typical adaptive grid already

77 Figure 4.15: Forking one dimension d of a grid into three parts ds(1), ds(2), and ds(3). The picture illustrates the most complex case of pseudo stratified sampling (see figure 4.13).

has more than 25 bins, a decomposition along the stratification grid offers no advantages for medium scale parallelization (O(10) to O(100) processors) and the decomposition along the adaptive grid has been implemented. This scheme is particularly convenient, because the sampling algorithm S0 can be applied unchanged to the individual grids resulting from the decomposition. For pseudo stratified sampling (see figure 4.13), the situation is more complicated, because the adaptive and the stratification grid do not share bin boundaries. Since VEGAS does not use the variance in this mode, it would be theoretically possible to decompose along the adaptive grid and to mimic the incomplete bins of the stratification grid in the sampling algorithm. However, this would cause technical complications, destroying the universality of S0. Instead, the adaptive grid is subdivided in a first step in

lcm(n ,n ) lcm f g ,n (4.68) n x  f  bins4. such that the adaptive grid is strictly finer than the stratification grid. This procedure is shown in figure 4.15. The code for forking and duplicating a single dimension suffices as building block for general recursion [?], as illustrated in figure 4.16.

4.4.4 Random Numbers The implementation [?] of this procedure takes advantage of the ability of Knuth’s new preferred generator [?] to generate provable statistically inde- pendent subsequences. However, since the state of the random number gen- erator is explicit in all procedure calls, other means of obtaining subsequences can be implemented in a trivial wrapper. The results of the parallel example will depend on the number of pro- cessors, because this effects the subsequences being used. Of course, the variation will be compatible with the statistical error. It must be stressed that the results are deterministic for a given number of processors and a

4 The coarsest grid covering the division of ng bins into nf forks has ng/ gcd(nf ,ng)= lcm(nf ,ng)/nf bins per fork.

78 fork along 3 −−−−−−→

fork along 2   y

fork along 1 ←−−−−−−

Figure 4.16: Recursively fork a 6 5 4-grid for distribution among 12 = 3 2 2 processors. × × × × given set of random number generator seeds. Since parallel computing en- vironments allow to fix the number of processors, debugging of exceptional conditions remains therefore possible.

4.5 Dedicated Phase Space Algorithms

No description of Monte Carlo algorithms for phase space integration would be complete without mentioning the beautiful RAMBO algorithm [?]for massless n-particle phase space. RAMBO proceeds by generating massless momentum vectors isotropically, without regard to energy or momentum conservation. In a second step, these vectors are boosted with a Lorentz transformation to their common rest frame to enforce momentum conserva- tion and the lengths of massless vectors can be rescaled by a common factor to ensure energy conservation without spoiling momentum conservation. It can be shown [?] that this transformation has a unit Jacobian. Therefore the phase space distribution can be generated without a rejection step. The only obvious additional cost is that a n-particle final state uses 4n random numbers for 3n 4 degrees of freedom. −

79 Unfortunately, the transformation from the 4n random numbers to the 3n 4 independent variables has a substantial hidden cost: the relation of the integration− variables to the kinematical variables in which matrix elements have singularities is lost and adaptive Monte Carlo algorithms are almost useless. For this reason, RAMBO is mainly useful for smooth background distributions. There is a modification (called MAMBO) for light massive particles [?], but the Jacobians are now no longer unity and the resulting event weights make the algorithm not very efficient.

80 —5— Linear Colliders

Wie B¨ohmen sie bestand, das einen sch¨onen Tags zum Meer begnadigt wurde und jetzt am Wasser liegt. [Ingeborg Bachmann: B¨ohmen liegt am Meer]

The marquee measurement in the area of gauge boson physics at a linear col- lider is without a doubt the precision determination of their self-couplings, known as Triple Gauge Couplings (TGC). As has been stressed in chapter 2, the measurements a LEP2 [?] (see figure 1.4), have been a very important confirmation, but can not have been surprising, because a consistent de- scription of the low energy data leaves little room for deviations from the SM predictions at a level that would be detectable at LEP2 [?]. As will be shown in section 5.1, this is qualitatively different at a LC, because higher energy and luminosity allow to probe the couplings at a level where deviations due to different realizations of EWSB are possible.

5.1 Triple Gauge Couplings

The most popular parametrization of the most general Lorentz invariant + 0 + W W −Z -andW W−γ-vertices has been proposed by Hagiwara et al. [?]

LWWV/gWWV =

V µ ν µν µν λV µ νλ ig1 Wµν† W V Wµ†Vν W + iκV Wµ†Wν V + i 2 W뵆 W ν V − MW V µ ν ν µ V µνλρ g W†Wν (∂ V + ∂ V )+ g  W †∂λWν ∂λW †Wν Vρ − 4 µ 5 µ − µ ˜ ˜µν λV µ ˜ νλ + iκ˜V Wµ†WνV + i 2 W뵆 W νV , (5.1) MW

81 0 where V is either Z or γ, while gWWγ = e and gWWZ = ecot θw are − − the SM couplings. All field-strengths tensors in (5.1) are abelian Vµν = ∂µVν ∂ν Vµ with the tilde denoting the dual tensor. At tree level in the SM, the parameters− in (5.1) take the values

∆κγ,Z = κγ,Z 1 = 0 (5.2a) − ∆gγ,Z = gγ,Z 1 = 0 (5.2b) 1 1 − λγ,Z =0 (5.2c) ˜ γ,Z γ,Z λγ,Z =˜κγ,Z = g4 = g5 =0. (5.2d)

γ Electromagnetic U(1)Q gauge invariance dictates g1,5 = 0 at zero momentum transfer. The parametrization (5.1) is calculationally convenient, because the parameters translate directly to coefficients of the same size in the Feyn- man rules. Therefore, the matrix elements used in most Monte Carlo event generators are expressed in terms of (5.1). Another advantage of the parameters in (5.1) is that they translate im- mediately, without unnaturally small or large coefficients, to model indepen- dent physical observables for the electromagnetic properties of charged vector bosons, such as the magnetic dipole

e γ µW = g1 + κγ + λγ , (5.3a) 2mW   electric quadrupole

e e QW = 2 κγ λγ , (5.3b) −mW −   electric dipole

e ˜ dW = κ˜γ + λγ , (5.3c) 2mW   and magnetic quadrupole

m e ˜ QW = 2 κ˜γ λγ (5.3d) −mW −   moments of W ± bosons [?]. On the other hand, (5.1) does not offer any indication to the probable size of the individual contributions. Completely model independent fits of all 14 parameters simultaneously are not possible, even at a high luminosity LC. Furthermore, the parameters in (5.1) are not necessarily constant and could

82 Figure 5.1: Sensitivity on the parameters of a non-linearly realized 1 EWSB sector with = 200 fb− and √s = 800 GeV [?]. The vertical L bars indicate the constraint on x10 derived from LEP1 analyses [?]. The R2 2 2 2 blind direction (2 sin θw, 2cos θw,cos θw sin θw) does not contribute to the triple gauge couplings.− −

Figure 5.2: The sensitivity ellipses of figure 5.1 rescaled to the new high 1 luminosity TESLA with =1ab− . L R be polynomials in the invariant masses or even general form factors. Obvi- V ˜ ously, g5 ,˜κV,andλV are CP-violating couplings and their study could be postponed. A more systematic approach starts from a basis [?] for the operators that can appear in an SU(2)L U(1)Y /U(1)Q effective Lagrangian and classifies them according to their order⊗ in the momentum expansion. For non-linear realizations, NDA provides the necessary hierarchy. A systematic study can then start from the lowest dimensional operators and can incorporate sym- metries like the custodial SU(2)c, which is strongly suggested by the low energy electroweak precision data. The operators of dimension four in the momentum expansion contributing to the TGC are [?]

x9l µν L = i tr gW DµU †Dν U − 16π2 x9r µν  x10 µν i tr g0B DµU †Dν U + tr U †g0Bµν UgW . (5.4) − 16π2 16π2     The contributions of the interaction (5.4) to the TGC can be expressed in terms of the general parametrization (5.1) by comparing coefficients1 of the trilinear terms e2 x e2 x gZ =1 9l + 10 (5.5a) 1 2 2 2 2 2 2 2 −2sin θw cos θw 16π cos θw(cos θw sin θw) 16π − e2 x e2 x 2e2 x κ =1 9l + 9r + 10 (5.5b) Z 2 2 2 2 2 2 2 −2sin θw 16π 2cos θw 16π cos θw sin θw 16π 1The erroneous non-singular transformation formula in [?] is corrected for the special case x9l = x9r in [?] and for the general case in [?].

83 2 2 2 e x9l e x9r e x10 κγ = 2 2 2 2 2 2 . (5.5c) −2sin θw 16π − 2sin θw 16π − sin θw 16π Unfortunately, the transformation (5.5) is singular and the blind direction

2 x9l 2sin θw 2 x9r 2cos θw (5.6)   ∝  −2 2  x10 cos θw sin θw blind −     in the parameter space can not be observed in measurements of the TGC alone. However, LEP1 analyses (e. g. [?]) provide independent constraints on x10, which can lift the degeneracy. The coefficients in (5.4) have been normalized such that their natural size according to NDA (see sections 2.7 and A.3.5) is of order one. Consequently, the all important question from a physics perspective is: Are experiments at a LC sensitive to values for x9l, x9r, x10 of order one? Detector effects appear to be small and preliminary studies can there- fore work on the generator level with simple geometrical acceptance cuts [?]. Much more important than detector effects are the distortions of the dis- tributions from hard, almost collinear initial state photon radiation, when the photon is lost in the beam-pipe. Since these effects cause a migration of events in the phase space, simple acceptance corrections do not suffice and more complicated unfolding procedures are required. In one study [?](seealso[?]) the unfolding has been performed by generat- ing a large sample of Monte Carlo events (ten times the size of the expected data sample at TESLA) with WOPPER [?] and summing over all radiative and non-radiative events with the same experimental signature. The events were subsequently reweighted with anomalous TGC using fast matrix ele- ments of WPHACT [?] to obtain the likelihood function for the TGC. Using a quadratic approximation of the likelihood function for the “old” high lu- 1 minosity option at TESLA with = 200 fb− and √s = 800 GeV, the sensitivities in figure 5.1 have beenL derived. A more careful analysis will have to take non-linearities in the likelihoodR function into account2, but they can not change the conclusion that the LC is definitely sensitive to values for x9l, x9r, x10 of order one, if constraints on x10 are taken into account. Increasing the integrated luminosity according to the “new” TESLA high 1 luminosity option by a factor of five to =1ab− , as shown in figure 5.2, shows that such a machine would allow precisionL tests of the TGC, at the level of 10% of reasonable deviations from theR SM, caused by different realizations of the EWSB sector. 2The non-linearities had been neglected because the unfolding would have exceeded the limited computing resources in a non-linear fit.

84 + Figure 5.3: “Single-W ” contribution to e e− e−νu¯ d¯. →

5.2 Single W ± Production At energies that are high compared to the masses of all participating particles, all cross sections with a finite limit for massless particles have to fall off as dictated by dimensional analysis 1 σ(s) . (5.7a) ∝ s Cross sections that have contributions from the exchange of light particles in the t-channel do not necessarily have a finite limit and can diverge in the forward direction if the mass of the exchanged particle is send to zero. Such cross sections can scale like 1 s σ(s) ln , (5.7b) ∝ µ2 µ2   where µ2 denotes a typical kinematical scale set by a minimum momentum transfer. For the t-channel neutrino exchange in W ± pair production, the forward singularity is cut off kinematically by the W ±-mass and the cross section is still dominated by (5.7a) in the TeV-region. The situation is different for “single-W ” production as shown in figure 5.3, because the electron or positron that goes undetected in the forward direction does not provide a kinematical cut-off. The observable cross section is still small at LEP2, but it will be larger at a LC than that of W ± pair production. A unique physics opportunity of single-W production in the context ofTGCisthat,unlikeW± pair production, the diagram in figure 5.3 is + 0 sensitive to the W W −γ-couplings alone, because the Z -exchange is not + 0 + singular. Therefore it can be used to disentangle W W −Z -andW W−γ- couplings without studying γγ-collisions in a dedicated collider.

85 For LEP2, W ± pair production has been studied extensively [?]andmany reliable Monte Carlo event generators have been prepared [?]. A more recent dedicated comparison of theoretical predictions for single-W production has revealed that the situation is less favorable for single-W production. At LC energies, unacceptable variations of the predictions for the total cross section on the order of up to five percent have been reported [?, ?]. It appears that these discrepancies can be traced back to incomplete implementations of effects from the finite electron mass. The lesson to be learned is that a simple translation of the electron mass into a lower cut-off on the electron scattering angle will not suffice at a LC and more complete calculations in the forward region are required, if single- W production is to be used for precision physics, like the disentangling of + 0 + W W −Z -andW W−γ-couplings. The classification of the gauge invariant classes of Feynman diagrams for single-W production (see figure 3.7) allows to improve the handling of radiative corrections. The single-W and W ± pair production contributions to the same four fermion final state can be folded with a separate radiator function, without spoiling gauge invariance. Since the two classes have very different characteristic scales and the interferences are small, this improves the predictions considerably.

5.3 Beamstrahlung

Another issue that differentiates a LC from LEP2 is the non-trivial beam en- ergy spectrum caused by beamstrahlung. Unlike initial state radiation, beam- strahlung is not characterized by the scale of the hard interaction, but by the collective interaction of the crossing bunches, which must be much denser than the bunches in storage rings to obtain sufficient luminosity. Therefore, beamstrahlung is a more complex physical phenomenon, but it is indepen- dent of the hard interaction under study and can be parametrized in a single universal beam spectrum for each collider design at each beam energy. The comparison of analytical approximations [?] with the results of mi- croscopic simulations [?, ?] reveals that the former are not adequate. A suc- cessful pragmatical solution is to parametrize the results of the microscopic simulations with a simple factorized ansatz [?]

Dp1p2 (x1,x2,s)=dp1(x1)dp2(x2) (5.8)

where xe±,γ = Ee±,γ/Ebeam and

a2 a3 d (x)=a0δ(1 x)+a1x (1 x) (5.9a) e± − − 86 Figure 5.4: Vector boson scattering subprocess in six-fermion production.

a5 a6 dγ(x)=a4x (1 x) . (5.9b) −

For a0,1,4 0anda2,3,5,6 1, the ansatz (5.9) corresponds to a positive and integrable≥ distribution≥− can reproduce the soft singularities from multiple emission at xe± 1andxγ 0fora3,5 <0. The ansatz→ (5.9) is able→ to fit the results of microscopic simulations well [?]. Fits for the latest collider parameter sets are available as distri- butions and as efficient random number generator in the library circe [?].

5.4 Quadruple Gauge Couplings and W ± Scattering While LEP2 is, for gauge boson physics, essentially a four fermion production machine, a LC can also produce final states with six or more fermions, which have contributions from quadruple gauge boson couplings. In addition to the coupling of an s-channel Z0 or γ to three vector bosons, there are the kine- matical configurations corresponding to the fusion of almost on-shell vector bosons, as shown in figure 5.4. This part of the phase space is particularly attractive for studying the interactions of strongly interacting vector bosons. Realistic estimates for the physics reach in this channl have already been obtained using an improved equivalent particle approximation for the “pro- duction” of the vector bosons in the intermediate state [?](seealso[?]). Interesting physics can be probed, but a high luminosity option is required. + The construction of a complete unweighted event generator for e e− 6f, including general gauge boson couplings, is still a challenge. Complete→ calculations for selected channels exist [?, ?], which are expected to be reliable in the regions of the phase space corresponding to top pair production and Higgs production. More work is required for a proper treatment of the part of the phase space that corresponds to the scattering of almost on-shell W ±’s

87 as in figure 5.4. Currently, work is underway that uses the Monte Carlo methods described in section 4.3.4 to bridge this gap.

88 —6— Conclusions

While it is never safe to say that the future of Physical Science has no marvels even more astonishing than those of the past, it seems probable that most of the grand underlying principles have been firmly established and that further advances are to be sought chiefly in the rigorous applications of these principles to all the phenomena which come under our notice. It is here that the science of measurement shows its importance—where quantitative results are more to be desired than qualitative work. An eminent physicist has remarked that the future truths of Physical Science are to be looked for in the sixth place of decimals. [Albert Michelson]

There is ample model independent evidence from low energy data that weak interactions exchange vector and axial-vector quantum numbers, coupling to charged and neutral currents. The attempt to incorporate these interactions in a renormalizable QFT in which perturbation theory is well defined without any short distance cut- off has singled out spontaneously broken gauge theories in which the weak interactions are described by the exchange of massive gauge bosons. The same conclusion is reached by attempting to enforce high energy tree-level unitarity of scattering amplitudes. As discussed in chapter 2 (with a little help from appendix A), gauge theories retain their special status, even if the electroweak SM is interpreted as an EFT, which may have to be replaced by a more comprehensive theory at a energy scale beyond the physics reach of current colliders. Instead of

89 guaranteeing the consistency of formal1 perturbation theory to arbitrarily high energies, the improved high energy behavior is used for organizing the expansion of the effective interaction in a more systematic way than would be possible with non-gauge vector bosons. Selecting the gauge theory among all equivalent EFTs that describe the same physics (on-shell matrix elements) does not sacrifice any generality. Chapter 3 described how to systematically organize tree-level calculations for many particle final states in such a way that non-abelian gauge invariance can be maintained by partial sums. This is important in state-of-the-art cal- culations, that would exhaust available human and computing resources if they had to be done in a single step. In chapter 4, methods for improving numerical convergence and parallelization of computations have been pre- sented, that can be used for obtaining high precision theoretical predictions for event rates in channels with many particles in the final state, as discussed briefly in chapter 5. Currently, a pessimistic scenario can not be excluded, in which a fun- damental Higgs particle will be found, together with a few supersymmetry partners of SM particles, compatible with the MSSM. While these discoveries would be a tremendous achievement of experimental physics and the culmi- nation of decades of theoretical work, it would probably put the answers to the open questions in the physics of flavor out of our reach. However, we must not forget which eminent “future truths of Physical Science” were found when Michelson had a look at “the sixth place of decimals.” Rumors about the completion of physics are almost always greatly exaggerated ...

1I. e. without regard to the convergence of the perturbative series.

90 Appendix

91 —A— The Renormalization Group

Wisse, daß dies Universum das ist, was es zu sein vorgibt: ein Unendliches. Versuche nie im Vertrauen auf deine logische Verdauungskraft, es zu verschlingen; sei vielmehr dankbar, wenn du durch geschicktes Einrammen dieses oder jenes festen Pfeilers in das Chaos verhinderst, daß es dich verschlinge. [Thomas Carlyle: The French Revolution1]

Until a few year back, textbooks on elementary particle physics and QFT have hardly mentioned the method of EFT and have usually given treatments of the RG that are have been more formalistic than lucid. Consequently, these subjects have had more of an oral tradition than canonical scriptures. Fortunately this has changed now with the advent of two excellent textbooks offering both a pedagogical [?] and a systematical [?, ?] exposition of QFT. These lectures have been given to an audience of graduate students work- ing in experiments and on theoretical projects. Therefore they had to live up to the combined challenge of presenting theorists new angles on a subject they had already studied, while at the same time offering concrete applica- tions to experimentalists, without burying them in formalism. The intuitive reasoning in these lectures avoids the horrendous combi- natorial complications of perturbative renormalization theory. Nevertheless, mathematical proofs of the intuitive observations still need the technical re- sults of traditional perturbative renormalization theory. These lectures deliberately use a strictly perturbative language in order to appeal to the audience’s intuition that is trained on Feynman diagrams. Many results remain valid outside of PT, but no attempt is made to mention this systematically.

1Translation and quote from Thomas Mann: Betrachtungen eines Unpolitischen.

92 + Figure A.1: One-loop contribution to e e− qq¯. →

A.1 Introduction

Ein Teil von jener Kraft, Die stets das B¨ose will und stets das Gute schafft. [J. W. v. Goethe: Faust I ]

What is the size of leptons and quarks? According to the most recent report of the Particle Data Group (PDG) [?], the compositeness scale is larger than 1.5 TeV for all of them, but even the best limits fall short of 5 TeV. In the SM, we assume all fermions to be strictly pointlike. As you all know, the SM is very successful. Yet, we can’t pin down the size of its constituents 1 below 0.2TeV− . How is this possible? The easy way out is to argue that our accelerators are too small to probe any deeper. This is too easy, however, because we claim to control the SM at the quantum (i. e. one-loop) level. The description of a typical LEP1 + experiment e e− qq¯ involves loop diagrams like the one in figure A.1. And in this diagram, the→ loop momentum k is not bounded anywhere and there is a part of the integration region where it exceeds the energy scale Λ 1/∆x. At this point we can rephrase the question: Why is the compositeness≈ scale irrelevant to physics at LEP1 energies? The answer will be that the compositeness interactions 1 ∆L = ψ¯Γ ψ ψ¯Γµψ (A.1) 2Λ2 µ are indeed irrelevant and the term “irrelevant” will be given a precise tech- nical definition below. This might be a bit surprising, because most of us have learned in graduate school that interactions like (A.1) are verboten in loop calculations, because they make the theory non-renormalizable. A closely related question is: why don’t we have to understand M-theory to do phenomenological particle physics? According to the theoretical party

93 Figure A.2: Electron self energy

line, we expect a lot of new particles (grand unification gauge and Higgs bosons, higher string and membrane modes, etc.) somewhere above 1016 GeV. Why do they have so little impact on our bread and butter phenomenology that involves loop diagrams?

A.1.1 History QFT was introduced shortly after Quantum Mechanics (QM) in the 1920s. But while QM was put on a solid mathematical basis in the 1930s, it took much longer to make sense out of QFT. Soon (in the early 1930s), it was discovered that the second order of PT gave non-sensical results in QED. For example, the electron self energy (see figure A.2, but the calculators in the 1930s had to work without the benefits of Feynman diagrams, of course) came out infinite. Even though the lowest order predictions were very successful, the inability to calculate corrections was unsettling, to put it mildly. By the late 1940s, a workable scheme was established for absorbing the infinities in renormalized couplings constants and masses. Nevertheless, this process of renormalization was considered a nuisance and it was expected that a complete theory would not need it. Even though the renormalization procedure apparently worked in practi- cal applications, it took another twenty years of hard theoretical work until a mathematical proof that the procedure works in all orders of PT was es- tablished in the 1960s. Today, the formalism of perturbative QFT is well established and the electroweak precision measurements at LEP1 leave no doubt that its results are phenomenologically sound. Nevertheless, it should be remembered that no mathematical proof for the existence of an interacting QFT in four space- time dimensions has been given that works outside of the realms of PT.

A.1.2 Renormalizability Let us look at a simple example to get a feel for the physics of renormalization. For the vacuum polarization diagram of QED (see figure A.3)

2 2 iΠµν (p)= i(pgµν pµpν )Π(p )(A.2) − −

94 Figure A.3: Vacuum polarization

α d( k2) p2)= −∞ + finite (A.3) Π( − 2 3π p2 ( k ) Z− − we find a logarithmically divergent integral over k. Obviously, the divergence comes from the region k2 , where the eeγ-vertex has not been tested experimentally. A brute− force→∞ remedy is to cut-off the integral at an arbitrary, but very large, scale Λ

Λ2 2 2 ˜ 2 α d( k ) α Λ Π(p )= − 2 + finite = ln 2 + finite (A.4) 3π p2 ( k ) 3π p Z− − − This leaves us with a logarithmic dependence on the unphysical parameter Λ and is not acceptable in this form. We can however impose a renormalization condition

Π(˜ p2) = 0 (A.5) p2=µ2 − that trades the dependence on the unphysical cut-off Λ for the dependence on the physical renormalization point µ at which we define the charge. Even though the result is now finite α p2 ΠR(p2)= ln − (A.6) −3π µ2 we are faced with a potentially large logarithm ln(p2/µ2), that reaches 25 for LEP2 energies (√s 190 GeV) with a charge renormalized at low en- ≈ ergies (µ me). The small coupling constant of QED helps to keep this logarithm≈ relatively harmless, but since LEP1 has introduced the era of elec- troweak precision physics, higher orders in PT are required for reliable quan- titative predictions. We will see below, that there are cases, where the effec- tive coupling is much larger and a resummation of the perturbation series is required. It is far from obvious that the subtraction procedure, that I have sketched so crudely, works to all orders in PT. It is not even obvious that all re- quired subtractions correspond to a renormalization of coupling constants

95 Figure A.4: Vacuum polarization counterterm

and masses. While I will not go into detail, it will be helpful later to estab- lish some nomenclature at this point. The subtraction leading from (A.4) to (A.6) is obviously equivalent to adding the new counterterm vertex depicted in figure A.4 to the theory. Thinking in terms of counterterms instead of subtractions has the advantage that the generalization to higher orders is more intuitive. In order to get a feeling for the required counterterms, let us consider the free field actions for scalars φ, spin-1/2 fermions ψ, and vectors Aµ

1 ∂φ(x) ∂φ(x) m2 Lφ = d4x φ φ2(x) (A.7a) 0 2 ∂x ∂xµ − 2 Z  µ  ψ 4 ¯ ∂ ¯ L = d x ψ(x)iγµ ψ(x) mψψ(x)ψ(x) (A.7b) 0 ∂x − Z  µ  A 4 1 µν ∂Aν(x) ∂Aµ(x) L = d x − Fµν (x)F (x) ,Fµν (x)= . (A.7c) 0 4 ∂xµ − ∂xν Z We are using units with ~ = c = 1 and therefore the actions must be dimen- sionless. If we assign dimension +1 to mass

dim(m)=1, (A.8) then length has dimension 1 in these units − dim d4x = 4 (A.9a) − ∂ dim  =1. (A.9b) ∂x  µ  Now we can read off the dimensions of the fields from (A.7) immediately

dim (φ(x)) = 1 (A.10a) 3 dim (ψ(x)) = (A.10b) 2 dim (Aµ(x)) = 1 . (A.10c)

96 These dimensions are reflected in the high energy behavior of the propagators, 2d 4 which is 1/p − i d4xeipx 0 Tφ(x)φ(0) 0 = (A.11a) h i p2 m2 + i Z − φ /p + m d4xeipx 0 Tψ(x)ψ¯(0) 0 = i ψ (A.11b) p2 m2 + i Z − ψ

4 ipx igµν d xe 0 TAµ(x)Aµ(0) 0 = − . (A.11c) h i p2 + i Z The power counting for loop integrals in Feynman diagrams can be related directly to the dimensions of the interaction operators. Let me illustrate this by a some examples2. I will give a more comprehensive account from the point of view of the RG in later sections. Consider a one-loop integral with two φ6 insertions. Since the loop inte- gral d4k 1 (A.12) (2π)4 k2(p k)2 Z − is logarithmically divergent, we need a φ8 counterterm + = finite . (A.13) Similarly, a φ4 and a φ6 operator need a φ6 counterterm + = finite (A.14) and two φ4 operators need a φ4 counterterm + = finite . (A.15) This suggests a pattern, which can indeed be verified by a more detailed analysis. If we have multiple insertions of operators with dimension greater than four, then counterterms of higher and higher dimension are needed. On the other hand, operators of dimension less than or equal to four require counterterms of dimension less than or equal to four. This pattern can already be seen from dimensional arguments. If an operator has dimension greater than four, it must come with a coupling constant of negative dimension, e. g. 1 1 φ6(x) , dim(Λ) = 1 . (A.16) Λ2 6! 2A systematic exposition too all orders in PT is well beyond the scope of these lectures, but can be found in any QFT text.

97 If the loop integral does not depend on the scale in this coupling constant, then the product of two such operators will have a coupling constant of an even more negative dimension. The product will therefore be of even higher dimension, e. g. 1 1 1 1 1 1 1 φ6(x) φ6(y) φ8(x) . (A.17) Λ2 6! Λ2 6! → 16π2 Λ4 8! Since all building blocks have a positive dimension (see (A.9b) and (A.10)), there are only a finite number of operators of a given dimension. Therefore, interactions of dimension less then four can be renormalized by a finite num- ber of counterterms3.Aslongasafinite number of counterterms suffices, the theory retains its predictive power. As a result, interactions of dimen- sion four are called renormalizable and interactions of dimension less than four are called super-renormalizable. There are additional complications with the preservation of symmetries, which do not affect the general argument, however. It was then very reasonable to expect that nature is described by a renor- malizable QFT, such as QED, QCD or the SM. These theories have been very successful and as a result, all other—non-renormalizable—theories fell from grace and became second class citizens. However, two very important questioned were apparently never posed, let alone answered: 1. Why is nature described by a renormalizable QFT? I will show later that the RG will provide a satisfactory answer to this question. 2. Why should our low energy QFT be correct to arbitrarily high energies? The contribution of string theory to particle physics that this question is no longer taboo.

A.1.3 The Renormalization Group The great Cinderella story of theoretical particle physics since 1970 has been the rise of renormalization from a nuisance (sweeping things under the rug) to a powerful tool. The turning points of this epic are 1971: Wilson’s RG [?] for critical phenomena (i. e. phase tran- sitions) and strong interactions proved that renormalization could help physics insight, instead of hindering it. 3While this is rather intuitive from the one-loop examples, the mathematical proof is extremely complicated in the general case. It took forty years of efforts [?]ofsomeof the brightest theoretical physicists of their time to disentangle the multi-loop calculations when the loops are not simply nested, but are overlapping.

98 1972: The proof of renormalizability of spontaneously broken gauge theories [?, ?] established a viable candidate for a QFT of weak interactions (now known as SM).

1973: Asymptotic freedom [?] paved the way for QCD, the QFT of strong interactions.

1979: Weinberg’s EFT paper [?] summarized and formalized the folklore of systematic expansions in the energy.

1990s: Even the non-renormalizable theories have been reinstated as first class citizens under the name of EFT.

By the time of Weinberg’s paper, the physics and the formalism was well understood by the experts, but it remained for some time part of the oral tradition that was covered insufficiently by textbooks.

A.2 The Renormalization Group

caminante, no hay camino se hace camino al andar. [Antonio Machado: Proverbios y cantares, VI ]

We evaluate the predictions of QFT by doing integrals: loop integrals from Feynman diagrams in PT and Feynman path integrals beyond PT. The RG has a very intuitive interpretation [?](seealso[?]) in the evaluation of Feyn- man’s path integral

φeiS(φ) (A.18) D Z and I encourage everyone to consult the reference. But I can not assume a previous exposure to (A.18) and a detailed discussion is beyond the scope of these lectures. Therefore I will concentrate on Feynman diagrams and PT.

A.2.1 Cracking Integrals There are (at least) three common ways to do integrals in physics: either from the anti-derivative or from the theorem of residues or from a Riemann- (or Cauchy- or Lebesgue- or ... ) construction. The first method is familiar to all of you from high school days, but it is frowned upon in certain circles because the second method enables physicists to do “harder” integrals and

99 Figure A.5: Nested divergencies

Figure A.6: Overlapping divergencies

the third method allows mathematicians to integrate more (i. e. “wilder”) functions. As one might guess from this introduction, I will now sing the praise of the first method. The problem of calculating the integral

b I(f; a, b)= dxf(x) (A.19) Za is equivalent to solving the first order Ordinary Differential Equation (ODE)

d F (x)=f(x),F(b)=C, (A.20) dx since the integral is given by I(f; a, b)=F(a) C. This relation is familiar, because it is used in simple cases in the opposite− direction to obtain solutions to (A.20) from (A.19). In fact, we will use it frequently below. Solving the ODE (A.20) might appear more difficult than doing the in- tegral (A.19) and this is true in the general case. However, I will try to convince you that, for many applications in particle physics, the ODE can give us very reliable approximate results, where the value of the correspond- ing integral is hard (or even impossible) to estimate. In addition, the ODE can give us more physical insight than brute force estimates of integrals. I will also argue that there are cases, where the integral (A.19) does not exist, while the ODE (A.20) can still give us the physically correct answer.

A.2.2 Wilson’s Insight We have seen heuristically in section A.1 that the divergencies and large logarithms in PT originate from logarithmic integrals in loop momenta. For nested divergencies (see figure A.5), it is intuitively clear that the proper pro- cedure is to proceed from the inside out and to subtract the subdivergencies first. Then only the overall divergency from the outermost loop remains.

100 More intricate is the treatment of overlapping divergencies (see figure A.6), where each subdivergency has to be subtracted without double counting. The proof that the subtraction algorithm is mathematically sound to all orders in PT is an extremely complicated combinatorial exercise [?]. Nevertheless, it can be given a lucid physical interpretation, if we change our point of view. Why should we insist of doing the loop integrals first and subtract divergencies later? As we have seen in (A.4) and (A.6), the leading logarithmic dependence on the renormalization scale can be recovered from the dependence on the cut-off by dimensional analysis. Wouldn’t it be easier to calculate the dependence on the cut-off directly? As we shall see now, this heuristic argument can be made precise and results in a powerful calculational tool, the RG. The essence of the method lies in deriving a differential equation, the Renormalization Group Equation (RGE), that can be integrated to give the desired result. It will turn out that the calculation never involves any infinite numbers and in most cases, the coefficients of the PT will never be large.

A.2.3 The Sliding Cut-Off For simplicity, I will now concentrate on massless φ4-theory. This is the theory of a spinless, uncharged, massless particle, that interacts through a local φ4 vertex. The arguments will go through for more realistic QFTs, but the technical complications arising from gauge invariance would obscure the point that I’m trying to make. I will deal later with the effects from massive particles. Let us assume for the moment that the theory has been defined with a cut-off in momentum space, i. e. a propagator

2 i iDΛ(k )= Θ( k Λ) . (A.21) k2+i | |≤ The cut-off k Λ needs some explanation, because it appears not to be Lorentz invariant.| |≤ In fact, Θ( k Λ) is just a symbolic notation. A simple minded cut-off like k2 Λ2| is|≤ invariant, but it is not restrictive enough, because k can grow along| |≤ light-like directions (k2 = 0) without being cut off. A more appropriate definition can be found by using the “Wick rotation”. As can be seen from figure A.7, in each loop integration, the integration contour in the complex k0 plane can be deformed from the dashed curve to the dotted curve without crossing singularities. Then we can perform 0 ~ 0 ~ the substitution (k , k)=(ikE, kE) and the Minkowski “length” becomes a Euclidean length k2 =(k0)2 ~k2 = (k0)2 ~k2 = k2. The Lorentz − − E − −E

101 Figure A.7: Wick rotation

2 2 invariant cut-off kE Λ is now effective, because there are no light-like directions in Euclidean≤ space. We shall interpret Θ( k Λ) in this fashion. We can now compare the theories defined with| two|≤ different cut-offs Λ and Λ0 < Λ. It turns out that we can change the theory in such a way that the physics does not change when we lower the cut-off, as long as the external momenta remain below the new cut-off. To see how this works, let us introduce a graphical notation

2 iDΛ(k ) = (A.22a) 2 iDΛ0 (k ) = (A.22b) 2 2 iDΛ(k ) iDΛ (k ) = (A.22c) − 0 for propagators in the full theory, the low energy theory and their difference. For the one-loop correction to the propagator we find below the new cut-off =+.(A.23) Therefore the physics will not change, if we introduce a new vertex = (A.24) into the low energy theory. When we calculate (A.24) explicitely, it will be useful later to calculate the more general integral dDk 1 n I(n, D)= Θ(Λ0 k Λ) , (A.25) (2π)D k2 + i ≤| |≤ Z   defined in D space-time dimensions. As discussed above, we will perform a “Wick rotation” D n n d kE 1 2 2 2 I(n, D)=( 1) i Θ(Λ0 k Λ ) . (A.26) − (2π)D k2 ≤ E ≤ Z  E  102 The volume of the surface of the D-dimensional sphere is given by

2πD/2 dΩ = (A.27) D Γ(D/2) Z and therefore

n Λ2 ( 1) i 2 2 D/2 1 n I(n, D)= D/−2 dkE (kE) − − , (A.28) (4π) Γ(D/2) 2 ZΛ0 which can be calculated explicitely

( 1)ni Λ2 I(D/2,D)= D/−2 ln 2 (A.29a) (4π) Γ(D/2) Λ0 n ( 1) i 1 D 2n D 2n I(n, D) = Λ Λ − . (A.29b) n=D/2 D/−2 − 0 | 6 (4π) Γ(D/2) D/2 n − −   In the case of the tadpole diagram (A.24) we find finally

4 d k i 1 2 2 Θ(Λ0 k Λ) = iI(1, 4) = Λ Λ0 (A.30) (2π)4 k2 + i ≤| |≤ (4π)2 − Z   for the new vertex. In the four point function

=+++, (A.31) we need more vertices. First a modified four particle vertex

= . (A.32)

Under the assumption that the external momenta satisfy p Λ0 <Λ, we can do the loop integral in (A.32) easily to leading order | |

4 1 d k iΘ(Λ0 k Λ) iΘ(Λ0 p k Λ) ≤| |≤ ≤| − |≤ 2 (2π)4 k2 + i (p k)2 + i Z − 1 i Λ2 p2 = ln + O . (A.33) −2 16π2 Λ 2 Λ2 0   The exact expression is more complicated, due to the finite integration limits, but we will not need its explicit form. More important is the observation that all integrals exist, since they are over a compact domain in Euclidean space, far from any singularities. Below, after motivating the physics, we shall see

103 how to perform calculations more efficiently. Anyway, there are three such diagrams and the new vertex is

2 2 3 1 Λ g 4 2 ln 2 φ (x) . (A.34) −2 16π Λ0 4! There is also a new six particle vertex

= , (A.35) that reads (again neglecting small terms O(p2/Λ2)):

1 g2 φ6(x) . (A.36) Λ2 6!

The general procedure consists in a stepwise reduction of the scale Λ0

Λ > Λ0 > Λ00 > Λ000 >... (A.37) where the Lagrangian is adjusted in each step such that the low energy observables below the current cut off scale is unchanged. If we allow arbitrary powers of the field φ and derivatives, the low energy physics can indeed be reproduced fully, as shown above.

A.2.4 Renormalization Group Flow The most general interaction can be expanded

L(x)= giOi(x) (A.38) i X in an infinite series of local operators

O(x)= φ2(x),φ4(x),(∂φ)2(x),φ6(x),(φ∂φ)2(x),... , (A.39) where an infinite sum will not necessarily correspond to a local interaction. A discrete renormalization group transformation of the scale dependent cou- pling constants

gi(Λ) gi(Λ0)= Γij(Λ0, Λ)gj(Λ) (A.40) → j X is defined by the requirement that the low energy physics remains unchanged. However, discrete transformations like (A.40) are technically hard to han- dle. It is much more transparent to consider a continuous transformation

104 Figure A.8: The evolution of the coefficients g4 and g6 as a function of a cut- off scale Λ0 (left hand side) and after elimination of the unphysical cut-off scale Λ0 (right hand side).

with infinitesimal generators instead. Then the RGE is a system of coupled non-linear ordinary differential equations

2 dgi(Λ0) Λ0 = γ g (Λ0) . (A.41) dΛ 2 ij j 0 j X

As a concrete example, consider the evolution of the coefficients g4 and g6 g g L (x)= 4φ4(x)+ 6φ6(x)+... (A.42) int 4! 6! with a change of the cut-off scale Λ0, as depicted on the left hand side of figure A.8. By construction, the low-energy physics does not change along the trajectories in figure A.8, as long as each coupling uses the same Λ0. The cut-off scale Λ0 is therefore just conventional and determines what is considered to be part of the lowest order Lagrangian and what is to be calculated from it in higher orders. Thus the scale Λ0 is redundant and can be eliminated from the physical parameter space. This physics is therefore not determined by a point in the parameter space, but by a whole trajectory instead. Since Λ0 is redundant, it is desirable to eliminate it completely. As long as g4(Λ0) can be inverted (which is always possible in fixed orders of PT), the physics on the left hand side of figure A.8 can be recast as the right hand side of the same figure, where the dimensionful cut-off scale Λ0 has be traded for the dimensionless coupling parameter g4.Thisisanexample of the celebrated phenomenon of dimensional transmutation [?], of which the relation of the strong coupling αS and the QCD scale ΛQCD is the most famous example. So far, we have ignored the dependence on the upper cut-off scale Λ, which was assumed to be fixed and much larger than any physics scale. After the introduction of the RG trajectories, the scale Λ obtains a straightforward interpretation as the starting point of the trajectory. In this interpretation, the low energy physics will not depend on Λ if the RGE can be solved for all Λ. However, the existence of global solutions is not a trivial condition and there are several cases to consider.

105 Figure A.9: All trajectories on the left hand side extend to Λ0 and a continuum limit exists. All trajectories on the right hand side even→∞ remain perturbative for Λ0 . →∞

Figure A.10: The low energy theory on the left hand side can never be valid above Λ0. The low energy theory on the right hand side can be valid above Λ0, but only for a restricted part of the parameter space.

In the scenario depicted on the left hand side of figure A.9, all trajecto- ries extend to Λ0 . Therefore, the starting point of the evolution can be pushed back to→∞ Λ , which means that the cut-off can be removed altogether and a continuum→∞ limit exists. However, one must be careful that the trajectories might lead through non perturbative regions with g 1and that a perturbative determination of the evolution equation is unreliable in these regions. Even more favorable is the scenario depicted on the right hand side of figure A.9, where all trajectories that start in the low energy regime, where couplings are determined experimentally, remain perturbative for all Λ0 . Perturbative calculations are reliable everywhere in this scenario. The→ most∞ favorable special case of this scenario is asymptotic freedom,[?]inwhich the couplings vanish for Λ0 and low orders of PT provide reliable predictions for high energy experiments.→∞ In stark contrast to the previous two scenarios, no trajectory on the left hand side of figure A.10 can be extended from the low energy domain beyond the scale Λ0. Thus the QFT that describes the present experiments can not be valid above Λ0, unless non-perturbative effects change the trajectories significantly close to Λ0. Such a scenario is not entirely bad news, because it predicts that something “interesting” will happen around Λ0: at least strong interactions, maybe even “new physics.” An interesting combination of the scenarios in figures A.9 and A.10 is shown on the right hand side of figure A.10: some trajectories can be ex- tended to Λ0 , while others are confined to Λ0 < Λ0. Typically, there →∞ will be a critical coupling g0(Λ0) and trajectories with g(Λ0) g0(Λ0) can not. As a result, there will be upper limits on the couplings at any given scale, if the theory is in-

106 tended to be defined at arbitrary high scales. The most famous example for this phenomenon is provided by the upper limits on the Higgs self coupling resulting in an upper limit in the Higgs mass m2 = g/2 φ 2 intheSM[?]. H ·h i A.2.5 Relevant, Marginal Or Irrelevant? While the graphical representation or the RG flow is very suggestive for a few couplings, the untruncated problem has to deal with infinitely many couplings. Therefore, it is not obvious that the intuition gained from the low-dimensional examples can be used in the infinite dimensional space of couplings. This is particularly problematic, because operators of higher and higher dimension produce more and more divergent contributions to Feynman diagrams and need even higher dimensional counterterms (see section A.1.2). Fortunately, “the first shall be last and the last shall be first” and the higher dimensional operators turn out to be harmless from the point of view of the RG. If there was no interaction, the couplings would be constant. In this case, we can estimate matrix elements of the operators of dimension d by dimensional analysis as (in four space-time dimensions)

d 4 p − | | (A.43) Λ   where Λ is the high energy scale at which the “bare” couplings are defined and d is the power counting dimension of the operator, counted according to (A.9b) and (A.10). In (A.43), we can identify three different cases according to the value d d<4: these operators are termed relevant, because their contributions • are becoming more important at low energies, where experiments are performed. d = 4: these operators are termed marginal. • d>4: these operators are termed irrelevant, because their contribu- • tions are becoming less important at low energies, where experiments are performed. Therefore, the initial conditions at high energies for higher dimensional (“non renormalizable”) interactions play no important rˆole in low energy physics. Unless there are very strong interactions, relevant and irrelevant operators will retain their property, even if interactions are switched on. This can be seen from the evolution dg (Λ ) 1 2 2 0 =Λ2 g (Λ )+... (A.44a) Λ0 2 0 2 4 0 dΛ0 16π 107 Figure A.11: Contributions of irrelevant operators do not necessarily vanish in the infrared, but the dependence of these contributions on their initial conditions at high energy vanishes in the infrared.

2 dg4(Λ0) 3 1 2 Λ0 2 = 2 g4(Λ0)+... (A.44b) dΛ0 2 16π 2 dg6(Λ0) 1 2 Λ0 2 = 2 g4(Λ0)+... (A.44c) dΛ0 Λ0 of the low dimensional operators:

2 d =2(i.e.g2 or m ): the rate of change of this coupling is of dimen- • 2 sion 2: O(Λ0 ). Unless there is significant fine tuning of the initial conditions, the value of the coupling will be of the order of the high 2 2 energy starting point of the evolution: g2 = O(Λ ) Λ0 .  d =6(i.e.g6): the rate of change of the coupling is of dimension 2: • 2 − O(Λ0− ). Therefore the value of the initial condition at the high en- ergy scale is indeed irrelevant with respect to the value induced by the evolution. To avoid a possible misunderstanding induced by the technical term “irrele- vant operator”, it is important to stress that the contributions of irrelevant operators do not necessarily vanish in the infrared. Instead, the dependence of these contributions on their initial conditions at high energy vanishes, because the RG trajectories flow together in the infrared, as shown in fig- ure A.11. Thus the initial conditions for the irrelevant operators have no effect on the results of low energy experiments. In fact, the coefficients of the irrelevant operators can be set to zero at the high scale without changing the observable physics at lower energies. Therefore, the phenomena can be described by a renormalizable theory. To avoid another possible misunderstanding, this observation does not “prove” that only a renormalizable QFT can describe the observed phenom- ena. It only states that there is always are renormalizable QFT that is indistinguishable from a non-renormalizable theory at low energies. Since renormalizable theories are technically more convenient and depend on less parameters, common sense and Occam’s razor suggest to prefer the renor- malizable QFT with only relevant or marginal interactions at the high scale over the others with the same infrared behavior.

108 The technical advantage of renormalizable theories lies in the fact that, for a given set of conserved quantum numbers, there is only a finite number of relevant or marginal operators, which is even quite small. This simplifies calculations and increases the predictive power of these theories because fewer free parameters have to be fitted in experiments. In fact, from the point of view of the RG, the relevant (“super renormal- izable”) operators are more problematic than the non renormalizable ones. The most prominent examples are masses of light particles. Unless there is a symmetry that forces the mass to be small, most trajectories will lead to a value of the mass of the order of the high energy cut off scale, which is incompatible with the assumption that the particle is light. The selection of a trajectory that flows towards a light mass in the infrared requires unnatural fine tuning in this case. Fermion masses can be protected by a chiral symmetry and need no fine tuning. The same applies to the Goldstone bosons of a spontaneous sym- metry breaking. Goldstone’s theorem protects their masses as well. Other scalars need to be related to fermions through a supersymmetry in order to be protected against a naturally large mass. The marginal operators are the most interesting from the theoretical point of view. Without interaction, their contributions do not change with the scale. Therefore even weak interactions can make the marginal couplings grow strong or become weak asymptotically. From the RGE for g4

2 dg4(Λ0) Λ0 2 = β(g4(Λ0)) (A.45) dΛ0 it can be seen that the behavior of the coupling depends on the sign of the β-function

2 2 β>0 (as in the scalar example considered above: β =3g4/(32π )): • the coupling grows strong in the ultraviolet and becomes weak in the infrared,

β<0 (e.. g. QCD, see section A.2.9): the coupling grows strong in the • infrared and becomes weak in the ultraviolet.

The solution of (A.45) for our scalar example is

g(Λ) (Λ )= 4 . (A.46) g4 0 3 Λ2 1+ 2 g4(Λ) ln 2 32π Λ0

Figure A.12 displays more contributions to the β-function for g4, but all these 2 contributions are O(Λ− ) and can be ignored in leading order.

109 Figure A.12: Suppressed contributions to the β-function for g4, which 2 are O(Λ− ).

If the interactions are strong enough, relevant operators can become marginal or even irrelevant and vice versa. However, this is impossible in the perturbative domain and therefore outside our present scope. The non perturbative calculation of path integrals via the solution of a RGE is in principle possible, but technically very complicated. The known solutions require a truncation of the RGE and the systematic error introduced in this fashionisveryhardtoquantify.

A.2.6 Leading Logarithms In addition to providing an explanation for the fortunate circumstance that nature seems to be described by a renormalizable QFT, the study of the RG flows provides a more reliable calculational tool than the direct pertur- bative evaluation of Feynman diagrams. In theories with widely separated mass scales, PT suffers from large coeffi- cients in the perturbative series that are caused by logarithmic integrals from one mass scale to the next. In contract, the coefficients of the RGE never contain more than one scale, which is determined by the slice of momentum space that is being integrated out. Thus these coefficients can not contain large logarithms and the theory remains perturbative as long as the cou- plings are weak. Observable large logarithms are recovered only in the final stage of the calculation, the solution of the RGE. Therefore, large unphysical logarithms that cancel in observables can never appear in intermediate steps.

A.2.7 Callan Symanzik Equations The calculation of integrals over a small slice of momentum space like (A.25) or (A.33) is technically demanding, because the finite integration limits make it impossible to use many of the standard tricks of the trade, like the shifting of integration variables. In addition, fixed cut-offs manifestly violate gauge invariance, which makes their practical application in gauge theories [?]very awkward. At this point, the relation of the RG flow to perturbative renormaliza- tion comes in handy. A change of the lower cut off scale Λ0 corresponds to

110 a change of the Lagrangian that keeps the physics unchanged. In pertur- bative renormalization, the renormalization scale µ plays a similar rˆole as a scale that characterizes how much of the interaction is absorbed into the Lagrangian in the form of local counter terms. Therefore, it is tempting to identify Λ0 with µ. The analogy is not complete, however. The hard cut off Λ0 is replaced by counter terms that correspond to a soft cut off, which is switched on gradu- ally. This is the price to pay for gauge invariance and technical convenience. If we were to sum all orders of the perturbative series, both methods will give the same result. For a truncated series, however, differences of higher order will remain. Consider a general n-point function

(n) G (x1,x2,... ,xn;g,µ)= 0Tφ(x1)φ(x2)...φ(xn) 0 , (A.47) h i which is renormalized at µ, i. e. the invariants of the momenta in the renor- malization conditions are equal to µ. A change of µ will not change the physics. Instead, the theory is shifted to another point on the same trajec- tory. Therefore, there is a new value g0 for the renormalized coupling, such that only the unobservable normalization of the field operators is changed φ(x) 1/√Zφ(x). Thus →

(n) n/2 (n) G (x1,x2,... ,xn;g0,µ0)=Z− (µ, µ0)G (x1,x2,... ,xn;g,µ). (A.48)

As above, it is more convenient to replace the discrete RG transformations by continuous transformations with infinitesimal generators

d n/2 (n) µ0 Z (µ, µ0)G (x1,... ,xn;g0,µ0) = 0 (A.49) dµ0  where d/dµ0 is a total derivative, that includes the change of the coupling g as well. Therefore, the Callan-Symanzik Equation (CSE) reads

∂ ∂ µ + β(g) + nγ(g) G(n)(x ,... ;g,µ)=0, (A.50) ∂µ ∂g 1   with d β(g)=µ g(µ) (A.51a) dµ 1 d γ(g)= µ Z(µ0,µ). (A.51b) 2Z(µ0,µ) dµ

111 The β-function is dimensionless. In a massless theory, it can therefore not depend on µ for dimensional reasons. In theories with masses, there are al- ways renormalization schemes that retain this property [?]. In other schemes, β can depend on ratios of µ and masses. However, it is not obvious that γ does not depend on µ0 through the ratio µ/µ0. In renormalizable theories, the limit µ0 exists and γ can →∞ not depend on µ0. As we have explained in section A.2.5 above, we can always chose a trajectory that corresponds to a renormalizable theory and drop the dependence of γ on µ0. The actual calculation of the coefficient functions β and γ proceeds by cal- culating a sufficiently large set of renormalized Green’s functions and finding functions β and γ such that (A.50) is satisfied. The non-trivial statement of the Callan-Symanzik Equation (CSE) is that one universal function β for each coupling and one universal function γ for each field suffices to satisfy (A.50) for all Green’s functions. If there is more than one coupling and more than one field, the general- ization is obvious ∂ k ∂ m µ + β (g) + n γ (g) G(n1,... ,nm)(x ,... ;g ,... ,g ,µ)=0. ∂µ i ∂g j j 1 1 k i=1 i j=1 ! X X (A.52) For example, the QED vertex functions obeys the CSE ∂ ∂ µ + β(e) +2γ (e)+γ (e) 0 Tψ(x )ψ¯(x )A (x ) 0 =0. (A.53) ∂µ ∂e ψ A 1 2 ν 3  

A.2.8 Running Couplings and Anomalous Dimensions As a concrete example for solving the CSE, consider a four-point function

(4) G (P ; g,µ) = = F.T. 0 Tφ(x1)φ(x2)φ(x3)φ(x4) 0 (A.54) h i in the deep Euclidean domain (this does not correspond to a physical ampli- tude, but can be part of one): p2 = P 2 (A.55a) i − pipj =0. (A.55b) Lowest order PT gives for the connected (but not amputated) part i 4 G(4)(P ; g,µ)= − ( ig)+O(g2). (A.56) P2 −   112 Ignoring corrections from mass terms, dimensional analysis predicts that to all orders in PT i 4 G(4)(P ; g,µ)= − Gˆ(4)(P/µ;g), (A.57) P2   i. e. that G(4) is a homogenous function and solves the differential equation ∂ ∂ µ G(4)(P ; g,µ)= 8+P G(4)(P ; g,µ). (A.58) ∂µ − ∂P   Plugging (A.58) into the CSE (A.50), we can trade the dependence of the renormalization scale for the momentum dependence and obtain the Renor- malization Group Equation (RGE) for the four point function ∂ ∂ P β(g) +8 4γ(g) G(4)(P ; g,µ)=0. (A.59) ∂P − ∂g −   The RGE can be solved with standard techniques for the integration of a Partial Differential Equation (PDE). The introduction of a running coupling constant g¯(P ; g) d P g¯(P ; g)=β(¯g(P; g)) (A.60a) dP g(µ; g)=g (A.60b) absorbs the differential operator ∂ ∂ P β(g) g¯(P;g) = 0 (A.61) ∂P − ∂g   and the general solution can be written as

g¯(P ;g) i 4 γ(g ) G(4)(P ; g,µ)= (4)(¯g(P; g)) exp dg 0 , −2 4 0 (A.62) P G ·  β(g0)   Zg   where (4) is an arbitrary function of one variable, which is not determined by theG RG. However, matching the solution (A.62) of the RGE to the per- turbative result (A.56) at P = µ gives

(4)(¯g)= ig¯+O(¯g2) . (A.63) G − In general, we can identify two elements in the solution of the RGE. One part of the solution is given by the perturbative result at a given order with

113 the coupling constant replaced by the running coupling at the appropriate scale. In addition, there is an exponential factor modifying the scaling be- havior of each field in the n-point function under consideration. Since the γ appear in the numerator of the exponent, they are usually called anomalous dimensions. From the β -function for φ4 3 β(g)= g2 + O(g3) (A.64) 16π2 follows the running coupling g g¯(P ; g)= . (A.65) 1 3 g ln P −16π2 µ The solution (A.62) has the nice feature that all large logarithms ln(P 2/µ2) are collected in two universal places, that are common to all Green’s func- tions: the running couplingg ¯(P ; g) and the exponents of the scale factors. Thus the large logarithms have been resummed successfully in universal quantities and PT is reliable, as long asg ¯(P ; g) is small. Unfortunately, this procedure does not work as straightforwardly for all Green’s functions. The RGE is much harder to solve if not all external momenta scale uniformly. If there is more than one relevant mass scale, the coefficient functions themselves can contain large logarithms. The classical example for large logarithms that can not be obtained from the RG are the Sudakov double logarithms [?] in exclusive scattering processes

α Q2 = ln2 (A.66) −2π m2  

A.2.9 Asymptotic Freedom As discussed earlier, the behavior of the β-function in the vicinity of g =0 determines the properties of PT. For a single coupling, the β-function must vanish for g = 0, because without interactions, there is no mechanism that could induce running of the coupling. Figures A.13 and A.14 display the four possible different scenarios. On the left hand side of figure A.13, we have β>0 everywhere and the running coupling grows in the ultraviolet without limits. However, PT is reliable in the infrared.

114 Figure A.13: β-functions for infrared free (left) and ultraviolet free (right) theories. The arrow denotes the direction of the evolution for a rising scale.

Figure A.14: β-functions with ultraviolet (left) and infrared (right) stable fixed points. The arrow denotes the direction of the evolution for a rising scale.

On the right hand side of figure A.13, we have β<0 everywhere and the running coupling grows in the infrared without limits. PT is reliable in the ultraviolet, but problematic in the infrared. The theory of strong interactions, QCD, has this property [?], called asymptotic freedom and it is known that non abelian gauge theories are the only class of theories in four space-time dimensions that can exhibit asymptotic freedom. In the left hand side of figure A.14, the coupling grows in the ultraviolet, but the β-function makes a turn and the coupling becomes constant after running into an ultraviolet stable fixed point at g∗, because β(g∗)=0.The opposite phenomenon appears in the right hand side of figure A.14, where g∗ is an infrared stable fixed point.

A.2.10 Dimensional Regularization As mentioned above, the RG functions β(g)andγ(g) are universal. They are determined by calculating µ∂G(n)/∂µ for a sufficient set of G(n) and solving the CSE for β(g)andγ(g). A typical integral appearing in these equations is

D 2 n 2 D 4 d k (k ) Iˆ (D, M )=µ − , (A.67) n,m (2π)D (k2 M 2 + i)m Z − 4 D where the factor µ − has been added in order to make the mass dimension of Iˆ independent of the dimension of space time. As long as m n D/2 and D/2+nare not equal to negative whole numbers, the integral− − can be evaluated

2 Iˆn,m(D, M )=

115 n+m 2 D/2 2 ( 1) i 2+n m M − Γ(m n D/2)Γ(D/2+n) − M 2 − − − 16π2 · 16π2µ2 Γ(m)Γ(D/2)    (A.68)

ˆ 2 and the logarithmic divergency of I0,2(4,M ) shows up as the pole of the Euler Γ-function. This divergency is regularized by moving away from D =4

2  ˆ 2 i M − I0,2(4 2, M )= Γ() . (A.69) − 16π2 16π2µ2   Γ() can be expanded in a Laurent series, that starts with a single pole

2 ˆ 2 i 1 µ I0,2(4 2, M )= +ln + 2 ln 4π γE . (A.70) − 16π2  M2 −   In a massless theory, the dependence on µ must be identical to the depen- dence on the renormalization scale and the expansion of the Γ-function shows that it suffices to determine the residues of the poles in  =2 D/2. Even in massive theories, there are renormalization schemes that retain− this con- venient property [?]. This procedure of dimensional regularization [?]andminimal subtraction of the poles in 1/(D 4) is particularly useful in gauge theories, because gauge invariance is not violated− by the regularization and no additional counter terms are needed in order to maintain gauge invariance. Since gauge invari- ance softens divergencies, dimensional regularization avoids the overestima- tion of radiative corrections induced by a naive application of regularization procedures that violate gauge invariance.

A.2.11 Gauge Theories In QED, we find for the lowest order RG functions e3 β(e)= (A.71a) 12π2 e2 γ (e)= (A.71b) ψ 16π2 e2 γ (e)= . (A.71c) A 12π2 There is a peculiarity of QED: the Ward identity enforces that the contri- butions of the vertex function and the self energy (see figure A.15) to the β-function cancel to all orders and the β-function can be calculated from the vacuum polarization (figure A.3) alone.

116 Figure A.15: The contributions of the vertex function and the self energy to the β-function in QED cancel.

Figure A.16: Non abelian contributions to the vacuum polarization: gluon and ghost loops.

From (A.71a), we see that βQED > 0 and that QED is not asymptotically free. Higher order calculations have not revealed an ultraviolet fixed point either. A more complicated calculation in a non-abelian SU(NC ) gauge theory with Nf quarks has the result [?]

3 e Nf 11 β(g)= CF NC , (A.72) 16π2 2 − 3   where

2 NC 1 CF = − . (A.73) 2NC

In QCD (i. e. NC = 3), this means e3 2 βQCD(g)= Nf 11 (A.74) 16π2 3 −   and the theory is indeed asymptotically free for Nf < 33/2. As a result, the running coupling of QCD falls with rising energy:

2 4π αs(Q )= 2 2 (A.75) β0ln(Q /ΛQCD) where 0.15 GeV < ΛQCD < 0.25 GeV and β0 =11 2Nf/3. The most obvious difference to an abelian gauge− theory like QED is the existence of the genuinely non-abelian contributions to the vacuum polar- ization displayed in figure A.16. These diagrams do indeed give a negative contribution to the β-function. Intuitively, the negative contribution from the gluon and ghost loops gives a fair description of the interplay of the positive quark contribution and the

117 negative non-abelian contributions. Strictly speaking, however this expla- nation is not correct, because the cancellations of vertices and self energies familiar from QED are not operative and more complicated diagrams also give contributions to the β-function.

A.3 Effective Field Theory

Ein Bild erh¨alt ja gerade dadurch seine Brauchbarkeit, daß es nicht die Sache ist. [Egon Friedell: Kulturgeschichte Griechenlands]

A.3.1 Flavor Thresholds and Decoupling So far, all effects of particle masses have been neglected, in order to highlight the simplicity of the RG approach. This is often a very good approxima- tion. For example at LEP2, fermion masses can be neglected in four fermion production, unless one of the final state particles is an electron and a t- channel singularity has to be regularized. The massless approximation is a good approximation, because the LEP2 center of mass energy is far from all flavor thresholds: mb √sLEP2

3 e Nf CF (A.76) ∝ 16π2 2 and in a concrete calculation we have to answer the question for the value of Nf . This question can not be answered by counting the quarks in table 2.1 2 2 2 alone. To see this, consider the typical case mb p mt and mb 2 −   µ mt. Then the contribution of a top quark loop in (A.76) is  2 2 R 2 p µ Π (p )=O( 2, 2) (A.77) mt mt

118 and is very small below the threshold. The purpose of the RG is to systematically resum the large logarithms in the vacuum polarization and other diagrams, but we find that there are no large logarithms below the threshold, as long as the renormalization scale is chosen sensibly. Therefore, the top flavor must not be included in (A.76) below threshold and Nf = 5 is the appropriate value between the bottom and top thresholds. The fact that a heavy particle does not contribute to RG functions below its threshold is non-trivial. In the unrenormalized version of (A.76), each flavor contributes equally to the high energy tail of the loop integral, as long as the cut off is significantly above the threshold of this flavor. Only after renormalization, it can be seen that each flavor also contributes equally to the counter term and that the net contribution of a heavy flavor vanishes, if the Green’s functions are renormalized below the threshold of this flavor. Since the RG never deals with the loop integrals over all of momen- tum space, the cancellations described in the previous paragraph take place automatically. The β-function describes the rate of change of the running coupling and if only active flavors contribute to this change, the cancellation of contributions from heavy flavors and their counter terms is already mani- fest. This argument, which has been made heuristically using (A.76) for the β-function, can be made rigorous in the decoupling theorem [?]; also for the anomalous dimensions. However, there is one circumstance under which the effects of a heavy particle can be felt in the RG flow below its threshold. This the case if the large mass violates a symmetry of the theory. The most celebrated example of this effect is the breaking of the SU(2)c custodial symmetry [?] of the SM through the heavy top mass and the corresponding contribution to the ρ-parameter. In a perturbative calculation with masses, there is a contribution to ρ that is proportional to the quadratic mass splitting in the third generation

2 2 2 αmt mb Π + (q )= + (A.78) W W − −2 ∝π MW ···

If every SU(2)L doublet was degenerate, the decoupling theorem would ap- ply and each heavy doublet would decouple from the low energy phenomena. The resulting low energy physics could be described by an EFT built from SU(2)c-invariant interactions. If, as in the third generation of the SM, there is a strong lifting of the degeneracy, there will be an effect on the low en- ergy couplings that manifests itself as a ρ-parameter that is different from

119 Figure A.17: Running couplings with thresholds. In an asymptotically free gauge theory, the running is slowed down at each new flavor threshold. unity. Then the low energy EFT below the top threshold has to contain SU(2)c-violating operators. Below the bottom threshold, there are no new SU(2)c violating contributions from the fermion sector, but the effects accu- mulated between mt and mb remain. If there would be no reason to expect a custodial symmetry or ρ = 1, the low energy EFT would contain SU(2)c- violating operators from the outset, of course. The general result of the decoupling theorem is that, unless symmetries are violated, heavy particles can be removed from a theory, if only phenomena below the energy scale given by the particle’s mass are to be calculated. The only residual effect of the heavy particle remains in a renormalization of coupling constants and fields.

A.3.2 Integrating Out or Matching and Running The decoupling of heavy flavors discussed in the previous section provides a systematic calculational procedure for theories with more than one mass scale. For example, in order the calculate the strong coupling at low energies from an experimental measurement at a 1 TeV LC, one proceeds as follows:

1. start with the measured value g0 at µ0 =1TeV mt:¯g(µ0)=g0  2. solve the massless RGEg ¯(µ)withtheβ-function for six active flavors from µ = µ0 down to µ = mt

3. start a second evolution step at µ = mt, using the result of the first step as initial value 4. solve the massless RGEg ¯(µ)withtheβ-function for five active flavors from µ = mt down to µ = mb.

5. start a third evolution step at µ = mb, ... 6. and so on This procedure is known matching and running and is illustrated in fig- ure A.17. The matching at each threshold is continuous, but not differen- tiable, because the rate of change (i. e. the β-function) changes discontinu- ously at each threshold.

120 The approximation of the flavor thresholds by step functions is not op- timal, of course. However, for many observables, the detailed features of the threshold are much less important than the large logarithms between thresholds and the matching and running procedure provides a systematic way of resumming these logarithms. In any case, the calculations can be im- proved systematically by including higher orders (NLO, NNLO) in the loop expansion, as well as power corrections. The most famous example (already mentioned in section 1.2) for a thresh- old effect is given by the running coupling of the three SM gauge groups. The experimentally determined couplings do not meet [?] in one point as would be a necessary condition of a GUT. However, matching the evolution of the couplings in the SM to the evolution of the couplings in the MSSM at the threshold provided by the supersymmetry breaking scale µ 1 TeV causes the three couplings to meet [?]. This may be an accident, but≈ it is the most suggestive phenomenological evidence for supersymmetry so far.

A.3.3 Important Irrelevant Interactions We have argued in section A.2.5, that operators of dimension greater than four are irrelevant and that the influence of their initial value at a high matching scale on the physics at much lower energies can be ignored. This argument can not be applied in two important cases. If the irrelevant interaction violates a symmetry that is respected by all relevant and marginal operators, the effects of the irrelevant interactions can be observable, even if they are small in absolute terms. A different problem arises if the hierarchy of scales is not large enough and the lever arm of the RG is therefore not large enough to suppress irrelevant operators when running from one threshold to the next. A typical example for the latter case is provided by the thresholds for the top quark and the W ± bosons. The power corrections MW /mt 0.45 can ≈ not be neglected relative to the logarithmic terms ln(MW /mt) 0.8. The solution in cases like B0B¯0-mixing is to decouple the| top quark and|≈W boson simultaneously

M2 +=ΦW ,(A.79) m2· t

121 where the function Φ is calculated from the massive loop [?]. Since the scales in the loop are rather close, no large logarithms can develop and the loop expansion for Φ is well behaved. The ∆B = 2 operator discussed in the previous paragraph is also an example| for| first case, but the more common example is provided by tree level charged current interactions. Below MW , the exchange of a W can be approximated by a local interaction

p2 =+O (A.80) M2 W of dimension six

GF ¯ ¯ µ LF = ψ(1 γ5)γµψ ψ(1 γ5)γ ψ (A.81) √2 − − which is irrelevant. However, the Fermi interaction LF can not be ignored, because it includes flavor changing terms that are absent in all relevant or marginal operators provided by the electromagnetic and strong interaction. The SM W ±-exchange is matched to the local Fermi interaction (A.81) at µ = MW like (A.80). Below the W -mass, all calculations are performed using the local Fermi interaction. The irrelevant operator (A.81) is not renor- malizable in the traditional perturbative sense, because it requires counter terms of higher dimensions. In the EFT, this is no problem, however, be- cause these counter terms are all determined by matching conditions at MW . Consequently, there is no loss of predictive power. The radiative corrections can be summed below MW by solving a RGE that includes the anomalous dimension of the irrelevant operator that is derived from diagrams like + ...

A.3.4 Mass Independent Renormalization A useful intuitive picture for the matching and running procedure is that more and more physics from higher scales is integrated out as the renor- malization scale is lowered. The physics is not lost in this process, but is condensed from Feynman diagrams into the short distance interaction ver- tices. One can imagine this as if all momentum integrations would be carried out from above down to the renormalization scale and that the remaining low energy momentum integrations would be cut off at the renormalization scale.

122 This intuitive picture is, however, not entirely accurate. As mentioned above, a fixed momentum space cut off is technically inconvenient, because of complicated integrals. But more important is that the fixed cut off is physically dangerous (see e. g. [?]), because gauge symmetries are violated by cut-offs. Therefore the dividing line between the short distance physics, which is absorbed in effective local interactions, and the long distance physics is gauge dependent and thus unphysical. Typically, the unphysical short distance coefficients will be spuriously large in magnitude, but their contri- butions will cancel in physical observables against unphysical long distance contributions. In order to exploit the cancellation of unphysical contribu- tions as soon as possible, it is preferable to extend all loop integrals over the whole momentum space and to replace the hard cut offs by counter terms. The price to pay is small, because the resulting divergencies can be handled with traditional perturbative renormalization theory. Calculations in mass independent scheme like minimal subtraction are convenient and do respect gauge invariance. The physically important threshold effects are handled by matching.

A.3.5 Naive Dimensional Analysis If the theory at the high energy scale is known, EFT is but a very powerful calculational tool. It allows to use the most appropriate degrees of freedom at each scale by matching different theories at thresholds. Furthermore, large radiative corrections can be summed systematically through the renormal- ization group. In such a scenario, the short distance coefficients are calculated by match- ing and running from the top to the bottom. The opposite case is when low energy data is fitted to an arbitrary effective Lagrangian and this Lagrangian is evolved from bottom to top until it becomes inconsistent, as shown in fig- ure A.10. In this case, reliable estimates of the scale of “new physics” can be made, but the “new physics” above the threshold can rarely be determined from the matching conditions at the threshold. Nevertheless, we can use EFT to estimate the likely size of contributions from “new physics” to observables at future experiments using Naive Dimen- sional Analysis (NDA) [?, ?, ?]. We know from general RG arguments, that the relevant operators will have larger coefficients at low energy than the ir- relevant operators of the same symmetry. That this statement remains true after renormalization was first pointed out in [?]. Unfortunately these argu- ments alone give no indication of the absolute magnitude of the coefficients.

123 From (A.30) and (A.33) we see that typical loop integrals that contain a single scale can be estimated by a factor of

1 3 6.3 10− , (A.82) 16π2 ≈ · which is multiplied by a power of this scale as dictated by simple dimensional analysis. A systematic analysis [?, ?] reveals that the small number (A.82) creates a hierarchy that can be used classify terms. One possible form4 of the lowest order effective Lagrangian in the mo- mentum expansion [?] for a non-linearly realized broken abelian5 symmetry is

2 2 µ L = f tr (DµU) D U (A.83a) where 

U(x)=eiφ(x)/f (A.83b) and f is a scale introduced by the symmetry breaking. It corresponds to the vacuum expectation value of a Higgs field in a linear realization of the symmetry breaking. Loops renormalize the operators of a given order as well as higher orders in the momentum expansion. For example, a loop with two lowest order terms 1 φ(x)←∂→φ(x) φ(x)←∂→µφ(x) (A.84) f 2 µ    renormalized at a scale µ2

1 P 2 µ2 = ln + ... , (A.85) 16π2 f 2 P 2 ·   where P 2 is an invariant built from the external momenta, requires a coun- terterm from the next order of the momentum expansion 1 1 ∂ φ(x)∂ φ(x)∂ν φ(x)∂µφ(x) , (A.86) Λ2 f 2 ν µ

4There are many equivalent forms [?], but all involve the field φ(x) only in the combi- nation φ(x)/f. 5In the general non-abelian case, the formulae are more complicated, because the factor group manifold G/H has to be parametrized [?]. These geometrical complications have no impact on the power counting argument.

124 where Λ is the scale characterizing the momentum expansion. A priori, there is no relation between the scales f and Λ for symmetry breaking and momentum expansion, respectively. However, unless there is some unnatural fine-tuning of parameters at work, the rate of change of the couplings in the RG running from (A.85) should be of the same order as the coefficients of the counterterm (A.86). Equating coefficients gives 1 1 1 (A.87) 16π2 f 2 ≈ Λ2 or the celebrated formula

Λ 4πf . (A.88) ≈ From another perspective, the RG will evolve towards (A.88) in the infrared, unless a symmetry that has not been made manifest in the effective La- grangian forces the RG flow in a particular direction. A physical thresh- old between f and Λ would require matching and provides the only way around (A.88) in an EFT. Such a physical threshold will however be observ- able independently. The heuristic argument for the lowest order in the momentum expansion for scalar fields can be extended systematically to all orders in the loop and momentum expansions and to fermions and vector fields [?]. The result is that a natural interaction involving scalars φ,fermionsψ, derivatives ∂ and gauge bosons A has the size

φ Nφ ψ Nψ eA NA ∂ N∂ L(Nφ,Nψ,NA,N∂ ) = f 2Λ2 (A.89) f f√Λ Λ Λ         for any quartet (Nφ,Nψ,NA,N∂) of natural numbers. The power counting rules assume that the vector boson A is indeed a gauge boson, such that the two parts of the covariant derivative ∂ ieA scale identically. The rules would be much more complicated for non-gauge− bosons and their longitudinal polarization states. The important result of NDA is that the natural strengths (A.89) of interactions can be estimated, even in a strongly interacting EFT. The as- sumption that couplings have their natural strengths is nothing more than an assumption, of course. Nevertheless, the predictions from NDA work for low energy strong interactions and NDA can be used as a guide while searching for good places for hunting “new physics.” Higher dimensional operators are most likely suppressed by a factor of 1/(16π2) and proposals for experiments should take this into account.

125