<<

Inclusive Low-Mass Drell-Yan Cross-Section at LHCb at √s = 8 TeV

Dissertation zur Erlangung der naturwissenschaftlichen Doktorw¨urde (Dr. sc. nat.) vorgelegt der Mathematisch-naturwissenschaftlichen Fakult¨at der Universit¨atZ¨urich von Andreas Robert Weiden aus Deutschland

CERN-THESIS-2020-279 29/01/2020 Promotionskommission Prof. Dr. Ulrich Straumann (Vorsitz, Leitung der Dissertation) Dr. Katharina M¨uller Prof. Dr. Nicola Serra

Z¨urich, 2020 Die vorliegende Arbeit wurde von der Mathematisch-naturwissenschaftlichen Fakult¨atder Universit¨atZ¨urich im Herbst-Semester 2020 als Dissertation angenommen.

Promotionskommission: Prof. Dr. Ulrich Straumann (Vorsitz und Leitung der Dissertation) Dr. Katharina M¨uller Prof. Dr. Nicola Serra ABSTRACT

The LHCb experiment, one of the four main experiments at the LHC, is optimized for decays of particles containing a b- or c-quark. The LHCb detector is a single-arm forward spectrometer with an acceptance from approximately 30 to 250 mrad, with respect to the incoming proton beams. In addition to its main goal, its unique geometry makes it also a very interesting detector to probe general physics in the forward region. This includes electroweak production, which can provide important insights into the parton distribution functions (PDFs) of the proton. As part of this electroweak program, a measurement of the differential and double-differential inclusive Drell-Yan cross-sections with subsequent decay to muon-pairs

dσ(pp Z/γ∗ µ+µ−) d2σ(pp Z/γ∗ µ+µ−) → → and → → dMµµ dy dMµµ

2 is performed in the range 10.5 < Mµµ < 110 GeV/c and 2 < y < 4.5. The cross-section measurement benefits from the high-precision calibration of the absolute luminosity at LHCb. For this measurement, data corresponding to 2.0 fb−1 collected with the LHCb detector at a centre-of-mass energy of √s = 8 TeV are analysed.

The cross-sections are compared to theoretical next-to-next-to leading order perturbative QCD predictions using four different PDF sets, none of which included measurements from this hitherto unexplored region in phase space. The differential cross-section as a function of invariant mass of the dilepton pair is found to be in general agreement with the different predictions, however some systematic discrepancies are found when studying the double-differential cross-section as a function of rapidity of the dilepton pair.

Overall the measurement presented here is systematically limited in the region which would be of most interest as input for future PDF sets, the low mass region. Nevertheless, it is hoped that the data presented here will help place new constraints on the quark and anti-quark content of the proton PDFs down to very low values of the momentum fraction of the partons, x.

i ii ACKNOWLEDGEMENTS

Writing a thesis is like a journey. You can choose the way you set out on, but you don’t always know where it leads or how long it will take to get there. I would like to thank all people who have accompanied me along the way.

Vaney, for helping me stay the course by giving me a goal.

Luan, for being that goal.

Mama und Papa, for always telling me I can do this if I want to.

Thorsten, for moral support from afar.

Prof. Ueli Straumann and Dr. Katharina M¨ullerfor their advice and their insights.

The Zurich LHCb group, for making the journey enjoyable.

iii iv F¨urOmi und P¨unktchen

v vi CONTENTS

Abstract i

Acknowledgements iii

Contents vii

1 Introduction1

2 Theoretical Introduction3 2.1 The of ...... 3 2.1.1 Composite particles...... 6 2.1.2 Experimental confirmation...... 7 2.2 Electro-weak symmetry breaking...... 7 2.3 Quantum-Chromo-Dynamics...... 11 2.3.1 Asymptotic freedom and confinement...... 12 2.3.2 Parton Density Functions...... 14 2.4 The Drell-Yan process...... 18 2.4.1 The general 2 2 process...... 18 2.4.2 The differential→ Drell-Yan cross-section...... 22 2.4.3 The FEWZ tool...... 26

3 The LHCb experiment at the LHC 27 3.1 Particle accelerators...... 27 3.2 The Large Hadron Collider...... 29 3.3 The LHCb experiment...... 31 3.3.1 Tracking system...... 36 3.3.2 Calorimeter system...... 40 3.3.3 Muon system...... 41 3.3.4 Particle identification using RICH detectors...... 44 3.3.5 Trigger and event reconstruction...... 45 3.3.6 Production of simulated events...... 48 3.4 Luminosity determination at LHCb...... 50 3.4.1 Relative luminosity determination...... 52

vii 3.4.2 The Van-der-Meer method...... 54 3.4.3 The Beam-Gas-Imaging method...... 56 3.4.4 Absolute luminosity calibration at 8 TeV...... 58 3.4.5 Determining the luminosity of leading bunches...... 58

4 Measurement of the Drell-Yan cross-section 63 4.1 Previous measurements...... 64 4.2 Trigger and data selection...... 66 4.2.1 Signal samples...... 66 4.2.2 Background samples...... 68 4.2.3 Overview over samples...... 71 4.3 Fit templates...... 71 4.3.1 The isolation variable...... 72 4.3.2 Signal template...... 77 4.3.3 Background templates...... 78 4.4 Determining the signal yields...... 79 4.4.1 Initial fit...... 81 4.4.2 Residual signal removal...... 82 4.4.3 Fixed background fraction...... 89 4.4.4 Toy studies...... 92 4.5 Bin migration...... 95 4.5.1 Bin migration as a function of mass...... 96 4.5.2 Bin migration as a function of mass and rapidity...... 100 4.6 Efficiencies...... 102 4.6.1 Trigger efficiency...... 102 4.6.2 Tracking efficiency...... 108 4.6.3 Muon ID efficiency...... 110 4.6.4 Combined reconstruction and trigger efficiency...... 112 4.6.5 Global event cut efficiency...... 114 4.6.6 Vertex χ2 cut efficiency...... 117

5 Systematic uncertainties 125 5.1 Fitting...... 125 5.1.1 Signal template...... 126 5.1.2 Heavy-flavour template...... 130 5.1.3 Toy studies and bin migration...... 133 5.2 Efficiencies...... 134 5.3 Luminosity...... 134 5.4 Cross-checks...... 135 5.4.1 Number of bins...... 135 5.4.2 Magnetic polarity...... 135 5.5 Total systematic uncertainty...... 137 5.5.1 As a function of mass...... 137 5.5.2 As a function of mass and rapidity...... 137

6 Results 141 6.1 Total cross-section at the Z-peak...... 142 6.2 Differential cross-section as a function of mass...... 143

viii 6.3 Double-differential cross-section as a function of mass and rapidity..... 146

7 Conclusions & Outlook 149

A Theoretical predictions of the Drell-Yan cross-section 151

B Measured values of the Drell-Yan cross-section 155

C Individual fit results 157 C.1 As a function of invariant mass...... 158 C.2 As a function of invariant mass and rapidity...... 160

D Using a feed forward neural network to identify Drell-Yan events at the LHCb experiment 165

E Uncertainties of differences and ratios for correlated variables 169

F Individual contributions to the systematic uncertainty 171

G Correlation with Z-measurement 173

Bibliography 175

ix x CHAPTER 1

INTRODUCTION

He was determined to discover the underlying logic behind the universe. Which was going to be hard, because there wasn’t one.

Terry Pratchett - Mort

While there might be no underlying logic behind the universe, trying to find such a logic has nevertheless been a very fruitful endeavour for humanity. By discovering patterns in , one can make predictions about future behaviour. This has opened the gateway to technology. In recent times one frontier of human knowledge has been particle physics.

The Standard Model of Particle Physics has been developed as a quantum field theory in order to describe the fundamental building blocks of nature. It is the most fundamental logic we have discovered so far and is briefly introduced in Chapter2. One part of the Standard Model is Quantum-Chromo-Dynamics, the quantum theory that describes the , which is responsible for the proton being held together and which is introduced in Section 2.3. Contrary to the other fundamental forces, the strong interaction does not diminish with distance, but increases in strength. Using perturbation theory to describe QCD processes is not possible in general because of this. However, for processes which can be separated into a high-energy (small distances) part and a low-energy (large distances) part, the latter can be described by process independent parton density functions determined using data as inputs, which are introduced in Section 2.3.2. The parton density functions allow perturbative calculations of the high-energy process by encapsulating the low-energy part. One of the most fundamental hard processes which benefits from this is the Drell-Yan process. It can change one -pair into a different fermion-pair via exchange of a or Z-boson. In this thesis the particular Drell-Yan process of a quark –

1 2 CHAPTER 1. INTRODUCTION

anti-quark pair annihilating and forming a muon – anti-muon pair is explored. The leading- order equation needed to describe this Drell-Yan process is introduced in Section 2.4. In order to calculate theoretical predictions for the cross-section of the Drell-Yan process the FEWZ tool is used, which is introduced in Section 2.4.3. It uses next-to-next-to-leading order equations and different parton density function parametrizations.

In order to actually measure the cross-section of a process like this Drell-Yan process, experimental data is needed. In this thesis data collected by the LHCb experiment at the LHC is used, which is introduced in Chapter3. The data was collected in 2012 at a centre-of-mass energy of √s = 8 TeV. The LHCb detector is sensitive to particles which are produced at small angles with respect to the incident beams. This allows probing a region of phase space that is not fully accessible to the other experiments at the LHC. To measure absolute cross-sections, the concept of luminosity is introduced in Section 3.4. Luminosity relates the number of observed events to the cross-section. It is an accelerator and interaction region specific value that needs to be calibrated for each data taking. The nominal luminosity calibration employed at LHCb is presented, as well as a way to determine the luminosity in cases where the nominal calibration is not sufficient.

The main part of this thesis, Chapter4, is spent on the various steps needed to measure the (double-)differential Drell-Yan cross-section. An overview over relevant previous measurements of the Drell-Yan cross-section is given in Section 4.1. Sections 4.2 and 4.3 describe the data selection for the signal and background samples. A template fit, which is described in Section 4.4, is used to distinguish the signal from various background processes which mimic the signature of the Drell-Yan process, a pair of muons with opposite charges. The signal yields obtained in this way are corrected for biases of the fit using toy fits in Section 4.4.4. Final-state radiation and the finite detector resolution are corrected by means of an unfolding procedure in Section 4.5. After taking into account the efficiencies of the various stages of data selection necessary to produce a clean data sample in Section 4.6, the cross-section can be determined. The systematic uncertainties associated with the different stages of this process are shown in Chapter5 and the final results are presented in Chapter6. Some conclusions and lessons for the future can be found in Chapter7. CHAPTER 2

THEORETICAL INTRODUCTION

Three quarks for Muster Mark.

James Joyce - Finnegans Wake

The beginning of this thesis is a general introduction of the Standard Model of Particle Physics, followed by the definition of the Drell-Yan cross-section in Section 2.4, whose measurement is the focus of this thesis. A central part of this definition are the Parton Density Functions introduced in Section 2.3.2, which are needed to fully describe the Drell-Yan process at a hadron collider. The tool which was used to calculate theoretical predictions for the Drell-Yan cross-section at the LHCb detector at √s = 8 TeV is described in Section 2.4.3.

2.1 The Standard Model of Particle Physics

The Standard Model of Particle Physics (SM) was developed in the 1960s to explain the plethora of particles discovered studying cosmic rays and using particle accelerators at the beginning of the 20th century. It has been able to successfully describe almost all phenomena observed in particle physics with a high precision.

The SM describes three of the four known fundamental forces in nature. Electro-magnetism, which is the force between charged particles and responsible for holding atoms together; the strong nuclear force, which keeps atomic nuclei and their constituents bound together; and the weak nuclear force, responsible for some nuclear decays. Not included is the force we are most accustomed to, gravity.

3 4 CHAPTER 2. THEORETICAL INTRODUCTION

Particles can have a multitude of properties. One such property is the mass of the particle, but they can also have charges under different fields or the intrinsic property called , which can be combined just like angular momenta in composite particles. If a certain property is conserved in interactions with a specific field, then this is usually connected to a symmetry of the field, according to Noether’s theorem [1].

Phenomenologically, the SM contains a number of fundamental particles (see Fig. 2.1 for an overview over all fundamental particles in the SM, including their mass and spin) which make up composite particles. The SM gives different roles to particles depending on their spin. Particles with integer spins are generally called and particles with half-integer spins . In the SM, bosons mediate the three forces included in the SM and fermions make up matter. The fermions are further structured by which force(s) they can interact with into quarks, charged leptons and neutrinos, further explained in the following.

2.3MeV 2/3 1.28GeV 2/3 173.2GeV 2/3 0 125.18GeV 0 1/2 1/2 1/2 1 0 u c t g H charm quark top quark 4.8MeV -1/3 95MeV -1/3 4.7GeV -1/3 0 1/2 1/2 1/2 1 d s b γ down quark strange quark bottom quark photon <2eV 0 <190keV 0 <18.2MeV 0 80.4GeV ±1 1/2 1/2 1/2 1 νe νμ ντ W electron neutrino muon neutrino tau neutrino W bosons 511keV -1 105.7MeV -1 1.777GeV -1 91.2GeV 0 1/2 1/2 1/2 1 e μ τ Z electron muon tau Z boson

Figure 2.1: Overview over the fundamental particles in the Standard Model. The fermions are displayed in the first three columns, ordered by the three generations. The quarks are in red, neutrinos in green, charged leptons in orange and the bosons in blue. The centre shows the usual letter used for abbreviation, below it is the name of the particle. In the upper left corner is the approximate mass (if not zero) and in the upper right corner the electric charge and the spin. Masses taken from [2].

All fermions interact via the weak nuclear force, mediated by the W ± and Z-bosons. Neutrinos are the only (electrically) uncharged fundamental fermions in the SM, they therefore only interact via the weak force. All other fundamental fermions carry an electric charge, allowing them to interact via the electro-magnetic force, mediated by the photon (γ). Quarks carry an additional colour charge and can also interact via the strong force, mediated by the (g). The fermions that cannot interact strongly, the electron, muon, tau and the neutrinos, are collectively called leptons. CHAPTER 2. THEORETICAL INTRODUCTION 5

Fermions

The fermions are separated into three generations. Each generation of fermions contains two quarks (an up-type quark and a down-type quark) and two leptons (a charged lepton and the associated neutrino). The first generation is the only one that contains stable particles, it includes the up- and down-quark, which can form a proton as uud or a neutron as udd, as well as the electron and electron-neutrino. The particles in the second and third generation have in general the same properties as the corresponding ones in the first generation, except that they have higher mass.

In addition to the particles described so far, each particle has a corresponding anti-particle. This is especially important for interactions, which conserve certain quantities like charge or spin of the initial state. Therefore particles such as the electrically neutral Z-boson can directly decay to a pair of oppositely charged leptons. These two leptons cannot, however, be any two leptons, they need to be from the same generation. In other words, the Z-boson can decay into an electron - anti-electron (historically named positron) pair, a muon - anti-muon pair, or a tau - anti-tau pair. Electrically neutral particles can be their own anti-particles (which is the case for e.g. the photon and the Z-boson in the SM), but they don’t have to be.

Bosons

The bosons, the particles with integer spin, mediate the forces included in the SM. The photon is the mediator of the electro-magnetic force. It is itself not electrically charged, massless, and its own anti-particle. The gluons are the mediators of the strong force. In contrast to the photon, each gluon carries a charge of the force they mediate. The charge of the strong force is called colour and instead of having only one dimension like the electric charge (plus and minus), there are three colours, each of which can be positive and negative (red and anti-red, green and anti-green and blue and anti-blue). Quarks have a single colour charge, while anti-quarks carry an anti-colour. Each of the eight gluons carries a colour and an anti-colour, but is electrically neutral and massless, just like the photon. The weak force has three mediators, the two electrically charged W + and W − bosons and the neutral Z-boson, all of which are massive in contrast to the gluons and photon. Weak interactions are the only interactions which can change the flavour of a particle, so e.g. turn a u-quark into a d-quark via an interaction with a W -boson.

All bosons mentioned up to this point have spin 1. The currently only exception† to this is the spin-0 Higgs-boson. It is responsible for giving the bosons mediating the weak force their mass, via a process called electro-weak symmetry breaking described in more detail in Section 2.2. Indirectly, this process also gives the quarks and the electron, muon and tau their masses. Neutrinos are assumed to be massless in the SM, even though it has been experimentally confirmed since then that they must have a small, but non-zero, mass [2,3].

All of the interactions included in the SM are described using quantum field theories. The

†As far as experimentally confirmed particles go. The graviton, the purported mediator in most Quantum Gravity theories, would have spin 2. 6 CHAPTER 2. THEORETICAL INTRODUCTION

electro-magnetic force is described by Quantum Electro Dynamics (QED), the strong force by Quantum Chromo Dynamics (QCD), described in Section 2.3 and the weak force using a unification of the electro-magnetic force and the weak force to the electro-weak force, described in more detail in Section 2.2.

Not everything we observe in nature is included in the SM. The most obvious omission is gravity. So far attempts to formulate a consistent version of Quantum Gravity have failed. In addition, Dark Matter, the observed difference of how visible matter moves in the universe in contrast to how it should move if there was only the visible matter, and Dark Energy, the observation that the universe is experiencing an accelerated expansion [4,5], are also not described by the SM. It also does not explain why neutrinos have mass (they are massless in the SM) or why the other particles have the masses they do.

2.1.1 Composite particles

The phenomenology of the SM does not stop with its elementary particles. The major reason for its huge success in modern physics comes from the fact that it is able to explain the plethora of particles discovered in the course of the 20th century as either elementary particles or composita of them.

The most numerous bound states are formed using quarks. Since the strong force is the strongest of the fundamental forces, hence the name, these composite particles form very readily, in a process called hadronization. Already with the weaker electro-magnetic force, objects tend to be electrically-neutral, since otherwise they would attract charged particles until they become neutral. This is even more extreme for the strong force, which does not diminish with increasing separation of particles charged under it, but increases. This leads to the phenomenon called confinement, the fact that we do not observe any object with a colour charge, only colour-neutral particles.

There are different ways to form these colour-neutral composite particles, called hadrons. The first way to combine coloured quarks to form an uncoloured object are the , where three objects, called partons, of all three colour charges or all three anti-colour charges, i.e. rgb or r¯g¯¯b, are combined. The most well-known representatives of this class are the proton, which combines two up-quarks with one down-quark, uud, and the neutron, udd, collectively called nucleons. When two nucleons come close enough together, they can become bound together by the remnants of the strong force, which extends a bit outside of the nucleons. This is how atomic nuclei form from protons and neutrons. Apart from the proton and neutron also more complex, unstable, baryons can be formed using quarks from other generations than the first. The lifetime of such baryons is usually less than a nanosecond [2]. The masses of the unstable baryons are higher than that of the proton or neutron, to which they eventually decay.

The second way to form a colour-neutral object is to combine a quark with an anti-quark with the opposite colour charge (e.g. rr¯). These objects are called mesons and are all unstable, either because the quark and anti-quark can annihilate (as is the case with the neutral pion, which is a mixture of uu¯ and dd¯ or the J/ψ with the quark content cc¯) or CHAPTER 2. THEORETICAL INTRODUCTION 7 because one of them decays via the weak force (e.g. the charged pions ud¯ and du¯). Mesons containing a strange-quark are called kaons, a charm-quark D-mesons and a bottom-quark B-mesons.

Lately also more complex colour-neutral composite particles containing four [6,7], five [8,9] and even six quarks [10] have been discovered, but these are very rare and only exist for very short times.

2.1.2 Experimental confirmation

Experimentally, the SM has been a great success. One of the first successes came in 1974 when the J/ψ , a cc¯-resonance was discovered [11,12]. The existence of the c-quark had only been postulated the year before by Sheldon Glashow, John Iliopoulos and Luciano Maiani [13]. The third generation of quarks was discovered shortly afterwards in 1977 with the observation of another quark – anti-quark resonance, the Υ (b¯b)[14].

The gauge bosons (except for the already known photon) were also successively discovered at colliders, starting with the existence of quarks within the proton and the gluons responsible for them forming protons [15,16]. The Z-boson was discovered in the early 1980s by the UA1 [17] and UA2 [18] collaborations using the decays Z e+e− and Z µ+µ−. A picture of the first event observed by the UA1 experiment→ which was consistent→ with the decay of a Z-boson to a muon - anti-muon pair can be seen in Fig. 2.2. The final step was the discovery of the Higgs-boson by the ATLAS [19] and CMS [20] collaborations in 2012, completing the SM.

2.2 Electro-weak symmetry breaking

The electro-weak (EW) theory, together with QCD (see Section 2.3), forms the theoretical foundation of the SM. It was formulated in the early 1960s by Sheldon Lee Glashow [22], and John Clive Ward [23], and Steven Weinberg [24]. The SM is described using gauge groups, which describe transformations under which the Lagrangian ( ) of the theory is invariant. The Lagrangian describes all of the dynamics of a system withinL a theory.

In the EW theory both the electro-magnetic force mediator, the photon, as well as the charged and neutral mediators of the weak force, the W ± and Z-bosons, are described as fields left over after a symmetry breaking when transitioning from higher energies to low energies, called Electro-Weak Symmetry-Breaking (EWSB). This symmetry breaking was first described by , Robert Brout, Fran¸coisEnglert and others [25–28] in 1964.

The EWSB proceeds by postulating the two symmetry groups U(1)Y and SU(2)L of the Lagrangian and four fields W1, W2, W3 and B. In addition it has a global complex scalar potential V (φ), which follows the description 8 CHAPTER 2. THEORETICAL INTRODUCTION

Figure 2.2: The first event observed by the UA1 experiment which was consistent with the decay of a Z-boson to a muon - anti-muon pair. The energetic muon tracks are detected in the central detector and in the outer muon detector which surrounds the calorimeter. Adapted from [21].

V (φ) = µ2 φ 2 + λ( φ 2)2, λ > 0. (2.1) | | | | Depending on the sign of the parameter µ, this results in either a rotated parabola in the complex plane, or what is usually called the Mexican-hat potential, both are shown in Fig. 2.3. For µ2 > 0 the potential has a minimum at φ = 0 and is equivalent to QED, but with a massive charged scalar field with mass µ added. In the case of µ2 < 0, the potential has a minimum for φ = 0, at the v. This means that the lowest energy state of this6 theory is not at the point of symmetry of the potential (the origin), but away from it. At energies high enough that the difference between the point of symmetry and the minimum is negligible, the theory is symmetric, in this case under the U(1) symmetry group. When the potential difference is not negligible on the other hand, the symmetry will be spontaneously broken and the system will decay into the ground state, which can lie anywhere on a circle around the origin.

After the spontaneous symmetry-breaking, the fields change into combinations of the original fields at higher energy. They can still be described as linear combinations of the original fields, though, using the coupling strengths of the charged and uncharged parts of CHAPTER 2. THEORETICAL INTRODUCTION 9

1 )

φ 0.5 ( V 1 0 1 0 − 0 1 1 − (φ) (φ) = < Figure 2.3: Shapes of the complex potential V (φ) for µ2 > 0 (left) and µ2 < 0 (right) in the complex plane. Drawing based on [29]. the theory, g and g0, respectively [30]:

1 W ± = (W 1 iW 2) (2.2) µ √2 µ ∓ µ 0 3 g Bµ + gW Zµ = − µ (2.3) pg2 + g02 0 3 gBµ + g W Aµ = µ . (2.4) pg2 + g02

In other words, the EW theory has two charged and massive gauge bosons, the W ± bosons, one massive and uncharged gauge boson, the Z boson, and one massless and uncharged one, the photon A. The contributions to the gauge boson masses can also be determined from this, using the vacuum-expectation value v

1 M 2 = g2v2 (2.5) W 4 1 M 2 = (g2 + g02)v2 (2.6) Z 4 MA = 0. (2.7)

Instead of using the coupling constants, the weak mixing angle θW , also called Weinberg angle [22,24], can be used to describe the two massless fields [31]:

      Aµ cos θW sin θW Bµ = 3 . (2.8) Zµ sin θW cos θW W − µ 10 CHAPTER 2. THEORETICAL INTRODUCTION

This transformation is equivalent to a rotation by the Weinberg angle, as illustrated in Fig. 2.4.

W3 Z

θW

γ

θW B

Figure 2.4: Relationship of the Z and γ bosons with the unbroken W 0 and B0 fields as a simple rotation using the Weinberg angle θW . Reproduced from [21].

The Weinberg angle is related to the coupling constants on one hand [30] via

0 e = g sin θW = g cos θW , (2.9)

but it also relates the masses of the gauge bosons via [31]

MW cos θW = . (2.10) MZ

Since the couplings g and g0 change depending on the energy scale, the Weinberg angle also changes as a function of energy. The effective Weinberg angle, which is relevant at the 2 energies of the LHC, has been measured by LHCb to be sin θW = 0.23142 0.00106 [32]. ± This model has two free parameters related to the Higgs-boson, which need to be measured. − − The first one is the vacuum expectation value v, which is determined by the µ e ν¯eνµ decay rate [2,31]: →

−1/2 v = (√2GF ) = 246 GeV. (2.11)

The other free parameter is the mass of the Higgs-boson which determines the parameter λ in Eq. (2.1):

M 2 λ = H 0.13. (2.12) 2v2 ≈ CHAPTER 2. THEORETICAL INTRODUCTION 11

While the gauge bosons directly obtain mass from the Higgs boson, fermions, i.e. the quarks and charged leptons, can also gain mass by interacting with the Higgs-field. The value of the interaction strengths between fermions and the Higgs-field, called Yukawa coupling strengths named after Hideki Yukawa, who first used them to describe the nuclear force between nucleons via exchange of virtual pions [33], are not predicted by the SM. They need to be determined by actually measuring the masses and couplings of the particles.

2.3 Quantum-Chromo-Dynamics

The quantum theory used to describe the strong nuclear force, which binds quarks into nucleons, and nucleons into atomic nuclei, is called Quantum Chromo Dynamics (QCD). Only quarks and gluons are affected by the strong force. It is mediated by the gluons.

Similar to the electro-weak theory, which is described by the symmetries SU(2)L and U(1)Y (plus the spontaneous symmetry breaking), QCD is described by the symmetry group SU(3). This means that the wave function is invariant under the local transformation

ψ exp(igsα(x)λ/2)ψ, (2.13) → as shown in Ref. [34]. Here gs is the coupling strength of the strong force, α(x) is the local phase angle and λ are the generators of the SU(3) symmetry, which are usually represented by the eight Gell-Mann matrices [35] (and not the λ parameter of the ).

The Lagrangian of QCD can be built with a kinetic and an interaction term, which sums over all flavours of quarks in this case. The different quark flavours do not matter to the strong force, only the colour charge does and only quarks appear because they are the only fundamental fermions affected by the strong force.

1 A αβ X QCD = F F + q¯a(iD/ m)abqb, (2.14) L −4 αβ A − flavours µ µ ν µν D/ = γµD γ , γ = 2g , (2.15) { } with gµν = diag(1, 1, 1, 1) the metric [36]. The structure of the strong force is hidden − A− − in the definition of Fαβ,

A A A ABC B C F = ∂α ∂β gf (2.16) αβ Aβ − Aα − Aα Aβ with f ABC the structure constants of the SU(3) colour group. The last term is what distinguishes QCD from QED, a non-abelian term that gives rise to gluon self-interactions and ultimately the property of asymptotic freedom [36]. Here the generators of the symmetry, the Gell-Mann matrices λ, also appear again, because the structure constants A 1 A are the commutation of the (rescaled) Gell-Mann matrices t = 2 λ , 12 CHAPTER 2. THEORETICAL INTRODUCTION

[tA, tB] = if ABC tc. (2.17)

2.3.1 Asymptotic freedom and confinement

Many equations which show up in physics can not be analytically solved, either with current tools or possibly at all. If this is the case, the usual approach is to resort to perturbation theory. With this approach a system is considered to be in some free ground state and all other effects are just small perturbations from this ground state. The ground state approximation is called leading-order (LO), when one more order of corrections is included, next-to-leading-order (NLO) and after that only additional next-to are added (NNLO, ...). Higher orders in perturbation theory are represented by Feynman diagrams with additional internal vertices. For an introduction into Feynman diagrams, a graphical way to express perturbative parts of an amplitude of a process, see e.g. Ref. [37]. In general perturbative expansions take the form

2 x = x0 + αx1 + α x2 + .... (2.18)

The parameter α is the parameter the expansion is performed in. The expansion can be cut off at some value if α < 1 in order to obtain an approximate result, in this case the parameter is the perturbative| | regime. If α > 1, then every additional term of higher order contributes more than the previous terms,| | and the parameter is in the non-perturbative regime and must be calculated to all orders in order to get the actual value.

QED is a theory in which everything can be calculated perturbatively, while this is not the case for QCD. The underlying reason for this is that the nature of the that describes the strength of the interaction is fundamentally different. In both theories the coupling constants are not actually constant, but depend on the transferred four-momentum squared of the process q2. The way in which the coupling constant changes is encoded in the β function, which is defined as

∂α q2 s = β(α ). (2.19) ∂q2 s

In QCD the β function has the perturbative expansion

2 0 2 β(αs) = bα (1 + b αs + (α )), (2.20) − s · O s with CHAPTER 2. THEORETICAL INTRODUCTION 13

33 2nf b = − (2.21) 12π 153 19nf b0 = − , (2.22) 2π (33 2nf ) − where nf is the number of active flavours [36]. In contrast, in QED the β function is to leading order given by

1 β (α) = α2 + .... (2.23) QED 3π

From the β function the coupling constant can be calculated at any scale q2, if it is known at any other scale µ2 and as long as both are in the perturbative regime. When neglecting 0 b and higher coefficients αs can be expressed as

2 2 2 αs(µ ) q αs(q ) = 2 , t = ln 2 . (2.24) 1 + αs(µ )bt µ

When t becomes large, i.e. at large energies q2 or small distances, the coupling decreases slowly to zero. This means that at large energies, such as at the mass of the Z-boson, the quarks behave like free quarks, they are asymptotically free. At small energies, the opposite happens. The coupling gets larger, which leads to the fact that at large distances, where many soft particles are exchanged, the strong force becomes even stronger, leading to confinement of e.g. the partons in the proton. In contrast, for QED the opposite sign in the β function leads to the coupling becoming stronger at large energies, and smaller at small energies.

The different structure encoded in the different β functions can be explained by the additional structure of QCD in comparison to QED. The non-abelian last term in Eq. (2.16) allows interactions of the gluons with themselves, in contrast to the photon of QED, which does not interact with itself (in leading order). This feature of QCD is also represented by gluons carrying themself strong charge (colour), in contrast to the , which are electrically neutral.

Confinement can also be understood by considering the force between e.g. two quarks. If two quarks were forcefully separated further and further in space, at some point the potential energy between them exceeds the rest mass energy of a quark – anti-quark pair, which is spontaneously being created. This process happens during hadronization, which occurs when one parton of e.g. a proton is removed from the proton (via a collision with another proton, for example). In this case the remaining partons are also no longer colour-neutral, creating quarks and anti-quarks that form hadrons until all particles are colour-neutral. 14 CHAPTER 2. THEORETICAL INTRODUCTION

2.3.2 Parton Density Functions

The non-perturbativity of QCD below some energy scale needs to be dealt with somehow when making predictions. In hadron collisions, two different scales of q2 are present. On the one side there is the hard process, for example the Drell-Yan process (see Section 2.4). Hard processes are distinguished by having a large q2, which means that they can be calculated perturbatively. On the other hand, there are the incoming hadrons and the hadronization of the parts of the colliding hadrons that were not involved in the hard process. These parts contain interactions at low energies, where QCD cannot be described perturbatively anymore.

usually these two scales are assumed to operate independently of each other. Such an independent treatment of the hard process on one side and the hadronization on the other is called factorization. Factorizability allows taking the non-perturbative parts, like the distribution of partons within the hadrons, and describing them using experimental data. Since the non-perturbative parts are independent of the hard process, they can be measured using many different processes, operating in different regions of the phase space.

In order to describe these non-factorizable parts, let’s consider colliding protons. When being accelerated, only the protons themself have a known centre-of-mass energy, not the constituent partons. For highly relativistic protons one can assume that the partons move collinearly with the proton they are a part of, i.e. all partons have pT = 0. Naturally, each parton carries only a fraction of the total momentum p of a proton (or any hadron), x. Which part of the total momentum is carried by which parts of the proton can not be calculated from first principles in QCD. However, as described in more detail e.g. in Ref. [38], one can define a probability distribution for finding a parton with a certain x, which only depends on x and q2. This probability distribution is given separately for each 2 flavour i of the quarks and is denoted by fi(x, q ) and called Parton Density Function (PDF)†. Usually u, d, c and s quarks and anti-quarks (denoted by ¯i), as well as gluons are considered as partons.

The PDFs exhibit multiple properties that make them very useful. Assuming some fixed scale q2, when summing over all momentum fractions of all partons, the average value is unity, i.e. the sum of all momenta of the partons must be the momentum of the proton:

! DX E Z 1 X X xi = x fi(x) + f¯i(x) + fg(x) d x = 1 (2.25) 0 i ¯i

In addition, the expected number of up- and down-quarks in a proton must be equal to the valence quark content uud:

Z 1 Z 1 Nu = fu(x) fu¯(x) d x = 2 Nd = fd(x) fd¯(x) d x = 1 (2.26) h i 0 − h i 0 −

†Not to be confused with Probability Density Functions (also PDFs). CHAPTER 2. THEORETICAL INTRODUCTION 15

Due to these expectation values, the PDF for u and d is expected to peak at about one third, i.e. each valence quark carries a bit less than one third of the proton’s momentum.

However, a proton contains not only the valence quarks by which it is usually identified. The gluons that hold the proton together start to dominate the PDF at lower momentum fractions. These gluons can even spontaneously split into quark and anti-quark pairs, which means that also c and s quarks contribute to the PDF at small x and also that the anti-quark density is non-zero. The quarks (and anti-quarks) that are not the valence quarks are called sea-quarks, because they are a sea of constantly forming and annihilating quark – anti-quark pairs. This progression is depicted as a sketch in Fig. 2.5. At very low energies the proton behaves as a single object. This is the case for most dynamics of the proton as a nucleon. At higher values of q2 the three valence quarks of the proton become important and at even higher q2 the low-momentum gluons and sea-quarks become visible.

s s̅ u u u u u̅ d‾ u d u̅ c u u c̅ u d d d ‾ s d s̅ d

Figure 2.5: Sketch of the proton at low (left) to high (right) values of q2. At very low q2 the proton behaves as a single object, at slightly higher q2 the three valence quarks become important. At higher energies the gluons start to dominate and at very high q2 the sea-quarks become visible.

As an example of what these PDFs actually look like, Fig. 2.6 shows the MMHT2014 PDFs (formally introduced later in this section) for a proton, evaluated at two different q2 scales. Once at q2 = 10 GeV2/c4, so just above the J/ψ peak and therefore just below the mass range considered in this analysis and once at q2 = 104 GeV2/c4, just above the Z-peak and at the higher end. The dominance of the valence quarks at high x is clearly visible, with the u-quark PDF about twice the value of the d-quark. In addition, the dominating gluon-component is clearly visible at lower x, in this graph scaled down by a factor of ten.

PDF evolution

Parton density functions exhibit another useful property. Given the PDF at a certain q2 allows the calculation of the PDF at a different q2. This process is called evolution of the PDFs. It is governed by three equations, one each for the quark and anti-quark PDFs, where the anti-quark equation is the same as those for the quarks and one for the gluon PDF. They are named DGLAP equations after Yuri Dokshitzer, Vladimir Gribov, Lev Lipatov, Guido Altarelli and Giorgio Parisi [40–43], which first independently described them during the 1970s. 16 CHAPTER 2. THEORETICAL INTRODUCTION

Figure 2.6: MMHT2014 NNLO proton PDFs at q2 = 10 GeV2/c4 and q2 = 104 GeV2/c4 (close to the Z-peak), with associated 68% confidence-level uncertainty bands. [39]

2 2 Z 1 dqi(x, q ) αs(q ) d w h 2  x  2  x i 2 2 2 = qi(w, q ) qq + g(w, q ) qg + (αs(q )) (2.27) d ln q 2π x w P w P w O 2 2 Z 1 dg(x, q ) αs(q ) d w h 2  x  2  x i 2 2 2 = qi(w, q ) qg + g(w, q ) gg + (αs(q )). (2.28) d ln q 2π x w P w P w O

2 2 where qi(x, q ) are the quark PDFs, g(x, q ) is the gluon PDF, qq, qg and gg are the so-called Altarelli-Parisi splitting functions [43] which denote theP probabilityP P of the different possible parton processes containing quarks (q) and gluons (g), g qq¯, q qg and g gg, respectively [38,44]. The strong coupling constant, which also→ depends→ on → 2 the energy scale, is denoted by αs(q ).

While this evolution prevents one from making absolute predictions for the PDFs, it does allow using experimental data from one region of q2 and extrapolating into a different region.

The evolution of parton distributions with x, especially towards low-x, is also possible and can be described by the Balitsky-Fadin-Kuraev-Lipatov (BFKL) equation [45–47], the expression of which is more complicated than the DGLAP equations. An introduction and their derivation can be found e.g. in Ref. [36] and more recently in Ref. [48].

Different PDF sets

With the DGLAP equations, it becomes a choice which data to use, i.e. at which q2 to fix the PDFs, and to which regions of q2 to extrapolate the PDFs. Usually this is done by parametrizing the PDFs at some initial scale using a polynomial functions such as CHAPTER 2. THEORETICAL INTRODUCTION 17

2 η 1 δ xf(x, q ) = (1 x) (1 + x 2 + γx) , (2.29) 0 − where the parameter η enforces that in the limit x 1 the PDF becomes 0 and the parameter δ determines the low-x behaviour [49]. The→ parameters of this parametrization can then be determined by fitting them with experimental data.

Different PDF sets have been developed which make different choices in which PDF parametrization they use, which data to include, but also on how exactly to evolve the PDF to a particular q2. All of the PDF sets used in this thesis use results from multiple experiments, spanning a large range in q2. The oldest PDF set used in this analysis is the MSTW08 PDF set [50], which used data from early fixed-target experiments as well as data collected at the electron-proton collider HERA and the proton - anti-proton collider Tevatron and performs a χ2 fit to extract the parameters. It was published in 2008, just in time for the start of the LHC.

After the end of Run I of the LHC and the publication of the first analyses using data collected during it, multiple new PDF sets were published. The same methods as for the MSTW08 PDF set were used again by the authors and published as MMHT2014 [39]. Except for the new data, the parametrization uses Chebyshev polynomials [51], instead of the simple linear and square-root term in Eq. (2.29). This was found to give a better description of the lepton charge asymmetry from W ± decays [52] observed at the LHC.

In addition, two other updated PDF sets were also published after including the most recent LHC data, the NNPDF30 [53] and the CT14 [54] PDF sets. For the NNPDF30 PDF set, a neural network with a genetic algorithm minimization was used instead of a simple χ2 fit. In addition, the PDF uncertainties are controlled by performing a closure test using pseudoexperiments. The authors of the CT14 PDF set focused on data on inclusive, high-momentum transfer processes, for which perturbative QCD is expected to be reliable, instead of taking all available data.

In Fig. 2.6 the proton PDFs from MMHT2014 are shown, including their uncertainties, at two different energy scales. In general the uncertainties on the PDFs become larger with decreasing x, because this is where the least amount of data is available (and which is also where the analysis presented here is sensitive). They also become smaller with increasing q2, because this is where perturbative QCD is 18 CHAPTER 2. THEORETICAL INTRODUCTION

2.4 The Drell-Yan process

One electro-weak process that has been studied extensively, and which is the focus of this thesis, is the so-called Drell-Yan process (DY), first described and calculated theoretically by Sidney Drell and Tung-Mow Yan in 1970 [55]. It is a two-body to two-body process where a fermion and an anti-fermion interact to form a (possibly distinct) fermion – anti-fermion pair. Specifically in this thesis it is taken to be the process where a quark q interacts with an anti-quark q¯ and forms a lepton-pair. The interaction can proceed either electro-magnetically, in which case the quark - anti-quark pair annihilates into a virtual photon, γ∗, or via the , where a neutral Z-boson (on- or off-shell) is produced. Here on-shell means that the produced particle is real and has their actual mass, while off-shell means that it is virtual and can have any mass as long as it appears only as an internal in the . In either case, the boson subsequently decays to the two oppositely-charged leptons (of the same generation). The Drell-Yan process was actually the process by which the Z-boson was first discovered, as briefly shown in Section 2.1.2. In Fig. 2.7 the leading order Feynman diagram of this process is shown. q ¯l

Z0, γ∗

q l Figure 2.7: Leading order Feynman diagram for the process qq l¯l. →

As the initial state consists of quarks, the Drell-Yan process provides important information about the PDFs, specifically about the anti-quark content of the proton. The Drell-Yan process is also relatively easy, both experimentally, due to the clean experimental signature of just two leptons, and its factorisation in the context of QCD factorisation [56].

2.4.1 The general 2 2 process → The production of massive lepton pairs at hadron-hadron colliders via the Drell-Yan mechanism is one of the most studied processes in particle physics phenomenology, but historically, the time-reversed process was more relevant at first: two leptons (specifically an electron and a positron) annihilating and producing a quark-anti-quark pair that then CHAPTER 2. THEORETICAL INTRODUCTION 19 hadronize. Even though the hadronization itself can not be calculated perturbatively, for large momentum transfers q2† the cross-section of this process can be predicted by perturbation theory.

e− f¯ e− f¯

γ∗ Z0 +

e+ f e+ f Figure 2.8: Leading order Feynman diagrams for the process e+e− ff¯ via an exchange of a → virtual photon or a Z-boson.

To calculate the cross-section of this process, one can first look at the slightly more general process e+e− γ∗ ff¯ (with a light charged fermion, f = e). When this process happens at energies far→ below→ the mass of the Z-boson, the exchange6 of a virtual photon γ∗ dominates and the effects of the Z, and the interference between the Z and γ can be neglected. In this case the differential cross-section to leading order is given in Ref. [36] to be

2 2 2 dσ + − ∗ ¯ πα Q Q e e →γ →ff = e f 1 + cos2 θ (2.30) d cos θ 2s with s the centre-of-mass-energy, α = e2 = 1/137.035999139(31) [2] the electro- 4π0~c 2 2 magnetic coupling constant and Qe = 1 and Qf the squared charges of the electron and fermion in units of the electron charge e, respectively, and θ the centre-of-mass scattering angle of the final state fermion.

Integrating over θ gives the total-cross-section

2 4πα 2 2 σ + − ∗ ¯ = Q Q (2.31) e e →γ →ff 3s e f

At the Z-peak, the cross-section looks slightly different, since the process is dominated by the Z-boson. Since the electro-weak interaction contains both a vector and an axial-vector coupling, the cross-section is determined by the respective coupling strengths to the involved fermion, Vf and Af , which are given in [36] as

3 2 Vf = Tf 2Qf sin θW 3 − (2.32) Af = Tf ,

†Sometimes also denoted with Q2, but q2 will be used throughout this thesis to avoid confusion with the charges of particles, denoted with Q. 20 CHAPTER 2. THEORETICAL INTRODUCTION

with the Weinberg angle θW , the charge of the fermion Qf and the weak isospin component 3 1 1 T , which is + 2 for the neutrinos and the up-type quarks u, c, t and 2 for the charged leptons and the down-type quarks d, s, b. The cross-section is then, as− given in [36],

2 2 4πα κ 2 2 2 2 σe+e−→Z→ff¯ = 2 (Ae + Ve )(Af + Vf ), (2.33) 3ΓZ with ΓZ = 2.4952(2) GeV [2] the total decay width of the Z boson. The constant κ is here defined as:

√2G M 2 κ = F Z , (2.34) 16πα −5 −2 3 2 with GF = 1.1663787(6) 10 GeV (~c) [2] the Fermi constant and MZ = 91.1876(21) GeV/c [2] the mass of the Z boson.× ·

The hadronic cross-section ratio R

This process was studied extensively at e+e− colliders. One of the most well-known measurements is the ratio R of the cross-sections for this process to produce a quark anti-quark pair f = q and a muon - anti-muon pair, f = µ. Below the Z-mass this can be easily evaluated from Eq. (2.31) by summing over all quarks accessible at the centre-of-mass energy s and multiplying by the different number of colours quarks can carry, Nc = 3, as shown in [38]:

2 + − N P 4πα Q2Q2 σ(e e hadrons) c q 3s e q X 2 R → = = Nc Q , (2.35) ≡ σ(e+e− µ+µ−) 4πα2 Q2Q2 q → 3s e µ q with Qi the charge of the particle i.

2 Assuming √s < 2Mt, with Mt = 173.34(76) GeV/c [2] the mass of the top-quark, which is always the case below the Z-peak since MZ < Mt, all quarks except the top-quark are 1 accessible. Since there are three down-type quarks (d, s, b) with charge 3 and only two 2 − up-type quarks (u, c) with charge 3 the ratio can be evaluated to be

  1 4 11Nc 11 R = Nc 3 + 2 = = 3.67. (2.36) · 9 · 9 9 3 ≈

At the Z-peak, the analogous quantity is the ratio of the partial decay widths of the Z to hadrons and muon pairs, as given in [36]:

P 2 2 Γ(Z hadrons) Nc quarks(Aq + Vq ) RZ = → = (2.37) Γ(Z µ+µ−) A2 + V 2 → µ µ CHAPTER 2. THEORETICAL INTRODUCTION 21

Since the Z-mass is smaller than the top mass, still only all quarks except the top quark are accessible. In this case we get

 h 1 2 1 2 2 2i h 1 2 1 4 2 2i Nc 3 + + sin θW + 2 + sin θW · − 2 − 2 3 · 2 2 − 3 RZ = 1 2 1 2 2 + + 2 sin θW − 2 − 2 = 20.072 0.018, (2.38) ± 2 with Vf and Af given in Eq. (2.32) and using the measured value of sin θW = 0.23142 0.00106 [32]. ±

The number of accessible quarks changes as a function of the centre-of-mass energy which leads to a step in R whenever a new quark becomes kinematically available. in addition there is a peak where particle – anti-particle resonances are, as can be seen in Fig. 2.9.

Figure 2.9: Ratio of the cross-section of electron and positron annihilation going to hadrons and to muons versus the centre-of-mass energy √s. Clearly visible are the steps whenever a new quark becomes kinematically accessible as well as the positions of the resonances [57].

The Drell-Yan cross-section at parton level

The Drell-Yan process, which is the focus of this thesis, is essentially just the time-reversed process of the process l¯l qq. The only difference is that instead of summing over all colours, we need to average→ over them, since now the quarks are in the initial state, giving 2 an additional factor of 1/Nc for the cross-section (outside of the Z-peak): 22 CHAPTER 2. THEORETICAL INTRODUCTION

2 4πα 1 2 2 σqq→γ∗→l+l− = Ql Qq (2.39) 3s Nc

The cross-section at the Z-peak exhibits a resonance structure, following a Breit-Wigner shape, and is therefore proportional to

1 σ + − , (2.40) qq¯→Z→l l ∝ (s M 2 )2 + M 2 Γ2 − Z Z as explained e.g. in Ref. [38]

In both cases, s is the centre-of-mass energy squared of the two quarks and not of the hadron they are contained in. To get from the cross-section involving individual quarks to the Drell-Yan cross-section at a pp-collider, such as the LHC, the Parton Density Functions described in Section 2.3.2 are necessary. With the PDFs, the hadronic cross- section relevant for proton-proton collisions can be calculated from the partonic one given in Eq. (2.39) by integrating over all partons with all momentum fractions x in both protons, as shown in Ref. [38]:

Z 1 X σpp→γ∗→l+l− = fj(x1)f¯j(x2)σj¯j→l+l− (x1x2s) d x1 d x2 0 ¯ jj (2.41) 2 2 Z 1 4πα Ql X 2 1 = Q f (x )f¯(x ) d x d x . 3N j j 1 j 2 q2 1 2 c 0 j

Here j runs over the partons of one proton, with momentum fraction x1, and ¯j over the partons of the other proton which needs to be the anti-particle of the parton considered for the first proton, with momentum fraction momentum x2. Also, the PDFs are taken to 2 2 be at the relevant q , fi(x) fi(x, q ). ≡ So, instead of being as simple as shown in the Feynman diagram in Fig. 2.7, the reality of the DY process looks more like the Feynman diagram shown in Fig. 2.10. In order to calculate the cross-section of such a process at a hadron-collider, the content of the hadrons needs to be considered.

2.4.2 The differential Drell-Yan cross-section

To get the differential Drell-Yan cross-section as a function of variables accessible in experiments, lets first define a different phase-space basis. Currently the cross-section in Eq. (2.41) is given in terms of the centre-of-mass energy squared q2, but also in terms of the momentum fractions x1,2. Instead of x1,2 we can define the rapidity y as

1 E + p y = log L , (2.42) 2 E pL − CHAPTER 2. THEORETICAL INTRODUCTION 23

p

q `−

0 ∗ q Z , γ `+

p

Figure 2.10: Leading order Feynman diagram for the Drell-Yan process (see Section 2.4) in pp collisions.

where pL is the longitudinal momentum component. The rapidity has the useful property that it is additive when performing longitudinal Lorentz transformations.

For two massless partons (from e.g. two protons) with momentum fractions x1 and x2 colliding, the following is also useful, relating the rapidity of the final state and the momentum fractions of the initial state 1 x y = log 1 . (2.43) 2 x2

It can be directly derived by considering that in the√ centre-of-mass frame the√ four-momenta s s of two colliding massless partons are given by p1 = (x1, 0, 0, x1) and p2 = (x2, 0, 0, x2) √ √ 2 2 s s − and so E = (x1 + x2) and pL = pz = (x1 x2) in Eq. (2.42)[36]. 2 2 − Conversely, the momentum fractions can be expressed as a function of the rapidity when using that the relation between the q2 of the hard process and the centre-of-mass energy 2 squared of the proton-proton collision s is given by q = x1x2s:

r r q2 q2 x = ey x = e−y (2.44) 1 s 2 s

For massless particles the rapidity is also equivalent to the pseudorapidity η

 θ  η = log tan , (2.45) − 2 where θ is the angle between the momentum and the beam, conventionally taken to be on the z-axis. The pseudorapidity is a measure of the direction of the particle with respect to the beam and therefore usually easier to measure in an experiment than y. For muons in a Drell-Yan event measurable at e.g. LHCb, p mµ and therefore p E and y η. | |  | | ≈ ≈ 24 CHAPTER 2. THEORETICAL INTRODUCTION

For an alternative coordinate system two more coordinates are needed. The first one is the p 2 2† transverse momentum component pT = px + py and the second one is the azimuthal angle in the xy-plane p  φ = arctan y . (2.46) px

Since the beam is conventionally taken to be on the z-axis and both pT and φ are Lorentz-invariant under a boost in the z-direction this defines an alternative coordinate system which uses only Lorentz-invariant quantities and which is linked to the Cartesian coordinate system via the following relations:

px = pT cos φ p = p sin φ y T (2.47) pz = pT sinh η

p = pT cosh η | | The definitions of the angles φ and θ, as well as the value of the pseudorapidity for different angles are shown in Fig. 2.11.

η = 0 y y η = 0.55 η = 0.88 x θ = 90◦ θ = 60◦ η = 1.32 φ p θ = 45◦ θ = 30◦ θ η = 2.44 θ = 10◦ θ = 0◦ η = z z ∞ (a) Definition of the angles φ and θ used to (b) Values of the pseudorapidity for a selection convert from the Cartesian to the spherical of polar angles, based on [59]. coordinate system, based on [58].

Figure 2.11: Sketches relating to relation between the Cartesian coordinate system and the coordinate system using pT, η and φ.

In the case of an intermediate particle decaying to two final-state particles, the rapidity y can still be calculated using the reconstructed four-momentum of the decaying particle, which is equal to the sum of the four-momenta of the two particles. The same is true for the momentum transfer q2, which just becomes the invariant mass of the final-state. In 2 2 the case of the DY process to two leptons this means that q = Ml+l− . Using the rapidity y, together with the invariant mass of the intermediate boson, the cross section can be obtained from Eq. (2.41) by a change of variables, as detailed in Ref. [38]: 2 2 Z 4πα Ql 1 X 2 2 σ ∗ + − = Q x f (x )x f¯(x ) d q d y (2.48) pp→γ →l l 3N q4 j 1 j 1 2 j 2 c j

† Here py is the y-component of the momentum as usual, and not related to the rapidity y. CHAPTER 2. THEORETICAL INTRODUCTION 25

Instead of writing this as a double integral over d q2 d y, this equation can also be formulated as a differential equation, so one instead obtains the double-differential Drell-Yan cross- section to LO:

2 2 2 d σpp→γ∗→l+l− 4πα Ql X 2 = Q x f (x )x f¯(x ). (2.49) dq2 dy 3N q4 j 1 j 1 2 j 2 c j

The defining feature of this cross-section is that it strongly falls of as a function of q2, leading to a much smaller cross-section at high invariant di-lepton masses. This is only changed when getting closer to the Z-mass, which exhibits a resonance-like structure, as shown in Eq. (2.40).

Higher order corrections

The equations presented so far are only to leading order. Higher order corrections need to be added to them. A selection of the next-to-leading order corrections to the DY cross-section are shown in Fig. 2.12. These include one of the initial quarks radiating a (soft) gluon, called initial-state radiation (ISR), the ISR interacting with the same gluon again, a gluon coupling to both incoming quarks, or the whole process being initiated by a quark and a gluon, instead of a quark and an anti-quark. The next higher order, NNLO, corrections include diagrams with another additional vertex, of which there are too many to show here. q ` q ` q `

Z/γ∗ g Z/γ∗ Z/γ∗ g g

q ` q ` q ` g g q q

Z/γ∗ ` Z/γ∗ `

q ` q ` Figure 2.12: Second order QCD Feynman diagrams for the processes qq Z/γ∗ l¯l and → → qg Z/γ∗ ql¯l. → → 26 CHAPTER 2. THEORETICAL INTRODUCTION

2.4.3 The FEWZ tool

In practice, theoretical predictions of the differential Drell-Yan cross-sections, to which the results of this analysis are compared to in Chapter6, can be produced using a tool called Fully Exclusive W and Z production (FEWZ) version 3.1 [60,61]. This tool was specifically designed for the simulation of W and Z production at hadron colliders. It can produce predictions for the total and differential cross-sections at NNLO level, using the the LO equations described above and their higher order corrections (see Fig. 2.12 for some NLO contributions). In addition, it can apply phase space cuts (which are a necessity from the experimental side due to limited resolution for e.g. low momentum tracks) as well as using different PDF sets [62]. Final state radiation (FSR) can be included as well, but this is not done here, because detector effects like bremsstrahlung and finite resolution need to be taken into account as well. The correction for the detector effects also automatically correct for any FSR, as described in Section 4.5.

The PDF sets used in this thesis are the ones mentioned in Section 2.3.2, MMHT2014, MSTW08, NNPDF30 and CT14. The uncertainties reported by FEWZ are separated into three components. There is a statistical uncertainty, which results from the numerical integration of Eq. (2.41) internally. In addition, since this is only a perturbative calculation (albeit at NNLO), some theoretical uncertainty remains, which are estimated by varying the fragmentation and renormalisation scales by a factor of two up and down (with some extreme values being discarded). And finally, the PDF sets themselves also have uncertainties associated, which are evaluated at a 68% CL. The only exception to this is the CT14 PDF, whose uncertainties are evaluated at 90% CL, but they can be easily rescaled to the 68% CL reported by the other PDFs by dividing the PDF uncertainty by the numerical factor 1.645 [63].

The numerical values predicted by FEWZ using the different PDFs can be found in Tables A.1 and A.2†.

†The predictions for this analysis were generated by K. M¨uller. CHAPTER 3

THE LHCB EXPERIMENT AT THE LHC

The true method of knowledge is experiment.

William Blake

3.1 Particle accelerators

In order to study the constituents of matter, the favoured approach in high-energy particle physics is to use particle accelerators. These machines accelerate particles to higher and higher energies and collide them with other particles. Particle detectors observe the results of these collisions. During these interactions, the collision energy can manifest itself in the production of new particles. This creation often happens as particle – anti-particle pairs, in order to observe conservation laws like conservation of charge. Such pairs may originate from a parent particle, in which case the measurements of their kinematic quantities allows to constrain the features of the parent. In order for these pairs to become real, instead of being purely virtual, (see Chapter2), the energy of the colliding particles needs to be larger than the invariant mass of the particles/anti-particles. The discovery of ever heavier new particles lead to the development of accelerators which can reach higher and higher collision energies.

The cross-section ( likelihood) for a specific process depends on the collision centre-of- mass energy, √s. The∝ production of specific particles is governed by the equation Z N = σ d t, (3.1) L

27 28 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC where N is the number of times a specific particle is produced, σ the cross-section of this particle being produced (the normalized probability to produce this particle) and is the luminosity, an accelerator-specific variable describing the density of the interactingL particles. The higher the luminosity, the more often a specific process occurs. Luminosity and how it is measured at LHCb is described in more detail in Section 3.4.

In general, there exist two different designs of particle physics experiments. The first one are fixed-target experiments. These involve creating and accelerating a single particle beam, which is then brought into collision with a stationary target. This target often, but not necessarily, consists of different particles than the particles being accelerated. In this design, the luminosity is very high, because the targets are designed in such a way that basically all of the accelerated particles interact with the target. On the other hand, this design has a limited centre-of-mass energy, since the centre-of-mass frame is moving, leading to a reduced centre-of-mass energy. With E1 being the energy of the beam and m2 the mass of the target particles the centre-of-mass energy √s can be calculated from µ µ the four-momenta of the particles in the beam, p1 , and the target, p2 :

µ µ  ~ p1 = (E1, ~p1) p2 = m2, 0 (3.2)

The centre-of-mass energy √s is given by the magnitude of the combined four-momentum:

µ µ 2 2 2 2 2 E1m1,m2 s = (p + p ) = (E1 + m2) ~p1 = m + m + 2E1m2 2E1m2 (3.3) 1 2 − 1 2 ≈ The second way to build a particle accelerator is a collider. In this case there are two particle beams (not necessarily containing the same particles or being of the same energy), that are collided at an interaction region. In the case of both beams containing the same particles, of the same energy, the centre-of-mass frame does not move and the centre-of-mass energy squared becomes:

s = (pµ + pµ)2 = (E + E)2 (~p ~p)2 = (2E)2 (3.4) 1 2 − − Therefore, a collider is able to achieve higher centre-of-mass energies than a fixed target collider with the same beam energies. The simplest version of a collider is a linear collider, where the beams are accelerated in a straight line, brought into collision and dumped afterwards. This design is used if one (or both) of the beams contains unstable particles that would decay soon anyway. A more advanced design is a circular collider, which reuses the beams after each interaction by refocussing them and forcing them on a closed path in order to collide them again.

In contrast to a fixed-target experiment, not all particles interact with each other in a linear or circular collider. Indeed, most particles do not interact with particles from the opposite beam at all. However, in a circular collider the beams can be reused every orbit.

Circular colliders are limited by energy loss due to synchrotron radiation emitted in the bending of the flight path of the particles and by the strength of the magnets (or CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 29 equivalently, the diameter of the accelerator) needed to force the particles on a closed track. Both limitations depend on the momentum of the particles to collide. The energy loss due to synchrotron radiation also depends on the mass of the particles being bent by the magnetic field ( m−4 [64]). Circular electron-positron colliders are limited by synchrotron radiation,∝ which is why the next electron-positron collider will likely be a linear collider. For heavier particles, such as protons, energy losses due to synchrotron radiation can be neglected and the strengths of the magnets is the major constraint.

3.2 The Large Hadron Collider

The European Organization for Nuclear Research (CERN), near Geneva, Switzerland, has a long history of building particle accelerators, starting with the Synchrocyclotron (SC) constructed in 1957. Quite often when a new accelerator was built, the already existing accelerators were reused as pre-accelerators for the new accelerator. The accelerator chain of the newest accelerator, the Large Hadron Collider (LHC) [65] consists of five particle accelerators with increasing energies, as shown in Table 3.1. The oldest accelerator still in use (albeit refurbished multiple times) is the Proton Synchrotron (PS), which was originally built in 1959.

Table 3.1: Current accelerator chain used to inject protons into the LHC, the year of the start of operation and the current possible energy per beam [66].

Name Year Current energy/proton Linear Accelerator 2 (LINAC2) 1963 50 MeV Proton Synchrotron Booster (PSB) 1972 1.4 GeV Proton Synchrotron (PS) 1959 25 GeV Super Proton Synchrotron (SPS) 1976 450 GeV Large Hadron Collider (LHC) 2008 6.5 TeV

The LHC†, the last part of the chain, was finished in 2008. It is situated in the same 27 km long tunnel between the Jura mountains and Lake Geneva, under the Franco-Swiss border, as the Large Electron-Positron Collider (LEP) before it. LEP had been in operation from 1989 to 2000 and the LHC is expected to operate until at least 2030, however with multiple upgrades. The particles being accelerated are forced on the roughly circular underground path using more than 1200 liquid-helium-cooled superconducting dipole magnets. The LHC mostly accelerates and collides protons, but it can also be filled with lead or xenon ions.

The particles being accelerated are not in a continuous beam, but rather in small packets, called bunches. The synchrotron concept for circular accelerators uses high frequency longitudinal electric fields to accelerate the particles, which determines both frequency (i.e. distance) and size of the bunches. The distance in time between two bunches in the LHC was 50 ns from 2010 to 2012. Starting with the 2015 data taking it was reduced

†Some general information on the LHC can be found for example in [68]. 30 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

CMS

LHC North Area 2008 (27km)

ALICE TT20 LHCb TT40 TT41 SPS 1976 (7km) TI8 TI2 TT10 ATLAS AWAKE HiRadMat 2016 2011 TT60 ELENA AD 2016 (31m) 1999 (182m) TT2 BOOSTER 1972 (157m) ISOLDE p 1989 p East Area

n-ToF PS 2001 H+ 1959 (628m) LINAC2 CTF3 neutrons e- LEIR LINAC3 2005 (78m) Ions

p (proton) ion neutrons p (antiproton) electron proton/ antiprotonconversion

Figure 3.1: Full accelerator complex at CERN, as of 2016. The path of protons can be followed from the LINAC2, through the BOOSTER, the PS, the SPS and finally to the LHC, as described in Table 3.1. Similarly, ions start in the LINAC3, pass through LEIR and into the PS, from which they follow the same way protons take. Reproduced from [67].

to the design value of 25 ns. This corresponds to a spatial distance of about 7.5 m for particles travelling at the speed of light, which is a good approximation for protons and ions in the LHC.

Theoretically there would be space for 3564 bunches in the LHC ring with a bunch spacing of 25 ns. However, only a maximum of 2808 bunches can be circulated in the ring at one time [69]. The difference is due to the filling scheme used by the LHC, which for example includes an abort gap to ensure there is enough time to ramp up the magnets to steer the beam into the beam dump if necessary, but it also includes small gaps between the bunches. In practice the number of bunches populated is even lower than the maximum allowed by the filling scheme, however it has been increased with every year of data taking, which comes with increasing operation experience of the machine. Each LHC bunch normally contains approximately 1011 protons. When filled with heavy ions, the number of particles is smaller. During injection of bunches from the SPS into the LHC, whole trains of bunches are injected at once. Within a train each bunch slot is filled, while there can be larger gaps between trains. Multiple injections from the SPS are needed to fill the LHC. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 31

Together with the number of bunches in the machine, the energy to which protons are accelerated has also been successively increased. See Table 3.2 for an overview over the different settings during the different years. During 2012, when the data analysed in this thesis were taken, the LHC was running at a centre-of-mass energy of √s = 8 TeV and a bunch spacing of 50 ns. Data taking is separated into multiple runs of the LHC, each spreading over several years. Each year of data taking ends with an end-of-year shutdown, while runs are separated by long shutdowns (LS) which are used for maintenance and upgrade of the accelerator and detectors. Run I, which includes the data taking for this analysis, was from 2010 – 2012, while the data taking from 2015 – 2018 is called Run II. Run I and II were separated by LS1 and as of the writing of this thesis, LS2 is ongoing.

Table 3.2: Overview over various machine parameters during pp collisions of the LHC during the different years of data taking [70].

Run Year √s Bunch spacing Maximum # bunches Run I 2010 7 TeV 150 ns 368 2011 7 TeV 75/50 ns 1380 2012 8 TeV 50 ns 1380 Run II 2015 13 TeV 25 ns 2244 2016 13 TeV 25 ns 2220 2017 13 TeV 25 ns 2556 2018 13 TeV 25 ns 2556

With a circumference of about 27 km, a bunch reaches the same interaction region c again with a frequency of frev = D 11.3 kHz. The path of the particles is not quite circular, but rather consists of eight≈ straight segments connected by curved segments. The straight segments contain interaction points (IP), where the beams can be brought into collision. Four of the eight straight segments contain the main experiments. Going clockwise along the ring these are the general purpose experiment ATLAS [71] at IP1, the ALICE experiment [72] which focuses on studying quark-gluon plasma at IP2, the CMS experiment [73], like ATLAS a general purpose detector, at IP5 and the LHCb experiment [74] at IP8. The main dipole magnets are stationed in, and responsible for, the curved sections. In addition to the main dipole magnets, the LHC also contains thousands of other magnets used to steer, focus or manipulate the beam in other ways. A sketch of the different interaction points can be seen in Fig. 3.2.

3.3 The LHCb experiment

The LHCb detector [74] is a single-arm forward spectrometer situated at the IP8 of the LHC. It was designed to measure the production and decay of hadrons containing a b- or c-quark. However, it also offers unique opportunities to study many processes in the forward region, both during proton-proton collisions, as well as during heavy-ion runs.

The detector is about 21 m long, 10 m high and 13 m wide. The layout and the names of the detector systems, which will be explained in more detail in the following, can be 32 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC IP5

beam 1 beam 2

IP2 IP8 IP1 Figure 3.2: Simplified sketch of the LHC. Particles are in two counterrotating beams which cross at multiple interaction points (IPs), containing the four experiments ATLAS (IP1), ALICE (IP2), CMS (IP5) and LHCb (IP8).

seen in Fig. 3.3. In contrast to the general purpose (GP) detectors ATLAS and CMS, LHCb covers only the forward region, as the the sketch of the pseudorapidity coverage in Fig. 3.4 shows. While the GP detectors resemble barrels, which are centred around the collision point, the LHCb detector forms a rough cone, opening from the collision point. This design is very similar to detectors designed for fixed-target experiments, where the products of the collisions are heavily boosted in the direction of the beam, away from the target. The reason behind this particular geometry for the LHCb detector is LHCb’s main focus to study hadrons containing b-quarks. As can be seen in Fig. 3.5, b-quarks are predominantly produced in the forward direction with small angles towards the beam axis. A large portion of the b-quarks produced in the forward direction (defined as into the detector), lie within the detector acceptance. Because of this, the detector does not have to cover the whole solid angle, which allowed its construction in an existing experimental hall†.

As also shown in Fig. 3.3, LHCb uses a right-handed Cartesian coordinate system, where the z-axis is along the direction of beam 1 (which rotates clockwise, viewed from above, as shown in Fig. 3.2), the y-axis points vertically upwards and the x-axis lies in the plane of the LHC ring.

One of the distinguishing features of the LHCb experiment is its capability to determine the origin of the particles (called vertex) provided by the vertex locator (VELO, see below). It is used to identify the collision point(s) of the protons (primary vertices, PVs)

†This experimental hall housed the DELPHI experiment when the tunnel was home to LEP. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 33

Figure 3.3: The LHCb detector [74] including the abbreviations of its multiple subdetectors and the y and z-axis of its right-handed coordinate system.

Muons HCAL LHCb ECAL Tracking

CMS

ATLAS

4 2 0 2 4 − − η

Figure 3.4: pseudorapidity coverage of the ATLAS, CMS and LHCb detectors with regards to the muon systems, hadronic and electromagnetic calorimeters and the tracking. Reproduced from [75]. 34 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

b

θ 1 z θ 2 b

LHCb MC s = 8 TeV

0 π/4 θ π/2 0 2 [rad] π/4 3π/4 π/2 π π 3 /4 θ π 1 [rad]

Figure 3.5: Distribution of the angles under which b and b are produced at the LHC at √s =8 TeV [76]. The phase space covered by the LHCb detector is shown in red. It can be seen that a large fraction of (one half of) the produced b quarks lie within the detector acceptance.

as well as decay vertices of short living particles (secondary vertices, SVs). The precise determination of primary and secondary vertices only works if the average number of interactions per bunch (µ) is small enough such that there are not too many tracks in the event.

In order to limit µ, the instantaneous luminosity (see Section 3.4) in LHCb is adjusted regularly. This process is called luminosity levelling. A constant µ over a long period of time is achieved by vertically separating the beams in the beginning of each fill and gradually decreasing this separation as the beams deteriorate. After about 20 hours of collisions they have deteriorated far enough that the vertical separation reaches zero, after which the interaction rate decays. A plot of an example fill from late 2018 can be seen in Fig. 3.6. Clearly visible is the long constant rate of interactions, as well as the eventual exponential fall-off after the collisions have become head-on. In contrast, the GP detectors always have head-on collisions, and also more colliding bunch pairs. Luminosity levelling allows LHCb to identify from which PV each particle produced in the collision is coming from. A typical collision rate per bunch crossing in the period of levelling amounts to µ 1.2, while at the GP experiments this value can be as large as 500. ≈ The integrated luminosity (see Section 3.4) recorded by LHCb as a function of time is shown in Fig. 3.7. During 2012, LHCb has recorded approximately 2 fb−1 of data in pp-collisions. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 35

Figure 3.6: Instantaneous luminosity for ATLAS & CMS and LHCb for fill 2651. The instanta- neous luminosity is kept stable at LHCb for about 15 hours by adjusting the vertical separation of the two beams. [77]

Figure 3.7: Cumulative recorded luminosity per year of data taking [78]. Run I and II are separated by the Long Shutdown 1 (LS1). The data analysed in this thesis was recorded in 2012, shown here in dark blue. 36 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

In the following, the different detector components of LHCb are introduced.

3.3.1 Tracking system

The tracking system measures the position of charged particles both before (in the VELO and TT) and after (in the IT/OT) the magnet. It covers the forward acceptance of 250 mrad vertically and of 300 mrad horizontally, both with respect to the z-axis. The high± precision tracking system± and the large magnetic field integral allow to determine the kinematic quantities of the observed particles with high accuracy. For the analysis discussed in this thesis, the tracking system is the most relevant part of the LHCb detector. It contains several subdetectors, which are described in more detail in the following.

Vertex locator

To get a measurement of charged tracks close to the interaction region, a very precise, and at the same time radiation-hard, detector is needed. In LHCb this purpose is served by the VErtexLOcator (VELO). As the name suggests, its main task is to identify where exactly proton-proton collisions occurred and for each charged track determine from which primary or secondary vertex it most likely originated.

The VELO consists of a series of silicon sensor planes (see Fig. 3.9) along the z-axis, which provide alternate measurements of r or φ to facilitate fast track reconstruction of charged tracks. Most sensors are in the forward direction as seen from the nominal interaction point, but some sensors are also situated upstream of the nominal interaction point to get a measure of the activity in the direction not covered by the rest of the detector. This is especially important in measurements which require no other activity in the detector. During stable beams, the two halves of the detector are brought to within 8 mm of the beams, while they are retracted during the beam injection procedure.

The VELO is especially important for the measurement presented in this thesis, since it is used to separate signal candidates originating directly from a PV from background which originates from subsequent decays of particles produced in the primary interaction and which therefore have a displaced decay vertex. A cluster resolution of about 4 µm was aimed at in order to achieve the needed resolution. The VELO (together with the rest of the tracking system) can distinguish the closest distance of a track to a PV (the impact parameter (IP)) with an uncertainty of σIP = 14 µm + 35 µm/pT [74], where the transverse momentum pT is measured in GeV/c. For the x-component of the IP this is also shown in Fig. 3.8.

The acceptance for tracks with an angle from 15 to 60 mrad with respect to the beam axis is fully covered by the VELO, meaning that a track within this acceptance crosses all forward stations of the VELO. A track in the LHCb spectrometer angular acceptance of 300 mrad crosses at least three VELO stations, or in other words the VELO supplies tracking in the pseudorapidity range of 1.6 < η < 4.9. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 37

Figure 3.8: The impact parameter in x resolution as a function of 1/pT using data collected in 2012 [77].

A rendering of one half of the VELO as well as a sketch of the positions of the sensors is shown in Fig. 3.9, including the RF foil separating the VELO vacuum from the LHC vacuum.

Magnet

The LHCb experiment uses a dipole magnet to bend the tracks of charged particles in the x-z-plane. The magnet has an integrated magnetic field of 4 Tm for tracks of 10 m length. The polarity of the magnetic field is periodically reversed (between MagUp and MagDown), which is especially import for the measurements of CP -asymmetries. For other measurements, like the one presented in this thesis, it gives two statistically independent datasets of similar size allowing to limit some systematic uncertainties of the track parameter measurements. Figure 3.10 shows a sketch of the magnet, together with the LHCb coordinate system. The magnetic field strength in the main direction, the y-axis, is shown in Fig. 3.11.

Tracking stations

The tracker consists of two parts, one in front (upstream) of the magnet and one behind (downstream). The upstream tracker is called Tracker Turicensis (TT) and covers the full acceptance of the detector. It was developed and built by the Zurich group. The downstream tracker consists of an inner and outer part.

Both the TT and the Inner Tracker (IT) use silicon microstrip sensors and need to be particularly radiation hard since they are in regions of high occupancy (i.e. in regions with very high particle rates). The spatial resolution of both detectors was chosen to be 38 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

(a) A sketch of the relative positions of the VELO sensors and of the acceptance of the VELO. The beam is along the horizontal z-axis and the rest of the LHCb detector is to the right. The position of the two halves in closed and fully open position are also shown.

(b) A rendering of one half of the VELO, including the foil used to separate the vacuum in the VELO from the LHC vacuum. The beam enters from the top right corner and exits the VELO in the lower left corner. At the back the two VETO stations used to veto interactions with too many interactions are shown as well.

Figure 3.9: Visualizations of the VELO [74] showing the position of the sensors and the overall layout of the detector. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 39

Figure 3.10: The LHCb dipole magnet including dimensions [74]. Also indicated is the LHCb coordinate system.

Figure 3.11: The main B-field component By as a function of the z coordinate [74]. The positions of the different parts of the tracker are also indicated. 40 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

about 50 µm, so that the momentum resolution is dominated by multiple scattering at low pT. Resolutions measured from actual data show that this goal has been achieved (see Fig. 3.12 for the TT).

Figure 3.12: Hit resolution measured for all modules of the TT. The sector number corresponds approximately to the x-direction. The labels X1, U, V and X2 correspond to the four detection layers arranged with an (xuvx) geometry. [77]

The outer parts of the acceptance of the tracking stations (T1 - T3) are covered with a gaseous drift-time detector, the Outer Tracker (OT). This region has a lower occupancy and therefore straw tubes can be used, which allows covering a larger area at lower cost.

3.3.2 Calorimeter system

The energy of particles is measured using a calorimeter system consisting of four parts. The first is a scintillator pad detector (SPD), which detects charged particles in order to separate photons and electrons. It also provides a fast measure for the activity in each event and is used in the hardware level of the trigger (see Section 4.2) to reject events which would take too long to reconstruct.

The SPD is followed by a thin lead layer and the pre-shower detector (PS) in order to distinguish electrons from charged pions. Both the SPD and the PS are only one layer thick and contain an LED for calibration. A picture of the scintillators used both in the SPD and PS is shown in Fig. 3.13

After the PS detector the main electromagnetic calorimeter follows (ECAL). It detects electrons and photons (via pair-production in the material) using scintillation light being detected by photo-multipliers. Between the layers of scintillators are lead absorbers. The ECAL consists of 66 layers and has a thickness of 25 radiation lengths in order to fully contain showers produced by photons. It has an energy resolution of σE/E = 10%/√E 1% (E in GeV). ⊕ CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 41

Figure 3.13: Individual SPD/PS scintillator pad with output fibre and calibration LED housing in the middle [74].

The final part of the calorimeter system is the hadronic calorimeter (HCAL). Its purpose is the detection and energy measurement of hadronic showers. This includes neutral hadrons such as neutral kaons and charged hadrons like pions, kaons and protons. In contrast to the ECAL it consists of alternating layers of scintillators and iron absorbers. Due to space constraints its length is only 5.6 interaction lengths, leading to some leakage of hadrons into the muon system. This leakage is one source of background events for the analysis presented in this thesis and is further described in Section 4.2.2.

3.3.3 Muon system

Paramount for this analysis is the reconstruction and identification of muons. Since muons are minimum-ionizing particles at the energies most common at the LHC, they traverse the rest of the detector without being stopped or even loosing a lot of energy.

Five muon stations (M1 - M5) are used to reconstruct muons, with M1 situated before the calorimeter system and the rest after it. Stations M1 - M3 have a high spatial resolution and are used to calculate the pT of muon candidates, while the main purpose of M4 and M5 is the detection of the presence of penetrating particles in order to select good muon candidates. The stations M2 - M5 are separated by 80 cm thick iron absorbers. Each station is segmented into multiple cells (R1-R4), which become larger along the z-direction so that straight tracks coming from the PV stay in the same corresponding cell number in the subsequent stations. A side view of this setup can be seen in Fig. 3.14. Muons with momenta above ca. 6 GeV/c can punch through the whole muon system and leave the detector without being stopped (but not without being detected).

The muon stations themself use two different technologies. For the most part, multiwire proportional chambers (MWPC) are used, which are cheap and reliable. Only for the inner region of the M1 station (R1), triple gas electron multiplier (GEM) detectors are used because the particle rate is exceeding the safety limits for ageing of the MWPCs here [79,80].

An important parameter for the spatial resolution of the muon system is the cluster size, 42 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

Figure 3.14: Side view sketch of the muon system with the five muon stations (M1-M5) and the radial size of the four regions (R1-R4), each containing multiple cells. [74]

the average number of cells triggered by a crossing particle. In the L0 trigger, where the muon system is used to identify the two highest pT muons (see Section 3.3.5), each cell simply returns a yes-no signal indicating if it was hit or not. Therefore no interpolation between adjacent cells is possible and the cluster size should be as close as possible to one. Two sources contribute to a larger cluster size. The first is purely geometric, a track can be inclined with respect to the borders between cells. Another contribution comes from cross-talk between cells, both inductive or capacitive, or in the read-out electronics. The design criteria required that the cluster size should be less than 1.2 [79]. In the actual detector it is below 1.1 for the whole range of voltages applied to the chambers [74]. The other important quantity is the overall efficiency of the muon system to detect muons. Both the efficiency and the cluster size are shown as a function of the applied high voltage in Fig. 3.15. Also shown is the resulting transverse-momentum resolution as a function of the momentum of the muon. It is mostly affected by multiple scattering between M1 and M2, i.e. in the calorimeters and by the granularity of the muon system, i.e. the size of the cells being read out. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 43

30 Total Granularity 25 Multiple scattering between M1 and M2 Magnet parameterization and 1 M4R2, threshold 8fC 1.25 multiple scattering before M1

resolution ( % ) 20 Efficiency T

0.9 1.2 cluster size p

15 efficiency - four gaps 0.8 efficiency - two gaps 1.15 cluster size - four gaps cluster size - two gaps 0.7 1.1 10

0.6 1.05 5

0.5 2300 2350 2400 2450 2500 2550 2600 2650 2700 1 HV(V) 0 9 10 20 30 40 50 60 (a) Measured efficiency and cluster size in a momentum (GeV/c) 20 ns window as a function of the applied high- (b) Contributions to the transverse-momentum voltage for two different configurations of one resolution prec ptrue /ptrue as a function of the | T − T | T muon system module. Double-gap MWPCs are muon momentum, averaged over the full accep- used in M1 and four-gap MWPCs in the rest of tance. It is shown for muons from semileptonic the muon system. [74, 81]. b-decays having a reconstructed pT close to the trigger threshold (1 – 2 GeV/c). Simulation study during the technical design phase [79].

Figure 3.15: Performance characteristics of the muon system (measured and projected).

The muon system is mostly used to identify particles as muons. Only very few other particle types are able to penetrate the detector this far. Most of the electrons and photons are stopped in the ECAL, while charged and neutral hadrons are usually stopped in the HCAL. So, most of the particles making it to the muon stations are actually muons, but some highly energetic hadrons, like charged pions and kaons, can also reach this far. This leads to two major contributions to particles being misidentified as muons:

1. Highly energetic hadrons that punch through the HCAL and reach the muon system. These are mostly pions, kaons and protons.

2. Particles that decay in-flight into a muon (and other particles). These are mostly ± ± ± charged pions and kaons decaying to a muon and a neutrino (π /K µ νµ). →

Both categories are still present in the data used in this analysis and need to be disentangled from the signal events. The general muon ID efficiency is described in more detail in Section 4.6.3 for high-pT muons. However, the misidentification probabilities for the three major misidentification candidates is shown in Fig. 3.16. The efficiency to correctly identify muons is larger than 90% for all muon momenta considered. The largest misidentification probability is for low-momentum pions. 44 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

(a) Efficiency of the muon candidate selection (b) Misidentification probability of protons as a based on the matching of hits in the muon function of momentum, for different pT ranges. system to track extrapolation, as a function of momentum for different pT ranges.

(c) Misidentification probability of pions as a (d) Misidentification probability of kaons as a function of momentum, for different pT ranges. function of momentum, for different pT ranges.

Figure 3.16: Probabilities to identify a given particle as a muon in the muon system as a function of the (transverse-)momentum of the particle [77].

3.3.4 Particle identification using RICH detectors

Two Ring Imaging Cherenkov (RICH) detectors are used for the identification of charged particles. Charged particles traversing the radiator media (RICH1: Aerogel, removed in Run II and C4F10, RICH2: CF4) with a velocity larger than the speed of light in the media emit Cherenkov radiation under an angle θc given by their velocity β and the refractive index n of the medium: CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 45

1 v cos θ = , with β = (3.5) c nβ c

1 For a given momentum p = γmv, with γ = q 2 , measured with the tracking system, 1− v c2 particles with different mass travel at different speeds and can therefore be distinguished by the angle under which Cherenkov radiation is produced. Figure 3.17 shows that the RICH1 is able to distinguish pions, kaons, protons and muons in a wide momentum range (2 - 40 GeV/c). Together with RICH2, particle identification for particles with momenta between 2 and 100 GeV/c is possible.

Figure 3.17: Reconstructed Cherenkov angle as a function track momentum in the C4F10 radiator of RICH1 [82].

3.3.5 Trigger and event reconstruction

Trigger

In order to understand the dataset generation for this thesis, it is necessary to understand how LHCb decides to reconstruct and store data of the collisions. During Run I a collision between oppositely-rotating bunches can occur every 50 ns, or with a rate of about 40 MHz. This is a lot more than what can eventually be stored on disk. There is also not enough time to run the full event reconstruction or even to read out the full detector†. Therefore a system is needed to decide which events are kept for further analysis. For this purpose a multilevel triggering system is used to gradually decrease the rate of events that are processed and finally stored. An overview of the different levels can be seen in Fig. 3.18. A full description of this scheme can also be found in Ref. [74].

†Being able to read out the full detector at the full bunch crossing rate is one reason for the upgrade of the LHCb detector [83–86] during LS2. 46 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

LHCb 2012 Trigger Diagram

40 MHz bunch crossing rate

L0 Hardware Trigger : 1 MHz readout, high ET/PT signatures

450 kHz 400 kHz 150 kHz h± µ/µµ e/γ

Software High Level Trigger 29000 Logical CPU cores Offline reconstruction tuned to trigger time constraints Mixture of exclusive and inclusive selection algorithms

5 kHz (0.3 GB/s) to storage 2 kHz 2 kHz 1 kHz Inclusive/ Inclusive Muon and Exclusive Topological DiMuon Charm

Figure 3.18: Trigger scheme during the 2012 data taking with the L0 hardware trigger and the two-stage software trigger as well as the rates allocated for the different trigger parts [87].

The first trigger level is the L0 hardware-trigger. It is implemented in custom-built FPGAs and uses information from the VELO, SPD, ECAL and HCAL. The number of SPD cells with a hit is used to obtain a measure for the overall charged track multiplicity in the crossing. The calorimeters are used to detect and identify high ET electrons, photons, neutral pions and hadrons and the muon system reconstructs muon tracks and selects the two muons with the highest pT. A pile-up system estimates the number of primary pp interactions. This information is combined and used to make different L0 decisions. If the number of activated SPD cells is above 600, the L0Hadron trigger fires, if there is at least one high-pT muon or one high-pT muon pair in the event the L0Muon or L0DiMuon trigger fire, respectively, and if there is sufficient energy deposited in the ECAL, the L0Electron trigger fires. In addition, a small random number of events is kept using the minimum-bias trigger. The hardware trigger runs synchronously to the full 40 MHz bunch crossing and reduces it to a rate of 1 MHz, the rate at which the full detector can be read out.

In addition to the L0 triggers mentioned above, the full LHCb detector is randomly read out at a rate of about 1 kHz in order to continuously monitor the luminosity, as described in Section 3.4 and in more detail in Ref. [88]. These triggers are called luminosity triggers.

Events that have been selected in the L0 trigger are further processed by the High Level Triggers 1 and 2 (HLT 1 and HLT 2), which are implemented in software. This second level of the trigger reduces the event rate from 1 MHz down to 5 kHz, the rate at which events can be written to storage for further offline analysis. The HLT makes use of the full event data, after confirming the result of the L0 trigger. In the HLT, an event is processed further if it triggered any of many different trigger lines that run in parallel. By requiring that one specific trigger line has fired, events with certain properties can be selected, which is used to produce the datasets for specific analyses, such as this one.

The first part of the HLT, HLT 1 reconstructs the particles in the VELO, the position of primary vertices and it determines the impact parameters of the particles. Only tracks CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 47 satisfying the minimum track quality, impact parameter, momentum, and transverse momentum are processed further. This reduces the event rate sufficiently, such that the rest of the track reconstruction can be run for the selected events. This step always runs concurrently to the collisions.

The last level of the trigger is the HLT 2. During Run I it ran concurrently to the collisions, while during Run II a deferred running was implemented, where 20% of the events were stored on disk, to be processed during the time between fills. This allowed to increase the total number of events processed. During HLT 2 the full event reconstruction is performed, with calibration constants periodically updated during the year†. This means that values reconstructed during the data-taking are not necessarily the same as when the HLT 2 trigger is run again on offline data, because of more correct calibration constants being available afterwards.

For an overview of the trigger lines used specifically in this analysis, and the selection requirements they implement, see Section 4.2.

Event reconstruction and track types

The event reconstruction is run during the HLT but can also be rerun in full offline, allowing some modifications even after the data taking. These repeated runnings of the reconstruction are called Stripping. For a technical description of how the track reconstruction works, see e.g. Ref. [89]. Here only a high-level description of how the reconstruction proceeds is given.

During the reconstruction, hits in the different subdetectors are combined to form tracks within each subdetector. These partial tracks are then extrapolated to the other subdetec- tors and if a matching track is found they are combined. Since not all particles leave a track in every subdetector, this leads to different track types, depending on what kind of particle produced them. An overview of the different track types can be seen in Fig. 3.19.

Tracks of particles that leave the detector acceptance after leaving hits in the VELO are called VELO tracks. If the particle makes it to the TT before being swept out of the acceptance by the magnet (due to being close to the edge of the acceptance and/or having a low momentum), the track is called an Upstream track. Tracks from particles that are reconstructed in all three trackers, VELO, TT and IT/OT, are called Long tracks. These are particles with a long enough lifetime to fly multiple meters and which happen to lie within the acceptance of the detector. If a particle is only created outside of the VELO, for example through the decay of a long-lived, neutral particle, a Downstream track is created. Finally, tracks that only have hits in the IT/OT are called T tracks. For this analysis long tracks are most important, since muons created at a PV will be reconstructed in all three trackers, as well as the muon stations, due to their lifetime and relatively small energy loss in the calorimeters.

Tracks have different properties assigned to them. This is one hand the impact parameter

†In Run II this was changed to automatically run at the beginning of every fill. 48 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

Upstream track

TT Long track T track

Downstream track VELO VELO track

T1 T2 T3 Figure 3.19: A schematic illustration of the various track types: long, upstream, downstream, VELO and T tracks [74].

mentioned before, the closest distance between the track to a PV, where the track is extrapolated backwards using its momentum vector if necessary. In addition to the impact 2 2 parameter the χ related to the impact parameter, the χIP, is often relevant and also used in this analysis. It is the impact parameter in units of its uncertainty, where the uncertainty takes into account the covariance matrix of both the vertex and the track itself. In order to properly combine the two uncertainties, including any correlations, one step of a Kalman filter [90] with a specific structure designed for vertex fitting [91] is used. The other track property which is used in this analysis is the χ2 probability related to the 2 2 track itself, the P rob(χtrack). It is simply the p-value from the χ of the track fit. After individual tracks have been reconstructed, sets of tracks with very loose selection criteria on their momentum and impact parameter are combined to form composite particles. These include particles such as D0 hh, J/ψ µ+µ− or Z µ+µ−, the last of which is the object of study in this thesis.→ → →

Trigger signals are associated with both tracks and reconstructed particles in order to determine if the particle was (partially) responsible for this event being triggered. If this is the case it is called TOS (Trigger On Signal), otherwise TIS (Trigger Independent of Signal).

3.3.6 Production of simulated events

In addition to actual data being recorded, simulated data samples are being produced as well, using an event generator and a model of the LHCb detector and its event reconstruction. For this the Pythia 8 Monte Carlo (MC) generator [92, 93] is used. Pythia is a general purpose MC generator, that can simulate a large number of processes. It calculates processes at leading-order and also performs hadronization. The output of Pythia is an event table including particle type, momentum and charge for each CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 49 particle produced in each event. At LHCb a specific LHCb configuration is used [94] and subsequent decays of unstable particles are described by EvtGen [95], in which final-state radiation is generated using Photos [96].

After the event has been generated, the interaction with the detector and the detector response are modelled using the Geant4 toolkit [97] as described in Ref. [98]. This includes generating analog signals just like in the real detector and digitizing these signals. The normal event reconstruction can be run on the output of Geant4.

In the case of this thesis Pythia is used to generate simulated Drell-Yan signal events, which are then available after reconstruction as the signal MC sample further described in Section 4.2.1. 50 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

3.4 Luminosity determination at LHCb

In particle physics, just like almost everywhere else, there is a difference between what can be physically observed and what can be predicted. From the theoretical point of view, all processes have a cross-section, which dictates how often that process happens. Experimentally usually only the rate of this process happening i.e. how often that process happened during some data taking period, can be observed.

The link between these two concepts is the luminosity ( ). It quantifies how many collisions happened per second and area. With the luminosity,L the observed rate N˙ and the cross-section σ are directly proportional:

N˙ = σ , (3.6) L The observed rate can also be defined using more collider-specific variables. In the case of the LHC, a bunched collider, it is given by the average number of times the process happens per bunch-crossing µ and the revolution frequency frev of the bunches.

In addition, on the other side of the equation, any actual experiment will not be able to reconstruct every single event occurring, it will have some efficiency ε, which can depend on many different variables. This efficiency is neglected for the rest of this section, but plays an important role later in Section 4.6, where the efficiencies in this analysis are presented. Accordingly Eq. (3.6) can be rewritten as

˙ N = µfrev = εσ . (3.7) L When comparing datasets collected at different experiments, or even at different accelera- tors, the integrated luminosity is usually given instead of the instantaneous luminosity defined above. It is obtained by integrating Eq. (3.7) with respect to time:

Z 1 L = d t = N. (3.8) L σ

This has the advantage that it is process independent in the sense that with knowing the integrated luminosity of a dataset, the expected number of events for a specific process can be easily obtained if the cross-section of that process is known. Or, vice versa, the cross-section can be obtained from the number of signal events observed, instead of a rate of signal events. The unit of the integrated luminosity is that of an area, but it is often expressed in the units barn−1 = 100 fm−2. Similarly, cross-sections are expressed in barn, so that the total number of events in Eq. (3.8) is unitless.

The luminosity can also be defined from a geometrical point of view. For two bunches which repeatedly cross, the following facts are true. The instantaneous luminosity should be higher if CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 51

• each bunch contains more particles,

• the bunches meet more often per second,

• the bunches meet head-on,

• the spatial overlap of the two bunches is larger.

From this heuristic description, the instantaneous luminosity of two counterrotating bunches moving at the speed of light with Ni particles in each bunch is given by:

Z 2 = 2c cos φfrevN1N2 ρ1(x, y, z, t)ρ2(x, y, z, t) d x d y d z d t, (3.9) L with ρ1, ρ2 the normalized densities of the two beams, frev the revolution frequency of the bunches, and 2c cos2 φ the Møller factor for beams moving at the speed of light [99] with φ the half crossing angle between the two beams. At LHCb the half crossing angle is usually small, on the order of 500 µrad, so cos2 φ 1 [100]. ≈ This formula can be greatly simplified under the assumption that both bunches have a gaussian distribution in x and y with the widths σx and σy, a uniform distribution in z, and that they collide head-on. In that case

f N N = rev 1 2 , (3.10) L 4πσxσy where the overlap integral is completely determined by the size of the two bunches in the x y-plane. − At a real collider, this simplified equation is usually not enough, additional effects need to be taken into account. One of these effects is the hourglass-effect [101], the fact that the transversal beam width is not constant as a function of z. This happens because the beams are usually focused down towards the collision point, which decreases the beam width at the collision point, but increases the beam width away from the collision point, as can be seen in Eq. (3.10). Overall this leads to a reduction of the luminosity. Assuming the narrowest point being at z = 0, this leads to a beam width of [102]

s z2 σ (z) = σ (0) 1 + . (3.11) x,y x,y β∗2 Here β∗ is the value of the betatron function at the focussing point. The betatron function is defined by the optics of the magnets that are used to focus the beam. See Fig. 3.20 for a visualization of how the transverse beam size changes as a function of distance from the focussing point for two different values of β∗. At LHCb, in 2012, normal physics fills were operated at β∗ = 3 m, while most luminosity calibration fills were operated at β∗ = 10 m [100]. Over half of the length of the VELO, roughly 0.5 m, a β∗ of 3 m leads to an increase in the transverse beam width of about 1%, leading to a similarly sized reduction in luminosity. 52 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

β* = 0.05 m β* = 0.50 m (z) [µm] x,y σ

z [m] Figure 3.20: Transverse beam size as a function of distance from the focussing point for two different values of β∗. Adapted from [102].

3.4.1 Relative luminosity determination

For measurements of e.g. cross-sections, the absolute luminosity is needed in order for Eq. (3.8) to be applicable. However, the conditions continuously change during data taking, the number of particles in each bunch deteriorates with each collision (see Fig. 3.6), and other effects. Therefore ideally, one would want to measure the instantaneous luminosity continuously and obtain the absolute luminosity via integrating (or summing, in case of discrete time intervals).

As the instantaneous luminosity is not easy to measure continuously, a proxy is used instead. The number of events n from certain reference processes is regularly observed. From this the absolute luminosity for a whole data sample can be obtained:

1 X L = n, (3.12) σref with the reference cross-section σref to be determined. Different standard processes can be used as a reference process, which usually take the form of counters that indicate whether or not an interaction occurred. Possible reference variables include:

• number of tracks reconstructed, • number of vertices reconstructed, • number of muon tracks reconstructed, • energy deposition in the calorimeters.

Instead of observing the reference variable in every event, the average number of interactions per bunch crossing, µ is determined from the number of empty events. This works because when two particle beams cross, the number of interactions follows a Poissonian distribution. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 53

Under that assumption the probability to have no interaction during a bunch-crossing is given by:

µ0 P (0) = e−µ = e−µ . (3.13) µ 0!

Out of a total ntot crossings, the number of crossings without any interaction n0 is then:

−µ n0 = ntot Pµ(0) = ntot e . (3.14) · ·

So the average number of interactions µ can be retrieved from the fraction of empty events:  n  µ = log 0 . (3.15) − ntot

To determine whether or not an event is empty, the previously mentioned reference variables are used. The definitions of an empty event in LHCb are summarized in Table 3.3. Of these reference variables, two are actually used in the two independent determinations of the reference cross-section described in the next sections. One is the number of tracks in the VELO that point to the interaction region (IR) and the other is the number of vertices reconstructed in the IR.

Table 3.3: Commonly used luminosity counter definitions at LHCb.

Name Description Cut for empty event Track Number of VELO tracks pointing to IR < 2 Muon Number of Muon tracks < 1 SPD Number of SPD hits < 2 Vertex Number of vertices in IR < 1

Background subtraction

When calculating the rate µ as shown in Eq. (3.15), it contains not only events from the primary collisions, but also from collisions of particles of one beam with the rest gas present even in the ultra-high vacuum of the LHC. In order to account for this, not only beam-beam (bb) collisions are recorded, but also bunch-crossings where one of the two beams has no nominal bunch in it (empty-beam and beam-empty, eb and be), as well as bunch-crossings where both beams contain no actual bunch (empty-empty, ee).

With this the background corrected rate can be obtained to linear order by subtracting the rates obtained from the bunch-crossings with only one beam populated, plus the rate from empty-empty bunch-crossings, since that contribution was subtracted twice:

µ = µbb µbe µeb + µee. (3.16) − − 54 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

3.4.2 The Van-der-Meer method

LHCb is the only LHC experiment which can measure the reference cross-section σref in two complementary ways. One way is the Van-der-Meer scan (VdM), which is employed by all LHC experiments, and the other one is called beamgas imaging (BGI), which is unique to LHCb.

The Van-der-Meer scan method was developed in 1968 [103] in order to calibrate the absolute luminosity for the Intersecting Storage Ring (ISR), the world’s first hadron collider. It works by continuously measuring the interaction rate, while displacing the two beams with respect to each other. For a discussion of the method in the LHC era, see Ref. [104].

The average number of interactions per bunch crossing is given by the rate of the reference ˙ process, N, and the revolution frequency of the bunches frev. The rate of the reference process is given by its cross-section, σref , and the instantaneous luminosity, , given by Eq. (3.9): L ˙ Z N σref µ = = L = σref N1N2 ρ1(x, y)ρ2(x, y) d x d y. (3.17) frev frev Here a constant density in the z-direction of bunches of both beams has been assumed.

The visible rate of interactions changes with a varying overlap between the two beams. It has a maximum if the beams are perfectly overlapping and is reduced when they are separated. If the beams are separated by some amount ∆x, ∆y, µ changes to Z µ(∆x, ∆y) = σref N1N2 ρ1(x, y)ρ2(x + ∆x, y + ∆y) d x d y. (3.18)

Here beam 1 was arbitrarily chosen as being stationary and only beam 2 being displaced, which can of course always be achieved by a suitable choice of coordinate systems. An illustration of this displacement is shown in Fig. 3.21.

Integrating over the separation ∆x, ∆y yields the reference cross-section σref multiplied by the number of protons in each bunch Z µ(∆x, ∆y) d(∆x) d(∆y) = σref N1N2. (3.19)

The integral of the curve of µ versus the displacement is relatively easy to obtain since both beams, as well as the displacement curve, are Gaussian-like. They can be described with a double-gauss model, as can be seen in Fig. 3.22.

Therefore the reference cross-section can be measured by calculating the integral of the number of visible interactions as a function of beam-displacement and the number of protons in each bunch: N N σ = 1 2 . (3.20) ref R µ(∆x, ∆y) d(∆x) d(∆y) CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 55

Figure 3.21: Illustration of the Van-der-Meer method, which separates the two beams by some amount ∆y. Each beam contains bunches with Ni particles and has a density profile ρi(x, y).

The number of particles in each bunch, Ni is determined by the LHC using beam current transformers. Two independent DC current transformers per beam measure the overall current circulating in each beam and two fast beam-current transformers (of which only one per beam was operational in 2012) measure the relative charge of each of the 3546 nominal 25 ns slots [65,105,106].

In Ref. [88] the reference cross-section for the data taken in 2012 was determined to be

σV dM = 60.63 0.89 mb. (3.21) ref ±

BCID 164, pair 1 LHCb 2.0 22 10

× 1.5 sp µ 1.0

0.5 Corrected

0.0

2 0 pull −2 −250 0 250 −250 0 250 ∆x (µm) ∆y (µm)

Figure 3.22: Average number of interactions as a function of the relative displacement of the beams in x and y during the VdM-scan in April 2012 which was used to calibrate the luminosity counters for the data used in this thesis [88]. 56 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

3.4.3 The Beam-Gas-Imaging method

As an alternative to the VdM method, the Beam-Gas-imaging method was first proposed a few years before the start of the LHC [107]. Instead of scanning the beams across each other and thereby indirectly determining the shape of the beams, the high vertex resolution of modern silicon vertex detectors is employed. A small amount of gas is injected into the vacuum vessel of the vertex detector and collisions of the two beams with the gas atoms are reconstructed. The spatial distribution of these vertices directly yields the shape of the beams and therefore their spatial density.

Care has to be taken not to introduce too much gas, such that the decay rate of the beam from collisions with the gas becomes dominant. This is given for a partial pressure of the injected gas of less than about 10−7 mb [107], with the general LHC vacuum being down to 10−10 mb. In addition, the type of gas needs to be chosen such that it does not interact with the vacuum vessels or interfere with the getter pumps employed to keep the LHC vacuum stable [107]. Therefore only the use of noble gases is practicable and initially Neon was chosen. In the meantime other noble gases (such as Helium, Argon and Xenon) have also been injected, but not with the goal of providing a luminosity calibration but using it as a gaseous target and running LHCb as a fixed-target experiment [108,109].

The LHCb vertex detector, the VELO, allows for vertices to be reconstructed in a region about 3 m long, centred on the nominal collision point. In Fig. 3.23 the distribution of the vertices during a calibration run in Run I can be seen. The individual contributions of the beam-gas collisions can be seen in events where a bunch is only present in one of the two beams and the -charge, protons present outside of the nominal bunches, can be measured from vertices in events where no nominal bunch is present in either beam. The three-dimensional distribution of the vertices can also be reconstructed for the different bunch crossing types, as is shown in Fig. 3.24.

The profiles of the two beams are determined by fitting two-dimensional double-gaussian models to slices in z of bb, eb and be events. The result of these fits in the central z slice can be seen in Fig. 3.25.

With the spatial distribution known, the overlap integral between the two beams can be calculated directly, from which the reference cross-section can then be determined via

BGI µ σref = , (3.22) N1N2Ω where Ω is the overlap integral of the two beams. The rate of interactions µ is measured in the same way as for the VdM method, however using the Vertex counter instead of the Track counter (see Table 3.3). The BGI reference cross-section was then determined to be [88]

σBGI = 60.62 0.87 mb. (3.23) ref ± The BGI method can even be performed in parallel to other experiments performing VdM CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 57

Figure 3.23: Distribution of longitudinal z-position vertices for the various bunch crossing types. Crossing types ee, be and eb contain only beam-gas events while bb crossing types contain beam-beam vertices in the central region and beam-gas vertices away from z = 0. The effect of the trigger prescale reducing pp events is visible. Also the exclusion region of 300 mm for ± beam-gas events in bb crossings is visible. [100]

Figure 3.24: Reconstructed beam-gas interaction vertices view in 3 dimensions for fill 2852 and BCID 1989. Only the first 5000 vertices per beam and in the luminous region are shown. [100]

scans. It only requires the overlap integral to stay constant, so it cannot be performed while a VdM scan is being performed at LHCb. This way LHCb is able to provide a measurement of the ghost charge, the charge in nominally empty bunches, by measuring the visible rate in ee bunches [88]. 58 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

Figure 3.25: 2-dimensional fit for fill 2853. Left: central z-slice of beam 1. Middle: central z-slice of beam 2. Right: central z-slice of the luminous region. Remaining 19 z-slices are not shown. The fit result of the true beam shape including resolution convolution is shown as a 3-dimensional shape, the data is shown as a contour plot above the the result. The pulls of the fit are shown on top [100].

3.4.4 Absolute luminosity calibration at 8 TeV

For the data collected in 2012 at √s = 8 TeV luminosity calibrations using both the VdM and the BGI method are available. In order to combine the two reference cross- sections, the average ratio between the rates from the two counter variables at 8 TeV, µT rack/µV ertex = 1.106 is taken into account [88]. The systematic uncertainties are mostly uncorrelated, but where a correlation exists, such as for the measurement of the bunch charges, the choice of the fit model and the background subtraction, it was taken into account. The combined reference cross-section is

σref = 60.62 0.68 0.19 mb, (3.24) ± ± where the first uncertainty is the propagated uncertainty from the two methods and the second uncertainty comes from propagating this result to the actual physics data. The total relative uncertainty on the luminosity calibration for √s = 8 TeV is 1.16% [88].

3.4.5 Determining the luminosity of leading bunches

Sometimes this official luminosity calibration is not sufficient. One such case were the 2015 early measurements, which was the first few weeks of data taking after LS1 (up to September 2015), corresponding to an integrated luminosity of about 20 pb−1 [110]. During this running period a sample of minimum-bias events was collected, but instead of just randomly keeping a small fraction of all events, as is normally the case for minimum-bias events, here only events which were from collisions of leading bunches were considered. Leading bunches refer here to the first bunch in a bunch train, as injected from the SPS. CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 59

These bunches have the advantage that there can be no spill-over left from the collision of the bunch right in front of it, making these events very clean. This makes them ideal for cross-section measurements.

However, the standard luminosity calibration averages over all bunches and in addition it only takes a very small fraction of all events, so that a luminosity calibration with the standard luminosity stream would not have sufficient statistics. Therefore, a stand-alone luminosity calibration is needed for events like this. This procedure was first introduced in another internal LHCb analysis note [111], together with the rest of the activities related to the luminosity calibration of the data collected by LHCb in 2015. Specifically, the determination of the rate using the standard luminosity calibration, the cross-calibration between the two different luminosity counters and the vertex resolution of the VELO in MC were done as part of this thesis. The final results of the cross-calibration between the standard luminosity calibration and the offline luminosity calibration are documented in an internal LHCb note [112]. The procedure used there is briefly described here, together with its conclusions.

The luminosity stream is independent from the normal physics stream of events and uses a dedicated fast reconstruction of mainly the VELO data, which is not identical to the information available offline. Therefore, a cross-calibration between the standard online luminosity calibration and any offline calibration is necessary.

During the standard luminosity calibration two variables are mainly used as reference counters, as explained previously. Both variables are related to the VELO. An event is counted as not empty if it has

• Track (lc.3): at least two tracks reconstructed in the VELO, with the point of closest approach to the z-axis within a radius r = px2 + y2 < 4 mm and z < 300 mm. | | p • Vertex (lc.14): at least one reconstructed primary vertex with r = x2 + y2 < 4 mm and z < 300 mm. | |

Both variables count the number of events where a certain criterium is true. The former is used for the calibration using VdM scans and the latter for the BGI analysis (see previous sections). The ratio between the two counters is approximately constant and used to determine a combined luminosity calibration. During the fills used for the early measurements, this ratio is shown in Fig. 3.26.

During the offline reconstruction, the information available is slightly different. The exact objects used in the luminosity stream are no longer available, because they are created in a dedicated fast reconstruction. Instead, reasonable approximations to the counters are used. For the Track counter, two alternatives exist, using two different containers of reconstructed tracks, while the offline variant of the Vertex counter just lacks the edges of the phase-space:

• Track: 60 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

1.094

Vertex ●

µ 1.090 ● Track

µ ● ● ● ● ● ●

1.086

3860 3880 3900 3920 3940 3960 3980 fill Figure 3.26: Ratio between the two standard luminosity counters Track and Vertex for some fills during the 2015 early measurements period.

– number of VELO-tracks that have been fitted in the HLT1, specifically tracks contained in the FittedHLT1VeloTracks container, – reconstructed tracks with a VELO-segment (long tracks, VELO tracks and upstream tracks, see Fig. 3.19), contained in the so-called best track container. p • Vertex: number of primary vertices within the fiducial region r = x2 + y2 < 4 mm and z < 250 mm, since the full region up to 300 mm is no longer available at this point.| |

These offline counters can be compared to the values from the luminosity counters. This comparison can be performed for the different counters, for each run and for each individual bunch pair. Figure 3.27 shows the µ extracted from both the random luminosity triggers and the offline data sample as a function of time for the specific runs used for the 2015 early measurements. It is visible that they agree very well in general. The offline calibration has a much higher precision, which was one of the reasons to use it in the first place. The differences between the partial reconstruction in the trigger and the full offline reconstruction result in a difference of about 1%.

A calibration ratio for the offline luminosity for each of the three different offline counters is obtained by studying all twelve runs relevant for the early measurements. The combined cross-calibration factor can then be used to obtain an absolute luminosity for the data considered for the 2015 early measurements. This calibration was then used in e.g. Ref. [113] to determine the inelastic pp cross-section, where it contributes a relative uncertainty of 4%, most of which comes from the absolute luminosity calibration, though. The cross-section for inelastic proton-proton collisions at a centre-of-mass energy of 13 TeV is measured to be CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC 61

(a) Comparison between the average number of interactions µ as calculated from the luminosity stream (blue) and the offline reconstruction (red) using the different counters as a function of time during run 157808.

(b) Resulting cross-calibration factor R for the different counters in the different runs of the early measurements.

Figure 3.27: Determining the cross-calibration factor for the offline luminosity calibration for leading bunches [112]. 62 CHAPTER 3. THE LHCB EXPERIMENT AT THE LHC

σinel = 75.4 3.0 4.5 µb, (3.25) ± ± where the first uncertainty is experimental (and mostly due to the luminosity determination) and the second due to the extrapolation from the acceptance of the LHCb detector to the full phase space [113]. CHAPTER 4

MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

Asking a question of nature, and listening for the answer [...].

Carlo Rubbia

In order to determine the (double-) differential Drell-Yan cross-section for a γ∗/Z decaying into two muons, multiple investigation steps are required. In this chapter the actual analysis is described, from the data selection during and after the data taking, to the determination of the signal yields and the corrections needed to determine the cross-section from the signal yields.

During the data taking, candidates for Drell-Yan events need to be reconstructed and saved to disk. For this the hardware and software trigger of LHCb described in Section 3.3.5 is used. The specific trigger lines and their requirements are described in Section 4.2, together with the offline selection.

The stored data set still contains events from multiple background sources. In order to distinguish these from the signal DY events a template fit is performed. The templates are generated from data using events around the Z-peak for the signal and separate selections for the different backgrounds, as described in Section 4.3. The variable in which the templates are generated is the isolation of the two muons of a DY candidate from other charged tracks, described in Section 4.3.1. The fitting process itself is described in Section 4.4, where the bins in mass and rapidity in which the fit is performed are also introduced. The fit is corrected for potential biases using a toy MC study described in Section 4.4.4.

63 64 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

Signal yields in the data sample as a function of mass and rapidity are obtained from the fit. With this the cross-section is determined by applying multiple corrections needed to be able to compare it to QCD calculations. First, bremsstrahlung and final-state radiation (FSR) need to be accounted for, which is done by calculating correction factors in MC, described in Section 4.5. Finally, the efficiencies of the different selection stages need to be taken into account, which is described in Section 4.6.

4.1 Previous measurements

Before delving into the analysis performed in this thesis some historic context might be needed. This thesis is not the first measurement of the Drell-Yan cross-section, not by a long shot. The first measurement of the DY cross-section was already performed in 1970 [114], the same year as its theoretical description by Drell and Yan [55], using muon pairs with an invariant mass from 1 to 6.7 GeV/c2. After that, various fixed-target experiments supplied more data for the low-mass region, most recently up to 16.85 GeV/c2 [115–117].

The Drell-Yan cross-section below the Z-mass has also been measured at the LHC at √s = 2 7 TeV, both by ATLAS [118] for 12 < Mll < 66 GeV/c and η < 2.4 and by CMS [119] for 2 | | 15 < Mll < 600 GeV/c and η < 2.5, using electron as well as muon pairs. Measurements at √s = 8 TeV are also available| | from ATLAS [120] and at √s = 8 and 13 TeV from CMS [121,122].

LHCb, which covers the unique rapidity range 2 < η < 4.5 has not yet published a measurement in the mass region below the Z-mass. A preliminary measurement at √s = 7 TeV has been performed in a previous PhD thesis [123,124], in addition to measurements of the Z-cross-section using electron, muon and tau-pairs [125–127]. At √s = 8 and 13 TeV only measurements of the Z-cross-section using muons and electrons have been performed so far [128–130].

A measurement of the Drell-Yan cross-section from LHCb is however still needed, even when the GP experiments have already done so at the same centre-of-mass energy. Because of its unique phase space coverage, LHCb has an unprecedented opportunity to provide insights into hitherto unexplored kinematic regions for Drell-Yan events. The low opening angle from 20 to 300 mrad enables studying events with a very low momentum fraction x. The LHCb coverage for DY events is shown in Fig. 4.1, where it is visible that while one half of the phase space accessible with the LHCb experiment has already been explored by previous experiments, LHCb has the potential to explore down to a Bjorken-x < 10−4 over a broad range in momentum transfer q2. While the experiments H1 and ZEUS at the proton-electron collider HERA could reach similarly low x, they were only able to do this at a lower momentum transfer q2, due to the lower centre-of-mass energy of the collider (318 GeV/c2). In addition, the HERA measurements have significant statistical uncertainties. Any measurement in the low q2 region is extremely valuable for the PDF fits and also to probe DGLAP evolution at low q2.

Studying the Drell-Yan cross-section in the region from 20 GeV/c2 to the Z-peak is especially interesting because here the uncertainties due to the PDFs become comparable CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 65

HERA 106 Fixed Target

105

104 Z (91 GeV)

103

γ* (10 GeV) 102

101 y = 5 2 0 2 5

ATLAS/CMS 0 10 LHCb CDF/D0 10-1 H1/ZEUS Fixed Target

10-2 -5 -4 -1 10-6 10 10 10-3 10-2 10 100 x Figure 4.1: Kinematic coverage in x and q2 for different experiments assuming a centre-of-mass energy of √s = 8 TeV. Roughly indicated is also the mass range of this analysis from about 10 GeV to above the Z-peak. The left half of LHCb’s coverage has not been explored by any other experiment so far. The two separate regions in rapidity accessible at LHCb stem from the fact that only events where the ratio between the Bjorken-x of the two particles is very large lie in the detector acceptance, cf. Eq. (2.44). Adapted from [123, 131]. to the theoretical uncertainties. A measurement of the low-x DY cross-section has therefore the potential to improve our knowledge of the PDFs. Below 20 GeV/c2 the theoretical uncertainties dominate, so a measurement here can also help the understanding of the theoretical calculations.

The goal of this thesis is therefore the measurement of the single-differential and double- differential Drell-Yan cross-sections dσ(pp Z/γ∗ µ+µ−) d2σ(pp Z/γ∗ µ+µ−) → → and → → , (4.1) dMµµ dy dMµµ where Mµµ is the invariant mass of the muon-pair and y its rapidity. The measurement is 2 performed in the range 10.5 < Mµµ < 110 GeV/c and 2 < y < 4.5 at √s = 8 TeV. 66 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.2 Trigger and data selection

During the data taking only those events are kept which pass multiple trigger levels, as described in Section 3.3.5. In the hardware trigger L0, the line used in this analysis is the L0DiMuon line, which triggers whenever there is a high-energy muon pair in the event. In the first stage of the software trigger, the event needs to pass one of the two HLT1 lines Hlt1DiMuonHighMass or Hlt1DiMuonLowMass, which place loose selection requirements on the (transverse-) momentum of the two muons. There are multiple HLT2 lines dedicated to the Drell-Yan process, all within different invariant mass regions. All three lines require two muons that have a minimum momentum, transverse momentum and track reconstruction quality, using the χ2 probability from the track fit. The exact requirements used are listed in Table 4.1.

In order to be considered a DY candidate, muon pairs need to have passed all selection requirements during data taking. Additional, harsher, cuts are applied offline. On the one hand this is done in order to reaffirm the selection that was used during the data taking, but now with the (better) offline reconstruction. The main difference between the online and offline reconstructions was that all tracks were additionally fitted with a Kalman filter in order to obtain a full covariance matrix, which was too CPU intensive during the data taking [74]. On the other hand, making slightly more restrictive cuts removes differences of the trigger settings during the data taking by choosing the most selective value from the different choices, or even stricter requirements. The offline selection is also included in Table 4.1. Some trigger thresholds also changed during the data taking, which is denoted by footnotes.

From this base selection multiple different samples are generated, which are described in the following.

4.2.1 Signal samples

Data

Signal candidates are required to have triggered (TOS) all three stages of the online trigger, L0, HLT1 and HLT2, as described above. To specifically select signal events and reject background events, the origin vertex of the muon pair is required to be well reconstructed. 2 For this, a cut on the Z vertex quality of χvtx/ndf < 5 is used.

Monte-Carlo

The same selection as for the data signal sample is used to produce the MC signal sample, the generation of which is described in 3.3.6. In addition, the candidate is truth-matched which means that the events included in the MC signal sample actually originated from the same γ∗/Z and are actual muons. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 67

Table 4.1: Selection requirements on both muons in the different stages of the trigger and offline ± 1,2 selection. Here µ denotes that the requirement is evaluated for both muons and pT denote the transverse momentum of the two muons with the highest momentum in the event. Values for the trigger stages taken from Ref. [89]. Changes of the cut values during the data taking are marked by footnotes. Where no line is designated, the requirement is used for all lines of that stage, where applicable.

Stage Line Description Cut

p 1 2 † L0 High-momentum muon pair pT pT > 1.6 GeV/c · 1,2 pT > 80 MeV/c GEC SPD mult. < 900 2 ± HLT1 Muons well reconstructed χtrack/ndf(µ ) < 3 2 Muons close to each other χvtx/ndf < 25 Closest approach < 0.2 mm Muons identified as muons isMuon = T rue 2 ± ‡ LowMass Not directly from PV χIP(µ ) > 6 ± HighMass Muon momentum pT(µ ) > 0.5 GeV/c HighMass p(µ±) > 3 GeV/c § 2 HighMass Di-muon mass Mµµ > 2.7 GeV/c 2 ± HLT2 Muons well reconstructed χtrack/ndf(µ ) < 10 ± DY2 Muon momentum pT(µ ) > 1 GeV/c 2 DY2 Di-muon mass Mµµ > 5 GeV/c 2 DY3 Di-muon mass Mµµ > 10 GeV/c 2 DY4 Di-muon mass Mµµ > 20 GeV/c 2 offline Muons well reconstructed P rob(χtrack) > 0.001 ± Muon momentum pT(µ ) > 3 GeV/c p(µ±) > 10 GeV/c Muons in acceptance η(µ±) (2, 4.5) ∈ 2 Mass in range Mµµ (10, 120) GeV/c Rapidity in acceptance y ∈ (2, 4.5) ∈ † For parts of the data taking (<3%), this threshold was set to 1.296 GeV/c. ‡ For parts of the data taking (<3%), the same momentum requirements as for the HighMass line were used. § For parts of the data taking ( 30%), this threshold was set to 6 GeV/c. ∼

While MC is able to reproduce the features seen in data very well in general, some features are not well described. Specifically, the occupancy in the detector is systematically too low. In order to improve the agreement between data and MC, a reweighting of the MC events is performed, which is used whenever data and MC are compared in the remainder of this thesis (unless noted otherwise). As reweighting variable the number of active cells in the SPD detectors, nSPDHits, is chosen. The result of the reweighting can be seen in Fig. 4.2. 68 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

0.0035 Data MC 0.0030 MC (reweighted)

0.0025 Normalized entries 0.0020

0.0015

0.0010

0.0005

0.0000 0 200 400 600 800 #SPD hits

Figure 4.2: Number of cells with hits in the SPD in data, MC, and in MC with multiplicity weights applied. All three distributions are restricted to the Z-peak (80 - 110 GeV/c2) and normalized to unit area.

4.2.2 Background samples

The signature of a DY event with two oppositely charged muons in the final state can also be produced by other processes than the DY process, which lead to additional DY candidates. If there are two muons from a DY process in the event it is already sufficient for one additional muon to be produced by any other process to form one additional DY candidate. One additional candidate, not two, because two muons need to be of opposite charge in order to form a DY candidate. All processes which produce one or more muons are therefore backgrounds for this analysis and need to be taken into account. In particular, the following physical backgrounds have to be taken into account:

• B-mesons decaying semileptonically, i.e. with one muon in the final state,

• Drell-Yan events decaying to two τ, where one or both τ decay into a muon,

• W -bosons decaying to a muon or to a τ which then decays to a muon,

• wrongly identifying a pion or kaon as a muon,

• random combinations of two true or wrongly identified muons.

These physical backgrounds are described using two different classes of samples. For each class one or more samples enriched in events with the corresponding backgrounds, or depleted of signal events, is created from data. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 69

Heavy-flavour samples

The first class of background samples contains events where the origin vertex, the vertex from which the two muons originate, is either displaced from the primary vertex or badly reconstructed. These background samples cover the first three physical backgrounds mentioned above and are called heavy-flavour, because the primary constituents are muons from B-decays.

B-mesons have a comparatively long life-time of of τB = 1.638 0.004 ps [2], which ± means that on average they fly a distance of τB c 491 µm before decaying. This is measurable using the VELO because of its excellent· ≈ impact parameter resolution (see Section 3.3.1). Charged B-mesons decay in about 11% of all cases semileptonically via + + − − the decay B µ νµX or B µ ν¯µX [2]. The muon can then be combined with a true or fake muon→ from any other→ process to form a DY candidate in the reconstruction. Since the mass of the B-meson is only about 5.3 GeV/c2, the muon from this decay has a low momentum. These events are therefore expected to contribute mostly at low masses, because in order to be reconstructed as a high-mass DY event the muon needs to be combined with a very-high-momentum muon, which is statistically less likely than being combined with another low-momentum muon.

Actual Drell-Yan events where the γ∗/Z decays to two taus, instead of two muons, also contribute to the heavy-flavour background samples. If the DY process proceeds via the Z-resonance, the result is with almost equal probability a tau pair as a muon pair. The two + + − − taus can subsequently decay into muons via the decays τ µ ν¯τ νµ and τ µ ντ ν¯µ, each with a probability of about 15% [2]. If only one tau decays→ to a muon (and→ the other directly to an electron), another muon in the event is required in order to form a DY candidate, if both decay to muons they can form a DY candidate on their own. In addition + + + to DY events, also tauonic W τ ντ µ ντ ν¯τ νµ decays (and its charge conjugate) contribute to this class. As in all→ cases where→ neutrinos are in the final state, some energy and momentum is carried away unseen by the neutrinos. Therefore the reconstructed invariant mass is lower than the actual invariant mass which leads to fewer events being expected at high mass.

Three heavy-flavour samples are being constructed. Events with a badly reconstructed origin vertex (HF ) are selected by requiring the same L0 and HLT1 lines as for signal and 2 the origin vertex to have a χvtx/ndf > 15. This sample is the baseline heavy-flavour sample. It is by construction statistically independent of the signal sample. Two alternative heavy- flavour samples are created by requiring a displaced vertex. For the HF (IP) sample the final selection requirement is min(IP (µ±)) > 90 µm and for the HF (IPCHI2) sample 2 it is χIP/ndf > 10. These two samples are not statistically independent from the signal sample. The HF (IPCHI2) sample is used to determine a systematic uncertainty due to the heavy-flavour template in Section 5.1.2, while the HF (IP) sample is not used after having studied the residual signal content of all three heavy-flavour samples, which can be found in Section 4.4.2. 70 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

Same-sign sample

The second background sample used in this analysis is the same-sign sample. As the name suggests, it contains events with at least one pair of muons which have the same charge. By construction it contains events where the two muons do not come from the same Drell-Yan decay. This sample is used to describe the other two backgrounds mentioned, misidentification and random combinations of muons. Random combinations happen because there can randomly be two muons (or rather, two particles identified as muons) coming from a PV which are combined to form a DY candidate. Misidentification as a muon happens most often for pions or kaons, which can either decay in-flight, i.e. somewhere between the PV and the muon system, or even punch through the calorimeters and reach the muon system. The chance for misidentification is higher for particles with lower momentum (but still high enough momentum to reach the muon system). In Fig. 4.3 the misidentification probability of a pion as a muon is shown as a function of pT. It quickly falls off at higher momenta, therefore this background is also expected to mostly

contribute at low masses. À Ô 0.08

À 0.07 Nominal background

À Maximal background 0.06

À 0.05

À 0.04

À 0.03

À 0.02

À 0.01

misidentification probability p

À

Ê Ë Ë É É 0 À 0 5 10 15 20 25 30 35 40 45 Õp momentum (GeV/c)

Figure 4.3: Pion misidentification probability as a function of momentum, for b µX events. [79] →

Both random combinations and misidentifications happen equally as often for same-sign and opposite-sign muon pairs. Same-sign muon pairs of course have the advantage that the two muons cannot both come from the same DY process, except for a negligible contribution where one muon is assigned the wrong charge. The selection criteria for this sample are the same as for the signal sample, except that the HLT2 lines used require the two muons to have the same charge, instead of the opposite charge. Specifically, in order to be statistically independent of the baseline heavy-flavour sample, and to be as similar as possible to the signal samples, the same requirement on the origin vertex is placed as 2 for the signal samples, χvtx/ndf < 5. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 71

4.2.3 Overview over samples

To summarize the different samples, Table 4.2 contains the requirements on the origin vertex used to differentiate them.

Table 4.2: Selection criteria unique to the individual samples.

Sample Selection criterion Selection condition 2 Signal Z vertex quality χvtx/ndf < 5 2 MC Signal Z vertex quality χvtx/ndf < 5 2 Same-sign Z vertex quality χvtx/ndf < 5 2 HF Badly reconstructed Z-vertex χvtx/ndf > 15 2 ± HF (IPCHI2) At least one muon not from PV min(χIP(µ )) > 10

The number of events left after the selection requirements in the different samples is shown in Table 4.3. The split between events recorded with the magnetic field pointing up- and downwards is about equal.

Table 4.3: Number of events in the different samples, both per magnetic polarity and combined. The HF (IPCHI2) sample is only available combined, since the datasamples split by magnetic polarity are only needed for the baseline heavy-flavour sample.

Sample Number of events MagUp MagDown Total Data 1.42M 1.39M 2.81M MC 1.34M 1.40M 2.74M Same-sign 755k 733k 1.49M HF 276k 269k 418k HF (IPCHI2) - - 746k

4.3 Fit templates

There are many different ways to distinguish signal from background events in a data sample. One can use distinct features of signal or background events. This can take the form of one or multiple cuts or even a neural network trained on distinguishing the different classes†. Alternatively, the distribution of events in some variable can be used. For this method two approaches are common. One can either make some general assumptions about the distribution of events, such as the signal peaks with a Gaussian-like distribution and the background is exponentially falling, and then fit the parameters and relative fractions of these distributions to the data. The other possibility, the one used in this thesis, is not to assume any analytical relationship and take the distribution of

†See AppendixD for a first study in this direction. 72 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

signal and background events directly from suitably selected data or simulation samples. In order for this to work, all major contributors of events to the data sample need to be accounted for, without overlap between samples. The distribution of events coming from one sample is called template. In the following, the generation of the templates from the samples introduced in Section 4.2 is described. The first step for this is introducing the isolation variable, the distribution of which is used as the templates to estimate the signal fraction.

4.3.1 The isolation variable

Quite often in particle physics, an invariant mass fit is used to separate signal and background. If an intermediate massive particle is created which subsequently decays, this is visible as a resonance in the invariant mass. While this is the case for events from the process qq¯ Z µ+µ−, this is not true if the intermediate state is virtual (i.e. a photon or an off-shell→ Z→-boson). In addition, the goal is to measure the differential cross-section with respect to, and as a function of, the invariant mass, so the mass cannot be used to distinguish signal from background in this case.

Instead, the isolation of the two muons is used to separate signal from background. Drell- Yan events produce very isolated muon pairs, because the muons are directly produced at the PV and no other particles are produced in the decay of the γ∗/Z. All background events usually produce additional particles in the decay (for example the semileptonic B µX decays). The only other particles in a Drell-Yan event come from the partons not→ involved in the primary process, which hadronize and form the underlying event.

In order to quantify the isolation of a muon track, all tracks in a cone in φ and η with ∆R = p∆φ2 + ∆η2 < 0.5 around the muon track are considered. The transverse momenta (pT) of all tracks within this cone are summed up. From this the sum of the transverse momenta of all tracks within an inner, smaller, cone with radius ∆R = 0.1 is subtracted. Thus the muon isolation is defined as

X X isolation(µ) = pi pi . (4.2) T − T tracks, tracks, ∆R≤0.5 ∆R≤0.1

This second cone contains the muon track itself, by construction, and most of any potentially existing final-state radiation or bremsstrahlung emitted by the muon, which can produce charged tracks via photon conversion. It also contains ghost tracks, tracks which share a large fraction of hits with the muon track. An illustration of this definition is shown in Fig. 4.4. This variable was introduced in Ref. [123] and improved upon the previously used ratio between the momentum of the muon and the total momentum in a cone around it, see Ref. [124], as it is almost independent of the invariant mass of the muon pair [123].

In order to distinguish events where one muon is from a signal event and the other muon from a background process, the least isolated muon of the muon pair is taken and so the CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 73

¢R = 0.5

¢R = 0.1

Figure 4.4: Sketch of the definition of the minimal isolation of the muon pair. The radius of the inner and outer cone are shown around the µ−, while around the µ+ both a bremsstrahlung photon as well as the tracks of other particles are shown. isolation variable is defined as:

isolation = max(isolation(µ+), isolation(µ−)) (4.3)

The features of this variable are best studied by taking the logarithm of the isolation (plus one, to avoid log(0)):

log(isolation +1) = log(max(isolation(µ+), isolation(µ−)) + 1) (4.4)

In Fig. 4.5 the distribution of the isolation variable in the different samples is shown, integrated over the full mass- and rapidity range. Clearly visible are two distinct features common to all templates.

The distributions all have a peak at zero, which contains events where both muons are fully isolated because there are no other tracks reconstructed in the outer cone around each muon. A muon pair being fully isolated is more likely for a muon pair coming from a DY event than for muon pairs coming from the different background sources. This can also be seen when comparing the MC sample to the various background samples.

A bulk of events with a broader distribution is separated by a gap from the fully-isolated events. This gap is due to reconstruction thresholds, tracks with very low transverse momentum are either not detected by the LHCb detector at all or do not pass preselection requirements. The former can be due to them being swept out of the detector by the magnetic field before reaching the downstream tracking stations or because they are stopped by the material of the detector. The distributions taper off towards high values due to the limited amount of total energy available in the collision, which is distributed 74 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

0.2

Data 0.0 Normalized entries 0.25 MC Signal 0.00

0.25

Same-Sign 0.00

0.25

Heavy-flavour 0.00

0.25

Heavy-flavour (IPCHI2) 0.00 0 2 4 6 8 10 12 14 log(isolation + 1)

Figure 4.5: Normalized distributions of the logarithm of the isolation of the different samples used in this analysis. Note that the Data distribution contains both signal and background events and that all distributions are integrated over Mµµ and y. In addition, the bulk mean of each distribution is shown in grey.

among all particles produced. This leads to a unimodal distribution of the bulk, which at first glance is almost Gaussian-like. However, the tails are asymmetric and more pronounced than it would be for a true Gaussian distribution. This asymmetry is expected since the two tails originate from two different effects. The bulk is present both for DY events as well as events in the background samples. For DY events this is due to the underlying event or other collisions happening in the same event. Both of these can produce particles that just happen to be reconstructed within the cone around one of the muons. For the background samples these effects are also relevant, but in addition there are the particles coming from the background process itself (e.g. the rest of the CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 75 semileptonic B-decay).

A comparison of some characteristic values of the different samples is shown in Table 4.4. The table contains the fraction of the number of events which are fully isolated

x0 fiso = Pn , (4.5) i=0 xi where x0, x1, . . . , xn are the contents of the template, the mean of the bulk n 1 X µ = x , (4.6) bulk n 1 i − i=1 and its width v u n u 1 X 2 σbulk = t (xi µbulk) . (4.7) n 1 − − i=1 Note that the standard deviation, and not the sample standard deviation, is used here. The n 1 in the denominator of the bulk mean and width result from the exclusion of the first− bin, which contains the fully isolated events.

Table 4.4: Characteristic values of the different samples, including the isolation fraction, bulk mean and width.

Sample fiso µbulk σbulk Data 0.08 8.05 1.21 MC Signal 0.21 7.01 0.94 Same-Sign 0.01 8.65 1.05 Heavy-flavour 0.02 8.48 1.10 Heavy-flavour (IPCHI2) 0.02 8.38 1.08

The distribution shown for the data sample contains events from the full mass-range. Therefore it contains both events from the low mass region, where the signal fraction is low, and from the high mass region close to the Z-resonance, where the signal fraction is very high. This leads to the shown distribution being somewhat of a mixture of the signal MC and the background samples. The characteristic values are averaged over all mass and rapidity bins. Their mass and rapidity dependence is shown in Fig. 4.6. That the heavy-flavour templates approach the signal template towards high mass is explored in more detail in Section 4.4.2, where it turns out that this is due to residual signal in the background samples and the distribution of the bulk as a function of invariant mass of the signal and MC templates is explored in more detail in Section 5.1.1 as a source of systematic uncertainty.

The templates used to represent the different sources of events are described in the following. They are all histograms of the isolation variable, taken from the different samples introduced previously. The templates all have the same binning, 40 uniform bins. Some of them use weights in order to be applicable in the different mass and rapidity bins in which the analysis is performed, which will be introduced in Section 4.4. 76 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

MC SameSign HF (IPCHI2) Signal HF 0.4 MC Signal 0.15 SameSign 0.3 HF HF (IPCHI2) Isolation fraction Isolation fraction 0.10 0.2

0.05 0.1

0.00 0.0 10 20 30 40 60 100 2.0 2.5 3.0 3.5 4.0 4.5 2 Mµµ [GeV/c ] y (a) Isolation fraction as a function of mass. (b) Isolation fraction as a function of rapidity in the mass bin 10.5 – 12 GeV/c2.

MC SameSign HF (IPCHI2) MC SameSign HF (IPCHI2) Signal HF Signal HF 9.5

9.0 8.5 Bulk mean Bulk mean

8.5 8.0

8.0 7.5 7.5

7.0 7.0 10 20 30 40 60 100 2.0 2.5 3.0 3.5 4.0 4.5 2 Mµµ [GeV/c ] y (c) Bulk mean as a function of mass. (d) Bulk mean as a function of rapidity in the mass bin 10.5 – 12 GeV/c2.

1.15 1.4 MC Signal 1.10 SameSign Bulk width 1.3 HF Bulk width 1.05 HF (IPCHI2) 1.00 1.2 0.95 1.1 0.90 MC Signal 0.85 1.0 SameSign HF 0.80 HF (IPCHI2)

10 20 30 40 60 100 2.0 2.5 3.0 3.5 4.0 4.5 2 Mµµ [GeV/c ] y (e) Bulk width as a function of mass. (f) Bulk width as a function of rapidity in the mass bin 10.5 – 12 GeV/c2.

Figure 4.6: Changes of the characteristic values of the template with respect to mass and rapidity. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 77

4.3.2 Signal template

From a previous analysis of the total cross-section of the process pp Z µ+µ− [123,128] it is known that at the Z-peak the signal purity is very high and→ the→ backgrounds are almost negligible. The purity was measured to be ρZ = (99.3 0.2)% [128] in the mass- range 60 - 120 GeV/c2. The isolation template for the signal± is taken from the slightly 2 tighter mass region 80 < Mµµ < 110 GeV/c , surrounding the Z-peak. This range is used throughout the rest of this thesis as at the Z-peak, unless noted otherwise.

Rapidity reweighting

Ideally, this signal template could be used directly in all bins of mass and rapidity. However, the signal template cannot be used unmodified because the rapidity distribution is different in different mass bins, as shown for MC simulated events in Fig. 4.7. This reflects the underlying kinematics of the process, see Fig. 4.1 where it can be seen that different masses select different regions in rapidity. This mass dependence was already noticed in the previous analysis [123], but not fully taken into account. i y h 3.25 MC MagUp MC MagDown 3.20

3.15

3.10

3.05

3.00

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.7: Mean of the rapidity distribution as a function of mass in MC.

To compensate for this mass dependence, a reweighting of the signal template is performed to ensure that the signal template matches the rapidity distribution in the respective mass bin. The weights are calculated using signal MC. The ratio of the rapidity distribution in the mass bin under study and at the Z-peak in MC is interpolated with a B-spline [132–134] and used as weights. These weights are then applied to the data at the Z-peak, with different weights for the different mass bins.

The procedure is visualized in Fig. 4.8 by showing how the rapidity reweighting affects the template in the mass bin 10.5 11. GeV/c2. Even though the shape of the rapidity − 78 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

distribution can be quite different in the different mass bins, the overall effect on the templates is not very large.

105 2.00 MC (Z-peak) Entries 1.75 MC (low mass) Data (Z-peak) 104 Data (reweighted) 1.50

1.25 3

Normalized entries 10

1.00 100 0.75

0.50 10 0.25

0.00 2.0 2.5 3.0 3.5 4.0 4.5 2 4 6 8 10 y weight (a) Rapidity distribution of the MC signal sam- (b) Distribution of weights. ple at the Z-peak and in the target mass bin, and of the data sample at the Z-peak and with the weights from the MC sample.

16 Spline interpolation Signal weight Binned 14 0.150 Signal (unweighted)

12 0.125 10 0.100 Normalized entries 8 0.075 6 0.050 4 0.025 2 0.000 0 2.0 2.5 3.0 3.5 4.0 4.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 y log(isolation + 1) (c) Weights as a function of rapidity, both (d) The isolation template with and without the binned (directly taken from the MC), as well reweighting. as interpolated via B-splines.

Figure 4.8: Rapidity reweighting for the signal template in the mass bin (10.5 11. GeV/c2). −

4.3.3 Background templates

The heavy-flavour background templates are directly taken from data using the two background selections HF and HF (IPCHI2) as described in Table 4.2. No additional reweighting or correction is applied.

Similarly, the SameSign sample is used to obtain a template for the mis-ID and random background. No reweighting is applied, the template is taken as is in each mass and rapidity bin. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 79

4.4 Determining the signal yields

With the templates now defined, the actual determination of the signal yields can proceed. As mentioned at the beginning of this chapter, the differential cross-section with respect to the invariant mass of the two muons is determined as a function of the invariant mass. Likewise, the double-differential cross-section is determined as a function of the invariant mass and the rapidity of the two muons.

The determination of the signal yields is performed in bins. For the single-differential cross-section the bins in the invariant mass are shown in Table 4.5. At low masses, where a lot of statistics is available, the bins have a width of 0.5 GeV/c2, whereas at higher masses the bins get larger. The Z-peak is covered with a single bin in order to guarantee that there is enough statistics available, especially in the heavy-flavour sample.

Table 4.5: Bin edges for the differential cross-section determination as a function of invariant mass. Bin edges are the lower bin edges, except for the last bin, where the upper bin edge is also shown.

2 Mµµ[ GeV/c ] 10.5 11.0 11.5 12.0 13.0 14.0 15.0 17.5 20.0 25.0 30.0 40.0 60.0 110.0

For the double-differential cross-section the fit is performed in 2D bins of invariant mass and rapidity. The bins in the invariant mass are wider than in the 1D fit to ameliorate the lower statistics due to the additional dimension. The bin edges used are shown in Table 4.6. In the rapidity the bins are mostly uniform, except for the last bin which is twice as wide since otherwise it would not have enough statistics. The mass bin that contains the Z-peak is not included, since there was not enough statistics available in the HF template to split up the data sample into bins in rapidity.

Table 4.6: 2D bin edges for the double-differential cross-section determination as a function of invariant mass and rapidity. Bin edges are the lower edges, except for the last bin, where the right edge is also shown.

2 Mµµ[ GeV/c ] 10.5 12.0 15.0 20.0 60.0

y 2.0 2.25 2.5 2.75⊗ 3.0 3.25 3.5 3.75 4.5

The templates described in the previous section are obtained for each mass and rapidity bin. They are used in a template fit implemented in Root [135] with the TFractionFitter class. The class implements more than a basic template fit, which would just return the fractions of each template needed to minimize the difference between the sum of the templates and the data. It also takes into account the uncertainty of both the data being fitted, as well as the uncertainty of the templates.

This procedure was introduced in Ref. [136], but is shortly described in the following. The fit itself is a standard maximum-likelihood fit using Poisson statistics. The variation 80 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

of the templates within their uncertainties are taken into account on a bin by bin basis, leading to additional fit parameters, one per bin and template. An increase in computation time and instability is avoided by using an analytical minimization with respect to these additional parameters instead of treating them as formal fit parameters. Some special care is taken in the case of bins with zero content, as is the case between the peak at zero isolation and the bulk. Two conditions are required in order to produce meaningful results using this fitting procedure.

1. The total number of events in each template is not too small (so that the Poisson uncertainty of the total number of events can be neglected).

2. The number of events in each bin of each template is much smaller than the total number of events in each template (so that multinomial uncertainties can be replaced with Poisson uncertainties).

Biased fit uncertainties may result if these conditions are not fulfilled [137]. While these assumptions are well satisfied for the low mass bins, this is not necessarily the case for the high mass bins, where the number of events in the background templates becomes small. However, as asserted in Ref. [137], this only leads to an overestimation of the uncertainty, so the uncertainties can still be used as conservative estimates of the true uncertainties.

The fit itself is carried out iteratively, in three steps which are described in more detail in the following sections. First, a baseline fit is performed, then the residual signal present in the heavy-flavour template is removed and finally the fraction between the two background templates is fixed, removing one degree of freedom and increasing the number of events in the background template. This iterative procedure is needed, since otherwise an instability of the fit is observed at low mass. This instability manifests itself as unphysically large uncertainties of single bins. A visualization of this procedure is shown in Fig. 4.9.

Initial Fit

Remove residual signal from heavy-flavour template

Fix fraction between background templates

Perform toy fits to determine bias

Figure 4.9: Steps to determine the signal yields in each bin in mass and rapidity. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 81

4.4.1 Initial fit

The first fit step uses all three templates, just as described in the previous section. The fraction of the number of events assigned to each template is shown in Fig. 4.10, as a function of invariant mass.

1.0

0.8

0.6 Fraction of events Signal HF 0.4 SameSign

0.2

0.0

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.10: Fraction of events associated with each category after the first fit as a function of invariant mass.

In this figure the overall structure of the signal fraction is visible. At low mass the data sample is dominated mostly by the heavy-flavour background, whereas with increasing mass the signal fraction continues to rise. As expected, in the region of the Z resonance (60–110 GeV/c2) almost all events are actual DY events. The Mis-ID background, modelled by the same-sign sample, contributes only a small fraction and is also only relevant in the low mass range.

The fit instability mentioned before is also clearly visible at low masses. It manifests itself in unrealistically large uncertainties in multiple mass bins. The next two fit steps seek to remove this, without introducing too many additional biases. The next step starts with the output of this step as an input. 82 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.4.2 Residual signal removal

In both samples used for the heavy-flavour background (HF and HF (IPCHI2)) a single cut is used to define the background-enhanced sample (see Section 4.2.2). Choosing the right value for this cut comes with a trade-off between reducing the amount of signal left and maintaining enough statistics to be used as a template. This is especially problematic in the high mass bins, since the number of events drops off exponentially as a function of mass.

As a consequence of this, the amount of residual signal in the HF background samples needs to be studied. As can be seen in Fig. 4.11, while there is no peak around the Z-mass in the same-sign sample, both heavy-flavour samples show a significant Z-peak. These cannot be true background events. The only background source that could be a candidate + − + − for this would be the decay Z τ τ µ µ ν¯τ ντ ν¯µνµ. However, the peak from this process is much broader and at lower→ mass→ (ca. at 25 - 45 GeV/c2 [127]) due to the amount of energy carried away by the neutrinos.

0.1

0.01

Normalized entries 3 10−

4 10− Data MC Signal 5 10− Same-Sign Heavy-flavor Heavy-flavor (IPCHI2) 6 10− 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.11: Normalized distributions of the samples as a function of the di-muon mass.

To quantify the residual signal, two steps are performed. First, the number of events under the Z-peak in the heavy-flavour samples is determined, relative to the number of events in the same mass region in the data sample. Second, the fraction of signal events surviving the heavy-flavour cuts is studied in MC as a function of the invariant mass in order to extrapolate the result to the full mass range.

For the first step, fits to the Z-peak still visible in the background samples and to the Z-peak in the signal sample are performed. This way the ratio of the amount of residual sig sig signal in the background i, Ni , to the amount of signal in the signal sample, Nsignal is obtained at the Z-peak: CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 83

sig data Ni (MZ , y) Ri (MZ , y) = sig (4.8) Nsignal(MZ , y)

Two different signal models are used to fit the Z-peak in the mass region 60 - 120 GeV/c2. For the data sample, a Hypatia [138] distribution is found to describe the peak very well. This distribution was developed to describe a gaussian-like peak coming from events with different per-event errors. This is the case here, since the uncertainty on the mass of the lepton-pair depends on the momenta of the leptons. For the heavy-flavour background samples a simpler non-relativistic Breit-Wigner distribution is used to describe the signal. This makes the fit more stable, as the Breit-Wigner distribution has fewer free parameters than the Hypatia distribution and is still able to sufficiently describe the Z-peak due to the lower amount of statistics available in the background samples. The fit stability is especially important when fitting the MC samples in the next step. In order to avoid systematic effects, the same two distributions are also used in the corresponding MC samples in the determination of the dependency of the residual signal of the invariant mass.

The background under the Z-peak is in both cases modelled with an exponential distribu- tion. Note that this exponential also includes γ∗ signal events, but since this is used for both signal and background samples, this should cancel in the ratio in Eq. (4.8).

In Fig. 4.12 the fits to the data and heavy-flavour samples are shown. From this the amount of residual signal events in the heavy-flavour sample, relative to the data sample, can be extracted, which are shown in Table 4.7. These fractions are observed to be mostly stable as a function of rapidity, as is shown in Fig. 4.13. Nevertheless, the residual signal is calculated as a function of rapidity.

Table 4.7: Fraction of signal events Rdata in the heavy-flavour samples to signal events in the signal sample, in percent.

y HF HF (IP) HF (IPCHI2) 2 - 4.5 0.34 0.03 1.59 0.05 1.10 0.04 ± ± ± 2.0 - 2.25 0.46 0.16 1.92 0.27 1.40 0.23 2.25 - 2.5 0.38 ± 0.07 1.89 ± 0.15 1.17 ± 0.13 2.5 - 2.75 0.36 ± 0.07 1.67 ± 0.12 1.22 ± 0.10 2.75 - 3.0 0.30 ± 0.05 1.37 ± 0.10 1.03 ± 0.08 3.0 - 3.25 0.36 ± 0.05 1.59 ± 0.11 1.12 ± 0.08 3.25 - 3.5 0.23 ± 0.05 1.44 ± 0.11 0.87 ± 0.09 3.5 - 3.75 0.43 ± 0.07 1.74 ± 0.18 1.15 ± 0.14 3.75 - 4.5 0.16 ± 0.15 1.10 ± 0.40 0.89 ± 0.23 ± ± ±

The second step is determining the mass dependence of the residual signal fraction. This cannot be studied in data, therefore signal MC as introduced previously is used. After truth-matching, the same selection as for data is applied, as described in Section 4.2.1. In addition, a second set of MC samples with the heavy-flavour selections instead of 84 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION )

2 8000 Data 7000 Fit Signal 6000 Background 5000 4000 3000

Events / ( 600 MeV/c 2000 1000

4

Pull 2 0 −2 −4 ×103 60 70 80 90 100 110 120 2 Mµµ (MeV/c ) (a) Fit of the data sample. )

2 40 Data 35 Fit Signal 30 Background 25 20 15

Events / ( 600 MeV/c 10 5 4

Pull 2 0 −2 −4 ×103 60 70 80 90 100 110 120 2 Mµµ (MeV/c ) (b) Fit of the heavy-flavour sample. ) 2

90 Data Fit 80 Signal 70 Background 60 50 40 30 Events / ( 600 MeV/c 20 10 4

Pull 2 0 −2 −4 ×103 60 70 80 90 100 110 120 2 Mµµ (MeV/c ) (c) Fit of the heavy-flavour (IPCHI2) sample.

Figure 4.12: Fit of the Z-peak in the mass range 60 - 120 GeV/c2 for different samples. Below each fit is the pull between the fit and the data. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 85

2.5 [%] HF HF (IP) data

R 2.0 HF (IPCHI2)

1.5

1.0

0.5

0.0 2.0 2.5 3.0 3.5 4.0 4.5 y

Figure 4.13: Fraction of signal events at the Z-peak in the background samples, normalized to the number of signal events at the Z-peak in the data sample, as a function of rapidity. the signal selection is prepared. These MC samples contain signal events which survive the background selections. The ratio between the background selection and the signal selection is calibrated to coincide with the ratio obtained in data in the previous step. From this the amount of residual signal in each background can be determined relative to the number of signal events, which is obtained from the first fit to the data:

data MC R (MZ , y) R(Mµµ, y) = R (Mµµ, y) MC . (4.9) R (MZ , y)

The fraction of residual signal events for the different background samples is shown as a function of mass in Fig. 4.14. The HF sample consistently has the lowest amount of residual signal, which is why it was chosen as the baseline template. The HF (IP) sample has a much larger residual signal contribution, with a strong mass dependence. Therefore it was replaced by the HF (IPCHI2) sample as the cross-check sample, because both the residual signal fraction and its mass dependence are smaller.

With the residual signal contribution calculated, the heavy-flavour template can now be corrected. The signal template sig is scaled with the residual signal fraction R and subtracted from the background template hf.

hfcorr = hf Nsig R sig (4.10) − · · In Fig. 4.15 the result of this subtraction is shown for two different mass bins. At low masses the effect of the subtraction is small, because most of the events are real background events. However, under the Z-peak almost all events used previously in the background 86 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

[%] MC HF 2.5 MC MC HF (IP) R MC HF (IPCHI2) 2.0

1.5

1.0

0.5

0.0 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.14: Fraction of signal MC events passing the heavy-flavour selections normalized to the number of events passing the signal selection, as a function of invariant mass.

template are actually residual signal and are subtracted, only very few events remain. Note that negative bin contents resulting from this subtraction are fixed to zero, with their uncertainty retained. This happens only at large masses, where the number of events in the heavy-flavour template becomes small and is visible e.g. in some bins in Fig. 4.15b.

After removing the residual signal from the heavy-flavour template, the fit is run again using the updated template. The fraction of events assigned to the different templates with this second fit are shown in Fig. 4.16, again as a function of invariant mass. While the bins with large uncertainties in the previous fit step seem to behave normally now (their uncertainty agrees with that of the surrounding bins), there is still one other bin with a noticeably large uncertainty. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 87

Heavy-flavor

120 Signal template scaled

Heavy-flavor corrected 100 Number of entries

80

60

40

20

0 0 2 4 6 8 10 12 14 log(isolation)

(a) In the mass bin 40 – 60 GeV/c2 there is only a small amount of residual signal.

Heavy-flavor 60 Signal template scaled

Heavy-flavor corrected 50 Number of entries

40

30

20

10

0

0 2 4 6 8 10 12 14 log(isolation)

(b) In the mass bin 60 – 110 GeV/c2 almost no events remain after the residual signal subtraction.

Figure 4.15: Residual signal removal in the HF sample in two mass bins. Shown are the heavy-flavour template as coming from the background sample, the signal template scaled to the amount of residual signal left, and the difference between the two, the corrected heavy-flavour template. 88 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

1.0

0.8

0.6 Fraction of events Signal HF 0.4 SameSign

0.2

0.0

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.16: Fit result as a function of the di-muon mass after subtracting the residual signal in the heavy-flavour template. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 89

4.4.3 Fixed background fraction

The third and final step of the fit is a measure to increase the general stability of the fit, or rather of the uncertainties reported. After the second step of the fit, one bin in the low mass region still produces unrealistic uncertainties, especially when compared to the neighbouring bins, as can be seen in Fig. 4.16. The probable cause of this is the high degree of similarity between the two background templates, which could already be seen in Fig. 4.5. While the signal tends to produce more isolated events, as shown with the MC distribution there, background events tend to be less isolated. The difference in shape between the two background templates is quite small, compared to the difference to the signal template.

In order to remove the unreasonable uncertainties entirely, the ratio between the two background templates is fixed in each bin to the one obtained in the second fit in Fig 4.16. The fit is then repeated, yielding as a result signal and overall background fractions. The result of this third fit step is shown in Fig. 4.17. With this step, all bins produce uncertainties whose size are reasonable. In addition to the event fractions as a function of mass, the fit result as a function of rapidity is also shown in Fig. 4.18.

1.0

0.8

Fraction of0 events .6 Signal Background 0.4

0.2

0.0 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.17: Fit result as a function of the di-muon mass after subtracting the residual signal in the heavy-flavour template and fixing the fraction between the two backgrounds to the values obtained in Fig. 4.16.

The evolution of the signal yield over the three fit steps is shown in Fig. 4.19. In the direct side-by-side comparison the difference seems small. The largest difference of up to 4% comes from removing the residual signal from the heavy-flavour template, as can be seen when being plotted relative to the previous fit step. The residual signal removal clearly shifts events from background to signal, and not just from one background to the other. Fixing the background fraction only affects the signal yield in the low mass bins, where the Mis-ID template is relevant. This shift of less than 1% does constitute a bias, 90 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

10.5 - 12.0 [GeV/c2] 12.0 - 15.0 [GeV/c2]

Signal 0.8 Signal 0.8 Background Background 0.7

0.6 0.6 Fraction of events Fraction of events 0.5

0.4 0.4

0.3 0.2 0.2

2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y (a) Signal fraction as a function of rapidity in (b) Signal fraction as a function of rapidity in the mass bin 10.5 – 12 GeV/c2. the mass bin 12 – 15 GeV/c2.

15.0 - 20.0 [GeV/c2] 20.0 - 60.0 [GeV/c2]

0.7 0.65 Signal Background 0.60 0.6 0.55 Fraction of events Fraction of events Signal 0.5 Background 0.50 0.45 0.4 0.40

0.3 0.35

2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y (c) Signal fraction as a function of rapidity in (d) Signal fraction as a function of rapidity in the mass bin 15 – 20 GeV/c2. the mass bin 20 – 60 GeV/c2.

Figure 4.18: Fit result as a function of di-muon mass and rapidity after subtracting the residual signal in the heavy-flavour template and fixing the fraction between the two backgrounds. which is corrected together with all other possible biases using the toy studies performed in Section 4.4.4. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 91

1.2

1.0

0.8 Fraction of events 0.6

0.4

0.2

0.0 10 20 30 40 60 100 2 Mµµ[GeV/c ] (a) Signal (background) fraction in the different fitting steps using fully saturated (desaturated) colours.

1.00

0.99

0.98

0.97 First fit Remove residual signal Ratio of signal fraction to last fit step 0.96 Fix background fraction

10 20 30 40 60 100 2 Mµµ[GeV/c ] (b) Ratio of the signal fraction in each fitting step relative to the signal fraction in the last fitting step in order to see the effect of each step on the signal fraction.

Figure 4.19: Comparison of the signal and background yield between the three fit steps. See bottom plot for legend for both plots. 92 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.4.4 Toy studies

Since the fitting procedure is not a simple single fit, but the rather complicated procedure using three fit steps described above, the bias of the fit needs to be evaluated. This is done using toy studies. These consist of generating random data that resembles the real data as closely as possible and subsequently fitting the generated samples using the three fit steps. This way both the generated and fitted amount of signal are available and can be compared using the pull, which is defined as the difference between the fitted and generated amount of signal, normalized by the uncertainty of the fit, as reported by the fitter:

N fit N gen pull = sig − sig (4.11) σfit

If the fit is unbiased, the distribution of the pulls of each individual toy fit follows a Gaussian distribution centred at zero, i.e. with mean and median at zero. If in addition the uncertainty is properly calculated, this Gaussian distribution will have a standard deviation of one. If the standard deviation is larger (smaller) than one, the fit uncertainties are too small (big).

In each mass and rapidity bin n = 500 toy fits are performed. This number was chosen such that there is enough statistics available, but the runtime of the toy fits is still not too long. The total number of generated events in each bin is determined by the observed number of events in that bin in data. Before performing the toys in each bin, a fit on real data is performed, in order to Gaussian-constrain both the amount of signal and the fraction between the two backgrounds to values close to those observed in data in each bin.

With the number of total and signal events and the relative fraction between the two background templates, a toy sample can be generated using the templates themselves. Before doing this, the number of events in each template is randomly varied within its bin-by-bin uncertainty using a Gaussian distribution centred at the value of the bin N and with a standard deviation of √N. This way the statistics available for each template, which can vary greatly from the high statistics at low masses to very few events in the heavy-flavour template at high masses, is properly taken into account.

In Fig. 4.20 the median of the pulls of all toy fits which succeeded is shown as a function of mass and in Fig. 4.21 as a function of rapidity and mass. The values are also tabulated in Table 4.8 and Table 4.9, respectively. The median was chosen instead of the mean for robustness against outliers. Since the standard deviation of the median is not well-defined, a 68% CI is determined using bootstrap [139,140]. The fit as a function of mass exhibits a bias at low masses, while at high masses it is consistent with no bias. In the toy study as a function of rapidity and mass, this pattern stays the same. The higher the mass, the less overall bias. In total, low mass and low rapidity bins have a larger bias. When determining the cross-sections, the signal yield is corrected for this bias in each bin by rearranging Eq. (4.11) to CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 93

N corr = N fit pull σfit, (4.12) sig sig − h i · where pull is the median of the pull in this bin. The correction reaches up to 5.4σ at low masses.h i ]

fit 1 σ

0

1 −

Median of pull [ 2 − 3 − 4 − 5 − 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.20: Median of the pull of n = 500 toy fits per bin, as a function of mass. The uncertainties shown as errorbars are the standard deviation for the median determined using bootstrap.

Table 4.8: Median of the pull and its standard deviation determined using bootstrap in the toy study performed to evaluate the bias of the fit, as a function of invariant mass.

2 Mµµ GeV/c Median of pull [σfit] 10.5 - 11.0 2.7 0.1 11.0 - 11.5 −4.7 ± 0.1 11.5 - 12.0 −5.0 ± 0.2 12.0 - 13.0 −5.2 ± 0.2 13.0 - 14.0 −3.8 ± 0.1 14.0 - 15.0 −3.1 ± 0.1 15.0 - 17.5− 0.81 ± 0.09 17.5 - 20.0 0.93 ± 0.08 20.0 - 25.0 0.088 ± 0.043 25.0 - 30.0 −0.24 ± 0.02 30.0 - 40.0− 0.56 ± 0.02 40.0 - 60.0 0.63 ± 0.04 60.0 - 110.0 0.34 ± 0.09 − ± 94 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

] 2 fit σ 1

0

1 −

Median of pull [ 2 − 3 − 10.5 12 GeV/c2 4 − − 12 15 GeV/c2 15 − 20 GeV/c2 5 20 − 60 GeV/c2 − − 6 − 2.0 2.5 3.0 3.5 4.0 4.5 y

Figure 4.21: Median of the pull of n = 500 toy fits per bin, as a function of rapidity and mass. The markers are placed at the bin centres. The uncertainties shown as shaded areas are the standard deviation for the median determined using bootstrap.

Table 4.9: Median of the pull and its standard deviation determined using bootstrap in the toy study performed to evaluate the bias of the fit, as a function of invariant mass and rapidity.

2 Mµµ[ GeV/c ] y 10.5 - 12.0 12.0 - 15.0 15.0 - 20.0 20.0 - 60.0 2.0 - 2.25 2.30 0.12 1.07 0.09 0.45 0.04 0.07 0.04 2.25 - 2.5 −4.60 ± 0.09 −2.84 ± 0.07 −0.91 ± 0.04 −0.096 ± 0.035 2.5 - 2.75 −5.40 ± 0.07 −3.24 ± 0.07 −1.40 ± 0.05 −0.14 ± 0.05 2.75 - 3.0 −3.21 ± 0.06 −3.41 ± 0.11− 1.78 ± 0.13 −0.33 ± 0.04 3.0 - 3.25− 1.17 ± 0.13− 0.91 ± 0.10 1.32 ± 0.15 −0.05 ± 0.05 3.25 - 3.5 3.12 ± 0.15 0.49 ± 0.12 1.28 ± 0.09− 0.13 ± 0.05 3.5 - 3.75 −2.69 ± 0.10 1.58 ± 0.11 0.83 ± 0.07 1.31 ± 0.13 3.75 - 4.5− 0.35 ± 0.15− 0.79 ± 0.14 0.25 ± 0.08 0.10 ± 0.04 ± ± ± ± CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 95

4.5 Bin migration

The LHCb detector, just as every other real detector, has only a finite position- and momentum-resolution, leading also to a finite mass- and rapidity-resolution. This can lead to the effect that events which are close to the boundary between two bins are reconstructed in a different bin than the bin they would be in if the true value was known. This effect is called bin-migration and it is especially prevalent in the low-mass region, where the bins are smaller than at high masses. Figure 4.22 shows the average mass resolution (as reported by the reconstruction) in relation to the bin width, as a function of mass. At low mass the mass resolution is up to 20% of the bin width. M ∆

/ 0.25 M σ

0.20

0.15

0.10

0.05

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.22: Average mass resolution σM in relation to the bin width ∆M, as a function of invariant mass. The mass resolution is estimated per reconstructed particle by the reconstruction.

In addition to the finite resolution, two physical effects can also change the bin an event is reconstructed in. Charged particles can emit bremsstrahlung while traversing the detector, losing energy which is usually not recovered during the reconstruction†. During the hard process itself, final-state radiation (FSR) can also be emitted, which needs to be considered as well. These two processes, together with the finite mass resolution, can lead to a Drell-Yan candidate being reconstructed in a different mass or rapidity bin.

Therefore the number of signal events in each bin needs to be corrected for the bin- migrations. This correction can only be determined using the signal MC sample. The selection requirements on the invariant mass and rapidity are removed in order to also include events which migrate out of or into the considered region in phase space. In

†For electrons, where this effect is very pronounced, such a recovery procedure exists. Photon clusters in the ECAL are matched to reconstructed electrons by extrapolating the VELO part of the electron track. This recovers at least some part of the lost energy, but can also falsely associate photons as bremsstrahlung. 96 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

MC both the values of the true mass and rapidity as well as the reconstructed mass and rapidity are available. This allows determining where each individual event is reconstructed, compared to where it was generated.

4.5.1 Bin migration as a function of mass

In order to get a feeling for the size of the effect bin-migrations can have, the difference between the reconstructed and true mass in MC can be seen in Fig. 4.23. The MC sample used here is the same as the one described in Section 4.2.1, but with the requirement on the mass and rapidity removed. Clearly visible is a symmetric distribution around zero, due to the mass-resolution, as well as a broader distribution towards lower energies. This broader distribution contains events that have lost energy due to bremsstrahlung or FSR. Note that there are even some events where the difference is larger than 100 GeV/c2, so it is possible for events to skip the entire mass range being analysed.

106 Entries 105 Bremsstrahlung / FSR Mass Resolution

104

103

100

10

1 150 100 50 0 50 100 − − Reconstructed− mass - true mass [GeV/c2]

Figure 4.23: Difference between the generated and reconstructed mass in MC 2012 MagUp events.

But in order to correct for the bin-migration, more information is needed than just the difference between the true and reconstructed mass. In Fig. 4.24 the fraction of events in each reconstructed mass bins is shown for each bin in the true mass. The graph is normalized such that the entries in each column, i.e. in the true mass, sum to one. Clearly visible is the diagonal, which contains events that are reconstructed in the bin in which they were generated. This diagonal contains most events (> 91.5%) and it corresponds to the peak at zero in Fig. 4.23. Events above the diagonal are events which are reconstructed at a higher mass, which is only possible due to the finite resolution of the detector, while the events below the diagonal are mostly due to bremsstrahlung and FSR.

With this so-called correlation or confusion matrix, the number of events with a true mass CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 97

] 200 1 2 LHCb simulation

0.1 100

60 0.01

40 10 3 − Fraction of true events 30 4 10− Reconstructed mass [GeV/c 20

5 10− 10 10 6 10 20 30 40 60 100 200 − True mass [GeV/c2]

Figure 4.24: Correlation matrix for the bin migration in MC. The normalization is such that the entries in each bin of true mass sum to one. The (highest) lowest bin is the (over-)underflow bin not analysed in data. within each bin can be obtained by correcting the number of reconstructed events in each bin by the ratio

true MIG Ni f = rec . (4.13) Ni

This was the approach used in the previous analysis [123]. However, this is only a leading order approximation. In addition, this method can lead to biases because it depends on the MC distribution used [141]. The mathematically more correct approach to the problem of bin migration is called unfolding. In general, the problem that needs to be solved by this unfolding procedure can be written as [142]

m X y˜i = Aijx˜j + bi, 1 i n, (4.14) j=1 ≤ ≤ where the m bins x˜j represent the true distribution, Aij is the confusion matrix describing the migrations from bin j to any of the n bins of the reconstructed distribution, y˜i is the expected number of reconstructed events in bin i, and bi is the number of background events in bin i. The actually observed number of reconstructed events is then yi, which results from statistical fluctuations of the y˜i. In order to recover the actual true number of events xj multiple different unfolding techniques can be employed, such as matrix inversion [143], singular value decomposition [144], iterative methods or the use of Bayes’ theorem [145–147]. The application used in this analysis, TUnfold [142], uses a least square method in order to solve for the xj. In addition, a Tikhonov regularisation [148] is applied in order to be less susceptible to statistical fluctuations in the MC distributions. 98 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

The uncertainty due to the limited statistics of the sample used to generate the unfolding is taken into account automatically. A comparison between this method and other unfolding procedures can be found in Ref. [141] and a description of the different methods in Ref. [149].

The confusion matrix is determined using the MagUp MC signal sample. The normalization is handled by TUnfold internally. In principle, the unfolding could be applied directly to the number of events in data. The number of background events to be subtracted is (approximately) known from the fit and the confusion matrix can be determined from MC. However, the application used here only works if the input bins to the unfolding are finer than the output. Since we do not want to lose any more statistics, instead correction factors are obtained by applying the unfolding to the MagDown MC signal sample.

The correction factors f MIG are shown in Fig. 4.25 and the values from the unfolding are tabulated in Table 4.10. Values greater than one mean that a bin looses more events than it gains, whereas a ratio smaller than one corresponds to a bin gaining more events than loosing. In the bin containing the Z-peak, the correction factor is larger than one. Here many events are being lost due to bremsstrahlung. At the same time, not many events migrate into the bin, since there are only few events with masses larger than the Z-mass. The bin directly below the Z-peak receives a lot of events with bremsstrahlung or FSR from the Z-peak, resulting in a value smaller than one. At intermediate masses, bins loose events on average, because of the falling trend of the cross-section as a function of mass. This leads to bins loosing more events due to bremsstrahlung than they are gaining from bremsstrahlung from higher mass bins. And finally, at small masses the bin width becomes small enough to have an effect and events are lost due to bremsstrahlung pushing them out of the mass range considered in this analysis. Therefore the correction factor becomes smaller than one again. The shown uncertainties are the ones reported by the unfolding tool, which are due to the statistics of the MC sample. MIG f 1.02

1.00

0.98

0.96

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.25: Bin migration correction factors determined using MagUp MC with the unfolding. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 99

Table 4.10: Bin migration correction factors determined using the unfolding, as a function of mass.

2 MIG Mµµ[GeV/c ] f 10.5 - 11.0 0.980 0.005 11.0 - 11.5 0.988 ± 0.005 11.5 - 12.0 0.997 ± 0.005 12.0 - 13.0 1.000 ± 0.004 13.0 - 14.0 1.006 ± 0.005 14.0 - 15.0 1.011 ± 0.005 15.0 - 17.5 1.013 ± 0.004 17.5 - 20.0 1.019 ± 0.005 20.0 - 25.0 1.024 ± 0.004 25.0 - 30.0 1.025 ± 0.006 30.0 - 40.0 1.016 ± 0.007 40.0 - 60.0 0.953 ± 0.008 60.0 - 110.0 1.0189 ± 0.0032 ±

In order to check that the unfolding procedure reproduces the true distribution, the reconstructed mass corrected with the unfolding and the generated mass for MC MagDown signal events are compared. The MC MagDown events are statistically independent of the MC MagUp events used to generate the unfolding. The ratio between the two is consistent with unity within 5‰, as is shown in Fig. 4.26.

1.010

1.005 Unfolded / True

1.000

0.995

0.990 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.26: Cross-check of the unfolding. Ratio of the number of events in each bin corrected with the unfolding, according to the reconstructed mass and to the generated mass. 100 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.5.2 Bin migration as a function of mass and rapidity

For the double-differential cross-section, the bin migration is needed both as a function of mass and rapidity. Unfortunately, TUnfold does not support multidimensional unfolding. In order to still obtain correction factors for the bin migration as a function of rapidity, the unfolding is first performed in the mass bins used for the double-differential cross-section measurement, see Section 4.4. Then the unfolding is performed as a function of rapidity separately in each of these mass bins. The final correction factor fMIG is then the product of the correction factor from the unfolding in rapidity and in the mass bin.

Figure 4.27 and Table 4.11 show the result of this method as a function of rapidity in the different mass bins. The correction factor as a function of rapidity is almost constant, except for the mass bin 20 60 GeV/c2. However, the correction factor does differ in the different mass-bins, as expected− from the previous section. The uncertainties shown are statistical only and are taken as the systematic uncertainty due to the unfolding.

MIG 1.03 f 1.02

1.01

1.00

0.99

0.98

0.97 10.5 - 12.0 GeV/c2 12.0 - 15.0 GeV/c2 0.96 15.0 - 20.0 GeV/c2 20.0 - 60.0 GeV/c2 0.95 2.0 2.5 3.0 3.5 4.0 4.5 y

Figure 4.27: Bin migration correction factor vs rapidity in the different mass bins. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 101

Table 4.11: Bin migration correction factors f MIG determined using two separate unfoldings, as a function of mass and rapidity.

2 Mµµ[ GeV/c ] y 10.5 - 12.0 12.0 - 15.0 15.0 - 20.0 20.0 - 60.0 2.0 - 2.25 0.990 0.022 1.004 0.021 1.009 0.022 0.981 0.030 2.25 - 2.5 0.986 ± 0.011 1.001 ± 0.010 1.011 ± 0.012 0.999 ± 0.014 2.5 - 2.75 0.988 ± 0.009 1.005 ± 0.008 1.015 ± 0.009 1.010 ± 0.010 2.75 - 3.0 0.988 ± 0.008 1.004 ± 0.007 1.015 ± 0.008 1.014 ± 0.008 3.0 - 3.25 0.987 ± 0.008 1.005 ± 0.007 1.016 ± 0.008 1.018 ± 0.008 3.25 - 3.5 0.989 ± 0.007 1.004 ± 0.007 1.016 ± 0.008 1.015 ± 0.008 3.5 - 3.75 0.987 ± 0.008 1.006 ± 0.007 1.016 ± 0.008 1.014 ± 0.008 3.75 - 4.5 0.988 ± 0.007 1.003 ± 0.006 1.013 ± 0.007 1.008 ± 0.008 ± ± ± ± 102 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.6 Efficiencies

There are multiple selection and reconstruction steps from the actual collisions to the dataset being analysed. The first steps are the triggering, followed by the reconstruction. Afterwards additional offline selections are applied, as described in Section 4.2. Each of these steps has a certain efficiency to retain signal events, none of which are 100%. Since the goal is to perform an absolute measurement of the Drell-Yan cross-section, these efficiencies need to be accounted for in the master formula used in Chapter6. In order to do so properly, per-event efficiencies are needed for some efficiencies, if they depend on the kinematics of the event and not just mass and rapidity. With the per-event efficiencies, the average efficiency in each bin in mass and rapidity can be determined. This is the case for the trigger (Section 4.6.1), tracking (Section 4.6.2) and muon ID efficiency (Section 4.6.3). Efficiencies that do not depend on the kinematics can just be taken as a global efficiency, such as the efficiency due to Global Event Cuts (Section 4.6.5) and the vertex χ2 cut used to select signal events (Section 4.6.6).

4.6.1 Trigger efficiency

Not all Drell-Yan events pass the selection requirements of the hardware and software trigger (see Section 3.3.5 for a description of the trigger setup during the data taking for this analysis). This can be due to too harsh selection criteria, dead time in the detector, or even simply due to a malfunctioning of the detector. However, these undetected events also need to be taken in to account when determining the cross-section.

In order to determine the efficiency of the whole trigger chain, a sample with untriggered events is needed. In data this is not possible, because only a small part of the trigger bandwidth is dedicated to the minimum bias trigger, which randomly selects events, but even for these events it is unknown if it actually was a Drell-Yan event. In other words, there is no other unbiased trigger available for low mass DY data. Therefore the signal MC sample is used to determine the trigger efficiency, because it also contains events that were generated, but which did not fire the trigger. Since MC events are different compared to actual data in regards to the event multiplicity, the nSP D reweighting described in Section 4.2.1 is applied.

In the previous analysis it has been shown that the efficiency of the trigger firing is not dependent on the whole event, but on properties of the individual muons [123]. Therefore the trigger efficiency is evaluated using a tag and probe method [150], where one muon from the Drell-Yan event, which is required to have fired all three stages of the single muon trigger, acts as the tag and the response of the other muon is being probed. Using this method has the advantage that it can be cross-checked in data at the Z-peak (85 - 95 GeV/c2), where the signal purity is very high. In addition, it allows determining the efficiency separately for muons and anti-muons, although no difference is expected here.

single For this determination of the single-muon efficiency, εtrig , first the corresponding single muon trigger needs to be defined. It is defined analogous to the trigger described in CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 103

Section 4.2 and is listed in Table 4.12. There are two HLT1 lines available for this, the Hlt1SingleMuonNoIP and Hlt1SingleMuonHighPT lines. The requirements are in general similar to the ones for the di-muon trigger lines presented in Table 4.1. The major difference is that the GEC cut and the cuts on the muon momentum are stronger. For the purposes of this thesis, a muon is said to have triggered the single-muon line if it has triggered the L0SingleMuon trigger and one of the two HLT1 lines.

Table 4.12: Selection requirements for the single muon trigger in the different stages of the 1 trigger. Here µ denotes that the requirement is used for muons of either charge and pT denotes the transverse momentum of the muon with the highest momentum in the event. Values for the trigger stages taken from Ref. [89]. Changes of the cut values during the data taking are marked by footnotes. Where no line is designated, the requirement is used for all lines of that stage, where applicable.

Stage Line Description Cut 1 † L0 High-momentum muon pT > 1.76 GeV/c GEC SPD mult. < 600 2 HLT1 Muon well reconstructed χtrack/ndf < 3 Muon momentum p(µ) > 3 GeV/c ‡ Muon identified as muon isMuon = T rue NoIP Enough hits in VELO VELO hits > 9 NoIP Muon momentum pT(µ) > 1.3 GeV/c HighPT Any hits in VELO VELO hits > 0 HighPT Muon momentum pT(µ) > 4.8 GeV/c † For parts of the data taking (<3%), this threshold was set to 1.48 GeV/c. ‡ For parts of the data taking ( 30%), this threshold was set to 6 GeV/c for the NoIP line and 8 GeV/c∼for the HighPT line.

The single muon efficiency is the ratio of the number of events for which the full event has triggered the di-muon trigger over the number of events where the probe muon triggered the single muon trigger. As expected, it is the same for muons and anti-muons, as can be seen in Fig. 4.28 as a function of the pT of the probe muon.

As is going to be the case for the other efficiencies, they need to be applied to the events in the data sample once they are determined. This is done as per-event efficiencies, so that a total reconstruction efficiency can be assigned to each individual event. For the trigger efficiency this is done separately for the two muons, using the single-muon efficiencies. The single-muon efficiencies are determined in bins of the muon pT, for each individual muon and can be seen in Fig. 4.28.

Determining the efficiency in MC has the advantage that not only the single-muon efficiencies can be determined, but also the combined di-muon trigger efficiency. Therefore it can be checked if the product of the single-muon efficiencies is the same as the di-muon efficiency, which is shown in Fig. 4.29. At low muon momenta the two are not exactly the same, but very similar with differences less than 1.2%. At high muon momenta the two efficiencies are almost identical. This difference is added as a systematic uncertainty on the trigger efficiency. 104 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

1.0 + single trig µ ε µ− 0.9

0.8

0.7

0.6

0.5 10 100 103 pT (µ) [GeV/c]

Figure 4.28: Single-muon trigger efficiency as a function of the pT of the probe muon in simulation. The statistical uncertainty due to the size of the MC sample is included, but too small to see.

1.0 trig ε Full DiMuon trigger efficiency Product of SingleMuon trigger efficiencies 0.9

0.8

0.7

0.6

0.5 10 100 103 pT (µ) [GeV/c]

Figure 4.29: Product of the single muon trigger efficiencies and the direct di-muon trigger efficiency in MC as a function of the transverse momentum of the probe muon. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 105

Ideally, this should be all that is needed in order to determine the trigger efficiency. However, a discrepancy is observed when comparing the single-muon (and also the di- muon) trigger efficiency in MC and in data, both at the Z-peak. This can be seen in Fig. 4.30. In order to correct for this discrepancy, a correction factor is introduced that rescales the trigger efficiency to the one observed in data. Since for data this is only possible at the Z-peak, the dependence of this correction factor is also studied as a function of kinematic variables of the muon. The correction factor as a function of pT, φ and η of the probe muon is shown in Fig. 4.31. While it is very stable as a function of η and reasonably stable as a function of pT, there is a clear dependency visible as a function of the azimuthal angle φ.

The nature of the discrepancy between data and simulation has been studied in detail, but no root cause was found. In the measurement of the Z cross-section [44] a correction of the energy scale in MC was applied, which showed a similar dependence on φ, however this did not lead to any improvement here. A study whether the effect could be due to a wrong simulation of the crossing-angle was also performed by comparing the correction factor in MagUp and MagDown data and MC. If the crossing-angle was wrongly simulated in MC, the trend should be opposite for MagUp and MagDown, but this is not the case. Another possible cause of this dependency that was not checked is a small misalignment of the actual detector in φ.

Since no clear cause for the observed difference between data and MC was found, the correction factor is included as an additional factor, separately for muons and anti-muons. The correction factors are assumed to be universal and determined to have the values + − ctrig = 1.0418 0.0021 and ctrig = 1.0438 0.0021. The uncertainty, which is due to the statistics of the± data and MC sample at the± Z-peak (here 85 - 95 GeV/c2), is taken as a source of systematic uncertainty.

The single-muon trigger efficiency as a function of pT is listed in Table 4.13 as a function − + of pT, separately for µ and µ . 106 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

0.90 trig ε MC 0.85 Data

0.80

0.75

0.70

0.65

0.60 20 40 60 80 100 pT (µ)[GeV/c]

Figure 4.30: Single-muon trigger efficiency as a function of pT of the probe muon in MC and data, both at the Z-peak. Due to the large range of the two bins at the edges, the markers are placed a the centre-of-mass of each bin, instead of at the centre.

Table 4.13: Single-muon trigger efficiency from MC as a function of the transverse momentum of the muons, pT. Bins were chosen such that roughly equal statistics are present in each. The uncertainty is due to the statistical uncertainty from the size of the MC sample and the systematic uncertainty from the difference between the full di-muon efficiency and the product of the single muon efficiencies. The latter is fully correlated between the two single-muon efficiencies and is much larger than the statistical uncertainty.

+ − pT[ GeV/c] εtrig εtrig 3.00 4.61 0.831 0.005 0.833 0.005 4.61 − 5.56 0.833 ± 0.012 0.833 ± 0.012 5.56 − 6.44 0.834 ± 0.012 0.835 ± 0.012 6.44 − 7.40 0.836 ± 0.008 0.834 ± 0.008 7.40 − 8.62 0.833 ± 0.006 0.833 ± 0.006 8.62 − 10.36 0.830 ± 0.004 0.829 ± 0.004 10.36− 13.52 0.826 ± 0.002 0.825 ± 0.002 13.52 − 24.01 0.814 ± 0.005 0.814 ± 0.005 24.01 − 41.02 0.764 ± 0.006 0.762 ± 0.006 41.02 − 0.755 ± 0.004 0.754 ± 0.004 − ∞ ± ± CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 107

1.08 µ+ 1.07 µ−

1.06

Correction factor 1.05

1.04

1.03

1.02

1.01 20 40 60 80 100 pT (µ) [GeV/c] (a) As a function of transverse momentum of the muon.

1.08 µ+ 1.07 µ−

1.06

Correction factor 1.05

1.04

1.03

1.02

1.01 2.0 2.5 3.0 3.5 4.0 4.5 η(µ) (b) As a function of the pseudorapidity of the muon.

1.08

1.07

1.06

Correction factor 1.05

1.04

1.03

+ 1.02 µ µ− 1.01 3 2 1 0 1 2 3 − − − φ(µ) [rad] (c) As a function of the azimuthal angle of the muon track.

Figure 4.31: Dependency of the correction factor to the trigger efficiency, accounting for differences between data and MC, as a function of the kinematics of the probe muon. The uncertainties are from the statistical uncertainties of the Data and MC samples. The two bands show the values of the correction factors if no dependency is assumed. 108 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.6.2 Tracking efficiency

In addition to not all signal events successfully passing all stages of the trigger, events can also be lost because the reconstruction of (one of) the muons fails. This failure in the reconstruction has two origins. Either the track of the muon itself is not correctly reconstructed or the muon is not correctly identified as a muon. The efficiency of the former is accounted for in the tracking efficiency, which is described here, while the latter is corrected for by the muon-ID efficiency, described in the next section.

There are multiple potential reasons that not all charged particles that are created in an event are also successfully reconstructed.

• The particle did not generate enough hits in the tracking chambers due to e.g dead time of the readout electronics or detection inefficiencies.

• The hits left in the trackers are associated to wrong particle tracks.

• The particle leaves the fiducial volume of the detector due to an interaction with the detector material or due to the magnetic field.

All of these reasons are accounted for in the tracking efficiency, defined as the efficiency to successfully reconstruct a track if a charged particle passes through the active region of the detector.

The tracking efficiency is determined from data using a tag and probe method. This method exploits again the fact that at the Z-peak the purity is > 99% [128]. Almost 2 2 all di-muon events in the region 60 - 110 GeV/c passing the χvtx requirement are DY events. As a tag the track of one of the muons (a long track, see Section 3.3.5 for the definitions of the different track types) is chosen and as a probe a partial track is being reconstructed by combining hits in the muon stations and the TT. The invariant mass is reconstructed from the probe and tag track, which means that the mass resolution is worse than in the standard reconstruction. Using this method, the efficiency of correctly reconstructing a long track associated with the probe track can be determined, given the tag. The matching between the probe track and the long track is done by requiring that the two tracks share more than 40% of their hits in the muon stations, more than 60% of their hits in the TT and requiring that the invariant mass of the long track and the tag track is more than 40 GeV/c2 in order to exclude soft tracks from the underlying event [151]. The procedure was validated using signal MC for which truth information exists. The probe was chosen in such a way that the reconstruction efficiency does not depend on it, i.e. the reconstruction of a long track does not use hits in the muon or TT stations [152]. A sketch of the tag and probe tracks can be seen in Fig. 4.32.

The tracking efficiency has already been determined using events at the Z-peak (60 – 120 GeV/c2) in an internal LHCb note [151]. The efficiency is made available as a function of pseudorapidity and transverse momentum of each muon. It is corrected for biases due to the tag and probe method and for the efficiency of the matching procedure. In Ref. [123] it was also shown that the efficiency does not depend on the invariant mass by determining CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 109

tag

probe

magnetic plane muon stations

Figure 4.32: Sketch of the tag and probe method to measure the tracking efficiency. A long track is used as the tag and a partially reconstructed track (MuonTT track) as the probe. Both tracks together are required to form a good Z-boson candidate [150].

2 the tracking efficiency using events at the Υ (1S)-peak at MΥ 9.5 GeV/c MZ , instead of at the Z-peak. Because of this it is not important that≈ the tracking efficiency was only determined using tracks with pT > 20 GeV/c. Therefore the values can directly be taken from Ref. [151] and be applied as single-event efficiencies per muon, as a function of rapidity. The efficiency as a function of rapidity is shown in Fig. 4.33 and Table 4.14. The uncertainty of the efficiency, which is due to the size of the data sample and uncertainties on the corrections applied to it, is taken as a systematic uncertainty for the determination of the cross-section. track ε 0.97

0.96

0.95

0.94

0.93

0.92 LHCb-INT-2014-030[151] 0.91 2.0 2.5 3.0 3.5 4.0 4.5 η(µ)

Figure 4.33: Tracking efficiency as a function of pseudorapidity of the the muons, muons and anti-muons combined. The uncertainties contain both the statistical and the systematic uncertainty [151]. 110 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

Table 4.14: Tracking efficiency as a function of pseudorapidity of the muons, muons and anti- muons combined. The uncertainties contain both the statistical and the systematic uncertainty [151].

η εtrack 2.00 - 2.25 0.915 0.006 2.25 - 2.50 0.951 ± 0.004 2.50 - 2.75 0.963 ± 0.003 2.75 - 3.00 0.955 ± 0.003 3.00 - 3.25 0.953 ± 0.003 3.25 - 3.50 0.971 ± 0.003 3.50 - 3.75 0.972 ± 0.004 3.75 - 4.00 0.969 ± 0.004 4.00 - 4.25 0.949 ± 0.006 4.25 - 4.50 0.924 ± 0.008 ±

4.6.3 Muon ID efficiency

Just as not all muon tracks are not reconstructed correctly, the identification as a muon can also fail. This is more likely for low-momentum muons, since they leave hits in fewer muon stations. This means that this efficiency needs to be applied as a function of transverse momentum as well.

In Ref. [151] the muon ID efficiency was determined for muons with a momentum greater than 20 GeV/c2 in data using Z µµ events at √s = 7 and 8 TeV, while in Ref. [123] this was extended to lower momenta→ using MC simulation created for the measurement of the Drell-Yan cross-section at √s = 7 TeV. Both use the tag and probe method, with a fully reconstructed muon as the tag and a long track originating from a muon as the probe.

In this analysis the muon-ID efficiency is directly taken from the 7 TeV analysis [123], because otherwise no information about the muon ID efficiency for low momentum tracks + − would be available. For high-pT muons, i.e. for Z µ µ events, the muon-ID efficiency is also available at 8 TeV in Ref. [151] in the mass range→ 60 – 120 GeV/c2. When comparing the two, as is shown in Fig. 4.34, two differences are visible. The efficiency from the Z-peak is only available at high-pT, it quickly drops off towards lower momentum due to the kinematic constraints. In addition, at high transverse momentum, a difference between the efficiencies at 7 TeV and 8 TeV can be seen. The largest observed difference in the high momentum region (33.92 - 6.85 GeV/c) between the average 2011 efficiency and the 2012 efficiency at the Z-peak (0.58%), is taken as systematic uncertainty. This is not ideal, ideally the procedure presented in Ref. [123] would be repeated using the 2012 MC, however, the necessary MC sample was not available at the time. As is done in Ref. [123], the efficiency is separately applied for muons and anti-muons. The values used are shown in Table 4.15. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 111 ID ε 0.99

0.98

0.97

0.96

0.95

0.94 µ+ 7 TeV [123] 0.93 µ− 7 TeV [123] µ± 8 TeV, Z-peak[151] 0.92 Systematic uncertainty 0.91 0 10 20 30 40 50 60 70 pT (µ) [Gev/c]

Figure 4.34: Muon-ID efficiency as a function of the muon pT in the full mass range for muons and anti-muons [123] and at the Z-peak [151]. The systematic uncertainty due to using the 7 TeV efficiency is indicated in grey in the region where it is determined.

Table 4.15: Muon-ID efficiency as a function of pT.

+ − pT[ GeV/c] εID εID 3.00 8.15 0.967 0.000 0.968 0.000 8.15 − 13.31 0.978 ± 0.001 0.978 ± 0.001 13.31− 18.46 0.980 ± 0.001 0.982 ± 0.001 18.46 − 23.62 0.980 ± 0.001 0.985 ± 0.001 23.62 − 28.77 0.980 ± 0.001 0.980 ± 0.001 28.77 − 33.92 0.984 ± 0.001 0.985 ± 0.001 33.92 − 39.08 0.988 ± 0.001 0.988 ± 0.001 39.08 − 44.23 0.990 ± 0.000 0.990 ± 0.000 44.23 − 49.38 0.991 ± 0.000 0.991 ± 0.000 49.38 − 54.54 0.990 ± 0.001 0.992 ± 0.001 54.54 − 59.69 0.990 ± 0.001 0.989 ± 0.002 59.69 − 64.85 0.992 ± 0.002 0.992 ± 0.002 64.85 − 70.00 0.988 ± 0.003 0.990 ± 0.003 − ± ± 112 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.6.4 Combined reconstruction and trigger efficiency

All three previously presented efficiencies take the form of per-event efficiencies. However in Chapter6, these efficiencies are only needed per bin in mass and rapidity. Therefore a combined reconstruction and trigger efficiency εreco is computed for each event, which is then averaged over each bin in mass and rapidity. Since in Eq. (6.2) the inverse of the efficiency is needed and the average of the inverse is not the same as the inverse of the average, the combined reconstruction and trigger efficiency is calculated as

 1   1  . (4.15) εreco ≡ εtrigεtrackεID

Since all three efficiencies are binned, the uncertainty can be propagated such that two events which are in the same bin for a specific efficiency have a fully correlated uncertainty for that efficiency. This way the correlation is preserved correctly.

In Fig. 4.35 the inverse of the combined efficiency is shown as a function of mass and in Fig. 4.36 as a function of mass and rapidity. The reconstruction efficiency gets smaller for high masses, which is driven by the trigger efficiency and for low rapidities, which is driven by the tracking efficiency. The reconstruction efficiency does not get smaller for the high rapidity bin, because its lower edge is still in a region with a high tracking efficiency and there are not as many events at the high edge of this bin. The shown uncertainties are the propagated uncertainties from the three efficiencies, which are used as the systematic uncertainty of the reconstruction and trigger efficiency. i reco

ε 0.64 h

0.62

0.60

0.58

0.56

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.35: Combined reconstruction and trigger efficiency as a function of mass. The uncertainty is the propagated uncertainty from the individual efficiencies, averaged in each bin including correlations. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 113 i

reco .

ε 0 65 h 0.64

0.63

0.62

0.61

0.60 10.5 - 12.0 GeV/c2 12.0 - 15.0 GeV/c2 0.59 15.0 - 20.0 GeV/c2 20.0 - 60.0 GeV/c2 0.58 2.5 3.0 3.5 4.0 y

Figure 4.36: Combined reconstruction and trigger efficiency as a function of mass and rapidity. The uncertainty is the propagated uncertainty from the individual efficiencies, averaged in each bin including correlations. 114 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

4.6.5 Global event cut efficiency

Events with high particle multiplicities produce many tracks to reconstruct and many hits in the calorimeter system. This can cause the event reconstruction to take too long compared to the small time of 50 ns (in Run I) between collisions. To avoid this, global event cuts (GEC) rejecting high multiplicity events are enforced in the L0 trigger. For the L0DiMuon trigger, there is only one such cut, events with more than 900 hits in the SPD are rejected (see also Table 4.1).

The efficiency of this cut is determined from data, following the second method from Ref. [44]. The distribution of the number of SPD hits in data within about 2 GeV/c2 of the Z-peak (specifically, in the range 89 - 93 GeV/c2) is described using the sum of a Gamma distribution and a Gaussian distribution. The Gamma distribution used here is from the the RooFit package [153], where it is defined as:

βα f(x; α, β) = (α, β)xα−1e−βx, (α, β) = , (4.16) N N Γ(α) R ∞ a−1 −t where Γ is the Gamma function Γ(a) = 0 t e d t.

1000 Data

Events / ( 9 ) Fit Signal

800 Background

600

400

200

4 2 0 −2 −4 0 200 400 600 800 #SPD hits

Figure 4.37: Distribution of the number of hits in the SPD, in 2012 data at the Z-peak. Overlayed is a fit (red) of a sum of a Gamma distribution (sky-blue) and a Gaussian distribution (orange). Below are the resulting pulls.

The fit result to the MagUp dataset can be seen in Fig. 4.37. From the fitted PDF, the efficiency of the GEC at 900 SPD hits can be extracted as the fraction of the total integral of the PDF that lies below the cut value. The value obtained this way is (99.776 0.015)%, where the uncertainty is statistical only. ±

In order to determine a systematic uncertainty on this value, the dependency of the CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 115 efficiency on several kinematic variables are studied. This is only possible in MC, since in data there is a significant contribution of background outside of the region of the Z-peak. However, data and MC disagree significantly in the number of SPD hits, as can be seen in in Fig. 4.38. What is also visible, however, is that the functional shape is very similar. Therefore, a cut on the MC distribution can be found where the efficiency obtained from data agrees with the one obtained from MC (both at the Z-peak). It is found that a cut at 644 SPD hits in MC corresponds to the GEC in data at 900 SPD hits.

0.0035 Data MC 0.0030

0.0025 Density of events 0.0020

0.0015

0.0010

0.0005

0.0000 0 200 400 600 800 #SPD hits

Figure 4.38: Comparison of the SPD hits distribution in data and MC, within 2 GeV/c2 of the Z-peak.

With this equivalent cut value, the dependency of the efficiency can be studied also outside of the region of the Z-peak using the MC sample. Shown in Fig. 4.39 is the dependency of the GEC efficiency in MC on the invariant mass Mµµ and the transverse momentum and rapidity of the Z, as well as the angular separation of the two muons, ∆φ.

Note that in general, the efficiency in MC is a bit lower than the one obtained from data, which is also shown. This is probably due to the mass dependence visible in the upper left plot, since the MC cut was tuned to correspond to the cut in data only at the Z-peak, where the efficiency is the highest. Nevertheless, the dependence on the four variables is assumed to be close enough to the real dependence. The maximum observed difference for each of the four variables is taken into account as a systematic uncertainty, including their correlation with each other. This results in an overall uncertainty of 0.16 %, where the original statistical uncertainty of 0.015 % can be neglected. As an additional cross-check, the GEC efficiency is also calculated with a cut in data at 600 SPD hits, to be compared to the GEC efficiency calculated in Ref. [44] for Z µ+µ− events. There the efficiency for 2012 data was determined to be (93.0 0.3)%. With→ the method used here, a consistent value of (93.23 0.10)% is determined.± ±

In total, the global event cut efficiency is determined to be εGEC = (99.78 0.16)%. It ± 116 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

1.000 1.000 GEC GEC

ε Data ε Data 0.999 MC 0.999 MC

0.998 0.998

0.997 0.997

0.996 0.996

0.995 0.995

0.994 0.994

10 20 30 40 60 100 10 100 103 104 105 2 0 Mµµ[GeV/c ] pT (Z )[GeV/c]

(a) As a function of invariant mass Mµµ. (b) As a function of transverse momentum 0 pT(Z ).

1.000 1.000 GEC GEC

ε Data ε Data 0.999 MC 0.999 MC

0.998 0.998

0.997 0.997

0.996 0.996

0.995 0.995

0.994 0.994

2π 3π π 1π 0 1π π 3π 2π 2.0 2.5 3.0 3.5 4.0 4.5 − −2 − −2 2 2 y ∆φ (c) As a function of rapidity y. (d) As a function of the angular separation of the two muons ∆φ.

Figure 4.39: GEC efficiency as a function of various variables. Also shown is the efficiency evaluated in data at the Z-peak (grey dashed line). The uncertainty in εGEC is statistical only and the uncertainty of the independent variables denotes only the bin width. was checked that the same efficiency is obtained from MagUp and MagDown data. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 117

4.6.6 Vertex χ2 cut efficiency

2 In order to enhance the data sample with Drell-Yan events a cut on the χvtx/ndf of the Z-vertex was applied (see Section 4.2), requiring that this value be less than five. 2 Conversely, a χvtx/ndf > 15 was required for the heavy-flavour sample to enhance it with heavy-flavour background events.

However, as can be seen for example in Fig. 4.11, some residual signal is left in the heavy-flavour sample. There it was necessary to subtract this signal component in order to get a correct estimation of the amount of signal and not wrongly assign part of the signal to the background.

Here it is now necessary to correct for the fact that by making this selection cut, some of the signal is lost. Both the data and MC datasets are reproduced with the cut on the vertex χ2 removed. From this the efficiency of the cut can be determined, once in data at the Z-peak (from 89 to 93 GeV/c2) and in MC across the full mass-range.

In order to ensure that this efficiency is independent of the other efficiencies all other selection cuts are performed and in addition it is required that both muons are reconstructed in the fully-instrumented region of LHCb.

By removing the cut on the vertex χ2, many normally rejected background events are added, which dilutes the purity in data. In order to at least reduce this effect, especially from random combinations of two muons, only events with exactly one primary vertex are used for this study. This, naturally, ensures that the two muons come from the same primary vertex (or at least only from a secondary vertex associated with the same primary vertex in the case of a heavy-flavour event).

Z In data this procedure yields an efficiency of εvtx = (95.95 0.23)%, while in MC a value MC ± of εvtx = (96.00 0.04)% is obtained, where the uncertainty is statistical only and the MC efficiency is integrated± over all mass and rapidity bins. Both values agree within their uncertainties.

In addition to the statistical uncertainty, there are systematic effects to be taken into account. In MC both the full mass dependence and the rapidity dependence can be studied. In data studying the mass dependence is not possible, because the contribution from background is negligible only at the Z-peak. However, the rapidity dependence (at the Z-peak) can be studied in data.

The mass dependence, as observed in MC, as well as the baseline value for εvtx obtained in data at the Z-peak, are shown in Fig. 4.40. In Fig. 4.41 the rapidity dependence is shown, both in MC as well as in data (both at the Z-peak, in order to be able to compare them). The dependence seems to be reproduced in MC reasonably well. While there is a strong mass dependence, the rapidity dependence is much less pronounced. Both the mass and rapidity dependence of the efficiency are taken into account by assigning each bin an individual efficiency. The efficiency in each mass bin is shown in Table 4.16 and the efficiency in each mass and rapidity bin in Table 4.17 and Fig. 4.42. The uncertainties are due to the statistics of the MC sample and are taken as the systematic uncertainty 118 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

due to the vertex χ2 efficiency.

Table 4.16: Vertex χ2 efficiency determined in MC in the different mass bins. The uncertainty is statistical only.

2 Mµµ [ GeV/c ] εvtx [%] 10.5 – 11.0 94.41 0.05 11.0 – 11.5 94.41 ± 0.05 11.5 – 12.0 94.55 ± 0.06 12.0 – 13.0 94.44 ± 0.04 13.0 – 14.0 94.47 ± 0.05 14.0 – 15.0 94.68 ± 0.05 15.0 – 17.5 94.69 ± 0.04 17.5 – 20.0 94.69 ± 0.05 20.0 – 25.0 94.83 ± 0.05 25.0 – 30.0 95.09 ± 0.07 30.0 – 40.0 95.19 ± 0.07 40.0 – 60.0 95.60 ± 0.09 60.0 – 110.0 95.87 ± 0.03 ±

2 Table 4.17: Vertex χ efficiency εvtx in % determined in MC in the different mass and rapidity bins. The uncertainty is statistical only.

2 Mµµ[ GeV/c ] 10.5 - 12.0 12.0 - 15.0 15.0 - 20.0 20.0 - 60.0 y 2.0 - 2.25 93.35 0.13 93.55 0.11 94.02 0.12 93.98 0.13 2.25 - 2.5 93.38 ± 0.23 93.26 ± 0.21 92.73 ± 0.25 94.21 ± 0.23 2.5 - 2.75 93.71 ± 0.10 93.83 ± 0.09 93.80 ± 0.10 93.98 ± 0.10 2.75 - 3.0 93.82 ± 0.09 93.94 ± 0.08 93.84 ± 0.09 94.21 ± 0.08 3.0 - 3.25 93.78 ± 0.08 93.91 ± 0.07 93.98 ± 0.08 94.61 ± 0.08 3.25 - 3.5 94.02 ± 0.08 94.44 ± 0.07 94.16 ± 0.08 94.82 ± 0.08 3.75 - 4.5 94.39 ± 0.07 94.27 ± 0.06 94.49 ± 0.07 95.24 ± 0.07 ± ± ± ± The origin of the mass dependence was also studied further, since it is quite pronounced. A priori it could be due to one of (at least) three reasons:

1. It is actually a rapidity dependence that is also visible in the mass due to different rapidity distributions in the different mass bins.

2. It is a kinematic effect, where muons with different transverse momenta are not equally well reconstructed, which leads to a different vertex χ2 and therefore also to a different efficiency of the cut.

3. It is an effect of the event multiplicity, where the reconstruction becomes at one side easier with more tracks (it is easier to identify where the PV was) as well as harder (more hits in the trackers make it easier to misreconstruct a track by e.g. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 119 vtx ε 0.960

0.955

0.950

0.945 Data, at Z-peak MC, at Z-peak MC 0.940 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.40: Vertex χ2 efficiency in the different mass bins, obtained from MC. The uncertainty is statistical only. Also shown are the efficiencies at the Z-peak both in data and MC.

1.00 vtx ε Data, at Z-peak, integrated 0.99 MC, at Z-peak, integrated MC, at Z-peak Data, at Z-peak 0.98

0.97

0.96

0.95

2.0 2.5 3.0 3.5 4.0 4.5 y

Figure 4.41: Vertex χ2 efficiency in the different rapidity bins at the Z-peak (89 - 93 GeV/c2) in data and MC. The uncertainties are statistical only. Also shown are the efficiencies integrated over the rapidity. 120 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION vtx ε 0.950

0.945

0.940

0.935 10.5 - 12.0 GeV/c2 0.930 12.0 - 15.0 GeV/c2 15.0 - 20.0 GeV/c2 20.0 - 60.0 GeV/c2 0.925 2.0 2.5 3.0 3.5 4.0 4.5 y

Figure 4.42: Vertex χ2 efficiency as a function of rapidity in the different coarse mass bins in MC. The uncertainties denoted by the shaded areas are statistical only and the markers are drawn at the centres of the rapidity bins.

using a hit from a different particle). In order for this to be able to produce a mass dependence, the multiplicities need to be different in the different mass bins, which they are as can be seen in Fig. 4.43. i 212 nSPD h

210

208

206

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 4.43: Average number of SPD hits in MC as a function of mass.

The first possibility is easily excluded by restricting the data to a central rapidity bin (3 < y < 3.5) and observing that the mass dependence is still visible. If this had been the CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 121 origin of the mass dependence, a rapidity dependence would also have to be taken into account for the efficiency.

In order to study the other two possibilities the following procedure is performed.

1. Calculate the efficiency as a function of the variable under study (either a variable related to the transverse momenta of the two muons or the multiplicity in the SPD).

2. Assign a per-event efficiency from this dependence (if any exists).

3. For each mass bin calculate the average efficiency and compare it to the mass dependence.

For the SPD multiplicity the results of this can be seen in Fig. 4.44. As is shown in the top figure, the vertex χ2 efficiency does depend on the number of hits in the SPD. However, when using the SPD dependence as a per-event efficiency and averaging the resulting efficiency in each mass bin, no dependence is observed, as is shown in the lower figure. Therefore this cannot be the cause of the mass dependence, the change in the number of SPD hits as a function of mass is not large enough.

The vertex χ2 efficiency also depends on the transverse momenta of the two muons, specifically the lower transverse momentum of the two. In contrast to the number of SPD hits, this dependence is also visible when projecting it into the mass dimension. It even closely resembles the observed mass dependence, as is shown in Fig. 4.45. Therefore the mass dependence of the vertex χ2 efficiency is probably due to a transverse momentum dependence. No additional systematic uncertainty is assigned due to this dependence, as it is already included in the mass dependence of the efficiency. 122 CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION

vtx 0.960 ε

0.955

0.950

0.945

0.940

0.935

0.930 Data, at Z-peak MC, at Z-peak 0.925 MC

0 200 400 600 800 #SPDhits (a) Dependence of the vertex χ2 efficiency on the number of SPD hits in MC. vtx ε 0.960

0.955

0.950

Data, at Z-peak 0.945 MC, at Z-peak MC MC from nSPD dep 0.940 10 20 30 40 60 100 2 Mµµ[GeV/c ] (b) The projection of this dependence to the invariant mass.

Figure 4.44: Study to see if the dependence of the vertex χ2 efficiency on the number of SPD hits can be responsible for the mass dependence using MC. The efficiency observed in data and in MC at the Z-peak also shown. CHAPTER 4. MEASUREMENT OF THE DRELL-YAN CROSS-SECTION 123

vtx 0.97 ε

0.96

0.95

0.94

0.93

0.92

0.91 Data, at Z-peak MC, at Z-peak MC 0.90 104 105 min pT (µ)[GeV/c] (a) Dependence of the vertex χ2 efficiency on the transverse momentum of the lower-momentum muon. vtx ε 0.960

0.955

0.950

Data, at Z-peak 0.945 MC, at Z-peak MC MC from pT dep 0.940 10 20 30 40 60 100 2 Mµµ[GeV/c ] (b) The projection of this dependence to the invariant mass.

Figure 4.45: Study to see if the dependence of the vertex χ2 efficiency on the transverse momentum of the lower-momentum muon can be responsible for the mass dependence using MC. The efficiency observed in data and in MC at the Z-peak also shown. 124 CHAPTER 5

SYSTEMATIC UNCERTAINTIES

As we know, there are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.

Donald H. Rumsfeld

Essentially all of the steps used to determine the (double-) differential Drell-Yan cross- section have some uncertainty associated with them. In this chapter the different contri- butions to the systematic uncertainty are presented. These uncertainties can be either statistical in nature, such as the size of the MC sample which is used to determine some of the efficiencies, or entirely systematic, such as the choice of the binning of the fitting variable. All these uncertainties are considered as systematic uncertainties, only the uncertainty reported by the fit is used as the statistical uncertainty. For an overview of the different contributions to the total systematic uncertainty, see Section 5.5.

5.1 Fitting

In order to study the systematic effect of the fitting procedure on the results, a number of checks are performed. First, toy studies are used to validate the fitting procedure in general and determine the bias, if any, of the final fitting procedure.

125 126 CHAPTER 5. SYSTEMATIC UNCERTAINTIES

In addition, both an alternate signal template (using information from signal MC) and an alternate heavy-flavour template (using the impact parameter instead of the vertex χ2) are used. Finally, the number of bins in the isolation variable is varied since the baseline choice is essentially arbitrary.

5.1.1 Signal template

The signal template is generated from data around the Z-resonance. This assumes that the events there are representative for the events in the other mass bins. That this is not entirely the case could already be seen in Section 4.3.2. In order to mitigate this, a rapidity reweighting was applied there, which accounts for the fact that the rapidity distribution changes as a function of mass. For this the rapidity distribution of the MC sample in the different mass bins was used. Since this procedure relies entirely on the rapidity distribution being reproduced correctly in MC, this is a possible source of systematic uncertainty. In order to estimate this systematic uncertainty, two different approaches are used:

1. Model the mass dependence in a different way.

2. Disregard the rapidity reweighting entirely.

For both approaches it is helpful to consider the mass dependencies of the different templates, namely the MC sample, the rapidity reweighted signal template and a signal template where no rapidity reweighting is performed. For this the bulk mean of the three templates is shown in Fig. 5.1. The bulk mean of the MC sample clearly has a mass dependence. While the signal template also has a mass dependence, which it gained from the rapidity reweighting, the two mass dependencies are not the same. In contrast, the unweighted signal template has no mass dependence, by construction.

In addition it is visible that the isolation variable is not modelled properly in MC. The origin lies in the imperfect modelling of the event activity, as already observed in the number of SPD hits. The MC sample generally tends to underestimate the activity in the detector. This affects also the isolation variable since it is just the sum of the transverse momenta of all tracks in a cone around the muon. Because of this the isolation variable is susceptible to both the number of tracks not being correctly modelled and also to the momentum being incorrectly simulated.

For the first approach, using a different model of the mass dependence, the procedure is described in the following. The idea is to modify the data used to generate the signal template such that the isolation fraction and the bulk mean and width follow the mass dependence of the MC sample while retaining the overall differences between data and MC.

In order to compensate for the difference between data and MC, the shift of the bulk mean with respect to the value at the Z-peak is used instead of the bulk mean of the MC CHAPTER 5. SYSTEMATIC UNCERTAINTIES 127

7.30

7.25 Bulk mean

7.20 MC 7.15 Signal template Signal template (unw.) 7.10

7.05

7.00

10 20 30 40 60 100 2 Mµµ [GeV/c ]

Figure 5.1: Bulk mean of the MC sample and the reweighted and unweighted signal template as a function of mass. sample directly. If the isolation variable is wrong by an offset in MC, using the shift of the bulk mean from the value at the Z-peak cancels this offset. This would be the case if either the number of tracks or their transverse momentum is wrong by a multiplicative factor. In that case there would be no change of the bulk width, however the bulk width is also different in data and MC. Nevertheless, Figs. 4.6c and 4.6d show that a constant offset between data and MC is a good approximation.

Similarly, the ratio of the bulk width and the ratio of the isolation fraction with respect to the Z-peak are used instead of the actual values of the bulk width and isolation fraction in MC. Also here it can be seen in Figs. 4.6a, 4.6b, 4.6e and 4.6f that assuming a constant ratio between the data and MC bulk width and isolation fraction is a reasonable approximation.

The bulk mean and width of the data used to generate the signal template can be adjusted to follow the mass dependence seen in MC by modifying the values of the isolation variable of each event. In order to do so, the bulk is moved such that it has mean zero, the values are scaled to have the desired bulk width and then the rescaled bulk is moved to the requested new bulk mean, defined as the old bulk mean plus the shift observed in MC in that bin. Defining the shift in bulk mean and the ratio of the bulk widths in bin i using the template j as

j j j ∆µi = µi µZ , (5.1) j j −j si = σi /σZ , (5.2) this variation can be expressed as

MC 0 si Data Data MC x = Data x µi + µZ + ∆µi , (5.3) si · − 128 CHAPTER 5. SYSTEMATIC UNCERTAINTIES with x = log(isolation +1). Here the requested change in the bulk mean and width, with respect to their values at the Z-peak, have been determined in MC in each bin.

This can be simplified further because XZ is the value of variable X in the mass bin containing the Z-peak and in the current rapidity bin. Since the unweighted signal template is used as a baseline for the adjusted template, there is no mass dependence Data Data Data Data Data for the data. This means that σi = σZ = si = 1 and µi = µZ in each rapidity bin, leading to: ⇒

x0 = sMC x µData + µData + ∆µMC . (5.4) i · − Z Z i

Afterwards the fully isolated events need to be restored to lie again at log(isolation) = 0, since they have also been shifted and scaled. This is done manually by recording the fraction of fully isolated events. In addition, the fraction of fully isolated events is varied as well. In order to scale the isolation fraction with a factor rMC , the content of the first bin is multiplied by the factor

Pn MC i=1 xi r˜ = r MC , x0 = 0. (5.5) · (1 r ) x0 6 − · This takes into account that the isolation fraction was defined as the number of events in the first bin divided by the number of events in all bins, including the first bin, in Eq. (4.5).

That these variations change the template in the requested way is shown in Fig. 5.2, where the requested variation of the bulk mean, width and isolation fraction is compared to the actual variation observed after performing this procedure. The different points correspond to all mass and rapidity bins in which the fit is being run. In general these variations are on the order of less than 20%. The small deviations from the expected straight line are probably due to statistics, since the mean and width were determined on the binned templates instead of the underlying distributions.

As an example of how this affects the template shape, in Fig. 5.3 the adjusted signal template in the lowest mass bin is shown together with the signal template with and without the rapidity reweighting. The rapidity reweighting seems to mostly affect the number of fully isolated events, while the template adjustment affects mostly the bulk.

By construction the adjusted signal template follows quite closely the trends of the MC sample in bulk mean and standard deviation and in the isolation fraction, as is shown in Fig. 5.4. While the bulk mean and isolation fraction are relatively similar between the signal template and the MC sample, the bulk width is most affected by the adjustment.

In Fig. 5.5 the ratio between the fit with the two alternative signal templates and the baseline fit is shown as a function of mass. For the systematic uncertainty, half of the larger difference is taken in each bin as a conservative estimate for the systematic uncertainty due to the signal template. The resulting relative uncertainty from this source is shown in the overview in Section 5.5. CHAPTER 5. SYSTEMATIC UNCERTAINTIES 129

1.10 Isolation fraction ratio Bulk mean shift + 1 1.05 Bulk width ratio

Actual variation 1.00

0.95

0.90

0.85

0.80 0.80 0.85 0.90 0.95 1.00 1.05 1.10 Requested variation

Figure 5.2: Relationship between the requested variation of the isolation fraction, the bulk mean and standard deviation of the signal template versus the observed change.

Signal 0.150 Signal (adjusted) Signal (unweighted) 0.125

0.100 Normalized entries

0.075

0.050

0.025

0.000 0.0 2.5 5.0 7.5 10.0 12.5 15.0 log(isolation + 1)

Figure 5.3: Baseline signal template as well as the two signal templates used for the systematic uncertainty, the adjusted and the unweighted signal template, in the mass bin 10.5 – 11 GeV/c2. 130 CHAPTER 5. SYSTEMATIC UNCERTAINTIES Z Z µ 0.00 σ /

− MC σ µ Signal template 1.00 0.02 SignalAdjust template − 0.99 0.04 − Bulk std ratio, 0.98 Bulk mean shift, 0.06 − 0.97 0.08 − MC 0.96 Signal template 0.10 SignalAdjust template − 10 20 30 40 60 100 10 20 30 40 60 100 2 2 Mµµ [GeV/c ] Mµµ [GeV/c ] (a) Bulk mean as a function of mass. (b) Bulk width as a function mass. iso

r 1.00 MC Signal template 0.98 SignalAdjust template

0.96

0.94 Isolation fraction ratio, 0.92

0.90

0.88 10 20 30 40 60 100 2 Mµµ [GeV/c ] (c) Isolation fraction as a function of mass.

Figure 5.4: Properties of the signal, signal MC and adjusted signal template as a function of invariant mass.

5.1.2 Heavy-flavour template

As the heavy-flavour background is the dominant background, see e.g. Fig. 4.16, it is also studied how the particular choice of the heavy-flavour background template affects the cross-section measurement. Instead of using the requirement for the two muons to form 2 a vertex of low quality (using the Z vertex χvtx), the HF (IPCHI2) sample, as defined in Section 4.2.2, can be used to generate the heavy-flavour background template. The difference between the fit result using the two different heavy-flavour templates can then be used to estimate the systematic uncertainty due to the background template.

Recall that the HF (IPCHI2) sample used the quality of the vertex of the two muons, 2 the χIP, as a selection criteria. The cuts for both samples are designed to select events where one of the two muons originates not from a PV or at least not from the same PV as the other muon. This means mostly background like B± Xµ± is selected. The alternative heavy-flavour sample with the impact parameter quality→ cut also contains e.g. B Xµ±µ∓ events, where both muons originate from a common secondary vertex, which → CHAPTER 5. SYSTEMATIC UNCERTAINTIES 131

Baseline Ratio 1.10 SignalUnweighted SignalAdjust 1.05

1.00

0.95

0.90

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 5.5: Ratio between the signal yield using the adjusted template and the unweighted template with respect to using the baseline signal template, as a function of mass. The uncertainties assume ρ = 1, see AppendixE.

2 also leads to a higher χIP. About 47% of the events present in the baseline heavy-flavour sample are also present in the IPCHI2 heavy-flavour sample. The number of events in each sample, and their overlap, are shown in Fig. 5.6.

This overlap between the two samples can also be studied as a function of invariant mass and rapidity. The easiest way to do so is to calculate the correlation between the two 2 samples, which is given by ρ = σ1σ2/σ1∩2. Since this would mean running the fit with two additional samples, in order to obtain both σ and σ , instead we can estimate the 2 1∩2 p correlation coefficient using the fact that in general σ √1 , leading to ρ = N 2 /N N . N 1∩2 1 2 This correlation varies in the different mass bins, but∝ not in the different rapidity bins, as can be seen in Fig. 5.7. The different correlation in the different mass bins can be explained by the fact that the IPCHI2 heavy-flavour sample contains more residual signal than the baseline heavy-flavour sample. This becomes especially important in the bin containing the Z-peak, since there both background samples are almost entirely signal, but apparently different signal events, because the correlation almost disappears in the bin containing the Z-peak.

The fit is run again with this alternative heavy-flavour background template. The ratio between the signal yields using the two different heavy-flavour background samples is shown in Fig. 5.8, with the correlation shown above taken into account for the uncertainty of the ratio. While the two fits mostly agree at high masses, a larger discrepancy towards low masses is found. Note that the toy study determining the bias of the fit is not rerun for the HF IPCHI2 template and only the raw signal yields are compared. A preliminary study using a reduced number of toy fits in each mass bin did not show any large differences between the biases, though. Therefore, half of the difference between the two background models is taken as systematic uncertainty. In the end, this uncertainty is the limiting 132 CHAPTER 5. SYSTEMATIC UNCERTAINTIES

HF: HF & HF (IPCHI2): HF (IPCHI2): 188,521 165,223 460,203

Figure 5.6: Overlap between the HF and HF IPCHI2 samples and the number of events present in each.

ρ 1.0 ρ 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0 10 20 30 40 60 100 2.0 2.5 3.0 3.5 4.0 4.5 2 Mµµ[GeV/c ] y (a) As a function of the invariant mass. (b) As a function of rapidity, integrated over the invariant mass.

Figure 5.7: Correlation factor ρ between the HF and HF IPCHI2 samples. CHAPTER 5. SYSTEMATIC UNCERTAINTIES 133 factor of the cross-section measurement, at least at low masses.

Ratio 1.000

0.975

0.950

0.925

0.900 HF 0.875 HF IPCHI2

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 5.8: Ratio between the signal yields using the baseline HF sample and the HF IPCHI2 sample, as a function of invariant mass. The uncertainty of the ratio takes the correlation between the two samples into account, see AppendixE.

5.1.3 Toy studies and bin migration

The two corrections which are directly applied to the signal yield, the toy correction and the bin migration factor, both have systematic uncertainties assigned due to the statistical uncertainties from their determination. See Sections 4.4.4 and 4.5 for the values of the systematic uncertainties, respectively. 134 CHAPTER 5. SYSTEMATIC UNCERTAINTIES

5.2 Efficiencies

The systematic uncertainties that are assigned to the different efficiencies needed to obtain the absolute cross-section have already been described in Section 4.6 together with the efficiencies themselves. There a combined reconstruction efficiency was determined, which includes the trigger, tracking and muon ID efficiencies. The systematic uncertainties of these three efficiencies are combined in a total systematic efficiency uncertainty (see Section 5.5). In order to see which part of the reconstruction efficiency contributes most to its overall systematic uncertainty, the relative systematic uncertainty due to the different components of the reconstruction efficiency are shown in Fig. 5.9 as a function of invariant mass.

Tot. Syst. muon ID syst 0.01 trigger eff factorizability tracking eff trigger eff correction trigger eff muon ID Relative uncertainty

3 10−

10 20 30 40 60 2 Mµµ[GeV/c ]

Figure 5.9: Relative systematic uncertainty due to the different components of the reconstruction efficiency, as a function of invariant mass.

5.3 Luminosity

The absolute luminosity scale was measured during dedicated data taking periods, using both Van-der-Meer scans and beam-gas imaging methods, as described in Section 3.4. Both methods are combined to determine the final luminosity estimate for data recorded by LHCb in 2012 with a relative uncertainty of 1.16% [88]. CHAPTER 5. SYSTEMATIC UNCERTAINTIES 135

5.4 Cross-checks

In addition to the systematic uncertainties, some consistency checks were performed in order to rule out obvious errors. These include changing some of the arbitrarily chosen parameters of the fitting procedure (such as the number of bins of the isolation variable) and analysing statistically independent samples, such as the MagUp and MagDown samples. These cross-checks were performed only as a function of mass. No additional uncertainties are applied.

5.4.1 Number of bins

Since the number of bins used for the isolation template was arbitrarily chosen to be 40, a study is performed to see if this number has any effect on the result. The fit is run again, but this time with only 30 bins in the isolation variable. The results as a function of mass are shown in Fig. 5.10, where the uncertainty is determined according to Eq. (E.2), with ρ = 1. The results agree within less than half a percent with the baseline result.

40 bins Ratio 1.004 30 bins

1.002

1.000

0.998

0.996

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 5.10: Ratio between the fit using 30 bins in the isolation variable to the baseline fit which uses 40 bins. The uncertainty is the difference between the (relative) statistical uncertainties of the two results, see AppendixE.

5.4.2 Magnetic polarity

As mentioned in Section 4.2.3, the data taking in 2012 was performed with the magnetic field either pointing up or down (MagUp and MagDown). To each magnetic field configu- 136 CHAPTER 5. SYSTEMATIC UNCERTAINTIES

ration about half the time was dedicated. This is especially important for measurements of CP -violation, since oppositely charged particles end up in different regions of the detector, depending on the orientation of the magnetic field. Having data taken with both orientations allows studying asymmetries of the detector. Since the final state in this analysis is CP -even, this is of no concern here, but the two data samples allow the whole data set to be split into two statistically independent data samples.

This was already exploited when calculating the bin migrations due to the finite detector resolution, bremsstrahlung and FSR in Section 4.5. Here it is used as a cross-check, to see if any significant deviation is observed. However, it has to be kept in mind that both the toy studies and most efficiencies always used the combined data sample and not the two samples separately. Therefore, the raw signal yields are directly compared here, only scaled by the luminosities of the two samples, up = 1016 5 pb and down = 981.9 4.2 pb, respectively. As can be seen in Fig. 5.11, theL ratio between± the signalL fractions using± these two data sets is roughly compatible with unity (χ2/ndf = 24.84/12) and no dependency is visible as a function of invariant mass.

1.100

1.075

1.050 MagUp / MagDown 1.025

1.000

0.975

0.950 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 5.11: Ratio of the signal fractions of separate fits to the two magnetic field polarities MagUp and MagDown, compatible with no dependency. CHAPTER 5. SYSTEMATIC UNCERTAINTIES 137

5.5 Total systematic uncertainty

5.5.1 As a function of mass

Since there are many parts which contribute to the systematic uncertainty, an overview is given here. In Fig. 5.12 the relative systematic uncertainty is shown as a function of invariant mass, for each different source considered. Note that the systematic uncertainty due to the reconstruction efficiency is combined, as described in Section 5.2. Both the total systematic uncertainty, which is just the sum in quadrature of the individual relative systematic uncertainties, and the statistical uncertainty are also included for comparison. The numerical values can be found in Table F.1. As can be seen from this, the measurement of the DY cross-section is dominated by the systematic uncertainty in all mass bins. The dominant source of systematic uncertainty is the uncertainty of the heavy- flavour background template at low masses and the uncertainties of the reconstruction efficiency and luminosity at high masses.

5.5.2 As a function of mass and rapidity

Similarly, Fig. 5.13 shows the relative systematic uncertainty from the different sources including the total systematic uncertainty in each mass and rapidity bin. Again, the statistical uncertainty is included as a reference. The numerical values can be found in Table F.2. For most of the rapidity range the systematic uncertainty due to the heavy-flavour background template is also dominant here. Where it is not, the uncertainty due to the signal template mostly dominates the systematic uncertainty. 138 CHAPTER 5. SYSTEMATIC UNCERTAINTIES

Total Syst. HF template Reconstruction efficiency Statistical 0.01 Luminosity Bin migration Signal template

Relative uncertainty GEC Toys 3 Vertex Chi2 10−

4 10−

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 5.12: Relative contributions to the total systematic uncertainty as a function of mass. The statistical uncertainty is also shown, as a comparison. CHAPTER 5. SYSTEMATIC UNCERTAINTIES 139

(a) In the mass bin 10.5 – 12 GeV/c2. (b) In the mass bin 12 – 15 GeV/c2.

(c) In the mass bin 15 – 20 GeV/c2. (d) In the mass bin 20 – 60 GeV/c2.

Figure 5.13: Relative contributions to the total systematic uncertainty as a function of rapidity in the different mass bins. The statistical uncertainty is shown as a comparison. See Fig. 5.12 for the legend. 140 CHAPTER 6

RESULTS

There are two possible outcomes: if the result confirms the hypothesis, then you’ve made a measurement. If the result is contrary to the hypothesis, then you’ve made a discovery.

Enrico Fermi

In order to properly retrieve the cross-section from the number of observed signal events, the time-integrated version of Eq. (3.7) is used: N σ = , (6.1) εL where N is the number of signal events (after all corrections), L the integrated luminosity and ε the efficiency.

The cross-section is determined separately in each mass- (and rapidity-) bin i. There are two types of corrections on the number of signal events, the correction due to the toy fits and a correction factor due to the bin migration. The cross-section in each mass- and rapidity-bin is then determined with the master formula

sig toy sig MIG   Ni ci σi fi 1 σi = − · , (6.2) L εGEC εvtx · εreco · · i i where N sig is the number of signal events retrieved from the fit, ctoy the correction due sig to the toy fits in units of standard deviations, σi the statistical uncertainty from the

141 142 CHAPTER 6. RESULTS

fit, f MIG the bin migration correction factor, εGEC the GEC efficiency, εvtx the efficiency of the vertex χ2 cut, εreco the reconstruction efficiency containing the trigger, tracking and muon ID efficiencies, which were obtained as per-event efficiencies and averaged over the bin i as indicated. The luminosity and the GEC efficiency are the same in all bins, whereas the rest depends on the bin.

6.1 Total cross-section at the Z-peak

In order to ascertain that the results obtained in this analysis, especially some of the efficiencies, are reasonable, the total cross-section at the Z-peak can be compared to the one obtained in the stand-alone Z µ+µ− analysis: →

σref = 95.0 0.3 0.7 1.1 1.1 pb, (6.3) Z→µ+µ− ± ± ± ± where the first uncertainty is statistical, the second systematic, the third due to the knowledge of the LHC beam energy and the fourth due to the luminosity determination 2 [128]. The analysis was performed in the range 2 < y < 4.5 and 60 < Mµµ < 120 GeV/c using a template fit to the muon pT distribution.

In this analysis the cross-section is only being determined in the range 2 < y < 4.5 and 2 60 < Mµµ < 110 GeV/c . This can be corrected by adding the average cross-section in the 2 region 110 < Mµµ < 120 GeV/c , as predicted by the four different PDF sets, in order to be compared to the cross-section measured in the Z-analysis. The average predicted value is σPDF = 0.60 0.03 pb, (6.4) ± where the uncertainty is due mostly due to the theory uncertainty associated with all predicted values. With this correction added, the Drell-Yan cross-section at the Z-peak is measured to be

DY σ + − = 93.81 0.33 1.62 0.03 pb, (6.5) Z→µ µ ± ± ± where the first uncertainty is statistical, the second systematic and the last due to the different mass range used in this analysis.

In order to know if this difference is significant, the correlation between the two mea- surements needs to be taken into account. The luminosity measurement is certainly fully correlated between the two measurements. In addition, the tracking efficiency is also fully correlated and the muon ID efficiency, GEC efficiency and bin migration use the same (or similar) data and methods. In order to obtain a conservative estimate of the uncertainty of the difference between the two measurements, these are all taken to be fully correlated as well. Note that in this analysis only the combined reconstruction efficiency is available. This efficiency also contains the trigger efficiency, which is mostly uncorrelated with the trigger efficiency in the Z-analysis, since different trigger lines were used. The uncertainties due to the fitting procedure and the statistical uncertainties are taken to be uncorrelated. The correlation matrix used can also be seen in Fig. G.1. CHAPTER 6. RESULTS 143

When using these correlations and combining all uncertainties, the difference between the two measurements is

ref DY σ σ + − = 1.2 1.6 pb, (6.6) Z→µ+µ− − Z→µ µ ± which is compatible with zero.

6.2 Differential cross-section as a function of mass

The results of the differential DY cross-section dσ as a function of the invariant mass of dMµµ the muon pair are shown in Fig. 6.1, while the numerical values can be found in Table B.1. In general the agreement between the theoretical predictions and the measured values is quite good. As shown above, the agreement with the independent measurement of the Z cross-section is also good. A better comparison between the different predictions and the measured values is possible by taking the ratio between the predictions and the measured values, which is shown in Fig. 6.2. There it is visible that the predictions made using the CT14 PDF differ both from the predictions by the other PDF sets and the results obtained in this thesis, albeit not statistically significantly. 144 CHAPTER 6. RESULTS

)] 100 2 c LHCb unofficial, √s = 8TeV Stat. & Syst. Stat. MMHT2014

[pb/(GeV/ MSTW08

µµ NNPDF30

M 10 CT14 d

/ Z-analysis[128] σ d

1

10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 6.1: Differential Drell-Yan cross-section as a function of the centre-of-mass energy and theory predictions using different PDFs with the 68% CL shown as error bars. The yellow boxes show the statistical uncertainty obtained from the fit and the orange boxes show the combined statistical and systematic uncertainty. CHAPTER 6. RESULTS 145

1.4 LHCb unofficial, √s = 8TeV, 2 < y < 4.5 1.3 Stat. & Syst. MMHT2014 NNPDF30 Stat. MSTW08 CT14

Theory /1 Data .2

1.1

1.0

0.9

0.8 10 20 30 40 60 100 2 Mµµ[GeV/c ]

Figure 6.2: Ratio of the differential Drell-Yan cross-section as a function of the centre-of-mass energy as obtained from theory predictions using different PDFs with the 68% CL shown as error bars and from the fit. The yellow boxes show the statistical uncertainty obtained from the fit and the orange boxes show the combined statistical and systematic uncertainty. The data uncertainties are not included in the uncertainty of the ratio for the predictions, only the theory and PDF uncertainties. 146 CHAPTER 6. RESULTS

6.3 Double-differential cross-section as a function of mass and rapidity

Figure 6.3 shows the double-differential DY cross-section as a function of the rapidity in the different mass bins. The statistical uncertainties and the combined statistical and systematic uncertainties are also shown. The corresponding values, with the statistical and systematic uncertainties reported separately, can be found in Table B.2. Also here the ratio between the predictions and the measured values is shown, in Fig. 6.4. Two discrepancies between the theoretical predictions and the measured values can be seen. This measurement shows a higher cross-section at medium to low rapidities as well as a lower cross-section in the high-rapidity bin than the NNLO predictions. Unfortunately the low statistics of the HF sample did not allow to compare the result also in the bin containing the Z-peak, where the Z-analysis [128] showed only a small discrepancy at low rapidity in the cross-section as a function of rapidity when compared to a similar set of PDFs. )] )] 2 2 c c 70 LHCb unofficial, √s = 8TeV, 10.5 < Mµµ < 12.0 40 LHCb unofficial, √s = 8TeV, 12.0 < Mµµ < 15.0 Stat. & Syst. MMHT2014 NNPDF30 Stat. & Syst. MMHT2014 NNPDF30 60 Stat. MSTW08 CT14 Stat. MSTW08 CT14

[pb/(GeV/ 50 [pb/(GeV/ 30 µµ µµ

M 40 M d d y y

d d 20

/ 30 / σ σ 2 2 d 20 d 10 10

0 0 2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y )] )] 2 2 c c LHCb unofficial, √s = 8TeV, 15.0 < M < 20.0 LHCb unofficial, √s = 8TeV, 20.0 < Mµµ < 60.0 17.5 µµ . Stat. & Syst. MMHT2014 NNPDF30 2 0 Stat. & Syst. MMHT2014 NNPDF30 15.0 Stat. MSTW08 CT14 Stat. MSTW08 CT14 [pb/(GeV/ [pb/(GeV/ 12.5 1.5 µµ µµ M M

. d

d 10 0 y y d d 1.0 / / 7.5 σ σ 2 2 d d 5.0 0.5 2.5

0.0 0.0 2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y

Figure 6.3: Double-differential Drell-Yan cross-section as a function of the rapidity in the different mass bins as obtained from the fit and theory predictions using different PDFs. The yellow boxes show the statistical uncertainty obtained from the fit and the orange boxes include the systematic uncertainty. CHAPTER 6. RESULTS 147

2.00 2.00 √s = . < M < . √s = . < M < . 1.75 LHCb unofficial, 8TeV, 10 5 µµ 12 0 1.75 LHCb unofficial, 8TeV, 12 0 µµ 15 0 Stat. & Syst. MMHT2014 NNPDF30 Stat. & Syst. MMHT2014 NNPDF30 1.50 Stat. MSTW08 CT14 1.50 Stat. MSTW08 CT14 Theory / Data Theory / Data 1.25 1.25

1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25 2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y

2.00 2.00 √s = . < M < . √s = . < M < . 1.75 LHCb unofficial, 8TeV, 15 0 µµ 20 0 1.75 LHCb unofficial, 8TeV, 20 0 µµ 60 0 Stat. & Syst. MMHT2014 NNPDF30 Stat. & Syst. MMHT2014 NNPDF30 1.50 Stat. MSTW08 CT14 1.50 Stat. MSTW08 CT14 Theory / Data Theory / Data 1.25 1.25

1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25 2.0 2.5 3.0 3.5 4.0 4.5 2.0 2.5 3.0 3.5 4.0 4.5 y y

Figure 6.4: Ratio of the double-differential Drell-Yan cross-section as a function of the rapidity in the different mass bins as obtained from theory predictions using different PDFs and from the fit. The yellow boxes show the statistical uncertainty obtained from the fit and the orange boxes include the systematic uncertainty. The data uncertainties are not included in the uncertainty of the ratio, only the theory and PDF uncertainties. 148 CHAPTER 7

CONCLUSIONS & OUTLOOK

One never notices what has been done; one can only see what remains to be done.

Marie Curie

In conclusion, a measurement of the single-differential and double-differential Drell-Yan cross-section using data collected by the LHCb experiment in 2012 at √s = 8 TeV has been presented in this thesis. A template fit using the isolation variable distinguishing between signal and multiple backgrounds, eventually combined into one background template, was performed. The measurement of the cross-section has been made possible by the excellent precision of the luminosity determination for the 2012 data sample by LHCb. The differential cross-section as a function of invariant mass shows very good agreement with three of the four PDF sets used, while the CT14 PDF shows a discrepancy at low mass. For the double-differential cross-section as a function of invariant mass and rapidity both at low and high rapidities the predictions show tensions with the values measured in this thesis.

2 In Ref. [154] it was said that at low masses (Mµµ < 20 GeV/c ) the effects from higher order contributions to the perturbative DY cross-section can be sizeable, which makes this region particularly interesting. Unfortunately, the low mass region is also the region where the method used in this thesis has the largest systematic uncertainty, mostly due to the heavy-flavour background template. In future analysis (e.g. at √s = 13, 14 TeV) this 2 can be avoided by choosing a different fitting variable, namely by directly fitting in χvtx, the variable used here to separate the signal and heavy-flavour background samples. This leaves more statistics available, because the region between the two selection cuts, which is enriched in neither signal nor background, can also be included. For the data sample

149 150 CHAPTER 7. CONCLUSIONS & OUTLOOK

used in this thesis, about 18% of the data falls in this region between the two selection cuts.

With future measurements at different centre-of-mass energies energies available a measure- ment of ratios of cross-sections will have significantly reduced systematic uncertainties as it will benefit from various cancellations. This is especially true when taking a double-ratio and normalising each cross-section to the Z-production cross-section. In the double-ratio, the uncertainty due to the luminosity, one of the leading systematic uncertainties, cancels. Furthermore, some of the theoretical uncertainties cancel as well when taking ratios. This is because in the theoretical calculations of the cross-sections various input parameters, such as masses, running coupling constants, PDFs and renormalisation and factorization scales enter [155], which are correlated between different centre-of-mass energies. Since the scale and mass uncertainties largely cancel, the sensitivity to the PDFs is enhanced in the ratio measurements.

In addition to using a different fitting variable, there are other ways to reduce the systematic uncertainty of the measurement, which is the limiting factor of this analysis. The systematic uncertainty of the muon-ID efficiency is the major driving factor of the uncertainty of the reconstruction efficiency, which is the second largest systematic uncertainty, after the uncertainty due to the heavy-flavour template. This can be reduced by redoing the muon-ID and tracking efficiencies also for low pT tracks using MC.

Finally, the difference between MC and data regarding the trigger efficiency needs to be further investigated.

In the end, only time will tell if the deviations observed here will also be seen in the analysis of all data collected by LHCb so far. Regardless of this, inclusion of the DY cross-section at LHCb at low masses into future PDF fits will improve the description of the low-x gluon and anti-quark density. APPENDIX A

THEORETICAL PREDICTIONS OF THE DRELL-YAN CROSS-SECTION

It’s difficult to make predictions, especially about the future.

Niels Bohr

Table A.1 contains the predictions of the differential cross-section with respect to the invariant mass, as a function of mass and Table A.2 contains the double-differential cross- section as a function of mass and rapidity. For simplicity, the different uncertainties are summed in quadrature. All four PDF sets produce similar predictions. See Section 2.3.2 for an introduction of the different PDF sets and Section 2.4.3 for an explanation of how FEWZ generally works.

151 APPENDIX A. THEORETICAL PREDICTIONS OF THE DRELL-YAN 152 CROSS-SECTION

Table A.1: Differential cross-section dσ in pb/( GeV/c2) as predicted using NNLO FEWZ with dMµµ the different PDF sets. The uncertainties are the sum in quadrature of the statistical uncertainty from the numerical integration, the PDF set used, and the theoretical input. [156]

2 Mµµ[ GeV/c ] y MMHT2014 MSTW08 NNPDF30 CT14 +6.27 +7.20 +5.96 +13.14 10.5 – 11.0 2 – 4.5 76.90−4.03 76.20−4.13 70.44−4.42 84.00−6.52 +6.33 +7.06 +6.19 +11.78 11.0 – 11.5 2 – 4.5 68.04−3.64 68.12−3.75 65.70−4.10 74.48−5.78 +4.47 +5.05 +4.25 +9.74 11.5 – 12.0 2 – 4.5 61.48−4.03 59.40−4.02 58.36−4.31 65.96−5.64 +3.55 +3.70 +3.50 +7.75 12.0 – 13.0 2 – 4.5 51.68−4.27 51.54−4.31 50.02−4.55 55.30−5.38 +2.25 +2.38 +2.20 +5.58 13.0 – 14.0 2 – 4.5 41.38−3.23 41.01−3.28 40.00−3.46 43.84−4.12 +1.61 +1.75 +1.57 +4.19 14.0 – 15.0 2 – 4.5 33.42−2.97 33.28−3.00 32.45−3.12 35.21−3.60 +1.20 +1.33 +1.18 +2.78 15.0 – 17.5 2 – 4.5 23.63−1.55 23.13−1.53 23.10−1.69 24.71−2.09 +0.76 +0.81 +0.76 +1.60 17.5 – 20.0 2 – 4.5 14.68−0.90 14.54−0.89 14.23−0.98 15.42−1.24 +0.33 +0.35 +0.33 +0.76 20.0 – 25.0 2 – 4.5 8.22−0.39 8.13−0.41 7.91−0.47 8.52−0.48 +0.23 +0.15 +0.15 +0.51 25.0 – 30.0 2 – 4.5 4.05−0.23 4.06−0.23 3.92−0.25 4.12−0.32 +0.06 +0.05 +0.06 +0.13 30.0 – 40.0 2 – 4.5 1.79−0.07 1.79−0.07 1.73−0.08 1.84−0.12 +0.02 +0.01 +0.02 +0.03 40.0 – 60.0 2 – 4.5 0.53−0.02 0.53−0.01 0.52−0.02 0.54−0.03 +0.04 +0.04 +0.04 +0.07 60.0 – 120.0 2 – 4.5 1.60−0.04 1.59−0.04 1.56−0.04 1.58−0.06 APPENDIX A. THEORETICAL PREDICTIONS OF THE DRELL-YAN CROSS-SECTION 153

2 Table A.2: Double-differential cross-section d σ in pb/( GeV/c2) as predicted using NNLO dy dMµµ FEWZ with the different PDF sets. The uncertainties are the sum in quadrature of the statistical uncertainty from the numerical integration, the PDF set used, and the theoretical input. [156]

2 Mµµ[ GeV/c ] y MSTW08 MMHT2014 NNPDF30 CT14

+0.53 +0.54 +0.99 +0.55 10.5 – 12.0 2.00 – 2.25 2.63−0.18 3.01−0.18 1.61−0.86 2.94−0.21 +0.43 +0.48 +2.14 +0.57 2.25 – 2.50 8.08−1.47 8.05−1.47 6.35−2.56 8.60−1.49 +0.94 +1.01 +1.91 +1.12 2.50 – 2.75 11.99−2.14 11.34−2.14 11.16−2.73 12.68−2.18 +1.11 +1.34 +2.36 +1.39 2.75 – 3.00 14.45−2.21 15.41−2.24 11.35−3.07 15.07−2.26 +1.12 +1.33 +3.70 +1.54 3.00 – 3.25 15.89−2.44 13.82−2.45 15.55−4.32 16.81−2.49 +1.08 +1.63 +4.86 +1.70 3.25 – 3.50 16.14−2.50 17.12−2.58 7.46−5.39 17.31−2.53 +1.20 +1.62 +2.70 +1.79 3.50 – 3.75 13.59−1.88 11.00−1.96 9.84−3.11 14.84−1.89 +1.20 +1.30 +2.44 +1.81 3.75 – 4.00 11.06−1.95 6.53−1.95 7.39−2.93 12.65−1.95 +1.03 +1.61 +1.74 +1.65 4.00 – 4.50 9.46−1.27 8.83−1.48 11.96−1.95 9.83−1.27 +0.23 +0.90 +0.24 +0.27 12.0 – 15.0 2.00 – 2.25 3.44−0.26 5.16−0.91 3.20−0.27 3.72−0.29 +0.69 +1.13 +0.71 +0.79 2.25 – 2.50 9.44−0.65 9.55−1.09 8.99−0.69 10.01−0.72 +0.76 +1.19 +0.78 +0.98 2.50 – 2.75 14.90−1.69 13.98−1.90 13.88−1.73 15.01−1.74 +1.52 +1.93 +1.56 +1.78 2.75 – 3.00 17.30−1.22 17.67−1.65 17.74−1.34 18.81−1.35 +0.98 +1.72 +0.95 +1.53 3.00 – 3.25 20.02−1.70 20.24−2.13 18.78−1.79 21.10−1.80 +1.34 +2.00 +1.28 +1.89 3.25 – 3.50 20.13−2.02 20.83−2.37 18.97−2.10 20.85−2.08 +1.52 +2.29 +1.51 +2.08 3.50 – 3.75 17.32−1.29 17.61−1.96 17.04−1.44 18.54−1.33 +0.89 +1.87 +0.96 +1.56 3.75 – 4.00 12.66−0.85 11.50−1.74 12.74−1.06 13.80−0.85 +1.13 +1.98 +1.27 +1.77 4.00 – 4.50 11.25−0.96 11.41−1.81 11.40−1.22 12.60−0.91 +0.15 +0.14 +0.20 +0.17 15.0 – 20.0 2.00 – 2.25 2.60−0.17 2.57−0.16 2.28−0.22 2.64−0.19 +0.38 +0.36 +0.44 +0.44 2.25 – 2.50 7.12−0.41 7.02−0.39 7.10−0.48 7.29−0.46 +0.53 +0.50 +0.61 +0.65 2.50 – 2.75 11.17−1.01 10.90−1.00 11.12−1.08 11.23−1.06 +0.74 +0.75 +0.78 +0.93 2.75 – 3.00 13.44−0.70 13.64−0.67 12.78−0.80 14.25−0.81 +0.67 +0.70 +0.70 +0.98 3.00 – 3.25 15.40−1.01 14.91−0.99 14.71−1.11 16.00−1.10 +0.78 +0.87 +0.77 +1.12 3.25 – 3.50 15.04−0.78 14.92−0.77 14.25−0.90 15.85−0.88 +0.96 +1.09 +0.96 +1.29 3.50 – 3.75 13.12−0.90 13.21−0.91 12.80−1.01 13.94−0.95 +0.60 +0.76 +0.63 +0.99 3.75 – 4.00 10.05−0.72 9.80−0.73 9.96−0.83 10.64−0.74 +0.51 +0.59 +0.61 +0.91 4.00 – 4.50 8.01−0.75 7.26−0.76 8.17−0.88 8.36−0.73 +0.12 +0.11 +0.12 +0.13 20.0 – 60.0 2.00 – 2.25 2.41−0.13 2.40−0.13 2.40−0.14 2.46−0.14 +0.21 +0.19 +0.22 +0.26 2.25 – 2.50 6.78−0.52 6.85−0.52 6.70−0.53 6.94−0.55 +0.27 +0.22 +0.27 +0.35 2.50 – 2.75 10.65−0.37 10.59−0.34 9.87−0.39 10.61−0.44 +0.38 +0.35 +0.40 +0.51 2.75 – 3.00 13.31−0.48 13.41−0.38 13.14−0.47 13.72−0.52 +0.46 +0.43 +0.45 +0.60 3.00 – 3.25 14.92−0.61 14.82−0.59 14.29−0.67 15.25−0.69 +0.56 +0.58 +0.55 +0.72 3.25 – 3.50 14.43−0.51 14.45−0.48 13.96−0.58 14.99−0.60 +0.51 +0.56 +0.50 +0.69 3.50 – 3.75 12.12−0.44 12.23−0.42 11.85−0.52 12.51−0.50 +0.39 +0.45 +0.38 +0.58 3.75 – 4.00 8.81−0.51 8.66−0.50 8.52−0.56 9.28−0.54 +0.25 +0.36 +0.32 +0.50 4.00 – 4.50 6.32−0.37 6.08−0.38 6.21−0.42 6.45−0.40 154 APPENDIX B

MEASURED VALUES OF THE DRELL-YAN CROSS-SECTION

Table B.1: Differential Drell-Yan cross-section as a function of mass. Statistical and Systematic uncertainty quoted separately.

M y dσ stat syst µµ dMµµ [ GeV/c2] [pb/( GeV/c2)] 10.5 – 11.0 2.0 – 4.5 71.3 1.5 4.5 11.0 – 11.5 2.0 – 4.5 66.9 1.3 3.8 11.5 – 12.0 2.0 – 4.5 61.5 1.1 3.2 12.0 – 13.0 2.0 – 4.5 52.2 0.7 2.2 13.0 – 14.0 2.0 – 4.5 42.9 0.5 1.7 14.0 – 15.0 2.0 – 4.5 33.8 0.4 1.0 15.0 – 17.5 2.0 – 4.5 22.73 0.21 0.57 17.5 – 20.0 2.0 – 4.5 14.39 0.13 0.37 20.0 – 25.0 2.0 – 4.5 8.03 0.06 0.19 25.0 – 30.0 2.0 – 4.5 3.98 0.04 0.08 30.0 – 40.0 2.0 – 4.5 1.761 0.017 0.037 40.0 – 60.0 2.0 – 4.5 0.546 0.006 0.012 60.0 – 110.0 2.0 – 4.5 1.554 0.006 0.030

155 156 APPENDIX B. MEASURED VALUES OF THE DRELL-YAN CROSS-SECTION

Table B.2: Double-differential Drell-Yan cross-section as a function of mass and rapidity. Statis- tical and Systematic uncertainty quoted separately.

M y d2σ stat syst µµ dy dMµµ [ GeV/c2] [pb/( GeV/c2)] 10.5 – 12.0 2.0 – 2.25 9.0 0.6 0.5 2.25 – 2.5 27.8 0.9 1.5 2.5 – 2.75 42.9 1.1 3.6 2.75 – 3.0 42.1 1.5 2.9 3.0 – 3.25 43.9 2.0 5.6 3.25 – 3.5 42.1 1.4 3.8 3.5 – 3.75 36.2 1.0 2.6 3.75 – 4.5 15.3 0.5 0.9 12.0 – 15.0 2.0 – 2.25 5.47 0.22 0.30 2.25 – 2.5 18.5 0.4 0.9 2.5 – 2.75 22.7 0.5 1.0 2.75 – 3.0 26.0 0.5 1.5 3.0 – 3.25 27.1 0.7 1.9 3.25 – 3.5 25.8 0.7 1.8 3.5 – 3.75 25.1 0.6 1.4 3.75 – 4.5 9.07 0.24 0.45 15.0 – 20.0 2.0 – 2.25 2.23 0.08 0.09 2.25 – 2.5 6.37 0.13 0.19 2.5 – 2.75 9.93 0.18 0.35 2.75 – 3.0 12.63 0.22 0.35 3.0 – 3.25 12.19 0.24 0.57 3.25 – 3.5 11.46 0.23 0.48 3.5 – 3.75 9.96 0.21 0.45 3.75 – 4.5 4.19 0.10 0.17 20.0 – 60.0 2.0 – 2.25 0.263 0.008 0.010 2.25 – 2.5 0.722 0.013 0.018 2.5 – 2.75 1.096 0.017 0.028 2.75 – 3.0 1.374 0.019 0.030 3.0 – 3.25 1.513 0.021 0.042 3.25 – 3.5 1.592 0.024 0.039 3.5 – 3.75 1.128 0.020 0.024 3.75 – 4.5 0.389 0.010 0.032 APPENDIX C

INDIVIDUAL FIT RESULTS

Here the results of all baseline fits performed in order to determine the signal yields, as described in Sect. 4.4 are shown. All fits are the results of the last fit step, where the residual signal has already been subtracted from the heavy-flavour template and the two background templates are combined with a fixed fraction. The signal contribution is shown in orange, while the background is shown in blue. The uncertainties on the templates are shown using errorbars on the distributions and the data being fitted is shown on top as black markers.

The first part contains the fits as a function of invariant mass and integrated over the rapidity acceptance of LHCb (2 < y < 4.5), while the second part contains the fits as a function of both invariant mass and rapidity, both in the bins defined in Sect. 4.4.

157 158 APPENDIX C. INDIVIDUAL FIT RESULTS

C.1 As a function of invariant mass

≤ 2 ≤ ≤ 2 ≤ ×103 10.5 < Mµµ 11.0 [GeV/c ] 2.0 < y 4.5 ×103 11.0 < Mµµ 11.5 [GeV/c ] 2.0 < y 4.5 40 Signal LHCb Signal LHCb Background unofficial 30 Background unofficial 35 Data Data 30 25

25 20 Number of events Number of events 20 15 15 10 10 5 5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 11.5 < Mµµ 12.0 [GeV/c ] 2.0 < y 4.5 ×103 12.0 < Mµµ 13.0 [GeV/c ] 2.0 < y 4.5 Signal LHCb 40 Signal LHCb 25 Background unofficial Background unofficial 35 Data Data 20 30

25 Number of events 15 Number of events 20

10 15

10 5 5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 13.0 < Mµµ 14.0 [GeV/c ] 2.0 < y 4.5 ×103 14.0 < Mµµ 15.0 [GeV/c ] 2.0 < y 4.5 Signal LHCb 20 Signal LHCb 25 Background unofficial 18 Background unofficial Data 16 Data

20 14 12 Number of events Number of events 15 10

8 10 6

5 4 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

Figure C.1: Fit result to combined MagUp and MagDown data using the signal template and the background template with fixed fractions, in the mass bins 10.5 11, 11 11.5, 11.5 12, − − − 12 13, 13 14 and 14 15 GeV/c2. − − − APPENDIX C. INDIVIDUAL FIT RESULTS 159

≤ 2 ≤ ≤ 2 ≤ ×103 15.0 < Mµµ 17.5 [GeV/c ] 2.0 < y 4.5 ×103 17.5 < Mµµ 20.0 [GeV/c ] 2.0 < y 4.5 Signal LHCb Signal LHCb 25 Background unofficial 12 Background unofficial Data Data 10 20

8 Number of events Number of events 15 6

10 4

5 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 20.0 < Mµµ 25.0 [GeV/c ] 2.0 < y 4.5 ×103 25.0 < Mµµ 30.0 [GeV/c ] 2.0 < y 4.5 Signal Signal 10 LHCb 3.5 LHCb Background unofficial Background unofficial Data 3 Data 8 2.5

Number of events 6 Number of events 2

4 1.5

1 2 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 30.0 < Mµµ 40.0 [GeV/c ] 2.0 < y 4.5 ×103 40.0 < Mµµ 60.0 [GeV/c ] 2.0 < y 4.5 2.2 Signal LHCb Signal LHCb 2 3 Background unofficial Background unofficial Data 1.8 Data 2.5 1.6 1.4 2

Number of events Number of events 1.2

1.5 1 0.8 1 0.6 0.4 0.5 0.2 0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ×103 60.0 < Mµµ 110.0 [GeV/c ] 2.0 < y 4.5 16 Signal LHCb 14 Background unofficial Data 12

10 Number of events 8

6

4

2

0 0 2 4 6 8 10 12 14 log(isolation)

Figure C.2: Fit result to combined MagUp and MagDown data using the signal template and the background template with fixed fractions, in the mass bins 15 17.5, 17.5 20, 20 25, − − − 25 30, 30 40, 40 60 and 60 110 GeV/c2. − − − − 160 APPENDIX C. INDIVIDUAL FIT RESULTS

C.2 As a function of invariant mass and rapidity

≤ 2 ≤ ≤ 2 ≤ ×103 10.5 < Mµµ 12.0 [GeV/c ] 2.0 < y 2.25 ×103 10.5 < Mµµ 12.0 [GeV/c ] 2.25 < y 2.5 4 Signal Signal LHCb 10 LHCb Background unofficial Background unofficial 3.5 Data Data 3 8

2.5

Number of events Number of events 6 2

1.5 4

1 2 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 10.5 < Mµµ 12.0 [GeV/c ] 2.5 < y 2.75 ×103 10.5 < Mµµ 12.0 [GeV/c ] 2.75 < y 3.0 16 20 Signal LHCb Signal LHCb 18 14 Background unofficial Background unofficial Data 16 Data 12 14

10 12 Number of events Number of events 8 10 8 6 6 4 4

2 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 10.5 < Mµµ 12.0 [GeV/c ] 3.0 < y 3.25 ×103 10.5 < Mµµ 12.0 [GeV/c ] 3.25 < y 3.5 22 Signal LHCb 16 Signal LHCb 20 Background unofficial Background unofficial 14 18 Data Data 16 12 14 10 Number of events 12 Number of events 10 8

8 6 6 4 4 2 2 0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 10.5 < Mµµ 12.0 [GeV/c ] 3.5 < y 3.75 ×103 10.5 < Mµµ 12.0 [GeV/c ] 3.75 < y 4.5 6 Signal LHCb Signal LHCb 8 Background unofficial Background unofficial 5 7 Data Data

6 4

Number of events 5 Number of events 3 4

3 2

2 1 1

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

Figure C.3: Fit result to MagUp and MagDown data using the signal template and the background 2 template with fixed fractions, in the mass bin 10.5 < Mµµ 12 GeV/c , as a function of rapidity. ≤ APPENDIX C. INDIVIDUAL FIT RESULTS 161

≤ 2 ≤ ≤ 2 ≤ ×103 12.0 < Mµµ 15.0 [GeV/c ] 2.0 < y 2.25 ×103 12.0 < Mµµ 15.0 [GeV/c ] 2.25 < y 2.5 Signal LHCb 9 Signal LHCb 3 Background unofficial 8 Background unofficial Data Data 2.5 7

6 2 Number of events Number of events 5

1.5 4

3 1 2 0.5 1

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 12.0 < Mµµ 15.0 [GeV/c ] 2.5 < y 2.75 ×103 12.0 < Mµµ 15.0 [GeV/c ] 2.75 < y 3.0 18 Signal LHCb Signal LHCb 12 Background unofficial 16 Background unofficial Data 14 Data 10 12 8 Number of events Number of events 10

6 8

6 4 4 2 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 12.0 < Mµµ 15.0 [GeV/c ] 3.0 < y 3.25 ×103 12.0 < Mµµ 15.0 [GeV/c ] 3.25 < y 3.5 Signal Signal 18 LHCb LHCb Background unofficial 14 Background unofficial 16 Data Data 12 14 10 12 Number of events Number of events 10 8

8 6 6 4 4 2 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 12.0 < Mµµ 15.0 [GeV/c ] 3.5 < y 3.75 ×103 12.0 < Mµµ 15.0 [GeV/c ] 3.75 < y 4.5 8 Signal Signal LHCb 5 LHCb 7 Background unofficial Background unofficial Data Data 6 4

5

Number of events Number of events 3 4

3 2

2 1 1

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

Figure C.4: Fit result to MagUp and MagDown data using the signal template and the background 2 template with fixed fractions, in the mass bin 12 < Mµµ 15 GeV/c , as a function of rapidity. ≤ 162 APPENDIX C. INDIVIDUAL FIT RESULTS

≤ 2 ≤ ≤ 2 ≤ ×103 15.0 < Mµµ 20.0 [GeV/c ] 2.0 < y 2.25 ×103 15.0 < Mµµ 20.0 [GeV/c ] 2.25 < y 2.5 1.4 Signal LHCb 4 Signal LHCb Background unofficial Background unofficial 3.5 1.2 Data Data 3 1 2.5 Number of events 0.8 Number of events 2 0.6 1.5 0.4 1

0.2 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 15.0 < Mµµ 20.0 [GeV/c ] 2.5 < y 2.75 ×103 15.0 < Mµµ 20.0 [GeV/c ] 2.75 < y 3.0 6 Signal LHCb 8 Signal LHCb Background unofficial Background unofficial 7 5 Data Data 6 4 5 Number of events Number of events 3 4

3 2 2 1 1

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 15.0 < Mµµ 20.0 [GeV/c ] 3.0 < y 3.25 ×103 15.0 < Mµµ 20.0 [GeV/c ] 3.25 < y 3.5 7 Signal LHCb Signal LHCb 8 Background unofficial 6 Background unofficial 7 Data Data 5 6

Number of events 5 Number of events 4

4 3 3 2 2 1 1

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 15.0 < Mµµ 20.0 [GeV/c ] 3.5 < y 3.75 ×103 15.0 < Mµµ 20.0 [GeV/c ] 3.75 < y 4.5 Signal 3 Signal 4 LHCb LHCb Background unofficial Background unofficial 3.5 Data 2.5 Data

3 2 2.5 Number of events Number of events

2 1.5

1.5 1 1 0.5 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

Figure C.5: Fit result to MagUp and MagDown data using the signal template and the background 2 template with fixed fractions, in the mass bin 15 < Mµµ 20 GeV/c , as a function of rapidity. ≤ APPENDIX C. INDIVIDUAL FIT RESULTS 163

≤ 2 ≤ ≤ 2 ≤ 20.0 < Mµµ 60.0 [GeV/c ] 2.0 < y 2.25 ×103 20.0 < Mµµ 60.0 [GeV/c ] 2.25 < y 2.5 2 900 Signal LHCb Signal LHCb Background unofficial 1.8 Background unofficial 800 Data 1.6 Data 700 1.4 600 1.2 Number of events Number of events 500 1 400 0.8 300 0.6

200 0.4

100 0.2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 20.0 < Mµµ 60.0 [GeV/c ] 2.5 < y 2.75 ×103 20.0 < Mµµ 60.0 [GeV/c ] 2.75 < y 3.0 2.5 Signal LHCb 3 Signal LHCb Background unofficial Background unofficial 2 Data 2.5 Data

2 1.5 Number of events Number of events

1.5 1 1

0.5 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 20.0 < Mµµ 60.0 [GeV/c ] 3.0 < y 3.25 ×103 20.0 < Mµµ 60.0 [GeV/c ] 3.25 < y 3.5 Signal Signal 3 LHCb LHCb Background unofficial 2.5 Background unofficial Data Data 2.5 2

2

Number of events Number of events 1.5 1.5

1 1

0.5 0.5

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

≤ 2 ≤ ≤ 2 ≤ ×103 20.0 < Mµµ 60.0 [GeV/c ] 3.5 < y 3.75 ×103 20.0 < Mµµ 60.0 [GeV/c ] 3.75 < y 4.5 2.2 2.5 Signal LHCb Signal LHCb 2 Background unofficial Background unofficial 1.8 Data 2 Data 1.6 1.4 1.5

Number of events 1.2 Number of events 1 1 0.8 0.6 0.4 0.5 0.2 0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 log(isolation) log(isolation)

Figure C.6: Fit result to MagUp and MagDown data using the signal template and the background 2 template with fixed fractions, in the mass bin 20 < Mµµ 60 GeV/c , as a function of rapidity. ≤ 164 APPENDIX D

USING A FEED FORWARD NEURAL NETWORK TO IDENTIFY DRELL-YAN EVENTS AT THE LHCB EXPERIMENT

The analysis as presented here used a classic template fit using the isolation variable in order to separate the Drell-Yan events from any background. However, currently machine learning is having a high rate of success in classifying problems and in the end this is a classification problem as well. In order to explore if machine learning can also possibly be used for this analysis, separating signal events from heavy-flavour background was explored in a Bachelor thesis [157].

As a first step, multiple simple feed-forward neural networks were trained using signal MC 2 for Drell-Yan events and events with χvtx/ndf > 15 as heavy-flavour events. The networks differed both in the number of hidden layers, number of neurons per layer as well as the choice of features used.

The main goal of this approach was to explore if a neural network could learn to distinguish Drell-Yan events from background solely relying on information of the event itself and without needing to construct for example the isolation variable. For this the momentum and charge of all charged tracks originating from the same PV as the Drell-Yan candidate were given as input, together with the momentum of the two muons and some general event information like the total number of tracks in the event, the number of tracks from the same PV and the activity in the SPD. Since the input to a neural network needs to be a fixed size in general, only the 150 tracks with the largest transverse momentum were chosen, dropping additional events or padding zeros if necessary. As a comparison, some networks were also trained directly with the momenta of the two muons and their isolation.

165 APPENDIX D. USING A FEED FORWARD NEURAL NETWORK TO IDENTIFY 166 DRELL-YAN EVENTS AT THE LHCB EXPERIMENT In Fig. D.1 it can be seen that the networks did successfully learn to distinguish the MC events from the heavy-flavour events, with accuracies of 90.13% when using tracks from the same PV and 97.61% when directly using the isolation feature.

(a) Feed-forward neural network using the un- (b) Feed-forward neural network using the iso- derlying event. lation variable.

Figure D.1: Distribution of the outputs of the neural networks for signal and heavy-flavour events and the corresponding ROC curves. [157]

To compare this to the results obtained in this thesis, the network was then applied to the 2 data sample (events with χvtx/ndf < 5), with the additional requirement of nSP D < 600 to mitigate the effect of the Data - MC difference also visible in Fig. 4.38. In Fig. D.2 the result of this comparison can be seen. While the networks were able to successfully learn the difference between MC signal events and data heavy-flavour events, this does not directly translate to being able to predict the signal fraction. However, both for very low mass (which contains mostly heavy-flavour background) and for events under the Z-peak (which is almost entirely signal), the networks do give the same results as the template fit.

The two most likely reasons for the difference observed in the intermediate mass regime are that the random/Mis-ID background was completely neglected and that the Data - MC difference is too large to be overcome by a simple cut in the number of SPD hits and at least a reweighting (as described in Section 5.1.1) is also necessary here. In addition, as could be seen e.g. in Section 4.6.1, there seems to be some unexplained Data - MC discrepancies, which could also be responsible for this.

Nevertheless, this study shows that using machine learning for this analysis in the future is not a hopeless endeavour, even if simple feed-forward neural networks are maybe not sufficient. Therefore, as a next step a more complex network type should be considered. In Ref. [158] it was shown that a convolutional neural network is able to distinguish di-muon events with different invariant mass from each other by drawing a circle for each track from the same PV as the Z-candidate, with the centre of the circle given by the η and φ of the track and the radius by ln(pT). Figure D.3 contains two pictures created in this fashion, once for a event under the Z-peak and once for a heavy-flavour candidate. In addition to the kinematics, also the identity of the two muons forming the Drell-Yan 2 candidate, the χIP and some PID information are encoded in the three colour channels. APPENDIX D. USING A FEED FORWARD NEURAL NETWORK TO IDENTIFY DRELL-YAN EVENTS AT THE LHCB EXPERIMENT 167

(a) Neural network trained using the underlying event directly.

(b) Neural network trained using the isolation variable.

Figure D.2: Difference of the signal and background fraction between the neural network and the template fit for a feed-forward neural network. The uncertainties of the network are estimated from the number of falsely classified events in the validation samples. The uncertainty of the difference assumes full correlation with the results from the template fit. [157] APPENDIX D. USING A FEED FORWARD NEURAL NETWORK TO IDENTIFY 168 DRELL-YAN EVENTS AT THE LHCB EXPERIMENT This setup has the advantage that the input to the neural network is automatically always the same size. Using this approach might lead to a better performance on real data than the approach used in this bachelor thesis.

Z candidate HF candidate

π

ϕ

π −

2 η 5

Figure D.3: One circle, per track from same PV as Z-candidate, x-y: η-ϕ, radius: ln(pT), red: 2 signal candidate muons, green: χIP, blue: isMuon APPENDIX E

UNCERTAINTIES OF DIFFERENCES AND RATIOS FOR CORRELATED VARIABLES

In general, when performing a systematic study, one compares the original estimate of a parameter, aˆ1 with a statistical error σ1 with a second estimate of the same parameter, aˆ2 with a statistical error σ2. The difference ∆ between the estimates and its variance is

2 2 2 ∆ =a ˆ1 aˆ2 σ = σ + σ 2ρσ1σ2, (E.1) − ∆ 1 2 − with ρ the correlation coefficient with 1 ρ +1. Similarly, for the ratio of the values r, − ≤ ≤ aˆ σ 2 σ 2 σ 2 σ σ r = 2 r = 1 + 2 2ρ 1 2 . (E.2) aˆ1 r aˆ1 aˆ2 − aˆ1 aˆ2

There are two simple classes of systematic studies. The first one is the repetition of the analysis with changing conditions, but with the same data samples. This can be using a different fitter, different binning schemes or different fitting variables. In this case the correlation is ρ = 1 and the variances become

2 2 2 2 σ = σ + σ 2σ1σ2 = (σ1 σ2) (E.3) ∆ 1 2 − − σ 2 σ 2 σ 2 σ σ σ σ 2 r = 1 + 2 2 1 2 = 1 2 . (E.4) r aˆ1 aˆ2 − aˆ1 aˆ2 aˆ1 − aˆ2 The other simple class is when the analysis is repeated on a subsample of the full data (for example binned in some variable like the runnumber). In this case the correlation between the two samples can be estimated from their individual variances as ρ = σ1/σ2

169 APPENDIX E. UNCERTAINTIES OF DIFFERENCES AND RATIOS FOR 170 CORRELATED VARIABLES and the variance on ∆ becomes

2 2 2 2 2 σ = σ + σ 2ρσ1σ2 = σ σ . (E.5) ∆ 1 2 − 2 − 1

And lastly, there is the more complicated case of partial overlap. This happens if for example different samples which are not independent of each other are used to generate the templates, as is the case for the heavy-flavour background. Both samples have events which are not present in the other, while they also contain events which are present in 2 both. The correlation coefficient in this case is ρ = σ1σ2/σ1∩2, where 1 2 denotes a sample where both cuts are applied i.e. the intersection between the two∩ samples. The variance of ∆ becomes

2 2 2 2 2 2 2 σ1σ2 σ∆ = σ1 + σ2 2ρσ1σ2 = σ1 + σ2 2 2 . (E.6) − − σ1∩2

One wants to avoid having to make three measurements, aˆ1 σ1, aˆ2 σ2 and aˆ1∩2 σ1∩2, though, because it might be impossible in practice. In order to± achieve± this, the correlation± coefficient can be estimated from the number of events in each sample Ni by making the 1 assumption that σi √ . In that case the correlation factor can be estimated to be ∝ Ni

N ρ = 1∩2 . (E.7) √N1 N2 · APPENDIX F

INDIVIDUAL CONTRIBUTIONS TO THE SYSTEMATIC UNCERTAINTY

In this appendix the numerical values of the relative contributions to the systematic uncertainty are presented as a function of mass in Table F.1 and as a function of mass and rapidity in Table F.2.

Table F.1: Relative systematic uncertainty due to the different sources in the different mass bins, in %.

2 Mµµ[ GeV/c ] y εvtx Toys εGEC Signal fMIG εreco HF Total L 10.5 – 11.0 2.0 – 4.5 0.05 0.21 0.16 0.99 0.49 1.16 1.29 5.91 6.27 11.0 – 11.5 2.0 – 4.5 0.06 0.22 0.16 0.27 0.52 1.16 1.29 5.34 5.65 11.5 – 12.0 2.0 – 4.5 0.06 0.41 0.16 0.20 0.55 1.16 1.29 4.79 5.15 12.0 – 13.0 2.0 – 4.5 0.05 0.24 0.16 0.05 0.41 1.16 1.29 3.68 4.10 13.0 – 14.0 2.0 – 4.5 0.05 0.16 0.16 0.31 0.46 1.16 1.29 3.47 3.92 14.0 – 15.0 2.0 – 4.5 0.06 0.14 0.16 0.46 0.51 1.16 1.29 2.13 2.84 15.0 – 17.5 2.0 – 4.5 0.04 0.08 0.16 0.78 0.37 1.16 1.28 1.48 2.44 17.5 – 20.0 2.0 – 4.5 0.05 0.07 0.16 0.78 0.47 1.16 1.28 1.62 2.54 20.0 – 25.0 2.0 – 4.5 0.05 0.03 0.16 0.84 0.44 1.16 1.28 1.31 2.37 25.0 – 30.0 2.0 – 4.5 0.07 0.03 0.16 0.85 0.63 1.16 1.27 0.40 2.07 30.0 – 40.0 2.0 – 4.5 0.07 0.02 0.16 0.69 0.67 1.16 1.27 0.22 1.99 40.0 – 60.0 2.0 – 4.5 0.09 0.03 0.16 0.43 0.88 1.16 1.26 0.03 1.98 60.0 – 110.0 2.0 – 4.5 0.03 0.03 0.16 0.01 0.32 1.16 1.26 0.04 1.75

171 APPENDIX F. INDIVIDUAL CONTRIBUTIONS TO THE SYSTEMATIC 172 UNCERTAINTY

Table F.2: Relative systematic uncertainty in %, individually for each source and combined, as a function of mass and rapidity.

2 MIG Mµµ [ GeV/c ] y εvtx εGEC Toys f εreco Signal HF Total L 10.5 – 12.0 2.00 – 2.25 0.25 0.16 0.79 2.26 1.16 1.58 3.82 0.95 5.01 2.25 – 2.50 0.14 0.16 0.28 1.13 1.16 1.57 3.71 3.36 5.50 2.50 – 2.75 0.10 0.16 0.20 0.88 1.16 1.56 3.66 7.18 8.34 2.75 – 3.00 0.09 0.16 0.22 0.80 1.16 1.56 2.63 6.11 6.98 3.00 – 3.25 0.09 0.16 0.61 0.76 1.16 1.57 1.55 12.49 12.78 3.25 – 3.50 0.08 0.16 0.54 0.75 1.16 1.35 0.33 8.83 9.06 3.50 – 3.75 0.09 0.16 0.29 0.79 1.16 1.35 3.74 5.69 7.09 3.75 – 4.50 0.07 0.16 0.46 0.67 1.16 1.34 4.63 3.38 6.06 12.0 – 15.0 2.00 – 2.25 0.23 0.16 0.37 2.06 1.16 1.34 3.12 3.54 5.46 2.25 – 2.50 0.12 0.16 0.16 1.00 1.16 1.33 2.86 2.96 4.59 2.50 – 2.75 0.09 0.16 0.14 0.79 1.16 1.30 3.63 1.64 4.42 2.75 – 3.00 0.08 0.16 0.22 0.71 1.16 1.30 2.05 4.86 5.61 3.00 – 3.25 0.08 0.16 0.26 0.67 1.16 1.29 0.85 6.85 7.15 3.25 – 3.50 0.07 0.16 0.33 0.67 1.16 1.28 0.32 6.53 6.81 3.50 – 3.75 0.08 0.16 0.24 0.70 1.16 1.27 3.98 3.40 5.56 3.75 – 4.50 0.06 0.16 0.38 0.61 1.16 1.29 4.05 2.17 4.96 15.0 – 20.0 2.00 – 2.25 0.27 0.16 0.15 2.20 1.16 1.28 2.86 1.09 4.16 2.25 – 2.50 0.13 0.16 0.08 1.16 1.16 1.28 2.14 0.24 3.00 2.50 – 2.75 0.10 0.16 0.09 0.91 1.16 1.27 2.89 0.46 3.52 2.75 – 3.00 0.09 0.16 0.23 0.81 1.16 1.25 1.52 1.32 2.78 3.00 – 3.25 0.08 0.16 0.30 0.76 1.16 1.29 0.75 4.15 4.64 3.25 – 3.50 0.08 0.16 0.18 0.76 1.16 1.29 0.43 3.69 4.17 3.50 – 3.75 0.09 0.16 0.17 0.79 1.16 1.28 3.10 2.68 4.52 3.75 – 4.50 0.07 0.16 0.20 0.70 1.16 1.26 3.22 1.63 4.06 20.0 – 60.0 2.00 – 2.25 0.25 0.16 0.13 3.01 1.16 1.25 1.66 0.30 3.86 2.25 – 2.50 0.13 0.16 0.06 1.36 1.16 1.29 1.06 0.07 2.46 2.50 – 2.75 0.11 0.16 0.08 0.98 1.16 1.29 1.55 0.47 2.58 2.75 – 3.00 0.09 0.16 0.05 0.82 1.16 1.28 0.90 0.37 2.16 3.00 – 3.25 0.08 0.16 0.07 0.78 1.16 1.27 0.39 2.01 2.79 3.25 – 3.50 0.08 0.16 0.07 0.78 1.16 1.25 0.17 1.57 2.46 3.50 – 3.75 0.09 0.16 0.23 0.82 1.16 1.30 0.94 0.00 2.16 3.75 – 4.50 0.08 0.16 0.11 0.75 1.16 1.30 7.84 1.41 8.19 APPENDIX G

CORRELATION WITH Z-MEASUREMENT

Some of the uncertainties of this analysis are correlated with the uncertainties of the dedicated measurement of the Z cross-section [128], see Section 6.1. As a conservative estimate, the correlation matrix shown in Fig. G.1 is used. 2 vtx reco Stat. Signal Background Toys ε GEC χ Bin mig. Mass range Luminosity Stat. Purity Tracking muon ID Trigger GEC Bin mig. Beam energy Luminosity

Stat. Z

Signal DY Background Toys εreco GEC 2 χvtx Bin mig. Mass range Luminosity DY Stat. Z Purity Tracking muon ID Trigger GEC Bin mig. Beam energy Luminosity

Figure G.1: Correlation of the individual uncertainty components between the DY and the Z-analysis. Black entries denote an assumed full correlation, which is a conservative estimate. 173 174 BIBLIOGRAPHY

[1] E. Noether, Invariant Variation Problems, Gott. Nachr. 1918 (1918) 235, arXiv:physics/0503066, [Transp. Theory Statist. Phys.1,186(1971)]. [2] Particle Data Group, M. Tanabashi et al., Review of particle physics, Phys. Rev. D98 (2018) 030001. [3] F. Capozzi et al., Neutrino masses and mixings: Status of known and unknown 3ν parameters, Nuclear Physics B 908 (2016) 218234, arXiv:1601.07777. [4] A. G. Riess et al., Observational evidence from supernovae for an accelerating universe and a cosmological constant, The Astronomical Journal 116 (1998) 1009, arXiv:astro-ph/9805201. [5] S. Perlmutter et al., Measurements of Ω and Λ from 42 high-redshift supernovae, The Astrophysical Journal 517 (1999) 565, arXiv:astro-ph/9812133. [6] BELLE collaboration, S. K. Choi et al., Observation of a resonance-like structure in the π±ψ0 mass distribution in exclusive B Kπ±ψ0 decays, Phys. Rev. Lett. 100 (2008) 142001, arXiv:0708.1790. → [7] LHCb collaboration, R. Aaij et al., Observation of the resonant character of the Z(4430)− state, Phys. Rev. Lett. 112 (2014) 222002, arXiv:1404.1903. [8] LHCb collaboration, R. Aaij et al., Observation of J/ψp resonances consistent with 0 − pentaquark states in Λb J/ψpK decays, Phys. Rev. Lett. 115 (2015) 072001, arXiv:1507.03414. → [9] LHCb collaboration, R. Aaij et al., Observation of a narrow pentaquark state, + + Pc(4312) , and of two-peak structure of the Pc(4450) , Phys. Rev. Lett. 122 (2019) 222001, arXiv:1904.03947. [10] WASA-at-COSY collaboration, P. Adlarson et al., Evidence for a new resonance from polarized neutron-proton scattering, Phys. Rev. Lett. 112 (2014) 202301, arXiv:1402.6844.

175 176 BIBLIOGRAPHY

[11] E598 collaboration, J. J. Aubert et al., Experimental Observation of a Heavy Particle J, Phys. Rev. Lett. 33 (1974) 1404.

[12] SLAC-SP-017 collaboration, J. E. Augustin et al., Discovery of a Narrow Resonance in e+e− Annihilation, Phys. Rev. Lett. 33 (1974) 1406, [Adv. Exp. Phys.5,141(1976)].

[13] S. L. Glashow, J. Iliopoulos, and L. Maiani, Weak Interactions with Lepton-Hadron Symmetry, Phys. Rev. D2 (1970) 1285.

[14] S. W. Herb et al., Observation of a Dimuon Resonance at 9.5-GeV in 400-GeV Proton-Nucleus Collisions, Phys. Rev. Lett. 39 (1977) 252.

[15] PLUTO, C. Berger et al., Jet Analysis of the Υ (9.46) Decay Into Charged Hadrons, Phys. Lett. 82B (1979) 449.

[16] PLUTO, C. Berger et al., Topology of the Υ Decay, Z. Phys. C8 (1981) 101.

[17] UA1 collaboration, G. Arnison et al., Experimental Observation of Lepton Pairs of Invariant Mass Around 95-GeV/c2 at the CERN SPS Collider, Phys. Lett. 126B (1983) 398.

[18] UA2 collaboration, P. Bagnaia et al., Evidence for Z0 e+e− at the CERN pp¯ Collider, Phys. Lett. 129B (1983) 130. →

[19] ATLAS collaboration, G. Aad et al., Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC, Phys. Lett. B716 (2012) 1, arXiv:1207.7214.

[20] CMS collaboration, S. Chatrchyan et al., Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC, Phys. Lett. B716 (2012) 30, arXiv:1207.7235.

[21] P. M. Watkins, DISCOVERY OF THE W AND Z BOSONS, Contemp. Phys. 27 (1986) 291.

[22] S. L. Glashow, Partial-symmetries of weak interactions, Nuclear Physics 22 (1961) 579.

[23] A. Salam and J. C. Ward, Electromagnetic and weak interactions, Physics Letters 13 (1964) 168.

[24] S. Weinberg, A model of leptons, Phys. Rev. Lett. 19 (1967) 1264.

[25] P. W. Higgs, Broken symmetries, massless particles and gauge fields, Phys. Lett. 12 (1964) 132.

[26] P. W. Higgs, Spontaneous Symmetry Breakdown without Massless Bosons, Phys. Rev. 145 (1966) 1156.

[27] F. Englert and R. Brout, Broken Symmetry and the Mass of Gauge Vector Mesons, Phys. Rev. Lett. 13 (1964) 321. BIBLIOGRAPHY 177

[28] G. S. Guralnik, C. R. Hagen, and T. W. B. Kibble, Global Conservation Laws and Massless Particles, Phys. Rev. Lett. 13 (1964) 585.

[29] Stack Exchange network, Jake, Answer to: How to draw a ’Mexican hat potential’ using 3dplot?, https://tex.stackexchange.com/a/95252, Accessed at 21. Aug 2019.

[30] S. Dawson, Introduction to electroweak symmetry breaking, in Proceedings, Summer School in High-energy physics and cosmology: Trieste, Italy, June 29-July 17, 1998, pp. 1–83, 1998, arXiv:hep-ph/9901280.

[31] A. Pich, Electroweak Symmetry Breaking and the Higgs Boson, Acta Phys. Polon. B47 (2016) 151, arXiv:1512.08749.

[32] LHCb collaboration, R. Aaij et al., Measurement of the forward-backward asymmetry in Z/γ∗ µ+µ− decays and determination of the effective weak mixing angle, JHEP 11 (2015)→ 190, arXiv:1509.07645.

[33] H. Yukawa, On the Interaction of Elementary Particles I, Proc. Phys. Math. Soc. Jap. 17 (1935) 48, [Prog. Theor. Phys. Suppl.1,1(1935)].

[34] I. J. R. Aitchison and A. J. G. Hey, Non-Abelian gauge theories: QCD and the elec- troweak theory, vol. 2 of Gauge theories in particle physics: A practical introduction, Taylor & Francis Group, New York, NY 10016, 3 ed., 2004.

[35] M. Gell-Mann, Symmetries of baryons and mesons, Phys. Rev. 125 (1962) 1067.

[36] R. K. Ellis, W. J. Stirling, and B. R. Webber, QCD and Collider Physics, Camb. Monogr. Part. Phys. Nucl. Phys. Cosmol. 8 (1996) 1.

[37] D. J. Griffiths, Introduction to elementary particles, Wiley, New York, USA, 1987.

[38] T. Plehn, Lectures on LHC Physics, vol. 886 of Lect. Notes Phys., Springer, 2015.

[39] L. A. Harland-Lang, A. D. Martin, P. Motylinski, and R. S. Thorne, Parton distributions in the LHC era: MMHT 2014 PDFs, Eur. Phys. J. C75 (2015) 204, arXiv:1412.3989.

[40] Y. L. Dokshitzer, Calculation of the Structure Functions for Deep Inelastic Scattering and e+ e- Annihilation by Perturbation Theory in ., Sov. Phys. JETP 46 (1977) 641, [Zh. Eksp. Teor. Fiz.73,1216(1977)].

[41] V. N. Gribov and L. N. Lipatov, e+ e- pair annihilation and deep inelastic e p scattering in perturbation theory, Sov. J. Nucl. Phys. 15 (1972) 675, [Yad. Fiz.15,1218(1972)].

[42] L. N. Lipatov, The parton model and perturbation theory, Sov. J. Nucl. Phys. 20 (1975) 94, [Yad. Fiz.20,181(1974)].

[43] G. Altarelli and G. Parisi, Asymptotic Freedom in Parton Language, Nucl. Phys. B126 (1977) 298. 178 BIBLIOGRAPHY

[44] R. J. Wallace and R. McNulty, A precise measurement of the Z boson cross-section and a test of the Standard Model using the LHCb detector, PhD thesis, University College Dublin, Nov, 2015, Presented 16 Oct 2015, CERN-THESIS-2015-223.

[45] L. N. Lipatov, Reggeization of the Vector Meson and the Vacuum Singularity in Non- abelian Gauge Theories, Sov. J. Nucl. Phys. 23 (1976) 338, [Yad. Fiz.23,642(1976)].

[46] E. A. Kuraev, L. N. Lipatov, and V. S. Fadin, Multi - reggeon processes in the yang-mills theory, Sov. Phys. JETP 44 (1976) 443.

[47] I. I. Balitsky and L. N. Lipatov, The pomeranchuk singularity in quantum chromo- dynamics, Sov. J. Nucl. Phys. 28 (1978) 822.

[48] G. Chachamis, Bfkl phenomenology, in Proceedings, New Trends in High-Energy Physics and QCD: Natal, Rio Grande do Norte, Brazil, October 21 Oct - November 06, 2014, (2016), pp. 4–24, 2016, arXiv:1512.04430.

[49] F. De Lorenzi, Parton Distribution Function Studies and a Measurement of Drell- Yan Produced Muon Pairs at LHCb, PhD thesis, University College Dublin, May, 2011, Presented 21 Mar 2011, CERN-THESIS-2011-237.

[50] A. D. Martin, W. J. Stirling, R. S. Thorne, and G. Watt, Parton distributions for the LHC, Eur. Phys. J. C63 (2009) 189, arXiv:0901.0002.

[51] P. L. Chebyshev, Th´eoriedes m´ecanismes connus sous le nom de parall´elogrammes., in M´emoires des Savants ´etrangers pr´esent´es`al’Acad´emiede Saint-P´etersbourg, vol. 7, pp. 539–586, 1854.

[52] A. D. Martin et al., Extended Parameterisations for MSTW PDFs and their effect on Lepton Charge Asymmetry from W Decays, Eur. Phys. J. C73 (2013) 2318, arXiv:1211.1215.

[53] NNPDF, R. D. Ball et al., Parton distributions for the LHC Run II, JHEP 04 (2015) 040, arXiv:1410.8849.

[54] S. Dulat et al., New parton distribution functions from a global analysis of quantum chromodynamics, Phys. Rev. D93 (2016) 033006, arXiv:1506.07443.

[55] S. D. Drell and T.-M. Yan, Massive Lepton Pair Production in Hadron-Hadron Collisions at High-Energies, Phys. Rev. Lett. 25 (1970) 316, Erratum: Phys. Rev. Lett. 25 (1970) 902.

[56] L. Tomlinson, The resummation of the low-phistar domain of Z production, PoS DIS2013 (2013) 130, arXiv:1306.0919.

[57] D. H. Perkins, Introduction to high energy physics, Cambridge Univ. Pr., Cambridge, UK, 2000 ed., 1982.

[58] Stack Exchange network, jub0bs, Answer to: How to draw and annotate a spherical coordinate system, https://tex.stackexchange.com/a/116215/51399, Accessed at 22. Aug 2019. BIBLIOGRAPHY 179

[59] I. Neutelings, CMS Wiki Pages, https://wiki.physik.uzh.ch/cms/latex: exampe_eta, Accessed at 22. Aug 2019. [60] R. Gavin, Y. Li, F. Petriello, and S. Quackenbush, FEWZ 2.0: A code for hadronic Z production at next-to-next-to-leading order, Comput. Phys. Commun. 182 (2011) 2388, arXiv:1011.3540. [61] Y. Li and F. Petriello, Combining QCD and electroweak corrections to dilepton production in FEWZ, Phys. Rev. D86 (2012) 094034, arXiv:1208.5967. [62] J. M. Keaveney, A measurement of the Z cross-section at LHCb, PhD thesis, Univer- sity College Dublin, Aug, 2011, Presented 17 Oct 2011, CERN-THESIS-2011-202.

[63] J. Campbell, J. Huston, and F. Krauss, The black book of quantum chromodynamics: a primer for the LHC era, Oxford University Press, Oxford, 2018.

[64] J. S. Schwinger, On radiation by electrons in a betatron: Transcription of a paper by J. Schwinger, 1945,.

[65] L. R. Evans and P. Bryant, LHC Machine, JINST 3 (2008) S08001. 164 p, This report is an abridged version of the LHC Design Report (CERN-2004-003).

[66] CERN, The accelerator complex, https://home.cern/about/accelerators, Ac- cessed at 24. Okt 2018.

[67] E. Mobs, The CERN accelerator complex. Complexe des acclrateurs du CERN, https://cds.cern.ch/record/2197559. [68] CERN, LHC Guide, CERN-Brochure-2017-002-Eng, Mar, 2017.

[69] T. Pieloni, A study of beam-beam effects in hadron colliders with a large number of bunches, PhD thesis, Ecole Polytechnique, Lausanne, 2008, Presented on 4 Dec 2008, CERN-THESIS-2010-056.

[70] CERN, LHC Accelerator Performance and Statistics, https://acc-stats.web. cern.ch/acc-stats/#lhc/super-table, Accessed at 16. Sept 2018. [71] ATLAS collaboration, G. Aad et al., The ATLAS Experiment at the CERN Large Hadron Collider, JINST 3 (2008) S08003.

[72] ALICE collaboration, K. Aamodt et al., The ALICE experiment at the CERN LHC, JINST 3 (2008) S08002.

[73] CMS collaboration, S. Chatrchyan et al., The CMS Experiment at the CERN LHC, JINST 3 (2008) S08004.

[74] LHCb collaboration, A. A. Alves Jr. et al., The LHCb detector at the LHC, JINST 3 (2008) S08005.

[75] LHCb collaboration, CMS collaboration, S. Farry, Forward EW Physics at the LHC, in Proceedings, 3rd Large Hadron Collider Physics Conference (LHCP 2015): St. Petersburg, Russia, August 31-September 5, 2015, pp. 254–265, 2016, arXiv:1602.09006. 180 BIBLIOGRAPHY

[76] LHCb collaboration, C. Elsasser, bb production angle plots, https://lhcb.web. cern.ch/lhcb/speakersbureau/html/bb_ProductionAngles.html, Accessed at 7. Jan 2019.

[77] LHCb collaboration, R. Aaij et al., LHCb detector performance, Int. J. Mod. Phys. A30 (2015) 1530022, arXiv:1412.6352.

[78] LHCb collaboration, LHCb Operations Plots Webpage, https://lbgroups.cern. ch/online/OperationsPlots/index.htm, Accessed at 3. Dec 2018.

[79] LHCb collaboration, LHCb muon system: Technical Design Report, CERN-LHCC- 2001-010.

[80] LHCb Collaboration, LHCb muon system: second addendum to the Technical Design Report, CERN-LHCC-2005-012.

[81] G. Sabatino et al., Cluster size measurements for the LHCb Muon System M5R4 MWPCs using cosmic rays, Tech. Rep. LHCb-2006-011. CERN-LHCb-2006-011, CERN, Geneva, Mar, 2006. revised version submitted on 2006-04-04 16:24:37.

[82] M. Adinolfi et al., Performance of the LHCb RICH detector at the LHC, Eur. Phys. J. C73 (2013) 2431, arXiv:1211.6759.

[83] LHCb collaboration, LHCb VELO Upgrade Technical Design Report, CERN-LHCC- 2013-021.

[84] LHCb collaboration, LHCb PID Upgrade Technical Design Report, CERN-LHCC- 2013-022.

[85] LHCb collaboration, LHCb Tracker Upgrade Technical Design Report, CERN- LHCC-2014-001.

[86] LHCb collaboration, LHCb Trigger and Online Technical Design Report, CERN- LHCC-2014-016.

[87] LHCb collaboration, Trigger Schemes, http://lhcb.web.cern.ch/lhcb/speakersbureau/html/TriggerScheme.html, Accessed at 7. Jan 2019.

[88] LHCb collaboration, R. Aaij et al., Precision luminosity measurements at LHCb, JINST 9 (2014) P12005, arXiv:1410.0149.

[89] A. Puig, The LHCb trigger in 2011 and 2012, LHCb-PUB-2014-046.

[90] R. E. Kalman, A New Approach to Linear Filtering and Prediction Problems, Journal of Basic Engineering 82 (1960) 35.

[91] Fr¨uhwirth, R. and Bock, R. K. , Data analysis techniques for high-energy physics experiments, Camb. Monogr. Part. Phys. Nucl. Phys. Cosmol. 11 (2000) 1.

[92] T. Sj¨ostrand,S. Mrenna, and P. Skands, A brief introduction to PYTHIA 8.1, Comput. Phys. Commun. 178 (2008) 852, arXiv:0710.3820. BIBLIOGRAPHY 181

[93] T. Sj¨ostrand,S. Mrenna, and P. Skands, PYTHIA 6.4 physics and manual, JHEP 05 (2006) 026, arXiv:hep-ph/0603175.

[94] I. Belyaev et al., Handling of the generation of primary events in Gauss, the LHCb simulation framework, J. Phys. Conf. Ser. 331 (2011) 032047.

[95] D. J. Lange, The EvtGen particle decay simulation package, Nucl. Instrum. Meth. A462 (2001) 152.

[96] P. Golonka and Z. Was, PHOTOS Monte Carlo: A precision tool for QED corrections in Z and W decays, Eur. Phys. J. C45 (2006) 97, arXiv:hep-ph/0506026.

[97] Geant4 collaboration, J. Allison et al., Geant4 developments and applications, IEEE Trans. Nucl. Sci. 53 (2006) 270; Geant4 collaboration, S. Agostinelli et al., Geant4: A simulation toolkit, Nucl. Instrum. Meth. A506 (2003) 250.

[98] M. Clemencic et al., The LHCb simulation application, Gauss: Design, evolution and experience, J. Phys. Conf. Ser. 331 (2011) 032023.

[99] C. Møller, General properties of the characteristic matrix in the theory of elementary particles, K. Danske Vidensk. Selsk. Mat. -Fys. Medd. 23 (1945).

[100] C. Barschel and M. Ferro-Luzzi, Precision luminosity measurement at LHCb with beam-gas imaging, PhD thesis, RWTH Aachen, 2014, Presented 05 Mar 2014, CERN- THESIS-2013-301.

[101] A. W. Chao, K. H. Mess, M. Tigner, and F. Zimmermann, eds., Handbook of accelerator physics and engineering, World Scientific, Hackensack, USA, 2013.

[102] W. Herr and B. Muratori, Concept of luminosity, in Intermediate accelerator physics. Proceedings, CERN Accelerator School, Zeuthen, Germany, September 15-26, 2003, pp. 361–377, 9, 2003.

[103] S. van der Meer, Calibration of the effective beam height in the ISR, Tech. Rep. CERN-ISR-PO-68-31. ISR-PO-68-31, CERN, Geneva, 1968.

[104] V. Balagura, Notes on van der Meer Scan for Absolute Luminosity Measurement, Nucl. Instrum. Meth. A654 (2011) 634, arXiv:1103.1129.

[105] D. Belorhad, S. Longo, P. Odier, and S. Thoulet, Mechanical Design of the Intensity Measurement Devices for the LHC, Tech. Rep. CERN-AB-2007-026, CERN, Geneva, 2007.

[106] D. Belohrad et al., Implementation of the Electronics Chain for the Bunch by Bunch Intensity Measurement Devices for the LHC, Tech. Rep. CERN-BE-2009-018, CERN, Geneva, May, 2009.

[107] M. Ferro-Luzzi, Proposal for an absolute luminosity determination in colliding beam experiments using vertex detection of beam-gas interactions, Nucl. Instrum. Meth. A553 (2005) 388. 182 BIBLIOGRAPHY

[108] LHCb collaboration, R. Aaij et al., Measurement of antiproton production in pHe col-

lisions at √sNN = 110 GeV, Phys. Rev. Lett. 121 (2019) 222001, arXiv:1808.06127. [109] LHCb collaboration, R. Aaij et al., First measurement of charm production in fixed-target configuration at the LHC, Phys. Rev. Lett. 122 (2019) 132002, arXiv:1810.07907.

[110] LHCb, A. Poluektov, First Look at 13 TeV Data and Highlights from the Most Recent Analyses, in Proceedings, 3rd Large Hadron Collider Physics Conference (LHCP 2015): St. Petersburg, Russia, August 31-September 5, 2015, (Gatchina), pp. 13–17, Kurchatov Institute, Kurchatov Institute, 2016.

[111] LHCb collaboration, A. Weiden et al., Preliminary luminosity calibration at √s = 13 TeV (p-p), LHCb-ANA-2015-036, (internal analysis note).

[112] LHCb collaboration, M. Schmelling, Luminosity calibration for the leading bunch nobias-triggered data from the 2015 early measurement runs, LHCb-INT-2017-015. CERN-LHCb-INT-2017-015, (internal note).

[113] LHCb collaboration, R. Aaij et al., Measurement of the inelastic pp cross-section at a centre-of-mass energy of √s =13 TeV, JHEP 06 (2018) 100, arXiv:1803.10974. [114] J. H. Christenson et al., Observation of massive muon pairs in hadron collisions, Phys. Rev. Lett. 25 (1970) 1523.

[115] G. Moreno et al., Dimuon production in proton - copper collisions at √s = 38.8-GeV, Phys. Rev. D43 (1991) 2815.

[116] NuSea collaboration, J. C. Webb et al., Absolute Drell-Yan dimuon cross-sections in 800 GeV / c pp and pd collisions, arXiv:hep-ex/0302019.

[117] J. C. Webb, Measurement of continuum dimuon production in 800-GeV/C proton nucleon collisions, PhD thesis, New Mexico State U., 2003, arXiv:hep-ex/0301031, doi: 10.2172/1155678, FERMILAB-THESIS-2002-56.

[118] ATLAS collaboration, G. Aad et al., Measurement of the low-mass Drell-Yan differential cross section at √s = 7 TeV using the ATLAS detector, JHEP 06 (2014) 112, arXiv:1404.1212.

[119] CMS collaboration, S. Chatrchyan et al., Measurement of the Drell-Yan Cross Section in pp Collisions at √s = 7 TeV, JHEP 10 (2011) 007, arXiv:1108.0566. [120] ATLAS, M. Aaboud et al., Measurement of the Drell-Yan triple-differential cross section in pp collisions at √s = 8 TeV, JHEP 12 (2017) 059, arXiv:1710.05167. [121] CMS, V. Khachatryan et al., Measurements of differential and double-differential Drell-Yan cross sections in proton-proton collisions at 8 TeV, Eur. Phys. J. C 75 (2015) 147, arXiv:1412.1115.

[122] CMS, A. M. Sirunyan et al., Measurement of the differential Drell-Yan cross section in proton-proton collisions at √s = 13 TeV, JHEP 12 (2019) 059, arXiv:1812.10529. BIBLIOGRAPHY 183

[123] N. Chiapolini, Low-Mass Drell-Yan Cross-Section Measurements with the LHCb Experiment, PhD thesis, University Zurich, Nov, 2014, CERN-THESIS-2014-299.

[124] LHCb collaboration, J. Anderson and K. M¨uller, Inclusive low mass Drell-Yan production in the forward region at √s = 7 TeV, LHCb-CONF-2012-013. [125] LHCb collaboration, R. Aaij et al., Measurement of the cross-section for Z e+e− production in pp collisions at √s =7 TeV, JHEP 02 (2013) 106, arXiv:1212.4620→ . [126] LHCb collaboration, R. Aaij et al., Measurement of the forward Z boson cross-section in pp collisions at √s =7 TeV, JHEP 08 (2015) 039, arXiv:1505.07024. [127] LHCb collaboration, R. Aaij et al., A study of the Z production cross-section in pp col- lisions at √s =7 TeV using tau final states, JHEP 01 (2013) 111, arXiv:1210.6289. [128] LHCb collaboration, R. Aaij et al., Measurement of forward W and Z boson produc- tion in pp collisions at √s =8 TeV, JHEP 01 (2016) 155, arXiv:1511.08039. [129] LHCb collaboration, R. Aaij et al., Measurement of Z e+e− production at √s =8 TeV, JHEP 05 (2015) 109, arXiv:1503.00963. → [130] LHCb collaboration, R. Aaij et al., Measurement of the forward Z boson pro- duction cross-section in pp collisions at √s =13 TeV, JHEP 09 (2016) 136, arXiv:1607.06495.

[131] J. S. Anderson and R. McNulty, Testing the electroweak sector and determining the absolut luminosity at LHCb using dimuon final states, PhD thesis, University College Dublin, Nov, 2008, Presented on 18 Nov 2008, CERN-THESIS-2009-020.

[132] P. Dierckx, An algorithm for smoothing, differentiation and integration of experimen- tal data using spline functions, Journal of Computational and Applied Mathematics 1 (1975) 165.

[133] P. Dierckx, A fast algorithm for smoothing data on a rectangular grid while using spline functions, SIAM J. Numer. Anal. 19 (1982) 12861304.

[134] P. Dierckx, Curve and Surface Fitting with Splines, Clarendon Press, 1995.

[135] R. Brun and F. Rademakers, ROOT: An object oriented data analysis framework, Nucl. Instrum. Meth. A389 (1997) 81.

[136] R. J. Barlow and C. Beeston, Fitting using finite Monte Carlo samples, Comput. Phys. Commun. 77 (1993) 219.

[137] A. Nappi, A Pitfall in the use of extended likelihood for fitting fractions of pure sam- ples in mixed samples, Comput. Phys. Commun. 180 (2009) 269, arXiv:0803.2711.

[138] D. Mart´ınezSantos and F. Dupertuis, Mass distributions marginalized over per-event errors, Nucl. Instrum. Meth. A764 (2014) 150, arXiv:1312.5000.

[139] B. Efron and R. J. Tibshirani, An introduction to the bootstrap, Mono. Stat. Appl. Probab., Chapman and Hall, , 1993. 184 BIBLIOGRAPHY

[140] M. Waskom et al., mwaskom/seaborn: v0.9.0 (july 2018), July, 2018. doi: 10.5281/zenodo.1313201.

[141] S. Schmitt, Data Unfolding Methods in High Energy Physics, EPJ Web Conf. 137 (2017) 11008, arXiv:1611.01927.

[142] S. Schmitt, TUnfold: an algorithm for correcting migration effects in high energy physics, JINST 7 (2012) T10003, arXiv:1205.6201.

[143] A. Hocker and V. Kartvelishvili, SVD approach to data unfolding, Nucl. Instrum. Meth. A372 (1996) 469, arXiv:hep-ph/9509307.

[144] V. Blobel, An Unfolding method for high-energy physics experiments, in Advanced Statistical Techniques in Particle Physics. Proceedings, Conference, Durham, UK, March 18-22, 2002, pp. 258–267, 2002, arXiv:hep-ex/0208022. http://www.ippp.dur.ac.uk/Workshops/02/statistics/proceedings//blobel2.pdf.

[145] G. D’Agostini, A Multidimensional unfolding method based on Bayes’ theorem, Nucl. Instrum. Meth. A362 (1995) 487.

[146] G. D’Agostini, Improved iterative Bayesian unfolding, arXiv:1010.0632.

[147] G. Choudalakis, Fully Bayesian Unfolding, arXiv:1201.4612.

[148] A. N. Tikhonov, Solution of incorrectly formulated problems and the regularization method, Soviet Math. Dokl. 4 (1963) 1035.

[149] V. Blobel, Unfolding Methods in Particle Physics, in Proceedings, PHYSTAT 2011 Workshop on Statistical Issues Related to Discovery Claims in Search Experiments and Unfolding, CERN,Geneva, Switzerland 17-20 January 2011, (Geneva), pp. 240– 251, CERN, CERN, 2011. doi: 10.5170/CERN-2011-006.240, https://cds.cern. ch/record/2203257.

[150] M. De Cian, Track Reconstruction Efficiency and Analysis of B0 K∗0µ+µ− at the LHCb Experiment, PhD thesis, University Zurich, Sep, 2013, Presented→ 14 Mar 2013, CERN-THESIS-2013-145.

[151] LHCb collaboration, S. Farry and N. Chiapolini, A measurement of high-pT muon reconstruction efficiencies in 2011 and 2012 data, LHCb-INT-2014-030. CERN- LHCb-INT-2014-030, (internal note).

[152] J. Van Tilburg, Track simulation and reconstruction in LHCb, PhD thesis, Vrije Universiteit Amsterdam, 2005, Presented on 01 Sep 2005, CERN-THESIS-2005-040.

[153] W. Verkerke and D. P. Kirkby, The RooFit toolkit for data modeling, eConf C0303241 (2003) MOLT007, arXiv:physics/0306116, [,186(2003)].

[154] R. S. Thorne, A. D. Martin, W. J. Stirling, and G. Watt, Parton Distributions and QCD at LHCb, in Proceedings, 16th International Workshop on Deep Inelastic Scattering and Related Subjects (DIS 2008): London, UK, April 7-11, 2008, p. 30, 2008, arXiv:0808.1847. doi: 10.3360/dis.2008.30. BIBLIOGRAPHY 185

[155] M. L. Mangano and J. Rojo, Cross Section Ratios between different CM energies at the LHC: opportunities for precision measurements and BSM sensitivity, JHEP 08 (2012) 010, arXiv:1206.3557.

[156] K. M¨uller.private communication.

[157] T. Neuer, Using a feed forward neural network to identify Drell-Yan events at the LHCb experiment, Bachelor’s thesis, University Zurich, Aug, 2018. https: //www.physik.uzh.ch/dam/jcr:8c51b680-19f3-4c0a-8b6c-2e61b5582681/ Bachelor_Thomas_Neuer.pdf.

[158] Madrazo, Celia Fern´andezand Cacha, Ignacio Heredia and Iglesias, Lara Lloret and de Lucas, Jes´usMarco, Application of a Convolutional Neural Network for image classification to the analysis of collisions in High Energy Physics, arXiv:1708.07034.