<<

CERN-THESIS-2018-044 26/03/2018 erhi the in search A a of production the for Evidence w o urswt h ATLAS the with top two ig oo nascainwith association in Higgs hsssbitdt h nvriyo acetrfrtedge fDco of Doctor of degree the for Manchester of University the to submitted thesis A te ig oo easat decays boson Higgs other hlspyi h aut fSineadEngineering and Science of Faculty the in Philosophy colo hsc n Astronomy and of School H onAde Raine Andrew John → detector b ¯ b hne n ncmiainwith combination in and channel 2018 √ s = 3TeV 13 “It is not our part to master all the tides of the world, but to do what is in us for the succour of those years wherein we are set, uprooting the evil in the fields that we know, so that those who live after may have clean earth to till. What weather they shall have is not ours to rule.”

- Gandalf the White

2 Contents

1 Introduction 13

2 The 15 2.1 Overview of the Standard Model ...... 15 2.2 Theoretical Motivation for Measuring ttH¯ ...... 29 2.3 Monte Carlo Simulation ...... 33

3 Statistical Methods 37 3.1 Statistical Analysis ...... 37 3.2 Boosted Decision Trees ...... 44

4 LHC and the ATLAS Experiment 49 4.1 The Large Collider ...... 49 4.2 The ATLAS Detector ...... 52

5 Object Reconstruction 65

6 Modelling and Event Selection 77 6.1 Signal and Background Modelling ...... 77 6.2 Event Preselection ...... 88   7 Search for ttH¯ H → b¯b at 13 TeV 93 7.1 Analysis Strategy ...... 94 7.2 Systematic Uncertainties ...... 126

8 Results 137   8.1 Search for ttH¯ H → b¯b ...... 137 8.2 Combination with other Searches ...... 151

9 Colour Connection of b-quarks 157 9.1 Phenomenological Motivation ...... 158   9.2 Colour Flow in ttH¯ H → b¯b ...... 161

10 Conclusion 169

References 173 Total Word Count: 44600

3 4 Abstract

In this thesis, the search for the production of the in association with two top quarks is presented. The main focus of this work is on the analysis optimised for the decay of the Higgs boson to a b- pair. The analysis is performed using 36.1 fb−1 of pp collision data at a centre of mass energy √ s = 13 TeV collected by the ATLAS detector at the Large Hadron Collider during 2015 and 2016. The signal strength of ttH¯ in relation to the Standard Model prediction for a Higgs boson with a mass of 125 GeV is measured to be +0.64 µttH¯ = 0.87−0.61, with signal strengths greater than 2.0 excluded at the 95% confidence level. The combination of this analysis with searches targeting additional Higgs boson decay modes is subsequently presented. The measured signal strength in relation to the Standard Model prediction is µttH¯ = 1.2±0.3. This corresponds to an observed (expected) significance for ttH¯ of 4.2σ (3.8σ), constituting evidence for the ttH¯ production mode. Finally, a study into the ability to observe and model the colour connection   of b-quarks in ttH¯ H → b¯b and tt¯ + jets events is presented. The jet pull angle observable is used to investigate the effect of colour connection on jet substructure. Such an observable is found to be sensitive to the underlying colour structure in events, showing differences between b-quarks which decay from a colour singlet in comparison to a colour octet. However, the effect is found to be small and a larger dataset is required to measure the effect in ttH¯ events.

5 6 Declaration and Copyright

No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

i The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the Copyright) and he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes.

ii Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the Intellectual Property) and any reproduc- tions of copyright works in the thesis, for example graphs and tables (Re- productions), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property University IP Policy,1 in any relevant Thesis restriction declara- tions deposited in the University Library, The University Library’s regula- tions2 and in The University’s policy on Presentation of Theses.

1http://documents.manchester.ac.uk/display.aspx?DocID=24420 2http://www.library.manchester.ac.uk/about/regulations/

7 8 Acknowledgements

First of all, I want to say a big thank you to everyone I have had the pleasure to meet and work with over the past three and a half years. Without the support and encouragement from so many people, as well as the countless experiences I’ve been fortunate to enjoy, I can’t imagine I’d be in the same position as I am now.

To my family I owe the biggest thanks. You’ve always been there for me, driving me to pursue my goals in life. It’s not always been the easiest journey but knowing you’re there has helped me get through the bumps in the road.

To all my friends, you’ve made the past few years far more than just working towards a PhD. Be it with beers on a balcony, barbecuing in the rain, skiing in incredible alpine vistas, the countless coffee breaks, or enjoying the swiss diet of wine and cheese, it wouldn’t have been the same without you. And to Marie, I can’t thank you enough for helping me relax and stay focused. Writing this thesis would have been a lot harder without you.

To Yvonne, thank you for all the guidance and feedback you’ve provided. You’ve encouraged me to consider new ideas and I’ve learnt a lot under your supervision. And of course, thank you for encouraging the group to bring in chocolates and cake! And to everyone in Manchester, as well as those based out at CERN, the atmosphere you created was an absolute joy to work in.

  I would also like to thank everyone I worked with on the ttH¯ H → b¯b analysis, with additional thanks to Lisa and Georges. It has been great fun working on such a complex analysis with such a fantastic group of people.

9 10 Preface

In 2014 the author joined the High Energy Physics group at the Uni- versity of Manchester, becoming a member of the ATLAS Collaboration. In the time since then the author has contributed to the operation of the ATLAS experiment and worked on the search for ttH¯ in the H → b¯b decay channel during Run 2 of the LHC. The author worked on the analysis from its start to conclusion. This analysis has since been published in PRD [1] and has been combined with three other searches for ttH¯ using the same dataset collected by the ATLAS experiment [2]. The notable contributions from the author in this thesis are as follows. An online monitoring algorithm for the efficiency of reconstructing tracks used in the ATLAS trigger system was developed by the author and is described   in Section 4.2.5. In the search for ttH¯ H → b¯b , the author focused on the dilepton channel, working on and developing the whole analysis strategy. In ad- dition, he performed the full chain of the analysis from preparing the datasets to running the statistical analysis. Several background modelling studies in the dilepton channel were performed by the author. These mostly concerned the Z+jets background using a data-driven method to derive the correction fac- tors for Z+Heavy Flavour jets. But in addition to this, the author investigated the modelling of the tt¯ + HF background and its impact on the sensitivity and stability of the statistical analysis. With regards to the analysis strat- egy, the author performed and optimised the event categorisation presented in Chapter 7. The reconstruction BDT in the dilepton channel, described in the same chapter, was also developed by the author. On top of this, he opti- mised the choice of variables and trained the classification BDTs used as the final discriminants in all dilepton signal regions. The statistical analysis in the dilepton channel was performed by the author, before entering into combina-   tion with the semileptonic channel for the final ttH¯ H → b¯b measurement. It subsequently entered into the final ttH¯ combination, the results of which are presented in Chapter 8. For the ttH¯ combination, orthogonality checks

11 Contents Contents were carried out to ensure that there was no overlap between channels in the H → b¯b and multilepton analyses. Finally, the author implemented the pull   angle in ttH¯ H → b¯b , an observable used previously in tt¯ analyses to look at the modelling of colour structure in collisions, and studied whether such an effect is observable with the current dataset and whether it could be useful in future searches. Although not presented in this thesis, one of the major contributions the author made during the previous four years has been a substantial effort in de- veloping the analysis framework used in the ATLAS HTop(bb) working group,   to which the ttH¯ H → b¯b analysis belongs. This framework is used by various analyses and continues to be used and supported by the working group.

12 1. Introduction

The Standard Model of is a theory which has been in devel- opment over the past five decades. However, it was only with the discovery of the Higgs boson in 2012, a particle first theorised in 1967, that the theory became complete. Since its discovery, the fundamental properties of the Higgs boson have been under investigation, predominantly by the ATLAS and CMS collaborations. Great interest surrounds its production and decay modes, in particular concerning how it couples to other elementary . As more data is collected by the experiments at the Large Hadron Collider, processes with a very low rate become possible to observe. This offers new areas of phase space in which to rigorously test the Standard Model and search for potential deviations from new physics. The main topic in this thesis is the search for the production of the Higgs boson in association with a pair, ttH¯ . The analysis is performed using data taken with the ATLAS experiment at a centre of mass energy √ s = 13 TeV during 2015 and 2016. The search presented in this thesis targets decays of the Higgs boson into a b-quark pair, focusing on events with a dilepton final state. The measurement obtained from combining the analysis with three additional searches each optimised for different Higgs boson decay modes is also presented. In the analysis, a wide range of techniques are used to maximise the sensitivity to the ttH¯ signal, in particular using multivariate techniques to enhance the separation of ttH¯ events from the dominant tt¯+ jets background. One such method employs a Boosted Decision Tree to assign reconstructed   jets to partons in the ttH¯ H → b¯b hard scatter. A study of the differences in colour structure of H → b¯b and g → b¯b in   ttH¯ H → b¯b and tt¯ + b¯b events is also presented. The effect of the colour structure on the substructure of jets provides a potential source of additional information in an event that is otherwise unused in separating the two pro- cesses. This thesis is structured in the following manner. In Chapter 2 the Stan-

13 1. Introduction dard Model is introduced, along with the motivation behind the search for ttH¯ production. It concludes with an overview of the main technique used in high energy particle physics to model particle interactions with the Monte Carlo method. Chapter 3 subsequently describes the statistical formulation forming the backbone of the analysis. In Chapter 4 the accelerator complex at CERN and the ATLAS detector are described, before discussing how physics objects are reconstructed by the detector in Chapter 5. The following three chapters   describe the ttH¯ H → b¯b analysis in detail. This starts with the modelling of signal and background processes in Chapter 6, as well as the selection criteria employed in the analysis, before moving on to the analysis strategy in Chap- ter 7. It is in this chapter that the categorisation of events and multivariate techniques used to separate the signal from background are described, before providing information regarding the sources of systematic uncertainties in the analysis. The results of the analysis and the subsequent combination with other searches for ttH¯ are then presented in Chapter 8. The study of the colour con-   nection of b-quarks and its effect on jet substructure in ttH¯ H → b¯b and tt¯+ jets events is presented in Chapter 9. Finally, Chapter 10 contains con- cluding remarks on the work contained in this thesis. In this thesis all units are given using natural units with ~ = c = 1, where ~ is the reduced and c is the speed of light in a vacuum. Electric −19 charge is given in units of charge, where Qe = 1.6×10 C.

14 2. The Standard Model

In this chapter the theoretical framework used to describe the fundamental laws of particle physics and provide high precision predictions of particle in- teractions is described. These predictions are used to model the signal and background processes used in the analysis presented in this thesis and are a fundamental part of high energy particle physics. An overview of the theo- retical model, called the Standard Model, is detailed in Section 2.1, though more comprehensive explanations and the history of the model can be found for example in Ref. [3, 4]. Section 2.2 discusses the motivation for the analysis presented in Chapter 7, and an overview of how the simulation of interactions are treated using the Monte Carlo method is described in Section 2.3.

2.1 Overview of the Standard Model

The Standard Model of particle physics (SM) is one of the most successful scientific theories ever developed. Thus far, it has been able to describe almost all observed particle physics data with high accuracy. For example, the g-2 experiment aims to measure the magnetic moment anomaly of the to a precision of 0.14 parts per million [5]. The SM describes the interactions between all elementary particles via three of the four fundamental forces of the universe, the electromagnetic, weak, and strong interactions. The SM is a fully renormalisable, Lorentz invariant, quantum field theory (QFT) built upon gauge symmetries. The internal symmetries of the symmetry group

SU (3)C × SU (2)L × U (1)Y (2.1) provides a basis for the SM, from which different components can be attributed to the three interactions. The gravitational force, which is the weakest of the four fundamental forces, is not present in the SM as it can not currently be described by QFT, though there are many theories which seek to describe it at a quantum level with the SM in a Theory of Everything.

15 2. The Standard Model 2.1. Overview of the Standard Model

Elementary particles are categorised into two different types, and , which are defined by the value of their spin, an intrinsic property of all particles. Bosons are particles with integer spin whereas fermions are particles which have half integer spin. Fermions are the constituents of in the observable universe. Bosons are so-called “force carriers”, elementary particles that mediate the interactions described by the SM. All particles are treated as field quanta in QFT which, in the SM, are required to be invariant under the local gauge transformations of each symmetry group. The fundamental particles of the SM are shown in Fig. 2.1, and are discussed in more detail in the following sections.

2.2 MeV 1.28 GeV 173.1 GeV +2/ +2/ +2/ u 3 c 3 t 3 up charm top Bosons

4.7 MeV 96 MeV 4.18 GeV 0 0 −1 −1 −1 /3 /3 /3 0 0 Quarks d s b γ g down strange bottom

0.511 MeV 105.7 MeV 1.78 GeV 80.4 GeV 91.2 GeV -1 -1 -1 ±1 0 e µ τ W ± Z0 electron muon W boson Z boson

< 2 eV < 2 eV < 2 eV 125.1 GeV Mass 0 0 0 0 Charge Leptons νe νµ ντ H0 Symbol e µ neutrino τ neutrino Higgs boson Name

Figure 2.1: The fundamental particles in the Standard Model. Values taken from Ref. [6]. Mass and are shown for each entry; other quantum numbers, such as colour, are not shown in this figure. The spin of a particle 1 is denoted by the colour of its background. Green and red represent spin- /2, blue represents spin-1, and orange background represents spin-0.

Its phenomenal success aside, the SM is not without its failings. It is known to be an incomplete theory as it does not describe gravity, nor can it explain the abundance of matter over anti-matter in the universe. It also fails to provide an explanation for the phenomena of dark matter or dark energy, which are implied to exist from cosmological observations.

16 2.1. Overview of the Standard Model 2. The Standard Model

2.1.1 Fermions

In the SM, elementary fermions are further categorised into two different types, and quarks. Quarks are fermions with fractional electric charge and carry a property known as colour charge. Leptons on the other hand have integer electric charge and do not carry colour charge. Electric charge and colour charge are the charges associated to the U (1) and SU (3) gauge groups respectively. The quantum number of the weak force is the weak isospin, which is related to the SU (2) gauge group, and is non-zero for (anti-)fermions with positive (negative) chirality. The quantum charges of each symmetry group are conserved in every interaction. The weak isospin, I3, and electric charge, Q, are related by the weak hypercharge Y = 2 (Q − I3). Quarks interact via all three forces in the SM. In the case of leptons, the charged leptons interact with the electromagnetic and weak forces, and the neutral leptons, called , interacting only via the weak force. Each also has its own , which shares the same quantum numbers as the particle but has opposite- sign electric charge. For neutrinos, both the particle and antiparticle have zero charge. In total there are six types of quarks, commonly referred to as flavours. The quarks are split into two categories based on their electric charge. The up-type 2 quarks, the up, charm and top, have electric charge Q = /3; the down-type 1 quarks, down, strange and bottom, have Q = − /3. As a consequence of being colour charged, quarks cannot be observed as free particles but instead are in bound states called . Hadrons with three bound quarks are called and bound quark-antiquark pairs are . These hadrons are what compose matter in the observable universe. and are the two most common baryons, which together build all the atomic nuclei. Leptons are also divided into two categories based on their electric charge as well as being split into three flavours. The charged leptons, the electron e−, the muon µ− and the tau τ −, have Q = −1, whereas the three neutrinos have Q = 0. For each charged there is a neutrino of the corresponding lepton

flavour, the νe, the νµ and the

ντ . The quarks and leptons are typically grouped into three generations, each containing an up-type quark, a down-type quark, a charged lepton and a neu- trino. These generations are ordered by particle mass and are the order in which the particles were discovered. Each generation forms one of the first three columns of particles in Fig. 2.1. Between generations, the particles within the

17 2. The Standard Model 2.1. Overview of the Standard Model same row have identical quantum numbers except for their mass. The heaviest fermion, the top quark, was the last fermion to be observed and is discussed in more detail in Section 2.1.4.

2.1.2 Bosons and the Forces

In the SM there are four vector gauge bosons and one . The gauge bosons all are spin-1 and mediate the fundamental interactions in the SM. The scalar spin-0 boson is the Higgs boson, which is discussed in Section 2.1.3. The gauge bosons and the interactions in the SM are all consequences of the symmetries of the SU (3)C × SU (2)L × U (1)Y group. The components of this theory can be evaluated separately, breaking up the group product into com- plementary theories. The SU (3)C group describes the strong interaction which has the conserved colour charge C. This theory is formulated and evaluated in

Quantum Chromo Dynamics (QCD). The group product SU (2)L × U (1)Y de- scribes the unified electromagnetic and weak interactions, which interact with the conserved properties of weak isospin and hypercharge (Y ). In addition, the electromagnetic force can also be described without unification with the weak force by the U (1)QED group; this is often a starting point for describing the interactions in the SM.

Electromagnetic Interaction U (1)QED

The electromagnetic interaction occurs between all charged particles and has an infinite range. It originates from the U (1) symmetry, which predicts a single mediator. The that mediates this interaction is the photon γ; it is massless and carries neither electric nor colour charge. In QFT, this processes is described by Quantum Electrodynamics (QED) and can be represented by the Feynman diagram shown in Fig. 2.2. In this diagram f and f¯ represent a charged fermion and charged anti-fermion respectively, with the wavy line representing the photon. The convention used in these Feynman diagrams has time flowing from left to right, with spatial separation from top to bottom. Fermion lines have an arrow indicating direction of travel, with (anti-)fermions travelling (backwards) forwards in time. As energy and momentum must be conserved at each vertex, a single interaction vertex by itself does not represent a physical process as a result of the photon being massless.

18 2.1. Overview of the Standard Model 2. The Standard Model

γ

f

Figure 2.2: The Feynman diagram of the QED interaction vertex.

Strong Interaction SU (3)C

The strong force is represented by the SU (3)C group in the SM and it is described by QCD. It describes the interactions between all particles carrying colour charge. The SU (3)C group provides three distinct colour charges and predicts a colour octet of eight gauge bosons, which mediate the interactions of the strong force, each constructed as an orthogonal colour charged state. The neutral colour singlet state, which is orthogonal to the colour octet, is not observed to exist in nature as it would result in an infinite range of the strong interaction. The three colours (anti-colours for ) are (anti-)red, (anti-)green and (anti-)blue. The eight gauge bosons are which carry orthogonal combinations of colour and anti-colour. Self interactions of gluons are also a consequence of the SU (3)C group. Although there are larger numbers of colour charges and gluons than the electric charge and , the QCD interactions between quarks and gluons can be described in a very similar way to QED, with the gluons and colour charge analogous to photons and electric charge in QED. The interaction vertex in QCD between quarks and gluons is shown in Fig. 2.3 and the self-interactions of gluons are similarly shown in Fig. 2.4.

g

q

Figure 2.3: The Feynman diagram representing the quark gluon interaction vertex in QCD.

19 2. The Standard Model 2.1. Overview of the Standard Model

g g g

g

g g g

Figure 2.4: The Feynman diagrams representing the gluon self-interactions in QCD.

Another aspect of QCD in which it differs from QED is that colour-charged particles cannot exist in unbound states, a property of QCD known as confine- ment. In the SM, these colour neutral bound states are baryons and mesons. In QCD, interactions between quarks become much weaker and approach zero as the distance between them decreases and the energy scale increases. This property is called asymptotic freedom. Due to the asymptotic freedom of QCD, quarks can be modelled as bare quarks at high energies, as opposed to in bound states. This makes it possible to use perturbative calculations to make highly accurate predictions. At low energies the QCD interaction is much stronger, leading to confinement. As a result of confinement, the range of the strong interaction is limited to ∼ 10−15 m.

Electroweak Interaction SU (2)L × U (1)Y

The final interaction in the SM is described by the weak force. The SU (2)L group describes the chiral symmetry of the weak interactions, which occur between all (right)left-handed (anti-)fermions. All fermions interact via the weak interaction, mediated by the W ± and Z0 bosons, however it has a much shorter range than the other fundamental forces at distances of around 10−17 to 10−16 m. Unlike the gauge bosons of QED and QCD, both the W ± and Z0 bosons have mass. The weak interaction is also unique in the SM in that it allows for flavour changing interactions. There are two aspects to the weak interaction, the charged and neutral currents, and it is only via the weak force that neutrinos interact. Furthermore, the SU (2)L group alone is not able to describe the weak force, due to the presence of the neutral current and massive gauge bosons. To explain the presence of the weak neutral current, the electromagnetic and weak interactions were shown to be aspects of a unified force, the elec-

20 2.1. Overview of the Standard Model 2. The Standard Model troweak force. This was first theorised by Glashow in 1961 [7], and was sub- sequently formulated as a spontaneous broken gauge symmetry by Weinberg and Salam [8, 9]. It is from this spontaneous broken symmetry that the W ± and Z0 gauge bosons obtain their mass. At low energy scales the electromag- netic and weak forces can be identified as two separate processes, however at energies above the unification energy they merge into one force. This unified interaction is described by the SU (2)L × U (1)Y group. The L denotes that the weak component of the electroweak interaction couples only to (right) left handed (anti-)particles and Y is the weak hypercharge. The electromagnetic interaction, however, still interacts with both left and right-handed fermions with electric charge. The weak charged current is mediated by the W ± bosons, which have a mass of 80.4 GeV and Q = ±1 [6]. There are two main interaction vertices in the charged weak interactions, shown in Fig. 2.5, as well as the corresponding diagrams where the particles are exchanged for their antiparticles. An example of the weak charged current in the lepton sector is the decay of the muon

− µ → e +ν ¯e + νµ; (2.2) and the semileptonic process of β decay

− n(d) → p(u) + e +ν ¯e, (2.3) where n and p are the and respectively. The weakly interacting quark constituent is shown in parentheses. The Feynman diagrams for these processes are shown in Fig 2.6.

W ± W −

q0 q ν `

Figure 2.5: The Feynman diagrams representing the charged weak interaction. Here q and q0 are used to distinguish between up- and down-type quarks and ` represents any charged lepton.

In charged weak interactions, the W ± bosons do not couple directly to the mass eigenstates of freely propagating quarks, as is the case in QCD. Instead they couple to the weak flavour eigenstates, which are combinations of the

21 2. The Standard Model 2.1. Overview of the Standard Model

νµ u

− µ ν¯e d ν¯e

W − W −

e− e−

(a) Muon decay (b) β decay

Figure 2.6: The Feynman diagrams for muon decay (a) and β decay (b).

mass eigenstates. The weak eigenstates of the down-type quarks (d0 s0 b0) are given by

 0       d d |Vud| |Vus| |Vub| d          0       s  = VCKM s = |Vcd| |Vcs| |Vcb| s , (2.4)         0 b b |Vtd| |Vts| |Vtb| b

where VCKM is the unitary CKM matrix, and (d s b) are the mass eigenstates. The CKM matrix describes the probability of an up-type quark decaying to a given down-type quark. As the weak eigenstates are combinations of the mass eigenstates, in a charged weak interaction an up-type quark can interact with any of the three down type quarks at the same vertex. As a result of this, flavour changing processes across generations can occur, for example in the decay of Λ0 → p + π−, where a decays into an and a W − boson.

The weak neutral current is mediated by the Z0 boson, which has a mass of 91.2 GeV [6]. The weak neutral current vertex is shown in Fig. 2.7. This ver- tex is analogous to the interaction vertex between charged fermions in QED, with a Z0 instead of a γ mediating the process. At low energy scales where 2 2 q  MZ , the weak neutral current processes with charged fermions are com- pletely dominated by the QED interaction. However, in addition to charged fermions the neutrinos can also interact at this vertex.

The gauge bosons of the electroweak interaction, like the gluons in QCD, are self interacting. These self interaction vertices are shown in Fig. 2.8.

22 2.1. Overview of the Standard Model 2. The Standard Model

Z0

f f

Figure 2.7: The Feynman diagram for the weak neutral current interaction.

± 0 W ± W ± W ± W Z /γ

W ±

0 0 Z /γ W ∓ W ∓ W ∓ Z /γ

Figure 2.8: The Feynman diagrams of the electroweak self-interactions.

2.1.3 The Englert-Brout-Higgs Mechanism

Spontaneous Symmetry Breaking

As mentioned in the previous section, the weak and electromagnetic interac- tions are unified into a single gauge invariant electroweak interaction. In an exact gauge invariant theory, spin-1 gauge bosons must be massless for the the- ory to be fully renormalisable. However, this is not observed in nature where the weak gauge bosons are massive. To accommodate gauge boson masses in the SM without losing gauge in- variance in the weak interactions, a complex scalar field is introduced by the ad- dition of a scalar SU (2) doublet [10–12]. This scalar field has an SU (2)×U (1) invariant scalar potential of the form

 2 V (φ) = µ2φ†φ + λ φ†φ . (2.5)

By requiring λ to be positive, so that the potential energy density is bounded from below, and choosing a value of µ2 which is negative, the minimum poten- tial of the scalar potential becomes non-unique and non-zero. This potential can be visualised by the diagram in Fig. 2.9, illustrating a Mexican Hat po- tential for a complex field φ. As a consequence of choosing a minimum in the potential as the ground state of the field, the symmetry of the system is broken and the vacuum ex-

23 2. The Standard Model 2.1. Overview of the Standard Model pectation value (v) becomes non-zero. This can be visualised by starting with a perfect gauge symmetry when the vacuum starts at the origin, on top of the peak in the potential, but by moving to a minimum the overall symmetry is broken. By introducing the scalar doublet in the SM, the gauge symmetry can be broken while the weak interaction remains fully symmetric. The gauge bosons acquire mass from their interaction with the non-zero vacuum state of the scalar field. Through electroweak symmetry breaking, the SU (2)L ×U (1)Y group is broken to the U (1)QED subgroup, which is constructed to remain a true symmetry. Thus, the photon does not acquire mass like the W ± and Z0 bosons. Four new degrees of freedom (d.o.f.) are introduced as a consequence of including a scalar doublet in the SM. The four new d.o.f are in addition to the six from the massless spin-1 W ± and Z0 bosons before spontaneous symmetry breaking (SSB) occurs in the weak sector. After the breaking of the electroweak symmetry, three Goldstone bosons appear as a consequence of the broken in- ternal symmetries. These Goldstone bosons, which are massless and spin-0, couple to the gauge fields and are absorbed by weak gauge bosons. In compar- ison to massless particles, massive particles have a longitudinal polarisation, and thus an additional degree of freedom. Through absorbing the Goldstrone bosons, the weak gauge bosons each acquire the additional d.o.f. they require as massive particles.

The Higgs Boson

After the weak bosons absorb three of the additional d.o.f. from the intro- duction of the scalar field, one still remains unaccounted for. This degree of freedom becomes a new particle quantising the scalar field φ that has been introduced to the SM. This particle is the Higgs boson. It has spin-0 and is both electrically and colour neutral. In the SM, the potential in Eq. 2.5 can be rewritten in terms of the Higgs mass MH , and v of the potential, where

q 2 MH = −2µ , (2.6) and

µ2 v2 = − = (246 GeV)2 . (2.7) 2λ

The value for v is calculated with high precision from its relation to the W ±

24 2.1. Overview of the Standard Model 2. The Standard Model

Figure 2.9: The so-called “Mexican Hat” effective potential of a complex scalar field V (φ) [13]. The symmetry is broken as the lowest energy state moves from the crest to an arbitrary coordinate in the bottom of the hat. boson mass and the coupling constant of the weak interaction g, with

2M v = W . (2.8) g

The mass of the Higgs boson is not predicted a priori and in the SM could take any value. Experimentally, the Higgs mass has been measured to be 125.09±0.24 GeV [6]. The couplings of the Higgs boson to the W ± and Z0 bosons is proportional to their mass squared.

Fermion Masses

With the addition of the scalar Higgs field the masses of the gauge bosons are explained, but in addition it can also explain how the elementary fermions 1 acquire their mass. Without any modification to the SM, all spin- /2 particles involved in parity violating weak interactions would have be to massless in order to preserve gauge symmetry. This is a consequence of needing to decouple the left- and right-handed states of particles in order to treat them separately in interactions. In contrast, with the addition of the Higgs mechanism, fermions can also couple to the scalar field and acquire mass in a way which preserves gauge invariance in the weak interaction. Through these couplings the fermion

25 2. The Standard Model 2.1. Overview of the Standard Model masses are generated as

v mf = yf √ . (2.9) 2

The constants yf are the Yukawa coupling constants, where f can be substi- tuted for any fermion in the SM. They describe the strength of the coupling of the fermion to the Higgs field. Since yf are unknown, the masses of the fermions are arbitrary and can only be measured from experiments. It is therefore im- portant to look at the Higgs-fermion vertices in order to measure the Yukawa coupling constants in the SM and validate the relationship in Eq. 2.9. The Higgs couplings to the gauge bosons and the fermions are shown in Fig. 2.10. It should be noted that the Higgs does not couple to the neutri- nos which, in the SM, remain massless. This is due to the requirement that (anti-)neutrinos always be (right) left handed.

W ±/Z0 f¯

H H

W ∓/Z0 f

(a) (b)

Figure 2.10: The couplings of the Higgs boson to massive gauge bosons (a) and fermions (b).

Higgs Production Mechanisms and Decays

At hadron colliders, such as the Large Hadron Collider at CERN, there are four main production mechanisms by which the Higgs boson is produced in collisions. In order of cross sections at a centre of mass energy of 13 TeV, these are gluon gluon fusion (ggF), fusion (VBF), associated vector boson production (VH) and associated top quark pair production (ttH¯ ). The Feynman diagrams for these production mechanisms are shown in Fig. 2.11. In VBF and VH the direct coupling of the Higgs boson to the weak bosons is probed, which provides a stringent test for spontaneous symmetry breaking. Both ggF and ttH¯ probe the coupling of the Higgs boson to the heaviest fermion, the top quark. In ggF the massless gluons couple to the Higgs through

26 2.1. Overview of the Standard Model 2. The Standard Model

g q0 H

H

g q W ±/Z0

(a) ggF (b) VH q0 t¯ q0 g

W/Z0 H H W/Z0

q g

q t

(c) VBF (d) ttH¯

Figure 2.11: The four main Higgs production mechanisms at hadron colliders. a virtual quark loop dominated by the top quark (b-quarks contribute ∼5%), whereas in ttH¯ there is a direct coupling of the top quarks to the Higgs. The branching ratios (BR) for the main decay modes of the Higgs are shown in Fig. 2.12. For a Higgs with a mass of 125 GeV, the dominant decay mode is to a b¯b pair, which has a BR of 58%, followed by the decay to W +W − at a rate of 22%. Both of these decay modes, however, are difficult to distinguish from background processes. One of the cleanest signatures of the Higgs boson is its decay to two photons. This process, like ggF, is possible due to loop effects involving virtual heavy particles, dominated by W ± bosons and the top quark. However, this decay has a very low BR of just 0.2%.

2.1.4 The Top Quark

The top quark is the heaviest particle in the SM, with a mass of 173.1 GeV [6]. As a result of this its properties differ from those of the other quarks. An estimate of the decay width of the top quark predicts a lifetime of the order 10−25 s. This makes the top quark unique as its lifetime is two orders of magni- tude shorter than the time scale of hadronisation; the top quark decays before it can form any hadrons. Therefore it can decay directly through the process

t → W + + q0, (2.10)

27 2. The Standard Model 2.1. Overview of the Standard Model

1 bb WW

gg LHC HIGGS XS WG 2016 10-1 ττ Branching Ratio cc ZZ 10-2

γγ

-3 10 Zγ

µµ 10-4 120 121 122 123 124 125 126 127 128 129 130 MH [GeV]

Figure 2.12: The main decay modes of the Higgs boson as a function of MH [14]. where q0 = d, s, b as mentioned in Section 2.1.2 for flavour changing weak decays. However, the couplings of the top quark to d and s quarks are negligible, with the value of |Vtb| in the CKM matrix approximately equal to 1. Thus the only significant decay mode is

t → W + + b. (2.11)

By looking at the decay of top quarks, the properties of a quark, such as its spin, can be measured. Due to these unique properties of the top, it has a very recognisable decay signature in events at collider experiments and provides a window into the properties of quarks before hadronisation. In experimental analyses, top quarks are typically categorised by the decay channel of the W ± boson. For top quarks where the W ± boson decays into a quark pair, the top quarks are categorised as hadronic tops; for the case of the W ± boson decaying to an electron-neutrino or muon-neutrino pair, the top is labelled as a leptonic top. In the cases where the W ± boson decays into a tau-neutrino pair, if the τ decays leptonically they are typically treated the same as leptonic tops. For hadronically decaying taus, the top decays are either categorised as hadronic tops, with the hadronic taus treated as jets, or handled separately with the hadronic taus reconstructed experimentally. These decay channels are further used to classify tt¯ and tt¯ +X events using the decays of the two top quarks:

28 2.2. Theoretical Motivation for Measuring ttH¯ 2. The Standard Model

• All hadronic: both tops decay hadronically;

• Semileptonic: one top decays hadronically, the other leptonically - also referred to as lepton+jets or single lepton;

• Dilepton: both tops decay leptonically.

Keeping tt¯ events with taus separate, which comprise approximately 20% of all tt¯ events, the all hadronic channel represents 46% of all tt¯ events, the semileptonic channel represents 30%, and the dilepton channel contributes the least at only 4%. The high mass of the top quark indicates that it has a very strong coupling to the Higgs boson, as shown by the relation in Eq. 2.9. The large value of the top-Yukawa coupling, yt, leads the top quark to be dominant in loop effects in Higgs production and decays such as ggF and H → γγ. As these loops contain virtual massive particles, there is a large model dependence; processes involving loop effects would be sensitive to new physics which introduce additional heavy particles.

2.2 Theoretical Motivation for Measuring ttH¯

The majority of the properties of the Higgs boson which have been measured since its discovery have concerned the coupling of the Higgs boson to bosons, with the only fermionic coupling observed in H → ττ decays [15]. Evidence of the decay of the Higgs boson to two b-quarks has also been reported by the ATLAS and CMS collaborations [16, 17]. Measurement of the Yukawa couplings of the Higgs boson to fermions remains a very important goal of the LHC, with the Yukawa interaction predicted to be the source of fermion masses. Any deviations found between measurements of yf and the expected values extracted using the fermion masses in Eq. 2.9 would be strong evidence for new physics. The top quark, the heaviest SM particle, is predicted to have a value of yt ∼ 1, with the latest experimental measurement of yt = 0.87 ± 0.15 [15] in good agreement with the SM prediction. In comparison to the couplings of the Higgs boson to other fermions, it is almost two orders of magnitude greater than the next largest coupling, yb. Measurements of yt can be extracted from processes involving loop effects, such as ggF production, shown in Fig. 2.11a,

29 2. The Standard Model 2.2. Theoretical Motivation for Measuring ttH¯ and H → γγ decays, shown in Fig. 2.13. Notably, these channels only provide an indirect measurement where the top quark mediates the interactions in the loops and assumes no BSM effects. Instead, the top-Higgs vertex in ttH¯ production, as shown in Fig. 2.11d, provides a direct measurement of yt, sig- nificantly reducing the model dependence. A direct measurement of yt enables BSM searches in the coupling of the top to the Higgs and precision tests of the SM. Measurements from direct and indirect searches can also be compared, which would probe the presence of BSM particles mediating loops in indirect processes.

γ γ W t

H W H t

t W γ γ

Figure 2.13: The decay of the Higgs boson to two photons showing the two main contributors in the loop. The decay is mainly driven by the W ± boson, however the contribution from fermions destructively interfere with the W ± boson loop process.

The top Yukawa coupling also provides a window into the scale of new physics. The effective potential of the Higgs field is extremely sensitive to yt. Small changes in yt can modify the effective potential from a monotonic behaviour which appears as an extra minimum at very large values. The effect of yt on the effective potential is shown in Fig. 2.14. For some critical value crit yt = yt , this extra minimum has the same potential energy as the vacuum crit state of the universe. For yt < yt the vacuum of the universe is deeper than crit the second minimum and so the universe is stable. For yt > yt the second minimum has a lower potential energy causing the universe to be unstable. For small excesses the lifetime of the current vacuum is larger than the age of the universe and so the universe is considered to be metastable. However as yt becomes larger the universe becomes unstable with the lifetime of the vacuum shorter than the age of the universe. In the case of an unstable universe, new physics would be required at the scale where the Higgs self-coupling becomes 0. This scale could be as low as ∼ 107 GeV [18]. Figure 2.15 shows the stability of the universe as a function of the mass of the top quark, directly related to yt, and the mass of the Higgs boson. The latest calculations suggest the electroweak vacuum is metastable.

30 2.2. Theoretical Motivation for Measuring ttH¯ 2. The Standard Model

In addition to measuring the top Yukawa coupling, the ttH¯ production channel can also be used to enhance the discrimination of certain Higgs decays from background processes. At present, the decay of the Higgs boson to a b¯b pair has not been observed despite having the highest branching ratio. By requiring either a vector boson or a top quark pair in the final state, from VH or ttH¯ production, the overwhelming multijet background is reduced. In 2017, ATLAS and CMS reported evidence for the decay of the Higgs boson to a b-quark pair in the VH channel [16, 17].

31 2. The Standard Model 2.2. Theoretical Motivation for Measuring ttH¯

1e+76 yt=0.92447924 1e+74 yt=0.92448161 yt=0.92448279 1e+72 yt=0.92448293 4 1e+70 , GeV

eff 1e+68 V 1e+66

1e+64

1e+62 1e+17 1e+18 1e+19 φ, GeV

Figure 2.14: The effect on the Higgs effective potential of small variations of yt for a scale µ = 173.2 GeV, showing the appearance of a new minimum in the potential [18].

180 107 108 Instability 109 178 1010 1011 176 12 GeV 10 13

in 10 t

M 1016 174

mass 1,2,3 Σ Meta-stability pole 172 1019 Top

170 1018 1014 Stability 168 120 122 124 126 128 130 132 1017 Higgs pole mass Mh in GeV

Figure 2.15: SM phase diagram showing the stability of the Electroweak vac- uum in terms of the Higgs and top pole mass [19]. The coloured areas denote the three regions of stability, with the red dashed lines showing the instability scale in GeV. The ellipses show the experimental limits at up to 3σ uncertainty.

32 2.3. Monte Carlo Simulation 2. The Standard Model

2.3 Monte Carlo Simulation

An important aspect in high energy particle physics is the ability to predict the final states of processes occurring in particle collisions, such as those at the LHC. The full calculation of the matrix element (ME) of different processes is a very difficult challenge and calculations are normally only performed up to a few orders in perturbation theory. Monte Carlo (MC) simulations provide the majority of predictions for signal and background processes in collider physics. In these simulations the full interactions are factorised into more manageable processes within an event. There are four main components into which the simulation is factorised: the choice of parton distribution functions; the hard scatter of an event; the subsequent parton shower and hadronisation; and the underlying event. This QCD factorisation is the general treatment when cal- culating the cross sections of processes. The factorisation scale µF is defined as the energy at which the calculation is split. Above µF the partonic pro- cesses are treated with the ME, whereas below the energy scale the processes are described by the parton shower and parton distribution functions. For use in particle physics analyses where data are recorded with a detector, the MC events are interfaced to a simulation of the detector in order to model the interactions with the detector and the subsequent recorded signals.

2.3.1 Parton distribution functions

The parton distribution function (PDF) describes the momentum distribution of the quarks and gluons in the colliding protons. This information enters the MC simulation in the hard scatter, parton shower and underlying event. PDFs are expressed as a probability that a given parton will be found inside a proton carrying a given fraction of the total momentum and are defined at the factorisation scale. PDFs are determined experimentally through deep inelastic scattering (DIS), in which a beam of leptons is incident on to probe the inner structure of hadrons, and from collider experiments. From these measurements, the inner structure and momentum share of partons are measured at different virtuality scales of the exchanged electroweak gauge boson. There are various PDF sets which are used in MC simulations, with NNPDF [20] and CTEQ [21] two of the most commonly used at the LHC. The DIS measurements performed at the electron-proton collider HERA provide the backbone for the relevant PDFs used in the pp → ttH¯ processes presented in this thesis, which are dominated

33 2. The Standard Model 2.3. Monte Carlo Simulation by the gluon density of the proton [22]. PDFs can be determined to different orders in QCD, with next to leading order (NLO) PDFs used when calculating the hard scatter to NLO, and leading order (LO) PDF sets used for all cal- culations performed at LO, such as the parton shower and underlying event. A complementary choice of PDF set is chosen for the hard scatter and parton shower, such as NNPDF3.0NLO in the hard scatter and NNPDF2.3LO in the parton shower. PDFs in pp collisions can be calculated assuming a different number of flavours of quark being present in the proton. The most common distinction made is between the four flavour (4F) and five flavour (5F) scheme. In the 5F scheme b-quarks are included in the partons which can be found within the pro- ton. In 4F scheme PDFs the partons are limited to gluons and the four lighter quarks. In calculations using the 4F scheme, b-quarks are treated as massive particles, however they cannot enter the interaction as constituents from the proton. The 5F scheme provides for b-quarks in the initial state, however they are treated as massless and mass effects are not taken into account. This in- troduces a modelling dependence on gluon splitting in the parton shower. The optimal choice of flavour schemes is based on the process being modelled.

2.3.2 Hard scatter

The hard scatter in an event handles the calculation of the process of interest up to a fixed order in perturbative QCD. This process occurs at the highest scale in an event, and describes how the constituent partons in the colliding protons interact to produce a small number of outgoing elementary particles. An example hard process is the production of a Higgs boson through ggF which subsequently decays into two b-quarks. The matrix elements of the in- teractions in the hard scatter are calculated perturbatively to a fixed order in QCD and QED, with most processes now calculated to NLO precision. These processes are typically represented by Feynman diagrams, such as the Higgs production mechanisms in Fig. 2.11. As additional higher order corrections are calculated for each process, the modelling becomes more precise; however, this is much more computationally intensive and lower order calculations can prove sufficient. To calculate the cross section of a process, the matrix element, PDF and phase space of the interaction are required. Cross sections can therefore be calculated up to higher order in perturbative QCD than the precision used to simulate events. Lower order calculations are rescaled to higher order cross section calculations to improve the accuracy of the prediction.

34 2.3. Monte Carlo Simulation 2. The Standard Model

2.3.3 Parton shower and hadronisation

Moving on from the hard scatter, the evolution of outgoing particles from the hard scatter is simulated. In the parton shower (PS), the effect of higher order calculations which were not included in the hard scatter event can be simulated. The decays of particles which were not handled in the hard scatter are also performed in this step. In the parton shower the evolution of partons is performed, including soft and collinear emissions. This is done for partons in the initial state, before the hard scatter process, in addition to partons in the final state after the hard scatter. The evolution is simulated from the scale of the hard scatter down to an infrared scale at which point confinement takes over. The evolution of partons in the PS can result in identical final states being created as those which are calculated in matrix elements in the hard scatter. This overlap needs to be removed when interfacing the matrix element and parton shower. Calculations of the parton shower and matrix element are combined to provide the best description of multi-parton states. After the evolution of partons, hadronisation occurs. In this step the final state partons are hadronised into colour neutral particles due to colour con- finement in QCD. Two main techniques are used to simulate this process. With string fragmentation, the partons transform without intermediate states into colour neutral hadrons; with cluster fragmentation the partons can process through intermediary clusters of objects before forming the final hadrons. In high energy particle physics, neighbouring hadrons are clustered into topolog- ical objects known as jets. Parton shower algorithms perform an important part of the simulation of the structure of an event. However, they are built on soft and collinear approximations, whereas many observables of interest are explicitly sensitive to wide opening angles or multijet processes. In these cases, more accurate predictions can only come from higher order calculations of the matrix element.

2.3.4 Underlying event

The final component of the simulation of the event is the underlying event (UE). The UE describes the additional activity in an event which is not di- rectly associated to the interaction of interest. In a collision, the radiation and additional partons in the initial state can also interact leading to multi- parton interactions (MPI). MPI can lead to multiple hard interactions with large pT, such as double parton scattering. At lower pT these interactions con-

35 2. The Standard Model 2.3. Monte Carlo Simulation tribute significantly to the underlying event in an interaction. These additional parton interactions can lead to a higher multiplicity of jets in an event. The colour reconnection between the partons in an interaction, coming from the PS and MPI, also needs to be considered. Experimentally, it is observed that some amount of colour reconnection is required to describe the average pT of charged particles in minimum bias events [23].

36 3. Statistical Methods

The production of the Higgs boson in association with two top quarks is a very interesting process, as discussed in Section 2.2. However, this process has not yet been observed with high enough significance to rule out the background- only hypothesis. In this chapter, the statistical method used to extract the signal strength from the data is introduced, from which the observed signifi- cance can be calculated and limits can be set on ttH¯ production. In order to maximise the significance of the result, several techniques are used to enhance the separation of signal and background processes. Multivari- ate techniques are used to utilise as much information as possible in an event to separate the signal from background. Section 3.2 introduces the multivariate technique used in this analysis, Boosted Decision Trees (BDTs). BDTs combine many input variables, building a classifier which is used to separate signal from background. They are used in a two-stage strategy in the analysis.

3.1 Statistical Analysis

In order to observe or set limits on a process, a statistical analysis needs to be performed comparing the data recorded by an experiment to the expectation. The expected signal and background processes are either described by simu- lation or are derived using data-driven techniques. Test statistics are defined in order to analytically compare different models, so that hypotheses can be tested. From this the observation of a process can be declared or upper limits can be set on models.

3.1.1 Test Statistic and Hypotheses

In a search for an unobserved process, referred to as the signal, two different hypotheses must be tested. The first of which is the background-only hypothe- sis, in which there is no signal present in the model; this is often called the null

37 3. Statistical Methods 3.1. Statistical Analysis

hypothesis Hnull. The second hypothesis is the signal-plus-background model, referred to as the test hypothesis Htest. In searches for an unobserved Standard Model process, as is the case for ttH¯ , the signal-plus-background (S+B) model, which is used for the signal hypothesis, is the Standard Model prediction multiplied by a signal strength parameter µ. The signal strength µ is defined as a ratio of the hypothesised and predicted cross sections of the signal process, σ µ = hypothesis . (3.1) σSM

In this case the hypothesis is written as Hµ. The SM prediction, which by definition has µ = 1, is written as H1 and the null hypothesis, where µ = 0, is written as H0. The first step in observing a new process is to reject the null hypothesis and then secondly to reject it in favour of an alternate hypothesis. In order to test the hypothesis one needs to use a test statistic q, under which H0 follows a predicted distribution f(q|H0). In data, a value for the test statistic will be measured (qobs). From this measured test statistic, the hypothesis is tested by calculating the probability to observe a result at least as extreme as the measurement given the background-only hypothesis. This probability is called the p−value and can be calculated as Z ∞ p−value = f(q|H0)dq. (3.2) qobs From the p−value, the significance of the observed result Z can be extracted using the relation

Z = Φ−1(1 − p), (3.3) where Φ−1 is the inverse function of the Gaussian distribution. The significance is often reported by the number of standard deviations of a Gaussian distribu- tion, σ. The relationship between the observed test statistic and the p−value is demonstrated by the two distributions in Fig 3.1. The shaded area in both distributions corresponds to the p−value. It is taken from the test statistic dis- tribution and applied to the Gaussian distribution, from which Z is extracted. In particle physics, the significance required to reject the background-only hy- pothesis is set to Z = 5σ, corresponding to a p−value = 2.87×10−7. To exclude an alternate hypothesis, a 95% confidence level is required. This corresponds to a p−value = 0.05 and Z = 1.67σ. Another interesting value of significance reported in particle physics occurs at Z = 3σ, which is when it may be reported that there is evidence for a given process.

38 3.1. Statistical Analysis 3. Statistical Methods

When considering a simple analysis with data in a single bin, the expected √ discovery significance is approximately equal to S/ B, where S and B are the number of signal and background events in the bin respectively. This metric holds in the limit S  B and where B is sufficiently large.

y

obs qµ

f(qµ|H0)

p−value p−value

q x µ Z

(a) (b)

Figure 3.1: A graphical representation of the p−value and how it corresponds to the significance. In (a) the p−value is extracted from the shaded area created by the observed test statistic. The significance Z is calculated from the p−value using the normal Gaussian distribution in (b).

3.1.2 Profile Likelihood

In particle physics the common approach used in searches for new processes is to construct a test statistic using the ratio of likelihoods of the signal and null hypotheses, where the likelihood is defined as the probability of observed data for a given hypothesis. The statistical methods used by the ATLAS and CMS collaborations are presented in Ref. [24, 25]. This derives from the Neyman- Pearson lemma [26], which states that when comparing two simple hypotheses, such as H1 and H0, the most powerful test to reject H0 in favour of H1 at a given confidence level is the ratio of the likelihoods. Given a set of data in a binned distribution with m bins, the likelihood L is defined as the product of Poisson probabilities across all bins as a function of µ and the set of nuisance parameters θ. Nuisance parameters are additional parameters which need to be taken into account in the fit in addition to the parameter of interest. In particle physics these cover the systematic uncertainties in an analysis. The likelihood is written as

m (µs (θ) + b (θ))ni Y i i −(µsi(θ)+bi(θ)) Y L(µ,θθ) = e f(θj), (3.4) ni! i=1 θj ∈θ

39 3. Statistical Methods 3.1. Statistical Analysis

where si (θ) and bi (θ) are the predicted signal and background events in a given bin and ni is the number of observed data. The functions f(θj) are the prob- ability density functions for each nuisance parameter, also known as penalty terms. There are three types of penalty terms used for the nuisance parame- ter constraint terms. The Gaussian distribution is the most common choice, however log-normal distributions are used for uncertainties on overall normal- isations and gamma distributions are used for the uncertainties on numbers of events [27]. There are two common test statistics which are used in particle physics analyses, based on the Neyman-Pearson lemma. The first is the Neyman- Pearson test statistic, which is defined as

L(H ) qNP = −2ln 0 , (3.5) L(H1) where the negative logarithm is added for convenience. However as most anal- yses are dealing with a large number of nuisance parameters in addition to the parameter (or parameters) of interest in an hypothesis, another approach is to use a profile likelihood. This is the test statistic which is used in the ttH¯ analy- sis. These nuisance parameters represent systematic uncertainties which arise from theoretical modelling uncertainties, as well as experimental uncertainties coming from the detector calibration and response. The values of nuisance pa- rameters are not known explicitly and must be extracted from the fit at the same time as the parameter, or parameters, of interest. The profile likelihood ratio is given by ˆ L(µ,θˆ) λ(µ) = . (3.6) L(ˆµ,θˆ)

In this formalism a single hat represents the value of a parameter which max- imises the likelihood; a double hat represents the value which maximises the likelihood for a given value of µ. Again for convenience, the test statistic of the profile likelihood used in the analysis is given by

qµ = −2lnλ(µ). (3.7)

In this form the higher the value of qµ, the more incompatible the data is for the given hypothesis. By setting µ = 0 in the test statistic, the discovery statistic q0 can be calculated. Under the background-only hypothesis with data falling either side of the prediction with equal probability, Wilk’s theorem states that in the asymptotic

40 3.1. Statistical Analysis 3. Statistical Methods

region, where the sample size is large and treated as approaching infinity, q0 is 2 √ distributed like a χ distribution. Equation 3.2 leads to a p−value = 1−Φ( q0). By comparing this to Equation 3.3 the following relation is derived,

√ q Z = q0 = −2lnλ(0), (3.8) and the discovery significance can be calculated directly for a given observed value [24].

3.1.3 Expected Discovery Significance

An important ability in any analysis is to calculate the expected discovery significance before performing a measurement on data. One method is to gen- erate a large number of events based on the null and alternate hypotheses and repeatedly run the analysis chain in order to evaluate the median signif- icance. This would be a very computationally intensive and slow process. In statistical analyses, a single representative dataset can be used instead. This dataset is called the Asimov dataset, deriving its name from the author of the short story Franchise, in which elections are held by selecting the single most representative voter [24, 28]. With the Asimov dataset, the need to perform an ensemble of experiments is replaced by using the dataset which best represents the typical experiment, delivering the desired median sensitivity. It is constructed from the true values of all parameters in a model, for example building the H1 dataset where the content in each bin is equal to the predicted number of signal and background events ni = si + bi.

The expected significance can be calculated for both the H0 and H1 hy- potheses by building two separate Asimov datasets. These are used to place an expected upper limit on the alternate hypothesis and the expected discovery significance respectively.

3.1.4 Limit Setting and the CLs Method

In many experiments it is not possible to reject the null hypothesis. In these cases an upper limit is set on the alternate hypothesis, excluding areas of phase space. The method used in particle physics to apply an upper limit is the CLs method, which is used to identify the values of µ which are excluded at a confidence level of 95% [29, 30].

41 3. Statistical Methods 3.1. Statistical Analysis

In the CLs method, a modified p−value is introduced to be used in limit setting instead of using the p−value from the alternate hypothesis directly. This quantity is defined as

ps+b CLs = , (3.9) 1 − pb where ps+b is the probability to obtain a result which is less compatible with the signal-plus-background hypothesis than the observed result, and pb is the prob- ability to obtain a result which is less compatible with the background-only hypothesis than the observed result. This value is compared to the significance level instead of ps+b to avoid excluding or discovering signals in areas where an analysis has no sensitivity. The motivation for the CLs method is demon- strated in Fig. 3.2. For a signal hypothesis that has very few expected signal events, the distributions of the test statistics for Hb and Hs+b will overlap by a large amount. Should the background fluctuate downwards from the expected number of events, it could lead to a very small value of ps+b, which would lead to a rejection of the signal-plus-background hypothesis. The CLs method therefore penalises the p−value based on the background probability, making the test more conservative. When setting upper limits, an alternate test statistic is used for models which predict |µ| > 0. This test statistic is given by

 ˆ  L(µ,θˆ)  −2ln ˆ µˆ < 0, ˜  L(0,θˆ) −2lnλ(µ)µ ˆ < µ  ˆ L(µ,θˆ) q˜µ = = −2ln 0 ≤ µˆ ≤ µ, (3.10)   L(ˆµ,θˆ)  0µ ˆ > µ    0µ ˆ > µ.

The modified version of the profile likelihood q˜µ is used so that an observed µ which is more signal-like than the hypothesised signal is treated the same as the signal hypothesis, and to prevent a negative signal from a downwards fluctuation being used to set a limit.

42 3.1. Statistical Analysis 3. Statistical Methods

pb obs obs qµ qµ

ps+b pb ps+b qµ qµ

(a) (b)

Figure 3.2: Comparison of the test statistics for the signal+background (red) and background-only (blue) hypotheses in two cases for setting the limit on obs the signal hypothesis from an observed test statistic qµ . In (a) there is little separation between the signal and background, which can result from low ex- pected signal yield. In (b) there is a large separation in the distributions of test statistics for the S+B and background-only hypotheses. In both cases ps+b is the same, but in (a) the S+B hypothesis would be rejected even though there is little sensitivity. By using the CLs method, the shaded area in blue in both cases is used to penalise the p−value to prevent this. In the case of (a) this results in the hypothesis not being excluded as there is no sensitivity, however in (b) it only has a small effect and the hypothesis is rejected.

43 3. Statistical Methods 3.2. Boosted Decision Trees

3.2 Boosted Decision Trees

In analyses with a low signal to background ratio, multivariate techniques are used to extract the most information from an event in order to separate the signal events from the background processes. By separating the signal events from the background the sensitivity of the analysis is improved. One of the most widely used techniques in experimental particle physics are BDTs. BDTs are also used in object reconstruction as well as at analysis level, for example in determining whether a jet originates from a b-hadron, commonly referred to as b-tagging. A decision tree is a binary tree structure which performs a series of cuts in order to classify events. An example of a decision tree is shown in Fig. 3.3, demonstrating the possible progression of events through a series of cuts in order to classify them as signal-like or background-like. Cuts are applied in succession until a criterion has been reached, such as a maximum depth of the tree or too few events remain in a node. In the decision trees cuts are optimised at each node to achieve the best splitting of events. In all trainings performed in the analysis presented in this thesis, the Gini index is used to optimise the separation at each node. The Gini index is defined as

G = P (1 − P ), (3.11) where P is the purity of signal in a node, given by the fraction of total events which are signal. For a node which completely comprises signal or background events, the Gini index is equal to 0. At each node the variable and cut are optimised by minimising the overall increase in G when taking the sum of the two daughter nodes after applying the cut, in comparison to the value of G in the parent node. In the calculation, G of the daughter nodes is weighted by their respective fractions of events in relation to the parent node. A single decision tree does not necessarily provide the best separation of signal and background events, nor stability against statistical fluctuations in the training dataset used. In a boosted decision tree, the concept of a decision tree is expanded to a forest which combines a large number of decision trees into the same classifier. Each tree in the forest is added iteratively, using in- formation from the previous tree to improve the training. The training events are reweighted for each subsequent training using the output of the classifier in the previous step, with events which were misclassified in the tree assigned larger weights. At each training step a score is assigned to each tree, and the

44 3.2. Boosted Decision Trees 3. Statistical Methods

Event

?

? ?

? ?

Figure 3.3: A schematic of a very basic decision tree. An event starts at the root node at the top and passes through a series of cuts until it ends up in one of the leaf nodes at the bottom. At each node in the tree, shown by the orange circles, a binary cut is applied on the best variable to separate signal from background. A variable may be used multiple times in the same tree. At the leaf nodes at the end of the tree, the events are classified as being more signal or more background-like using the proportion of signal and background, represented by the areas shaded red and blue.

45 3. Statistical Methods 3.2. Boosted Decision Trees

final BDT classifier is constructed from a weighted average over all the trees in the forest. The resulting classifier is stabler against fluctuations in the train- ing sample and should provide an enhanced performance compared to using a single tree. To mitigate overtraining on a training sample, the decision trees are kept to a short depth with only a few nodes so that the BDT is an ensemble of weak classifiers. Overtraining can occur when a classifier is over-optimised on the training set and has learned features which are specific to it, for example statistical variations and noise. A two-fold cross-validation strategy is also used to remove overtraining on events which would subsequently be used in the evaluation of the BDT. In this strategy, the training sample is split into two equal sized samples and parallel trainings are performed over the two sets. Each training sample is then used as the test sample for the other training. The effect of this is that an event used in the training of a BDT is never evaluated using the resulting training. To evaluate the level of overtraining in the BDT, the ROC curves of the individual testing sets are compared to each other and the merged sample of both test sets. All three ROC curves are required to show the same characteristics. An additional check is performed on every training, evaluating the BDT using both the training and test samples. The distributions of the signal and background are compared between the responses of the test and training samples to check that they are in agreement with each other and no large differences are observed. In an overtrained BDT, the distributions of the signal and background events in the training samples are skewed in relation to the test samples, showing artificially greater separation.   In the ttH¯ H → b¯b analysis BDTs are utilised as part of a two stage   strategy. The first stage attempts to reconstruct the ttH¯ H → b¯b final state, matching the jets in an event to the partons from the top quark and Higgs boson decays. This then feeds into an event classifier, which is used to separate the signal and background events in the final discriminant variables that are used in the statistical fit. For both stages of this strategy the Toolkit for Multivariate Analysis [31] is used to train and evaluate the BDTs. Two of the methods by which the performance of a BDT can be evaluated are the separation hS2i, and the area under the Receiver Operating Charac- teristic (ROC) curve (AUC). The ROC curve for a BDT is constructed as the background rejection efficiency as a function of the signal acceptance efficiency. It illustrates the sensitivity of a classifier in discriminating between signal and background processes, with each point on the curve describing the discrimina-

46 3.2. Boosted Decision Trees 3. Statistical Methods tion ability of the classifier at any given threshold. The larger the area under the ROC curve, the greater the probability of a classifier to correctly assign a signal or background event. It is related to the Gini index by G = 2 · AUC − 1. For a perfect classifier AUC = 1 and for a uniform classifier with no discrimi- natory power AUC = 0.5. The separation is used to calculate the performance of a binned classifier, it is defined as

D E 1 (n − n )2 S2 = X s b , (3.12) 2 bins ns + nb where ns and nb are the number of signal and background events in each bin. The ROC curve and the area underneath are a better description of the overall performance of the BDT, as the bin by bin separation is heavily dependent on binning.

47 48 4. LHC and the ATLAS Experiment

All data used in the analysis presented in this thesis have been collected with the ATLAS experiment at the Large Hadron Collider (LHC). This chapter aims to provide an outline of the CERN accelerator complex and more detailed information about the ATLAS detector.

4.1 The Large Hadron Collider

The Large Hadron Collider is a high energy particle collider based at the European Organisation for Nuclear Research (CERN). Located in the tunnel previously used for the Large Electron collider, the LHC has a cir- cumference of 27 km [32]. The primary function of the LHC is to accelerate and collide bunches of protons p, currently with a centre of mass collision energy of 13 TeV and a design specification of 14 TeV. Proton-proton (pp) runs are used for the study of the elementary particles and interactions in the Standard Model, as well as the search for BSM physics. At the operational centre of mass energy of the LHC of 13 TeV, the total interaction cross section of pp collisions of ∼ 100 mb is comparable to that of p-p¯ collisions, which were used in the Tevatron, and has the benefit that beams of protons are much easier to produce. Furthermore, bunches of heavy lead ions (Pb) are also used dur- ing dedicated periods of either Pb-Pb or p-Pb collisions to study quark-gluon plasma, which dominated the first 30 µs of the early universe. During 2017 the first proton-Xenon collisions were performed at the LHC. The LHC houses four main experiments, ALICE, ATLAS, CMS and LHCb. The ATLAS and CMS detectors are general purpose detectors designed to cover a wide ranging physics programme, including tests of the SM and the search for BSM physics. The major goal of their designs was the search for the Higgs boson [33, 34] following on from the searches undertaken at the Large Electron Position collider and the Tevatron. The LHCb detector [35] is an asymmetric forward detector. It has a strong focus on particle identification

49 4. LHC and the ATLAS Experiment 4.1. The Large Hadron Collider for use in precision measurements of rare hadron decays, and the search for physics beyond the standard model. In particular, LHCb is searching for CP violation in b-hadron and c-hadron interactions. The fourth experiment, AL- ICE [36], runs a physics programme focusing on heavy ion collisions in order to understand the properties of quark-gluon plasma. The LHC is divided into eight equal octants which are labelled clockwise around the ring from one to eight. ATLAS and CMS are located on opposite sides of the ring at octants one and five respectively. The LHCb detector is located in octant eight and ALICE is in octant two. Injection of the proton and heavy ion beams used in collisions occur at two points on the ring, one in octant two, between the ATLAS and ALICE detectors, and the other in octant eight between the ATLAS and LHCb detectors. Before the beams are injected into the LHC they are accelerated in a series of sequentially larger accelerators.

4.1.1 The Accelerator Complex at CERN

The accelerator complex at CERN consists of several different linear and syn- chrotron accelerators. These are used to isolate and accelerate protons, as well as heavy ions, to the required energy for use in LHC collisions. The protons begin as Hydrogen before being ionised in an electrical field, separating the protons and the in the hydrogen atoms, and accelerated by a 90 kV supply. They are then subsequently accelerated to 50 MeV in Linac 2, a linear accelerator. From here they are injected into succes- sively larger synchrotron rings, the first of which is the Proton synchrotron Booster (PSB). The PSB accelerates the protons to an energy of 1400 MeV before injecting them into the Proton synchrotron (PS) which in turn increases their energy to 25 GeV. The protons then enter the Super Proton synchrotron (SPS), which was formerly used as a proton anti-proton collider. It was in this collider where the UA1 and UA2 experiments discovered the W ± and Z0 bosons [37–40]. Here the protons are accelerated to 450 GeV before being injected into the LHC. The beams of protons are injected into two points in the LHC, allowing for two beams to travel clockwise and anticlockwise around the ring in separate beam pipes. The protons are injected in bunches into the LHC and it takes a total of 15 minutes to inject the beams of protons into the LHC. Once completely filled, the beams are accelerated to 6.5 TeV per beam using super conducting Radio Frequency (RF) cavities. The RF cavities are located in

50 4.1. The Large Hadron Collider 4. LHC and the ATLAS Experiment octant four of the LHC. During pp collisions, bunches of protons are separated by 25 ns. A similar process is used for the acceleration and injection of heavy ions into the LHC. The ions are initially accelerated in Linac 3 and the Low Energy Ion Ring (LEIR) before being injected into the PS. After the PS the ions follow the same steps as described for the protons. The layout of the accelerator complex at CERN and its relation to the LHC is shown in Fig. 4.1.

Figure 4.1: Layout of the CERN accelerator complex [41]. The four main ex- periments are shown on the LHC ring.

4.1.2 Luminosity

The intensity of beam collisions at any one time is given by the instantaneous luminosity, defined as n f n n L = b r 1 2 . (4.1) 2πΣxΣy

Here nb is the number of colliding bunch pairs, fr is the rotation frequency of the beams, ni is the number of protons in each beam, and Σx(y) is the beam widths in the x(y)-axis. The total delivered luminosity is calculated as the integral of the instantaneous luminosity R Ldt, and represents the total amount of data delivered in beam collisions over a given time period. The

51 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector expected occurrence of a process in collisions is calculated by its interaction cross section multiplied by the integrated luminosity. The integrated luminosity is often referred to simply as the luminosity. During 2016 data taking, the ATLAS detector recorded a peak instanta- neous luminosity of 1.38×1034 cm−2s−1 in comparison to the peak instanta- neous luminosity recorded during Run 1 of 7.73×1033 cm−2s−1. The longest fill at the LHC also occurred during 2016 with one single beam fill lasting 1 day, 13 hours and 2 minutes; this was fourteen hours longer than the longest fill in Run 1. This run produced an expected 363 ttH¯ events, assuming the SM cross +35 section at 13 TeV of 507.1−50 fb and using the delivered integrated luminosity in the fill of 716.0 pb−1.

4.2 The ATLAS Detector

The ATLAS detector is located at point one on the LHC. It was designed to encompass a vast programme of high energy particle physics. This includes precision measurements of the SM and searches for new physics beyond the Standard Model (BSM). One of the main goals of the detector was to search for the Higgs boson, the final in the SM which had yet to be discovered. In July 2012 both the ATLAS and CMS collaborations announced that they had observed a particle consistent with a Higgs boson [42, 43]. Subsequent measurements of the CP nature and spin of this boson have confirmed it as the discovery of the Higgs boson. This has lead to new areas of physics carried out by both experiments, in particular precision measurements of the Higgs boson. A schematic of the ATLAS detector is shown in Fig. 4.2. In this figure the various subdetectors of the ATLAS detector can be seen. The subdetectors are arranged in cylindrical layers moving radially away from the beampipe. Most subdetectors are split into the barrel region, which is cylindrical about the beam pipe, and endcaps, which are discs at each end of the barrel region, perpendicular to the beam pipe. The barrel region covers the central region with the endcaps providing coverage in the forward regions of the detector.

4.2.1 Coordinate System

The ATLAS detector uses a right-handed coordinate system. The origin is at the nominal interaction point (IP) in the centre of the detector where the two beams collide. The z-axis lies along the direction of the beampipe with the

52 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment

Figure 4.2: A schematic representation of the ATLAS detector [33]. For refer- ence, a 7.5 m high TIE/LN starfighter [44, 45] is shown to scale. positive direction anticlockwise around the LHC ring. The x-axis points from the IP to the centre of the ring; the y-axis points upwards from the IP. The side of the ATLAS detector which falls in the positive z direction is labelled as the A side; the opposite side is labelled as the C side. A cylindrical coordinate system is used to describe the paths of particles created in collisions. The azimuthal angle φ is used in the x-y plane and the angle θ is used to measure the angle from the beamline in the y-z plane. More typically, physicists use rapidity y to describe the angle in the y-z plane as the separation in rapidity of two particles is an invariant quantity under a Lorentz boost along the z direction. Rapidity is defined as 1 E + p ! y = ln z . (4.2) 2 E − pz For massless particles, and in the relativistic limit for massive particles, another quantity called the pseudo-rapidity, given by θ ! η = − ln tan , (4.3) 2 is used instead. A benefit of using pseudorapidity as opposed to rapidity is that the momentum of the particle is not required in the calculation, and in high energy collisions the values of both y and η are nearly identical. In both instances for a path along the beamline the quantities are equal to ∞ and

53 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector

π for trajectories where θ = /2, i.e. along the y-axis, they are equal to 0. The distance between objects in the the η-φ plane, ∆R, is given by q ∆R = (∆φ)2 + (∆η)2. (4.4)

The kinematics of objects in the transverse x-y plane is also a useful quan- tity in particle physics, with the transverse momentum (energy) pT (ET) be- ing the component of a particles momentum (energy) which lies in this plane. These can be calculated from the 3-momentum vector p and energy E of an object using the following relations

pT = |p| sinθ,

ET = E sinθ.

These are useful quantities as the vector sum of the pT (ET) of all objects is zero due to the conservation of energy and momentum. In addition, a quantity miss called the missing transverse energy, ET , is defined using the negative vector sum of the momenta of all objects in the transverse plane,

miss X ET = − pi sinθi. (4.5) i∈objects

miss Due to the conservation of energy and momentum, ET is used to describe the missing transverse energy carried by particles which are not detected by the ATLAS detector, such as neutrinos.

4.2.2 Inner Detector

The Inner Detector (ID) is the innermost subdetector of ATLAS and is used to reconstruct the tracks of charged particles created in collisions. The ID comprises three distinct and complementary systems, the Pixel Detector, the Tracker (SCT) and the Transition Radiation Tracker (TRT). With these systems the tracks of charged particles with energies as low as 0.4 GeV can be reconstructed. Calculating the coordinates of the origins of tracks in the detector, known as vertices, is also performed with the ID. The ID covers the full φ angle with the Pixel Detector and SCT extend- ing to |η| < 2.5 and the TRT covering |η| < 2.0. A cut out schematic of the ID can be seen in Fig. 4.3 with the barrel and endcap components of each subdetector labelled. The ID is surrounded by a 2 T solenoid magnet in the barrel region, with its axis parallel to the beam axis. This magnetic field de- flects charged particles, creating a curvature in their trajectory. This makes an accurate measurement of their momenta possible.

54 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment

Figure 4.3: A cut away schematic of the ATLAS Inner Detector [46] showing the location of the barrel and endcap components of each subsystem.

Pixel Detector

One of the most important features of a detector is to reconstruct the pri- mary and secondary vertices in an event with as high precision as possible. This is especially important for the detection of long lived particles, such as those containing b-quarks and particles in BSM models. For example, the life- time of ground state B-mesons is ∼1.5×10−12 s, and such a b-hadron with pT = 50 GeV travels on average 4.5 mm in the transverse plane before de- caying. This results in a vertex of tracks being displaced from the primary interaction vertex. To be able to resolve the separation of vertices to a high precision, it is desirable to have a detector as close to the beamline as possible. The Pixel Detector is the innermost system in the ID, with the Insertable B-layer (IBL) forming the layer closest to the beam line at a distance of 3.2 cm [47]. The IBL was a new addition to the ATLAS detector for the start of Run 2 of the LHC in 2015. It surrounds the beam pipe which has an inner radius of 25 mm with very little gap, about 2 mm, between the envelopes of the beam pipe and the IBL. The next closest layer, the B-layer, is located 5 cm from the beam pipe. In total, the Pixel Detector has four distinct layers of pixel modules in the barrel region, and three discs of pixel modules in each of the endcap regions.

55 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector

The addition of the IBL has increased the performance of reconstructing the vertices of an event, allowing the detector to operate at higher luminosities with increased levels of pileup. It has also improved the b-tagging performance of jets due to the increased vertex resolution [48]. The modules in each of the four layers use n+-in-n sensors, which are able to provide a space point in (η,φ,r) space. These space points are used to reconstruct the trajectories of each track. When a particle passes through a silicon sensor it liberates electrons in the material, creating electron- pairs. By applying a bias voltage across the silicon the electrons and holes flow towards the positively and negatively doped regions of the silicon. This produces a current which can be read out by electronics and is registered as a hit. On average, charged particles register eight hits in the four distinct layers of the Pixel Detector.

Semiconductor Tracker

Due to the high cost of the pixel modules, the Pixel Detector only covers the region closest to the beampipe. The next component of the ID, the SCT, also utilises silicon modules. However, in the SCT microstrip technology is used instead of pixels. In total the SCT comprises four layers of strip modules in the barrel region and nine planar discs in each of the endcaps. Each layer of the SCT has double sided strip modules, with the sides rotated by 40 mrad with respect to each other. A track crosses on average four layers of the SCT, providing four space points thanks to the stereo angle in each layer.

Transition Radiation Tracker

The last system in the ID is the TRT. Unlike the two systems closest to the beampipe, the TRT does not use silicon based technology. Instead the TRT utilises drift tubes filled with gas. Compared to the other two inner detector subsystems, this technology has a much lower material cost. The drift tubes in the TRT are cylindrical and contain an ionising gas mixture with an wire along the central axis of each tube. The walls of each tube are held at a high negative voltage and act as cathodes. When a charged particle passes through the tube it ionises the gas, causing electrons and ions to drift towards the central anode and the tube walls. From the current in each wire hits are recorded.

56 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment

Although it has lower precision than the two silicon based systems, the TRT contributes to the measurement of a particles momentum due to the high multiplicity of hits in the detector. A track typically crosses 35-40 tubes. The TRT also aids in particle identification, especially between electrons and , due to the transition radiation of incident particles. The energy deposi- tions from transition radiation of electrons is much larger than that from the transition radiation of pions. For the ATLAS upgrade scheduled to take place in 2025, it is planned to replace the ATLAS ID with a completely silicon based tracking subsystem. This upgrade is known as the Inner Tracker (ITk) [49].

4.2.3 Calorimetry

In order to determine the energy of particles created in collisions, especially of neutral particles which are not detected in the inner detector, calorimeters are required. In ATLAS the calorimeters cover |η| < 4.9 and the full φ angle. The calorimetry system in ATLAS is split into several components which capture information from different processes. The Electromagnetic calorimeter (EM) is used to make measurements of the energies of electrons and photons incident in the detector and covers |η| < 3.9. The Hadronic calorimeter extends up to |η| < 3.2 and is designed to contain collimated showers of charged and neutral particles, known as jets, produced in collisions in order to measure the energy deposited by the showers accurately. Finally, the Forward calorimeter (FCal) covers a phase space close to the beampipe with 3.2 < |η| < 4.9, and miss is important for calculating ET . Two technologies are used in the calorimeters which both use alternating layers of absorbing materials and detection medium. The absorbing materi- als create showers of particles in the calorimeter which deposit charge in the detection medium, which is used to measure this deposited energy. The two technologies are distinguished by the use of either Liquid Argon (LAr) as the medium with copper or lead absorbers, or scintillating tiles (Tile) with steel as the absorbing material. Figure 4.4 shows the calorimeters in the ATLAS detector, surrounding the ID.

Electromagnetic Calorimeters

In the electromagnetic calorimeters, LAr calorimeters are used in both the barrel and endcap regions. In the barrel region the calorimeter is constructed

57 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector

Figure 4.4: A cut away schematic of the ATLAS calorimeters, shown around the ID [50]. to provided higher granularity than the endcap region. This high granularity region, covering the same η range as the ID, allows precision measurements of the energy clusters of electrons and photons. In the case of electrons these clusters can be matched to tracks in the ID. The barrel covers |η| < 1.475 with the endcaps covering 1.375 < |η| < 3.2. The designenergy resolution of the EM calorimeter is given in Table 4.1, with the resolution higher for higher energy deposits.

Hadronic Calorimeters

In the barrel region, the hadronic calorimeters use Tile calorimeters with the region further split into the barrel and extended barrel region. The endcaps and FCal both use LAr calorimeters. The hadronic calorimeters need to provide enough depth to cover the entire collimated shower of the jets of particles. To fulfil this requirement, the barrel and the extended barrel region, which together cover |η| < 1.7, have a depth in the radial direction of just under 2 m. The LAr hadronic endcap then covers 1.5 < |η| < 3.2 and the FCal covers the remaining phase space 3.2 < |η| < 4.9. Additionally, the designed energy resolutions of the electromagnetic and hadronic calorimeters are shown in Table 4.1.

58 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment

σE Calorimeter Design resolution /E √ EM 10%/ E L 0.7% √ Hadronic barrel & endcaps 50%/ E L 3% √ FCal 100%/ E L 10%

Table 4.1: The designed energy resolution performance of the ATLAS calorime- ters [33] as a function of energy in GeV.

4.2.4 Muon System

After the calorimeters, the only SM particles which have not decayed and de- posited all their energy as they pass through the ID and calorimeters are and neutrinos. As neutrinos are extremely weakly interacting, they hardly in- teract with the detector and so only the transverse energy they take with them can be calculated. However, a dedicated subdetector exists for muons. Due to their long lifetime and low interaction rate, muons produced in collisions or as decay products in the ATLAS detector travel through the en- tirety of the detector without depositing much of their energy. For this reason, dedicated muon chambers are constructed to measure their momenta. In total there are four different muon chamber types employed in ATLAS. These are Thin Gap Chambers (TGCs), Resistive Plate Chambers (RPCs), Monitored Drift Tubes (MDTs) and Cathode Strip Chambers (CSCs). The chambers are arranged in three cylindrical planes in the central region and three discs at each end of the detector. The MDTs and CSCs are both used for precision measurements of the muon tracks. The MDTs are used over the majority of the η range covered by the detector. They provide a robust measurement of a muons momentum due to the large number of drift tubes through which the muons pass. The CSCs are multiwire proportional chambers which provide a higher granularity measurement and are used only in the innermost plane of the detector in the range 2.0 < |η| < 2.7. Both the TGCs and RPCs are used as part of the trigger system and both provide a second coordinate measurement of the muon track in the bending φ plane, which complements the coordinate measurement from the MDTs. The TGCs are used in the endcap regions and the RPCs are used in the central barrel region. The triggering aspect of these chambers covers the region |η| < 2.4 with the full range being |η| < 2.7. ATLAS comprises three air core toroidal magnets producing a large mag-

59 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector

Figure 4.5: A cut away schematic of the ATLAS muon system [51]. netic field surrounding the muon chambers and calorimeters. This field deflects the paths of charged muons as they travel through the detector, allowing their momenta to be accurately measured. The barrel toroid covers |η| < 1.4 and the two endcap toroids cover 1.6 < |η| < 2.7. In the transition region of 1.4 < |η| < 1.6 the deflection of muons is provided by the combination of the fields from the endcap and barrel toroids. The magnet field strength of the barrel and endcap toroids is 0.5 T and 1.0 T respectively. Figure 4.5 shows a schematic view of the muon system in ATLAS, with the four chamber types and magnets labelled.

4.2.5 Trigger System

A crucial component of data acquisition with the ATLAS detector is the trigger system. Due to the limitations of how many events can be saved per second and the high rate of pp collisions, it is vital to be able to identify and process only interesting events. The ATLAS trigger system consists of two levels of event selection, Level 1 (L1) and the High Level Trigger (HLT) [52]. For an event to be processed and subsequently saved it must pass trigger requirements. The L1 trigger is implemented in hardware using very fast electronics in

60 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment order to be able to make a decision on whether an event is interesting within 2.5 µs. This reduces the interaction rate down from 40 MHz to a stream at

100kHz. Signatures of events with high pT objects as well as events with large miss ET are identified using the L1 trigger, which selects regions of interest (RoIs) in the muon spectrometer and the calorimeters, with a course resolution. L1 trigger signatures from the calorimeter systems consist of recording clusters with a large sum of transverse energy, and provide ROIs for electron/photons, taus and jets. The muon L1 trigger systems use the dedicated muon trigger chambers to quickly measure the pT of muons at a low granularity. Events that pass the L1 trigger are then fed into the HLT, which has an output rate of 1.0 kHz. The HLT uses the regions of interest from the L1 trigger but makes use of the full detector information to reconstruct the physics objects using dedicated offline-like trigger algorithms. The HLT is implemented in software with the reconstruction algorithms running on a large processing farm. These algorithms typically take up to 300 ms to make a decision on whether to keep an event. It is possible to perform full event reconstruction with the HLT. From the HLT events are saved into several data streams. Different streams are used for physics analysis, trigger level analysis, monitoring of the detector, and for detector calibration. To ensure the highest efficiency for data taking in the varying conditions in the ATLAS detector, a set of trigger menus are used. These trigger menus define the list of L1 and HLT triggers enabled at any given time, with each HLT trigger paired with a L1 trigger as a seed. One of the goals of the trigger menu is to maximise the rate of events with interesting objects whilst trying to keep the threshold for satisfying the trigger as low as possible. In addition, some triggers have a prescale applied which reduces the rate the trigger fires. For example, if a L1 trigger has a prescale of 10, then only one in 10 events which satisfy the trigger are passed to the HLT. Primary triggers used for the main physics analyses are typically unprescaled, whereas additional triggers which only require a lower rate are prescaled to reduce their impact on the amount of events processed.

Monitoring of the Trigger Tracking Algorithms

In the HLT, dedicated algorithms are used to reconstruct objects in order to process events as quickly as possible. To monitor the performance of the track- ing algorithms in the ATLAS HLT trigger, special monitoring algorithms are used. These algorithms run over events passing dedicated monitoring triggers

61 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector with reduced track requirements.

For the 2015 and 2016 data taking runs the author was responsible for writing and maintaining a monitoring algorithm for the ID trigger tracking. In the monitoring algorithm, muons from Z → µµ decays are used to measure the performance of two track reconstruction algorithms with the Tag and Probe method. Events are taken from a dedicated trigger stream selecting events with two muons. One muon is required to pass the full reconstruction with tracks in both the ID and MS; the second muon is triggered only on a MS track. To ensure a high purity of Z → µ+µ− events, the invariant mass of the two muons is required to be within 5 GeV of the Z mass and with the muons having opposite sign charge. The muon which passes the full track requirements is labelled as the tag muon amd the muon which is reconstructed with a MS track is labelled as the probe muon. To monitor the efficiency of two different track reconstruction algorithms, tracks from the ID using each algorithm are matched to the probe track from the MS. The two algorithms are the Fast Track Finder (FTF) and Precision Tracking (PT). The FTF is an algorithm designed to quickly provide tracks of a medium quality ignoring hits in the TRT. The PT tracks take FTF tracks as a seed and include the TRT hits. The full track is subsequently recalculated using a global χ2 fit. This two-step trigger strategy is performed to reduce the required computation time.

Tracks are matched by looking within a region of interest around the probe track. If an ID track falls within a tolerable separation in ∆R of the probe, the invariant mass of the ID track and the tag track is computed to see if it is compatible with the Z boson mass. If an ID track is found for an algorithm which is within 5 GeV of the Z boson mass and has been matched to the MS probe track, then the algorithm has been successful in reconstructing the track.

For each tracking algorithm the performance was calculated as a function of several tracking parameters, such as the pT, η and φ of the probe muon. Histograms showing the efficiencies were generated during each data taking run and were available in the online monitoring system. Two such histograms are shown for the PT and FTF tracks in Fig. 4.6, using online data taken during a single run in 2015. The histograms were monitored during data taking to detect issues with the performance of the tracking algorithms, for example a drop in efficiency for certain values of track η or track pT, which would impact the performance of the triggers.

62 4.2. The ATLAS Detector 4. LHC and the ATLAS Experiment

1 Data 15, Run 283429 Data 15, Run 283429 1

Efficiency IDTPMonitor Efficiency IDTPMonitor

0.9 0.95

0.8 0.9

FTF Track FTF Track 0.7 0.85 PT Track PT Track

× 3 0.6 10 0.8 0 10 20 30 40 50 60 70 80 90 100 −3 −2 −1 0 1 2 3 Track p [MeV] Track η T (a) (b)

Figure 4.6: The performance of the Precision Tracking (PT) and Fast Track

Finder (FTF) algorithm, shown as a function of probe muon track pT (a) and probe muon track η (b). Efficiencies are calculated online using the tag and probe method with events from Z → µ+µ−.

4.2.6 Luminosity

The delivered luminosity is monitored by the ATLAS detector using several different luminometers which measure the visible interaction rate per bunch crossing. The visible interaction rate takes into account the efficiency of the detector and method used to measure the true interaction rate. The two main luminometers, BCM (Beam Condition Monitor) and LUCID (LUminosity mea- surement using a Cherenkov Integrating Detector), are both dedicated subde- tectors which measure the bunch-by-bunch interaction rate [53]. The BCM consists of four diamond sensors arranged in the cross pattern around the ATLAS beampipe in the forward regions at |η| = 4.2. A hit is recorded if any sensor produces a signal above a preset threshold, which provides a low acceptance signal of the bunch-by-bunch luminosity. The LUCID detector is a Cherenkov detector consisting of aluminium photomultiplier tubes (PMT) in the very forward regions of the ATLAS detector, 5.6 < |η| < 6.0. If one of the LUCID PMTs records a Cherenkov photon with an energy above a preset threshold during collisions, a hit is registered for that bunch crossing similarly to the BCM. Both the BCM and LUCID subdetectors are situated at both ends of the ATLAS detector, with each side treated as an independent mea- surement. In addition to dedicated systems, complimentary measurements of the interaction rate of pp collisions are performed using other ATLAS subde- tectors, for example by counting the number of tracks reconstructed by the inner detector.

63 4. LHC and the ATLAS Experiment 4.2. The ATLAS Detector

The measurements of the interaction rate are calibrated to the true instan- taneous luminosity using the van der Meer method, as described in Ref [53]. The van der Meer method is performed using data collected from van der Meer scans performed in dedicated fills of the LHC. In these fills there are fewer col- liding bunches with lower intensities, and the beams are scanned in both the horizontal and vertical directions.

64 5. Object Reconstruction

This chapter aims to give an overview of how the physics objects used in anal- yses are reconstructed from their interactions with the ATLAS subdetectors. In the ATLAS collaboration, dedicated combined performance groups provide recommendations for the reconstruction of physics objects and the quality re- quirements that should be applied to them. Associated systematic uncertainties and corrections are also provided.

Tracks and Vertices

Tracks in the ID are used to reconstruct physics objects and, alongside ver- tices, are used to apply object quality requirements on reconstructed objects. Tracks are crucial for electron, muon and vertex reconstruction, with vertex reconstruction vital in jet flavour tagging. ID tracks are reconstructed within |η| < 2.5, corresponding to the η cover- age of both the Pixel and SCT subdetectors. Spacepoints of the recorded hits in each layer of Pixel and SCT are used as seeds in a pattern recognition algo- rithm to reconstruct tracks. The algorithm reconstructs tracks starting from the hits in the innermost layer and progresses radially outwards, matching hits in the next layer to the track as it moves away from the beamline. Once the algorithm has progressed through all silicon layers, the tracks are projected into the TRT where the full track is refit using the additional hits recorded in the TRT. As a quality requirement, a track needs to be matched to at least seven hits in the silicon detectors with at most one hole in the pixel detector and at most two in the SCT [54]. A hole is defined as an active sensor traversed by a reconstructed track that does not register any hits, but which is located in between two layers with hits successfully assigned to the track. Vertices are reconstructed by grouping tracks based on their impact param- eters with respect to the beamline. A minimum of two tracks is required for a vertex. The transverse impact parameter, d0, is defined as the closest approach of a track to the primary vertex in the transverse plane. The longitudinal im-

65 5. Object Reconstruction

pact parameter, z0, is defined as the distance between the z coordinate of the primary vertex and the location of closest approach of a track in the trans- verse plane. The transverse impact parameter is demonstrated by the cartoon in Fig. 5.1. Tracks are tested for consistency with a vertex using a χ2 fit. They are discarded from a vertex if they are incompatible by greater than 7σ in the χ2 fit. Vertices are built iteratively from the tracks until all tracks are associated to a vertex or it is no longer possible to construct additional vertices. For tracks to be considered in constructing a vertex, they must have pT > 400 MeV and fall within |η| < 2.5. Additional requirements on the number of hits in the SCT are applied, as well as the requirement that there is at least one hit in the innermost two pixel layers. The number of allowed holes in the SCT is reduced to a maximum of one, with any track with a hole in the pixel detector vetoed [55]. The primary vertex of an event is defined as the vertex with the highest sum of associated track pT. The resolution of tracks is measured with respect to the impact parameter they make with the primary vertex. The d0 and z0 resolutions are strongly dependent on the track kinematics; tracks with lower pT and larger values of |η| have a much larger uncertainty. In a single bunch crossing there are multiple proton-proton interactions resulting in multiple primary vertices per crossing. In the 2015+2016 pp dataset recorded by the ATLAS detector there was an average of 23.7 interactions per crossing. These multiple interactions are referred to as pileup, and can result in tracks and vertices from interactions migrating to an event of interest which has passed the trigger requirements.

Muons

Muons are reconstructed by combining tracks in the muon spectrometer with those in the inner detector. Tracks are reconstructed in the MS by searching for hit patterns inside each chamber to form segments. Tracks in the MDT chambers are searched for in a trajectory aligned with the bending plane of the toroidal magnet, with segments from each chamber reconstructed with a straight line fit through the hits in each layer. In the RPC and TGCs hits are measured as a coordinate orthogonal to the bending plane, and segments in the CSC are built separately in the η-φ plane. Muon candidates are built by fitting together segments from hits in different layers, starting with segments in the middle layers of the MS where more trigger hits are recorded. These segments are used as seeds for the muon reconstruction algorithm. At least two

66 5. Object Reconstruction

y

track

d0 x primary vertex

Figure 5.1: A cartoon representing the definition of the transverse impact pa- rameter d0 of a track with respect to the primary vertex of an event, shown in the x-y plane. The longitudinal impact parameter z0 is the distance of closest approach in the z-axis. segments are required to build a muon candidate. However, in the transition region between the barrel and endcaps a single high quality segment with η and φ information can be used to build a track. Segments can be used to build multiple muon track candidates, and an overlap removal algorithm is employed to select the best track, or to assign a segment to be shared between two tracks [56]. In the analysis presented in this thesis, muons are required to be in the range |η| < 2.5 and are reconstructed combining tracks from both the MS and ID, known as combined muons. For combined muons, the tracks in the ID and MS are reconstructed independently before being matched. A global fit is then performed over all the hit information. Hits from the MS may be added or removed during the reconstruction to improve the quality of the fit. Most muons are reconstructed by taking the MS track and extrapolating inwards to match it to an ID track, but a complementary method starting with an ID track and extrapolating outwards is also employed. This inside-out reconstruction accounts for roughly 0.5% of the reconstructed muons used in the analysis [56]. As a quality requirement, the muon candidates are required to have seg- ments with at least three hits in at least two MDT layers, except in the very central region |η| < 0.1. In the central region, tracks with at least one MDT layer but no more than one MDT hole layer are allowed. The ID tracks are required to have at least one Pixel hit, five SCT hits, fewer than three Pixel

67 5. Object Reconstruction and SCT holes, and at least 10% of the TRT hits originally assigned to the ID track included in the final combined muon fit. The TRT requirement is dropped outside the TRT acceptance region (2.0 < |η| < 2.5). In order to reject hadrons misidentified as muons, the difference in the ratio of the charge and momentum measured in the MS and ID tracks, divided by their corre- sponding uncertainties summed in quadrature, is required to be compatible within 7σ. A cut on the muon transverse impact parameter significance of d0/σ(d0) < 3 is applied to remove muons that are not matched to the primary vertex in an event. Muons which pass these quality requirements are called Medium quality muons. Muons are additionally required to pass an isolation requirement. Isolation quantifies the energy belonging to the muon in com- parison to the total energy in a cone surrounding the muon track. A gradient isolation is used to reject background objects, which cuts on the pT of tracks with a value that scales with the pT of the reconstructed muon. This results in a tighter isolation requirement on muons with lower pT.

The reconstruction efficiency of muons as a function of pT and η is compared to simulation in Fig. 5.2. Muons from Z-boson and J/ψ decays are used to measure the efficiencies, with Jψ → µµ decays used in addition to Z → µµ to cover the low pT region. Discrepancies are seen between the efficiencies observed in data and those from simulation, with a lower reconstruction efficiency seen in data. The overall drop in efficiency seen for Medium quality muons in the region |η| < 0.1 is due to an acceptance gap in the MS. Scale factors are extracted from the differences in the efficiency of muons measured in data and simulation, and are applied to events from Monte Carlo simulation used in analyses. Uncertainties from the extraction of the scale factors are considered as sources of systematic uncertainty.

Jets

Jets are reconstructed from energy deposits in the EM and hadronic calorime- ters using the anti-kt jet clustering algorithm [57] with a cone radius parame- ter of R = 0.4. Energy deposits in neighbouring cells in the calorimeters form topological clusters. The algorithm is seeded by a large energy deposit in the calorimeter and groups neighbouring clusters together to build the jets. The four momenta of jets are constructed from the energy and location of clusters in the calorimeters, and using the direction of the cluster with respect to the primary vertex. Jets are reconstructed using the clusters in the hadronic and electromagnetic calorimeters, with the energy of the jet calibrated to the elec-

68 5. Object Reconstruction

1 1 0.98 Efficiency Efficiency 0.96 0.98 J/ψ→µµ Data 0.65 ATLAS Z→µµ ATLAS -1 J/ψ→µµ MC s = 13 TeV, 3.2 fb -1 Data s = 13 TeV, 3.2 fb Medium muons 0.96 Z→µµ Data 0.6 MC Medium muons, |η|>0.1 Z→µµ MC Loose muons (|η| < 0.1)

C 1.02 Stat only Sys ⊕ Stat 1.02 Stat only Sys ⊕ Stat 1 1 Data / M 0.98 Data / MC 0.98 −2.5 −2 −1.5 −1−0.5 0 0.5 1 1.5 2 2.5 6 7 8 910 20 30 40 50 60 102 η p [GeV] T

(a) (b)

Figure 5.2: The muon reconstruction efficiency for combined muons with the Medium quality requirements. The efficiency is measured as a function of η

(a) and pT (b) in both simulation and data [56]. Events are selected using the tag and probe method from Z and J/ψ dimuon decays. In (a) Loose muons, which have reduced requirements for matching tracks in the MS and ID, are also compared for |η| < 0.1. tromagnetic scale using the measured energy deposited by the shower in the EM calorimeters. The calibrations are performed using the jet energy scale cal- culated from simulation and applying in situ corrections derived from 13 TeV measurements. These are performed in order to correct for the response of the calorimeters in measuring the total energy of the jet. The response of the calorimeters for jets of five different energies can be seen in Fig. 5.3 measured from simulated dijet events. A reduced energy response in the calorimeters is measured for jets with lower energies at truth level. Additionally, it can be seen that lower energy jets have a more central distribution in the detector. Drops in efficiencies in the average energy response correspond to gaps and transition regions between different subdetectors. The calibration factor applied to the jets is parametrised as a function of the reconstructed jet energy, and is taken as the inverse of the average energy response [58]. A large number of system- atic uncertainties are assigned to account for the calculation and calibration of the jet energy scale.

Jets are required to be within |η| < 2.5 and have pT > 25 GeV. Quality requirements are imposed on jets to remove those not originating from colli- sions as well as detector noise. To reduce the selection of jets originating from neighbouring bunch crossings and other pp collisions in the same bunch, known as pileup, a quality requirement cut is imposed on jets with pT < 50 GeV and

69 5. Object Reconstruction

1.1 Etruth = 30 GeV ATLAS Simulation 1 Etruth = 60 GeV Etruth = 110 GeV s = 13 TeV, Pythia Dijet truth 0.9 E = 400 GeV anti-k t R = 0.4, EM scale Etruth = 1200 GeV 0.8 0.7

Energy Response 0.6 0.5 0.4 0.3 0.2 −4 −3 −2 −1 0 1 2 3 4 η det

Figure 5.3: The energy response of the ATLAS calorimeters as a function of jet η for truth jets with different energies. The response is the fraction of energy measured by the calorimeters given the truth energy of the jet, taken from simulated dijet events [58].

|η| < 2.4 using a multivariate discriminant, the Jet Vertex Tagger [59]. The Jet Vertex Tagger combines the number of primary vertices in an event with track based variables to reject the majority of these jets. Jets with a larger radius parameter of R = 1.0 are constructed using a technique called reclustering [60]. These Large R jets are constructed using the anti-kt algorithm, taking smaller radius jets with R = 0.4 that have already been reconstructed as inputs into the algorithm. This is instead of performing the anti-kt algorithm on the clusters in the same way the jets with R = 0.4 are constructed. The benefit of this is that the large R jets can use the energy calibration of the smaller constituent jets instead of using separate energy calibrations performed for jets with R = 1.0. Large R jets are primarily used for the decay of high pT particles where the jets from the decay products have substantial overlap, resulting in a single large merged jet.

Jet flavour tagging

An important feature of many analyses is to identify the original parton type of a jet in an event, in particular determining whether a jet originates from a b-quark (b-jets). Together with jets originating from c-quarks (c-jets), these are collectively referred to as Heavy Flavour (HF) jets. Due to the relatively long lifetimes of b-hadrons, it is possible to distinguish b-jets from jets originating from light quarks (u,d,s) and gluons, collectively referred to as light-flavour

70 5. Object Reconstruction jets, and to a degree c-jets. The distance the b-hadrons travel in the detector before decaying results in large impact parameters of tracks from secondary vertices. These can subsequently be matched to jets. This property can be re- constructed using several algorithms, the outputs of which are combined into a single multivariate discriminant. There are three classes of algorithms com- bined together; these are impact parameter based, secondary vertex finding and decay chain reconstruction. The impact parameter algorithms rely on the flight path of the b-hadrons that result in tracks having large impact parame- ters. Information from the transverse plane and additionally the longitudinal displacement are used in two separate algorithms. The secondary vertex finder is used to reconstruct secondary vertices within jets, from which multiple dis- criminatory variables are used to identify heavy flavour jets, for example the invariant mass of the secondary vertex and its flight path from the primary ver- tex. In this method, tracks associated to a jet which are significantly displaced from the primary vertex are paired together to form all possible two-track secondary vertex candidate. Vertices which are likely to originate from long lived particles or material interactions are rejected, as well as vertices with in- variant masses compatible with Λ, Ks and photon conversions. The remaining two-track vertices are combined into an inclusive secondary vertex. The decay chain reconstruction attempts to exploit the topological structure of b- and c-hadron decays to reconstruct the full decay chain of the b-hadron by approx- imating the flight path of the hadron using the coordinates of the primary and secondary vertices. The algorithm used is based on a modified Kalman Filter, and uses the intercepts of particle tracks with the jet axis together with single prong vertices to reconstruct the full decay topology.

Variables from the reconstruction algorithms are combined into a class of BDTs, which are optimised for selecting and rejecting different flavour jets. The flavour tagging discriminant used in this analysis to identify b-jets is called the MV2c10 tagger [61]. This discriminant is trained on jets from simulated tt¯ events, with b-jets classed as signal and light and c-jets as background. In addition to the reconstruction algorithms, jet kinematics are included in the training to exploit additional correlations. To avoid optimising differences in the kinematic distributions of the signal and background in the training, b-jets treated as signal are reweighted in pT and |η| to match the distributions of light- flavour jets. The MV2c10 tagger is calibrated at four b-jet efficiency working points, 85%, 77%, 70% and 60%, and is used as a binned discriminant for all jets. The light and c-jet rejection for each working point is given in Table 5.1.

71 5. Object Reconstruction

10 ATLAS Simulation Preliminary

1 s = 13 TeV, tt b jets

Arbitrary units c jets Light-flavour jets

10−1

10−2

10−3

−1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 MV2c10 BDT Output

Figure 5.4: The MV2c10 output score for light, c and b-flavour jets, evaluated on simulated tt¯ events [61].

The rejection factors of light and c-jets increase for tighter b-tagging working points, and there is a greater rejection of light-flavour jets than c-jets. In the analysis presented in this thesis, the working points are referred to as loose, medium, tight and very tight respectively. The binned discriminant has five bins corresponding to these four working points along with a bin for jets which do not pass the loose working point. The MV2c10 BDT output for different flavour jets has been evaluated on tt¯ events in Fig. 5.4, showing the unbinned discriminant. When tagging jets, the distribution is binned into five bins using the b-tagging efficiency given in Table 5.1. Large differences are visible in the shapes of the output for each of the three flavour of jets, with c-jets predicted to appear more similar to b-jets than the light-flavour jets. Additionally, the clear separation between the majority of b- and light-flavour jets is observed, with two clear peaks in the discriminant output. Scale factors for simulated events are derived to correct for the disagreement observed in the modelling when compared to data. Different complementary methods are used to calibrate the light-flavour and c-jet mistag rates, in addition to the b-jet tagging efficiency [61]. A large set of systematic uncertainties are assigned to each of the three flavours to take into account the uncertainties derived in the calibrations.

72 5. Object Reconstruction

Working point b-jet efficiency c-jet rejection Light jet rejection Loose 85% 3.1 33 Medium 77% 6 134 Tight 70% 12 381 Very tight 60% 34 1538

Table 5.1: The light-flavour and c-jet rejection factors for the b-jet efficiency working points with the MV2c10 b-tagging discriminant. Values are extracted from simulated tt¯ events [61].

Electrons and Photons

Electrons are reconstructed from energy deposits in the EM calorimeter, known as clusters, within the range |η| < 2.47. These are subsequently matched to reconstructed tracks in the ID. Tracks are first reconstructed assuming a hypothesis as they traverse through the detector and deposit energy. If the pion hypothesis fails but the track falls within a region of interest of an EM cluster, a second complementary reconstruction attempt for the track is performed using an electron-specific hypothesis. In this reconstruction attempt, additional bremsstrahlung energy losses of up to 30% at each intersection of the electron track with detector material is considered [62]. If several tracks are matched to a cluster, a primary track is chosen based on the quality of the track and how well it matches the cluster for various momentum hypotheses. The primary track is used for electron kinematics and charge identification. If an electron candidate is not matched to any tracks with hits in the pixel detector, it is considered to be an unconverted photon. Electron identification algorithms are applied to the reconstructed electrons to separate real electrons from background processes, such as misidentified hadron jets and converted photons. A likelihood discriminant is used to sep- arate the signal from background. This approach combines information from the cluster shape; track properties, including the number of hits in the IBL; as well as the likelihood probability of the transition radiation measured in the TRT to come from an electron [62]. For electrons there is a larger fraction of hits in the TRT which are above a given threshold due to transition radiation than for pions. Different operating points are calibrated for electrons with increasing levels of background rejection ranging from Loose to Medium to Tight. Due to the

73 5. Object Reconstruction

dependence on ET and η of the electron cluster shape, the working points are optimised in bins of η and ET. To further reject background objects such as misidentified hadronic jets, a gradient isolation requirement is applied to the electrons. As is the case for muons, this is implemented as a sliding cut on the pT of tracks and the

ET of cluster deposits, and scales with the ET of the reconstructed electrons.

This results in a tighter isolation requirement on electrons with lower ET. Electrons are also required to match the primary vertex of an event, with cuts on the longitudinal impact parameter of |z0sinθ| < 0.5 mm and on the transverse impact parameter significance d0/σ(d0) < 5 applied to all electrons.

Reconstructed electrons are required to have pT > 10 GeV. If a cluster does not pass the electron likelihood requirement and is matched to a track that is missing silicon hits, it can be reconstructed as a photon. These candidates are labelled as converted photons. The most discriminatory variable between electrons and converted photons in the likelihood discriminant is the number of hits in the innermost layer of the Pixel detector [62].

The reconstruction efficiency of electrons is measured in bins of |η| and ET from Z and J/ψ dielectron decays. The ratio of the efficiency measured in data and extracted from simulation is applied as a correction to the simulation as a scale factor, in the same way that is done for muons. The scale factors are measured by comparing data to simulated Z → ee decays for electrons with

ET > 15 GeV and J/ψ → ee decays for electrons with 7 GeV < ET < 20 GeV

[62]. The scale factors as a function of η for two bins of ET are shown in Fig. 5.5 for two ranges of ET. As for muons, the scale factors are found to be less than 1, with the reconstruction efficiency being lower in data than simulation, and systematic uncertainties are assigned to take into account the uncertainties on the calculation of the scale factors.

Taus

Tau leptons which decay leptonically into either an electron or muon and two neutrinos are treated as electrons or muons. However, taus which decay hadronically (τhad) can be reconstructed using a multivariate technique. Al- though not used in the analysis presented in this thesis, tau candidates are reconstructed in order to preserve orthogonality with other ttH¯ search chan- nels. A veto is applied to events passing a selection optimised for ttH¯ events containing hadronic taus by counting the candidate multiplicity per event.

τhad decay primarily into one or three pronged processes, classified by the

74 5. Object Reconstruction

1.05 1.05 ATLAS Preliminary ATLAS Preliminary 25 GeV < E < 30 GeV 40 GeV < E < 45 GeV 1 T 1 T scale-factor scale-factor 0.95 0.95

0.9 0.9

0.85 Tight 0.85 Tight Ziso (stat ⊕ syst) Ziso (stat ⊕ syst) Zmass (stat ⊕ syst) Zmass (stat ⊕ syst) 0.8 Zmass/Ziso (stat ⊕ syst) 0.8 Zmass/Ziso (stat ⊕ syst)

0.75 0.75 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 η η

(a) (b)

Figure 5.5: The scale factors for reconstructed electrons calculated from a comparison of data to simulation in Z → e+e− events, shown as a function of

η in two bins of ET, 25 < ET < 30 GeV (a) and 40 < ET < 45 GeV (b) [62]. The scale factor is the ratio between the reconstruction efficiency measured in data and MC. Two measurements are used in combination to measure the efficiency, each using a different method to subtract the background in Z → e+e− events. The combined measurement is shown by the grey points. number of charged pions in the daughter particles. These decays account for approximately 65% of all tau decays, of which 45% are one pronged processes [6]. Hadronically decaying tau candidates are reconstructed by selecting jets reconstructed using the anti-kt algorithm with R = 0.4, and requiring the jet to contain either one or three ID tracks, which when summed have a total electric charge of Q = ±1. Tau candidates are identified using a BDT that is trained to separate τhad from hadronic jets [63], using information about the tracks and clusters associated to the jet. Reconstructed taus are required to have pT > 25 GeV and to be in the range |η| < 2.5. A veto is applied to the transition region between the barrel and endcaps (1.37 < |η| < 1.52).

Missing transverse energy

miss The ET in an event is calculated from all the reconstructed objects, the jets and leptons passing the criteria described above, using Equation 4.5. In addition, an extra soft term is built to take into account the energy deposited in an event which is not associated to any object. The soft term is constructed from tracks in the ID that are matched to the primary vertex in order to be more resilient to pileup effects. Tracks are required to have pT > 0.4 GeV and pass vertex association cuts of d0/σ(d0) < 2 and |z0sinθ| < 3 mm.

75 76 6. Modelling and Event Selection

This chapter provides information on the choice of samples used to model   the signal and background processes in the ttH¯ H → b¯b analysis. The data driven techniques and corrections which are used to improve the modelling of V +jets, as well as the fake and non-prompt lepton backgrounds, are also described. Furthermore, the selection requirements are provided. The selection   is applied to select events which corresponds to the ttH¯ H → b¯b final state and tt¯+ jets background.

6.1 Signal and Background Modelling

The Monte Carlo method described in Section 2.3 is used to model the sig- nal and background processes in the analysis, with the exception of the fake lepton estimate in the single lepton channel. To model how the particles in each event interact with the material in the ATLAS detector, and the signals which are subsequently recorded, the generated samples are passed through a simulation of the ATLAS detector using Geant 4 [64, 65]. However, for some samples such as those used in evaluating the systematic uncertainties relat- ing to modelling, only a partial simulation of the calorimeters is performed. This is implemented using parametrisations of the shower shapes, and has a much reduced computation time in comparison to using Geant [66]. It is implemented in the FastJet package [67]. This fast simulation is referred to as ATLAS FAST II (AFII) and the full simulation using Geant 4 known as Full Sim (FS). For a particle which decays within 10 mm of the interaction point, the event generator handles all the simulation and no interactions with the ATLAS detector are considered. All nominal samples for signal and back- ground modelling in the analysis are FS. All simulated samples are processed using the same reconstruction software as used for recorded data. Object re- construction efficiencies, energy scales and resolutions are corrected to what is observed in data using scale factors.

77 6. Modelling and Event Selection 6.1. Signal and Background Modelling

The effects of pileup are simulated by overlaying inelastic collisions gen- erated with Pythia 8.186 [68] onto the hard scatter event. All samples are reweighted to the pileup distribution observed in data to account for the impact of pileup in events as accurately as possible. The mass of the top quark used in simulation is 172.5 GeV and the mass of the Higgs boson is set to 125 GeV. The decays of all b and c-hadrons is performed by EvtGen v1.2.0 [69] in all samples except in those with events simulated using the Sherpa [70] generator.

6.1.1 Signal Modelling

The ttH¯ signal process is simulated using MadGraph5_aMC@NLO ver- sion 2.3.2 [71] for the matrix element calculation, which is accurate up to NLO in QCD. The PDF set used in simulation is NNPDF3.0NLO. The ma- trix element is interfaced to Pythia 8.210 for the parton shower, tuned with the A14 parameter set, a central tune that has been tuned to most ATLAS jet and underlying event observables [72], and uses the NNPDF2.3LO PDF set. The factorisation and renormalisation scales are set to µF = µR = HT/2, where HT is defined as the scalar sum of the transverse masses of all final state q 2 2 objects pT + m . A systematic uncertainty is assigned to account for this choice of parton shower generator. The cross section of ttH¯ production, σttH¯ = +35 507.1−50 fb, is computed to NLO accuracy in QCD and includes electroweak corrections; the Higgs branching ratios are calculated using HDECAY [14, 73].

6.1.2 Background Modelling tt¯+jets

The largest background process in the search for ttH¯ with the Higgs boson de- caying to a b-quark pair is tt¯+jets, comprising over 90% of the total predicted   background in the ttH¯ H → b¯b analysis. Of particular importance is the irre-   ducible tt¯+b¯b background which has the same final state as ttH¯ H → b¯b . The nominal sample used to model the tt¯ background is generated with Powheg-

Box v2 [74] with the NNPDF3.0NLO PDF set. The hdamp parameter is set to 1.5 times the top quark mass [75]. The hdamp parameter is used as a re- summation damping factor to control the matrix element to parton shower matching in Powheg and controls the pT of additional radiation. The parton shower and hadronisation is performed using Pythia 8.210 with the A14 tun- able parameter set [76] and the NNPDF2.30LO PDF set. The hadronisation

78 6.1. Signal and Background Modelling 6. Modelling and Event Selection and factorisation scales are set to the transverse top mass. The sample is nor- +46 malised to the predicted cross section of σtt¯ = 832−52 pb, calculated to NNLO precision in QCD and including resummation of the next-to-next-to-leading log (NNLL) soft gluon terms [77–81], which was calculated using Top++2.0 [82]. Extensive studies of the modelling of tt¯ +jets and the optimisation of MC samples has been performed by the ATLAS collaboration [75, 83]. Different choices of PS and ME generators as well as optimisation of the tuning param- eters for samples were compared to unfolded data from the ATLAS experiment taken during Run 1 and Run 2 at centre of mass energies of 7, 8 and 13 TeV. Both the choice of PS and ME generators are considered as two systematics on the tt¯ modelling in the analysis presented in this thesis. To compare the choice of parton shower and hadronisation, the nominal sample is compared to events generated using Powheg interfaced to Herwig 7 [84] and also to Pythia 6 [85]. The configurations of these two samples are provided in Ta- ble 6.1, including the choice of PDF sets and additional information regarding the tuning and matching used. In Ref. [75] these samples are compared to un- √ folded data at s = 13 TeV for different tt¯ kinematics in addition to the jet multiplicity and respective pT distributions. The Powheg+Pythia 8 sample models the distributions well and falls within the experimental uncertainties for all variables except for pT of the hadronic top. Powheg+Herwig 7 in comparison models the top and tt¯ kinematic distributions well, in particular top pT which falls within the experimental uncertainties. However, it does not model the jet multiplicity well where a clear trend for higher multiplicities of jets can be seen. In addition to comparing different PS and hadronisation models, differ- ent event generators to Powheg are compared. Events are generated us- ing Madgraph5_aMC@NLO+Pythia 8 and Sherpa 2.2. Both Sherpa and Madgraph5_aMC@NLO use the MC@NLO matching scheme in comparison to Powheg. The PDF sets and additional settings for the samples are listed in Table 6.1. The choice of generators is compared to unfolded data in Fig. 6.1 for the same tt¯ and jet kinematic variables which were compared for the choice of the parton shower. Good agreement with the data is seen for the Sherpa sample in addition to Powheg+Pythia 8, however some disagreement is seen between MadGraph5_aMC@NLO+Pythia 8 and the data, especially in the leading jet pT and predicted jet multiplicity. In all the samples, the pT

79 6. Modelling and Event Selection 6.1. Signal and Background Modelling

ME generator PS/UE gen. ME (PS/UE) PDF Additional information Default

NNPDF3.0NLO hdamp= 1.5mtop Powheg-Box v2 Pythia 8 (NNPDF2.30LO) A14 tune Alternative PS/UE models h = 1.5m CT10 damp top Powheg-Box v1 Herwig7 Colour3 reconstruction (MMHT2014lo68cl) H7-UE-MMHT tune

CT10 hdamp=mtop Powheg-Box v1 Pythia 6 (CTEQQ6L1) Perugia 2012 tune Alternative ME models NNPDF3.0NLO Madgraph5_aMC@NLO Pythia 8 A14 tune (NNPDF2.30LO) Sherpa 2.2 Sherpa NNPDF3.0NNLO MEPS@NLO matching

Table 6.1: The different Monte Carlo samples studied for tt¯ modelling, pro- viding overview of the settings of each sample. Alternative samples using the same ME generator as the nominal sample but with different PS/UE settings are grouped together in one category, with samples comparing different ME generators in a second category.

of the top quark proves to be mismodelled with a harder top pT distribution predicted by all choices of matrix element generator. Studies into applying event by event weights to the tt¯samples to correct the modelling of top pT were performed using a sequential reweighting procedure. However, after applying the weights no improvement in the overall modelling of the background was found, and the top pT distribution was found to still be mismodelled after applying the weights to simulated events. Therefore in the analysis presented in this thesis no attempt is made to correct the modelling of this variable in the nominal Powheg+Pythia 8 sample or to apply it as a source of systematic uncertainty.

In order to assess the impact of the additional radiation in tt¯events, two ad- ditional Powheg+Pythia 8 tt¯samples were generated, adjusting the amount of additional radiation in tt¯ events. These samples vary the hdamp parameter and change the A14 tune using the Var3c variation, chosen as it covers the size of the other available tuning configurations for the sample. These samples also cover the impact of the different tuning options in Powheg+Pythia 8, and are used to evaluate the systematic uncertainty on the additional radiation in the analysis.

80 6.1. Signal and Background Modelling 6. Modelling and Event Selection

Particle level, absolute cross-section Particle level, absolute cross-section 10 1 b b

[pb] –1 b 10 b b b jets b

b [pb/GeV] /dn

T –2 b

σ 10

d 1 b /dp b σ d 10–3

b b ATLAS Data, √s = 7 TeV b b ATLAS Data, √s = 7 TeV 10–1 10–4 Powheg+Py8, hdamp = 1.5 mtop · Powheg+Py8, hdamp = 1.5 mtop MG5 aMC@NLO+Py8 b · MG5 aMC@NLO+Py8 MG5 aMC@NLO+Py8 (FxFx) –5 10 MG5 aMC@NLO+Py8 (FxFx) Sherpa 2.2 MEPS@NLO Sherpa 2.2 MEPS@NLO 10–2 –6 1.6 10 1.3 1.4 1.2 1.2 1.1 1 1.0 0.8 0.9 0.6 0.8 Expected/Data Expected/Data 0.7 0.4 0.6 3 4 5 6 7 8 10 2 10 3 st njets(jet pT > 25 GeV) 1 jet pT [GeV] (a) (b)

Particle level, absolute cross-section

b –1 10 b [pb/GeV] b T 10–2 /dp σ d 10–3 b

b ATLAS Data, √s = 7 TeV 10–4 Powheg+Py8, h = 1.5 m damp · top MG5 aMC@NLO+Py8 –5 10 MG5 aMC@NLO+Py8 (FxFx) Sherpa 2.2 MEPS@NLO 10–6 1.3 1.2 1.1 1.0 0.9 0.8

Expected/Data 0.7 0.6 10 2 th 5 jet pT [GeV] (c)

Figure 6.1: Comparison of unfolded data to tt¯ samples with different matrix element generators at 7 TeV. The comparison is performed for the jet multi- plicity of the events (a) and the pT of the first (b) and fifth (c) leading jets ordered by pT [75].

81 6. Modelling and Event Selection 6.1. Signal and Background Modelling tt¯ + Heavy Flavour

Events in tt¯+jets samples are categorised based on the flavour of the additional jets. The flavour of jets is defined using the same procedure as described in

Ref. [86]. Generator particle jets are reconstructed using the anti-kt algorithm for all stable particles that have mean lifetimes greater than 3×10−11 seconds, using a radius parameter of ∆R = 0.4. A selection cut of pT > 15 GeV and |η| < 2.5 is applied. The flavour of jets is determined by counting the number of b- or c-hadrons within ∆R < 0.4 of the jet axis. Events which do not contain any jets matched to b- or c-hadrons are labelled as tt¯+light. Events containing at least one jet matched to b- or c-hadrons, with pT > 5 GeV, are labelled as tt¯ + Heavy Flavour (tt¯+ HF). These are categorised further into tt¯+≥1b for events where at least one additional jet originates from a b-hadron and tt¯+≥1c for the remaining events with at least one additional jet originating from a c-hadron. A finer categorisation of tt¯+≥1b is also defined, with the definitions of the subcategories described in Table 6.2. The modelling of the   tt¯+ HF background is the leading source of uncertainty in the ttH¯ H → b¯b analysis presented in this thesis.

Category Description tt¯+ b One additional jet matched to a single b-hadron tt¯+ b¯b Two additional jets matched each matched to a single b-hadron tt¯+ B One additional jet matched to two b-hadrons tt¯+≥3b At least three b-hadrons matched to additional jets Additional b-jets matched to hadrons from multi parton interac- tt¯+≥1b (MPI/FSR) tions (MPI) and final state radiation (FSR)

Table 6.2: The definitions of the subcategories of tt¯+≥1b events, using hadrons matched to particle level jets per event in the tt¯ simulation.

In order to model the tt¯+≥1b background as precisely as possible, the relative fractions of the tt¯+≥1b categories, excluding tt¯+ b (MPI/FSR), are scaled to the predicted fractions from a dedicated NLO tt¯+ b¯b sample gen- erated with Sherpa+OpenLoops. The overall tt¯+≥1b normalisation is kept at the prediction from Powheg+Pythia 8. The differences in the shapes of the distributions for each of the fractions are subsequently considered as a source of systematic uncertainty. This sample is expected to provide the best theoret- ical prediction for tt¯+ b¯b with the additional b-quarks calculated with higher accuracy than in the Powheg+Pythia 8 sample. The events are generated with Sherpa 2.1.1 and the CT10 4F scheme PDF set [87, 88]. This tt¯ + b¯b

82 6.1. Signal and Background Modelling 6. Modelling and Event Selection sample is referred to as Sherpa4F in contrast to the inclusive Sherpa tt¯ sample introduced in the previous section which uses the 5F scheme PDF set. This sample is subsequently referred to as Sherpa5F. The renormalisation Q 1/4 scale is set to the CMMPS value [89] µCMMPS = i=t,t,b,¯ ¯b ET,i . The factori- sation and renormalisation scales are both set to HT/2, where HT here is the scalar sum of the top quark and b-quark pT. The relative fractions of tt¯+≥1b in Powheg+Pythia 8 and Sherpa4F are shown in Fig. 6.2. The largest difference can be seen in the tt¯+≥3b fraction.

ATLAS Simulation

POWHEG+PYTHIA 8 1 SHERPA4F Fraction of events

10−1

10−2

8 2 tt+b tt+bb tt+B tt+≥3b 4F

YTHIA 1.5

+P 1 HERPA

S 0.5 OWHEG

P tt + b tt + bb tt + B tt + ≥3b

Figure 6.2: The relative predicted fractions of tt¯+≥1b events from the Sherpa4F and Powheg+Pythia 8 samples. The fractions correspond to the values before selection. The uncertainties on the Sherpa4F values are cal- culated from varying the factorisation and renormalisation scales, the rate of MPI, the choice of the parton shower recoil scheme, changing the PDF set, and varying the UE tune.

V+jets

The production of W /Z bosons in association with jets is generated with Sherpa 2.2.1, with two additional partons in the matrix element calculated up to NLO in QCD and four additional partons calculated at leading order. This is done using the Comix [90] and OpenLoops matrix element genera- tors and merged with the Sherpa parton shower using the MEPS@NLO pre- scription. The NNPDF3.0NNLO PDF set is used alongside dedicated parton

83 6. Modelling and Event Selection 6.1. Signal and Background Modelling shower tuning. The V +jets background is the second largest background in   the ttH¯ H → b¯b analysis, contributing several percent to the total predicted background. The Z+jets background has a correction applied to events with additional jets originating from b- and c-hadrons. This correction factor is derived us- ing a data driven method in an orthogonal selection to the analysis described below, which applies a veto on same-flavour events with a dilepton invariant mass falling within 8 GeV of the Z mass. To derive the correction, opposite- sign same-flavour dilepton events with at least two jets are selected, requiring the invariant mass of the dilepton pair to fall within 83–99 GeV. The cor- rection factor is used to correct for the observed mismodelling of the Z+HF jets background seen in dedicated control regions enriched in Z+jets. This mis- modelling is apparent in Fig. 6.3a where the simulation, dominated by Z+jets, underestimates the data in events with at least one b-tagged jet. To derive the correction factor, Z+jets events are divided into three cat- egories using the number of jets at the reconstruction level which have been matched to heavy flavour hadrons. These categories are called ZHF for events with at least two heavy flavour jets; Zbl/cl for events with exactly one heavy

flavour jet; and ZLF for events with no heavy flavour jets. Events are split into three regions by cutting on the number of b-tagged jets in the event. These regions are labelled as ≥ 2j0b, ≥ 2j1b and ≥ 2j ≥ 2b, and contain zero, ex- actly one and at least two b-tagged jets at the tight b-tagging working point respectively. These regions are each enriched in a different Z+jets category;

≥ 2j0b is mostly ZLF , ≥ 2j1b is dominated by Zbl/cl, and ≥ 2j ≥ 2b is mostly

ZHF though the tt¯ background also starts to contribute. By subtracting the event yields in MC not coming from the Z+jets samples from the data yield in each region, the total Z+jets contribution to the data is derived. This is equal to the sum of the yields from the Z+jets categories with each category multiplied by a correction factor. The three linear equations shown in equa- tions 6.1–6.3 are used to calculate the normalisation factors for each Z+jets category, kLF, kbl/cl and kHF. The region from which the yields are taken are shown in superscript.

k N ≥2j0b + k N ≥2j0b + k N ≥2j0b = N ≥2j0b − N ≥2j0b (6.1) LF ZLF bl/cl Zbl/cl HF ZHF data nonZ

k N ≥2j1b + k N ≥2j1b + k N ≥2j1b = N ≥2j1b − N ≥2j1b (6.2) LF ZLF bl/cl Zbl/cl HF ZHF data nonZ

k N ≥2j≥2b + k N ≥2j≥2b + k N ≥2j≥2b = N ≥2j≥2b − N ≥2j≥2b (6.3) LF ZLF bl/cl Zbl/cl HF ZHF data nonZ

84 6.1. Signal and Background Modelling 6. Modelling and Event Selection

The values of the correction factors are kLF = 1.085, kbl/cl = 1.302 and kHF = 1.291. As the ZLF contribution is negligible in events with high b-tag multiplicities, this correction factor is not applied. A global correction factor of 1.3 is applied to all Z+jets event with at least one HF jet. This value is in agreement with the normalisation factor extracted for the Z+HF background in other ATLAS analyses, such as the search for VH production with H → b¯b [16]. The effect of applying this correction factor is visible in Fig. 6.3b, where much better agreement is seen than in Fig. 6.3a.

×103 ×103 100 16000 Data tt Data tt Data tt 1000 -1 -1 -1

Events Z+jets (HF) Z+jets (bl/cl) Events Z+jets (HF) Z+jets (bl/cl) Events Z+jets (HF) Z+jets (bl/cl) s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb 14000 s = 13 TeV, 36.1 fb Dilepton Z+jets (LF) other Dilepton Z+jets (LF) other Dilepton Z+jets (LF) other ≥2j0b Uncertainty 80 ≥2j1b Uncertainty ≥2j≥2b Uncertainty 800 12000 Pre-Fit Pre-Fit Pre-Fit 10000 60 600 8000

40 400 6000

4000 200 20 2000

1.50 1.50 1.50 1.25 1.25 1.25 1 1 1 0.75 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 Data / Pred. 0.5 84 86 88 90 92 94 96 98 84 86 88 90 92 94 96 98 84 86 88 90 92 94 96 98

Mll [GeV] Mll [GeV] Mll [GeV]

(a) Before applying data driven Z+jets correction.

×103 ×103 100 16000 Data tt Data tt Data tt 1000 -1 -1 -1

Events Z+jets (HF) Z+jets (bl/cl) Events Z+jets (HF) Z+jets (bl/cl) Events Z+jets (HF) Z+jets (bl/cl) s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb 14000 s = 13 TeV, 36.1 fb Dilepton Z+jets (LF) other Dilepton Z+jets (LF) other Dilepton Z+jets (LF) other ≥2j0b Uncertainty 80 ≥2j1b Uncertainty ≥2j≥2b Uncertainty 800 12000 Pre-Fit Pre-Fit Pre-Fit 10000 60 600 8000

40 400 6000

4000 200 20 2000

1.50 1.50 1.50 1.25 1.25 1.25 1 1 1 0.75 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 Data / Pred. 0.5 84 86 88 90 92 94 96 98 84 86 88 90 92 94 96 98 84 86 88 90 92 94 96 98

Mll [GeV] Mll [GeV] Mll [GeV]

(b) After applying data driven Z+jets correction.

Figure 6.3: The invariant mass of the dilepton pair in the three regions used to derive a correction factor. The left column shows the ≥ 2j0b region, the middle column shows the ≥ 2j1b column and the right column shows the ≥ 2j ≥ 2b region. Shown before (a) and after (b) applying a correction factor of 1.3 to events containing an HF jet.

Other backgrounds

Additionally there are other backgrounds which in total contribute a small frac-   tion to the ttH¯ H → b¯b analysis. Events containing two top quarks and a vec- tor boson in the final state (ttV¯ ) are generated using MG5_aMC@NLO inter-

85 6. Modelling and Event Selection 6.1. Signal and Background Modelling faced to Pythia 8.210 tuned with the A14 parameter set. The NNPDF3.0NLO PDF set is used for the ME and the NNPDF2.3LO set for the PS and UE. Sin- gle top events in the W t and s channel are generated with Powheg-Box v1 at NLO with the CT10 PDF set. At NLO in QCD, corrections to the W t produc- tion can have the same final state as tt¯events at LO, and the interference effects between the two needs to be considered. There are two methods considered to handle this in MC samples, the diagram removal and diagram subtraction schemes [91]. The diagram removal scheme is used as the default to handle the interference. Samples which use the diagram subtraction scheme are also produced and a comparison between both methods is a source of systematic uncertainty in ATLAS analyses. Single top events produced in the t-channel are generated using Powheg-Box v1 at NLO accuracy using the four flavour CT10 PDF set. All single top events are interfaced to Pythia 6.428 using the Perugia 2012 set of tuned parameters, and all cross sections are normalised to NNLO theoretical predictions. Events containing two vector bosons (W or Z), labelled as diboson events, are generated using Sherpa 2.1.1 and the CT10 PDF set with up to three additional partons in the final state. Higgs boson production in association with a single top is a rare process in the SM and is treated as background in the analysis. Samples of single top quarks produced in association with a Higgs boson and a W boson, tW H, are generated at NLO with MadGraph5_aMC@NLO using the CT10 PDF set interfaced to Herwig++ [92]. The single top plus Higgs and jets events tHjb are generated at LO with MadGraph5_aMC@NLO using the CT10 4F scheme PDF set. These events are interfaced to Pythia 8 for the parton shower and hadronisation. Both tW H and tHjb events use the CTEQ6L1 PDF set in the PS and UE. The remaining Higgs production mechanisms are not considered in the analysis as they have a negligible contribution. Additional backgrounds containing top quarks from four top (ttt¯ t¯) and ttW¯ W are generated with MadGraph5_aMC@NLO at LO and interfaced to Pythia 8 using the A14 tune and the NNPDF2.3LO PDF set. tZj events are also generated with MadGraph5_aMC@NLO at LO but interfaced to Pythia 6 with the Perugia 2012 tune. tW Z events are generated at NLO with MadGraph5_aMC@NLO and interfaced to Pythia 8.

Fakes and non prompt leptons

In the semileptonic channel, events containing a fake or non prompt lepton (together referred to as fakes) are estimated using a data driven technique.

86 6.1. Signal and Background Modelling 6. Modelling and Event Selection

Fake leptons are objects which are incorrectly reconstructed as leptons, for ex- ample converted photons or jets, whereas non prompt leptons are those which originate from the decays of long lived particles, such as B-mesons. In ttH¯ signal enriched selections these events provide a negligible contribution how- ever in the semileptonic channel they contribute more significantly at lower jet and b-tagged jet multiplicities. The matrix method (MM) is used to calculate an event weight which is applied to data to model the fake contribution. MM compares events passing the nominal tight lepton selection and those passing a looser lepton definition in a fakes-enriched control region. The events pass- ing the loose selection requires are a superset of the events passing the tight requirement, and contain a larger contribution from fake and non-prompt lep- tons. The number of events in the loose and tight lepton selections can be represented as linear combinations of the number of events which have either a real or fake lepton, Nr and Nf respectively, by

loose loose loose N = Nr + Nf (6.4) and

tight loose loose N = rNr + f Nf . (6.5)

The efficiencies r and f are the fractions of the events in the loose selection, containing a real or fake lepton respectively, which also pass the tight lep- ton selection. Rearranging equations 6.4 and 6.5, the number of tight events containing fakes can be expressed as

tight f loose tight Nf = (rN − N ), (6.6) r − f and

tight loose Nf = f N . (6.7)

The efficiencies r and f are dependent on the lepton kinematics as well as other event variables such as the jet and b-jet multiplicities. An event weight is calculated from efficiencies which are parametrised as a function of chosen kinematics to provide an estimate of the yields and also kinematics of the fakes −→ background. For efficiencies, given as a function of the event kinematics ki for event i, the event weight wi is given by −→ f ki  −→  wi = −→ −→ r ki − Ti , (6.8) r ki − f ki

87 6. Modelling and Event Selection 6.2. Event Preselection

where Ti = 1 for events which pass both the loose and tight selections, and

Ti = 0 for events passing only the loose selection. The event weights are applied to data to build a dataset for the estimated contribution from fakes. In the dilepton channel, fakes are estimated from the nominal MC back- ground samples which do not contain two opposite-sign prompt leptons. To improve the prediction from directly using the yields from MC samples, ad- ditional normalisation factors are calculated. Events where both leptons have the same electric charge are selected and split into two channels based on the

flavour of the lepton with subleading pT. In this control region the normalisa- tion factors are calculated as the difference between same-sign data and the predicted same-sign contribution from Monte Carlo. The contribution from non-fake backgrounds in the Monte Carlo samples, where both same-sign lep- tons are real non-prompt leptons, are first subtracted from the data before the normalisation factor is derived. For events with a subleading electron (muon), the normalisation factor is 1.18 (1.31).

6.2 Event Preselection

6.2.1 Data

Events are collected from pp collisions with a centre of mass energy of √ s = 13 TeV recorded by the ATLAS detector in 2015 and 2016. Only events where all relevant detector subsystems were operational and which have at least one vertex have been selected, representing a total integrated luminosity of 36.1 fb−1.

6.2.2 Trigger

For an event to be recorded by the detector and used in this analysis, it is required to contain a reconstructed electron or a muon with pT > 27 GeV, which is matched within ∆R < 0.15 to the same-flavour lepton candidate that passed the single lepton trigger. Only unprescaled single lepton triggers are used; these have either a low pT threshold and isolation requirements on the lepton, or have a high pT threshold and reduced or no isolation requirements.

In 2015 the lowest pT threshold was 24 (20) GeV for electrons (muons); in 2016 it was 26 GeV for both electrons and muons. The leading lepton in the event is required to have pT > 27 GeV so that it lies above the threshold of the lowest pT trigger for all events over the data taking period.

88 6.2. Event Preselection 6. Modelling and Event Selection

6.2.3 Overlap Removal

To prevent counting detector responses and information as more than one reconstructed object, an overlap removal procedure is used. Overlap removal is a step-by-step procedure which removes objects falling within a predefined spatial separation of another object. The order and criteria of the overlap procedure is

1. Remove taus within ∆R < 0.2 of a lepton, 2. Remove electrons sharing a track with a muon, 3. Remove single jet within ∆R < 0.2 of an electron, 4. Remove electrons within ∆R < 0.4 of a jet, 5. Remove muons within ∆R < 0.4 of a jet. Step one is to prevent a reconstructed lepton being used as a hadronic tau, reducing the number of fake reconstructed taus, and step two is used to make sure all tracks are only used for one final state object each. Step three is necessary as all electrons are also reconstructed as jets due to the energy clusters in the EM calorimeter. Therefore, the jet within ∆R < 0.2 which corresponds to the electron is removed and the electron candidate is kept. Contrarily, for any additional jets within ∆R < 0.4 of the electron, the electron is removed. This is in order to remove events where the neighbouring electron and jet may bias the reconstructed energy or position of each other, as well as to remove non-prompt electrons. Step five is performed to suppress the number of muons originating from non-prompt decays of heavy flavour hadrons inside jets. In order to have an event selection which is orthogonal to other analyses searching for Higgs boson production with the ATLAS experiment, a loose lepton veto is imposed. In this, leptons reconstructed with looser quality re- quirements are used in the overlap removal procedure instead of the nominal tighter definitions.

6.2.4 Analysis Selection

In the ttH¯ analysis with the Higgs boson decaying to a b-quark pair, the event selection is split into two orthogonal selections, the semileptonic and dilepton channels. No attempt is made to study events with fully hadronic tt¯ decays. In the semileptonic channel there are two selections, a boosted selection which exploits the kinematics of high pT objects with large R jets, and a resolved se- lection which uses the standard anti-kt jets. The boosted selection is optimised

89 6. Modelling and Event Selection 6.2. Event Preselection

to select events where the decays of the high pT top quark and Higgs boson produce overlapping jets of hadrons. For both the dilepton and semileptonic channel, events are selected with a loose preselection before being categorised into regions. This is in order to control the different tt¯ +jets backgrounds in dedicated regions. To select ttH¯ events with the Higgs boson decaying to a b-quark pair, events are selected with high jet and b-tagged jet multiplicity. The event selection aims to reduce the non-tt¯ background by selecting objects in a final state consistent with the tt¯+ X. This is done by requiring there to be at least two b-tagged jets, for the b-quarks from the top decay, and the decay products from the W -bosons, of which at least one is required to decay into a lepton-neutrino pair. At least one more additional jet is required on top of the objects from tt¯ events. This results in preselection criteria of one lepton and at least five jets, of which at least two are b-tagged, in the semileptonic channel. In the dilepton channel this results in at least three jets, of which two are b-tagged, and two opposite-sign leptons. The event selection is tightened to reject reducible backgrounds further and to be closer to the   ttH¯ H → b¯b signal with tighter b-tagging requirements. The semileptonic channel selects events with exactly one electron or muon with pT > 27 GeV. The boosted selection requires at least two large R jets, from which one must be tagged as a Higgs candidate and another as a top can- didate. Boosted Higgs candidates are defined as a large R jet with pT > 200 GeV containing at least two anti-kt jets with ∆R = 0.4, both of which pass the loose b-tagging working point. Boosted top candidates are defined as large R jets with pT > 250 GeV containing at least two anti-kt jets with ∆R = 0.4, of which exactly one passes the loose b-tagging working point. The resolved semileptonic selection requires events to have at least five jets; at least two jets are required to be b-tagged at the very tight working point or at least three at the medium working point. Events that fall into both the boosted and resolved selections are assigned to the boosted selection. The dilepton channel only has a resolved selection. Events with exactly two electrons or muons with opposite electric charge are selected. The leading lepton is required to have pT > 27 GeV, and the subleading lepton to have pT > 10 (15) GeV in the µµ and eµ (ee) channels. In events with two same- flavour leptons, ee and µµ, events are vetoed if the invariant mass of the two leptons is less than 15 GeV or falls within the Z-boson mass range 83 – 99 GeV. This is to reduce the contribution from the Z+jets background and hadronic resonances. Events are required to have at least three jets, at least two of which

90 6.2. Event Preselection 6. Modelling and Event Selection

Semileptonic Dilepton Resolved Boosted Exactly 2 opposite sign e/µ Exactly 1 e/µ Exactly 1 e/µ

pT leading ` > 27 GeV lepton pT > 27 GeV lepton pT > 27 GeV

pT subleading ` > 10 (15) GeV

ee/µµ: M`` > 15 GeV

ee/µµ: |M`` − 91| > 8 GeV ≥3 jets ≥5 jets ≥2 large R jets ≥2 b-jets (77% wp.) ≥2 b-jets (70% wp.) 1 Top candidate or ≥3 b-jets (77% wp.) 1 Higgs candidate

Veto on ≥1 τhad candidate Veto on ≥2 τhad candidates

Table 6.3: Summary of the preselection of events in each of the dilepton and semileptonic channels. In the dilepton channel, the pT of the subleading lepton is required higher in the ee channel, and is shown in parentheses. pass the medium b-tagging working point. A summary of the event selection in the dilepton and semileptonic channels is provided in Table 6.3. To preserve orthogonality with other ttH¯ channels which select events with hadronic taus in the final state, all events containing τhad candidates in the dilepton channel are vetoed. In the semileptonic channel events with at least two hadronic tau candidates are vetoed. Events that contain only one candidate are kept and the τhad is not used in the analysis. Instead, the anti-kt jet used to reconstruct the tau candidate is treated as a normal jet. In this way, only the multiplicity of taus is considered in the event, and the objects in the event are otherwise the same as though no tau reconstruction had been performed. Distributions of the preselection in the semileptonic and dilepton channels are shown in Fig. 6.4, showing the jet and b-jet multiplicities, using the tight b-tagging working point. The tt¯+ jets background has been separated into the three categories mentioned before. The remaining backgrounds are separated into ttV¯ and the non-tt¯ background which contains all remaining backgrounds such as V +jets and fakes. Relatively good agreement is seen for both variables in the two channels, though a slight slope can be seen in the b-tagged jet multiplicity in the dilepton channel.

91 6. Modelling and Event Selection 6.2. Event Preselection

×103 140 Data ttH Data ttH -1 ≥ 6 •1 ≥ Events s = 13 TeV, 36.1 fb tt + light tt + 1c Events 10 s = 13 TeV, 36.1 fb tt + light tt + 1c 120 ≥ ≥ Dilepton tt + 1b tt +V Dilepton tt + 1b tt +V Inclusive Selection non-tt Uncertainty Inclusive Selection non•tt Uncertainty 105 100 Pre-Fit Pre•Fit

4 80 10

3 60 10

40 102

20 10

1.50 1.51 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 3 4 5 ≥6 0 1 2 3 4≥ 5 Number of Jets Number of b•tagged jets at 70 %

(a) (b)

×103 500 Data ttH 107 Data ttH -1 ≥ -1 ≥ Events s = 13 TeV, 36.1 fb tt + light tt + 1c Events s = 13 TeV, 36.1 fb tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V Single Lepton 106 Single Lepton 400 Non-tt Uncertainty Non-tt Uncertainty Inclusive Selection Inclusive Selection Pre-Fit Pre-Fit 105 300 104

200 103

102 100 10

1.50 1.51 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 2 2.5 3 3.5 4 4.5 5 5.5 6 Number of Jets Number of b-tagged jets at 70 %

(c) (d)

Figure 6.4: The events in the dilepton (top row) and semileptonic (bottom row) channels after preselection. The jet multiplicity (a,c) and b-tag multiplicity at the 70% working point (b,d) are shown. The uncertainty band on the MC includes statistical and systematic uncertainties. The systematic uncertainties considered include the theoretical and modelling uncertainties on the signal and background predictions and uncertainties relating to the simulation of the ATLAS detector, and are described later. The distributions are shown before performing a statistical analysis (pre-fit).

92 ! 7. Search for ttH¯ H → b¯b at 13 TeV

In this chapter the search for the production of the Higgs boson in association with two top quarks with the Higgs boson decaying into a b-quark pair, using data recorded with the ATLAS detector during Run 2 of the LHC, is presented [1]. The overall analysis strategy is presented in the following sections, with particular emphasis on the dilepton channel, and the systematic uncertainties which are considered are described.

  This analysis follows on from previous searches for ttH¯ H → b¯b per- √ formed with data taken with a centre of mass energy of s = 7 and 8 TeV. In the searches performed by the ATLAS and CMS collaborations during Run 1 of the LHC, the signal strength parameter was measured as µttH¯ = 1.4±1.0 and 0.7±1.9 respectively [93–95]. These results were combined together with other searches for ttH¯ using additional Higgs decay channels, resulting in an overall +0.7 combined signal strength of µttH¯ = 2.2−0.6. This corresponds to an observed (expected) significance of 4.4σ (2.0σ) [96].

  The search for ttH¯ H → b¯b in this chapter is performed using the full √ dataset of pp collisions taken at a centre of mass energy s = 13 TeV with the ATLAS detector during 2015 and 2016. This dataset represents an integrated luminosity of 36.1 fb−1. The preselection of events and modelling of the signal and background have been described previously in Chapter 6. In order to reduce the QCD multijet background and to provide a clean trigger signature, events where one or both of the top quarks decay leptonically are selected. The event selection is targeted for ttH¯ events with the Higgs decaying to two b-quarks, though all ttH¯ events in the selection are treated as signal regardless of the decay mode. The dominant background comes from tt¯ with additional   jets, especially tt¯+ b¯b which shares the same final state as the ttH¯ H → b¯b signal at tree level. The tt¯+≥1b background has a cross section approximately   two orders of magnitude larger than ttH¯ H → b¯b .

93   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

7.1 Analysis Strategy

The search for ttH¯ in the H → b¯b decay channel takes advantage of the highest Higgs branching ratio. Unfortunately, it is dominated by large background processes. In order to improve the sensitivity to the signal process, ttH¯ events are separated from the background processes using a variety of techniques. Controlling and understanding the modelling of the dominant backgrounds is also an important part of the analysis, as mismodelling of the backgrounds can have a large impact on the final result extracted in a fit. In this analysis, events are categorised into regions to improve the sensi- tivity to the ttH¯ signal, separating signal events from reducible backgrounds. These regions are also used to control the modelling of the dominant back- ground processes coming from tt¯+ jets. The regions which are most enriched in signal events are labelled as signal regions. In the signal regions, the signal is separated further from the background processes using multivariate tech- niques. A two-stage multivariate strategy is employed in the analysis. In the first stage, attempts are made to reconstruct the final state of the signal and background processes from the objects in an event. In the dilepton signal re-   gions, a BDT is used to reconstruct the ttH¯ H → b¯b final state, and in the semileptonic channel two additional methods which also reconstruct the tt¯+b¯b final state are used. This stage feeds into the second stage of the multivariate strategy. In the second stage, multiple variables are combined using multivari- ate methods into a final classification discriminant, trained to maximise the separation of signal events from the background. A simultaneous fit over all signal and control regions is performed using a binned profile likelihood in order to extract the signal strength parameter µ. The sensitivity of the analysis comes mainly from the signal regions, however as signal is also found in control regions, which in addition have larger expected yields, some sensitivity also comes from the control regions. Nevertheless, they are mainly used to control the uncertainty on the background modelling and other systematic uncertainties which enter into the fit as nuisance parameters. This is extremely important in the modelling of the tt¯+HF background, which is not well understood theoretically and has large systematic uncertainties. The signal and control regions are fit simultaneously in a profile likelihood fit and no distinction is made between them. In the fit, the final discriminant used in had the control regions is either a single bin or the scalar sum of jet pT (HT ). For the signal regions, the classification discriminant from the two stage strategy

94   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV enters the fit. The binning of the final discriminant is optimised to maximise the sensitivity to the ttH¯ signal, and have a low total statistical uncertainty in each bin. The total Monte Carlo statistical uncertainty in each bin is kept low in order to avoid statistical fluctuations on the predicted yield per bin having an impact on the result of the fit.

7.1.1 Event Categorisation

Following the event selection described in Section 6.2, all events are categorised by lepton (electron or muon) multiplicity into the dilepton and semileptonic channels. Within each channel events are further categorised into regions. The events passing the boosted selection in the semileptonic channel constitute one region. The events passing the resolved selections in both channels are further categorised using the jet multiplicity and b-tagging information of the jets in the event. The resolved regions are constructed to be enriched in different tt¯+ jets backgrounds and to maximise the expected significance in the signal regions. As tighter b-tagging requirements are applied on the jets, the domi-   nant background becomes tt¯+ b¯b and the purity of ttH¯ H → b¯b increases as both of these processes have four b-quarks in the final state. At looser require- ments the tt¯+ light and tt¯+≥1c backgrounds dominate, however some signal and tt¯+≥1b events migrate to lower jet and looser b-tag multiplicities due to detector efficiencies. In the dilepton channel events are grouped by jet multiplicity into events with three jets (3j) and at least four jets (≥4j); in the l+jets channel events are grouped into those with exactly five jets (5j) and at least six jets (≥6j). Within each jet multiplicity, events are divided into a fine binning based on the b-tagging information of the jets. The jets are sorted by the binned b-tagging discriminant and the first four jets are selected. Events are separated into exclusive bins using the working points of each of the four jets. In the case of the dilepton 3j events the b-tagging working points of the three jets are used. These bins form all the combinations of working points for the four jets, cutting exclusively on the working point bins. The bins are labelled by the bins of the b-tagging discriminant of the four (or three) jets. An example bin is all four jets passing the tight working point but not the very tight working point; or three very tight and one loose b-tagged jet. All possible bins in the dilepton and semileptonic selections are shown in Fig. 7.1 and Fig. 7.2 respectively, which also show the region definitions explained below. Regions are constructed using the background composition of the bins in

95   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

Semileptonic Regions ≥ 6 jets 5 jets Region name Definition Region name Definition ≥6j ¯ 5j ¯ SR1 > 60% tt + ≥ 2b SR1 > 60% tt + ≥ 2b ≥6j ¯ 5j ¯ SR2 > 45% tt + ≥ 2b SR2 > 20% tt + ≥ 2b ≥6j ¯ SR3 > 30% tt + ≥ 2b ≥6j ¯ 5j ¯ CRtt¯+b > 30% tt + 1b CRtt¯+b > 20% tt + 1b ≥6j ¯ 5j ¯ CRtt¯+≥1c > 30% tt + ≥ 1c CRtt¯+≥1c > 20% tt + ≥ 1c ≥6j 5j CRtt¯+light Remaining events CRtt¯+light Remaining events Dilepton Regions ≥ 4 jets 3 jets Region name Definition Region name Definition ≥4j ¯ SR1 > 70% tt + ≥ 2b ≥4j ¯ SR2 > 1.5% ttH ≥4j ¯ SR3 > 30% tt + 1b ≥4j ¯ 3j ¯ CRtt¯+≥1c > 25% tt + ≥ 1c CRtt¯+≥1b > 30% tt + ≥ 1b ≥4j 3j CRtt¯+light Remaining events CRtt¯+light Remaining events

Table 7.1: The definitions used to define the regions in the resolved semileptonic and dilepton channels for each jet multiplicity group. The order of the regions denotes the priority for an event to enter a region. Signal regions and control regions are denoted by the capitalised letters in the name. Signal regions are numbered in order of signal purity per jet multiplicity with control regions named after the dominant background. All selection criteria are based on the expected fractions for each sample The tt¯ + ≥ 2b category includes the tt¯+ B, tt¯+ b¯b and tt¯+≥3b backgrounds.

96   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV this fine splitting. Requirements are made on the composition of the expected signal and background events in each bin, and all bins which pass the require- ment are grouped together to form a region. The different tt¯+ jets categories are used in defining the majority of the regions, with the finer categorisation of tt¯+≥1b events used. As several of the bins are very low in simulated events, these are not included when constructing the regions. Instead, after the regions have been defined and all bins grouped together, they are incorporated into neighbouring regions. An example region is constructed by selecting all bins where the tt¯+ b¯b background forms >60% of the entire expected background. As a bin can pass several of the requirements and thus enter into multiple regions, priority is given to regions by the order in which they are defined. In the dilepton channel, the selection of the regions is optimised constructing separate regions for each of the main backgrounds, and maximising the ex- √ pected significance in the two most signal enriched regions using S/ B as an estimate.

The region definitions in the resolved dilepton and semileptonic channels are given in Table 7.1. The order in Table 7.1 is the same order as the priority of a bin to enter a region. How the definitions relate to the b-tagging information of the jets is depicted in Fig. 7.1 for the dilepton regions and Fig. 7.2 for the semileptonic regions. The events are classified into exclusive bins using the b-tagging working points of the constituent jets, with the b-tagging working point of each jet given on the x and y-axes. After the definition of the regions, an additional adjustment of the boundaries is performed by hand. A bin can be moved between two regions to reduce the impact of statistical fluctuations on the region definitions. For example, a bin is moved if it is isolated from other bins in the same region.

The two most signal enriched regions in the resolved semileptonic channel 5j ≥6j with 5j and ≥6j, SR1 and SR1 respectively, are defined by requiring at least four jets tagged at the very tight b-tagging working point. In the dilepton ≥4j channel the most signal enriched region, SR1 , is defined by requiring at least three jets tagged at the very tight working point, and an additional fourth jet passing at least the tight b-tagging working point. The other signal regions are defined with looser b-tagging requirements. The boosted semileptonic region is also considered as a signal region. In total there are 19 regions in the analysis, of which nine are signal regions and ten are control regions. In the dilepton 3j 3j channel the 3j events are split into two control regions, CRtt¯+light and CRtt¯+≥1b. For events with ≥4 jets in the dilepton channel, there are two control regions

97   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

≥4j ≥4j in addition to the signal regions, CRtt¯+light and CRtt¯+≥1c. In the semileptonic channel, events which are not in the signal regions are divided into six control regions, with control regions for tt¯+ light, tt¯+≥1c, and tt¯+ b for each of the two jet multiplicities.

The expected background composition of the regions in the analysis are shown by the pie charts in Fig. 7.3 for the dilepton regions and in Fig. 7.4 for the semileptonic regions. The backgrounds are divided into the different tt¯+jets components, with tt¯+≥1b further categorised into the subcategories introduced in Chapter 6; the ttV¯ background; and all remaining backgrounds grouped to- gether as non-tt¯, which includes the background from fakes and non prompt leptons. The signal regions, with the exception of the boosted region, are all dominated by the tt¯+≥1b background. The different background compositions in the control regions optimised for different tt¯ + jets backgrounds are also shown. In Fig. 7.5 the expected fraction of signal events coming from the dif- ferent Higgs decay modes are shown for all regions in the analysis. The H → b¯b decay represents 89% of all ttH¯ events in the dilepton signal regions, 96% in the resolved semileptonic signal regions, and 86% in the boosted semileptonic sig- nal region. The expected signal to background ratio for the regions is shown in √ Fig. 7.6 alongside S/ B, calculated using the expected signal and background yields per region for the 36.1 fb−1 dataset. When looking at these plots, the main sensitivity can be expected to come from the semileptonic channel, es- pecially in the ≥6j signal regions. However, the most signal enriched region is in the dilepton channel, and further sensitivity comes from the separation of signal from background in the classification discriminant which enters the fit.

The overall yields in data and the background composition for the regions in the analysis after categorisation are shown in Fig. 7.7, with the expected ttH¯ contribution shown in red and also separately by the red dashed line. The total uncertainty band per bin includes the systematic uncertainties which are discussed in detail in Section 7.2. An underestimation of events in simulation is seen in several regions, especially for some of the tt¯+ HF control regions and less signal enriched signal regions. The event yields in each control region enter as a bin in the profile likelihood fit with the exception of the tt¯+≥1c had enriched regions in the semileptonic channel. In these regions HT defined as the scalar sum of the pT of all jets in the event, is used as the final discriminant to help control the modelling of the tt¯+≥1c background. In the signal regions, multivariate techniques are used to construct the final discriminants.

98   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

(1st, 2nd) jet Dilepton, 3 j b-tagging discriminant (3, 3)

(4, 3)

(5, 3)

(4, 4) CRtt¯+light

(5, 4)

(5, 5) CRtt¯+≥1b

5 4 3 2 1 3rd jet b-tagging discriminant (a)

(1st, 2nd) jet Dilepton, ≥ 4 j b-tagging discriminant

(3, 3) CRtt¯+≥1c

(4, 3)

(5, 3)

(4, 4) CRtt¯+light

(5, 4) SR2

(5, 5) SR1 SR3

(5, 5) (5, 4) (5, 3) (5, 2) (4, 4) (4, 3) (4, 2) (3, 3) (3, 2) (2, 2) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (3rd, 4th) jet b-tagging discriminant (b)

Figure 7.1: The definition of the 3j (a) and ≥4j (b) dilepton regions using the binned b-tagging discriminant of the jets in the event. The vertical axis shows the bin in the b-tagging discriminant of the first two jets in the event. The horizontal axis shows the bin in the b-tagging discriminant of the third (a) or third and fourth (b) jets in the event. In this labelling, 1 refers to the untagged working point bin and 5 refers to the very tight b-tagging working point bin, as described in Table 5.1. The jets are ordered by their b-tagging scores with the empty areas representing combinations of working points which are not possible.

99   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

(1st, 2nd) jet Single Lepton, 5 j b-tagging discriminant (3, 3)

(4, 3)

(5, 3) CRtt¯+light

(4, 4)

CRtt¯+b CRtt¯+≥1c (5, 4)

(5, 5) SR1 SR2

(5, 5) (5, 4) (5, 3) (5, 2) (4, 4) (4, 3) (4, 2) (3, 3) (3, 2) (2, 2) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (3rd, 4th) jet b-tagging discriminant (a)

(1st, 2nd) jet Single Lepton, ≥ 6 j b-tagging discriminant (3, 3)

(4, 3)

(5, 3) CRtt¯+light

(4, 4)

CRtt¯+b CRtt¯+≥1c (5, 4)

(5, 5) SR1 SR2 SR3

(5, 5) (5, 4) (5, 3) (5, 2) (4, 4) (4, 3) (4, 2) (3, 3) (3, 2) (2, 2) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (3rd, 4th) jet b-tagging discriminant (b)

Figure 7.2: The definition of the 5j (a) and ≥6j (b) semileptonic resolved regions using the binned b-tagging discriminant of the jets in the event. The vertical axis shows the bin in the b-tagging discriminant of the first two jets in the event. The horizontal axis shows the bin in the b-tagging discriminant of the third and fourth jets in the event. In this labelling, 1 refers to the untagged working point bin and 5 refers to the very tight b-tagging working point bin, as described in Table 5.1. The jets are ordered by their b-tagging scores with the empty squares representing combinations of working points which are not possible, or bins which would contain events which do not pass the event selection.

100   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

ATLAS tt + light tt + ≥1c MPI/FSR s = 13 TeV tt + b tt + B tt + bb ≥ Dilepton tt + 3b tt +V Non-tt

3j 3j CR tt+light CR tt+≥1b

≥4j ≥4j CR tt+light CR tt+≥1c

≥4j ≥4j ≥4j SR 3 SR 2 SR1

Figure 7.3: The fractional background composition of the regions in the dilep- ton channel. The tt¯+ jets background is separated into tt¯+ light, tt¯+≥1c and the finer tt¯+≥1b categorisation. MPI/FSR refers to the tt¯+ b(MPI/FSR) con- tribution. The ttV¯ background is shown separately, with the remaining non-tt¯ backgrounds grouped together.

101   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

ATLAS tt + light tt + ≥1c MPI/FSR s = 13 TeV tt + b tt + B tt + bb ≥ Single Lepton tt + 3b tt + V Non-tt

5j 5j 5j CR tt+light CR tt+≥1c CRtt+b

5j 5j boosted SR 2 SR1 SR

≥6j ≥6j ≥6j CR tt+light CR tt+≥1c CRtt+b

≥6j ≥6j ≥6j SR 3 SR 2 SR1

Figure 7.4: The fractional background composition of the regions in the semileptonic channel. The tt¯ + jets background is separated into tt¯ + light, tt¯+≥1c and the finer tt¯+≥1b categorisation. MPI/FSR refers to the tt¯ + b(MPI/FSR) contribution. The ttV¯ background is shown separately, with the remaining non-tt¯ backgrounds grouped together.

102   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

ATLAS Simulation H → bb s = 13 TeV H → WW → Single Lepton H other

5j 5j 5j CR tt+light CR tt+≥1c CRtt+b

ATLAS Simulation H → bb s = 13 TeV H → WW → Dilepton H other

3j 3j SR5j SR5j SRboosted CR tt+light CR tt+≥1b 2 1

≥ ≥ ≥ ≥ ≥ 4j 4j CR 6j CR 6j CR 6j CR tt+light CR tt+≥1c tt+light tt+≥1c tt+b

≥4j ≥4j ≥4j ≥6j ≥6j ≥6j SR 3 SR 2 SR1 SR 3 SR 2 SR1

(a) (b)

Figure 7.5: The fractional composition of the ttH¯ signal in the dilepton (a) and semileptonic (b) regions. The H →other contribution is primarily H → ZZ(∗) → 4` and H → ττ. In the signal regions where there is sensitivity to the ttH¯ signal, the events are dominated by the H → b¯b decay mode.

103   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy B B B B

/ /

2 / 2 /

S ATLAS S ATLAS S S -1 -1 0.05 s = 13 TeV, 36.1 fb 1.8 0.05 s = 13 TeV, 36.1 fb 1.8 Dilepton Single Lepton 1.6 1.6 0.04 0.04 1.4 1.4

1.2 1.2 0.03 0.03 1 1

0.8 0.8 0.02 0.02 0.6 0.6

0.01 0.4 0.01 0.4 0.2 0.2

0 0 0 0 CR 3j CR 3j CR ≥4j CR ≥4j SR ≥4j SR ≥4j SR ≥4j CR 5j CR 5j CR 5j SR 5j SR 5j SRboostedCR ≥6j CR ≥6j CR ≥6j SR ≥6j SR ≥6j SR ≥6j tt+light tt+ tt+light tt+ 3 2 1 tt+light tt+ tt+b 2 1 tt+light tt+ tt+b 3 2 1 ≥1b ≥1c ≥1c ≥1c

(a) (b)

Figure 7.6: The expected ratio of signal to background S/B (solid black line) √ and the expected sensitivity of each region, calculated using S/ B (dashed red line), for each region in the dilepton (a) and semileptonic (b) channels. √ Both S/B and S/ B are calculated from the expected number of signal (S) and background (B) yields per region, taken from the Monte Carlo simulation and data driven background estimates, as described in Chapter 6, and using a luminosity of 36.1 fb−1.

ATLAS Data ttH tt + light 108 ATLAS Data ttH tt + light 107 s = 13 TeV, 36.1 fb-1 tt + ≥1c tt + ≥1b tt + V s = 13 TeV, 36.1 fb-1 tt + ≥1c tt + ≥1b tt + V 7 Dilepton Non-tt Total unc. ttH 10 Single Lepton Non-tt Total unc. ttH Events / bin 106 Pre-Fit Events / bin Pre-Fit 106 105 105 104 104

3 10 103

2 10 102

10 1.5 1.510 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 CR 3j CR 3j CR ≥4j CR ≥4j SR ≥4j SR ≥4j SR ≥4j CR 5j CR 5j CR 5j SR 5j SR 5j SRboostedCR ≥6j CR ≥6j CR ≥6j SR ≥6j SR ≥6j SR ≥6j tt+light tt+ tt+light tt+ 3 2 1 tt+light tt+ tt+b 2 1 tt+light tt+ tt+b 3 2 1 ≥1b ≥1c ≥1c ≥1c

(a) (b)

Figure 7.7: Summary of all regions in the dilepton (a) and semileptonic (b) channels, showing the predicted yields and observed data per region. The signal contribution, normalised to the expected SM cross section, is shown by the solid red contribution in each bin and additionally by the red dashed line for clarity. The recorded data are compared to the expected contribution from simulation and data driven techniques, with the total uncertainty on the expected events coming from all systematic uncertainties.

104   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

7.1.2 Overview of Multivariate Methods

After the selection and categorisation of events into regions, there are nine sep- arate signal regions in which the ttH¯ signal and tt¯+≥1b background have been enhanced with regards to the reducible backgrounds. To further improve the sensitivity of the analysis, signal events are separated from the background in the signal regions region by utilising as much information in the event with mul- tivariate techniques. Classification BDTs are used as the final discriminants in all signal regions, combining multiple variables to have an enhanced separation of signal and background. In addition to the final discriminant, multivariate techniques are used to exploit topological information in the event and recon- struct the particles in the final state of the event. From this powerful variables can be constructed which are used as inputs to classification BDTs, greatly improving the sensitivity of the analysis. In the resolved regions, the first stage reconstructs the signal or background   hypothesis, rebuilding the particles in the ttH¯ H → b¯b and tt¯+jets final state miss from the jets, leptons and ET in the event. Three complementary techniques are used to reconstruct the final state. In contrast, no explicit reconstruction method is attempted in the boosted region as one of the large R jets is al- ready tagged as a Higgs boson candidate in the event selection. The Higgs boson candidate large R jet in boosted events is correctly matched to the true   Higgs boson in 47% of ttH¯ H → b¯b simulated events, with the top candi-   date containing the Higgs boson in less than 0.5 % of ttH¯ H → b¯b events. In the remaining events both b-quarks from the Higgs boson decay are not fully contained within a single large R jet. In both the dilepton and semileptonic channel, BDTs are used to recon-   struct the ttH¯ H → b¯b final state from the jets and leptons in the event by matching the recorded objects to those in the final state at tree level. In this method, a BDT is used to assign jets to the four b-quarks coming from the top   and Higgs boson decays in ttH¯ H → b¯b events. The reconstruction BDT is trained to separate the correct jet-parton assignment from the combinatoric background. This is explained in more detail in Section 7.1.3. Complimentary to the reconstruction BDT, the Likelihood discriminant (LHD) is employed in all resolved semileptonic signal regions. It is computed analogously to the method described in Ref. [97]. The LHD is constructed as a   product of one dimensional probability density functions for the ttH¯ H → b¯b and tt¯+≥1b event hypotheses, with the particles in each hypothesis constructed from jet assignments to the Higgs boson and top quarks. The probability den-

105   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy sity functions are built for various invariant masses, such as that of the recon- structed Higgs boson and top quarks, and additionally for angular distributions miss using the reconstructed jets, leptons and ET in each event. All possible jet- parton assignments are used in the calculation, with the combinations weighted by the b-tagging discriminants of the individual jet assignments. This is per- formed to penalise unlikely combinations, for example where b-tagged jets are matched to light quarks. The LHD is defined as the ratio p LHD = sig , (7.1) psig + pbkg where psig and pbkg are the probabilities for each event under the ttH¯ signal and tt¯+≥1b background hypotheses respectively, calculated from the weighted average of the products of the probability density functions per combination. The LHD uses information from all possible combinations in an event to con- struct the discriminant, however, unlike the reconstruction BDT, it does not consider the correlations between variables within individual combinations. Finally, a discriminant based on the Matrix Element Method (MEM) is computed for events in the most signal enriched semileptonic region, calculated using a similar method to the one described in Ref. [86]. The MEM discrim- inant is similar to the LHD, however the hypothesis testing is performed at parton level using a transfer function instead of using the reconstructed objects directly. The discriminant is defined as

MEMD1 = log10(LS) − log10(LB), (7.2) where LS and LB are the computed likelihoods for the event under the sig- nal and background hypotheses. The MEM takes the four vectors of the jets, miss leptons and ET in the event as inputs. The second stage of the multivariate strategy is classifying the events as signal or background. In this step the ttH¯ signal events are separated from the background processes by combining variables in a classification BDT. Variables from the reconstruction techniques are combined with additional variables to maximise the sensitivity of the analysis to the signal in the statistical analysis.

7.1.3 Reconstruction BDT   In ttH¯ H → b¯b events with both tops decaying leptonically there are four b-quarks, two leptons and two neutrinos from the decays of the Higgs boson and top quarks. However, it is not possible to detect and reconstruct these par- ticles directly, or determine the particles from which they have decayed. In the

106   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

ATLAS detector they are reconstructed as four jets, which should be b-tagged, miss two leptons and ET . The aim of the reconstruction BDT is to correctly assign   the reconstructed jets to the partons in a ttH¯ H → b¯b event. Due to the large jet multiplicity in the event there are a large number of possible permutations, and the jets cannot be assigned to the four b-quarks trivially. Once the jets   have been assigned to the partons in the ttH¯ H → b¯b event, it is possible to miss ¯ use the jets, leptons and ET to reconstruct the objects in the ttH process and their kinematics. For example, using the two jets matched to the b-quarks from the Higgs boson decay, one can calculate the invariant mass of the re- constructed Higgs boson candidate. As the reconstruction BDT assigns jets to partons assuming a ttH¯ hypothesis for all events, the distributions of variables built using the jet assignments have different shapes for background events compared to the ttH¯ signal. These variables are very powerful in separating ttH¯ and tt¯ events and feed into the next stage of the multivariate strategy in the analysis.

In order to match the jets to the four b-quarks, a classification BDT is trained treating the correct jet-parton assignment as the signal and training it against the combinatoric jet background, where the jets are assigned to partons incorrectly. The training is performed using simulated ttH¯ events, requiring both tops to decay leptonically and for the Higgs boson to decay into two b-quarks. To determine the truth origin of the jets in the event, final state partons are matched to a jet if they are within ∆R < 0.3. It can happen that a parton is matched to multiple jets or two partons to the same jet. The parton history is taken from the Monte Carlo simulation and is used to identify the partons which originate from the top quark and Higgs boson decays. As there is very little difference in the kinematics of the two b-quarks from the Higgs boson decay, when determining whether a jet is correctly assigned to a b-quark from the Higgs boson the truth origin of the jet is allowed to be either of the two b-quarks. A similar relaxed constraint is applied to the assignment of jets to the top quarks. The jet and lepton from the same top decay are required to be matched to the same top in the reconstruction, however it does not need to be the correct top in the truth information. Due to this relaxation, the notation used for the top quarks is not top and anti-top, but rather top1 and top2 in the trainings. For variables subsequently reconstructed using the jet assignments from the reconstruction BDT, the charge of the lepton is used to determine whether the reconstructed top quark is the top or anti-top.

In every event all possible combinations of jet-parton matching are con-

107   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

  structed using the jets in the event and the four b-quarks in the ttH¯ H → b¯b hard scatter. The combinations where the two jets matched to the Higgs have been swapped in comparison to a previously used combination are removed due to the kinematic similarities. When training and evaluating the reconstruction BDTs, only four jets from each event are used for the parton assignments, in order to reduce the total number of possible combinations in events with more than four reconstructed jets. For example, when using five jets there would be 60 combinations per event, whereas restricting this to four jets re- duces this to 12. To select the jets, they are first sorted by the binned b-tag discriminant, from tightest to loosest; jets which are in the same bin are sub- sequently ordered in descending value of pT. The four leading jets are selected and used in the training and evaluation of the reconstruction BDTs. The cor- rectly assigned jets for each combination are determined, and combinations are labelled as signal if all jets are correctly assigned to the four b-quarks, other- wise they are labelled as the combinatoric background. For each combination, kinematic and topological variables are constructed using the jet assignments. These variables are used to discriminate between combinations with the cor- rect assignment and the combinatorial background, and are used as inputs in the training of the reconstruction BDT. The majority of the variables used are angular separations and invariant masses of objects, such as the invariant mass of the two jets matched to the Higgs boson, or the separation in ∆R of the jet and lepton matched to the same top quark, as these variables have a much higher discrimination power than the kinematic properties of individual jets. The reconstructed Higgs boson mass for combinations with the jets correctly matched to the Higgs boson is shown in Fig. 7.8 in comparison to the dis- tribution from the combinatoric background. For the correct assignments the distribution has a clear peak structure whereas the combinatoric background has a much broader distribution which peaks at a lower invariant mass.

Once trained, the reconstruction BDT is evaluated for all combinations per event. The combination which has the highest score from the BDT output discriminant is chosen as the correct jet assignment for the event. Using the jet assignments, event properties can be reconstructed such as the reconstructed Higgs boson mass and the angular separation of the b-tagged jets matched to the Higgs boson.

In addition to variables constructed from the objects, the BDT score of the best combination per event is used to discriminate between the ttH¯ sig- nal and background processes. For signal-like events it should be more likely

108   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

0.25

Correct Higgs matching Incorrect Higgs matching 0.2 Normalised entries

0.15

0.1

0.05

0 0 100 200 300 400 500 600 reco MHiggs [GeV]

Figure 7.8: The reconstructed Higgs mass of all possible combinations per event, built using the two jets assigned to the b-quarks from the Higgs boson decay in simulated ttH¯ (H → b¯b) events with both top quarks decaying lep- tonically. Combinations with the correct jet assignment to the Higgs boson are shown in red, and those without in blue. All combinations are constructed from the ttH¯ signal sample, using only events where the Higgs boson decays into a b-quark pair and requiring that both top quarks decay leptonically. The distributions have been normalised to unity.

109   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy to have a combination of jet assignments which describes the kinematics of   ttH¯ H → b¯b events well, whereas for background processes this should be less likely. For example, the best scoring combination from a tt¯ + b¯b event should be able to correctly assign jets to the b-quarks from the top decays, as is the case for ttH¯ , but the additional b-quark pair will have very different kine- matics to the b-quarks from Higgs boson decays. This results in tt¯+ b¯b events, and other backgrounds, having lower output scores from the BDT for the best performing combination in comparison to ttH¯ signal events. This variable can be interpreted as how signal-like an event is, and has a large discrimination between signal and background events.

To evaluate the performance of this method, the efficiency of the recon- struction assignments for signal events is calculated. In this, the reconstruc- tion matching is tested by calculating the fraction of jets which are correctly assigned to the four b-quarks when choosing the best scoring combination per event. These efficiencies are compared to the fraction of events which contain the final state partons matched to unique jets, as the partons can fall out of the acceptance of the detector, or equally two partons can be matched to the same jet due to the ∆R matching. In these cases it is not possible to correctly assign all jets in the event.

Achieving the best matching performance comes at a cost, however. The variables constructed using the best scoring combination per event are biased towards the signal distributions of the input variables used in the training. For example, the invariant mass of the two jets matched to the Higgs boson will be biased to mH = 125 GeV even in tt¯+ jets events, as this looks the most signal-like. To avoid biasing the distributions of the reconstructed variables in background processes from appearing more signal-like after the jet assign- ments, two separate sets of BDTs are trained to assign jets to the partons. The first uses all information in the event to find the best combination possi- ble. This results in the BDT output score having the best separation between signal and background, and also provides the highest reconstruction purity for   ttH¯ H → b¯b events. The second BDT is trained without including variables which use the jets matched to the Higgs boson. In this training, all combina- torics where the jets have been correctly matched to the b-quarks from the top decays are treated as signal, with the remaining combinatorics being treated as background, dropping the requirement of the jets to be correctly assigned to the Higgs boson. As there are only four jets used in the combination, the two jets not assigned to the top quark decays can be inferred to come from the

110   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

Higgs boson. The best scoring combination per event from this second BDT is used to reconstruct kinematic variables, such as the Higgs boson mass, which can be used to separate ttH¯ events from background processes. The two BDTs are referred to as the BDT with Higgs and BDT without Higgs information. A comparison of the reconstructed Higgs boson mass from these two BDTs is depicted in Fig. 7.9. The shape of the distribution for background events differs quite drastically, depending on which BDT is used to select the jet assignments. A much narrower distribution is observed when using the BDT with Higgs information in comparison to the BDT without Higgs information. For signal events the Higgs mass peak is clear in both cases, however in the BDT without Higgs information there is a broader distribution with a tail to higher masses. Variables related to the Higgs boson, which are used to discrim- inate between signal and background events, are reconstructed using the BDT without Higgs jet assignments to improve the separation. This also reduces correlations between the reconstructed kinematics and the BDT output score of the best scoring combination from the with Higgs BDT.

In addition to the two methods, both with and without the Higgs infor- mation, the reconstruction BDTs are trained in two distinct regions. The first requires at least four jets to be b-tagged at the loose working point. This is   to match the expected final state objects for the ttH¯ H → b¯b events, as- ≥4j suming all partons are matched to unique jets. This training is used in SR1 ≥4j ≥4j and SR2 . In the remaining signal region, SR3 , there are exactly three jets which pass the very tight b-tag working point and all remaining jets do not pass the loose working point. Here a second reconstruction training is used. This training is optimised for events with only three b-jets, and is trained on events with exactly three jets which pass the tight b-tag working point. In this region, the efficiency of the partons to be in the final state is much reduced and a dedicated training is found to provide improved performance. The b-tagging ≥4j working point in this training is loosened with respect to the definition of SR3 in order to increase the available statistics for training the BDT.

The variables which are used in the reconstruction BDTs are optimised by maximising the area under the ROC curve of the test samples, whilst min- imising the correlations between input variables. However, in instances where there is a large difference in the correlation between two variables for signal and background jet assignments the variables are not removed from the training, for example if the variables are correlated in signal events and anti-correlated in background events. Although the optimisation is performed using the ROC

111   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

0.35 0.25 ttH(bb) ttH(bb) 0.3 tt+jets 0.2 tt+jets

Normalised Entries 0.25 Normalised Entries

0.15 0.2

0.15 0.1

0.1

0.05 0.05

0 0 0 50 100 150 200 250 0 50 100 150 200 250 300 350 400 reco, withH reco, noH MHiggs [GeV] MHiggs [GeV]

(a) (b)

Figure 7.9: Comparing the reconstructed Higgs mass of tt¯and ttH¯ events after evaluating both the reconstruction BDTs, where Higgs information is used in the jet assignment (a) and without the Higgs information (b). The bias towards a more signal-like mass distribution for the background is seen in (a) with both peaking between 100–125 GeV. In (b) a shape difference is seen for the reconstructed Higgs boson mass with the signal still peaking between 100– 125 GeV but the background now peaking at a lower mass and a longer tail in the distribution, like for the combinatoric background in Fig. 7.8. Events are selected with ≥ 4 jets, of which ≥ 4 are b-tagged at the loose working point.   The signal is ttH¯ H → b¯b events where both tops decay leptonically, and the background is the tt¯ +jets background, both normalised to unity.

112   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV curve and variable correlations, the final selection of variables is chosen by maximising the efficiency and overall performance of the BDT method. This is evaluated by choosing the best scoring combination per event, and deter- mining the fraction of events where the BDT has correctly assigned jets to the four b-quarks. Table 7.2 lists the variables used in the reconstruction BDT per region. In the table there are two sets of variables for each region, with Higgs and without Higgs. The distributions of all variables used in the reconstruc- tion BDTs in the dilepton channel are compared in Fig. 7.10 and Fig. 7.11 for events where all jets are correctly matched to the partons (signal) and the combinatoric background. Variables which only use topological information from the tt¯ system are compared in Fig. 7.10 and variables which use topo- logical information from the jets matched to the Higgs boson are compared in Fig. 7.11.

Reco BDT with Higgs Reco BDT without Higgs Variable ≥4j ≥4j ≥4j ≥4j SR1,2 SR3 SR1,2 SR3 Topological information from tt¯ Mass of top XX XX Mass of anti-top XX XX Mass difference between top and anti-top XX XX ∆R(`, b) from top XX XX ∆R(`, b) from anti-top XX XX |∆R(`, b) from top - ∆R(`, b) from anti-top| -- XX ∆φ(b from top, b from anti-top) - X XX ∆R(b from top, b from anti-top) X - -- pT b from top -- XX pT b from anti-top -- XX Min. ∆η(`, b from top or anti-top) -- XX Topological information from the Higgs-boson candidate Min. ∆R(b from Higgs, `) - X -- Max. ∆R(Higgs, b from top or anti-top) X - -- Mass of Higgs XX -- ∆φ(Higgs,tt¯) - X -- ∆R(Higgs,tt¯) X - -- pT b from Higgs with lowest b-tagging discriminant - X -- ∆R(b1 from Higgs, b2 from Higgs) XX --

Table 7.2: The input variables used in the reconstruction BDTs in the dilepton channel. The variables are listed for both training regions, shown by the re- gions in which they are applied, and both reconstruction BDT methods. The topological information from the Higgs boson candidate is only used in the Reco BDT with Higgs. In the variable definitions, top and antitop refer to the objects reconstructed using only the b-jet and lepton matched to the (anti-)top.

113   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

Bin by bin separation 13.01% Bin by bin separation 12.38% Bin by bin separation 12.74%

0.25 signal 0.25 signal 0.25 signal

background background background 0.2 0.2 0.2 Normalised Entries Normalised Entries Normalised Entries

0.15 0.15 0.15

0.1 0.1 0.1

0.05 0.05 0.05

0 0 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 20 40 60 80 100 120 140 160 180 200 top antitop ∆ tops Mlep,b [GeV] Mlep,b [GeV] Mlep,b [GeV]

Bin by bin separation 9.22% Bin by bin separation 7.66% Bin by bin separation 1.99% 0.2 0.2 0.25 signal 0.18 signal signal 0.18 background 0.16 background background 0.16 0.2

Normalised Entries Normalised Entries 0.14 Normalised Entries 0.14 0.12 0.12 0.15 0.1 0.1

0.08 0.08 0.1

0.06 0.06

0.04 0.04 0.05

0.02 0.02

0 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 top antitop tops ∆ R ∆ R ∆ (∆ R ) lep,b lep,b lep,b Bin by bin separation 0.20% Bin by bin separation 0.07% Bin by bin separation 0.80% 0.18 0.18 0.35 0.16 signal signal signal 0.16 0.14 background background 0.3 background 0.14

Normalised Entries 0.12 Normalised Entries Normalised Entries 0.25 0.12

0.1 0.1 0.2

0.08 0.08 0.15 0.06 0.06 0.1 0.04 0.04

0.05 0.02 0.02

0 0 0 40 60 80 100 120 140 160 180 200 220 40 60 80 100 120 140 160 180 200 220 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 b, top b, antitop tops p [GeV] p [GeV] Min ∆η(b , l) T T Bin by bin separation 1.49% Bin by bin separation 1.59% 0.25 signal signal 0.12

background 0.2 background 0.1 Normalised Entries Normalised Entries

0.08 0.15

0.06 0.1

0.04

0.05 0.02

0 0 0 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 6 tops ∆ φ ∆ Rtops bb bb

Figure 7.10: Comparison of signal and background distributions for the input variables entering the reconstruction BDTs in the dilepton channel which use topological information only from the tt¯ system. The entries are taken from all possible jet-parton combinations in the ttH¯ sample, with the Higgs boson decaying to two b-quarks and requiring both top quarks to decay leptonically. The signal is all the combinations with all four jets correctly matched to the four b-quarks, and the background contains all other combinations. Events are selected with at least four jets b-tagged at the loose working point. Events where all jets are correctly matched to the partons are labelled as signal, with all remaining events labelled as background. All events falling outside the range of the axis are included in the final bin.

114   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

Bin by bin separation 1.79% Bin by bin separation 32.77% Bin by bin separation 1.45% 0.6 0.6 0.35 signal signal signal 0.5 0.3 background background 0.5 background Normalised Entries Normalised Entries Normalised Entries 0.4 0.25 0.4

0.2 0.3 0.3

0.15 0.2 0.2 0.1

0.1 0.1 0.05

0 0 0 0 1 2 3 4 5 6 0 50 100 150 200 250 300 350 400 0 0.5 1 1.5 2 2.5 3 Higgs Max ∆ R(b , b) MHiggs [GeV] ∆ φ(Higgs,tt)

Bin by bin separation 1.64% Bin by bin separation 4.31% Bin by bin separation 6.36% 0.35 0.2 0.5 signal signal signal 0.3 0.18 background background background 0.16 0.4 0.25 Normalised Entries Normalised Entries Normalised Entries 0.14 0.2 0.3 0.12 0.1 0.15 0.2 0.08

0.1 0.06

0.1 0.04 0.05 0.02

0 0 0 0 1 2 3 4 5 6 40 60 80 100 120 140 160 180 200 220 0 1 2 3 4 5 6 subleading b, Higgs Higgs ∆ R(Higgs,tt) p [GeV] ∆ R T bb

Figure 7.11: Comparison of signal and background distributions for the input variables entering the reconstruction BDTs in the dilepton channel which use topological information from the Higgs boson. The entries are taken from all possible jet-parton combinations in the ttH¯ sample, with the Higgs boson decaying to two b-quarks and requiring both top quarks to decay leptonically. The signal is all the combinations with all four jets correctly matched to the four b-quarks, and the background contains all other combinations. Events are selected with at least four jets b-tagged at the loose working point. Events where all jets are correctly matched to the partons are labelled as signal, with all remaining events labelled as background. All events falling outside the range of the axis are included in the final bin.

115   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

The resulting performance of the reconstruction BDT method to correctly assign jets to the final state partons in the dilepton channel is shown in Fig. 7.12 ≥4j ≥4j for SR1 and SR2 , which use the same training, and in Fig. 7.13 which has a dedicated training for events with one untagged jet. In each plot the efficiency of each parton to be matched to a reconstructed jet is shown in addition, ≥4j in order to demonstrate the maximum achievable performance. In SR1 , the reconstruction BDT with (without) Higgs correctly matches the jets from the   H → b¯b decay in 50% (32%) of simulated ttH¯ H → b¯b events. In these plots, the acceptance of the subleading b from Higgs in the selection acceptance is lower than the leading b-quark. This can be attributed to the minimum pT cut applied on b-hadrons in the truth object reconstruction in the Monte Carlo samples. Additionally, the efficiency to match the jets to the b-quarks from the ¯ ≥4j tt decays is observed to be higher in SR3 for the BDT without Higgs than the BDT with Higgs information. This can be attributed to the dedicated training in this region, with variables optimised to correctly match the tt¯ decays in this region. Furthermore, the reduced acceptance of the Higgs boson decay products limits the overall performance of the BDT with Higgs, and impacts the ability of the BDT with Higgs to correctly match the jets to the b-quarks from top1 and top2. To demonstrate the ability of the reconstruction BDT without Higgs to correctly reconstruct the Higgs boson kinematics, the Higgs candidate mass and minimum separation in ∆R of the Higgs candidate from a lepton are shown in Fig. 7.14 for all ttH¯ events in the three signal regions. Events which have both jets correctly matched to the b-quarks from the Higgs boson decay are represented by the red shaded area. The reconstructed variables and BDT output scores are used in separating ttH¯ from tt¯+jets in the classification BDT. Comparisons of the distribution of the reconstruction BDT output score for simulated ttH¯ and tt¯+ jets events are shown in Fig. 7.16 for all three signal regions.

The reconstruction BDT technique is also used in all resolved semileptonic signal regions, with dedicated trainings for the 5j and ≥6j multiplicities. The four leading jets sorted by the binned b-tagging discriminant are used in the jet-parton assignments for the b-quarks in the event. The remaining jets are used to reconstruct the hadronically decaying W -boson. In 5j events, as 70% of events do not contain both jets from the W decay, the requirement to reconstruct the hadronic W -boson is dropped. In the semileptonic channel the leptonically decaying W -boson is fully reconstructed, unlike in the dilepton channel. The leptonic W is constructed from the lepton and neutrino four

116   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

1

≥4j 0.9 Dilepton SR1 Fraction 0.8

0.7

0.6

0.5

0.4

0.3 Truth Acceptance 0.2 BDT with Higgs

0.1 BDT without Higgs

0 all Higgs two tops top1 b top2 b Higgs subleading Higgsb leading b

Partons matched to Jets

(a)

1

≥4j 0.9 Dilepton SR2 Fraction 0.8

0.7

0.6

0.5

0.4

0.3 Truth Acceptance 0.2 BDT with Higgs

0.1 BDT without Higgs

0 all Higgs two tops top1 b top2 b Higgs subleading Higgsb leading b

Partons matched to Jets

(b)

Figure 7.12: The performance of the reconstruction BDTs in the dilepton signal ≥4j ≥4j regions SR1 (a) and SR2 (b). The performance is shown for the BDT with Higgs (green) and the BDT without Higgs (blue), showing the fraction of events where the jets have been correctly assigned to the partons. The jets labelled as the leading and subleading b from Higgs do not correspond to the truth pT but are sorted by the ordering of the jets entering in the BDT. The truth acceptance (red) shows the fraction of events in which the partons are able to be matched to a jet in each region. For the truth acceptance, top1 refers to the b-quark from the top decay and top2 refers to the b-quark from the anti-top decay. The same trainings on ttH¯ signal events is used in both regions.

117   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

1

≥4j 0.9 Dilepton SR3 Fraction 0.8

0.7

0.6

0.5

0.4

0.3 Truth Acceptance 0.2 BDT with Higgs

0.1 BDT without Higgs

0 all Higgs two tops top1 b top2 b Higgs subleading Higgsb leading b

Partons matched to Jets

(a)

Figure 7.13: The performance of the reconstruction BDTs in the dilepton signal ≥4j region SR3 (c). The performance is shown for the BDT with Higgs (green) and the BDT without Higgs (blue), showing the fraction of events where the jets have been correctly assigned to the partons. The jets labelled as the leading and subleading b from Higgs do not correspond to the truth pT but are sorted by the ordering of the jets entering in the BDT. The truth acceptance (red) shows the fraction of events in which the partons are able to be matched to a jet in each region. For the truth acceptance, top1 refers to the b-quark from the top decay and top2 refers to the b-quark from the anti-top decay. Dedicated trainings on ttH¯ events are used for this region

118   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

≥4j ≥4j Dilepton SR1 Dilepton SR1

0.15

0.2 ttH, all events ttH, all events

Normalised Entries ttH, Correct Higgs matching Normalised Entries ttH, Correct Higgs matching

0.1

0.1

0.05

0 0 0 50 100 150 200 250 300 350 400 0 1 2 3 4 5 6 reco ∆ Min MHiggs [GeV] RHiggs,l

Figure 7.14: The mass of the Higgs candidate (left) and the minimum sep- aration in ∆R between the Higgs candidate and a lepton (right). Shown in ≥4j SR1 for the jets assigned using the reconstruction BDT without Higgs in- formation. The distributions are shown for all simulated ttH¯ dilepton events, with the subset of events which have jets correctly matched to the Higgs boson represented by the red shaded area.

momenta (p` and pν respectively), with the z component of pν inferred by 2 2 2 solving mW = (p` + pν) given the known W -boson mass mW . Both solutions of this quadratic equation are used in the combinations; for events where no real solution exists, the discriminant of the quadratic equation is set to zero to create a unique, real solution. The hadronically and leptonically decaying top quarks are reconstructed using the reconstructed W -bosons and additional jets matched to the b-quarks from the top decays. The variables which are used in the training of the semileptonic BDTs are similar to those used in the dilepton channel, consisting of invariant masses and angular separations of objects using ≥6j the jet matching. In SR1 , the reconstruction BDT with Higgs (without Higgs) correctly matches jets to the b-quarks from the Higgs decay in 48% (32%) of   ttH¯ H → b¯b events.

7.1.4 Classification BDT

Following on from the reconstruction techniques, the next stage of the analysis strategy is separating the ttH¯ signal events from the dominant tt¯ background.   In the most signal-rich region in the ttH¯ H → b¯b analysis, the fraction of sig- nal over background is only ∼5% in the dilepton channel. BDTs are trained to separate ttH¯ events from the tt¯ background and are used as the final discrimi-

119   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy nants in all of the signal regions. This is instead of using a single discriminatory variable in each signal region or simply counting the number of events in each region in order to maximise the sensitivity of the analysis. The classification BDTs combine multiple input variables which each have some separating power between signal and background events. This enhances the separation of signal and background by extracting as much information from the event as possible, improving the overall sensitivity of the analysis. The classification BDTs are optimised for each signal region in the analysis, taking advantage of the different kinematics in the background composition. In the dilepton channel, individual BDTs are trained to separate ttH¯ from tt¯+jets in each of the signal regions. This is done in order to maximise the sensitivity by selecting variables which better discriminate between the different background compositions in each region and the signal. Several types of input variables are investigated and used to train the BDTs, which can be grouped into five categories:

• Reconstruction variables,

• Global event variables,

• Object pair properties,

• Event shape variables,

• Binned b-tagging discriminant of jets.

The first of these categories includes all variables which come from the recon- struction step of the two stage strategy. In the dilepton channel, these variables all come from the reconstruction BDT. In the semileptonic channel the ma- trix element method and likelihood discriminant methods are both used to reconstruct the final state of the event, in addition to the reconstruction BDT. Global event variables are used to describe the whole event, such as the jet multiplicity above a pT threshold, the kinematics of individual objects, or the scalar sum of the pT of final state objects (HT). The properties of object pairs also provide separation between signal and background processes. Dijet and lepton-jet pairs are constructed from the types of jets, both b-tagged and un- tagged, and leptons in the event. The pairings which maximise or minimise a kinematic property are selected. The selection criteria can be related to the invariant mass of the system, for example the vector sum of object pT or the separation in ∆R. Properties of the paired system are calculated and used

120   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV as input variables. An example is the invariant mass of the b-jet pair which has minimum separation in ∆η. In addition, variables describing the average value of a property for all pairings of a given type are considered, such as the average separation in ∆R of all b-jet pairs. Event shape variables are used to describe the flow of energy and momentum in an event. They are defined using well defined combinations of the eigenvalues of the energy-momentum tensor [98] for each event. Examples of these variables are aplanarity and sphericity. In addition, the Fox Wolfram moments [99], which describe the geometrical correlation of objects in an event using spherical harmonics, and the centrality of the objects in an event are considered. In all cases, event shape variables are calculated using three different collections of objects, all jets and leptons, all jets, or only the b-tagged jets in an event. Finally, in addition to the kinematics of an event, the bin of each jet from the b-tagging discriminant is considered as an input variable in the training of the classification BDTs.

For each of the signal regions in the dilepton channel, the variables which enter the BDT are selected by maximising the separation of the ttH¯ signal from the tt¯ background using the area under the ROC curve. To evaluate the performance of the BDT, both of the test samples from the cross-validation training are merged. An initial set of variables for training each BDT is selected by ranking variables by their separating power, calculated using Equation 3.12. Very strongly correlated variables are removed in favour of the variable with the higher separating power. In the cases where a high correlation between two input variables is very different in signal and background events, both variables are kept. Using this variable set, the BDTs are optimised by adding each vari- able in turn to the training and evaluating the AUC. The variable which max- imises the AUC is kept and the procedure is subsequently repeated, increasing the number of variables in the training by one. Trainings are performed for up to 20 variables and the optimum number and set of input variables is chosen by observing where the performance reaches a plateau. The AUC and separation power for the BDTs in each of the signal regions are shown in Fig. 7.15, with each point showing the maximised AUC for the number of variables in the training. The final variables chosen for the classification BDT in each region are listed in Table 7.3. In total there are 12 variables in the classification BDT ≥4j ≥4j ≥4j ≥4j in SR1 , 14 in SR2 and 11 in SR3 . Half of the input variables in SR1 and ≥4j SR2 originate from variables built using the outputs of the reconstruction BDTs. In all the signal regions, the two best performing variables are the best reconstruction BDT score and the average separation in η of all b-jet pairs in

121   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

avg the event, ∆ηbb . These variables are shown in Fig. 7.16 for all three signal regions.

0.8 0.8 25 0.78 25 0.78

0.76 0.76 20 Area under ROC Area under ROC

20 TMVA Separation TMVA Separation 0.74 0.74

0.72 0.72 15 15 0.7 0.7

0.68 0.68 10 10 0.66 0.66

0.64 5 0.64 5

0.62 0.62

0.6 0 0.6 0 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 Number of Variables Number of Variables

(a) (b)

0.8

0.78 18

0.76 16 Area under ROC 0.74 14 TMVA Separation

0.72 12

0.7 10

0.68 8

0.66 6

0.64 4

0.62 2

0.6 0 2 4 6 8 10 12 14 16 18 20 Number of Variables

(c)

Figure 7.15: The AUC (black) and separation (red) achieved for the classifi- ≥4j ≥4j cation BDT trainings in the dilepton signal regions SR1 (a), SR2 (b) and ≥4j SR3 (c), by iteratively adding additional variables. The signal and back- ground events are taken from ttH¯ and tt¯+ jets simulation. The final number of variables used in each region is shown by the dashed orange line.

The classification BDTs are compared in Fig. 7.17 for signal and back- ground events, alongside the corresponding ROC curves. In the ROC curve plots, the combined response is shown against the two validation sets. The three ROC curves are compared to check that there is no bias towards features in either of the data sets coming from statistical fluctuations. In all cases, the ROC curves for the two test samples for each training and the ROC curve when combining the two test samples show no large differences. As the per- formance of the statistical analysis is dependent on the binning used in the discriminants, the separation hS2i for each of the BDTs is also given. Classification BDTs are trained and optimised for the semileptonic signal regions, with two inclusive trainings used for the resolved signal regions within each jet multiplicity, 5j and ≥6j. The same types of input variables are used

122   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

≥4j ≥4j ≥4j Variable Definition SR1 SR2 SR3 General kinematic variables

min mbb Minimum invariant mass of a b-tagged jet pair XX -

max mbb Maximum invariant mass of a b-tagged jet pair -- X

min ∆R mbb Invariant mass of the b-tagged jet pair with minimum ∆R X - X

max pT mjj Invariant mass of the jet pair with maximum pT X --

max pT mbb Invariant mass of the b-tagged jet pair with maximum pT X - X avg ∆ηbb Average ∆η for all b-tagged jet pairs XXX

max ∆η`,j Maximum ∆η between a jet and a lepton - XX

max pT ∆Rbb ∆R between the b-tagged jet pair with maximum pT - XX

Higgs 30 Number of b-tagged jet pairs with invariant mass within Nbb XX - 30 GeV of the Higgs-boson mass

pT>40 njets Number of jets with pT > 40 GeV - XX

1.5λ2, where λ2 is the second eigenvalue of the momentum Aplanarityb-jet - X - tensor built with all b-tagged jets all HT Scalar sum of pT of all jets and leptons -- X

Variables from reconstruction BDT

** ** BDT output Output of the reconstruction BDT with Higgs X X X Higgs mbb Higgs candidate mass X - X

* ∆RH,tt¯ ∆R between Higgs candidate and tt¯ candidate system X --

min ∆RH,` Minimum ∆R between Higgs candidate and lepton XXX

min ∆RH,b Minimum ∆R between Higgs candidate and b-jet from top XX -

max ∆RH,b Maximum ∆R between Higgs candidate and b-jet from top - X -

Higgs ∆Rbb ∆R between the two jets matched to the Higgs candidate - X -

Variables from b-tagging

Higgs Sum of the binned b-tagging discriminants of jets from best wb-tag - X - Higgs candidate from the reconstruction BDT

Table 7.3: The variables used in the classification BDTs in the dilepton signal regions. For the variables which come from the reconstruction BDT, those with (without) a * are constructed using the best scoring combination from the reconstruction BDT with (without) Higgs, and for ** both versions are ≥4j included, each using the different combination. In SR1,2 the four leading jets sorted by the b-tagging discriminant are chosen as the b-jets in an event, and ≥4j in SR3 the three jets which pass the very tight working point are used.

123   7. Search for ttH¯ H → b¯b at 13 TeV 7.1. Analysis Strategy

Bin by bin separation 13.14% Bin by bin separation 11.94% ≥4j ≥4j Dilepton SR1 0.25 Dilepton SR1 0.35 Signal Signal

0.3 Background Background 0.2 Normalised Entries Normalised Entries 0.25

0.15 0.2

0.15 0.1

0.1 0.05 0.05

0 0 −0.4 −0.2 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 3 3.5 4 ∆ηAvg Reco BDT Score bb avg ≥4j (a) Reconstruction BDT with Higgs output (left) and ∆ηbb (right) in SR1 . Bin by bin separation 9.22% Bin by bin separation 11.66% 0.25 0.35 ≥4j ≥4j Dilepton SR2 Dilepton SR2 Signal Signal 0.3 Background 0.2 Background 0.25 Normalised Entries Normalised Entries

0.15 0.2

0.15 0.1

0.1

0.05 0.05

0 0 −0.4 −0.2 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 3 3.5 4 ∆ηAvg Reco BDT Score bb avg ≥4j (b) Reconstruction BDT with Higgs output (left) and ∆ηbb (right) in SR2 . Bin by bin separation 4.50% Bin by bin separation 10.53% ≥4j ≥4j Dilepton SR3 0.22 Dilepton SR3 Signal Signal 0.3 0.2

Background 0.18 Background 0.25

Normalised Entries Normalised Entries 0.16

0.14 0.2 0.12

0.15 0.1

0.08 0.1 0.06

0.04 0.05 0.02

0 0 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 2.5 3 3.5 4 ∆ηAvg Reco BDT Score bb avg ≥4j (c) Reconstruction BDT with Higgs output (left) and ∆ηbb (right) in SR3 .

Figure 7.16: Comparison of the distributions of the simulated ttH¯ signal (red) and tt¯ background (blue) for the two most important input variables in the classification BDTs in all three signal regions. Both processes are shown nor- malised to unity.

124   7.1. Analysis Strategy 7. Search for ttH¯ H → b¯b at 13 TeV

Area under curve 0.816 ≥4j Dilepton SR 1 1 s = 13 TeV, 36.1 fb-1 Total background Dilepton ttH(bb) 0.6 ≥4j Arbitrary units SR Separation: 22.5% 1 0.8

0.4 0.6

Background Rejection Efficiency Combined Test Samples 0.4 Test Samples 1 0.2 Test Samples 2

0.2

0− − − − − 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 0 0 0.2 0.4 0.6 0.8 1 Classification BDT Signal Acceptance Efficiency

≥4j (a) SR1

Area under curve 0.776 ≥4j Dilepton SR 0.3 1 2 s = 13 TeV, 36.1 fb-1 Total background Dilepton ttH(bb) ≥4j Arbitrary units SR Separation: 24% 2 0.8

0.2 0.6

Background Rejection Efficiency Combined Test Samples 0.4 Test Samples 1 0.1 Test Samples 2

0.2

0− − − − − 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 0 0 0.2 0.4 0.6 0.8 1 Classification BDT Signal Acceptance Efficiency

≥4j (b) SR2

Area under curve 0.737 ≥4j 0.3 Dilepton SR 1 3 s = 13 TeV, 36.1 fb-1 Total background Dilepton ttH(bb) ≥4j Arbitrary units SR Separation: 16.7% 3 0.8

0.2

0.6

Background Rejection Efficiency Combined Test Samples 0.4 Test Samples 1 0.1 Test Samples 2

0.2

0− − − − − 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 0 0 0.2 0.4 0.6 0.8 1 Classification BDT Signal Acceptance Efficiency

≥4j (c) SR3

Figure 7.17: The classification BDT distributions for signal and background events in all three dilepton signal regions (left), and the corresponding ROC curves from the test samples used to validate the training (right) . The ROC curve for the combined test samples is equivalent to calculating the ROC curve using the unbinned distributions of the discriminants shown in the left column.

125   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties in training the BDTs used in the semileptonic channel as those which were used in the dilepton channel, with the addition of the likelihood discriminant and the MEM discriminant. The individual variables with the best separation power in the semileptonic channel are the likelihood discriminant output, the highest reconstruction BDT score per event, and the maximum separation in ≥6j η of two b-tagged jets. In SR1 , a dedicated BDT is trained which uses the variables from the ≥6j BDT with the addition of the MEM discriminant to achieve the best separation power in this signal region. A separate BDT is optimised in the boosted signal region using jet substructure variables and kinematic information from the boosted Higgs boson candidate.

7.2 Systematic Uncertainties

There are a large number of sources of systematic uncertainty, which affect the modelling of the signal and background processes, that need to be taken into account in this analysis. In this section, the systematic uncertainties which have the largest impact on the analysis are discussed. The systematic un- certainties can be split into two broad categories: experimental uncertainties, which relate to the performance and simulation of the ATLAS detector; and theoretical uncertainties, which cover the modelling of signal and background processes using Monte Carlo simulation and data-driven estimates, as well as the uncertainties on the overall cross sections of different processes. Nuisance parameters are assigned to each source of uncertainty in the pro- file likelihood fit. Each systematic can alter the overall normalisation and shape of the final distributions of the signal and background processes, and is evalu- ated for each relevant sample in every region. In order to reduce the computa- tion time when performing the statistical analysis, the shape or normalisation information of a systematic is kept only if it has an effect on the distribution which is greater than 1%. The removal of shape or normalisation information of a systematic is performed on each sample and region separately, removing the information only for the sample and region in which the criterion is not passed. Studies were performed on the pruning threshold to verify that it has no impact on the result of the analysis. In addition, to alleviate problems aris- ing from statistical fluctuations in the distributions of systematic uncertainties which can cause non-regular shape differences, a smoothing procedure is ap- plied to the shape of the variation in the distributions used to evaluate the systematic uncertainties. This helps reduce the double counting of statistical

126   7.2. Systematic Uncertainties 7. Search for ttH¯ H → b¯b at 13 TeV uncertainties on the samples and artificial pulls and constraints resulting from the non-regular shape of systematic variations. The smoothing is performed by rebinning the systematic variation distribution until the relative uncertainty on each bin is below a threshold, nominally 8%. Should the slope of the sys- tematic distribution change direction more than four times, the tolerance is halved and the process is repeated. Finally, the histogram smoothing function provided by ROOT [100] using running medians is applied to the distribution to avoid having artificially flat uncertainties as a result of rebinning. The majority of the sources of systematic uncertainty are kept correlated across all samples and regions. However, some systematic uncertainties are decorrelated into multiple components, notably the experimental uncertain- ties. Additionally, several systematic uncertainties relating to the simulation of backgrounds are decorrelated across samples or decomposed into multiple components within samples. An additional source of systematic uncertainty is considered from the sta- tistical component of all the simulation used in the analysis. For each bin entering the profile likelihood fit, a nuisance parameter is assigned to the total bin content of all signal and background processes, using the uncertainty from the total number of monte carlo events. Gamma penalty terms are applied to each nuisance parameter, with all bins treated as uncorrelated.

7.2.1 Experimental Uncertainties

Experimental uncertainties relate to the modelling of the ATLAS detector and how physics objects are reconstructed in simulated events in comparison to recorded data. In addition, there is an uncertainty on the overall integrated luminosity of the dataset used in the analysis. The uncertainty is derived from √ data taken from dedicated van der Meer scans performed at s = 13 TeV dur- ing 2015 and 2016, using a methodology similar to that detailed in Ref. [101]. The total uncertainty on the integrated luminosity of the 2015+2016 dataset used in the analysis is 2.1%. An uncertainty on the modelling of the pileup is also considered, arising from the corrections applied to simulated events to adjust the profile of pileup to match the distribution in recorded data. The uncertainty is applied to all simulated events and covers the uncertainty in the ratio of the predicted and measured cross sections of inelastic collisions. The remaining systematic uncertainties on the physics objects used in the analysis are described in this section.

127   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties

Jets

Uncertainties on the reconstructed jets arise from three main sources, the jet energy scale (JES), jet energy resolution (JER), and the jet vertex tagger (JVT) [102]. Both JES and JER uncertainties contribute sizeably to the sensi- tivity of the analysis due to the large number of jets in the events, though the individual uncertainties are of the order of a few percent. For samples which use the ATLFAST-II simulation of the ATLAS detector, an additional uncer- tainty is considered, which covers the non-closure between these samples and using the full Geant simulation. The uncertainties on JES come from several independent sources. The ma- jority originate from the in situ corrections applied to the jets; these are subse- quently decomposed into a reduced set of eight separate nuisance parameters. They take into account assumptions made on the event topology, the MC sim- ulated samples used, including the number of simulated events in samples, and uncertainties which are propagated from the energy scales of other physics ob- jects. Additional uncertainties are derived from pileup modelling of jets, the η-intercalibration in the range 2.0 < |η| < 2.6, and a correction on the punch- through of jets into the muon spectrometer. The remaining uncertainties on JES cover the differences in jet response, and how jets originating from differ- ent flavour quarks are simulated; this includes the predicted compositions of light quark, b-quark, and gluon initiated jets. The uncertainty on JER is evaluated by smearing the energy of each jet in an event in simulation. The magnitude of smearing is extrapolated from JER √ measurements performed at s = 8 TeV to the current centre of mass energy of 13 TeV. The uncertainty is divided into two separate components in the analysis. Finally, the uncertainty on the efficiency of jets passing the JVT cut is taken into account.

Flavour tagging

Uncertainties from the tagging efficiency of b-jets and the mistag rates of light- flavour and c-tag jets are evaluated from the scale factors applied to correct for differences between data and simulation. The efficiency to correctly tag b-jets is measured in data using tt¯ dilepton events; the mistag rate of c-jets is also measured in tt¯ events, identifying hadronic decays of the W bosons which include c-jets. The light-flavour jet mistag rate is measured from mul- tijet events, using secondary vertices and tracks whose impact parameters are

128   7.2. Systematic Uncertainties 7. Search for ttH¯ H → b¯b at 13 TeV consistent with having negative lifetimes. The efficiencies and mistag rates are calculated for each b-tagging working point and as a function of jet kinemat- ics. The uncertainties are extracted considering correlations across multiple working points. In total there are 30 uncertainties associated with the b-tag efficiency, extracted as a function of the b-tagging working point and jet pT. For the mistag rates of light-flavour and c-jets, there are 80 and 15 associated uncertainties respectively, extracted as a function of the b-tagging working point and jet pT. In the case of light-flavour jet mistags, the uncertainties are extracted with an additional dependence on jet η. The c-jet mistag rates are also applied to τhad candidates, with an additional source of uncertainty on the extrapolation from c- to τhad-jets taken into consideration.

Leptons

The main sources of uncertainties on the leptons arise from the efficiencies in reconstruction and identification, as well as the efficiencies of the lepton triggers used in the analysis. The efficiencies are derived as scale factors from the comparison of data to simulation in tag and probe events, with the asso- ciated uncertainties evaluated in the calculation of the scale factors [56, 62]. In addition, there are systematic uncertainties relating to the energy scale and resolution of the leptons. In the case of muons, the uncertainties on the energy scale and resolution are evaluated separately for tracks in the inner detector and the muon spectrometer.

Others

The systematic uncertainties on the energy scales and resolutions of all physics miss objects are propagated to ET , with three additional sources of uncertainty on miss the calculation of the ET soft term. Tau energy scale systematic uncertainties are also evaluated, but have a negligible impact on the analysis.

7.2.2 Theoretical Uncertainties

The theoretical uncertainties in the analysis come from the theoretical predic- tions and models chosen for the prediction of signal and background processes using Monte Carlo as well as data-driven techniques. The main contributor to the theoretical uncertainties of a process arise from the choice of event genera- tor and the parton shower model chosen, as well as the scale and PDF set used in the generation of the events. These uncertainties are evaluated by compar-

129   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties ing the nominal samples used in the analysis with alternative samples where different configurations have been used. For these systematic uncertainties, the difference between the nominal and alternative predictions is symmetrised and used as a two-sided systematic uncertainty. The difference between the two samples represents one standard deviation in the systematic uncertainty. Additionally, the uncertainties on the theoretical cross sections and branching ratios need to be considered. By far the largest contribution to the theoretical uncertainties come from the modelling of the tt¯+jets background, in particular tt¯+ HF processes, with a large number of dedicated systematic uncertainties considered when handling this background. In this section the evaluation of systematic uncertainties on the signal and background modelling are discussed.

Signal modelling

Three main sources of uncertainty are considered on the modelling of ttH¯ . The first comes from the theoretical calculation of the total production cross section √ at s = 13 TeV. This uncertainty is split into two independent contributions, coming from the QCD scale in teh calculation and the uncertainty from the PDF choice. Secondly, the uncertainty on the branching ratios of the various +5.9 Higgs decay channels are considered. In total there is a −9.2% uncertainty on the cross section from the QCD scale and ±3.6% from the uncertainties on the PDF. There is additionally an uncertainty of 2.2% on the H → b¯b decay mode [14]. The QCD scale uncertainty is estimated by independent variations of µF and µR, and the uncertainty on the PDF choice originate from the treatment to include QED evolution effects in the PDFs, as described in Ref. [14]. The final source of uncertainty is taken from the choice of parton shower and hadronisation model used to simulate ttH¯ events. To evaluate this uncertainty, the nominal Madgraph5_aMC@NLO +Pythia 8 sample is compared to events simulated with Madgraph5_aMC@NLO and interfaced to Herwig++ for the parton shower and hadronisation. tt¯ modelling

As the analysis is extremely sensitive to the modelling of the tt¯+ jets back- ground, a comprehensive set of systematic uncertainties are evaluated in order to encompass all relevant sources of uncertainty. These cover the choice of Monte Carlo generator, parton shower and hadronisation model, and the scale settings which were compared in Section 6.1.2. Additional systematic uncer- tainties from the modelling of the tt¯+HF background and the theoretical cross

130   7.2. Systematic Uncertainties 7. Search for ttH¯ H → b¯b at 13 TeV section of tt¯ production are considered. As the tt¯+ light, tt¯+≥1c and tt¯+≥1b components have different uncertainties related to their modelling, systematic uncertainties are treated as uncorrelated between them and separate indepen- dent nuisance parameters are assigned to each tt¯+ jets component in the fit. The exception to this is the uncertainty on the inclusive NNLO+NNLL tt¯cross section, which has an uncertainty of ±6% and is kept correlated. All sources of systematic uncertainty on the various tt¯+ jets components are summarised in Table 7.4, alongside the categories to which they apply.

Systematic source Description tt¯ categories tt¯ cross-section Up or down by 6% All, correlated k(tt¯+ ≥ 1c) Free-floating tt¯+≥1c normalisation tt¯+≥1c k(tt¯+ ≥ 1b) Free-floating tt¯+≥1b normalisation tt¯+≥1b Related to the choice of NLO event Sherpa5F vs. nominal All, uncorrelated generator Powheg+Herwig 7 vs. PS & hadronisation All, uncorrelated Powheg+Pythia 8 Variations of µ , µ , h and the ISR / FSR R F damp All, uncorrelated A14 Var3c parameters MG5_aMC@NLO + Herwig++: tt¯+≥1c ME vs. inclusive tt¯+≥1c ME prediction (3F) vs. incl. (5F) Comparison of tt¯+ b¯b NLO (4F) vs. tt¯+≥1b Sherpa4F vs. nominal tt¯+≥1b Powheg+Pythia 8 (5F) tt¯+≥1b renorm. scale Up or down by a factor of two tt¯+≥1b

tt¯+≥1b resumm. scale Vary µQ from HT/2 to µCMMPS tt¯+≥1b

tt¯+≥1b global scales Set µQ, µR, and µF to µCMMPS tt¯+≥1b tt¯+≥1b shower recoil scheme Alternative model scheme tt¯+≥1b tt¯+≥1b PDF (MSTW) MSTW vs. CT10 tt¯+≥1b tt¯+≥1b PDF (NNPDF) NNPDF vs. CT10 tt¯+≥1b

Alternative set of tuned parameters tt¯+≥1b UE tt¯+≥1b for the underlying event tt¯+≥1b MPI Up or down by 50% tt¯+≥1b tt¯+≥3b normalisation Up or down by 50% tt¯+≥1b

Table 7.4: A summary of the evaluation of systematic uncertainties on the tt¯ + jets backgrounds. The component(s) to which the systematic applies is shown in the third column, and for the case where it is applied inclusively whether it is treated as a correlated or uncorrelated uncertainty.

To evaluate the systematic uncertainty arising from the choice of NLO event generator, the nominal Powheg+Pythia 8 sample is compared to events generated with Sherpa 2.2, which uses the MC@NLO matching scheme in contrast to Powheg. This comparison is preferred to comparing the nominal

131   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties sample to events generated with Madgraph5_aMC@NLO and interfaced to the same parton shower due to the limited available effective number of events in the alternative sample. The systematic uncertainty from the choice of parton shower and hadronisation model is extracted from a comparison of the nom- inal sample to Powheg interfaced to Herwig 7. The Powheg+Herwig 7 and Sherpa 2.2 samples are both configured as described in Section 6.1.2. In order to disentangle the shape of these systematic uncertainties from effects caused by the different relative fractions of the tt¯+ jets categories in each sample, the fractions of tt¯+light, tt¯+≥1c, and tt¯+≥1b are adjusted to match the prediction coming from Powheg+Pythia 8 in all alternative tt¯ samples. Additionally, the tt¯+≥1b subcategories, excluding tt¯+ b(MPI/FSR), are adjusted to match the prediction from the dedicated Sherpa4F sample. The final source of uncer- tainty on the inclusive tt¯+ jets process arises from the amount of initial and final state radiation. This uncertainty is evaluated by comparing two alterna- tive Powheg+Pythia 8 samples, in which the amount of radiation has been modified by changing the tuning settings, to the nominal Powheg+Pythia 8 sample. In the alternative radiation samples, the A14 tunable parameters are varied using the Var3c set, in addition to scaling the renormalisation and fac- torisation scales (µR and µF respectively), and the value of the hdamp param- eter. For one sample, the amount of radiation is increased by using the Var3c up variation set, decreasing both µR and µF by a factor of two and doubling the value of hdamp. For the second sample, the amount of radiation is decreased by using the Var3c down variation set, doubling µF and µR, but keeping the value of hdamp fixed to the nominal value of 1.5 · mtop, with mtop = 172.5 GeV. As tt¯+ HF production is not well understood in theory and has little guid- ance from experimental measurements, the normalisation factors of tt¯+≥1c and tt¯+≥1b are both allowed to float freely and uncorrelated in the fit; this is in addition to the uncertainty on the inclusive tt¯ cross section. The free

floating parameters are labelled ktt¯+≥1c and ktt¯+≥1b, with values set to 1 in all pre-fit distributions and calculations. Further sources of uncertainty are also considered for both the tt¯+≥1c and tt¯+≥1b processes.

For the tt¯+≥1c background, it is unclear whether the nominal approach to consider c-quarks primarily produced in the parton shower is more or less accurate than a prediction with tt¯+ cc¯ calculated in the matrix element with NLO precision in QCD. In order to evaluate the uncertainty coming from the choice of approach, tt¯+≥1c events from two separate samples are compared. In the first sample, events are generated with tt¯+ cc¯ in the matrix element us-

132   7.2. Systematic Uncertainties 7. Search for ttH¯ H → b¯b at 13 TeV ing a three flavour scheme PDF, which includes massive c-quarks. The events are generated using Madgraph5_aMC@NLO interfaced to Herwig++, as de- scribed in Ref. [103]. In the second sample, inclusive tt¯ events are produced using the same generator and parton shower, but using a 5F scheme PDF with massless c-quarks. In this sample, the c-quarks in tt¯+≥1c events are produced primarily in the parton shower, as opposed to in the hard scatter. The shape difference in the modelling of tt¯+≥1c events seen when comparing the two samples is applied as a systematic uncertainty on the tt¯+≥1c background in the Powheg+Pythia 8 sample.

For the tt¯+≥1b process, the difference between the predictions from the in- clusive Powheg+Pythia 8 sample and the dedicated Sherpa4F tt¯+ b¯b sample is applied as a source of uncertainty. As the relative fractions of all tt¯+≥1b sub- components are adjusted to match the prediction from the Sherpa4F sample, this systematic only takes into account the shape differences in the distri- butions. The differences originate from the origin of the additional b-quarks in tt¯ + b¯b events. In the 5F scheme sample, b-quarks mostly originate from gluon splitting in the parton shower, but in the 4F scheme sample, the ad- ditional b-quarks are treated as massive in the matrix element and are de- scribed to NLO in QCD. This uncertainty is not applied to tt¯+ b(MPI/FSR). Additional sources of uncertainty on the tt¯+≥1b modelling come from the various subcomponents, whose relative fractions in all other samples remain fixed to the prediction from Sherpa4F. For this, seven sources of systematic uncertainty are derived from modifying the configuration of the Sherpa4F sample to cover the uncertainties on the relative fractions of the tt¯+≥1b com- ponents. Of these, three systematic uncertainties are evaluated by scaling µF up and down by a factor of two; changing the resummation scale, µQ, from the nominal value of HT/2 to µCMMPS; and by choosing a global scale choice of µQ = µF = µR = µCMMPS. In addition, two uncertainties are evaluated by using two alternative PDF sets, MSTW2008NLO and NNPDF2.3NLO, in the event generation, compared to the nominal CT10 PDF set. The remain- ing two uncertainties are evaluated by choosing an alternative shower recoil scheme and an alternative set of tuned parameters for the underlying event. The relative size of these uncertainties on each of the tt¯+≥1b subcomponents is represented by the red uncertainty band on the Sherpa4F prediction in Fig. 6.2. Due to the large difference seen in the fraction of tt¯+≥3b events pre- dicted by Powheg+Pythia 8 and Sherpa4F, an additional 50% normalisation uncertainty is applied on tt¯+≥3b. As the tt¯+b(MPI/FSR) fraction is not fixed

133   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties to the prediction from Sherpa4F in the samples used to assess the systematic uncertainties coming from the modelling of the tt¯+ jets background, the effect on the shape and normalisation of MPI and FSR in tt¯+≥1b events is already incorporated in the systematic model. Furthermore, an additional 50% uncer- tainty on the normalisation on tt¯+≥1b events coming from MPI is considered, motivated by studies of different tunable sets of parameters in the calculation of the underlying event.

Other backgrounds

For the smaller backgrounds in the analysis, the main sources of uncertainty come from the theoretical uncertainty on the production cross sections. For the single-top Higgs, ttV¯ and ttW¯ W backgrounds, these are split into two com- ponents, taking into account the uncertainty from the QCD scale and PDF choice. For the V +jets backgrounds, the uncertainty on the cross section is absorbed into a total normalisation uncertainty, calculated from varying µF and µQ. Additional uncertainties on V +HF jets production are considered, with mismodelling observed in the overall normalisation of these processes. In W +jets, this is implemented as two additional normalisation uncertainties of 30% on W +HF on events with exactly two HF jets and events with at least two HF jets. These uncertainties are treated as uncorrelated. In Z+jets this is included in the total normalisation uncertainty, which is subsequently decorrelated across jet multiplicities in the dilepton channel. For the fake and non-prompt lepton backgrounds, an overall normalisation uncertainty of 25% is applied in the dilepton channel, and a 50% uncertainty is assigned in the semileptonic channel, decorrelated across jet multiplicity and lepton flavour. The systematic uncertainties applied on the cross sections and total normali- sations uncertainties are summarised in Table 7.5. In addition to the uncertainties on their cross sections, sources of uncer- tainty on the ttV¯ and single top processes which originate from the choice of Monte Carlo simulation are also considered. To assess the uncertainty from the modelling of ttV¯ coming from the choice of the event generator, parton shower and hadronisation, events generated with Sherpa 2.1 are compared to the nominal prediction. The uncertainties on ttW¯ and ttZ¯ are treated as uncorrelated. For the single top backgrounds, the source of uncertainty from the choice of parton shower and radiation is assessed by comparing the nom- inal Powheg+Pythia 6 to alternative samples, analogous to those used to estimate the tt¯ parton shower, hadronisation and radiation uncertainties. This

134   7.2. Systematic Uncertainties 7. Search for ttH¯ H → b¯b at 13 TeV

Assigned uncertainty Process Dilepton Semileptonic Z+jets 35% 35% W +jets - 40% W +HF - 30% ¯ +9.6% ttZ QCD: −11.3%; PDF: 4.0% ¯ +12.9% ttW QCD: −11.5%; PDF: 3.4% +5% Single top −4% +6.5% tHjb QCD: −14.9%; PDF: 3.7% +6.5% W tH QCD: −6.7% ; PDF: 6.3% ¯ +10.9% ttW W QCD: −11.8%; PDF: 2.1% tZ 50% tW Z 50% ttt¯ t¯ 50% Diboson 50% Fakes 25% 50%

Table 7.5: The assigned uncertainties on the total normalisation and cross sections of the non-tt¯+ jets backgrounds. is performed independently for t-channel, s-channel and W t single top pro- duction. Finally, the choice between using the diagram removal or diagram subtraction schemes to handle the interference effects between the overlapping tt¯ and W t events is considered as a source of systematic uncertainty. A summary of all the sources of systematic uncertainty is provided in Ta- ble 7.6. The systematic uncertainties are grouped into categories, and the num- ber of individual nuisance parameters per source of uncertainty are provided. Information regarding whether the uncertainties affect only the normalisation of individual distributions, or both shape and normalisation effects, is listed for each group of systematic uncertainties.

135   7. Search for ttH¯ H → b¯b at 13 TeV 7.2. Systematic Uncertainties

Table 7.6: Summary of the Systematic uncertainty Type Comp. f(θ) sources of systematic uncer- Experimental uncertainties tainty included in the analy- Luminosity N 1 G sis. For each source of uncer- Pileup modelling SN 1 G Physics Objects tainty, the number of individual Electron SN 6 G nuisance parameters is listed Muon SN 15 G in the column ‘Comp.’. For Taus SN 3 G systematic uncertainties where Jet energy scale SN 20 G only the normalisation effect Jet energy resolution SN 2 G Jet vertex tagger SN 1 G is taken into account, the sys- miss ET SN 3 G tematic is listed as ‘N’ in b-tagging the column ‘Type’; systematic Efficiency SN 30 G uncertainties where both nor- Mis-tag rate (c) SN 15 G malisation and shape effects Mis-tag rate (light) SN 80 G Mis-tag rate (c → τ) SN 1 G are considered are listed as Theoretical uncertainties ‘SN’. The penalty terms on Signal the nuisances parameters for ttH¯ cross section N 2 LN each source of systematic un- H branching fractions N 3 G certainty are listed in the last ttH¯ modelling SN 1 G tt¯ Background column, with gaussian penalty tt¯ cross section N 1 LN terms listed as ‘G’, log-normal tt¯+≥1c normalisation N 1 Free penalty terms listed as ‘LN’, tt¯+≥1b normalisation N 1 Free and free floating parameters tt¯+ light modelling SN 3 G without penalty terms listed as tt¯+≥1c modelling SN 4 G tt¯+≥1b modelling SN 13 G ‘Free’. The final entry in the Other Backgrounds table refers to the cross sec- ttW¯ cross section N 2 LN tions of the tZ, tW Z, ttW¯ W , ttZ¯ cross section N 2 LN tHjb and W tH background ttW¯ modelling SN 1 G processes. ttZ¯ modelling SN 1 G Single top cross section N 3 LN Single top modelling SN 5 G W +jets normalisation N 3 LN Z+jets normalisation N 3 LN Diboson normalisation N 1 LN Fakes normalisation N 7 LN ttt¯ t¯ cross section N 1 LN Small bkg cross sections N 9 LN

136 8. Results

  In this chapter, the results of the ttH¯ H → b¯b analysis are presented. The signal strength of ttH¯ is extracted using a binned profile likelihood fit, per- formed over all regions simultaneously. The systematic uncertainties described in Chapter 7 enter the fit as nuisance parameters. The fit is performed using the RooStats framework [104], with Minuit [105] used to determine the best fit values of µ, ktt¯+≥1c and ktt¯+≥1b and their associated errors. MINOS errors cal- culated with Minuit are subsequently used to extract the confidence intervals of the parameters of interest. From the measured value of µ, the observed significance can be compared to the expected significance from a fit to the Asimov dataset, and limits at the

95% CL are set using the CLs method. This result is combined with several other searches for ttH¯ which have been optimised for other Higgs boson decay modes [2]. The result of the combined fit is presented in Section 8.2.

  8.1 Search for ttH¯ H → b¯b

The observed event yields in each region are compared to the prediction in Fig. 8.1, showing the agreement seen after performing the fit to data assuming the signal-plus-background hypothesis (post-fit). The total systematic uncer- tainty is shown by the shaded band. An improved agreement is seen compared to the yields before the fit (pre-fit), shown previously in Fig. 7.7, with all regions showing good agreement within the total uncertainty band. The con- tribution from ttH¯ is shown in red, with the total observed yield per region shown on top of the background by the dashed line. The distributions of the dilepton signal regions which enter the fit are shown in Fig. 8.2. The distributions before and after performing the fit are compared. In these figures, the fitted contribution from ttH¯ is in red, with the signal prediction also normalised to the total background and overlain as a dashed red line, in order to better illustrate the shape of the signal distribution. In the pre-

137 8. Results 8.1. Search for ttH¯ H → b¯b

ATLAS Data ttH tt + light 108 ATLAS Data ttH tt + light 107 s = 13 TeV, 36.1 fb-1 tt + ≥1c tt + ≥1b tt + V s = 13 TeV, 36.1 fb-1 tt + ≥1c tt + ≥1b tt + V 7 Dilepton Non-tt Total unc. ttH 10 Single Lepton Non-tt Total unc. ttH Events / bin 106 Post-Fit Events / bin Post-Fit 106 105 105 104 104

3 10 103

2 10 102

10 1.5 1.510 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 CR 3j CR 3j CR ≥4j CR ≥4j SR ≥4j SR ≥4j SR ≥4j CR 5j CR 5j CR 5j SR 5j SR 5j SRboostedCR ≥6j CR ≥6j CR ≥6j SR ≥6j SR ≥6j SR ≥6j tt+light tt+ tt+light tt+ 3 2 1 tt+light tt+ tt+b 2 1 tt+light tt+ tt+b 3 2 1 ≥1b ≥1c ≥1c ≥1c

(a) (b)

Figure 8.1: Data and expected yields in all regions of the analysis for the dilepton (a) and semileptonic (b) channels after performing the fit to data un- der the signal-plus-background hypothesis. The measured signal contribution is shown by the solid red area in each bin and separately by the red dashed line for clarity. The recorded data are compared to the expected contribution from simulation and data driven techniques, with the total uncertainty on the expected events coming from all systematic uncertainties shown by the error band.

fit predictions, the normalisation factors of the tt¯+≥1c and tt¯+≥1b processes and the signal strength parameter are all set to unity; no uncertainty on the normalisation factors is included in the total pre-fit systematic uncertainty band. In both the pre- and post-fit distributions, the data are reasonably well ≥4j modelled within the total uncertainties, with the exception of SR3 which sees an excess of data above the post-fit prediction in the more background- like bins. After the fit to data, the level of agreement is improved with a reduction in the overall size of the systematic uncertainties. This is a result of the nuisance parameters being modified from their nominal values in the fit, in particular the tt¯+≥1c and tt¯+≥1b normalisation factors, whose best-fit values are 1.63±0.23 and 1.24±0.10 respectively. These values do not include the theoretical uncertainties in the calculation of the tt¯+≥1c and tt¯+≥1b cross sections, and do not reflect a change in the cross section of the tt¯+HF processes as sources of systematic uncertainty on the tt¯+ HF background can also affect the normalisation. The reduction in overall uncertainty is due to the constraint of nuisance parameters and correlations between them when performing the the fit.

The best fit value of the signal strength, extracted from the simultaneous

138 8.1. Search for ttH¯ H → b¯b 8. Results

250 250 ATLAS Data ttH ATLAS Data ttH ≥ ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V Dilepton Dilepton 200 ≥ Non-tt Total unc. 200 ≥ Non-tt Total unc. Events / bin 4j Events / bin 4j SR1 ttH (norm) SR1 ttH (norm) Pre-Fit Post-Fit 150 150

100 100

50 50

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Classification BDT output Classification BDT output ≥4j (a) SR1

600 ATLAS Data ttH 600 ATLAS Data ttH ≥ ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V Dilepton Dilepton 500 ≥ Non-tt Total unc. 500 ≥ Non-tt Total unc. Events / bin 4j Events / bin 4j SR 2 ttH (norm) SR 2 ttH (norm) Pre-Fit Post-Fit 400 400

300 300

200 200

100 100

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Classification BDT output Classification BDT output ≥4j (b) SR2

600 ATLAS Data ttH 600 ATLAS Data ttH ≥ ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V 500 Dilepton 500 Dilepton ≥ Non-tt Total unc. ≥ Non-tt Total unc. Events / bin 4j Events / bin 4j SR 3 ttH (norm) SR 3 ttH (norm) 400 Pre-Fit 400 Post-Fit

300 300

200 200

100 100

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Classification BDT output Classification BDT output ≥4j (c) SR3

Figure 8.2: Comparing the agreement between data and prediction in the dilep- ton signal regions for the classification BDT used in the fit. Shown before (left) and after (right) the combined fit across both channels. The ttH¯ signal contri- bution is shown in solid red on top of the background prediction, and also by the red dashed line which has been normalised to the total background. The total uncertainty is shown by the shaded area, with the uncertainty on the tt¯+≥1b and tt¯+≥1c normalisation factors not included in the pre-fit distribu- tions.

139 8. Results 8.1. Search for ttH¯ H → b¯b

fit over both channels, is

+0.57 +0.64 µttH¯ = 0.84 ± 0.29 (stat.) −0.54 (syst.) = 0.84−0.61, with the total uncertainty in the observed measurement the same as the ex- pected uncertainty. The statistical-only uncertainty is calculated by fixing the nuisance parameters to their best fit value, with the exception of the nor- malisation factors and ttH¯ signal strength, and performing the fit a second time. The systematic component is then calculated by subtracting the sta- tistical component from the total uncertainty in quadrature. The systematic uncertainty has a much greater contribution to the overall uncertainty in the analysis than the statistical component. The measured value of µ corresponds to an observed significance of 1.4σ, in comparison to an expected significance of 1.6σ assuming a SM signal with µttH¯ = 1, calculated using the methods described previously in Chapter 3. An additional fit is performed with the signal strength decorrelated between the dilepton and semileptonic channels. In this fit the values of the signal +1.02 +0.65 strength are measured to be µttH¯ = −0.24−1.05 and µttH¯ = 0.95−0.62 in the dilepton and semileptonic channels respectively. The compatibility of these two measurements with each other is 1.3σ, calculated by comparing the p-values of the two the individual fits. The signal strengths measured in this fit are compared to the nominal fit in Fig. 8.3, with the overall uncertainty separated into statistical and systematic components. When performing the fit in the two +1.36 channels separately, the signal strength parameter is measured as 0.11−1.41 in +0.71 the dilepton channel and 0.67−0.69 in the semileptonic channel. These signal strengths are both lower than the best fit value from the combined fit due to large correlations in systematic uncertainties of the background modelling across the two channels.

Using the CLs method, upper limits are set on ttH¯ production, with values of the signal strength greater than 2.0 excluded at the 95% confidence level. Upper limits are also set using the decorrelated signal strengths in the indi- vidual channels and are shown alongside the upper limit extracted from the nominal combined fit in Fig. 8.4. The fits used to set the upper limits are con- sistent with the method used to measure the best fit values of the decorrelated signal strengths. The post-fit yield is compared to the data in Fig. 8.5 as a function of the logarithm of the signal to background ratio, log10 (S/B). In this plot, all bins from the final discriminants in the analysis regions of both channels are grouped

140 8.1. Search for ttH¯ H → b¯b 8. Results

ATLAS s = 13 TeV, 36.1 fb-1 tot. m = 125 GeV stat. H tot (stat syst )

Dilepton +1.02 +0.54 +0.87 −0.24 − ( − − ) (two-µ combined fit) 1.05 0.52 0.91

Single Lepton +0.65 +0.31 +0.57 0.95 − ( − − ) (two-µ combined fit) 0.62 0.31 0.54

+0.64 +0.29 +0.57 Combined 0.84 −0.61 ( −0.29 −0.54 )

−1 0 1 2 3 4 5 6 µ σttH σttH Best fit = / SM

Figure 8.3: Summary of the measured signal strengths in both the dilepton and semileptonic channels, and the combined best-fit measurement. The numbers are obtained from combined fits across all regions, with the single channel mea- surements obtained from a fit with the signal strength decorrelated between the two regions but the remaining nuisance parameters kept correlated.

ATLAS s = 13 TeV, 36.1 fb-1

Dilepton (two-µ combined fit)

Single Lepton (two-µ combined fit) mH = 125 GeV

Expected ± 1σ Expected ± 2σ Combined Observed Expected (µ=1) 0 1 2 3 4 5 σ σ 95% CL upper limit on / SM(ttH)

Figure 8.4: Summary of the 95% confidence limits set on the ttH¯ signal strength in both the dilepton and semileptonic channels, and the combined best-fit measurement. The method used to set all the upper limits in the individual channels is consistent with the method used for the values shown in Fig. 8.3. The observed limits are shown by the solid black lines. The expected lim- its are shown for the SM background-only hypothesis (dashed black lines) and SM signal-plus-background hypothesis (red dashed lines, µ = 1). For the background-only hypothesis, uncertainty bands of 1σ and 2σ are shown.

141 8. Results 8.1. Search for ttH¯ H → b¯b into individual bins of S/B. The signal is shown on top of all backgrounds, both with µ normalised to both the best fit value and the upper limit excluded at the 95% confidence level. In the ratio the excess of data above the background is compared to the signal contribution for both values of µ, in addition to the total background from a fit performed assuming the background-only hypothesis. The data is in good agreement with both the observed signal strength and the background-only hypothesis.

107 ATLAS Data s = 13 TeV, 36.1 fb-1 ttH (µ =0.84) 106 fit ttH (µ =2.0) 95% excl. Events / 0.2 Background 5 10 Bkgd. Unc. Bkgd. (µ=0) 104

103

ttH (bb) Combined 102 Dilepton and Single Lepton Post-fit 4 2 0 Bkgd. Unc. Data - Bkgd. − 2 −2.6 −2.4 −2.2 −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 log (S/B) 10

Figure 8.5: Post-fit yields of the signal (S) and total background (B) as a function of log10 (S/B), with all bins from the final discriminants in each region grouped by the post-fit values of S/B. The data is compared to the total background yield with the signal shown normalised to the best fit value of µ = 0.84 (red) and the value excluded at the 95% CL of µ = 2.0 (orange). In both cases, the signal is shown in addition to the total background yield from the fit. In the lower panel, the excess of data to the total background yield from the signal-plus-background fit is shown against the two hypotheses, the best fit signal strength (solid red line) and the excluded limit (dashed orange line). The total background yield from a fit assuming the background-only hypothesis is shown by the dashed black line. The shaded area represents the total systematic uncertainty on the background content of each bin.

The dominant systematic uncertainties in the fit originate from the mod- elling of the tt¯+ HF background. This can be seen in Table 8.1, which sum- marises the impact of the sources of systematic uncertainty in the fit on the ttH¯

142 8.1. Search for ttH¯ H → b¯b 8. Results signal strength. For a single nuisance parameter, the contribution to the total uncertainty on µ from a single systematic is calculated by taking the difference between the nominal best-fit value of µ and the value measured when fixing the nuisance parameter in the fit to its best-fit value (θˆ) shifted by its pre-fit (post-fit) uncertainties θˆ± ∆θ (θˆ± ∆θˆ). The systematic uncertainties with the largest impact are shown in Fig. 8.6. Each nuisance parameter is shown with its pre- and post-fit impact on µ, as well as the pulls and constraints on the nuisance parameters after the fit. The pulls on the nuisance parameters show their variation in the fit from their nominal value θ0, with the constraints show- ing the post-fit uncertainty on the nuisance parameter relative to the pre-fit uncertainty.

Several systematic uncertainties relating to the modelling of the tt¯+ HF background have relatively large constraints, though no systematic is pulled by more than ∆θ from the nominal value. Other systematic uncertainties which have a relatively large impact on µ come from flavour tagging, which comprises four of the top 20 systematic uncertainties when ranked by impact, and jet energy resolution, of which both sources of uncertainty feature in the top 20 ranking. Nuisance parameters are constrained in the profile likelihood fit where a systematic variation would impact the distributions of the discriminants in the fit and result in a large deviation from the data. To validate the pulls and constraints seen in the systematic uncertainties, and to ensure no significant bias on the signal model, a background-only fit is also performed. In this fit, the same discriminants and nuisance parameters are used, however the most signal enriched bins are removed. Similar pulls and constraints are seen to the nominal fit. Fits where individual systematic uncertainties are decorrelated across all regions or samples are also performed to determine the origin of pulls and constraints in the background-only fit. The pulls are found to correct the modelling of the tt¯+ jets background to the observed data, with no one region being a major contributor to the constraint of the systematic uncertainties. The capability of the fit to constrain the uncertainties is validated by comparing the resulting constraints in the fit to those observed when performing a fit on the Asimov dataset. No additional or larger constraints are seen when fitting on data compared to the Asimov dataset. In addition, the same systematic uncertainties as seen in Fig. 8.6 are found to have the largest impact on µ when fitting the Asimov dataset, and only small differences in the values of ∆µ are observed.

In addition to validating the stability of the fit and the pulls seen in sys-

143 8. Results 8.1. Search for ttH¯ H → b¯b

Pre-fit impact on µ: ∆µ θ = θ+∆θ θ = θ-∆θ −1 −0.5 0 0.5 1 Post-fit impact on µ: θ = θ+∆θ θ = θ-∆θ ATLAS -1 Nuis. Param. Pull s = 13 TeV, 36.1 fb

tt+≥1b: SHERPA5F vs. nominal

tt+≥1b: SHERPA4F vs. nominal tt+≥1b: PS & hadronization tt+≥1b: ISR / FSR ttH: PS & hadronization b-tagging: mis-tag (light) NP I k(tt+≥1b) = 1.24 ± 0.10 Jet energy resolution: NP I ttH: cross section (QCD scale) tt+≥1b: tt+≥3b normalization

tt+≥1c: SHERPA5F vs. nominal tt+≥1b: shower recoil scheme tt+≥1c: ISR / FSR Jet energy resolution: NP II tt+light: PS & hadronization Wt: diagram subtr. vs. nominal b-tagging: efficiency NP I b-tagging: mis-tag (c) NP I

miss ET : soft-term resolution b-tagging: efficiency NP II

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 θ θ ∆θ ( - 0)/

Figure 8.6: Ranking of the 20 most significant nuisance parameters, sorted by their impact on the value of the signal strength, µ. Nuisance parameters corresponding to Monte Carlo statistical uncertainties are not shown. The empty blue rectangles correspond to the impact on µ of the pre-fit systematic uncertainties and the filled blue rectangles correspond to the post-fit impact on µ. The impact of each systematic uncertainty, ∆µ, is calculated by taking the difference between the best-fit value of µ and the value measured when fixing the systematic to its post-fit value (θˆ) shifted by its pre-fit (post-fit) uncertainties θˆ± ∆θ (θˆ± ∆θˆ). The scale of the impact is shown on the upper axis. The pulls on each systematic are represented by the black points, shown as a deviation relative to the nominal pre-fit values. The constraints are the relative uncertainties on the post-fit values of each systematic. The scale of the pulls and constraints are shown on the bottom axis.

144 8.1. Search for ttH¯ H → b¯b 8. Results

Uncertainty source ∆µ tt¯+≥1b modelling +0.46 −0.46 Background-model stat. unc. +0.29 −0.31 b-tagging efficiency and mis-tag rates +0.16 −0.16 Jet energy scale and resolution +0.14 −0.14 ttH¯ modelling +0.22 −0.05 tt¯+≥1c modelling +0.09 −0.11 JVT, pileup modelling +0.03 −0.05 Other background modelling +0.08 −0.08 tt¯+ light modelling +0.06 −0.03 Luminosity +0.03 −0.02 Light lepton (e, µ) id., isolation, trigger +0.03 −0.04 Total systematic uncertainty +0.57 −0.54 tt¯+≥1b normalisation +0.09 −0.10 tt¯+≥1c normalisation +0.02 −0.03 Intrinsic statistical uncertainty +0.21 −0.20 Total statistical uncertainty +0.29 −0.29 Total uncertainty +0.64 −0.61

Table 8.1: A breakdown of the contribution of different sources of system- atic uncertainty to the total uncertainty on the measured ttH¯ signal strength. Different sources of systematic uncertainty are grouped together into broader categories. The impact of each category is evaluated by fixing the nuisance parameters to their best fit value and repeating the fit. The resulting uncer- tainty on µ is then subtracted in quadrature from the total uncertainty from the full fit to determine the contribution ∆µ. The “background-model stat. unc.” refers to the statistical uncertainties on the MC events and from the data driven estimate of the non-prompt and fake lepton contribution in the semileptonic channel. The intrinsic statistical uncertainty refers to the total statistical uncertainty when repeating the fit with the tt¯+≥1c and tt¯+≥1b normalisation factors fixed to their best fit values. The total uncertainty is different from the sum in quadrature of all the individual components due to correlations between nuisance parameters in the fit.

145 8. Results 8.1. Search for ttH¯ H → b¯b tematic uncertainties, the ability for the fit to accommodate for mismodelling in tt¯+ jets is studied by injecting an alternative tt¯ model into the fit. The al- ternative model is built by replacing the tt¯ contribution in the Asimov dataset by the tt¯ model predicted by Powheg interfaced to Pythia 6. In this sam- ple, the tt¯+≥1b contribution is modified to match the prediction from the Sherpa4F tt¯+ b¯b sample. By replacing the tt¯ events in the Asimov dataset with an alternative model and performing a fit on this new dataset, it can be observed which systematic uncertainties are pulled to compensate for the differences between the nominal prediction and the injected model. If the tt¯ systematic model is sufficient to cover the differences in the models, other sources of uncertainty should not be pulled. The fit is performed using the signal-plus-background hypothesis on the modified Asimov dataset. It is found to be able to recover the difference in modelling using the systematic uncer- tainties relating to tt¯ modelling instead of introducing a bias on µ. The pulls on the nuisance parameters relating to modelling uncertainties for this dataset are shown in Fig. 8.7; this fit is performed including only the dilepton regions. The value of µ measured in this fit is in agreement with its true value µ = 1 within the statistical uncertainty of the modified dataset. The largest pull in this study comes from the systematic uncertainty on the inclusive tt¯ cross sec- tion. This is a result of the lower jet multiplicity and b-tagged jet multiplicity in events predicted by the Powheg+Pythia 6 tt¯ sample in comparison to the nominal Powheg+Pythia 8 sample.   Further validation of the result from the ttH¯ H → b¯b analysis is per- formed by looking at the post-fit modelling of the individual variables which enter the classification BDTs in the signal regions. The most important vari- ables which enter the BDTs in each of the signal regions are shown in Fig. 8.8 ≥4j ≥4j ≥4j for SR1 , Fig. 8.9 for SR2 , and Fig. 8.10 for SR3 . The chosen variables include the best reconstruction BDT score per event and the Higgs boson can- didate mass, using the reconstruction BDT to select the two jets matched to the Higgs boson. Good agreement is seen between the data and prediction in the variables which enter the classification BDTs used as the final discriminants in the signal regions.

146 8.1. Search for ttH¯ H → b¯b 8. Results

Theory tttt cross section tt+light radiation tt+light PS & hadronisation tt+light SHERPA5F tt+≥1c radiation tt+≥1c PS & hadronisation tt+≥1c 3F vs 5F tt+≥1c SHERPA5F tt+≥1b scale choice tt+≥1b global scale tt+≥1b ISR/FSR tt+≥1b Q CMMPS tt+≥1b PS & hadronisation tt+≥1b MPI tt+≥1b UE modelling tt+≥1b SHERPA5F tt+≥1b shower recoil scheme tt+≥1b SHERPA4F tt cross section ttZ cross section (QCD scale) ttZ cross section (PDF scale) ttZ generator ttW cross section (QCD scale) ttW cross section (PDF scale) ttW generator ttWW cross section (QCD scale) ttH cross section (QCD scale) ttH cross section (PDF scale) ttH PS & hadronisation tt+≥3b normalisation tZjb cross section (QCD scale) tHjb cross section (QCD scale) tHjb cross section (PDF scale) Wt cross section Z+jets norm. (Dilepton, ≥4j) Z+jets norm. (Dilepton, 3j) Wt radiation Wt PS & hadronisation Wt diagram subtraction WtZ cross section WtH cross section (QCD scale) WtH cross section (PDF scale) Fakes norm. (Dilepton) Diboson cross section BR(H → bb) BR(H → WW) BR(H → other) −2 −1 0 1 2

Figure 8.7: The pulls and constraints on modelling uncertainties in a fit per- formed using a modified Asimov dataset, where the contribution from tt¯events is replaced by events from a sample generated with Powheg interfaced to Pythia 6. The pulls seen in the fit are correcting for the difference in the tt¯ model between Powheg+Pythia 8 and Powheg+Pythia 6. The systematic uncertainties from experimental sources are not shown, with only the luminos- ity exhibiting a small pull.

147 8. Results 8.1. Search for ttH¯ H → b¯b

Data ttH Data ttH 200 ATLAS ATLAS -1 tt + light tt + ≥1c -1 tt + light tt + ≥1c s = 13 TeV, 36.1 fb 100 s = 13 TeV, 36.1 fb 180 tt + ≥1b tt + V tt + ≥1b tt + V Dilepton Dilepton ≥ Non-tt Total unc. ≥ Non-tt Total unc. Events / bin 4j 4j 160 SR1 ttH (norm) SR1 ttH (norm) 80 Post-Fit Post-Fit 140 Pre-Fit Bkgd. Events / 25 GeV Pre-Fit Bkgd. 120 60 100

80 40 60 40 20 20 1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 250 300 350 Higgs Reconstruction BDT output (w/ Higgs info) mbb (reco BDT) [GeV] (a) (b)

ATLAS Data ttH ATLAS Data ttH ≥ ≥ 120 s = 13 TeV, 36.1 fb-1 tt + light tt + 1c 70 s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V Dilepton Dilepton ≥4j Non-tt Total unc. 60 ≥4j Non-tt Total unc.

Events / 0.30 100 SR1 ttH (norm) Events / 0.20 SR1 ttH (norm) Post-Fit Pre-Fit Bkgd. 50 Post-Fit Pre-Fit Bkgd. 80 40 60 30 40 20

20 10

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 avg ∆ η ∆ R (reco BDT w/ Higgs info) bb H,tt (c) (d)

Figure 8.8: Comparison between data and the post-fit prediction for four of the most important variables in the classification BDT used as the final dis- ≥4j ¯ criminant in the dilepton signal region SR1 . The ttH signal is shown in solid red, normalised to the measured best-fit value µ = 0.84, and additionally as a dashed red line normalised to the total background prediction. The dashed black line represents the total pre-fit background prediction.

148 8.1. Search for ttH¯ H → b¯b 8. Results

600 450 ATLAS Data ttH ATLAS Data ttH ≥ ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c 400 s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V 500 Dilepton Dilepton ≥ Non-tt Total unc. 350 ≥ Non-tt Total unc. Events / bin 4j 4j SR 2 ttH (norm) SR 2 ttH (norm) 400 Post-Fit Post-Fit Pre-Fit Bkgd. Events / 25 GeV 300 Pre-Fit Bkgd.

250 300 200

200 150

100 100 50

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 250 300 350 Higgs Reconstruction BDT output (w/ Higgs info) mbb (reco BDT) [GeV] (a) (b)

ATLAS Data ttH 1200 ATLAS Data ttH ≥ ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c 500 tt + ≥1b tt + V tt + ≥1b tt + V Dilepton 1000 Dilepton ≥4j Non-tt Total unc. ≥4j Non-tt Total unc.

Events / 0.30 SR 2 ttH (norm) SR 2 ttH (norm) 400 Post-Fit Pre-Fit Bkgd. 800 Post-Fit Pre-Fit Bkgd. Events / 31.25 GeV 300 600

200 400

100 200

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 50 100 150 200 250 avg ∆ η min m bb mbb,85 [GeV] (c) (d)

Figure 8.9: Comparison between data and the post-fit prediction for the re- constructed Higgs mass from the reconstruction BDT and three of the most important variables in the classification BDT used as the final discriminant ≥4j ¯ in the dilepton signal region SR2 . The ttH signal is shown in solid red, nor- malised to the measured best-fit value µ = 0.84, and additionally as a dashed red line normalised to the total background prediction. The dashed black line represents the total pre-fit background prediction.

149 8. Results 8.1. Search for ttH¯ H → b¯b

600 ATLAS Data ttH ATLAS Data ttH ≥ 400 ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c tt + ≥1b tt + V tt + ≥1b tt + V 500 Dilepton 350 Dilepton ≥ Non-tt Total unc. ≥ Non-tt Total unc. Events / bin SR 4j SR 4j 3 ttH (norm) 300 3 ttH (norm) Post-Fit Post-Fit 400 Pre-Fit Bkgd. Events / 25 GeV Pre-Fit Bkgd. 250

300 200

200 150 100 100 50

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 250 300 350 Higgs Reconstruction BDT output (w/ Higgs info) mbb (reco BDT) [GeV] (a) (b)

700 ATLAS Data ttH ATLAS Data ttH ≥ 1200 ≥ s = 13 TeV, 36.1 fb-1 tt + light tt + 1c s = 13 TeV, 36.1 fb-1 tt + light tt + 1c 600 tt + ≥1b tt + V tt + ≥1b tt + V Dilepton Dilepton ≥4j Non-tt Total unc. 1000 ≥4j Non-tt Total unc.

Events / 0.30 500 SR 3 ttH (norm) Events / 0.50 SR 3 ttH (norm) Post-Fit Pre-Fit Bkgd. Post-Fit Pre-Fit Bkgd. 800 400

600 300

200 400

100 200

1.50 1.50 1.25 1.25 1 1 0.75 0.75

Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 1 1.5 2 2.5 3 3.5 4 4.5 5 avg max p ∆ η ∆ T bb Rbb (c) (d)

Figure 8.10: Comparison between data and the post-fit prediction for four of the most important variables in the classification BDT used as the final ≥4j ¯ discriminant in the dilepton signal region SR3 . The ttH signal is shown in solid red, normalised to the measured best-fit value µ = 0.84, and additionally as a dashed red line normalised to the total background prediction. The dashed black line represents the total pre-fit background prediction.

150 8.2. Combination with other Searches 8. Results

8.2 Combination with other Searches

In addition to searching for ttH¯ production in leptonic final states with the Higgs decaying to a b-quark pair, searches for ttH¯ production have been car- √ ried out at s =13 TeV in three other analyses with orthogonal event selec- tions. The additional searches are optimised for Higgs boson production with the Higgs boson decaying to two photons [106]; ttH¯ events with final states containing multiple leptons arising from H → WW ∗/ZZ∗/ττ decays (multi- lepton) [2]; and a dedicated search for H → ZZ∗ → 4` [107]. In the diphoton and resonant four lepton searches, the ttH¯ signal strength is measured using an enriched region in part of a global analysis, measuring VBF, VH and ttH¯ Higgs boson production modes. All analyses are performed using 36.1 fb−1 of pp data collected by the ATLAS detector. Overlap between the signal and control regions in each analysis has been studied and minimised using prese- lection cuts so that the overlap is negligible. In addition, the modelling of the signal ttH¯ process is consistent across all analyses, using the same theoretical prediction and associated uncertainties, with the same choice of Monte Carlo simulation used to generate the samples. These separate analyses are subse- quently combined into a single analysis to extract the most precise value of µ. This is performed using a combined likelihood function, constructed as a product of the likelihood functions from the individual analyses. The majority of nuisance parameters associated with the same sources of systematic uncertainty are treated as correlated between the different analyses. For example, all uncertainties arising from the Higgs boson production and de- cay modes, as well as the MC estimated backgrounds in each analysis, are fully correlated across all analyses. However there are some notable exceptions. The additional modelling uncertainties on the tt¯+ HF backgrounds are not propa- gated outside of the H → b¯b analysis, as the relevant regions of phase space in the other searches are not as sensitive to the modelling of these backgrounds. In the other analyses, an independent set of systematic uncertainties for the modelling of tt¯ events is considered instead. The dominant experimental un- certainties in each analysis are associated with the jet energy resolution, jet energy scale, and flavour tagging efficiencies. The nuisance parameters related to the jet energy scale are correlated across all analyses, with the exception of the nuisance parameter associated to the fraction of jets initiated by quarks and gluons. As the different analyses have different event selections, this causes the fractions of jets initiated by quarks and gluons to be different. The jet en-

151 8. Results 8.2. Combination with other Searches ergy resolution is correlated across all analyses, except for the control regions in the H → b¯b analysis where the systematic is decorrelated into two inde- pendent sources. Finally, the H → ZZ∗ → 4` and H → γγ analyses utilise a different calibration scheme for the flavour tagging efficiencies and mistag rates compared to the scheme used in the H → b¯b and multilepton analyses. Therefore, the nuisance parameters associated with the flavour tagging uncer- tainties in each scheme are correlated, but the two different schemes are treated as decorrelated. As a result of this, the flavour tagging uncertainties are only correlated between H → b¯b and multilepton, and between H → ZZ∗ → 4` and H → γγ. None of the nuisance parameters in the fit are strongly constrained by more than one of the individual channels, and the signal strength extracted does not depend on the choice of correlation scheme used for the systematic uncertainties. As mentioned previously, the signal strength of ttH¯ in the H → ZZ∗ → 4` and H → γγ channels is extracted as part of combined analyses targeting mul- tiple Higgs production modes. However, in the combined ttH¯ analysis only the ttH¯ enriched categories from these searches are included. These enriched cat- egories contain significant contamination from other Higgs production modes, which is not the case in the H → b¯b and multilepton searches. In the com- bined fit, the non-ttH¯ Higgs production modes and all Higgs branching ratios are set to the SM expectation values, and all associated theoretical uncertain- ties are included in the statistical analysis. Furthermore, in the ttH¯ (H → γγ) analysis, additional categories are optimised to measure the signal strength of Higgs production in the W tH and tHjb channels. However, when extracting the signal strength µ in the combined analysis, these processes are treated as background and set to the SM predictions with their associated theoretical un- certainties. Both of these considerations result in slight differences in the values of µ extracted in each individual channel in the combined analysis compared to the values reported in the standalone analyses. The best fit value of the signal strength, extracted from the combined fit over all channels is

+0.27 +0.33 µttH¯ = 1.17 ± 0.19 (stat.) −0.23 (syst.) = 1.17−0.30. The background-only hypothesis is excluded at a level of 4.2σ, with an expected significance of 3.8σ in the case of the SM ttH¯ prediction, using the methods described in Chapter 3. This result constitutes evidence for the production of the Higgs boson in association with a top quark pair. The breakdown of the extracted values of the signal strength per analysis and the combined result

152 8.2. Combination with other Searches 8. Results are shown in Fig. 8.11. The measured signal strength corresponds to a ttH¯ +160 production cross section of σttH¯ = 590−150 fb, compared to the SM prediction ttH¯ +35 of σSM = 507−50 fb.

( tot. ) ( stat. , syst. )

ATLAS s=13 TeV, 36.1 fb•1 total stat. ttH ZZ < 1.9 (68% CL)

+0.7 +0.7 +0.2 ttH γγ 0.6 − 0.6 ( − 0.6 , − 0.2 )

+0.6 +0.3 +0.6 ttH bb 0.8 − 0.6 ( − 0.3 , − 0.5 )

+0.5 +0.3 +0.4 ttH ML 1.6 − 0.4 ( − 0.3 , − 0.3 )

+0.3 +0.2 +0.3 ttH combined 1.2 − 0.3 ( − 0.2 , − 0.2 ) −2 0 2 4 6 8 10 Best•fit µ for m =125 GeV t Ht H

Figure 8.11: Summary of the measurements of the ttH¯ signal strength from each individual analysis and the combined measurement. ZZ refers to the res- onant H → ZZ∗ → 4` channel and ML refers to the multilepton analysis. As no events are observed in the H → ZZ∗ → 4` channel, the upper limit on µ at the 68% confidence level computed using the CLs method is shown.

Taking advantage of the different decay modes of the Higgs boson in the various regions used in the combination, the coupling strength of the Higgs boson to different elementary particles can be measured. In combination, the ttH¯ analyses are sensitive to the Htt, Hbb, Hττ, HWW and HZZ couplings, as well as the indirect Hγγ coupling. The deviations of individual couplings can be measured in relation to the SM prediction. The coupling strengths are interpreted using the κ-parameterisation, where the coupling between the Higgs boson and a particle species i are all scaled linearly by a common factor

κi. For a given process j, the coupling modifier κj is defined such that

obs 2 σj κj = SM , (8.1) σj where by construction κ = 1 for all couplings in the SM. In this interpretation all couplings between fermions and the Higgs boson are scaled by a factor κF with respect to the SM prediction, and all vector bosons are scaled by a factor

κV . The relevant parameterisation is given in Ref. [15], with the coupling of the Higgs boson to gluons set to κF and the coupling to photons is expressed

153 8. Results 8.2. Combination with other Searches

in terms of κF and κV . Only the relative sign between the two factors κF and

κV is relevant when calculating the effective couplings of loop processes, and so by convention κV ≥ 0 is chosen. For κF < 0, the interference of the W boson and top quark in the H → γγ loop becomes constructive. Modifications to loop induced processes, such as the H → γγ decay mode, are determined by multiplying the relevant SM amplitudes in the loop process by the corre- sponding κ-factors. Contributions from non-SM physics are not included in the interpretation and only SM particles are included in the loops and Higgs decay modes. In the κ-parameterisation, an additional modifier is introduced to change the total width of the Higgs boson. This κ-factor is used to correct for the overall change in the Higgs decay rate which results from modifying the fermion and vector boson couplings. In the H → γγ and H → ZZ∗ → 4` channels, the combined analysis has sensitivity to the cross sections of the tHbj and W tH processes. The amplitude of the Hγγ decay mode and the production of tHbj and W tH all involve interference between the couplings of the Higgs boson to the top quark and to the W -boson. This interference is destructive in the SM, and suppresses the rate of single-top Higgs production. As a result of this, the combined ttH¯ analysis is able to resolve the relative sign between the two couplings and thus determine the sign of κF in this parameterisation.

A likelihood scan is performed over the κF -κV plane to find the best fit values of the coupling modifiers. The acceptances of all Higgs production and decay modes in each of the individual channels are assumed to be constant as the values of κF and κV vary. The results of the scan are shown in Fig. 8.12 with very good agreement with the SM. In this parameterisation, a negative coupling modifier of the Higgs boson to fermions is excluded at the 95% confidence level.

154 8.2. Combination with other Searches 8. Results

F 3 Standard Model κ ATLAS Best fit 68% CL [ttH → γγ , ZZ, bb, ML] 2.5 •1 95% CL s = 13 TeV, 36.1 fb

2

1.5

1

0.5

0 0 0.5 1 1.5 2 κV

Figure 8.12: A two dimensional likelihood scan in the κF -κV plane of the Higgs boson couplings to fermions and vector bosons using all channels in the ttH¯ combination [2]. The allowed regions at the 68% and 95% confidence levels are shown by the solid and dashed lines respectively. In this parameterisation, the couplings of the Higgs boson to photons is expressed in terms of κF and κV .

155 156 9. Colour Connection of b-quarks

As the total integrated luminosity delivered by the LHC increases, subtler phenomenological effects in high energy particle collisions can be investigated. Precision measurement of these effects can help improve the overall modelling of interactions as well as provide new observables with which to study the properties of individual processes.

In hadron collisions performed at the LHC, jets are built from the evolu- tion of colour charged particles produced in the collisions. This results in a collimated shower of hadrons due to the confining nature of QCD which does not allow colour charged particles to exist freely. Nevertheless, the underlying colour connections between the initial particles affects the overall structure of the emitted radiation in the evolution of partons and subsequent hadronisation. The colour charge carried by partons in the hard scatter must be considered at every subsequent interaction vertex. This in turn affects the relative substruc- ture of jets due to the distribution of the resulting hadrons. Colour connection and its impact on the underlying structure of jets has been measured in tt¯ √ events by the ATLAS experiment at centre of mass energies of s = 8 TeV and 13 TeV using jet pull observables [108, 109] and previously in pp¯ collisions by the D0 experiment [110]. In this chapter the ability to observe such an effect   in a busy environment with many jets, namely the ttH¯ H → b¯b final state, is studied. Jet pull observables [111] are constructed using jet substructure information from jets selected using the reconstruction BDT developed in the   ttH¯ H → b¯b analysis. The distributions are compared for signal and back- ground processes. Such variables could be used to improve the discrimination between signal and background processes by utilising colour connection and jet substructure information which so far has not been exploited.

157 9. Colour Connection of b-quarks 9.1. Phenomenological Motivation

9.1 Phenomenological Motivation

In QCD interactions, the colour charge carried by individual partons must be conserved at each vertex. As a consequence of this, the flow of colour in succes- sive interactions can be followed, which results in a colour connection between colour charged partons in the hard scatter and the colour neutral hadrons after hadronisation. The colour propagation rules for the QCD interaction vertices involving three partons are shown in Fig. 9.1. In the evolution of partons, the emission of soft radiation is dynamically constrained to lie between colour connected lines. Therefore, additional radiation is expected to fall in between the colour connected partons. This results in a structure in the emission of partons which persists after hadronisation [112]. Thus, the overall structure of the hadrons which are clustered together to form a jet is influenced by the colour connection of the partons in the hard scatter prior to hadronisation.

,

Figure 9.1: Feynman diagrams illustrating the conservation of colour charge at QCD vertices with three partons [113]. Black lines represent the propagation lines of Feynman diagrams, coloured lines show the QCD colour connection. Analogous to the arrows on the Feynman propagators, the arrows on the colour lines represent the direction of colour flow with (anti-)colours flowing forwards (backwards) in time.

As the colour connection information from the hard scatter is propagated to the structure of jets recorded in the detector, it is possible to construct observables using the jet substructure that are sensitive to the colour flow in the parton interactions. These observables could be used to distinguish between event topologies with different colour structure. Such information is complementary to the kinematic properties of reconstructed jets, and could provide additional information to help distinguish between processes with the same final state objects. A relevant example of differences in the colour structure of two similar   event topologies is the comparison of ttH¯ H → b¯b and tt¯+ b¯b, where the two processes have the same final state in the hard scatter. There are four colour charged b-quarks in the decay products in addition to the W -bosons and their

158 9.1. Phenomenological Motivation 9. Colour Connection of b-quarks decay products. In both cases, two b-quarks originate from the decay of a tt¯ pair, but in contrast the two additional b-quarks are the decay products of very different particles, the Higgs boson and the gluon. The Higgs boson is a colour singlet, and thus does not carry colour charge, whereas the gluon is a colour octet and the colour carrying mediator of the strong interaction. As a result,   the two b-quarks are colour connected in ttH¯ H → b¯b events but in tt¯+ b¯b they are each connected to other partons via the gluon. This is demonstrated   in Fig. 9.2 which shows the hard scatter of ttH¯ H → b¯b and tt¯+ b¯b processes with the colour connections illustrated alongside the propagators. Therefore, if one is able to select the jets which correspond to the two b-quarks not originating from the tt¯ decays and compare the substructure of the jets, the soft radiation should be different in the two processes due to the underlying colour structure.

b

+ b t W b ¯b H ¯b t W + − b t¯ W ¯ t W − ¯b

¯b

(a) (b)

Figure 9.2: Feynman diagrams illustrating the conservation of colour charge   at each interaction vertex in ttH¯ H → b¯b (a) and tt¯ + b¯b (b). Black lines represent the regular Feynman diagrams, colour lines show the QCD colour connection.

An observable which is designed to exploit the differences in radiation pat- ~ terns within the jet substructure is the jet pull vector P [111], a pT weighted radial moment of a jet. The pull vector of a jet J with transverse momentum J pT is defined as ~ i ~ X |∆ri| · pT ~ P (J) = J · ∆ri. (9.1) i∈J pT The summation is performed over a collection of jet constituents, with each i constituent i having transverse momentum pT and separated by ∆ri = (∆yi, ∆φi)

159 9. Colour Connection of b-quarks 9.1. Phenomenological Motivation from the jet axis in rapidity-azimuth space. The jet axis is the momentum weighted centre of the jet in y-φ space, and is calculated by performing the ~ sum of the constituents in Equation 9.1 without the factor |∆ri|. Examples of jet constituent collections used to construct P~ include inner detector tracks and calorimeter clusters. In the following study, the constituent collection contains all charged particles matched to a jet. Calorimeter clusters are not considered as they have a lower spatial resolution, as shown in Ref. [108]. Tracks in the jet collection must satisfy |η| < 2.5 and pT > 0.5 GeV. In addition, tracks must pass similar quality requirements to those applied on reconstructed elec- tron and muon tracks to ensure they come from the primary vertex, requiring

|d0| < 2 mm and |z0sinθ| < 3 mm. Tracks are individually matched to jets using a technique called association [114]. In this method tracks are in- cluded in the clustering algorithm used to reconstruct jets, though with their four vectors set to infinitesimally small magnitudes. Using this method, tracks are associated with the most appropriate jet according to the chosen clustering algorithm but have no impact on the reconstruction of jets.

Figure 9.3: An illustration of the jet pull observables in a dijet system. The jet pull vector of jet J1 is shown by the dashed blue line, constructed from the constituents of the jet. The pull angle between J1 and a second jet J2 is defined as the angle in red between the pull vector and the vector formed by connecting the jet axes of J1 and J2 in y-φ space, centred on the jet axis of J1 [109].

160 9.2. Colour Flow in ttH¯ H → b¯b 9. Colour Connection of b-quarks

From P~ , information regarding the direction and magnitude of the internal colour structure of a single jet is accessible, however the colour connection between two jets is of more interest. Using the pull vector a second observable, the jet pull angle θP (J1,J2), can be constructed for two given jets J1 and J2.

The jet pull angle relates to the direction of the pull vector of J1 in relation ~ to the relative location of J2; it is defined as the angle between P (J1) and the vector formed by connecting the jet axes of J1 and J2 in y-φ space. For a system in which the two jets are colour connected, the direction of soft radiation is expected to fall between the two jets, thus the pull vectors of each jet should align with the axis between the two jets. In this case θP for each jet is predicted to be approximately zero. Furthermore, for a system where the two jets are not colour connected, the distribution of θP is predicted to be uniform. In Fig. 9.3 the definition of P~ is illustrated for a jet, alongside the definition of θP in a dijet system. The pull angle observable has been demonstrated to be able to observe the colour connection in hadronic top decays. However, the differences in the shape of the pull angle for colour connected and colour isolated dijet pairs are found to be a subtle effect which is not well modelled in Monte Carlo simulation with large variations seen when comparing different predictions [109].

  9.2 Colour Flow in ttH¯ H → b¯b

Using an observable which is able to probe the colour connection between two jets, the differences in the colour flow of two processes can be probed. In ≥4j ¯  ¯ this study, the SR1 dilepton signal region from the ttH H → bb analysis is used to investigate the feasibility of measuring the different colour connections ¯  ¯ ¯ ¯ ≥4j between jets in ttH H → bb and tt + bb events. In SR1 there is the largest   contribution from ttH¯ H → b¯b in comparison to the total background, and the background is dominated by tt¯+b¯b as described in previous chapters. These two processes correspond to the diagrams shown in Fig. 9.2, where the colour flow between partons is illustrated using the rules in Fig. 9.1. Therefore, by   selecting the jets which have a different colour structure in ttH¯ H → b¯b and tt¯+b¯b, namely the additional b-jets not from the tt¯decays, the distributions of the jet pull angle observable should show distinction between the two processes. In order to select the b-jets not originating from the tt¯ decays, the recon- struction BDT is employed to select the two b-jets hypothesised to originate from the Higgs boson decay. The BDT with Higgs information is utilised to

161 9. Colour Connection of b-quarks 9.2. Colour Flow in ttH¯ H → b¯b match the jets as it provides the highest performance. No bias should be intro- duced in the reconstructed observables, as is the case for observables such as

MHiggs, as the jet substructure does not enter the BDT training and no cor- ≥4j relations with the input variables are observed. As shown previously, in SR1   all four jets are correctly matched to the b-quarks in ttH¯ H → b¯b events for just over 40% of all events. In tt¯+jets events, the BDT with Higgs information correctly matches jets to the two b-quarks from the tt¯ decay in 29.4% of all events. Furthermore, for tt¯+b¯b events the two jets matched to the Higgs boson should subsequently correspond to the additional b-quarks in the event.   In order to compare the colour structure in ttH¯ H → b¯b with tt¯+ b¯b , the pull angle is constructed between the two jets matched to the Higgs boson. As the pull angle is calculated from one jet with relation to another, and the distributions of θP (J1,J2) and θP (J2,J1) probe the same substructure infor- mation, only θP from the leading jet to the subleading jet is reconstructed. The ordering of the jets is consistent with the method used in the reconstruction BDT, with the two jets sorted by the binned b-tagging discriminant, and sub- sequently by pT for jets in the same b-tagging discriminant bin. The leading jet is chosen as it typically has higher pT and more jet constituents. The com- ≥4j parison of θP is performed in SR1 where the background is predominantly tt¯+ b¯b and there is the greatest sensitivity to the ttH¯ signal, as well as the highest matching performance. Due to the low efficiency of the reconstruction BDT to correctly assign jets, it can be expected that the observed performance of θP is not optimal. However, in Fig. 9.4a the non-uniform distribution of θP for simulated ttH¯ events is visibly coming from the events where the jets are both correctly matched to the Higgs boson. The events with jets incorrectly matched to the Higgs boson show a more uniform distribution in θP . To validate the behaviour of the observable in signal and background events by comparing the colour structure of H → b¯b and g → b¯b, it is useful to compare the pull angle between two jets with no colour connection in both   ttH¯ H → b¯b and tt¯+ b¯b. Here the pull angle between the jets matched to the b-quarks from the tt¯decays is calculated. In addition, due to the statistical ≥4j −1 limitation of the SR1 signal region with the current 36.1 fb dataset, and the large statistical uncertainty from the Monte Carlo samples in relation to the expected size of the effect, the two pull angles are calculated for events in a looser selection region. In this looser selection ≥4 jets are required, of which ≥4 are required to pass the loose b-tagging working point (≥ 4j ≥ 4b).   In this region the comparison is not between the pull angles in ttH¯ H → b¯b

162 9.2. Colour Flow in ttH¯ H → b¯b 9. Colour Connection of b-quarks

  and tt¯+ b¯b, but rather ttH¯ H → b¯b and tt¯+ jets. In spite of this, the same underlying colour structure is present in tt¯+ jets events with two additional quarks in the hard scatter, with the b-quarks in tt¯+ b¯b replaced by any flavour quark. Another consequence of the looser selection is that the efficiency of the reconstruction BDT in this region is reduced, with all jets correctly matched in 30.8% of ttH¯ events and both jets correctly matched to the tt¯ decay in 25.5% of all tt¯+ jets events. Nevertheless, in Fig. 9.4b the shape of the observable in simulated ttH¯ events is also observed to come from events with correctly ≥4j matched jets, consistent with the behaviour in SR1 .

≥ 4j Dilepton SR Dilepton ≥4j≥4b (loose wp) 1 ttH, all events 0.3 ttH, all events 0.3 ttH, Incorrect Higgs matching ttH, Incorrect Higgs matching Normalised Entries Normalised Entries

0.2 0.2

0.1 0.1

0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ l,Higgs sl,Higgs θ l,Higgs sl,Higgs Charged Particle P (b ,b ) [rad] Charged Particle P (b ,b ) [rad] (a) (b)

Figure 9.4: The pull angle θP between the two jets matched to the b-quarks   from the Higgs boson decay in ttH¯ H → b¯b events using the reconstruction BDT. The angle is calculated using all charged particle constituents in the jets. The events where the reconstruction BDT has incorrectly matched both jets to the Higgs boson decays are shaded in red, with the distribution for all ttH¯ ¯ ≥4j events in blue. Shown for all simulated ttH events in SR1 (a) and a looser selection requiring ≥ 4j ≥ 4b, with the loose b-tagging working point.

The pull angle between the two jets matched to the Higgs boson are com- ¯ ≥4j pared for the ttH signal and all background events in Fig. 9.5 for both SR1 and the loose selection. Although the effect is small, and the lower number of ≥4j events from Monte Carlo simulation in SR1 is apparent, a shape difference is visible between signal and background events. However, the distribution of the background is not perfectly uniform, as would be expected for jets with no colour connection. Instead the background distribution in the ≥ 4j ≥ 4b region in particular can be interpreted as having a slight slope. In both regions, there is no difference observed in distributions for tt¯ events where jets are correctly assigned to the two b-quarks from the tt¯ decays and those with incorrect as- signments. This is shown by the distributions in Fig. 9.6. This disagreement

163 9. Colour Connection of b-quarks 9.2. Colour Flow in ttH¯ H → b¯b with expectation could be the result of the origin of the additional jets in the event, with a larger expected contribution to the background from non-tt¯+ b¯b events. In the case of the pull angle of the jets matched to the b-quarks from the tt¯ decay, one would expect both the signal and background events to have a uniform distribution in θP as there is no colour connection. In Fig. 9.7 the distributions are compared for both regions, and the behaviour is consistent with expectation.

0.35 0.35

s = 13 TeV, 36.1 fb-1 Total background s = 13 TeV, 36.1 fb-1 Total background Dilepton ttH(bb) Dilepton ttH(bb) ≥4j Arbitrary units SR Separation: 0.143% Arbitrary units ≥4j ≥4b (loose wp) Separation: 0.103% 1 0.3 0.3

0.25 0.25

0.2 0.2

0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ l,Higgs sl,Higgs θ l,Higgs sl,Higgs Charged Particle P (b ,b ) [rad] Charged Particle P (b ,b ) [rad]

(a) (b)

Figure 9.5: The θP between the two jets matched to the Higgs boson by the ≥4j reconstruction BDT in SR1 (a) and ≥ 4j ≥ 4b (b). Comparing simulated ttH¯ events (red) with the total expected background (blue) in each region, normalised to unity.

In order to assess how well the colour connection is currently modelled in these regions, the agreement between data and prediction is compared for the pull angle observables. The same samples and systematic uncertainties used in   the ttH¯ H → b¯b analysis are utilised for this comparison. As the modelling of   the tt¯+HF background is not well understood, the results of the ttH¯ H → b¯b statistical analysis are applied to the distributions. In this way, the best current understanding of the background processes extracted from the same dataset is used to help the overall modelling. The comparison to data is performed both before and after applying the results of the statistical analysis. The pull angle between the jets matched to the Higgs boson is shown in Fig. 9.8 and in Fig. 9.9 for the pull angle between the jets matched to the b-quarks from the tt¯decay. In all distributions no large disagreement is seen between the data and prediction.   Before applying the results from the ttH¯ H → b¯b statistical analysis all data points fall within the total uncertainty band. No large shape differences are

164 9.2. Colour Flow in ttH¯ H → b¯b 9. Colour Connection of b-quarks

≥4j Dilepton SR Dilepton ≥4j≥4b 1 tt, all events tt, all events tt, Correct tt matching tt, Correct tt matching 0.3 0.3 tt, Incorrect tt matching tt, Incorrect tt matching Normalised Entries Normalised Entries

0.2 0.2

0.1 0.1

0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ l,Higgs sl,Higgs θ l,Higgs sl,Higgs Charged Particle P (b ,b ) [rad] Charged Particle P (b ,b ) [rad] (a) (b)

Figure 9.6: The pull angle θP between the two jets matched to the b-quarks   from the Higgs boson decay in ttH¯ H → b¯b events using the reconstruction BDT. The angle is calculated using all charged particle constituents in the jets. The events where the reconstruction BDT has correctly matched both jets to the b-quarks from the tt¯ decay are shaded in green, events incorrectly matched are shown in red and the distribution for all tt¯+ jets events are shown by the blue line. The distributions are evaluated for all simulated tt¯+ jets events in ≥4j SR1 (a) and ≥ 4j ≥ 4b (b). The green contribution is shown on top of the red contribution, and the sum of the two in each bin is equal to the blue line.

0.3

s = 13 TeV, 36.1 fb-1 Total background 0.3 s = 13 TeV, 36.1 fb-1 Total background Dilepton ttH(bb) Dilepton ttH(bb) 0.28 ≥4j

Arbitrary units Arbitrary units ≥ ≥ SR1 Separation: 0.00703% 4j 4b (loose wp) Separation: 0.0406%

0.26

0.25 0.24

0.22

0.2 0.2 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ t t θ t t Charged Particle P (b , b ) [rad] Charged Particle P (b , b ) [rad]

(a) (b)

Figure 9.7: The θP between the two jets matched to the b-quarks from tt¯ ≥4j decays by the reconstruction BDT in SR1 (a) and ≥ 4j ≥ 4b (b). Comparing simulated ttH¯ events (red) with the total expected background (blue) in each region, normalised to unity.

165 9. Colour Connection of b-quarks 9.2. Colour Flow in ttH¯ H → b¯b observed, though a slight normalisation offset can be seen in ≥ 4j ≥ 4b. This is   consistent with the excess seen in the dilepton ≥4j regions in the ttH¯ H → b¯b analysis. After applying the fit results, the normalisation is improved in this ≥4j region, though the prediction in SR1 overestimates the data as already seen in   the post-fit distributions in the search for ttH¯ H → b¯b . Furthermore, almost all data points fall within the total uncertainty after applying the changes to   the modelling from the results from the ttH¯ H → b¯b analysis. The amount of data available in ≥ 4j ≥ 4b suggests that observing the colour structure in tt¯+ jets events is possible, however a good control on the background modelling is required to reduce the total systematic uncertainty. In contrast, more data are required to look at a tt¯+ b¯b enriched region such as ≥4j SR1 . Another limitation is the current available number of events in Monte Carlo simulation for the background prediction in such a region. In addition, even after applying the fit result to control the background modelling it is seen that the total uncertainty band is larger than the ttH¯ contribution in all bins. From this it makes it hard to discern any differences from the ttH¯ contribu- tion on the overall distributions. Nevertheless, differences in jet substructure between ttH¯ events and the tt¯+ jets backgrounds are visible. In addition to the apparent shape difference between two different θP observables, which are well modelled in data, the effects of colour connection on jet substructure in   ttH¯ H → b¯b and tt¯+ b¯b events should be observable in future as more data is collected and the background modelling is improved. The additional infor- mation provided by these observables could further improve the separation of signal and background events in a classification discriminant, or be used to discriminate against unlikely jet assignments in reconstructing the final state of processes with many jets.

166 9.2. Colour Flow in ttH¯ H → b¯b 9. Colour Connection of b-quarks

180 Data ttH Data ttH -1 900 -1

Events Non-tt tt + light Events Non-tt tt + light 160 s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb tt + ≥1c tt + ≥1b tt + ≥1c tt + ≥1b Dilepton 800 Dilepton ≥4j tt +V non-tt ≥ ≥ tt +V non-tt 140 SR1 4j 4b (loose wp) Pre-Fit Uncertainty 700 Pre-Fit Uncertainty 120 600 100 500 80 400

60 300

40 200

20 100

1.50 1.50 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ l,Higgs sl,Higgs θ l,Higgs sl,Higgs Charged Particle P (b ,b ) [rad] Charged Particle P (b ,b ) [rad]

(a) Pre-fit

180 Data ttH Data ttH -1 900 -1

Events Non-tt tt + light Events Non-tt tt + light 160 s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb tt + ≥1c tt + ≥1b tt + ≥1c tt + ≥1b Dilepton 800 Dilepton ≥4j tt +V non-tt ≥ ≥ tt +V non-tt 140 SR1 4j 4b (loose wp) Post-Fit Uncertainty 700 Post-Fit Uncertainty 120 600 100 500 80 400

60 300

40 200

20 100

1.50 1.50 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ l,Higgs sl,Higgs θ l,Higgs sl,Higgs Charged Particle P (b ,b ) [rad] Charged Particle P (b ,b ) [rad]

(b) Post-fit

Figure 9.8: Comparison between data and prediction for θP between the jets ≥4j matched to the Higgs boson using the reconstruction BDT. Shown in SR1 (left) and ≥ 4j ≥ 4b (right). The comparison is performed using the prediction   from both before (a) and after (b) applying the result of the ttH¯ H → b¯b analysis.

167 9. Colour Connection of b-quarks 9.2. Colour Flow in ttH¯ H → b¯b

1000 180 Data ttH Data ttH -1 -1

Events Non-tt tt + light Events Non-tt tt + light 160 s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb tt + ≥1c tt + ≥1b tt + ≥1c tt + ≥1b Dilepton 800 Dilepton ≥4j tt +V non-tt ≥ ≥ tt +V non-tt 140 SR1 4j 4b (loose wp) Pre-Fit Uncertainty Pre-Fit Uncertainty 120 600 100

80 400 60

40 200

20

1.50 1.50 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ t t θ t t Charged Particle P (b , b ) [rad] Charged Particle P (b , b ) [rad]

(a) Pre-fit

1000 180 Data ttH Data ttH -1 -1

Events Non-tt tt + light Events Non-tt tt + light 160 s = 13 TeV, 36.1 fb s = 13 TeV, 36.1 fb tt + ≥1c tt + ≥1b tt + ≥1c tt + ≥1b Dilepton 800 Dilepton ≥4j tt +V non-tt ≥ ≥ tt +V non-tt 140 SR1 4j 4b (loose wp) Post-Fit Uncertainty Post-Fit Uncertainty 120 600 100

80 400 60

40 200

20

1.50 1.50 1.25 1.25 1 1 0.75 0.75 Data / Pred. 0.5 Data / Pred. 0.5 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 θ t t θ t t Charged Particle P (b , b ) [rad] Charged Particle P (b , b ) [rad]

(b) Post-fit

Figure 9.9: Comparison between data and prediction for θP between the jets matched to the b-quarks from the tt¯ decays using the reconstruction BDT. ≥4j Shown in SR1 (left) and ≥ 4j ≥ 4b (right). The comparison is performed using the prediction from both before (a) and after (b) applying the result of   the ttH¯ H → b¯b analysis.

168 10. Conclusion

In this thesis, the search for the Standard Model Higgs boson produced in association with a top quark pair has been presented. The search was performed √ using 36.1 fb−1 of pp collision data, taken at s = 13 TeV by the ATLAS detector during Run 2 of the LHC, and follows on from previous searches performed at a centre of mass energy of 8 TeV. The search targeted the decay of the Higgs boson into a pair of b-quarks, the decay mode with the highest branching ratio. Furthermore, events were selected with leptonic tt¯ decays, providing a clean trigger signature. The ttH¯ production mechanism has a very low cross section, and like the H → b¯b decay mode has not yet been observed in data. In order to improve the sensitivity to the signal in the analysis, a wide range of techniques were used to discriminate ttH¯ events from the dominant backgrounds. This included a new approach to defining the analysis regions by using the binned b-tagging discriminant of the jets in an event, as opposed to applying a cut on the b-tag multiplicity at a single working point. Another method employed to enhance the sensitivity of the analysis was to   reconstruct the events assuming a ttH¯ H → b¯b hypothesis. The final state was reconstructed in the dilepton channel using a BDT trained to match jets to the b-quarks in the hard scatter. This method was able to correctly re-   construct the Higgs boson in up to 50% of all ttH¯ H → b¯b events. Using the event reconstruction, variables were constructed that provide separating power between the signal and background processes. Variables from the recon- struction BDT constituted half of the variables entering the final classification BDT trainings in the two most signal rich regions. In addition, by using the reconstruction BDT to match the jets to the Higgs boson and tt¯ decays, the effect of the colour connection between jets on jet substructure was probed. Looking at the pull angles between jets matched to the Higgs boson, a dif- ference is observed between events where the two jets are predicted to have a colour connection and events where there is no connection. Nevertheless,

169 10. Conclusion the differences are found to be small and an increase in data and the number of simulated events would be required to exploit the information and further understand the modelling of these observables in ttH¯ .   Performing a simultaneous likelihood fit on all regions in the ttH¯ H → b¯b ¯ +0.64 analysis, the signal strength of ttH was measured to be µ = 0.84−0.61, with values greater than 2.0 excluded at the 95% confidence level. This analysis was subsequently combined with three other analyses searching for ttH¯ , each targeting different Higgs boson decay modes. This combined result measured the signal strength of ttH¯ as

+0.27 +0.33 µttH¯ = 1.17 ± 0.19 (stat.) −0.23 (syst.) = 1.17−0.30. This corresponds to an observed significance of 4.2σ, constituting evidence for the production of the Higgs boson in association with a top quark pair. +230 The measured cross section of σttH¯ = 790−210 fb is in agreement with the SM +35 prediction of 507−50 fb. Both of these measurements are improvements on the results from searches performed by the ATLAS experiment during Run 1 of √ the LHC, which used 20.1 fb−1 of pp collisions at s = 8 TeV [93, 94, 115]. Due to the sensitivity of the combined analysis to multiple Higgs boson cou- plings, a further study has been presented which uses the κ-parameterisation. In this study, the couplings of the Higgs boson to fermions and vector bosons are scaled linearly by two common factors, κF and κV . Best fit values extracted for the two modifiers are in good agreement with the Standard Model predic- tion of κF = κV = 1. Furthermore, a negative value of κF has been excluded at the 95% confidence level. As the dataset collected by the ATLAS experiment increases, and measurements of the ttH¯ production mode become more pre- cise, further interpretations will become possible. These include measuring the coupling strengths of individual particles to the Higgs boson and extracting a value of yt using only the direct coupling of the Higgs boson to top quarks. Moving forward, the expected significance to observe ttH¯ production with the ATLAS detector is expected to exceed 5σ in the combination after the addition of the pp data collected during 2017, increasing the total luminosity to 80 fb−1. Similarly, the H → b¯b decay mode has not yet been observed by either the ATLAS or CMS collaborations. However, with the additional data collected during 2017 observation is a likely prospect, with evidence already reported in the VH channel alone by both the ATLAS and CMS experiments [16, 17] using the current Run 2 dataset. Furthermore, projections can be made on the expected sensitivity to the ttH¯ production mode that can be achieved with the full dataset from the

170 10. Conclusion

LHC after the High Luminosity LHC (HL-LHC) upgrade. The HL-LHC aims to deliver a pp dataset with a total integrated luminosity of 3000 fb−1 over the course of its running, which is set to finish in 2035 [116]. Extrapolating the sensitivity of the current analyses to include the total dataset, the total uncertainty on the total ttH¯ production cross section is expected to reach 1%. Most of the gain in sensitivity is expected to come from the H → γγ decay channel, followed by the multilepton channels.   However, the ttH¯ H → b¯b analysis is currently limited by systematic un- certainties. In order to improve the sensitivity in this channel, new techniques will need to be employed to enhance the signal contribution. Furthermore, a more comprehensive understanding of the dominant tt¯ + HF background is required, as evidenced by the large number of high-impact systematic uncer- tainties which see pulls and constraints in the profile likelihood fit.

171 172 References

[1] ATLAS Collaboration. Search for the Standard Model Higgs boson pro- duced in association with top quarks and decaying into a b¯b pair in pp √ collisions at s = 13 TeV with the ATLAS detector. In: Phys. Rev. D (2017). arXiv: 1712.08895 [hep-ex]. [2] ATLAS Collaboration. Evidence for the associated production of the Higgs boson and a top quark pair with the ATLAS detector. In: Phys. Rev. D97.7 (2018), p. 072003. arXiv: 1712.08891 [hep-ex]. [3] Griffiths, D. Introduction to Elementary Particles. 2nd edition. Wiley- VCH, 2008. [4] Martin, B. R. and Shaw, G. Particle Physics. 3rd edition. Manchester Physics Series. Wiley-Blackwell, 2013. [5] Grange, J. et al. Muon (g-2) Technical Design Report. In: (2015). arXiv: 1501.06858 [physics.ins-det]. [6] Patrignani, C. et al. Review of Particle Physics. In: Chin. Phys. C40.10 (2016), p. 100001. [7] Glashow, S. L. Partial symmetries of weak interactions. In: Nucl. Phys. 22 (1961), p. 579. [8] Weinberg, S. A model of leptons. In: Phys. Rev. Lett. 19 (1967), p. 1264. [9] Salam, A. Weak and electromagnetic interactions. In: Proc. of the 8th Nobel Symposium (1969), p. 367. [10] Englert, F. and Brout, R. Broken Symmetry and the Mass of Gauge Vector Mesons. In: Phys. Rev. Lett. 13 (1964), p. 321. [11] Higgs, P. W. Broken Symmetries and the Masses of Gauge Bosons. In: Phys. Rev. Lett. 13 (1964), p. 508. [12] Guralnik, G., Hagen, C., and Kibble, T. Global Conservation Laws and Mass-less Particles. In: Phys. Rev. Lett. 13 (1964), p. 585.

173 References

[13] Alvarez-Gaume, L. and Ellis, J. Eyes on a prize particle. In: Nature Phys. 7.1 (2011). Editorial Material, pp. 2–3. url: https : / / cds . cern.ch/record/1399903. [14] LHC Higgs Cross Section Working Group. Handbook of LHC Higgs Cross Sections: 4. Deciphering the Nature of the Higgs Sector. 2016. arXiv: 1610.07922 [hep-ph]. [15] ATLAS and CMS Collaborations. Measurements of the Higgs boson production and decay rates and constraints on its couplings from a com- √ bined ATLAS and CMS analysis of the LHC pp collision data at s = 7 and 8 TeV. ATLAS-CONF-2015-044. 2015. url: https://cds.cern. ch/record/2052552. [16] Aaboud, M. et al. Evidence for the H → bb decay with the ATLAS detector. In: JHEP 12 (2017), p. 024. arXiv: 1708.03299 [hep-ex]. [17] CMS Collaboration. Evidence for the Higgs boson decay to a bottom quarkantiquark pair. In: Phys. Lett. B780 (2018), pp. 501–532. arXiv: 1709.07497 [hep-ex]. [18] F. Bezrukov and M. Shaposhnikov. Why should we care about the top quark Yukawa coupling? 2014. arXiv: 1411.1923 [hep-ph]. [19] Buttazzo, D. et al. Investigating the near-criticality of the Higgs boson. In: JHEP 12 (2013), p. 089. arXiv: 1307.3536 [hep-ph]. [20] Ball, R. D. et al. Parton distributions for the LHC Run II. In: JHEP 04 (2015), p. 040. arXiv: 1410.8849 [hep-ph]. [21] Lai, H.-L. et al. New parton distributions for collider physics. In: Phys. Rev. D 82 (2010), p. 074024. arXiv: 1007.2241 [hep-ph]. [22] Hautmann, F. and Jung, H. Transverse momentum dependent gluon density from DIS precision data. In: Nucl. Phys. B883 (2014), pp. 1–19. arXiv: 1312.7875 [hep-ph]. [23] Buckley, A. et al. General-purpose event generators for LHC physics. In: Phys. Rept. 504 (2011), pp. 145–233. arXiv: 1101.2599 [hep-ph]. [24] Cowan, G. et al. Asymptotic formulae for likelihood-based tests of new physics. In: Eur. Phys. J. C 71 (2011), p. 1554. arXiv: 1007 . 1727 [physics.data-an]. [25] Procedure for the LHC Higgs boson search combination in summer 2011. ATL-PHYS-PUB-2011-011, CMS-NOTE-2011-005. 2011.

174 References

[26] Neyman, J. and Pearson, E. S. On the Problem of the Most Efficient Tests of Statistical Hypotheses. In: Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathe- matical or Physical Character 231 (1933), pp. 289–337. url: http : //www.jstor.org/stable/91247. [27] ATLAS Collaboration. Procedure for the LHC Higgs boson search com- bination in summer 2011. ATL-PHYS-PUB-2011-011. 2011. url: https: //cds.cern.ch/record/1375842. [28] Asimov, I. Franchise. Isaac Asimov: The Complete Stories, Vol. 1. Broadway Books, 1990. [29] Read, A. L. Presentation of search results: The CL(s) technique. In: J. Phys. G 28 (2002), p. 2693. [30] Junk, T. Confidence level computation for combining searches with small statistics. In: Nucl. Instrum. Meth. A434 (1999), pp. 435–443. arXiv: hep-ex/9902006 [hep-ex]. [31] Hocker, A. et al. TMVA - Toolkit for Multivariate Data Analysis. In: PoS ACAT (2007), p. 040. arXiv: physics/0703039 [PHYSICS]. [32] Brüning, O. S. et al. LHC Design Report. CERN Yellow Reports: Mono- graphs. Geneva: CERN, 2004. [33] ATLAS Collaboration. The ATLAS Experiment at the CERN Large Hadron Collider. In: JINST 3 (2008), S08003. [34] CMS Collaboration. The CMS experiment at the CERN LHC. In: JINST 3 (2008), S08004. [35] LHCb Collaboration. The LHCb Detector at the LHC. In: JINST 3 (2008), S08005. [36] ALICE Collaboration. The ALICE experiment at the CERN LHC. In: JINST 3 (2008), S08002. [37] UA1 Collaboration. Experimental Observation of Lepton Pairs of In- variant Mass Around 95 GeV/c2 at the CERN SPS Collider. In: Phys. Lett. 126B (1983), pp. 398–410. [38] UA1 Collaboration. Experimental Observation of Isolated Large Trans- verse Energy Electrons with Associated Missing Energy at s1/2 = 540 GeV. In: Phys. Lett. 122B (1983). [611(1983)], pp. 103–116.

175 References

[39] UA2 Collaboration. Evidence for Z0 —> e+e− at the CERN anti-p p Collider. In: Phys. Lett. 129B (1983), pp. 130–140. [40] UA2 Collaboration. Observation of Single Isolated Electrons of High Transverse Momentum in Events with Missing Transverse Energy at the CERN anti-p p Collider. In: Phys. Lett. 122B (1983), pp. 476–485. [41] Mobs, E. The CERN accelerator complex. Complexe des accélérateurs du CERN. General Photo. July 2016. url: https://cds.cern.ch/ record/2197559. [42] ATLAS Collaboration. Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. In: Phys. Lett. B 716 (2012), p. 1. arXiv: 1207.7214 [hep-ex]. [43] CMS Collaboration. Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. In: Phys. Lett. B 716 (2012), p. 30. arXiv: 1207.7235 [hep-ex]. [44] Gorden, G. Star Wars Imperial Sourcebook. West End Games, 1989.

[45] Fathead, LLC. TIE/LN Starfighter. url: http://starwars.wikia. com/wiki/File:TIEfighter2-Fathead.png (visited on 02/27/2018). [46] Pequenao, J. Computer generated image of the ATLAS inner detector. Mar. 2008. url: https://cds.cern.ch/record/1095926. [47] ATLAS Collaboration. ATLAS Insertable B-Layer Technical Design Report. In: ATLAS-TDR-19 (2010). https://cds.cern.ch/record/ 1291633. [48] ATLAS Collaboration. Commissioning of the ATLAS b-tagging algo- rithms using tt¯ events in early Run 2 data. ATL-PHYS-PUB-2015-039. 2015. url: https://cds.cern.ch/record/2047871. [49] ATLAS Collaboration. Technical Design Report for the ATLAS In- ner Tracker Strip Detector. CERN-LHCC-2017-005, ATLAS-TDR-025. 2017. [50] Pequenao, J. Computer Generated image of the ATLAS calorimeter. Mar. 2008. url: https://cds.cern.ch/record/1095927. [51] Pequenao, J. Computer generated image of the ATLAS Muons subsys- tem. Mar. 2008. url: https://cds.cern.ch/record/1095929. [52] ATLAS Collaboration. Performance of the ATLAS Trigger System in 2015. In: Eur. Phys. J. C77.5 (2017), p. 317.

176 References

√ [53] Aaboud, M. et al. Luminosity determination in pp collisions at s = 8 TeV using the ATLAS detector at the LHC. In: Eur. Phys. J. C76.12 (2016), p. 653. arXiv: 1608.03953 [hep-ex]. [54] ATLAS Collaboration. Early Inner Detector Tracking Performance in √ the 2015 Data at s = 13 TeV. ATL-PHYS-PUB-2015-051. 2015. url: https://cds.cern.ch/record/2110140. [55] ATLAS Collaboration. Vertex Reconstruction Performance of the √ ATLAS Detector at s = 13 TeV. ATL-PHYS-PUB-2015-026. 2015. url: https://cds.cern.ch/record/2037717. [56] ATLAS Collaboration. Muon reconstruction performance of the ATLAS √ detector in protonproton collision data at s =13 TeV. In: Eur. Phys. J. C76.5 (2016), p. 292. arXiv: 1603.05598 [hep-ex].

[57] Cacciari, M., Salam, G., and Soyez, G. The anti-kt jet clustering algo- rithm. In: JHEP 04 (2008), p. 063. arXiv: 0802.1189 [hep-ph]. [58] Aaboud, M. et al. Jet energy scale measurements and their system- √ atic uncertainties in proton-proton collisions at s = 13 TeV with the ATLAS detector. In: Phys. Rev. D96.7 (2017), p. 072002. arXiv: 1703.09665 [hep-ex]. [59] Aad, G. et al. Performance of pile-up mitigation techniques for jets in √ pp collisions at s = 8 TeV using the ATLAS detector. In: Eur. Phys. J. C76.11 (2016), p. 581. arXiv: 1510.03823 [hep-ex]. [60] Nachman, B. et al. Jets from Jets: Re-clustering as a tool for large radius jet reconstruction and grooming at the LHC. In: JHEP 02 (2015), p. 075. arXiv: 1407.2922 [hep-ph]. [61] ATLAS Collaboration. Optimisation of the ATLAS b-tagging performance for the 2016 LHC Run. ATL-PHYS-PUB-2016-012. 2016. url: https://cds.cern.ch/record/2160731. [62] ATLAS Collaboration. Electron efficiency measurements with the ATLAS detector using the 2015 LHC proton-proton collision data. ATLAS-CONF-2016-024 (2016). url: https://cds.cern.ch/ record/2157687.

177 References

[63] ATLAS Collaboration. Reconstruction, Energy Calibration, and Iden- tification of Hadronically Decaying Tau Leptons in the ATLAS Exper- iment for Run-2 of the LHC. ATL-PHYS-PUB-2015-045. 2015. url: https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PUBNOTES/ ATL-PHYS-PUB-2015-045. [64] ATLAS Collaboration. The ATLAS Simulation Infrastructure. In: Eur. Phys. J. C 70 (2010), p. 823. arXiv: 1005.4568 [physics.ins-det]. [65] GEANT4 Collaboration, S. Agostinelli et al. GEANT4: A Simulation toolkit. In: Nucl. Instrum. Meth. A 506 (2003), p. 250. [66] ATLAS Collaboration. The simulation principle and performance of the ATLAS fast calorimeter simulation FastCaloSim. 2010. url: https: //cds.cern.ch/record/1300517. [67] Cacciari, M., Salam, G. P., and Soyez, G. FastJet User Manual. In: Eur. Phys. J. C 72 (2012), p. 1896. arXiv: 1111.6097 [hep-ph]. [68] Sjostrand, T., Mrenna, S., and Skands, P. Z. A Brief Introduction to PYTHIA 8.1. In: Comput. Phys. Commun. 178 (2008), p. 852. arXiv: 0710.3820 [hep-ph]. [69] Lange, D. J. The EvtGen particle decay simulation package. In: Nucl. Instrum. Meth. A462 (2001), pp. 152–155. [70] Gleisberg, T. et al. Event generation with SHERPA 1.1. In: JHEP 0902 (2009), p. 007. arXiv: 0811.4622 [hep-ph]. [71] Alwall, J. et al. The automated computation of tree-level and next-to- leading order differential cross sections, and their matching to parton shower simulations. In: JHEP 1407 (2014), p. 079. arXiv: 1405.0301 [hep-ph]. [72] ATLAS Collaboration. ATLAS Pythia 8 tunes to 7 TeV data. ATL- PHYS-PUB-2014-021. 2014. url: https://cds.cern.ch/record/ 1966419. [73] Djouadi, A., Kalinowski, J., and Spira, M. HDECAY: A Program for Higgs boson decays in the standard model and its supersymmetric ex- tension. In: Comput. Phys. Commun. 108 (1998), pp. 56–74. arXiv: hep-ph/9704448 [hep-ph]. [74] Alioli, S. et al. A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX. In: JHEP 1006 (2010), p. 043. arXiv: 1002.2581 [hep-ph].

178 References

[75] ATLAS Collaboration. Studies on top-quark Monte Carlo modelling for Top2016. ATL-PHYS-PUB-2016-020. 2016. url: https://cds.cern. ch/record/2216168. [76] ATLAS Collaboration. ATLAS Run 1 Pythia8 tunes. ATL-PHYS-PUB- 2014-021. 2011. url: http://cdsweb.cern.ch/record/1966419. [77] Cacciari, M. et al. Top-pair production at hadron colliders with next- to-next-to-leading logarithmic soft-gluon resummation. In: Phys. Lett. B710 (2012), p. 612. arXiv: 1111.5869 [hep-ph]. [78] Bärnreuther, P., Czakon, M., and Mitov, A. Percent Level Precision Physics at the Tevatron: First Genuine NNLO QCD Corrections to qq¯ → tt¯. In: Phys. Rev. Lett. 109 (2012), p. 132001. arXiv: 1204.5201 [hep-ph]. [79] Czakon, M. and Mitov, A. NNLO corrections to top-pair production at hadron colliders: the all-fermionic scattering channels. In: JHEP 1212 (2012), p. 054. arXiv: 1207.0236 [hep-ph]. [80] Czakon, M. and Mitov, A. NNLO corrections to top-pair production at hadron colliders: the quark-gluon reaction. In: JHEP 1301 (2013), p. 080. arXiv: 1210.6832 [hep-ph]. [81] Czakon, M., Fiedler, P., and Mitov, A. The total top quark pair pro- 4 duction cross-section at hadron colliders through O(αS). In: Phys. Rev. Lett. 110 (2013), p. 252004. arXiv: 1303.6254 [hep-ph]. [82] Czakon, M. and Mitov, A. Top++: A Program for the Calculation of the Top-Pair Cross-Section at Hadron Colliders. In: Comput. Phys. Com- mun. 185 (2014), p. 2930. arXiv: 1112.5675 [hep-ph]. [83] ATLAS Collaboration. Further studies on simulation of top-quark pro- √ duction for the ATLAS experiment at s = 13 TeV. ATL-PHYS-PUB- 2016-016. 2016. url: https://cds.cern.ch/record/2205262. [84] Bellm, J. et al. Herwig 7.0/Herwig++ 3.0 release note. In: Eur. Phys. J. C 76.4 (2016), p. 196. arXiv: 1512.01178 [hep-ph]. [85] Sjöstrand, T., Mrenna, S., and Skands, P. Z. PYTHIA 6.4 Physics and Manual. In: JHEP 0605 (2006), p. 026. arXiv: hep-ph/0603175. [86] ATLAS Collaboration. Search for the Standard Model Higgs boson pro- duced in association with top quarks and decaying into b¯b in pp collisions √ at s = 8 TeV with the ATLAS detector. In: Eur. Phys. J. C 75 (2015), p. 349. arXiv: 1503.05066 [hep-ex].

179 References

[87] Re, E. Single-top Wt-channel production matched with parton showers using the POWHEG method. In: Eur. Phys. J. C71 (2011), p. 1547. arXiv: 1009.2450 [hep-ph]. [88] Alioli, S. et al. NLO single-top production matched with shower in POWHEG: s- and t-channel contributions. In: JHEP 09 (2009), p. 111. arXiv: 0907.4076 [hep-ph]. [89] Cascioli, F. et al. NLO matching for ttb¯ ¯b production with massive b- quarks. In: Phys. Lett. B734 (2014), pp. 210–214. arXiv: 1309.5912 [hep-ph]. [90] Gleisberg, T. and Höche, S. Comix, a new matrix element generator. In: JHEP 0812 (2008), p. 039. arXiv: 0808.3674 [hep-ph]. [91] Frixione, S. et al. Single-top hadroproduction in association with a W boson. In: JHEP 0807 (2008), p. 029. arXiv: 0805.3067 [hep-ph]. [92] Bahr, M. et al. Herwig++ Physics and Manual. In: Eur. Phys. J. C58 (2008), pp. 639–707. arXiv: 0803.0883 [hep-ph]. [93] ATLAS Collaboration. Search for the Standard Model Higgs boson pro- duced in association with top quarks and decaying into b¯b in pp colli- √ sions at s = 8 TeV with the ATLAS detector. In: Eur. Phys. J. C75.7 (2015), p. 349. arXiv: 1503.05066 [hep-ex]. [94] ATLAS Collaboration. Search for the Standard Model Higgs boson de- caying into bb produced in association with top quarks decaying hadron- √ ically in pp collisions at s = 8 TeV with the ATLAS detector. In: JHEP 05 (2016), p. 160. arXiv: 1604.03812 [hep-ex]. [95] CMS Collaboration. Search for the associated production of the Higgs boson with a top-quark pair. In: JHEP 09 (2014), p. 087. arXiv: 1408. 1682 [hep-ex]. [96] ATLAS and CMS Collaborations. Measurements of the Higgs boson production and decay rates and constraints on its couplings from a com- √ bined ATLAS and CMS analysis of the LHC pp collision data at s = 7 and 8 TeV. In: JHEP 08 (2016), p. 045. arXiv: 1606.02266 [hep-ex]. [97] ATLAS Collaboration. Search for flavour-changing neutral current top √ quark decays t → Hq in pp collisions at s = 8 TeV with the ATLAS detector. In: JHEP 12 (2015), p. 061. arXiv: 1509.06047 [hep-ex].

180 References

[98] ATLAS Collaboration. Measurement of event shapes at large momen- √ tum transfer with the ATLAS detector in pp collisions at s = 7 TeV. In: Eur. Phys. J. C 72 (2012), p. 2211. arXiv: 1206.2135 [hep-ex]. [99] Bernaciak, C. et al. Fox-Wolfram Moments in Higgs Physics. In: Phys. Rev. D87 (2013), p. 073014. arXiv: 1212.4436 [hep-ph]. [100] Brun, R. and Rademakers, F. ROOT: An object oriented data analysis framework. In: Nucl. Instrum. Meth. A389 (1997), pp. 81–86. [101] ATLAS Collaboration. Physics at a High-Luminosity LHC with ATLAS (Update). ATL-PHYS-PUB-2012-004. 2012. url: https://cds.cern. ch/record/1484890. [102] ATLAS Collaboration. Jet calibration and systematic uncertainties for √ jets reconstructed in the ATLAS detector at s =13 TeV. In: ATLAS- PHYS-PUB-2015-015 (2015). url: http://cdsweb.cern.ch/record/ 2037613. [103] ATLAS Collaboration. Studies of tt+cc production with MadGraph5_aMC@NLO and Herwig++ for the ATLAS experiment. ATL-PHYS-PUB-2016-011. 2016. url: https://cds.cern.ch/record/ 2153876. [104] Moneta, L. et al. The RooStats Project. In: PoS ACAT2010 (2010), p. 057. arXiv: 1009.1003 [physics.data-an]. [105] James, F. and Roos, M. Minuit: A System for Function Minimization and Analysis of the Parameter Errors and Correlations. In: Comput. Phys. Commun. 10 (1975), pp. 343–367. [106] ATLAS Collaboration. Measurements of Higgs boson properties in the √ diphoton decay channel with 36 fb−1 of pp collision data at s = 13 TeV with the ATLAS detector. CERN-EP-2017-288. 2017. [107] ATLAS Collaboration. Measurement of the Higgs boson coupling prop- √ erties in the H → ZZ∗ → 4` decay channel at s = 13 TeV with the ATLAS detector. In: JHEP 03 (2018), p. 095. arXiv: 1712.02304 [hep-ex]. [108] ATLAS Collaboration. Measurement of colour flow with the jet pull √ angle in tt¯ events using the ATLAS detector at s = 8 TeV. In: Phys. Lett. B 750 (2015), p. 475. arXiv: 1506.05629 [hep-ex].

181 References

[109] ATLAS Collaboration. Measurement of colour flow using jet-pull ob- √ servables in tt¯ events with the ATLAS experiment at s = 13 TeV. ATLAS-CONF-2017-069. 2017. url: https://cds.cern.ch/record/ 2285807. [110] D0 Collaboration. Measurement of color flow in t¯t events from p¯p col- √ lisions at s = 1.96 TeV. In: Phys. Rev. D83 (2011), p. 092002. arXiv: 1101.0648 [hep-ex]. [111] Gallicchio, J. and Schwartz, M. D. Seeing in Color: Jet Superstructure. In: Phys. Rev. Lett. 105 (2010), p. 022001. arXiv: 1001.5027 [hep-ph]. [112] Ellis, R. K. and Stirling, W. J. Introduction to Perturbative QCD. In: Proceedings, 1990 Summer School in High-energy Physics and Cosmol- ogy. Ed. by J. C. Pati et al. Vol. 7. World Scientific, 1991, pp. 275– 279. [113] Wilk, F. Private communication. Feb. 14, 2018. [114] Cacciari, M., Salam, G. P., and Soyez, G. The Catchment Area of Jets. In: JHEP 04 (2008), p. 005. arXiv: 0802.1188 [hep-ph]. [115] ATLAS and CMS Collaborations. Combined Measurement of the Higgs √ Boson Mass in pp Collisions at s = 7 and 8 TeV with the ATLAS and CMS Experiments. In: Phys. Rev. Lett. 114 (2015), p. 191803. arXiv: 1503.07589 [hep-ex]. [116] Apollinari, G. et al. High Luminosity Large Hadron Collider HL-LHC. In: CERN Yellow Report 5 (2015), pp. 1–19. arXiv: 1705.08830.

182