<<

Master-Thesis University of Hamburg

Search for a Singly Produced Excited Bottom Decaying to a Top √ Quark and W at s =13 TeV with the CMS Experiment

Alexander Fröhlich Geboren am 15.02.1991 in Hamburg

08.11.2017 1. Gutachter: Prof. Dr. Johannes Haller 2. Gutachter: Dr. Roman Kogler Abstract

A search for a singly produced excited b∗ decaying to a and a √ W boson in the +jets decay channel in -proton collisions at s = 13 TeV is presented. The analyzed dataset has been recorded by the CMS experiment in 2016 and corresponds to an integrated luminosity of 35.87 fb−1.

This is the first search for new physics to use the HOTVR top tagger. The capabilities of the HOTVR top tagger are demonstrated and a comparison to another top tagging algorithm is given.

No significant excess of data over the prediction is observed. Upper limits on the b∗ production cross section at 95 % confidence-level are set. These limits are used to exclude b∗ for purely left-handed, purely right-handed and vector-like benchmark couplings up to masses of 2050 GeV/c2, 2150 GeV/c2 and 2350 GeV/c2, respectively.

Furthermore, the sensitivity of this search to a singly produced vector-like B0 quark decaying to a top quark and a W boson in the muon+jets decay channel is demonstrated and upper cross section limits are set.

Zusammenfassung

Eine Suche nach einem einzelnd produzierten, angeregten bottom Quark b∗, das in ein top Quark und ein W Boson im Muon+Jets Zerfallskanal zerfällt, ist präsentiert. Die √ Suche wurde in Proton-Proton Kollisionen bei s = 13 TeV durchgeführt, die vom CMS Experiment in 2016 aufgezeichnet wurden. Der analysierte Datensatz entspricht einer integrierten Luminosität von 35.87 fb−1.

Dies ist die erste Suche nach neuer Physik, die den HOTVR Algorithmus zur Identifikation von top-Jets verwendet. Die Einsatzmöglichkeiten von HOTVR werden demonstriert und ein Vergleich mit einem anderen Algorithmus zur top-Jet Identifikation wird aufgezeigt.

Es wurde kein signifikanter Datenüberschuss über der Standard Modell Vorhersage be- obachtet. Obere Grenzen für den b∗ Produktionswirkungsquerschnitt bei einem 95 % Konfidenzintervall wurden gesetzt. Diese Grenzen wurden dann benutzt, um b∗ Quarks für reine linkshändige, reine rechtshändige und vektorartige Benchmark-Kopplungen bis hin zu Massen von 2050 GeV/c2, 2150 GeV/c2 und 2350 GeV/c2, in dieser Reihenfolge, auszuschließen.

Desweiteren wird die Sensitivität dieser Suche auf ein einzelnd produziertes vektorartiges B0 Quark, das in ein top Quark und ein W Boson im Muon+Jets Zerfallskanal zerfällt, demonstriert und obere Grenzen für den Produktionswirkungsquerschnitt werden gesetzt. Contents

1 Introduction 1

2 Theory 3 2.1 The Standard Model ...... 3 2.2 Shortcomings of the Standard Model ...... 7 2.3 Beyond the Standard Model ...... 7 2.4 Heavy Bottom Quarks ...... 9 2.4.1 Excited Bottom Quarks ...... 9 2.4.2 Vector-Like B Quarks ...... 10 2.5 Standard Model Backgrounds ...... 11 2.6 Proton-Proton Collisions ...... 12

3 Jet Clustering and Top Tagging 15 3.1 Jet Clustering Algorithms ...... 15 3.1.1 Heavy Object Tagger with Variable R ...... 17 3.2 Top Tagging ...... 18 3.2.1 Jet Substructure ...... 18 3.2.2 b Tagging ...... 20 3.2.3 CMS Top Tagger ...... 20 3.2.4 HOTVR Top Tagger ...... 20

4 Experimental Setup 22 4.1 The Large Collider ...... 22 4.2 The Compact Muon Solenoid Experiment ...... 24 4.2.1 Coordinate System ...... 25 4.2.2 Inner Tracking System ...... 26 4.2.3 Electromagnetic Calorimeter ...... 27 4.2.4 Hadronic Calorimeter ...... 28 4.2.5 Solenoid Magnet ...... 30 4.2.6 Muon System ...... 30 4.2.7 Trigger ...... 31

5 Object Reconstruction and Identification 32 5.1 Flow Algorithm ...... 32 5.2 Primary Vertex ...... 34 5.3 ...... 35 5.4 Jets ...... 36 5.4.1 Jet Energy Corrections ...... 37 5.5 Other Variables ...... 37

5.5.1 E T ...... 37

5.5.2 ST ...... 37

6 Studies on the HOTVR Algorithm 38 6.1 Jet Energy Corrections for HOTVR Jets ...... 38 6.2 Top Tagging Performance ...... 43

7 Analysis 47 7.1 Dataset and Simulated Events ...... 48 7.2 Event Selection ...... 49 7.2.1 Pre-Selection ...... 50 7.2.2 Top Jet Selection using HOTVR ...... 55 7.2.3 Top Jet Selection using the CMS top tagger ...... 59 7.3 Reconstruction of the Invariant tW Mass ...... 59 7.3.1 Reconstruction ...... 61 7.3.2 Reconstruction of the tW System ...... 61 7.3.3 Determining the Goodness of the Hypothesis ...... 62 7.3.4 Comparison of the Different Top Tagging Methods in the Recon- struction ...... 65 7.4 Systematic Uncertainties ...... 67

8 Results 71 8.1 Statistical Interpretation ...... 71 8.2 Further improvements of the Analysis ...... 73

9 Summary 77 1 Introduction

The Standard Model of elementary is a theory describing the fundamental building blocks of our universe – the elementary – and their interactions. It is extremely successful in providing precise predictions of experimental results. With the discovery of the in 2012 [1,2], the last missing piece of the Standard Model is found. There are, however, still some questions left unanswered by the Standard Model. For example the question of naturalness, arising from the hierarchy problem and the large number of free parameters in the Standard Model. In order to approach those open questions, many possible extensions to the Standard Model have been proposed. Many of these extensions predict the existence of new, heavy quarks. With the discovery of the Higgs, a fourth generation of chiral quarks is ruled out [3]. There are, however, other possibilities to introduce new heavy quarks, either as vector-like quarks, or as excited states of the already existing quarks.

The analysis presented in this thesis focuses on the search for excited bottom quarks b∗ decaying to a top quark and a W boson in the final state with one muon and jets. Due to the similarity of their signatures, the search is also sensitive to heavy vector-like B0 quarks in the same final state. The analysis is performed using data from proton-proton collisions at a center-of-mass energy of 13 TeV, recorded by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC). The dataset used was recorded in 2016 and corresponds to an integrated luminosity of 35.87 fb−1. Events with a highly energetic muon and missing transverse energy are selected to reconstruct the leptonically decaying W. Then, top tagging is used to identify the jet originating from the hadronically decaying top quark. The tW system is reconstructed using a χ2-method, where the back-to-back signature of the signal process is exploited to discriminate signal from the dominant t¯t background. The invariant mass of the tW system is used to search for excited bottom quarks and heavy vector-like B0 quarks over a wide range of possible masses, reaching from 700 GeV/c2 to 3000 GeV/c2. The reliable identification of top quarks over such a wide kinematic range is important to maintain sensitivity over the whole mass spectrum. In this search, top tagging is performed with the Heavy Object Tagger with Variable R (HOTVR) [4]. By adapting the jet size to its transverse momentum pT, HOTVR achieves a stable top tagging efficiency over a wide pT spectrum. This is the first search for new physics to implement this new top tagging algorithm. A performance comparison of the HOTVR top tagger with the soft drop top tagger is presented.

This thesis is structured as follows. The theoretical background for this thesis is given in chapter 2, where the Standard Model and possible extensions are discussed. Furthermore, an introduction to excited bottom quarks, as well as heavy vector-like quarks is given. In chapter 3, the jet clustering algorithms and top tagging methods used in this analysis are described. A brief description of the LHC and the CMS detector is given in chapter 4. The reconstruction and identification of physics objects from detector data is described in chapter 5. In chapter 6, the performance of the HOTVR algorithm is demonstrated and a comparison with the CMS top tagger is shown. The analysis is presented in chapter 7 and the results are discussed in chapter 8. Finally, a summary is given in chapter 9.

2 2 Theory

2.1 The Standard Model

The Standard Model of particle physics (SM) is a field theory describing all known elementary particles and their interactions via three fundamental forces, the strong force, the weak force, and the electromagnetic force, see for example [5, 6]. According to the SM, all observed matter is fundamentally made up of . Fermions are particles carrying half-integer . There are twelve different fermions and their respective antifermions contained in the SM. They are further categorized into quarks and . The interactions between the fermions are mediated by the exchange of spin 1 particles called gauge .

The SM is formulated as a gauge theory with internal symmetry of the

SU(3)C × SU(2)L × U(1)Y symmetry group. That means the SM Lagrangian is gauge invariant under local gauge transformations of this group. This invariance eventually gives rise to the three fundamental interactions and their respective gauge bosons.

The electromagnetic interaction is described by the theory of quantum electrodynamics (QED). It postulates a local gauge symmetry of the Lagrangian under phase transformations of the U(1)Q group. QED introduces a massless , the , that mediates the electromagnetic interaction between particles carrying electromagnetic charge Q.

The theory of the , (QCD), is formulated similarly to QED. It postulates local gauge invariance under transformations of the non- abelian SU(3)C symmetry group and describes the interaction between particles carrying color charge C. Color charge can have three states called red, green and blue. The only fermions in the SM that carry color are quarks. Their interaction via the strong force is mediated by the exchange of massless gauge bosons called . In total, QCD introduces eight gluons, which also have color charge, since they arise from the generators of a non-abelian symmetry group. This leads to a property of the strong force called color confinement. As a result of the interaction between gluons, the potential between quarks increases with the distance between them. If the distance becomes large enough, new quark-antiquark pairs are formed from the potential energy stored in the field. Due to this process, called hadronization, the quarks can only appear in bound, colorless states called . Because of the hadronization, quarks and gluons can not be observed directly. When these particles decay, their final states consist of cascades of many charged and neutral hadrons, called jets. The top quark is the only exception to this. It is the only quark that decays before the hadronization takes place. This is due to the high mass of the top quark of 2 mtop = 172.44 ± 0.13(stat.) ± 0.47(syst.) GeV/c [7], and the resulting large phase space for the decay.

On the other hand, the strong interaction becomes insignificant at short distances, so that the quarks inside of hadrons can be considered free particles. This effect is called asymptotic freedom. It is an important feature of QCD, because it allows the calculation of processes of the strong interaction at high energies in perturbation theory. This further allows the factorization ansatz in the calculation of deep inelastic proton-proton scattering cross sections described later in this chapter.

The arises from the invariance under SU(2)L transformations. However, for a consistent description of the weak interaction as a gauge theory, it has to be unified with the electromagnetic interaction to the [8, 9]. Above the electroweak unification energy, the electromagnetic interaction and the weak interaction are combined into a single interaction. Below this energy, the electroweak symmetry is broken, and the two separate interactions are obtained. The underlying gauge symmetry of the electroweak theory is of the SU(2)L × U(1)Y symmetry group. The U(1)Y sector gives rise to the B gauge boson. Instead of the electromagnetic charge Q, the hypercharge Y , given by

Y = 2(Q − T3) , (2.1) is conserved. Here, T3 is the third component of the weak isospin. The SU(2)L sector gives ± rise to three gauge bosons W1, W2 and W3. The W bosons of the weak interaction are obtained by the superposition of W1 and W2:

± 1 W = √ (W1 ∓ iW2). (2.2) 2

4 0 The Z boson and the photon (γ) are mixed states of the W3 and B bosons. The mixing is represented by a rotation

      γ cos θW sin θW B  0 =     , (2.3) Z − sin θW cos θW W3 where θW is the Weinberg angle.

The W± bosons couple only to particles with weak isospin. Special about the weak interaction is, that in the SM only fermions with left-handed chirality and anti-fermions with right-handed chirality carry weak isospin. This property of the weak interaction is called violation. Left-handed fermions form isospin doublets, while right-handed fermions are isospin singlets. Leptons can thus be arranged as shown here:

      νe νµ ντ   ,   ,   , eR, µR, τR. (2.4) e µ τ L L L

In the lower row the charged leptons, (e), muons (µ) and tauons (τ) are shown. They carry electric charge of Q = −e, where e is the elementary charge, and weak isospin 1 of T3 = − 2 . In the upper row of the doublets are the -, muon-, and tauon- 1 (νe, νµ and ντ). They have weak isospin of T3 = 2 and do not carry electric charge, this means neutrinos are only weakly interacting, which makes them difficult to detect. Since neutrinos are assumed to be massless in the SM, there are no right-handed neutrinos.

Quarks can be arranged similar to the leptons:

      u c t   ,   ,   , uR, dR, cR, sR, tR, bR. (2.5) d s b L L L

In the upper row the so called up-type quarks, up-, charm- and top-quark are shown. They 1 2 have weak isospin of T3 = 2 and electric charge of Q = 3 e. The down-type quarks, shown in the lower row, are the down-, strange-, and bottom-quark, which have weak isospin of 1 1 T3 = − 2 and electric charge of Q = − 3 e.

The different isospin doublets are referred to as generations. The charged gauge bosons of the weak interaction, W±, mediate the transition within one generation. It is, however, observed in experiments, that the W± also mediate transitions between different generations of quarks. These transitions happen, because the mass eigenstates of the quarks are not

5 the same as their weak eigenstates. The mixing of the weak eigenstates is implemented by rotating the down-type quark mass eigenstates with a unitary matrix called Cabbibo-

Kobayashi-Maskawa matrix (CKM matrix) VCKM,

 0       d d Vud Vus Vub d          0        s  = VCKM s = Vcd Vcs Vcb s , (2.6)  0       b b Vtd Vts Vtb b with [10]

  |Vud| |Vus| |Vub|     |VCKM| = |Vcd| |Vcs| |Vcb| (2.7)   |Vtd| |Vts| |Vtb|  +0.00011  0.97434−0.00012 0.22506 ± 0.0005 0.00357 ± 0.00015     = 0.22492 ± 0.0005 0.97351 ± 0.00013 0.0411 ± 0.0013  .  +0.00032  0.00875−0.00033 0.0403 ± 0.0013 0.99915 ± 0.00005

The squares of the matrix elements give the probability of a certain transition.

All SM particles would have to be massless, since explicit mass terms would violate gauge invariance of the Lagrangian. This is in contradiction to experimental results. Fortunately, this problem can be solved by introducing the mechanism of spontaneous symmetry breaking, developed by Brout, Englert and Higgs independently [11–13]. They introduced a new complex scalar field φ. Gauge invariance of this new field introduces a new potential, called Higgs potential. The form of this potential leads to a non-zero vacuum expectation value of the scalar field, which spontaneously breaks the electroweak symmetry. Excitations of the Higgs potential around the minimum are interpreted as a new boson with spin 0, the Higgs boson. The mechanism of spontaneous symmetry breaking introduces new terms to the SM Lagrangian, giving mass to the Higgs boson and the W± and Z0 bosons and describing their interactions with the Higgs boson. Furhtermore, fermions obtain their masses via the interaction with the Higgs field. In the SM, the masses are free parameters.

The Higgs boson was discovered in 2012 at the LHC [1,2] with a mass of mH = 125.09 ± 0.21(stat.) ± 0.11(syst.) GeV/c2 [14], completing the SM.

6 2.2 Shortcomings of the Standard Model

The SM is very successful at giving precise predictions of experimental measurements. There are however some phenomena observed in nature, that cannot be explained with the SM. This section gives a brief description of some of these shortcomings.

Gravity is the only known that is not described by the SM. This is due to the fact that there is to this point no consistent formulation of the theory of general relativity as a quantum field theory. The gravitational interaction is many orders of magnitude weaker than the other fundamental interactions. It is therefore negligible in current physics experiments. Nevertheless, for a complete and consistent description of the fundamental interactions in the universe, gravity has to be included.

Another problem concerning the scale of gravity is the hierarchy problem. It considers the large difference of the energy scale of the electroweak interaction (O(103 GeV)) and the energy scale of gravity, which is the Planck scale (O(1019 GeV)). All particles in the SM receive quantum corrections to their masses. For the Higgs boson, these corrections can become huge, since they depend on the scale up to which the SM is valid, e.g. the Planck scale. In order to explain the light Higgs boson in terms of the SM, the corrections would have to be fine-tuned to cancel out in a way that they result in the observed Higgs mass. Furthermore, since all fermions obtain their masses by the coupling to the Higgs field, the fermion masses can not be derived from the SM, but are free parameters. The large number of free SM parameters, as well as the fine-tuning of these parameters, are considered unnatural.

2.3 Beyond the Standard Model

Various possible extensions to the SM have been proposed to address the shortcomings discussed above. All extensions postulate the existence of new particles. A selection of possible extensions is introduced in this section.

7 Supersymmetry A famous type of extension to the SM is Supersymmetry (SUSY) [15]. It assumes a fundamental symmetry between fermions and bosons. In SUSY models, each elementary particle has a supersymmetric partner with the same quantum numbers, but with different spin. The partner particles of fermions are bosons and vice versa. The contributions of the new particles to the loop corrections cancel out with the contributions of their SM partners thus solving the hierarchy problem. To this point, there is no evidence for SUSY at low energies, indicating that SUSY particles would have higher masses than their SM partners. As a consequence, the loop corrections would not cancel completely, resulting in a smaller version of the hierarchy problem.

Warped Extra Dimensions The physics described in the SM takes place in four dimen- sions, three in space and one in time. However, there is no reason to limit the physics to these four dimensions. The Randall-Sundrum (RS) model [16] proposes a five-dimensional universe, where the fifth dimension is warped and spans between two (3+1)-dimensional branes. All SM particles are localized on one of the branes, while , the mediators of gravity, are localized on the other brane. Gravitons can propagate through the fifth dimension towards the SM brane. Traveling along the warped dimension reduces their energy, explaining the weakness of gravity and thus explaining the hierarchy problem.

Compositeness In compositeness models [17,18], the SM fermions are assumed to be not elementary particles, but composed of new, more fundamental particles. This drastically reduces the number of free parameters in the SM, since the fermion masses are no longer fundamental quantities. Compositeness would also give an explanation for the seeming symmetry between leptons and quarks.

Composite Higgs Another type of compositeness models, the composite Higgs models (CHM) [19], address the hierarchy problem by assuming that the Higgs boson is not elementary, but composed of new, undiscovered particles. CHMs predict new particles with masses around O(1 TeV), which are excited states of the composite Higgs boson. This avoids the problem of unnaturalness by introducing a new physics scale and explains the small mass of the Higgs boson.

8 2.4 Heavy Bottom Quarks

Many of the above described extensions to the SM predict the existence of excited quark states or heavy vector-like quarks. This analysis focuses on the search for excited bottom quarks (b∗), but due to the similarity of the processes, it is expected to be also sensitive to heavy vector-like B0 quarks. This section focuses on the description of the b∗, while the B0 is briefly introduced in the end.

2.4.1 Excited Bottom Quarks

∗ 1 The b is an excited state of the b quark, assuming a composite b quark. It is a spin 2 1 particle and has, like the b quark, a charge of − 3 q and forms a color triplet. Its interactions are described by an effective Lagrangian [20]. As described above, the W bosons couple only to SM fermions with left-handed chirality, but this must not hold for new particles. Instead, the coupling of the b∗ to the W boson is a free parameter. Three different benchmark models are considered. In the purely left-handed and purely right-handed models, only the left-handed or right-handed chiral projections of the b∗ couple to the W boson, respectively. The vector-like model assumes the left-handed and right-handed couplings to be equally strong. The predicted cross-sections from these three models are compared to experimentally observed cross-sections. The main production process of b∗ at the LHC would be single production via the interaction of a b quark and a gluon. Pair production of b∗ is also possible, but it is highly suppressed due to the high mass of the b∗. Possible decay modes of the b∗ are gb, bZ, bH and tW. The branching ratios of these decay modes are shown in Fig. 2.1 as a function of the b∗ mass. It can be seen, that the decay to tW is the dominant process at high b∗ masses, reaching a plateau of almost 40 %. The presented search for b∗ in this thesis is therefore performed in the tW decay channel. The corresponding leading order is shown in Fig. 2.2.

Searches for b∗ decaying to tW have already been performed by CMS [22] and ATLAS [23]. √ The latest analysis was done by CMS at s = 8 TeV. They were able to set mass exclusion limits for the three different benchmark models. The current limits exclude b∗ masses up to 1390 GeV/c2, 1430 GeV/c2 and 1530 GeV/c2 for the left-handed, right-handed and vector-like benchmark models, respectively.

9 Figure 2.1: Branching ratio of the heavy bottom quark (B0) decay modes as a function of B0 mass [21].

b t b∗

g W

Figure 2.2: Leading order feynman diagram of the process bg → b∗ → tW.

2.4.2 Vector-Like B Quarks

Other than the b∗, the B0 is a completely new kind of quark [24,25], rather than an excited 1 1 state. It is a spin 2 particle, has a charge of − 3 e and forms a color triplet. It is called vector-like, because the left-handed and right-handed components both transform in the same way under transformations of the electroweak symmetry group.

Vector-like B0s are singly produced at the LHC via the weak interaction as seen in Fig. 2.3. The possible decay modes of the B0 include gb, bZ, bH and tW. The branching fractions to the different decay modes depend on the specific model. For this analysis, a benchmark

10 q q0

t Z/W B0

W

g b/t

Figure 2.3: Leading order Feynman diagram of the considered production and decay process of the vector-like B0. model with BR(B0 → bZ) = BR(B0 → tW) = 0.5 is assumed. The final states of the B0 decay all contain an additional quark and either a top or a b, due to the production process. In this analysis only the production with an associated b quark is considered.

2.5 Standard Model Backgrounds

The presented search focuses on the muon+jets final state of the b∗ decay into tW. The leading order Feynman diagram for this decay is shown in Fig. 2.4. In this specific final state, the W from the top quark decays hadronically, while the W from the b∗ decays into a muon and a . There are various SM processes, that can result in the same or in a similar final state. Processes with the same signature are called irreducible backgrounds, while all other processes are called reducible backgrounds. The considered SM processes contributing to the expected background are discussed in the following. t¯t + jets The production of top-antitop (t¯t) pairs in association with jets constitutes the main background in this analysis. If one of the top quarks decays leptonically, this decay mode is hard to distinguish from signal events. Only constraints on the extra b jet and on the event topology can reduce this background.

W + jets The production of a W boson and extra jets can have a similar signature as the signal, when the W boson decays leptonically. The requirement for heavy flavor jets

11 q0

W q t b W µ

ν

Figure 2.4: Feynman diagram of the b∗ decay into tW in the muon+jets final state.

(top-jets and b-jets) reduces this background.

Single top The production of a single top quark in association with a W boson has the same signature as the signal. This makes it an irreducible background.

Other backgrounds The other considered SM backgrounds are the production of a Z boson with additional jets (Z+jets), diboson production (WW, ZZ, WZ) and the production of multiple jets via the strong interaction (QCD). All of these backgrounds are reducible and make only small contributions to the total expected background.

There are of course more background processes, but their contribution is negligible due to their small cross-sections.

2.6 Proton-Proton Collisions

The dynamics of elementary particles are studied in scattering experiments. The pre- sented search analyzes data from proton-proton (pp) collisions. This section gives a brief description of the physics of pp collisions and the inner structure of the proton.

The proton structure was studied in deep inelastic electron-proton scattering experiments. These experiments found, that the proton is not an elementary particle, but a complex formation quarks and gluons. Every proton consists of three valence-quarks (u, u, d).

12 Arising from the strong interaction between the valence quarks, the proton additionally contains several gluons and short-lived quark-antiquark pairs, the so-called sea-quarks.

In this context, all proton constituents are generally referred to as partons. High energy scattering of is hence described as the scattering of partons. The partons each carry only a fraction x of the protons momentum. At leading order, this fraction is described by the Bjorken x, Q2 x = . (2.8) 2P q Here, Q2 = −q2 is the transferred momentum squared and P is the proton’s four- momentum. Consequently, the center-of-mass energy of a scattering event in pp collisions is reduced by the value of x of both participating partons. The effective center-of-mass √ energy sˆ is given by √ √ sˆ = x1x2s. (2.9)

In general, the value of x of the two interacting partons are not equal, resulting in a boost of the event in z direction, where z is the direction of the beam axis (see section 4.2.1). As a result, the z component of the total momentum vector of a simple interaction is unknown. Instead, the transverse momentum pT, given by

q 2 2 pT = px + py , (2.10) is used in pp collision experiments, because it is invariant under Lorentz transformations in z direction.

The cross section σ of a certain process in pp collisions is given by the cross section of the parton interaction σˆ, convoluted with the parton density functions (PDFs) fi and fj,

ZZ X 2 2 2 σpp = dx1dx2fi(xi,Q )fj(xj,Q )ˆσi,j(x1, x2,Q ). (2.11) i,j

The PDFs give the probability to encounter a certain parton as a function of x and Q2.A measurement of the PDFs can be seen in Fig. 2.5. At low values of x, the pp scattering is dominated by gluon interactions, while the valence quark interaction becomes more likely in the high x region.

13 MSTW 2008 NLO PDFs (68% C.L.)

) 1.2 ) 1.2 2 2

Q2 = 10 GeV2 Q2 = 104 GeV2

xf(x,Q 1 xf(x,Q 1 g/10 g/10 0.8 0.8

0.6 u 0.6 b,b u

d 0.4 d 0.4 c,c

c,c s,s 0.2 s,s d 0.2 d u u

0 0 -3 -3 10-4 10 10-2 10-1 1 10-4 10 10-2 10-1 1 x x

Figure 2.5: PDFs for quarks, antiquarks and gluons as a function of x at Q2 = 10 GeV2/c2 and Q2 = 104 GeV2/c2 [26].

14 3 Jet Clustering and Top Tagging

Quarks and gluons can not be observed directly due to color confinement. At high energies, they result in final states of many charged and neutral hadrons. These hadrons can be clustered into jets to simplify the final state of a process. Since the investigated final state in this analysis features a top quark, the identification of jets originating from top quark decays plays an important role. This chapter gives a description of the jet clustering algorithms and the top tagging methods used in this analysis.

3.1 Jet Clustering Algorithms

In general there are two types of jet clustering algorithms, cone algorithms and sequential recombination algorithms. The focus of this section will be on sequential recombination algorithms, since the algorithms used are of this type.

Sequential recombination algorithms build jet candidates by pairing four-momenta, also referred to as pseudojets, using the distance parameters dij and diB, defined as

2 2k 2k ∆Rij dij = min(p , p ) , (3.1a) Ti Tj R2 2k diB = pTi . (3.1b)

Here, dij describes the distance between two pseudojets i and j and diB describes the distance between a pseudojet and the beam, pTi is the transverse momentum of the q 2 2 pseudojet i, ∆Rij = (yi − yj) + (φi − φj) is the distance between the pseudojets i and   j y φ R y 1 E+pz in the - -plane, and is the cone size parameter. In this case = 2 ln E−pz denotes the rapidity. For each combination of pseudojets i and j, the values of dij and diB are calculated. If one of the dij is the smallest, the pseudojets i and j are paired, i.e. their four-momenta are added. Otherwise, if the smallest value is one of the diB, the pseudojet i is added to the jet collection and removed from the list of pseudojets. Figure 3.1: A simulated event clustered with different jet algorithms. The found jets and their respective area are shown [27].

In Eq. (3.1), k is a free parameter. The choice of k changes the behavior of the clustering algorithm with respect to the pT of the pseudojets. Three prominent algorithms, which are defined by their choice of k, are briefly introduced. The kT algorithm [28,29] chooses k = 1, meaning softer pseudojets are paired preferably. The resulting jets have a more irregular shape, see Fig. 3.1. The Cambridge/Aachen algorithm (CA) [30,31] uses k = 0, i.e. jets are combined according solely to their geometrical distance. This approach makes it most suitable for many substructure algorithms. For the anti-kT algorithm [27] k = −1 is chosen. In this case, hard pseudojets are combined first, resulting in more uniformly shaped jets.

All jet algorithms have to be infrared and collinear (IRC) safe. That means the calculated observables should not change for final states with or without infinitely soft (infra red) radiation of gluons or collinear splittings. The clustering algorithms used in this search

16 fulfill these requirements.

3.1.1 Heavy Object Tagger with Variable R

The HOTVR algorithm [4] is a sequential clustering algorithm specialized in the iden- tification of boosted, hadronically decaying, heavy particles. Other than the sequential clustering algorithms described above, HOTVR does not use a fixed cone radius, but adapts the cone size to the jet’s pT. HOTVR also introduces a mass jump criterion to identify subjets within the jet.

HOTVR uses distance parameters dij and diB as defined in Eq. (3.1), but instead of having a fixed cone radius R an effective radius Reff,

 R ρ/p < R  min for T min,  Reff = Rmax for ρ/pT > Rmax, , (3.2)   ρ/pT else. is used, where ρ is a parameter describing the slope of Reff. Like the CA algorithm, HOTVR uses n = 0 for the clustering. Before combining two pseudojets, HOTVR requires them to fulfill a mass jump criterion,

θ · mij > max (mi, mj) , (3.3)

if their combined mass mij is above a certain mass threshold µ. Otherwise, the lighter pseudojet is removed from the list of pseudojets. Additionally, if the mass jump criterion is fulfilled, the pseduojets have to have pT ≥ pT,sub in order to be combined, where pT,sub is a parameter of the algorithm, else they are removed. Finally, if the pseudojets are combined, they are stored as subjets of the combined pseudojet.

HOTVR is used in the default top tagging mode [4], the parameter settings are shown in table 3.1.

17 Parameter Default Description

Rmin 0.1 Minimum value of Reff. Rmax 1.5 Maximum value of Reff. ρ 600 GeV/c Slope of Reff. µ 30 GeV/c2 Mass jump threshold. θ 0.7 Mass jump strength. pT,sub 30 GeV/c Minimum pT of subjets.

Table 3.1: Parameters of the HOTVR algorithm in default top-tagging mode [4].

3.2 Top Tagging

An important discriminator against many SM backgrounds in this analysis is the presence of a jet originating from a boosted top quark. In order to identify this top-jet, a method called top tagging is used. There are various top tagging algorithms, which identify top-jets based on different variables, like the jet mass and the number of subjets. This analysis features a new top tagging method called the HOTVR top tagger. The performance of the new algorithm is studied and compared to the standard top tagging method in CMS.

3.2.1 Jet Substructure

√ At the high center of mass energy of s = 13 TeV even heavy objects like top quarks can be created with high transverse momentum pT. The decay products of those boosted objects are therefore collimated, such that they are likely to be clustered into a single big jet. It is useful to resolve the substructure of those big jets in order to discriminate jets originating from different initial particles. In this analysis two sophisticated substructure algorithms are used to obtain top tagging variables. They are described below.

Soft Drop

The soft drop algorithm [32] is used to remove soft, wide-angle radiation from jets (grooming), leading to a better mass resolution and therefore improving the top tagging performance. Information on the jet substructure is obtained by reverting the last clustering iteration. In this step, called declustering, the jet is divided into two subjets with a distance

18 of ∆R1,2. The algorithm then checks the soft drop criterion

!β min (pT,1, pT,2) ∆R1,2 > zcut , (3.4) pT,1 + pT,2 R0 where R0 is the cone size of the jet, and zcut and β are free parameters to control the behavior of the algorithm. If the criterion is not fulfilled, the softer subjet is removed. The procedure is repeated, until the soft drop criterion is fulfilled.

For optimal performance of the soft drop algorithm, jets clustered with the CA algorithm are required. The so-called soft drop mass is obtained for anti-kT jets by clustering the jet constituents again with the CA algorithm and taking the invariant mass of the groomed CA jet.

N-subjettiness

N-subjettiness [33] is a jet substructure variable that measures the compatibility of the jet with having up to N subjets. It is defined as

1 X τN = pT,k min (∆R1,k, ∆R2,k,..., ∆RN,k), (3.5) d0 k where ∆Ri,k is the distance between a hypothetical subjet axis i and the jet constituent k. A normalization factor X d0 = pT,kR0 (3.6) k is used to restrict τN to values between 0 and 1. A jet with τN close to 0 denotes a good compatibility with having up to N subjets. Having a value close to 1 means that the jet is likely to have more than N subjets.

The ratio τ3/2 = τ3/τ2 is often used as a discriminating variable for identifying jets originating from top quarks. Jets from top quarks are expected to have three subjets, i.e.

τ3 is expected to be close to 0 while τ2 is expected to be close to 1, resulting in a τ3/2 < 1. Jets from light quarks, gluons or vector bosons on the other hand are expected to have less then three subjets. The variables τ2 and τ3 are both expected to be of similar size, resulting in τ3/2 ≈ 1.

19 3.2.2 b Tagging

The top quark decays almost exclusively into a b quark and a W boson. It is therefore useful to be able to distinguish between b-jets and jets originating from light quarks or gluons.

The b-jet identification exploits the longevity of B . Due to their long lifetime, the B mesons travel an average distance of 1 mm in the detector before they decay. This results in a secondary vertex, measurable with the pixel detector. In this analysis the optimized Combined Secondary Vertex (CSVv2) algorithm [34, 35] is used for b tagging. CSVv2 uses a multivariate approach to combine information from jet tracks and reconstructed secondary vertices to a single variable. The CSVv2 variable ranges from 0 to 1, where a value close to 1 means the jet is likely to be originating from a b quark.

In this analysis a jet is considered to be originating from a b quark if the CSVv2 variable is > 0.935. This corresponds to the tight identification working point. The misidentification rate at this working point is 0.1 %, while the efficiency of the tagger is 49 % [35].

3.2.3 CMS Top Tagger

The top tagging method recommended by CMS uses the soft drop mass as the discriminating variable. It requires groomed AK8 jets with pT > 400 GeV/c. Four different working points are provided for the CMS top tagger [36]. Top-jet candidates are required to have 2 2 a soft drop mass of 105 GeV/c < mjet < 220 GeV/c . The different working points are obtained by varying the threshold on the N-subjettiness variable τ3/2. The top tagging performance can be further improved by requiring one of the subjets to fulfill the loose working point of the CSVv2 b-tagger.

3.2.4 HOTVR Top Tagger

The HOTVR algorithm offers its own set of substructure variables, which can be calculated from the HOTVR subjets. The following criteria are used for the standard HOTVR top tagging working point [4]:

• The number of subjets has to be Nsubjet ≥ 3.

20 2 2 • The jet mass is required to be within 140 GeV/c < Mjet < 220 GeV/c .

• The fraction of the leading subjets pT with respect to the jet pT is fpT = pT,1/pT < 0.8.

2 • The lowest combined mass of two of the first three subjets has to fulfill Mpair > 50 GeV/c .

This selection on itself should be sufficient to achieve a good top tagging performance. However, for a better comparison to the CMS top tagger, an additional requirement on the N-subjettiness variable τ3/2 is imposed.

21 4 Experimental Setup

The data used in this thesis was recorded with the CMS experiment at the LHC. This chapter gives a brief description of the LHC and the components of the CMS detector.

4.1 The Large Hadron Collider

The LHC is a superconducting synchrotron and storage ring with a circumference of 27 km. √ It was designed for proton-proton collisions at a center-of-mass energy of s = 14 TeV with an instantaneous luminosity of 1034 cm−2s−1 [37]. Collisions with heavy ions (e.g. Pb-Pb) are also possible. It is located at the European Organization for Nuclear Research ("Conseil Européen pour la Recherche Nucléaire" - CERN) near Geneva, Switzerland. The purpose of the LHC is to perform precision measurements on the SM including the search for the Higgs boson and to search for new phenomena beyond the SM to solve questions left open by the SM.

Before the protons are injected into the LHC they are prepared and accelerated in a chain of pre-accelerators. Those are the Linac 2, the Proton Synchrotron Booster (PSB), the Proton Synchrotron (PS) and the Super Proton Synchrotron (SPS). In the SPS the protons reach the required energy of 450 GeV to be injected into the two beampipes of the LHC. The protons are then accelerated to the final collision energy using superconducting radio-frequency (RF) cavities. The proton beam is divided into bunches due to the RF used.

The protons travel in the two beam pipes in opposite directions around the ring. They are bent on their circular path by 1232 superconducting dipole magnets. The beam pipes intersect at four points along the ring, where the experiments are located. These experiments are ALICE (A Large Ion Collider Experiment) [38], ATLAS (A Toroidal LHC Aparatus) [39], CMS (Compact Muon Solenoid) [40] and LHCb (LHC-beauty) [41]. ATLAS and CMS are multi-purpose detectors used for proton-proton collision experiments. CMS Integrated Luminosity, pp, 2016, s = 13 TeV p Data included from 2016-04-22 22:48 to 2016-10-27 14:12 UTC 45 45

) 1

1 LHC Delivered: 40.82 fb¡ ¡ 1 b 40 40

f CMS Recorded: 37.76 fb¡ (

y

t 35 35 i s o

n 30 30 i m

u 25 25 L

d 20 20 e t a

r 15 15 g e t

n 10 10 I

l a

t 5 5 o T 0 0

1 Jul 1 May 1 Jun 1 Aug 1 Sep 1 Oct Date (UTC) √ Figure 4.1: Integrated luminosity of proton-proton collisions at s = 13 TeV delivered by LHC (blue), and recorded by CMS (orange) in 2016 as a function of time [42].

LHCb is a detector focused on B physics and ALICE is a detector used for heavy ion collision experiments. The dataset analyzed in this thesis was recorded with the CMS experiment. It is described in more detail in section 4.2.

One of the main goals of the LHC is the discovery of new physics. The expected number of events N of a certain process occurring in a collider experiment is given by,

N = Lintσ , (4.1)

where σ is the cross-section of the process and Lint is the integrated luminosity. Since the cross-section of new physics processes is expected to be small compared to SM processes, according to Eq. (4.1), a high integrated luminosity is required to produce a sufficient number of events, i.e. a lot of proton-proton collisions have to be recorded. The integrated luminosity is given by Z Lint = Ldt , (4.2) where L is the instantaneous luminosity. The instantaneous luminosity is a measure for

23 the number of events per cross section and time. It is given by

n1n2 L = Nbfrev . (4.3) 4πσxσy

Here, Nb is the number of bunches, frev is the revolution frequency, n1 and n2 are the numbers of protons in the colliding bunches, and σx and σy are the beam diameters in x- and y-directions respectively. At the design luminosity of L = 1034 cm−2s−1 the LHC is 11 operated with Nb = 2808 bunches per beam, n1 = n2 = 1.15 × 10 protons per bunch, and a revolution frequency of frev = 11.25 kHz [37]. The beam profile is measured in special runs with Van der Meer scans [43]. In these runs, the proton beams are scanned through each and the event rate is measured as a function of the displacement, to calculate the beam diameter. The luminosity measurement in CMS is described in detail in [44]. √ The integrated luminosity in proton-proton collisions at s = 13 TeV delivered by the LHC and recorded by CMS in 2016 can be seen in Fig. 4.1. The LHC delivered an integrated luminosity of 40.82 fb−1 of which 37.76 fb−1 were recorded by CMS.

4.2 The Compact Muon Solenoid Experiment

The CMS experiment is a multi-purpose detector to record particles produced in proton- proton collisions in the LHC. The CMS detector is located at one of the four interaction points of the LHC. It is 14.6 m long and has a diameter of 21.6 m. The detector is divided into the central barrel part and the endcaps at each side of the barrel, as shown in Fig. 4.2.

The CMS detector consists of several different sub-detector layers that each serve a special purpose for measuring and identifying particles produced in proton-proton collisions. Listing from the inside out, these layers are the tracking system, the electromagnetic and the hadronic calorimeters (ECAL and HCAL), the superconducting solenoid, and several alternating layers of the return yoke and the muon chambers. The following sections describe these sub-systems based on information from [40,45].

24 Superconducting Solenoid Silicon Tracker Very-forward Pixel Detector Calorimeter

Preshower

Hadronic Calorimeter Electromagnetic Calorimeter Muon Detectors C ompact Muon S olenoid

Figure 4.2: Schematic overview of the CMS detector [45]

4.2.1 Coordinate System

The coordinate system used by CMS to describe a position within the detector is centered at the nominal interaction point of the proton beams. The y-axis points upwards, the x-axis points inwards toward the center of the LHC, and the z-axis points in counterclockwise direction along the beam axis. It is convenient to use spherical coordinates when referring to the position of particles within the detector. The azimutal angle φ is measured in the x-y-plane starting from the x-axis, the polar angle θ is measured starting from the z-axis and r is measured as the radial distance to the beam axis. Another coordinate η is introduced, because, unlike in θ, differences in η are invariant under Lorentz transformations in z direction. The relation between η and θ is given by

" θ !# η = − ln tan . (4.4) 2

25 Figure 4.3: Schematic cross section view of the CMS tracking system [40].

Using these coordinates, a Lorentz invariant distance measure in the φ-η-plane can be expressed as q ∆R = (∆φ)2 + (∆η)2 . (4.5)

4.2.2 Inner Tracking System

The inner tracking system is located in the heart of the CMS detector. Its purpose is the measurement of the trajectories of charged particles in order to reconstruct their charge and momentum. It is also used to reconstruct secondary vertices. The tracking system is 5.8 m long and 2.5 m in diameter. Its first layer is only 4.4 cm away from the proton beam. The high particle flux and the fast rate of bunch crossing require a high granularity, a quick response and radiation hardness of the detector components. On the other hand the material budget should be kept as low as possible to minimize the energy loss in the tracking system. Therefore the layout of the inner tracking system features an inner part consisting of a silicon pixel detector with 66 million pixels, and an outer part consisting of silicon strip detectors with a total number of 9.6 million strips. The layout is shown in Fig. 4.3.

The pixel detector (PIXEL) is composed of three cylindrical layers in the barrel part and two endcap discs on each side. Each pixel has a size of 100 µm × 150 µm. The strip

26 detector is divided into three subsystems. The tracker inner barrel (TIB) and tracker inner disc (TID) make the first subsystem. They consist of four layers in the TIB and three discs at each end in the TID. The strips in TIB and TID have a pitch of 320 µm. Around the TIB and TID the tracker outer barrel (TOB) is located. The TOB is made up of six layers with 500 µm pitch. At each end of the TOB are the tracker endcaps (TEC+ and TEC-, where the sign indicates the orientation on the z-axis). The TECs are each composed of 9 discs with up to seven rings of silicon strips with a pitch of 320 µm in the inner four rings and 500 µm in the outer three rings.

The inner tracking system has an overall acceptance of up to |η| = 2.5.

4.2.3 Electromagnetic Calorimeter

The ECAL is built directly around the inner tracking system. It is used to measure the energy of particles that interact mostly via the electromagnetic interaction, i.e. electrons and . This is done by stopping those particles within the ECAL volume, such that they deposit all their energy in the calorimeter. When high-energy particles interact with the ECAL they cause cascades of secondary particles, so called electromagnetic showers. The amount of material needed to contain the electromagnetic shower is dictated by two material properties, the radiation length X0 and the Molière radius. The radiation length is defined as the mean distance an electron has to traverse in the material until its energy 1 is reduced by a factor of e . The Molière radius is the radius of the cylinder containing 90 % of the energy of the electromagnetic shower. Distances in a calorimeter are usually given in units of X0.

The ECAL used in the CMS detector is a homogenous calorimeter made of lead tungstate crystals. Lead tungstate has a radiation length X0 = 0.89 cm and a Molière radius of 2.2 cm. The energy deposited in the calorimeter is measured by exploiting the scintillation process. When particles interact with the ECAL, they excite the lead tungstate , which re-emit the energy as photons that can be detected using photo detectors. The amount of photons produced is proportional to the deposited energy. In lead tungstate, 80 % of the scintillation light is emitted within 25 ns. These properties allow a compact construction design of the ECAL with a fine granularity.

The barrel section of the ECAL (EB) is composed of crystals with a front face cross section of 22 mm × 22 mm and a length of 230 mm (25.8 X0). It has an inner radius of 1.29 m

27 Barrel ECAL (EB)

y = 1.653 = 1.479 Preshower (ES) = 2.6 = 3.0 Endcap z ECAL (EE) Figure 4.4: Schematic overview of the ECAL layout [40]. and covers a pseudorapidity interval of 0 < |η| < 1.479. The ECAL endcaps (EE) consist of crystals with a front face cross section of 28.6 mm × 28.6 mm and a length of 220 mm

(24.7 X0). They are located at a distance of 3.14 m in z from the nominal interaction point and cover a pseudorapidity interval of 1.479 < |η| < 3.0. An overview of the ECAL layout is shown in Fig. 4.4.

The ECAL has a relative energy resolution of [40]

σE 2.8 % 12 % = q ⊕ ⊕ 0.3 % . (4.6) E E/GeV E/GeV

There are three different contributions to the energy resolution, as shown in Eq. (4.6). The first term is the stochastic term, describing statistical fluctuations in the shower development. The second term is called noise term and describes the contribution of electronic noise. The last term describes constant contributions to the relative energy resolution, such as calibration errors and non-linearity in the readout elements.

4.2.4 Hadronic Calorimeter

Hadrons lose only a fraction of their energy in the ECAL, since the hadronic interaction length λI is much longer than X0. The HCAL is build around the ECAL to stop the

28 Figure 4.5: Schematic overview of the HCAL layout [45]. hadrons and measure their energy. It is also a crucial component in the measurement of missing transverse energy E T . Other than the ECAL, the HCAL is a sampling calorimeter. It consists of alternating layers of active medium and passive absorber material. This choice was made to fit most of the HCAL into the solenoid. The HCAL uses plastic scintillator tiles as the active medium. They are embedded between layers of brass, which acts as the absorber material. The absorber material has a short interaction length, for brass that is λI = 16.42 cm. Hadrons interacting with the absorber material develop a shower of secondary particles. These particle showers can then be measured in the scintillators. Sampling calorimeters decrease the energy resolution because they only measure a fraction of the deposited energy. On the other hand they allow the measurement of the longitudinal shower profile. The overall energy resolution is [46]

σE 115.3 % = q ⊕ 5.5 % . (4.7) E E/GeV

The HCAL features four different parts, shown in Fig. 4.5. In the barrel region the hadron barrel (HB) and the hadron outer barrel (HO) calorimeters are located. The HB is placed directly above the EB and covers a region of 0 < |η| < 1.3. The HO is placed

29 outside the solenoid. It is used to improve the shower containment in the central region of 0 < |η| < 1.26. The endcaps consist of the hadron endcap (HE) and the hadron forward (HF) calorimeters. HE covers a region of 1.3 < |η| < 3.0, while HF covers a region of 2.9 < |η| < 5.0.

4.2.5 Solenoid Magnet

The superconducting solenoid is used to produce a homogeneous magnetic field of B = 4 T in z-direction on its inside. It is 12.9 m long and has an inner diameter of 5.9 m. Contained in this volume are the inner tracking system, as well as the ECAL and HCAL. This is done to maximize the magnetic field in the inner tracking system, while ensuring that most of the energy is absorbed in the calorimeters and not in the solenoid.

The magnetic field bends the trajectories of charged particles via the Lorentz force. The curvature of the trajectories is used to determine the momentum and the charge of those particles. The magnetic field is returned through the iron yokes on the outside of the solenoid. Embedded between the iron yoke layers is the muon system. This improves the measurement of muon momenta.

4.2.6 Muon System

Muons can traverse the ECAL, HCAL and even the solenoid without considerable loss of energy. Nearly all other particles are absorbed before they reach the muon system. It therefore makes up the outermost part of the CMS detector. The detectors of the muon system are placed inside the return yoke of the solenoid. The muon trajectories are bent due to the magnetic field, which allows a momentum measurement like in the tracking system. This additional measurement improves the resolution of the muon momenta.

Three different kinds of gaseous detectors are used in the muon system: drift tubes (DT), resistance plate chambers (RPC) and cathode strip chambers (CSC). The layout of the muon system can be seen in Fig. 4.6. It consists of two parts, the muon barrel (MB) and the muon endcaps (ME). The MB uses DTs and RPCs. It covers a region of 0 < |η| < 1.2. The MEs use CSCs and RPCs and they cover a region of up to |η| < 2.4

30 ) 800 m c ( DT eta = 0.8 1.04 MB 4

R RPC 1.2 700

MB 3 600

MB 2 500 MB 1 1.6

400  

300 2.1

2.4 200   

 ME 4 100 ME 2 ME 3 ME 1 CSC 0 0 200 400 600 800 1000 1200 Z (cm)

Figure 4.6: Schematic overview of the muon system [40].

4.2.7 Trigger

Operating at design luminosity the LHC provides an event rate of 40 MHz. Considering the average disc space needed to store an event, which is O(MB), the rate of events written to disk has to be reduced. The trigger system is used for this purpose and to save only potentially interesting events. The trigger system works in two levels. The first level (L1) is a hardware based trigger that reduces the event rate to about 100 kHz. The events can be saved in a buffer for 3.2 µs. In this time the L1 trigger decides whether to keep an event based on information from the muon system and the calorimeters. Events that fulfill the L1 trigger criteria are then passed to the software based high level trigger (HLT). The HLT reconstructs the whole event with full detector granularity and resolution. It further reduces the rate to about 100 Hz based on that information. The events passing the HLT are then stored for offline analysis.

31 5 Object Reconstruction and Identification

The physical objects used by physics analyses have to be reconstructed from signals in the detector and interpreted as particles. In CMS stable particles are reconstructed and identified using the particle flow (PF) algorithm [47]. Quarks and gluons appear as cascades of neutral and charged hadrons, so called jets, due to the hadronization process. Jets can be used to identify the initial particles. Sophisticated jet clustering algorithms are used to identify jets using the objects from the PF algorithm.

This chapter gives a description of the PF algorithm and the jet algorithms used. Additional identification criteria for the physical objects used in this analysis are also given.

5.1 Particle Flow Algorithm

The particle flow algorithm is used to reconstruct all stable particles of a proton proton collision using the signals (hits) of the full detector. The different particles can be identified by their characteristic interaction with the detector, as shown in Fig. 5.1. Charged particles produce signals in the tracking system while neutral particles pass it undetected. Photons and electrons lose all their energy in the ECAL and cause electromagnetic showers in the process. Charged and neutral hadrons are stopped in the HCAL and deposit their energy in the form of hadron showers. Muons and neutrinos are the only particles that can pass through the detector. While neutrinos leave the detector undetected, the muons produce hits in the muon detectors.

The reconstruction of an event with the PF algorithm is done in three steps. First it analyzes all sub detectors independently by combining hits to particle trajectories (tracks) or clusters. In the second step the tracks and clusters are linked to each other. Finally the PF algorithm identifies these objects as particles by their characteristic behavior in the detector. Key: Muon Electron Charged Hadron (e.g. ) Neutral Hadron (e.g. ) Photon

3.8T Transverse slice through CMS

2T

Silicon Tracker

Electromagnetic Calorimeter

Hadron Calorimeter Superconducting Solenoid Iron return yoke interspersed with Muon chambers

0m 1m 2m 3m 4m 5m 6m 7m

Figure 5.1: Sketch of signatures of different particles in a cross-section of the CMS detector [47].

Hits in the tracking system and muon detectors are combined to tracks using an iterative approach. First a seed for the track is generated using a few consecutive hits. Then, a track is built using hits in all tracker layers that are compatible with the seed track. Finally, the trajectory is fitted using a χ2-method to determine the transverse momentum, charge, and origin of the charged particle. If the track then fulfills the reconstruction criteria, it is kept for further analysis and the associated hits are removed. This procedure is repeated in several iterations to increase the efficiency of this method while keeping the misreconstruction rate low. The reconstruction criteria are loosened after every iteration.

Hits in the calorimeter cells are combined to clusters using a clustering algorithm. Like in the track reconstruction the first step is to generate a seed. Local energy maxima with respect to either four or eight calorimeter cells are selected as cluster seeds if they are above a certain energy threshold. Topological clusters are grown from the cluster seeds by adding neighboring cells to the cluster if their energy is above a given threshold. Finally

33 the energy and position of the clusters are determined using a Gaussian mixture model with N Gaussian distributions, where N is the number of cluster seeds in the topological cluster.

After the subsystems are analyzed separately, the PF algorithm links them together to so called PF blocks. Tracks in the inner tracking system and calorimeter clusters are linked by extrapolating the tracks into the calorimeters up to a certain depth. If the track ends in a cluster, the two are linked. Additionally, at each layer of the tracking system, tangents to the track are extrapolated to the ECAL and linked to clusters to collect photons from bremsstrahlung. Tracks in the muon detector are linked to tracks in the inner tracking system if the extrapolated trajectories match. The combined track is used in a χ2-fit to reconstruct a so-called global muon.

The final step in the PF algorithm is to identify the PF blocks as specific particles. The first particles to be identified are muons, followed by electrons and photons and finally charged and neutral hadrons. The specific identification criteria are described in detail in [47]. In this step, the PF algorithm also decides which sub detector measurements are used to achieve the most precise reconstruction. For example, if a muon has a transverse momentum of pT < 200 GeV/c, the measurement from the inner tracking system is used. Otherwise, the momentum is determined from the track fit with the smallest χ2 value.

5.2 Primary Vertex

The primary interaction vertices are reconstructed as a part of the track reconstruction. The primary vertex reconstruction algorithm [48] uses the tracks reconstructed by the PF algorithm. The primary vertex reconstruction works in three steps. First, the algorithm selects tracks that fulfill certain quality criteria. The selected tracks are then clustered to primary vertex candidates according to their z-coordinate at their closest approach to the center of the beam spot. In the last step, the tracks in each cluster are use in a fit to determine the position of the primary vertex.

In this analysis only primary vertices fulfilling the following criteria are considered:

• The primary vertex has to be in the center of the interaction region, i.e. √ x2 + y2 < 2 cm and |z| < 24 cm.

34 • The number of degrees of freedom of the fit must be ≥ 4.

It is likely that there are more then one primary vertices in an event, due to the high number of protons in the colliding bunches. In this case, the vertex with the largest value of the sum of all transverse momenta squared of the associated track-level physics objects is considered the main interaction vertex. All other primary vertices are pile-up vertices.

5.3 Muons

Muons identified and reconstructed from the PF algorithm have to fulfill additional criteria to be considered as muons in this analysis. CMS provides different working points for identification of muons depending on the efficiency and misidentification rate. The tight working point was chosen for this analysis. It has the following requirements on muon candidates [49]:

• It must be reconstructed as a global muon.

• The global track fit must have χ2/d.o.f. < 10.

• At least one hit in the muon chamber must be included in the fit.

• The track in the inner tracking system must be matched to at least two muon stations.

• It must have at least 10 hits in the inner tracking system, including at least one hit in the pixel detector.

• It must have a transverse impact parameter of |dxy| < 2 mm with respect to the primary vertex.

The muon candidate is also required to have a minimum transverse momentum of pT > 50 GeV/c and to be within a pseudorapidity range of |η| < 2.4. Furthermore, the muon candidate has to be isolated, i.e. its relative isolation

P h± P γ P h0 P h±,PU pT + max pT + pT − 0.5 pT Irel = µ,cand , (5.1) pT

35 considering particles within a cone of ∆R < 0.4 around the muon candidate, must be γ h± h0 Irel < 0.15. Here, pT is the photon transverse momentum and pT and pT are the charged and neutral hadron transverse momenta, respectively. The transverse momentum of h±,PU charged hadrons originating from a different primary vertex is denoted by pT . It is used to approximate the contribution of neutral hadrons from different primary vertices.

5.4 Jets

The PF candidates are passed to jet clustering algorithms, which combine them into jets. This analysis uses three different jet definitions. So-called AK4 jets and AK8 jets are clustered using the anti-kT algorithm [27] with a cone radius of R = 0.4 and R = 0.8, respectively. HOTVR jets are clustered using the Heavy Object Tagger with Variable R (HOTVR) algorithm [4]. After clustering, jets have to fulfill certain criteria to be kept for further analysis.

HOTVR jets and AK8 jets are required to have pT > 200 GeV/c and |η| < 2.5. AK4 jets have to have pT > 30 GeV/c and |η| < 2.4. The |η| requirement is recommended for b tagging [35]. Additionally, the AK4 jets have to fulfill the requirements of the loose identification working point [50]:

• The neutral hadron energy fraction has to be < 0.99.

• The neutral electromagnetic energy fraction hast to be < 0.99.

• The jet has to have at least two constituents.

• The charged hadron energy fraction has to be > 0.

• The charged electromagnetic energy fraction hast to be < 0.99.

• The charged multiplicity has to be > 0.

36 5.4.1 Jet Energy Corrections

Pile-up and non-linear response of the calorimeter affect the measurement of the jet four-momentum. In order to account for those effects the jets have to be corrected. The

CMS collaboration provides several different jet energy corrections (JEC) [51] for anti-kT jets that each account for a specific effect. The first corrections (L1 corrections) account for particles from pile-up events that are clustered into the jets. The next corrections (L2L3 corrections) account for non-linear response of the detector. The final corrections (L2L3Res corrections) account for residual differences between simulated events and data.

5.5 Other Variables

5.5.1 E T

Neutrinos and hypothetical particles that can not be detected with the CMS detector appear as missing transverse energy ( E T). It is defined as

~ X E T = E T = − ~pT,i . (5.2)

i

All particle flow objects are considered in this sum. This variable is used to reconstruct neutrinos from a leptonically decaying W boson in this analysis and as a discriminating variable to reduce background events without neutrinos.

5.5.2 ST

The transverse activity in an event ST is defined as the sum of the transverse momenta of all electrons, muons and AK4-jets plus the missing transverse energy:

X ST = pT + E T. (5.3) e,µ,jets

It is a good indicator for heavy particles in an event, because they result in final states with high pT objects. The ST of such an event is expected to have a value around the mass of the heavy particle. It is used as a discriminating variable to select potential signal events.

37 6 Studies on the HOTVR Algorithm

In the presented analysis, the HOTVR algorithm is implemented for the first time in a search for new physics. A method to calibrate the HOTVR jets is developed. Afterwards the performance of the HOTVR algorithm is demonstrated and compared to the CMS top tagger.

6.1 Jet Energy Corrections for HOTVR Jets

Up to now there are no JEC provided for jets clustered with HOTVR. The following approach is used to calibrate the jets. The effect of pile-up is expected to be dominant for HOTVR jets in the low and intermediate pT region, since these jets can become very large. It is studied in data and in simulated events of SM processes. A description of the event simulation using Monte-Carlo generators can be found in section 7.1. Simulations of all SM processes discussed in section 2.5 were used to get a reasonable prediction that can be compared to the observed data. A baseline selection, as described in section 7.2.1, is applied. The selection requires the presence of exactly one highly energetic isolated muon, E T > 50 GeV/c and ST > 400 GeV/c. Afterwards, at least one HOTVR jet with pT > 200 GeV/c and |η| < 2.5 is required. The effect of pile-up on the HOTVR jets can be seen by measuring the selection efficiency of the pT > 200 GeV/c requirement as a function of the number of primary vertices NPV. The selection efficiency is given by the ratio between the number of events before and after the selection. In Fig. 6.1a, the selection efficiency as a function of NPV is shown. The red markers correspond to the selection efficiency in data, while the blue markers correspond to the selection efficiency in simulated SM events. In the lower part, the ratio between data and simulation is shown. The error bars depict the statistical uncertainties. A trend towards higher selection efficiency with higher NPV, i.e. more pile-up, is clearly visible. The pile-up dependency is described well by the simulation, as only small deviations between data and simulation are observed. 1 1 0.9 Data 0.9 Data 0.8 0.8 MC MC Efficiency 0.7 Efficiency 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1

1.2 0 20 40 1.2 0 20 40 1 1 0.8 0.8 DATA/MC DATA/MC 0 20 40 0 20 40

NPV NPV (a) (b)

Figure 6.1: Efficiency of the selection of HOTVR jets with pT > 200 GeV/c as a function of the number of primary vertices NPV in data (red) and simulated SM events (blue) for (a) uncorrected HOTVR jets and (b) HOTVR jets with pile-up correction.

In order to account for the pile-up, an offset method is used. Under the assumption that the activity of the pile-up is distributed uniformly over the whole detector, a correction factor apile-up is derived with ρ · Ajet apile-up = 1 − raw , (6.1) pT raw where ρ is the mean pT per unit area arising from pile-up, Ajet is the jet area, and pT is the uncorrected jet pT. Control distributions of ρ and the subjet area after the event selection are shown in Fig. 6.2. Data is shown as black markers and the different simulated processes are shown as colored areas. The simulated events are stacked, to allow a comparison between data and simulation. In the lower parts the ratio between data and simulation is shown. The error bars correspond to the statistical uncertainties. The calculated corrections are applied to the subjets of each HOTVR jet. As shown in Fig. 6.1b, the pile-up dependency is removed almost completely using this offset method.

It is shown in [51], that the response is similar for anti-kT jets with distance parameter R in the range of R = 0.3 − 1.0, after they were corrected for pile-up. To get an approximation for the L2L3 and L2L3res corrections for HOTVR, the L2L3 and L2L3Res corrections provided for anti-kT jets are used for the HOTVR subjets as well. Finally, to account for

39 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 6 Data 10 5 TTbar 10 5 Events WJets Events 10 4 DYJets 10 104 SingleTop 103 3 DiBoson 10 QCD 102 102 10 10 1 1

1.50 20 40 60 1.50 1 2 3 4 5 1 1 0.5 0.5 DATA / BG DATA / BG 0 20 40 60 0 1 2 3 4 5 subjet ρ [GeV/c] A [a.u.] (a) (b)

Figure 6.2: Control distributions of (a) ρ and (b) the subjet area.

differences in the response between anti-kT jets and HOTVR jets, an additional correction factor r is derived by calculating rec pT r = gen (6.2) pT rec from simulated t¯t events. Here pT is the pT of the subjet with full detector simulation gen and pile-up, and pT is the pT of the corresponding subjet on particle level, i.e. without detector simulation and pile-up. This ratio is calculated for several bins in η and pT.A correction factor ares is then derived using

1 ares = , (6.3) hri where hri denotes the mean value of r in one (η,pT) bin. Finally, a correction function fη(pT) is obtained via a fit to the correction factors for a fixed η, using the ansatz

b·pT+c fη(pT) = a + e . (6.4)

The corresponding correction function is then used for the final correction of the HOTVR subjets. The fits can be seen in Figs. 6.3 and 6.4 for all considered η regions.

40 1.2 1.2 •4.0< η < •1.5 •1.5< η < •1.0 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

1.2 1.2 •1.0< η < •0.7 •0.7< η < •0.4 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

1.2 1.2 •0.4< η < •0.2 •0.2< η < 0.0 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

Figure 6.3: Fits to the calculated correction factor for different calibrations between HOTVR subjets and anti-kT jets as a function of pT in different η regions.

41 1.2 1.2 0.0< η < 0.2 0.2< η < 0.4 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

1.2 1.2 0.4< η < 0.7 0.7< η < 1.0 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

1.2 1.2 1.0< η < 1.5 1.5< η < 4.0 correction factor correction factor

correction factor fit correction factor fit 1.1 1.1

1 1

0 100 200 300 400 500 0 100 200 300 400 500 prec [GeV/c] prec [GeV/c] T T

Figure 6.4: Fits to the calculated correction factor for different calibrations between HOTVR subjets and anti-kT jets as a function of pT in different η regions.

42 3

max 0.16 R

∆ 0.14

0.12 2

0.1

0.08

1 0.06

0.04

0.02 0 0 500 1000 1500 2000 p [GeV/c] T, top

Figure 6.5: Maximum distance between the top quark decay products as a function of the top pT. The bins are normalized to the total number of entries per pT slice.

6.2 Top Tagging Performance

The presented analysis searches for new particles over a large mass spectrum, ranging from 700 GeV/c2 (lowest considered B0 mass) up to 3000 GeV/c2 (highest considered b∗ mass). Ensuring efficient top tagging over the whole mass spectrum is a challenging task because the kinematics of the resulting top quark and its decay products change dependent on the mass of the new particle. Higher masses lead to higher top pT and the higher the top pT, the more likely the top quark decay products are to be closer together in the detector, resulting in narrower top-jets. Figure 6.5 shows the maximum distance between two of the three top quark decay products (the b quark and two quarks from the W boson), as a function of the top pT. All bins in one pT slice are normalized to the number of entries in that slice. At low values of the top pT the decay products are farther away from each other, while at high values they are closer to each other.

In order to achieve good signal sensitivity over the whole considered mass spectrum, the used top tagging algorithm has to be able to identify top-jets over a large pT range. Therefore, the performance of the two top tagging algorithms described above is studied

43 in different pT regions.

The top tagging algorithms are used to identify top-jet candidates in simulated events. The tagged jets are then matched to generator level particles. A jet is labeled "matched", if the distance ∆R between the jet and the generator particle in the (η, φ)-plane is ∆R < 0.4.

The performance of a top tagger is characterized by two quantities, the signal efficiency S and the mistag rate B. The signal efficiency is defined as the fraction of tagged top-jets that were matched to generator level top quarks with respect to all hadronically decaying top quarks. It is measured in simulations of expected signal events. The mistag rate is defined as the fraction of tagged top-jets that were matched to generator-level jets with respect to all generator-level jets. The generator-level jets were clustered with the anti-kT algorithm using the distance parameter R = 0.4. A description of the simulated samples used can be found in section 7.1.

Both rates are measured in six different kinematic regions based on the pT of the matched generator particle. Receiver operating characteristic (ROC) curves are obtained for each pT region by varying the N-subjettiness variable τ3/2. They are shown in Fig. 6.6. The markers show the different identification working points of the top tagging algorithms. The working point shown for HOTVR corresponds to the setting used in the analysis in chapter 7.

In the 200 GeV/c < pT < 400 GeV/c region, the HOTVR algorithm shows a significantly better performance then the CMS top tagger. This is expected, because the CMS top tagger is not suited for this region, since it uses jets with a fixed cone size of R = 0.8 and can therefore not contain all top constituents in this region. HOTVR on the other hand adapts the cone size to the pT of the clustered object and can thus capture the whole top quark decay more often also in the lower pT region. In the 400 GeV/c < pT < 600 GeV/c and 600 GeV/c < pT < 800 GeV/c regions, the CMS top tagger is performing slightly better than HOTVR. The top quark decays in these regions are captured well by jets with cone sizes of R = 0.8, as seen in Fig. 6.5. The CMS top tagger is hence optimized specifically for this region. The HOTVR algorithm is still able to deliver a comparable performance. Going to even higher pT regions, the HOTVR algorithm improves further, while the CMS top tagger becomes slightly less efficient. Here, the distances of the top quark constituents become very small and the cone size of the AK8 jet is now too large. The top quark decay is contained in the jet, but the soft drop criterion becomes less effective, since it depends on the size of the jet and the distance between the obtained

44 −1 −1

B 10 B 10 ε ε

− − 10 2 CMS TopTagger: 10 2 CMS TopTagger: τ τ WP: 3/2<0.81 WP: 3/2<0.81 τ τ WP: 3/2<0.67 WP: 3/2<0.67 τ τ WP: 3/2<0.57 WP: 3/2<0.57 τ τ WP: 3/2<0.50 WP: 3/2<0.50 − − 10 3 HOTVR TopTagger: 10 3 HOTVR TopTagger: τ τ WP: 3/2<0.56 WP: 3/2<0.56 200GeV/c < p < 400GeV/c 400GeV/c < p < 600GeV/c T T HOTVR TopTagger HOTVR TopTagger CMS TopTagger CMS TopTagger 10−4 10−4 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 ε ε S S

−1 −1

B 10 B 10 ε ε

− − 10 2 CMS TopTagger: 10 2 CMS TopTagger: τ τ WP: 3/2<0.81 WP: 3/2<0.81 τ τ WP: 3/2<0.67 WP: 3/2<0.67 τ τ WP: 3/2<0.57 WP: 3/2<0.57 τ τ WP: 3/2<0.50 WP: 3/2<0.50 − − 10 3 HOTVR TopTagger: 10 3 HOTVR TopTagger: τ τ WP: 3/2<0.56 WP: 3/2<0.56 600GeV/c < p < 800GeV/c 800GeV/c < p < 1000GeV/c T T HOTVR TopTagger HOTVR TopTagger CMS TopTagger CMS TopTagger 10−4 10−4 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 ε ε S S

−1 −1

B 10 B 10 ε ε

− − 10 2 CMS TopTagger: 10 2 CMS TopTagger: τ τ WP: 3/2<0.81 WP: 3/2<0.81 τ τ WP: 3/2<0.67 WP: 3/2<0.67 τ τ WP: 3/2<0.57 WP: 3/2<0.57 τ τ WP: 3/2<0.50 WP: 3/2<0.50 − − 10 3 HOTVR TopTagger: 10 3 HOTVR TopTagger: τ τ WP: 3/2<0.56 WP: 3/2<0.56 1000GeV/c < p < 1500GeV/c 1500GeV/c < p < 2000GeV/c T T HOTVR TopTagger HOTVR TopTagger CMS TopTagger CMS TopTagger 10−4 10−4 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 ε ε S S

Figure 6.6: Receiver operating characteristic (ROC) curves for HOTVR and CMS top tagger from a scan of the N-subjettiness variable τ3/2 in six different pT regions.

45 subjets. The improvement of the HOTVR algorithm comes from the fact that the effect of pile-up and additional radiation on jets becomes less significant as the jet radius decreases.

This also means, that HOTVR can be further improved in the low-pT region, with more sophisticated pile-up suppression techniques such as PUPPI [52].

46 7 Analysis

This analysis focuses on the search for a singly produced b∗ decaying to tW in the muon+jets final state. For this search, the top quark is assumed to decay hadronically, while the W boson decays into a muon and a muon neutrino. A sketch of the event signature can be seen in Fig. 7.1. The analysis is expected to be also sensitive to the process B0 → tW with an associated b quark in the muon+jets final state, due to the similar signature.

In this chapter, a detailed description of the search for b∗ and B0 in the above described decay channel is given. The dataset, as well as the simulated events used in this analysis are described in section 7.1. The event selection is described in section 7.2. The method of reconstructing the invariant mass of the tW system is described in section 7.3. The results of the reconstruction using either HOTVR or the CMS top tagger for the reconstruction are compared. The systematic uncertainties are discussed in section 7.4.

t p b ∗ µ q0 b q ν p

t b∗ W

Figure 7.1: Sketch of the event signature of the b∗ decay to tW in the +jets final state. 7.1 Dataset and Simulated Events

The dataset analyzed in this analysis was recorded in 2016 with the CMS detector. It √ contains events from proton-proton collisions at a center of mass energy of s = 13 TeV and corresponds to an integrated luminosity of 35.87 fb−1. The single muon dataset was chosen, since the investigated event signature features a muon.

The expected SM events are simulated using Monte-Carlo (MC) simulations to compare the data to the theoretical predictions. Discrepancies between data and simulated backgrounds could be evidence for new physics.

MC simulations of the expected signal processes are used as well. For the process b∗ → tW, samples with masses ranging from 1200 GeV/c2 to 2000 GeV/c2 in steps of 200 GeV/c2 are used. For the process B0 → tW with an associated b quark, MC simulations with B0 masses ranging from 700 GeV/c2 to 1800 GeV/c2 in steps of 100 GeV/c2 are used. Simulations of purely left-handed (lh) and purely right-handed (rh) samples are available for each process. Samples for the vector-like b∗ coupling are obtained by combining the two respective samples.

The matrix element of a specific process is calculated up to a certain order of pertubation theory using sophisticated Monte-Carlo generators. The generators used in this analysis are MadGraph5_aMC@NLO (MadGraph) [53] and powheg [54–58]. After the matrix element is calculated, the parton shower and hadronization has to be simulated and then matched to the matrix element calculation. While powheg performs this actions on its own, MadGraph relies on external algorithms to perform these tasks. The parton shower and hadronization process is simulated using Pythia8 [59,60]. The parton shower is then matched to the matrix element calculation using the MLM algorithm [61]. Finally, the interaction of the produced particles with the CMS detector is simulated with geant4 [62].

The full list of the signal samples used in this analysis is given in tables 7.1 and 7.2. The SM background samples are listed in table 7.3. All simulated samples were normalized to the integrated luminosity recorded in the dataset, by weighting each event with a weight

σi ωi = Lint. (7.1) Ni

48 2 Mb∗ [GeV/c ] σ [pb] N (lh / rh) 1200 1.94 99 984 99 987 1400 0.783 99 979 99 785 1600 0.342 99 969 99 171 1800 0.159 99 971 97 568 2000 0.077 99 969 99 956 2200 0.039 96 957 99 957 2400 0.020 99 926 99 941 2600 0.011 99 937 79 971 2800 0.006 79 952 99 936 3000 0.003 99 916 99 936

Table 7.1: Overview of all signal samples of the process b∗ → tW used in this analysis. The theoretical cross section is denoted as σ in the second column. The number of generated events N is given in the third column. The coupling to the W boson is set to be purely left-handed (lh) or purely right-handed (rh). The vector-like couplings are obtained by combining the lh and rh samples. All samples were generated using MadGraph.

Here, Ni is the number of simulated events and σi denotes the cross-section of the process i.

The theoretical production cross sections for gb → b∗ times the branching fraction for b∗ → tW are obtained from the matrix element calculation performed with MadGraph, based on [20,21]. The cross section for the vector-like coupling is twice as large.

The theoretical production cross section for the vector-like B0 were calculated in [25,63]. A branching fraction of BR(B0 → bZ) = BR(B0 → tW) = 0.5 is assumed.

7.2 Event Selection

Looking at Fig. 7.1, it can be seen, that the event signature can be divided into two hemispheres, since the top and the W are expected to be back to back in the (η-φ) plane. The hemisphere containing the W boson features a highly energetic, isolated muon, as well as missing transverse energy from the muon neutrino. The other hemisphere contains a jet originating from the top quark. The event selection based on this signature is presented in the following.

49 2 MB0 [GeV/c ] σ [pb] N (lh / rh) 700 1.085 99 994 94 795 800 0.754 99 989 99 994 900 0.555 99 992 99 993 1000 0.413 99 989 99 987 1100 0.298 198 380 99 986 1200 0.224 89 777 99 988 1300 0.170 99 779 99 983 1400 0.132 99 785 99 979 1500 0.104 99 778 99 973 1600 0.080 99 984 99 975 1700 0.062 94 960 99 968 1800 0.049 99 971 99 963

Table 7.2: Overview of all signal samples of the process B0 → tW with an associated b quark used in this analysis. The theoretical cross section is denoted as σ in the second column. The number of generated events N is given in the third column. The coupling to the W boson is set to be purely left-handed (lh) or purely right-handed (rh). All samples were generated using MadGraph.

7.2.1 Pre-Selection

The first step in this analysis is to reduce the number of background events, while keeping as many signal events as possible. This is done by imposing a number of requirements on the events to select those featuring the above described signature.

The final state of the b∗ decay features an isolated muon. The starting point of the pre-selection hence is to select events fulfilling the trigger combination recommended by CMS for analyses with an isolated muon. The recommendation is to use the logical or combination of the HLT_IsoMu24_v* and HLT_IsoTkMu24_v* triggers. The triggers both require an isolated muon with pT > 24 GeV/c. The only difference between the two triggers is the method, by which the muon is reconstructed in the HLT. Trigger scale factors are applied to simulated events to account for the different trigger efficiency in MC and data.

After the trigger selection, all objects not fulfilling the quality criteria described in chapter 5 are removed from the events. Differences in the muon identification efficiency between data and MC are addressed by applying scale factors to the MC events. Additionally, a pile-up reweighting is applied to each MC event, to accurately describe the number of

50 Process σ [pb] MC generator N t¯t 831.76 powheg 155 151 287 W+jets (W→ lν, pT ∈ [100, 250)GeV/c) 676.3 MadGraph 119 167 959 W+jets (W→ lν, pT ∈ [250, 400)GeV/c) 23.94 MadGraph 12 021 695 W+jets (W→ lν, pT ∈ [400, 600)GeV/c) 3.031 MadGraph 1 939 698 W+jets (W→ lν, pT ∈ [600, ∞)GeV/c) 0.4524 MadGraph 1 914 241 Single top (s-channel) 3.36 powheg 999 976 Single top (t, t-channel) 136.02 powheg 5 993 570 Single top (¯t t-channel) 80.95 powheg 3 927 980 Single top (tW) 35.6 powheg 6 942 907 Single top (¯tW) 35.6 powheg 6 932 903 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [70, 100)GeV/c) 215.62 MadGraph 9 608 508 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [100, 200)GeV/c) 181.30 MadGraph 10 606 926 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [200, 400)GeV/c) 50.42 MadGraph 9 646 008 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [400, 600)GeV/c) 6.98 MadGraph 10 008 141 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [600, 800)GeV/c) 1.68 MadGraph 8 292 160 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [800, 1200)GeV/c) 0.775 MadGraph 2 668 311 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [1200, 2500)GeV/c) 0.186 MadGraph 595 906 Z+jets (Z→ ll, mll ≥ 50, HT ∈ [2500, ∞)GeV/c) 0.0044 MadGraph 399 147 Diboson, WW production 118.7 pythia8 993 997 Diboson, WZ production 47.13 pythia8 990 003 Diboson, ZZ production 16.52 pythia8 988 222 QCD (muon enriched, pT ∈ [15, 20)GeV/c) 3 819 570 pythia8 4 141 208 QCD (muon enriched, pT ∈ [20, 30)GeV/c) 2 960 198.4 pythia8 31 474 692 QCD (muon enriched, pT ∈ [30, 50)GeV/c) 1 652 471.46 pythia8 29 944 322 QCD (muon enriched, pT ∈ [50, 80)GeV/c) 437 504.1 pythia8 19 806 515 QCD (muon enriched, pT ∈ [80, 120)GeV/c) 106 033.66 pythia8 13 776 823 QCD (muon enriched, pT ∈ [120, 170)GeV/c) 25 190.52 pythia8 8 042 452 QCD (muon enriched, pT ∈ [170, 300)GeV/c) 8654.49 pythia8 7 946 703 QCD (muon enriched, pT ∈ [300, 470)GeV/c) 797.35 pythia8 7 936 465 QCD (muon enriched, pT ∈ [470, 600)GeV/c) 79.03 pythia8 3 850 452 QCD (muon enriched, pT ∈ [600, 800)GeV/c) 25.10 pythia8 4 008 200 QCD (muon enriched, pT ∈ [800, 1000)GeV/c) 4.71 pythia8 3 959 757 QCD (muon enriched, pT ∈ [1000, ∞)GeV/c) 1.62 pythia8 3 976 075

Table 7.3: Overview of all SM background samples used in this analysis. The theoretical cross sections σ are given in the second column. The third column gives the name of MC generator used. The number of generated events N is given in the fourth column.

51 interactions in the events. For this, a correction factor is applied to each MC event based on the number of interactions in that event. The correction factor is calculated from the minimum-bias cross section of 69.2 mb and the recorded integrated luminosity. Finally, top pT reweighting is applied to simulated t¯t events [64], to account for the observed shape difference in the top pT spectrum between data and MC [65].

Afterwards, the following selections are applied:

• The event must have exactly one isolated muon with pT > 50 GeV/c.

• The missing transverse energy is required to be E T > 50 GeV/c.

• The sum of all transverse momenta has to be ST > 400 GeV/c.

• At least one HOTVR jet, or one AK8 jet (see below), with pT > 200 GeV/c and |η| < 2.5 has to be present.

The first selection ensures the presence of an isolated, high energy muon. It is also a good discriminator against several SM backgrounds. The QCD multijet background should already be greatly reduced by the trigger, but the additional requirement of the tight muon identification working point and pT > 50 GeV/c further reduces this background. Furthermore, all processes including more than one lepton are greatly suppressed. The second requirement reflects the presence of a neutrino in the decay. Since the neutrino passes the detector undetected, it carries away energy, resulting in missing transverse energy. The next selection is the strongest discriminator against SM backgrounds in the pre-selection. All SM backgrounds peak at low values of ST, while the signal samples have a broad peak around the value of their hypothetical masses. This requirement is kept rather loose, to maintain sensitivity for signal events with lower masses, while still rejecting a fair amount of background events. The last requirement in the pre-selection ensures the presence of a top-jet candidate to be used for top tagging in the next step. The pre-selection requires HOTVR jets for the main analysis, but a second pre-selection is performed, where AK8 jets are required instead, to be able to compare both top tagging algorithms later in the reconstruction.

Figure 7.2a shows the number of events per 1 fb−1 passing the pre-selection. The solid line shows the average of all datapoints and the dashed lines represent one standard deviation. It can be seen, that there is a trend toward more events per integrated luminosity going to higher luminosity intervals. This is due to the higher instantaneous luminosity and

52 •1 30000 •1 40000

20000 30000

events per 1000.0 pb events per 1000.0 pb 20000

10000

χ2 / ndf 27.0 / 5 10000 χ2 / ndf 97.4 / 18 average26768.2 ± 347.5 average40311.8 ± 453.4

0 10000 20000 30000 0 10000 20000 30000 integrated luminosity [pb•1] integrated luminosity [pb•1]

(a) (b)

Figure 7.2: Number of events after the pre-selection as a function of the integrated lumi- nosity in bins of 1000 pb−1 . therefore higher pile-up in the later data taking runs. Since the pile-up dependency is not fully corrected in HOTVR, as discussed in section 5.4.1, this results in more events passing the HOTVR jet requirement of pT > 200 GeV/c. The trend is not visible in Fig. 7.2b, −1 where the number of events per 1 fb after the ST requirement is shown. The step in this plot around 20 fb−1 comes from the HIP effect [66]. Highly ionizing particles (HIP) can momentarily saturate the read out chips in the silicone tracker, reducing the efficiency of the muon trigger. A fix for this effect was implemented around this time, resulting in an overall increase of recorded events per integrated luminosity. The last bin, corresponding to the luminosity interval between 35 fb−1 and 36 fb−1, contains less events in both plots, since only 35.87 fb−1 have been recorded in the dataset used.

Control distributions of all important objects for this analysis after the pre-selection are shown in Fig. 7.3. Data points are depicted by the black markers, the error bars represent the statistical uncertainty. The simulated SM backgrounds are shown in the upper parts of each plot as filled areas. They are stacked to allow for a comparison between data and expected background. Additionally, three signal samples of the purely left-handed model with different b∗ masses are shown as black lines. The other two benchmark models result in similar distributions. The lower parts of the plots show the ratio between data and simulated background events. The grey area shows the statistical uncertainties of

53 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

106 Data 105 TTbar 5 Events Events 10 WJets 104 DYJets 104 SingleTop 103 DiBoson 103 QCD 2 M b*=1200 10 2 10 M b*=2000 M b*=3000 10 10 1 1

1.50 100 200 300 400 500 1.50 0.1 0.2 0.3 0.4 0.5 1 1 0.5 0.5 DATA / BG DATA / BG 0 100 200 300 400 500 0 0.1 0.2 0.3 0.4 0.5 pµ [GeV/c] T relIso muon (a) (b) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 106 106 105 Events Events 105 104 104 3 10 3 10 2 10 102 10 10 1 1

1.50 1000 2000 3000 4000 1.50 200 400 600 800 1000 1 1 0.5 0.5 DATA / BG DATA / BG 0 1000 2000 3000 4000 0 200 400 600 800 1000 S [GeV/c] T ET [GeV/c] (c) (d) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

6 106 10

Events Events 5 105 10 4 104 10 3 103 10 102 102 10 10 1 1

1.50 2 4 6 8 10 1.50 500 1000 1500 1 1 0.5 0.5 DATA / BG DATA / BG 0 2 4 6 8 10 0 500 1000 1500 top•jet Ntop•jets p [GeV/c] T (e) (f)

Figure 7.3: Control distributions of events passing the pre-selection for the following variables: (a) pT of the muon, (b) relative muon isolation, (c) ST, (d) E T, (e) number of HOTVR jets, (f) pT of the HOTVR jets.

54 the background samples and the error bars represent the statistical uncertainty of the data. The dominant SM backgrounds after the pre-selection are t¯t+jets and W+jets productions.

Overall, data and expected background are in good agreement. There is a slight excess of data at low muon pT (Fig. 7.3a). In this region, the contribution of the QCD multijet background is still considerable, indicating that the excess comes from jets, that include a muon from hadronic decays. This process is not well described in MC simulation, resulting in the observed excess. The following top jet selection will further reduce the QCD multijet background and thus remove most of these events . The deviation in the last bin of the number of HOTVR jets (Fig. 7.3e) is due to the inaccurate description of events with high jet multiplicity in MC simulation [65]. Furthermore, there is a slight trend in the ratio between data and MC in the E T distribution. This trend is also observed by other analyses. The same trend can be seen in the ST distribution, since E T contributes to the

ST variable.

7.2.2 Top Jet Selection using HOTVR

The events passing the pre-selection are divided into two categories depending on the number of AK4-jets passing the tight working point of the CSVv2 b tagging algorithm in the event. The distribution of the number of b-tagged AK4-jets is shown in Fig. 7.4. Differences in the b-jet identification efficiency between data and MC are addressed by applying a scale factor provided by the CMS collaboration to all MC samples containing b-tagged AK4-jets. If the number of b-tagged AK4-jets is Nb ≤ 1, the event is sorted into the signal region. Events with Nb > 1 are sorted into the control region. The control region is used to measure the efficiency of the top tagger in data and MC, to check if the efficiencies are compatible. This region was chosen, because it has kinematics very similar to the signal region and features mainly t¯t events. The amount of possible signal events in this region is negligible, so the comparison between data and MC is not biased by signal contamination.

The HOTVR top tagging algorithm is then used to identify top-jets in both regions. In addition to the standard working point described in section 3.2.4, two requirements are imposed on the top-jet candidates, in order to enhance the top tagging performance in signal events. First, the top-jet candidates have to have an N-subjettiness ratio of

55 35.9 fb•1 (13 TeV) Data 6 10 TTbar

Events 5 WJets 10 DYJets 104 SingleTop DiBoson 103 QCD M b*=1200 102 M b*=2000 SR CR M b*=3000 10 1

1.50 2 4 6 8 10 1 0.5 DATA / BG 0 2 4 6 8 10

Nbjets

Figure 7.4: Number of AK4-jets fulfilling the tight CSVv2 b-tag working point. The dashed line depicts the separation of the signal region (SR) and the t¯t control region (CR).

τ3/2 < 0.56. This requirement reduces the number of misidentified top-jets and thus helps to reduce background events without top quarks. Second, the top-jet candidate has to have an angular distance ∆φ to the muon of ∆φ > π/2, to select events with the expected back-to-back signature.

Figure 7.5 shows control distributions for the HOTVR jets in the t¯t control region. The distributions show pT and η of the HOTVR jets before and after top tagging. The large binning was chosen, to maintain a statistically meaningful amount of events in each bin, because these distributions will later be used to calculate a top tagging scale factor. It can be seen, that the selected control region contains predominantly t¯t events.

The top tagging efficiency is calculated for different pT and η bins in the control region. The efficiency is given by the number of top tagged HOTVR jets, divided by the number of all HOTVR jets in that region. All simulated non-t¯t backgrounds are subtracted from data, to get an approximation of the number of t¯t events in data. The statistical uncertainties of the background samples are propagated accordingly. The top tagging efficiency is then compared in data and simulated t¯t events. The top tagging efficiencies are shown in the upper parts of Fig. 7.6. The lower parts of the plots show the ratio between the efficiencies in data and MC. The efficiencies are, within the statistical uncertainties, in good agreement, except for the first pT bin and two η bins. To account for the slight

56 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 108 Data TTbar 5 7 10 10 WJets DYJets

Events 106 SingleTop DiBoson Events 104 QCD M b*=1200 105 M b*=2000 M b*=3000 103 104 2 103 10 2 10 10 10 1 1

1.5 500 1000 1500 1.5 −2 −1 0 1 2 1 1 0.5 0.5 DATA / BG DATA / BG 500 1000 1500 −2 −1 0 1 2 ptopjet [GeV/c] ηtopjet T (a) (b) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

4 104 10

Events Events 3 103 10

2 102 10

10 10

1 1

1.5 500 1000 1500 1.5 −2 −1 0 1 2 1 1 0.5 0.5 DATA / BG DATA / BG 500 1000 1500 −2 −1 0 1 2 ptopjet [GeV/c] ηtopjet T (c) (d)

Figure 7.5: Control distributions of the HOTVR jets in the t¯t control region: (a) pT of the HOTVR jets, (b) η of the HOTVR jets, (c) pT of the tagged HOTVR jets, (d) η of the tagged HOTVR jets.

difference, a scale factor atoptag is calculated with

εdata atoptag = , (7.2) εMC where εdata is the efficiency in data and εMC is the efficiency in MC. The scale factor is calculated for each pT bin and is applied to the simulated events accordingly. Afterwards, the scale factor is verified in a closure test. The top tagging efficiency in data is now compared to the efficiency in the scaled t¯t sample. The closure test, seen in Fig. 7.7, shows that after applying the top tagging scale factors to the t¯t sample, the differences of the top tagging efficiency in data and MC are corrected. The ratio between data and MC in

57 0.25 0.25 Data (background subtracted) Data (background subtracted) 0.2 0.2 TTbar TTbar Efficiency Efficiency

0.15 0.15

0.1 0.1

0.05 0.05

1.1 500 1000 1500 1.1 −2 0 2 1 1 0.9 0.9 DATA/MC DATA/MC 500 1000 1500 −2 0 2 p [GeV/c] η T (a) (b)

Figure 7.6: Top tagging efficiency in t¯t events measured in the t¯t control region as a function of (a) pT and (b) η. The upper parts of the plots show the efficiency in data (red) and simulation (blue). The lower parts show the ratio between data and simulation.

the pT binned efficiency is, by definition, set to one by the scale factor, but also the ratios of the η binned efficiencies are now compatible with one.

Events in the signal region are required to have exactly one top tagged HOTVR jet. The top tagging scale factor from the control region is applied to all t¯t and signal events passing this selection. Distributions of the variables used for top tagging are shown in Fig. 7.8. The mass distribution of the tagged top jets shows a sharp peak at the top mass, indicating that the kinematics of the top quark are captured well by the HOTVR jet. The number of subjets has its maximum at three, as expected, since the hadronic decay of the top quark results in three subjets. The distribution of the N-subjettiness ratio τ3/2 verifies, that the tagged jet is compatible with a three-prong decay. The fpT distribution peaks around 0.5, as expected from momentum conservation. The distribution of the minimum 2 pairwise mass Mpair shows a sharp peak around 80 GeV/c , corresponding to the mass peak of the W boson from the top decay. The back-to-back signature of the events can be seen in the ∆φµ,t distribution. Looking at the contributions of the different SM processes to the expected background, it can be seen, that t¯t production is now by far the dominant background, while all other backgrounds were heavily reduced. This demonstrates how powerful the top tagging requirement is.

58 0.25 0.25 Data (background subtracted) Data (background subtracted) 0.2 0.2 TTbar TTbar Efficiency Efficiency

0.15 0.15

0.1 0.1

0.05 0.05

1.1 500 1000 1500 1.1 −2 0 2 1 1 0.9 0.9 DATA/MC DATA/MC 500 1000 1500 −2 0 2 p [GeV/c] η T (a) (b)

Figure 7.7: Closure test of the top tagging scale factor in t¯t events as a function of (a) pT and (b) η. The upper parts of the plots show the efficiency in data (red) and simulation (blue). The lower parts show the ratio between data and simulation.

7.2.3 Top Jet Selection using the CMS top tagger

In order to compare the performance of the HOTVR top tagger with the CMS top tagger in the reconstruction, a second set of events is selected using the CMS top tagger. For this set, AK8-jets are required in the pre-selection instead of HOTVR jets. As in the selection described in section 7.2.2, a requirement on the number of b-tagged AK4-jets is imposed. The CMS collaboration provides top tagging scale factors for the CMS top tagger, so only events with Nb ≤ 1 are considered. Afterwards, exactly one top tagged

AK8 jet is required. The working point corresponding to τ3/2 < 0.57 is used, as it has a comparable performance to the HOTVR working point used (see section 6.2). Additionally, the requirement on the angular distance ∆φ between the top and the muon of ∆φ > π/2 is imposed.

7.3 Reconstruction of the Invariant tW Mass

The discriminating variable in this search is the invariant mass of the tW system MtW, since the signal processes are expected to result in resonances in the invariant mass spectrum. In order to receive MtW, the tW system is reconstructed from the objects available in the event. The reconstruction uses the top tagged jet, the muon and the

59 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

9000 Data 5 TTbar 10

Events 8000 WJets Events 104 7000 DYJets SingleTop 3 6000 DiBoson 10 5000 QCD M b*=1200 102 4000 M b*=2000 3000 M b*=3000 10 2000 1 1000

1.50 100 200 300 400 1.50 2 4 6 8 10 1 1 0.5 0.5 DATA / BG DATA / BG 0 100 200 300 400 0 2 4 6 8 10 2 N Mtop•jet [GeV/c ] subjets (a) (b) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

4 104 10 Events Events

3 103 10

2 102 10

10 10

1 1

1.50 0.2 0.4 0.6 0.8 1 1.50 50 100 150 200 1 1 0.5 0.5 DATA / BG DATA / BG 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 2 fp , 1 M [GeV/c ] T pair (c) (d) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 105

4 10 4

Events Events 10

3 10 103

2 10 102

10 10

1 1

1.50 0.2 0.4 0.6 0.8 1 1.50 1 2 3 4 1 1 0.5 0.5 DATA / BG DATA / BG 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 τ ∆ φ 3/2 µ,t (e) (f)

Figure 7.8: Distributions of the variables used by the HOTVR top tagging algorithm: (a) Mass of the HOTVR jet, (b) Number of subjets, (c) pT fraction of the first

subjet fpT , (d) minimum pairwise mass of the subjets, (e) N-subjettiness ratio τ3/2, (f) angular distance between top jet an muon ∆φµ,t.

60 missing transverse energy to build MtW hypotheses. Afterwards, the goodness of the reconstruction is estimated with a χ2-method. All events passing the event selection are guaranteed to contain the objects required for the reconstruction.

7.3.1 Neutrino Reconstruction

The neutrino itself can not be measured with the CMS detector, therefore it has to be reconstructed from other available information. Assuming, that the W boson is produced on its mass shell, the neutrino can be reconstructed by solving the quadratic equation system 2 µ 2  µ µ 2 MW = (pW) = pµ + pν (7.3) for the z component of the neutrino four-momentum. The solutions for this equation are given by v u 2 2 2 2 2 ± apz,µ ua pz,µ Ez,µpT,ν − a pz,ν = 2 ± t 4 − 2 , (7.4) pT,µ pT,µ pT,µ

2 MW with a = 2 + pT,νpT,µ cos(∆φ). The transverse momentum of the neutrino is obtained by assuming that the neutrino is the only source of missing transverse energy in the event, i.e. ν pT = E T, (7.5)

The equation can have zero, one or two real solutions. If no real solution is found, the real part of the complex solution is taken.

7.3.2 Reconstruction of the tW System

The invariant tW system is reconstructed by combining the four momenta of the top and the W hypotheses. The top tagged jet is used as the top hypothesis and the W hypothesis is build by combining the four momenta of the muon and the reconstructed neutrino. If two real solutions were found in the neutrino reconstruction, a tW hypothesis is build for each W hypothesis. The best tW hypothesis is then chosen, based on the χ2-method discussed below.

In total three different reconstructions are performed, each using a different top tagging criterion. The first reconstruction uses top-jets selected with HOTVR in section 7.2.2,

61 while the other two reconstructions use top-jets selected with the CMS top tagger in section 7.2.3. The first reconstruction using the CMS top tagger is performed using

AK8-jets with pT > 200 GeV/c and |η| < 2.5, to compare both top tagging algorithms in the same kinematic range. The second reconstruction imposes an additional requirement of pT > 400 GeV/c on the AK8-jets, to demonstrate the performance of the CMS top tagger at its recommended working point.

The invariant mass distributions of the reconstructed tW system in signal samples for the three different reconstructions are shown in Fig. 7.9. They were normalized to the distributions’ integral to better compare their shapes. The plots on the left side 0 2 show the MtW distribution of three B samples, with hypothetical masses of 700 GeV/c , 2 2 1200 GeV/c and 1800 GeV/c . The plots on the right side show the MtW distribution of three b∗ samples, with hypothetical masses of 1200 GeV/c2, 2000 GeV/c2 and 3000 GeV/c2. All distributions show sharp peaks at their expected masses. The shapes of all three reconstruction methods are very similar.

7.3.3 Determining the Goodness of the Hypothesis

2 The goodness of the reconstructed MtW hypothesis is determined by a χ estimator. The estimator is calculated with

∆φreco − ∆φmean !2 ∆preco − ∆pmean !2 χ2 = t,W t,W + T,rel T,rel . (7.6) σ∆φt,W σ∆pT,rel

The first term resembles the back-to-back signature of the signal. It becomes small, if the reco angular distance between the top and the W hypothesis ∆φt,W is close to the expected mean value ∆φt,W . The second term comes from momentum conservation. Since the top and the W in signal events come from the decay of a heavy particle, their transverse momenta should be of the same order. The second term hence becomes small, if the difference of the pT,top−pT,W top pT and the W pT relative to the top pT, ∆pT,rel = , is close to the expected pT,top value. The mean values and the standard deviations (σ∆pT,rel and σ∆φt,W ) are obtained by matching the reconstructed objects to the generator objects in MC signal events and then fitting a Gaussian function to the peak of the respective distributions, as shown in the upper distributions in Fig. 7.10. The values obtained from the fits are:

62 0.45 0.45 N/N M B’=700 N/N M b*=1200 ∆ ∆ 0.4 M B’=1200 0.4 M b*=2000 M B’=1800 M b*=3000 0.35 0.35 0.3 0.3 0.25 0.25

0.2 0.2 0.15 0.15 0.1 0.1

0.05 0.05

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 2 2 MtW [GeV/c ] MtW [GeV/c ]

(a) HOTVR; pT > 200 GeV/c (b) HOTVR; pT > 200 GeV/c

0.45 0.45 N/N M B’=700 N/N M b*=1200 ∆ ∆ 0.4 M B’=1200 0.4 M b*=2000 M B’=1800 M b*=3000 0.35 0.35 0.3 0.3 0.25 0.25

0.2 0.2 0.15 0.15 0.1 0.1

0.05 0.05

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 2 2 MtW [GeV/c ] MtW [GeV/c ]

(c) CMS top tagger; pT > 200 GeV/c (d) CMS top tagger; pT > 200 GeV/c

0.45 0.45 N/N M B’=700 N/N M b*=1200 ∆ ∆ 0.4 M B’=1200 0.4 M b*=2000 M B’=1800 M b*=3000 0.35 0.35 0.3 0.3 0.25 0.25

0.2 0.2 0.15 0.15 0.1 0.1

0.05 0.05

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 2 2 MtW [GeV/c ] MtW [GeV/c ]

(e) CMS top tagger; pT > 400 GeV/c (f) CMS top tagger; pT > 400 GeV/c

Figure 7.9: Distribution of the reconstructed invariant mass of the tW system for three different b∗ mass hypotheses. The distributions are normalized to their integral.

63 1500

Events 1500 Events

1000 1000

500 500

2.6 2.8 3 3.2 3.4 −0.4 −0.2 0 0.2 0.4 ∆ φ ∆ p / p top,W T top,W T, top

(a) (b) 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 7 4 10 10 Data TTbar 106 WJets DYJets

Events Events SingleTop DiBoson 103 5 10 QCD M b*=1200 M b*=2000 M b*=3000 104 102 103 10 102 10 1 1

1.5 2.6 2.8 3 3.2 3.4 1.5 −0.4 −0.2 0 0.2 0.4 1 1 0.5 0.5 DATA / BG DATA / BG 2.6 2.8 3 3.2 3.4 −0.4 −0.2 0 0.2 0.4 ∆ φ ∆ p /p top,W T top,W T top (c) (d)

Figure 7.10: Distributions of (a),(c) ∆φt,W and (b),(d) ∆pT,rel; (a),(b) show distributions of signal events, where the reconstructed top and W were matched on generator level. Gaussian functions were fitted to the peaks.

mean ∆φt,W = 3.14,

σ∆φ = 0.0589, t,W (7.7) mean ∆pT,rel = 0.0267,

σ∆pT,rel = 0.0611.

The lower two histograms in Fig. 7.10 show the distributions of ∆φt,W and ∆pT,rel for all events after the reconstruction. The signal events peak nicely around the expected value. The expected background events show a much broader distribution around the expected value in ∆φt,W. In t¯t events, the top-jet is expected to have higher pT then the W. This

64 35.9 fb•1 (13 TeV) 105 Data TTbar WJets DYJets 104 Events SingleTop DiBoson QCD M b*=1200 103 M b*=2000 M b*=3000

102

10

1

1.50 100 200 300 400 500 1 0.5 DATA / BG 0 100 200 300 400 500 Χ2

Figure 7.11: Distribution of the χ2 estimator of the b∗ reconstruction.

can be seen in the ∆pT,rel distribution, where t¯t is prevalent at values greater than zero, making this variable a good discriminator against the t¯t background.

The distribution of the χ2 estimator is shown in Fig. 7.11. The distribution peaks around zero for the signal, but also for the expected background. However, going to higher values, the signal distribution falls much faster. Hence, only events with a χ2 < 20 are considered after the reconstruction.

7.3.4 Comparison of the Different Top Tagging Methods in the Reconstruction

The final distributions of the reconstructed invariant tW mass are shown in Fig. 7.12. The binning becomes less fine at higher values of MtW, to maintain a statistically meaningful background prediction. The last bin is used as a so-called overflow bin, i.e. all events with values higher than this bins upper border are also filled in this bin. Hence, the displayed width of the last bin is meaningless. In the left plots, the expected signal of three B0 samples, with hypothetical masses of 700 GeV/c2, 1200 GeV/c2 and 1800 GeV/c2 are drawn as lines, while in the right plots three b∗ samples, with hypothetical masses of 1200 GeV/c2, 2 2 2 2000 GeV/c and 3000 GeV/c are drawn as lines. At values of MtW > 800 GeV/c , all three reconstructions show very similar results. Going to lower masses, the reconstruction using the CMS top tagger becomes less efficient. In this mass region, the top-jets have, on

65 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 105 Data TTbar 105 Data TTbar WJets DYJets WJets DYJets

Events SingleTop DiBoson Events SingleTop DiBoson 104 104 QCD M B’=700 QCD M b*=1200 M B’=1200 M B’=1800 M b*=2000 M b*=3000 103 103

102 102

10 10

1 1

1.50 1000 2000 3000 4000 5000 1.50 1000 2000 3000 4000 5000 1 1 0.5 0.5 DATA / BG DATA / BG 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 2 2 MtW [GeV/c ] MtW [GeV/c ]

(a) HOTVR; pT > 200 GeV/c (b) HOTVR; pT > 200 GeV/c 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV) 105 105

Events 104 Events 104

103 103

102 102

10 10

1 1

1.50 1000 2000 3000 4000 5000 1.50 1000 2000 3000 4000 5000 1 1 0.5 0.5 DATA / BG DATA / BG 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 2 2 MtW [GeV/c ] MtW [GeV/c ]

(c) CMS top tagger; pT > 200 GeV/c (d) CMS top tagger; pT > 200 GeV/c 35.9 fb•1 (13 TeV) 35.9 fb•1 (13 TeV)

5 10 103

Events 104 Events 102 103

2 10 10

10 1 1

1.50 1000 2000 3000 4000 5000 1.50 1000 2000 3000 4000 5000 1 1 0.5 0.5 DATA / BG DATA / BG 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 reco 2 2 Mb* [GeV/c ] MtW [GeV/c ]

(e) CMS top tagger; pT > 400 GeV/c (f) CMS top tagger; pT > 400 GeV/c

Figure 7.12: Invariant mass distribution of the reconstructed tW system after the final selection. Signal events are drawn as black lines for B0 on the left plots and for b∗ on the right plots.

66 1 expected limit at 95% C.L. expected limit at 95% C.L.

HOTVR HOTVR BR [pb] BR [pb]

× 10 CMS top tagger; p >200GeV/c × CMS top tagger; p >200GeV/c T T CMS top tagger; p >400GeV/c CMS top tagger; p >400GeV/c T T tW) tW) •1

→ → 10 B’ 1 b* → → (pp (bg σ σ

10•2 10•1

1000 1500 1500 2000 2500 3000 2 2 MB’ [GeV/c ] Mb* [GeV/c ] (a) (b)

Figure 7.13: Comparison of the expected cross section limits for (a) B0 and (b) b∗ derived from the distributions in Fig. 7.12.

average, pT < 400 GeV/c. The top quark decay is not captured well by AK8-jets in this kinematic region, resulting in a drop of efficiency for the CMS top tagger, as demonstrated in section 6.2.

The expected cross section limits derived from the invariant mass distributions are compared in Fig. 7.13. All three reconstruction methods result in comparable cross section limits for masses over 900 GeV/c2. At lower masses, the sensitivity of the methods using the CMS top tagger drastically drops, while the method using HOTVR maintains good sensitivity.

7.4 Systematic Uncertainties

All systematic uncertainties that vary the normalization and/or shape of the final distri- bution of MtW have to be taken into account for the interpretation of this distribution. Each uncertainty is treated as recommended by CMS. The considered uncertainties are discussed in the following.

• The uncertainty on the luminosity measurement of the 2016 dataset is 2.5 % [67].

• For each sample produced with MadGraph and powheg, a set of slightly varied PDFs is provided. These varied PDFs are used to estimate the uncertainty on the PDF. The final invariant mass distribution is filled for each PDF in different histograms. From these histograms, the symmetric standard deviation with respect

67 to the nominal value is calculated for each bin. The nominal distribution is then varied up and down by one standard deviation in each bin simultaneously to estimate the systematic uncertainty arising from the PDF uncertainty.

• The uncertainties on the production cross section for all MC samples, except the QCD sample, are measured by the CMS collaboration. The QCD production cross section uncertainty is estimated conservatively. The following uncertainties are applied to the respective samples:

– t¯t: 5.6 % [68]

– W+jets and Z+jets: 10 % [69]

– Single top: 10 % [70,71]

– Diboson: 20 % [72,73]

– QCD: 100 %

• The uncertainty on the renormalization and factorization scales µr and µf can be varied in MC samples produced with MadGraph and powheg. These samples

contain weights corresponding to different choices of µr and µf . Each scale can be varied up by a factor of 2 and down by a factor of 0.5, resulting in six different combinations of scale variations plus the nominal values, not counting the cases, where one scale is varied up, while the other is varied down. For each combination, a separate invariant mass histogram was filled. The envelope of the histograms is taken as the systematic uncertainty arising from the renormalization and factorization scale uncertainties.

• The uncertainties on the jet energy correction (JEC) for HOTVR are treated in two parts. The uncertainty from the L2L3 and L2L3res corrections are provided by CMS and applied as recommended [74]. The uncertainty on the pile-up correction presented in section 5.4.1 is addressed separately. It is estimated by calculating the

relative deviation of the subjet pT to the pT of the matched generator level subjet

as a function of ρ and varying the subjet pT up and down by this deviation. The rec gen relative deviation pT /pT as a function of ρ is shown in Fig. 7.14.

68 • The uncertainties on the muon efficiency scale factors were estimated by the CMS

collaboration as functions of pT and η of the identified muon candidate. The statistical uncertainties on the scale factors are added in quadrature to a constant value provided by the CMS collaboration. The following uncertainties were applied, the constant values added in quadrature to the statistical uncertainties are given:

– Uncertainty on muon reconstruction efficiency: 1.0 %

– Uncertainty on muon trigger scale factor: 0.5 %

– Uncertainty on relative muon isolation scale factor: 1.0 %

– Uncertainty on muon ID scale factor: 1.0 %

The scale factors were each varied up and down by the resulting uncertainty and the deviation in the final invariant mass spectrum is taken as a systematic uncertainty.

• The minimum bias cross section of 69.2 mb is used for the pile-up reweighting. This cross section is varied up and down by 5 %, as recommended and the resulting cross sections are used for pile-up reweighting instead [75]. The resulting variations of the final distribution are taken as systematic uncertainties.

• The systematic uncertainties from the top pT reweighting are determined from the difference in the final invariant mass spectrum between applying and not applying the reweighting [64].

• The b tagging scale factors provided by CMS are varied up and down within their statistical uncertainties, as recommended [35]. Here, the b- and c-tagging uncertainties are taken as fully correlated, while the tagging efficiency of jets from light quarks and gluons is taken as uncorrelated to the b- and c-tagging efficiency. Therefore, these two uncertainties are considered separately. The resulting deviation in the final invariant mass distribution is taken as a systematic uncertainty.

• The measured top tagging scale factors are varied within their statistical uncertainties. Again, the deviation in the final invariant mass spectrum is taken as a systematic uncertainty.

All systematic uncertainties discussed above are assumed to be uncorrelated and are added in quadrature.

69 1.2 gen T

/ p 1.15 rec T p 1.1

1.05

1

0.95

0.9

0.85

0.8 0 20 40 60 ρ [GeV/c]

Figure 7.14: Relative deviation of the subjet pT to the pT of the matched generator rec gen level subjet pT /pT as a function of ρ. The error bars show the statistical uncertainties.

70 8 Results

The statistical interpretation of the invariant tW mass distribution reconstructed in chapter 7 is discussed in this chapter. Afterwards, possible improvements of the presented analysis are discussed.

8.1 Statistical Interpretation

The number of events passing the full section is given in table 8.1. The statistical and systematic uncertainties are given separately. The invariant mass distribution of the tW system MtW is shown in Fig. 8.1. The total uncertainty of the SM prediction is depicted as the dashed area in the upper part. The dark grey area in the lower part of the plot shows the statistical uncertainty, while the light grey area depicts the total uncertainty. The data is in good agreement with the SM prediction within the uncertainties.

Since no excess of data is observed, the MtW is used to calculate upper limits on the production cross section of a singly produced b∗ decaying to tW and a singly produced B0 decaying to tW with an associated b quark. This is done by performing a binned likelihood fit using the theta framework [76]. Statistical uncertainties, as well as all systematic uncertainties discussed in section 7.4, are taken into account as nuisance parameters in the fit. All systematic uncertainties affecting only the normalization of the distribution are accounted for as nuisance parameters with a log-normal prior. For systematic uncertainties affecting the shape, a Gaussian prior is used to interpolate between the shifted and the nominal shape. The statistical uncertainties of the simulated processes are addressed using the Barlow-Beeston lite method [77,78]. Here nuisance parameters with a Gaussian distribution for each bin are used and the likelihood is maximized with respect to these parameters.

The results of the fits are shown in Figs. 8.2 and 8.3 for the considered benchmark models as a function of the particle mass. The dotted line depicts the expected upper cross Process Event yield ± stat. ± syst. b∗ → tW left-handed right-handed vector-like 2 +148 +177 +325 M = 1200 GeV/c 771 ±23.9 −143 926 ±26.3 −173 1697 ±35.5 −316 2 +76.1 +87.6 +152 M = 1400 GeV/c 352 ±10.3 −74.8 411 ±11.2 −85.7 704 ±14.5 −150 2 +43.6 +52.2 +95.6 M = 1600 GeV/c 172 ±4.75 −43.0 211 ±5.31 −51.3 383 ±7.12 −94.2 2 +23.7 +28.7 +52.3 M = 1800 GeV/c 83.9 ±2.26 −23.3 101 ±2.51 −28.2 184 ±3.37 −51.4 2 +14.3 +17.1 +31.4 M = 2000 GeV/c 43.3 ±1.13 −14.1 51.7 ±1.24 −16.9 95.0 ±1.68 −31.0 2 +8.33 +10.3 +18.6 M = 2200 GeV/c 22.5 ±0.59 −8.22 27.0 ±0.63 −10.1 49.6 ±0.87 −18.4 2 +4.91 +5.99 +10.9 M = 2400 GeV/c 11.8 ±0.30 −4.86 14.5 ±0.33 −5.91 26.3 ±0.45 −10.8 2 +3.01 +3.62 +6.56 M = 2600 GeV/c 6.46 ±0.16 −2.97 7.78 ±0.20 −3.58 14.1 ±0.25 −6.48 2 +1.97 +2.22 +4.22 M = 2800 GeV/c 3.65 ±0.10 −1.95 4.23 ±0.10 −2.20 7.95 ±0.14 −4.18 2 +1.21 +1.39 +2.59 M = 3000 GeV/c 2.08 ±0.05 −1.20 2.31 ±0.05 −1.38 4.39 ±0.07 −2.57 B0 → tW left-handed right-handed - 2 +39.4 +50.7 M = 700 GeV/c 141 ±7.61 −30.77 182 ±8.90 −38.1 - 2 +42.6 +48.1 M = 800 GeV/c 145 ±6.45 −30.5 164 ±6.85 −35.2 - 2 +41.1 +47.6 M = 900 GeV/c 135 ±5.33 −30.5 159 ±5.76 −35.2 - 2 +37.5 +44.2 M = 1000 GeV/c 118 ±4.30 −27.4 140 ±4.68 −32.7 - 2 +31.2 +39.0 M = 1100 GeV/c 96.6 ±2.34 −22.6 121 ±3.69 −28.3 - 2 +26.9 +34.5 M = 1200 GeV/c 81.1 ±2.75 −19.4 103 ±2.95 −24.8 - 2 +23.0 +29.3 M = 1300 GeV/c 67.5 ±2.08 −16.4 86.1 ±2.34 −21.0 - 2 +18.9 +24.8 M = 1400 GeV/c 53.7 ±1.63 −13.5 71.5 ±1.89 −17.7 - 2 +17.0 +21.7 M = 1500 GeV/c 47.2 ±1.36 −12.1 60.5 ±1.53 −15.3 - 2 +11.9 +17.1 M = 1600 GeV/c 39.0 ±1.08 −8.4 46.4 ±1.18 −12.1 - 2 +11.9 +15.2 M = 1700 GeV/c 31.8 ±0.88 −8.38 40.6 ±0.97 −10.7 - 2 +10.7 +12.4 M = 1800 GeV/c 27.9 ±0.72 −7.55 32.0 ±0.77 −8.75 - +1521 t¯t 10 043 ±43.2 −1376 +794 W+jets 3544 ±57.6 −693 +181 Single top 1190 ±15.7 −184 +23.3 Z+jets 132 ±5.07 −20.6 +57.5 Diboson 264 ±32.6 −55.8 +39.9 QCD 37.2 ±13.0 −37.2 +1727 total SM background 15 210 ±81.8 −1553 Data 13 327

Table 8.1: Total number of events after the full selection for the process b∗ → tW, expected SM background and data. Statistical and systematic uncertainties are given separately.

72 35.9 fb•1 (13 TeV) 105 Data TTbar WJets DYJets

Events SingleTop DiBoson 104 QCD M b*=1200 M b*=2000 M b*=3000 103

102

10

1

1.50 1000 2000 3000 4000 5000 1 0.5 DATA / BG 0 1000 2000 3000 4000 5000 2 MtW [GeV/c ]

Figure 8.1: Distribution of the invariant tW mass after the final selection. Statistical and systematical uncertainties are shown in the ratio as dark and light gray areas, respectively. section limits at the 95 % confidence level. The green and orange areas correspond to one and two standard deviations of the expected limit, respectively. The solid line shows the observed upper limit at 95 % confidence level in data. The dashed line shows the theoretical prediction of the cross section for the respective benchmark model. The observed limits are in agreement with the expected limits within two standard deviations over the whole considered mass range.

The observed limits for the b∗ are compared to the predicted cross sections for the different benchmark models to set mass exclusion limits for these models. The left-handed, right-handed and vector-like b∗ quarks can be excluded up to masses of 2050 GeV/c2, 2150 GeV/c2 and 2350 GeV/c2, respectively.

8.2 Further improvements of the Analysis

The presented analysis is able to set stringent cross section limits on both, the excited bottom quark b∗ and the vector-like B0 over a wide mass range. There is, however, still some room for improvement. A few points to improve the presented analysis are discussed briefly.

73 • The signal sensitivity, especially in the low-mass region, could be improved by using a more sophisticated method to remove the pile-up dependency of the HOTVR algorithm. The Pileup Per Particle Identification (PUPPI) [52] algorithm could be used to remove pile-up from the HOTVR jets in the clustering process. Thus, not only the effect from pile-up on the jet four-momentum, but also on the jet shape could be avoided.

• The background prediction is completely taken from MC simulations. A data-driven background estimation for the dominant t¯t background would give a more reliable estimation for the expected background in this region.

• In this analysis, only the case where the top decays hadronically and the W decays into a muon and a muon neutrino is considered. The signal efficiency could be increased, if another category is added, where a hadronically decaying W and a leptonically decaying top is assumed. In this case, HOTVR could be used as a W tagger, while the top would be reconstructed from a b tagged jet, the muon and the missing transverse energy, similar to the W reconstruction in this analysis.

• The analysis could be extended to the electron+jets final state by introducing another category, where an isolated electron is required, instead of an isolated muon.

74 10 10 b* left•handed b* right•handed observed 95% C.L. upper limit observed 95% C.L. upper limit

BR [pb] expected 95% C.L. upper limit BR [pb] expected 95% C.L. upper limit × 1 68% expected × 1 68% expected 95% expected 95% expected tW) tW) → →

•1 b* 10•1 b* 10 → → (bg (bg σ σ 10•2 10•2

1500 2000 2500 3000 1500 2000 2500 3000 2 2 Mb* [GeV/c ] Mb* [GeV/c ] (a) (b)

10 b* vector•like observed 95% C.L. upper limit

BR [pb] expected 95% C.L. upper limit × 1 68% expected 95% expected tW) →

b* 10•1 → (bg σ 10•2

1500 2000 2500 3000 2 Mb* [GeV/c ] (c)

Figure 8.2: Expected and observed upper limits on the production cross section for singly produced excited bottom quarks b∗ decaying to tW at the 95 % confidence level. The dashed line represents the theoretical prediction of the production cross section. The green and orange areas correspond to one and two standard deviations of the expected limit, respectively. The three benchmark models shown have (a) purely left-handed, (b) purely right-handed and (c) vector-like coupling to the W boson.

75 B’; BR(B’ → bZ) = BR(B’ → tW) = 0.5 B’; BR(B’ → bZ) = BR(B’ → tW) = 0.5 observed 95% C.L. upper limit observed 95% C.L. upper limit 10 10

BR [pb] expected 95% C.L. upper limit BR [pb] expected 95% C.L. upper limit × 68% expected × 68% expected 95% expected 95% expected tW) tW)

→ 1 → 1 B’ B’ → →

(pp (pp •1 σ 10•1 σ 10

10•2 1000 1500 1000 1500 2 2 MB’ [GeV/c ] MB’ [GeV/c ] (a) (b)

Figure 8.3: Expected and observed upper limits on the production cross section for singly produced vector-like B0 quark decaying to tW at the 95 % confidence level. The dashed line represents the theoretical prediction of the production cross section. The green and orange areas correspond to one and two standard deviations of the expected limit, respectively. The two benchmark models shown have (a) purely left-handed and (b) purely right-handed coupling to the W boson.

76 9 Summary

A search for a singly produced excited bottom quark b∗ decaying to a top quark and a W boson in the muon+jets final state was presented. It was demonstrated, that the search is also sensitive to a singly produced vector-like B0 quark decaying to a top quark and a W boson in the muon+jets final state. The dataset analyzed was recorded with the CMS √ detector in 2016 at s = 13 TeV with an integrated luminosity of 35.87 fb−1.

Two different top tagging methods were introduced and compared. The HOTVR top tagging algorithm was implemented for the first time in a search for new physics. A jet energy correction method for HOTVR jets was developed and the stable performance of the tagger over a wide kinematic range was demonstrated. The identified top-jets are used to reconstruct the invariant mass of the tW system, which is used as the discriminating variable between signal and background events in this analysis.

Good agreement between data standard model prediction within the estimated uncertainties was observed. Since there was no excess of data above the expected background, upper limits on the production cross sections for the searched processes were set.

The production cross section limits on the excited bottom quark b∗ were converted into mass exclusion limits. Excited bottom quarks with masses up to 2050 GeV/c2, 2150 GeV/c2 and 2350 GeV/c2 can be excluded at 95 % confidence level for purely left-handed, purely right-handed and vector-like benchmark couplings, respectively. These are the most stringent limits on the b∗ mass to date. Bibliography

[1] CMS Collaboration, “Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC,” Phys. Lett. B716 (2012) 30–61, arXiv:1207.7235.

[2] ATLAS Collaboration, “Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC,” Phys. Lett. B716 (2012) 1–29, arXiv:1207.7214.

[3] A. Djouadi and A. Lenz, “Sealing the fate of a fourth generation of fermions,” Phys. Lett. B715 (2012) 310–314, arXiv:1204.1252.

[4] T. Lapsien, R. Kogler, and J. Haller, “A new tagger for hadronically decaying heavy particles at the LHC,” Eur. Phys. J. C76 no. 11, (2016) 600, arXiv:1606.04961.

[5] D. J. Griffiths, Introduction to elementary particles; 2nd rev. version. Physics textbook. Wiley, New York, NY, 2008. https://cds.cern.ch/record/111880.

[6] M. E. Peskin and D. V. Schroeder, An Introduction to ; 1995 ed. Westview, Boulder, CO, 1995. https://cds.cern.ch/record/257493.

[7] CMS Collaboration, “Measurement of the top quark mass using proton-proton data q at (s) = 7 and 8 TeV,” Phys. Rev. D93 no. 7, (2016) 072004, arXiv:1509.04044.

[8] S. L. Glashow, “Partial Symmetries of Weak Interactions,” Nucl. Phys. 22 (1961) 579–588.

[9] S. Weinberg, “A Model of Leptons,” Phys. Rev. Lett. 19 (1967) 1264–1266.

[10] Particle Data Group Collaboration, “Review of Particle Physics,” Chin. Phys. C40 no. 10, (2016) 100001.

[11] P. W. Higgs, “Broken Symmetries and the Masses of Gauge Bosons,” Phys. Rev. Lett. 13 (1964) 508–509. [12] P. W. Higgs, “Broken symmetries, massless particles and gauge fields,” Phys. Lett. 12 (1964) 132–133.

[13] F. Englert and R. Brout, “Broken Symmetry and the Mass of Gauge Vector Mesons,” Phys. Rev. Lett. 13 (1964) 321–323.

[14] ATLAS, CMS Collaboration, “Combined Measurement of the Higgs Boson Mass in √ pp Collisions at s = 7 and 8 TeV with the ATLAS and CMS Experiments,” Phys. Rev. Lett. 114 (2015) 191803, arXiv:1503.07589.

[15] S. P. Martin, “A Supersymmetry primer,” arXiv:hep-ph/9709356. [Adv. Ser. Direct. High Energy Phys.18,1(1998)].

[16] L. Randall and R. Sundrum, “A Large mass hierarchy from a small extra dimension,” Phys. Rev. Lett. 83 (1999) 3370–3373, arXiv:hep-ph/9905221.

[17] H. Harari, “A Schematic Model of Quarks and Leptons,” Phys. Lett. 86B (1979) 83–86.

[18] M. A. Shupe, “A Composite Model of Leptons and Quarks,” Phys. Lett. 86B (1979) 87–92.

[19] M. J. Dugan, H. Georgi, and D. B. Kaplan, “Anatomy of a Composite Higgs Model,” Nucl. Phys. B254 (1985) 299–326.

[20] U. Baur, I. Hinchliffe, and D. Zeppenfeld, “Excited Quark Production at Hadron Colliders,” Int. J. Mod. Phys. A2 (1987) 1285.

[21] J. Nutter, R. Schwienhorst, D. G. E. Walker, and J.-H. Yu, “Single Top Production as a Probe of B-prime Quarks,” Phys. Rev. D86 (2012) 094006, arXiv:1207.5179.

[22] CMS Collaboration, “Search for the production of an excited bottom quark √ decaying to tW in proton-proton collisions at s = 8 TeV,” JHEP 01 (2016) 166, arXiv:1509.08141.

[23] ATLAS Collaboration, “Search for single b∗-quark production with the ATLAS √ detector at s = 7 TeV,” Phys. Lett. B721 (2013) 171–189, arXiv:1301.1583.

[24] J. A. Aguilar-Saavedra, R. Benbrik, S. Heinemeyer, and M. Pérez-Victoria, “Handbook of vectorlike quarks: Mixing and single production,” Phys. Rev. D88 no. 9, (2013) 094010, arXiv:1306.0572.

79 [25] O. Matsedonskyi, G. Panico, and A. Wulzer, “On the Interpretation of Top Partners Searches,” JHEP 12 (2014) 097, arXiv:1409.0100.

[26] A. D. Martin, W. J. Stirling, R. S. Thorne, and G. Watt, “Parton distributions for the LHC,” Eur. Phys. J. C63 (2009) 189–285, arXiv:0901.0002.

[27] M. Cacciari, G. P. Salam, and G. Soyez, “The Anti-k(t) jet clustering algorithm,” JHEP 04 (2008) 063, arXiv:0802.1189.

[28] S. D. Ellis and D. E. Soper, “Successive combination jet algorithm for hadron collisions,” Phys. Rev. D48 (1993) 3160–3166, arXiv:hep-ph/9305266.

[29] S. Catani, Y. L. Dokshitzer, M. H. Seymour, and B. R. Webber, “Longitudinally

invariant Kt clustering algorithms for hadron hadron collisions,” Nucl. Phys. B406 (1993) 187–224.

[30] Y. L. Dokshitzer, G. D. Leder, S. Moretti, and B. R. Webber, “Better jet clustering algorithms,” JHEP 08 (1997) 001, arXiv:hep-ph/9707323.

[31] M. Wobisch and T. Wengler, “Hadronization corrections to jet cross-sections in deep inelastic scattering,” in Monte Carlo generators for HERA physics. Proceedings, Workshop, Hamburg, Germany, 1998-1999, pp. 270–279. 1998. arXiv:hep-ph/9907280. https://inspirehep.net/record/484872/files/arXiv:hep-ph_9907280.pdf.

[32] A. J. Larkoski, S. Marzani, G. Soyez, and J. Thaler, “Soft Drop,” JHEP 05 (2014) 146, arXiv:1402.2657.

[33] J. Thaler and K. Van Tilburg, “Identifying Boosted Objects with N-subjettiness,” JHEP 03 (2011) 015, arXiv:1011.2268.

[34] CMS Collaboration, “Identification of b-quark jets with the CMS experiment,” JINST 8 (2013) P04013, arXiv:1211.4462.

[35] CMS Collaboration, “Identification of b quark jets at the CMS Experiment in the LHC Run 2,” Tech. Rep. CMS-PAS-BTV-15-001, CERN, Geneva, 2016. https://cds.cern.ch/record/2138504.

80 [36] CMS Collaboration, “Boosted Top Jet Tagging.” https://twiki.cern.ch/twiki/ bin/view/CMS/JetTopTagging#13_TeV_working_points_CMSSW_8_0. Accessed: 2017-09-13.

[37] L. Evans and P. Bryant, “LHC Machine,” JINST 3 (2008) S08001.

[38] ALICE Collaboration, “The ALICE experiment at the CERN LHC,” JINST 3 (2008) S08002.

[39] ATLAS Collaboration, “The ATLAS Experiment at the CERN Large Hadron Collider,” JINST 3 (2008) S08003.

[40] CMS Collaboration, “The CMS Experiment at the CERN LHC,” JINST 3 (2008) S08004.

[41] LHCb Collaboration, “The LHCb Detector at the LHC,” JINST 3 (2008) S08005.

[42] CMS Collaboration, “Public CMS Luminosity Information.” https://twiki.cern.ch/twiki/bin/view/CMSPublic/LumiPublicResults# Luminosity_versus_day_AN1. Accessed: 2017-08-23.

[43] S. van der Meer, “Calibration of the Effective Beam Height in the ISR,”.

[44] CMS Collaboration, “Absolute Calibration of Luminosity Measurement at CMS: Summer 2011 Update,”.

[45] CMS Collaboration, “CMS physics: Technical design report,”.

[46] CMS HCAL Collaboration, “Design, Performance, and Calibration of CMS Hadron-Barrel Calorimeter Wedges,”.

[47] CMS Collaboration, “Particle-flow reconstruction and global event description with the CMS detector,” JINST 12 (2017) P10003, arXiv:1706.04965.

[48] CMS Collaboration, “Description and performance of track and primary-vertex reconstruction with the CMS tracker,” JINST 9 no. 10, (2014) P10009, arXiv:1405.6569.

[49] CMS Collaboration, “Performance of CMS muon reconstruction in pp collision √ events at s = 7 TeV,” JINST 7 (2012) P10002, arXiv:1206.4071.

81 [50] CMS Collaboration, “Jet Identification for the 13 TeV data Run2016.” https://twiki.cern.ch/twiki/bin/view/CMS/JetID13TeVRun2016. Accessed: 2017-08-23.

[51] CMS Collaboration, “Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV,” JINST 12 no. 02, (2017) P02014, arXiv:1607.03663.

[52] D. Bertolini, P. Harris, M. Low, and N. Tran, “Pileup Per Particle Identification,” JHEP 10 (2014) 059, arXiv:1407.6013.

[53] J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli, and M. Zaro, “The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations,” JHEP 07 (2014) 079, arXiv:1405.0301.

[54] P. Nason, “A New method for combining NLO QCD with shower Monte Carlo algorithms,” JHEP 11 (2004) 040, arXiv:hep-ph/0409146.

[55] S. Frixione, P. Nason, and C. Oleari, “Matching NLO QCD computations with Parton Shower simulations: the POWHEG method,” JHEP 11 (2007) 070, arXiv:0709.2092.

[56] S. Frixione, P. Nason, and G. Ridolfi, “A Positive-weight next-to-leading-order Monte Carlo for heavy flavour hadroproduction,” JHEP 09 (2007) 126, arXiv:0707.3088.

[57] E. Re, “Single-top Wt-channel production matched with parton showers using the POWHEG method,” Eur. Phys. J. C71 (2011) 1547, arXiv:1009.2450.

[58] S. Alioli, P. Nason, C. Oleari, and E. Re, “A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX,” JHEP 06 (2010) 043, arXiv:1002.2581.

[59] T. Sjöstrand, S. Ask, J. R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel, C. O. Rasmussen, and P. Z. Skands, “An Introduction to PYTHIA 8.2,” Comput. Phys. Commun. 191 (2015) 159–177, arXiv:1410.3012.

[60] T. Sjostrand, S. Mrenna, and P. Z. Skands, “PYTHIA 6.4 Physics and Manual,” JHEP 05 (2006) 026, arXiv:hep-ph/0603175.

82 [61] M. L. Mangano, M. Moretti, F. Piccinini, and M. Treccani, “Matching matrix elements and shower evolution for top-quark production in hadronic collisions,” JHEP 01 (2007) 013, arXiv:hep-ph/0611129.

[62] GEANT4 Collaboration, “GEANT4: A Simulation toolkit,” Nucl. Instrum. Meth. A506 (2003) 250–303.

[63] J. M. Campbell, R. K. Ellis, and F. Tramontano, “Single top production and decay at next-to-leading order,” Phys. Rev. D70 (2004) 094012, arXiv:hep-ph/0408158.

[64] CMS Collaboration, “pt(top-quark) based reweighting of ttbar MC.” https://twiki.cern.ch/twiki/bin/view/CMS/TopPtReweighting. Accessed: 2017-10-20.

[65] CMS Collaboration, “Measurement of the differential cross section for t¯t production √ in the dilepton final state at s = 13 TeV,”.

[66] CMS Tracker Collaboration, “The effect of highly ionising particles on the CMS silicon strip tracker,” Nucl. Instrum. Meth. A543 no. 2-3, (2005) 463–482.

[67] CMS Collaboration, “CMS Luminosity Measurements for the 2016 Data Taking Period,”.

[68] CMS Collaboration, “Measurement of the tt¯ production cross section using events in √ the eµ final state in pp collisions at s = 13 TeV,” Eur. Phys. J. C77 (2017) 172, arXiv:1611.04040.

[69] CMS Collaboration, “Measurement of inclusive W and Z boson production cross sections in pp collisions at sqrt(s)=13 TeV,”.

[70] CMS Collaboration, “Cross section measurement of t-channel single top quark √ production in pp collisions at s = 13 TeV,” Phys. Lett. B772 (2017) 752–776, arXiv:1610.00678.

[71] N. Kidonakis, “Top Quark Production,” in Proceedings, Helmholtz International Summer School on Physics of Heavy Quarks and Hadrons (HQ 2013): JINR, Dubna, Russia, July 15-28, 2013, pp. 139–168. 2014. arXiv:1311.0283. https://inspirehep.net/record/1263209/files/arXiv:1311.0283.pdf.

83 [72] J. M. Campbell, R. K. Ellis, and C. Williams, “Vector boson pair production at the LHC,” JHEP 07 (2011) 018, arXiv:1105.0020.

[73] T. Gehrmann, M. Grazzini, S. Kallweit, P. Maierhöfer, A. von Manteuffel, S. Pozzorini, D. Rathlev, and L. Tancredi, “W +W − Production at Hadron Colliders in Next to Next to Leading Order QCD,” Phys. Rev. Lett. 113 no. 21, (2014) 212001, arXiv:1408.5243.

[74] CMS Collaboration, “Recommended Jet Energy Corrections and Uncertainties For Data and MC.” https://twiki.cern.ch/twiki/bin/view/CMS/JECDataMC. Accessed: 2017-10-20.

[75] CMS Collaboration, “Utilities for Accessing Pileup Information for Data .” https://twiki.cern.ch/twiki/bin/viewauth/CMS/PileupJSONFileforData# Pileup_JSON_Files_For_Run_II. Accessed: 2017-10-20.

[76] T. Müller, J. Ott, and J. Wagner-Kuhr, “theta a framework for template-based modeling and inference.” http://www-ekp.physik.uni-karlsruhe.de/~ott/theta/theta.pdf. Accessed: 2017-10-27.

[77] R. J. Barlow and C. Beeston, “Fitting using finite Monte Carlo samples,” Comput. Phys. Commun. 77 (1993) 219–228.

[78] J. S. Conway, “Incorporating Nuisance Parameters in Likelihoods for Multisource Spectra,” in Proceedings, PHYSTAT 2011 Workshop on Statistical Issues Related to Discovery Claims in Search Experiments and Unfolding, CERN,Geneva, Switzerland 17-20 January 2011, pp. 115–120. 2011. arXiv:1103.0354. https://inspirehep.net/record/891252/files/arXiv:1103.0354.pdf. Erklärung

Hiermit bestätige ich, dass die vorliegende Arbeit von mir selbständig verfasst wurde und ich keine anderen als die angegebenen Hilfsmittel - insbesondere keine im Quellenverzeichnis nicht benannten Internet-Quellen - benutzt habe und die Arbeit von mir vorher nicht in einem anderen Prüfungsverfahren eingereicht wurde. Die eingereichte schriftliche Fassung entspricht der auf dem elektronischen Speichermedium. Ich bin damit einverstanden, dass die Master-Arbeit veröffentlicht wird.

Ort, Datum Alexander Fröhlich Danksagung

An dieser Stelle möchte ich mich ganz herzlich bei allen bedanken, die mich während der Erstellung dieser Arbeit und während meines Studiums unterstützt haben.

Zuerst möchte ich mich bei Prof. Dr. Johannes Haller bedanken, für die Möglichkeit meine Master-Arbeit in seiner Gruppe verfassen zu können, und für die vielen hilfreichen Kommentare und Anregungen in den wöchentlichen Gruppenmeetings.

Bei Dr. Roman Kogler bedanke ich mich für die gute Betreuung, die vielen, sehr hilfreichen Diskussionen und Anregungen und auch für die Übernahme des Zweitgutachtens.

Der gesammten Arbeitsgruppe danke ich für die angenehme Arbeitsatmosphäre und die vielen Diskussionen und Hilfestellungen bei Probelmen oder Fragen. Ganz besonders bedanke ich mich bei meinen Büro-Kollegen Arne Reimers, Marc Stöver und Irene Zoi für die stets gute Stimmung im Büro, es hat mir sehr viel Spaß gemacht mit euch.

Meiner Freundin möchte ich für die schöne gemeinsame Zeit und die Aufmunterung während der stressigen Phasen dieser Arbeit danken.

Schliesslich möchte ich ganz herzlich meinen Freunden und meiner Familie für die Un- terstützung und Motivation während meines ganzen Studiums danken. Ich konnte mich immer darauf verlassen bei euch ein offenes Ohr und einen guten Rat zu finden. Ganz besonders möchte ich hier meinen Eltern danken, die immer an mich geglaubt haben und die mit ihrer moralischen und finanziellen Unterstützung dieses Studium für mich überhaupt erst möglich gemacht haben.