Signature Redacted Department of Physics May 7, 2015

ARCHMM! Exclusive Cone Jet Algorithms MASSACHUSETTS INSTITUTE OF rECHNOLOLGY Colliders for High Energy Particle AUG 10 2015 by LIBRARIES Thomas Frederick Wilkason, Jr. Submitted to the Department of Physics in partial fulfillment of the requirements for the degree of Bachelor of Science in Physics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2015 @ Massachusetts Institute of Technology 2015. All rights reserved.

A uthor ...... Signature redacted Department of Physics May 7, 2015

C ertified by ...... Signature redacted 6' Jesse Thaler Assistant Professor of Physics Thesis Supervisor

Accepted by...... Signature redacted I - 'Nergis Mavalvala Senior Thesis Coordinator, Department of Physics 2 Exclusive Cone Jet Algorithms for High Energy Particle Colliders by Thomas Frederick Wilkason, Jr.

Submitted to the Department of Physics on May 7, 2015, in partial fulfillment of the requirements for the degree of Bachelor of Science in Physics

Abstract In this thesis, I develop an exclusive cone jet algorithm based on the principles of jet substructure and demonstrate its use for physics analyses at the Large Hadron Col- lider. Based on the event shape N-jettiness, this algorithm, called "XCone," partitions the event into a fixed number of conical jets of size RO in the rapidity-azimuth plane. This algorithm is designed to locate substructure independent of momentum, allowing accurate resolution of jets at both low and high energy scales. I present three potential analyses using XCone to search for heavy resonances, Higgs bosons, and top quarks at various momenta and show that it reconstructs these particles with efficiencies between 60% and 80% without any additional substructure techniques, and maintains this efficiency over a wide kinematic range. This algorithm provides many key advantages over traditional jet algorithms that make it appealing for use at the LHC and other high energy particle colliders.

Thesis Supervisor: Jesse Thaler Title: Assistant Professor of Physics

3 4 Acknowledgments

First and foremost, I'd like to thank my advisor Jesse Thaler for his incredible mentor- ship over the past two years. Working on this and other projects under his guidance has taught me an enormous amount about how to conduct physics research, and I will be forever grateful for his efforts in teaching me how to become a better scientist. I will always admire his excitement and passion for the field, and I sincerely hope that we will have the opportunity to work more in the future. I'd also like to thank my first advisor Christoph Paus for offering me a UROP position in the MIT CMS group during my freshman year and for taking me to CERN during the summer of 2013. The opportunity to learn about real particle physics research at such an early stage in my career helped solidify my own excitement and passion for the field. I'd like to thank my friends in the physics department, in East Campus, and all over MIT for making my time here so enjoyable. I'd also like to thank my family for always supporting me in my life decisions, however strange they may seem. This work is based on two upcoming publications. Chapter 2 is based primarily on "XCone: N-jettiness as an Exclusive Cone Jet Algorithm" by lain Stewart, Frank Tackmann, Jesse Thaler, Christopher Vermilion, and myself. Chapter 3 is essentially identical to "Resolving Boosted Jets with XCone" by Jesse Thaler and myself. I would like to thank my collaborators Jesse, lain, Frank, and Chris for allowing me to use this material for my thesis. I would also like to thank the MIT Undergraduate Research Opportunities Program for providing funding for this research through the Paul E. Gray Endowed Fund.

5 6 Contents

1 Introduction and Background 13 1.1 Physics of the Large Hadron Collider ...... 13 1.2 Jet Definitions and Jet Algorithms ...... 15

2 Introducing the XCone Jet Algorithm 19 2.1 N-jettiness as a Jet Algorithm ...... 20 2.2 Choice of Measure ...... 22 2.2.1 The Conical Measure ...... 22 2.2.2 The Geometric Measure ...... 25 2.2.3 The Conical Geometric Measure ...... 26 2.2.4 The XCone Default Measure ...... 27 2.3 Minimization of N-jettiness ...... 28 2.3.1 One-Pass Minimization ...... 28 2.3.2 Update Step for General Measures ...... 29 2.3.3 Seed Axes for Minimization ...... 31

3 Resolving Boosted Jets with XCone 35 3.1 Dijet Resonances and Comparison to Anti-kT ...... 36 3.1.1 N = 1 ...... 37 3.1.2 N = 2 ...... 38 3.1.3 N = 3 ...... 39 3.2 Boosted Higgs Bosons and Dynamic Split/Merge ...... 41 3.2.1 N = 1 ...... 43

7 3.2.2 N = 2 ...... 45 3.2.3 N = 3 ...... 47 3.3 Boosted Top Quarks and High-Multiplicity Final States ...... 48 3.3.1 N = 6 ...... 49

3.3.2 N = 2 x 3 ...... 52

4 Conclusion 57

8 List of Figures

2-1 Event displays and jet regions with various N-jettiness measures for N = 6...... 23

2-2 Event displays and jet regions with XCone measures for N = 2. .. . 24

2-3 Density plots showing alignment of global TN minimized axes and generalized kT axes ...... 33

3-1 Single Jet kinematics of N = 1 XCone versus hardest anti-kT jet, measured on the dijet resonance sample...... 37

3-2 Same N = 1 comparison as Fig. 3-1 with area distributions...... 38

3-3 Dijet kinematics of N = 2 XCone versus the two hardest anti-kT jets, measured on the dijet resonance sample...... 39

3-4 Comparing the third hardest jet kinematics between XCone N = 3 and anti-kT, measured on the dijet resonance sample...... 40

3-5 Same N = 3 comparison as Fig. 3-4 for area distributions...... 41

3-6 Comparing mass and area of fat jet Higgs reconstruction between XCone N= 1 and hardest anti-kT jet...... 43

3-7 Efficiency for N = 1 fat jet Higgs reconstruction as a function of Higgs

PT, with the mass window m, C (100, 150) GeV...... 44 3-8 Comparing mass and area of dijet Higgs reconstruction between XCone

N = 2 and the two hardest anti-kT jets ...... 46

3-9 Efficiency for N = 2 resolved dijet Higgs reconstruction with and with-

out IRS tagging as a function of Higgs PT...... 47

9 3-10 Comparing mass and area of three-jet top reconstruction between XCone with N = 6 and the six hardest anti-kT jets...... 50 3-11 Comparing boosted top signal efficiency, background efficiency, and

signal significance for XCone with N = 6 as a function of top PT. . . 51 3-12 Comparing mass of three-jet top reconstruction between XCone with

N = 2 x 3 and the boosted and resolved anti-kT strategies...... 53 3-13 Comparing boosted top signal efficiency, background efficiency, and

signal significance for XCone with N = 2 x 3 as a function of top PT. 53

10 List of Tables

2.1 N-jettiness measures studied in this thesis...... 21 2.2 Fraction of jets and events found with the heuristic minimum that are aligned with the "true minimum"...... 34

3.1 Fraction of events where the nth hardest XCone jet is aligned with the nth hardest anti-kT jet...... 41

11 12 Chapter 1

Introduction and Background

The high energy collisions of the Large Hadron Collider (LHC) create an environment that is well-suited for the discovery of new physics. This environment is also dominated by the emission of jets, collimated collections of hadrons that result from the production of quarks or gluons. As a result, our ability to correctly understand and interpret signals of new physics relies heavily on effective and powerful tools for analyzing jets. In this thesis, I will present the idea of an exclusive cone jet algorithm, a new jet algorithm based on the principles of jet substructure that is designed to improve jet finding and reconstruction. I will also demonstrate the use of this algorithm in several key LHC channels to show its advantages.

1.1 Physics of the Large Hadron Collider

The Large Hadron Collider is a 27-km particle accelerator located outside of Geneva, Switzerland that is capable of producing proton-proton collisions at a center of mass energy of V = 14 TeV. The LHC is designed to probe the energy frontier of particle physics and search for new particles both in and beyond the Standard Model. With the discovery of the Higgs boson in 2012, the LHC has effectively confirmed the validity of Standard Model in describing particle physics at low energies. However, the Standard Model has a number of theoretical failings, including but not limited to the hierarchy problem and the nature of dark matter. It is expected that new physics

13 should appear at the TeV scale to answer some of these questions, and so new results from the LHC are expected to help explain what lies beyond the Standard Model.

Since the LHC is designed to search for incredibly rare and short-lived particles, many of the events in the LHC are produced as a result of common, Standard Model processes. Some of the most prevalent and important types of events in the LHC are those containing jets. Due to the property of asymptotic freedom in the strong interaction, these high energy collisions are capable of producing individual partons (i.e. quarks and gluons) in the collision. However, due to color confinement, these partons cannot be individually measured. After the partons are produced in the collision, they quickly begin radiating off additional quarks and gluons, creating a showering effect known as "fragmentation." At the same time, the partons excite the vacuum into creating quark/anti-quark pairs that then couple to the original partons to create colorless bound states in a process called "hadronization." The combination of these two processes leads to the creation of jets in these collisions.

Jets serve a dual purpose at the LHC, as they can result both from interesting signal processes and from expected background processes. Partons are often the final product of the decays from both Standard Model and beyond the Standard Model particles, and as such, jets are a key component in the discovery of new physics. For example, supersymmetric models often include long decay chains down into a large multiplicity of Standard Model partons, leading to a large number of jets in the central region of the detector [14]. Indeed, one of the key search channels for the Higgs boson is the decay into two bottom quarks, resulting in two b-jets [23, 41. The ability to accurately tag and resolve each of these different jets is thus required to accurately detect and measure the properties of these new particles.

In addition, due to the fact that the LHC is a hadron collider, there are a large number of soft interactions between the two protons that lead to a huge number of background jets. These include the 1) underlying event due to the unscattered partons in the proton and soft interactions between the two colliding protons that produce a large number of low-energy jets, and 2) pile-up, which occurs due to the large number of simultaneous proton-proton collisions that further produces a large

14 number of low-energy jets. As a result, the key to accurately resolving important, signal jets lies in the ability to discriminate signal jets from the background jets and the ability to mitigate these backgrounds so they have minimal effect on the final data.

1.2 Jet Definitions and Jet Algorithms

The dual processes of fragmentation and hadronization leave the definition of jets fundamentally ambiguous, as the evolution from a single parton into a final-state jet cannot be fully predicted or understood using conventional methods of quantum field theory. As a result, the analysis of jets requires the use of jet algorithms that define the jet based on characteristic properties such as direction, momentum, and size. These algorithms all must fulfill basic physical requirements, such as infrared and collinear (IRC) safety to guarantee the cancellation of divergences at each order in perturbative QCD, but take different approaches to defining and classifying jets [31, 46]. As a result, different algorithms will produce different jets for the same event, each with different locations, momenta, and shapes, and each with their own key analysis advantages and disadvantages. Although there a large number of existing jet algorithms, the most popular at the LHC are the kT-like sequential clustering algorithms [18, 20]. These algorithms create jets by taking each pairwise combination of particles and calculating a jet metric (dij) and a beam metric (diB)- If dij

15 [29, 57, 56]. The above two approaches have the advantage of creating jets that retain characteristic information about the distribution of energy within the jet, but have the disadvantage that they are oddly shaped and difficult to calibrate in the experiment. The most commonly used algorithm at the LHC is the anti-kT algorithm, which clusters particles from hard to soft 118]. This opposes the original intuition of the showering process, but it was found that this algorithm accurately finds the hard centers of jets with ease. Further, for well-separated jets, the anti-kT algorithm creates nearly conical boundaries, making jets easier to calibrate in the detector and calculate using theoretical tools.

The anti-kT algorithm, while useful for finding large jets at the LHC, has certain limitations that make some LHC analyses more difficult. In order to maintain sensible results with this algorithm, jets should be found only using the inclusive version [181, which means that the same type of event can lead to a different number of jets. Also, if two jets are closer than the characteristic radius, the anti-kT algorithm will merge them into a single jet. Further, because the particles are clustered from hard to soft, much of the structure inside the jet, which can often give interesting and useful information, is lost. This becomes a significant problem in the highly boosted regime (i.e. for large jet PT), as many of the interesting decay products become highly collimated and lost inside the same anti-kT jet. As a result, jets resulting from heavy particles such as W/Z bosons and top quarks, which typically have a multi-prong structure inside the jet, become virtually indistinguishable from background QCD jets [25, 27, 2].

This problem has led to the development of the field of jet substructure. Jet substructure is a recent development in jet physics that utilizes the internal information of the jet, such as the relative distribution of energy, in order to find out more about the event. Substructure techniques, such as jet groomers [38, 341, as well as substructure observables, such as N-subjettiness [53, 54], both modify the jets produced from anti-kT and other algorithms to find interesting structure within the jet to improve mass resolution and background discrimination. These techniques have also been useful in improving jet resolution in the face of the effects of pile-up and the under-

16 lying event [50, 171. However, all analyses that utilize these techniques first require jet finding with a standard algorithm and then performing additional techniques to extract this information.

This motivates the development of a new jet algorithm derived from the principles of jet substructure. By treating each of the important decay products of the original

particle as separate, independent jets that can be found independent of PT scale, the original particle can be reconstructed more accurately. Further, the principles behind substructure techniques, such as recoil insensitivity and background supression, can be built into the algorithm itself so that the jets remain insensitive to effects from pileup and other soft interactions. In this way, similar types of events can be analyzed as a group, which gives a stronger means of identifying events with new and interesting physics. This also allows identification of jets at different energy scales, reducing the necessity of tuning jet radii for different energy scales. In particular, these algorithms allow improved resolution on the substructure of the jet to capture more information about all the separate decay products contained in the jet, resulting in a more accurate measurement. These algorithms would thus simplify analyses by accurately reconstructing the original particles without requiring any additional machinery.

The primary novelty of this thesis is the development of an exclusive cone jet algorithm based on the principles described above. The exclusive cone algorithm has a number of novel features that give it great power over existing algorithms. In channels where the number of jets is known in advance, such as Higgs -+ bb, the exclusive cone algorithm allows the expected number jets to be found regardless of

the PT scale of the event. As a result, even for large boosts, this algorithm can locate the hard decay products of the event, where they would otherwise be lost in an anti- kT algorithm. Further, these jets remain conical regardless of their separation, and dynamically partition via Voronoi tessellation at the scale where the jets begin to overlap. This allows smooth interpolation between the highly resolved regime at low

PT and the highly boosted regime at high PT, where currently different algorithms and different techniques have to be used [24, 28]. It also has the flexibility to be used

17 in cases where the number of jets in the final state differs from the expected value, such as in the case of initial state radiation or final state radiation. The remainder of this thesis is organized as follows. In chapter 2, we present a concrete implementation of an exclusive cone jet algorithm, called "XCone", based on the principle of minimization of the N-jettiness event shape [52J. We start by briefly reviewing the N-jettiness shape and its use as a jet finding technique. We will discuss the different possible measures that can be used to define N-jettiness, and outline a new measure that utilizes insights from previous uses of N-jettiness. We will then discuss the implementation of the minimization routine and how we can approximate the minimum of N-jettiness with a sequential clustering algorithm. In Chapter 3, we outline three potential analyses for searches at the LHC made possible with the XCone algorithm. We will first study the dijet decays of heavy bosons to show XCone's similarity to standard algorithms in the case of resolved jets. We will then show how XCone's unique properties allows jets to be accurately resolved both at low PT and high PT. We will demonstrate this both with a Higgs -+ bb decay and a boosted top pair to demonstrate XCone's usage both in low and high multiplicity final states. These three analyses will demonstrate how XCone can be successfully used in a wide variety of analyses across different kinematic regimes with high performance.

18 Chapter 2

Introducing the XCone Jet Algorithm

In this chapter, we present a new jet algorithm based on the event shape N-jettiness [521, using insights from the jet shape N-subjettiness [53, 54]. The key novelty is that N-jettiness defines an exclusive cone jet algorithm (or "XCone"). Like the exclusive kT algorithm [211, the XCone algorithm returns a fixed number of jets, relevant for physics applications where the number of jets is known in advance. Like anti-kT jets [181, XCone jets are nearly conical for well-separated partons, such that they have a fixed active jet area [17, 19]. Unlike previous jet algorithms, XCone can smoothly interpolate between widely-separated conical jets and merged jets with substructure [5, 8, 7, 6], a possibility that will be explored in Chapter 3. As we will see, there is considerable flexibility in how one defines N-(sub)jettiness. Because N-jettiness and N-subjettiness had different original intended applications, they ended up having different default definitions. Here, as the XCone default, we propose a "conical geometric" measure that incorporates lessons from previous work [52, 53]. This measure is based on the dot product between particles and light-like axes as in Ref. [52] but incorporates an angular exponent # as in Ref. [54], as well as a beam exponent -y for additional flexibility. Crucially for the purposes of jet finding at the LHC, this measure yields (nearly) conical jets over a wide rapidity range, and the user can choose the desired jet radius RO.

For the physics applications, we propose a default setting of / = 2 and -y = 1, which acts similarly to existing cone algorithms (see e.g. [51, 11, 32, 47]) in that

19 the resulting jet regions are (approximately) stable cones where the jet momentum and the jet axis align. The key difference with algorithms like SISCone [471 is that XCone does not require a split/merge step. In particular, typical inclusive cone algorithms have an overlap parameter which determines whether two abutting stable cones should be joined or remain separate. However, XCone has the advantage that the split/merge decision is determined dynamically through N-jettiness minimization. In the following chapter, we will show examples of quasi-boosted kinematics that capitalize on this exclusive approach to cone jet finding. Our practical XCone implementation will use recursive clustering algorithms [21, 33, 29, 57, 56] to approximate N-jettiness minima. Roughly speaking, we run a generalized kT clustering algorithm to determine the jet axes, and then use N-jettiness to define the jet regions. Separating jet axes finding from jet region finding appeared previously in the context of recoil-free jets 140, 421, where a fixed radius cone was centered on winner-take-all axes [10, 40, 45j or broadening axes [54, 40j. XCone allows us to extend this strategy to N jet events, with / = 1 yielding recoil-free jets and # = 2 yielding traditional cones where the jet axes and jet momenta are (nearly) aligned. The remainder of this chapter is organized as follows. In Sec. 2.1, we review how to define an exclusive jet algorithm via minimizing N-jettiness. We then discuss a variety of N-jettiness measures in Sec. 2.2, including the conical geometric measure that is the basis for XCone. In Sec. 2.3, we discuss some details of our XCone implementation, most especially the choice of seed axes for finding a (local) N-jettiness minimum. The XCone algorithm will be available through the NSUBJETTINESS FAsTJET CONTRIB [20, 1].

2.1 N-jettiness as a Jet Algorithm

Given a set of normalized light-like axes nA - { 1, nA}, a generic definition of N- jettiness is

TN min Ipjet (p,i ), . , Pjet (Pi, nN), Pbeam(Pi), (2.1)

20 Name pjet (pi, nA) Pbeam(Pi) Conical [54| piARiA PTiRO General Conical pTi/A R" f (pi) pTiRo f(pi) Geometric [36] nA Pi mTR min{eyi, e yi} Modified Geometric nA P mTiR'(2 cosh yi)-

2 Conical Geometric PTi _ A-Pi PTi (2osy4Y fTA PTi (2coshyi) - 0 XCone Default (/ = 2, y = 1) 2nA P/nTA PTiR Recoil-Free Default (# 1, y = 1) /2nA APi P /nA PriRo

Table 2.1: N-jettiness measures studied in this thesis. The conical geometric measure with # = 2 and -y = 1 is the suggested XCone default, giving stable cone jets (like the conical measure) through dot-product distances (like the geometric measure). The recoil-free variant with # = 1 centers the jet around its hardest cluster, making the jet regions less sensitive to soft contamination. where the sum runs over particles i in the event with four-vector pi, Pjet (pi, nA) is a distance measure to the A-th jet direction, and Pbeam(Pi) is a distance measure to the beam. For a given form of Pjet and Pbeam, the minimum inside of FN partitions the event into N jet regions and one unclustered beam region. To use N-jettiness as a jet algorithm, one minimizes FN over all possible light-like axes directions:

TN Min TN. (2.2) n1,n2,---,.N

The locations of the axes at the minimum define the centers of the jet regions, and we discuss methods to approximate the rN minimum in Sec. 2.3.

For any choice of measure, N-jettiness minimization defines an exclusive jet algorithm. In particular, rN always identifies N jet regions (and one beam region), regardless of how close the axes nA might be to each other. When the axes are well separated, the boundary of the jet regions is determined through competition between

Pjet and Pbeam. When the axes are close together, the jet regions are determined by the competition between different Pjet. To go from an exclusive jet algorithm to an exclusive cone jet algorithm (i.e. XCone), one wants the jet boundaries to be approximately circles in the rapidity-azimuth plane, which requires an appropriate choice of jet and beam measures. In Sec. 2.2

21 we study a variety of measures summarized in Table 2.1, including a general conical measure (see Eq. (2.4)) that yields exact cones for widely separated jets.

In addition to partitioning the event into regions, TN is a quality criteria that measures how well an event is characterized by N jets. For narrow jets (i.e. small effective jet radius), rN is typically dominated by the beam measure. Thus, for LHC applications, one typically wants Pbeam(pi) to be proportional to PTi (the transverse momentum of particle i) such that N-jettiness minimization results in the least unclustered PT.

2.2 Choice of Measure

Every choice of jet and beam measure defines some kind of N-jettiness jet algorithm. We now review previous measures in the literature en route to explaining the logic behind the new XCone default measure. Example jet regions found from these measures are shown in Figs. 2-1 and 2-2.

2.2.1 The Conical Measure

The first conical N-jettiness measure was proposed in Ref. [54]:1

Conical Measure Pbe(A) PTi/ROA7 (2.3)

Pbeam Pi) =pTi R0 , where RO is the characteristic jet radius and ,3 is an angular weighting exponent. The parameter RO acts just like the radius in a cone algorithm, since particle i can only be clustered into jet A if pjet(pi, nA) < Pbeam(Pi), which is equivalent to ARiA < Ro. Thus, the measure in Eq. (2.3) yields perfect conical jets in the rapidity/azimuth plane, unless two jet axes are closer than RO, in which case the jet regions are determined by Voronoi partitioning (i.e. nearest neighbor). This yields "clover jet" configurations as shown in Fig. 2-la.

'Strictly speaking, the measure in Ref. [54] has an extra rapidity cut and a slightly different normalization.

22 -INSEN------INNINIL- -

Conical Measure (B = 2) Geometric Measure 6 6 N=6 N=6 RO =0.5 RO = 0.5 5 5

4 4

2- 2

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 y (a) (b)

XCone Default (8 = 2) Recoil-free Default (/8 = 1) 6 6- N=6 N=6 RO =0.5 RO =0.5 5 5

4 4

2 2

1 1

0 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 y y (c) (d) Figure 2-1: Jet regions found with various N-jettiness measures. This is a boosted tt event from the Boost 2010 event sample [51, and every measure has N = 6 and Ro = 0.5. (a) Conical measure with # = 2. (b) Geometric measure. (c) The XCone default, conical geometric measure with # = 2. (d) The recoil-free default, conical geometric measure with / = 1. Both the conical and conical geometric measures yield (approximately) circular jets, with the overlap region partitioned by nearest neighbor.

...... XCone Default (#B = 2) Recoil-free Default (0 = 1) 6 6- N=2 N=2 Ro =1.0 Ro =1.0

4 W 4

S3 3' 2 *2

1 w 1.... -3 -_ -1 '- 1 2 3 -2 -1 0 1 2 3 y y (a) (b)

Figure 2-2: Same as Figs. 2-1c and 2-1d, but for N = 2 and Ro = 1.0. Here we see that the XCone default (0 = 2, left) yields jets roughly centered along the total jet momentum while the recoil-free default (# = 1, right) yields jets centered along the hardest cluster within the jet. Note that the jets are elongated in azimuth compared to rapidity, which is consequence of the trigonometric functions in Eq. (2.8).

For small R0 , TN is dominated by the beam measure, which is just the unclustered

PT in an event (multiplied by Rg). Thus, this measure typically finds the N hardest jets in an event. By adjusting the exponent 0, the jet axis can be varied to point along the jet direction (0 = 2) or along the hardest cluster inside a jet (0 = 1) (see also Refs. [54, 40, 42]).

Naively, the conical measure would seem to be the only measure that would lead to conical jets. After all, any change to the measure should change the style of event partitioning since it would affect the competition between Pjet and Pbeam. One can maintain conical jets, however, if one deforms Eq. (2.3) via

Pjet (pi,nA)=-pri ARf f(pi), General Conical Measure iA (2.4) Pbeam(Pi) = PTiR'f (Pi), where f(pi) is any function of the particle four-momentum. This still returns conical jets since the factor of f(pi) drops out when comparing Pjet to Pbeam, and overlapping jets still have Voronoi partitioning since f(pi) also drops out when comparing two

...... -- - ...... different pjet. Of course, the f(pi) factor does play a role in determining the minimum in Eq. (2.2), so the final jets will be different depending on the choice of f(pi). We will exploit this f(pi) function when defining the conical geometric measure in Sec. 2.2.3.

2.2.2 The Geometric Measure

A variety of N-jettiness measures were proposed and studied in Ref. [36]. For the purposes of defining a cone jet algorithm, the most promising choice is the geometric measure:

Geometric Measure Pjet(Pi, UA) P (2.5) Pbeam(Pi) = R min{na -pi, nb - } where nra,b = {1, 0, 0, +1} are the beam directions. Despite the presence of the Lorentz dot product nA pi, this measure actually behaves quite similarly to the conical measure. To see this, note that the four-vectors for particle i and light-like axis nA can be expressed as

A = {mTi cosh yi, P-Ti, mTi sinh yi}, (2.6)

nA {1, 'TA, tanhyA}, (2.7)

where mTi = i + mi, and InTAI =1 / cosh YA. Taking the dot product, we have

nA -*Pi 'Mri R 2 =- coshyiA - cosqiA ~ -, (2.8) nTA Pri Pri 2

where the last approximation holds in the small angle limit for massless particles. 2 Thus, the pjet for the geometric measure acts similarly to the general conical measure

with /3 = 2 and f (pi) = 1/(2 cosh yi).

The presence of the nA - pi dot product is very natural from a theoretical perspective, since soft and collinear modes are usually defined in terms of their light-like

2 This dot product form was also suggested in Ref. [21] in the context of recursive clustering algorithms, though to our knowledge it has never been used in a physics study.

25 momenta. But because the geometric measure does not take the precise form of Eq. (2.4), it yields football-like jets in the central region (y ~ 0, see Fig. 2-1b), which is rather unnatural from a collider physics perspective. To yield more conical jets, though, we just need to ensure that this same factor of f(pi) appears in the beam measure. This motivates a modified geometric measure:

PTiR2 Pjet(pi, nA) = nA - A i , Modified Geometric Measure 2 cosh yi (2.9) MriRS2 piR 2 Pbem~p) =2 cosh yi 2 cosh y,

Here, the approximations are the same as for Eq. (2.8), such that 1/nrTA = cosh YA ~ cosh yi. Since this modified measure is approximately the same as Eq. (2.4) with

/ = 2 and f(pi) = 1/(2 cosh yi), it yields nearly conical jet regions.3

2.2.3 The Conical Geometric Measure

Combining the lessons of the conical and geometric measures, we now introduce the "conical geometric" measure. For a specific choice of parameters, this will be the recommended XCone default measure. Like the conical measure, we want a measure that returns (nearly) conical jets, and we also want a parameter 3 in the jet measure to adjust the jet axes behavior. Like the geometric measure, we want a measure that depends on the dot products between light-like axes and particles, since that is the natural distance to use in theoretical calculations. For additional flexibility, we will introduce a separate -y parameter to adjust the beam measure. These requirements lead us to

Pjet(Pi, nA) = PTi 2n_ pA/2 Conical Geometric Measure (2 cosh y)7- 1 nTA PTZi = PTi _R Pbeam(A) (2 cosh y)7-1 (2.10) In the jet measure, we recognize the last factor in parentheses as the approximate

3 1n the next subsection, we will make the jets even more conical by changing the jet measure as well.

26 form for RiA in Eq. (2.8), such that 3 acts just like the 0 factor in the conical measure.

We have chosen f(pi) = (2 cosh yi)l-Y, such that we maintain (nearly) conical jets.

For the beam measure, y = 1 corresponds to the original conical measure and y = 2 corresponds to the modified geometric measure.

2.2.4 The XCone Default Measure

For LHC applications, our recommended XCone default is / = 2 and y = 1:

2 Pjet (Pi, nA) = rA XCone Default Measure (/ = 2) nTA (2.11) Pbeam Pi) = pTi RO.

By choosing -y = 1, the beam measure is the same as the conical measure, so mini-

mizing TN will tend to minimize the unclustered PT. By choosing # = 2, the jet axis (approximately) aligns with the jet momentum, as is typical for traditional stable cone algorithms. Alternatively, we use / = 1 to handle cases where recoil-sensitivity [22, 30, 9, 41] is an issue (such as in high pileup environments [421):

MPet (pi, PA)Ti Recoil-Free Default Measure (/3 1) =TA (2.12) Pbeam(Pi) - PTiRo.

Here, the jet center aligns approximately along the broadening axis of the jet [54, 40], which is the axis that minimizes the summed transverse momentum relative to it. This is similar to finding the "median" jet energy and nA tends to point along the hardest cluster within a given jet. When the number of clusters is the same as the chosen value of N, the difference between / = 2 and / = 1 is very small (in the absence of jet contamination), analogous to the way that the mean and median of a peaked distribution are very similar. This

is demonstrated in Figs. 2-1c and 2-1d for a boosted top event with N = 6. When

a jet has substructure, the "mean" (0 = 2) and "median" (0 = 1) axes are offset, as

27 shown in Figs. 2-2a and 2-2b for a boosted top event with N = 2. These measures will be the basis for our LHC case studies in the following chapter, where we find that both 3 = 2 and 3 = 1 give comparable results (again, in the absence of jet contamination).

2.3 Minimization of N-jettiness

For a given N-jettiness measure, we need to implement the minimization procedure in

Eq. (2.2). In general, the only guaranteed method to find the global minimum of TN is to brute force test all possible partitions of the particles into N jet regions and one beam region. Since that is computationally prohibitive, we will rely on approximate methods that are only guaranteed to find local minima of TN.

2.3.1 One-Pass Minimization

Our minimization routine is based on iteratively improving the axes directions nA, starting from a set of seed axes. In order to be infrared and collinear (IRC) safe, this procedure must be deterministic, so we always work with one-pass minimization algorithms where a set of IRC safe seed axes are iteratively improved to a local minimum of TN, without any stochastic elements. For the conical measure in Eq. (2.3), Ref. [54] introduced a modification of Lloyd's method [43] that finds a local minimum of TN for 1 < / < 3. We can adopt this same strategy for more general measures. This iterative procedure has two pieces: an assignment step and an update step. In the assignment step, particles are assigned to one of the N jet regions or to the beam region via the rN partitioning in Eq. (2.1). The assignment step can be easily implemented for any choice of measure since it depend directly on the competition between the jet measures Pjet(nA, pi) and the beam measure Pbeam(P).

In the update step, the axes nA are improved to minimize the value of TN within each jet region, keeping the jet constituents fixed. Different update steps are needed for different measures, since there is no general procedure to find the axes nA that

28 minimize pjet (nA, pi)." Once an appropriate update step is found, the assignment and update steps can be iterated until the axes converge to within some specified accuracy.

In Sec. 2.3.2, we give a general update step that works well for the measures studied in this thesis.

As discussed in Ref. [541, these one-pass minimization procedures are quite effective for N-subjettiness, often converging to the global minimum. There are additional

complications, however, for N-jettiness. The reason is that N-jettiness has a beam

region, and particles in the bulk of the beam region are insensitive to small changes

to the location of the jet axes nA. Even minimization routines that try to go "uphill"

to escape local minima may never find the right jet regions. Keep in mind that rN

corresponds roughly to the unclustered PT in an event (for -y = 1), so failing to find

a decent rN minimum means that one will identify too many soft jets.

Therefore, for XCone to be a practical jet algorithm, one has to find a good set

of seed axes for one-pass minimization. In Sec. 2.3.3, we show how to find such seed

axes through recursive clustering algorithms.

2.3.2 Update Step for General Measures

We now construct a general update step that converges to a local minimum of

Pjet(nA, pi) for a fixed set of jet constituents. This approach works for a wide variety of jet measures, including the XCone defaults.

To motivate our general procedure, we start with the special case of the modified

geometric measure, where finding a local minimum of Pjet is particularly straightfor-

ward. Within a given jet region, we want to find the axis nA that minimizes

ZnA -Pi = nA PA,* (2.13) iEA

where PA = EieA pi is the total jet four-momentum. Introducing a Lagrange multi-

4 Even if one does find such a procedure, one has to check on a case-by-case basis whether this procedure actually converges, and some pathological cases were discussed in Ref. [54].

29 plier A, the quantity

nA -PA + A(A -- 1) (2.14)

is minimized for

PPA , (2.15) argmin nA PA = nA IP-AI

such that the jet axis exactly aligns with the jet three-momentum. Thus, minimizing the modified geometric measure is equivalent to finding N mutually stable (Voronoi- bounded) cones. In this same spirit, any measure of the form

Pjet(nAPi) = 2 nA -p f(pi) (2.16)

will be minimized by

nA = q qA = I: Pi f (Pi), (2.17) iqA

where qA is an effective four-vector for the f-rescaled jet constituents. For both of these cases, one-pass minimization will terminate in a finite number of assignment/update steps.5

The conical geometric measure does not take the form of Eq. (2.16), but rather takes the general form

pjet (nA, pi) = 2nA -pi g(pi, nA), (2.18)

where the jet measure has non-linear dependence on nA. This means that the jet axis and the jet three-momentum do not in general align. For the XCone default measure

in particular, the extra factor of 1/nrTA in the jet measure means that there is an offset between the axis and the momentum proportional to the jet mass. Thus, we cannot directly use the stable cone finding logic to minimize pjet. Just like in Ref. [54],

5 For practical purposes, it is sometimes necessary to include an "effective mass" term by changing 2nA -A --+ 2 nA -pi + EE to avoid potential divide by zero errors.

30 though, we can simply define an update step based on the previous nA value:

ne = 1 ,A qA- pi g (pi, nold). (2.19) i EA

As long as the dependence on nA is mild enough (roughly 1 ; 3 < 3 for the conical geometric measure), this procedure will converge within a desired accuracy in a rea- sonable number of iterations. We adopt this one-pass minimization strategy for the XCone default measures.

2.3.3 Seed Axes for Minimization

Recursive clustering algorithms are particularly effective to find seed axes for one-pass minimization. When run in exclusive mode, a recursive clustering algorithm returns exactly N jets which can then be interpreted as N light-like seed axes. In fact, the axes are often so good in practice that the iterative improvement step is unnecessary. One could even imagine a more general strategy that separates jet axis finding (here using recursive clustering) from jet region finding (here using N-jettiness partitions). Unlike generic cluster optimization, recursive clustering algorithms are computationally efficient, and this efficiency is inherited by our XCone implementation (at the

expense of only guaranteeing a local minimum for TN). 6

For the conical geometric measures with -y = 1 (including the XCone defaults), good seed axes can be found by running the generalized kT clustering algorithm with a generalized Et recombination scheme. The generalized kT clustering measure [18, 201 is parametrized by an exponent p and a jet radius RO:

AR?. dij = min (p,p) S , diB =p 2 , (2.20)

where p = 1 is the kT algorithm [21, 33] and p = 0 is the Cambridge/Aachen (C/A) algorithm [29, 57, 56].

'One can further improve TN minimization by running an exclusive jet algorithm to find N + n axes and then testing N + n choose N options to find the best minimum. This option is available in the NSUBJETTINEsS code (though not recommended by default for reasons of speed).

31 The generalized Et recombination scheme is parametrized by an energy weighting power 6, such that one obtains a massless recombined four-vector with

PTr ~PTi + PTj

where 6 1 is the original Et scheme, 6 = 2 is the El scheme [21, 131, and 6 = oc is the winner-take-all scheme [10, 40, 451. It is possible to handle the 'y $ 1 case by further modifications of Eq. (2.20) (such as those proposed in Ref. [12]), but we have not attempted that for the present XCone implementation. For finding seed axes, the recommended parameters for 0 < #3 < 2 are

1 1(

with matching radius parameter R0 . To understand this heuristic, consider starting with a final state of N +1 particles and running one iteration of exclusive generalized kT to find N axes. For this procedure to give good seed axes for TN minimization, we want to choose the values of p and 6 that match the N-jettiness behavior as closely as possible. Essentially, we want diB to match the beam measure Pheam, di3 to match the jet measure Pjet, and the recombination scheme to appropriately place the merged axis in the desired location. We perform this heuristic analysis for the conical measure, which a bit easier to understand than the conical geometric measure with ~y = 1, though the same conclusions hold. To match the conical beam measure, generalized kT with p ;> 0 already gives the right behavior, since the softest particle further than R0 from any other particle is merged with the beam. To match the conical jet measure, we want di, to depend on the combination PTnRG, which is achieved for

1- (2.23)

32 Min. Axes for P = 2 (One pass) Min. Axes for P= I (One pass)

0.9 0.9 3. 3. 0.8 0.8 0.7 0.7 2. 0.6 2. 0.6 0.5 0.5 1. 0.4 1 0.4 *0.3 0.3 0.2 0.2 0. 0.5[ 0.1 S0.31 4 0 0.5 1 1.5 2 2.5 3 3.5 4 6 6 (a) (b) Figure 2-3: Fraction of events where all generalized kT jets align with the axes from global TN minimization after one-pass minimization. This is for a boosted top sample, using the conical geometric measure with N = 6 and Ro = 0.5. (a) The XCone default (3 = 2). (b) The recoil-free default (,= 1). Here, p and 6 parametrize the generalized kT metric and recombination scheme, respectively. The boxes here indicate the preferred values of p and 6 predicted by the heuristic.

To match the conical axis behavior, we have to know which axis minimizes the rN value for a jet region consisting of two particles. Labeling the particles 1 and 2 and simplifying to one dimension q without loss of generality, we have

TN D PT1 11 - OA I" + PT2|#2 - OA 3, (2.24)

where #A is the location of the axis. Solving dTN/dA = 0 to find the location of the minimum, we have

PTIO + Pr2 02 6 1 OA= (5 6 /3-1' (2.25) PT1 + PT2 which is exactly the generalized Et recombination scheme. This is the logic behind the heuristic given in Eq. (2.22).

To validate this heuristic, we consider a sample of boosted top quarks from the

Boost 2010 report [5], using N = 6 and Ro = 0.5. We first brute force determine (as best we can) the global TN minimum by performing one-pass minimization on a wide range of seed axes. We then run the generalized kT procedure for a range of p and 6

33 (0 = 2) No Min With Min (0 = 1) No Min With Min Jets 0.96 0.96 Jets 0.93 0.95 Events (>4) 0.99 0.99 Events (>4) 0.97 0.97 Events (>5) 0.93 0.93 Events (>5) 0.94 0.96 Events (6) 0.80 0.83 Events (6) 0.71 0.78 (a) (b)

Table 2.2: (a) Fraction of jets found with the heuristic minimum that are aligned with the "true" minimum both before and after one-pass minimization, along with the fraction of events with 4 or more, 5 or more, and all 6 jets aligned with the true minimum, for axes found with 3 = 2 minimization. (b) Same as Table (a) with # = 1 minimization.

values and find the fraction of generalized kT jets (without any one-pass minimization) that are within AR < 0.1 of the axes found from global rN minimization. The results

are shown in Fig. 2-3 for the XCone default measures. As shown by the density plots, the heuristic choice in Eq. (2.22) is satisfied for the two # values presented, indicating

that this heuristic provides a simple and accurate means of locating seed axes. The percentage of aligned jets (with and without one-pass minimization), as well as the percentage of events with 4 or more, 5 or more, and all 6 jets aligned (with and

without one-pass minimization) are shown in Tables 2.2a and 2.2b. Even without one- pass minimization, > 95% of the jets are aligned for the heuristic choice in Eq. (2.22) for both 3 = 1 and 3 = 2, with a range of p and 6 values giving similar results.

This suggests that finding local TN minima from generalized kT seed axes is a robust

procedure that often results in a global TN minimum.

34 Chapter 3

Resolving Boosted Jets with XCone

In the previous chapter, we introduced a new jet algorithm called XCone that blurs the boundary between resolved and boosted kinematics. As an exclusive cone algorith, XCone always returns a fixed number of jets N. When jets are well-separated, XCone yields nearly conical jet regions with radius RO. When jets are overlapping, XCone dynamically splits the jet regions into nearest neighbor partitions. Thus, XCone smoothly interpolates between isolated conical jets and merged jets with substructure, making it ideally suited for studying the boosted and quasi-boosted regimes.

In this chapter, we present three applications of XCone which are relevant for LHC physics in and beyond the standard model. In Sec. 3.1, we study high-mass dijet resonances with isolated final state jets, showing that XCone has nearly identical performance to the popular anti-kT algorithm [18]. In Sec. 3.2, we study associated

Higgs boson production, showing that XCone can resolve h -÷ bb decays, even when the bb splitting angle Rbb is less than the radius parameter RO. In Sec. 3.3, we study the classical example of boosted top quarks, showing how XCone can simultaneously identify jets and subjets in a high multiplicity final state. These three case studies highlight the versatility of the XCone algorithm across a wide kinematic range.

35 3.1 Dijet Resonances and Comparison to Anti-kT

For our first case study, we compare the performance of XCone to anti-kT in the resolved regime of well-separated jets. Inclusive jet algorithms like anti-kT identify a variable number of jets above some PT threshold, which is useful for classifying events into different jet multiplicity bins [18, 21, 331. Exclusive jet algorithms like XCone always return a fixed number of jets N, which is useful if the number of desired jets is known in advance [21, 33]. For widely separated cone jets, however, the distinction between inclusive and exclusive cone jet algorithms is rather mild, since for typical Ro values, exclusive cone jet algorithm will just return the N hardest jets from an inclusive cone jet algorithm. Since anti-kT acts like an idealized cone algorithm for well-separated jets [18], XCone jets should be quite similar to the hardest N anti-kT jets. When we study overlapping jets in Secs. 3.2 and 3.3, the inclusive/exclusive distinction will become quite important.

A good setting to study the resolved regime is a heavy resonance decay to dijets, where the two resulting jets are back-to-back and isolated. Here, we consider the scenario

pp - Z' - qq, (3.1)

where Z' is a heavy boson with mass mz, and q is a u, d, or s quark. We start with N = 1 and show that XCone typically matches the hardest anti-kT jet, up to

an expected two-fold ambiguity when the jets are nearly degenerate in PT. Going to N = 2, both XCone and anti-kT can successfully reconstruct the dijet resonance peak.

Even at N = 3, the found jets are quite similar for typical choices of jet parameters, though XCone will identify final state jets with substructure. Overall, XCone has comparable performance to anti-kT in this basic jet reconstruction scenario.

In the following study, we use Pythia 8.176 [49, 48] to simulate Z' events at the 14 TeV LHC. We take mz, = 1 TeV and assume equal couplings to the three light quarks. All of the final-state particles (except neutrinos) with Ir/ < 3.0 are considered

for analysis. Anti-kT jets are found using FASTJET 3.0.4 [201. XCone jets are found

36 Single Jet p1 Single Jet pTDiff. Single Jet A pT vs. A < ~3 0.1 mz,=1TeV -P =22 -P=2 -- 2 -13=2 9 9 -1=10.08131 -~5 ~ -1= ak 0 R= 0.5 j2 R= 0.5 U0.06- R= 0.5 51 - -- 1.5- 0.04 C U.W4 g 10-311 0.02 0.5

200 400 600 800 1000 -1 -50 0 50 100 150 200 -0 0 0 ! 6. 0 p - pT,XCone (GeV) p1 (GeV) PT - PT, XCone (GeV) (a) (b) (c)

Figure 3-1: Single jet kinematics of N = 1 XCone versus the hardest anti-kT jet, measured on the dijet resonance sample. Shown are both the XCone default (0 = 2) as well as the recoil-free variant (/ = 1). (a) Single jet PT spectrum. (b) Jet PT difference between XCone and anti-kT jet, showing that anti-kT jets are slightly harder on average. (c) Jet PT difference versus azimuth difference, showing the expected two- fold # ++ #$+ 7r ambiguity for dijets of comparable PT.

using the NSUBJETTINESS 2.2.01 contrib [53, 54, 1J , using the XCone default measure

with # = 2 and 3 = 1. For all algorithms, the jet radius parameter is Ro = 0.5.

3.1.1 N = 1

For N = 1, the XCone jet will tend to align with the hardest anti-kT jet in the

event. The reason is that the XCone measure in Eq. (2.10) penalizes unclustered PT by design. In Fig. 3-la, we see that anti-kT and XCone yield nearly identical single

jet PT spectra, with the expected structure at mz'/2 from a dijet resonance decay. In

Fig. 3-1b, we compare the found jet PT on an event-by-event basis, and find a sharp

peak at APT = panti-kr _ Cone = 0. On a logarithmic scale, one can see a small tail

extending to 9(50 GeV) for / = 2, with larger deviations possible in the / = 1 case. As shown in Fig. 3-1c, there is a two-fold # ambiguity in the found jets, as expected

from dijet events where both jets have similar PT values. Note that the box sizes are

logarithmic in bin counts, and in the majority of cases, XCone and anti-kT find very

similar jet regions.

'At the time of this thesis, this version of NSUBJETTINESS is not yet publicly available. It will be available by the time this work is officially published.

37 Single Jet Area Single Jet Area Diff.

20.35 -@=2 =2 0.3 - =1 10 00.25- T 0.2 R= 0.5 > 1O 0.15

0.05- -1~~~~ ...I .. 84 0'5 06. 0.7 0.8 0.9 1 1.1 1.2 -0.5-0.40.3-0.2-0.1 O 0.10.20.30.40.5 Area As - Axcone (a) (b)

Figure 3-2: Same N = 1 comparison as Fig. 3-1. (a) Single jet areas, showing the expected peak at irR2. (b) Jet area difference, showing that anti-kT jets occasionally have a higher area, explaining the PT asymmetry seen in Fig. 3-1b.

It is interesting that the APT distribution is not symmetric, such that anti-kT jets tend to have a larger PT than XCone jets. In Fig. 3-2a, we plot the active jet area [17, 19],2 which is quite similar between the two algorithms and peaked at the expected value of 7rR2. On an event-by-event basis, though, there is a population of anti-kT jets that are systematically larger than XCone jets, as shown in Fig. 3-2b. This occurs because anti-kT clustering can yield jets that extend beyond the conical boundary [181. The APT asymmetry then arises because these slightly bigger anti-kT jets contain more particles.

Comparing the performance of different / values, / = 2 jets are more similar to anti-kT jets since both methods align the jet axis with the jet momentum. The 3 = 1 jets are slightly softer, since they do not recoil away from the hard jet center to absorb soft radiation. In the absence of pileup, however, the # 2 and / 1 performance is quite similar on single jet reconstruction.

3.1.2 N = 2

For dijet resonance reconstruction, N = 2 is the most natural choice for running an exclusive cone jet algorithm. As shown in Figs. 3-3a and 3-3b, both anti-kT and XCone

2We use the built-in FASTJET area determination routines using active ghosts. For the general conical measure introduced in Ref. [53, 541, the active jet area can be determined analytically, though this is not as straightforward for the conical geometric measure used here.

...... Dijet Inv. Mass 2-Jet Mass Diff.

0.16 - 2 mz,=1 TeV -@=2 0.14 -1= 1 U0.12 --ak R=.5 =0.515 U0.1 R=0.5T. R 10-2- 0.08

90.06 10- 0.04

0.02 104 200 400 600 800 100012001400 -300 -200 -100 0 100 200 300 m, (GeV) Ma - mxcone (GeV) (a) (b)

Figure 3-3: Dijet kinematics of N = 2 XCone versus the two hardest anti-kT jets, measured on the dijet resonance sample. (a) Dijet mass, showing the expected peak at mz, 1 TeV. (b) Dijet mass difference between XCone and anti-kT jet, showing comparable reconstruction.

give a good reconstruction of the resonance peak, and they largely agree on the in 3 value on an event-by-event basis, without much of asymmetry in the m.anti-kT __XConer ne distribution. XCone can therefore act as a replacement for anti-kT for dijet resonance reconstruction, with comparable performance. 3

3.1.3 N = 3

Thus far, XCone and anti-kT have exhibited very similar behavior, but differences start to appear when considering N = 3. There are two main ways to achieve three jet configurations: either there is sufficient initial state radiation (ISR) to form an additional widely-separated jet, or there is sufficient final state radiation (FSR) to give one of primary jets some two-prong substructure. In the ISR case, anti-kT and

XCone should still give very similar results since the jets are non-overlapping. In the

FSR case, we expect XCone will often identify two separate prongs inside a fat clover jet, whereas anti-kT can only identify FSR if it is further away than RO from the hard jet core.

From the PT spectra in Fig. 3-4a, we see that the third jet is often softer in the anti-kT case, as expected if anti-kT tends to identify ISR jets. This is highlighted

31t is known that the RO = 0.5 cone size is typically too small to capture all of the dijet decay products [16]. While we did find that better performance could be obtained with somewhat larger RO, we wanted all of the plots in this chapter to have a common cone size for ease of comparison.

...... Third Jet p Third Jet p Diff. Third Jet A p vs. A p 0 0.18 1- . . - 80.16 m,=1 TeV -(3=2 -P=2 . -P-=2

U .k...... 00 124. - ' 1' 0 0.1'.-0 0 - R =0.5 gi2 :R 0.5 >o.1: R= 0.5 10- -0.08 1- ~0.06 1 0.04 '-0 0.02 ... --

100 200 300 400 500 -200.150-100 -50 0 50 100 150 200 -9 150100 50 0 50 100 150 200 pT (GeV) P - pT, xcone (GeV) pTa - PT,XCone (a) (b) (c) Figure 3-4: Comparing the third hardest jet kinematics between XCone N = 3 and anti-kT, measured on the dijet resonance sample. (a) Third jet PT spectrum.(b) Third jet PT difference between XCone and anti-kT, showing that XCone has a somewhat harder spectrum due to its ability to identify FSR subjets. (c) Third jet PT difference versus azimuth difference, showing a population of events where the third jet kinematics are completely different between the algorithms.

in Figs. 3-4b and 3-4c, which shows that the third anti-kT jet can have completely uncorrelated kinematics from the third XCone jet. We can gain further insight in Fig. 3-5a, which shows the distance between the third jet and the closest harder jet. In the anti-kT case, the third jet is forced to be further away than RO from the jet core, whereas in the XCone case, the third jet can go nearly to AR = 0, corresponding to XCone finding substructure within a fat clover jet instead finding a separate ISR jet.

The same effects are visible in the area distributions in Figs. 3-5b and 3-5c. While

the overall area distributions are not so dissimilar (particularly in the /3 = 2 case), on an event-by-event basis, there is a population of events where the third XCone jet has substantially smaller jet area, indicative of jet overlap. This is the flip side of the

area distributions for N = 1 in Fig. 3-2b, where anti-kT jets could grow larger in size by incorporating a neighboring subjet. In the XCone case, that subjet is separately identified as its own jet for N = 3.

Despite these differences, the overall jet reconstruction is still rather similar between XCone and anti-kT. In Tables 3.1b and 3.1a, we show the fraction of events for

which the nth hardest XCone jet is within RO/2 = 0.25 of the nth hardest anti-kT jet.

40 Third Jet/Closest Jet A R Third Jet Area Third Jet Area Diff. 0.2 R=0.5 -0=2 00.25 -0=2 -1=2 0.18- -$=1 10-1 0.2 n -p3=1 U - - - - ... akT C4) 00.147 0 9 R=0.5 u0.12 . 00.15 R =0.5 0 10-2 0 .10- ~0.08: - 0. ~0.06 0.04 0.05 't 1073 0.02

0 5 1 1.5 2 2.5 3 8.4 0.50.6 07 0.8 0.9 1 1.1 1.2 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 AR Area Aa - Axcone (a) (b) (c) Figure 3-5: Same N = 3 comparison as Fig. 3-4. (a) Angle of the third hardest jet to the nearest harder jet, showing that XCone jets can be located as close as AR = 0 whereas anti-kT jets are forced to have AR > Ro. (b) Third jet area distribution, showing the expected peak at 7rR2, but larger tails than in the N = 1 case in Fig. 3-2a. (c) Third jet area difference, showing a population of XCone jets with much smaller areas due to the lack of jet merging.

(0= 2) XC 1 XC 2 XC 3 (#= 1) XC 1 XC 2 XC 3 AKT 1 0.918 0.084 0.037 AKT 1 0.882 0.123 0.078 AKT 2 0.082 0.909 0.037 AKT 2 0.118 0.864 0.083 AKT 3 0.000 0.002 0.805 AKT 3 0.000 0.012 0.705 AKT 4 0.000 0.001 0.038 AKT 4 0.000 0.000 0.051 (a) (b) Table 3.1: Comparing XCone with N = 3 to anti-kT. Shown is the fraction of events where the nth hardest XCone jet is within RO/2 = 0.25 of the nth hardest anti-kT jet. (a) The # = 2 default which behaves most similarly to anti-kT. (b) The #3 1 recoil-free variant where larger differences are possible.

For # = 2, the three hardest jets are well aligned 80% to 90% of the time. For /3 = 1, there are larger deviations, though often this is just because the first and second jets are reversed in PT ordering. Thus, the choice of using anti-kT or XCone for N = 3 depends on the intended application.

3.2 Boosted Higgs Bosons and Dynamic Split/Merge

To highlight the distinct advantages of XCone, we now consider physics situations where resolving substructure is a key element of the analysis. Because XCone always

41 identifies N jets, it is well-suited to physics applications with a fixed number of expected (sub)jets. This is particularly interesting for cases involving jet substructure, where traditional jet algorithms yield merged fat jets, but XCone can identify jets and subjets simultaneously.

As one key example at the LHC [23, 41, consider associated Higgs boson production where the Higgs decays to bottom pairs:

pp -+ HZ - bbvfl. (3.2)

Apart from possible ISR, the final state consists only of two b-jets. To fight QCD backgrounds, the PT of the Higgs boson must be reasonably large [15]. However, in this regime, the Higgs decay products are more collimated, often resulting in jet merging. Roughly speaking, the two b-jets will be merged into a single fat jet when the Higgs boson is at the scale

2 merge = mH (33)

In order to counteract this effect, either RO can be decreased until it is small enough to resolve two separate b-jets, or jet substructure techniques can be used [15].

This (quasi-)boosted Higgs analysis is well-suited for XCone. At minimum, N = 1 can identify a single fat jet, after which existing jet substructure techniques can be applied. More intuitively, N = 2 can be used to identity the two hard b-jets in the event at all PT scales. Here, we describe an analysis in each of these cases to show the advantages of XCone.

Like the previous dijet study, we use Pythia 8.176 [49, 48] to simulate pp -+ HZ at the V = 14 TeV LHC. For simplicity, we force the decays Z -+ vO and H + bb.

In order to analyze the properties of the algorithm in different PT regimes, we place generator-level PT cuts on the Higgs boson between 200 and 1000 GeV. We use the same RO = 0.5 jet radius for all analyses in this section, such that pmerg, - 500 GeV is in the middle of our studied PT range.

42 Jet Mass (p > 200) Jet Mass (p > 500) Jet Mass (p > 800) S0.3 0.3 c 0.3 -p= 2 - -$=2 805 -p=2 0.25 00.25 *0.25- - P=1 0.2 U0.2- Q 0 0.2 0~ - R.akTP 0 AT >0.15 .>0.15- .>0.15 R 0.5 R =0.5 R = 0.5 c 0.1- 0. 1

0.05 0.051 0.05

n n 250 0 50 100 150 200 250 50 100 150 200 0 50 100 150 200 250 mj (GeV) mj (GeV) mj (GeV) (a) (b) (c)

Area of Jet (p > 200) Area of Jet (p > 500) Area of Jet (p > 800)

0.35 50.35 -p =2 r0.35- -$= 2 - -P = 0.3 0.3 =1 03 -$= 00.25 00.25 AT A T 0.25 akT U 0.2 > 0.2- > 0.2 5 1 0.15 , R .5 Mo. 1 R 0.5 50.15 R 0.5 0.1 0.1 0.1 0.05 0.05: 0.05

8.4 0, 5 0. 07 .8 0.9 1 .1 1.2 8.4 0.6 0.7 0.8 0.9 1 1.1 1.2 O. 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 Area Area Area (d) (e) (f) Figure 3-6: Top row: Comparing fat jet Higgs reconstruction between XCone with N = 1 and the hardest anti-kT jet. As the Higgs PT increases from (a) 200 GeV to (b) 500 GeV to (c) 800 GeV, both methods captured the merged Higgs decay products, yielding a growing mass peak at mH = 125 GeV. Bottom row: Comparing area of Higgs fat jet between XCone with N = 1 and the hardest anti-kT jet. All distributions show a large peak at A =ir(0.5)2 , as expected, with anti-kT jets showing a slighly larger area. This also helps explain the slight mass offset seen in the first row of figures.

3.2.1 N = 1

Since the pioneering work in Ref. [15], the boosted Higgs channel has often been been analyzed by finding one fat jet with a large radius parameter, and then using substructure techniques to analyze its properties [23]. The original BDRS paper used the C/A algorithm to find this fat jet, and XCone with N = 1 can serve a similar purpose.4 Here, we compare XCone to anti-kT with the same jet radius.

4 We do not explicitly compare the Higgs 1-jet channel to the BDRS method since the philosophy of the two approaches is very different. However, one could in principle use XCone N = 1 to find a jet and then use the Cambridge/Aachen algorithm to create a clustering sequence from which the BDRS method could be used.

43 ...... - :...... :...... _ - ...... -

Higgs Eff. for 1-jet 0.9 -p=2 R= 0.5

0.8:_ak

0.7- 0.2-

0.1-- -

C2 0 0 500 40 500 600 700 800 900 1000 p (GeV)

Figure 3-7: Efficiency for N = 1 fat jet Higgs reconstruction as a function of Higgs PT, with the mass window mj E (100, 150) GeV. The efficiency grows when the PT is above the merging scale 2mH/RO, and XCone / = 2 outperforms # = 1 in the transition region since the former centers the jet along the Higgs momentum.

Unlike in the dijet resonance study, the difference between / = 1 and / = 2 is

rather important for quasi-boosted Higgs bosons. As described in Ref. [54], / = 1 minimization aligns the jet axis with the hardest cluster within a jet, whereas / = 2 minimization places the jet axis approximately in the direction of the jet momentum. Roughly speaking, / = 1 finds the "median" jet axis direction whereas / = 2 finds the "mean" jet axis direction. For the boosted Higgs case, the / = 1 jet is more likely to point in the direction of one of the two b-jets, while the 3 = 2 jet is more likely to

lie in between the two b-jets. Anti-kT acts like / = 2 since it also aligns the jet axis with the jet momentum.

In Figures 3-6a, 3-6b, and 3-6c we show the single jet invariant mass as the

minimum Higgs PT (at generator level) is adjusted from 200 GeV to 500 GeV to

800 GeV. With increasing PT, more of the Higgs decay products are contained inside

a single jet and the peak at mj = 125 GeV grows. By eye, the anti-kT and / = 2 distributions are quite similar, whereas the / = 1 case has a somewhat worse

performance since the jet axis is misaligned from the Higgs boson momentum. We quantify the Higgs reconstruction efficiency in Fig. 3-7, which shows the frac-

tion of jets in the Higgs mass window mj E (100, 150) GeV. At very high PT values, all algorithms have very similar performance, but 3 = 2 does better in the vicinity

of prmerge. This is because the / = 2 jet axis is more likely to lie in between the two

44 I...' , --- 1-1 --. 1 - , - 41kik"-. ------,, .. '1-1-1-ANA11"- -, .,- - ., -1-1-11 1 1 -1- . . -.-- Allidiblilk-l-, ..". -- -- - , -. "', -, -1-1.1- . 1--

b-jets, so the jet is more likely to capture the full Higgs decay products. As expected, anti-kT and 0 = 2 are nearly identical.

3.2.2 N = 2

For this associated Higgs scenario, the real power of XCone comes from using N 2. In the unboosted regime, the standard analysis strategy is to find two b-jets, reconstruct their invariant mass, and look for a peak at the known Higgs mass. In the boosted regime with jet merging, though, algorithms like anti-kT are likely to find one fat Higgs jet and one ISR jet elsewhere in the event, so a dijet reconstruction strategy is no longer effective. By contrast, since XCone is an exclusive cone algorithm, it will

always identify two jets regardless of the Higgs PT. To find boosted Higgs bosons with XCone, we can simply run with N = 2 and perform a standard resolved jet analysis.

In Figures 3-8a, 3-8b, and 3-8c we show the reconstructed dijet invariant mass

comparing XCone with anti-kT. For low Higgs PT, all algorithms find the Higgs peak

with roughly the same line shape.5 As the Higgs PT increases, the anti-kT distributions move to higher dijet masses because of jet mergers, whereas XCone maintains good

performance regardless of PT.

We show the Higgs reconstruction efficiency in Fig. 3-9a as a function of the Higgs

PT. Anti-kT jets start to merge around PT = 300 GeV and the Higgs efficiency drops significantly. XCone has nearly flat efficiency as a function of Higgs PT, even as

the PT crosses beyond the pere scale. At higher PT values, the # = 2 jets see a performance degradation, since the / 2 jets are susceptible to radiation at a wide angles. The # = 1 jets are able to maintain their performance since the jet axes tend to always align with the momentum of the Higgs decay products. Overall, the XCone reconstruction efficiency is around 65% for Ro = 0.5.

5 The low mass tail in each of the plots can be explained by neutrinos from B meson decays within the b-jet.

45 ------_: -- , ------ANL- - .- - -

Dijet Mass (pT > 200) Dijet Mass (p > 500) Dijet Mass (pT > 800) vO .2 5 S0.25 -0=2 -P =2 Q 0. 2 0..2 ~0.2 -j=

00.1 5- ~A T 0.1 0.15 T

0. 1 . R = 0.5 0 .1 R =0.5 0.1, R =0.5

0.05 - 0.05- 0.05 -

10 so 100 150 200 250 50 100 150 200 250 50 100 150 200 250 m. (GeV) mj (GeV) mj (GeV) (a) (b) (c)

Area of Jet (pT > 200) Area of Jet (pT > 500) Area of Jet (p T> 800)

0.25 0 0.2 5 -P3=2 90.25 p= 1 0.2 -P=1 0.25- -ak 00. 2 ak uO.15 0 >0.1 5- R 0.5 0.1 R =0.5 R = 0.5 .1 0.1 0.0 '5 0.05 NL- 0.05

0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 c.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 8.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 Area Area Area (d) (e) (f) Figure 3-8: Top row: Comparing resolved dijet Higgs reconstruction between XCone with N = 2 and the two hardest anti-kT jets. As the Higgs PT increases from (a) 200 GeV to (b) 500 GeV to (c) 800 GeV, anti-kT suffers from jet mergers, whereas XCone yields a dijet Higgs peak across the entire PT spectrum. Bottom row: Bottom row: Comparing area between XCone with N = 2 and the two hardest anti-kT jets. At low PT, both distributions show a large peak at A =7r(0.5) 2 , as expected. As PT increases, the XCone area falls to roughly half its original value, indicating the jets become overlapping, while anti-kT jets remain at their original value.

...... Higgs Eff. for 2-jet Higgs Eff. for 2-jet + ISR tag 1 1 0.9 R= 0.5 0.9:- R= 0.5 0.8-OK

0.5 0.5 S0.4 -0.4 - 0. -- =2b 0.2--@=1 0.2-10.14 -- a ---- 0.14---03 ak 0.=. akT...... 0.1-..akT 200 300 400 500 600 700 800 900 1000 203040506070809010 pTmin (GeV) pTmin (GeV) (a) (b) Figure 3-9: (a) Efficiency for N = 2 resolved dijet Higgs reconstruction as a function of Higgs PT, with the mass window mj E (100, 150) GeV. We see a degradation in the efficiency of the anti-kT spectrum at higher PT due to jet merging issue, while XCone produces constant efficiency across the spectrum at around 65%. Here, # = 1 outperforms 3 = 2, since the former is less susceptible to wide-angle jet contamination. (b) Same as Fig. 3-9a, but now allowing the Higgs to be reconstructed with either N = 2 or N = 3 (with the minimum pairwise mass). The N = 3 method allows ISR jets to be vetoed, improving the performance of all methods, such that XCone /3 1 and / = 2 have comparable performance.

3.2.3 N = 3

To improve the XCone performance, we have to account for ISR, which the leading cause of misreconstruction. In the presence of hard ISR, XCone can identify an ISR jet instead of finding one of the two b jets. To address this issue, we can explicitly identify the ISR jet using N = 3 and find the best reconstruction among the N = 2 and N = 3 options. We first run XCone with N = 2 and check whether the dijet mass is in the mnj E (100, 150) GeV window. If not, we run XCone with N = 3 and apply the Higgs mass test on the pair of jets with the smallest invariant mass, as these are kinematically the most likely candidates to be the Higgs decay products. Allowing two pathways for Higgs reconstruction gives improved signal efficiency, and in Fig. 3- 9b, we see that both 3 = 1 and / = 2 now have efficiencies around 75%. Applying the same 2- and 3-jet technique to anti-kT does improve the signal efficiency somewhat, though XCone still has better performance for PT > 300 GeV.

47 We conclude that XCone is highly efficient in reconstructing Higgs bosons across a range of kinematics, from the resolved to quasi-boosted to boosted regimes. This method could be further improved by using b-tagging to help identify and veto ISR jets. Using b-tagging is also important for mitigating backgrounds, since a potential downside of using the combined N = 2,3 method is that background events also have two pathways to land in the signal window. But the main take away from this study is that XCone allows traditional resolved analyses to be extended into the boosted regime, providing a PT-independent method for Higgs reconstruction.

3.3 Boosted Top Quarks and High-Multiplicity Final States

Given the success of XCone in reconstructing boosted Higgs bosons, we now test whether XCone can handle the increasingly complex final states possible at LHC collision energies. An important process at the LHC is pair production of top quarks with fully hadronic decays:

pp -+ tt, t -+ Wb -4 qq'b. (3.4)

At low mtf, the final state consists of six resolved jets. At high mtf, the jets are arranged into two fat jets with three-prong substructure, and a variety of substructure techniques have been developed to tag these boosted tops [37, 55, 44, 54, 39]. Here, we show that XCone with N = 6 can identify each of the six individual (sub)jets, regardless of the mtf value, allowing the same analysis strategy to be effective in both the resolved and boosted regimes. We also show a more efficient N = 2 x 3 method where the event is first partitioned into hemispheres using N = 2 and then separated into subjets by applying N = 3 in each hemisphere. Our study is based on the BOOST 2010 events samples [51 for fs = 7 TeV

collisions. 6 For the boosted top signal, we use the Herwig tE -4 hadrons samples

6While the BOOST 2011 samples would be more realistic for our studies, they unfortunately

48 where the generator-level top PT ranges from 200-800 GeV in bins of 100 GeV. We also apply XCone to the the Herwig QCD dijet background sample with the same

PT bins. As in the boosted Higgs study, we take Ro = 0.5. When comparing to traditional fat jet studies, we use anti-kT jets with Ro= 1.0 as recommended in the

BOOST 2010 report [5]. For brevity, we do not include a straight N = 2 fat jet study for XCone, since the results are similar to those found in Sec. 3.1. At the outset, we want to emphasize that XCone is able to handle partially overlapping jets, as expected in the boosted and quasi-boosted regimes. In the highly boosted limit, however, the subjets are fully overlapping, so substructure methods based on fat jets are typically more effective at signal/background separation. While it is possible to combine XCone with jet shapes like N-subjettiness for improved performance in the highly boosted limit, we find in preliminary studies that there is no real advantage to using N = 6 over a more traditional fat jet analysis with N = 2. The key advantage of XCone is that it yields relatively uniform performance over a

broad PT range, and while specialized techniques can achieve better performance at extreme kinematics, XCone allows resolved techniques to be applied even when jets

are overlapping.

3.3.1 N = 6

The most straightforward application of XCone for hadronic tops is using N = 6

to resolve six jets. Like in Sec. 3.2.3, one can try to improve the performance by explicitly identifying ISR jets using N = 7, but we find that the N = 2 x 3 method

shown below is more effective at dealing with the combinatorial complexity of this

final state. For these studies, we have not incorporated b-tagging information, though it would be straightforward to use XCone in conjunction with recent subjet b-tagging methods [26, 3].

After running XCone with N = 6, we want to partition the jets into two top

candidates in a way that is PT-independent. We do this by finding all 6 C3 = 20 ways of partitioning the jets into two three-jet clusters, and then finding the configuration

seem to be lost forever.

49 Top Mass (400 < pT < 500) Top Area (400 < p < 500) 0.35 -P=2 50.16 -P=2 (0-3 -p3=1 (0.14 -p=1 o0.25 00.12. ak ...AT 0.2 R = 0.5 R =0.5 0.08 70.15 0.06 0.1- 0.04 - 0.05 -.. 0.02-

100 200 30 400 500 04 05 06 07 08 0.9 1 1.1 1.2 m (GeV) Area (a) (b)

QCD Mass (400 < p < 500) QCD Area (400

O.5 R=0.5 D0.02- 0015- 0.1- 0.01 - 0.05 0.005:- 100 200 00 400 500 .4 0.5 06 07 08 09 1 1.1 1.2 m (GeV) Area (c) (d) Figure 3-10: Top row: Comparing resolved three-jet top reconstruction between XCone with N = 6 and the six hardest anti-kT jets, in the PT E (400,500) GeV bin. Here, top candidates are identified by minimizing the sum of the three-jet masses. (a) Candidate top mass distributions, showing that XCone does not have as pronounced of a high mass tail due to ISR. (b) Area of all six jets, showing a peak at (2/3)7rR2 for XCone expected of clover jet configuration compared to a peak at 7rR2 for anti-kT expected of separated jets. Bottom row: Same for the QCD background.

...... Top Eff. (6-jets) QCD Eff. (6-jets) Signal Signif. (6-jets) 0--10 R=0.5 0. -f=2 R= 0.5 R= 0.5 F 0.8 -= 1 ---. =1 S 0.7- -- akT 0.7 .. ak 7 --- akT 0.6 O.60 - 6 o.5 0.5 5 cA 0.4- 0 .4: 4 . -- - -- ~ 0.3 0.3 3 . -- 0.2 0.2 2 0.1- 0-.1-. 1- 0 ,I ...... , I , -__0_...... ____ 200 300 400 500 600 700 '0 0 0 0 0 0 0 0 0 0 0 0 p . (GeV) p . (GeV) p T, min (GeV) T, mm T,min (a) (b) (c)

Figure 3-11: Comparing boosted top and XCone performance with N = 6 as a function of top PT for (a) signal efficiency, (b) background mistag, and (c) signal significance gain. XCone surpasses the performance of anti-kT across the entire PT range. that minimizes the scalar sum of the three-jet masses. Much like in the boosted

Higgs case, we expect that minimizing the mass is most likely to yield the correct 2 top candidates. In a more sophisticated analysis, one could use a x -minimization approach to find the best top candidates, also incorporating W-mass and b-tagging information. For an apples-to-apples comparison below, we apply the same analysis strategy on the six hardest anti-kT jets.

In Fig. 3-10a, we show the reconstructed top jet mass in the PT C (400, 500) GeV bin, comparing XCone and anti-kT with N = 6. The XCone distributions show a much more resolved top peak, which is more symmetric around the top mass and has a substantially reduced high-mass tail. In the area distributions in Fig. 3-10b, we see that XCone jets are peaked at roughly (2/3)irR , where the factor of 2/3 is expected since the jets are arranged in a clover configuration around the boosted top direction. The anti-kT jets are peaked around 7rR2, as expected since anti-kT jets do not typically overlap. Because of subjet mergers, the six found anti-kT jets are not all associated with the top quarks, leading to large invariant mass values from ISR jets. The equivalent dijet background distributions are shown in Figs. 3-10c and 3-

10d. In the absence of genuine three-prong substructure, the XCone jets tend to be scattered throughout the event, leading to large reconstructed invariant masses and

...... 7rR' areas. The effect is even stronger in the anti-kT jets, since they avoid overlaps. To define the top signal region, we take a top mass window of mjjj C (150,200) GeV.7 We also apply a W-tagging cut as described in the CMS analysis [27], by analyzing each pairwise combination of the three subjects and requiring a minimum pairwise invariant mass cut of mjjmin > 50 GeV. In Figs. 3-11a and 3-11b we show the efficiency and mistag rates for the top and dijet samples as a function of (generator-level) PT. XCone has a signal efficiency of around 60% across the entire PT scale, showing the desired scale invariance. While anti-kT starts with the same 60% efficiency in the resolved regime, the efficiency drops considerably with PT due to jet mergers, analogous to what was found in Sec. 3.2.2. Both methods have around a 10% background mistag rate, which is relatively stable as a function of PT. The improvement in signal significance (S2 /B) is shown in Fig. 3-11c, where we see that the performance exceeds that of anti-kT across the entire PT range, and remains relatively flat. Note that we have not included b-tagging information nor additional jet shape information, so background rejection factors can be much larger in practice. From this simple study, we see that a straightforward application of XCone allows for a PT-independent analysis strategy even for complicated final states.

3.3.2 N = 2 x 3

To further improve on the performance of XCone, we can take into account the event topology. Even with a moderate boost, the top decay products tend to arrange themselves into two hemispheres, a feature that is exploited, for example, in the HEPTogTagger [441. Thus, we can use XCone in multiple stages, first dividing the event into separate top candidate regions with N = 2 and Ro -* oc, and then finding jets in each of those regions using N = 3 and Ro = 0.5.

There are two advantages of this N = 2 x 3 approach over the N = 6 approach. First, it reduces combinatorial confusion and increases computational efficiency. Sec-

7Since the peak from XCone is more narrow and symmetric around the top mass, we can use a smaller and more symmetric mass window without losing much signal efficiency.

52 Top Mass (400 < pT < 500) QCD Mass (400 < pT < 500)

UO12- r0.3 5 0. 3 akT (Bst) 0. - -AT (Bst) 0.08- C) 0.2 - akT (Res) akT (Res) 0. 2 R =0.5 -0.06 R = 0.5 0.1 5L ~0.04-

0.0 5 0.02-

0 100 200 300 400 500 1 100 200 300 400 500 m (GeV) m (GeV) (a) (b) Figure 3-12: Reconstructed mass distributions with the N = 2 x 3 strategy, for (a) top signal events and (b) QCD background events. Here we compare the XCone N = 2 x 3 method to two traditional methods: a boosted strategy where two anti-kT RO = 1.0 jets have three exclusive kT subjets, and a resolved strategy where two exclusive kT hemispheres have three anti-kT RO = 0.5 jets.

Top Eff. (2x3-jets) QCD Eff. (2x3-jets) Signal Signif. (2x3-jets) R=0.5=2I- - 2 0.4 10 R 0. U R = 0.5 0..- 2 R =0.5 9

0o.0.7 5 akT (Bst) 0.7 ---akT (Bst) 7- ---- akT (Bst) -- a(Res) 0.6' -akT (Res) 6 - akT (Res) i- 0.64 5- -. CU0.4 0.3 ....-...... S0. 3 0.2 0.2:- 2- 0. 0.1 .....------"200 300 400 500 600 700 200 300 400 500 600 700 0200 306 40 o500600 700 T, minmp (GeV) T,pm min (GeV) pTmin (GeV) (a) (b) (c) Figure 3-13: Comparing boosted top and XCone performance with N = 2 x 3 as a function of top pT for (a) signal efficiency, (b) background mistag, and (c) signal significance gain. XCone interpolates between the traditional resolved strategy at low PT and the traditional boosted strategy at high PT.

53 ond, it ensures that each top candidate has a chance to get three jets. Even without ISR, the N = 6 method can yield, for example, one four-leaf top clover and one two- leaf top clover, something that is avoided with N = 2 x 3. While it is possible to get even higher signal efficiencies by applying N = 2 x 4 and vetoing ISR jets (analogous to Sec. 3.2.3), such an approach tends to also increase the background mistag rate, so we will not show N = 2 x 4 results here. We can compare XCone to traditional top reconstruction methods in two different ways. For a boosted strategy, we can run anti-kT with Ro = 1.0 to find two fat jets and then run exclusive kT with N = 3 and Ro = 0.5 on the fat jet constituents to identify subjets. We denote this as anti-kT (Bst) in the figures. For a resolved strategy, we can run exclusive kT with N = 2 and Ro -+ oc to find hemisphere regions, and then run anti-kT with Ro = 0.5 to find the three hardest jets in each hemisphere. We denote this as anti-kT (Res) in the figures. As we will see, XCone N = 2 x 3 effectively interpolates between these behaviors as a function of PT, reproducing the best performance of the traditional strategies in their respective domains.

The resulting top mass distributions are shown in Fig. 3-12a, again in the PT E (400, 500) GeV bin. The top mass distribution for XCone N = 2 x 3 jets is similar to the N = 6 case, continuing to maintain a peaked, symmetric shape around the top mass. Crucially, N = 2 x 3 reduces the high-mass tail since there are more correctly reconstructed top quarks. The anti-kT boosted strategy results in a W shelf caused by the Ro = 1.0 jet radius not containing the full top decay products, while the anti- kT resolved strategy has a high-side mass tail from the inclusion of ISR jets. XCone avoids both of these pitfalls, giving an excellent overall reconstruction. Similarly, as shown in Fig. 3-12b, the background mass distribution for XCone falls in between the anti-kT boosted and resolved strategies. Like in the N = 6 case, both XCone jets and anti-kT N = 6 jets tend to be scattered throughout the hemisphere, leading to large invariant masses. However, the additional constraint of the hemisphere helps to control this effect, giving smaller masses on average than N =6. The signal efficiency and background mistag rates are given in Figs. 3-13a and

54 3-13b, again for the mjjj c (160, 240) top mass window and mjj,min > 50 GeV W mass cut. For the signal efficiency, it is clear the XCone interpolates between the good anti-kT resolved performance at low PT and the good anti-kT boosted performance at high PT, yielding approximately a 60% reconstruction efficiency throughout the PT range. The improvement over anti-kT is likely due to the inclusion of soft subjets that fall outside of the normal anti-kT clustering radius, as described in [35]. For the background mistag rate, XCone holds steady at 10%, where again, further improvements are possible using b-tagging or substructure information. The improvement in signal significance (S2 /B) is shown in Fig. 3-13c, where we see that the performance exceeds both the boosted and resolved strategy of anti-kT across the entire PT range. Further, the performance is increasing as a function of PT, indicating a very strong performance of this algorithm in the highly boosted regime. We conclude that XCone gives a powerful way to extend conventional resolved analysis strategies into the boosted regime.

55 56 Chapter 4

Conclusion

In this thesis, we have presented the concept of the exclusive cone algorithm designed for improved analysis of jets at the Large Hadron Collider, and demonstrated its use in several key LHC analyses. In Chapter 2, we introduced a new jet algorithm

"XCone" based on the N-jettiness event shape that is capable capable of resolving highly boosted jets by dynamically splitting the jets into equal partitions while still maintaining the conical boundaries. We also defined a new conical geometric measure based on previous uses of N-jettiness that has many key theoretical and experimental advantages. We also outlined the implementation of a simple minimization routine that approximates stable cone finding, and discussed a straightforward and accurate way of finding seed axes through sequential clustering algorithms.

In Chapter 3, we presented three case studies to show how XCone can be used in a variety of LHC analyses, producing comparable or better results than conventional methods. Remarkably, a single benchmark cone size of RO = 0.5 was able to successfully reconstruct heavy Z' resonances, boosted Higgs bosons, and boosted top quarks to a high degree of accuracy across a wide range of jet PT. This shows that XCone is well-equipped to handle a variety of search channels at the LHC over the kinematic regime probed in current colliders. Although we did not test it in this thesis, we expect that the use of XCone in conjunction with other, existing, jet substructure techniques will allow for even greater improvement in signal reconstruction.

There are many future directions that one could take in exploring the wide param-

57 eter space that XCone offers. In particular, the freedom offered by both the choice of measure and the axis finding and minimization routine allows future implementations of XCone that may be helpful in vetoing ISR jets, mitigating the QCD background, and more. Possible important extensions include a study of the role of -Y in the definition of the conical geometric measure, or a new implementation where the strategies for finding the axes and the jet regions are completely separated. XCone can also be extended to many other channels beyond those shown in the thesis, including complex multi-jet searches for supersymmetry and other beyond the Standard Model physics. At the moment, the XCone code is currently limited to the options presented in this thesis, but these other options are still left open for future consideration. Given both the experimental and theoretical advantages that XCone provides, as well as the promising results of this thesis, we hope that XCone and other exclusive cone jet algorithms will continue to be researched and implemented for physics studies at the LHC and other colliders. We look forward to potential results with this algorithm extremely soon as the LHC begins ramping up for Run 2.

58 Bibliography

[1] Fastjet contrib. http://fastjet.hepforge. org/contrib/.

[2] Performance of boosted top quark identification in 2012 ATLAS data. Technical Report ATLAS-CONF-2013-084, ATLAS-COM-CONF-2013-074, 2013.

[3] Calibration of the performance of b-tagging for c and light-flavour jets in the 2012 ATLAS data. Technical Report ATLAS-CONF-2014-046, CERN, Geneva, Jul 2014.

[4] Georges Aad et al. Search for the bb decay of the Standard Model Higgs boson in associated (W/Z)H production with the ATLAS detector. JHEP, 1501:069, 2015.

[5] A. Abdesselam, E. Bergeaas Kuutmann, U. Bitenc, G. Brooijmans, J. Butter- worth, et al. Boosted objects: A Probe of beyond the Standard Model physics. Eur.Phys.J., C71:1661, 2011.

[6] D. Adams, A. Arce, L. Asquith, M. Backovic, T. Barillari, et al. Towards an Understanding of the Correlations in Jet Substructure. 2015.

[7] A. Altheimer, A. Arce, L. Asquith, J. Backus Mayes, E. Bergeaas Kuutmann, et al. Boosted objects and jet substructure at the LHC. 2013.

[8] A. Altheimer, S. Arora, L. Asquith, G. Brooijmans, J. Butterworth, et al. Jet Substructure at the Tevatron and LHC: New results, new tools, new benchmarks. J.Phys., G39:063001, 2012.

[9] Andrea Banfi, Gavin P. Salam, and Giulia Zanderighi. Principles of general final-state resummation and automated implementation. JHEP, 0503:073, 2005.

[10] Daniele Bertolini, Tucker Chan, and Jesse Thaler. Jet Observables Without Jet Algorithms. JHEP, 1404:013, 2014.

[11] Gerald C. Blazey, Jay R. Dittmann, Stephen D. Ellis, V. Daniel Elvira, K. Frame, et al. Run II jet physics. pages 47-77, 2000.

[12] MarAga Boronat, Ignacio Garcia, and Marcel Vos. A new jet reconstruction algorithm for lepton colliders. 2014.

59 [131 J.M. Butterworth, J.P. Couchman, B.E. Cox, and B.M. Waugh. KtJet: A C++ implementation of the K-perpendicular clustering algorithm. Com- put. Phys. Commun., 153:85-96, 2003.

[14] J.M. Butterworth, John R. Ellis, and A.R. Raklev. Reconstructing sparticle mass spectra using hadronic decays. JHEP, 0705:033, 2007.

[15] Jonathan M. Butterworth, Adam R. Davison, Mathieu Rubin, and Gavin P. Salam. Jet substructure as a new Higgs search channel at the LHC. Phys.Rev.Lett., 100:242001, 2008.

[16] Matteo Cacciari, Juan Rojo, Gavin P. Salam, and Gregory Soyez. Quantify- ing the performance of jet definitions for kinematic reconstruction at the LHC. JHEP, 0812:032, 2008.

[171 Matteo Cacciari and Gavin P. Salam. Pileup subtraction using jet areas. Phys.Lett., B659:119-126, 2008.

[18] Matteo Cacciari, Gavin P. Salam, and Gregory Soyez. The Anti-k(t) jet clustering algorithm. JHEP, 0804:063, 2008.

[19] Matteo Cacciari, Gavin P. Salam, and Gregory Soyez. The Catchment Area of Jets. JHEP, 0804:005, 2008.

[20] Matteo Cacciari, Gavin P. Salam, and Gregory Soyez. FastJet User Manual. Eur.Phys.J., C72:1896, 2012.

[21] S. Catani, Yuri L. Dokshitzer, M.H. Seymour, and B.R. Webber. Longitudi- nally invariant Kt clustering algorithms for hadron hadron collisions. Nucl. Phys., B406:187-224, 1993. [221 S. Catani, G. Turnock, and B.R. Webber. Jet broadening measures in +e-- annihilation. Phys.Lett., B295:269-276, 1992.

[23] Serguei Chatrchyan et al. Search for the standard model Higgs boson produced in association with a W or a Z boson and decaying to bottom quarks. Phys.Rev., D89(1):012003, 2014.

[24] CMS Collaboration. Search for ttbar resonances in semileptonic final state. 2012.

[25] CMS Collaboration. Identifying Hadronically Decaying Vector Bosons Merged into a Single Jet. Technical Report CMS-PAS-JME-13-006, 2013. [261 CMS Collaboration. Performance of b tagging at sqrt(s)=8 TeV in multijet, ttbar and boosted topology events. Technical Report CMS-PAS-BTV-13-001, 2013. [27] CMS Collaboration. Boosted Top Jet Tagging at CMS. 2014.

[28] The ATLAS collaboration. A search for tt resonances in the lepton plus jets final state with ATLAS using 14 fb1 of pp collisions at f = 8 TeV. 2013.

60 [291 Yuri L. Dokshitzer, G.D. Leder, S. Moretti, and B.R. Webber. Better jet clustering algorithms. JHEP, 9708:001, 1997.

[30] Yuri L. Dokshitzer, A. Lucenti, G. Marchesini, and G.P. Salam. On the QCD analysis of jet broadening. JHEP, 9801:011, 1998.

[311 S.D. Ellis, J. Huston, K. Hatakeyama, P. Loch, and M. Tonnesmann. Jets in hadron-hadron collisions. Prog.Part.Nucl.Phys.,60:484-551, 2008.

[32] S.D. Ellis, J. Huston, and M. Tonnesmann. On building better cone jet algorithms. page P513, 2001.

[33] Stephen D. Ellis and Davison E. Soper. Successive combination jet algorithm for hadron collisions. Phys.Rev., D48:3160-3166, 1993.

[341 Stephen D. Ellis, Christopher K. Vermilion, and Jonathan R. Walsh. Recombi- nation Algorithms and Jet Substructure: Pruning as a Tool for Heavy Particle Searches. Phys.Rev., D81:094023, 2010.

[35] Marat Freytsis, Tomer Volansky, and Jonathan R. Walsh. Tagging Partially Reconstructed Objects with Jet Substructure. 2014.

1361 Teppo T. Jouttenus, lain W. Stewart, Frank J. Tackmann, and Wouter J. Waalewijn. Jet Mass Spectra in Higgs + One Jet at NNLL. Phys.Rev., D88:054031, 2013.

[37] David E. Kaplan, Keith Rehermann, Matthew D. Schwartz, and Brock Tweedie. Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks. Phys.Rev.Lett., 101:142001, 2008.

[38] David Krohn, Jesse Thaler, and Lian-Tao Wang. Jet Trimming. JHEP, 1002:084, 2010.

[39] Andrew J. Larkoski, Ian Moult, and Duff Neill. Building a Better Boosted Top Tagger. Phys.Rev., D91(3):034035, 2015.

[40] Andrew J. Larkoski, Duff Neill, and Jesse Thaler. Jet Shapes with the Broadening Axis. JHEP, 1404:017, 2014. [41] Andrew J. Larkoski, Gavin P. Salam, and Jesse Thaler. Energy Correlation Functions for Jet Substructure. JHEP, 1306:108, 2013.

[42] Andrew J. Larkoski and Jesse Thaler. Aspects of jets at 100 TeV. Phys.Rev., D90(3):034010, 2014.

[43] Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28:129-137, 1982.

[44] Tilman Plehn, Michael Spannowsky, Michihisa Takeuchi, and Dirk Zerwas. Stop Reconstruction with Tagged Tops. JHEP, 1010:078, 2010.

61 [45] Gavin Salam. Unpublished. Unpublished.

[46] Gavin P. Salam. Towards Jetography. Eur.Phys.J., C67:637-686, 2010.

[47] Gavin P. Salam and Gregory Soyez. A Practical Seedless Infrared-Safe Cone jet algorithm. JHEP, 0705:086, 2007. [48] Torbjorn Sjostrand, Stephen Mrenna, and Peter Skands. A Brief Introduction to PYTHIA 8.1. Comput.Phys.Commun.178:852-867,2008,October 2007.

[491 Torbjorn Sjostrand, Stephen Mrenna, and Peter Z. Skands. PYTHIA 6.4 Physics and Manual. JHEP, 0605:026, 2006.

[50] Gregory Soyez, Gavin P. Salam, Jihun Kim, Souvik Dutta, and Matteo Cacciari. Pileup subtraction for jet shapes. Phys.Rev.Lett., 110(16):162001, 2013.

[51] George F. Sterman and Steven Weinberg. Jets from Quantum Chromodynamics. Phys.Rev.Lett., 39:1436, 1977.

[52] lain W. Stewart, Frank J. Tackmann, and Wouter J. Waalewijn. N-Jettiness: An Inclusive Event Shape to Veto Jets. Phys.Rev.Lett., 105:092002, 2010.

[53] Jesse Thaler and Ken Van Tilburg. Identifying Boosted Objects with N- subjettiness. JHEP, 1103:015, 2011.

[54] Jesse Thaler and Ken Van Tilburg. Maximizing Boosted Top Identification by Minimizing N-subjettiness. JHEP, 1202:093, 2012.

[551 Jesse Thaler and Lian-Tao Wang. Strategies to Identify Boosted Tops. JHEP, 0807:092, 2008.

[56] M. Wobisch. Measurement and QCD analysis of jet cross-sections in deep inelastic positron proton collisions at Vf/s = 300 GeV. DESY- THESIS-2000-049, 2000. [57] M. Wobisch and T. Wengler. Hadronization corrections to jet cross-sections in deep inelastic scattering. 1998.