<<

Bb4l generator, interferences and off-shell effects

Leo Peyruchat University of Strasbourg Supervised by Jeremy Andrea, CMS, CERN (Dated: 9 aoˆut2017) Proton-proton collisions happening in LHC create lots of data. To understand the underlying physics behind these events, the real data must be compared to simulated events. A new generator, called the bb4l model, is able to simulate collisions happening in LHC with new interesting features regarding process creating two W bosons and two b quarks. One of them is that it takes interferences between different processes into account. Such effects have always been neglected in the case of top pair or single top production, but with the increasing sensitivity of the detectors it is becoming important to know precisely their amplitude. The goal of this study is to separate events generated with bb4l into different categories, and then to look at many variables and look for differences between categories.

STANDARD MODEL which indicate that it is only a low energy effective theory of a more general theory. This is one of the main moti- The (SM) of is the vation for current researches in particle physics. Because theoretical framework that describes the behaviour of ele- of its properties, the top quark is a good tool to probe mentary particles and their interactions (except via gra- new physics. vity) [1]. Particles are divided into different subsets. First one shall distinguish fermions (half-integer spin), which are the building blocks of matter, and bosons (integer TOP QUARK PHYSICS spin), which are the carrier of the interactions (except for the higgs boson). With its heavy mass (' 173 GeV) and its short life Then among fermions, one can make the distinction time (' 10−25s), the top quark has a peculiar role in the between quarks and . The former carry a color SM [3]. charge, and therefore interacts via strong interaction, Because the strong force increases with distance, while the latter doesn’t. Both have six particles and anti- quarks are always in bound states (also called confin- particles, divided into three generations. Each quark ge- ment). So when high energy quarks are created and ta- neration has two members, with spin 2/3 (eg. up quark) ken apart, they will create many composite particles (cal- and spin -1/3 (eg. down quark). A leptonic family consists led ). This process is called , and is of an electrically charged particle (eg. electron) and its at the origin of the tight cones of particles (called jets) associated neutral particle, the neutrino. created in high energy collisions. But the time required The different gauge bosons have distinct properties to hadronize (' 10−24s) is bigger than the life-time of that specify how the associated interaction will work. the top quark. So top quarks, unlike other quarks, will mediate electromagnetic force between electri- decay before hadronizing. Therefore top quarks can be cally charged particles, via processes described by quan- seen as ’bare quarks’, allowing us to measure their mass tum electrodynamics (QED). W+− and Z bosons carry precisely and to study their properties directly from their weak force between all fermions, via respectively char- decay products. ged and neutral currents. These bosons can be associa- A single top quark will decay into a W boson and a b ted with photons in the electroweak (EW) interaction. quark in almost all cases. The b quark will form a , Strong force is mediated by eight and is described and the W will decay into either a charged and by (QCD). neutrino pair, or into a quark and antiquark pair. If one With the help of quantum field theory, and by imposing then considers a double top process, the final products a local gauge invariance, one can build the SM lagrangian will consist of two b quark, and either two leptons pairs that describes all three fundamental interactions [2]. But (’dileptons’), two quarks pairs (’all jets’), or one of each this predicts only massless particles, which doesn’t coin- (’lepton+jets’). cide with experimental observations. The Higgs mecha- nism must be introduced to give mass to particles. The associated particle is the Higgs boson, which has a unique LHC AND CMS role is the SM. It is the only elementary scalar particle (spin 0), and because it gives their mass to the particles The LHC, which first ran in 2008, is an acce- it can be viewed as the building block of the SM. lerator that can reach an energy in the center of mass But this theory isn’t perfect. Indeed, it has some flaws as high as 13 TeV [4]. It uses radiofrequency chambers 2 to accelerate particles, and superconducting electroma- gnets to control and focus the beam. The beam consists of many bunches of protons (about 106). The actual re- cord is 2556 simultaneous bunches, with correspond to a 25ns ’separation’ between the bunches. Four main detectors are placed along the ring, where the bunches are brought together to collide. One of them is called CMS for Compact Muon Solenoid. It is a big cy- Figure 1: Feynmann diagram of an quark anti-quark linder (29x15 meters) consisting of multiple layers of de- pair annihilation leading to the creation of an electron tectors. The innermost part is a tracker [5] (silicium semi- positron pair conductor) which uses a strong magnetic field to bend the charged particles. It allows to reconstruct the trajectory and the momentum of the particles. Then there are two vertices. Particles are represented by an arrow pointing calorimeters, which measure their energy. The first one towards a vertex, whereas anti-particles have an arrow is for electron and , and the second one is for ha- pointing away from a vertex. The example shown in fi- drons. The last part is the muon chamber. Muons need gure 1 shows the annihilation of a quark anti-quark pair a specific detector because they are the only particles of into photon. The photon then decays into an electron and the SM that are able to reach the outermost part of the a positron. A basic diagram like this represent what is cal- detector. led an ’hard’ process, and describes particles with a high Let’s define two widely used parameters. The first one momentum. These hard process are calculated via exact is the transverse momentum (pt), which is simply the pro- fiexd-order perturbation theory. But corrections need to jection of the momentum of a particle in the transverse be done to have a good description of the experimental plan. The second one is the pseudo-rapidity (η), defined collisions. as : η = − ln(tan(θ/2)) with θ the angle between the Any particle with a colour charge (quarks and gluons) momentum of the particle and the direction of the beam. is called a parton, and can radiate virtual gluons. These Therefore a small rapidity corresponds to a particle emit- gluons can themselves turn into a quark and anti-quark ted close to the transverse plan, whereas a large rapidity pair or emit another . This effect is particularly im- correspond to a particle going near the beam axis. portant with hadronic colliders, because there is always Raw data obtained from the detectors need to be car- multiple partons in the initial state. All these possible refully treated to determine which particles were created contributions are computed through a process called ’par- during a collision. Among the many reconstruction algo- ton shower’, that uses approximate pertubation theory. rithms used, one is peculiar to CMS : the particle flow. It All the quarks and gluons produced will then hadronize combines data from different parts of the detector, thus because of QCD confinment, by creating additional jets. allowing a better identification of the different types of Another particular aspect of hadronic collisions is that particles. the initial state is not well known. Indeed, because of But experimental data alone are not enough to un- vacuum fluctuation, a proton of the beam, which usually derstand what is happening during collisions. Indeed, one contains two up quark and a down quark, can have virtual needs to compare them to numerical simulation. partons inside. These may interact during the collision. The probability to find a certain parton in the proton is called parton density function, and depends on the MONTE CARLO EVENT GENERATION energy of the initial proton. So the probability to find a certain final state is the sum of all the possible initial The purpose of simulating an event is to be able to states, times the probability to find it in the proton and compare it to the experimental data. If the two are si- also the probability associated to the process. milar, then the model you used to simulate the events Now, one has to generate many events with the pre- is effective at describing what’s happening. If not, then viously calculated probability amplitude, with pseudo- there might be issues in the simulations or in the detec- random numbers to simulate the fluctuation associated tor, but it could also be a hint towards new physics. to quantum mechanics. This kind of methods are called If one wants to simulate a certain event, the first step is Monte Carlo (MC) generators, and compute the integral to simulate the matrix element associated to this process of the phase space associated to a given process. If the [6]. This element is directly related to the probability of number of generated events is sufficient, the probability it happening. Feynmann diagram are used to represent to find a certain configuration in the simulation should interactions, and allows to compute the matrix element correspond to the probability of finding it in the detector. associated. The time line is going from left to right, and But detectors suffer from resolution effects that needs to the space line is going from bottom to top. The meeting be taken into account. points of lines correspond to an interaction, and are called The final step is to simulate the response of the de- 3 tector. For example a particle with an energy too low no direct way to know if an event involved a single or a may be confused with intrinsic electronic noise. Another double top process. So my goal was to find an indirect example is that the detector, because of its geometry, will way to retrieve the interference information. only detect particles in a certain angle. So after taking If a particle satisfies Einstein’s equation of energy many effects like these account, one can finally get simu- E2 = m2c4 + p2c2, it is called ’on shell’. But in quan- lated data that should reflect reality. Let’s now have a tum field theory, virtual particles can be created. They look at a particular event generator, the bb4l model. do not satisty the previous equations, and are then called ’off-shell’. So instead of differentiating processes with the number of top involved, I looked at the number and mass THE BB4L MODEL of off-shell top quarks. The mass of an off-shell particle can be seen as an effective mass that would describe its A widely used MC generator is called POHWEG. The behaviour. bb4l model [7] is a new version of Pohweg, and is being implemented to simulate the interaction of two protons + − ¯ creating two b quarks and four leptons : pp → l νll ν¯lbb EVENTS ANALYSIS with a greater accuracy than before. The name bb4l des- cribes the final state, which can be obtained via multiple We used the bb4l generator to create 1.15 million processes. Three examples are shown in figure 2. events on one of Cern computer cluster (Lxplus). These data are on the particle level, which means that the de- tector response was not simulated. Then I used CMS col- lection of software, CMSSW, to analyze the data. It uses primarly ROOT (scientific framework written in c++) and Python. The first step was to iterate over all particles in every event, to find the top quarks just before they decay in W and b. Then I looked at the invariant mass of top quarks, which is defined as the mass in the rest frame of (a) double top process the particle. I will now adopt natural units (c=1), which allows us to write invariant masses in GeV. I plotted the invariant mass of all top quarks in an histogram shown in figure 3. Then I fitted the central part (red curve) between 130 and 230 GeV with a Breit-Wigner function (dark line), which corresponds to the expected distribu- tion for an on-shell top. As can be seen for masses below and above the central part, the number of top is about twice as big as expected with the fit. These top are in fact mostly off-shell, and about 4% of the events contain (b) single top process (c) Z boson process one off-shell top. They can be seen on the blue curve. I will now make the distinction betwenn off-shell top with Figure 2: Different Feynmann diagrams corresponding a mass below the central part (called ’off-shell inf’) and to bb4l events off-shell top with a mass above the central part (called ’off-shell sup’). But sometimes both tops are off-shell (in Because these interactions are governed by quantum about 0.05% of the cases), these events are shown in yel- mechanics, the different processes will interfere. This low. After separating the events in different categories, I means that the total probability of finding a bb4l final compared many variables to search for differences. There state is different than the sum of each individual pro- is also a clear drop in the number of top quarks with cess probability. But until now, interefence effects bet- a mass below 80 GeV/c2. This value corresponds to the ween single top and top pair processes were expected to mass of the W boson. be small. But the precision keeps getting better because First I looked at the transverse momentum (pt) and of both detector upgrades and the increasing amout of the pseudo-rapidity (η) of the top quarks and its de- data [8]. Therefore the interferences might become non cay products (W boson, charged lepton, neutrino and b neglectable in the future. So it is becoming necessary to quark). All of them show differences in their kinematic determine their importance. But given the way the ge- variable’s distributions, but let’s focus on b quarks which nerator is built, the intermediate state is lost. Thus in show the biggest difference. The pt distribution, shown in every event there will be two top, even though it might figure 4a, is peaked much lower for b quarks coming from have been a single top or a Z boson process. So there is off-shell inf and two off-shell events. This is expected if 4

Figure 3: Histogram of the invariant mass of top quark (in GeV). The middle part of the peak was fitted with a Breit-Wigner function. On-shell top are shown in red, (a) b quark pt in GeV single off-shell top are shown in blue, and double off-shell top are shown in yellow. the off-shell top has a low mass. We can also see that b quarks coming from off-shell sup tops have a distribution that looks like a combination of off-shell inf and on-shell behavior. The pseudo rapidity distribution is wider for off-shell events than for on-shell events. This means that the b quarks coming from off-shell cases will tend to be more towards the forward region (large eta) than the b quarks coming from on-shell event. I also looked at many other variables like the number of jets or the angle between leptons and jets. They all show differences between on and off shell top events. So I have the distributions of about 15 variables that can be used to determine how many off-shell top quarks are (b) b quark pseudo rapidity in a given event. To do so, one has to combine multiple ’cuts’, which consists in discarding an event if a certain Figure 4: Normliazed histograms of b quark pt and η, parameter is above or below a certain value. separated between two on-shell top process (red), one off-shell sup (dark blue), one off-shell inf (light blue) and two off-shell (yellow). CONCLUSION

The goal of this project was to study a new MC gene- be to use a machine learning algorithm. Such an algo- + − ¯ rator, named bb4l, which simulate pp → l νll ν¯lbb in- rithm has to be trained on simulations (in our case with teractions with a greater accuracy than before. One of its the bb4l generator). The more training it receives, the interesting new feature is that it takes interefences bet- more performant it will become. An example is the boos- ween processes into account. But the intermediate state ted decision tree (BDT), commonly used in high energy of the events are lost in the generation, so one cannot physics analysis. According to the data it received, the directly deduce the effect of interferences. BDT will tell exactly what cuts to apply to determine So I used the invariant mass of the top quarks to divide if a top quark is on-shell or off-shell. Such an algorithm them into different categories. Top quarks with a mass could be run on experimental data, and would allow us around the mean value ('172.5 GeV) are flagged as on- to know in which category a top quark belongs, without shell, whereas the others are flagged as off-shell inf or knowing its mass. sup, depending on whether their mass is smaller or bigger But to do so, one needs to simulate the resolution ef- than the one-shell’s one. I then looked at many variables, fects of the CMS detector. Indeed it is necessary to train like pt and η of the particle involved in the process, and the BDT on simulated data that are as close as possible compared them between categories. to real data. It would also allow us to directly compare But the study is not finished yet. The next step would data coming from bb4l with real data, to check the vali- 5 dity of the generation model. [2] Michael Buttignol. ”Search for monotops in the leptonic Finally, the way to show if interferences are impor- channel at the LHC” A˜a¨ . PhD thesis, University of Stras- ’ tant or not would be to compare the result obtained bourg, 2016. with bb4l to previous generators that computed the [3] Gautier Hamel de Monchenault. Summer student lecture, probability of single and double top events separately, standard model physics at hadron colliders. volume 4, 2017. and without taking interferences into account. Then [4] Verena Kain. Summer student lecture, accelerators. 2017. the difference between the two should be the interference. [5] Isabelle Wingerter. Summer student lecture, detectors. 2017. Thanks to Jeremy Andrea for his supervision and for [6] Bryan Webber. Summer student lecture, introduction to everything I learned, and thanks to CERN for giving me monte-carlo techniques. 2017. the opportunity to do such a great project. [7] Paolo Nason Carlo Oleari Tomas Jezo, Jonas M. Lindert and Stefano Pozzorini. Search for monotops in the leptonic channel at the lhc. 2016. [8] CMS collaboration. Measurement of the tt¯ production cross√ section in the eµ channel in proton-proton collisions at s = 7 and 8 tev. 2016. [1] Michael Kramer. Summer student lecture, of the standard model. 2017.