Large-Scale Data Analysis for Higgs Boson Mass Reconstruction in Tth Production

Large-Scale Data Analysis for Higgs Boson Mass Reconstruction in Tth Production

Master’s thesis Large-Scale Data Analysis for Higgs Boson Mass Reconstruction in ttH Production Bc. Petr Urban CERN-THESIS-2019-195 03/06/2019 Department of Theoretical Computer Science Supervisor: doc. Dr. Andr´e Sopczak May 7, 2019 Acknowledgements I would like to thank my supervisor doc. Dr. Andr´e Sopczak. He was willing to help me whenever I had a question or was stuck during my research. He also contributed by important remarks and recommending quality sources. Declaration I hereby declare that the presented thesis is my own work and that I have cited all sources of information in accordance with the Guideline for adhering to ethical principles when elaborating an academic final thesis. I acknowledge that my thesis is subject to the rights and obligations stip- ulated by the Act No. 121/2000 Coll., the Copyright Act, as amended, in particular that the Czech Technical University in Prague has the right to con- clude a license agreement on the utilization of this thesis as school work under the provisions of Article 60(1) of the Act. In Prague on May 7, 2019 ..................... Czech Technical University in Prague Faculty of Information Technology !c 2019 Petr Urban. All rights reserved. This thesis is school work as defined by Copyright Act of the Czech Republic. It has been submitted at Czech Technical University in Prague, Faculty of Information Technology. The thesis is protected by the Copyright Act and its usage without author’s permission is prohibited (with exceptions defined by the Copyright Act). Citation of this thesis Urban, Petr. Large-Scale Data Analysis for Higgs Boson Mass Reconstruction in ttH Production. Master’s thesis. Czech Technical University in Prague, Faculty of Information Technology, 2019. Abstrakt Tato pr´ace se zabyv´ ´a probl´emem rekonstrukce klidov´e hmotnosti ˇc´astice zn´am´e jako Higgs˚uv boson pomoc´ı technik strojov´eho uˇcen´ı – neuronovy´ch s´ıt´ı. Je zamˇeˇrena na rozpadovy´ kan´al 2ℓSS +1τhad. V prvn´ı ˇc´asti je vysvˇetlen probl´em rekonstrukce klidov´e hmotnosti a pops´any nˇekter´e pˇr´ıstupy k jeho ˇreˇsen´ı. Dalˇs´ı ˇc´ast popisuje f´azi rekonstrukce klidov´e hmotnosti vˇsech ˇc´astic pomoc´ı exaktn´ıch vy´poˇct˚u a algoritm˚u. Pot´e jsou pouˇzita data z´ıskan´a bˇehem rekonstrukce ostatn´ıch ˇc´astic ttH¯ syst´emu. Na r˚uznˇe velky´ch datasetech, je natr´enov´ano a otestov´ano nˇekolik neuronovy´ s´ıt´ı pro predikci/odhad klidov´e hmotnosti Higgsova bosonu nau ´rovni simulace. Nakonec jsou pˇripraveny neuronov´e s´ıtˇe pro odhad klidov´e hmotnosti nau ´rovni detektoru a pro rozliˇsen´ı sign´alu a pozad´ı. Kl´ıˇcov´aslova ATLAS, CERN, Higgs˚uv boson, klidov´a hmotnost, neuronov´e s´ıtˇe vii Abstract This thesis deals with the problem of reconstruction of the invariant mass of the Higgs boson using machine learning techniques – neural networks. It focuses on the 2ℓSS +1τhad decay channel. In the first part, the problem of mass reconstruction is explained and some used approaches to the problem are described. The next part describes the phase of reconstructing the invariant mass of all the particles using exact formulas and algorithms. Then, the data obtained during reconstruction of the other particles of the ttH¯ system are used. Several neural networks are trained and tested on dif- ferent datasets to predict/estimate the invariant mass of the Higgs boson on truth level. Finally, neural networks for estimating the invariant mass of the Higgs bo- son on detector level and distinguishing signal events from background are prepared. Keywords ATLAS, CERN, Higgs boson, invariant mass, neural networks viii Contents Citation of this thesis ....................... vi Introduction 1 1 Searching for the Higgs boson 3 1.1 Standard model........................... 3 1.2 CERN ................................ 4 1.3 ATLAS Detector .......................... 5 1.3.1 Inner Detector ....................... 5 1.3.2 Calorimeter ......................... 5 1.3.3 Muon Spectrometer .................... 6 1.3.4 Magnets ........................... 6 1.4 Decay channels ........................... 6 2 Artificial neural networks 9 2.1 Origin ................................ 9 2.2 Formal definitions ......................... 10 2.2.1 Neuron ........................... 10 2.2.2 Neural network ....................... 11 2.2.3 Types of networks ..................... 17 3 Related research 21 3.1 Invariant mass ........................... 21 3.2 Other particles ........................... 22 3.3 Other channels ........................... 22 3.4 2ℓSS +1τhad channel........................ 22 3.5 Summary – feature selection .................... 23 4 Implementation 25 4.1 Software ............................... 25 4.1.1 ROOT ............................ 25 ix 4.1.2 Python ........................... 26 4.2 Data ................................. 27 4.3 Truth level ............................. 27 4.3.1 Particle codes ........................ 27 4.3.2 Algorithm .......................... 28 4.3.3 Dataset features ...................... 31 4.4 Detector level............................ 32 4.4.1 Algorithm .......................... 32 4.4.2 Dataset features ...................... 33 4.5 Reconstruction of the invariant mass ............... 34 4.5.1 Exact formula ........................ 34 4.5.2 Neural networks ...................... 38 4.6 Further research .......................... 55 Conclusion 57 Bibliography 59 AAcronyms 63 B Contents of enclosed CD 65 x List of Figures 1.1 Standard model of particle physics .................. 4 1.2 CERN complex ............................. 5 1.3 ATLAS scheme ............................. 6 1.4 Decay channels ............................. 7 2.1 Comparison of biological and artificial neuron ............ 10 2.2 Identity/Linear activation function .................. 11 2.3 ReLU activation function ....................... 12 2.4 Softplus activation function ...................... 12 2.5 Sigmoid activation function ...................... 13 2.6 Neuron scheme ............................. 13 2.7 Backpropagation of errors through the network............ 16 2.8 Perceptron possibilities ........................ 17 2.9 Multilayer perceptron ......................... 18 2.10 Convolutional NN – example ..................... 19 3.1 ttH¯ system decay – 2ℓSS +1τhad channel.............. 23 4.1 ttH¯ system decay – 2ℓSS +1τhad channel, with PDG codes .... 29 4.2 Histogram of W boson (from top quark) invariant mass – truth level 35 4.3 Histogram of top quark invariant mass – truth level . 35 4.4 Histogram of antitop quark invariant mass – truth level . 36 4.5 Histogram of Higgs boson invariant mass – truth level . 36 4.6 Dependence of percentage error on shuffling/splitting the dataset . 40 4.7 Slower convergence with larger batchsize ............... 41 4.8 The best configuration ......................... 42 4.9 Feature importance – full truth dataset ............... 43 4.10 Higgs invariant mass – full truth dataset ............... 44 4.11 Feature importance – visible part of the truth dataset ....... 45 4.12 Higgs invariant mass – visible part of the truth dataset ...... 46 4.13 Feature importance – detector available part of the truth dataset . 47 xi 4.14 Higgs invariant mass – detector available part of the truth dataset 48 4.15 Feature importance – full truth dataset (Higgs branch) ....... 49 4.16 Higgs invariant mass – full truth dataset (Higgs branch) ...... 50 4.17 Feature importance – visible part of the truth dataset (Higgs branch) 51 4.18 Higgs invariant mass – visible part of the truth dataset (Higgs branch) 52 4.19 Feature importance – detector available part of the truth dataset (Higgs branch) ............................. 53 4.20 Higgs invariant mass – detector available part of the truth dataset (Higgs branch) ............................. 54 xii List of Tables 2.1 Comparison of a biological and an artificial neuron ......... 9 2.2 Comparison of a biological and an artificial network ........ 10 2.3 Activation functions .......................... 14 4.1 Selected hyper-parameters ....................... 39 4.2 Summary of the truth level results .................. 55 xiii Introduction Apart from the basic elementary particles forming atoms, the Higgs boson is probably the best known particle by the public. It is a part of the Standard Model – a theory describing elementary particles and various forces between them that make up matter. And the Higgs boson is the particle that gives hope to many physicists to extend the model and make it more precise. Although the existence of a new particle was predicted in 1960s it took more than 50 years to confirm the hypothesis, because the observation of the Higgs boson is complicated. One reason for this long waiting is, that production of the Higgs boson, requires high energy proton-proton collisions and the Large Hadron Collider, where such collisions can be achieved, was put in the operation relatively recently (2008). However, despite all these difficulties, in 2012 two experiments (ATLAS and CMS) at LHC recorded in that time unknown particle which was later identified as the Higgs boson. Since then many physicists have been focused on its research and believe that its mass is around 125 GeV/c2. This goal of this work is to train a neural network that will be able to reconstruct the invariant mass of the Higgs boson from the final products of its decay and distinguish signal events from background data, which should enable physicists more detailed and precise studies of the correct collisions which might help to get more information about the properties

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    79 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us