Insights Into Cosmological Structure Formation with Machine Learning
Total Page:16
File Type:pdf, Size:1020Kb
Insights into cosmological structure formation with machine learning Luisa Lucie-Smith Submitted for the degree of Doctor of Philosophy Department of Physics and Astronomy University College London Supervisors: Examiners: Prof. Hiranya V. Peiris Prof. Benjamin Wandelt Prof. Andrew Pontzen Prof. Ofer Lahav 31st of October, 2019 I, Luisa Lucie-Smith, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. 1 Abstract Our modern understanding of cosmological structure formation posits that small matter density fluctuations present in the early Universe, as traced by the cosmic microwave background, grow via gravitational instability to form extended haloes of dark matter. A theoretical understanding of the structure, evolution and formation of dark matter haloes is an essential step towards unravelling the intricate connection between halo and galaxy formation, needed to test our cosmological model against data from upcoming galaxy surveys. Physical understanding of the process of dark matter halo formation is made difficult by the highly non-linear nature of the haloes’ evolution. I describe a new approach to gain physical insight into cosmological structure formation based on machine learning. This approach combines the ability of machine learning algorithms to learn non-linear relationships, with techniques that enable us to physically interpret the learnt mapping. I describe applications of the method, with the aim of investigating which aspects of the early universe density field impact the later formation of dark matter haloes. First I present a case where the process of halo formation is turned into a binary classification problem; the algorithm predicts whether or not dark matter ‘particles’ in the initial conditions of a simulation will collapse into haloes of a given mass range. Second, I present its generalization to regression, where the algorithm infers the final mass of the halo to which each particle will later belong. I show that the initial tidal shear does not play a significant role compared to the initial density field in establishing final halo masses. Finally, I demonstrate that extending the framework to deep learning algorithms such as convolutional neural networks allows us to explore connections between the early universe and late time haloes beyond those studied by existing analytic approximations of halo collapse. 2 Impact Statement This thesis presents a new approach based on machine learning, aimed at deepening our under- standing of the formation of dark matter haloes in the Universe. Our goal is to understand what information is learnt by the machine learning algorithm about the underlying connection between the early universe and the late-time dark matter haloes in cosmological simulations; this differs from common approaches where machine learning is utilized as a black-box tool to obtain fast and automated mappings. Our method led to a re-interpretation of the existing understanding of halo formation over the last decades, in particular in relation to the role of the tidal shear tensor in establishing the final mass of dark matter haloes (Chapters3 & 4). This work achieved academic impact through two scientific publications, cited by independent researchers around the world, and over ten professional presentations, including invited talks, at international conferences for broad and specialized audiences from the cosmology and machine learning communities. Thanks to the broad applicability of our method, there has been an upsurge of interest in the community to apply our method to other aspects of dark matter haloes, leading to new international collaborations for the PhD candidate. The advances in machine learning presented in this thesis can be directly applied to industrial and commercial applications of artificial intelligence (AI). One of the key problems faced in the AI community is the issue of interpretability; without a deeper understanding of how deep learning algorithms make their predictions, we cannot trust AI tools in scientific, industrial and commercial applications. Our methods are specifically designed to turn black-box algorithms into interpretable ones, allowing one to better understand the complex systems these algorithms describe. This is particularly relevant for convolutional neural networks (Chapter5), which are used extensively in industry. In addition to being interpretable, our deep learning architecture allows for broader 3 applicability to three-dimensional data sets than conventional architectures applied to 2D images. Finally, this thesis points toward a new branch of machine learning research known as knowledge extraction (Chapter6), where deep learning algorithms are constructed in a manner that allows for the discovery of fundamental properties of the underlying data sets. This work has initiated an international collaboration of experts in the fields of brain sciences, computational psychiatry, crime sciences and physics, as well as Google DeepMind. We expect this collaboration to expedite progress in the ability of humans to extract knowledge from machine learning algorithms. 4 Acknowledgements First, I owe my deepest gratitude to my amazing supervisors, Hiranya Peiris and Andrew Pontzen, for investing an infinite amount of time in making me a better researcher. Thank you for always being there for me, for being extremely attentive mentors, for the endless support and understanding at difficult times and for sharing with me your scientific expertise and insights. Working with you and being your student has been a privilege. Thank you to all the other brilliant cosmologists I was lucky enough to collaborate with – Nina Roth, Michelle Lochner, Brian Nord, Risa Wechsler and Susmita Adhikari – for the all the help in the fun projects carried out together. I am extremely grateful to Edd Edmondson and John Deacon for solving all my IT problems in no time. Special thanks also go to Corentin Cadou, Martin Rey and Sam Witte for proof reading parts of this thesis. Thank you to all the friends I have made at UCL for the laughs, pub trips and occasional trips to Borderline. Special thoughts go to Felix Priestley for listening to extensive monologues starting with ‘about me now’, Arthur Loureiro for keeping me up to date on Brazilian politics, Arianna Sorba for always having an answer to my questions, Niall Jeffrey for that magnum I still owe him, Martin Rey for our morning coffees and paper discussions, Keir Rogers for the wisest advices on surviving a PhD, Michelle Lochner for buying all the rounds at the pub, Andreu Font-Ribera for rigorously drinking only half pints at the pub, Roger Wesson for still not accepting that I won, Jonathan Braden for laughing at the words baryons and stars, Davide Gualdi for excellent coffees in his office, Will Hartley for always reminding us to never work on photo-z and Sam Witte, for introducing me to my newest big loves, Jim Croce and pad thai. Thank you to all the very good friends I made during my time at Fermilab, especially for the numerous pitchers of Aperol spritz drank at the village pub. Special thanks go to Antonella Palmese for making me feel at home and Federico Speranza for our shared passion for country music and for grating parmigiano on my pasta. Thanks to all the Italian 5 songwriters that kept me company during long hours of work, Francesco De Gregori, Lucio Battisti, Brunori Sas and Tiromancino, and to Pyotr Ilyich Tchaikovsky, for the most productive hours of thesis writing. A big thank you and all my love goes to my family. Thanks to my dad, for teaching me special relativity when I was way too young to understand it, my brother, for always being there for all of us, and my sister for the “sclero times” together. Thank you to Martina for being my best friend. My deepest love goes to my mum, Antonella, for being the strongest and most inspiring woman I know on this planet and for all the love and support she has given me. Thank you for your efforts in trying to learn about my research, although I know you still think I work on black holes. I dedicate this thesis to you, cocca. 6 Contents 1 Introduction 19 1.1 The Universe....................................... 19 1.1.1 General relativity................................. 20 1.1.2 The energy content................................ 23 1.1.3 Expansion history................................. 24 1.2 Seeds of cosmic structures................................ 26 1.2.1 The cosmic microwave background....................... 27 1.2.2 Inflation...................................... 29 1.3 Cosmological structure formation............................ 32 1.3.1 Linear growth................................... 33 1.3.2 Non-linear growth................................ 38 1.3.3 Spherical collapse model............................. 38 1.3.4 Press-Schechter Formalism............................ 40 1.4 Numerical simulations.................................. 43 1.4.1 The GADGET code................................. 46 1.4.2 Initial conditions................................. 47 1.4.3 Finding dark matter haloes........................... 48 1.5 Outline of the thesis................................... 49 2 Method 51 2.1 Machine learning algorithms............................... 51 2.1.1 Decision trees................................... 54 7 2.1.2 Ensembles of decision trees........................... 56 2.1.3 Feature importances............................... 60 2.2 Deep learning algorithms...............................