Deep Learning in Radio Astronomy

Deep Learning in Radio Astronomy Dissertation zur Erlangung des Doktorgrades an der Fakultät für Mathematik, Informatik und Naturwissenschaften Fachereich Physik der Universität Hamburg vorgelegt von Vesna Lukic Hamburg, Wintersemester 2019 1 Gutachter/innen der Dissertation: Prof. Dr. Marcus Brüggen Zusammensetzung der Prüfungskommission: Prof. Dr. Marcus Brüggen Prof. Dr. Gregor Kasiescka Prof. Dr. Francesco De Gasperin Prof. Dr. Jochen Liske Prof. Dr. Robi Banerjee Vorsitzende/r der Prüfungskommission: Prof. Dr. Jochen Liske Datum der Disputtation: 26.11.2019 Vorsitzender Fach-Promotionsausschusses PHYSIK: Prof. Dr. Günter H. W. Sigl Leiter des Fachbereichs PHYSIK: Prof. Dr. Michael Pothoff Dekan der Fakultät MIN: Prof. Dr. Heinrich Graener 2 3 Zusammenfassung Aktuelle und kommende radioastronomische Himmelsvermessungen liefern weiterhin neue Erkenntnisse über die Entstehung und Entwicklung von Galaxien, unseres kosmologischen Modells und seiner Parameter. Die vorliegende Arbeit fasst unsere Arbeiten zu Tiefenlerntech- niken für die Radioastronomie zusammen. Das Datenvolumen, das bei radioastronomische Durchmusterungen anfällt, ist enorm und nimmt aufgrund von technologischen Verbesserun- gen ständig zu. Dies führt zu einer steigenden Nachfrage nach der Entwicklung komplexerer Tools zur Analyse der Daten, da manuelle Analysen nicht mehr möglich sind. Um diesen Prozess zu vereinfachen, wurden maschinelle Lerntechniken entwickelt, die sich auf die Vo- raussetzung stützen, dass sie zum Identifizieren von Mustern und Merkmalen in Daten verwendet werden können. Der Schwerpunkt der vorliegenden Arbeit liegt auf der Analyse von Radiodaten auf der Basis von Bildern mit Hilfe von Deep-Learning-Techniken, ein Ansatz, der sich bei hochdimensionalen Daten bewährt hat. Mit Hilfe von Radio-Galaxien-Bildern aus dem Radio Galaxy Zoo Citizen Science Project demonstrieren wir, dass es möglich ist, drei verschiedene Klassen von Quellen zu klassifizieren. Anschließend testen unser Daten- netzwerk mit Daten Release 1. Wir vergleichen die Leistung traditioneller, tief neuronaler Faltungsnetzwerke mit der Leistung von Kapseln Netzwerke. Letztere sind eine in jüngerer Zeit entwickelte Technik, bei der Gruppen von Neuronen verwendet werden, die Eigenschaften eines Bildes einschließlich der relativen räumlichen Positionen von Merkmalen beschreiben. Anhand von Bildern aus der LOFAR-Zwei-Meter-Himmelsvermessung (LoTSS) zeigen wir, dass die herkömmlichen neuronalen Faltungsnetze für die Art der vorliegenden Funkgalax- iendaten eine bessere Leistung erbringen. Schließlich entwickeln wir einen Quellensucher, der auf einem Faltungsautocodierer basiert ist, und vergleichen ihn mit einem hochmodernen Quellensucher unter Verwendung simulierter Quadratkilometer-Array-Daten. Wir stellen fest, dass die Leistung zwischen den Quellenfindern je nach Belichtungszeit, Frequenz und Signal- Rausch-Verhältnis sich variiert. 4 This thesis is based on the following publications: V. Lukic, M. Brüggen, J K Banfield, O. I. Wong, L. Rudnick, R. P. Norris, B. Simmons • ‘Radio Galaxy Zoo: Compact and Extended Source Classification with Deep Learning’, Mon. R. Astro. Soc. 466,1, pp. 246-260 (2018) V. Lukic, M. Brüggen, B. Mingo, J.H. Croston, G. Kasieczka, P. N. Best ‘Morphological • classification of radio galaxies: capsule networks versus convolutional neural networks’, Mon. R. Astro. Soc. 487, 2, pp. 1729-1744 (2019) V. Lukic, F. De Gasperin, M. Brüggen ‘AutoSource: Radio-astronomical source-finding • with convolutional autoencoders’, submitted to Galaxies, arXiv:1910.03631 (2019) 5 Abstract Current and forthcoming radio surveys continue to provide new insights in understanding the formation and evolution of galaxies, our cosmological model and its parameters. The current thesis summarises our work on deep learning techniques applied to radio astronomy. The volume of data produced from radio surveys is vast and constantly growing due to improvements to technology. This results in increasing demand to develop more sophisticated tools to analyse the data, as manual analyses will become unfeasible. Machine learning techniques have been developed to facilitate this process, and rely on the premise that they can be trained to recognise patterns and features in data. The focus of the current thesis is analysing radio data based on images using deep learning techniques, an approach which has proven successful on high-dimensional data. Using radio galaxy images from the citizen science project Radio Galaxy Zoo, we show that it is possible to classify between compact sources and three classes of extended sources, and test our trained network on Data Release 1. We compare the performance of traditional convolutional deep neural networks against Capsule networks. The latter are a more recently developed technique using groups of neurons that describe properties of an image including the relative spatial locations of features. Using images from the LOFAR Two-metre Sky Survey (LoTSS), we show that for the type of radio galaxy data at hand, the traditional convolutional neural networks perform better. Finally, we develop a source finder based on a convolutional autoencoder, and compare the performance against a state-of-the-art source-finder, using simulated Square Kilometre Array data. We find the performance varies between the source-finders based on exposure time, frequency and signal-to-noise ratio. 6 7 Eidesstattliche Versicherung Hiermit versichere ich an Eides statt, die vorliegende Dissertationsschrift selbst verfasst und keine anderen als die angegebenen Hilfsmittel und Quellen benutzt zu haben. Die eingereichte schriftliche Fassung entspricht der auf dem elektronischen Speichermedium. Die Dissertation wurde in der vorgelegten oder einer ähnlichen Form nicht schon einmal in einem früheren Promotionsverfahren angenommen oder als ungenügend beurteilt. (Vesna Lukic) Brüssel, den 10.10.2019 8 9 Contents 1 Introduction 13 1.1 SourcesintheRadiospectrum .......................... 13 1.1.1 Introduction ................................ 13 1.1.2 Spectral line emission . 13 1.1.3 Thermal emission . 15 1.1.4 Non-thermal emission . 15 1.1.5 Galaxies with significant radio emission . 19 1.1.6 Supernovaremnants ............................ 27 1.2 Radio astronomical telescopes and surveys . 27 1.2.1 Main parameters in surveys . 27 1.2.2 Science goals and purpose of surveys . 30 1.2.3 Past and present (large-scale) surveys . 37 1.3 Machine learning and applications to astronomy . 41 1.3.1 Supervised learning and common techniques . 42 1.3.2 Unsupervised learning and common techniques . 47 1.3.3 Machine learning theory . 48 1.3.4 Machine learning techniques applied to astronomy . 53 2 Compact and Extended Radio Source Classification 63 2.1 Introduction..................................... 63 2.2 Deepneuralnetworks ............................... 66 2.3 Methods....................................... 67 2.3.1 Pre-processing ............................... 67 2.3.2 PyBDSF .................................. 68 2.3.3 Image augmentation . 69 2.3.4 Deep learning algorithms . 70 2.3.5 Selection of sources for two-class classification . 72 2.3.6 Selection of sources for four-class classification . 73 2.4 Resultsfortwoclasses ............................... 77 2.5 Results for four classes . 86 2.5.1 Comparing results with Data Release 1 of the Radio Galaxy Zoo . 88 Contents 10 2.6 Conclusions..................................... 92 3 Convolutional vs Capsule networks 97 3.1 Introduction..................................... 97 3.2 LOFAR HETDEX v1.0 dataset . 99 3.2.1 Source cutouts . 99 3.2.2 Classifications . 100 3.3 Methods....................................... 104 3.3.1 Pre-processing ............................... 104 3.3.2 Image augmentation . 105 3.4 Deep Learning algorithms . 105 3.4.1 Convolutional Neural Networks . 106 3.4.2 Capsule networks . 106 3.4.3 Deep learning parameters . 109 3.5 Results........................................ 111 3.5.1 LOFAR original images . 113 3.5.2 LOFAR original and augmented images . 122 3.5.3 Sigma-clipped images . 125 3.5.4 Additional results . 128 3.6 Conclusions . 131 4 Source-finding with convolutional autoencoders 135 4.1 Introduction..................................... 135 4.1.1 Source-finding at radio frequencies . 135 4.1.2 Types of radio sources . 137 4.1.3 Deep learning . 137 4.1.4 Simulated SKA data . 138 4.2 Methods....................................... 141 4.2.1 Convolutional autoencoders . 141 4.2.2 Pre-processing ............................... 144 4.2.3 Dataset generation . 144 4.2.4 Image augmentation . 146 4.3 Results........................................ 149 4.3.1 Very low significance source metrics at SNR=1 . 151 4.3.2 Low significance source metrics at SNR=2 . 152 4.3.3 High significance source metrics at SNR=5 . 154 4.3.4 Execution times . 158 4.4 Discussion and Conclusions . 159 4.5 Appendix ...................................... 162 Contents 11 5 Conclusions and Outlook 167 6 Bibliography 1 Contents 12 13 1 Introduction The introductory section is composed of three parts; the physics of sources in the radio spectrum, radio surveys and finally machine learning and applications in astronomy, with particular emphasis on its use in radio images. 1.1 Sources in the Radio spectrum 1.1.1 Introduction Radio astronomy is the branch of astronomy concerned with the study of celestial bodies at radio frequencies, which range from 15 MHz to 300 GHz (20m to 1mm in wavelength). The low frequency limit is set by the opacity of the ionosphere and the high frequency

Deep Learning in Radio Astronomy

Journal Issue 26 | October 2019

Patrick Thaddeus

List Stranica 1 Od

Radio Galaxy Zoo: Compact and Extended Radio Source Classiﬁcation with Deep Learning

GCSE Astronomy Scheme of Work

Inventory of CO2 Available for Terraforming Mars

Machine Learning in Large Radio Astronomy Surveys (How to Do Science with Petabytes)

SPARCS VII Final Programme

Help Find the Location of Newly Discovered Black Holes in the LOFAR Radio Galaxy Zoo Project 26 February 2020

The Emergence of a Lanthanide-Rich Kilonova Following the Merger of Two Neutron Stars

Host Galaxies and Radio Morphologies Derived from Visual Inspection

Heliophysics at Total Solar Eclipses Jay M