Download The
Total Page:16
File Type:pdf, Size:1020Kb
Applying Convolutional Neural Networks to Classify Fast Radio Bursts Detected by The CHIME Telescope by Prateek Yadav A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Physics) The University of British Columbia (Vancouver) April 2020 c Prateek Yadav, 2020 The following individuals certify that they have read, and recommend to the Fac- ulty of Graduate and Postdoctoral Studies for acceptance, the thesis entitled: Applying Convolutional Neural Networks to Classify Fast Radio Bursts Detected by The CHIME Telescope submitted by Prateek Yadav in partial fulfillment of the requirements for the de- gree of Master of Science in Physics. Examining Committee: Dr. Ingrid H. Stairs, Astronomy Supervisor Dr. Gary F. Hinshaw, Astronomy Supervisory Committee Member ii Abstract The Canadian Hydrogen Intensity Mapping Experiment (CHIME) is a novel ra- dio telescope that is predicted to detect up to several dozens of Fast Radio Bursts (FRBs) per day. However, CHIME’s FRB detection software pipeline is suscep- tible to a large number of false positive triggers from terrestrial sources of Radio Frequency Interference (RFI). This thesis details the description of intensityML, a software pipeline designed to generate waterfall plots and automatically classify ra- dio bursts detected by CHIME without explicit RFI-masking and DM-refinement. The pipeline uses a convolutional neural network based classifier trained exclu- sively on the events detected by CHIME, and the classifier has an accuracy, preci- sion and recall of over 99%. It has also successfully discovered several FRBs, both in real-time and from archival data. The ideas presented in this thesis may play a key role in designing future machine-learning models for FRB classification. iii Lay Summary Fast radio bursts are bright bursts of radio waves that last about a few milliseconds in duration. These radio bursts originate from outside of our galaxy, but their exact origins remain a mystery. CHIME is a novel radio telescope which aims to detect a large number of these radio bursts in order to better understand their origins. However, radio telescopes are extremely susceptible to picking up radio signals from terrestrial sources, such as airplanes and mobile phones. This thesis presents an automated classifier, which can look at the radio bursts detected by CHIME and tell whether they had a terrestrial or an astrophysical origin. iv Preface The work presented in this thesis was primarily done by me, with contributions from various members of The CHIME/FRB collaboration. All members of collab- oration have contributed to this thesis in some way or the other through instrument and software development or through data acquisition and verification. The description of The CHIME/FRB system in Section 1.2 has been mostly summarised from previous publications by The CHIME/FRB Collaboration [4][5]. Chapter 2 presents some basic background knowledge for understanding convolu- tional neural networks, which can be readily found in most deep-learning textbooks such as Goodfellow et al., 2016 [19]. Dr. Shriharsh Tendulkar played a significant role in supervising and providing insights into the development of the plotting routine described in Section 3.2. The scripts were primarily coded by me, with some contributions from Dr. Shriharsh Tendulkar and Mr. Chitrang Patel. These scripts extensively utilise modules from Intensity Analysis Utilities, a library developed by The CHIME/FRB collaboration to analyze intensity data from radio telescopes. The idea of DM-augmentation, described in Subsection 3.2.2, also emerged from discussions within the collabo- ration. The classifier described in Section 3.1 was developed and trained indepen- dently by me. While most of the events used were labelled by the members of the collaboration, they were also double-checked by me. Mr. Charanjot Brar and Mr. Chitrang Patel assisted me with the real-time deployment of intensityML. Dr. Ingrid Stairs supervised the overall project and assisted with editing this thesis. v Table of Contents Abstract . iii Lay Summary . iv Preface . .v Table of Contents . vi List of Tables . viii List of Figures . ix Glossary . xiii Acknowledgments . xiv 1 Introduction . .1 1.1 Fast Radio Bursts . .1 1.2 The CHIME/FRB Project . .3 1.3 Related Work: Use of Machine Learning in FRB Classification . .8 1.3.1 Hybrid Deep Neural Network . .8 1.3.2 Transfer Learning on ImageNet Models . 10 1.4 Thesis Organisation . 11 2 Introduction to Convolutional Neural Networks . 12 2.1 Feed-Forward Neural-Networks . 12 vi 2.2 Convolutional Neural Networks . 15 3 Description of intensityML ........................ 19 3.1 The FRBNet Architecture . 19 3.2 Generating Waterfall Plots . 23 3.2.1 Automated Plotting Scripts . 23 3.2.2 Data Augmentation . 24 4 Results . 28 4.1 Training . 28 4.2 Results . 29 5 Discussion . 33 5.1 Discussion and Future Work . 33 5.1.1 Discussion . 33 5.1.2 Future Work . 34 5.1.3 Science Goals . 35 5.2 Conclusion . 36 Bibliography . 38 vii List of Tables Table 4.1 Accuracy, precision, recall and F1-score computed on the test set. 29 Table 4.2 L1 SNRs and DMs for events shown in Figures 4.1 and 4.2... 30 viii List of Figures Figure 1.1 (a) Plot on the left shows the waterfall plot for an FRB after correcting for the effects from electromagnetic dispersion. (b) The plot on the right shows the same waterfall plot, but with partial dedispersion to demonstrate the effects of the quadratic dispersive delay in Equation 1.1. The masked frequency chan- nels in both plots are due to interference from the LTE band. .2 Figure 1.2 CHIME radio telescope located at The Dominion Radio Astro- physical Observatory (DRAO) in Canada. Photo taken by Mark Halpern and used with permission. .4 Figure 1.3 A schematic of CHIME’s signal path. The raw data collected by the four reflectors (red arcs) is transferred to the F- and X- Engines at a rate of 13 Tb/s. The F-Engine consists of FPGA boards to digitise and channelise the data. The X-Engine utilises a GPU cluster for Fast Fourier Transform beam-forming. The CHIME/FRB backend receives the 1024 stationary intensity beams at 1 ms cadence and 16k frequency channels. Image adapted from [4].........................5 Figure 1.4 A high-level overview of the CHIME/FRB’s software pipeline showing different stages of processing. Image adapted from [4].6 Figure 1.5 A schematic diagram showing the hybrid deep neural-network developed by Connor et al. Image adapted from [9]. In the original figure, the authors seem to have incorrectly labelled the operation on the waterfall plot as 1D convolution. .9 ix Figure 1.6 Diagram showing an example of a network architecture used by Agarwal et al.. Image adapted from [2]........... 11 Figure 2.1 Schematic diagram of a fully-connected n-layer neural-network. d ( ) m An input vector x 2 R is transformed to a vector h 1 2 R in the first hidden layer. This transformation is repeated n times k as shown in Equation 2.1. The final output vector yˆ 2 R is obtained by the transformation shown in Equation 2.3, where yˆi represents the probability for class i if a softmax function (Equation 2.4) is used. Image adapted from [19]. 13 Figure 2.2 The figures show how convolution can be used to extract the edges from an input image with a single colour channel. The red pixels in the convolution kernels represent negative num- bers, while the blue pixels represent positive numbers. 16 Figure 2.3 Schematic diagram showing the operation performed by the convolution layer. See Equation 2.8. Image adapted from [19] 17 Figure 2.4 Diagram shows the typical sequence of operations in a CNN. Image adapted from [19].................... 18 Figure 3.1 Architecture of FRBNet. The model takes a single-channel 256 × 256 pixel image as input. Layer 0 of the model con- volves the input with a fixed set of thirteen 7 × 7 kernels. The resulting thirteen-channel 250×250 image is then down-sampled with 2 × 2 max-pooling and followed with non-linear ReLU activation function. The next three layers each perform a 5×5 convolution with a stride of two, followed by a ReLU activa- tion. At the end of Layer 3, the ten-channel image is down- sampled using max-pooling to give a one-dimensional vector of size ten. Finally, a fully-connected layer performs the oper- ation in Equation 2.3 to give an output vector of size 2. Image style adapted from [23]. .................... 20 x Figure 3.2 The top left plot shows the waterfall plot of an FRB. The hori- zontal streaks in this plot are RFI contamination. The remain- ing plots show the thirteen convolution kernels and their cor- responding Layer 0 transformations on the original plot. The kernel on the top right is simply the identity kernel. The re- maining kernels show the Prewitt (left) and Sobel (right) ker- nels embedded in a 7×7 grid. These help enhance the vertical pulse shape while wiping away the horizontal RFI streaks. 21 Figure 3.3 The top left plot shows the waterfall plot of an RFI event (with no astrophysical pulse present). Similar to Figure 3.2, the re- maining plots show the thirteen convolution kernels and their corresponding Layer 0 transformations on the original plot. 22 Figure 3.4 Some examples of plots generated by the automated script for events that were classified as astrophysical by the CHIME/FRB pipeline. The top five plots are pulses from FRBs and known pulsars that were correctly classified as astrophysical by the pipeline.