Deep Learning for Astronomical Object Classification

Deep Learning for Astronomical Object Classification: A Case Study Ana Martinazzo, Mateus Espadoto a and Nina S. T. Hirata b Institute of Mathematics and Statistics, University of Sao˜ Paulo, Sao˜ Paulo, Brazil famartina, mespadot, [email protected] Keywords: Deep Learning, Astronomy, Galaxy Morphology, Merging Galaxies, Transfer Learning, Neural Networks, ImageNet. Abstract: With the emergence of photometric surveys in astronomy, came the challenge of processing and understanding an enormous amount of image data. In this paper, we systematically compare the performance of five popular ConvNet architectures when applied to three different image classification problems in astronomy to deter- mine which architecture works best for each problem. We show that a VGG-style architecture pre-trained on ImageNet yields the best results on all studied problems, even when compared to architectures which perform much better on the ImageNet competition. 1 INTRODUCTION Deep Learning for astronomy started to appear. Ex- amples include star/galaxy classification (Kim and Traditionally, astronomers relied on spectroscopy to Brunner, 2016), galaxy morphology (Dieleman et al., gather data from objects in the sky, which is very 2015) and merging galaxies detection (Ackermann accurate, since it can collect data from thousands of et al., 2018), all based on images, and galaxy classifi- frequency bands in the electromagnetic spectrum, but cation based on data from radio telescopes (Aniyan very time-consuming, since it usually works for a and Thorat, 2017). However, as we understand it, single object at a time and requires longer exposure Deep Learning techniques have not yet been system- times. In recent years, with advances in telescope atically explored in astronomy. This might have to do and sensor technology, there was a shift towards pho- with the fact that typically a large amount of labeled tometric surveys, where a trade-off was made: data data is required to train deep neural networks. Label- from only a few frequency bands are collected, typi- ing astronomical data is costly, as it requires expert cally less than a dozen, but for many objects at a time. knowledge or crowdsourcing efforts, such as Galaxy With this huge amount of data available, came the Zoo (Lintott et al., 2008). problem of making sense of all of it, and this is where To help alleviate the lack of labeled data inher- machine learning (ML) comes into play. There are ent to many fields, the idea of Transfer Learning (Pan many works on how to use ML in astronomy, like and Yang, 2009) was applied to Deep Learning (Ben- (Ball and Brunner, 2010; Ivezic´ et al., 2014), but gio, 2012). Deep neural networks trained with huge many of them focus on processing astronomical cata- datasets such as ImageNet (Deng et al., 2009), which log data, which is pre-processed data in table format contains millions of images, can be used as generic based on features extracted from photometric data. In image feature extractors, or be fine-tuned for particu- particular, there are quite a few works that address the lar, much smaller datasets. problem of object classification, which can be of different types, such as star/galaxy classification (Moore In this paper, we present a systematic study of et al., 2006) or galaxy morphology (Gauci et al., Deep Learning models applied to different image 2010). classification problems in astronomy. We compare More recently, with the growing popularity of the accuracy obtained by five popular models in the Deep Learning techniques applied to images, fu- literature, when trained from scratch or pre-trained on eled by the development of the AlexNet architec- ImageNet and fine-tuned for each problem. The prob- ture (Krizhevsky et al., 2012), some works based on lems we consider are star/galaxy classification, detection of merging galaxies, and galaxy morphology, on a https://orcid.org/0000-0002-1922-4309 different datasets. Our aim is to answer the following b https://orcid.org/0000-0001-9722-5764 questions: • Which of the five models is more appropriate to 2015). each classification problem? More recently, there have been works that suc- • Which set of hyperparameters is the best for each cessfully used Deep Learning techniques, such as model and for each problem? Convolutional Neural Networks (ConvNets), to clas- sify astronomical objects (Kim and Brunner, 2016; • When is ImageNet pre-training beneficial? Dieleman et al., 2015; Aniyan and Thorat, 2017). • How do the number of classes and observations These networks take raw pixel values as input and affect model performance? learn how to extract features during training. Also, there are works which leverage the fact The structure of this paper is as follows. Section 2 that ConvNets are good feature extractors to do discusses related work on astronomical object clas- so-called Transfer Learning, that is, taking a Con- sification. Section 3 details our experimental setup. vNet trained to solve one problem and applying it to Section 4 presents our results, which are next dis- images of different domains. Regarding astronomical cussed in Section 5. Section 6 concludes the paper. data processing, we are aware of works which implement this idea for the detection of merging galaxies (Ackermann et al., 2018) and for galaxy 2 RELATED WORK morphology (Sanchez et al., 2018) with good results, but to the best of our knowledge, our work is novel Related work can be split into astronomical object in that it presents a systematic comparison of how classification and Deep Learning, as follows. different architectures perform in Transfer Learning setups with various settings, and when presented with Astronomical Object Classification: There are different classification problems. some recurring classification problems in astronomy, which we describe next. Star/galaxy classification, as Deep Learning: The ImageNet Large Scale Vi- implied by the name, is the identification of an object sual Recognition Challenge (ILSVRC) (Russakovsky from a set of images as a star or as a galaxy. This is et al., 2015) is a competition where different image fairly straightforward for bright objects, but becomes classification techniques are evaluated over a very challenging as astronomical surveys probe deeper into large dataset, and it was a main driving factor for the the sky and fainter galaxies begin to appear as point- development of ConvNets and Deep Learning in gen- like objects. Separating stars from galaxies is impor- eral. Since the development of AlexNet (Krizhevsky tant for deriving more accurate notions of true size et al., 2012), many network architectures have been and true scale of the objects. Detection of Merging proposed, with the twofold objective of improving ac- Galaxies from a set of galaxy images is the identifi- curacy on ImageNet and reducing model complexity, cation of two or more galaxies that are colliding. It in terms of number of parameters. In Table 1 we show is important for understanding galaxy evolution, as it the number of parameters for a few selected models can give clues about how galaxies agglomerated over over the years, and the improvement is dramatic in time. Galaxy Morphology is the study and categoriza- some cases. tion of the shapes of the galaxies. It provides infor- Table 1: Selected CNN architectures for ImageNet. mation on the structure of the galaxies and is essen- Model Year Top-1 Acc Parameters tial for understanding how galaxies form and evolve. AlexNet 2012 0.570 62,378,344 Galaxies may be separated into four classes – ellip- VGG16 2014 0.713 138,357,544 tical, spiral, lenticular and irregular – or into multi- InceptionResNetV2 2016 0.803 55,873,736 ple sub-classes. The classification problem becomes InceptionV3 2016 0.779 23,851,784 harder as more fine-grained sub-classes are consid- ResNext50 2017 0.778 22,979,904 ered. It is worth noting that merging galaxies are in DenseNet121 2017 0.750 8,062,504 fact a subclass of irregular galaxies. The use of Machine Learning techniques for ob- We next describe how each selected model im- ject classification in astronomy is well established, proved on AlexNet: VGG16 (Simonyan and Zisser- with many works based on neural networks (Ode- man, 2014) introduced the idea of using more layers wahn, 1995; Bertin, 1994; Moore et al., 2006) or other with smaller receptive fields, which improved train- well-known classifiers, such as decision trees (Gauci ing speed. InceptionV3 (Szegedy et al., 2016b) in- et al., 2010), and with most works being based troduced factorized convolutions, which is a mini- on hand-crafted feature extraction techniques, typi- network with smaller filters, also called Inception cally by using specific tools, created for and by as- block, which reduced the number of total parameters tronomers (Bertin and Arnouts, 1996; Ferrari et al., in the network. InceptionResNetV2 (Szegedy et al., 2016a) added residual or skip connections (He et al., Galaxy Morphology, 2-class (EF-2): 3604 images 2016) to the Inception architecture, with the goal of divided into two classes: Elliptical (289) and Spiral solving the problem of vanishing gradients, which can (3315) galaxies, extracted from the EFIGI (Baillard occur on very deep networks. ResNext50 (Xie et al., et al., 2011) dataset. This dataset is highly unbalanced 2017) introduced the ideas of aggregated transforma- towards images of Spiral galaxies; tions and cardinality, which are respectively a net- Galaxy Morphology, 4-class (EF-4): 4389 images work building block and the number of paths inside divided into four classes: Elliptical (289), Spiral each block, which showed improvements in accuracy. (3315), Lenticular (537) and Irregular (248) galax- DenseNet121 (Huang et al., 2017) took the idea of ies, extracted from the EFIGI dataset. The additional residual connections further and connected each layer classes makes the classification problem harder, since to every other following layer, which reduced signifi- the objects are not as clearly identifiable as in the 2- cantly the number of parameters in the network with- class subset (See Figure 3) and classes are highly un- out losing too much accuracy.

Load more