Automatic Detection and Quantitative Assessment of Peculiar Galaxy Pairs in Sloan Digital Sky Survey
Total Page:16
File Type:pdf, Size:1020Kb
Automatic detection and quantitative assessment of peculiar galaxy pairs in Sloan Digital Sky Survey Lior Shamir1?, John Wallin2 1. Dept. of Comp. Sci., Lawrence Technological University 21000 W Ten Mile Rd., Southfield, MI 48075, USA 2. Dept. of Physics and Astronomy, Middle Tennessee State University 1301 E Main St, Murfreesboro, TN 37130, USA ABSTRACT We applied computational tools for automatic detection of peculiar galaxy pairs. We first <18 that had more than one point spread function, and then applied a machine learning algorithm that detected detected in SDSS DR7 ∼400,000 galaxy images with i magnitude roducing a short ∼26,000 galaxy images that had morphology similar to the morphology of galaxy mergers. That dataset was mined using a novelty detection algorithm, p list of 500 most peculiar galaxies as quantitatively determined by the algorithm. Manual examination of these galaxies showed that while most of the galaxy pairs in the list were not necessarily peculiar, numerous unusual galaxy pairs were detected. In this paper we describe the protocol and computational tools used for the detection of peculiar mergers,Key words: and provide Galaxy: examples general of– galaxies: peculiar interactionsgalaxy pairs –that galaxies: were detected. peculiar – astronom- ? ail: [email protected] Em ical databases:catalogs 1 INTRODUCTION studies is the possibility of superpositions between galaxies in Interactions between galaxies are associated with galaxy recentthe same interaction group. If theand galaxies the system in a close is pair have peculiar morphologies, there is a high confidence that there has been a morphology (Springel & Hernquist 2005; Bower et al. 2006), quasars (Hopkins et al. 2005), enhanced rates of star formation not just a chance superposition. However since there has been no clear objective Numerous(Di Matteo etmanuallyal. 2007; Bridgecrafted et catalogues al. 2007), quasars and classification (Hopkins et al. 2005), and activity in galactic nuclei (Springel et al. 2008). way to define when a galaxy is “peculiar” (Naim & Lahavinteracting 1996), pairs.the objective criteria of velocity and projected distance has been the best way of analyzing a large sample of schemes of- galaxy mergers have been proposed and published (Arp 1966; Struck 1999; Schombert, Wallin & Struck 1990; automaticVorontsov detectionVelyaminov of 1959,galaxy 1977;mergers, Arp and& Madorestudied 1987).galaxy Early efforts to identify and catalog peculiar galaxies used Cotini et al. (2013) developed and utilized a method for photographic surveys. The difficulty of defining a set of Theobjective Cata criteria for peculiar galaxies can be illustrated by the merger population to show a link between mergers and classification schemes that have been used in these catalogs. galaxies with supermassive black holes. log of Interacting Galaxies VorontsovVelyaminov- More recent work on interacting systems has focused on (1959, 1977) contained - 335 objects and placed peculiar pairs of galaxies with similar redshifts examinedand small these projected close galaxies into six primary categories: “HII regions”, “M51 type”, distances taken from the Sloan Digital Sky Survey. In these “Nests”, “Pairs”, “Pseudo Rings”, “Comets”, and “Enigmatic.” papers, the authors have systematically The Arp Atlas of Peculiar Galaxies (Arp 1966) was a catalog of pairs for evidence of an increased star formation rate (Ellison 332 peculiar and interacting- systems. There were four primary interaction.et al. 2008), Theelevated confounding nuclear effectactivity in (Ellisonthese et al. 2010), and overlapping categories for these objects including Spirals, other c measurable effects that might be associated with Galaxies, E and E like Galaxies, and Double Galaxies. Within 2 Shamir & Wallin - these groups, there were 37 other subgroups including “ring by first computing the Otsu threshold (Otsuely testing1979). eachThe Otsugray galaxies”, “three arm spirals”, “galaxies with jets”,’ and “double method determines the threshold above which a pixel is galaxies with wind effects.” Given the range ataloguesof naming of considered a foreground pixel by iterativ conventions and morphologies, the question of when a galaxy value, and computing the variance of the pixels dimmer than is “peculiar” has remained difficult to quantify. C that value and the variance of the pixels brighter than the galaxy interactions have traditionally been produced by candidate threshold. The gray value that provides the manual inspection of galaxy images by a few dedicated minimum of the sum of the variances is the Otsu threshold scientists. However as data sets have grown, the Galaxy Zoo (Otsu 1979). The Otsu threshold separates the foreground and project (and including Galaxy Zoo II and the broader backgroundThe set pixels of foreground regardless pixelsof linear is mappingthen separated of the pixels into Zooniverse efforts) have incorporated “citizen scientist” intensity values. - volunteers to classify galaxy morphologies from SDSS data - (Lintott et al. 2008, 2011). Such manual analysis of galaxies foreground pixelsobjects such by that counting each foreground the 4 connected pixel Ix,y objects using crowdsourcing was used for analyzing properties of (Shamirgroup O satisfies2011a). Thethe condition4 connected objects are simply groups of merging galaxies (Darg et al. 2010; Casteels et al. 2013). within the catalogs, the time needed to construct them is immense. Arp & Ix,y O Ix,y+1 O|Ix,y 1 O|Ix+1,y O|Ix 1,y O Aside from the questions about the completeness of these foreground pixel in the− group has at − ∀foreground∈ ∃( pixel.∈ ∈ ∈ ∈ ). That is, each hemMadore (1987) reported that it took ∼14 years to compile and least one neighboring produce their catalog of ∼6400 mergers in the southern isphere. In the era of robotic telescopes and digital sky If more than one object is found, the image is flagged as a surveys acquiring images of many billions of galaxies detectioncandidate algorithmfor a galaxy (Shamir merger. & NemIf only one object is found, the morphology(Djorgovski ofet mostal. 2013; galaxy Borne mergers 2013), is manually crafted object is scanned for peaks using a point spread function catalogues of mergers becomes impractical. While the iroff 2005a,b), andWolf if moreopen known, some galaxy than one peak is found the image is considered a potential mergers have peculiar morphology. Here we describe the galaxy merger. The peak detection code is part of the detection of peculiar galaxy mergers by a computer algorithm source image analysis package (Shamir et al. 2006; Shamir mining galaxy images acquired by Sloan Digital Sky Survey. holes.2012a). It should be noted that the same technique can also be used for automatic detection of recoiling supermassive black 2 METHOD . 5 potential The separation of objects6 with more than one peak oreduced the set of ∼ 3 2×10 images to ∼ 432×10 Galaxy mergers feature complex morphology that involves the galaxy mergers. However, many of these images are not images shape of two or more interacting galaxies, as well as the detectedf interacting as potential galaxies. galaxy Figure mergers. 1 shows a few examples of distance and position of the galaxies in the system. Here we objects classified as galaxies by SDSS pipeline and were also analyze images from from Sloan Digital Sky Survey (Schneider . rily galaxy et al. 2003). i 6 As the figure shows, images with two objects or with < In the first step, we downloaded ∼ 37 × 10 objects objects with two detected PSFs are not necessa dentified by SDSS as galaxies (object type = 3) with i magnitude mergers. To find galaxy mergers we used the Wndchrm image 18. Each galaxy image was of dimensionality- of 120×120 analysis software tool (Shamir et al. 2008a; Shamir 2013b), magnitudepixels, downloaded thr directly using DR7 Catalog Archive Server which was originally developed for analysis of microscopy (CAS) as jpg images, and converted to 8 bit TIFF format. The images (Shamir et al. 2008b, 2010a), but was also found images lasted 28eshold days. is used to reduce the number of images to informative for the analysis of galaxy images (Shamir 2009), a smaller set of the brightest objects. Downloading all these and in particular for analyzing the morphology of galaxy descriptorsmergers (Shamir for each et image, al. 2013a). so that Wndchrmeach image works is represented by first preliminaAfter the images were downloaded, a preliminary test was extracting a very large set of numerical image content applied to each image to filter possible artefacts. The ry test rejected images in which 80% or more of the shape,by a colour,vector textures,of 2883 fractals, numerical polynomial values. decomposition These content of pixels were brighter than- 120, or images that 80% or more of descriptors provide a comprehensive set that reflects these the pixels were of the same colour (green, blue or red) after applying a fuzzy logic based colour classification transform the image, and statistics of the pixel value distribution. The (Shamir 2006). The test was based on the SDSS colour mapping content descriptors are extracted from the raw images, but also (Lupton et al. 2004), in which large swathes of a single. colour from transforms of the images (e.g., FFT, Wavelet, Chebyshev, are often signs of detector saturation, and can therefore be6 Wndcgradient), as well as combinations of transforms (e.g., FFT consideredThen, artefacts. After rejecting the artefacts, ∼ 3 2 × 10 transform of the Wavelet transform). A detailed description of objects were left. hrm can be found in (Shamir et al. 2008a, 2010b, 2013a). had more thanwe appliedone point an spread algorithm function to determine in it. The detection whether ofa Wndchrm performs colour analysis by using the RGB channels certain image has two neighboring objects, or that an object (Shamir et al. 2010b). That type of analysis might be less accurate than analyzing the FITS images of each colour channel two separate objects was detected by first applying the Otsu separately, but it allows colour analysis without the need to binary transform (Otsu 1979) to separate the foreground from download multiple FITS files for each celestial object, and the background pixels.