Discovering the Unexpected in Astronomical Survey Data

Publications of the Astronomical Society of Australia (PASA), Vol. 34, e007, 10 pages (2017). © Astronomical Society of Australia 2017; published by Cambridge University Press. doi:10.1017/pasa.2016.63 Discovering the Unexpected in Astronomical Survey Data Ray P. Norris1,2 1Western Sydney University, Locked Bag 1797, Penrith South, NSW 1797, Australia 2CSIRO Astronomy & Space Science, PO Box 76, Epping, NSW 1710, Australia Email: [email protected] (RECEIVED November 17, 2016; ACCEPTED December 19, 2016) Abstract Most major discoveries in astronomy are unplanned, and result from surveying the Universe in a new way, rather than by testing a hypothesis or conducting an investigation with planned outcomes. For example, of the ten greatest discoveries made by the Hubble Space Telescope, only one was listed in its key science goals. So a telescope that merely achieves its stated science goals is not achieving its potential scientific productivity. Several next-generation astronomical survey telescopes are currently being designed and constructed that will significantly expand the volume of observational parameter space, and should in principle discover unexpected new phenomena and new types of object. However, the complexity of the telescopes and the large data volumes mean that these discoveries are unlikely to be found by chance. Therefore, it is necessary to plan explicitly for unexpected discoveries in the design and construction. Two types of discovery are recognised: unexpected objects and unexpected phenomena. This paper argues that next-generation astronomical surveys require an explicit process for detecting the unexpected, and proposes an implementation of this process. This implementation addresses both types of discovery, and relies heavily on machine-learning techniques, and also on theory-based simulations that encapsulate our current understanding of the Universe. Keywords: surveys, telescopes 1 INTRODUCTION representing data, that led to the development of models of stellar evolution and ultimately nuclear fusion. In another ex- Popper (1959) described the scientific method as a process in ample, the expanding Universe was discovered when Hubble which theory is used to make a prediction which is then tested plotted redshifts of galaxies against their brightness (Hub- by experiment. That model, and its principle of ‘falsifiabil- ble 1929). More recently, the Hubble Deep Fields (Williams ity’ remains the gold standard of the scientific method, and et al. 1996, 2000) were primarily motivated by a desire to ex- probably drives the majority of scientific progress. Notable plore the early Universe, rather than testing specific models recent successes include the discovery of the Higgs boson or hypotheses. (ATLAS 2012) and the detection of gravitational waves (Ab- bottetal.2016). Conversely, models such as string theory are sometimes criticised (e.g. Woit 2011) for being unfalsi- 1.1. The history of astronomical discovery fiable, and thus failing to adhere to this Popperian scientific method. Astronomical discovery has often occurred as a result of tech- However, the Popperian scientific method is not the only nical innovation, resulting in the Universe being observed in one, and a number of other modes of scientific discovery a way that was not previously possible. Examples include the have been proposed, notably by Kuhn (1962). For example, development of larger telescopes, or the opening up of a new science may also proceed through a process of ‘exploration’ window of the electromagnetic spectrum. More generally, (e.g. Harwit 1981), in which experiments or observations are we may define an n-dimensional parameter space whose n carried out in the absence of a compelling theory, in order to orthogonal axes correspond to observable quantities (e.g. fre- guide the development of theory. quency, sensitivity, polarisation, colour, spatial scale, tempo- Astronomy has largely developed through a process of ral scale). Some parts of this parameter space have been well- exploration. For example, the Hertzsprung–Russell diagram observed and have already yielded their discoveries, whereas (Hertzsprung 1908) was an observationally driven idea of some parts of this space have not yet been observed. New 1 Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 01 Oct 2021 at 10:40:22, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/pasa.2016.63 2 Norris 1.2. This paper In Section 2 of this paper, I discuss the opportunities and challenges to making unexpected discoveries in the high data volumes and high complexity of next-generation astronomical surveys, and argue that surveys need to plan explicitly for these discoveries if they are to be successful. Section 3 proposes a process for discovering unexpected objects in astronomical surveys, and Section 4 proposes a process for discovering unexpected phenomena in astronomical surveys. Section 5 describes some preliminary attempts to implement and test some of these approaches and suggests some future Figure 1. A plot of recent major astronomical discoveries, taken from Ekers directions. (2009), of which seven were ‘known–unknowns’ (i.e. discoveries made by To focus the discussion, this paper uses the ‘Evolutionary testing a prediction) and ten were ‘unknown–unknowns’ (i.e. a serendipitous Map of the Universe’ survey (EMU: Norris et al. 2011)as result found by chance while performing an experiment with different goals). an exemplar of next-generation surveys, but the broad con- The data in this plot are taken from Wilkinson et al. (2004). clusions and process will be relevant to all next-generation astronomical surveys. discoveries may lie in those unsampled parts of the param- 2 THE PROCESS OF ASTRONOMICAL eter space, presumably available to new instruments able to DISCOVERY sample that region of the parameter space. Most ‘accidental’ or ‘serendipitous’ discoveries result from observing a new Astronomy is currently enjoying a boom in new surveys, part of this parameter space (Harwit 2003). with several next-generation astronomical survey telescopes We may therefore broadly divide astronomical discover- planned, which will undoubtedly open up large new swathes ies into (a) those which were made according to the Poppe- of observational parameter space, potentially resulting in a rian model, in which a model or hypothesis is being tested large number of unexpected discoveries. (the known–unknowns), and (b) those which have resulted There are two quite different types of unexpected from observing the Universe in a new part of the parame- discovery: ter space, resulting in unexpected discoveries (the unknown– unknowns). Of course, an experiment may often be planned • Type 1: Discoveries of new types of object (e.g. pulsars, to test a hypothesis, but in doing so stumbles across an unex- quasars), identified as anomalies or unexpected objects pected discovery. A classic example of this is the discovery in images or catalogues; of pulsars (Hewish et al. 1968) discussed in Section 2.2.Al- • Type 2: Discoveries of new phenomena (e.g. HR dia- ternatively, data taken for an unrelated purpose may be mined gram, the expanding Universe, dark energy), identified for unexpected discoveries, such as the outlier detection al- as anomalies in the distributions of properties of objects. gorithm described by Baron & Poznanski (2016) that finds These are identified when the results of experiment are ‘weird’ galaxies by searching for unusual spectra from the compared to theory (or perhaps to other observations) in Sloan Digital Sky Survey. some suitable parameter space. Several studies (Harwit 1981; Wilkinson et al. 2004; Wilkinson 2007; Fabian 2010; Kellermann 2009;Ekers2009; 2.1. Case study 1: The Hubble space telescope Wilkinson 2015) have shown that at least half the major discoveries in astronomy are unexpected, and are typically made The science goals that drove the funding, construction, and by surveying the Universe in a new way, rather than by test- launch of the Hubble Space Telescope (HST) are listed in the ing a hypothesis or conducting an investigation with planned HST funding proposal (Lallo 2012). A further four projects outcomes. For example, Figure 1 shows the result of an ex- were planned in advance by individual scientists but not listed amination (Ekers 2009) of 17 major astronomical discoveries as key projects in the HST proposal. Conveniently, the Na- in the last 60 yrs. Ekers concluded that only seven resulted tional Geographic magazine selected the ten major discov- from systematic observations designed to test a hypothesis or eries of the HST (Handwerk 2005), resulting in an admit- probe the nature of a type of object. The remaining ten were tedly subjective ‘top ten’ list of HST discoveries (shown in unexpected discoveries resulting either from new technology, Table 1). So we may compare the actual achievements of the or from observing the sky in an innovative way, exploring un- HST against its planned achievements. Of these ten greatest charted parameter space. In particular, experience has shown discoveries by HST, only one was listed in its key science that unexpected discoveries often result when the sky is ob- goals. In particular, the unplanned discoveries include two of served to a significantly greater sensitivity, or a significantly the three most cited discoveries, and the only HST discovery new volume of observational parameter space is explored. (Dark Energy) to win a Nobel prize. PASA, 34, e007 (2017) doi:10.1017/pasa.2016.63 Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 01 Oct 2021 at 10:40:22, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/pasa.2016.63 Discovering the Unexpected 3 Table 1. Major discoveries made by the Hubble Space Telescope ( HST ). Of the HST ’s ‘top ten’ discoveries (as ranked by National Geographic magazine), only one was a key project used in the HST funding proposal (Lallo 2012). A further four projects were planned in advance by individual scientists but not listed as key projects in the HST proposal.

Discovering the Unexpected in Astronomical Survey Data

2004-2005 Annual Report

ASTRONET IR Final Word Doc for Printing with Logo

Inferring Kilonova Population Properties with a Hierarchical Bayesian Framework I : Non-Detection Methodology and Single-Event Analyses

J-PLUS: on the Identification of New Cluster

Combining Dark Energy Survey Science Verification Data with Near

Fast and Accurate Estimation for Astrophysical Problems in Large Databases

Astro2010: the Astronomy and Astrophysics Decadal Survey

Atlas: a High-Cadence All-Sky Survey System

Astronet SV for Publication New Cover Final

THE ALHAMBRA SURVEY: EVOLUTION of GALAXY SPECTRAL SEGREGATION Ll

Ground Based, Space Based, Infrastructure, Technological Development, and State of the Profession Activities

Assessing the Photometric Redshift Precision of the S-PLUS Survey: the Stripe-82 As a Test-Case