Discriminating Topology in Galaxy Distributions Using Network Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Discriminating topology in galaxy distributions using network analysis The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Hong, Sungryong, Bruno C. Coutinho, Arjun Dey, Albert -L. Barabási, Mark Vogelsberger, Lars Hernquist, and Karl Gebhardt. 2016. “Discriminating Topology in Galaxy Distributions Using Network Analysis.” Monthly Notices of the Royal Astronomical Society 459 (3): 2690–2700. https://doi.org/10.1093/mnras/stw803. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:41381855 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP A Preprint typeset using LTEX style emulateapj v. 2/16/10 DISCRIMINATING TOPOLOGY IN GALAXY DISTRIBUTIONS USING NETWORK ANALYSIS Sungryong Hong1,2, Bruno Coutinho3, Arjun Dey1, Albert -L. Barabasi´ 3,4,5, Mark Vogelsberger6, Lars Hernquist7, and Karl Gebhardt2 ABSTRACT The large-scale distribution of galaxies is generally analyzed using the two-point correlation function. However, this statistic does not capture the topology of the distribution, and it is necessary to resort to higher order correlations to break degeneracies. We demonstrate that an alternate approach using network analysis can discriminate between topologically different distributions that have similar two- point correlations. We investigate two galaxy point distributions, one produced by a cosmological simulation and the other by a L´evy walk, that have different topologies but yield the same power-law two-point correlation function. For the cosmological simulation, we adopt the redshift z =0.58 slice 8 from Illustris (Vogelsberger et al. 2014A) and select galaxies with stellar masses greater than 10 M⊙. The two point correlation function of these simulated galaxies follows a single power-law, ξ(r) ∼ r−1.5. Then, we generate L´evy walks matching the correlation function and abundance with the simulated galaxies. We find that, while the two simulated galaxy point distributions have the same abundance and two point correlation function, their spatial distributions are very different; most prominently, filamentary structures, which are present in the simulation are absent in L´evy fractals. To quantify these missing topologies, we adopt network analysis tools and measure diameter, giant component, and transitivity from networks built by a conventional friends-of-friends recipe with various linking lengths. Unlike the abundance and two point correlation function, these network quantities reveal a clear separation between the two simulated distributions; therefore, the galaxy distribution simulated by Illustris is not a L´evy fractal quantitatively. We find that the described network quantities offer an efficient tool for discriminating topologies and for comparing observed and theoretical distributions. Subject headings: methods: data analysis–galaxies: formation–galaxies: evolution–large-scale struc- ture of Universe : network science 1. INTRODUCTION verse with unprecedented accuracy, by measuring baryon Throughout the history of the Universe, various geo- acoustic oscillations (BAO). These experiments will also metrical and topological features have formed, evolved, result in the most detailed maps of the large-scale galaxy and vanished in the cosmic energy and matter distribu- distribution over a wide range in redshifts, from z ∼ 0 to tion. It is undeniably critical to quantify and measure z ∼ 3. such features, since many of them can provide definitive The successes of measuring CMB acoustic peaks and probes for constraining important cosmological parame- BAO features demonstrate how important the two-point ters. correlation functions (or power spectra) are for quanti- During the past two decades, studies of anisotropic fying cosmic structures. Higher order n−point correla- features in the cosmic microwave background (CMB), tion statistics are essential for analyzing cosmic struc- specifically acoustic peaks, have motivated the so-called tures. For example, the three and four point correlation Λ cold dark matter (ΛCDM) cosmology as a standard functions (or, bi- and tri-spectra) can constrain the non- paradigm (e.g., Hinshaw et al. 2013, Aghanim et al. Gaussianity of primordial quantum fluctuations (Barkats 2015) and made a new step forward in precision cosmol- et al. 2014, Ade et al. 2015); however, these measures are ogy. Various experiments (Levi et al. 2013, Delubac et computationally challenging (e.g., Kulkarni et al. 2007, arXiv:1603.02285v1 [astro-ph.CO] 7 Mar 2016 al. 2015, Zhao et al. 2015), currently beginning or under- Gil-Mar´ınet al. 2015). way, are mapping out the expansion history of the Uni- Along with the successful n−point statistics, many topological measurements have been introduced, such as 1 National Optical Astronomy Observatory, Tucson, AZ genus numbers and Minkowski functionals (Gott, Wein- 85719, USA berg & Melott 1987, Eriksen et al. 2004). To identify 2 Department of Astronomy, The University of Texas at voids and filaments, various methods have been adapted Austin, 2515 Speedway, Stop C1400, Austin, TX 78712, USA 3 Center for Complex Network Research and Department of from other fields of science, including minimum-spanning Physics, Northeastern University, Boston, Massachusetts 02115, trees, watersheds, Morse theory, wavelets, and smoothed USA 4 Hessian matrices (e.g., Barrow, Bhavsar & Sonoda 1985, Department of Medicine and Channing Division of Network Sheth et al. 2003, Mart´ınez et al. 2005, Arag´on-Calvo Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA et al. 2007, Colberg 2007, Sousbie et al. 2008, Bond, 5 Center for Network Science, Central European University, Strauss & Cen 2010, Lidz et al. 2010, Cautun et al. 1051, Budapest, Hungary 6 2013). While these topological diagnostics have provided Department of Physics, Kavli Institute for Astrophysics and important insights into the nature of structure in the Space Research, Massachusetts Institute of Technology, Cam- bridge, Massachusetts 02139, USA Universe, this wide but heterogeneous range of applied 7 Harvard-Smithsonian Center for Astrophysics, 60 Garden methodologies reflects how difficult it is to find a consis- Street, Cambridge, Massachusetts 02138, USA tent and comprehensive framework for quantifying and 2 Hong et al. measuring the topology of the Universe, in contrast to invariant in the strict mathematical sense in order to be the successful n−point statistics. topologically meaningful. For example, the set of Roman Many of these studies generate a continuous density alphabets is topological. We can consistently recognize field by smoothing the galaxy point distribution and then letters irrespective of font or handwriting since each al- measuring geometric topologies of genus numbers and phabet has its own distinct topology. “i”, “k”, “l” are Minkowski functionals. Our approach, which we term topologically very different even in mathematically rigor- “network cosmology”, is to characterize the topology of ous measures. However, “i” and “j” are indistinct topo- the discrete point distribution directly using graph the- logically. They are discerned instead by the differences in ory and network algorithms. length and curve (angle). The process of reading, i.e., vi- As a pilot study to explore new ways to quantify cosmic sually measuring the characteristics of each letter, is pre- topologies, Hong & Dey (2015; hereafter, HD15) applied dominantly topological but includes geometric aspects. the analysis tools developed for the study of complex In galaxy surveys, n−point statistics are typical mea- networks (e.g. Albert & Barab´asi 2002, Newman 2010) surements, as presented in §1. These are geometri- to the study of the large-scale galaxy distribution. The cally driven measurements; n−point correlation func- basic idea is to generate a graph (i.e., a “network”) com- tions contain specific information about distances and posed of vertices (nodes) and edges (links) from a galaxy angles between galaxies. From a practical standpoint, distribution, and then measure network quantities used this renders n−point statistics computationally challeng- in graph theory. In this paper, we demonstrate the util- ing, since computation times are dominated by the han- ity of these techniques for differentiating between point dling of geometric information. distributions that have identical two-point correlations If we are only interested in topological features, much but different spatial distributions and topologies. of the geometric information is redundant. For example, Our paper is organized as follows. In §2, as a more if we need to count all triangles in a friends-of-friends specific introduction to this paper, we offer a general network from a certain galaxy distribution, we can run discussion about what types of features can be measured a network algorithm to count all triangular subgraphs. from galaxy survey data, the strong and weak points of We do not need to measure the three point statistic for n−point statistics, and how network representations of the problem of only counting triangles. Likewise, if we galaxy distributions can improve our ability to quantify are interested in the number of holes for an object, we do topological features in the Universe. In §3, we describe not need to know whether it looks like