The Illustristng Simulations: Public Data Release
Total Page:16
File Type:pdf, Size:1020Kb
Dylan Nelson et al. RESEARCH The IllustrisTNG Simulations: Public Data Releasey Dylan Nelson1*, Volker Springel1,6,7, Annalisa Pillepich2, Vicente Rodriguez-Gomez8, Paul Torrey9,5, Shy Genel4, Mark Vogelsberger5, Ruediger Pakmor1,6, Federico Marinacci10,5, Rainer Weinberger3, Luke Kelley11, Mark Lovell12,13, Benedikt Diemer3 and Lars Hernquist3 y Permanently available at www.tng-project.org/data Main Text 1 Introduction Some of our most powerful tools for understanding the origin and evolution of large-scale cosmic structure and Abstract the galaxies which form therein are cosmological simu- We present the full public release of all data lations. From pioneering beginnings (Davis et al., 1985; from the TNG50, TNG100 and TNG300 simu- Press and Schechter, 1974), dark matter, gravity-only lations of the IllustrisTNG project. IllustrisTNG simulations have evolved into cosmological hydrody- is a suite of large volume, cosmological, gravo- namical simulations (Katz et al., 1992). These aim to magnetohydrodynamical simulations run with the self-consistently model the coupled evolution of dark moving-mesh code Arepo. TNG includes a com- matter, gas, stars, and blackholes at a minimum, and prehensive model for galaxy formation physics, and are now being extended to also include magnetic fields, each TNG simulation self-consistently solves for the radiation, cosmic rays, and other fundamental physi- coupled evolution of dark matter, cosmic gas, lumi- cal components. Such simulations provide foundational nous stars, and supermassive blackholes from early support in our understanding of the ΛCDM cosmolog- time to the present day, z = 0. Each of the flagship ical model, including the nature of both dark matter runs { TNG50, TNG100, and TNG300 { are ac- and dark energy. companied by halo/subhalo catalogs, merger trees, Modern large-volume simulations now capture cos- lower-resolution and dark-matter only counterparts, mological scales of tens to hundreds of comoving mega- all available with 100 snapshots. We discuss scien- parsecs, while simultaneously resolving the internal tific and numerical cautions and caveats relevant structure of individual galaxies at . 1 kpc scales. Re- when using TNG. cent examples reaching z = 0 include Illustris (Genel The data volume now directly accessible online et al., 2014; Vogelsberger et al., 2014b), EAGLE (Crain is ∼1.1 PB, including 2,000 full volume snapshots et al., 2015; Schaye et al., 2015), Horizon-AGN (Dubois and ∼115,000 high time-resolution subbox snap- et al., 2014), Romulus (Tremmel et al., 2017), Simba shots. Data access and analysis examples are avail- (Dav´eet al., 2019), Magneticum (Dolag et al., 2016), able in IDL, Python, and Matlab. We describe im- among others. These simulations produce observation- provements and new functionality in the web-based ally verifiable outcomes across a diverse range of astro- API, including on-demand visualization and analysis physical regimes, from the stellar and gaseous proper- arXiv:1812.05609v3 [astro-ph.GA] 29 Jan 2021 of galaxies and halos, exploratory plotting of scal- ties of galaxies, galaxy populations, and the supermas- ing relations and other relationships between galac- sive blackholes they host, to the expected distribution tic and halo properties, and a new JupyterLab inter- of molecular, neutral, and ionized gas tracers across face. This provides an online, browser-based, near- interstellar, circumgalactic, and intergalactic scales, in native data analysis platform enabling user compu- addition to the expected distribution of the dark mat- tation with local access to TNG data, alleviating the ter component itself. need to download large datasets. Complementary efforts, although not the focus of this data release, include high redshift reionization- Keywords: methods: data analysis; methods: era simulations such as BlueTides (Feng et al., 2016), numerical; galaxies: formation; galaxies: evolution; data management systems; data access methods, *Correspondence: [email protected] 1Max-Planck-Institut f¨urAstrophysik, Karl-Schwarzschild Str. 1, 85741 distributed architectures Garching, Germany Full list of author information is available at the end of the article Dylan Nelson et al. Page 2 of 30 Sphinx (Rosdahl et al., 2018), and CoDa II (Ocvirk Other dark-matter only simulations have released et al., 2018), among others. In addition, `zoom' simu- data with similar approaches, including Bolshoi and lation campaigns include NIHAO (Wang et al., 2015), MultiDark (CosmoSim; Klypin et al., 2011; Riebe FIRE-2 (Hopkins et al., 2018), and Auriga (Grand et al., 2013), DEUS (Rasera et al., 2010), and MICE et al., 2017), in addition to many others. These have (Cosmohub; Crocce et al., 2010). In contrast, some provided numerous additional insights into many ques- recent simulation projects have made group catalogs tions in galaxy evolution (recent progress reviewed in and/or snapshots available for direct download, in- Faucher-Gigu`ere, 2018). For instance, reionization sim- cluding MassiveBlack-II (Khandai et al., 2014), the ulations may be able to include explicit radiative trans- Dark Sky simulation (Skillman et al., 2014), ν2GC fer, and zoom simulations may be able to reach higher (Makiya et al., 2016), and Abacus (Garrison et al., resolutions and/or more rapidly explore model varia- 2018). Skies and Universes (Klypin et al., 2017) or- tions, in comparison to large cosmological volume sim- ganizes a number of such data releases. With respect ulations. to Illustris, the most comparable in simulation type, Observational efforts studying the properties of data complexity, and scientific scope is the recent pub- galaxies across cosmic time provide ever richer datasets. lic data release of the Eagle simulation, described in Surveys such as SDSS (York et al., 2000), CANDELS McAlpine et al.(2016) (see also Camps et al., 2018). (Grogin et al., 2011), 3D-HST (Brammer et al., 2012), The initial group catalog release was modeled on the LEGA-C (van der Wel et al., 2016), SINS/zC-SINF Millennium database architecture, and the raw snap- and KMOS3D (Genzel et al., 2014; Wisnioski et al., shot data was also subsequently made available (The 2015), KBSS (Steidel et al., 2014), and MOSDEF EAGLE team, 2017). More recently, significant in- (Kriek et al., 2015) provide local and high redshift frastructure research and development has focused on measurements of the statistical properties of galaxy providing remote computational resources, including populations. Complementary, spatially-resolved data the NOAO Data Lab (Fitzpatrick et al., 2014) and has recently become available from large, z = 0 IFU the SciServer project (Medvedev et al., 2016; Raddick surveys such as MANGA (Bundy et al., 2015), CAL- et al., 2017). Web-based orchestration projects also in- IFA (S´anchez et al., 2012) and SAMI (Bryant et al., clude Ragagnin et al.(2017), Tangos (Pontzen and 2015). Tremmel, 2018), and Jovial (Araya et al., 2018). In order to inform theoretical models using obser- The public release of IllustrisTNG (hereafter, TNG) vational constraints, as well as to interpret observa- follows upon and further develops tools and ideas pi- tional results using realistic cosmological models, pub- oneered in the original Illustris data release. We of- lic data dissemination from both observational and fer direct online access to all snapshot, group cata- simulation campaigns is required. Observational data release has a successful history dating back at least log, merger tree, and supplementary data catalog files. to the SDSS SkyServer (Szalay et al., 2000, 2002), In addition, we develop a web-based API which al- which provides tools for remote users to query and lows users to perform many common tasks without acquire large datasets (Gray et al., 2002; Szalay et al., the need to download any full data files. These include 2002). The still-in-use approach is based on user writ- searching over the group catalogs, extracting particle ten SQL queries, which provide search results as well data from the snapshots, accessing individual merger as data acquisition. From the theoretical community, trees, and requesting visualization and further data the public data release of the Millennium simulation analysis functions. Extensive documentation and pro- (Springel et al., 2005) was the first attempt of simi- grammatic examples (in the IDL, Python, and Matlab lar scale. Modeled on the SDSS approach, data was languages) are provided. stored in a relational database, with an immediately This paper is intended primarily as an overview recognizable SQL-query interface (Lemson and Virgo guide for TNG data users, describing updates and new Consortium, 2006). It has since been extended to in- features, while exhaustive documentation will be main- clude additional simulations, including Millennium-II tained online. In Section2 we give an overview of the (Boylan-Kolchin et al., 2009; Guo et al., 2011), and simulations. Section3 describes the data products, and a first attempt at the idea of a \virtual observatory" Section4 discusses methods for data access. In Sec- (VO) was realized (Overzier et al., 2013). The Theo- tion5 we present some scientific remarks and cautions, retical Astrophysical Observatory (TAO; Bernyk et al., while in Section6 we discuss community considera- 2014) was similarly focused around mock observations tions including citation requests. Section7 describes of simulated galaxy and galaxy survey data. Explo- technical details related to the data release architec- rations continue on how to best