The Illustristng Simulations: Public Data Release
Total Page:16
File Type:pdf, Size:1020Kb
Nelson et al. Computational Astrophysics and Cosmology (2019)6:2 https://doi.org/10.1186/s40668-019-0028-x R E S E A R C H Open Access The IllustrisTNG simulations: public data release Dylan Nelson1*,VolkerSpringel1,6,7, Annalisa Pillepich2, Vicente Rodriguez-Gomez8,PaulTorrey9,5, Shy Genel4, Mark Vogelsberger5, Ruediger Pakmor1,6, Federico Marinacci10,5, Rainer Weinberger3, Luke Kelley11,MarkLovell12,13, Benedikt Diemer3 and Lars Hernquist3 Abstract We present the full public release of all data from the TNG100 and TNG300 simulations of the IllustrisTNG project. IllustrisTNG is a suite of large volume, cosmological, gravo-magnetohydrodynamical simulations run with the moving-mesh code AREPO. TNG includes a comprehensive model for galaxy formation physics, and each TNG simulation self-consistently solves for the coupled evolution of dark matter, cosmic gas, luminous stars, and supermassive black holes from early time to the present day, z = 0. Each of the flagship runs—TNG50, TNG100, and TNG300—are accompanied by halo/subhalo catalogs, merger trees, lower-resolution and dark-matter only counterparts, all available with 100 snapshots. We discuss scientific and numerical cautions and caveats relevant when using TNG. The data volume now directly accessible online is ∼750 TB, including 1200 full volume snapshots and ∼80,000 high time-resolution subbox snapshots. This will increase to ∼1.1 PB with the future release of TNG50. Data access and analysis examples are available in IDL, Python, and Matlab. We describe improvements and new functionality in the web-based API, including on-demand visualization and analysis of galaxies and halos, exploratory plotting of scaling relations and other relationships between galactic and halo properties, and a new JupyterLab interface. This provides an online, browser-based, near-native data analysis platform enabling user computation with local access to TNG data, alleviating the need to download large datasets. Keywords: Methods: data analysis; Methods: numerical; Galaxies: formation; Galaxies: evolution; Data management systems; Data access methods; Distributed architectures 1 Introduction simulations provide foundational support in our under- Some of our most powerful tools for understanding the standing of the ΛCDM cosmological model, including the origin and evolution of large-scale cosmic structure and nature of both dark matter and dark energy. the galaxies which form therein are cosmological simula- Modern large-volume simulations now capture cosmo- tions. From pioneering beginnings (Press and Schechter logical scales of tens to hundreds of comoving mega- 1974;Davisetal.1985), dark matter, gravity-only simula- parsecs, while simultaneously resolving the internal struc- tions have evolved into cosmological hydrodynamical sim- ture of individual galaxies at 1 kpc scales. Recent exam- ulations (Katz et al. 1992). These aim to self-consistently ples reaching z = 0 include Illustris (Vogelsberger et al. model the coupled evolution of dark matter, gas, stars, 2014a;Geneletal.2014), EAGLE (Schaye et al. 2015;Crain and black holes at a minimum, and are now being ex- tended to also include magnetic fields, radiation, cosmic et al. 2015), Horizon-AGN (Dubois et al. 2014), Romu- rays, and other fundamental physical components. Such lus (Tremmel et al. 2017), Simba (Davé et al. 2019), Mag- neticum (Dolag et al. 2016), among others. These simula- tions produce observationally verifiable outcomes across a *Correspondence: [email protected] 1Max-Planck-Institut für Astrophysik, Garching, Germany diverse range of astrophysical regimes, from the stellar and Full list of author information is available at the end of the article gaseous properties of galaxies, galaxy populations, and the © The Author(s) 2019. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Nelson et al. Computational Astrophysics and Cosmology (2019)6:2 Page 2 of 29 supermassive black holes they host, to the expected distri- Explorations continue on how to best deliver theoretical bution of molecular, neutral, and ionized gas tracers across results within the existing VO framework (Lemson and interstellar, circumgalactic, and intergalactic scales, in ad- Zuther 2009;Lemsonetal.2014). dition to the expected distribution of the dark matter com- Other dark-matter only simulations have released data ponent itself. with similar approaches, including Bolshoi and MultiDark Complementary efforts, although not the focus of this (CosmoSim; Klypin et al. 2011;Riebeetal.2013), DEUS data release, include high redshift reionization-era simu- (Rasera et al. 2010), and MICE (Cosmohub; Crocce et al. lationssuchasBlueTides(Fengetal.2016), Sphinx (Ros- 2010). In contrast, some recent simulation projects have dahl et al. 2018), and CoDa II (Ocvirk et al. 2018), among made group catalogs and/or snapshots available for di- others. In addition, ‘zoom’ simulation campaigns include rect download, including MassiveBlack-II (Khandai et al. NIHAO (Wang et al. 2015), FIRE-2 (Hopkins et al. 2018), 2014), the Dark Sky simulation (Skillman et al. 2014), ν2GC and Auriga (Grand et al. 2017), in addition to many others. (Makiya et al. 2016), and Abacus (Garrison et al. 2018). These have provided numerous additional insights into Skies and Universes (Klypin et al. 2017)organizesanum- many questions in galaxy evolution (recent progress re- ber of such data releases. With respect to Illustris, the most viewed in Faucher-Giguère 2018). For instance, reioniza- comparable in simulation type, data complexity, and sci- tion simulations may be able to include explicit radiative entific scope is the recent public data release of the Eagle transfer, and zoom simulations may be able to reach higher resolutions and/or more rapidly explore model variations, simulation, described in (McAlpine et al. 2016)(seealso in comparison to large cosmological volume simulations. Camps et al. 2018). The initial group catalog release was Observational efforts studying the properties of galax- modeled on the Millennium database architecture, and the ies across cosmic time provide ever richer datasets. Sur- raw snapshot data was also subsequently made available veys such as SDSS (York et al. 2000), CANDELS (Grogin (The EAGLE team 2017). More recently, significant infras- et al. 2011), 3D-HST (Brammer et al. 2012), LEGA-C (van tructure research and development has focused on provid- der Wel et al. 2016), SINS/zC-SINF and KMOS3D (Gen- ing remote computational resources, including the NOAO zel et al. 2014; Wisnioski et al. 2015), KBSS (Steidel et al. Data Lab (Fitzpatrick et al. 2014)andtheSciServerproject 2014), and MOSDEF (Kriek et al. 2015) provide local and (Medvedev et al. 2016; Raddick et al. 2017). Web-based high redshift measurements of the statistical properties orchestration projects also include (Ragagnin et al. 2017), of galaxy populations. Complementary, spatially-resolved Tangos (Pontzen et al. 2018), and Jovial (Araya et al. 2018). data has recently become available from large, z =0IFU The public release of IllustrisTNG (hereafter, TNG) fol- surveys such as MANGA (Bundy et al. 2015), CALIFA lows upon and further develops tools and ideas pioneered (Sánchez et al. 2012) and SAMI (Bryant et al. 2015). in the original Illustris data release. We offer direct on- In order to inform theoretical models using observa- line access to all snapshot, group catalog, merger tree, and tional constraints, as well as to interpret observational re- supplementary data catalog files. In addition, we develop a sults using realistic cosmological models, public data dis- web-based API which allows users to perform many com- semination from both observational and simulation cam- mon tasks without the need to download any full data files. paigns is required. Observational data release has a suc- These include searching over the group catalogs, extract- cessful history dating back at least to the SDSS SkyServer ing particle data from the snapshots, accessing individ- (Szalay et al. 2000, 2002), which provides tools for remote ual merger trees, and requesting visualization and further users to query and acquire large datasets (Gray et al. 2002; data analysis functions. Extensive documentation and pro- Szalay et al. 2002). The still-in-use approach is based on grammatic examples (in the IDL, Python, and Matlab lan- user written SQL queries, which provide search results guages) are provided. as well as data acquisition. From the theoretical commu- This paper is intended primarily as an overview guide nity, the public data release of the Millennium simula- tion (Springel et al. 2005)wasthefirstattemptofsimi- for TNG data users, describing updates and new features, lar scale. Modeled on the SDSS approach, data was stored while exhaustive documentation will be maintained on- in a relational database, with an immediately recogniz- line. In Sect. 2 we give an overview of the simulations. Sec- able SQL-query interface (Lemson and VirgoConsortium tion 3 describes the data products, and Sect. 4 discusses 2006). It has since been extended to include additional sim- methods for data access. In Sect. 5 we present some scien- ulations, including Millennium-II (Boylan-Kolchin et al. tific remarks