The Millennium-XXL Project: Simulating the Galaxy Population of Dark Energy Universes

The Millennium-XXL Project: Simulating the Galaxy Population of dark Energy Universes Modern cosmology as encoded in the In particular, the arrival of the largest leading ⌳CDM model confronts astrono- galaxy surveys ever made is imminent, mers with two major puzzles. One is that offering enormous scientific potential the main matter component in today’s for new discoveries. Experiments like Universe appears to be a yet undiscov- SDSSIII/BOSS or PanSTARRS have ered elementary particle whose contribu- started to scan the sky with unprec- tion to the cosmic density is more than edented detail, considerably improving 5 times that of ordinary baryonic matter. the accuracy of existing cosmological This cold dark matter (CDM) interacts probes. This will likely lead to chal- extremely weakly with regular atoms lenges of the standard ⌳CDM paradigm Applications and photons, so that gravity alone has for cosmic structure formation, and affected its distribution since very early perhaps even discover new physics. times. The other is a mysterious dark energy force field, which dominates the en- One of the most important aims of ergy content of today’s Universe and has these galaxy surveys is to shed light led to its accelerated expansion in recent on the nature of the dark energy via times. In the standard model, this com- measurements of the redshift evolution ponent is described in terms of Einstein’s of its equation of state. However, the cosmological constant (‘⌳’). Uncovering ability of these surveys to achieve this the nature of dark energy has become major scientific goal crucially depends one of the most actively pursued goals of on an accurate understanding of sys- obser vational cosmology. tematic effects and on a precise way 20 Autumn 2010 • Vol. 8 No. 2 • inSiDE to physically model the observations, The State of the Art in particular the scale-dependent bias The N-body method for the collision- between luminous red galaxies and the less dynamics of dark matter is a long- underlying dark matter distribution, or established computational technique the impact of mildly non-linear evolution used to follow the growth of cosmic on the so-called baryonic acoustic oscil- structure through gravitational instabil- lations (BAOs) measured in the power ity. The Boltzmann-Vlasov system of spectrum of galaxy clustering. equations is here discretized in terms of N fiducial simulation particles, whose Simulations of the galaxy formation pro- motion is followed under their mutual cess are arguably the most powerful gravitational forces in an expanding technique to accurately quantify and background space-time. While concep- understand these effects. However, tually simple, calculating the long-range this is an extremely tough computa- gravitational forces exactly represents tional problem, because it requires an N2-problem, which quickly becomes ultra-large volume N-body simulations prohibitively expensive for interesting with sufficient mass resolution to iden- problem sizes. However, it is fortu- tify the halos likely to host the galaxies nately sufficient to calculate the forces Applications seen in the surveys, and a realistic approximately, for which a variety of model to populate these halos with algorithms have been developed over galaxies. Given the significant invest- the years. ments involved in the ongoing galaxy surveys, it is imperative to tackle these This allowed the sizes of cosmological Figure 1: Dark matter numerical challenges to ensure that ac- simulations to steadily increase since distribution in the curate theoretical predictions become the early 1980s, roughly doubling the MXXL simulation on available both to help to quantify and particle number every 17 months and different scales. Each panel shows the pro- understand the systematic effects, and hence providing progressively more jected density of dark to extract the maximum amount of in- faithful models for the real Universe [1, matter in a slice of formation from the observational data. 2, 3, 4]. Such simulations have proven thickness 20 Mpc. Autumn 2010 • Vol. 8 No. 2 • inSiDE 21 to be an indispensable tool to under- than 1010 particles, exceeding the size stand the low- and high-redshift Uni- of previous simulations by almost an verse by comparing the predictions of order of magnitude. Its success was CDM to observations, since these cal- not only computational but most im- culations are the only way to accurately portantly scientific – more than 300 calculate the outcome of non-linear research articles in the fields of theo- cosmic structure formation. retical and observational cosmology have used the MR data-products since. A milestone in the development of cos- The MR has an exquisite mass resolu- mological simulations was set by the tion and accuracy but, unfortunately, Millennium Run (MR), performed by its volume is insufficient to get reliable our Virgo Consortium group in 2004 statistics on large scales at the level [3]. This simulation was the first, and needed for future surveys. for many years the only run with more Applications Figure 2: Differential halo abundance as a function of mass at the present epoch in the MXXL simulation. Note the good sampling far into the exponential tail. Apart from the last point, the error bars from counting statistics are smaller than the plotted symbols. 22 Autumn 2010 • Vol. 8 No. 2 • inSiDE Recently, French and Korean collabora- The computational Challenge tions [5, 6] have successfully carried However, it was clear from the start out simulations containing almost 70 that performing a simulation with these billion particles, but at considerably characteristics poses severe challenges, worse mass and spatial resolution than involving raw execution time, scalability the MS, which did not allow them to of the algorithms employed, as well as robustly identify Milky-Way sized halos. their memory consumption and the disk Also, the need to manipulate and ana- space required for the output data. For lyze a huge volume of data has proven example, simply storing the positions to be a non-trivial challenge in work- and velocities of the simulation particles ing with simulations of this size, a fact in single precision consumes of order that has made scientific exploitation of 10 TB of RAM memory. This figure, of these projects difficult. course, is greatly enlarged by the extra memory required by the complex data We have therefore set out to perform structures and algorithms employed in a new ultra-large N-body simulation the simulation code for the force calcu- of the hierarchical clustering of dark lation, domain decomposition, and halo matter, featuring a new strategy for and subhalo finding. Applications dealing with the data volume, and com- bining it with semi-analytical modelling The code we used is a novel version of galaxy formation, which allows a of GADGET-3 we wrote specifically for prediction of all the luminous proper- the MXXL project. GADGET computes ties of the galaxies that form in the short-range gravitational forces with a simulation. We designed the simula- hierarchical tree algorithm, and long- tion project, dubbed Millennium-XXL range forces with a FFT-based particle- (MXXL), to follow more than 303 billion mesh scheme [7]. Both the force particles (67203) in a cosmological box computation and the time stepping of of size 4.2 Gpc across, resolving the GADGET are fully adaptive. The code is cosmic web with an unprecedented written in highly portable C and uses a combination of volume and resolution. spatial domain decomposition to map While the particle mass of the MXXL different parts of the computational do- is slightly worse than that of the MR, main to individual processors. In 2005, its resolution is sufficient to accurately we publicly released GADGET-2, which measure dark matter merger histories presently is the most widely employed for halos hosting luminous galaxies, code for cosmic structure formation. within a volume more than 200 times larger than that of the MS. In this way Ultimately, memory requirements are the simulation can provide extremely the most serious limiting factor for cos- accurate statistics of the large-scale mological simulations of the kind stud- structure of the Universe by resolv- ied here. We have therefore put sig- ing around 500 million galaxies at the nificant effort in developing algorithms present epoch, allowing for example and strategies that minimize memory highly detailed clustering studies based consumption in our code, while at the on galaxy or quasar samples selected same time retaining high integration in a variety of different ways. This com- accuracy and calculational speed. prehensive information is indispensable The special “lean” version of our code for the correct analysis of future ob- we developed for MXXL requires only servational datasets. 72 bytes per particle in the peak for the Autumn 2010 • Vol. 8 No. 2 • inSiDE 23 ordinary dynamical evolution of MXXL. for our GADGET code. Instead of using The sophisticated in-lined group and 12,288 distributed-memory MPI substructure finders add a further 26 tasks, we only employed one MPI bytes to the peak memory consumption. task per processor (3,072 in total), This means that the memory require- exploiting the 4 cores of each socket ment for the target size of 303 billion via threads. This mixture of distrib- particles of our simulation amounts to uted and shared memory parallelism slightly more than 29 TB of RAM. proved ideal to limit memory and work-load imbalance losses, because The JuRoPa machine at the Jülich Su- a smaller number of MPI tasks re- percomputing Centre (JSC) appeared duces the number of independent as one of the best suited supercom- spatial domains in which the volume puters within Germany to fulfill our needs to be decomposed, as well as computing requirements, thanks to its the data volume that needs to be available storage of 3 GB per compute shuffled around with MPI calls.

Load more