Spring 2014 • Vol

Editorial is published two times a year by The GAUSS Centre Editorial for Supercomputing (HLRS, LRZ, JSC) Welcome to this new issue of inside, tive of the German Research Society the journal on Innovative Supercomput- (DFG) start to create interesting re- Publishers ing in Germany published by the Gauss sults as is shown in our project section. Centre for Supercomputing (GCS). In Prof. Dr. A. Bode | Prof. Dr. Dr. Th. Lippert | Prof. Dr.-Ing. Dr. h.c. Dr. h.c. M. M. Resch this issue, there is a clear focus on With respect to hardware GCS is con- Editor applications. While the race for ever tinuously following its roadmap. New faster systems is accelerated by the systems will become available at HLRS F. Rainer Klank, HLRS [email protected] challenge to build an Exascale system in 2014 and at LRZ in 2015. System within a reasonable power budget, the shipment for HLRS is planned to start Design increased sustained performance allows on August 8 with a chance to start Linda Weinmann, HLRS [email protected] for the solution of ever more complex general operation as early as October Sebastian Zeeden, HLRS [email protected] problems. The range of applications 2014. We will report on further Pia Heusel, HLRS [email protected] presented is impressive covering a progress. variety of different fields. As usual, this issue includes informa- GCS has been a strong player in PRACE tion about events in supercomputing over the last years. In 2015 will con- in Germany over the last months and tinue to provide access to leading edge gives an outlook of workshops in the systems for all European researchers field. Readers are invited to participate through the evaluation process of in these workshops which are now PRACE. By the end of 2015 GCS will part of the PRACE Advanced Training have delivered CPU time worth 100 Center and hence open to all European Million Euro. It is noteworthy that com- researchers. putational costs as offered by GCS are • Prof. Dr. A. Bode even lower than comparable Cloud (LRZ) offerings. The concept of dedicated • Prof. Dr. Dr. Th. Lippert centers for research is working well (JSC) both for the user community and the funding agencies. For PRACE we will • Prof. Dr.-Ing. Dr. h.c. give a review on the results of the last Dr. h.c. M. M. Resch (HLRS) call or simulation proposals. HPC continues to be an important issue in the German research community. The working group of the Wissenschaftsrat (Science Council), which is currently evaluating the exist- ing German funding scheme for HPC and discussing possible improvements, is expected to deliver its report in July. At the same time a number of software initiatives like the HPC Software Initiative of the Federal Ministry of Science (BMBF) or the SPPEXA initia- 2 Spring 2014 • Vol. 12 No. 1 • inSiDE Spring 2014 • Vol. 12 No. 1 • inSiDE 3 News News PRACE: Results of the Optimizing a Seismic 8th Regular Call Wave Propagation Code for PetaScale-Simulations The Partnership for Advanced Computing granted a total of 100 million compute in Europe (PRACE) is continuously of- core hours on JUQUEEN. One research fering supercomputing resources on project has been granted resources The accurate numerical simulation of One of the biggest challenges today is to the highest level (tier-0) to European both on Hermit and SuperMUC. seismic wave propagation through a real- correctly simulate the high frequencies researchers. istic three-dimensional Earth model is in the wave field spectrum. These high Four of the newly awarded research pro- key to a better understanding of the frequencies can be in resonance to The Gauss Centre for Supercomputing jects are from Germany and the UK, Earth’s interior and is used for earth- buildings or enable a clearer image of (GCS) is currently dedicating shares each, three are from Italy, two are from quake simulation as well as hydrocarbon the subsurface due to their sensitivity of its IBM iDataPlex system SuperMUC France and Sweden, each, and one is exploration. Seismic waves generated to smaller structures. To match reality in Garching, of its Cray XE6 system from Belgium, Finland, the Netherlands, by an active (e.g. explosions or air-guns) and simulated results, high accuracy Hermit in Stuttgart, and of its IBM Blue Spain, and Switzerland, each. The re- or passive (e.g. earthquakes) source and fine resolution in combination with Gene/Q system JUQUEEN in Jülich. search projects awarded computing carry valuable information about the geometrical flexibility are the demands France, Italy, and Spain are dedicating time cover again many scientific areas, subsurface structure. Geophysicists on modern numerical schemes. Our shares on their systems CURIE, hosted from Astrophysics to Medicine and interpret the recorded waveforms to software package SeisSol implements by GENCI at CEA-TGCC in Bruyères- Life Sciences. More details, also on the create a geological model (Fig. 1). For- a highly optimized Arbitrary high-order Le-Châtel, FERMI, hosted by CINECA in projects granted access to the ma- ward simulation of waves support the DERivatives Discontinuous Galerkin Casalecchio di Reno, and MareNostrum, chines in France, Italy, and Spain, can survey setup assembly and hypothesis (ADER-DG) method that works on un- hosted by BSC in Barcelona. be found via the PRACE web page testing to expand our knowledge of structured tetrahedral meshes. The www.prace-ri.eu/PRACE-8th-Regular-Call. complicated wave propagation phenom- ADER-DG algorithm is high-order accu- The 8th call for proposals for computing ena. One of the most exciting applica- rate in space and time to ensure lowest time for the allocation time period The 9th call for proposals for the alloca- tions is the full simulation of the frictional dispersion errors for simulating waves March 4, 2014 to March 3, 2015 on tion time period September 2, 2014 to sliding during an earthquake and the propagating over large distances. • Walter Nadler the above systems closed October 15, September 1, 2015 closed March 25, excited waves that may cause damage 2013. Nine research projects have been 2014 and evaluation is still under way, on built infrastructure. Jülich granted a total of about 170 million as of this writing. The 10th call for Supercomputing compute core hours on Hermit, seven proposals is exptected to open in Sep- Centre (JSC) have been granted a total of about 120 tember 2015. million compute core hours on Super- MUC, and five proposals have been Details on calls can be found on www.prace-ri.eu/Call-Announcements. contact: Walter Nadler, [email protected] Figure 1: Waveforms of the 2009 L’Aquila earthquake (yellow star). Synthetic seismograms produced by SeisSol are depicted as red lines and compared to recorded real data (black lines) at 9 stations (red triangles). The seismograms span 800 seconds and are filtered to periods of T ≥ 33 s. Figure obtained from Wenk et al. (2013). 4 Spring 2014 • Vol. 12 No. 1 • inSiDE Spring 2014 • Vol. 12 No. 1 • inSiDE 5 News News In a collaboration of the research groups Optimizing SeisSol for 1 PetaFlop/s Sustained SuperMUC runs, and Mikhail Smelyanskiy on Geophysics at LMU Munich (Dr. PetaScale-Simulations Performance (Intel Labs) for his support in hardware- Christian Pelties, Dr. Alice-Agnes Gabriel, With this goal in mind, three key per- In cooperation with the Leibniz Super- aware optimization. Financial support Stefan Wenk) and HPC at TUM, Munich formance and scalability issues were computing Centre, several benchmark was provided by DFG, KONWIHR and (Prof. Dr. Michael Bader, Alexander addressed in SeisSol: runs were executed with SeisSol on Volkswagen Foundation (project ASCETE). Breuer, Sebastian Rettenberger, SuperMUC. In an Extreme Scaling Period Alexander Heinecke), SeisSol has been • Optimized Matrix Kernels: end of January 2014, a weak-scaling References re-engineered with the explicit goal In SeisSol’s ADER-DG method, the test with 60,000 grid cells per core [1] Breuer, A., Heinecke, A., Rettenberger, S., Bader, M., Gabriel, A.-A., Pelties, C. to realize petascale simulations and to element-local numerical operations are was conducted: SeisSol achieved a per- Sustained Petascale Performance of Seismic prepare SeisSol for the next generation implemented via matrix multiplications. formance of 1.4 PetaFlop/s computing Simulations with SeisSol on SuperMUC, of supercomputers. For these operations, highly optimized, a problem with nearly 9 billion grid cells International Supercomputing Conference hardware-aware kernels were generated and over 1,012 degrees of freedom. ISC’14, accepted. by a custom-made code generator. However, as the true highlight a simula- Depending on the sparsity pattern of tion under production conditions was [2] Breuer, A., Heinecke, A., Bader, M., the matrices and on the achieved time performed. We modelled seismic wave Pelties, C. Accelerating SeisSol by Generating Vectorized to solution, sparse or dense kernels propagation in the topographically com- Code for Sparse Matrix Operators, Parallel were selected for each individual opera- plex volcano Mount Merapi (Fig. 2) Computing–Accelerating Computational tion. Due to more efficient exploitation discretized by a mesh with close to 100 Science and Engineering (CSE), Advances in of current CPUs’ vector computing million cells. In 3 hours of computing Parallel Computing 25, IOS Press, 2014. features, speedups of 5 and higher were time the first 5 seconds of this process achieved. were simulated. During this simulation [3] Pelties, C., de la Punte, J., Ampuero, J.-P., SeisSol achieved a sustained perfor- Brietzke, G.B., Käser, M. Three-dimensional dynamic rupture simu- • Hybrid MPI+OpenMP-Parallelization: mance of 1.08 PetaFlop/s (equivalent lation with a high-order discontinuous SeisSol’s parallelization was carefully to 35% of peak performance, see the Galerkin method on unstructured tetra- Figure 2: Tetrahedral mesh and simulated wave field (after 4 seconds simulated changed towards a hybrid MPI+OpenMP performance plot in Fig.

Spring 2014 • Vol

Base Performance for Beegfs with Data Protection

Data Integration and Analysis 10 Distributed Data Storage

Globalfs: a Strongly Consistent Multi-Site File System

Towards Transparent Data Access with Context Awareness

The Leading Parallel Cluster File System, Developed with a St Rong Focus on Perf Orm Ance and Designed for Very Ea Sy Inst All Ati on and Management

Maximizing the Performance of Scientific Data Transfer By

Understanding and Finding Crash-Consistency Bugs in Parallel File Systems Jinghan Sun, Chen Wang, Jian Huang, and Marc Snir University of Illinois at Urbana-Champaign

Using On-Demand File Systems in HPC Environments

Performance and Scalability of a Sensor Data Storage Framework

Towards Transparent Data Access with Context Awareness

Beegfs Unofficial Documentation

Filesystems in IO500 3% Gridscaler Datawarp 2% Wekaio Matrix 2% Total Submissions Across All Lists: 147 5% SUSE Enterprise Storage 1% Lustre 29%