Accelerating Computational Science Symposium 2012

Accelerating Computational Science Symposium 2012 Hosted by: The Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory The National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign The Swiss National Supercomputing Centre at ETH Zurich Scientists Assess a New Supercomputing Approach and Judge It a Win Graphics processing units speed up research codes on hybrid computing platforms Can scientists and engineers benefit from extreme-scale supercomputers that use application-code accelerators called GPUs (for graphics processing units)? Comparing GPU accelerators with today’s fastest central processing units, early results from diverse areas of research show 1.5- to 3-fold speedups for most codes, which in turn increase the realism of simulations and decrease time to results. With the availability of new, higher-performance GPUs later this year, such as the Kepler GPU chip to be installed in the 20-petaflop Titan supercomputer, application speedups are expected to be even more substantial. n innovation disrupts the The Accelerating Computational marketplace when it confers Science Symposium 2012 (ACSS Aan undeniable advantage 2012), held March 28–30 in Wash- over earlier technologies. Develop- ington, D.C., brought together about ing a disruptive innovation—such 100 experts in science, engineering, as the telephone that so long ago and computing from around the world replaced the telegraph—can be a to share transformational advances high-risk, high-rewards venture. enabled by hybrid supercomputers in Early in a product’s life, it may be diverse domains including chemistry, unclear whether customers will derive combustion, biology, nuclear fusion enough gain from a new technology and fission, and seismology. Partici- to justify its developers’ pain. Then pants also discussed the new breadth unprecedented success compels a and scope of research possible as the community to embrace the innovation. performance of petascale systems, With supercomputers, which must which execute quadrillions of calcula- balance performance and energy tions per second, continues to in- consumption, hybrid architectures crease. At the start of the symposium, incorporate high-performance, energy- neither the scientific community nor Jack Wells, ORNL efficient scientific-application-code the government officials charged with accelerators called graphics process- providing supercomputing facilities to sors to speed discoveries in fields ing units (GPUs) as well as traditional meet the needs of researchers knew from seismology to space and aid central processing units (CPUs). For whether hybrid computing architec- innovations such as next-generation scientists who must beat competitors tures conferred incontestable benefits catalysts, drugs, materials, engines, to discoveries in a publish-or-perish to the scientists and engineers using and reactors. The overwhelming world or engineers who must invent leadership-class computing systems consensus: GPUs accelerate a broad and improve products for the global to conduct digital experiments. range of computationally intensive ap- marketplace, the ability to more fully To understand the impact hybrid plications exploring the natural world, exploit massive parallelism is proving architectures are making on science, from subatomic particles to the vast advantageous enough for GPU/ attendees heard from two dozen cosmos, and the engineered world, CPU hybrids to displace CPU-only researchers who use a combination from turbines to advanced fuels. architectures. of traditional and accelerated proces- 1 “The symposium [was] motivated by society’s great need for advances in energy technologies and by the demonstrated achievements and tremendous potential for computational science and engineering,” said meeting program chair Jack Wells, director of science at the Oak Ridge Leadership Computing Facility (OLCF), which co-hosted ACSS 2012 with the National Center for Supercomputing Applications (NCSA) and the Swiss National Supercomputing Centre (CSCS). Supercomputer vendor Cray Inc. and GPU inventor NVIDIA helped sponsor the event. “Compu- tational science on extreme-scale Monte Rosa, the Swiss National Supercomputing Centre’s 402-teraflop XE6 Cray hybrid computing architectures will supercomputer, uses only CPUs. The entire computational-science community benefits advance research and development in from efforts to optimize CPU performance. Image credit: Michele De Lorenzi/CSCS this decade, increase our understand- ing of the natural world, accelerate innovation, and as a result, increase economic opportunity,” Wells said. Concurring was James Hack, director of the National Center for Compu- tational Sciences, which houses the U.S. Department of Energy’s (DOE’s) Jaguar, the National Science Founda- tion’s (NSF’s) Kraken, and the National Oceanic and Atmospheric Administra- tion’s (NOAA’s) Gaea supercomputers. “As a community we’re really at a point where we need to embrace this disruptive approach to supercomputing,” he said. “This meeting was an Tödi, the Swiss National Supercomputing Centre’s 137-teraflop XK6 Cray opportunity to catalyze a broad dis- supercomputer, is the first GPU/CPU hybrid supercomputing system. cussion of how extreme-scale hybrid Image credit: Michele De Lorenzi/CSCS architectures are already accelerating progress in computationally focused system from Cray with Interlagos computational giant called Titan. So science research.” At its conclusion, CPUs and NVIDIA Fermi GPUs, has far the upgrade has converted the attendees had a better idea of where been in operation at CSCS since fall Cray XT5 system, an all-CPU model the community stood in exploiting 2011. Third, NCSA’s Blue Waters is a with an older, less-capable version of hybrid architectures. hybrid Cray system under construction the AMD processor, into a Cray XK6 at the University of Illinois at Urbana- system with Interlagos CPUs, twice The ACSS 2012 researchers Champaign that gives more weight to the memory of its predecessor, and presented results obtained with four CPUs than GPUs. It will have only the a more capable network. After the high-performance systems employ- Interlagos CPUs in 235 XE6 cabinets cabinets have been equipped with ing different computing strategies. and both CPUs and the NVIDIA Fermi NVIDIA’s newest Kepler GPUs by the First, CSCS’s 402-teraflop XE6 Cray GPUs in 30 XK6 cabinets. A testbed of end of 2012, the peak performance supercomputer, called Monte Rosa, at 48 XE6 cabinets has been operational will be at least 20 petaflops. In the the Swiss Federal Institute of Technol- since mid-March. Fourth, the OLCF’s meantime, ten of Jaguar’s cabinets ogy in Zurich (ETH Zurich) uses two 200-cabinet Jaguar supercomputer, have already gone hybrid to create a modern 16-core CPUs from AMD America’s fastest, is undergoing an testbed system called TitanDev. named Interlagos per compute node. upgrade at its Oak Ridge National Second, a two-cabinet XK6 super- Laboratory (ORNL) home that will computer called Tödi, the first hybrid transform it into a mostly hybrid 2 ALCC 2012 Workshop Report Compute Node Architectures of Select Cray Supercomputers XK6 Next XT5 XE6 (w/o XK6 -Gen GPU) XK AMD Istanbul CPU 2 (6 cores) AMD Interlagos CPU (16 cores) 2 1 1 1 NVIDIA Fermi 1 NVIDIA Kepler 1 Added Schulthess: “Given the a big avalanche coming that’s really The National Center for Supercomputing Applications’ Blue Waters, under applications that we’ve seen and the going to revolutionize computational construction at the University of Illinois benchmarks that were representative science. I think for these reasons we at Urbana-Champaign, is a hybrid of real science runs, I haven’t seen the really want to invest in hybrid comput- Cray system that gives more weight to case where the GPU doesn’t provide a ing from an application developer CPUs than GPUs. Image courtesy of performance increase. Really, it makes point of view.” the National Center for Supercomputing no sense not to have the GPU.” Applications and the Board of Trustees Work remains to determine if the of the University of Illinois The scientific results presented at the payoff for other applications and symposium were way ahead of what algorithms is great enough to develop At the start of ACSS 2012, managers the high-performance computing com- and port codes to hybrid, multicore of extreme-scale supercomputing munity had expected, Schulthess said. systems. “What we’re watching is a centers and program managers from Scientists from diverse fields of study bunch of pioneers, people who have DOE’s Office of Science wanted to were able to demonstrate impressive taken the first steps in embracing know if researchers would rather use results using hybrid architectures. and adopting what amounts to an an all-CPU XE6 or a hybrid GPU/ architectural sea change,” Hack said. CPU XK6. The results presented at “Compared to the past when we made the meeting laid that question to rest. major architectural changes, this one Most scientific application codes is different because the vendors are showed a 1.5- to 3-fold speedup on part of the solution. The partnering the GPU-equipped XK6 compared to that’s going on in terms of allowing the CPU-only XE6, with evidence of early adaptation or adoption of this a 6-fold speedup for one code. The new architecture includes working with greater computing speed makes pos- the vendors, something I hope will sible simulations of greater complexity continue. The kind of work that’s going and realism and quicker solutions to on now is identifying where the holes computationally demanding problems. are and where the opportunities are for “The Fermi [GPU from NVIDIA] delivers making progress.” higher performance than Interlagos [CPU from AMD],” said meeting NCSA Deputy Director Rob Penning- co-chair Thomas Schulthess, who is ton, a meeting co-chair, added that head of CSCS and a professor at ETH Thomas Schulthess, ETH Zurich and the partnership between domain and Zurich. “And with Fermi we’re talking CSCS computational scientists is also crucial about a GPU that is 1 or 2 years old “The XK architecture is looking like to providing whole solutions.

Accelerating Computational Science Symposium 2012

Workload Management and Application Placement for the Cray Linux Environment™

Pubtex Output 2011.12.12:1229

Use Style: Paper Title

Titan Introduction and Timeline Bronson Messer

Introduction to the Oak Ridge Leadership Computing Facility for CSGF Fellows Bronson Messer

Leadership Computing at the National Center for Computational Science: Transitioning to Heterogeneous Architectures

Warwick.Ac.Uk/Lib-Publications MENS T a T a G I MOLEM

Cray Linux Environment™ (CLE) Software Release Overview Supplement

Overview of ORNL Leadership Computing Facility and Usage

Cray Linux Environment™ (CLE) 4.0 Software Release Overview Supplement

Early Experiences with the Cray XK6 Hybrid CPU and GPU MPP Platform

AMD Opteron™ 6000 Series Platform Increases Reach, Demonstrates Virtualization Prowess, Drives Toward Core-Rich “Interlagos”