High Performance Computing (HPC)

Vol. 20 | No. 2 | 2013 Globe at a Glance | Pointers | Spinouts Editor’s column Managing Editor “End of the road for Roadrunner” Los Alamos important. According to the developers of the National Laboratory, March 29, 2013.a Graph500 benchmarks, these data-intensive applications are “ill-suited for platforms designed for Five years after becoming the fastest 3D physics simulations,” the very purpose for which supercomputer in the world, Roadrunner was Roadrunner was designed. New supercomputer decommissioned by the Los Alamos National Lab architectures and software systems must be on March 31, 2013. It was the rst supercomputer designed to support such applications. to reach the petaop barrier—one million billion calculations per second. In addition, Roadrunner’s These questions of power eciency and unique design combined two dierent kinds changing computational models are at the core of processors, making it the rst “hybrid” of moving supercomputers toward exascale supercomputer. And it still held the number 22 spot computing, which industry experts estimate will on the TOP500 list when it was turned o. occur sometime between 2020 and 2030. They are also the questions that are addressed in this issue of Essentially, Roadrunner became too power The Next Wave (TNW). inecient for Los Alamos to keep running. As of November 2012, Roadrunner required 2,345 Look for articles on emerging technologies in kilowatts to hit 1.042 petaops or 444 megaops supercomputing centers and the development per watt. In contrast, Oak Ridge National of new supercomputer architectures, as well as a Laboratory’s Titan, which was number one on the brief introduction to quantum computing. While November 2012 TOP500 list, was 18 times faster yet this column takes the reader to the recent past ve times more ecient. of supercomputing, the remainder of the issue will propel you “beyond digital” to the future of In addition, data-intensive applications for advanced computing systems. supercomputers are becoming increasingly Managing Editor, The Next Wave a. Press release, Los Alamos National Laboratory. 29 March 2013. Available at: http://www.lanl.gov/newsroom/news- releases/2013/March/03.29-end-of-roadrunner.php. Contents 3 Dening the future with modeling, simulation, and emulation B P N W 7 Predicting the performance of extreme- scale supercomputer networks S P, X Y, M L 20 Doing more with less: Cooling computers with oil pays o D P 30 Energy-ecient superconducting computing coming up to speed M A. M 33 Beyond digital: A brief introduction to quantum computing P L 38 GLOBE AT A GLANCE 40 POINTERS 42 SPINOUTS e Next Wave is published to disseminate technical advancements and research activities in telecommunications and information technologies. Mentions of company names or commercial products do not imply endorsement by the US Government. is publication is available online at http://www.nsa.gov/research/tnw/tnw202/ article1.shtml. For more information, please contact us at [email protected]. Vol. 20 | No. 2 | 2013 Article title goes here unless article begins on this page. If article begins on this page, override rules and text using Ctrl + Shift. / 2 Dening the future with modeling, simulation, and emulation Benjamin Payne, PhD, and Noel Wheeler nformation has become a key currency and driving force in modern society. Keeping Iour leaders apprised of the most current information enables them to make the decisions essential to keep our country safe. The National Security Agency’s reputation as the “nation’s guardian” relies heavily on the ow of information, all of which is handled by an awe-inspiring array of computers and networks. Some of the problems encountered by NSA require a special breed of machines known as high-performance computers or supercomputers. Employing these powerful machines comes with a considerable price tag to the US government. When acquiring supercomputers, decision makers need to have a degree of condence that a new computer will be suitable, even in cases when the machine does not yet exist. The Next Wave | Vol. 20 No. 2 | 2013 | 3 Dening the future with modeling, simulation, and emulation What is a supercomputer? Processor Processor 200+ gigaops Interconnect 200+ gigaops Although supercomputers are unique, custom-built 0.1 byte per 0.1 byte per second per op second per op machines, they fundamen- tally share the design of the computers you use at home—a processor (i.e., 1 byte per 0.01 byte per 1 byte per second per op second per op second per op a central processing unit or CPU), small and fast memory (i.e., random- access memory or RAM), storage (i.e., hard disk Local Memory Local Memory drive/CD/DVD), and a Capacity=0.1 byte per op Capacity=0.1 byte per op network to communicate Disk Storage with other computers. A Capacity=10 bytes per op typical high-performance computing (HPC) system FIGURE 1. A high-performance computer is like a personal computer on a much grander scale— could be considered a per- it has tens of thousands of processors, terabytes of memory, and petabytes of storage. sonal computer on a much grander scale, with tens of thousands of proces- is substantial value to the construction of a model. sors, terabytes (i.e., trillions of bytes) of memory, and Architects, engineers, and scientists have a long his- petabytes (i.e., quadrillions of bytes) of storage (see tory of building models to study complex objects, such gure 1). High-performance computers can readily ll as buildings, bridges, and aircras. a large room, if not a whole building, have customized A new team—the Modeling, Simulation, and Emu- cooling infrastructure, use enough electricity to power lation (MSE) team—within the Laboratory of Physi- a small town, and take an act of Congress to purchase. cal Science’s Advanced Computing Systems Research Such an investment is not made without a great deal of Program [1] has been assembled to address this gap study and thought. between classied soware, which cannot be distrib- uted to vendors, and the vendors’ hardware systems, Simulating a supercomputer which have not been purchased by NSA. As an addi- tional twist, the proposed hardware may be built from Although HPC technology is not unique to NSA, the prototype components such as the hybrid memory specialized problems faced by the Agency can neces- cube (HMC; see gure 2), a three dimensional stacked sitate unique customizations. Because NSA’s applica- memory device designed by a consortium of industry tions and soware are oen classied, they cannot be leaders and researchers [2]. e core objectives of the shared with the architects and engineers developing MSE team include exploration of system architectures, supercomputers. At the same time, an investment of analysis of emerging technologies, and analysis of this magnitude requires condence that a proposed optimization techniques. system will oer the performance sought. Owners of HPC systems desire a computer that is Currently, benchmarks, simplied unclassied innitely fast, has innite memory, takes up no space, soware that exercises important attributes of a com- and requires no energy. None of these attributes are puter system, are developed and used to evaluate the truly realizable, and when considering a practical HPC performance of potential computing system hardware. system, trade-os must be considered. When analyz- However, these benchmarks may not paint the coming a prospective HPC system, four primary metrics plete picture. To better understand this problem, there are customarily considered: nancial cost, system 4 FEATURE resilience, time-to-solution, and energy eciency. ese metrics are interdependent. For example, in- creasing the speed of an HPC system will increase the amount of power it consumes and ultimately increase the cost necessary to operate it. In order to measure these metrics, one could build the system and test it. However, this would be extremely expensive and dif- cult to optimize. A model simulating the computer can be developed in far less time, and design parameters can be adjusted in soware to achieve the desired balance of power, performance, reliability, and cost. Any simulation or model of a computer should address the metrics listed above. If any are not addressed, then the model could yield incomplete results because optimizing for fewer than all relevant variables poten- FIGURE 2. NSA collaborated with the University of Maryland tially leads to non-global extrema. Many scalar bench- and Micron to develop a simulation tool for Micron's Hybrid marks, for example the TOP500 and the Graph500, Memory Cube that is helping to advance supercomputing focus exclusively on one characteristic, like time-to- applications. Micron now is sampling the three-dimensional package that combines logic and memory functions onto a solution, to the neglect of the other parameters of single chip. interest. e MSE team is collaborating to evangelize a more balanced approach to multiple facets of HPC system characterization, assuring an optimal solution take a holistic approach, targeting the network, CPU, to the Agency’s needs. memory hierarchy, and accelerators (e.g., a graphics e use of benchmarking soware allows for a processing unit or eld-programmable gate array). more comprehensive evaluation of a proposed com- Multiple levels of detail for a simulation are required puter architecture’s performance. is enables HPC to accomplish this. A simulation may be compute- system architects to better target their designs to serve cycle or functionally accurate; it may range from an NSA’s needs. Simply stated, NSA has to work within abstract model to a hardware simulation including budgetary and power (i.e., electricity) constraints, device physics. and it is vital to maximize the return on investment of money and time. Simulation technology While this description is somewhat generic to all To accomplish the objective of enabling HPC system HPC system purchasers, NSA is willing to build spe- simulation within NSA, the MSE group carried out a cial purpose hardware devices and to employ specially survey of existing simulators from academia, industry, developed programming languages if a cost-benet and national labs.

High Performance Computing (HPC)

NVIDIA Tesla Personal Supercomputer, Please Visit

"Computers" Abacus—The First Calculator

Performance Analysis on Energy Efficient High

Computer Organization and Architecture Designing for Performance Ninth Edition

The Reliability Wall for Exascale Supercomputing Xuejun Yang, Member, IEEE, Zhiyuan Wang, Jingling Xue, Senior Member, IEEE, and Yun Zhou

Computersaciﬁc 2008CIFIC Philosophical Universityuk Articles PHILOSOPHICAL Publishing of Quarterlysouthernltd QUARTERLY California and Blackwell Publishing Ltd

Trends in Electrical Efficiency in Computer Performance

Computer Performance Evaluation and Benchmarking

Programs Processing Programmable Calculator

Hematopoietic and Lymphoid Neoplasm Coding Manual

Lessons Learned in Deploying the World's Largest Scale Lustre File

LLNL Computation Directorate Annual Report (2014)