Exascale Computing Study: Technology Challenges in Achieving Exascale Systems
Total Page:16
File Type:pdf, Size:1020Kb
ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems Peter Kogge, Editor & Study Lead Keren Bergman Shekhar Borkar Dan Campbell William Carlson William Dally Monty Denneau Paul Franzon William Harrod Kerry Hill Jon Hiller Sherman Karp Stephen Keckler Dean Klein Robert Lucas Mark Richards Al Scarpelli Steven Scott Allan Snavely Thomas Sterling R. Stanley Williams Katherine Yelick September 28, 2008 This work was sponsored by DARPA IPTO in the ExaScale Computing Study with Dr. William Harrod as Program Manager; AFRL contract number FA8650-07-C-7724. This report is published in the interest of scientific and technical information exchange and its publication does not constitute the Government’s approval or disapproval of its ideas or findings NOTICE Using Government drawings, specifications, or other data included in this document for any purpose other than Government procurement does not in any way obligate the U.S. Government. The fact that the Government formulated or supplied the drawings, specifications, or other data does not license the holder or any other person or corporation; or convey any rights or permission to manufacture, use, or sell any patented invention that may relate to them. APPROVED FOR PUBLIC RELEASE, DISTRIBUTION UNLIMITED. This page intentionally left blank. DISCLAIMER The following disclaimer was signed by all members of the Exascale Study Group (listed below): I agree that the material in this document reects the collective views, ideas, opinions and ¯ndings of the study participants only, and not those of any of the universities, corporations, or other institutions with which they are a±liated. Furthermore, the material in this document does not reect the o±cial views, ideas, opinions and/or ¯ndings of DARPA, the Department of Defense, or of the United States government. Keren Bergman Shekhar Borkar Dan Campbell William Carlson William Dally Monty Denneau Paul Franzon William Harrod Kerry Hill Jon Hiller Sherman Karp Stephen Keckler Dean Klein Peter Kogge Robert Lucas Mark Richards Al Scarpelli Steven Scott Allan Snavely Thomas Sterling R. Stanley Williams Katherine Yelick i This page intentionally left blank. ii FOREWORD This document reects the thoughts of a group of highly talented individuals from universities, industry, and research labs on what might be the challenges in advancing computing by a thousand- fold by 2015. The work was sponsored by DARPA IPTO with Dr. William Harrod as Program Manager, under AFRL contract #FA8650-07-C-7724. The report itself was drawn from the results of a series of meetings over the second half of 2007, and as such reects a snapshot in time. The goal of the study was to assay the state of the art, and not to either propose a potential system or prepare and propose a detailed roadmap for its development. Further, the report itself was assembled in just a few months at the beginning of 2008 from input by the participants. As such, all inconsistencies reect either areas where there really are signi¯cant open research questions, or misunderstandings by the editor. There was, however, virtually complete agreement about the key challenges that surfaced from the study, and the potential value that solving them may have towards advancing the ¯eld of high performance computing. I am honored to have been part of this study, and wish to thank the study members for their passion for the subject, and for contributing far more of their precious time than they expected. Peter M. Kogge, Editor and Study Lead University of Notre Dame May 1, 2008. iii This page intentionally left blank. iv Contents 1 Executive Overview 1 2 De¯ning an Exascale System 5 2.1 Attributes . 5 2.1.1 Functional Metrics . 5 2.1.2 Physical Attributes . 6 2.1.3 Balanced Designs . 6 2.1.4 Application Performance . 7 2.2 Classes of Exascale Systems . 8 2.2.1 Data Center System . 8 2.2.2 Exascale and HPC . 9 2.2.3 Departmental Systems . 9 2.2.4 Embedded Systems . 10 2.2.5 Cross-class Applications . 11 2.3 Systems Classes and Matching Attributes . 12 2.3.1 Capacity Data Center-sized Exa Systems . 12 2.3.2 Capability Data Center-sized Exa Systems . 13 2.3.3 Departmental Peta Systems . 14 2.3.4 Embedded Tera Systems . 14 2.4 Prioritizing the Attributes . 14 3 Background 17 3.1 Prehistory . 17 3.2 Trends . 18 3.3 Overall Observations . 19 3.4 This Study . 19 3.5 Target Timeframes and Tipping Points . 20 3.6 Companion Studies . 20 3.7 Prior Relevant Studies . 21 3.7.1 1999 PITAC Report to the President . 21 3.7.2 2000 DSB Report on DoD Supercomputing Needs . 21 3.7.3 2001 Survey of National Security HPC Architectural Requirements . 21 3.7.4 2001 DoD R&D Agenda For High Productivity Computing Systems . 22 3.7.5 2002 HPC for the National Security Community . 22 3.7.6 2003 Jason Study on Requirements for ASCI . 23 3.7.7 2003 Roadmap for the Revitalization of High-End Computing . 23 3.7.8 2004 Getting Up to Speed: The Future of Supercomputing . 24 v 3.7.9 2005 Revitalizing Computer Architecture Research . 24 3.7.10 2006 DSB Task Force on Defense Critical Technologies . 25 3.7.11 2006 The Landscape of Parallel Computing Research . 25 4 Computing as We Know It 27 4.1 Today's Architectures and Execution Models . 27 4.1.1 Today's Microarchitectural Trends . 27 4.1.1.1 Conventional Microprocessors . 28 4.1.1.2 Graphics Processors . 28 4.1.1.3 Multi-core Microprocessors . 28 4.1.2 Today's Memory Systems . 29 4.1.3 Unconventional Architectures . 30 4.1.4 Data Center/Supercomputing Systems . 31 4.1.4.1 Data Center Architectures . 31 4.1.4.2 Data Center Power . 32 4.1.4.2.1 Mitigation . 33 4.1.4.3 Other Data Center Challenges . 33 4.1.5 Departmental Systems . 34 4.1.6 Embedded Systems . 34 4.1.7 Summary of the State of the Art . 35 4.2 Today's Operating Environments . 35 4.2.1 Unix . 36 4.2.2 Windows NT Kernel . 37 4.2.3 Microkernels . 37 4.2.4 Middleware . 38 4.2.5 Summary of the State of the Art . 38 4.3 Today's Programming Models . 38 4.3.1 Automatic Parallelization . 40 4.3.2 Data Parallel Languages . 40 4.3.3 Shared Memory . 41 4.3.3.1 OpenMP . 42 4.3.3.2 Threads . 43 4.3.4 Message Passing . 44 4.3.5 PGAS Languages . 45 4.3.6 The HPCS Languages . 46 4.4 Today's Microprocessors . 47 4.4.1 Basic Technology Parameters . 47 4.4.2 Overall Chip Parameters . 49 4.4.3 Summary of the State of the Art . 53 4.5 Today's Top 500 Supercomputers . 53 4.5.1 Aggregate Performance . 53 4.5.2 E±ciency . 54 4.5.3 Performance Components . 54 4.5.3.1 Processor Parallelism . 55 4.5.3.2 Clock . 56 4.5.3.3 Thread Level Concurrency . 56 4.5.3.4 Total Concurrency . 57 4.5.4 Main Memory Capacity . 59 vi 5 Exascale Application Characteristics 61 5.1 Kiviat Diagrams . 61 5.2 Balance and the von Neumann Bottleneck . 62 5.3 A Typical Application . 63 5.4 Exascale Application Characteristics . 65 5.5 Memory Intensive Applications of Today . 66 5.5.1 Latency-Sensitive Applications . 66 5.5.2 Locality Sensitive Applications . 68 5.5.3 Communication Costs - Bisection Bandwidth . 69 5.6 Exascale Applications Scaling . 71 5.6.1 Application Categories . 71 5.6.2 Memory Requirements . 72 5.6.3 Increasing Non-Main Memory Storage Capacity . 73 5.6.3.1 Scratch Storage . ..