Japan's 10 Peta FLOPS Supercomputer Development

Japan’s 10 Peta FLOPS Supercomputer Development Project and its energy saving designs Ryutaro Himeno 1) Group Director of R&D 2) Deputy Program Director 3) Director of Advance group, NGS R&D Center for Computational Science Center for Computer & Communication Contents 1. Self introduction 2. Japan’s next generation supercomputer R&D project: 1. Before the project started 2. Introduction and current status 3. Energy issues 3. Introduction of focused application area 2 Who am I? 1985 1987 1988 1995 1993 1990 Courtesy of Nissan 1992 3 4 Hardware development at RIKEN Special purpose computers MD-GRAPE2 Theoretical Peak In 1989, GRAPE Performance developed by U. Tokyo 100Peta MD-GRAPE3 MDM:MD-GRAPE2 develop by RIKEN 10Peta Japan US GRAPE-DR MDM was integrated in (U.Tokyo/RIKEN) BlueGene/Q installed 1PetaFLOPS RSCC, 2004 MD-GRAPE3 1Peta Planed HPCS MD-GRAPE3 developed in (RIKEN) NLCF BlueGene/L 2004 367TeraFLOPS 100 ASC nd Purple 2 fastest record on MD- Tera Earth Simulator ASC Tsubame 41TeraFLOPS Red (TITech) GRAPE3: Gordon Bell Storm ASCI Q PACS-CS Prize, 2006. 10Tera RSCC ASCI (Tsukuba) Blue Pacific ASCI White (Riken) MD-GRAPE-3 will be ASCI Red integrated in RSCC soon. ASCI Blue Mountain 1Tera *ASCI: Accelerated Strategic CP-PACS Computing Initiative '96 '98 '00 '02 '04 '06 '08 '10 [year] 5 Current RSCC System installed in 2004 at Riken Advance Center for Computing & Communication HDD User User Tape Storage User 20 TB User 200 TB POWER4 Server: 7 Itanium2 Server: 2 Front end Tape Drive: 4 (16) Server Disk Cache: 2TB Myrinet InfiniBand 128Nodes Vector InfiniBand Parallel (256CPUs) Cluster×4 512Nodes (1024CPUs) NEC SX-7/32 Cluster Intel Xeon 3.06GHz 282.5GFLOPS 2GB/Node,Myrinet/IB, 256GB 1.56TFLOPS×4 Intel Xeon 3.06GHz MD-GRAPE3 4GB/Node, IB 64TFLOPS 6.26TFLOPS 6 Basic concept of RSCC Each computer architecture has each advantage and disadvantage There is no single architecture which fits wide variety of applications RSCC is composed of 3 subsystem, Linux cluster with Xeon, Vector Parallel computer and MD-GRAPE3 We made a proposal of the NGSP based on this concept to MEXT 7 Our Proposal Needs of Multi- Needs of Integration of scale Multi- multiple multiple physic simulation computation architecture components Tightly-coupled Present Switch heterogeneous computer Slower connection Faster Faster Faster interconnect interconnect interconnect Proposing Faster Vector Scalar MD architecture interconnect Node Node Node Vector Scalar MD Node Node Node FPGA Node 8 Policy and Outline of A Next Generation Supercomputer Project Purpose of policy: development, installation and application of an advanced high performance supercomputer system, as one of Japan’s “Key Technologies of National Importance” Total Budget: about 115 billion Yen ( 1.15 billion US dollars ) Period of Project: FY2006 – FY2012 9 9 Formation of the 10 Peta R&D Project MEXT: Policy & Funding Office for Supercomputer Development Planning Industries Industrial Forum Project Committee for Promotion of Supercomputing R&D Scheme RIKEN: Project HQ Next-Generation NII: Grid Middleware Advisory Supercomputer R&D Center and Infrastructure (Ryoji Noyori) Evaluation Scheme Board IMS: Nano Science (MEXT and CSTP) Project Leader: Tadashi Watanabe Simulation R&D G.: Ryutaro Himeno Evaluation Riken Wako Institute: Life Science Simulation Committees Visiting Researchers from Univ. & National Labs. (Note) NII: National Institute of Informatics, Computer Companies IMS: Institute for Molecular Science 10 Goals of the Next Generation Supercomputer Project 1. Development and installation of the most advanced high performance supercomputer system 2. Development and wide use of application software to utilize the supercomputer to the maximum extent 3. Provision of flexible computing environment by sharing the next generation supercomputer through connection with other supercomputers located at universities and research institutes 4. Establishment of “Advanced Computational Science and Technology Center (tentative name)” 11 11 Expansion of Highest Computer for Global Usage National Leadership Sustained Performance (FLOPS) Government National Investment Leadership National 100P National Next-next- Infrastructure 100P Leadership next Institute, （The Next Generation Supercomputer） Generation University Project The Next Generation Supercomputer Project Next-next National Infrastructure 1P Generation 1P National Project Institute, Leadership University (Earth Simulator) National Earth Simulator Project Infrastructure Institute, Enterprise National University Company, Leadership 10T Laboratory 10T CP-PACS, Enterprise NWT Company, National Laboratory Infrastructure Enterprise Institute Company, University Laboratory 100G Personal 100G Personal Entertainment Personal National Entertainment PC, Home Server Infrastructure Entertainment Enterprise PC, Home Server Workstation, Institute, PC, Home Server Company, Workstation, Game Machine, Digital TV University Workstation, Game Machine, Digital TV Laboratory Game Machine, Digital TV 1990 2000 2010 2015 2020 2025 12 Schedule of Project FY2006 FY2007 FY2008 FY2009 FY2010 FY2011 FY2012 Conceptual Prototype and Production,Production, installation,installation, Processing unit DetailedDetailed designdesign design evaluation andand adjustmentadjustment System Buildings Front-end unit Basic Basic Detailed design Tuning and improvement (total system design Detailed design Production and evaluation Tuning and improvement software) design Shared file BasicBasic DetailedDetailed designdesign Production,Production, installation,installation, andand adjustmentadjustment system designdesign Applications Next-Generation Integrated Nanoscience Development,Development, production,production, andand evaluationevaluation VerificationVerification Simulation Next-Generation Integrated Development, production, and evaluation Verification Life Simulation Development, production, and evaluation Verification Computer building DesignDesign ConstructionConstruction Research building DesignDesign ConstructionConstruction present 13 Optimization of Computer Design Requirement from Grand Challenges Requirement from Computer Centers 1400 Power, Space 1200 Computer center in 1000 Japan 800 600 Others 設置面積（㎡) 400 Nano Sci. Life Sci. Reliability, operability 200 (Engineering etc) 0 0 500 1000 1500 2000 2500 3000 3500 電力（kW） Cost (development, manufacturing, maintenance Popular numerical scheme FMO MO FVM Fast transform Searching algorism Optimum system Other Project WatchTechnology Survey Real space Linear RISM density function Monte Carlo FEM Solver method Fastest in the world Operation & Utilization 21 Selected Target applications Essential Element technologies Power savving SOI Light transmit. Software LSI process Low-k tech. OS, compiler/// High-k Nano Science Life ScienceEnv./ DesasterEngineering.Physics, Astro. Spin off to the consumer electronics （6) （6） prev. （3）（4）（2） Technology Limit 14 Target applications and BM suit The application software committee started selecting target applications for making a benchmark suit in January, 2006. 21 codes were selected in March and NextBMT suit was developed. 15 NextBMT and Peta-scale BMT(1/2) Peta-scale BMT Fields Program name Description SimFold Prediction of Protein Structure GNISC Inference of Genetic Networks from Experimental Data of Gene Expression MLTest Validation of statistical significance for development of individualized medicine Life (6) MC-Bflow Simulation for blood-flow analysis sievgene/myPresto Docking simulation of protein and drug ProteinDF Ab-initio Molecular Dynamics Calculation in Large-Scale Protein Systems GAMESS/FMO FMO Molecular Orbital Calculation Modylas Massively-parallel multipurpose software for molecular dynamics calculation RSDFT Ab-initio Molecular Dynamics Calculation in Real Space Nono (6) RISM/3D-RISM Analysis of Electron States of Proteins in Solution with the 3D-RISM/FMO Method PHASE First-Principles Molecular Dynamics Simulation within the Plane-Wave Pseudopotential formalism Octa Coarse-Grained Molecular Dynamics Calculation 16 NextBMT and Peta-scale BMT(2/2) Peta-scale BMT LatticeQCD Study of elementary particle and nuclear physics based on Lattice QCD Physics simulation Astronomy NINJA/ASURA Super large-scale gravitational many-body simulation for finding the celestial (2) origin NICAM Nonhydrostatic ICosahedral Atmospheric Model (NICAM) for Global-Cloud Resolving Simulations Geophysics Seism3D Simulation of Seismic-Wave Propagation and Strong Ground Motions (3) COCO Super-resolution Ocean General Circulation Model Cavitation Computation of Unsteady Cavitation Flow by the Finite Difference Method LANS Computation of compressible fluid in aircraft and spacecraft analysis Engineering(4) FrontSTR Structural Calculation by Finite Element Method (FEM) FrontFlow/Blue Unsteady Flow Analysis based on Large Eddy Simulation (LES) 17 System Configuration The Next-Generation Supercomputer is designed as hybrid general-purpose supercomputer that provides the optimum computing environment for a wide range of simulations. •Calculations will be performed in processing units that are suitable for the particular simulation. •Parallel processing in a hybrid configuration of scalar and vector units will make larger and more complex simulations possible. 18 Characteristics Hybrid Supercomputer System composed of Scalar and Vector Units Vector unit is suitable for continuum physics simulation Scalar unit is suitable for Data Base application, particle-based simulation Hybrid system has both strong points for

Japan's 10 Peta FLOPS Supercomputer Development

An Overview of the Blue Gene/L System Software Organization

Advances in Ultrashort-Pulse Lasers • Modeling Dispersions of Biological and Chemical Agents • Centennial of E

2017 HPC Annual Report Team Would Like to Acknowledge the Invaluable Assistance Provided by John Noe

Performance Modeling the Earth Simulator and ASCI Q

Project Aurora 2017 Vector Inheritance

2020 Global High-Performance Computing Product Leadership Award

Hardware Technology of the Earth Simulator 1

R00456--FM Getting up to Speed

Blue Gene/L Architecture

Parallel Gaussian Elimination

75 YEARS Trinity Test the Dawn of America’S Scientific Innovation CONTENTS

Supercomputers – Prestige Objects Or Crucial Tools for Science and Industry?