High-Performance Computing

James B. White III (Trey) Cyberinfrastructure for Science and Engineering February 24, 2003

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 1 Acknowledgement

Sponsored by the Mathematical, Information, and Computational Sciences Division, Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract No. DE-AC05-00OR22725 with UT- Battelle, LLC.

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 2 High-Performance Computing

• Where are we? • Where have we been? • Where are we going?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 3 ORNL Center for Computational Sciences • Established in 1992 to support Grand Challenge projects • Evaluates and hardens new HPC systems • Resulting systems used for production • Focus on capability computing • Primary HPC resource for DOE SciDAC (Scientific Discovery through Advanced Computation)

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 4 CCS: A National Resource for Academic and DOE communities • FY2002 utilization − 46% DOE lab − 41% University • Connectivity − OC-12 to DOE − OC-192 to Internet2 universities • Academic partnerships − UT-Battelle Core Universities − Joint Institute for Computational Sciences − UT/CCS Computational Sciences Initiative − MOU with PSC for NSF access to CCS resources

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 5 National HPC community: Where are we? Answer #1:

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 6 National HPC community: Where are we? Answer #1: CoGS

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 7 CoGS

• Clusters of General-purpose SMPs • Clusters − Multiple independent systems − Interconnected • General purpose − Not designed specifically for HPC • SMPs − Multiple processors per system − Shared memory

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 8 CoGS example

• Georgia Tech Jedi Cluster • 17 systems − 8 processors each (550MHz Pentium III Xeon) − 4 GB of shared memory each • Gigabit and Myrinet interconnects

http://www.cc.gatech.edu/projects/ihpcl/resources/hardware.html AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 9 CoGS example

• ORNL CCS Cheetah • IBM Switch2 • 27 IBM p690 systems interconnect − 32 processors each • 4.5 TF of peak (1.3 GHz Power4) performance − 32-128 GB of shared memory

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 10 CoGS dominate US HPC

• Largest DOE systems − NNSA: LANL (HP), LLNL (IBM, Intel) − SC: LBL (IBM), ORNL (IBM), ANL (Intel), PNL (Intel) • Largest NSF systems − PSC (HP), NCAR (IBM), SDSC (IBM), NCSA (Intel) • Largest (known) DOD systems − NAVO (IBM), ARL (IBM) • Largest of other US agencies − NOAA (Intel), NASA (HP) • Largest state systems − LSU (Intel), SUNY (Intel), FSU (IBM), NCSC (IBM)

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 11 National HPC community: Where are we? Answer #2:

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 12 National HPC community: Where are we? Answer #2: behind

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 13 40

35 Leadership-scale computer Leading DOE-SC computer 30

25 Ouch! 20 Teraflops 15

Top DOE-SC 10 machine

5

0 Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul 95 95 96 96 97 97 98 98 99 99 00 00 01 01 02 02 courtesy Thomas Zacharia AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 14 Japanese Earth Simulator

• 40.8 TF peak − 35.6 TF LINPACK (87% of peak) − 26.6 TF on climate benchmark − 14.9 TF on fusion − 16.4 TF on turbulence • Faster than all DOE and courtesy Thomas Zacharia NSF combined! SC2002 Gordon Bell Awards

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 15 ES quotes

• "Japanese Computer Is World's Fastest, As U.S. Falls Back" NY Times headline • "These guys are blowing us out of the

water…" Thomas Sterling, Cal Tech CBS News • "In some sense we have a Computenik

on our hands." Jack Dongarra, UT/ORNL CBS News

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 16 But what is it?

• A building • An SMP cluster! − 640 NEC SX-6 systems − 8 processors each http://www.es.jamstec.go.jp/esc/eng/ − 16 GB of shared memory each • Not general purpose − Vector processors − Ultra-high memory and interconnect bandwidths

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 17 High-Performance Computing

• Where are we? • Where have we been? • Where are we going?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 18 Where have 1015 Earth Simulator ASCI (Q) 1013 ASCI (White) ASCI Blue (LANL, LLNL) we been? ASCI Red (SNL)

11 10 CRAY T3D, CM-5 HPC crisis in late 1980's Intel Delta CM-200 • ORNL/CCS Intel Paragon 9 10 Early Commercial Parallel Computers • HPCRC program advanced CRAY Y-MP CRAY X-MP Projection massively parallel 107 Published CRAY 1 In 1983 CDC 7600 processing (MPP) CDC 6600 105 IBM 7030 • CoGS took over in 1990's IBM 704 Serial MANIAC − Adaptation of general- Operations Per Second (OPS) 103 Vector SEAC purpose systems Parallel Scalable − Attrition and consolidation of Vector 101 Electro-Mechanical vendors Accounting Machine • Computenik! 1940 1950 1960 1970 1980 1990 2000 2010 Year courtesy Thomas Zacharia AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 19 Where has HPC been for US open science? Earth Simulator 40

35 Leadership scale computer Leading DOE-SC computer 30

25 Extreme symptom 20 Chronic problem Teraflops 15

Top DOE-SC 10 machine ORNL-CCS Japanese or ASCI Paragon #1 5

0 Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul 95 95 96 96 97 97 98 98 99 99 00 00 01 01 02 02

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 20 What we have seen on the way

• Increasing − System complexity − Processor speed − Software complexity − Parallelism − Algorithm convergence • Decreasing rates − Computational − Hardware cost requirements for scientific − Relative memory simulation bandwidth − Relative memory and − Relative interconnect interconnect latencies bandwidth − Power consumption − Relative I/O speed − Heat generation − % of peak performance

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 21 High-Performance Computing

• Where are we? • Where have we been? • Where are we going?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 22 Open science needs leadership- scale HPC • Climate science − Chemistry of atmosphere, rivers, vegetation − To guide US policy decisions • Magnetic fusion energy − Optimize self-healing of plasma with turbulence − To quantify prospects for commercial fusion • Environmental molecular science − Predict properties of radioactive substances − To develop innovative remediation technology • Astrophysics − Realistically simulate supernova explosions − To measure size, age, and expansion rate of the Universe • … AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 23 1015

Earth Simulator ASCI (Q) 1013 ASCI (White) ASCI Blue (LANL, LLNL) How do we ASCI Red (SNL)

11 10 CRAY T3D, CM-5 Intel Delta CM-200 get there? ORNL/CCS Intel Paragon 9 10 Early Commercial Parallel Computers CRAY Y-MP CRAY X-MP Projection 107 Published CRAY 1 In 1983 CDC 7600 CDC 6600

105 IBM 7030 IBM 704 Serial

Operations Per Second (OPS) Operations Per Second MANIAC 103 Vector SEAC Parallel Scalable 1 Electro-Mechanical Vector 10 Accounting Machine

1940 1950 1960 1970 1980 1990 2000 2010 Year

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 24 Bigger CoGS? ASCI says Yes

• History of CoGS − Blue (IBM, SGI), White (IBM), Q (HP) • Success for ASCI applications • ASCI Purple − 100 TF IBM Power5 cluster − Complete in 2005

http://www.llnl.gov/asci/purple/

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 25 Bigger CoGS for open science?

• CCS experience with CoGS evaluations − Compaq/HP AlphaServer SC − IBM Winterhawk I & II, Nighthawk I & II, p630, p690 (Cheetah), p655 soon − IBM Federation interconnect later in 2003 − … • Limited scalability for open science? − Increasing system imbalance? − Decreasing application efficiency?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 26 CoGS not the whole story for ASCI • ASCI Red (Sandia) − 9000+ processor MPP − General-purpose processors (Intel) − Special-purpose software and system − Fastest computer 1997-2000 • ASCI Red Storm (Sandia) − 40 TF MPP due in 2004

− General-purpose processors (AMD) http://www.sandia.gov/ASCI/images/RedPictures.htm − Special-purpose system (Cray) − Special-purpose software (Sandia)

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 27 Other half of ASCI Purple • Blue Gene/L − 65,536 nodes - 360 TF peak http://www.research.ibm.com/bluegene/ • Dual-processor PowerPC system-on-a-chip • Ultra-low power (and heat) − Integrated mesh and tree interconnects − Single-process OS kernels on application nodes − on I/O nodes • ORNL also part of Blue Gene team − Joint research with IBM − Scheduled to receive BG/D

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 28 CCS Vision

• Computational science is the third leg of Scientific Discovery • Nation's open scientific enterprise is critically behind in leadership-scale computing • DOE Office of Science has the experience to lead in scientific computing • CCS offers the best opportunity to launch a High-End Computing Initiative

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 29 CCS formed 1992 A Decade of Firsts (1992-2002) First Paragon XP/35 KSR1-64 1995 Install Paragon XP/150 Worlds fastest computer Construction starts Connected by fastest on new CCS building network World class DOE 1993 OC-12 to Sandia facility PVM used to create first International Grid 1994 PVM wins R&D 100

1996 First SC TeraFLOP peak system ORNL-SNL create first high- 2000 2003 performance 2001 computational Grid Longstanding climate simulation milestone first met on CCS Compaq 1997 R&D 100 Award for 1998 First IBM Power 4 Partnership with successful development Developed SciDAC leadership Cray on X1 begins and deployment of HPSS first Human Genome application to 2002 IBM Blue Gene CRADA sustain 1999 to develop super scala 1 TF NetSolve wins R&D 100 ATLAS wins R&D100 algorithms begins AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY courtesy Thomas Zacharia30 CCS is ready

• Experience • Facilities − 40,000 ft2 machine rooms − Power and cooling • Connectivity • Partnerships − Core universities − Joint Institue for Computational Sciences

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 31 CCS Goals

• Deliver Leadership-Class Computing • By 2005: 50x performance on major scientific simulations • By 2008: 1000x performance • Step 1: Evaluate potential architecture

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 32 Potential architecture?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 33 Potential architecture?

Vector power of Earth Simulator

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 34 Potential architecture?

+

Scalability of Red Storm

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 35 Potential architecture!

Cray X series AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 36 Cray X1 evaluation

• 32-processor systems ships March 14 • 256-processor system by end of September • Pending evaluation − Breakthrough performance on challenging scientific applications? − Scalability to leadership-level systems?

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 37 Maintaining leadership

• Cray's future products? − Company goal of a sustained PF by 2010 • DARPA High Productivity Computing Systems http://www.darpa.mil/ipto/research/hpcs/ − "Performance, Programmability, Portability, Robustness" − Cray, IBM, HP, SGI, Sun selected for Phase I − ~2 chosen for Phase III − Full systems developed by 2010

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 38 Maintaining leadership: HPC research • System architecture

− ~50 projects in 1992 Horst Simon − Handful today − DARPA HPCS is a good start • Software science − Parallel programming − Performance-friendly software architecture • Academic collaborations

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 39 Conclusions

• Where are we? Behind • Where have we been? Behind • Where are we going? Ahead!

AK IDGE ATIONAL ABORATORY The Center for O R N L CCSComputational Sciences U. S. DEPARTMENT OF ENERGY 40