How Hardware Influences Large-Scale Simulation Science at Extreme Scales: Where Big Data Meets Large-Scale Computing Tutorials

How Hardware Influences Large-Scale Simulation Science at Extreme Scales: Where Big Data Meets Large-Scale Computing Tutorials

Build It and They Will Come: How Hardware Influences Large-Scale Simulation Science at Extreme Scales: Where Big Data Meets Large-Scale Computing Tutorials 14 Sep 2018 Jeffrey A. F. Hittinger Director LLNL-PRES-758289 This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE- AC52-07NA27344. Lawrence Livermore National Security, LLC Thanks to § David Keyes, KAUST § Rob Neely, LLNL § And others… 2 LLNL-PRES-758289 Advancements in HPC have happened over three major epochs, and we’re settling on a fourth (heterogeneity) ATS-5 1.00E+19 ATS-4 Exascale 1.00E+18 Crossroads Transitions are Code 1.00E+17 Sierra essential for Transition Roadrunner 1.00E+16 Period BlueGene/L Petascale progress: between Trinity 1.00E+15 1952 and 2012 Red Storm Sequoia 1.00E+14 (Sequoia), NNSA saw Purple a factor of ~1 trillion White Cielo 1.00E+13 Terascale improvement in peak Blue Pacific After a decade or 1.00E+12 speed ASCI Red so of uncertainty, 1.00E+11 history will most 1.00E+10 Gigascale likely refer to this 1.00E+09 next era as the 1.00E+08 heterogeneous era 1.00E+07 Megascale 1.00E+06 1.00E+05 SerialSerial EraEra VectorVector EraEra DistributedDistributed Memory Era Era ManyHeterogeneous Core Era Era 1.00E+04 Kiloscale peak 1.00E+03 Jul-86 Jul-18 (flops) Jan-52 Jun-54 Dec-56 May-59 Nov-61 May-64 Oct-66 Apr-69 Sep-71 Mar-74 Aug-76 Feb-79 Aug-81 Jan-84 Dec-88 Jun-91 Nov-93 May-96 Nov-98 Apr-01 Oct-03 Mar-06 Sep-08 Feb-11 Aug-13 Jan-16 Jan-21 Jun-23 Dec-25 Each of these eras defines a new common programming model and transitions are taking longer; we are entering a fourth era 3 LLNL-PRES-758289 Pre-WWII – the theoretical foundations of modern computing are built in the U.S. and England § 1830’s – The Difference Engine — Charles Babbage / Ada Lovelace — Not built during his lifetime § 1930’s – The Turing Machine — Alan Turing (England / Princeton) — Theory of programmable algorithms § 1940’s – ENIAC — Mauchly/Eckert – Moore School of Eng. (Philadelphia) — 1st general programmable electronic computer § 1940’s – EDVAC design. — John Von Neumann (Los Alamos) — Basis for “stored program architecture” still used today Ballistics projectile calculations were the first “killer app” for computers, and were historically done by a pipeline of human “computers” 4 LLNL-PRES-758289 4 The “mainframe” era – general purpose computers designed for scientific computing 1953 § Univac 1 — First machine installed at LLNL in 1953 1954 § IBM 701 — Installed at Los Alamos also in 1953 — Williams tubes (fast memory, but unreliable) 1956 § IBM 704 — Core memory, floating point arithmetic, CRT — Commercially successful § IBM 709 1958 — Seamless porting § IBM 7090 — Transistor-based — Large speed increases § Univac LARC 1960 — Co-designed with, and for, LLNL — One of the first transistor-based machines 5 LLNL-PRES-758289 5 The “mainframe” era – continued 1961 § IBM 7030 (Stretch) • Competitor to LARC • Considered a failure at the time (only achieved 50% of performance Goals) • Introduced many concepts that went into the IBM System/360 § CDC 1604 1962 • First desiGned by Seymour Cray § CDC 3600 • 48-bit words (hiGher precision) § CDC 6600 • Considered the first real “supercomputer” (1st at CERN) 1964 • Full separation of input/output from computinG (precursor to RISC desiGns) § CDC 7600 • Hierarchical memory desiGn 1969 • Instruction pipeline (allowed for modest parallelism) • Fastest computer in world from 1969-1975 6 LLNL-PRES-758289 6 Programming in the mainframe era § Programming was often an intimate exercise with the particular computer you were targeting § Entirely serial programming § Lack of sufficient memory was often the limiting factor in scientific software design § This era saw huge innovations in hardware and software, despite computer science not yet being an established discipline — Development of compilers (e.g. Fortran) — Operating systems (e.g. CTSS) — Batch environments and time-sharing — Use of terminals for viewing output — LLNL’s “Octopus” network § U.S. stood largely alone in dominating the HPC market 7 LLNL-PRES-758289 7 The vector era 1976 § CDC Star 100 • First CDC design not led by Seymour Cray • One of the first commercial vector machines • Hard to achieve anything near peak performance • Large disparity between serial and vector code 1975 § Illiac IV (Burroughs & Univ of Illinois) • First massively parallel machine portends future trends • Moved to NASA Ames in 1975 (defense at a public university? Bah) 1978 § Cray 1 • First machine developed at Seymour Cray’s new company • Applications saw immediate gains in performance (vs CDC 7600), and could incrementally improve with additional vectorization § Cray X-MP 1983 • Designed by Steve Chen at Cray • First multi-processor supercomputer 8 LLNL-PRES-758289 8 The vector era – soon dominated by Cray in the U.S. 1985 § Cray 2 • Seymour’s 2nd design • Flourinert-cooled • Most performance gains were due to very fast memory Other players in vector computer design: 1988 § Cray Y-MP US • Successor to X-MP • Convex • First delivered with Unix-based UNICOS OS • Thinking Machines • ETA (spinoff of CDC) • IBM 1991 § Cray C90 Non-US 1994 § Cray J90 • NEC • “Mini” supercomputer – helped bridge gap into MPP era • Fujitsu • Hitachi 1995 § Cray T90 • End of an era! 9 LLNL-PRES-758289 9 Programming vector computers in that era § Machines were often delivered with no operating system—just hardware — Labs created their own software environments — Took years for the Cray compiler to catch up to the NNSA lab compilers § Vector computers are designed to work with arrays and loops — Lots of those in scientific codes! § Vectorizing was difficult, and many codes did not lend themselves to this — Loop dependencies (e.g. fib[i] = fib[i-1] + fib[i-2]) — Conditional statements — Non-constant stride through memory — I/O or system calls — Function calls Machine details matter for effective use… 10 LLNL-PRES-758289 Programming in the vector era – cont’d § Compilers became increasingly sophisticated at being able to perform some code transformations — Yet “automatic parallelization” never fully achieved its academic promises (Acquired by Intel, 200) § Vectorized portions of code could realize anywhere from 2 – 50x speedups over a serial counterpart § Vectorization (via SIMD units) eventually made its way into commodity CPUs with the Pentium III (SSE) — Also, AVX, ARM Neon, Power QPX — (Not quite the same, but close enouGh…) § Stable programming model, but — Lots of effort for potentially limited payoff — Vectorized codinG could Get uGly! Vectorization lives on in the form of SIMD units (e.g., Intel AVX) 11 LLNL-PRES-758289 11 Soon…mainframes (vector or not) faced the “attack of the killer micros” § Late 1980s; heavy influence by the PC/workstation market — Reduced instruction set computing (RISC) processors dominated price-performance — Mainframes became “dinosaurs running COBOL” § Revolution driven largely by cost and scaling — Cost: computing went mainstream — Scaling: commodity processors could be interconnected — Programming: rewrite from vector code to commodity processors Penetrate lower end Penetrate high end market through (HPC) market low-cost commodity through parallel CMOS processors scaling With industry focusing its investment on commodity processors, the writing was on the wall for the labs: leverage or fall behind the curve… 12 LLNL-PRES-758289 Commodity hardware components thus defines the distributed memory era § Workstations (e.g. Silicon Graphics) were doing the work of mainframes for 50x cheaper! — SGI acquires Cray Research § The relative ease of building distributed memory systems launched a wave of innovation and new supercomputer companies in the 90’s — IBM, Cray [Research, Computer, Inc], SGI, Intel, nCube, MasPar, Alliant, Multiflow, Kendall Square, BBN, DEC, Sequent, Convex, Encore, HP, Meiko, Supertek, Floating Point Systems, Tera, MIT J-machine, … § Fast commodity Ethernet and Linux eventually lead to the “Beowulf” explosion — Anyone can build a supercomputer now! — Great boon to universities and small companies — By the 2000’s, the list of dead or dying U.S. Supercomputing companies exceeded those still alive 13 LLNL-PRES-758289 13 The distributed computing or MPP era § ASCI program launches in ~1995 — Replace underground nuclear testing with science-based stockpile stewardship (simulation and experimental facilities) § Meanwhile, the Japanese “stay the course” with vectors — Gov’t funding pumped up their market — Some concern that US had forever lost it’s edge when the Japanese Earth Simulator was deployed in 2002. — Many here deemed it a Pyrrhic victory • It had a HUGE footprint and power use § BlueGene/L (@ LLNL) retakes the crown in 2004 for 3.5 straight years, putting a nail in the coffin of the Japanese dominance § DOE Leadership Computing Facilities (ANL/ORNL) launched in ~2004 to help reestablish US lead in open scientific computing — Currently housing Summit (#1 @ ORNL), Mira (#17 @ ANL) — Argonne scheduled to receive first exascale computer in 2021-22 (A21) 14 LLNL-PRES-758289 14 Programming in the Distributed Computing Era § Going on 20+ years of stability in programming model — PVM, and then ultimately MPI, provide performance portability — Unix becomes the de-facto standard for HPC — Object-oriented design helps balance complexity and performance — Programming Languages and compilers continue to improve — Emergence of grid and distance computing § Mission-delivery and scientific advancement has advanced rapidly,

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    47 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us