Parallel Gaussian Elimination

Total Page:16

File Type:pdf, Size:1020Kb

Parallel Gaussian Elimination Outline • Review Gaussian Elimination (GE) for solving Ax=b • Optimizing GE for caches on sequential machines - using matrix-matrix multiplication (BLAS and LAPACK) CS 267 • Minimizing communication for sequential GE Dense Linear Algebra: - Not LAPACK, but Recursive LU minimizes bandwidth (latency possible) Parallel Gaussian Elimination • Data layouts on parallel machines • Parallel Gaussian Elimination (ScaLAPACK) • Minimizing communication for parallel GE - Not ScaLAPACK (yet), but “Comm-Avoiding LU” (CALU) - Same idea for minimizing bandwidth and latency in sequential case James Demmel • Summarize rest of dense linear algebra • Dynamically scheduled LU for Multicore www.cs.berkeley.edu/~demmel/cs267_Spr14 • LU for Heterogeneous computers (CPU + GPU) 03/04/2014 CS267 Lecture 13 1 03/04/2014 CS267 Lecture 13 2 Summary of Matrix Multiplication Sca/LAPACK Overview • Goal: Multiply n x n matrices C = A·B using O(n3) arithmetic operations, minimizing data movement • Sequential - Assume fast memory of size M < 3n2, count slow mem. refs. - Thm: need Ω(n3/M1/2) slow mem. refs. and Ω(n3/M3/2) messages - Attainable using “blocked” or “recursive” matrix multiply • Parallel - Assume P processors, O(n2/P) data per processor - Thm: need Ω(n2/P1/2) words sent and Ω(P1/2) messages - Attainable by Cannon, nearly by SUMMA • SUMMA used in practice (PBLAS) - c copies of data ⇒ c1/2 times fewer words, c3/2 fewer messages • Which other linear algebra problems can we do with as little data movement? - Today: Solve Ax=b in detail, summarize what’s known, open3 03/04/2014 CS267 Lecture 13 4 03/04/2014 CS267 Lecture 13 1 Gaussian Elimination (GE) for solving Ax=b Refine GE Algorithm (1) • Add multiples of each row to later rows to make A upper • Initial Version triangular … for each column i • Solve resulting triangular system Ux = c by substitution … zero it out below the diagonal by adding multiples of row i to later rows for i = 1 to n-1 … for each column i … for each row j below row i … zero it out below the diagonal by adding multiples of row i to later rows for j = i+1 to n for i = 1 to n-1 … add a multiple of row i to row j … for each row j below row i tmp = A(j,i); for j = i+1 to n for k = i to n … add a multiple of row i to row j A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k) tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k) • Remove computation of constant tmp/A(i,i) from inner loop. for i = 1 to n-1 0 0 0 … 0 . 0 . 0 . 0 for j = i+1 to n . i . 0 . 0 m = A(j,i)/A(i,i) . 0 0 0 0 0 0 0 0 0 0 for k = i to n m 0 0 A(j,k) = A(j,k) - m * A(i,k) After i=1 After i=2 After i=3 After i=n-1 j 03/04/2014 CS267 Lecture 13 5 03/04/2014 CS267 Lecture 13 6 Refine GE Algorithm (2) Refine GE Algorithm (3) • Last version • Last version for i = 1 to n-1 for i = 1 to n-1 for j = i+1 to n for j = i+1 to n m = A(j,i)/A(i,i) m = A(j,i)/A(i,i) for k = i to n for k = i+1 to n A(j,k) = A(j,k) - m * A(i,k) A(j,k) = A(j,k) - m * A(i,k) • Don’t compute what we already know: • Store multipliers m below diagonal in zeroed entries zeros below diagonal in column i for later use for i = 1 to n-1 for i = 1 to n-1 for j = i+1 to n i for j = i+1 to n i m = A(j,i)/A(i,i) A(j,i) = A(j,i)/A(i,i) for k = i+1 to n m for k = i+1 to n m A(j,k) = A(j,k) - m * A(i,k) j A(j,k) = A(j,k) - A(j,i) * A(i,k) j Do not compute zeros Store m here 03/04/2014 CS267 Lecture 13 7 03/04/2014 CS267 Lecture 13 8 2 Refine GE Algorithm (4) Refine GE Algorithm (5) for i = 1 to n-1 • Last version • Last version for j = i+1 to n A(j,i) = A(j,i)/A(i,i) for i = 1 to n-1 for j = i+1 to n for j = i+1 to n for k = i+1 to n A(j,i) = A(j,i)/A(i,i) A(j,k) = A(j,k) - A(j,i) * A(i,k) for k = i+1 to n A(j,k) = A(j,k) - A(j,i) * A(i,k) • Express using matrix operations (BLAS) • Split Loop for i = 1 to n-1 A(i+1:n,i) = A(i+1:n,i) * ( 1 / A(i,i) ) for i = 1 to n-1 … BLAS 1 (scale a vector) A(i+1:n,i+1:n) = A(i+1:n , i+1:n ) for j = i+1 to n i A(j,i) = A(j,i)/A(i,i) - A(i+1:n , i) * A(i , i+1:n) for j = i+1 to n … BLAS 2 (rank-1 update) for k = i+1 to n j A(j,k) = A(j,k) - A(j,i) * A(i,k) Store all m’s here before updating 03/04/2014 CS267 Lecture 13 rest of matrix 9 03/04/2014 CS267 Lecture 13 10 What GE really computes Problems with basic GE algorithm for i = 1 to n-1 for i = 1 to n-1 A(i+1:n,i) = A(i+1:n,i) / A(i,i) … BLAS 1 (scale a vector) A(i+1:n,i) = A(i+1:n,i) / A(i,i) … BLAS 1 (scale a vector) A(i+1:n,i+1:n) = A(i+1:n , i+1:n ) … BLAS 2 (rank-1 update) A(i+1:n,i+1:n) = A(i+1:n , i+1:n ) - A(i+1:n , i) * A(i , i+1:n) … BLAS 2 (rank-1 update) - A(i+1:n , i) * A(i , i+1:n) • Call the strictly lower triangular matrix of multipliers • What if some A(i,i) is zero? Or very small? M, and let L = I+M - Result may not exist, or be “unstable”, so need to pivot • Call the upper triangle of the final matrix U • Current computation all BLAS 1 or BLAS 2, but we know that BLAS 3 (matrix multiply) is fastest (earlier lectures…) • Lemma (LU Factorization): If the above algorithm terminates (does not divide by zero) then A = L*U Peak = * • Solving A*x=b using GE BLAS 3 - Factorize A = L*U using GE (cost = 2/3 n3 flops) - Solve L*y = b for y, using substitution (cost = n2 flops) - Solve U*x = y for x, using substitution (cost = n2 flops) BLAS 2 BLAS 1 • Thus A*x = (L*U)*x = L*(U*x) = L*y = b as desired 03/04/2014 CS267 Lecture 13 11 03/04/2014 CS267 Lecture 13 12 3 Pivoting in Gaussian Elimination Gaussian Elimination with Partial Pivoting (GEPP) • A = [ 0 1 ] fails completely because can’t divide by A(1,1)=0 • Partial Pivoting: swap rows so that A(i,i) is largest in column [ 1 0 ] for i = 1 to n-1 find and record k where |A(k,i)| = max{i ≤ j ≤ n} |A(j,i)| • But solving Ax=b should be easy! … i.e. largest entry in rest of column i if |A(k,i)| = 0 exit with a warning that A is singular, or nearly so • When diagonal A(i,i) is tiny (not just zero), algorithm may elseif k ≠ i terminate but get completely wrong answer swap rows i and k of A • Numerical instability end if • Roundoff error is cause A(i+1:n,i) = A(i+1:n,i) / A(i,i) … each |quotient| ≤ 1 A(i+1:n,i+1:n) = A(i+1:n , i+1:n ) - A(i+1:n , i) * A(i , i+1:n) • Cure: Pivot (swap rows of A) so A(i,i) large • Lemma: This algorithm computes A = P*L*U, where P is a permutation matrix. • This algorithm is numerically stable in practice • For details see LAPACK code at http://www.netlib.org/lapack/single/sgetf2.f • Standard approach – but communication costs? 03/04/2014 CS267 Lecture 13 13 03/04/2014 CS267 Lecture 13 14 Problems with basic GE algorithm Converting BLAS2 to BLAS3 in GEPP • Blocking • What if some A(i,i) is zero? Or very small? - Used to optimize matrix-multiplication - Result may not exist, or be “unstable”, so need to pivot - Harder here because of data dependencies in GEPP • Current computation all BLAS 1 or BLAS 2, but we know that BLAS 3 (matrix multiply) is fastest (earlier lectures…) • BIG IDEA: Delayed Updates for i = 1 to n-1 A(i+1:n,i) = A(i+1:n,i) / A(i,i) … BLAS 1 (scale a vector) - Save updates to “trailing matrix” from several consecutive A(i+1:n,i+1:n) = A(i+1:n , i+1:n ) … BLAS 2 (rank-1 update) BLAS2 (rank-1) updates - A(i+1:n , i) * A(i , i+1:n) - Apply many updates simultaneously in one BLAS3 (matmul) operation Peak • Same idea works for much of dense linear algebra BLAS 3 - Not eigenvalue problems or SVD – need more ideas • First Approach: Need to choose a block size b BLAS 2 - Algorithm will save and apply b updates BLAS 1 - b should be small enough so that active submatrix consisting of b columns of A fits in cache - b should be large enough to make BLAS3 (matmul) fast 03/04/2014 CS267 Lecture 13 15 03/04/2014 CS267 Lecture 13 16 4 Blocked GEPP (www.netlib.org/lapack/single/sgetrf.f) Efficiency of Blocked GEPP (all parallelism “hidden” inside the BLAS) for ib = 1 to n-1 step b … Process matrix b columns at a time end = ib + b-1 … Point to end of block of b columns * = Speed (LAPACK/LU) / Speed(best effort) apply BLAS2 version of GEPP to get A(ib:n , ib:end) = P’ * L’ * U’ 1.2 Speed(Matmul) / HW Peak … let LL denote the strict lower triangular part of A(ib:end , ib:end) + I Speed(LAPACK LU) / Speed(MatMul) A(ib:end , end+1:n) = LL-1 * A(ib:end , end+1:n) … update next b rows of U A(end+1:n , end+1:n ) = A(end+1:n , end+1:n ) 1 - A(end+1:n , ib:end) * A(ib:end , end+1:n) … apply delayed updates with single matrix-multiply 0.8 … with inner dimension b 0.6 Efficiency 0.4 (For a correctness proof, see on-line notes from CS267 / 1996.) 0.2 0 Cnvx C4 Cnvx C4 Cray C90 Cray C90 Alpha RS6000 SGI PC (1 p) (4 p) (1 p) (16 p) 03/04/2014 17 03/04/2014 CS267 Lecture 13 18 CS267 Lecture 13 Communication Lower Bound for GE Does LAPACK’s GEPP Minimize Communication? • Matrix Multiplication can be “reduced to” GE for ib = 1 to n-1 step b … Process matrix b columns at a time end = ib + b-1 … Point to end of block of b columns • Not a good way to do matmul but it shows that GE apply BLAS2 version of GEPP to get A(ib:n , ib:end) = P’ * L’ * U’ needs at least as much communication as matmul … let LL denote the strict lower triangular part of A(ib:end , ib:end) + I A(ib:end , end+1:n) = LL-1 * A(ib:end , end+1:n) … update next b rows of U • Does blocked GEPP minimize communication? A(end+1:n , end+1:n ) = A(end+1:n , end+1:n ) - A(end+1:n , ib:end) * A(ib:end , end+1:n) … apply delayed updates with single matrix-multiply … with inner dimension b • Case 1: n ≥ M - huge matrix – attains lower bound I 0 -B I I 0 -B - b = M1/2 optimal, dominated by matmul A I 0 = A I · I A·B • Case 2: n ≤ M1/2 - small matrix – attains lower bound 0 0 I 0 0 I I - Whole matrix
Recommended publications
  • An Overview of the Blue Gene/L System Software Organization
    An Overview of the Blue Gene/L System Software Organization George Almasi´ , Ralph Bellofatto , Jose´ Brunheroto , Calin˘ Cas¸caval , Jose´ G. ¡ Castanos˜ , Luis Ceze , Paul Crumley , C. Christopher Erway , Joseph Gagliano , Derek Lieber , Xavier Martorell , Jose´ E. Moreira , Alda Sanomiya , and Karin ¡ Strauss ¢ IBM Thomas J. Watson Research Center Yorktown Heights, NY 10598-0218 £ gheorghe,ralphbel,brunhe,cascaval,castanos,pgc,erway, jgaglia,lieber,xavim,jmoreira,sanomiya ¤ @us.ibm.com ¥ Department of Computer Science University of Illinois at Urbana-Champaign Urabana, IL 61801 £ luisceze,kstrauss ¤ @uiuc.edu Abstract. The Blue Gene/L supercomputer will use system-on-a-chip integra- tion and a highly scalable cellular architecture. With 65,536 compute nodes, Blue Gene/L represents a new level of complexity for parallel system software, with specific challenges in the areas of scalability, maintenance and usability. In this paper we present our vision of a software architecture that faces up to these challenges, and the simulation framework that we have used for our experiments. 1 Introduction In November 2001 IBM announced a partnership with Lawrence Livermore National Laboratory to build the Blue Gene/L (BG/L) supercomputer, a 65,536-node machine de- signed around embedded PowerPC processors. Through the use of system-on-a-chip in- tegration [10], coupled with a highly scalable cellular architecture, Blue Gene/L will de- liver 180 or 360 Teraflops of peak computing power, depending on the utilization mode. Blue Gene/L represents a new level of scalability for parallel systems. Whereas existing large scale systems range in size from hundreds (ASCI White [2], Earth Simulator [4]) to a few thousands (Cplant [3], ASCI Red [1]) of compute nodes, Blue Gene/L makes a jump of almost two orders of magnitude.
    [Show full text]
  • Advances in Ultrashort-Pulse Lasers • Modeling Dispersions of Biological and Chemical Agents • Centennial of E
    October 2001 U.S. Department of Energy’s Lawrence Livermore National Laboratory Also in this issue: • More Advances in Ultrashort-Pulse Lasers • Modeling Dispersions of Biological and Chemical Agents • Centennial of E. O. Lawrence’s Birth About the Cover Computing systems leader Greg Tomaschke works at the console of the 680-gigaops Compaq TeraCluster2000 parallel supercomputer, one of the principal machines used to address large-scale scientific simulations at Livermore. The supercomputer is accessible to unclassified program researchers throughout the Laboratory, thanks to the Multiprogrammatic and Institutional Computing (M&IC) Initiative described in the article beginning on p. 4. M&IC makes supercomputers an institutional resource and helps scientists realize the potential of advanced, three-dimensional simulations. Cover design: Amy Henke About the Review Lawrence Livermore National Laboratory is operated by the University of California for the Department of Energy’s National Nuclear Security Administration. At Livermore, we focus science and technology on assuring our nation’s security. We also apply that expertise to solve other important national problems in energy, bioscience, and the environment. Science & Technology Review is published 10 times a year to communicate, to a broad audience, the Laboratory’s scientific and technological accomplishments in fulfilling its primary missions. The publication’s goal is to help readers understand these accomplishments and appreciate their value to the individual citizen, the nation, and the world. Please address any correspondence (including name and address changes) to S&TR, Mail Stop L-664, Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, California 94551, or telephone (925) 423-3432. Our e-mail address is [email protected].
    [Show full text]
  • 2017 HPC Annual Report Team Would Like to Acknowledge the Invaluable Assistance Provided by John Noe
    sandia national laboratories 2017 HIGH PERformance computing The 2017 High Performance Computing Annual Report is dedicated to John Noe and Dino Pavlakos. Building a foundational framework Editor in high performance computing Yasmin Dennig Contributing Writers Megan Davidson Sandia National Laboratories has a long history of significant contributions to the high performance computing Mattie Hensley community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraflop barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing capabilities to gain a tremendous competitive edge in the marketplace. Contributing Editor Laura Sowko As part of our continuing commitment to provide modern computing infrastructure and systems in support of Sandia’s missions, we made a major investment in expanding Building 725 to serve as the new home of high performance computer (HPC) systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) prototype Design platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing Stacey Long advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment, and application code performance.
    [Show full text]
  • R00456--FM Getting up to Speed
    GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING Susan L. Graham, Marc Snir, and Cynthia A. Patterson, Editors Committee on the Future of Supercomputing Computer Science and Telecommunications Board Division on Engineering and Physical Sciences THE NATIONAL ACADEMIES PRESS Washington, D.C. www.nap.edu THE NATIONAL ACADEMIES PRESS 500 Fifth Street, N.W. Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Gov- erning Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engi- neering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for ap- propriate balance. Support for this project was provided by the Department of Energy under Spon- sor Award No. DE-AT01-03NA00106. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the organizations that provided support for the project. International Standard Book Number 0-309-09502-6 (Book) International Standard Book Number 0-309-54679-6 (PDF) Library of Congress Catalog Card Number 2004118086 Cover designed by Jennifer Bishop. Cover images (clockwise from top right, front to back) 1. Exploding star. Scientific Discovery through Advanced Computing (SciDAC) Center for Supernova Research, U.S. Department of Energy, Office of Science. 2. Hurricane Frances, September 5, 2004, taken by GOES-12 satellite, 1 km visible imagery. U.S. National Oceanographic and Atmospheric Administration. 3. Large-eddy simulation of a Rayleigh-Taylor instability run on the Lawrence Livermore National Laboratory MCR Linux cluster in July 2003.
    [Show full text]
  • Blue Gene/L Architecture
    Blue Gene/L Blue Gene/L Architecture Burkhard Steinmacher-Burow IBM Watson / Böblingen November 2, 2004, DESY-Zeuthen © 2004 IBM Corporation Blue Gene/L Outline Architecture Motivation Given the motivation, the architecture should seem natural and obvious. Architecture Overview 2 Blue Gene/L Supercomputer Overview | November 2, 2004, DESY-Zeuthen © 2004 IBM Corporation Blue Gene/L What is the Blue Gene/L Project? A 512- to 65536-node highly-integrated supercomputer based on system-on-a-chip technology: Node ASIC. Link ASIC. Strategic partnership with LLNL and other high performance computing centers and researchers: – Focus on numerically intensive scientific problems. – Validation and optimization of architecture based on real applications. – Grand challenge science stresses networks, memory and processing power. – Partners accustomed to "new architectures" and work hard to adapt to constraints. – Partners assist us in the investigation of the reach of this machine. 3 Blue Gene/L Supercomputer Overview | November 2, 2004, DESY-Zeuthen © 2004 IBM Corporation Blue Gene/L BG/L for Capability Computing December 1999: IBM Research announced a 5 year, $100M US, effort to build a petaflop/s scale supercomputer to attack science problems such as protein folding. Goals: Advance the state of the art of scientific simulation. Advance the state of the art in computer design and software for capability and capacity markets. November 2001: Announced Research partnership with Lawrence Livermore National Laboratory (LLNL). November 2002: Announced planned acquisition of a BG/L machine by LLNL as part of the ASCI Purple contract. June 2003: First DD1 chips completed. November 2003: BG/L Half rack DD1 prototype (512 nodes at 500 MHz) ranked #73 on 22nd Top500 List announced at SC2003 (1.435 TFlops/s ).
    [Show full text]
  • 75 YEARS Trinity Test the Dawn of America’S Scientific Innovation CONTENTS
    75 YEARS Trinity Test The Dawn of America’s Scientific Innovation CONTENTS 1 THE MANHATTAN PROJECT ........ 4 2 TRINITY TEST - JULY 16, 1945 ...... 6 3 1940s .............................................. 10 4 1950s ..........................................12 5 1960s ..........................................14 1970s ..........................................16 Beyond the advances in nuclear physics 6 and chemistry that made the “ 7 1980s ..........................................18 “ first functional atomic device possible, Trinity was arguably the greatest 1990s ..........................................20 scientific experiment ever conducted. 8 9 2000s ..........................................22 Lisa E. Gordon-Hagerty U.S. Under Secretary of Energy for Nuclear Security Administrator of the National Nuclear Security Administration 10 2010s ..........................................24 11 2020 and FORWARD ..................... 26 1 Manhattan Project: The Origin of the Trinity Test In the 1920s-1930s, a young Hungarian-German physicist Roosevelt responded by launching The Manhattan Project, Leo Szilard led the field of nuclear research, submitting a nationwide network of laboratories and manufacturing patents for a linear accelerator (1928) and cyclotron (1929), facilities designed to collaboratively assist in the before collaborating with Albert Einstein to develop the manufacture of a new atomic weapon. Einstein refrigerator. But when Hitler came into power in 1933, Szilard fled to England, encouraging his friends and family to do the same. In England, he first described the nuclear chain reaction (1933) and patented an early design for a nuclear fission reactor (1934). In 1938, Szilard joined Einstein in the United States, but the rumor that a group of Berlin chemists had split the uranium atom made them so concerned that in 1939, they sent an urgent letter to President Franklin D. Roosevelt, warning him that that Axis scientists were working to turn new nuclear discoveries into a superweapon.
    [Show full text]
  • Supercomputers – Prestige Objects Or Crucial Tools for Science and Industry?
    Supercomputers – Prestige Objects or Crucial Tools for Science and Industry? Hans W. Meuer a 1, Horst Gietl b 2 a University of Mannheim & Prometeus GmbH, 68131 Mannheim, Germany; b Prometeus GmbH, 81245 Munich, Germany; This paper is the revised and extended version of the Lorraine King Memorial Lecture Hans Werner Meuer was invited by Lord Laird of Artigarvan to give at the House of Lords, London, on April 18, 2012. Keywords: TOP500, High Performance Computing, HPC, Supercomputing, HPC Technology, Supercomputer Market, Supercomputer Architecture, Supercomputer Applications, Supercomputer Technology, Supercomputer Performance, Supercomputer Future. 1 e-mail: [email protected] 2 e-mail: [email protected] 1 Content 1 Introduction ..................................................................................................................................... 3 2 The TOP500 Supercomputer Project ............................................................................................... 3 2.1 The LINPACK Benchmark ......................................................................................................... 4 2.2 TOP500 Authors ...................................................................................................................... 4 2.3 The 39th TOP500 List since 1993 .............................................................................................. 5 2.4 The 39th TOP10 List since 1993 ...............................................................................................
    [Show full text]
  • Delivering Insight: the History of the Accelerated Strategic Computing
    Lawrence Livermore National Laboratory Computation Directorate Dona L. Crawford Computation Associate Director Lawrence Livermore National Laboratory 7000 East Avenue, L-559 Livermore, CA 94550 September 14, 2009 Dear Colleague: Several years ago, I commissioned Alex R. Larzelere II to research and write a history of the U.S. Department of Energy’s Accelerated Strategic Computing Initiative (ASCI) and its evolution into the Advanced Simulation and Computing (ASC) Program. The goal was to document the first 10 years of ASCI: how this integrated and sustained collaborative effort reached its goals, became a base program, and changed the face of high-performance computing in a way that independent, individually funded R&D projects for applications, facilities, infrastructure, and software development never could have achieved. Mr. Larzelere has combined the documented record with first-hand recollections of prominent leaders into a highly readable, 200-page account of the history of ASCI. The manuscript is a testament to thousands of hours of research and writing and the contributions of dozens of people. It represents, most fittingly, a collaborative effort many years in the making. I’m pleased to announce that Delivering Insight: The History of the Accelerated Strategic Computing Initiative (ASCI) has been approved for unlimited distribution and is available online at https://asc.llnl.gov/asc_history/. Sincerely, Dona L. Crawford Computation Associate Director Lawrence Livermore National Laboratory An Equal Opportunity Employer • University of California • P.O. Box 808 Livermore, California94550 • Telephone (925) 422-2449 • Fax (925) 423-1466 Delivering Insight The History of the Accelerated Strategic Computing Initiative (ASCI) Prepared by: Alex R.
    [Show full text]
  • MICROCOMP Output File
    S. HRG. 107–355, PT. 7 DEPARTMENT OF DEFENSE AUTHORIZATION FOR APPROPRIATIONS FOR FISCAL YEAR 2002 HEARINGS BEFORE THE COMMITTEE ON ARMED SERVICES UNITED STATES SENATE ONE HUNDRED SEVENTH CONGRESS FIRST SESSION ON S. 1416 AUTHORIZING APPROPRIATIONS FOR FISCAL YEAR 2002 FOR MILITARY ACTIVITIES OF THE DEPARTMENT OF DEFENSE, FOR MILITARY CON- STRUCTION, AND FOR DEFENSE ACTIVITIES OF THE DEPARTMENT OF ENERGY, TO PRESCRIBE PERSONNEL STRENGTHS FOR SUCH FISCAL YEAR FOR THE ARMED FORCES, AND FOR OTHER PURPOSES PART 7 STRATEGIC APRIL 25; JUNE 26; JULY 11, 25, 2001 ( Printed for the use of the Committee on Armed Services VerDate 11-SEP-98 17:29 Sep 06, 2002 Jkt 000000 PO 00000 Frm 00001 Fmt 6011 Sfmt 6011 75352.CON SARMSER2 PsN: SARMSER2 DEPARTMENT OF DEFENSE AUTHORIZATION FOR APPROPRIATIONS FOR FISCAL YEAR 2002—Part 7 STRATEGIC VerDate 11-SEP-98 17:29 Sep 06, 2002 Jkt 000000 PO 00000 Frm 00002 Fmt 6019 Sfmt 6019 75352.CON SARMSER2 PsN: SARMSER2 S. HRG. 107–355, PT. 7 DEPARTMENT OF DEFENSE AUTHORIZATION FOR APPROPRIATIONS FOR FISCAL YEAR 2002 HEARINGS BEFORE THE COMMITTEE ON ARMED SERVICES UNITED STATES SENATE ONE HUNDRED SEVENTH CONGRESS FIRST SESSION ON S. 1416 AUTHORIZING APPROPRIATIONS FOR FISCAL YEAR 2002 FOR MILITARY ACTIVITIES OF THE DEPARTMENT OF DEFENSE, FOR MILITARY CON- STRUCTION, AND FOR DEFENSE ACTIVITIES OF THE DEPARTMENT OF ENERGY, TO PRESCRIBE PERSONNEL STRENGTHS FOR SUCH FISCAL YEAR FOR THE ARMED FORCES, AND FOR OTHER PURPOSES PART 7 STRATEGIC APRIL 25; JUNE 26; JULY 11, 25, 2001 ( Printed for the use of the Committee on Armed Services U.S.
    [Show full text]
  • LLNL 65 Th Anniversary Book, 2017
    Scientific Editor Paul Chrzanowski Production Editor Arnie Heller Pamela MacGregor (first edition) Graphic Designer George Kitrinos Proofreader Caryn Meissner About the Cover Since its inception in 1952, the Laboratory has transformed from a deactivated U.S. Naval Air Station to a campus-like setting (top) with outstanding research facilities through U.S. government investments in our important missions and the efforts of generations of exceptional people dedicated to national service. This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, About the Laboratory nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or Lawrence Livermore National Laboratory (LLNL) was founded in 1952 to enhance process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to the security of the United States by advancing nuclear weapons science and any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise technology and ensuring a safe, secure, and effective nuclear deterrent. With does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States a talented and dedicated workforce and world-class research capabilities, the government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed Laboratory strengthens national security with a tradition of science and technology herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.
    [Show full text]
  • Weapons Activity / Campaigns
    Campaigns Program Mission Campaigns are multi-year, multi-functional efforts involving, to varying degrees, every site in the nuclear weapons complex. They provide specialized scientific knowledge and technical support to the directed stockpile work on the nuclear weapons stockpile. Deliverables are defined/scheduled in each campaign plan and then coordinated with several key nuclear weapons complex directives, including the current Nuclear Weapons Production and Planning Directive (P&PD), Nuclear Weapons Schedule, Integrated Weapons Activity Plan (IWAP), and specific weapon Program Control Documents (PCDs), Component Description Documents (CDDs), and program planning documents. Current priority for general campaign support is to provide technology for three ongoing Life Extension Programs (LEPs) and to other ongoing refurbishments. Some campaigns focus on near-term deliverables; others on longer-range improvement to specific weapons complex capabilities. A few include directly associated construction projects; most do not. Overall, they all directly support the long-term stewardship of the nuclear weapons stockpile. There are six categories. • Science Campaigns (Primary Certification, Dynamic Materials Properties, Advanced Radiography, and Secondary Certification and Nuclear Systems Margins). These four campaigns develop certification methodologies and the associated capabilities and scientific understanding required to assure the safety and reliability of aged and remanufactured weapons in the absence of nuclear testing. This technology base must be in place to carry out weapons refurbishments and other stockpile support work. • Engineering Campaigns (Enhanced Surety, Weapons System Engineering Certification, Nuclear Survivability, Enhanced Surveillance, and Advanced Design and Production Technologies). These five campaigns and engineering construction activities provide required tools, methods, and technologies for the continued certification and long-term sustainment (via refurbishment) of the nuclear weapons stockpile.
    [Show full text]
  • Supercomputers: the Amazing Race Gordon Bell November 2014
    Supercomputers: The Amazing Race Gordon Bell November 2014 Technical Report MSR-TR-2015-2 Gordon Bell, Researcher Emeritus Microsoft Research, Microsoft Corporation 555 California, 94104 San Francisco, CA Version 1.0 January 2015 1 Submitted to STARS IEEE Global History Network Supercomputers: The Amazing Race Timeline (The top 20 significant events. Constrained for Draft IEEE STARS Article) 1. 1957 Fortran introduced for scientific and technical computing 2. 1960 Univac LARC, IBM Stretch, and Manchester Atlas finish 1956 race to build largest “conceivable” computers 3. 1964 Beginning of Cray Era with CDC 6600 (.48 MFlops) functional parallel units. “No more small computers” –S R Cray. “First super”-G. A. Michael 4. 1964 IBM System/360 announcement. One architecture for commercial & technical use. 5. 1965 Amdahl’s Law defines the difficulty of increasing parallel processing performance based on the fraction of a program that has to be run sequentially. 6. 1976 Cray 1 Vector Processor (26 MF ) Vector data. Sid Karin: “1st Super was the Cray 1” 7. 1982 Caltech Cosmic Cube (4 node, 64 node in 1983) Cray 1 cost performance x 50. 8. 1983-93 Billion dollar SCI--Strategic Computing Initiative of DARPA IPTO response to Japanese Fifth Gen. 1990 redirected to supercomputing after failure to achieve AI goals 9. 1982 Cray XMP (1 GF) Cray shared memory vector multiprocessor 10. 1984 NSF Establishes Office of Scientific Computing in response to scientists demand and to counteract the use of VAXen as personal supercomputers 11. 1987 nCUBE (1K computers) achieves 400-600 speedup, Sandia winning first Bell Prize, stimulated Gustafson’s Law of Scalable Speed-Up, Amdahl’s Law Corollary 12.
    [Show full text]