<<

A -BASED CASE FOR LARGE-SCALE SIMULATION

Office of Science U.S. Department of

July 30, 2003

“There will be opened a gateway and a road to a large and excellent science, into which minds more piercing than mine shall penetrate to recesses still deeper.” Galileo (1564–1642)

[on the “experimental mathematical analysis of nature,” appropriated here for “computational simulation”]

TRANSMITTAL

July 30, 2003

Dr. Raymond L. Orbach, Director Office of Science U.S. Department of Energy Washington, DC

Dear Dr. Orbach,

On behalf of more than 300 contributing computational from dozens of leading universities, all of the multiprogrammatic laboratories of the Department of Energy, other federal agencies, and industry, I am pleased to deliver Volume 1 of a two-volume report that builds a Science-Based Case for Large-Scale Simulation.

We find herein (and detail further in the companion volume) that favorable trends in computational power and networking, scientific software engineering and infrastructure, and modeling and algorithms are allowing several applications to approach thresholds beyond which lies new knowledge of both fundamental and practical kinds. A major increase in investment in computational modeling and simulation is appropriate at this time, so that our citizens are the first to benefit from particular new fruits of scientific simulation, and indeed, from an evolving culture of simulation science.

Based on the encouraging first two years of the Scientific Discovery through Advanced Computing initiative, we believe that balanced investment in scientific applications, applied mathematics, and computer science, with cross-accountabilities between these constituents, is a program structure worthy of extension. Through it, techniques from applied mathematics and computer science are systematically being migrated into scientific computer codes. As this happens, new challenges in the use of these techniques flow back to the mathematics and computer science communities, rejuvenating their own basic research. Now is an especially opportune time to increase the number of groups working in this multidisciplinary way, and to provide the computational platforms and environments that will enable these teams to push their simulations to the next insight-yielding levels of high resolution and large ensembles.

The Department of Energy can draw satisfaction today from its history of pathfinding in each of the areas that are brought together in large-scale simulation. However, from the vantage point of tomorrow, the Department’s hand in their systematic fusion will be regarded as even more profound.

Best regards,

David E. Keyes Fu Foundation Professor of Applied Mathematics Columbia University New York, NY

Executive Summary

EXECUTIVE SUMMARY

Important advances in basic science crucial theory, methods, and algorithms from to the national well-being have been ; and software and brought near by a “perfect fusion” of hardware infrastructure from computer sustained advances in scientific models, scientists. Only major new investment in mathematical algorithms, computer these activities across the board, in the architecture, and scientific software program areas of DOE’s Office of Science engineering. Computational simulation—a and other agencies, will enable the United means of scientific discovery that employs States to be the first to realize the promise a computer system to simulate a physical of the scientific advances to be wrought by system according to laws derived from computational simulation. theory and experiment—has attained peer status with theory and experiment in many In this two-volume report, prepared with areas of science. The United States is direct input from more than 300 of the currently a world leader in computational nation’s leading computational scientists, a simulation, a position that confers both an science-based case is presented for major, opportunity and a responsibility to mount a new, carefully balanced investments in vigorous campaign of research that brings • scientific applications the advancing power of simulation to many • algorithm research and development scientific frontiers. • computing system software infrastructure Computational simulation offers to • network infrastructure for access and enhance, as well as leapfrog, theoretical and resource sharing experimental progress in many areas of • computational facilities science critical to the scientific mission of • innovative computer architecture the U.S. Department of Energy (DOE). research, for the facilities of the future Successes have been documented in such • proactive recruitment and training of a areas as advanced energy systems (e.g., fuel new generation of multidisciplinary cells, fusion), biotechnology (e.g., genomics, computational scientists cellular dynamics), nanotechnology (e.g., sensors, storage devices), and The two-year-old Scientific Discovery environmental modeling (e.g., climate through Advanced Computing (SciDAC) prediction, pollution remediation). initiative in the Office of Science provides a Computational simulation also offers the template for such science-directed, best near-term hope for progress in multidisciplinary research campaigns. answering a number of scientific questions SciDAC’s successes in the first four of these in such areas as the fundamental structure seven thrusts have illustrated the advances of matter, the production of heavy elements possible with coordinated investments. It is in supernovae, and the functions of now time to take full advantage of the enzymes. revolution in computational science with new investments that address the most The ingredients required for success in challenging scientific problems faced by advancing scientific discovery are insights, DOE. models, and applications from scientists;

A Science-Based Case for Large-Scale Simulation iii Contents

iv A Science-Based Case for Large-Scale Simulation Contents

A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION

VOLUME 1

Attributions ...... vii

List of Participants...... xi

1. Introduction ...... 1

2. Scientific Discovery through Advanced Computing: A Successful Pilot Program ...... 9

3. of a Large-Scale Simulation ...... 17

4. Opportunities at the Scientific Horizon ...... 23

5. Enabling Mathematics and Computer Science Tools ...... 31

6. Recommendations and Discussion ...... 37

Postscript and Acknowledgments ...... 45

Appendix 1: A Brief Chronology of “A Science-Based Case for Large-Scale Simulation”...... 47

Appendix 2: Charge for “A Science-Based Case for Large-Scale Simulation”...... 49

Acronyms and Abbreviations ...... 51

A Science-Based Case for Large-Scale Simulation v Attributions

vi A Science-Based Case for Large-Scale Simulation Attributions

ATTRIBUTIONS

Report Editors Phillip Colella, Lawrence Berkeley National Laboratory Thom H. Dunning, Jr., University of Tennessee and Oak Ridge National Laboratory William D. Gropp, Argonne National Laboratory David E. Keyes, Columbia University, Editor-in-Chief

Workshop Organizers David L. Brown, Lawrence Livermore National Laboratory Phillip Colella, Lawrence Berkeley National Laboratory Lori Freitag Diachin, Sandia National Laboratories James Glimm, State University of New York–Stony Brook and Brookhaven National Laboratory William D. Gropp, Argonne National Laboratory Steven W. Hammond, National Renewable Energy Laboratory Robert Harrison, Oak Ridge National Laboratory Stephen C. Jardin, Princeton Plasma Physics Laboratory David E. Keyes, Columbia University, Chair Paul C. Messina, Argonne National Laboratory Juan C. Meza, Lawrence Berkeley National Laboratory Anthony Mezzacappa, Oak Ridge National Laboratory Robert Rosner, University of Chicago and Argonne National Laboratory R. Roy Whitney, Thomas Jefferson National Accelerator Facility Theresa L. Windus, Pacific Northwest National Laboratory

Topical Group Leaders

Science Applications Accelerators Kwok Ko, Stanford Linear Accelerator Center Robert D. Ryne, Lawrence Berkeley National Laboratory Astrophysics Anthony Mezzacappa, Oak Ridge National Laboratory Robert Rosner, University of Chicago and Argonne National Laboratory Biology Michael Colvin, Lawrence Livermore National Laboratory George Michaels, Pacific Northwest National Laboratory Chemistry Robert Harrison, Oak Ridge National Laboratory Theresa L. Windus, Pacific Northwest National Laboratory Climate and Earth Science John B. Drake, Oak Ridge National Laboratory Phillip W. Jones, Los Alamos National Laboratory Robert C. Malone, Los Alamos National Laboratory

A Science-Based Case for Large-Scale Simulation vii Attributions

Combustion John B. Bell, Lawrence Berkeley National Laboratory Larry A. Rahn, Sandia National Laboratories Environmental Remediation and Processes Mary F. Wheeler, University of Texas–Austin Steve Yabusaki, Pacific Northwest National Laboratory Francois Gygi, Lawrence Livermore National Laboratory G. Malcolm Stocks, Oak Ridge National Laboratory Nanoscience Peter T. Cummings, Vanderbilt University and Oak Ridge National Laboratory Lin-wang Wang, Lawrence Berkeley National Laboratory Plasma Science Stephen C. Jardin, Princeton Plasma Physics Laboratory William M. Nevins, Lawrence Livermore National Laboratory Quantum Chromodynamics Robert Sugar, University of California–Santa Barbara

Mathematical Methods and Algorithms Computational Fluid Dynamics Philip Colella, Lawrence Berkeley National Laboratory Paul F. Fischer, Argonne National Laboratory Discrete Mathematics and Algorithms Bruce A. Hendrickson, Sandia National Laboratories Alex Pothen, Old Dominion University Meshing Methods Lori Freitag Diachin, Sandia National Laboratories David B. Serafini, Lawrence Berkeley National Laboratory David L. Brown, Lawrence Livermore National Laboratory Multi-physics Solution Techniques Dana A. Knoll, Los Alamos National Laboratory John N. Shadid, Sandia National Laboratories Multiscale Techniques Thomas J. R. Hughes, University of Texas–Austin Mark S. Shephard, Rensselaer Polytechnic Institute Solvers and “Fast” Algorithms Van E. Henson, Lawrence Livermore National Laboratory Juan C. Meza, Lawrence Berkeley National Laboratory Transport Methods Frank R. Graziani, Lawrence Livermore National Laboratory Gordon L. Olson, Los Alamos National Laboratory

viii A Science-Based Case for Large-Scale Simulation Attributions

Uncertainty Quantification James Glimm, State University of New York–Stony Brook and Brookhaven National Laboratory Sallie Keller-McNulty, Los Alamos National Laboratory

Computer Science and Infrastructure Access and Resource Sharing Ian Foster, Argonne National Laboratory William E. Johnston, Lawrence Berkeley National Laboratory Architecture William D. Gropp, Argonne National Laboratory Data Management and Analysis Arie Shoshani, Lawrence Berkeley National Laboratory Doron Rotem, Lawrence Berkeley National Laboratory Frameworks and Environments Robert Armstrong, Sandia National Laboratories Kathy Yelick, University of California–Berkeley Performance Tools and Evaluation David H. Bailey, Lawrence Berkeley National Laboratory Software Engineering and Management Steven W. Hammond, National Renewal Energy Laboratory Ewing Lusk, Argonne National Laboratory System Software Al Geist, Oak Ridge National Laboratory Visualization E. Wes Bethel, Lawrence Berkeley National Laboratory Charles Hanson, University of Utah

A Science-Based Case for Large-Scale Simulation ix Workshop Participants

x A Science-Based Case for Large-Scale Simulation Workshop Participants

WORKSHOP PARTICIPANTS

Tom Abel Michael A. Bender Pennsylvania State University State University of New York–Stony Brook Marvin Lee Adams Jerry Bernholc Texas A&M University North Carolina State University Srinivas Aluru David E. Bernholdt Iowa State University Oak Ridge National Laboratory James F. Amundson Wes Bethel Fermi National Accelerator Laboratory Lawrence Berkeley National Laboratory Carl W. Anderson Tom Bettge Brookhaven National Laboratory National Center for Atmospheric Research Kurt Scott Anderson Amitava Bhattacharjee Rensselaer Polytechnic Institute University of Iowa Adam P. Arkin Eugene W. Bierly University of California–Berkeley/Lawrence American Geophysical Union Berkeley National Laboratory John Blondin Rob Armstrong North Carolina State University Sandia National Laboratories Randall Bramley Miguel Arqaez Indiana University University of Texas–El Paso Marcia Branstetter Steven F. Ashby Oak Ridge National Laboratory Lawrence Livermore National Laboratory Ron Brightwell Paul R. Avery Sandia National Laboratories University of Florida Mark Boslough Dave Bader Sandia National Laboratories DOE Office of Science Richard Comfort Brower David H. Bailey Boston University Lawrence Berkeley National Laboratory David L. Brown Ray Bair Lawrence Livermore National Laboratory Pacific Northwest National Laboratory Steven Bryant Kim K. Baldridge University of Texas–Austin San Diego Supercomputer Center Darius Buntinas Samuel Joseph Barish Ohio State University DOE Office of Science Randall D. Burris Paul Bayer Oak Ridge National Laboratory DOE Office of Science Lawrence Buja Grant Bazan National Center for Atmospheric Research Lawrence Livermore National Laboratory Phillip Cameron-Smith Brian Behlendorf Lawrence Livermore National Laboratory CollabNet Andrew Canning John B. Bell Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

A Science-Based Case for Large-Scale Simulation xi Workshop Participants

Christian Y. Cardall Eduardo F. D’Azevedo Oak Ridge National Laboratory Oak Ridge National Laboratory Mark Carpenter Cecelia Deluca NASA Langley Research Center National Center for Atmospheric Research John R. Cary Joel Eugene Dendy University of Colorado–Boulder Los Alamos National Laboratory Fausto Cattaneo Narayan Desai University of Chicago Argonne National Laboratory Joan Centrella Carleton DeTar NASA Goddard Space Flight Center University of Utah Kyle Khem Chand Chris Ding Lawrence Livermore National Laboratory Lawrence Berkeley National Laboratory Jacqueline H. Chen David Adams Dixon Sandia National Laboratories Pacific Northwest National Laboratory Shiyi Chen John Drake Johns Hopkins University Oak Ridge National Laboratory George Liang-Tai Chiu Phil Dude IBM T. J. Watson Research Center Oak Ridge National Laboratory K.-J. Cho Jason Duell Stanford University Lawrence Berkeley National Laboratory Alok Choudhary Phil Duffy Northwestern University Lawrence Livermore National Laboratory Norman H. Christ Thom H. Dunning, Jr. Columbia University University of Tennessee/Oak Ridge National Laboratory Todd S. Coffey Sandia National Laboratories Stephen Alan Eckstrand DOE Office of Science Phillip Colella Lawrence Berkeley National Laboratory Robert Glenn Edwards Thomas Jefferson National Accelerator Michael Colvin Facility Lawrence Livermore National Laboratory Howard C. Elman Sidney A. Coon University of Maryland DOE Office of Science David Erickson Tony Craig Oak Ridge National Laboratory National Center for Atmospheric Research George Fann Michael Creutz Oak Ridge National Laboratory Brookhaven National Laboratory Andrew R. Felmy Robert Keith Crockett Pacific Northwest National Laboratory University of California–Berkeley Thomas A. Ferryman Peter Thomas Cummings Pacific Northwest National Laboratory Vanderbilt University Lee S. Finn Dibyendu K. Datta Pennsylvania State University Rensselaer Polytechnic University Paul F. Fischer James W. Davenport Argonne National Laboratory Brookhaven National Laboratory xii A Science-Based Case for Large-Scale Simulation Workshop Participants

Jacob Fish Sharon C. Glotzer Rensselaer Polytechnic University University of Michigan Ian Foster Tom Goodale Argonne National Laboratory Max Planck Institut für Gravitationsphysik Leopoldo P. Franca Brent Gorda University of Colorado–Denver Lawrence Berkeley National Laboratory Randall Frank Mark S. Gordon Lawrence Livermore National Laboratory Iowa State University Lori A. Freitag Diachin Steven Arthur Gottlieb Sandia National Laboratories Indiana University Alex Friedman Debbie K. Gracio Lawrence Livermore and Lawrence Berkeley Pacific Northwest National Laboratory National Laboratories Frank Reno Graziani Teresa Fryberger Lawrence Livermore National Laboratory DOE Office of Science Joseph F. Grcar Chris Fryer Lawrence Berkeley National Laboratory Los Alamos National Laboratory William D. Gropp Sam Fulcomer Argonne National Laboratory Brown University Francois Gygi Giulia Galli Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory James Hack Dennis Gannon National Center for Atmospheric Research Indiana University Steve Hammond Angel E. Garcia National Renewable Energy Laboratory Los Alamos National Laboratory Chuck Hansen Al Geist University of Utah Oak Ridge National Laboratory Bruce Harmon Robert Steven Germain Ames Laboratory IBM T. J. Watson Research Center Robert J. Harrison Steve Ghan Oak Ridge National Laboratory Pacific Northwest National Laboratory Teresa Head-Gordon Roger G. Ghanem University of California–Berkeley/Lawrence Johns Hopkins University Berkeley National Laboratory Omar Ghattas Martin Head-Gordon Carnegie Mellon University Lawrence Berkeley National Laboratory Ahmed F. Ghoniem Helen He Massachusetts Institute of Technology Lawrence Berkeley National Laboratory John R. Gilbert Tom Henderson University of California–Santa Barbara National Center for Atmospheric Research Roscoe C. Giles Van Emden Henson Boston University Lawrence Livermore National Laboratory James Glimm Richard L. Hilderbrandt Brookhaven National Laboratory National Science Foundation

A Science-Based Case for Large-Scale Simulation xiii Workshop Participants

Rich Hirsh Alan F. Karr National Science Foundation National Institute of Statistical Services Daniel Hitchcock Tom Katsouleas DOE Office of Science University of Southern California Forrest Hoffman Brian Kauffman Oak Ridge National Laboratory National Center for Atmospheric Research Adolfy Hoisie Sallie Keller-McNulty Los Alamos National Laboratory Los Alamos National Laboratory Jeff Hollingsworth Carl Timothy Kelley University of Maryland North Carolina State University Patricia D. Hough Stephen Kent Sandia National Laboratories Fermi National Accelerator Laboratory John C. Houghton Darren James Kerbyson DOE Office of Science Los Alamos National Laboratory Paul D. Hovland Jon Kettenring Argonne National Laboratory Telcordia Technologies Ivan Hubeny David E. Keyes NASA/Goddard Space Flight Center Columbia University Thomas J. R. Hughes Alexei M. Khokhlov University of Texas–Austin Naval Research Laboratory Rob Jacob Jeff Kiehl Argonne National Laboratory National Center for Atmospheric Research Curtis L. Janssen William H. Kirchhoff Sandia National Laboratories DOE Office of Science Stephen C. Jardin Stephen J. Klippenstein Princeton Plasma Physics Laboratory Sandia National Laboratories Philip Michael Jardine Dana Knoll Oak Ridge National Laboratory Los Alamos National Laboratory Ken Jarman Michael L. Knotek Pacific Northwest National Laboratory DOE Office of Science Fred Johnson Kwok Ko DOE Office of Science Stanford Linear Accelerator Center William E. Johnston Karl Koehler Lawrence Berkeley National Laboratory National Science Foundation Philip Jones Dale D. Koelling Los Alamos National Laboratory DOE Office of Science Kenneth I. Joy Peter Michael Kogge University of California–Davis University of Notre Dame Andreas C. Kabel James Arthur Kohl Stanford Linear Accelerator Center Oak Ridge National Laboratory Laxmikant V. Kale Tammy Kolda University of Illinois at Urbana-Champaign Sandia National Laboratories Kab Seok Kang Marc Edward Kowalski Old Dominion University Stanford Linear Accelerator Center xiv A Science-Based Case for Large-Scale Simulation Workshop Participants

Jean-Francois Lamarque Richard Matzner National Center for Atmospheric Research University of Texas–Austin David P. Landau Andrew McIlroy University of Georgia Sandia National Laboratories Jay Larson Lois Curfman McInnes Argonne National Laboratory Argonne National Laboratory Peter D. Lax Sally A. McKee New York University Cornell University Lie-Quan Lee Gregory J. McRae Stanford Linear Accelerator Center Massachusetts Institute of Technology W. W. Lee Lia Merminga Princeton Plasma Physics Laboratory Thomas Jefferson National Accelerator Facility John Wesley Lewellen Argonne National Laboratory Bronson Messer University of Tennessee/Oak Ridge National Sven Leyffer Laboratory Argonne National Laboratory Juan M. Meza Timothy Carl Lillestolen Lawrence Berkeley National Laboratory University of Tennessee Anthony Mezzacappa Zhihong Lin Oak Ridge National Laboratory University of California–Irvine George Michaels Timur Linde Pacific Northwest National Laboratory University of Chicago John Michalakes Philip LoCascio National Center for Atmospheric Research Oak Ridge National Laboratory Don Middleton Steven G. Louie National Center for Atmospheric Research University of California–Berkeley/Lawrence Berkeley National Laboratory William H. Miner, Jr. DOE Office of Science Steve Louis Lawrence Livermore National Laboratory Art Mirin Lawrence Livermore National Laboratory Ewing L. Lusk Argonne National laboratory Lubos Mitas North Carolina State University Mordecai-Mark Mac Low American Museum of Natural History Shirley V. Moore University of Tennessee Arthur B. Maccabe University of New Mexico Jorge J. Moré Argonne National Laboratory David Malon Argonne National Laboratory Warren B. Mori University of California–Los Angeles Bob Malone Los Alamos National Laboratory Jose L. Munoz National Nuclear Security Administration Madhav V. Marathe Los Alamos National Laboratory Jim Myers Pacific Northwest National Laboratory William R. Martin University of Michigan Pavan K. Naicker Vanderbilt University

A Science-Based Case for Large-Scale Simulation xv Workshop Participants

Habib Nasri Najm Karsten Pruess Sandia National Laboratories Lawrence Berkeley National Laboratory John W. Negele Hong Qin Massachusetts Institute of Technology Princeton University Bill Nevins Larry A. Rahn Lawrence Livermore National Laboratory Sandia National Laboratories Esmond G. Ng Craig E. Rasmussen Lawrence Berkeley National Laboratory Los Alamos National Laboratory Michael L. Norman Katherine M. Riley University of California–San Diego University of Chicago Peter Nugent Martei Ripeanu Lawrence Berkeley National Laboratory University of Chicago Joseph C. Oefelein Charles H. Romine Sandia National Laboratories DOE Office of Science Rod Oldehoeft John Van Rosendale Los Alamos National Laboratory DOE Office of Science Gordon L. Olson Robert Rosner Los Alamos National Laboratory University of Chicago George Ostrouchov Doron Rotem Oak Ridge National Laboratory Lawrence Berkeley National Laboratory Jack Parker Doug Rotman Oak Ridge National Laboratory Lawrence Livermore National Laboratory Scott Edward Parker Robert Douglas Ryne University of Colorado Lawrence Berkeley National Laboratory Bahram Parvin P. Sadayappan Lawrence Berkeley National Laboratory Ohio State University Valerio Pascucci Roman Samulyak Lawrence Livermore National Laboratory Brookhaven National Laboratory Peter Paul Todd Satogata Brookhaven National Laboratory Brookhaven National Laboratory William Pennell Dolores A. Shaffer Pacific Northwest National Laboratory Science and Technology Associates, Inc. Arnie Peskin George C. Schatz Brookhaven National Laboratory Northwestern University Anders Petersson David Paul Schissel Lawrence Livermore National Laboratory General Atomics Ali Pinar Thomas Christoph Schulthess Lawrence Berkeley National Laboratory Oak Ridge National Laboratory Tomasz Plewa Mary Anne Scott University of Chicago DOE Office of Science Douglas E. Post Stephen L. Scott Los Alamos National Laboratory Oak Ridge National Laboratory Alex Pothen David B. Serafini Old Dominion University Lawrence Berkeley National Laboratory xvi A Science-Based Case for Large-Scale Simulation Workshop Participants

James Sethian Michael Robert Strayer University of California–Berkeley Oak Ridge National Laboratory John Nicholas Shadid Scott Studham Sandia National Laboratories Pacific Northwest National Laboratory Mikhail Shashkov Robert Sugar Los Alamos National Laboratory University of California–Santa Barbara John Shalf Shuyu Sun Lawrence Berkeley National Laboratory University of Texas–Austin Henry Shaw F. Douglas Swesty DOE Office of Science State University of New York–Stony Brook Mark S. Shephard John Taylor Rensselaer Polytechnic Institute Argonne National Laboratory Jason F. Shepherd Ani R. Thakar Sandia National Laboratories Johns Hopkins University Timothy Shippert Andrew F. B. Tompson Pacific Northwest National Laboratory Lawrence Livermore National Laboratory Andrei P. Shishlo Charles H. Tong Oak Ridge National Laboratory Lawrence Livermore National Laboratory Arie Shoshani Josep Torrellas Lawrence Berkeley National Laboratory University of Illinois Andrew R. Siegel Harold Eugene Trease Argonne National Laboratory/University of Pacific Northwest National Laboratory Chicago Albert J. Valocchi Sonya Teresa Smith University of Illinois Howard University Debra van Opstal Mitchell D. Smooke Council on Competitiveness Yale University Leticia Velazquez Allan Edward Snavely University of Texas–El Paso San Diego Supercomputer Center Roelof Jan Versteeg Matthew Sottile Idaho National Energy and Engineering Los Alamos National Laboratory Laboratory Carl R. Sovinec Jeffrey Scott Vetter University of Wisconsin Lawrence Livermore National Laboratory Panagiotis Spentzouris Angela Violi Fermi National Accelerator Facility University of Utah Daniel Shields Spicer Robert Voigt NASA/Goddard Space Flight Center College of William and Mary Bill Spotz Gregory A. Voth National Center for Atmospheric Research University of Utah David J. Srolovitz Albert F. Wagner Princeton University Argonne National Laboratory T. P. Straatsma Homer F. Walker Pacific Northwest National Laboratory Worcester Polytechnic Institute

A Science-Based Case for Large-Scale Simulation xvii Workshop Participants

Lin-Wang Wang John Wu Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Warren Washington Ying Xu National Center for Atmospheric Research Oak Ridge National Laboratory Chip Watson Zhiliang Xu Thomas Jefferson National Accelerator State University of New York–Stony Brook Facility Steve Yabusaki Michael Wehner Pacific Northwest National Laboratory Lawrence Livermore National Laboratory Woo-Sun Yang Mary F. Wheeler Lawrence Berkeley National Laboratory University of Texas–Austin Katherine Yelick Roy Whitney University of California–Berkeley Thomas Jefferson National Accelerator Thomas Zacharia Facility Oak Ridge National Laboratory Theresa L. Windus Bernard Zak Pacific Northwest National Laboratory Sandia National Laboratories Steve Wojtkiewicz Shujia Zhou Zhou Sandia National Laboratories Northrop Grumman/TASC Pat Worley John P. Ziebarth Oak Ridge National Laboratory Los Alamos National Laboratory Margaret H. Wright New York University

xviii A Science-Based Case for Large-Scale Simulation 1. Introduction

1. INTRODUCTION

National investment in large-scale answered in 27 topical breakout groups, computing can be justified on numerous whose responses constitute the bulk of the grounds. These include national security; second volume of this two-volume report. economic competitiveness; technological leadership; formulation of strategic policy The computational scientists participating in defense, energy, environmental, health in the workshop included those who regard care, and transportation systems; impact on themselves primarily as natural scientists— education and workforce development; e.g., physicists, chemists, biologists—but impact on culture in its increasingly digital also many others. By design, about one- forms; and, not least, supercomputing’s third of the participants were computa- ability to capture the imagination of the tional mathematicians. Another third were nation’s citizens. The public expects all to computer scientists who are oriented see the fruits of these historically toward research on the infrastructure on established benefits of large-scale which computational modeling and computing, and it expects that the United simulation depends. States will have the earliest access to future, as yet unimagined benefits. A CASCADIC STRUCTURE Computer scientists and mathematicians This report touches upon many of these are scientists whose research in the area of motivations for investment in large-scale large-scale computing has direction and computing. However, it is directly value of its own. However, the structure of concerned with a deeper one, which in the workshop, and of this report, many ways controls the rest. The question emphasizes a cascadic flow from natural this report seeks to answer is: What case can scientists to mathematicians and from both be made for a national investment in large-scale to computer scientists (Figure 1). We are computational modeling and simulation from primarily concerned herein with the the perspective of basic science? This can be opportunities for large-scale simulation to broken down into several smaller achieve new understanding in the natural questions: What scientific results are likely, . In this context, the role of the given a significantly more powerful simulation mathematical sciences is to provide a capability, say, a hundred to a thousand times means of passing from a physical model to present capability? What principal hurdles can a discrete computational representation of be identified along the path to realizing these that model and of efficiently manipulating new results? How can these hurdles be that representation to obtain results whose surmounted? validity is well understood. In turn, the computer sciences allow the algorithms that A WORKSHOP TO FIND FRESH perform these manipulations to be executed ANSWERS efficiently on leading-edge computer These questions and others were posed to a systems. They also provide ways for the large gathering of the nation’s leading often-monumental human efforts required computational scientists at a workshop in this process to be effectively abstracted, convened on June 24–25, 2003, in the leveraged across other applications, and shadow of the Pentagon. They were propagated over a complex and ever- evolving set of hardware platforms.

A Science-Based Case for Large-Scale Simulation 1 1. Introduction

Figure 1. The layered, cascadic structure of computational science.

Most computational scientists understand understanding the importance of several that this layered set of dependencies—of practices in the contemporary natural science on mathematics and of both computational research community. These on computer science—is an include striving towards the abstraction of oversimplification of the full picture of technologies and towards identification of information flows that drive computational universal interfaces between well-defined science. Indeed, one of the most exciting of functional layers (enabling reuse across many rapidly evolving trends in many applications). It also motivates computational science is a “phase sustained investment in the cross-cutting transition” (a concept to which we return in technologies of computational mathematics Chapter 2) from a field of solo scientific and computer science, so that increases in investigators, who fulfill their needs by power and capability may be continually finding computer software “thrown over delivered to a broad portfolio of scientific the wall” by computational tool builders, to applications on the other side of a large, coordinated groups of natural standardized software interface. scientists, mathematicians, and computer scientists who carry on a constant SIMULATION AS A PEER conversation about what can be done now, METHODOLOGY TO EXPERIMENT what is possible in the future, and how to AND THEORY achieve these goals most effectively. Historians of science may pick different dates for the development of the twin Almost always, there are numerous ways to pillars of theory and experiment in the translate a physical model into “scientific method,” but both are ancient, mathematical algorithms and to implement and their interrelationship is well a computational program on a given established. Each takes a turn leading the computer. Uninformed decisions made other as they mutually refine our early in the process can cut off highly understanding of the natural world— productive options that may arise later on. experiment confronting theory with new Only a bi-directional dialog, up and down puzzles to explain and theory showing the cascade, can systematically ensure that experiment where to look next to test its the best resources and technologies are explanations. Unfortunately, theory and employed to solve the most challenging experiment both possess recognized scientific problems. limitations at the frontier of contemporary science. Its acknowledged limitations aside, the cascadic model of contemporary The strains on theory were apparent to John computational science is useful for Von Neumann (1903–1957) and drove him

2 A Science-Based Case for Large-Scale Simulation 1. Introduction to develop the computational sciences of celestial events.) The experiments that fluid dynamics, radiation transport, scientists need to perform to answer the weather prediction, and other fields. most pressing questions are sometimes Models of these phenomena, when deemed unethical (e.g., because of their expressed as mathematical equations, are impact on living beings), hazardous (e.g., inevitably large-scale and nonlinear, while because of their impact on the life- the bulk of the edifice of mathematical sustaining environment we have on earth), theory (for algebraic, differential, and politically untenable (e.g., prohibited by integral equations) is linear. Computation treaties to which the United States is a was to Von Neumann, and remains today, party), difficult (e.g., requiring the only truly systematic means of making measurements that are too rapid or progress in these and many other scientific numerous to be instrumentable or time arenas. Breakthroughs in the theory of periods too long to complete), or simply nonlinear systems come occasionally, but expensive. computational gains come steadily with increased computational power and The cost of experimentation is particularly resolution. important to consider when considering simulation as an alternative, for it cannot be The strains on experimentation, the gold denied that simulation can also be standard of scientific truth, have grown expensive. Cost has not categorically along with expectations for it (Figure 2). denied experimentalists their favored tools, Unfortunately, many systems and many such as accelerators, orbital telescopes, questions are nearly inaccessible to high-power lasers, and the like, at prices experiment. (We subsume under sometimes in the billions of dollars per “experiment” both experiments designed facility, although such facilities are carefully and conducted by scientists, and vetted and planned. Cost must also not observations of natural phenomena out of categorically deny computational scientists the direct control of scientists, such as their foremost scientific instrument—the

Figure 2. Practical strains on experimentation that invite simulation as a peer modality of scientific investigation.

A Science-Based Case for Large-Scale Simulation 3 1. Introduction supercomputer and its full complement of using experimentally measured fields as software, networking, and peripheral input data. Data assimilation can also be devices. It will be argued in the body of this systematically employed throughout a report that simulation is, in fact, a highly simulation to keep it from “drifting” from cost-effective lever for the experimental measurements, thus overcoming the effect process, allowing the same, or better of modeling uncertainty. science to be accomplished with fewer, better-conceived experiments. Further comparing simulation to experiment, one observes a sociological or Simulation has aspects in common with political advantage to increased investment both theory and experiment. It is in the tools of simulation: they are widely fundamentally theoretical, in that it starts usable across the entire scientific with a theoretical model—typically a set of community. A micro-array analyzer is of mathematical equations. A powerful limited use to a plasma physicist and a simulation capability breathes new life into tokamak of limited use to a biologist. theory by creating a demand for However, a large computer or an improvements in mathematical models. optimization algorithm in the form of Simulation is also fundamentally portable software may be of use to both. experimental, in that upon constructing and implementing a model, one observes the HURDLES TO SIMULATION transformation of inputs (or controls) to While the promise of simulation is outputs (or observables). profound, so are its limitations. Those limitations often come down to a question Simulation effectively bridges theory and of resolution. Though of vast size, experiment by allowing the execution of computers are triply finite: they represent “theoretical experiments” on systems, individual quantities only to a finite including those that could never exist in the precision, they keep track of only a finite physical world, such as a fluid without number of such quantities, and they operate viscosity. Computation also bridges theory at a finite rate. and experiment by virtue of the computer’s serving as a universal and versatile data Although all matter is, in fact, composed of host. Once experimental data have been a finite number of (atoms), the digitized, they can be compared side-by- number of particles in a macroscopic side with simulated results in visualization sample of matter, on the order of a trillion systems built for the latter, reliably particles, places simulations at macroscopic transmitted, and retrievably archived. scales from “first principles” (i.e., from the Moreover, simulation and experiment can quantum theory of electronic structure) complement each other by allowing a well beyond any conceivable computational complete picture of a system that neither capability. Similar problems arise when can provide as well alone. Some data may time scales are considered. For example, the not be measurable with the best range of time scales in protein folding is experimental techniques available, and 12 orders of magnitude, since a process that some mathematical models may be too takes milliseconds occurs in molecular sensitive to unknown parameters to invoke dance steps that last femtoseconds (a with confidence. Simulation can be used to trillion times shorter)—again, far too wide a “fill in” the missing experimental fields,

4 A Science-Based Case for Large-Scale Simulation 1. Introduction range to routinely simulate using first an increase in computer performance by a principles. factor of 100 provides an increase in resolution in each spatial and temporal Even simulation of systems adequately dimension by a factor of only about 3. described by equations of the macroscopic continuum—fluids such as air or water— One way to bring more computing power can be daunting for the most powerful to bear on a problem is to divide the work computers available today. Such computers to be done over many processors. are capable of execution rates in the tens of Parallelism in computer architecture—the teraflop/s (one teraflop/s is one trillion concurrent use of many processors—can arithmetic operations per second) and can provide additional factors of hundreds or cost tens to hundreds of millions of dollars more, beyond Moore’s Law alone, to the to purchase and millions of dollars per year aggregate performance available for the to operate. However, to simulate fluid- solution of a given problem. However, the mechanical turbulence in the boundary and desire for increased resolution and, hence, wake regions of a typical vehicle using accuracy is seemingly insatiable. This “first principles” of continuum modeling simply expressed and simply understood would tie up such a computer for months, “curse of dimensionality” means that which makes this level of simulation too “business as usual” in scaling up today’s expensive for routine use. simulations by riding the computer performance curve will never be cost- One way to describe matter at the effective—and is too slow! macroscopic scale is to divide space up into small cubes and to ascribe values for The “curse of knowledge explosion”— density, momentum, temperature, and so namely, that no one computational forth, to each cube. As the cubes become can hope to track advances in all of the smaller and smaller, a more and more facets of the mathematical theories, accurate description is obtained, at a cost of computational models and algorithms, increasing memory to store the values and applications and computing systems time to visit and update each value in software, and computing hardware accordance with natural laws, such as the (computers, data stores, and networks) that conservation of energy applied to the cube, may be needed in a successful simulation given fluxes in and out of neighboring effort—is another substantial hurdle to the cubes. If the number of cubes along each progress of simulation. Both the curse of side is doubled, the cost of the simulation, dimensionality and the curse of knowledge which is proportional to the total number of explosion can be addressed. In fact, they cubes, increases by a factor of 23 or 8. This is have begun to be addressed in a the “curse of dimensionality”: increasing remarkably successful initiative called the resolution of a simulation quickly eats Scientific Discovery through Advanced up any increases in processor power Computing (SciDAC), which was launched resulting from the well-known Moore’s by the U.S. Department of Energy’s (DOE’s) Law (a doubling of computer speed every Office of Science in 2001. SciDAC was just a 18–24 months). Furthermore, for time- start; many difficult challenges remain. dependent problems in three dimensions, Nonetheless, SciDAC (discussed in more the cost of a simulation grows like the detail in Chapter 2) established a model for fourth power of the resolution. Therefore, tackling these “curses” that promises to

A Science-Based Case for Large-Scale Simulation 5 1. Introduction bring the power of the most advanced types of computational scientists and good computing technologies to bear on the most communication between them. challenging scientific and engineering problems facing the nation. The following recommendations are elaborated upon in Chapter 6 of this In addition to the technical challenges in volume: scientific computing, too few computational cycles are currently available to fully • Major new investments in capitalize upon the progress of the SciDAC computational science are needed in all initiative. Moreover, the computational of the mission areas of the Office of science talent emerging from the national Science in DOE, as well as those of educational pipeline is just a trickle in many other agencies, so that the United comparison to what is needed to replicate States may be the first, or among the the successes of SciDAC throughout the first, to capture the new opportunities many applications of science and presented by the continuing advances in engineering that lag behind. Cycle- computing power. Such investments starvation and human resource–starvation will extend the important scientific are serious problems that are felt across all opportunities that have been attained of the computational science programs by a fusion of sustained advances in surveyed in creating this report. However, scientific models, mathematical they are easier hurdles to overcome than algorithms, computer architecture, and the above-mentioned “curses.” scientific software engineering.

SUMMARY OF RECOMMENDATIONS • Multidisciplinary teams, with carefully selected leadership, should be We believe that the most effective program assembled to provide the broad range of to deliver new science by means of large- expertise needed to address the scale simulation will be one built over a intellectual challenges associated with broad base of disciplines in the natural translating advances in science, sciences and the associated enabling mathematics, and computer science into technologies, since there are numerous simulations that can take full advantage synergisms between them, which are of advanced computers. readily apparent in the companion volume. • Extensive investment in new We also believe that a program to realize computational facilities is strongly the promises outlined in the companion recommended, since simulation now volume requires a balance of investments in cost-effectively complements hardware, software, and human resources. experimentation in the pursuit of the While quantifying and extrapolating the answers to numerous scientific thirst for computing cycles and network questions. New facilities should strike a bandwidth is relatively straightforward, the balance between capability computing scheduling of breakthroughs in for those “heroic simulations” that mathematical theories and computational cannot be performed any other way and model and algorithms has always been capacity computing for “production” difficult. Breakthroughs are more likely, simulations that contribute to the steady however, where there is critical mass of all stream of progress.

6 A Science-Based Case for Large-Scale Simulation 1. Introduction

• Investment in hardware facilities should collaborations among distributed teams be accompanied by sustained collateral of scientists, in recognition of the fact investment in software infrastructure that the best possible computational for them. The efficient use of expensive science teams will be widely separated computational facilities and the data geographically and that researchers will they produce depends directly upon generally not be co-located with multiple layers of system software and facilities and data. scientific software, which, together with the hardware, are the “engines of • Federal investment in innovative, high- scientific discovery” across a broad risk computer architectures that are well portfolio of scientific applications. suited to scientific and engineering simulations is both appropriate and • Additional investments in hardware needed to complement commercial facilities and software infrastructure research and development. The should be accompanied by sustained commercial computing marketplace is collateral investments in algorithm no longer effectively driven by the research and theoretical development. needs of computational science. Improvements in basic theory and algorithms have contributed as much to SCOPE increases in computational simulation The purpose of this report is to capture the capability as improvements in hardware projections of the nation’s leading and software over the first six decades computational scientists and providers of of scientific computing. enabling technologies concerning what new science lies around the corner with the next • Computational scientists of all types increase in delivered computational power should be proactively recruited with by a factor of 100 to 1000. It also identifies improved reward structures and the issues that must be addressed if these opportunities as early as possible in the increases in computing power are to lead to educational process so that the number remarkable new scientific discoveries. of trained computational science professionals is sufficient to meet It is beyond the charter and scope of this present and future demands. report to form policy recommendations on how this increased capability and capacity • Sustained investments must be made in should be procured, or how to prioritize network infrastructure for access and among numerous apparently equally ripe resource sharing as well as in the scientific opportunities. software needed to support

A Science-Based Case for Large-Scale Simulation 7 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

8 A Science-Based Case for Large-Scale Simulation 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

2. SCIENTIFIC DISCOVERY THROUGH ADVANCED COMPUTING: A SUCCESSFUL PILOT PROGRAM

THE BACKGROUND OF SciDAC fraction of the total memory, connected in clusters numbering in the thousands For more than half a century, visionaries through dedicated networks or fast anticipated the emergence of computational switches. The speeds of the commercial modeling and simulation as a peer to microprocessors in these designs increased theory and experiment in pushing forward according to Moore’s Law, producing the frontiers of science, and DOE and its distributed-memory parallel computers antecedent agencies over this period have that rivaled the power of traditional Cray invested accordingly. Computational supercomputers. These new machines modeling and simulation made great required a major recasting of the algorithms strides between the late 1970s and the early that scientists had developed and polished 1990s as Seymour Cray and the company for more than two decades on Cray that he founded produced ever-faster supercomputers, but those who invested versions of Cray supercomputers. Between the effort found that the new massively 1976, when the Cray 1 was delivered to parallel computers provided impressive DOE’s Los Alamos National Laboratory, levels of computing capability. and 1994, when the last traditional Cray supercomputer, the Cray T90, was By the late 1990s it was evident to many introduced, the speed of these computers that the continuing, dramatic advances in increased from 160 megaflop/s to computing technologies had the potential 32 gigaflop/s, a factor of 200! However, by to revolutionize scientific computing. the early 1990s it became clear that building DOE’s National Nuclear Security Agency ever-faster versions of Cray (DOE-NNSA) launched the Accelerated supercomputers using traditional Strategic Computing Initiative (ASCI) to electronics technologies was not help ensure the safety and reliability of the sustainable. nation’s nuclear stockpile in the absence of testing. In 1998, DOE’s Office of Science In the 1980s DOE and other federal (DOE-SC) and the National Science agencies began investing in alternative Foundation (NSF) sponsored a workshop at computer architectures for scientific the National Academy of Sciences to computing featuring multiple processors identify the opportunities and challenges of connected to multiple memory units realizing major advances in computational through a wide variety of mechanisms. A modeling and simulation. The report from wild ensued during which many this workshop identified scientific advances different physical designs were explored, as across a broad range of science, ranging in well as many programming models for scale from cosmology through nanoscience, controlling the flow of data in them, from which would be made possible by advances memory to processor and from one memory in computational modeling and simulation. unit to another. The mid-1990s saw a However, the report also noted challenges substantial convergence, which found to be met if these advances were to be commercial microprocessors, each with realized. direct local access to only a relatively small

A Science-Based Case for Large-Scale Simulation 9 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

By the time the workshop report was university partnerships in research and published, the Office of Science was well graduate education. SciDAC is built on along the path to developing an Office- these institutional foundations as well as on wide program designed to address the the integration of computational modeling, challenges identified in the DOE-SC/NSF computer algorithms, and computing workshop. The response to the National systems software. Academy report, as well as to the 1999 report from the President’s Information In this chapter, we briefly examine the short Technology Advisory Committee—the but substantial legacy of SciDAC. Our Scientific Discovery through Advanced purpose is to understand the implications Computing (SciDAC) program—was of its successes, which will help us identify described in a 22-page March 2000 report the logical next steps in advancing by the Office of Science to the House and computational modeling and simulation. Senate Appropriations Committees. Congress funded the program in the SciDAC’S INGREDIENTS FOR following fiscal year, and by the end of SUCCESS FY 2001, DOE-SC had established Aimed at Discovery 51 scientific projects, following a national peer-reviewed competition. SciDAC boldly affirms the importance of computational simulation for new scientific SciDAC explicitly recognizes that the discovery, not just for “rationalizing” the advances in computing technologies that results of experiments. The latter role for provide the foundation for the scientific simulation is a time-honored one from the opportunities identified in the DOE-SC/ days when it was largely subordinate to NSF report (as well as in Volume 2 of the experimentation, rather than peer to it. That current report) are not, in themselves, relationship has been steadily evolving. In a sufficient for the “coming of age” of frontispiece quotation from J. S. Langer, the computational simulation. Close chair of the July 1998 multiagency National integration of three disciplines—science, Workshop on Advanced Scientific computer science, and applied Computing, the SciDAC report declared: mathematics—and a new discipline at their “The computer literally is providing a new intersection called computational science window through which we can observe the are needed to realize the full benefits of the natural world in exquisite detail.” fast-moving advances in computing technologies. A change in the sociology of Through advances in computational this area of science was necessary for the modeling and simulation, it will be possible fusion of these streams of disciplinary to extend dramatically the exploration of developments. While not uniquely the fundamental processes of nature—e.g., American in its origins, the “new kind of the interactions between elementary computational science” promoted in the particles that give rise to the matter around SciDAC report found fertile soil in us, the interactions between proteins and institutional arrangements for the conduct other that sustain life—as well as of scientific research that are highly advance our ability to predict the behavior developed in the United States, and of a broad range of complex natural and seemingly much less so elsewhere: engineered systems—e.g., nanoscale multiprogram, multidisciplinary national devices, microbial cells, fusion energy laboratories, and strong laboratory- reactors, and the earth’s climate.

10 A Science-Based Case for Large-Scale Simulation 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

Multidisciplinary • creation of the mathematical and systems software to enable the scientific The structure of SciDAC boldly reflects the simulation codes to effectively and fact that the development and application efficiently use terascale computers; and of leading-edge simulation capabilities has • creation of collaboratory software become a multidisciplinary activity. This infrastructure to enable geographically implies, for instance, that physicists are able separated scientists to work together to focus on the physics and not on effectively as a team and to facilitate developing parallel libraries for core remote access to both facilities and data. mathematical operations. It also implies that mathematicians and computer The $57 million per year made available scientists have direct access to substantive under SciDAC is divided roughly evenly problems of high impact, and not only into these three categories. SciDAC also models loosely related to application- called for upgrades to the scientific specific difficulties. The deliverables and computing hardware infrastructure of the accountabilities of the collaborating parties Office of Science and outlined a are cross-linked. The message of SciDAC is prescription for building a robust, agile, a “declaration of interdependence.” and cost-effective computing infrastructure. These recommendations have thus far been Specifically, the SciDAC report called for modestly supported outside of the SciDAC • creation of a new generation of scientific budget. simulation codes that take full advantage of the extraordinary Figure 3, based on a figure from the original computing capabilities of terascale SciDAC report, shows the interrelationships computers; between SciDAC program elements as

Figure 3. Relationship between the science application teams, the Integrated Software Infrastructure Centers (ISICs), and networking and computing facilities in the SciDAC program.

A Science-Based Case for Large-Scale Simulation 11 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program implemented, particularizing Figure 1 of proof of concept. Laboratory teams excel in this report to the SciDAC program. All five creating, refining, “hardening,” and program offices in the Office of Science are deploying computational scientific participating in SciDAC. The disciplinary infrastructure. Thirteen laboratories and computational science groups are being 50 universities were funded in the initial funded by the Offices of Basic Energy round of SciDAC projects. Sciences, Biology and Environmental Research, Fusion Energy Sciences, and High Setting New Standards for Software Energy and , and the For many centuries, publication of original integrated software infrastructure centers results has been the gold standard of and networking infrastructure centers are accomplishment for scientific researchers. being funded by the Office of Advanced During the first 50 years of scientific Scientific Computing Research. computing, most research codes targeted a few applications on a small range of Multi-Institutional hardware, and were designed to be used by Recognizing that the expertise needed to experts only (often just the author). Code make major advances in computational development projects emphasized finding modeling and algorithms and computing the shortest path to the next publishable systems software was distributed among result. Packaging code for reuse was not a various institutions across the United job for serious scientists. As the role of States, SciDAC established scientifically computational modeling and simulations in natural collaborations without regard for advancing scientific discovery grows, this geographical or institutional boundaries— culture is changing. Scientists now realize laboratory-university collaborations and the value of high-quality, reusable, laboratory-laboratory collaborations. extensible, portable software, and the Laboratories and universities have producers of widely used packages are complementary strengths in research. highly respected. Laboratories are agile in reorganizing their research assets to respond to missions and Unfortunately, funding structures in the new opportunities. Universities respond on federal agencies have evolved more slowly a much slower time scale to structural and it has not always been easy for change, but they are hotbeds of innovation scientists to find support for the where graduate students explore novel development of software for the research ideas as an integral part of their education. community. By dedicating more than half In the area of simulation, universities are its resources to the development of high- often sources of new models, algorithms, value scientific applications software and and software and hardware concepts that the establishment of integrated software can be evaluated in a dissertation-scale infrastructure centers, SciDAC has effort. However, such developments may supported efforts to make advanced not survive the graduation of the doctoral scientific packages easy to use, port them to candidate and are rarely rendered into new hardware, maintain them, and make reusable, portable, extensible infrastructure them capable of interoperating with other in the confines of the academic related packages that achieve critical mass environment, where the reward structure of use and offer complementary features. emphasizes the original discovery and

12 A Science-Based Case for Large-Scale Simulation 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

SciDAC recognizes that large scientific latter half of the 1990s (in such programs as simulation codes have many needs in DOE’s ASCI) to justify them as cost- common: modules that construct and adapt effective simulation hardware, provided the computational mesh on which their that certain generic difficulties in models are based, modules that produce programming are amortized over many the discrete equations from the user groups and many generations of mathematical model, modules that solve hardware. Still, many challenges remain to the resulting equations, and modules that making efficient and effective use of these visualize the results and manage, mine, and computers for the broad range of scientific analyze the large data sets. By constructing applications important to the mission of these modules as interoperable components DOE’s Office of Science. and supporting the resulting component libraries so that they stay up to date with Supporting Community Codes the latest algorithmic advances and work As examples of simulations from Office of on the latest hardware, SciDAC amortizes Science programs, the SciDAC report costs and creates specialized points of highlighted problems of ascertaining contact for a portfolio of applications. chemical structure and the massively parallel simulation code NWCHEM, winner The integrated software infrastructure of a 1999 R&D 100 Award and a Federal centers that develop these libraries are Laboratory Consortium Excellence in motivated to ensure that their users have Technology Transfer Award in 2000. available the full range of algorithmic options under a common interface, along NWCHEM, currently installed at nearly with a complete description of their 900 research institutions worldwide, was resource tradeoffs (e.g., memory versus created by an international team of time). Their users can try all reasonable scientists from twelve institutions (two U.S. options easily and adapt to different national laboratories, two European computational platforms without recoding national laboratories, five U.S. universities, the applications. Furthermore, when an and three European universities) over the application group intelligently drives course of five years. The core team software infrastructure research in a new consisted of seven full-time theoretical and and useful direction, the result is soon computational chemists, computer available to all users. scientists, and applied mathematicians, ably assisted by ten postdoctoral fellows on Large Scale temporary assignment. Overall, NWCHEM Because computational science is driven represents an investment in excess of toward the large scale by requirements of 100 person-years. Yet, the layered structure model fidelity and resolution, SciDAC of NWCHEM permits it to easily absorb emphasizes the development of software new physics or algorithm modules, to run for the most capable computers available. on environments from laptops to the fastest In the United States, these are now parallel architectures, and to be ported to a primarily hierarchical distributed-memory new supercomputing environment with computers. Enough experience was isolated minor modifications, typically in acquired on such machines in heroic less than a week. The scientific legacy mission-driven simulation programs in the embodied in NWCHEM goes back three

A Science-Based Case for Large-Scale Simulation 13 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program decades to the Gaussian code, for which of contact for mathematical and John Pople won the 1998 Nobel Prize in computational technologies is a sound chemistry. investment strategy. For instance, Dalton Schnack of General Atomics, whose group Community codes such as Gaussian and operates and maintains the simulation code NWCHEM can consume hundreds of NIMROD to support experimental design person-years of development (Gaussian of magnetically confined fusion energy probably has 500–1000 person-years devices, states that software recently invested in it), run at hundreds of developed by Xiaoye (Sherry) Li of installations, are given large fractions of Lawrence Berkeley National Laboratory has community computing resources for enhanced his group’s ability to perform decades, and acquire an authority that can fusion simulations to a degree equivalent to enable or limit what is done and accepted three to five years of riding the wave of as science in their respective communities. hardware improvements alone. Ironically, Yet, historically, not all community codes the uniprocessor version of Li’s solver, have been crafted with the care of originally developed while she was a NWCHEM to achieve performance, graduate student at the University of portability, and extensibility. Except at the California–Berkeley, predated the SciDAC beginning, it is difficult to promote major initiative; but it was not known to the algorithmic changes in such codes, since NIMROD team until it was recommended change is expensive and pulls resources during discussions between SciDAC fusion from core progress. SciDAC application and solver software teams. In the groups have been chartered to build new meantime, SciDAC-funded research and improved community codes in areas improved the parallel performance of Li’s such as accelerator physics, materials solver in time for use in NIMROD’s current science, ocean circulation, and plasma context. physics. The integrated software infrastructure centers of SciDAC have a M3D, a fusion simulation code developed chance, due to the interdependence built at the Princeton Plasma Physics Laboratory, into the SciDAC program structure, to is complementary to NIMROD; it uses a simultaneously influence all of these new different mathematical formulation. Like community codes by delivering software NIMROD and many simulation codes, M3D incorporating optimal algorithms that may spends a large fraction of its time solving be reused across many applications. sparse linear algebraic systems. In a Improvements driven by one application collaboration that predated SciDAC, the will be available to all. While SciDAC M3D code group was using PETSc, a application groups are developing codes for portable toolkit of sparse solvers developed communities of users, SciDAC enabling at Argonne National Laboratory and in technology groups are building codes for wide use around the world. Under SciDAC, communities of developers. the Terascale Optimal PDE Solvers integrated software center provided Enabling Success another world-renowned class of solvers, Hypre, from Lawrence Livermore National The first two years of SciDAC provide Laboratory “underneath” the same code ample evidence that constructing interface that M3D was already using to call multidisciplinary teams around challenge PETSc’s solvers. The combined PETSc- problems and providing centralized points

14 A Science-Based Case for Large-Scale Simulation 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program

Hypre solver library allows M3D to solve important result: this process can occur in its current linear systems two to three times environments very different than heretofore faster than previously, and this SciDAC thought, perhaps providing a way around collaboration has identified several avenues some of the fundamental difficulties for improving the PETSc-Hypre code encountered in past supernova models in support for M3D and other users. explaining element synthesis. To extend their model to higher dimensions, the TSI While these are stories of performance team has engaged mathematicians and improvement, allowing faster turnaround computer scientists to help them overcome and more production runs, there are also the “curse of dimensionality” and exploit striking stories of altogether new physics massively parallel machines. Mezzacappa enabled by SciDAC collaborations. The has declared that he would never go back to Princeton fusion team has combined forces doing computational physics outside of a with the Applied Partial Differential multidisciplinary team. Equations Center, an integrated software infrastructure center, to produce a Setting a New Organizational simulation code capable of resolving the Paradigm fine structure of the dynamic interface By acknowledging sociological barriers to between two fluids over which a shock multidisciplinary research up front and wave passes—the so-called Richtmyer- building accountabilities into the project Meshkov instability—in the presence of a awards to force the mixing of “scientific magnetic field. This team has found that a cultures” (e.g., physics with computer strong magnetic field stabilizes the science), SciDAC took a risk that has paid interface, a result not merely of scientific off handsomely in technical results. As a interest but also of potential importance in result, new DOE-SC initiatives are taking many devices of practical interest to DOE. SciDAC-like forms—for instance, the FY 2003 initiative Nanoscience Theory, SciDAC’s Terascale Supernova Initiative Modeling and Simulation. The roadmap for (TSI), led by Tony Mezzacappa of Oak DOE’s Fusion Simulation Program likewise Ridge National Laboratory, has achieved envisions multidisciplinary teams the first stellar core collapse simulations for constructing complex simulations linking supernovae with state-of-the-art together what are today various transport. The team has also produced freestanding simulation codes for different state-of-the-art models of nuclei in a stellar portions of a fusion reactor. Such complex core. These simulations have brought simulations are deemed essential to the on- together two fundamental fields at their time construction of a working respective frontiers, demonstrated the demonstration International Thermonuclear importance of accurate nuclear physics in Experimental Reactor. supernova models, and motivated planned experiments to measure the nuclear In many critical application domains, such processes occurring in stars. This group has teams are starting to self-assemble as a also performed simulations of a rapid natural response to the demands of the next neutron capture process believed to occur level of scientific opportunity. This self- in supernovae and to be responsible for the assembly represents a phase transition in creation of half the heavy elements in the the computational science , from universe. This has led to a surprising and autonomous individual investigators to

A Science-Based Case for Large-Scale Simulation 15 2. Scientific Discovery through Advanced Computing: A Successful Pilot Program coordinated multidisciplinary teams. When around major computational facilities and experimental physicists transitioned from to count the cost of other professionals. All solo investigators to members of large such transitions are sociologically difficult. teams centered at major facilities, they In the case of computational simulation, learned to include the cost of other there is sufficient infrastructure in common professionals (statisticians, , between different scientific endeavors that computer scientists, technicians, DOE and other science agencies can administrators, and others) in their projects. amortize its expense and provide organiza- Computational physicists are now learning tional paradigms for it. SciDAC is such a to build their projects in large teams built pilot paradigm. It is time to extend it.

16 A Science-Based Case for Large-Scale Simulation 2. Scientific Discovery through Advanced3. Computing: Anatomy ofA aSuccessful Large-Scale Pilot Simulation Program

3. ANATOMY OF A LARGE-SCALE SIMULATION

Because the methodologies and processes user environment. This iteration is of computational simulation are still not indicated in the two loops ( arrows) of familiar to many scientists and laypersons, Figure 4 (reproduced from the SciDAC we devote this brief chapter to their report). illustration. Our goal is to acquaint the reader with a number of issues that arise in One loop over this computational process is computational simulation and to highlight related to validation of the model and the benefits of collaboration across the verification that the process is correctly traditional boundaries of science. The executing the model. With slight violence to structure of any program aimed at grammar and with some oversimplification, advancing computational science must be validation poses the question “Are we based on this understanding. solving the right equations?” and verification, the question “Are we solving Scientific simulation is a vertically the equations right?” integrated process, as indicated in Figures 1 and 3. It is also an iterative process, in The other loop over the process of which computational scientists traverse simulation is related to algorithm and several times a hierarchy of issues in performance tuning. A simulation that yields modeling, discretization, solution, code high-fidelity results is of little use if it is too implementation, hardware execution, expensive to run or if it cannot be scaled up visualization and interpretation of results, to the resolutions, simulation times, or and comparison to expectations (theories, ensemble sizes required to describe the experiments, or other simulations). They do real-world phenomena of interest. Again this to develop a reliable “scientific with some oversimplification, algorithm instrument” for simulation, namely a tuning poses the question, “Are we getting marriage of a code with a computing and enough science per operation?” and

Figure 4. The process of scientific simulation, showing the validation and verification loop (left) and algorithm and performance tuning loop (right).

A Science-Based Case for Large-Scale Simulation 17 3. Anatomy of a Large-Scale Simulation performance tuning, “Are we getting require a different expert to detect it, enough operations per cycle?” control it at an acceptable level, or completely eliminate it as a possibility in It is clear from Figure 4 that a scientific future designs. These are some of the simulation depends upon numerous activities that exercise the left-hand loop in components, all of which have to work Figure 4. properly. For example, the theoretical model must not overlook or misrepresent During or after the validation of a any physically important effect. The model simulation code, attention must be paid to must not be parameterized with data of the performance of the code in each uncertain quality or provenance. The hardware/system software environment mathematical translation of the theoretical available to the research team. Efforts to model (which may be posed as a control discretization and solution error in a differential equation, for instance) to a simulation may grossly increase the discrete representation that a computer can number of operations required to execute a manipulate (a large system of algebraic simulation, far out of proportion to their relationships based on the differential benefit. Similarly, “conservative” equation, for instance) inevitably involves approaches to moving data within a some approximation. This approximation processor or between processors, to reduce must be controlled by a priori analysis or a the possibility of errors or posteriori checks. The algorithm that solves miscommunication, may require extensive the discrete representation of the system synchronization or minimal data replication being modeled must reliably converge. The and may be inefficient on a modern compiler, operating system, and message- supercomputer. More elegant, often passing software that translate the adaptive, strategies must be adopted for the arithmetic manipulations of the solution simulation to be execution-worthy on an algorithm into loads and stores from expensive massively parallel computer, memory to registers, or into messages that even though these strategies may require traverse communication links in a parallel highly complex implementation. These are computer, must be reliable and fault- some of the activities that exercise the right- tolerant. They must be employed correctly hand loop in Figure 4. by programmers and simulation scientists so that the results are well defined and ILLUSTRATION: SIMULATION OF A repeatable. In a simulation that relies on TURBULENT REACTING FLOW multiple interacting models (such as fluids The dynamics of turbulent reacting flows and structures in adjacent domains, or have long constituted an important topic in radiation and combustion in the same combustion research. Apart from the domain), information fluxes at the interface scientific richness of the combination of of the respective models, which represent fluid turbulence and , physical fluxes in the simulated turbulent flames are at the heart of the phenomenon, must be consistent. In design of equipment such as reciprocating simulations that manipulate or produce engines, turbines, furnaces, and enormous quantities of data on multiple incinerators, for efficiency and minimal disks, files must not be lost or mistakenly environmental impact. Effective properties overwritten, nor misfiled results extracted of turbulent flame dynamics are also for visualization. Each potential pitfall may required as model inputs in a broad range

18 A Science-Based Case for Large-Scale Simulation 3. Anatomy of a Large-Scale Simulation of larger-scale simulation challenges, including fire spread in buildings or wildfires, stellar dynamics, and chemical processing.

Figure 5 shows a photograph of a laboratory-scale premixed methane flame, stabilized by a metal rod across the outlet of a cylindrical burner. Although experimental and theoretical investigations have been able to provide substantial insights into the dynamics of these types of flames, there are still a number of outstanding questions about their behavior. Recently, a team of computational scientists supported by SciDAC, in collaboration with experimentalists at Lawrence Berkeley Figure 6. Instantaneous surface of a turbulent National Laboratory, performed the first premixed rod-stabilized V-flame, from simulation. simulations of these types of flames using detailed chemistry and transport. Figure 6 surface from the simulation. (Many such shows an instantaneous image of the flame instantaneous configurations are averaged in the photographic image of Figure 5.)

The first step in simulations such as these is to specify the computational models that will be used to describe the reacting flows. The essential feature of reacting flows is the set of chemical reactions taking place in the fluid. Besides chemical products, these reactions produce both temperature and pressure changes, which couple to the dynamics of the flow. Thus, an accurate description of the reactions is critical to predicting the properties of the flame, including turbulence. Conversely, it is the fluid flow that transports the reacting chemical species into the reaction zone and transports the products of the reaction and the released energy away from the reaction zone. The location and shape of the reaction zone are determined by a delicate balance of species, energy, and momentum fluxes and are highly sensitive to how these fluxes are specified at the boundaries of the Figure 5. Photograph of an experimental computational domain. Turbulence can turbulent premixed rod-stabilized “V” methane flame. wrinkle the reaction zone, giving it much

A Science-Based Case for Large-Scale Simulation 19 3. Anatomy of a Large-Scale Simulation more area than it would have in its laminar some scientific fields, high-fidelity models state, without turbulence. Hence, incorrect of many important processes involved in prediction of turbulence intensity may the simulations are not available; and under- or over-represent the extent of additional scientific research will be needed reaction. to achieve a comparable level of confidence.

From first principles, the reactions of These computations pictured here were molecules are described by the Schrödinger performed on the IBM SP, named equation, and fluid flow by the Navier- “Seaborg,” at the National Energy Research Stokes equations. However, each of these Scientific Computing Center, using up to equation sets is too difficult to solve 2048 processors. They are among the most directly for the turbulent premixed rod- demanding combustion simulations ever stabilized “V” methane flame illustrated in performed. However, the ability to do them Figure 5. So we must rely on is not merely a result of improvements in approximations to these equations. These computer speed. Improvements in approximations define the computational algorithm technology funded by the DOE models used to describe the methane flame. Applied Mathematical Sciences Program For the current simulations, the set of over the past decade have been 84 reactions describing the methane flame instrumental in making these computations involves 21 chemical species (methane, feasible. Mathematical analysis is used to oxygen, water, carbon dioxide, and many reformulate the equations describing the trace products and reaction intermediates). fluid flow so that high-speed acoustic The particular version of the compressible transients are removed analytically, while Navier-Stokes model employed here allows compressibility effects due to chemical detailed consideration of convection, reactions are retained. The mathematical diffusion, and expansion effects that shape model of the fluid flow is discretized using the reaction; but it is “filtered” to remove high-resolution finite difference methods, sound waves, which pose complications combined with local adaptive mesh that do not measurably affect combustion in refinement by which regions of the finite this regime. The selection of models such as difference grid are automatically refined or these requires context-specific expert coarsened to maximize overall judgment. For instance, in a jet turbine computational efficiency. The simulation, sound waves might represent a implementation of this methodology uses non-negligible portion of the energy budget an object-oriented, message-passing of the problem and would need to be software framework that handles the modeled; but it might be valid (and very complex data distribution and dynamic cost-effective) to use a less intricate load balancing needed to effectively exploit chemical reaction mechanism to answer the modern parallel supercomputers. The data scientific question of interest. analysis framework used to explore the results of the simulation and create visual Development of the chemistry model and images that lead to understanding is based validation of its predictions, which was on recent developments in “scripting funded in part by DOE’s Chemical Sciences languages” from computer science. The Program, is an interesting scientific story combination of these algorithmic that is too long to be told here. Suffice it to innovations reduces the computational cost say that this process took many years. In by a factor of 10,000 compared with a

20 A Science-Based Case for Large-Scale Simulation 3. Anatomy of a Large-Scale Simulation standard uniform-grid compressible-flow on the morphology and structure of these approach. types of flames.

Researchers have just begun to explore the This simulation achieves its remarkable data generated in these simulations. resolution of the thin flame edge by means Figure 7 presents a comparison between a of adaptive refinement, which places fine laboratory measurement and a simulation. resolution around those features that need The early stages of the analysis of this data it and uses coarse resolution away from indicate that excellent agreement is such features, saving orders of magnitude obtained with observable quantities. in computer memory and processing time. Further analyses promise to shed new light As the reaction zone shifts, the refinement automatically tracks it, adding and removing resolution dynamically. As described earlier, the simulation also employs a mathematical transformation that enables it to skip over tiny time steps that would otherwise be required to resolve sound waves in the flame domain. The four orders of magnitude saved by these algorithmic improvements, compared with previous practices, is a larger factor than the performance gain from parallelization on a computer system that costs tens of millions of dollars! It will be many years before such a high-fidelity simulation can be cost-effectively run using previous practices on any computer.

Unfortunately, the adaptivity and mathematical filtering employed to save memory and operations complicate the software and throw the execution of the code into a regime of computation that processes fewer operations per second and uses the thousands of processors of the Seaborg system less uniformly than a “business as usual” algorithm would. As a result, the simulation runs at a small percentage of the theoretical peak rate of the Seaborg machine. However, since it models the phenomenon of interest in much less execution time and delivers it many years earlier than would otherwise be possible, it is difficult to argue against its Figure 7. Comparison of experimental “scientific efficiency.” In this case, a image velocimetry crossview of the turbulent “V” flame (top) and a simulation (bottom). simplistic efficiency metric like “percentage of theoretical peak” is misleading.

A Science-Based Case for Large-Scale Simulation 21 3. Anatomy of a Large-Scale Simulation

The simulation highlighted earlier is of outstanding contributions to the remarkable for its physical fidelity in a development and use of mathematical and complex physics multiscale problem. computational tools and methods for the Experimentalists are easily attracted to solution of science and engineering fruitful collaborations when simulations problems.” Bell and Colella have applied reach this threshold of fidelity. Resulting the automated adaptive meshing comparisons help refine both experiment methodologies they developed in such and simulation in successive leaps. diverse areas as shock physics, However, the simulation is equally astrophysics, and flow in porous media, as noteworthy for the combination of the well as turbulent combustion. sophisticated mathematical techniques it employs and its parallel implementation on The software used in this combustion thousands of processors. Breakthroughs simulation is currently being extended in such as this have caused scientists to refer many directions under SciDAC. The to leading-edge large-scale simulation geometries that it can accurately environments as “time machines.” They accommodate are being generalized for enable us to pull into the present scientific accelerator applications. For magnetically understanding for which we would confined fusion applications, a particle otherwise wait years if we were dependent dynamics module is being added. on experiment or on commodity computing capability alone. Software of this complexity and versatility could never have been assembled in the The work that led to this simulation traditional mode of development, whereby recently won recognition for one of its individual researchers with separate creators, John Bell. He and his colleague concerns asynchronously toss packages Phil Colella, both of Lawrence Berkeley written to arbitrary specifications “over the National Laboratory, were awarded the first transom.” Behind any simulation as Prize in Computational Science and complex as the one discussed in this Engineering awarded by the Society for chapter stands a tightly coordinated, Industrial and Applied Mathematics and vertically integrated team. This process the Association for Computing Machinery illustrates the computational science “phase in June 2003. The prize is awarded “in the transition” that has found proof of concept area of computational science in recognition under the SciDAC program.

22 A Science-Based Case for Large-Scale Simulation 4.3. Opportunities Anatomy of a atLarge-Scale the Scientific Simulation Horizon

4. OPPORTUNITIES AT THE SCIENTIFIC HORIZON

The availability of computers 100 to scientific accomplishments expected in each 1000 times more powerful than those scientific area from the investments currently available will have a profound described in this report. The table is impact on computational scientists’ ability followed by an expanded description of one to simulate the fundamental physical, of these accomplishments in each category. chemical, and biological processes that The progress described in these paragraphs underlie the behavior of natural and illustrates the progress that will be made engineered systems. Chemists will be able toward solving many other scientific to model the diverse set of molecular problems and in many other scientific processes involved in the combustion of disciplines as the methodologies of hydrocarbon fuels and the catalytic computational science are refined through production of chemicals. Materials application to these primary Office of scientists will be able to predict the Science missions. Volume 2 of this report properties of materials from knowledge of provides additional information on the their structure, and then, inverting the many advances that, scientists predict, process, design materials with a targeted these investments will make possible. set of properties. Physicists will be able to model a broad range of complex PREDICTING FUTURE CLIMATES phenomena—from the fundamental While existing general circulation models interactions between elementary particles, (GCMs) used to simulate climate can to high-energy nuclear processes and provide good estimates of continental- to particle-electromagnetic field interactions, global-scale climatic variability and change, to the interactions that govern the behavior they are less able to produce estimates of of fusion plasmas. Biologists will be able to regional changes that are essential for predict the structure and, eventually, the assessing environmental and economic function of the 30–40,000 proteins coded in consequences of climatic variability and the human DNA. They will also mine the change. For example, predictions of global vast reservoirs of quantitative data being average atmospheric temperature changes accumulated using high-throughput are of limited utility unless the effects of experimental techniques to obtain new these changes on regional space-heating insights into the processes of life. needs, rainfall, and growing seasons can be ascertained. As a result of the investments In this section, we highlight some of the described here, it will be possible to most exciting opportunities for scientific markedly improve both regional discovery that will be made possible by an predictions of long-term climatic variability integrated set of investments, following the and change, and assessments of the SciDAC paradigm described in Chapter 2, potential consequences of such variability in the programmatic offices in the Office of and change on natural and managed Science in DOE—Advanced Scientific ecosystems and resources of value to Computing Research, Basic Energy humans (see Figure 8). Sciences, Biological and Environmental Research, Fusion Energy Sciences, High Accurate prediction of regional climatic Energy Physics, and Nuclear Physics. In variability and change would have Table 1 we provide a list of the major

A Science-Based Case for Large-Scale Simulation 23 4. Opportunities at the Scientific Horizon

Table 1. If additional funding becomes available for Office of Science scientific, computer science, and mathematics research programs, as well as for the needed additional computing resources, scientists predict the following major advances in the research programs supported by the Office of Science within the next 5–10 years Research Programs Major Scientific Advances Biological and • Provide global forecasts of Earth’s future climate at regional scales using Environmental high-resolution, fully coupled and interactive climate, chemistry, land Sciences cover, and carbon cycle models. • Develop predictive understanding of subsurface contaminant behavior that provides the basis for scientifically sound and defensible cleanup decisions and remediation strategies. • Establish a mathematical and computational foundation for the study of cellular and molecular systems and use computational modeling and simulation to predict and simulate the behavior of complex microbial systems for use in mission application areas. Chemical and • Provide direct 3-dimensional simulations of a turbulent methane-air jet Materials Sciences flame with detailed chemistry and direct 3-dimensional simulations of autoignition of n-heptane at high pressure, leading to more-efficient, lower-emission combustion devices. • Develop an understanding of the growth and structural, mechanical, and electronic properties of carbon nanotubes, for such applications as strengtheners of polymers and alloys, emitters for display screens, and conductors and nonlinear circuit elements for nanoelectronics. • Provide simulations of the magnetic properties of nanoparticles and arrays of nanoparticles for use in the design of ultra-high-density magnetic information storage. (Nanomagnetism is the simplest of many important advances in nanoscience that will impact future electronics, tailored magnetic response devices, and even catalysis.) • Resolve the current disparity between theory and experiment for conduction through molecules with attached electronic leads. (This is the very basis of molecular electronics and may well point to opportunities for chemical sensor technology.) Fusion Energy • Improve understanding of fundamental physical phenomena in high- Sciences temperature plasmas, including transport of energy and particles, turbulence, global equilibrium and stability, magnetic reconnection, electromagnetic wave/particle interactions, boundary layer effects in plasmas, and plasma/material interactions. • Simulate individual aspects of plasma behavior, such as energy and particle confinement times, high-pressure stability limits in magnetically confined plasmas, efficiency of electromagnetic wave heating and current drive, and heat and particle transport in the edge region of a plasma, for parameters relevant to magnetically-confined fusion plasmas. • Develop a fully integrated capability for predicting the performance of magnetically confined fusion plasmas with high physics fidelity, initially for tokamak configurations and ultimately for a broad range of practical energy-producing magnetic confinement configurations. • Advance the fundamental understanding and predictability of high- plasmas for inertial fusion energy. (Inertial fusion and magnetically confined fusion are complementary technological approaches to unlocking the power of the atomic nucleus.)

24 A Science-Based Case for Large-Scale Simulation 4. Opportunities at the Scientific Horizon

Table 1 (continued) Research Programs Major Scientific Advances High Energy • Establish the limits of the of Elementary Physics by achieving a detailed understanding of the effects of strong nuclear interactions in many different processes, so that the equality of Standard Model parameters measured in different experiments can be verified. (Or, if verification fails, signal the discovery of new physical phenomena at extreme sub-nuclear distances.) • Develop realistic simulations of the performance of particle accelerators, the large and complex core scientific instruments of high-energy physics research, both to optimize the design, technology and cost of future accelerators and to use existing accelerators more effectively and efficiently. Nuclear Physics • Understand the characteristics of the quark-gluon plasma, especially in the temperature-density region of the phase transition expected from quantum chromodynamics (QCD) and currently sought experimentally. • Obtain a quantitative, predictive understanding of the quark-gluon structure of the nucleon and of interactions of nucleons. • Understand the mechanism of core collapse supernovae and the nature of the nucleosynthesis in these spectacular stellar explosions.

tremendous value to society, and that value could be realized if the most significant limitations in the application of GCMs were overcome. The proposed investments would allow climate scientists to • refine the horizontal resolution of the atmospheric component from 160 to 40 km and the resolution of the ocean component from 105 to 15 km (both improvements are feasible now, but there is insufficient computing power to routinely execute GCMs at these resolutions); • add interactive atmospheric chemistry, carbon cycle, and dynamic terrestrial vegetation components to simulate global and regional carbon, nitrogen, and sulfur cycles and simulate the Figure 8. Surface chlorophyll distributions effects of land-cover and land-use simulated by the Parallel Ocean Program for conditions in late 1996 (a La Niña year) showing changes on climate (these cycles and intense biological activity across the equatorial effects are known to be important to Pacific due to upwelling and transport of climate, but their implementation in nutrient-rich cold water. Comparisons with satellite ocean color measurements validate model accuracy GCMs is rudimentary because of and inform our understanding of the regional computational limitations); and environmental response to global climate change. • improve the representation of important [S. Chu, S. Elliott, and M. Maltrud] subgrid processes, especially clouds and atmospheric radiative transfer.

A Science-Based Case for Large-Scale Simulation 25 4. Opportunities at the Scientific Horizon

Even at 40-km resolution, cloud dynamics understanding of matter at the nanoscale, are incompletely resolved. Recent advances as well as to developing the new in the development of cloud sub-models, nanotechnologies (see Figure 9). At the i.e., two-dimensional and quasi three- nanoscale, experiment often cannot stand dimensional cloud-resolving models, can alone, requiring sophisticated modeling overcome much of this limitation; but their and simulation to deconvolute implementation in global GCMs will experimental results and yield scientific require substantially more computing insight. To simulate nanoscale entities, power. scientists will have to perform ab initio atomistic calculations for more atoms than SCIENCE FROM THE NANOSCALE UP ever before, directly treat the correlated behavior of the associated with Our ability to characterize small molecules those atoms, and model nanoscale systems or perfect solids as collections of atoms is as open systems rather than closed very advanced. So is our ability to (isolated) ones. The computational effort characterize materials of engineering required to simulate nanoscale systems far dimensions. However, our abilities fall far exceeds any computational efforts in short of what is necessary because many of materials and molecular science to date by the properties and processes necessary to a factor of at least 100 to 1000. The develop new materials and optimize existing ones occur with characteristic sizes (or scales) that lie between the atomistic world (less than 1 nanometer) and the world perceived by humans (meters or more), a spatial range of at least a billion. Bridging these different scales, while keeping the essential quantum description at the smallest scale, will require unprecedented theoretical innovation and computational effort, but it is an essential step toward a complete fundamental theory of matter.

The nanoscale, 10–100 nanometers or so— the first step in the hierarchy of scales leading from the atomic scale to macroscopic scales—is also important in its own right because, at the nanoscale, new Figure 9. To continue annual doubling of hard properties emerge that have no parallel at drive storage density (top: IBM micro-drive), improved understanding of existing materials either larger or smaller scales. Thus, the and development of new ones are needed. First new world of nanoscience offers principles spin dynamics and the use of massively tremendous opportunities for developing parallel computing resources are enabling researchers at Oak Ridge National Laboratory to innovative technological solutions to a understand magnetism at ferromagnetic- broad range of real-world problems using antiferromagnetic interfaces (bottom: magnetic scientists’ increasing abilities to manipulate moments at a Co/FeMn interface) with the goal of matter at this scale. Theory, modeling, and learning how to store binary data in ever smaller domains, resulting in further miniaturization simulation are essential to developing an breakthroughs. [M. Stocks]

26 A Science-Based Case for Large-Scale Simulation 4. Opportunities at the Scientific Horizon investments envisioned in this area will plasma; there can also be abrupt changes in make nanoscale simulations possible by the form of the plasma caused by large- advancing the theoretical understanding of scale instabilities (see Figure 10). nanoscale systems and supporting the Computational modeling of these key development of computational models and processes requires large-scale simulations software for simulating nanoscale systems covering a broad range of space and time and processes, as well as allowing detailed scales. An integrated simulation capability simulations of nanoscale phenomena and will dramatically enhance the efficient their interaction with the macroscopic utilization of a burning fusion device, in world. particular, and the optimization of fusion energy development in general; and it DESIGNING AND OPTIMIZING would serve as an intellectual integrator of FUSION REACTORS physics knowledge ranging from advanced tokamaks to innovative confinement Fusion, the energy source that fuels the sun concepts. and stars, is a potentially inexhaustible and environmentally safe source of energy. However, to create the conditions under which isotopes undergo nuclear fusion and release energy, high-temperature (100 million degrees centigrade) plasmas— a fluid of ions and electrons—must be produced, sustained, and confined. Plasma confinement is recognized internationally as a grand scientific challenge and a formidable technical task. With the research program and computing resources outlined in this report, it will be possible to develop Figure 10. Simulation of the interchange of a fully integrated simulation capability for plasma between the hot (red) and cold (green) predicting the performance of magnetic regions of a tokamak via magnetic reconnection. confinement systems, such as tokamaks. [W. Park]

In the past, specialized simulations were PREDICTING THE PROPERTIES OF used to explore selected key aspects of the MATTER physics of magnetically confined plasmas, Fundamental questions in high-energy e.g., turbulence or micro-scale instabilities. physics and nuclear physics pivot on our Only an integrated model can bring all of ability to simulate so-called strong the strongly interacting physical processes interactions on a computational lattice via together to make comprehensive, reliable the theory of quantum chromodynamics predictions about the real-world behavior (QCD). of fusion plasmas. The simulation of an entire magnetic fusion confinement system The ongoing goal of research in high-energy involves simultaneous modeling of the core physics is to find the elementary and edge plasmas, as well as the plasma- constituents of matter and to determine wall inter-actions. In each region of the how they interact. The current state of our plasma, turbulence can cause anomalies in knowledge is summarized by the Standard the transport of the ionic species in the

A Science-Based Case for Large-Scale Simulation 27 4. Opportunities at the Scientific Horizon

Model of Elementary Particle Physics. A available through the construction of new central focus of experiments at U.S. high- types of computers optimized for the energy physics facilities is precision testing problems at hand. A further reduction, of the Standard Model. The ultimate aim of when current experiments are complete, this work is to find deviations from this will need another of model—a discovery that would require the computational power. introduction of new forms of matter or new physical principles to describe matter at the A primary goal of the nuclear physics shortest distances. Determining whether or program at the Relativistic Heavy Ion not each new measurement is in accord Collider (RHIC) at Brookhaven National with the Standard Model requires an Laboratory is the discovery and accurate evaluation of the effects of the characterization of the quark–gluon strong interactions on the process under plasma—a state of matter predicted by study. These computations are a crucial QCD to exist at high densities and companion to any experimental program in temperatures. To confirm an observation of high-energy physics. QCD is the only a new state of matter, it is important to known method of systematically reducing determine the nature of the phase transition all sources of theoretical error. between ordinary matter and a quark– gluon plasma, the properties of the plasma, Lattice calculations lag well behind and the equation of state. Numerical experiment at present, and the precision of simulations of QCD on a lattice have experiments under way or being planned proved to be the only source of a priori will make the imbalance even worse. The predictions about this form of matter in the impact of the magnitude of the lattice errors vicinity of the phase transition. is shown in Figure 11. For the Standard Model to be correct, the parameters ρ and η Recent work suggests that a major increase must lie in the region of overlap of all the in computing power would result in colored bands. Experiments measure definitive predictions of the properties of various combinations of ρ and η, as shown the quark–gluon plasma, providing timely by the colored bands. The reduction of and invaluable input into the RHIC heavy theory errors shown will require tens of ion program. teraflop-years of simulations, which will be

Figure 11. Constraints on the Standard Model parameters r and h. For the Standard Model to be correct, they must be restricted to the region of overlap of the solidly colored bands. The figure on the left shows the constraints as they exist today. The figure on the right shows the constraints as they would exist with no improvement in the experimental errors, but with lattice gauge theory uncertainties reduced to 3%. [R. Patterson]

28 A Science-Based Case for Large-Scale Simulation 4. Opportunities at the Scientific Horizon

Similar lattice calculations are capable of predicting and providing an understanding of the inner structure of the nucleon where the quarks and gluons are confined, another major nuclear physics goal. Precise lattice calculations of nucleon observables can play a pivotal role in helping to guide experiments. This methodology is well understood, but a “typical” calculation of a single observable requires over a year with all the computers presently available to Figure 12. Development of a newly discovered theorists. A thousandfold increase in shock wave instability and resulting stellar computing power would make the same explosions (supernovae), in two- and three- dimensional simulations. [J. Blondin, result available in about 10 hours. Making A. Mezzacappa, and C. DeMarino; visualization by the turn-around time for simulations R. Toedte] commensurate with that of experiments would lead to major progress in the While observations from space are critical, quantitative, predictive understanding of the development of theoretical supernova the structure of the nucleon and of the models will also be grounded in terrestrial interactions between nucleons. experiment. Supernova science has played a significant role in motivating the BRINGING A STAR TO EARTH construction of the Rare Isotope Accelerator Nuclear astrophysics is a major research (RIA) and the National Underground area within nuclear physics—in particular, Laboratory for Science and Engineering the study of stellar explosions and their (NUSEL). As planned, RIA will perform element production. Explosions of massive measurements that will shed light on stars, known as core collapse supernovae, models for heavy element production in are the dominant source of elements in the supernovae; and NUSEL will house Periodic Table between oxygen and iron; numerous neutrino experiments, including and they are believed to be responsible for a half-megaton neutrino detector capable of the production of half the heavy elements detecting extra-galactic supernova (heavier than iron). Life in the universe out to Andromeda. would not exist without these catastrophic stellar deaths. Supernovae have importance beyond their place in the cosmic hierarchy: the extremes To understand these explosive events, their of density, temperature, and composition byproducts, and the phenomena associated encountered in supernovae provide an with them will require large-scale, opportunity to explore fundamental nuclear multidimensional, multiphysics and particle physics that would otherwise simulations (see Figure 12). Current be inaccessible in terrestrial experiments. supercomputers are enabling the first Supernovae therefore serve as cosmic detailed two-dimensional simulations. laboratories, and computational models of Future petascale computers would provide, supernovae constitute the bridge between for the first time, the opportunity to the observations, which bring us simulate supernovae realistically in three information about these explosions, and the dimensions. fundamental physics we seek.

A Science-Based Case for Large-Scale Simulation 29 5. Enabling Mathematics and Computer Science Tools

30 A Science-Based Case for Large-Scale Simulation 5. Enabling Mathematics and Computer Science Tools

5. ENABLING MATHEMATICS AND COMPUTER SCIENCE TOOLS

MATHEMATICAL TOOLS FOR represent it on a computer. The LARGE-SCALE SIMULATION mathematical issues for this process, called “discretization,” include (1) the Mathematics is the bridge between physical extent to which the finite approximation reality and computer simulations of better agrees with the solution to the physical reality. The starting point for a original equations as the number of is a mathematical computational unknowns increases and model, expressed in terms of a system of (2) the relationship between the choice equations and based on scientific of approximation and qualitative understanding of the problem being mathematical properties of the solution, investigated, such as the knowledge of such as singularities. forces acting on a particle or a parcel of fluid. For such a model, there are three • Solvers and Software. Once one has types of mathematical tools that may be represented the physical problem as the brought to bear in order to produce and use solution to a finite number of equations computer simulations. for a finite number of unknowns, how does one make best use of the • Model Analysis. Although the computational resources to calculate the mathematical model is motivated by solution to those equations? Issues at science, it must stand on its own as an this stage include the development of internally consistent mathematical optimally efficient algorithms, and the object for its computer representation to mapping of computations onto a be well defined. For that reason, it is complex hierarchy of processors and necessary to resolve a number of issues memory systems. regarding the mathematical structure of the model. Do the equations have a Although these three facets represent unique solution for all physically distinct mathematical disciplines, these meaningful inputs? Do small changes in disciplines are typically employed in the inputs lead only to small changes in concert to build complete simulations. The the solution, or, as an indication of choice of appropriate mathematical tools potential trouble, could a small can make or break a simulation code. For uncertainty be magnified by the model example, over a four-decade period of our into large variability in the outcome? brief simulation era, algorithms alone have brought a speed increase of a factor of more • Approximation and Discretization. For than a million to computing the many systems, the number of electrostatic potential induced by a charge unknowns in the model is infinite, as distribution, typical of a computational when the solution is a function of kernel found in a wide variety of problems continuum variables such as space and in the sciences. The improvement resulting time. In such cases, it is necessary to from this algorithmic speedup is approximate the infinite number of comparable to that resulting from the unknowns with a large but finite hardware speedup due to Moore’s Law number of unknowns in order to

A Science-Based Case for Large-Scale Simulation 31 5. Enabling Mathematics and Computer Science Tools over the same length of time (see successive redesigns of the discretization Figure 13). The series of algorithmic methods and solver algorithms to better improvements producing this factor are all exploit this mathematical property of the based on a fundamental mathematical model. property of the underlying model—namely, that the function describing electrostatic Another trend in computational science is coupling between disjoint regions in space the steady increase in the intellectual is very smooth. Expressed in the right way, complexity of virtually all aspects of the this coupling can therefore be resolved modeling process. The scientists accurately with little computational effort. contributing to this report identified and The various improvements in algorithms developed (in Volume 2) eight cross-cutting for solving this problem came about by areas of applied mathematics for which research and development is needed to accommodate the rapidly increasing complexity of state-of-the-art computational science. They fall into the following three categories.

• Managing Model Complexity. Scientists want to use increasing computing capability to improve the fidelity of their models. For many problems, this means introducing models with more physical effects, more equations, and more unknowns. In Multiphysics Modeling, the goal is to develop a combination of analytical and numerical techniques to better represent problems with multiple physical processes. These techniques may range from analytical methods to determine how to break a problem up into weakly interacting components, to new numerical methods for exploiting such a Figure 13. Top: A table of the scaling of decomposition of the problem to obtain memory and processing requirements for the solution of the electrostatic potential equation on efficient and accurate discretizations in a uniform cubic grid of n × n × n cells. Bottom: time. A similar set of issues arises from The relative gains of some solution algorithms the fact that many systems of interest for this problem and Moore’s Law for have processes that operate on length improvement of processing rates over the same period (illustrated for the case where n = 64). and time scales that vary over many Algorithms yield a factor comparable to that of the orders of magnitude. Multiscale hardware, and the gains typically can be combined Modeling addresses the representation (that is, multiplied together). The algorithmic gains become more important than the hardware gains for and interaction of behaviors on multiple larger problems. If adaptivity is exploited in the scales so that results of interest are discretization, algorithms may do better still, though recovered without the (unaffordable) combining all of the gains becomes more subtle in this case. expense of representing all behaviors at

32 A Science-Based Case for Large-Scale Simulation 5. Enabling Mathematics and Computer Science Tools

uniformly fine scales. Approaches objects arising in technological devices, include the development of adaptive as well as in some areas of science, such methods, i.e., discretization methods that as biology. can represent directly many orders of magnitude in length scales that might • Managing Computational Complexity. appear in a single mathematical model, Once the mathematical model has been and hybrid methods for coupling converted into a system of equations for radically different models (continuum a finite collection of unknowns, it is vs. discrete, or stochastic vs. necessary to solve the equations. The deterministic), each of which represents goal of efforts in the area of Solvers and the behavior on a different scale. “Fast” Algorithms is to develop Uncertainty Quantification addresses algorithms for solving these systems of issues connected with mathematical equations that balance computational models that involve fits to experimental efficiency on hierarchical data, or that are derived from heuristics multiprocessor systems, scalability (the that may not be directly connected to ability to use effectively additional physical principles. Uncertainty computational resources to solve quantification uses techniques from increasingly larger problems), and fields such as statistics and optimization robustness (insensitivity of the to determine the sensitivity of models to computational cost to details of the inputs with errors and to design models inputs). An algorithm is said to be to minimize the effect of those errors. “fast” if its cost grows, roughly, only proportionally to the size of the • Discretizations of Spatial Models. problem. This is an ideal algorithmic Many of the applications described in property that is being obtained for more this document have, as core and more types of equations. Discrete components of their mathematical Mathematics and Algorithms make up models, the equations of fluid dynamics a complementary set of tools for or radiation transport, or both. managing the computational Computational Fluid Dynamics and complexity of the interactions of Transport and Kinetic Methods have as discrete objects. Such issues arise, for their goal the development of the next example, in traversing data structures generation of spatial discretization for calculations on unstructured grids, methods for these problems. Issues in optimizing resource allocation on include the development of multiprocessor architectures, or in discretization methods that are well- scientific problems in areas such as suited for use in multiphysics bioinformatics that are posed directly as applications without loss of accuracy or “combinatorial” problems. robustness. Meshing Methods specifically address the process of COMPUTER SCIENCE TOOLS FOR discretizing the computational domain, LARGE-SCALE SIMULATION itself, into a union of simple elements. The role of computer science in ultrascale Meshing is usually a prerequisite for simulation is to provide tools that address discretizing the equations defined over the issues of complexity, performance, and the domain. This process includes the understanding. These issues cut across the management of complex geometrical computer science disciplines. An integrated

A Science-Based Case for Large-Scale Simulation 33 5. Enabling Mathematics and Computer Science Tools approach is required to solve the problems programming languages expressive and faced by applications. easy to use without sacrificing high performance on the sophisticated, adaptive COMPLEXITY algorithms. One of the major challenges for applications Another example is the success of is the complexity of turning a mathematical component-oriented software in some model into an effective simulation. There application domains; such “components” are many reasons for this, as indicated in have allowed computational scientists to Figure 4. Often, even after the principles of focus their own expertise on their science a model and simulation algorithm are well while exploiting the newest algorithmic understood, too much effort still is required developments. Many groups in high- to turn this understanding into practice performance computing have tackled these because of the complexity of code design. issues with significant leadership from Current programming models and DOE. Fully integrated efforts are required frameworks do not provide sufficient to produce a qualitative change in the way support to allow domain experts to be application groups cope with the shielded from details of data structures and complexity of designing, building, and computer architecture. Even after an using ultrascale simulation codes. application code produces correct scientific results, too much effort still is required to PERFORMANCE obtain high performance. Code tuning efforts needed to match algorithms to One of the drivers of software complexity is current computer architectures require the premium on performance. The most lengthy analysis and experimentation. obvious aspect of the performance problem is the performance of the computer Once an application runs effectively, often hardware. Although there have been the next hurdle is saving, accessing, and astounding gains in arithmetic processing sharing data. Once the data are stored, since rates over the last five decades, users often ultrascale simulations often produce receive only a small fraction of the ultrascale-sized datasets, it is still too theoretical peak of processing performance. difficult to process, investigate, and There is a perception that this fraction is, in visualize the data in order to accomplish fact, declining. This perception is correct in the purpose of the simulation—to advance some respects. For many applications, the science. These difficulties are compounded reason for a declining percentage of peak by the problems faced in sharing resources, performance is the relative imbalance in the both human and computer hardware. performance of the subsystems of high-end computers. While the raw performance of Despite this grim picture, prospects for commodity processors has followed placing usable computing environments Moore’s Law and doubled every 18 months, into the hands of scientific domain experts the performance of other critical parts of the are improving. In the last few years, there system, such as memory and interconnect, has been a growing understanding of the have improved much less rapidly, leading problems of managing complexity in to less-balanced overall systems. Solving computer science, and therefore of their this problem requires attention to system- potential solutions. For example, there is a scale architectural issues. deeper understanding of how to make

34 A Science-Based Case for Large-Scale Simulation 5. Enabling Mathematics and Computer Science Tools

As with code complexity issues, there are UNDERSTANDING multiple on-going efforts to address Computer science also addresses the issue hardware architecture issues. Different of understanding the results of a architectural solutions may be required for computation. Ultrascale datasets are too different algorithms and applications. A large to be grasped directly. Applications single architectural convergence point, such that generate such sets already currently as that occupied by current commodity- rely on a variety of tools to attempt to based terascale systems, may not be the extract patterns and features from the data. most cost-effective solution for all users. A Computer science offers techniques in data comprehensive simulation program management and understanding that can be requires that several candidate architectural used to explore data sets, searching for approaches receive sustained support to particular patterns. Visualization explore their promise. techniques help scientists explore their data, taking advantage of the unique Performance is a cross-cutting issue, and human visual cortex and visually- computer science offers automated stimulated human insight. Current efforts approaches to developing codes in ways in this area are often limited by the lack of that allow computational scientists to resources, in terms of both staffing and concentrate on their science. For example, hardware. techniques that allow a programmer to automatically generate code for an Understanding requires harnessing the application that is tuned to a specific skills of many scientists. Collaboration computer architecture address both the technologies help scientists at different issues of managing the complexity of institutions work together. Grid highly-tuned code and the problem of technologies that simplify data sharing and providing effective portability between provide access to both experimental and high-performance computing platforms. ultrascale computing facilities allow Such techniques begin with separate computational scientists to work together to analyses of the “signature” of the solve the most difficult problems facing the application (e.g., the patterns of local nation today. Although these technologies memory accesses and inter-processor have been demonstrated, much work communication) and the parameters of the remains to make them a part of every hardware (e.g., cache sizes, latencies, scientist’s toolbox. Key challenges are in bandwidths). There is usually plenty of scheduling multiple resources and in data algorithmic freedom in scheduling and security. ordering operations and exchanges while preserving correctness. This freedom In Volume 2 of this report, scientists from should not be arbitrarily limited by many disciplines discuss critical computer particular expression in a low-level science technologies needed across language, but rather chosen close to applications. Visual Data Exploration and runtime to best match a given application– Analysis considers understanding the architecture pair. Similarly, the performance results of a simulation through of I/O and dataset operations can be visualization. Computer Architecture looks improved significantly through the use of at the systems on which ultrascale well-designed and adaptive algorithms. simulations run and for which programming environments and algorithms

A Science-Based Case for Large-Scale Simulation 35 5. Enabling Mathematics and Computer Science Tools

(described in the section on computational the increasingly staggering volumes of data mathematics) must provide software tools. produced by ultrascale simulations. Programming Models and Component Performance Engineering is the art of Technology for High-End Computing achieving high performance, which has the discusses new techniques for turning potential of becoming a science. System algorithms into efficient, maintainable Software considers the basic tools that programs. Access and Resource Sharing support all other software. looks at how to both make resources available to the entire research community The universality of these issues calls for and promote collaboration. Software focused research campaigns. At the same Engineering and Management is about time, the inter-relatedness of these issues to disciplines and tools—many adapted from one another and to the application and the industrial software world—to manage algorithmic issues above them in the the complexity of code development. Data simulation hierarchy points to the potential Management and Analysis addresses the synergy of integrated simulation efforts. challenges of managing and understanding

36 A Science-Based Case for Large-Scale Simulation 5. Enabling Mathematics6. Recommendations and Computer and Science Discussion Tools

6. RECOMMENDATIONS AND DISCUSSION

The preceding chapters of this volume have and computer science that will be required presented a science-based case for large- or that could, if developed for widespread scale simulation. Chapter 2 presents an on- use, reduce the computational power going successful initiative, SciDAC—a required to achieve a given scientific collection of 51 projects operating at a level simulation objective. The companion of approximately $57 million per year—as a volume details these opportunities more template for a much-expanded investment extensively. This concluding section revisits in this area. To fully realize the promises of each of the recommendations listed at the SciDAC, even at its present size, requires end of Chapter 1 and expands upon them immediate new investment in computer with additional clarifying discussion. facilities. Overall, the computational science research enterprise could make effective 1. Major new investments in use of as much as an order of magnitude computational science are needed in all of increase in annual investment at this time. the mission areas of DOE’s Office of Funding at that level would enable an Science, as well as those of many other immediate tenfold increase in computer agencies, so that the United States may be cycles available and a doubling or tripling the first, or among the first, to capture the of the researchers in all contributing new opportunities presented by the disciplines. A longer, 5-year view should continuing advances in computing power. allow for another tenfold increase in Such investments will extend the computing power (at less than a tenfold important scientific opportunities that increase in expense). Meanwhile, the have been attained by a fusion of emerging workforce should be primed now sustained advances in scientific models, for expansion by attracting and developing mathematical algorithms, computer new talent. architecture, and scientific software engineering. Chapter 3 describes the anatomy of a state- of-the-art large-scale simulation. A The United States is an acknowledged simulation such as this one goes beyond international leader in computational established practices by exploiting science, both in its national laboratories and advances from many areas of in its research universities. However, the computational science to marry raw United States could lose its leadership computational power to algorithmic position in a short time because the intelligence. It conveys the opportunities represented by multidisciplinary spirit intended in the computational science for economic, expanded initiative. Chapter 4 lists some military, and other types of leadership are compelling scientific opportunities that are well recognized; the underlying technology just over the horizon—new scientific and training are available without advances that are predicted to be accessible restrictions; and the technology continues with an increase in computational power of to evolve rapidly. As Professor Jack a hundred- to a thousand-fold. Chapter 5 Dongarra from the University of Tennessee lists some advances in the enabling recently noted: technologies of computational mathematics

A Science-Based Case for Large-Scale Simulation 37 6. Recommendations and Discussion

The rising tide of change [resulting from advances in science, mathematics, and advances in information technology] shows computer science into simulations that can no respect for the established order. Those take full advantage of advanced who are unwilling or unable to adapt in response to this profound movement not computers. only lose access to the opportunities that the information technology revolution is Part of the genius of the SciDAC initiative is creating, they risk being rendered obsolete the cross-linking of accountabilities by smarter, more agile, or more daring between applications groups and the competitors. groups providing enabling technologies. This accountability implies, for instance, The four streams of development brought that physicists are concentrating on physics, together in a “perfect fusion” to create the not on developing parallel libraries for core current opportunities are of different ages, mathematical operations; and it implies but their confluence is recent. Modern that mathematicians are turning their scientific modeling still owes as much to attention to methods directly related to Newton’s Principia (1686) as to anything application-specific difficulties, moving since, although new models are continually beyond the model problems that motivated proposed and tested. Modern simulation earlier generations. Another aspect of and attention to floating point arithmetic SciDAC is the cross-linking of laboratory have been pursued with intensity since the and academic investigators. Professors and pioneering work of Von Neumann at the their graduate students can inexpensively end of World War II. Computational explore many promising research directions scientists have been riding the technological in parallel; however, the codes they curve subsequently identified as “Moore’s produce rarely have much shelf life. On the Law” for nearly three decades now. other hand, laboratory-based scientists can However, scientific software engineering as effectively interact with many academic a discipline focused on componentization, partners, testing, absorbing, and seeding extensibility, reusability, and platform numerous research ideas, and effectively portability within the realm of high- “hardening” the best of the ideas into long- performance computing is scarcely a lived projects and tested, maintained codes. decade old. Multidisciplinary teams that This complementarity should be built into are savvy with respect to all four aspects of future computational science initiatives. computational simulation, and that are actively updating their codes with optimal Future initiatives can go farther than results from each aspect, have evolved only SciDAC by adding self-contained, vertically recently. Not every contemporary integrated teams to the existing SciDAC “community code” satisfies the standard structure of specialist groups. The SciDAC envisioned for this new kind of Integrated Software Infrastructure Centers computational science, but the United (ISICs), for instance, should continue to be States is host to a number of the leading supported, and even enhanced, to develop efforts and should capitalize upon them. infrastructure that supports diverse science portfolios. However, mathematicians and 2. Multidisciplinary teams, with carefully computer scientists capable of selected leadership, should be assembled understanding and harnessing the output to provide the broad range of expertise of these ISICs in specific scientific areas needed to address the intellectual should also be bona fide members of the challenges associated with translating

38 A Science-Based Case for Large-Scale Simulation 6. Recommendations and Discussion next generation of computational science that involve increasingly optimal teams recommended by this report, just as algorithms and software solutions. they are now part of a few of the larger However, this incentive exists SciDAC applications teams. independently of the desperate need for computing resources. U.S. investigators are 3. Extensive investment in new shockingly short on cycles to capitalize on computational facilities is strongly their current know-how. Some workshop recommended, since simulation now cost- participants expressed a need for very large effectively complements experimentation facilities in which to perform “heroic” in the pursuit of the answers to numerous simulations to provide high-fidelity scientific questions. New facilities should descriptions of the phenomena of interest, strike a balance between capability or to establish the validity of a computing for those “heroic simulations” computational model. With current that cannot be performed any other way, facilities providing peak processing rates in and capacity computing for “production” the tens of teraflop/s, a hundred- to simulations that contribute to the steady thousand-fold improvement in processing stream of progress. rate would imply a petaflop/s machine. Other participants noted the need to The topical breakout groups assessing perform ensembles of simulations of more opportunities in the sciences at the June modest sizes, to develop quantitative 2003 SCaLeS Workshop have universally statistics, to perform parameter pleaded for more computational resources identifications, or to optimize designs. A dedicated to their use. These pleas are carefully balanced set of investments in quantified in Volume 2 and highlighted in capability and capacity computing will be Chapter 4 of this volume, together with required to meet these needs. (partial) lists of the scientific investigations that can be undertaken with current know- Since the most cost-effective cycles for the how and an immediate influx of computing capacity requirements are not necessarily resources. the same as the most cost-effective cycles for the capability requirements, a mix of It is fiscally impossible to meet all of these procurements is natural and optimal. In the needs, nor does this report recommend that short term, we recommend pursuing, for available resources be spent entirely on the use by unclassified scientific applications, computational technology existing in a at least one machine capable of sustaining given year. Rather, steady investments one petaflop/s, plus the capacity for should be made over evolving generations another sustained petaflop/s distributed of computer technology, since each over tens of installations nationwide. succeeding generation packages more capability per dollar and evolves toward In earlier, more innocent times, greater usability. recommendations such as these could have been justified on the principle of “build it To the extent that the needs for computing and they will come,” with the hope of resources go unmet, there will remain a encouraging the development of a larger driving incentive to get more science done computational science activity carrying with fewer cycles. This is a healthy certain anticipated benefits. In the current incentive for multidisciplinary solutions atmosphere, the audience for this hardware

A Science-Based Case for Large-Scale Simulation 39 6. Recommendations and Discussion already exists in large numbers; and the originated and are maintained at national operative principle is, instead, “build it or laboratories, a situation that is a natural they will go!” Two plenary speakers at the adaptation to the paucity of scientific focus workshop noted that a major U.S.-based in the commercial marketplace. computational geophysicist has already taken his leading research program to The SciDAC initiative has identified four Japan, where he has been made a guest of areas of scientific software technology for the Earth Simulator facility; this facility focused investment—scientific simulation offers him a computational capability that codes, mathematical software, computing he cannot find in the United States. Because systems software, and collaboratory of the commissioning of the Japanese Earth software. Support for development and Simulator last spring, this problem is most maintenance of portable, extensible acute in earth science. However, the software is yet another aspect of the genius problem will rapidly spread to other of the SciDAC initiative, as the reward disciplines. structure for these vital tasks is otherwise absent in university-based research and 4. Investment in hardware facilities difficult to defend over the long term in should be accompanied by sustained laboratory-based research. Although collateral investment in software SciDAC is an excellent start, the current infrastructure for them. The efficient use funding level cannot support many types of of expensive computational facilities and software that computational science users the data they produce depends directly require. upon multiple layers of system software and scientific software, which, together It will be necessary and cost-effective, for with the hardware, are the engines of the foreseeable future, for the agencies that scientific discovery across a broad sponsor computational science to make portfolio of scientific applications. balanced investments in several of the layers of software that come between the The President’s Information Technology scientific problems they face and the Advisory Committee (PITAC) report of computer hardware they own. 1999 focused on the tendency to under- invest in software in many sectors, and this 5. Additional investments in hardware is unfortunately also true in computational facilities and software infrastructure science. Vendors have sometimes delivered should be accompanied by sustained leading-edge computing platforms collateral investments in algorithm delivered with immature or incomplete research and theoretical development. software bases; and computational science Improvements in basic theory and users, as well as third-party vendors, have algorithms have contributed as much to had to rise to fill this gap. In addition, increases in computational simulation scientific applications groups have often capability as improvements in hardware found it difficult to make effective use of and software over the first six decades of these computing platforms from day one, scientific computing. and revising their software for the new machine can take several months to a year A recurring theme among the reports from or more. Many industry-standard libraries the topical breakout groups for scientific of systems software and scientific software applications and mathematical methods is

40 A Science-Based Case for Large-Scale Simulation 6. Recommendations and Discussion that increases in computational capability rapidly with increasing grid resolution into comparable to those coming directly from methods that increase linearly or log- Moore’s Law have come from improved linearly with complexity, with reliable models and algorithms. This point was bounds on the approximations that are noted over 10 years ago in the famous needed to achieve these simulation- “Figure 4” of a booklet entitled Grand liberating gains in performance. It is Challenges: High-Performance Computing and essential to keep computational science Communications published by FCCSET and codes well stoked with the latest distributed as a supplement of the algorithmic technology available. In turn, President’s FY 1992 budget. A computational science applications must contemporary reworking of this chart actively drive research in algorithms. appears as Figure 13 of this report. It shows that the same factor of 16 million in 6. Computational scientists of all types performance improvement that would be should be proactively recruited with accumulated by Moore’s Law in a 36-year improved reward structures and span is achieved by improvements in opportunities as early as possible in the numerical analysis for solving the educational process so that the number of electrostatic potential equation on a cube trained computational science over a similar period, stretching from the professionals is sufficient to meet present demonstration that Gaussian elimination and future demands. was feasible in the limited arithmetic of digital computers to the publication of the The United States is exemplary in so-called “full multigrid” algorithm. integrating its computational science graduate students into internships in the There are numerous reasons to believe that national laboratory system, and it has had advances in applied and computational the foresight to establish a variety of mathematics will make it possible to model fellowship programs to make the relatively a range of spatial and temporal scales much new career of “computational scientist” wider than those reachable by established visible outside departmental boundaries at practice. This has already been universities. These internships and demonstrated with automated adaptive fellowship programs are estimated to be meshing, although the latter still requires under-funded, relative to the number of too much specialized expertise to have well-qualified students, by a factor of 3. achieved universal use. Furthermore, Even if all presently qualified and multiscale theory can be used to interested U.S. citizens who are doctoral communicate the net effect of processes candidates could be supported—and even occurring on unresolvable scales to models if this pool were doubled by aggressive that are discretized on resolvable scales. recruitment among groups in the native- Multiphysics techniques may permit codes born population who are underrepresented to “step over” stability-limiting time scales in science and engineering, and/or the that are dynamically irrelevant at larger addition of foreign-born and domestically scales. trained graduate students—the range of newly available scientific opportunities Even for models using fully resolved scales, would only begin to be grazed in new research in algorithms promises to computational science doctoral transform methods whose costs increase dissertations.

A Science-Based Case for Large-Scale Simulation 41 6. Recommendations and Discussion

The need to dedicate attention to earlier addressed include network infrastructure, stages of the academic pipeline is well grid/networking middleware, and documented elsewhere and is not the application-specific remote user central subject of this focused report, environments. although it is a related concern. It is time to thoroughly assess the role of computational The combination of central facilities and a simulation in the undergraduate widely distributed scientific workforce curriculum. Today’s students of science and requires, at a minimum, remote access to engineering, whose careers will extend to computation and data, as well as the the middle of the twenty-first century, must integration of data distributed over several understand the foundations of sites. computational simulation, must be aware of the advantages and limitations of this Network infrastructure research is also approach to scientific inquiry, and must recommended for its promises of leading- know how to use computational techniques edge capabilities that will revolutionize to help solve scientific and engineering computational science, including problems. Few universities can make this environments for coupling simulation and claim of their current graduates. Academic experiment; environments for programs in computational simulation can multidisciplinary simulation; orchestration be attractors for undergraduates who are of multidisciplinary, multiscale, end-to-end increasingly computer savvy but presently science process; and collaboration in perceive more exciting career opportunities support of all these activities. in non-scientific applications of information technology. 8. Federal investment in innovative, high- risk computer architectures that are well 7. Sustained investments must be made in suited to scientific and engineering network infrastructure for access and simulations is both appropriate and resource sharing, as well as in the software needed to complement commercial needed to support collaboration among research and development. The distributed teams of scientists, commercial computing marketplace is no recognizing that the best possible longer effectively driven by the needs of computational science teams will be computational science. widely separated geographically and that researchers will generally not be Scientific computing—which ushered in the co-located with facilities and data. information revolution over the past half- century by demanding digital products that The topical breakout group on access and eventually had successful commercial spin- resource sharing addressed the question of offs—has now virtually been orphaned by what technology must be developed and/ the information technology industry. or deployed to make a petaflop/s-scale Scientific computing occupies only a few computer useful to users not located in the percentage points of the total marketplace same building. Resource sharing constitutes revenue volume of information technology. a multilayered research endeavor that is This stark economic fact is a reminder of presently accorded approximately one-third one of the socially beneficial by-products of of the resources available to SciDAC, which computational science. It is also a wake-up are distributed over ten projects. Needs call regarding the necessity that computa-

42 A Science-Based Case for Large-Scale Simulation 6. Recommendations and Discussion tional science attend to its own needs vis-à- (HECRTF) study, and is not the central vis future computer hardware concern of this application-focused report. requirements. The most noteworthy neglect However, it is appropriate to reinforce this of the scientific computing industry is, recommendation, which is as true of our era arguably, the deteriorating ratio of memory as it was true of the era of the “Lax Report” bandwidth between processors and main (see the discussion immediately following). memory to the computational speed of the The SciDAC report called for the processors. It does not matter how fast a establishment of experimental computing processor is if it cannot get the data that it facilities whose mission was to “provide needs from memory fast enough—a fact critical experimental equipment for recognized very early by Seymour Cray, computer science and applied mathematics whom many regard as the father of the researchers as well as computational supercomputer. Most commercial science pioneers to assess the potential of applications do not demand high memory new computer systems and architectures bandwidth; many scientific applications do. for scientific computing.” Coupling these facilities with university-based research There has been a reticence to accord projects could be a very effective means to computational science the same fill the architecture research gap. consideration given other types of experimental science when it comes to CLOSING REMARKS federal assistance in the development of The contributors to this report would like to facilities. High-end physics facilities—such note that they claim little original insight as accelerators, colliders, lasers, and for the eight recommendations presented. telescopes—are, understandably, designed, In fact, our last six recommendations may constructed, and installed entirely at be found, slightly regrouped, in the four taxpayer expense. However, the design and prescient recommendations of the Lax fabrication of high-end computational Report (Large-Scale Computing in Science and facilities are expected to occur in the course Engineering, National Science Board, 1982), of profit-making enterprises; only the which was written over two decades ago, in science-specific purchase or lease and very different times, before the introduction installation are covered at taxpayer of pervasive personal computing and well expense. To the extent that the marketplace before the emergence of the World Wide automatically takes care of the needs of the Web! computational science community, this represents a wonderful savings for the Our first two recommendations reflect the taxpayer. To the extent that it does not, it beginnings of a “phase transition” in the stunts the opportunity to accomplish way certain aspects of computational important scientific objectives and to lead science are performed, from solo the U.S. computer industry to further investigators to multidisciplinary groups. innovation and greatness. Most members of the Lax panel—Nobelists and members of the National Academies The need to dedicate attention to research included—wrote their own codes, perhaps in advanced computer architecture is in their entirety, and were capable of documented elsewhere in many places, understanding nearly every step of their including the concurrent interagency High- compilation, execution, and interpretation. End Computing Revitalization Task Force This is no longer possible: the scientific

A Science-Based Case for Large-Scale Simulation 43 6. Recommendations and Discussion applications, computer hardware, and international politics of information computing systems software are now each technology, or the of science so complex that combining all three into funding. Meanwhile, the supporting today’s engines of scientific discovery arguments for the recommendations, which requires the talents of many. The phase indicate the distance traveled along this transition to multidisciplinary groups, path, have changed dramatically. As the though not first crystallized by the SciDAC companion volume illustrates, the two report (Scientific Discovery through Advanced decades since the Lax Report have been Computing, U.S. Department of Energy, extremely exciting and fruitful for March 2000), was recognized and computational science. The next 10 to eloquently and persuasively articulated 20 years will see computational science there. firmly embedded in the fabric of science— the most profound development in the The original aspect of the current report is scientific method in over three centuries! not, therefore, its recommendations. It is, rather, its elaboration in the companion The constancy of direction provides volume of what the nation’s leading reassurance that the community is well computational scientists see just over the guided, while the dramatic advances in computational horizon, and the computational science indicate that it is descriptions by cross-cutting enabling well paced and making solid progress. The technologists from mathematics and Lax Report measured needs in megaflop/s, computer science of how to reach these new the current report in teraflop/s—a unit one scientific vistas. million times larger. Neither report, however, neglects to emphasize that it is It is encouraging that the general not the instruction cycles per second that recommendations of generations of reports measure the quality of the computational such as this one, which indicate a desired effort, but the product of this rating times direction along a path of progress, are the scientific results per instruction cycle. relatively constant. They certainly adapt to The latter requires better computational local changes in the scientific terrain, but models, better simulation algorithms, and they do not swing wildly in response to the better software implementation. siren calls of academic fashion, the

44 A Science-Based Case for Large-Scale Simulation 6. RecommendationsPostscript and Acknowledgments and Discussion

POSTSCRIPT AND ACKNOWLEDGMENTS

This report is built on the efforts of a further and live up to the expectations of community of computational scientists, Galileo, invoked in the quotation on the DOE program administrators, and technical title page of this volume. and logistical professionals situated throughout the country. Bringing together The three plenary speakers for the so many talented individuals from so many workshop during which the input for this diverse endeavors was stimulating. Doing report primarily was gathered—Raymond so and reaching closure in less than four Orbach, Peter Lax, and John Grosh— months, mainly summer months, was perfectly set the perspective of legacy, nothing short of exhilarating. opportunity, and practical challenges for large-scale simulation. In many ways, the The participants were reminded often challenges we face come from the same during the planning and execution of this directions as those reported by the National report that the process of creating it was as Science Board panel chaired by Peter Lax in important as the product itself. If resources 1982. However, progress along the path in for a multidisciplinary initiative in scientific the past two decades is astounding— simulation were to be presented to the literally factors of millions in computer scientists, mathematicians, and computer capabilities (storage capacity, processor scientists required for its successful speed) and further factors of millions from execution—as individuals distributed working smarter (better models, optimal across universities and government and adaptive algorithms). Without the industrial laboratories, lacking familiarity measurable progress of which the speakers with each other’s work across disciplinary reminded us, the aggressive extrapolations and sectorial boundaries—the return to the of this report might be less credible. With taxpayer certainly would be positive, yet each order-of-magnitude increase in short of its full potential. Investments in capabilities comes increased ability to science are traditionally made this way, and understand or control aspects of the in this way, sound scientific building blocks universe and to mitigate or improve our lot are accumulated and tested. But as a pile of in it. blocks does not constitute a house, theory and models from science, methods and The DOE staff members who facilitated this algorithms from mathematics, and software report often put in the same unusual hours and hardware systems from computer as the scientific participants in breakfast science do not constitute a scientific meetings, late-evening discussions, and simulation. The scientists, mathematicians, long teleconferences. They did not steer the and computer scientists who met at their outcome, but they shared their knowledge own initiative to build this report are now of how to get things accomplished in ready to build greater things. While projects that involve hundreds of people; advancing science through simulation, the and they were incredibly effective. Many greatest thing they will build is the next provided assistance along the way, but Dan generation of computational scientists, Hitchcock and Chuck Romine, in particular, whose training in multidisciplinary labored with the scientific core committee environments will enable them to see much for four months.

A Science-Based Case for Large-Scale Simulation 45 Postscript and Acknowledgments

Other specialists have left their imprints on production deadlines and creating a this report. Devon Streit of DOE presentation that outplays the content. Headquarters is a discussion leader nonpareil. Joanne Stover and Mark Bayless The editor-in-chief gratefully acknowledges of Pacific Northwest National Laboratory departmental chairs and administrators at built and maintained web pages, often two different universities who freed his updated daily, that anchored workshop schedule of standard obligations and thus communication among hundreds of people. directly subsidized this report. This project Vanessa Jones of Oak Ridge National consumed his last weeks at Old Dominion Laboratory’s Washington office provided University and his first weeks at Columbia cheerful hospitality to the core planners. University and definitely took his mind off Jody Shumpert of Oak Ridge Institute for of the transition. Science and Education (ORISE) single- handedly did the work of a conference team All who have labored to compress their (or in preparing the workshop and a pre- their colleagues’) scientific labors of love into workshop meeting in Washington, D.C. The a few paragraphs, or to express themselves necessity to shoehorn late-registering with minimal use of equations and jargon, participants into Arlington hotels during will find ten additional readers for each high convention season, when the response co-specialist colleague they thereby to the workshop outstripped planning de-targeted. All involved in the preparation expectations, only enhanced her display of of this report understand that it is a snap- unflappability. Rachel Smith and Bruce shot of opportunities that will soon be Warford, also of ORISE, kept participants outdated in their particulars, and that it may well directed and well equipped in ten-way prove either optimistic or pessimistic in its parallel breakouts. Listed last, but of first extrapolations. None of these out-comes importance for readers of this report, detracts from the confidence of its con- members of the Creative Media group at tributors that it captures the current state of Oak Ridge National Laboratory iterated scientific simulation at a historic inflection tirelessly with the editors in meeting point and is worth every sacrifice and risk.

46 A Science-Based Case for Large-Scale Simulation Appendix 1: A Brief Chronology of “A Science-Based CasePostscript for Large-Scale and Acknowledgments Simulation”

APPENDIX 1: A BRIEF CHRONOLOGY OF “A SCIENCE- BASED CASE FOR LARGE-SCALE SIMULATION”

Under the auspices of the Office of Science guarantee a critical mass in each breakout of the U.S. Department of Energy (DOE), a group, in the context of a fully open group of 12 computational scientists from workshop. Group leaders also typically across the DOE laboratory complex met on recruited one junior investigator “scribe” April 23, 2003, with 6 administrators from for each breakout session. This arrangement DOE’s Washington, D.C., offices and David was designed both to facilitate the assembly Keyes, chairman, to work out the of quick-look plenary reports from organization of a report to demonstrate the breakouts at the meeting and to provide a case for increased investment in unique educational experience to members computational simulation. They also of the newest generation of DOE-oriented planned a workshop to be held in June to computational scientists. gather input from the computational science community to be used to assemble Group leaders, invitees, and interested the report. The group was responding to a members of the community were then charge given three weeks earlier by Dr. Walt invited to register at the SCaLeS workshop Polansky, the Chief Information Officer of website, which was maintained as a service the Office of Science (see Appendix 2). The to the community by Pacific Northwest working acronym “SCaLeS” was chosen for National Laboratory. this initiative. A core planning group of approximately a From the April meeting, a slate of group dozen people, including Daniel Hitchcock leaders for 27 breakout sessions (11 in and Charles Romine of the DOE Office of applications, 8 in mathematical methods, Advanced Scientific Computing Research, and 8 in computer science and participated in weekly teleconferences to infrastructure) was assembled and attend to logistics and to organize the recruited. The short notice of the chosen individual breakout sessions at the dates in June meant that two important workshop in a way that guaranteed sub-communities, devoted to climate multidisciplinary viewpoints in each research and to Grid infrastructure, could discussion. not avoid conflicts with their own meetings. They agreed to conduct their breakout More than 280 computer scientists, sessions off-site at the same time as the computational mathematicians, and main workshop meeting. Thus 25 of the computational scientists attended the June breakout groups would meet in greater 24 and 25, 2003, workshop in Arlington, Washington, D.C., and 2 would meet Virginia, while approximately 25 others met elsewhere on the same dates and remotely. The workshop consisted of three participate synchronously in the workshop plenary talks and three sessions of topical information flows. breakout groups, followed by plenary summary reports from each session, and Acting in consultation with their scientific discussion periods. Approximately 50% of peers, the group leaders drew up lists of the participant-contributors were from DOE suggested invitees to the workshop so as to laboratories, approximately 35% from

A Science-Based Case for Large-Scale Simulation 47 Appendix 1: A Brief Chronology of “A Science-Based Case for Large-Scale Simulation” universities, and the remainder from other focus in time-constrained multidisciplinary agencies or from industry. discussion groups so as to arrive at useful research roadmaps. In the plenary talks, Raymond Orbach, director of the DOE Office of Science, From the brief breakout group plenary charged the participants to document the reports at the SCaLeS workshop, a total of scientific opportunities at the frontier of 53 co-authors drafted 27 chapters simulation. He also reviewed other Office documenting scientific opportunities of Science investments in simulation, through simulation and how to surmount including a new program at the National barriers to realizing them through Energy Research Scientific Computing collaboration among scientists, Center that allocates time on the 10 Tflop/s mathematicians, and computer scientists. Seaborg machine for grand challenge runs. Written under an extremely tight deadline, Peter D. Lax of the Courant Institute, editor these chapters were handed off to an of the 1982 report Large-Scale Computing in editorial team consisting of Thom Dunning, Science and Engineering, provided a personal Phil Colella, William Gropp, and David historical perspective on the impact of that Keyes. This team likewise worked under an earlier report and advised workshop extremely tight schedule, iterating with participants that they have a similar calling authors to encourage and enforce a measure in a new computational era. To further set of consistency while allowing for variations the stage for SCaLeS, John Grosh, co-chair that reflect the intrinsically different natures of the High-End Computing Revitalization of simulation throughout science. Task Force (HECRTF), addressed the participants on the context of this Following a heroic job of technical interagency task force, which had met the production by Creative Media at Oak Ridge preceding week. (While the SCaLeS National Laboratory, A Science-Based Case for workshop emphasized applications, the Large-Scale Simulation is now in the hands of HECRTF workshop emphasized computer the public that is expected to be the hardware and software.) Finally, Devon ultimate beneficiary not merely of an Streit of DOE’s Washington staff advised interesting report but of the scientific gains the participants on how to preserve their toward which it points.

48 A Science-Based Case for Large-Scale Simulation Appendix 1:Appendix A Brief Chronology 2: Charge forof “A Science-BasedScience-Based CaseCase forfor Large-ScaleLarge-Scale Simulation”Simulation”

APPENDIX 2: CHARGE FOR “A SCIENCE-BASED CASE FOR LARGE-SCALE SIMULATION”

Date: Wed, 2 Apr 2003 09:49:30 -0500 From: “Polansky, Walt” Subject: Computational sciences—Exploration of New Directions

Distribution:

I am pleased to report that David Keyes ([email protected]), of Old Dominion University, has agreed to lead a broad-based effort on behalf of the Office of Science to identify rich and fruitful directions for the computational sciences from the perspective of scientific and engineering applications. One early, expected outcome from this effort will be a strong science case for an ultra-scale simulation capability for the Office of Science.

David will be organizing a workshop (early summer 2003, Washington, D.C. metro area) to formally launch this effort. I expect this workshop will address the major opportunities and challenges facing computational sciences in areas of strategic importance to the Office of Science. I envision a report from this workshop by July 30, 2003. Furthermore, this workshop should foster additional workshops, meetings, and discussions on specific topics that can be identified and analyzed over the course of the next year.

Please mention this undertaking to your colleagues. I am looking forward to their support and their participation, along with researchers from academia and industry, in making this effort a success.

Thanks.

Walt P.

A Science-Based Case for Large-Scale Simulation 49 Acronyms and Abbreviations

50 A Science-Based Case for Large-Scale Simulation Acronyms and Abbreviations

ACRONYMS AND ABBREVIATIONS

ASCI Accelerated Strategic Computing Initiative ASCR Office of Advanced Scientific Computing Research (DOE) BER Office of Biology and Environmental Research (DOE) BES Office of Basic Energy Sciences (DOE) DOE U.S. Department of Energy FES Office of Fusion Energy Sciences (DOE) FCCSET Federal Coordinating Council for Science, Engineering, and Technology GCM general circulation model HECRTF High-End Computing Revitalization Task Force HENP Office of High Energy and Nuclear Physics (DOE) Hypre a solver library ISIC Integrated Software Infrastructure Center ITER International Thermonuclear Experimental Reactor M3D a fusion simulation code NIMROD a fusion simulation code NNSA National Nuclear Security Agency (DOE) NSF National Science Foundation NUSEL National Underground Laboratory for Science and Engineering NWCHEM a chemistry simulation code ORISE Oak Ridge Institute for Science and Education QCD quantum chromodynamics PETSc Portable, Extensible Toolkit for Scientific Computing, a solver library PITAC President’s Information Technology Advisory Committee RHIC Relativistic Heavy Ion Collider RIA Rare Isotope Accelerator SC Office of Science (DOE) SciDAC Scientific Discovery through Advanced Computing TSI Terascale Supernova Initiative

A Science-Based Case for Large-Scale Simulation 51

“We can only see a short distance ahead, but we can see plenty there that needs to be done.”

Alan Turing (1912—1954)