I

NSF Blue Ribbon Panel on High Performance Computing

k61M.I FIX

August 1993 Dedication

This report is dedicated to one of the nation's most distinguished computer scientists, a builder of important academic institutions, and a devoted and effective public servant; Professor Nico Habermann. Dr. Habermann took responsibility in organizing this Panel's work and saw it through to completion, but passed away just afew days before it was presented to the National Science Board. The members of the panel deeply feel the loss of his creativity, wisdom, and friendship. FROM

DESKTOP

To TERAFLOP:

EXPLOITING THE U.S. LEAD

IN

HIGH PERFORMANCE COMPUTING

NSF Blue Ribbon Panel on High Performance Computing August1993

Lewis Branscomb (Chairman) Barry Honig Theodore Belytschko Neal Lane (resigned from Panel, July 1993) Peter Bridenbaugh William Lester, Jr. Teresa Chay Gregory J. McRae Jeff Dozier James A. Sethian Gary S. Grest Burton Smith Edward F. Hayes Mary Vernon

"It is easier to invent the future than to predict it" - Alan Kay

EXECUTIVE SUMMARY An Introductory Remark: Many reports are and engineers who have gained experience in prepared for the National Science Board and the HPC, and the extraordinary technical progress in National Science Foundation that make an elo­ realizing new computing environments, creates its quent case for more resources for one discipline or own challenges. We invite the Board to consider another. This is not such a report. This report ad­ four such challenges: dresses an opportunity to accelerate progress in virtually every branch of science and engineering Challenge 1: How can NSF, as the nation's concurrently, while also giving a shot in the arm to premier agency funding basic research, remove ex­ the entire American economy as business firms isting barriers to the rapid evolution of high perfor ­ also learn to exploit these new capabilities. The mance computing, making it truly usable by all way much of science and engineering are prac­ the nation's scientists and engineers? These bar­ ticed will be transformed, if our recommendations riers are of two kinds: technological barriers are implemented. (primarily to realizing the promise of highly paral­ lel machines, workstations, and networks) and im­ The National Science Board can take pride in the plementation bathers (new mathematical methods Foundation's accomplishments in the decade since and new ways to formulate science and engineer­ it implemented the recommendations of the Peter ing problems for efficient and effective computa­ Lax Report on high performance computing tion). An aggressive commitment by NSF to (HPC). The Foundation's High Performance Com­ leadership in research and prototype development, puting Centers continue to play a central role in in both computer science and in computational this successful strategy, creating an enthusiastic science, will be required. and demanding set of sophisticated users, who have acquired the specialized computational skills Challenge 2: How can NSF provide scalable ac­ required to use the fast advancing but still imma­ cess to a pyramid of computing resources, from ture high performance computing technology. the high performance workstations needed by most scientists to the critically needed teraflop­ Stimulated by this growing user community, the and-beyond capability required for solving Grand HPC industry finds itself in a state of excitement Challenge problems? What balance of among and transition. The very success of the NSF pro­ high performance desktop workstations, vs. mid­ gram, together with those of sister agencies, has range or mini-, vs. networks of given rise to a growing varity of new experimen­ workstations, vs. remote, shared tal computing environments, from massively paral­ of very high performance should NSF anticipate lel systems to networks of coupled workstations, and encourage? that could, with the right research investments, produce entirely new levels of computing power, Challenge 3: The third challenge is to encourage economy, and usability. The U.S. enjoys a sub­ the continued broadening of the base of participa­ stantial lead in computational science and in the tion in HPC, both in terms of institutions and in emerging technology; it is urgent that the NSF terms of skill levels and disciplines. This calls for capitalize on this lead, which not Only offers scien­ expanded education and training, and participation tific preeminence but also the industrial lead in a by state-based and other HPC institutions. growing world market. Challenge 4: How can NSF best create the ­ The vision of the rapid advances in both science lectual and management leadership for the future and technology that the new generation of super­ of high performance computing in the U.S.? What computers could make possible has been shown to role should NSF play within the scope of the na­ be realistic. This very success, measured in terms tionally coordinated HPCC program? What of new discoveries, the thousands of researchers in relationships should NSF's activities in HPC have recommend as a multi-agency facility pushing the to the activities of other federal agencies? frontiers of high performance into the next decade. This report recommends significant expansion in A final recommendation addresses the NSF's role NSF investments, both in accelerating progress in at the national level and its relationship with the high performance computing through computer states in HPC. and computational science research and in provid­ ing the balanced pyramid of computing facilities A. CENTRAL GOAL FOR to the science and engineering communities. The NSF HPC POLICY cost estimates are only approximate, but in total they do not exceed the Administration's stated in­ Recommendation A-i: The National Science tent to double the investments in HPCC during the Board should take the lead, under OSTP guidance next 5 years. We believe these investments are not and in collaboration with ARPA, DoE and other only justified but are compatible with stated na­ agencies, to expand access to all levels of the tional plans, both in absolute amount and in their dynamically evolving pyramid of high perfor­ distribution. mance computing capability for all sectors of the whole nation. The realization of this pyramid RECOMMENDATIONS: depends, of course, on rapid progress in the pyramid's technologies. The computational We have four sets of interdependent recommenda­ capability we envision includes not only the re­ tions. The first implements a balanced pyramid of search capability for which NSF has special computing environments (see Figure A following stewardship, but also includes a rapid expansion this Summary). Each element in the pyramid sup­ of capability in business and industry to use HPC ports the others; whatever resources are applied to profitably, and many operational uses of HPC in the whole, the balance in the pyramid should be commercial and military activities. sustained. The second set addresses the essential research investments and other steps to remove the VISION OF THE HPC PYRAMID obstacles to realizing the technologies in the pyramid and the barriers to the effective use of Recommendation A-2: At the apex of the these environments. pyramid is the need for a national capability at the highest level of computing power the industry can The third set addresses the institutional structure support with both efficient software and hardware. for delivery of HPC capabilities, and consists it­ A reasonable goal would be the design, develop­ self of a pyramid (see Figure B following this ment, and realization of a national teraflop-class Summary), of which the NSF Centers are an im­ capability, subject to the successful development portant part. At the base of the institutional of software and computational tools for such a pyramid is the diverse array of investigators in large machine (recommendation B-i). NSF their universities and other settings, who use all should initiate, through OSTP, an interagency plan the facilities at all levels of the pyramid. At the to make this investment, anticipating multi-agency next level are departments and research groups funding and usage. devoted to specific areas of computer science or computational science and engineering. At the Recommendation A-3: Over a period of 5 years next level are the NSF HPC Centers, which must the research universities should be assisted to ac­ continue to be providers of shared high capability quire mid-range machines. These mid-sized computing systems and to provide aggregations of machines are the underfunded element of the specialized capability for all aspects of use and ad­ pyramid today - about 10% of NSF's FY92 HPC vance of high performance computing. At the apex budget is devoted to their acquisition. They are is the national teraflop-class facility, which we needed for both demanding science and engineer­ ing problems that do not require the very maxi- iv mum in computing capacity, and for use by the support required for initial development of computer science and computational mathematics prototypes. community in addressing the architectural, software, and algorithmic issues that are the Recommendation B-2: A significant barrier to primary barriers to progress with massively paral­ rapid progress in HPC application lies in the for­ lel processor architectures. mulation of the computational strategy for solving a scientific or engineering problem. In response to Recommendation A-4: We recommend that NSF Challenge 1, the NSF should focus attention, both double the current annual level of investment ($22 through CISE and through its disciplinary pro­ million) providing scientific and engineering gram offices, on support for the design and workstations to its 20,000 principal investigators. development of computational techniques, algo­ Within 4 or 5 years workstations delivering up to rithmic methodology, and mathematical, physical 400 megaflops costing no more than $15,000 to and engineering models to make efficient use of $20,000 should be widely available. For education the machines. and a large fraction of the computational needs of science and engineering, these facilities will be BALANCING THE PYRAMID OF adequate. HPC ACCESS Recommendation A-5: We recommend that the Recommendation B-3: We recommend NSF set NSF expand its New Technologies program to sup­ up a task force to develop a way to ameliorate the port expanded testing of the new parallel con­ imbalance in the HPC "pyramid" - the under-in­ figurations for HPC applications. For example, vestment in the emerging mid-range scalable, the use of Gigabit local area networks to link parallel computers and the inequality of access to workstations may meet a significant segment of stand-alone (but potentially networked) worksta­ mid-range HPC science and engineering applica­ tions in the disciplines. This implementation plan tions. A significant supplement to HPC applica­ should involve a combination of funding by dis­ tions research capacity can be had with minimal ciplinary program offices and some form of more additional cost if such collections of workstations centralized allocation of NSF resources. prove practical and efficient C. THE NSF HPC CENTERS B. RECOMMENDATIONS TO IMPLEMENT THESE GOALS Recommendation C-i : The Centers should be retained and their missions should be reaffirmed. However, the NSF HPC effort now embraces a REMOVING BARRIERS TO HPC variety of institutions and programs - HPC TECHNICAL PROGRESS AND HPC USAGE Centers, Engineering Research Centers, and Recommendation B-i: To accelerate progress in Science & Technology Centers devoted to HPC re­ developing the HPC technology needed by users, search, and disciplinary investments in computer NSF should create, in the Directorate for Com­ and computational science and applied mathe­ puter and Information Science and Engineering, a matics - all of which are essential elements of challenge program in computer science with grant the HPC effort needed for the next decade. Fur­ size and equipment access sufficient to support the thermore, HPC institutions outside the NSF orbit systems and algorithm research needed for more also contribute to the goals for which the NSF rapid progress in HPC capability. The Centers, in Centers are chartered. Thus we ask the Board to collaboration with hardware and software vendors, recognize that the overall structure of the HPC pro­ can provide test platforms for much of this work, gram at NSF will have more institutional diversity, and recommendation A-3 provides the hardware more flexibility, and more interdependence with other agencies and private institutions than was Prototyping and evaluating software, new ar­ possible in the early years of the HPC initiative. chitectures, and the uses of high speed data communications in collaboration with: com­ The NSF should continue its current practice of puter and computational scientists, discipli­ encouraging HPC Center collaboration, both nary scientists exploiting HPC resources, the with one another and with other entities HPC industry, and business firms exploring engaged in HPC work. The division of the sup­ expanded use of HPC. port budget into one component committed to the centers and another for multi-center ac­ Training and education, from post-docs and tivities is a useful management tool, even faculty specialists to introduction of less ex­ though it may have the effect of reducing com­ perienced researchers to HPC methods, to col­ petition among centers. The National Consor­ laboration with state and regional HPC tium for HPC (NCHPC), formed by NSF and centers working with high schools and com­ ARPA is a welcome measure as well. munity colleges. Recommendation C-2: The current situation in HPC is both more exciting, more turbulent, and ALLOCATION OF CENTER HPC more filled with promise of really big benefits to RESOURCES TO INVESTIGATORS the nation than at any time since the Lax report; Recommendation C-4: The NSF should continue this is not the time to "sunset" a successful, chang­ to monitor the administrative procedures used to ing venture, of which the Centers remain an impor­ allocate Center resources, and the relationship of tant part. Furthermore, we also recommend this process to the initial funding of the research against re-competition of the four Centers at this by the disciplinary program offices, to ensure that time, favoring periodic performance evaluation the burden on scientists applying for research sup­ and competition for some elements of their ac­ port is minimized. NSF should continue to pro­ tivities, both among Centers and when appropriate vide HPC resources to the research community with other HPC Centers such as those operated by through allocation committees that evaluate com­ states (see Recommendation D-1). petitively proposals for use of Center resources. Recommendation C-3: The mission of the Centers is to foster rapid progress in the use of EDUCATION AND TRAINING HPC by scientists and engineers, to accelerate Recommendation C-5: The NSF should give progress in usability and economy of HPC and to strong emphasis to its education mission in HPC, diffuse HPC capability throughout the technical and should actively seek collaboration with state- community, including industry. Provision to scien­ sponsored and other HPC centers not supported tists and engineers of access to leading edge super­ primarily on NSF funding. Supercomputing computer resources will confine to be a primary regional affiliates should be candidates for NSF purpose of the Centers. The following additional support, with education as a key role. HPC will components of the Center missions should be af­ also figure in the Administration's industrial exten­ firmed: sion program, in which the states have the primary • Supporting computational science, by re­ operational role. search and demonstration in the solution of significant science and engineering problems. • Fostering interdisciplinary collaboration - across sciences and between sciences and computational science and computer science - as in the Grand Challenge programs.

vi D. NSF AND THE NATIONAL HPC puter manufacturers, computer and computational EFFORT; RELATIONSHIPS WITH scientists (similar to the Federal Networking THE STATES Council's Advisory Committee), which should report to HPCCIT. A particularly important role Recommendation D-1: We recommend that NSF for this body would be to facilitate state-federal urge OSTP to establish an advisory committee rep­ planning related to high performance computing. resenting the states, HPC users, NSF Centers, com-

VII Teraf lop class

Center supercomputers

Mid-range parallel processors; Networked work stations

High performance workstations

Figure A PYRAMID OF HIGH PERFORMANCE COMPUTING ENVIRONMENTS

National Teraflop facility

NSF HPC Centers Other agency, State Centers

Departments, institutes, laboratories

Subject specific, Computer Science, utational Science and Engineering Gro

Individual investigators and small groups

Figure B PYRAMID OF HIGH PERFORMANCE COMPUTING INSTITUTIONS

VIII INTRODUCTION AND BACKGROUND A revolution is underway in the practice of science performance computing. 2 This success has opened and engineering, arising from advances in com­ up a vast set of new research and applications putational science and new models for scientific problems amenable to solution through high levels phenomena, and made possible by advances in of computational power and better computational computer science and technology. The importance tools. of this revolution is not yet fully appreciated be­ cause of the limited fraction of the technical com­ The key features of the new capabilities include: munity that has developed the skills required and • The power of the big, multiprocessing vector has access to high performance computational supercomputers, today's workhorse of supercom­ resources. These skill and access barriers can be puting, has increased by a factor of 100 to 200 dramatically lowered, and if they are, a new level since the Lax Report. 3 of creativity and progress in science and engineer­ ing may be realized which will be quite different • An exciting array of massively parallel proces­ from that known in the past. This report is about sors (MPP) have appeared in the market, offering that opportunity for all of science and engineering; three possibilities: an acceleration in the rate of ad­ it is not about the needs of one or two specialized vance of peak processing power, an improvement disciplines. in the ratio of performance to cost, and the option to grow the power of an installation incrementally

A little over a decade ago, the National Science as the need arises. 4 Board convened a panel chaired by Prof. Peter Lax to explore what should NSF do to exploit the • Switched networks based on high speed digi­ potential for science and industry of the rapid ad­ tal communications are extending access to major vances in high performance computing.' The ac­ computational facilities, permitting the dynamic tions taken by the NSF with the encouragement of redeployment of computing power to suit the the Board to implement the "Large Scale Comput­ users' needs, and improving connectivity among ing in Science and Engineering" Report of 1982 collaborating users. have helped computing foster a revolution in science and engineering research and practice, in 2With every new generation of computing machines, the academic institutions and to a lesser extent in in­ capability associated with "high performance computing" dustrial applications. At the time, centralized changes. High performance computing (HPC) may be facilities were the only way to provide access to defined as "a computation and communications capability high performance computing, which compelled that allows individuals and groups to extend their ability to the Lax panel to recommend the establishment of solve research, design, and modelling problems substantially beyond that available to them before." This definition recog­ NSF Supercomputer Centers interconnected by a nizes that HPC is a relative and changing concept. For the high speed network. The new revolution is charac­ PC user a scientific workstation is high performance comput­ terized both by advances in the power of super­ ing. For the technical people with specialized skill in com­ computers and by the diffusion throughout the putational science and access to high performance facilities, nation of access to and experience with using high a reasonable level for 1992-1993 mightbe 1 Gflop for a vec­ tor machine and 2 Gflops for a MPP system. 3As noted in Appendix C, the clock speed of a single has only increased by a factor of 5 to 6 since 1976, but a 16-way C-90 with one additional vector pipe mul­ tiplies the effective spedd by the estimated factor of a hundred 1 Report of the Panel on Large Scale Compuling in Science and Engineering, Peter Lax, chairman, commissioned by the or more. National Science Board in cooperation with the U.S. Depart­ 4The promise (not yet realized) of massively parallel systems ment of Defense, Department of Energy, and the National is a much higher degree of installed capacity expandability with Aeronautics and Space Administration, December 26, 1982. minimal disruption to the user's programming. Technical progress in computer science and visions for what may be even more dramatic microelectronics have transformed yesterday's su­ progress in the future. percomputers into today's emerging desktop • Supported fundamental work in computer workstations. These workstations offer more science and engineering which has led to advances flexible tradeoffs between ease of access and in­ in architectures, tools, and algorithms for computa­ herent computing power and can be coupled to the tional science. largest supercomputers over a national network, used in locally-networked clusters, or as stand­ • Initiated collaborations with many companies alone processors. to help them realize the economic and technologi­ cal benefits of high performance computing. • Advances in computer architectures, computa­ Caterpillar Inc. uses supercomputing to tiónal mathematics, algorithmic modeling, and model diesel engines in an attempt to reduce software, along with new computer architectures, emissions. are solving some of the most intractable but impor­ tant scientific, technical, and economic problems Dow Chemical Company simulates and facing our society. visualizes fluid flow in chemical processes to ensure complete mixing. To address these changes, the National Science USX has turned to supercomputing to im­ Board charged this panel with taking a fresh look prove the hot rolling process-control systems at the current situation and new directions that used in steel manufacturing. might be required. (See Appendix A for institution­ al identification of the panel membership and Ap­ Solar Turbine, Inc. applies computational pendix B for historical background leading to the finite-element methods to the design of very present study and the Charge to the Panel.) complex mechanical systems. To provide both direction and potential to exploit • Opened up supercomputer access to a wide these advances, a leadership role for the NSF con­ range of researchers and industrial scientists and tinues to be required. The goal of this report is to engineers. suggest how NSF should evolve its role in high This was one of the key recommendations of performance computing. Our belief that NSF can the Lax Report. The establishment of the and should continue to exert influence in these four NSF Supercomputer Centers (in addition fields is based in part on its past successes to NCAR) has been extraordinarily success­ achieved through the NSF Program in High Perfor­ ful. By providing network access, through mance Computing and Communications. the NSFNET and Internet linkages, NSF has put these computing resources at the fmger­ Achievements Since the Lax Report tips of scientists, engineers, mathematicians In the past 10 years, the NSF Program in High Per ­ and other professionals all over the nation. formance Computing and Communications has: Users seldom need to go personally to these Centers; in fact, the distribution of computa­ • Facilitated many new scientific discoveries tional cycles by the four NSF Supercomputer and new industrial processes, and supported fun­ Centers shows surprisingly little geographic damental work which has led to advances in ar­ bias. This extension of compute power, away chitectures, tools and algorithms for from dedicated, on-site facilities and towards computational science. a seamless national computing environment has been instrumental in creating the condi­ In Appendix E of this report several panel tions required for advances on a broad front members describe examples of those ac­ in science, engineering, and the tools of com­ complishments and suggest their personal putational science. There seems to be a lack 2 of geographic bias in users - Figure 1 in Ap- • Encouraged the Supercomputer Centers to pendix D shows users widely distributed leverage their relationship with HPC producers to across the United States. reduce the cost of bringing innovation to the scien­ tific and engineering communities. • Educated literally thousands of scientists, en­ gineers and students, as well as a new generation In recognition of Center activity in improving of researchers who now use computational science early versions of hardware and software for equally with theory and experiment. high performance computing systems, the At the time of the Lax Report access to the computer industry has provided equipment at most advanced facilities was restricted to a favorable prices and important technical sup­ relatively small set of users. Furthermore su­ port. This has allowed researchers earlier and percomputing was regarded by many scien- more useful access to HPC facilities than tists as either an inaccessible tool or as an might have been the case under commercial inelegantly brute force approach to science. terms. The NSF program successfully inoculated vir­ • Joined into successful partnerships with other tually all of the disciplines with the realiza­ agencies to make coordinated contributions to the tion that HPC is both a powerful and a U.S. capability in HPC. practical tool for many purposes. These NSF initiatives have not only pushed the technol- A decade ago the United States enjoyed a ogy and computational science ahead in world-wide commercial lead in vector sys­ sophistication and power, they have helped tems. In part as the result of more recent bring high performance computing to a large development and procurement actions of the fraction of the technical community. There Advanced Projects Agency, the Department has been a 5-fold increase in number of NSF of Energy, and the National Science Founda­ funded scientists using HPC and a 5-fold in­ tion, the U.S. now has the dominant lead in crease in ratio of graduate students to faculty providing new Massively Parallel Processing using HPC through the NSF Supercomputer (MPP) systems. 5 As an example, the NSF Centers. (See Figure 2 of Appendix D) has enabled NSF Supercomputer Center ac­ • Provided the HPC industry a committed, en­ quisitions of scalable parallel systems first thusiastic, and dedicated class of expert users who developed under seed money provided by share their experience and ideas with vendors, ac­ ARPA, and thus has been instrumental in celerating the evolutionary improvement in the leveraging ARPA projects into the technology and its software. One of the problems in the migration of new technologies from experimental environ­ ments to production modes are the inherent 5Massively parallel computers are constructed from large numbers of separate processors linked by high speed com­ risks in committing substantial resources munications providing access to each other and to shared I/O towards converting existing codes and devices and/or computer memory. There are many different developing software tools. The NSF Super­ architectural forms of MPP machines, but they have in com­ computer Centers have provided a proving mon economies of scale from the use of microprocessors ground for these new technologies; various produced at high volumes and the ability to combine them at industrial players have entered into partner­ many levels of aggregation. The challenge in using such machines is to formulate the problem so that it can be decom­ ships with the Centers aimed at accelerating posed and run efficiently on most or all of the processors this migration while maintaining solid and concurrently. Some scientific problems lend themselves to reliable underpinnings. parallel computation much more easily than others, suggest­ ing that improved utility of MPP machines will not be availed in all fields of science at once. mainstream. 6 (Figure 3 of Appendix D putational science in industry. The next major shows data on the uptake of advanced com­ HPC revolution may well be in industry, which is puting by sector across the world). still seriously under-utifizing HPC (with some ex­ ceptions such as aerospace, automotive, and The Lax Report microelectronics). All of these accomplishments have, in a large part, The success of the chemical industry in design­ arisen from the response by NSF to the recommen­ ing and simulating pilot plants, of the aircraft in­ dations of the 1982 Lax report "Large Scale Com­ dustry in simulating wind tunnels and puting in Science and Engineering". These performing dynamic design evaluation,' and in recommendations included: the electronics industry in designing integrated circuits and modelling the performance of com­ '. Increase access to regularly upgraded super- puters and networks suggests the scale of avail­ computing facilities via high bandwidth net­ able opportunities. The most important works. requirements are (a) improving the usability and Increase research in computational mathe­ efficiency problems of high performance matics, software, and algorithms. machines, and (b) training in HPC for people going into industry. The Supercomputer Centers Train people in scientific computing. have demonstrated they can introduce the com­ mercial sector to HPC at little cost, and with • Invest in research on new supercomputer sys­ high potential benefits to economy (produc­ tems. tivity of industry and stimulation of markets for For several reasons, NSF's investment in computa­ U.S. HPC vendors). Success in stimulating tional research and training has been a startling HPC usage in industry will also accelerate need success. First, there has been a widespread accep­ for HPC education and technology, thus exploit­ tance of computational science as a vital com­ ing the benefits of collaboration with univer­ ponent and tool in scientific and technological sities and vendors. The Centers' role can be a understanding. Second, there have been revolution­ catalytic one, but often rises to the level of a ary advances in computing technology in the past true collaborative partnership with industry, to decade. And third, the demonstrated ability to the mutual advantage of the firm and the NSF solve key critical problems has advanced the Centers. As industrial uses of HPC grow, the progress of mathematics, science and engineering scientists, mathematicians, and engineers in many important ways, and has created great benefit from the faffing costs and rising demand for additional HPC resources. usability of the new equipment. In addition the technological uses of HPC spur new and inter ­ esting problems in science. The following chart The New Opportunities in Science and in indicates the increasing importance of advanced Industry computing in industry. As discussed in detail in the Appendix E essays, the prospects are for dramatic progress in science Cray Research Inc. supercomputer sales and engineering and for rapid adoption of corn- Percent to Percent to Percent to Era government industry Universities Early 1980s 70 25 5 6Scalable parallel machines are those in which the number Late 1980s 60 25 15 of processor nodes can be expanded over a wide range without substantial changes in either the shared hardware or Today 40 40 20 the application interfaces of the operating system.

4 The New Technology distinguished from desktop machines by having Most HPC production work being done today uses much faster cycle times, to a world in which cycle big vector machines in single processor (or loosely times converge and the highest levels of computer coupled multiprocessor) mode. Vectorizing power will be delivered through parallelism, compilers and other software tools are memory size and bandwidth, and I/O speed. The well tested and many people have been trained in widespread availability of scientific workstations their use. These big shared memory machines will will accelerate the introduction of more scientists continue to be the mainstay of high performance and engineers to high performance computing, computing, at least for the next 5 years or so, and resulting in a further acceleration of the need for perhaps beyond if the promise of massively paral­ higher performance machines. Early exploration lel supercomputing is delayed longer than many of message-passing distributed operating systems expect. gives promise of loosely-coupled arrays of workstations being used to process large problems New desk top computers have made extraordinary in the background and when the workstations are gains in cost-performance (driven by competition- unused at night, as well as coupling the worksta­ driven commodity microprocessor production). tions (on which problems are initially designed Justin Rattner of Intel estimated that in 1996 and tested) to the supercomputers located at microprocessors with clock speeds of 200 MHz remote facilities. may power an 800 Mflops peak speed worksta­ tion.7 He, and others from the industry, predicted Of course, the faster microprocessors also make the convergence of the clock speeds of possible new MPP machines of ever increasing microprocessor chips and the large vector peak processing speed. MPP is catching on fast, machines such as the Cray C90, perhaps as soon as researchers with sufficient expertise (and as 1995. They held out the likelihood that in 1997 diligence) in computational science are solving a microprocessors may be available at 1 gigaflop; a growing number of applications that lend themsel­ desktop PC might be available with this speed for ves to highly parallel architectures. In some cases $10,000 or less. Mid-range workstations will also those investigators are realizing a ratio of theoreti­ show great growth in capacity; Today one can pur­ cal to peak performance approaching that chase a mid-range workstation with a clock speed achieved by vector machines, with significant cost- of 200MHz for an entry price of $40,000 to performance advantages. Efficient use of MPP on $50,000. the broad range of scientific and engineering problems is still beyond the reach of most inves­ Thus a technical transition is underway from the tigators, however, because of the expertise and ef­ world in which uniprocessor supercomputers were fort required. Thus the first speculative phase of MPP HPC is coming to an end, but its ultimate potential is still uncertain and largely unrealized. 7The instruction execution speeds of scientific computers are generally reckoned in the number of floating point instruc­ Limiting progress in all three of these technologies tions that can be executed in one second. Thus a 1 Megaflop is a set of architecture and software issues that are machine executes 1 million floating point instructions per discussed below in Recommendations B. Principal second, a Gigaflop would be one billion instructions per among them is the evolution of a programming second, and a Teraflop 1012 floating point instructions per second. Since different computer architectures may have model that can allow portability of applications quite different instruction sets one "flop" may not be the software across architectures. same as another, either in application power or in the number These technical issues are discussed at greater of machine cycles required. To avoid such difficulties, those who want to compare machines of different architecture length in Appendix C. generally use a benchmark suite of test cases to measure overall performance on each machine. FOUR CHALLENGES FOR NSF Computer Science and Engineering. The first challenge is to accelerate the development of High performance computing is changing very the technology underlying high performance fast, and NSF policy must chase a moving target. computing. Among the largest bathers to effec­ For that reason, the strategy adopted must be agile tive use of the emerging HPC technologies are and flexible in order to capitalize on past invest­ parallel architectures from which it is easy to ex- ments and adapt to the emerging opportunities. tract peak performance, system software (operat­ The Board and the Foundation face four central ing systems, databases of massive size, challenges, on which we will make specific recom­ compilers, and programming models) to take ad­ mendations for policy and action. These challen­ vantage of these architectures and provide por­ ges are: tability of end-user applications, parallel • Removing barriers to the rapid evolution of algorithms, and advances in visualization techni­ HPC capability ques to aid in the interpretation of results. The technical barriers to progress are discussed in • Providing scalable access to all levels of Appendix C. What steps will most effectively HPC capability reduce these bathers? • Finding the right incentives to promote ac­ Computationa! Tools for Advancing Science cess to all three levels of the computational and Engineering. Research in the develop­ power pyramid ment of computational models, the design of al­ gorithmic techniques, and their accompanying • Creating NSF's intellectual and management mathematical and numerical analysis, is re­ leadership for the future of high performance quired in order to ensure the continued evolu­ computing in the U.S. tion of efficient and accurate computational algorithms designed to make optimal use of these emerging technologies. CHALLENGE NO.!: Removing barriers to the rapid evolution of HPC capability In the past ten years, exciting developments in computer architectures, hardware and software How can NSF, as the nation's premier agency have come in tandem with stunning funding basic research, remove existing barriers to breakthroughs in computational techniques, the rapid evolution of High Performance Comput­ ing? These barriers are of two kinds: technologi­ mathematical analysis, and scientific models. For example, the potential of parallel machines cal bathers (primarily to realizing the promise of has been realized in part through new versions highly parallel machines, workstations, and net­ works) and exploitation barriers (new mathemati­ of numerical linear algebra routines and multi- grid techniques; rethinking and reformulating al­ cal methods and new ways to formulate science gorithms for computational physics within the and engineering problems for efficient and effec­ tive computation). An aggressive commitment by domain of parallel machines has posed sig­ nificant and challenging research questions. Ad­ NSF to leadership in research and prototype development, in both computer science and com­ vances in such areas of N-body solvers, fast putational science, will be required. Indeed, special function techniques, wavelets, high resolution fluid solvers, adaptive mesh techni­ NSF's position as the leading provider of HPC capability to the nation's scientists and engineers ques, and approximation theory have generated highly sophisticated algorithms to handle com­ will be strengthened if it commands a leadership plex problems. At the same time, important role in technical advances in both areas, which will contribute to the nation's economic position theoretical advances in the modelling of under­ as well as its position as a wOrld leader in research. lying physical and engineering problems have led to new, efficient and accurate discretization techniques. Indeed, in the evolution to scalable computing across a range of levels, designing in the national effort in high performance com­ appropriate numerical and computational techni­ puting, should support a "pyramid" of comput­ ques is of paramount importance. The challenge ing capability. At the apex of the pyramid is the facing NSF is to weave together existing work highest performance systems that affordable in these areas, as well as fostering new bridges technology permits, established at national between pure, applied and computational techni­ facilities. At the next level, every major re­ ques, engaging the talents of disciplinary scien­ search university should have access to one, or tists, engineers, and mathematicians. a few, intermediate-scale high-performance sys­ tems and/or aggregated workstation clusters. 10 CHALLENGE NO.2: Providing scalable At the lowest level are workstations with access to all levels of HPC capability visualization capabilities in sufficient numbers to support computational scientists and en­ How can NSF provide scalable access to comput­ gineers. ing resources, from the high performance worksta­ tions needed by most scientists to the critically Mid-range computational requirements. needed teraflop-and-beyond capability required Over the next five years, the middle range of for solving Grand Challenge problems? 8 What scientific computing and computational en­ balance should NSF anticipate and encourage gineering will be handled by an amazing variety among high performance desktop workstations, of moderately parallel systems. In some cases, mid-range or mini-supercomputers, networks of these will be scaled-down versions of the workstations, and remote, shared supercomputers highest performance systems available; in other of very high performance? cases, they will be systems targeted at the midrange computing market. The architecture Flexible strategy. NSF must ensure that ade­ will vary from shared memory at one end of the quate additional computational capacity is avail­ spectrum to workstation networks at the other, able to a steadily growing user community to depending on the types of parallelism in the solve the next generation of more complex local spectrum of applications. Loosely science and engineering problems. A flexible coupled networks of workstations will compete and responsive strategy that can support the with mid-range systems for performance of large number of evolving options for HPC and production HPC work. At the same time can adapt to the outcomes of major current autonomous mid-range systems are needed to development efforts (for example in MPP sys­ support the development of next-generation ar­ tems and in networked workstations) is required. chitectures and software by computer science A pyramid of computational capability. There groups. will continue to be an available spectrum span­ The panel perceives that there are imbalances in ning almost five orders of magnitude of com­ access to the pyramid of HPC resources (see the puter capabilities and prices. 9 NSF, as a leader following table). The disciplinary NSF program offices have not been uniformly effective in responding to the need for a desktop environ- 8By scalable access we mean the ability to develop a prob­ lem on a workstation or intermediate sized machine and migrate the problem with relative efficiency to larger machines as increased complexity requires it. Scalable ac­ cess implies scalable architectures and software. 10AS discussed in the recommendations, dedicated mid­ range systems are required not only for science and engineer­ 9APamgon machine of 300 Gigaflops peak performance would ing applications but also for research to improve HPC be five orders of magnitude faster than a 3 megaflop entry hardware and software, and for interactive usage. For workstation. Effective performance in most science applica­ science and science and engineering batch applications, net­ tions would, however, be perhaps a factor of ten lower. works of workstations will likely develop into an alternative. ment for their supported researchers, and there basis to everyone no matter where they work, it is serious under-investment in the mid-sized is clear the research community cannot accept a machines. The distribution of investment tends return to the previous mode of operation.The to be bimodal, to the disadvantage of mid-range high performance computing community has systems. The incentive structures internal to the grown to depend on NSF to make the necessary Foundation do not address this distortion. resources available to continually upgrade the NSF's HPCC coordinating mechanism needs to Supercomputer Centers in support of their com­ address this distortion in a more direct manner. putational science and engineering applications. NSF needs to broaden the base of participation Computational Infrastructure at NSF in HPC through NSF program offices as well as (FY92 $, M) through the Supercomputer Centers. There is no question that HPC has broken out of its original Other NSF ASC narrow group of privileged HPC specialists. The SuperQuest competition for high school stu­ Workstations�. . . 20.1 3.2 dents already demonstrates how quickly young

Small Parallel�.� .� . 2.1 0.5 people can master the effective use of HPC facilities. Other agencies, states, and private Large Parallel�.� .� . 9.4 3.2 HPC centers are springing up, making major Mainframe 9.1 16.3 contributions not only to science but to K-12 Total 40.8 23.2 education and to regional economies. NSF's policies on expanding access and training must CHALLENGE NO.3: The right incentives to take advantage of the leverage these Supercom­ promote access to all three levels of the corn­ puter Centers can provide. putational institution pyramid Allocation of HPC resources. There remains The third challenge is to encourage the continued the question of the best way to allocate HPC broadening of the base of participation in HPC, resources. Should Supercomputer Centers con­ both in terms of institutions and in terms of skill tinue to be funded to allocate HPC cycles com­ levels and disciplines. petitively, or should NSF depend on the "market" of funded investigators for allocation Lax Report incentives. At the time of the Lax report, relatively few people were interested in of HPC resources? This question gets at two HPC; even fewer had access to supercomputers. other issues: (a) the future role of the Centers Some users were fortunate to have contacts and (b) the best means for insuring adequate with someone at one of a few select government funding of workstations and other means of laboratories where computer resources were HPC access throughout the NSE The Centers available. Most, however, were less fortunate have peer review committees which allocate and were forced to carry out their research on HPC resources on the basis of competitive small departmental machines. This severely project selection. The Panel believes these al­ limited the research that could be carried out to locations are fairly made and reflect solid problems that would "fit" into available resour­ professional evaluation of computational merit. ces. NSF addressed this problem by concentrat­ The only remaining issue is whether there con­ ing supercomputer resources in Centers; by this tinues to be a need for protected funding for means those in the academic community most HPC access in NSF, including access to shared prepared and motivated were provided with ac­ Supercomputer Centers facilities? We believe cess to machine cycles. strongly that there is such a need. The panel does have suggestions for broadening the sup­ Need for expanded scopeof access. Now that port for the remainder of the HPC pyramid; these resources are available on a peer review these are articulated in the recommendations NSF leadership in HPCC. The voice of HPCC below. users needs to be more effectively felt in the na­ Education and training. A major requirement tional program; NSF has the best contact with this for education and training continues to exist. community. NSF has played, and continues to Even though most disciplines have been inocu­ play, a leadership role in the NREN program and lated with successful uses of HPC (see Appen­ the evolution of the Internet. Its initiative in creat­ dix D essays), and even though graduate student ing the "meta-center" concept establishes an NSF and postdoctoral uses of HPC resources is rising role in the sharing and coordination of resources faster than faculty usage, only a minority of (not only in NSF but in other cooperating agencies scientists have the training to allow them to as well), and the concept can be usefully extended overcome the initial barrier to proficiency, espe­ to cooperating facilities at state level and in cially in the use of MPP machines which re­ private firrñs. The question is, does the current quire a high level of computational structure in CISE, the HPCC coordination office, sophistication for most problems. the Supercomputer Centers, and the science and engineering directorates constitute the most favorable arrangement for that leadership? The CHALLENGE NO.4: How can NSF best panel does not attempt to suggest the best ways to create the intellectual and management manage the relationships among these important leadership for the future of high performance functions, but asks the NSF leadership to assure computing in the U.S.? the level of attention and coordination required to What relationships should NSF's activities in HPC implement the broad goals of this report. have to the activities of other federal agencies? Networking. The third barrier is the need for net­ NSF is a major player. What role should NSF work access with adequate bandwidth. For wide play within the scope of the nationally coordinated area networks, this is addressed in the NSF HPCC HPCC program and budget, as indicated in the fol­ NREN strategy. In the future, NSF will focus its lowing chart? network subsidies on HPC applications and their supporting infrastructure, while support for basic HPCC Agency Budgets Internet connectivity shifts to the research and

education institutions. 11 Agency� I FY92 Funding ($, M) ARPA�...... 232.2 NSF�...... 200.9 DOE�...... 92.3 NASA 71.2 HHS/NIH 41.3 11NREN is the National Research and Education Network, DOC/NOAA 9.8 envisioned in the High Performance Computing Act of 1991. EPA�...... 5.0 NREN is not a network so much as it is a program of ac­ tivities including the evolution of the Internet to serve the DOC/NIST 2.1 needs of FIPC as well as other information activities.

9 RECOMMENDATIONS A. CENTRAL GOAL FOR We have four sets of interdependent recommenda­ NSF HPC POLICY tions for the National Science Board and the Foun­ Recommendation A-i: We strongly recommend dation. The first implements a balanced pyramid that NSF build on its success in helping the U.S. of computing environments; each element sup­ achieve its preeminent world position in high per­ ports the others, and as priorities are applied the formance computing by taking the lead, under balance in the pyramid should be sustained. The OSTP guidance and in collaboration with ARPA, second set addresses the essential research invest­ DoE and other agencies, to expand access to all ments and other steps to remove the obstacles to levels of the rapidly evolving pyramid of high per­ realizing the technologies of the pyramid and the formance computing for all sectors of the nation. barriers to the effective use of these environments. The realization of this pyramid depends, of The third set addresses the institutional structure course, on rapid progress in the pyramid's tech­ for the delivery of HPC capabilities, and consists nologies. itself of a pyramid. At the base of the institutional pyramid is the diverse array of investigators in High performance computing is essential to the their universities and other settings who use all the leading edge of U.S. research and development. facilities at all levels of the pyramid. At the next It will provide the intelligence and power that level are departments and research groups devoted justifies the breadth of connectivity and access to specific areas of computer science or computa­ promised by the NREN and the National Infor­ tional science and engineering. Continuing up­ mation Infrastructure. The computational ward are the NSF HPC Centers, which must capability we envision includes not only the re­ continue to play a very important role, both as search capability for which NSF has special providers of the major resources of high capability stewardship, but also includes a rapid expansion computing systems and as aggregations of special­ of capability in business and industry to use ized capability for all aspects of use and advance HPC profitably and the many operational uses of high performance computing. At the apex is of HPC in commercial and military activities. the national teraflop facility, which we recom­ The panel is concerned that if the government mend as a multi-agency facility pushing the fron­ fails to implement the planned HPCC invest­ tiers of high performance into the next decade. A ments to support the National Information In­ final recommendation addresses the NSF's role at frastructure, the momentum of the U.S. the national level and its relationship with the industry, which blossomed in the first phase of states in HPC. the national effort, will be lost. Supercomputers This report recommends significant expansion in are only a $2 billion industry, but an industry NSF investments, both in accelerating progress in that provides critical tools for innovation across high performance computing through computer all areas of U.S. competitiveness, including and computational science research and in provid­ pharmaceuticals, oil, aerospace, automotive, ing the balanced pyramid of computing facilities and others. The administration's planned new to the science and engineering communities, but in investment of $250 million in HPCC is fully jus­ total they do not exceed the Administration's tified. Japanese competitors could easily close stated intent to double the investments in HPCC the gap in the HPC sectors in which the U.S. en­ during the next 5 years. We believe these invest­ joys that lead; they are continuing to invest and ments are not only justified, but are compatible could capture much of the market the U.S. with stated national plans, both in absolute amount government has been helping to create. and in their distribution.

10 VISION OF THE HPC PYRAMID such machine per year.' 3 Development cost Recommendation A-2: At the apex of the HPC would be substantial, perhaps in excess of the pyramid is a need for a national capability at the production cost of one machine; although it is highest level of computing power the industry can not clear to what extent government support support with both efficient software and hardware. would be required, this is a further reason to suggest a multi-agency program.' 4 Support A reasonable goal for the next 2-3 years would costs would also be additional, but one can as­ be the design, development, and realization of a sume that one or more of the NSF Supercom­ national teraflop-class capability, subject to the puter Centers could host such a facility with effective implementation of Recommendation something like the current staff. B- 1 and the development of effective software and computational tools for such a large Such a nationally shared machine, or machines, machine. 12 Such a capability would provide a must be open to competitive merit-evaluated significant stimulus to commercial development proposals for science and engineering computa­ of a prototype high-end commercial HPC sys­ tion, although it could share this mission of tem of the future. We believe the importance of responding to the community's research NSF's mission in HPC justifies NSF initiating priorities with mission-directed work of the an interagency plan to make this investment, sponsoring agencies. The investment is jus­ and further that NSF should propose to operate tified by (a) the existence of problems whose the facility in support of national goals in solution awaits a teraflop machine, (b) the im­ science and technology. For budgetary and inter­ portance of driving the HPC industry's innova­ agency collaboration reasons OSTP should in­ tion rate, (c) the need for early and concrete voke a FCCSET project to establish such a experience with the scalability of software en­ capability on a government-wide basis with vironments to higher speeds and larger arrays of multi-agency funding and usage. processors, since software development time is the limiting factor to hardware acceptance in the If development begins in 1995 or 1996, a market. reasonable guess at the cost of a teraflop machine is $50/megaflop for delivery in 1997 to 1998. If so, $50 million a year might buy one

13The cost estimates in this report cannot be much more than informed guesses. We have assumed a cost of $50/megaflop for purchase of a one teraflop machine in 1997 or 1998. We suspect that this cost might be reached earlier, say in 1995 or 1996 in a mid-range machine, because a tightly-coupled mas­ sively parallel machine may have costs rising more than 12Some panel members have reservations about the urgency linearly with the number of processors, overcoming the scale of this recommendation, are pessimistic about the likelihood economies that might make the cost rise less than linearly. of realizing the effective performance in applications, or are The cost estimates in recommendations A24 are intended to concerned about the possible opportunity cost to NSF of indicate that scale of investment we recommend is not in­ such a large project. The majority notes that the recommen­ compatible with the published pilans of the administration dation is intended to drive solutions to those architectural for investment in HPCC in the next 5 years, and further that and software problems. Intel's Paragon machine is on the roughly equal levels of incremental expenditures in the three market today with 0.3 Teraflops peak speed, but without the levels of the I{PC pyramid could produce the balance among support to deliver that speed in most applications. The panel these levels that we recommend. also recommends a multi-agency federal effort NSF's share 14The Departments of Energy and Defense and NASA might of cost and role in managing such a project are left to a share a major portion of the development cost and might also proposed FCCSET review. acquire such machines in the future as well.

11 Recommendation A-3: Over a period of 5 years If we assume a cost in three or four years of the research universities should be assisted to ac­ $50/megaflop for mid-sized MPP machines, an quire mid-range machines. annual expenditure of $10 million would fund the annual acquisition of one hundred 2 This will bring a rapid expansion in access to Gigaflop (peak) computers. Support costs for very robust capability, reducing pressure on the users would be additional. Supercomputer Centers' largest facilities, and a!­ lowing the variety of vendor solutions to be ex­ Recommendation A-4: We recomniend that NSF ercised extensively. If the new MPP double the current annual level of investment ($22 architectures prove robust, usable, and scalable, million) in scientific and engineering workstations these institutions will be able to grow the for its 20,000 principal investigators. capacity of such system in proportion to need Many researchers strongly prefer the new high and with whatever incremental resources are performance workstations that are under their available. This capability is also needed to pro­ control and find them adequate to meet many of vide testbeds for computer and computational their initial needs. Those without access to the science research and testing. new workstations may apply to use remote ac­ These mid-sized machines are the underfunded cess to a supercomputer in a Center, but often element today - less than 5% of NSF's FY92 they do not need all the I/O and other HPC budget is devoted to their acquisition. capabilities of the large shared facilities. NSF They are needed for both demanding science needs a strategy to off-load work not requiring and engineering problems that do not require the highest level machines in the Centers. The the very maximum in computing capacity, and justification is not economy of scale, but importantly for use by the computer science and economy of talent and time. computational mathematics community in ad­ When the Lax report was written a 160 Mfiop dressing the architectural, software, and algo­ peak Cray 1 was a high performance supercom­ rithmic issues that are the primary barriers to puter. Within 4 or 5 years workstations deliver ­ progress with MPP architectures. 15 ing up to 400 megaflops costing no more than Engineering is also a key candidate for their $15,000 to $20,000 should be widely available. use. There are 1050 University-Industry Re­ For education and a large fraction of the com­ search Centers in the U.S. Those UIRCs that putational needs of science and engineering, are properly equipped with computational these facilities will be adequate. However, once facilities can increase the coupling with in­ visualization of computational output becomes dustrial computation, adding greatly to what the routinely required they will be ubiquitously NSF HPC Supercomputer Centers are doing. needed. With the rapid pace of improvement, Many engineering applications, such as robotics the useful lifetimes of workstations are decreas­ research, require "real time" interactive com­ ing rapidly; they often cannot cope with the putation which is incompatible with the batch latest software. Researchers face escalating environment on the highest performance costs to upgrade their computers. NSF supports machines. perhaps some 20,000 principal investigators. Equipping an additional 10 percent of this num­ ber each year (2,000 machines) at $20,000 each requires an incremental $20 miffion. Recom­ 15The development of prototypes of architectures and operat­ mendation B-3 addresses how this investment ing systems for parallel computation requires access to a machine whose hardware and software can be experimental­ might be managed. ly modified. This research often cannot be done on machines Recommendation A-5: We recommend that NSF dediicated to full time production. expand its New Technologies program to support expanded testing of the practicality of new parallel 12 configurations for HPC applications. 16 For ex­ rithm research needed for more rapid progress in ample, networks of workstations may meet a sig­ HPC. The Supercomputer Centers, in collabora­ nificant part of midrange HPC science and tion with hardware and software vendors, can pro­ engineering applications. As progress is made in vide test platforms for much of this work. the development of this and other technologies, ex­ Recommendation A-3 provides the hardware sup­ perimental use of the new configurations should port required for initial development of be encouraged. A significant supplement to HPC prototypes. applications research capacity can be had with minimal additional cost if such collections of There is consensus that the absence of sufficient workstations prove practical and efficient. funding for systems and algorithms work which is not mission-oriented is the primary barrier to There have already been sufficient experiments lower cost, more widely accessible, and more with use of distributed file systems and loosely usable massively parallel systems. This work, coupled workstations to encourage the belief including bringing the most promising ideas to than many compute-intensive problems are prototype stage for effective transfer to the HPC amenable to this approach. For those problems industry, would address the most significant bar­ that do not suffer from the latency inherent in riers to the ultimate penetration of parallel ar­ this approach the incremental costs can be very chitectures in workstations. Advances on the low indeed, for the problems run in background horizon that could be accelerated include more and at times the workstations are otherwise un­ advanced network interface architectures and engaged. There are those who strongly believe operating systems technologies to provide low that in combination with object-oriented overhead communications in collections of programming this approach can create a revolu­ workstations, and advances in algorithms and tion in software and algorithm sharing as well software for distributed databases of massive as more economical machine cycles. 17 size. Computer science has made, and continues to make, important contributions to both hard B. RECOMMENDATIONS TO and soft parallel machine technology, and has ef­ IMPLEMENT THESE GOALS fectively transferred these ideas to the industry. Two problems impede the full contribution of REMOVING BARRIERS TO HPC computer science to rapid advance in MPP TECHNICAL PROGRESS AND HPC USAGE development grant sizes in the discipline are typically too small to allow enough con­ Recommendation B-i: To accelerate progress in centrated effort to build and test prototypes, and developing the HPC technology needed by users, too few computer science departments have ac­ NSF should create, in CISE, a challenge program cess to a mid-sized machine on which systems in computer science with grant size and equipment development can be done. access sufficient to support the systems and algo- The Board should ask for a proposal from CISE to effectively mobilize the best computer science and computational mathematics talent to addressing the solution of these problems in 16Today NSF CISE has a "new technologies" program that the areas of both improved operating systems, co-funds with disciplinaiy program offices perhaps 50 projects/yr. This program is in the division that funds the architectures, compilers, and algorithms for ex­ Centers, but is focused on projects which can ultimately isting systems as well as research in next- benefit all users of parallel systems. This program funds per­ generation systems. We recommend haps 15 methods and tools projects annually, in addition to establishing a number of major projects, with those co-funded with science programs. higher levels of annual funding than is typical 17MP Corporation, among others, is pursuing this vision in Computer Science, and assured duration of 13 up to five years, for a total annual incremental purchase of workstations out of their equipment investment of $10 million. We recommend that funds. But we recognize that these funds need this challenge fund be managed by CISE, and to be supplemented by HPCC funds. CISE has be accessible to all disciplinary program offices an office which co-funds interdisciplinary ap­ who wish to forward team proposals for add-on plications of HPC workstations. We believe this funding in response to specific proposals from office may require more budgetary authority the community. than it now enjoys, to ensure the proper balance of program and CISE budgets for workstations. Recommendation B-2: A significant bather to rapid progress in the application of HPC lies in for­ Scientific value must be a primary criterion for mulating a computational strategy to solve a prob­ resource allocation. It would be unwise to sup­ lem. In response to Challenge 1 above, NSF port mediocre projects just because they require should focus attention, both through CISE and supercomputers. The strategy of application ap­ through its disciplinary program offices, on sup­ proval will depend very heavily on funding port for the design and development of computa­ scenarios. If sufficient HPCC funds are made tional techniques, algorithmic methodology, and available to individual programs for computer mathematical, physical and engineering models to usage, then the Supercomputer Centers should make efficient use of the machines. be reserved for applications that cannot be car­ ried out elsewhere, with particular priority to Without such work in both theoretical and ap­ novel applications. If individual science plied areas of numerical analysis, applied mathe­ programs continue to be underfunded relative to matics, and computational algorithms, the full large centers, the Supercomputer Centers may benefit of advances in architecture and systems be forced into a role of supporting less novel or software will not be realized. In particular, sig­ demanding computing applications. Under nificantly increased funding of collaborative these circumstances, less stringent funding and individual state-of-the-art methodology is criteria should be applied. warranted, and is crucial to the success of high performance computing. Some of this can be done through the individual directorates with C. THE NSF SUPERCOMPUTER funds supplemented by HPCC funds; the Grand CENTERS Challenge Applications Group awards are a C-i: The Supercomputer good first step. Recommendation Centers should be retained and their missions, as Recommendation B-3: We recommend NSF set they have evolved since the Lax Report, should be up an agency-wide task force to develop a way to reaffirmed. However, the NSF HPC effort now ameliorate the imbalance in the HPC pyramid - embraces a variety of institutions and programs - the under-investment in the emerging mid-range HPC Centers, Engineering Research Centers scalable, parallel computers and the inequality of (ERC) and Science and Technology Centers access to stand-alone (but potentially networked) (STC) devoted to HPC research, and disciplinary workstations in the disciplines. This implementa­ investments in computer and computational tion plan should involve a combination of funding science and applied mathematics - all of which are by disciplinary program offices and some form of essential elements of the HPC effort needed for more centralized allocation of NSF resources. the next decade. NSF plays a primary but not Some directorates have "infrastructure" necessarily dominant role in each of them (see Fig­ programs; others do not. Still others fund ure 4 of Appendix D). Furthermore, HPC institu­ workstations until they reach the "target" set by tions outside the NSF orbit also contribute to the the HPCC coordination office. We believe that goals for which the NSF Supercomputer Centers individual disciplinary program managers are chartered. Thus we ask the Board to recognize should consider it their responsibility to fund that the overall structure of the HPC program at 14 NSF will have more institutional diversity, more venture, of which Supercomputer Centers remain flexibility, and more interdependence with other an essential part. Furthermore, we also recom­ agencies and private institutions than in the early mend against an open recompetition of the four years of the HPC initiative. Supercomputer Centers at this time, favoring in­ We anticipate an evolution, which has already stead periodic performance evaluation and com­ begun, in which the NSF Supercomputer petition for some elements of their activities, both Centers increasingly broaden their base of sup­ among the Centers themselves and when ap­ port, and NSF expands its support in collabora­ propriate with other HPC Centers such as those tion with other institutional settings for HPC. operated by states (see Recommendation D-l). Center-like groups, especially NSF S&T Continuing evaluation of each Center's perfor- Centers, are an important instrument for focus­ mance, as well as the performance of the overall ing on solving barriers to HPC, although they program, is, of course, an essential part of good do not provide HPC resources to users. An ex­ management of the Supercomputer Centers pro­ cellent example is the multi-institutional Center gram. Such evaluations must take place on a for Research in Parallel Computation at Rice regular basis in order to develop a sound basis University, which is supported at about $4M/yr, for adjustments in support levels, to provide in­ with additional support from ARPA. Another ex­ centives for quality performance and to recog­ ample is the Center for Computer Graphics and nize the need to encourage other institutions Scientific Visualization, an S&T Center award such as S&T Centers that are attacking HPC bar­ to University of Utah with participation of riers and state-based centers with attractive University of N. Carolina, Brown, Caltech, and programs in education and training. While Cornell. Still another example is the Discrete recompetition of existing Supercomputer Mathematics and Computational Science Center Centers does not appear to be appropriate at this (DIMACS) at Rutgers and Princeton. These time, if regular review of the Centers and the centers fill important roles today, and the ERC Centers program identifies shortcomings in a and S&T Center structures provide a necessary Center or the total program, a recompefition of addition to the Supercomputer Centers for in­ that element of the program should be initiated. stitutionalizing the programmatic work required for HPC. Supercomputer Centers are highly leveraged through investments by industry, vendors, and The NSF should continue its current practice of states. This diversification of support impedes encouraging HPC Center collaboration, both unilateral action by NSF, since the Centers' with one another and with other entities other sponsors must be consulted before engaged in HPC work. The division of the sup­ decisions important to the Center are made. 18 It port budget into one component committed to also suggests that the issue of recompetition the Supercomputer Centers and another for may, in future, become moot as the formal multi-center activities is a useful management designation "NSF HPC Center" erodes in sig- tool, even though it may have the effect of reducing competition among Supercomputer Centers. The National Consortium for HPC (NCHPC), formed by NSF and ARPA is a we!­ come measure as well. 18Each year each center gets a cooperative agreement level which is negotiated. Each center gets about $14M; about Recommendation C-2: The current situation in 15% is flexible. NSF centers have also received help from HPC is more exciting, more turbulent, and more ARPA to buy new IvIPP machines. Most of the Centers have filled with promise of really big benefits to the na­ important outside sources of support, which imply obliga­ tion than at any time since the Lax report, this is tions NSF must respect, such as the Cornell Center relation­ ship with IBM and the San Diego Center's activities with the not the time to "sunset" a successful, changing State of California. 15 nificance. There is a form of recompetition al­ available to computational scientists, who ready in place; the Centers compete for support themselves are intellectual leaders. In this for new machine acquisition and for roles in way the Centers will participate in leadership multi-center projects. but will not necessarily be its primary source. With certain notable exceptions, intellectual C-3: The NSF should con­ Recommendation leadership in computational science has come tinue to provide funding to support the Supercom­ from scientists around the country who have puter Centers' HPC capacity. Any distortion in the at times used the resources available at the uses of the computing pyramid that result from Centers. This situation is unlikely to change this dedicated funding are best offset by the recom­ nor should it change. It would be unrealistic mendations we make for other elements in the to place this type of demand on the Super­ pyramid. Provision to scientists and engineers of computer Centers and it would certainly not access to leading edge supercomputer resources be in the successful tradition of American will continue to be a primary purpose of the science. Centers, but it is a means to a broader mission; to foster rapid progress in the use of HPC by scien­ The Supercomputer Centers facilitate interdis­ tists and engineers, to accelerate progress in ciplinary collaborations because they support usability and economy of HPC and to diffuse HPC users from a variety of disciplines, and are capability throughout the technical community, in­ aware of their particular strengths. The cluding industry. The following additional com­ Centers have been deeply involved in nucleat­ ponents of the Center missions should be ing Grand Challenge teams, and particularly affirmed: in reaching out to bring computer scientists together with computational scientists. • Supporting computational science, by re­ Visualization, for example, is no longer just search and demonstration in the solution of sig­ in the realm of the computational scientist; nificant science and engineering problems. experimentalists use the same tools for • Fostering interdisciplinary collaboration - designing and simulating experiments in ad­ across sciences and between sciences and com­ vance of actual data generation. This com­ putational science and computer science - as in the mon ground should not be separated from the Grand Challenge programs. enabling technologies which have made this work possible. Rather high performance • Prototyping and evaluating software, new ar­ computing and the new science it has enabled chitectures, and the uses of high speed data com­ have seeded advances that would not have munications in collaboration with three groups: happened any other way. computer and computational scientists, discipli­ nary scientists exploiting HPC resources, the HPC industry, and business firms exploring expanded ALLOCATION OF CENTER HPC use of HPC. RESOURCES TO INVESTIGATORS • Training and education, from post-docs and Recommendation C-4: The NSF should review faculty specialists to introduction of less ex­ the administrative procedures used to allocate Cen­ perienced researchers to HPC methods, to col­ ter resources, and the relationship of this process laboration with state and regional HPC centers to the initial funding of the research by the discipli­ working with high schools, community colleges, nary program offices, to ensure that the burden on colleges, and universities. scientists applying for research support is mini­ mized, when that research also requires access to The role of a Supercomputer Center should, the facilities of the Centers, or perhaps access to therefore, continue to be primarily one of a other elements of the HPC pyramid that will be es­ facilitator, pursuing the goals just listed by tablished pursuant to our recommendations. How- making the hardware and human resources 16 ever we believe the NSF should continue to pro­ express a preference for a particular computer vide HPC resources to the research community or Center for their work, all Centers facilities through allocation committees that evaluate com­ would be in the pooi from which each inves­ petitively proposals for use of Center resources.' 9 tigator receives allocations. At the present time, the allocation of resources We recognize, of course, that the specific alloca­ in the Supercomputer Centers for all users is tion of machine time often cannot be made at handled by requiring principal investigators to the time of the original proposal for NSF re­ submit annual proposals to a specified Center search support, since in some cases the work for access to specific equipment. The NSF has not progressed to the point that the mathe­ should not require a duplicate peer review of matical approach, algorithms, etc., are available the substantive scientific merit of the proposed for Center experts to evaluate and translate into scientific investigation, first by disciplinary pro­ estimates of machine time. Nor is the demand gram offices, and then again by the Center Al­ function for facilities known at that time. location Committees. For this reason, it is proposed that the allocation of supercomputer EDUCATION AND TRAINING time be combined with the allocation of re­ search funds to the investigator. Recommendation C-5: The NSF should give strong emphasis to its education mission in HPC, Although this panel is not in a position to give and should actively seek collaboration with state- administrative details of such a procedure, it is sponsored and other HPC centers not supported suggested that requests for computer time be at­ primarily on NSF funding. Supercomputing tached to the original regular NSF proposal, regional affiliates should be candidates for NSF with (a) experts in computational science in­ support, with education as a key role. HPC will cluded among peer reviewers, or, (b) that por­ also figure in the Administration's industrial exten­ tion of the proposal be reviewed in parallel by a sion program, in which the states have the primary peer review established by the Centers. In either operational role. case only one set of peer reviewers should evaluate scientific merits, and only one set of The serious difficulties associated with the use reviewers should determine that the research of parallel computers pose a new training bur ­ task is being formulated properly for use of den. In the past it was expected that individual HPC resources. investigators would port their code to new com­ puters and this could usually be done with Second, we recommend that the Centers collec­ limited effort. This is no longer the case. The tively establish the review and allocation Supercomputer Centers should see their future mechanism, so that while investigators might mission as providing direct aid to the rewriting of code for parallel processors. Computational science is proving to be an effec­ 19For NSF funded investigators, allocation committees at Su­ percomputer Centers should evaluate requests for HPC tive way to generate new knowledge. As part of resources only on the appropriateness of the computational its basic mission, NSF needs to teach scientists, plans, choice of machine, and amount of resource requested. engineers, mathematicians, and even computer Centers should rely on disciplinary program office deter­ scientists how high performance computing can minations of scientific merit, based on their peer review. In be used to produce new scientific results. The this way a two level review of the merits of the science is role of the Supercomputer Centers is critical to avoided. A further simplification might be for the application for computer time at the Centers to be included in the such a mission since the Centers have expertise original disciplinary proposal, and forwarded to the Centers on existing hardware and software systems, when the proposal is approved. For non-NSF funded inves­ modelling, and algorithms, as well as tigators an alternative form of peer review of the research is knowledge of useful high performance comput- required. 17 ing application packages, awareness of trends in other federal agencies) should encourage their high performance computing and requisite staff. collaboration in the national effort. The Coalition of Academic Supercomputer D. NSF AND THE NATIONAL Centers (CASC) was founded in 1989 to pro­ HPC EFFORT; RELATIONSHIPS vide a forum to encourage support for high per­ WITH STATES formance computing and networking. Unlike the FCCSET task force, CASC is dependent on Recommendation D-1: We recommend that the others to bring the money to support high per­ National Science Board urge OSTP to establish an formance computing - usually their own State advisory committee representing the states, HPC government or university. The result is a valu­ users, NSF Supercomputer Centers, computer able discussion group for exchanging informa­ manufacturers, computer and computational scien­ tion and developing a common agenda and tists (similar to the Federal Networking Council's CASC should be encouraged. However, CASC Advisory Committee), which should report to is not a substitute for a more formal federal ad­ HPCCIT. A particularly important role for this visory body. body would be to facilitate state-federal planning. related to high performance computing. This recommendation is consistent with a recent Carnegie Commission Report entitled "Science, Congress required advisory committee report­ Technology and the States in America's Third ing to the PMES, but the committee has not yet Century," which recommends the creation of a been implemented. The committee we propose system of joint advisory and consultative bodies would provide policy level advice and coordina­ to foster federal-state exchanges and to create a tion with the states. The main components of partnership in policy development, especially HPCC are networking and HPC, although the for construction of national information in­ networks seem to be receiving priority atten­ frastructure and provision of services based on tion. The Panel believes it is important to con­ it. Because of the importance of high perfor­ tinue to emphasize the importance of ensuring mance computing to future economic develop­ adequate compute power in the network to sup­ ment, we need a new balance of cooperation port the National Information Infrastructure ap­ between federal and state government in this plications. We also believe that as participation area, as in a number of others. in HPC continues to broaden through initiatives by the states and by industry, the NSF (and

18 Appendix A MEMBERSHIP OF THE BLUE RIBBON PANEL ON HIGH PERFORMANCE COMPUTING Lewis Branscomb, John F. Kennedy School of Edward Hayes, Vice President for Research, Ohio Government, Harvard University (Chairman) State University Lewis Branscomb is a physicist, formerly chairman Edward F. Hayes is a computational chemist, former­ of the National Science Board (1980-1984) and ly Controller and Di Vision Director for Chemistry at Chief Scientist of IBM Corp. (1972-1986). NSF Theodore Belytschko, Department of Civil Engineer­ Barry Honig, Department of Biochemistry and ing, Northwestern University Molecular Biology, Columbia University Ted Belytschko 's research interests are in computa­ Barry Honig's research interests are in theoretical tional mechanics, particularly in the modeling of and computational studies of biological macro­ nonlinear problems, such as failure, crashworthi­ molecules. He is an associate editor of the Journal of ness, and manufacturing processes. Molecular Biology and is a former president of the Biophysical Society (1990- 1991). Peter R. Bridenbaugh, Executive Vice President - Science, Engineering, Environment, Safety & Health, Neal Lane, Provost, Rice University (resigned from Aluminum Company of America the Panel July 1993) Peter Bridenbaugh serves on a number of university advisory boards, and is a member of the National William A. Lester, Jr., Professor and Associate Deaii, Academy of Engineering's Industrial Ecology Com­ Department of Chemistry, University of California, mittee. He also serves on the NSF Task Force 1994 Berkeley Budget Committee and is a Fellow of ASM Interna­ William A. Lester, Jr., is a theoretical chemist, tional. formerly Director of the National Resource for Com­ putation in Chemistry (1978-8 1) and Chairman of Theresa Chay, Professor, Department of Biological the NSF Joint Advisory Committees for Advanced Sciences, University of Pittsburgh Scientific Computing and Networking and Com­ Teresa Chay's research interests are in modelling munications Research and Infrastructure (1987). biological phenomena such as nonlinear dynamics and chaos theory in excitable cells, cardiac arrhyth­ Gregory McRae,Professor, Department of Chemical mias by bifurcation analysis, mathematical modeling Engineering, MIT for electrical activity of insulin secreting pancreatic James Sethian, Professor, University of California at B-cells and agonist-induced cytosolic calcium oscil­ Berkeley lations, and elucidation of the kinetic properties of James Sethian is an applied mathematician in the ion channels. Mathematics Department at the University of Jeff Dozier, Center for Remote Sensing, University of California at Berkeley and in the Physics Division of California, Santa Barbara the Lawrence Berkeley Laboratory. Jeff Dozier, University of California, Santa Barbara, Burton Smith, Tera Computer Company is a hydrologist and remote sensing specialist. From Burton Smith is Chairman and Chief Scientist of 1990-1992 he was Senior Project Scientist on Tera Computer Company, a manufacturer of high NASA's Earth Observing System. performance computer systems. Gary Grest, Exxon Corporate Research Science Mary Vernon, Department of Computer Science, Laboratory University of Wisconsin Gary Grest's research interest are in the areas of Mary Vernon isa computer scientist who has computational physics and material science, recently received the NSF Presidential Young Investigator emphasizing the modeling the properties of Award and the NSF Faculty Award for Women polymers and complex fluids. Scientists and Engineers in recognition of her re­ search in parallel computer architectures and their performance.

19

Appendix B NSF AND HIGH PERFORMANCE COMPUTING: HISTORY AND ORIGIN OF THIS STUDY Introduction the pace of innovation in U.S. industries and the public sector. This report of the Blue Ribbon Panel on High Per­ formance Computing follows a number of Project what hardware, software and com­ separate, but related, activities in this area by the munication resources may be available in NSF, the computational science community, and the next five to ten years to further these ad­ the Federal Government in general acting in con­ vances and identify elements that may be cert through the Federal Coordinating Committee particularly important to the development of on Science, Engineering, and Technology. The HPC. Panel's findings and recommendations must be Assess the variety of institutional forms viewed within this broad context of HPC. This through which access to high performance section provides a description of the way in which computing may be gained including funding the panel has conducted its work and a brief over­ of equipment acquisition, shared access view of the preceding accomplishments which through local centers, and shared access were used as the starting point for the Panel's through broad band telecommunications. deliberations. Project sources, other than NSF, for support The Origin of the Present Panel and Charter of such capabilities, and potential coopera­ tive relationships with: states, private sector, Following the renewal of four of the five NSF Su­ other federal agencies, and international percomputer Centers in 1990, the National programs. Science Board (NSB) maintained an interest in the Centers' operations and activities. Given the na­ Identify bathers to the development of more tional scope of the Centers, and the possible im­ efficient, usable, and powerful means for ap­ plications for them contained in the HPCC Act of plying high performance computing, and 1992, the NSB commissioned the formation of a means for overcoming them. blue ribbon panel to investigate the future changes Provide recommendations to help guide the in the overall scientific environment due the rapid development of NSF's participation in super- advances occurring in the field of computers and computing and its relation to the federal in­ scientific computing. The panel was instructed to teragency High Performance Computing investigate the way science will be practiced in the and Communications Program. next decade, and recommend an appropriate role for NSF to enable research in the overall comput­ Recommend policies and managerial struc­ ing environment of the future. The panel consists tures needed to achieve NSF program goals, of representatives from a wide spectrum of the including clarification of the peer review computer and computational science communities procedures and suggesting appropriate in industry and academia. The role expected of the processes and mechanisms to assess pro­ Panel is reflected by its Charter: gram effectiveness necessary for insuring the highest quality science and engineering A. Assess the contributions of high perfor­ research. mance computing to scientific and engineer- ing research and education, including At its first meeting in January 1993, the panel ap­ ancillary benefits, such as the stimulus to proved its Charter, and established a scope of work which would allow a fmal report to be 21 presented to the NSB in Summer 1993. A large computers of this capability were available number of questions were raised amplifying the through other government agency (DoE and Charter's directions. Prior to its second meeting NASA) laboratories, but NSF did not play a role, in March 1993 the Panel solicited input from the and hence many of its academic researchers did national research community; a response to the fol­ not have the ability to perform computational re­ lowing four questions was requested. search on anything other than a departmental mini­ computer, thereby limiting the scope of their How would you project the emerging high research.� - performance computing environment and market forces over the next five years and the This lack of NSF participation in the high perfor­ implications for change in the way scientists mance computing environment began to be noted and engineers will conduct R&D, design and in the early 1980s with the publication of a grow­ production modeling? ing number of reports on the subject. A report to the NSF Division of Physics Advisory Committee What do you see as the largest barriers to the in March 1981 entitled "Prospectus for Computa­ effective use of these emergent technologies tiOnal Physics", edited by W. Press, identified a by scientists and engineers and what efforts "crisis" in computational physics, and recom­ will be needed to remove these barriers? mended support for facilities. Subsequent to this What is the proper role of government, and, report a joint agency study, "Large Scale Comput­ in particular, the NSF to foster progress? ing in Science and Engineering", edited by P. Lax, To what extent do you believe there is a fu­ appeared in December 1982 and acted as the ture role for government-supported supercom­ catalyst for NSF's reemergence in the support of puter centers? What role should NSF play in high performance computing. The Lax Report this spectrum of capabilities? presented four recommendations for a government- wide program: • To what extent should NSF use its resources • Increased access to regularly upgraded super - to encourage use of high performance com­ computing facilities via high bandwidth net­ puting in commercial industrial applications works through collaboration between high perfor­ mance computing centers, academic users • Increased research in computational mathe­ and industrial groups? matics, software, and algorithms Over fifty responses were received and were con­ • Training of personnel in scientific computing sidered and discussed by the Panel at its March • R&D of new supercomputer systems meeting. The Panel also received presentations, based on these questions, from vendors of high The key suggestions contained in the Lax Report performance computing equipment and repre­ were studied by an internal NSF working group, sentatives from non-NSF supercomputer centers. and the findings were issued in July 1983 as "A National Computing Environment for Academic Research", a report edited by M. Bardon and K. NSF's Early Participation in High Performance Curtis. The report studied NSF supported Computing scientists' needs for academic computing, and Although the National Science Foundation is now validated the conclusions of the Lax Report for the a major partner in the nation's high performance NSF supported research community. The findings computing effort, this was not always the case. In of Bardon/Curtis reformulated the four recommen­ the early 1970s the NSF ceased its support of cam­ dations of the Lax Report into a six point im­ pus computing centers, and by the mid-l970s plementation plan for the NSF. Part of this action there were no "supercomputers" on any campus plan was a recommendation to establish ten available to the academic community. Certainly academic supercomputer centers. 22 The immediate NSF response was to set up a The Federal High Performance Computing and means for academic researchers to have access, at Communications Initiative existing sites, to the most powerful computers of the day. This was an interim step prior to a At the same time the NSF Supercomputer Centers solicitation for the formation of academic super­ were beginning the early phases of their opera­ computer centers directly supported by the NSF. tions the Federal Coordinating Committee for By 1987, five NSF Supercomputer Centers had Science, Engineering, and Technology began a been established, and all had completed at least study in 1987 on the status and direction of high one year of operation. performance computing, and its relationship to federal research and development. The results During this phase the Centers were essentially iso­ were "A Research and Development Strategy for lated "islands of supercomputing" whose role was High Performance Computing" issued by the Of­ to provide supercomputer access to the academic fice of Science and Technology Policy (OSTP) in community. This aspect of the Centers' activities November 1987, followed in September 1989 by has changed considerably. The NSF concept of another OSTP document "The Federal High Per­ the Centers' activities was mandated to be much formance Computing Program". These two broader, as indicated by the Center's original ob­ reports set the framework for the inter-governmen­ jectives: tal agency cooperation on high performance computing which led to the High Performance • Access to state of the art supercomputers Computing and Communications (HPCC) Act of • Training of computational scientists and en­ 1991. gineers HPCC focuses on four integrated components* of computer research and applications which very • Stimulate the U.S. supercomputer industry closely echo the Lax Report conclusions:

• Nurture computational science and engi­ • High Performance Computing Systems - tech­ neering nology development for scalable parallel sys­ tems to achieve teraflop speed • Encourage collaboration among researchers • Advanced Software Technology and Algo­ in academia, industry and government rithms - generic software and algorithm development to support Grand Challenge In 1988-1989 NSF conducted a review to deter­ projects, including early access to production mine whether support was justified beyond 1990. scalable systems In developing proposals, the Centers were advised to increase their scope of responsibilities. Quoting • National Research and Education Network - from the solicitation: to further develop the national network and networking tools, and to support the research "To insure the long term health and value of a and development of gigabit networks supercomputer center, an intellectual environ­ ment, as well as first class service, is neces­ • Basic Research and Human Resources - to sary. Centers should identify an intellectual support individual investigator research on component and research agenda". fundamental and novel science and to initiate activities to significantly increase the poo1 of In 1989 NSF approved continuation through 1995 trained personnel of the Cornell Theory Center, the National Center for Supercomputing Applications, the Pittsburgh Supercomputing Center, and the San Diego Super­ *At the time of writing this Report, a fifth component, en­ computer Center. Support for the John von titled, Information Infrastructure, Technology, and Applica­ Neumann Center was not continued. tions is being defined for inclusion in the FIPCC Program 23 With this common structure across all the par­ directions for the Supercomputer Centers Program ticipating agencies, the Program outlines each which would "enable it to take advantage of these agency's roles and responsibilities. NSF is the (HPCC) opportunities and to meet its respon­ lead agency in the National Research and Educa­ sibilities to the national research community". tion Network, and has major roles in Advanced The committee's recommendations can be sum­ Software Technology and Algorithms, and in marized as: Basic Research and Human Resources. Decisions and planning by the Division need to be made in a programmatic way, rather The Sugar Report than on an individual Center by Center basis - After the renewal of the four NSF Supercomputer the meta-center concept provides a vehicle Centers the NSF Division of Advanced Scientific for this management capability which goes Computing recognized that the computing environ­ beyond the existing Centers. ment within the nation had changed considerably from that which existed at the inception of the Access to stable computing platforms (cur­ Centers Program. The Division's Advisory Com­ rently vector supercomputers) needs to be mittee was asked to survey the future possibilities augmented by access to state of the art tech­ nology (currently massively parallel com­ for high performance computing, and report back to the Division. Two workshops were held in the puters) - but, the former cannot be sacrificed Fall of 1991 and Spring of 1992. Thirty one par­ to provide the latter ticipants with expertise in computational science, • The Supercomputer Centers can be focal computer science and the operation of major super­ points for enabling collaborative efforts computer centers were involved. across many communities - computational The final report, edited by R. Sugar of the U. of and computer science, private sector and California at Santa Barbara, recommended future academia, vendors and academia.

24 Appendix C TECHNOLOGY TRENDS and BARRIERS to FURTHER PROGRESS BACKGROUND: Kendall Square, Thinking Machines Inc., MasPar, and nCUBE. What is the state of the HPC industry here and Microprocessor performance has increased by 4X abroad? What is its prognosis? every three years, matching the rate of integrated The high performance computer industry is in a circuit logic density improvement as predicted by state of turmoil, excitement and opportunity. On Moore's law. For example, the microprocessors of the one hand, the vector multiprocessors manufac­ 1993 are around 200 times faster than those of tured by many firms, large and small, have con­ 1981. By contrast, the clock rates of vector tinued to improve in capability over the years. processors have improved much more slowly; These systems are now quite mature, as measured today's fastest vector processors are only five or by the fact that delivered performance is a sig­ six times faster than 1976's Cray 1. Thus, the per­ nificant fraction of the theoretical peak perfor­ formance gap between these two technologies is mance of the hardware, and are still the preferred quickly disappearing in spite of other performance platform for many computational scientists and en­ improvements in vector processor architecture. gineers. They are the workhorses of high perfor­ Although microprocessor-based massively parallel mance computing today and will continue in that systems hold considerable promise for the future, role even as alternatives mature. they have not yet reached maturity in terms of On the other hand, dramatic improvements in ease of programming and ability to deliver high microprocessor performance and advances in performance routinely to large classes of applica­ microprocessor-based parallel architectures have tions. Unfortunately, the programming technology resulted in "massively parallel" systems that offer that has evolved for the vector multiprocessors does not directly transfer to highly parallel sys­ the potential for high performance at lower cost. 20 For example, $10 million in 1993 buys over 40 tems. New mechanisms must be devised for high gigaflops peak processing power in a multicom­ performance communication and coordination puter but only 5 gigaflops in a vector multiproces­ among the processors. These mechanisms must be sor. As a result, increasing numbers of efficiently supported in the hardware and effective­ computational scientists and engineers are turning ly embodied in programming models. to the highly parallel systems manufactured by Currently, vendors are providing a variety of sys­ companies such as Cray Research Inc., IBM, Intel, tems based on different approaches, each of which has the potential to evolve into the method of choice. Vector multiprocessors support a simple shared memory model which demands no par ­ 20Note that the higher cost of vector machines is partly caused by their extensive use of static memory chips for ticular attention to data arrangement in memory. main memory and the interconnection networks they use for Many of the currently available highly parallel ar­ high shared-memory bandwidth. These attributes contribute chitectures are based on the "multicomputer" ar­ to increased programmability and the realization of a high chitecture which provides only a message-passing fraction of peak performance on user applications. Realized interface for inter-processor communication. performance on MPP machines is still uncertain. A com­ Emerging architectures, including the Kendall parison of today's vector machines versus MPP systems based on realized performance per dollar reveals much less Square KSR-1 and systems being developed by difference in cost-performance than comparisons based on Convex, Cray Research, and Silicon Graphics, peak performance. have shared address spaces with varying degrees .25 of hardware support and different refinements of provement in both processors and memory by the shared memory programming model. These 1998. Estimating in constant 1993 dollars, the computers represent a compromise in that they most powerful machines ($50 million) will have offer much of the programming simplicity of peak performance of nearly a teraflop 21 ; mini-su­ shared memory yet still (at least so far) require percomputers ($1 million) will advertise 20 careful data arrangement to achieve good perfor­ gigaflops peak performance; workstations mance. (The data parallel language on the CM-S ($50,000) will approach 1 gigaflops, and personal has similar properties.) A true shared memory computers ($10,000) will approach 200 parallel architecture, based on mechanisms that megaflops.22 hide memory access latency, is under development During this period, parallel architectures will con­ at Tera Computer. tinue to emerge and evolve. Just as the CM-S rep­ The size of the high performance computer market resented a convergence between SIMD and worldwide is about $2 billion (excluding sales of MIMD parallel architectures and brought about a the IBM add-on vector hardware), with Cray Re­ generalization of the data-parallel programming search accounting for roughly $800 million of it. model, it is likely that the architectures will con­ IBM and Fujitsu are also significant contributors tinue to converge and better user-level program­ to this total, but most companies engaged in this ming models will continue to emerge. These business have sales of $100 million or less. Some developments will improve software portability companies engaged in high speed computing have and reduce the variety of architectures that are re­ other, larger sources of revenue (IBM, Fujitsu, quired for computational science and engineering Intel, NEC, Hitachi); other companies both large research, although there will likely still be some (Cray Research) ahd small (Thinking Machines, diversity of approaches at the end of this 5-year Kendall Square, Meiko, Tera Computer) are high horizon. Questions that may be resolved by 1998 performance computer manufacturers exclusively. include: There are certainly more companies in the busi­ Which varieties of shared memory architec­ ness than can possibly be successful, and no doubt ture provide the most effective tradeoff be­ there are new competitors that will appear. Help­ tween hardware simplicity, system ing to sustain this high level of competitive in­ performance, and programming con­ novation should be an important objective for venience? and NSB policy in HPC. • What special synchronization mechanisms FINDINGS for processor coordination should be sup­ ported in the hardware? Where is the hardware going to be in 5 years? Most current systems are evolving in these direc­ What will be the performance and cost of the tions, and answers to the issues will provide a most powerful machines, the best workstations, more stable base for software efforts. Further­ the mid-range computers? more, much of the current computer science re­ The next five years will continue to see improve­ search in shared memory architectures is looking ments in hardware price/performance ratios. for cost-effective hardware support that can be im- Since microprocessor speeds now closely ap­ proach those of vector processors, it is unclear whether microprocessor performance improve- ment can maintain its current pace. Still, as long as 21 0ne teraflop is 1000 gigaflops or 1012 floating point in­ integrated circuits continue to quadruple in density structions per second. but only double in cost every three years we can 22Spokesman from Intel, Convex and Silicon Graphics in ad­ probably expect a fourfold price/performance im- dressing the panel all made even higher estimates than this.

26 plemented in multiprocessor workstations that are synchronization behavior. Vendors are increasing­ interconnected by general-purpose local area net­ ly recognizing the need for sophisticated perfor­ works. Thus, technology from high performance mance tuning tools, with most now developing or parallel systems may be expected to migrate to beginning to develop such tools for their workstation networks, further improving the machines. The increasing number of computer capabilities of these systems to deliver high-perfor­ scientists who are also using these tools could lead mance computing to particular applications. It is to even more rapid improvement in the quality and possible that in the end the only substantial dif­ usability of these support tools. Operating sys­ ference between the supercomputers of tomorrow tems for high performance computers are increas­ and the workstation networks of tomorrow will be ingly ill suited to the demands placed on them. the installed network bandwidth. 23 Virtualization of processors and memory often leads to poor performance, whereas relatively Where is the software/programmability going fixed resource partitioning produces inefficiency, to be in 5 years? What new programming especially when parallelism within the application models will emerge for the new technology? varies. High performance 110 is another area of How transparent will parallel computers be to shortfall in many systems, especially the multi­ users? computers. Research is needed in nearly every aspect of operating systems for highly parallel While the architectural issues are being resolved, computers. parallel languages and their compilers will need to continue to improve the programmability of new high performance computer systems. Implementa­ What market forces or technology investments tions of "data parallel" language dialects like High drive HPC technologies and products? Performance Fortran, Fortran D, and High Perfor­ Future high performance systems will continue to mance C will steadily improve in quality over the be built using technologies and components built next five years and will simplify programming of for the rest of the computer industiy. Since in­ both multi-computers and shared address systems tegrated circuit fabrication facilities now represent for many applications. For the applications that billion dollar capital investments, integrated cir­ are not helped by these languages, new languages cuits benefit from very large scale economies; ac­ and programming models will emerge, although at cordingly it has been predicted that only a slower pace. Despite strong efforts addressing mass-market microprocessors will prove to have the problem from the language research com­ acceptable costs in future high performance sys­ munity, the general purpose parallel programming tems. Certainly current use of workstation language is an elusive and difficult quarry, espe­ microprocessors such as Sparc, Alpha and the RS­ cially if the existing Fortran software base must be 6000 chips suggests this trend. Even so the cost accommodated, because of difficulties with the of memory chips is likely to be a major factor in correct and efficient use of shared variables. the costs of massively parallel systems, which re­ Support tools for software development have also quire massive amounts of fast memory. Thus the been making progress, with emphasis on visualiza­ technology available for both tion of a program's communication and custom designs and industry standard processors will increasingly be driven by the requirements of much larger markets, including consumer electronics.

23While parallel architectures mature, vector multiprocessors The health of the HPC vendors and the structure will continue to evolve. Scaling to larger numbers of proces­ of their products will be heavily influenced by sors ultimately involves solving the same issues as for the demand from industrial customers. Business ap­ microprocessor-based systems. plications represents the most rapidly growing

27 market for HPC products; they have much higher Technologies of this sort are both interesting and potential growth than government or academic important in a broad engineering context and also uses. Quite apart from NSF's obligation to con­ are having impact on computational science and tribute to the nation's economic health through its engineering. Machine learning approaches, such research activities, this fact motivates the impor­ as neural networks, are most appropriate in scien­ tance of cooperation with industry users in expand­ tific disciplines where there is insufficient theory ing HPC usage. This reality means that NSF to support accurate computer modeling and should be attentive to the value of throughput as a simulation. figure of merit in HPC systems (in contrast with turnaround time which academic researchers usual­ How important are simulation and visualiza­ ly favor), as well as the speed with which large tion capabilities? volumes of data can be accessed. Industry won't put up with a stand-alone, idiosyncratic environ­ Simulation will play an ever increasing role in ment. science and engineering. Much of this work will be able to be carried out on workstations or inter­ mediate-scale systems, but it will continue to be How practical will be the loose coupling of desk­ appropriate to share the highest performance sys­ top workstations to aggregate their unused com­ tems (and the expertise in using them) on a nation­ pute power? al scale, to accomplish large simulations within Networks of workstations will become an impor­ human time scales. Smaller configurations of tant resource for the many computations that per­ these machines should be provided to individual form well on them. The probable success of these research universities for application software loosely coupled system will inevitably raise the development and research that involves modifying standard for communication capabilities in the the operating system and/or hardware. multicomputer arena. Many observers believe Personal computer capabilities will improve, and that competition from workstation networks on visualization on the desktop will become more one side and shared address space systems on the routine. Scientists and engineers in increasing other will drive multi-computers from the scene numbers will need to be equipped with visualiza­ entirely; in any event, the network bandwidth and tion capabilities. The usefulness of high perfor­ latency of multi-computers must improve to dif­ mance computing relies on these systems because ferentiate them from workstation networks. Many printed lists of numbers (or printed sheaves of pic­ large institutions have 1000 or more workstations tures, for that matter) are increasingly unsatisfac­ already installed; the utilization rate of their tory as an output medium, even for moderately processors on a 24 hour basis is probably only a sized simulations. few percent. An efficient way to use the power of such heterogeneous networks would be more financially attractive. It will, however, raise BARRIERS TO CONTINUED serious question about security, control, virus- RAPID PROGRESS prevention, and accounting programs. What software and/or hardware "inventions" Are there some emerging HPC technologies of are needed? Who will address meeting these interest other than parallel processing? What is needs? their significance? The most important impediment to the use of new Neural networks have recently become popular highly parallel systems has been the difficulty of and have been successfully applied to many pat­ programming these machines and the wide varia­ tern recognition and classification problems. tion that exists in communication capabilities Fuzzy logic has enjoyed an analogous renaissance. across generations of machines as well as among 28 the machines in a given generation. Application ROLES FOR GOVERNMENT AGENCIES software developers are understandably reluctant to re-implement their large scale production codes What should government agencies (NSF, DoD, on multi-computers, when significant effort is re­ DoE) do to advance HPC beyond today's state quired to port the codes across parallel systems as of the art? What more might they be doing? they evolve. In theory, any programming model can be implemented (by appropriate compilers) on The National Science Foundation plays several any machine. However, the inefficiency of certain critical roles in advancing high performance com­ models on certain architectures is so great as to puting. First, NSF's support of basic research and render them impractical. 24 What is needed in high human resources in all areas of science and en­ performance computing is an architectural consen­ gineering (and particularly in mathematics, com­ sus and a simple model to summarize and abstract puter science and engineering) has been the machine interface to allow compilers to be responsible for many of the advances in our ability ported more easily across systems, facilitating the to successfully tackle grand challenge problems. portability of application programs. Ideally, the The Supercomputer Centers and the NSFnet have consensus interface should efficiently support ex­ been essential to the growth of high performance isting programming models (even the multi-com­ computing as a basic paradigm in science and en­ puters have created their own dusty decks), as well gineering. These efforts have been successful and as more powerful models. Considerable research should be continued. in the computer science community is currently However, NSF has done too little in supporting devoted to these issues. It is unlikely that the diver­ computational engineering in the computer sity of programming models will decrease within science community. For example, the NSF Super­ the next five years, but it is likely that models will computer Centers were slow in providing ex­ become more portable. perimental parallel computing facilities and are How important will be access to data, data currently not responding adequately to integrating management? emerging technologies from the computer science community. Although this situation is gradually Besides needing high performance I/O, some changing, the pace of the change should be ac­ fields of computational science need widely dis­ celerated. tributed access to data bases that are extremely large and constantly growing. The need is par ­ Many advances in high performance computer sys­ ticularly felt in the earth and planetary sciences, al­ tems have been funded and encouraged by the Ad­ though the requirements are also great in cellular vanced Research Projects Agency (ARPA), the biology, high energy physics, and other dis­ major supporter of large scale projects in com­ ciplines. Large scale storage hierarchies and the puter science and engineering research and software to manage them must be developed, and development in the US. ARPA has been charged means to distribute the data nationally and interna­ by Congress to champion "dual use" technology; tionally are also required. Although this area of in so doing it is addressing many of the needs of high performance computing has been relatively computational science and engineering, even in neglected in the past, these problems are now the mathematical software arena, that are common receiving significantly more attention. to defense and commercial applications, and the science that underlies both. The Department of Energy has traditionally provided substantial support to computer science and engineering research within its national 2417or example, it is not practical to implement data-parallel laboratories and at universities with strong im­ compliers on the Intel iPSC/860. petus being provided by national defense require-

29 ments and resources. More recently, the focus has Investments in mathematics and computer science shifted to the high performance computing and research provide the foundation for attacking communications needs of the unclassified Energy today's problems in high performance computing Research programs within DoE. The National and must continue. NSF continues to be the Energy Research Supercomputer Center (NERSC) primary U.S. source of funds for mathematics and and the Energy Sciences Network (ESnet) provide computer science research within the scope of production services similar to the NSF supercom­ what one or two investigators and several graduate puter centers and the NSFnet. Under the DoE assistants can do. Many fundamental advances in HPCC component, "grand challenge" applications algorithms, programming languages, operating are supported at NERSC and also at two High Per­ systems, and computer architecture have been formance Computing Research Centers (HPCRCs) NSF funded. This mission has been just as vital as ivhich offer selected access for grand challenge ap­ ARPA's and is complementary to it. plications to leading edge parallel computing machines. DoE also sponsors a variety of graduate Among the largest barriers to the effective use of fellowships in the computational sciences. The emergent computing technologies are parallel ar­ computational science infrastructure and traditions chitectures from which it is relatively easy to ex­ of DoE remain sound; however, the ability of the tract peak performance, system software Department to advance the state-of-the-art in high (operating systems, databases, compilers, program­ performance computing systems will be paced by ming models) to take advantage of these architec­ its share of the funding available through the tures, parallel algorithms, mathematical modeling, Federal High Performance Computing Initiative or and efficient and high order numerical techniques. through Defense conversion funds. These are core computational mathematics and computer science/engineering research issues, The Department of Commerce has not been a sig­ many of which are best tackled through NSF's nificant source of funds for computer system re­ traditional peer-reviewed model. NSF should in­ search and development since the very early days crease its support of this work. Increased invest­ of the computer industry when the National ment in basic research and human resources in Bureau of Standards built one of the first digital mathematics and computer science/engineering computers. NBS has been an important factor in could significantly accelerate the pace of HPC supporting standards development, particularly for technology development. In addition, technology the Federal Information Processing Standards is­ transfer can be increased by supporting a new sued by GSA. The expanded role of the National scale of research not currently being funded by Institute for Standards and Technology (as NES is any agency: small teams with annual budgets in now called) under the Clinton administration may the $250K - $ 1 M range. These projects were once include this kind of activity, especially when in­ supported by DoE, and also indirectly by ARPA at dustrial participation is a desired component. the former "block grant" universities. 25 These NASA is embarked on a number of projects of potential importance, especially in the develop­ ment of a shared data system for the global 25ARPA made an enormous contribution to the maturing of climate change program, which will generate mas­ computer science as a discipline in U.S. universities by con­ sive amounts of data from the Earth Observing sistently funding research at about $1 million per year at Satellite Program. MIT, Carnegie Mellon, Stanford University and Berkeley. At this level of consistent support these universities could build What is role of NSF computer science and ap­ up a critical mass of faculty and trained the next generation of faculty leadership for departments being set up at every plied mathematics research program? Is it substantial research university. This targeted investment relevant to the availability of HPC resources in played a role in computer science not unlike what NSF has a five year time span? done in computational science at the four Centers.

30 enterprises are now generally too small for ARPA multi-disciplinary collaboration, either within and seem to be too big for current NSF budget mathematics and computer science/engineering levels in computer science. A project of this scale (architecture, operating systems, compilers, algo­ could develop and release an innovative piece of rithms), and related disciplines (astronomers or software to the high performance computing com­ chemists working with computer scientists and munity at large, or build a modest hardware mathematicians interested in innovative program­ prototype as a stepping stone to more significant ming or architectural support for that problem funding. A project of this scale would also allow domain).

31

Appendix D Figure 1 Supercomputer Usage by State: Fiscal Year 1992 Four National Supercomputer Centers

fl <10CPUHrs EI1 10-100 CPU Hrs EM 101-1,000 CPU Hrs • 1,001-10,000 CPU Hrs

33 Figure 2 Trends In user status at tour National Superconiputer Centers (Data from use of vector supercomputers only)

# of Users by Academic Status 2500 -.•- Faculty

2000 -9-- PostDoc

-- Grad. 1500 -*- Undergrad.

1000 -p- HS

-'I.-- Industry 500

FY86 FY87 FY88 FY89 FY90 FY91 FY92

% of Users by Academic Status

-•-- Faculty .---. PostDoc -s-- Grad. .-e-- Undergrad. -v- HS -us-- Industry

FY86 FY87 FY88 FY89 FY90 FY91 FY92

34 Figure 3 Installed Supercomputer Base (U.S. Vendors Only)

• US�a Japan�• Europe�E3 Other

• Academic�E3 Government�0 Industry

35 Figure 4 Trend in NSF Supercomputer Center Leverage and Growth in number of users

No. of Users $,M 110.r 9000

97. 8556

85.( 8111

73. 7667

61: 7222

48 . 6778

36. 6333

24. 5889

12. 5444

5000 FY88�FY89�FY90�FY91�FY92

- # of Active Users�E3 Other $�E NSF $

36 Appendix E REVIEW AND PROSPECTUS OF COMPUTATIONAL AND COMPUTER SCIENCE AND ENGINEERING Personal Statements by Panel Members

Computational Mechanics and Structural Analysis by Theodore Belytschko High performance computing has had a dramatic turers have undertaken extensive programs in impact on structural analysis and computational crashworthiness simulations by computer on high mechanics, with significant benefits for various in­ performance machines, and many manufacturers dustries. The finite element method, which was have bought supercomputers almost expressly for developed at aerospace companies such as Boeing crashworthiness simulation. in the late 1950's and subsequently at the univer­ Because of the increasing concern with safety sities, has become one of the key tools for the among manufacturers of many other products, non­ mechanical design of almost all industrial linear analysis are also emerging in many other in­ products, including aircraft, automobiles, power dustries: the manufacturer of trucks and plants, packaging, etc. The original applications of construction equipment, where the product must fmite element methods were primarily in linear be certified for safety in various accidents such as analysis, which are useful for determining the be­ overturning or impact due to falling construction havior of engineering products in normal operat­ equipment, railroad car safety; the safety of ing conditions. Most linear finite element analyses aircraft, where recently the FAA have undertaken are today performed on workstations ,except for programs to simulate the response of aircraft to problems with the order of 1 million unknowns. small weapons so that damage from such ex­ Supercomputers are used primarily for nonlinear plosives can be minimized. Techniques of this analysis, where they replace prototype testing. type are also being used the analysis the safety of One rapidly developing area has been automobile jet engines due to bird impact, the containment of crashworthiness analysis, where models of fragments in case of jet engine failure, and bird im­ automobiles are used to design for occupant safety pact on aircraft canopies. In several cases, NSF Su­ and for features such as accelerometer placement percomputer Centers have introduced industry to for a air bag deployment. The models which are the potentials of this type of simulation. In all of currently used are generally on the order of these, highly nonlinear analysis which require on 100,000 to 250,000 unknowns, and even on the the order of lOx floating point operations for a latest supercomputers such as the CRAY C90 re­ simulation must be made: such simulations even quire on the order of 10 to 20 hours of computer on today's supercomputers are still often so time time. Nevertheless these models are still often too consuming that decisions cannot be reached fast coarse to permit true prediction and hence they enough. Therefore an urgent need exists for in­ must be tuned by tests. creasing the speed with which such simulations can be made. Such models have had a tremendous impact on reducing the design time for automobiles, since Nonlinear finite element analysis is also becoming they eliminate the need for building nutuerous increasingly important in the simulation manufac­ prototypes. Almost all major automobiles manufac- turing processes. For example, tremendous im- 37 provements can be made in processes such as Most of the calculations mentioned above are not sheet metal forming, extrusion, and machining made with sufficient resolution because of limita­ processes if these are carefully designed through lions in computational power and speed. Also, im­ nonlinear finite element simulation. These simula­ portant physical phenomena are omitted for lions offer large cost reductions and reduce design reasons of expediency, and their computational time. Also the design of materials can be im­ modeling is not well understood. Therefore, the proved if computers are first used to examine how availability of more computational power will in­ these materials fail and then to design the material crease our understanding of modeling nonlinear so that failure is either decreased or so that the structural response and provide industry with material fails in a less catastrophic manner. Such more effective tools for design. simulations require great resolution, and at the lips of cracks phenomenon at the atomic scale must be considered.

Cellular and Systemic Biology by Teresa Chay HPC has made a great impact on a variety of others are responsive to chemical substances (e.g., biological disciplines, such as physiology, biologi­ neurotransmitters/ hormones). cal macromolecules, and genetics. I will discuss below three vital organs in our body where our un­ Opening of ion channels creates the "action poten­ derstanding has greatly benefitted from high-per­ tial" which spreads from cell to cell, either direct­ formance comuting and will continue to do so in ly or via chemical mediators. Electrical the future. transmission and chemical transmission are inter­ dependent in that chemical substances can in­ Computer Models For Vital Organs In Our Body fluence the ionic currents and visa versa. For example, the arrival of the action potential at a Although the heart, brain, and pancreas function presynaptic terminal may cause a release of chemi­ differently in our body (i.e., the heart circulates cal substances; in turn these chemicals can the blood, the brain stores and transfers informa­ open/close the ion channels in the postsynaptic tion, and the pancreas secretes vital hormones cell. such as insulin), the mechanisms underlying their functioning are quite similar - "excitable" cells Why is high-performance computing needed? that are coupled electrically and chemically, form­ How the signals are passed from cell to cell is a ing a network. nonlinear dynamical problem and can be treated mathematically by solving simultaneous differen­ Ion channels in the cell membranes are involved tial equations. These equations involve voltage, in information transfer. The ion channels receive conductance of the ionic current, and concentra­ stimuli from neighboring cells and from cells in tion of those chemical substances that influence other organs. Upon receiving stimuli .some ion conductances. Depending on the model, each net­ channels open while others close. When these work can be represented by a set of several mil­ channels are open, they pass ions into or out of the lion differential equations. The need for parallel cells, creating an electrical difference (membrane processors is obvious - the organs process infor­ potential) between the outside and the inside of mation just the same way as the most powerful the cell. Some of these ion channels are sensitive parallel supercomputers do. Since the mechanisms to the voltage (i.e., membrane potential) and 38 involved in these three organs are essentially the surface of the body via electrocardiogram. same, algorithms developed for one can be easily With such a map, clinicians can accurately modified to solve for another. locate the problem tissue and remove it with Three specific areas in which high-performance a relatively simple surgical procedure in­ computing is central are cardiac research, neural stead of with drastic open-heart surgery. networks, and insulin secretion. These are Controlling sudden cardiac death: Sudden detailed below. cardiac death is triggered by an extra heart beat. Such a beat is believed to initiate spiral Cardiac Research waves (i.e., reentrant arrhythmias) on the It would be a great benefit to cardiac research if a main pumping muscle of the heart known as realistic computer model of the heart, its valves the ventricular myocardium. With HPC it is and the nearby major vessels were to become possible to simulate how this part of the available. With such a model the general public heart can generate reentrant arrhythmia would be able to see how the heart generates its upon receiving a premature pulse. Computer rhythm, how this rhythm leads to contraction, and modelling of reentrant arrhythmia is very how the contraction leads to blood circulation. important clinically since it can be used as a Scientists, on the other hand, could study normal tool to predict the onset of this type of dead­ and diseased heart function without the limitations ly arrhythmia and find a means to cure it by of using human and animal subjects. With future properly administrating antiarrhythmia HPC and parallel processing, it may be possible to drugs (instead of actually carrying out ex­ build a model heart without consuming too many periments on animals). hours of computer time. As a step toward achiev­ Parallel computing and development of better ing this goal, the scientists in computational car­ software will soon enable the researchers to ex­ diology have thus far accomplished the following tend their simulations to a more realistic three- three objectives. dimensional system which includes the detailed A computer model of blood flow in the geometry of ventrical muscle. heart: Researchers have used supercom­ puters to design an artificial mitral valve Neural Networks (the valve that controls blood flow between Learning how the brain works is a grand challenge the left atrium and left ventricle) which is in science and engineering. Artificial neural nets less likely to induce clots. The computer are based largely on their connection patterns, but simulated mitral valve has been patented they have very simple processing elements or and licensed to a corporation developing it nodes (either ON or OFF). That is, a simple net­ for clinical use. With parallel processors, work consists of a layered network with an input this technique is now expanded in order to layer (sensory receptors), one or more "hidden" construct a realistic three dimensional heart. layers (representing the intemeurons which allow animals to have complex behaviors), and an out­ Constructing an accurate map of the electri­ put layer (motor neurons). Each unit in a neural cal potential of the heart surface (epicar­ dium): Arrhythmia in the heart is caused by net receives inputs, both excitatory and inhibitory, from a number of other units and, if the strength a breakdown in the normal pattern of car ­ diac electrical activity. Many arrhythmias of the signal exceeds a given threshold, the "on­ unit" sends signals, to other units. occur because of an abnormal tissue inside the heart. Bioengineers have been develop­ The real nervous system, however, is a complex ing a technique with which to obtain the organ that cannot be viewed simply as an artificial epicardial potential map from the coarse in­ neural net. Neural nets are not hard-wired but are formation of it that can be recorded on the made of neurons which are connected by synap- 39 ses. There are at least 10 billion active neurons in of Langerhans. In islets, beta cells are coupled by the brain. There are thousands of synapses per a special channel (gap junctional channel) which neuron, and hundreds of active chemicals which connects one cell to the next. Gap junctional chan­ can modify the properties of ion channels in the nels allow small ions such as calcium ions pass membrane. through from cell to cell. In the plasma of beta With HPC and massive parallel computing, cells, there are ion channels whose properties neuroscientists are moving into a new phase of in­ change when the content of calcium ions changes. vestigation which focuses on biological neural There are other types of cells in the islet which nets, incorporating features of real neurons and the secrete hormones. These hormones in turn in­ connectivity of real neural nets. Some of these fluence insulin secretion by altering the properties models are capable of simulating patterns of of the receptors bound in the membrane of a beta electrical activity, which can be compared to ac­ cell. Thus the study on how beta cells release in­ tual neuronal activity observed in experiments. sulin involves very complex non-linear dynamics. With the biological neural nets, we begin to under­ With a supercomputer it is possible to construct a stand the operation of the nervous system in terms model of the islet of Langerhans. With this model, of the structure, function and synaptic connec­ researchers would learn how beta cells release in­ tivity of the individual neurons. sulin in response to the external signals such as glucose, neurotransmitters and hormones. They Insulin Secretion would also learn the roles of other cell types in the islet of Langerhans and how they influence the Insulin is secreted from the beta cells in the functional properties of beta cells. A model in pancreas. To cure diabetes it is essential to under ­ which beta cells function as a cluster has been al­ stand how beta cells release insulin. The beta cells ready constructed. are located in a part of pancreas known as the islet

Material Science and Condensed Matter Physics by Gary S. Grest The impact of high performance computing on Green's function Monte Carlo methods allowed material science and condensed matter physics has one to begin to simulate quantum many-body been enormous. Major developments in the sixties problems. Quantum molecular dynamics which and seventies set the stage for the establishment of combine well-established electronic methods computational material science as a third dis­ based on local density theory with molecular cipline, equal, yet distinct from analytic theory dynamics for atoms have recently been intro- and experiment. These developments include the duced. On a more macroscopic scale, computation­ introduction of molecular dynamics and Monte al mechanics which was discussed above by T. Carlo methods to simulate the properties of liquids Belyschko was developed to study structural and solids under a variety of conditions. Density properties relations. functional theory was developed to model the electron-electron interactions and pseudopotential Current usage of high performance computing in methods to model the electron-ion interactions. material science and condensed matter physics can These methods were crucial in computing the be broadly classified as Classical Many-Body and electronic structure for a wide variety of solids. Statistical Mechanics, Electronic Structure and Later, the development of path integral and Quantum Molecular Dynamics , and Quantum Many-Body, which are discussed below. 40 Classical Many-Body and Statistical Mechanics Examples include polymers and macromolecular Classical statistical mechanics, where one treats a liquids, where typical lengths scales of each huge number of atoms collectively date back molecule can be hundreds angstroms and relaxa­ tion time scales extend from microseconds and to Boltzmann and Gibbs. In these systems, quan­ longer, liquids near their glass transition where tum mechanics plays only a subsidiary role. While relaxation times diverge exponentially, nucleation it is needed to determine the interaction between and phase separation which requires both large sys­ atoms, in practice these interactions are often tems and long times and effects of shear. Macro- replaced by phenomenologically determined pair- molecular liquids typically contain objects of very wise forces between the atoms. This allows one to different sizes. For example, in most colloidal treat large ensembles of atoms, by molecular suspensions, the colloid particles are hundreds of dynamics and Monte Carlo methods. Successes of angstroms in size while the solvent is only a few this approach include insight into the properties of angstroms. At present, the solvent molecules must liquids, phase transitions and critical phenomena, be treated as a continuum background. While this crystallization of solids and compositional order­ allows one to study the static properties of the sys­ ing in alloys. For systems where one needs a quan­ tem, the dynamics are incorrect. Faster computer titative comparison to experiment, embedded atom will allow us to study flocculation, sedimentation methods have been developed in which empirical­ and the effects of shear on order. Non-equilibrium ly determined functions are employed to evaluate molecular dynamics methods have been developed the energy and forces. Although the details of the to simulate particles under shear. However due to electronic structure are lost, these empirical the lack of adequate computation power, simula­ methods have been successful in giving tions at present can only be carried out at unphysi­ reasonable descriptions of the physical processes cally high shear rates. Access to HPC will enable in many systems in which directional bonding is one to understand the origins of shear thinning and not important. Theoretical work in the mid-70's on thickening in a variety of technologically impor ­ renormalization group methods, showed that a tant systems. While molecular dynamics simula­ wide variety of different kinds of phase transitions tions are inherently difficult to vectorize, recent could be classified according to the symmetry of efforts to run them on parallel computers have the order parameter and the range of the interac been very encouraging, with increases in speed of tion and did not depend on the details of the inter­ nearly a factor of 30 in comparison to the Cray action potential. This allowed one to use relatively YMP. simply models, usually on a lattice, to study criti­ cal phenomena and phase behavior. Monte Carlo simulations on a lattice remain a very powerful computational technique. Simulations of While the basic computational techniques used in this type have been very successful in under­ classical many-body theory are now well estab­ standing critical phenomena, phase separation, lished, there remain a large number of important growth kinetics and disordered magnetic systems. problems in material science which only be ad­ Successes include accurate determination of dressed with these techniques. At present, with universal critical exponents, both static and Cray YMP class computers, one can typically dynamic, and evidence for the existence of a phase handle thousands of atoms for hundreds of transition in spin glasses. Future work using mas­ picoseconds. With the next generator of massively sively parallel computers will be essential to under­ parallel machines, this can be extended to millions stand wetting and surface critical exponents as of particles for microseconds. While not all well as systems with complex order parameters. problems require this large number of particles or Direct numerical integration of a set of Langevin long times, many do. Problems which will benefit equations that describe the nonlinear fluctuating from the faster computational speed typically in­ hydrodynamics can be solved in two-dimensions volve either large lengths and/or long time scales. on a Cray YMP class supercomputer but the exten- 41 sion to three dimensions requires HPC. Finally, both the electronic and the ionic degrees of cell automata solutions of Navier-Stokes and freedom to relax towards equilibrium simul­ Boltzmann equations are a powerful method for taneously. While ab initio methods have been studying hydrodynamics. All of these methods, be­ around for more than a decade, only recently have cause of their inherent locality, run very efficiently they been applied to systems of more than a few on parallel computers. atoms. Now, however, this method can be used to model a few hundred atom system and this num­ Electronic Structure and Quantum Molecular ber will increase by at least a factor of 10 within Dynamics the next five years. The method has already lead The ability of quantum mechanics to predict the to new insights into the structure of amorphous total energy of a system of electrons and nuclei materials, finite temperature simulations of the enables ones to reap tremendous benefits from new C60 solid, computation of the atomic and quantum-mechanical calculations. Since many electronic structures of 7/7 reconstruction of physical properties can be related to the total ener­ Si(l 11) surface, melting of carbon and studies of gy of a system or to differences in total energy, step geometries on semiconductor surfaces. In the tremendous theoretical effort has gone into future, it will be possible to address many impor­ developing accurate local density functional total tant materials phenomena including phase transfor­ energy techniques. These methods have been very mations, grain boundaries, dislocations, disorder successful in predicting with accuracy equilibrium and melting. constants, bulk moduli, phonons, piezoelectric The problem of understanding and improving the constants and phase-transition pressures and methods of growth of complicated materials, such temperatures for a variety of materials. These as multi-component heterostructures which are methods have recently been applied to study the produced by epitaxial growth using molecular structural, vibrational, mechanical and other beam or chemical vapor deposition techniques, ground state properties of systems containing up stands out as one very important technological ap­ to several hundred atoms. Some recent successes plication of this method. Although a "brute-force" include the unraveling of the normal state proper­ simulation of atomic deposition on experimental ties of high T_c superconducting oxides, predic­ time scales will not possible for sometime, one tions of new phases of materials under high can learn a great deal from studying the mech­ pressure, predictions of superhard materials, deter ­ anisms of reactive film growth. Combining mination of the structure and properties of sur­ atomic calculations for the structure of an inter - faces, interfaces and clusters, and calculations of face with continuum theories of elasticity and plas­ properties of fullerenes and fullerites. tic deformation is also an important area for the Particularly important are the developments of the future. past few years which make it possible to carry out One of the most obvious areas for future applica­ "first principles" computations of complex atomic tions are biological systems, where key reaction se­ arrangements in materials starting from nothing quences would be simulated in ab initio fashion. more than the identities of the atoms and the rules These calculations would not replace existing of quantum mechanics. Recent developments in molecular mechanics approaches, but rather sup­ new iterative diagonalization algorithms coupled plement them in those areas where they are not with increases in the computational efficiency of sufficiently reliable. This includes enzymatic reac­ modem high performance computers have made it tions involving transition metal centers and other possible to use quantum mechanical calculations multi-center bond-reforming processes. A related of the dynamics of systems in the solid, liquid and area is catalysis, where the various proposed reac­ gaseous state. The basic idea of these methods tion mechanisms could be explicitly evaluated. which are known as ab initio methods is to mini­ Short-time finite temperature simulations can also mize the total energy of the system by allowing be explored to search for unforeseen reaction pat- 42 terns. The potential for new discoveries in these Quantum Many-Body areas is high. The quantum many body problem lies at the core Important progress has also been made in under­ of understanding many properties of materials. standing the excitation properties of solids, in par­ Over the last decade much of the classical ticular the predictions of band offsets and optical methodology discussed above has been extended properties. This requires the evaluation of the into the quantum regime in particular with the electron self-energy and is computationally much development of the path integral and Green's func­ heavier than the local density approaches dis­ tion Monte Carlo methods. Early calculations of cussed above. This first principles quasiparticle ap­ the correlation energy of the electron gas are exten­ proach has allowed for the first time the ab initio sively used in local density theory to estimate cor­ calculation of electron excitation energies in solids relation energy in solids. The low temperature valid for quantitative interpretation of spectro­ properties of liquid and solid helium, three and scopic measurements. The excitation of systems as four, the simplest strongly correlated many-body complex as C60 fullerites have been computed. quantum system, are now well understood thanks Although the quasiparticle calculations have yet to in large part to computer simulations. These quan­ be implemented on massively parallel machines, it tum simulations have required thousands of hours is doable and the gain in efficiency and power is on Cray-YMP class computers. expected to be similar to the ab initio molecular dynamics types of calculations. While there still remain very difficult algorithmic issues, exact fermion methods and quantum Much effort has been devoted in the past several dynamical methods to name two,. the progress in years to algorithm development to extend the ap­ the next decade should parallel the previous plicability of these new methods to ever larger sys­ developments in classical statistical mechanics. tems. The ab initio molecular dynamics have been Computer simulations of quantum many-body sys­ successfully implemented on massively parallel tems will become a ubiquitous tool, integrated into machines for systems as large as 700 atoms. Tight theory and experiment. The software and binding molecular dynamics methods are an ac­ hardware has reached a state where much larger, curate, empirical way to include the electronic complex and realistic systems can be studied. degrees of freedom which are important for Some particular examples are electrons in tranSi­ covalently bonded materials, at speeds of 300 time tion. metals, in restricted geometries, at high faster than ab initio methods. This method has al­ temperatures and pressures or in strong magnetic ready been used to simulate 2000 atoms and with fields. Mean field theory is unreliable in many of the new massively parallel machines, this number these situations. However, these applications, if will easily increase to 10,000 within a year. they are to become routine and widely distributed Another very exciting recent development in this in the materials science community will require area is work on the so-called order N methods for high performance hardware. Quantum simulations electronic structure calculations. At present, quan­ are naturally parallel and are likely to be among tum mechanical calculations scale at least as NA3 the first applications using massively parallel com­ in the large N limit, where N is the number of puters. atoms in a unit cell. Significant progress has been made recently by several groups in developing Thanks to J. Bernholc, D. Ceperley, J. Joan­ methods which would scale as N. The success of nopoulous, B. Harmon and S. Louie for their help these approaches would further enhance our in preparing this subsection. ability to study very large molecular and materials systems including systems with perhaps thousands of atoms in the near future.

43 Computational Molecular Biology/Chemistry/Biochemistry by Barry Honig Background plied to the protein folding problem (predicting three dimensional structure from amino acid se­ There have been a number of revolutionary quence). Methods include statistical analyses (in­ developments in molecular biology that have cluding neural nets) of homologies to known greatly expanded the need for high performance structures, approaches based on physical chemical computing within the biological community. First, principles and simplified lattice models of the type there has been an exponential growth in the num­ used in polymer physics. A major problem in un­ ber of gene sequences that have been determined, derstanding the physical principles of protein and and no end is in sight. Second, there has been a nucleic conformation is the treatment of the sur ­ parallel (although slower) growth in the number of rounding solvent. Molecular dynamics techniques proteins whose structures have been determined are widely used to model the solvent but their ac­ from x-ray crystallography and, increasingly, from curacy depends on the potential functions that are multidimensional NMR. This literal explosion in used as well as the number of solvent molecules new information has led to developments in areas that can be included in a simulation. Thus, the such as statistical analysis of gene sequences and technique is limited by the available computation­ of three dimensional structural data, new al power. Continuum solvent models offer an alter­ databases for sequence and structural information, native approach but these too are highly computer molecular modeling of proteins and nucleic acids, intensive. and three-dimensional pattern recognition. The recognition of grand challenge problems such as Even assuming a reliable method to evaluate free protein folding or drug design has resulted in large energies, the problem of conformational search is part from these developments. Moreover, the con­ daunting. There are a large number of possible tinuing interest both in sequencing the human conformations available to a macromolecule and it genome and in the field of structural biology is necessary to develop methods, such as Monte guarantees that computational requirements will Carlo techniques with simulated annealing, to en­ continue to grow rapidly in the coming decade. sure that the correct one has been included in the generated set of possibilities. A similar set of To illustrate the type of problems that can arise, problems arises, for example, inthe problem of consider the case where a new gene has been iso­ structure-based drug design. In this case one may lated and its sequence is known. In order to fully know the three-dimensional structure of a protein exploit this information it is necessary to first ob­ and it is necessary to design a molecule that binds tain maximum information about the protein this tightly to a particular site on the surface. Efficient gene encodes. This can be accomplished by search­ conformational search, energy evaluation and pat­ ing a nucleic acid sequence data base for struc­ tern recognition are requirements of this problem, turally or functionally related proteins, and/or by all requiring significant computational power. detecting sequence patterns characteristic of the three dimensional fold of a particular class of Significant progress has been made in these and proteins. There are numerous complexities that many other related areas. Ten years ago most cal­ arise in such searches and the computational culations were made without including the effects demands imposed by the increasingly sophisti­ of solvent. This situation has changed dramatically cated statistical techniques that are being used can due to scientific progress that has been potentiated be imposing. by the availability of significant computational power. Some of this has been provided by Super ­ A variety of methods, all of them requiring vast computer centers while some has been made avail­ computational resources, are currently being ap- able by increasingly powerful workstations. Fast

UK computers have also been crucial in the very There will be parallel improvements in structure process of three dimensional structure determinä­ based design of biologically active compounds tion. Both x-ray and NMR data analysis have ex­ such as pharmaceuticals. Moreover, the develop­ ploited methods such as molecular dynamics and ment of new compounds based on biomimetic simulated annealing to yield atomic coordinates of chemistry and new materials based on polymer macromolecules. More generally, the new dis­ design principles deduced from biomolecules cipline of structural biology, which involves the should become a reality. All of this progress will structure determination and analysis of biological require increased access to high performance com­ macromolecules, has been able to evolve due the puting for the reasons given above. The various increased availability of high performance comput­ simulation and conformational search techniques ing. will continue to benefit dramatically from in­ creased computational power. This will be true at Future the level of individual workstations, which a Despite the enormous progress that has been made bench chemist for example might use to design a the field is just beginning to take off. Gene se­ new drug. Work of this type often requires sophis­ quence analysis will continue to become more ef­ ticated three dimensional graphics and will benefit fective as the available data continue to grow and from progress in this area. Massively parallel as increasingly sophisticated data analysis techni­ machines which will certainly be required for the ques are applied. It will be necessary to make state­ most ambitious projects. Indeed, it is likely that of-the-art sequence analysis available to individual for some applications the need for raw computing investigators, presumably through distributed power will exceed what is available in the foresee­ workstations and through access to centralized able future. resources. This will require a significant training New developments in the areas covered in this sec­ effort as well as the development of user-friendly tion will have major economic impact. The programs for the biological community. biotechnology, pharmaceutical and general health There is enormous potential in the area of three industries are obvious beneficiaries but there will dimensional structure analysis. There is certain to be considerable spin-off in materials science as be major progress in understanding the physical well. chemical basis of biological structure and func­ tion. Improved energy functionals resulting from Recommendations progress in quantum mechanics will become avail­ Support the development of software in com­ able. Indeed a combination of quantum mechanics putational biology and chemistry. This should and reaction field methods will make it possible to take the form of improved software and algo­ obtain accurate descriptions of molecules in the rithms for workstations as well as the porting condensed phase. The impact of such work will be of existing programs and the development of felt in chemistry as well as in biology. Improved new ones on massively parallel machines. descriptions of the solvent through a combination of continuum treatments and detailed molecular • Make funds available for training that will ex­ dynamics simulations at the atomic level will lead ploit new technologies and for familiarizing to truly level descriptions of the conformational biologists with existing technologies. free energies and binding free energies of biologi­ • Funding should be divided between large cal macromolecules. When combined with sophis­ centers, smaller centers involving a group of ticated conformational search techniques, investigators at a few sites developing new simplified lattice models, and sophisticated statisti­ technologies, and individual investigators. cal techniques that identify sequence and struc­ tural homologies, there is every reason to expect major progress on the protein folding problem. 45 Molecular Modeling and Quantum Chemistry by William Lester The need for high performance computing has remains a major computational challenge that will been met historically by large vector supercom­ require the use of massively parallel computer sys­ puters. It is generally agreed in the computer and tems. computational science communities that sig­ nificance improvements in computational efficien­ Although in molecular mechanics or force field cy will arise from parallelism. The advent of methods, computational effort is dominated by the conventional parallel computer systems has evaluation of the force field that gives the poten­ generally required major computer code restruc­ tial energy as a function of internal coordinates of ture to move applications from vector serial com­ the molecules and non-bonded interactions be­ puters to parallel architectures. The move to tween atoms, a popular approach for small organic parallelism has occurred in two forms: distributed molecules is the ab initio Hartree-Fock (HF) MIMD machines and clusters of workstations with method which has come into routine use by or­ the former receiving the focus of attention in large ganic and medicinal chemists to study compounds multi-user center facilities and the latter in local re­ and drugs. The HF method is ab initio because the search installations. calculation depends only on basic information about the molecule, including the number and The tremendous interest in the simulation of types of atoms and the total change. The computa­ biological processes at the molecular level using tional effort of HF computations scales as N4, molecular mechanics and molecular dynamics where N is the number of basis functions used to methods has led to continuing increase in demand describe the atoms of the molecule. Because the of computational power. Applications have poten­ HF method describes only the "average" behavior tially high practical value and include, for ex­ of electrons, it typically provides a better descrip­ ample, the design of inhibitors for enzymes that tion of relative geometries than of energetics. The are suspected to play a role in disease states and accurate treatment of the latter requires proper ac­ the effect of various carcinogens on the structure count of the instantaneous correlated motions of of DNA. electrons which inherently is not described by the In the first case, one expects that a molecule HF method. designed to conform within the three-dimensional For systems larger than those accessible with the arrangement of the enzyme structure should be HF methods, one has, in addition to molecular bound tightly to the enzyme in solution. This re­ mechanics methods, semiemperical approaches. quirement, and others, make it desirable to know Their name arises out of the use of experimental the tertiary structure of the enzyme. The use of data to parameterize integrals and other simplifica­ computation for this purpose is contributing sig­ tions of the HF method leading to a reduction of nificantly to the understanding of those structures computational effort to order N3. Results of these which are then used to guide organic synthesis. methods can be informative for systems where In the second case, serial vector supercomputers parametrization has been performed. typically can carry out molecular dynamics simula­ Recently, the density functional (DF) method has tions of DNA for time frames of only picoseconds become popular, overcoming deficiencies in ac­ to nanoseconds. A recent calculation of 200-ps in­ curacy for chemical applications that limited ear­ volving 3542 water molecules and 16 sodium ions lier use. Improvements have come in the form of took 140 hours of Cray Y-MP time. Extending better basis sets, advances in computational algo­ such calculations to the millisecond or even the rithms for solving the DF equations, and the second range where important motions can occur development of analytical geometry optimization

46 methods. The DF method is an ab initio approach coworkers using transputors demonstrates the that takes into account electron correlation. In utility of parallelization in these simulations. Fu­ view of the latter capability, it can be used to study ture equipment should carry us much further a wide variety of systems, including metals and in­ towards understanding. organic species. Competing interactions and concomitant "frustra­ The move to parallel systems has turned out to be tion" characterizes complex fluids and the result­ a major undertaking in software development for ing mesoscopic structures. Such competition is the approaches described. Serious impediments also a central feature of "random polymers" - a have been encountered in algorithm modification model for proteins and also for manufactured for methods that go beyond the ab initio HF polymers. Computer simulation studies of random method, and in steps to maximize efficiency with polymers can be extremely useful. Though here increased numbers of processors. These cir ­ too, the computations of even the simplest models cumstances have increased interest in quantum press the limits of current technology. To treat this Monte Carlo (QMC) methods for electronic struc­ class of systems, Binder and coworkers and ture. In addition, QMC methods have been used Frenkel and coworkers have developed new algo­ with considerable success for the calculation of rithms, some of which are manifestly paral­ vibrational eigenvalues, and in statistical lelizable. Thus, this area is one where the new mechanics studies. computer technology should be very helpful. QMC, as used here in the context of electronic Along with large length scale fluctuations, as in structure, is an ab initio method for solving the self assembly and polymers, simulations press cur­ Schroedinger equation stochastically based on the rent computational equipment where relaxation oc­ formal similarity between the Schroedinger equa­ curs over many orders of magnitude. This is the tion and the classical diffusion equation. The phenomena of glasses. Here, the work of Fredrick­ power of the method is that it is inherently an N- son on the spin-facilitated Ising system body method that can capture all of the instan­ demonstrates the feasibility of parallelization in taneous correlation of the electrons. The QMC studying long time relaxation and the glass transi­ method is readily ported to parallel computer sys­ tion by simulation. tems with orders of magnitude savings in computa­ Polymers and glasses are examples pertinent to the tional effort over serial vector supercomputers. understanding and design of advanced materials. In the statistical mechanical studies of complex In addition, one needs to understand the electronic systems, one is often interested in the spontaneous and magnetic behavior of these and other con­ formation and energetics of structure over large densed matter systems. In recent years, a few length scales. The mesoscopic structures, such as methods have appeared, especially the Car-Par­ vesicles and lamellars, formed from self assembly rinello approach, which now makes feasible the in oil/water-surfactant mixtures are important ex­ calculation of electronic properties of complex amples. The systematic analysis of these materials. The calculations are intensive. For ex­ phenomena have only recently begun, and com­ ample, studying the dynamics and electronic struc­ puter simulation is one of the important tools in ture of a system with only 64 atoms, periodically this analysis. Due to the large length scales in­ replicated, for only a picosecond is at the limits of volveci, simulation is necessarily confined to very current capabilities. With parallelization, and simple classes of models. Even so, the, work pres­ simplified models, one can imagine, however, sig­ ses the capabilities of current computational equip­ nificant progress in our understanding of metalin­ ment to their limits. While the stability of various sulator transitions and localization in correlated nontrivial structures have been documented, we disordered systems. are still far from understanding the rich phase diagram in such systems. The work of Smit and 47 Mathematics and High Performance Computing by James Sethian A. Introduction designed for tracking physical interfaces have Mathematics underlies much, if not all, of high launched new theoretical investigations in differen­ performance computing. At first glance, it might tial geometry, to name just a few. Along the way, seem that mathematics, with its emphasis on this interrelation between mathematics and com­ theorems and proofs, might have little to con­ puting has brought breakthroughs in such areas as tribute to solving large problems in the physical material science (such as new schemes for sciences and engineering. On the contrary, in the solidification and fracture problems), computation­ same way that mathematics contributes the under­ al fluid dynamics (e.g., high order projection lying language for problems in the sciences, en­ methods and sophisticated particle schemes), com­ gineering, discrete systems, etc., mathematical putational physics (such as new schemes for Ising theory underlies such factors as the design and un­ models and percolation problems), environmental derstanding of algorithms, error analysis, ap­ modeling (such as new schemes for groundwater proximation accuracy, and optimal execution. transport and pollutant modeling) and combustion Mathematics plays a key role in the drive to (e.g. new approximation models and algorithmic produce faster and more accurate algorithms techniques for flame chemistry/fluid mechanical which, in tandem with hardware advances, interactions). produce state-of-the-art simulations across the B. Current State wide spectrum of the sciences. Mathematical research which contributes to high At the same time, high performance computing performance computing exists across a wide provides a valuable laboratory tool for many areas range. On one end are individual investigators or of theoretical mathematics such as number theory small, joint collaborations. In these settings, the and differential geometry. At the heart of most work takes a myriad of forms; brand-new algo­ simulations lies a mathematical model and an algo­ rithms are invented which can save an order of rithmic technique for approximating the solution magnitude speedup in computer resources, exist­ to this model. Aspects of such areas as approxima­ ing techniques are analyzed for convergence tion theory, functional analysis, numerical properties and accuracy, and model problems are analysis, probability theory, and the theory of dif­ posed which can isolate particular phenomena. ferential equations provide valuable tools for For example, such work includes analysts working designing effective algorithms, assessing their ac­ on fundamental aspects of the Navier-Stokes equa­ curacy and stability, and suggesting new techniques. tions and turbulence theory (c.f. the following sec­ What is so fascinating about the intertwining of tion on Computational Fluid Dynamics), applied computing and mathematics is that each in­ mathematicians designing new algorithms for vigorates the other. For example, understanding of model equations, discrete mathematicians entropy properties of differential equations have focussed on combinatorics problems, and numeri­ led to new methods for high resolution shock cal linear algebraists working in optimization dynamics, approximation theory using multipoles theory. At the other end are mathematicians work­ has led to fast methods for N-body pioblems, ing in focussed teams on particular problems, for methods from hyperbolic conservation laws and example, in combustion, oil recovery, differential geometry have produced exciting aerodynamics, material science, computational schemes for image processing, parallel computing fluid dynamics, operations research, cryptography, has spawned new schemes for numerical linear al­ and computational biology. gebra and multi-grid techniques, and methods

48 Institutionally, mathematical work in high perfor­ ing systems available, both through net­ mance computing is undertaken at universities, the works to supercomputer facilities, on-site ex­ National Laboratories, the NSF High Performance perimental machines (such as parallel Computing Centers, NSF Mathematics Centers processors), and individual high-speed (such as the Institute for Mathematical Analysis workstations. and the Mathematical Sciences Research Institute, State-of-the-art research in modeling, new and the Institute for Advanced Study), and across algorithms, applied mathematics, numerical a spectrum of industries. In recent years, high per­ analysis, and associated theoretical analysis formance computing has become a valuable tool be amply supported; it is this work that con­ for understanding subtle aspects of theoretical tinually and continuously rejuvenates com­ mathematics. For example, computing has revolu­ putational techniques. Without it, tionized the ability to visualize and evolve com­ yesterday's algorithms will be running on plex geometric surfaces, provided techniques to tomorrow's machines. untie knots, and helped compute algebraic struc­ tures. Such research be supported on all levels; the individual investigator, small joint collabora­ C. Recommendations tions, interdisciplinary teams, and large Mathematical research is critical to ensure state-of­ projects. the-art computational and algorithmic techniques Funding be significantly increased in the which foster the efficient use of national comput­ above areas, both to foster frontier research ing resources. In order to promote this work, it is in computational techniques, and to use important that: computation as a bridge to bring mathe­ Mathematicians be supported in their need matics and the sciences closer together. to have access to the most advanced comput-

Computational Fluid Dyanmics by James Sethian

Introduction plex phenomena that cannot be isolated in the laboratory.The effectiveness and versatility of a The central goal of computational fluid dynamics computational fluid dynamics simulation rests on (CFD) is to follow the evolution of a fluid by solv­ several factors. First, the underlying model must ing the appropriate equations of motion on a adequately describe the essential physics. Second, numerical computer. The fundamental equations the algorithm must accurately approximate the require that the mass, momentum, and energy of a equations of motion. Third, the computer program liquid/gas are conserved as the fluid moves. In all must be constructed to execute efficiently. And but the simplest cases, these equations are too dif­ fourth, the computer equipment must be fast ficult to solve mathematically, and instead one enough and large enough to calculate the answers resorts to computer algorithms which approximate sufficiently rapidly to be of use. Weaving these the equations of motion. The yardstick of success factors together, so that answers are accurate, reli­ is how well the results of numerical simulation able, and obtained with acceptable cost in an ac­ agree with experiment in cases where careful ceptable amount of time, is both an art and a laboratory experiments can be established, and science. Current uses of computational fluid how well the simulations can predict highly corn- dynamics range from analysis of basic research 49 into fundamental physics to commercial applica- the most part, this work is also carried out tions. While the boundaries are not sharp, CFD throughout the national laboratories and univer­ work may be roughly categorized in three ways: sities with government support. While cost is not a Fundamental Research, Applied Science, and In­ major issue in these investigations, the focus is dustrial Design and Manufacturing more on obtaining answers to directed questions. Less concerned with the algorithm for its own CFD and Fundamental Research sake, this work links basic research with commer­ At many of the nation's research universities and cial CFD applications, and provides a stepping national laboratories, much of the focus of CFD stone for advances in algorithms to propagate into work is on fundamental research into fluid flow industrial sectors. phenomena. The goal is to understand the role that fluid motion plays in such areas as the evolution CFD and Industrial Design and Manufacturing of turbulence in the atmosphere and in the oceans, The focus in this stage of the process is on apply­ the birth and evolution of galaxies, atmospherics ing the tools of computational fluid dynamics to on other planets, the formation of polymers, the solve problems that directly relate to technology. physiological fluid flow in the body, and the inter­ A vast array of examples exist, such as the play of fluid mechanics and material science such development of a high-speed inkjet plotter, the ac­ as in the physics of superconductors. In these tion of slurry beds for processing minerals, simulations, often employing the most advanced analysis of the aerodynamic characteristics of an and sophisticated algorithms, the emphasis is on automobile or airplane, efficiency analysis of an accurate solutions and basic insight. These calcula­ internal combustion engine, performance of high- tions are often among the most expensive of all speed computer disk drives, and optimal pouring CFD simulations, requiring many hundreds of and packaging techniques in manufacturing. For hours of computer time on the most advanced the most part, such work is carried out in private machines available for a single simulation. The industry, often with only informal ties to academic modeling and algorithmic techniques for such and government scientists. Communication of new problems are constantly under revision and refine­ ideas rests loosely on the influx of new employees ment. For the most part, the major advances in trained in the latest techniques, journal articles, new algorithmic tools, from schemes to handle the and professional conferences. A distinguishing associated numerical linear algebra to high order characteristic of this work is its emphasis on turn­ methods to approximate difference equations, around time and cost. Here, cost is not only the have their roots in basic research into CFD applied cost of the equipment to perform the calculation, to fundamental physics. but the people-years involved in developing the computer code, and the time involved performing CFD and Applied Science may hundreds of simulations as part of a detailed Here, the main emphasis is on the application of parameter study. This motivation is quite different the tools of computational fluid dynamics to from that in the other two areas. The need to per­ problems motivated by specific problems such as form a large number of simulations under extreme­ might occur in natural phenomena or physical iy general circumstances may mean that a simple processes. Such work might include detailed and fast technique that attacks only a highly studies of the propagation of flames in engines or simplified version of the problem may be fire research in closed rooms, the fluid mechanics preferable to a highly sophisticated and accurate involved in the dispersal of pollutants or toxic technique that requires many orders of magnitude groundwater transport, the hydraulic response of a more computational effort. This orientation is lies proposed heart valve, the development of severe at the heart of the applicability and suitability of storms in the atmosphere, and the aerodynamic computational techniques to a competitive in­ properties of a proposed space shuttle design. For dustry.

50 Future of CFD memory and compute power are still insuffi­ cient to brute force through most problems. • Modeling and Algorithmic Issues. For example, the wide variation in length and In many ways, the research, applied science, time scales in turbulent combustion require a and indusiry agenda in CFD has changed, host of iterative techniques, stiff ode solvers, mostly in response to increased computation­ and adaptive techniques. Further algorithmic al power coupled to significant algorithmic advances are required in areas such as and theoretical/numerical advances. On the domain decomposition, grid partitioning, research side, up through the early 1980's, error estimation, and preconditioners and the emphasis was on basic discretization iterative solvers for non-symmetric, non- methods. In that setting, it was possible to diagonally dominant matrices. Mesh genera­ develop methods based on looking at simple tion, while critical, has not progressed far. problems in two, or even one space dimen­ All in all, to accomplish in three-dimensional sion, in simple geometries, and in a fair de­ complex flow what is now routine in two- gree of isolation from the fluid dynamics dimensional basic flow will require theory, applications. Over the last five years, how­ numerics, and considerable cleverness. The ever, there has been a transition to the next net effect of these developments is to make generation of problems. These problems are the buy-in to perform CFD research much more difficult in part because they are at­ higher. Complex physical behavior makes it tempting more refined and detailed simula­ necessary to become more involved with the tions, necessitating finer grids and more physics modeling than had previously been computational elements. However, a more the case. The problems are sufficiently dif­ fundamental issue is that these problems are ficult that one cannot blindly throw them at qualitatively different from thosepreviously the computer and overwhelm them. A consid­ considered. To begin, they often involve com­ erable degree of mathematical, numerical, plex and less well-understood physical and physical understanding must be obtained models - chemically reacting fluid flows, about the problems in order to obtain effi­ flows involving muiltiphase or multicom­ cient and accurate solution techniques. On ponent mixtures of fluids or other complex the industrial side, truly complex problems constitutive behavior. They are often set in are still out-of-reach. For example, in the three dimensions, in which both the solution aircraft industry, we are still a long way from geometry and the boundary geometry are a full Navier-Stokes high Reynolds number more complicated than in two dimensions. unsteady flow around a commercial aircraft. Finally, they often involve resolving multiple Off in the distance are problems of takeoff length and time scales, such as boundary and and landing, multiple wings in close relation interior layers coming from small diffusive to each other, and flight recovery from sud­ terms, and intermittent large variations that den changes in conditions. In the automotive arise in fluid turbulence. Algorithmically, industry, a solid numerical simulation of the these require work in several areas. Complex complete combustion cycle (as opposed to a physical behavior makes it necessary to time-averaged transport model) is still many develop a deeper understanding of the years away. Other automotive CFD problems physics and modeling than had previously include analysis of coolant flows, thermal been required. Additional physics requires heat transfer, plastic mold problems, and the theory and design of complex boundary sheet metal formation. conditions to couple the fluid mechanics to the rest of the problem. Multiple length • High Performance Computing Issues. scales and complex geometries lead to dynamically adaptive methods, since 51 The computer needs to continue the next five vector machines, as space is exchanged for years of CFD work are substantial. As an ex­ communication time (duplicating data where ample, an unforced Navier-Stokes simulation possible rather than sending it between might require 1000 cells in each of three processors). The move to parallel machines is space dimensions. Presuming 100 flops per complicated by the fact that millions of lines cell, and 25,000 time steps, this yields 10 9 x of CFD codes have been written in the 102 x 2.4 x 104 = 2.5 x 10 15 flops; a typical serial/vector format. The instability of the compute time of 2 hours would then require a hardware platforms, the lack of a standard machine of 300 gigaflops. Adding forcing, global high performance Fortran and C, the combustion, or other physics severely ex­ lack of complete libraries, and insecurity as­ tends this calculation. As a related issue; sociated with a volatile industry all contribute memory requirements pose an additional to the caution and reluctance of all but the problem. Ultimately, more compute power most advanced research practitioners of CFD. and memory is needed. The promise of a single, much faster, larger vector machine is Recommendations not being made convincingly, and CFD is at­ In order to tackle the next generation of CFD tempting to adapt accordingly. Here is an problems, the field will require: area where the dream of parallelism is both tantalizing and frustrating. To begin, parallel • Significant accessibility to the fastest current computing has naturally caused emphasis on coarse-grained parallel machines. issues related to processor allocation and load Massively parallel machines with large balancing. To this end, communications cost memory, programmable under stable accounting (as opposed to simply accounting programming environments, including high for floating point costs) has become impor­ performance Fortran and C, mathematical tant in program design. For example, parallel libraries, functioning 1/0 systems, and ad­ machines can often have poor cache memory vanced visualization systems. management and a limited number of paths to/from main memory; this can imply a long Algorithmic advances in adaptive meshing, memory fetch/store time, which can result in grid generation, load balancing, and high actual computational speeds for real CFD order difference, element, and particle problems far below the optimal peek speed schemes. performance. To compensate, parallel com­ • Modeling and theoretical advances coupling puters tend to be less memory efficient than fluid mechanics to other related physics.

High Performance Computing In Physics by James Sethian and Neal Lane INTRODUCTION AND BACKGROUND problems in high energy physics, and is relevant to experimental programs in high energy and High Energy Physics nuclear physics. In the standard model of high Two areas in which high performance computing energy physics, the strong interactions are plays a crucial role are lattice gauge theory and the described by quantum chromodynamics (QCD). analysis of experimental data. Lattice gauge In this theory the forces are so strong that the fun­ theory addresses some of fundamental theoretical damental entities, the quarks and gluons, are not

52 observed directly under ordinary laboratory condi- from the many-body interactions of the con­ tions. Instead one observes their bound states, stituents and their interactions with external protons and neutrons, the basic constituents of the probes such as electric and magnetic fields are atomic nucleus, and a host of short lived particles truly astounding. Computation now provides a produced in high energy accelerator collisions. practical and useful alternative method to study One of the major objectives of lattice gauge theory these problems. Most importantly, it is now pos­ is to calculate the masses and other basic proper­ sible to perform calculations sophisticated enough ties of these strongly interacting particles from to have a real impact on AM science. These in­ first principles, and provide a test of QCD, as well clude high precision computations of the ground as suggest that the same tools could be used to cal­ and excited states of small molecules, scattering of culate additional physical quantities which may electrons from atoms, atomic ions and small not be so well determined experimentally. In addi­ polyatomic molecules, simple chemical reactions tion, lattice gauge theory provides an avenue for involving atom-diatom collisions, photoionization making first principle calculations of the effects of and photodissociation and various time dependent strong interactions on weak interaction processes, processes such as multiphoton ionization and the and thus holds the promise of providing crucial interaction of atoms with ultra strong (or short) tests on the standard model at its most vulnerable electromagnetic fields. points. And, although quarks and gluons are not directly observed in the laboratory, it is expected Gravitational Physics that at extremely high temperatures one would The computational goal of classical and astrophysi­ find a new state of matter consisting of a plasma cal relativity is the solution of the associated non­ of these particles. The questions being addressed linear partial differential equations. For example, by lattice gauge theorists are the nature of the tran­ simulations have been performed of the critical be­ sition between the lower temperature state of ordi­ havior in black hole formation using high ac­ nary matter and the high temperature quark-gluon curacy adaptive grid methods andwhich follow plasma, the temperature at which this transition oc­ the collapse of spherical scalar wave pulses over at curs, and the properties of the plasma. In the least fourteen orders of magnitude, of the structure general area of experimental high energy physics, we currently see in the universe (galaxies, clusters there are three primary areas of computing: of galaxies arranged in sheets, voids etc.) and how The processing of the raw data that is usual­ it may have arisen through fluctuations generated ly accumulated at a central accelerator during an inflationary epoch, of head-on collisions laboratory, such as the Wilson Laboratory at of black holes, and of horizon behavior in a num­ Cornell. ber of black hole configurations.

The simulation of physical processes of in­ CURRENT STATE terest, and the simulation of the behavior of the final states in detector. Definitive calculations in all of the areas men­ tioned above would require significantly greater The analysis of the compressed data that computing resources than have been available up results from processing of the raw data and to now. Nevertheless steady progress has been the simulations. made during the last decade due to important im­ Atomic and Molecular Physics provements in algorithms and calculational techni­ ques, and very rapid increases in available In contrast to some areas of theoretical physics, computing power. For example, among the major the AM theorist has the advantage that he under­ achievements in lattice gauge theory have been: a stands the basic equations governing the evolution demonstration that quarks and gluons are confined of the system of particles under consideration. at low temperatures; steady improvements in However, the wealth of phenomena that derive spectrum calculations, which have accelerated 53 markedly in the last year; an estimate of the transi­ quickly and in a manner consistent with the entire tion temperature between the ordinary state of mat­ sample. It is usual and logical that the processing ter and the quark-gluon plasma; a bound on the take place at the central accelerator center where mass of the Higgs boson; calculations of weak the apparatus sits, because the complete records of decay parameters including a determination of the the data taking usually reside there. In contrast, mixing parameter; and a determination of the the high energy physics work in simulation of strong coupling constant at the energy scale of 5 specific processes and analysis of compressed data GeV from a study of the charmonium spectrum. are well matched to individual University Much of this work has been carried out at the NSF groups.In terms of computing needs, all of the Supercomputer Centers. In the area of atomic and above are well served either by a large, powerful, molecular physics, until quite recently most AM central computer system, or by a cluster or farm of theorists were required to compute using workstations. For example, CERN does the simplified models or, if the research was computa­ majority of its computing on central computers, tionally intensive, to use vector supercomputers. while the Wilson Lab has a farm of DECstation This has changed with the widespread availability 5000/240's to process its raw data. High energy of cheap, fast, Unix based, RISC workstations. computing usually proceeds one event at a time; These "boxes" are now capable of performing at each event, whether from simulation or raw data, the 40-50 megaflop level and can have as much as can be dealt to an individual workstation for 256 megabytes of memory. In addition,it is pos­ processing. There is no need to put the entire sible to cluster these workstations and distribute resources of a supercomputer on one event. How­ the computational task among the cpu's. There are ever, massively parallel computers have the poten­ a few researchers in the US who have as many as tial to handle large numbers of events twenty or thirty of these workstations for their simultaneously in an efficient manner. These own group. This has enabled computational experi­ machines will also be of importance in the work ments using a loosely coupled, parallel model on on lattice gauge theory. Conversely, university selected problems in AM physics. However, this is groups are well served by the powerful worksta- not the norm in AM theory. More typically our tions now available, which has freed them from de­ most computationally intensive calculations are pendence on the central laboratory to study the still performed on the large mainframe, vector su­ physics signals of interest to them. These groups percomputers available only to a limited number need both fast CPUs, for simulation of data, as of users. The majority of the AM researchers in well as relatively large and fast disk farms, for the country are still computing on single worksta­ repeated processing of the compressed data. In the tions or ( outdated ) mainframes of one kind or field of atomic and molecular modeling, the future another. lies in the use of massively parallel, scaleable mul­ ticomputers. AM theorists, with rare exceptions, FUTURE COMPUTING NEEDS have not been as active as other disciplines in moving to these platforms. This is to be contrasted In the field of high energy physics, the processing with the quantum chemists, the lattice QCD of raw data and the simulation of generic mixing theorists and many materials scientists who are be­ processes are activities that are well-suited to a coming active users of these computers. The lack centralized computer center. The processing of the of portability of typical AM codes and the need to raw data should be done in a consistent, or­ expend lots of time and effort in rewriting or ganized, and reliable manner. Often, in the middle rethinking algorithms has prevented a mass migra­ of the data processing, special features in some tion to these platforms. data are discovered, and these need to be treated

ME Accomplishments in Computer Science and Engineering since the Lax Report by Mary K. Vernon While vector multiprocessors have been the sors, which is used in all high performance workhorses for many fields of computational workstations as well as in the massively paral­ science and engineering over the past ten years, re­ lel machines. search in computer science and engineering has been focused both on improving the capabilities of • development of computer-aided tools to these systems, and on developing the next genera­ facilitate the design, testing, and fabrication tion of high-performance computing systems - of complex digital systems, and their con­ namely the scalable, highly parallel computers stituent components. which have recently been commercially realized as systems such as the Intel Paragon, the Kendall invention of the multicomputer and develop­ Square Research KSR- 1, and the Thinking ment of the message-passing programming Machines Corporation CM-5. paradigm which is used in many of today's massively parallel systems. A variety of factors make scalable, highly parallel computers the only viable way to achieve the • refmement of shared memory architectures teraflop capability required by Grand Challenge and the shared memory programming model applications. These systems represent far more which is used in the KSR-1, the Cray T3D, than an evolutionary step from their modestly and other emerging massively parallel parallel vector predecessors. Realizing the teraflop machines. potential of massively parallel systems requires ad­ vances in a broad range of computer science and invention of the hypercube interconnection engineering subareas, including VLSI, computer network and refinement of this network to architecture, operating systems, runtime systems, lower-dimensional, 2-d and 3-d, mesh net­ compilers, programming languages, and algo­ works that are currently used in the Intel rithms. The development of the new capabilities in Paragon and the Cray T3D turn requires computationally intensive experimen­ tation and/or simulations that have been carried •. invention of the fat-tree interconnection net­ out on experimental prototypes (e.g., the NYU work which is currently used in the Thinking Ultracomputer), early commercial parallel Machines CM-5. machines (such as the BBN Butterfly or the Intel iPSC/2), and more recently on high-performance • refinement of the SIMD architecture which is workstations as well as the emerging massively used for example in the Thinking Machines parallel systems such as the Thinking Machines CM2 and the MasPar/l. CM5 and the Intel Paragon. • invention and refinement of the SPMD and Computer science and engineering researchers data parallel programming models which are have made tremendous progress in the past ten supported in several massively parallel sys­ years in the development of high performance tems. computing technology, including the development of ALL of the major technologies in massively • development of the technology underlying parallel systems. Among the specific accomplish­ the mature compilers for vector machines ments are: (i.e., compilers that give delivered perfor­ • development of RISC processor technology mance that is a substantial fraction of the and the compiler technology for RISC proces- theoretical peak performance of these machines.) 55 NATIONAL SCIENCE FOUNDATION WASHINGTON, D.C. 20550

OFFICIAL BUSINESS PENALTY FOR PRIVATE USE $300

RETURN THIS COVER SHEET TO ROOM 233A IF YOU DO NOT WISH TO RECEIVE THIS MATERIAL U • OR IF CHANGE OF ADDRESS IS NEEDED U , INDICATE CHANGE, INCLUDING ZiP CODE ON THE LABEL (DO NOT REMOVE LABEL).

zl -

NSB 93-205