Accelerating Scientific Discovery Through Computation and Visualization
Total Page:16
File Type:pdf, Size:1020Kb
Volume 105, Number 6, November–December 2000 Journal of Research of the National Institute of Standards and Technology [J. Res. Natl. Inst. Stand. Technol. 105, 875-894 (2000)] Accelerating Scientific Discovery Through Computation and Visualization Available online http://www.nist.gov/jres Please note that although all figures in the printed version of this paper are in black and white, Figs. 1–3, 7–9, 12–14, and 17–19 in the online ver- sion are in color. Volume 105 Number 6 November–December 2000 James S. Sims, John G. Hagedorn, Peter The rate of scientific discovery can be condensate modeling, (2) fluid flow in M. Ketcham, Steven G. Satterfield, accelerated through computation and porous materials and in other complex Terence J. Griffin, William L. George, visualization. This acceleration results from geometries, (3) flows in suspensions, Howland A. Fowler, Barbara A. am Ende, the synergy of expertise, computing (4) x-ray absorption, (5) dielectric Howard K. Hung, Robert B. Bohn, John E. Koontz, Nicos S. Martys, Charles E. tools, and hardware for enabling high- breakdown modeling, and (6) dendritic Bouldin, James A. Warren, David L. performance computation, information growth in alloys. Feder, Charles W. Clark, B. James Filla, science, and visualization that is provided and Judith E. Devaney by a team of computation and visualiza- tion scientists collaborating in a peer-to- National Institute of Standards and peer effort with the research scientists. Technology, Gaithersburg, MD 20899-0001 In the context of this discussion, high Key words: discovery science; distri- [email protected] performance refers to capabilities beyond buted processing; immersive environments; [email protected] the current state of the art in desktop IMPI; interoperable MPI; message pass- [email protected] computing. To be effective in this arena, ing interface; MPI; parallel processing; [email protected] a team comprising a critical mass of talent, scientific visualization. [email protected] parallel computing techniques, visualiza- [email protected] tion algorithms, advanced visualization [email protected] hardware, and a recurring investment is [email protected] required to stay beyond the desktop [email protected] capabilities. Accepted: December 22, 2000 [email protected] This article describes, through examples, [email protected] how the Scientific Applications and [email protected] Visualization Group (SAVG) at NIST [email protected] has utilized high performance parallel [email protected] computing and visualization to accelerate Available online: http://www.nist.gov/jres [email protected] [email protected] [email protected] [email protected] 1. Introduction Science advances through iterations of theory and provides a framework of hardware, software and experiment. Increasingly, computation and visualization complementary expertise which the application scientist are an integral part of this process. New discoveries can use to facilitate meaningful discoveries. obtained from an experiment or a computational model Parallel computing allows a computer code to use are enhanced and accelerated by the use of parallel the resources of multiple computers simultaneously. computing techniques, visualization algorithms, and A variety of parallel techniques are available which advanced visualization hardware. can be used depending upon the needs of the applica- A scientist who specializes in a field such as chem- tion. Generally, parallel computing is thought of in istry or physics is often not simultaneously an expert in terms of speeding up an application. While this is true, computation or visualization. The Scientific Applica- experience is showing that users often prefer to use this tions and Visualization Group (SAVG [1]) at NIST increased capability to do more computation within the 875 Volume 105, Number 6, November–December 2000 Journal of Research of the National Institute of Standards and Technology same amount of time. This may mean more runs of the Linux, as well as virtual parallel machines created from same complexity or runs with more complex models. workstation clusters. Each of these systems can be For example, parallel computing can use the combined used for parallel as well as sequential implementations memory of multiple computers to solve larger problems of scientific algorithms. In addition to these central than were previously possible. An example of this is computing resources, SAVG uses commercial tools described in Sec. 8, Dendritic Growth in Alloys. and freely available tools where appropriate, augment- Visualization of scientific data can provide an intu- ing these with locally developed tools when necessary. itive understanding of the phenomenon or data being The following are some tools in common use by SAVG. studied. One way it contributes to theory validation is through demonstration of qualitative effects seen in 2.1 Computation experiments such as Jefferies orbits as described in MPI—Message Passing Interface Sec. 5, Flow of Suspensions. Proper visualization can also exhibit structure where no structure was previously The majority of our parallel applications are written known. In the Bose-Einstein condensate (BEC) example using the message-passing model for parallel programs. (Sec. 3), visualization was key to the discovery of a In the message-passing model each process has exclu- vortex array. Current visualization technology provides a sive access to some amount of local memory and only full range of hardware and techniques from static two- indirect access to the rest of the memory. Any process dimensional plots, to interactive three-dimensional that needs data that is not in its local memory obtains images projected onto a monitor, to large screen fully that data through calls to message passing routines. MPI immersive systems allowing the user to interact on a is a specification for a library of these message-passing human scale. routines. Since its introduction in 1994, MPI has Immersive virtual reality (IVR) [2] is an emerging become the de facto standard for message-passing technique with the potential for handling the growing programming and is well supported on high perfor- amount of data from large parallel computations or mance machines as well as on clusters of workstations advanced data acquisitions. The IVR systems take and PCs. advantage of human skills at pattern recognition by MPI was designed to support its direct use by appli- providing a more natural environment where a stereo- cations programmers as well as to support the develop- scopic display improves depth perception and peripheral ment of parallel programming libraries. We have used vision provides more context for human intuition. MPI in both of these contexts (see the descriptions The techniques used for parallel computing and visu- of C-DParLib, F-DParLib, and AutoMap/AutoLink alization, as well as the knowledge of hardware, are below). specialized and outside the experience of most scientists. Interoperable MPI (IMPI) [6, 7] is a cross-implemen- SAVGmakes use of our experience in solving computa- tation communication protocol for MPI that greatly tional and visualization problems as we collaborate with facilitates heterogeneous computing. IMPI enables the scientists to enhance and interpret their data. Results of use of two or more parallel machines, regardless of this work include theory validation, experiment valida- architecture or operating system, as a single multi- tion, new analysis tools, new insights, standard reference processor machine for running any MPI program. SAVG codes and data, new parallel algorithms, and new visual- was instrumental in the development of the IMPI ization techniques. protocol. 2. Tools C-DParLib and F-DParLib The libraries C-DParLib and F-DParLib [8, 9, 10, SAVGhas worked with many scientists at NIST on a 11],, developed by SAVG, support data-parallel style wide variety of problems, and makes use of an array of programming in C and Fortran 90, respectively. These resources that it can bring to bear on these diverse libraries make heavy use of MPI to handle all communi- projects. Of course we make use of the central comput- cation. Data-parallel programming refers to parallel ing resources that include several SGI Origin 2000 programming in which operations on entire arrays are systems,1 an IBM SP system, a cluster of PCs running supported such as A=A+B,where A and B are arrays of values. C-DParLib and F-DParLib were developed 1 Certain commercial equipment, instruments, or materials are identi- specifically to support parallel applications that derive fied in this paper to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of parallelism primarily from the distribution of large Standards and Technology, nor does it imply that the materials arrays among processing nodes such as in most finite- or equipment identified are necessarily the best available for the difference based parallel applications. purpose. 876 Volume 105, Number 6, November–December 2000 Journal of Research of the National Institute of Standards and Technology Both libraries support basic data-parallel operations facility for NIST scientists to run a wide range of inter- such as array initialization, array shifting, and the active computational and visualization software. exchanging of array data between adjacent processing nodes. In addition, C-DParLib provides additional OpenGL—Performer