<<

Windows on New Initiatives Los Alamos

David W. Forslund, Charles A. Slocomb, and Ira A. Agins

o aspect of technology is changing rapidly than the of computing and . It is among the fastest growing and most competitive Narenas in our global economy. Each year, more of the items we use contain tiny chips on are etched hundreds or thousands or millions of electronic circuit elements. Those chips direct various operations and adjustments—automatic braking in cars; automatic focusing in cameras; automatic collection in cash registers; automatic - taking by answering machines; automatic operation of washers, dryers, and other appliances; automatic production of goods in manufacturing plants; the list could go on and on. Those inconspicuous devices that perform micro-scale computing are profoundly shaping our lives and our culture.

Los Alamos Number 22 1994 Number 22 1994 Los Alamos Science 1

Windows on Computing

Opening illustration: Elements of high-performance computing: Left, the CM-5 Connection a chip), had led to dramatic reductions Machine, the most powerful at Los Alamos; center, the foyer in the cost of producing powerful mi- in the Laboratory Data Communications Center (LDCC); upper right, digitized images of croprocessors and large units. Washington, D.. from a wavelet-based multiresolution developed at Los Alamos; As a result affordable personal comput- and lower right, a portion of a “metacomputer,” a 950,000- special-purpose chip for ers and powerful have be- analyzing the behavior of digital circuits. The chip is being developed by a team of graduate come commonplace in science, in busi- students at the University of Michigan. ness, and in the home. New microprocessors continue to be incorporated into various products at an More visible and no less important are The computer chip was invented in increasing rate; development cycles are the ways microprocessors are changing 1958 when Jack Kilby figured out how down to months rather than years as the the way we communicate with each other to fabricate several on a sin- current generation of processors are and even the kinds of tasks we do. In- gle-crystal silicon substrate and thereby used to aid in the design and manufac- dustries such as desktop publishing, created the . Since ture of the next generation. Because of electronic mail, multimedia systems, then more and more transistors have their , off-the-shelf and financial accounting systems have been integrated on a single chip. By microprocessors are expanding the use been created by the ubiquitous micro- the early 1980s the technology of of micro- and medium-scale computing . It is nothing short of the en- VLSI, or very-large scale integration in and in the home. They are gine of the . (hundreds of thousands of transistors on also motivating changes in the design

History of at Los Alamos

1943Ð45 1952 1955 1956 1961 1971 Desktop MANIAC is built at The MANIAC II pro- MANIAC II is com- STRETCH is com- The Laboratory buys and punched-card the Laboratory under ject, a computer fea- pleted. The Labora- pleted and is about its first CDC 7600, accounting machines the direction of Nick turing floating-point tory installs serial thirty-five times as the successor to the are used as calculat- Metropolis. It is the arithmetic, is started. number 1 of the the powerful as the IBM 6600. These ma- ing tools in the Man- first computer de- The Laboratory be- IBM 704, which has 704. IBM used chines are the main hattan Project. signed from the start gins working closely about the same much of the technol- in according to John with computer manu- power as MANIAC ogy developed for use at the Laborato- 1945 von Neumann’s facturers to ensure II. From this point STRETCH in its ry during much of ENIAC, the world’s stored-program that its future com- on, the Laboratory computers for years the 1970s. first large-scale elec- ideas. puting needs will be acquires supercom- afterward. tronic computer, is satisfied. puters from industry. 1972 completed at the 1953 1966 Research, Inc University of Penn- The Laboratory gets Late 1950s The first on- is founded. The sylvania. Its “shake- serial number 2 of The Laboratory and mass-storage Laboratory consults down” calculation is the IBM 701. This IBM enter into a joint with a capacity of on the design of the the “Los Alamos “Defense ” project to build over 1012 , the Cray-1. problem,” a calcula- is approximately STRETCH, a com- IBM 1360 Photo tion needed for the equal in power to puter based on tran- Store, is installed at 1975 design of thermonu- the MANIAC. sistors rather than the Laboratory. Con- Laboratory scientists clear weapons. MANIAC II vacuum tubes, to trol Data Corporation design and build meet the needs of introduces the first a high-speed - 1949 the nuclear-weapons “pipelined” computer, work that uses 50- IBM’s first Card Pro- program. the CDC 6600, de- megabit-per-second grammable Calcula- signed by Seymour channels. tors are installed at Cray. The Laborato- the Laboratory. ry buys a few.

2 Los Alamos Science Number 22 1994 Windows on Computing

of large-scale scientific computers. The dressing the “Grand Challenge” compu- plans to continue working with the su- focus has shifted from single, very fast tational problems in science and engi- percomputing industry and to help ex- processors to very fast networks that neering, Los Alamos set up the Ad- pand the contributions of computer allow hundreds to thousands of micro- vanced Computing Laboratory as a kind modeling and to all areas of processors to cooperate on a single of proving ground for testing MPPs on society. Here, we will briefly review a problem. Large-scale is a real problems. Ironically, just as their few of our past contributions to the critical technology in scientific re- enormous potential is being clearly high end of computing, outline some search, in major industries, and in the demonstrated at Los Alamos and else- new initiatives in large-scale parallel maintenance of national security. It is where, economic forces stemming from computing, and then introduce a rela- also the area of computing in which reduced federal budgets and slow ac- tively new area of involvement, our Los Alamos has played a major role. ceptance into the commercial market- support of the information revolution The has opened up place are threatening the viability of the and the National Information Infrastruc- the possibility of continual increases in supercomputing industry. ture initiative. the power of supercomputers through As a leader in scientific computing, Since the days of the Manhattan Pro- the architecture of the MPP, the mas- Los Alamos National Laboratory has ject, Los Alamos has been a driver of sively parallel processor that can con- always understood the importance of and a major participant in the develop- sist of thousands of off-the-shelf micro- supercomputing for maintaining nation- ment of large-scale scientific computa- processors. In 1989, seeing the al security and economic strength. At tion. It was here that Nick Metropolis potential of that new technology for ad- this critical juncture the Laboratory directed the construction of MANIAC I

1976 1980 1985 1988 1990 1994 Serial number 1 of The Laboratory be- The Ultra-High- The Laboratory A device for A massively parallel the Cray-1 is deliv- gins its parallel- Speed Graphics obtains the first of its HIPPI ports is trans- is installed ered to the Labora- processing efforts. Project is started. It six Cray Y-MP com- ferred to industry. at the ACL for use in tory. pioneers animation puters. It also in- The Laboratory, the collaborations with 1981 as a stalls, studies, and Jet Propulsion Labo- industry. 1977 An early parallel tool and requires evaluates a number ratory, and the San A Common Sys- processor (PuPS) is gigabit-per-second of massively parallel Diego Supercomput- tem, composed of fabricated at the communication ca- computers. The Ad- er start the Casa IBM mass-storage Laboratory but never pacity. A massively vanced Computing Gigabit Test Project components, is in- completed. parallel (128-node) Laboratory (ACL) is stalled and provides computer is established. 1991 storage for all cen- 1983 installed. The Laboratory tral and remote Lab- Denelcor’s HEP, an 1989 transfers to industry oratory computer early commercially 1987 The ACL purchases the HIPPI frame- systems. available parallel The need for higher the CM-2 Connec- buffer, an important processor, in in- communication tion Machine from component for visu- The Cray T3D stalled, as is the first capacity is answered Thinking Machines. alization of complex of five Cray X-MP by the development It has 65,536 parallel images. computers. of the High-Perfor- processors. mance Parallel Inter- 1992 face (HIPPI), an A 1024-processor 800-megabit/second Thinking Machines channel, which be- CM-5, the most pow- comes an ANSI erful computer at the The Cray-1 standard. , is installed at the ACL.

3

Windows on Computing

intro1.adb 7/26/94

and II. Maniac I (1952) was among the first general-purpose digital computers to realize von Neumann’s concept of a 1011 stored-program computer—one that Cray T3D, CM-5 could go from step to step in a compu- tation by using a set of instructions that was stored electronically in its own Intel Delta, CM-200, etc. memory in the same way that data are stored. (In contrast, the ENIAC (1945), Intel Gamma, CM-2, etc. 9 the very first General-purpose electronic 10 Early commercial CM-1, etc. computer, had to be programmed me- parallel computers chanically by inserting cables into a plugboard.) Maniac II (1956) embod- Cray Y-MP ied another major advancement in com- Cray X-MP puter design, the ability to perform Cray-1 floating-point arithmetic—the kind that 107 automatically keeps track of the posi- CDC 7600 tion of the point. The peak Projection CDC 6600 speed of MANIAC II was 10,000 arith- published in 1983 metic operations per second, an impres- sive of 1000 higher than that of the electromechanical accounting ma- IBM 7030 5 Serial chines used to perform numerical calcu- Operations per second 10 lations at Los Alamos during the Man- hattan Project (see Figure 1). IBM 704 Vector The main function of those account- ing machines and very early electronic MANIAC computers was to simulate through nu- Parallel merical computation the extreme physi- 103 conditions and complex nonlinear SEAC processes that occur within a . The continued role of com- puter simulation as the primary tool for designing and predicting the perfor- mance and safety of nuclear weapons has been a major driver behind the de- 101 velopment of increasingly powerful Electro-mechanical computers. The larger the computer, accounting machine the greater the complexity that could be simulated accurately. That is still true 1940 1950 1960 1970 1980 1990 2000 today as the largest supercomputers are Year being used to simulate realistically very complex phenomena including the in- Figure 1. The Growth of Computing Power teraction of the oceans and the atmos- This plot shows the number of operations per second versus year for the fastest avail- phere on global scales, the basis of bulk able “supercomputers”. Different shapes are used to distinguish serial, vector, and properties of materials in the motions parallel architectures. All computers up until and including the Cray-1 were single- of individual atoms, the flow of oil and processor machines. The dashed line is a projection of supercomputing speed gas through the porous of under- through 1990. That projection was published in a 1983 issue of Los Alamos Science— ground reservoirs, the dynamics of the before massively parallel processors became practical.

4 Los Alamos Science Number 22 1994 Windows on Computing

internal combustion engine, the behav- operations per second) and a memory the CM-2 , a mas- ior of biological macromolecules as capacity of 4 megabytes. sively parallel computer built by Think- pharmaceuticals, and so forth. In all these developments at the high ing Machines Corporation. The CM-2 After 1956 and MANIAC II the end of computing, the Laboratory has was originally designed to investigate Laboratory stopped building computers taken part in the struggles associated problems in artificial intelligence, and and instead relied on close working re- with bringing new technology into the when it arrived in Los Alamos in 1988 lationships with IBM and other vendors marketplace, testing new machines, de- to be tried out on problems related to to ensure that the industry would be veloping the operating systems, and de- atmospheric effects and other hydrody- able to supply the necessary computing veloping new computational techniques, namic problems of interest in the power. A particularly important collab- weapons program, it did not contain the orative effort was STRETCH, a project special processors needed for efficient with IBM started in 1956 to design and The Laboratory has taken part computation of floating-point arith- build the fastest computer possible with in the struggles associated with metic. Laboratory scientists responded existing technology. For the first time by working in collaboration with the new transistor would bringing new supercomputing Thinking Machines on the installation be used in computer design. Transis- technology into the market- and testing of the appropriate - tors have a much faster response time ing units and went on to develop the place, testing new machines, and are much more reliable than the first successful for perform- traditional vacuum-tube elements of developing the operating ing hydrodynamic problems on the digital circuits. The STRETCH, or systems, and developing new Connection Machine’s parallel IBM 7030, computer was made from architecture. 150,000 transistors and was delivered computational techniques, or The question of whether parallel - to Los Alamos in 1961. It was about algorithms, to full use of chitectures would be a practical, cost- thirty-five times faster than the IBM effective path to ever-increasing com- 704, a commercial vacuum-tube ma- the potential computing power puter power has been hanging in the chine similar to the MANIAC II. presented by each new balance since the early 1980s. By then In the early 1970s Los Alamos be- it was apparent that the speed of a sin- came a to on supercomputer. That of gle was limited by the the design of the Cray-1, the first suc- involvement continues today. physical size and density of the ele- cessful vector computer. Vector archi- ments in the integrated circuits and the tecture increases computational speed Los Alamos was one of the first time required for propagation of elec- by enabling the computer to perform institutions to demonstrate the tronic signals across the machine. many machine instructions at once on To increase the speed or overall per- linear data arrays (vectors). In contrast, general usefulness of the CM-2 formance of supercomputers by more computers with traditional serial archi- Connection Machine, a mas- than the factor associated with the tecture perform one machine instruction speed-up in the individual processors, at a time on individual pieces of data sively parallel computer built one could use parallel systems, for ex- (see “How Computers Work” in this by Thinking Machines ample, a number of Cray-1-type vector volume). processors working together or mas- Los Alamos was not only a consul- Corporation. sively parallel systems in which thou- tant on the design of the Cray but was sands of less powerful processors com- also the first purchaser of that innova- municate with each other. In the early tive hardware. The delivery of the first 1980s the latter approach appeared to Cray computer to Los Alamos in 1976 or algorithms, to make full use of the be intractable. might be said to mark the beginning of potential computing power presented by At the same time that supercomputer the modern era of high-performance each new supercomputer. That type of manufacturers were worrying about computing. The Cray-1 supercomputer involvement continues today. The Lab- how to improve on the standards of had a speed of tens of megaflops (one oratory was one of the first institutions performance set by the Cray-1, an ef- megaflop equals a million floating-point to demonstrate the general usefulness of fort was begun to apply VLSI technolo-

5 Windows on Computing

intro2.adb 7/26/94

gy to the design of a “Cray on a 1010 chip.” That effort to produce very Petaflop (10 powerful microprocessors was moti- 15 vated in the early 1980s by the rapid ) growth of and standardization in the Total speed PC and marketplace. As 108 microprocessor technology advanced Teraflop (10 THE FUTURE and commercial applications in- TODAY (1994) creased, production costs of micro- 12 processors decreased by orders of ) magnitude and it became possible to produce very high-performance work- 106 stations with price-to-performance ra- tios much lower than those associated Gigaflop ((10 with conventional vector supercomput- 3-D seismic analysis CM-2 Climate modeling ers. The workstations produced by 10 12 ) Sun, , IBM, and 104 Number of processors Hewlett-Packard can sustain speeds CM-200 comparable to the Cray-1 (see Figure 2). Such performance is sufficient for a wide range of scientific problems, CM-5, Cray T3D and many researchers are choosing to Megaflop (10 adapt their computing to high-end 102 6 workstations and workstations net- ) CrayC90Cray C90 worked together to form workstation clusters. The microprocessor revolu- Cray X-MP Cray Personal SGI Cray-1 Y-MP tion has led to the workstation revolu- computers SPARC IBM 3090 CYBER 205 Cray-2 tion in scientific computing. 100 Supercomputer manufacturers have 104 106 108 also tried to build on the cost savings Floating point operations per second per processor and technology advances afforded by the microprocessor revolution. Think- ing Machines, Intel, Cray Research, and Figure 2. Supercomputers Today and Tomorrow others have been designing MPPs, each The speeds of workstations, vector supercomputers and massively parallel supercom- containing hundreds or thousands of puters are compared by plotting the number of processors versus the speed per off-the-shelf microprocessors of the processor measured in number of floating-point operations per second (flops). The kind found in high-end workstations. solid diagonal lines are lines of constant total speed. The maximum theoretical speed Those processors are connected in par- of the 1024-node CM-5 Connection Machine is 131 gigaflops. Supercomputer manufac- allel through various network architec- turers are hoping to make MPPs with teraflop speeds available within the next few tures and are meant to work simultane- years. Increases in speed will probably be achieved by increasing both the number of ously on different parts of a single large processors in an MPP and the speed per processor. problem. As indicated in Figure 2, MPPs hold the promise of increasing the Los Alamos Advanced Computing the applications (for example, the mole- performance by factors of thousands Laboratory—a CM-5 Connection Ma- cular dynamics calculation in “State-of- because many of their designs are scal- chine from Thinking Machines contain- the-Art ”) run on the able, that is, their computational speed ing 1024 SPARC microprocessors. machine have already achieved 40 per increases in proportion to the number This machine has a theoretical peak cent of that theoretical maximum. of processors in the machine. One of speed of 130 gigaflops, more than a For over fifteen years the Laboratory the largest scalable MPPs in use is at factor of 1000 over the Cray-1; some of has had much success with convention-

6 Los Alamos Science Number 22 1994 Windows on Computing

al vector supercomputers, particularly RS/6000 560 workstations, is now in the Cray X-MP and Cray Y-MP. Los general use. The other, which is still in Alamos scientists have grown accus- development, consists of eight Hewlett- tomed to “vectorizing” their computer Packard 735 workstations, each of programs, particularly hydrodynamics which is faster at scalar calculations codes, and have been quite successful than is a single processor of a Cray Y- at designing codes that run very effi- MP. We are now working closely with ciently on the vector machines. Never- Hewlett-Packard on the software for theless, during the last five years the that new cluster. Laboratory has been strongly support- Achieving high performance on a ing the shift to parallel computing. It is massively parallel machine or a work- our belief that the best hope of achiev- station cluster requires the use of en- ing the highest performance at the low- tirely different programming models. est cost lies with the massively parallel The most flexible approach is the approach. In 1989 the Laboratory pur- MIMD, or multiple-instruction, multi- chased its first Connection Machine, a ple-data, model in which different oper- CM-2, and in 1992 the Laboratory ac- ations are performed simultaneously by quired the CM-5 mentioned above. different processors on different data. We have also been collaborating The challenge is to achieve communi- with manufacturers of high-end work- cation and coordination among the stations on another approach to parallel processors without losing too much computing, the development of work- computational time. “State-of-the-Art station clusters. A cluster consists of Parallel Computing” presents a good many stand-alone workstations connect- look at what is involved in achieving ed in a network that combines the com- high performance on MPPs. puting power and memory capacity of The rich array of programming pos- the members of the cluster. We are sibilities offered by parallel architec- Figure 3. Workstation Clusters helping to develop the networks and the tures has brought back into use many Workstation clusters are becoming more software necessary to make such a clus- algorithms that had been ignored during popular for scientific computing because ter appear to the user as a single com- the fifteen years or so that vector super- they have excellent price/performance ra- putational resource rather than a collec- computers dominated the scientific mar- tios and can be upgraded as new micro- tion of individual workstations. ketplace. Today the most active area in processors come on the market. The Clusters are likely to come into greater scientific programming and the one that eight RS/6000 workstations in the IBM and greater use because they provide will have a long-range impact on high- cluster at Los Alamos (top) are soon to scalable computing power at an excel- performance computing is the develop- be upgraded from model 560 to model lent price-to-performance ratio starting ment of algorithms for parallel architec- 590. We are collaborating with Hewlett- from a small number of processors. At tures. The Laboratory is a leader in Packard in the development of software the higher performance end of comput- development, and this - for the very high-performance HP 735 ing, where hundreds of processors are ume presents a few outstanding exam- workstation cluster recently assembled involved, clusters do not necessarily ples of new parallel algorithms. here (bottom). This cluster is being out- compete well with vector or massively The Laboratory has also taken a fitted with HIPPI, the networking interface parallel supercomputers because the leadership role in creating an advanced developed at the Laboratory for very high management of the processors becomes computing environment needed to data transmission rates (see “HIPPI—The very complex and the of the achieve sustained high performance on First Standard for High-Performance Net- interconnections becomes more difficult the new MPPs and to store and view working”). to achieve. the enormous amounts of data generat- Figure 3 shows two workstation ed by those machines. The focal point clusters at Los Alamos. One, the IBM for research on and development and cluster consisting of eight IBM implementation of the elements of the

7 Windows on Computing

new computing environment is the Ad- the scale of the individual pores in oil- to U.S. industry with the goal of in- vanced Computing Laboratory set up at bearing rock. The success of this col- creasing industrial competitiveness. Los Alamos in 1989. The goal of the laboration provided a working example The $52 million collaborative program ACL is to handle the most challenging, on which to base ACTI, the Advanced is under the auspices of the DOE’s computationally intensive problems in Computing Technology Initiative for High Performance Parallel Processor science and technology, the so-called program and will involve fifteen indus- Grand Challenge problems. Our 1024- trial partners. Over seventy scientists processor CM-5, with its enormous Today the most active area will be involved in creating computer speed and very large memory capacity, in scientific programming and algorithms for massively parallel ma- is the centerpiece of the Advanced chines that are of direct use in simulat- Computing Laboratory. The ACL also the one that will have a long- ing complex industrial processes and houses advanced storage facilities de- range impact on high-perfor- their environmental impact. veloped at Los Alamos to rapidly store In addition, two networked 128- and the terabytes of data gener- mance computing is the processor Cray T3D MPPs, one at Los ated by running Grand Challenge prob- development of algorithms for Alamos and one at Livermore, will be lems on the CM-5. A special “HIPPI” used in the industrial collaborations. network, developed at Los Alamos parallel architectures. They will be the first government- specifically to handle very high rates of The Laboratory is a leader in owned machines to be focused primari- data transmission, connects the CM-5 algorithm development for the ly on solving industrial problems. to the advanced storage facilities and The program includes fifteen pro- vice versa. The HIPPI protocol for su- new parallel machines. Two jects in the fields of environmental percomputer networks has since be- outstanding examples described modeling, petroleum exploration, mate- come an industry standard (see rials design, advanced manufacturing, “HIPPI”). in this volume are the fast and new MPP systems software. Los Five DOE Grand Challenge prob- code for many-body problems, Alamos will be involved in seven of lems are being investigated at the ACL: the projects, two of which are devoted global climate modeling, multiphase developed initially to trace the to developing general software tools flow, new materials technology, quan- formation of structure in the and diagnostics that across particu- tum chromodynamics, and the tokamak lar applications and address important fusion reactor. Other very interesting early universe, and the lattice- issues of portability of software from (see “Experimental Cos- Boltzmann method, an intrinsi- one to another. mology and the Puzzle of Large-scale Those projects in which Los Alamos Structures”) are being performed on our cally parallel approach to the will participate are listed in the box CM-5 and have demonstrated the great solution of complex multiphase “Collaborations with Industry on Paral- potential of MPPs for scientific and en- lel Computing.” gineering research. flow problems of interest to the So far we have concentrated on the As we make parallel computing oil and gas industry. Laboratory’s initiatives in high-perfor- work for Grand Challenge and other mance computing. But the digital revo- problems of fundamental interest, we lution and the culture of the are in a good position to help industry has led us into a second major area— take advantage of the new computing establishing research collaborations be- the National Information Infrastructure opportunities. Our work with Mobil tween the oil and gas industry and the (NII) initiative. The goal of this feder- Corporation on modeling multiphase DOE national laboratories. al initiative is to build the electronic su- flow through porous media (see “To- Another new collaborative effort, led perhighways as well as to develop the ward Improved Reservoir Flow Perfor- by Los Alamos National Laboratory, software and the hardware needed to mance”) uses an intrinsically parallel, Lawrence Livermore National Labora- bring to all groups in the population Los AlamosÐdeveloped algorithm tory, and Cray Research, Inc., is specif- and all areas of the economy the bene- known as the lattice-Boltzmann method ically designed to transfer high-perfor- fits of the . to model the flow of oil and water at mance parallel-processing technology One feature of the information age is

8 Los Alamos Science Number 22 1994 Windows on Computing

A digitized fingerprint image consisting of 768 by 768 The corresponding data-compressed image resulting from appli- 8- pixels. cation of the WSQ algorithm. The compression ratio is 21:1.

Detail of the center image after WSQ Detail of a digitized fingerprint image. Detail of the center image after “JPEG” compression. compression.

Figure 4. Wavelet Compression of Fingerprint Data The FBI has over 29 million active in their criminal-fingerprint files. These cards are now being digitized at a spatial resolu- tion of 500 pixels per inch and a gray-scale resolution of 8-bits. Each card yields about 10 megabytes of data, and the potential size of the entire database is thousands of terabytes. The FBI came to the Laboratory for help in organizing the data. The Labora- tory’s Computer Research and Applications Group collaborated with the FBI to develop the Wavelet/Scalar Quantization (WSQ) algorithm, which has been made a public standard for archival-quality compression of fingerprint images. (The algorithm involves a discrete wavelet transform decomposition into 64 frequency sub-bands followed by adaptive uniform scalar quantization and Huff- man coding.) WSQ compression introduces some distortion in the image. The figures demonstrate, however, that the important features of the fingerprint, including branches and ends of ridges, are preserved. In contrast, the international “JPEG” image-com- pression standard, based on a cosine transform, is applied not to the original image as a whole but rather to square blocks into which the image has been divided. The image resulting from “JPEG” data-compression shows artifacts of this blocking procedure.

9 Windows on Computing

the rapid accumulation of very large compressed by a factor of 20, the sets of —in the gigabyte and images can be transmitted in minutes terabyte range. Such data sets are gen- rather than hours. erated in a wide variety of contexts: The WSQ algorithm transforms each large-scale scientific computation, med- image into a superposition of overlap- ical procedures such as MRI and ping wavelets, localized functions that 0 scans, environmental surveillance and vanish outside a short domain (see Fig- geoexploration by satellites, financial ure 5) in contrast to the sine and cosine transactions, consumer profiling, law functions of the standard Fourier trans- enforcement, and so on. The abilities form, which oscillate without end. The to “mine” the data—to analyze them discrete wavelet transform includes for meaningful correlations—and to wavelets on many length scales and au- compress the data for rapid communi- tomatically produces a multiresolution cation and ease of storage are of in- representation of the image. Thus an creasing interest. Also needed are intu- image can be retrieved at whatever res- itive user interfaces for manipulation of olution is appropriate for a particular 0 those very large and complex data sets. application. The Laboratory has initiated several The algorithm developed for the fin- data-mining projects that use sophisti- gerprint data is also being used to cre- cated mathematical tools to solve prac- ate a multiresolution database for the Figure 5. Mother Wavelets tical problems of data analysis, storage, efficient storage and retrieval of very for the FBI Fingerprint Image and transmission. One project has re- large images. Aerial photographs of Compression Standard sulted in a new national standard, creat- the Washington, D.C. area, supplied by Shown here are the mother wavelets ed in collaboration with the FBI, for the Geological Survey used in the WSQ algorithm for fingerprint compressing digital fingerprint data (USGS), were first digitized. Through data compression. The mother wavelets with little loss of information. the discrete wavelet transform, the are translated and dilated to form the set Figure 4 shows an original and a many separate digital images were put of basis functions used to decompose compressed fingerprint image. It also together into a continuous image so and reconstruct the original image. The compares the method adopted by the that no seams are visible. The result- top wavelet is used for image decomposi- FBI, known as the Wavelet/Scalar ing multiresolution database is illustrat- tion and the bottom wavelet is used for Quantization (WSQ) algorithm, with a ed in Figure 6, which shows the area image reconstruction. traditional data-compression method. around the Lincoln Memorial in Wash- The compressed images will be trans- ington, D.C. at seven decreasing levels mitted between local law-enforcement of resolution. At the coarsest resolu- meter/pixel) the user is able to distin- agencies and the FBI to assist in re- tion (64 meters/pixel) the entire D.C. guish features as small as an automo- trieving past records of suspected crim- area can be displayed on a workstation bile. The USGS has an interest in inals. Because the data have been monitor. At the finest resolution (1 making such data available for the

10 Los Alamos Science Number 22 1994

Windows on Computing

Figure 6. A Database of Seam- less Multiresolution Images The area around the Lincoln Memori- al in Washington, D.C., is shown at seven resolutions. The digitized im- ages were retrieved from a database constructed by using the discrete wavelet transform.

whole United States and disseminating and retrieval, and automated data Further Reading the data on -ROM as well as on the analysis are also being developed in the Yuefan Deng, James Glimm, and David H. Sharp. electronic superhighways. Sunrise approach to telemedicine 1992. Perspectives on Parallel Computing. The multiresolution database project (see “Concept Extraction” for a Daedalus, winter 1992 issue. Also published in is part of a larger effort called Sunrise. discussion of the quantitative analysis A New Era in Computation, edited by N. Metrop- olis and Gian-Carlo Rota. MIT Press 1993. The Sunrise project is a unique attempt of LAM disease). to create an integrated software and The Laboratory is going through dy- N. Metropolis, J. Howlett, and Gian-Carlo Rota. 1980. A of Computing in the Twentieth hardware system that can handle many namic change and nowhere is the Century. Academic Press, Inc. diverse applications of the kind envi- change more visible than in the area of sioned for the nation’s information su- scientific computing, communications, perhighways—from access to very large and information systems. Because of David W. Forslund is a Laboratory Fellow and , to secure electronic commer- the revolution in electronic communica- Deputy Director of the Advanced Computing cial transactions, to efficient communi- tions, many people are doing things Laboratory. He is a theoretical plasma has contributed to controlled fusion and cation of medical information in a na- they never thought possible and in space plasma research and now specializes in dis- tional health-care system. The Sunrise ways that could not have been antici- tributed high-performance computing and NII strategy is to use a set of such diverse pated (see “@XXX.lanl.gov” for a look technologies. applications as a starting point for the into the future of research communica- Charles A. Slocomb is the Deputy Division Di- development of software solutions, the tion). Los Alamos is no longer focused rector of the Computing, Information, and Com- elements of which are general enough solely on the high end of scientific munications Division. His primary interest is the future of high-performance computing and its ap- to be used in many other applications. computing for national security and plication to scientific problems. The collaboration with radiologists basic research. We have become heavi- Ira A. Agins is a specialist staff member in the at the National Jewish Center for Im- ly involved in bringing advances in Computing, Information, and Communications munology and Respiratory Medicine on computing and information systems to Division. His avocation is computer history and developing tools for telemedicine is il- all members of our Laboratory, to busi- particularly the at Los Alamos. lustrative of the Sunrise approach. The ness and industry, and to the general tools include a multimedia data man- public. We also expect that during the agement system that will display and latter half of this decade advanced com- analyze medical images, manage patient puter modeling and simulation will records, provide easy data entry, and make increasingly direct and significant facilitate generation of medical reports. contributions to society. The system is also designed to provide transparent access to medical informa- Acknowledgements tion located anywhere on the informa- We are grateful to the following people for their tion superhighway. Tools for interac- help with this article: Jonathan N. Bradley, tive collaboration among physicians, Christopher M. Brislawn, John H. Cerutti, Salman efficient data compression, transmis- Habib, Michael A. Riepe, David H. Sharp, Pablo Tamayo, and Bruce R. Wienke. sion, and storage, remote data storage

11 Collaborations with Industry on Parallel Computing

Bruce R. Wienke

The Computational Testbed for Industry Portability Tools for Massively Parallel Applications Development (CTI) was established at the Laboratory Partners: Cray Research, Inc.; Thinking Machines Corporation Goals: At present, software developed for one vendor’s massively parallel computer system in 1991 to provide U.S. industry with is not portable, that is, able to be run on other vendors’ computers. The lack of portable access to the computing environment at programs has slowed the development of applications for every kind of massively parallel our Advanced Computing Laboratory computer and the adoption of such computers by industry. This project will work toward and to the technical expertise of Los removing that barrier by creating common programming conventions for massively parallel machines. Alamos scientists and engineers. Dur- ing this past year the CTI was desig- nated officially as a Department of En- Massively-Parallel-Processing Performance-Measurement and Enhancement Tools Partner: Cray Research, Inc. ergy User Facility. That designation Goals: Create a set of software tools to improve analysis of the system-level performance affords us greater flexibility in estab- of massively parallel systems, to maximize their operating efficiency, and enhance the de- lishing and implementing collaborative sign of future systems. Plans include using this sophisticated automated toolkit to enhance agreements with industry. The number the performance of applications developed in other projects under the cooperative agree- ment between the Department of Energy and Cray Research, Inc. of collaborations has been increasing steadily and will soon total about thir- ty. The seven projects described here Lithology Characterization for Remediation of Underground Pollution are being established at the CTI through Partner: Schlumberger-Doll Research Goals: Develop three-dimensional modeling software to cut the costs of characterizing and the new cooperative agreement between cleaning up underground environmental contamination. The software is intended for use by the DOE and Cray Research, Inc. under the petroleum and environmental industries on the next generation of massively parallel su- the auspices of the DOE’s High Perfor- percomputers. mance Parallel Processor program. The projects focus on developing scien- Development of a General Reservoir Simulation for Massively Parallel Computers tific and commercial software for mas- Partners: Amoco Production Company; Cray Research, Inc. sively parallel processing. Goals: Oil and gas exploration requires simulations of flow at several million points in reservoirs. have produced well-developed reservoir simulations for multi- processor vector supercomputers but not for massively parallel systems, so exploiting the potential of massively parallel computers is a high priority. The goal of this project is to adapt Amoco's field-tested reservoir-simulation software so that it performs efficiently on the massively parallel Cray T3D. The resulting program, which will allow much better management of reservoirs, will be made available to the entire petroleum industry.

Materials Modeling Partner: Biosym Technologies Incorporated Goals: In the competitive global marketplace for advanced materials, the traditional experi- mental approach to designing new materials needs to be complemented by materials model- ing using high-performance computers. This project is aimed at creating powerful new visual-modeling software tools to improve casting and welding processes and to calculate the fracture properties of new materials designs, including composites.

12 Los Alamos Science Number 22 1994 Collaborations with Industry on Parallel Computing

At left: The Cray T3D was delivered to our Advanced Computing Laboratory in June 1994. It is a 128-processor machine that will be used primarily for collabora- tive research with industry. Below: Scientists and engineers from Cray Research, Inc. and Los Alamos at a dedication of the new machine.

Application of the Los Alamos National Laboratory Hydrocode (CFDLIB) to Problems in Oil Refining, Waste Remediation, Chemical Manufacturing, and Manufacturing Technology Partners: Exxon Research and Company; General Motors Power Train Group; Rocket Research Company; Cray Research, Inc. Goals: The speed and memory size of massively parallel systems will make it possible for U.S. industry to accurately model and improve the efficiency of chemical reactions that in- volve substances in more than one phase (solid, liquid, or gas). The project with Exxon will advance the simulation of multiphase reactors, which are heavily used in hydrocarbon refining, chemical manufacturing, gas conversion, and coal liquefaction and conversion. The goal of the General Motors project is to improve analysis of important foundry processes. One of the Rocket Research projects is aimed at improving small rock- ets used in satellite stations and has potential applications to microelectronics manufacturing and to neutralizing wastes in flue-gas emissions. Another Rocket Research project involves analysis of the performance, safety, and environmental impact of propellants used in auto- motive air bags and in fire-suppression systems of aircraft and other mass-transportation ve- hicles.

Massively Parallel Implicit Hydrodynamics on Dynamic Unstructured Grids with Applications to Chemically Reacting Flows and Groundwater Pollution Assessment and Remediation Partners: Berea Incorporated; Cray Research, Inc. Goals: Develop advanced software models to help U.S. industry better address problems involving combustion, pollution, and the treatment of contaminated groundwater and sur- face waters. These models could also be applied to designing engines, extracting more oil and gas from fields that have been drilled over, and assessing the structural integrity of buildings after a severe fire.

13