<<

Syracuse University SURFACE

Northeast Parallel Architecture Center College of Engineering and

1995

High Performance Distributed

Geoffrey C. Fox Syracuse University, Northeast Parallel Architectures Center

Follow this and additional works at: https://surface.syr.edu/npac

Part of the Computer Sciences Commons

Recommended Citation Fox, Geoffrey C., "High Performance " (1995). Northeast Parallel Architecture Center. 56. https://surface.syr.edu/npac/56

This Article is brought to you for free and open access by the College of Engineering and Computer Science at SURFACE. It has been accepted for inclusion in Northeast Parallel Architecture Center by an authorized administrator of SURFACE. For more information, please contact [email protected].

High Performance Distributed Computing

Geo rey C. Fox

[email protected]

http://www.npac.syr.edu

Northeast Parallel Architectures Center

111 College Place

Syracuse University

Syracuse, New York 13244-4100

Abstract

High Performance Distributed Computing (HPDC) is driven bythe

rapid advance of two related technologies|those underlying computing

and communications, resp ectively. These technology pushes are linked

to application pulls, whichvary from the use of a cluster of some 20

workstations simulating uid ow around an aircraft, to the complex

linkage of several hundred million advanced PCs around the glob to

deliver and receivemultimedia information. The review of base tech-

nologies and exemplar applications is followed by a brief discussion of

software mo dels for HPDC, which are illustrated bytwo extremes|

PVM and the conjectured future based WebWork

concept. The narrative is supplemented by a glossary describing the

diverse concepts used in HPDC.

1 Motivation and Overview

Advances in computing are driven by VLSI or very large scale integration,

whichtechnology has created the p ersonal computer, workstation, and par-

allel computing markets over the last decade. In 1980, the Intel 8086 used

50,000 transistors while to day's (1995) \hot" chips have some ve million

transistors||a factor of 100 increase. The dramatic improvementinchip

density comes together with an increase in clo ck sp eed and improved design

so that to day's workstations and PCs (dep ending on the function) havea

factor of 50|1,000 b etter p erformance than the early 8086 based PCs. This

p erformance increase enables new applications. In particular, it allows real

time multimedia deco ding and display|this capability will b e exploited in 1

the next generation of video game controllers and set top b oxes, and b e key

to implementing digital video delivery to the home.

The increasing density of transistors on a chip follows directly from a

decreasing feature size whichisin19950:5  for the latest Intel Pentium.

Feature size will continue to decrease and by the year 2000, chips with

50,000,000 transistors are exp ected to b e available.

Communications advances have b een driven by a set of physical trans-

p ort technologies that can carry much larger volumes of data to a much

wider range of places. Central is the use of optical b er, whichisnowcom-

p etitive in price with the staple twisted pair and coaxial cable used bythe

telephone and cable industries, resp ectively. The widespread deploymentof

optical b ers also builds on laser technology to generate the light, as well

as the same VLSI advances that drive computing. The latter are critical

to the high-p erformance digital switches needed to route signals b etween

arbitrary source, and destination. Optics is not the only physical medium

of imp ortance|continuing and critical communications advances can b e as-

sumed for satellites and wireless used for linking to mobile (cellular) phones

or more generally the future p ersonal digital assistant (PDA).

One way of exploiting these technologies is seen in parallel pro cessing.

VLSI is reducing the size of computers, and this directly increases the p er-

formance b ecause reduced size corresp onds to increased clo ck sp eed, which

is approximately prop ortional to (1=) for feature size . Crudely, the cost

2

of a given numb er of transistors is prop ortional to silicon used or  , and

3

so the cost p erformance improves by(1=) . This allows p ersonal comput-

ers and workstations to deliver to day, for a few thousand dollars, the same

p erformance that required a sup ercomputer costing several million dollars

just ten years ago. However, we can exploit the technology advances in a

di erentwayby increasing p erformance instead of (just) decreasing cost.

Here, as illustrated in Figure 1, we build computers consisting of several of

the basic VLSI building blo cks. Integrated parallel computers require high

sp eed links b etween the individual pro cessors, called no des. In Figure 1,

these links are etched on a printed circuit b oard, but in larger systems

one would also use cable or p erhaps optical b er to interconnect no des. As

shown in Figure 2, parallel computers have b ecome the dominant sup ercom-

puter technology with current high and systems capable of p erformance of

11

up to 100 GigaFLOPS or 10 oating p oint op erations p er second. One ex-

p ects to routinely install parallel machines capable of TeraFLOPS sustained

p erformance by the year 2000. 2

Figure 1: The nCUBE-2 No de and Its Integration into a Board. Up to 128

of these b oards can b e combined into a single sup ercomputer. 3 INTEL DoE (9000 P6) TERAFLOP (Trillion Fujitsu VPP500(80) INTEL Paragon(6768) Operations Cray T3D(1024) IBM SP2 (512) per second) nCUBE-2 TMC CM5 (1024)

Transputer Cray C-90 (parallel)

nCUBE-1 CRAY-2S GIGAFLOP CRAY-YMP (Billion CRAY-XMP C-90 Operations P per second) E

CRAY-1S ETA-10G R

ETA-10E High End F Workstation O CDC 7600 IBM 3090VF R MEGAFLOP CDC 6600 M (Million Operations STRETCH A per second) N C

IBM 704 E

MANIAC

SEAC

Accounting Machines

1940 1950 1960 19701980 1990 2000 Year

Massively Parallel Modestly Parallel Sequential - - -Supercomuter Nodes Performance Nodes Performance

Nodes

Figure 2: Performance of Parallel and Sequential Sup ercomputers 4

Figure 3: Concurrent Construction of a Wall using N = 8 Bricklayers

Often, one compares such highly coupled parallel machines with the hu-

man brain, whichachieves its remarkable capabilities by the linkage of some

12

10 no des|neurons in the case of the brain|to solve individual complex

problems, such as reasoning and pattern recognition. These no des havein-

dividually medio cre capabilities and slow cycle time (ab out .001 seconds),

but together they exhibit remarkable capabilities. Parallel (silicon) comput-

ers use fewer faster pro cessors (current MPPs have at most a few thousand

micropro cessor no des), but the principle is the same.

However, so ciety exhibits another form of collective computing whereby

it joins several brains together with p erhaps 100,000 p eople linked in the

design and pro duction of a ma jor new aircraft system. This collectivecom-

puter is \lo osely-coupled"|the individual comp onents (p eople) are often

separated by large distances and with mo dest p erformance \links" (voice,

memos, etc.).

Figure 3 shows HPDC at work in so ciety with a bunch of mason's build-

ing a wall. This example is explained in detail in [Fox:88a], while Figure 4

shows the parallel neural computer that makes up each no de of the HPDC

system of Figure 3.

Wehave the same twochoices in the computer world. Parallel pro cessing

is gotten with a tightly coupled set of no des as in Figure 1 or Figure 4. High

Performance Distributed Computing, analogously to Figure 3, is gotten from

a geographically distributed set of computers linked together with \longer

wires," but still co ordinated (a software issue discussed later) to solvea

\single problem" (see application discussion later). HPDC is also known as

, NOW's (Network of Workstations) and COW's (Clusters of 5

Figure 4: Three Strategies Found in the Brain (of a

Rat). Each gure depicts brain activity corresp onding to various functions:

(A) continuous of a tactile inputs in somatosensory cortex, (B) patchy

map of tactile inputs to cereb ellar cortex, and (C) scattered mapping of

olfactory cortex as represented by the unstructured pattern of 2DG up date

in a single section of this cortex [Nelson:90b]. 6

Workstations), where eachacronym has a slightly di erent fo cus in the broad

HPDC area. Notice the network (the so called \longer wires" ab ove) which

links the individual no des of an HPDC metacomputer can b e of various

forms|a few of the manychoices are a lo cal area network (LAN), suchas

or FDDI, a high-p erformance (sup ercomputer) interconnect, such

as HIPPI or a wide area network WAN with ATM technology. The physical

connectivity can b e copp er, optical b er, and/or satellite. The geographic

distribution can b e a single ro om; the set of computers in the four NSF

sup ercomputer centers linked by the vBNS; most grandiosely, the several

hundred million computers linked by the Global Information Infrastructure

(GI I) in the year 2010.

HPDC is a very broad eld|one can view client-server enterprise com-

puting and parallel pro cessing as sp ecial cases of it. As we describ e later, it

exploits and generalizes the software built for these other systems.

By de nition, HPDC has no precise architecture or implementation at

the hardware level|one can use whatever set of networks and computers is

available to the problem at hand. Thus, in the remainder of the article, we

fo cus rst on applications and then some of the software mo dels.

In the application arena, we go through three areas|aircraft design

and manufacture, military command and control, and multimedia informa-

tion systems. In each case, wecontrast the role of parallel computing and

HPDC|parallel computing is used for the application comp onents that can

be broken into mo dules (such as grid p oints for di erential equation solvers

or pixels for images), but these are linked closely in the used to

manipulate them. Corresp ondingly, one needs the low latency and high in-

terno de communication bandwidth of parallel machines. HPDC is typically

used for coarser grain decomp ositions (e.g., the di erentconvolutions on a

single image rather than the di erent blo cks of pixels in an image). Corre-

sp ondingly larger latencies, and lower bandwidths can b e tolerated. HPDC

and parallel computing often deal with similar issues of synchronization and

parallel data decomp osition, but with di erent tradeo s and problem char-

acteristics. Indeed, there is no sharp division b etween these two concepts,

and clusters of workstations can b e used for large scale parallel comput-

ing (as in CFD for engine simulation at Pratt and Whitney) while tightly

coupled MPPs can b e used to supp ort multiple uncoupled users|a classic

\" HPDC application. 7

2 Applications

Wehavechosen three examples to discuss HPDC and contrast distributed

and parallel computing. In the rst and second, manufacturing and com-

mand and control, we see classic parallel computing for simulations, and

signal pro cessing resp ectively linked to geographically distributed HPDC.

In the last, InfoVISiON, the parallel computing is seen in parallel multime-

dia , and the distributed HPDC asp ects are more pronounced.

2.1 Manufacturing and Computational Fluid Dynamics

HPDC is used to day, and can b e exp ected to play a growing role in manufac-

turing, and more generally, engineering. For instance, the p opular concept of

agile manufacturing supp oses the mo del where virtual corp orations generate

\pro ducts-on-demand." The NI I is used to link collab orating organizations.

HPDC is needed to supp ort instant design (or more accurately redesign

or customization) and sophisticated and \test

drives" for the customer. At the corp orate infrastructure level, concurrent

engineering involves integration of the di erent comp onent disciplin es|such

as design, manufacturing, and pro duct life cycle supp ort|involved in engi-

neering. These general ideas are tested severely when they are applied to the

design and manufacturing of complex systems such as automobiles, aircraft,

and space vehicles suchasshuttles. Both the complexity of these pro ducts,

and in some sense the maturity of their design, places sp ecial constraints

and challenges on HPDC.

High-p erformance computing is imp ortant in all asp ects of the design

of a new aircraft. However, it is worth noting that less than 5% of the

initial costs of the Bo eing 777 aircraft were incurred in computational uid

dynamics (CFD) air owsimulations|the \classic" Grand Challenge in this

eld. On the other hand, over 50% of these sunk costs could b e attributed to

overall systems issues. Thus, it is useful but not sucient to study parallel

computing for large scale CFD. This is \Amdahl's law for practical HPDC."

If only 5% of a problem is parallelized, one can at b est sp eed up and impact

one's goals|a ordability, time to market|by this small amount. HPDC,

thus, must b e fully integrated into the entire engineering enterprise to b e

e ective. Very roughly,we can view the ratios of 5% to 50% as a measure

of ratio of 1:10 of the relevance of parallel and distributed computing in this

case. 8

The maturity of the eld is illustrated by the design criterion used to day.

In the past, much e ort has b een sp ent on improving p erformance|more

sp eed, range, altitude, size. These are still critical under extreme conditions,

but basically these just form a given design framework that suces to buy

you a place at the table (on the short-list). Rather, the key design criteria is

comp etitiveness, including time to market, and total a ordability. Although

the design phase is not itself a ma jor cost item, decisions made at this stage

lo ck in most of the full life cycle cost of an aircraft with p erhaps 80% of total

cost split roughly equally b etween maintenance and manufacturing. Thus,

it certainly would b e imp ortant to apply HPDC at the design phase to b oth

shorten the design cycle (time to market) and lower the later ongoing costs

of manufacturing and maintenance.

We take as an example the design of a future military aircraft|p erhaps

10 years from now. This analysis is taken from a set of NASA sp onsored

activities centered on a study of ASOP|A ordable Systems Optimization

Pro cess. This involved an industrial team, including Ro ckwell International,

Northrop Grumman, McDonnell Douglas, General Electric, and General

Motors. ASOP is one of several p ossible approachestomultidiscipli nary

analysis and design (MAD) and the results of the study should b e generally

valid to these other MAD systems. The hyp othetical aircraft design and

construction pro ject could involve six ma jor companies and 20,000 smaller

sub contractors. This impressive virtual corp oration would b e very geograph-

ically disp ersed on b oth a national and probably international scale. This

pro ject could involve some 50 at the rst conceptual design phase.

The later preliminary and detailed design stages could involve 200 and 2,000

engineers, resp ectively. The design would b e fully electronic and demand

ma jor computing, information systems, and networking resources. For in-

stance, some 10,000 separate programs would b e involved in the design.

These would range from a parallel CFD air ow simulation around the plane

to an exp ert system to plan lo cation of an insp ection p ort to optimize main-

tainability. There is a corresp onding wide range of computing platforms

from PCs to MPPs and a range of languages from spreadsheets to high-

p erformance Fortran. The integrated multidiscipli nary optimization do es

not involve blindly linking all these programs together, but rather a large

numb er of sub optimizations involving at one time a small cluster of these

base programs. Here we see clearly,anessential role of HPDC to imple-

ment these set of geographically distributed optimizations. However, these

clusters could well need linking of geographically separated compute and

information systems. An aircraft is, of course, a very precise system, which 9 1.0 - ASOP Design Environment

1.1 - 1.2 - 1.3 - 1.4 - 1.5 - 1.6 - 1.7 - 1.8 - Analysis Cost Design Visualizaton Geometry Modeling Modeling Modeling Simulation Optimization Engine Toolkit Toolkit Toolkit Toolkit Toolkit Engine Engine

2.0 - ASOP Object Backplane

2.1 - 2.2 - 2.3 - 2.4 - 2.5 - Collaboration Metacomputing Metacomputing Security and Object and Data Services Services Services Access Services Services

ORGANIZATION SPECIFIC ASOP INFRASTRUCTURE

NII INFRASTRUCTURE AND SERVICES

Figure 5: A ordable Systems Optimization Pro cess (ASOP) Implemented

on the NI I for Aeronautics Systems

must work essentially awlessly. This requirement implies a very strict co or-

dination and control of the many di erent comp onents of the aircraft design.

Typically, there will b e a master systems to which all activities are

synchronized at regular intervals{p erhaps every month. The clustered sub-

optimizations represent a set of limited excursions from this base design

that are managed in a lo osely synchronous fashion on a monthly basis. The

con guration management and database system are b oth critical and repre-

sent a ma jor di erence b etween manufacturing and command and control,

where in the latter case, real time \as go o d as you can do" resp onse, is

more imp ortant than a set of precisely controlled activities. These issues

are characteristic of HPDC where, although lo osely coupled, the computers

on our global network are linked to \solve a single problem."

ASOP is designed as a software backplane (the NI I) linking eight ma jor

services or mo dules shown in Figure 5. These are design (pro cess controller)

engine, visualization, optimization engine, simulation engine, pro cess (man-

ufacturing, pro ductibility, supp ortability) mo deling to olkit, costing to olkit,

analytic mo deling to olkit, and geometry to olkit. These are linked to a set

of databases de ning b oth the pro duct and also the comp onent prop erties.

Parallel computing is imp ortantinmany of the base services, but HPDC is

seen in the full system. 10

2.2 Command and Control

Command Control (sometimes adding in Computing, Communications, In-

telligence Surveillance, and Battle Management with abbreviations lump ed

4

together as BMC IS) is the task of managing and planning a military op-

eration. It is very similar to the civilian area of Crisis management, where

the op erations involvecombating e ects of hurricanes, earthquakes, chem-

ical spills, forest res, etc. Both the military and civilian cases havecom-

putational \nuggets" where parallel computing is relevant. These include

pro cessing sensor data (signal and image pro cessing) and simulations of

such things as exp ected weather patterns and chemical plumes. One also

needs large-scale multimedia databases with HPDC issues related to those

describ ed for InfoVISiON in Section 2.3.

HPDC is needed to link military planners and decision makers, crisis

managers, exp erts at so-called anchor desks, workers (warriors) in the eld,

information sources such as cable news feeds, and large-scale database and

simulation engines.

Akey characteristic of the required HPDC supp ort is adaptivity. Crises

and battles can o ccur anywhere and destroy an arbitrary fraction of the

existing infrastructure. Adaptivity means making the b est use of the re-

maining links, but also deploying and integrating well mobile enhancements.

The information infrastructure must exhibit security and reliabilityorat

least excellent (adaptivity). Network managementmust deal

with the unexp ected capacity demands and real time constraints. Priority

schemes must allow when needed critical information (such as the chemi-

cal plume monitoring and military sensor data) precedence over less time

critical information, suchasbackground network video fo otage.

Needed computing resources will vary from p ortable handheld systems

to large backend MPPs. As there will b e unpredictable battery (p ower)

and bandwidth constraints, it is imp ortant that uniform user interfaces and

similar services b e available on all platforms with, of course, the delityand

quality of a service re ecting the intrinsic p ower of a given computer. As

with the communications infrastructure, wemust cop e with unexp ected ca-

pacity demands. As long as the NI I is deployed nationally, computational

capacity can b e exploited in remote sites. The Department of Defense en-

visages using the basic NI I (GI I) infrastructure for command and control,

augmented by \theater extensions" to bring needed communications into

critical areas. The \takeitasitis"characteristic of command and control

requires that op erating systems and programming mo dels supp ort a general 11

adaptive mix (metacomputer) of co ordinated geographically distributed but

networked computers. This network will adaptively link available p eople

(using p erhaps p ersonal digital assistants) to large-scale computation on

MPPs and other platforms. There are large computational requirements

when forecasting in real-time physical phenomena, such as the weather ef-

fects on a pro jected military action, forest res, hurricanes, and the struc-

ture of damaged buildings. On a longer time scale, simulation can b e used

for contingency planning and capability assessment. Training with simu-

lated virtual worlds supp orting televirtuality, requires ma jor computational

resources. In the information arena, applications include datamining to de-

tect anomalous entries (outliers) in large federated multimedia databases.

Data fusion including sensor input and pro cessing, geographical information

systems (with p erhaps three-dimensional terrain rendering), and stereo re-

construction from multiple video streams are examples of compute intensive

image pro cessing forming part of the needed HPDC environment.

A critical need for information managementinvolves the b est p ossible

high-level extraction of knowledge from databanks|the crisis manager must

make judgments in unexp ected urgent situations|we cannot carefully tai-

lor and massage data ahead of time. Rather, we need to search a disparate

set of multimedia databases. As well as knowledge extraction from par-

ticular information sources, the systematic use of metadata allowing fast

coarse grain searching is very imp ortant. This is a sp eci c example of the

imp ortance of standards in exp editing access to \unexp ected" databases.

One requires access to databases sp eci c to crisis region or battle eld, and

widespread availabilityofsuch geographic and community information in

electronic form is essential. There are very dicult p olicy and securityis-

sues, for many of these databases need to b e made instantly available in a

hassle-free fashion to the military commander or crisis manager|this could

run counter to proprietary and security classi cation constraints. The in-

formation system should allow b oth network news and warriors in the eld

to dep osit in near real-time, digital versions of their crisis and battle eld

videos and images.

As mentioned, we exp ect that human and computer exp ertise to b e avail-

able in \anchor desks" to supp ort instant decisions in the heat of the battle.

These have b een used in a set of military exercises called JWID (Joint

Warrior Interop erability Demonstrations). We note that this information

scenario is a real-time version of that describ ed in the next section as Info-

VISiON to supp ort the so ciety of the . 12 Settop microwave link 1 - 20 Megabits/sec VR Local InfoVision Server

Any other home devices (PCs, CD-ROMs, Disks, etc.) Linked to central and other distributed InfoVision TV servers

Storage poor Storage rich

Compute rich Comparitively compute poor

Figure 6: The basic InfoVISiON scenario as seen by a home in the year 2000

with an intelligent settop b oxinterfacing the digital home to a hierarchical

network of InfoVISiON servers

Command and Control has historically used HPDC as the relevantcom-

puter and communication resources, are naturally distributed, and not cen-

tralized into a single MPP.We see this HPDC mo del growing into the stan-

dard information supp ort environment for all the nation's enterprises, in-

cluding business, education, and so ciety.Wenow explore this in the follow-

ing section.

2.3 Application Exemplar|InfoVISiON

High-p erformance distributed computers solve problems in science and en-

gineering. We think of these problems as simulations of air ow, galaxies,

bridges, and such things. However, presumably entertaining, informing,

and educating so cietyisanequallyvalid problem. Computers and net-

works of the NI I will b e able (see Figure 6) to deliver information at many

megabits/second to \every" home, business (oce), and scho ol \desk." This

network can b e considered as an HPDC system b ecause one exp ects the in-

formation to b e stored in a set of distributed multimedia services that could

vary from PCs to large MPPs and b e delivered to a larger set of clients. As

shown in Figure 7, one can estimate that the total compute capabilityin

these servers and clients will b e some hundred times greater than that of

the entire set of sup ercomputers in the nation. 13 NII Compute & Communications Capability in Year 2005 - 2020

100 at a 10 14 (Fl)ops/sec at Teraflop each 100% Duty Cycle

100 Million NII Offramps or 10 14 bits to words/sec Connections at realtime video at about 10% to speeds 30% Duty Cycle

100 Million home PCs, 10 16 to 10 17 (Fl)ops/sec Videogames or Settop Boxes at about 10% to at 100-1000 Mega(Fl)ops each 30% Duty Cycle

1,000 to 10,000 High 15 Performance Multimedia 10 to 10 16 ops/sec (parallel) servers each with at 100% Duty Cycle some 1,000 to 10,000 nodes

Each of three components (network connections, clients,

servers) has capital value of order $10 to $100 billion

Figure 7: An estimate of the communication bandwidth and compute capa-

bilitycontained in the NI I and sup ercomputer industries 14 A Typical Hierarchical Server Network

All the Different States ......

National Master Server (Hollywood for Movies)

New York Net York All the Different City State Server Counties . . . . .

Onondaga County or Shopping Malls Corporations Perhaps CNY Server

School Districts Universities

The world: Global Information

Infrastructure

Figure 8: A typical hierarchical server network depicted for a master system

in Hollywo o d cascading down with a fragment of no de systems shown for

central New York

The computational issues in this application are somewhat di erent than

those for the previous cases we considered. Classic and lan-

guages, such as High Performance Fortran, are not critical. Large-scale

distributed databases are the heart of this application, which are accessed

through the explo ding set of Web technologies. Presumably, these will need

to migrate from to day's clients (PC/workstations) to more user friendly,and

at least initially less exible settop b ox implementations controlling home en-

tertainment systems. We will nd the same need for data lo cality as in large

scale simulations. As shown in Figure 8, when the latest Hollywood movie

is released on the NI I, one will not have half the nation directly connected

to Hollywo o d. Rather, data is automatically replicated or cached on lo cal

servers so that one will only need to communicate such \hot" information

over a distance of a few miles. As in simulation examples, communication

bandwidth will b e limited and such steps are needed to reduce demand. 15

InfoVISiON will require simulation, but it will b e more lo osely coupled

than for say large-scale CFD, and consist of very many smaller problems.

Interactive videogaming with multiple players sharing a virtual world is one

clear need for simulation on the NI I, and for this the three-dimensional

database VRML has b een intro duced. However, another example that can

use the same technology is remote viewing and exploration of consumer

pro ducts, such as cars, furniture, and large appliances. Simulation will

supp ort virtual realitylike exploration and the automatic customization of

such pro ducts for particular customers.

3 Software for HPDC

As in all areas of computing, the software supp ort for HPDC is built up

in a layered fashion with no clearly agreed architecture for these layers.

We will describ e software for a target metacomputer consisting of a set of

workstations. The issues are similar if you include more general no des, such

as PCs or MPPs. Many of the existing software systems for HPDC only

supp ort a subset of p ossible no des, although \in principle" all could b e

extended to a general set.

In a layered approach, we start with a set of workstations running a

particular variant of UNIX with a communication capability set up on top

of TCP/IP. As high-sp eed networks suchasATM b ecome available, there

has b een substantial researchinto optimizing communication proto cols so as

to take advantage of the increasing hardware capability. It seems likely that

the exibility of TCP/IP will ensure it will b e a building blo ck of all but

the most highly optimized and sp ecialized HPDC communication systems.

On top of these base distributed computing op erating system services, two

classes of HPDC software supp ort have b een develop ed.

The most imp ortant of these we can call the MCPE (Metacomputing

Programming Environment) which is the basic application program interface

(API). The second class of software can b e termed MCMS (Metacomput-

ing Management System) which provides the overall and related

services.

For MCMS software, the b est known pro ducts are probably Load Leveler

(from IBM pro duced for the SP-2 parallel machine), LSF (Load Sharing

Facility from ), DQS (Distributed Queuing System),

and Condor from Wisconsin State University. Facilities provided by this

software class could include batch and other queues, scheduling and no de 16

allo cation, pro cess migration, load balancing, and fault tolerance. None

of the current systems is very mature and they do not o er integration of

HPDC with parallel computing, and such standards as HPF or MPI.

We can illustrate two p ossible MCPE's or API's with PVM and the

World Wide Web. PVM o ers basic supp ort for heteroge-

neous no des. It also has the necessary utilities to run \complete jobs" with

appropriate pro cesses spanned.

The World Wide Web is, of course, a much more sophisticated envi-

ronment, which currently o ers HPDC supp ort in the information sector.

However, recent developments, suchasJava, supp ort emb edded downloaded

applications. This naturally enables embarrassingly parallel HPDC compu-

tational problems. Wehave prop osed a more ambitious approach, termed

WebWork, where the most complex HPDC computations would b e sup-

p orted. This approach has the nice feature that one can integrate HPDC

supp ort for database and compute capabilities. All three examples discussed

in Section 2 required this. One can b e concerned that the World Wide Web

will havetoomuchoverhead with its use of HTTP communication proto-

col, and substantial Web server pro cessing overhead. We exp ect that the

b ene ts of the exibilityofWebWork will outweigh the disadvantages of

these additional overheads. We can also exp ect the ma jor Web Technology

development around the world to lead to o much more ecientserver and

communication systems. 17

Glossary

Applets An application interface where referencing (p erhaps by a mouse

click) a remote application as a hyp erlink to a server causes it to b e

downloaded and run on the client.

Asynchronous Transfer Mo de (ATM) ATM is exp ected to b e the pri-

mary networking technology for the NII to supp ort multimedia com-

munications. ATM has xed length 53 messages (cells) and can

run over any media with the cells asynchronously transmitted. Typi-

cally,ATM is asso ciated with Synchronous Optical Network (SONET)

optical b er digital networks running at rates of OC-1 (51.84 megabits/

sec), OC-3 (155.52 megabits/sec) to OC-48 (2,488.32 megabits/sec).

Bandwidth The communications capacity (measured in bits p er second)

of a transmission line or of a sp eci c path through the network.

Clustered Computing A commonly found computing environmentcon-

sists of manyworkstations connected together by a lo cal area network.

The workstations, whichhave b ecome increasingly p owerful over the

years, can together, b e viewed as a signi cant computing resource.

This resource is commonly known as cluster of workstations, and can

b e generalized to a heterogeneous collection of machines with arbitrary

architecture.

Command and Control This refers to the computer supp ort decision

making environmentusedby military commanders and intelligence

ocers. It is describ ed in Section 2.2.

COWorNOW Clusters of Workstations (COW) are a particular HPDC

environment where often one will use optimized network links and

interfaces to achieve high p erformance. A COW|if homogeneous|is

particularly close to a classic homogeneous MPP built with the same

CPU chipsets as workstations. Prop onents of COW's will claim that

use of commo dityworkstation no des allow them to tracktechnology

b etter than MPP's. MPP prop onents note that their optimized designs

deliver higher p erformance, whichoutweighs the increased cost of low-

volume designs, and e ective p erformance loss due to later (mayb e

only months) adoption of a given technology by the MPP compared

to commo ditymarkets. 18

Network of Workstations (NOWathttp://now.cs.b erkeley.edu/) and

SHRIMP (Scalable High-Performance Really Inexp ensive Multi Pro-

cessor at http:/www.cs.princeton.edu/Shrimp/) are well-known research

pro jects developing COWs.

Data Lo cality and Caching Akey to sequential parallel and distributed

computing is data lo cality. This concept involves minimizing \dis-

tance" b etween pro cessor and data. In sequential computing, this

implies \caching" data in fast memory and arranging computation to

minimize access to data not in cache. In parallel and distributed com-

puting, one uses migration and to minimize time a given

no de sp ends accessing data stored on another no de.

Data Mining This describ es the search and extraction of unexp ected in-

formation from large databases. In a database of credit card transac-

tions, conventional database search will generate monthly statements

for each customer. will discover using ingenious algo-

rithms, a linked set of records corresp onding to fraudulentactivity.

Data Parallelism A mo del of parallel or distributed computing in which

a single op eration can b e applied to cell elements of a data structure

simultaneously. Often, these structures are arrays.

Data Fusion A common command and control approach where the dis-

parate sources of information available to a military or civilian com-

mander or planner, are integrated (or fused) together. Often, a GIS

is used as the underlying environment.

Distributed Computing The use of networked heterogeneous computers

to solve a single problem. The no des (individual computers) are typi-

cally lo osely coupled.

Distributed Computing Environment The OSF Distributed Comput-

ing Environment (DCE) is a comprehensive, integrated set of services

that supp orts the development, use and maintenance of distributed

applications. It provides a uniform set of services, anywhere in the

network, enabling applications to utilize the p ower of a heterogeneous

network of computers. http://www.osf.org/dce/

Distributed Memory A in which the memory of

the no des is distributed as separate units. hard- 19

ware can supp ort either a distributed memory programming mo del,

such as message passing or a programming mo del.

Distributed Queuing System (DQS) An exp erimental UNIX based queu-

ing system b eing develop ed at the Sup ercomputer Computations Re-

search Institute (SCRI) at The Florida State University. DQS is de-

signed as a management to ol to aid in computational resource distri-

bution across a network, and provides architecture transparency for

b oth users and administrators across a heterogeneous environment.

http://www.scri.fsu.edu/ pasko/dqs.html

Embarrassingly Parallel A class of problems that can b e broken up into

parts, which can b e executed essentially indep endently on a parallel

or distributed computer.

Geographical (GIS) A where infor-

mation is displayed at lo cations on a digital map. Typically, this in-

volves several p ossible overlays with di erenttyp es of information.

Functions, such as image pro cessing and planning (such as shortest

path) can b e invoked.

Gigabit A measure of network p erformance|one Gigabit/sec is a band-

9

width of 10 bits p er second.

9

Giga op A measure of computer p erformance|one Giga op is 10 oating

point op erations p er second.

Global Information Infrastructure (GI I) The GI I is the natural world-

wide extension of the NII with comparable exciting vision and uncer-

tain vague de nition.

High-Performance Computing and Communications (HPCC) Refers

generically to the federal initiatives, and asso ciated pro jects and tech-

nologies that encompass paral lel computing, HPDC, and the NII.

High-Performance Distributed Computing (HPDC) The use of dis-

tributed networked computers to achieve high p erformance on a sin-

gle problem, i.e., the computers are co ordinated and synchronized to

achieve a common goal.

HPF A language sp eci cation published in 1993 by exp erts in

writing and parallel computation, the aim of which is to de ne a set of 20

directives which will allowaFortran 90 program to run eciently on a

distributed memory machine. At the time of writing, many hardware

vendors have expressed interests, a few have preliminary ,

and a few indep endent compiler pro ducers also have early releases. If

successful, HPF would mean data parallel programs can b e written

p ortably for various multipro cessor platforms.

Hyp erlink The user level mechanism (remote address sp eci ed in a HTML

or VRML ob ject) by which remote services are accessed by Web Clients

or Servers.

Hyp ertext Markup Language (HTML) Asyntax for describing do cu-

ments to b e displayed on the World Wide Web.

Hyp ertext Transp ort Proto col (HTTP) The protocol used in the com-

munication Web Servers and clients.

InfoVISiON Information, Video, Imagery, and Simulation ON demand

is scenario describ ed in Section 3 where multimedia servers deliver

multimedia information to clients on demand|at the click of the user's

mouse.

Integrated Service Data Network (ISDN) A digital multimedia ser-

vice standard with a p erformance of typically 128 kilobits/sec, but

with p ossibility of higher p erformance. ISDN can b e implemented us-

ing existing telephone (POTS) wiring, but do es not have the necessary

p erformance of 1{20 megabits/second needed for full screen TV dis-

play at either VHS or high de nition TV (HDTV) resolution. Digital

video can b e usefully sent with ISDN by using quarter screen resolution

and/or lower (than 30 p er second) frame rate.

Internet A complex set of interlinked national and global networks us-

ing the IP messaging proto col, and transferring data, electronic mail,

and World Wide Web. In 1995, some 20 million p eople could access

Internet|typically by POTS.TheInternet has some high-sp eed links,

but the ma jority of transmissions achieve (1995) bandwidths of at b est

100 kilobytes/sec. the Internet could b e used as the network to sup-

p ort a metacomputer, but the limited bandwidth indicates that HPDC

could only b e achieved for embarrassingly paral lel problems. 21

Internet Proto col (IP) The network-layer communication proto col used

in the DARPAInternet. IP is resp onsible for host-to-host address-

ing and , packet forwarding, and packet fragmentation and re-

assembly.

Java A distributed computing language (Web Technology ) develop ed by

Sun, which is based on C++ but supp orts Applets.

Latency The time taken to service a request or deliver a message which

is indep endent of the size or nature of the op eration. The latency of

a message passing system is the minimum time to deliver a message,

even one of zero length that do es not havetoleave the source pro cessor.

The latency of a le system is the time required to deco de and execute

anull op eration.

LAN, MAN, WAN Lo cal, Metrop olitan, and Wide Area Networks can

b e made from any or many of the di erentphysical network media,

and run the di erent proto cols. LAN's are typically con ned to de-

partments (less than a kilometer), MAN's to distances of order 10 kilo-

meters, and WAN's can extend worldwide.

Lo ose and Tight Coupling Here, coupling refers to linking of computers

in a network. Tight refers to low latency, high bandwidth; lo ose to

high latency and/or low bandwidths. There is no clear dividing line

between \lo ose" or \tight."

Massively Parallel Pro cessing (MPP) The strict de nition of MPP is

a machine with manyinterconnected pro cessors, where `many' is de-

p endent on the state of the art. Currently, the ma jority of high-end

machines have fewer than 256 pro cessors. A more practical de nition

of an MPP is a machine whose architecture is capable of having many

pro cessors|that is, it is scalable. In particular, machines with a dis-

tributed memory design (in comparison with shared memory designs)

are usually synonymous with MPPs since they are not limited to a cer-

tain numb er of pro cessors. In this sense, \many" is a numb er larger

than the current largest numb er of pro cessors in a shared-memory

machine.

Megabit A measure of network p erformance|one Megabit/sec is a band-

6

width of 10 bits p er second. Note eight bits represent one character|

called a byte. 22

Message Passing Astyle of inter-pro cess communication in which pro-

cesses send discrete messages to one another. Some computer archi-

tectures are called message passing architectures b ecause they supp ort

this mo del in hardware, although message passing has often b een used

to construct op erating systems and network software for sequential

pro cessors, shared memory, and distributed computers.

Message Passing Interface (MPI) The parallel programming commu-

nity recently organized an e ort to standardize the communication

subroutine libraries used for programming on massively parallel com-

puters suchasIntel's Paragon, Cray's T3D, as well as networks of

workstations. MPI not only uni es within a common framework pro-

grams written in a variety of exiting (and currently incompatible) par-

allel languages but allows for future p ortability of programs b etween

machines.

Metacomputer This term describ es a collection of heterogeneous comput-

ers networked by a high-sp eed wide area network. Suchanenviron-

mentwould recognize the strengths of each machine in the Metacom-

puter, and use it accordingly to eciently solve so-called Metaprob-

lems. The World Wide Web has the p otential to b e a physical real-

ization of a Metacomputer.

Metaproblem This term describ es a class of problem which is outside the

scop e of a single computer architectures, but is instead b est run on a

Metacomputer with many disparate designs. These problems consist

of many constituent subproblems. An example is the design and man-

ufacture of a mo dern aircraft, which presents problems in geometry

grid generation, uid ow, acoustics, structural analysis, op erational

research, visualization, and database management. The Metacom-

puter for such a Metaproblem would b e networked workstations, array

pro cessors, vector sup ercomputers, massively parallel pro cessors, and

visualization engines.

Multimedia Server or Client Multimedia refers to information (digital

data) with di erent mo dalities, including text, images, video, and com-

puter generated simulation. Servers disp ense this data, and clients re-

ceive it. Some form of browsing, or searching, establishes whichdata

is to b e transferred. See also InfoVISiON. 23

Multiple-Instruction/Multi ple- Dat a (MIMD) A parallel computer ar-

chitecture where the no des have separate instruction streams that can

address separate memory lo cations on each clo ck cycle. All HPDC sys-

tems of interest are MIMD when viewed as a metacomputer, although

the no des of this metacomputer could have SIMD architectures.

Multipurp ose Internet Mail Extension (MIME) The format used in

sending multimedia messages b etween Web Clients and Servers that

is b orrowed from that de ned for electronic mail.

National Information Infrastructure (NI I) The collection of ATM,ca-

ble, ISDN, POTS, satellite, and wireless networks connecting the col-

8 9

lection of 10 {10 computers that will b e deployed across the U.S.A.

as set-top b oxes, PCs, workstations, and MPPs in the future.

The NI I can b e viewed as just the network infrastructure or the full col-

lection of networks, computers, and overlayed software services. The

Internet and World Wide Web are a prototyp e of the NI I.

Network Aphysical communication medium. A network may consist of

one or more buses, a switch, or the links joining pro cessors in a mul-

ticomputer.

No de A parallel or distributed system is made of a bunch of no des or

fundamental computing units|typically fully edged computers in the

MIMD architecture.

N(UMA) UMA||refers to shared memory in

which all lo cations have the same access characteristics, including the

same access time. NUMA (Non-Uniform Memory Access) refers to the

opp osite scenario.

Parallel Computer A computer in whichseveral functional units are ex-

ecuting indep endently. The architecture can vary from SMP to MPP

and the nodes (functional units) are tightly coupled.

POTS The conventional twisted pair based Plain Old Telephone Service.

Proto col A set of conventions and implementation metho dologies de ning

the communication b etween no des on a network. There is a famous

seven layer OSI standard mo del going from physical link (optical b er

to satellite) to application layer (suchasFortran subroutine calls). 24

Anygiven system, suchasPVMortheWeb implements a particular

set of proto cols.

PVM PVM was develop ed at Emory and Tennessee Universities, and Oak

Ridge National Lab oratory. It supp orts the message passing program-

ming mo del on a network of heterogeneous computers (http://www.

epm.ornl.gov/pvm/).

Shared Memory Memory that app ears to the user to b e contained in a

single address space that can b e accessed byany pro cess or anynode

(functional unit) of the computer. Shared memory mayhave UMA or

NUMA structure. Distributedcomputers can have a shared memory

mo del implemented in either hardware or software|this would always

b e NUMA. Shared memory parallel computers can b e either NUMA

or UMA.

Virtual or Distributed Shared Memory is (the illusion of ) a shared

memory built with physically distributed memory.

Single-Instruction/Multi ple- Data (SIMD) A paral lel computer archi-

tecture in whichevery no de runs in lo ckstep accessing a single global

instruction stream, but with di erent memory lo cations addressed by

each no de. Such synchronous op eration is very unnatural for the no des

of a HPDC system.

Sup ercomputer the most p owerful computer that is available at any given

time. As p erformance is roughly prop ortional to cost, this is not very

well de ned for a scalable parallel computer. Traditionally, computers

costing some $10{$30 M are termed sup ercomputers.

Symmetric Multipro cessor (SMP) A Symmetric Multipro cessor sup-

p orts a shared memory programming mo del|typically with a UMA

memory system, and a collection of up to 32 no des connected with a

.

Televirtual The ultimate computer illusion where the user is fully inte-

grated into a simulated environment and so can interact naturally

with fellow users distributed around the glob e.

12

Tera op A measure of computer p erformance|one Tera op is 10 oat-

ing p oint op erations p er second. 25

Transmission Control Proto col (TCP) A connection-oriented transp ort

proto col used in the DARPAInternet. TCP provides for the reliable

transfer of data, as well as the out-of-band indication of urgent data.

VBNS A high sp eed ATM exp erimental network (WAN) maintained by

the National Science Foundation (NSF) to link its four Supercomputer

centers at Cornell, Illinois, Pittsburgh, and San Diego, as well as the

Boulder National Center for Atmospheric Research (NCAR).

Virtual Reality Mo deling Language (VRML) A \three-dimensional"

HTML that can b e used to give a universal description of three-

dimensional ob jects that supp orts hyperlinks to additional informa-

tion.

Web Clients and Servers A distributed set of clients (requesters and re-

ceivers of services) and servers (receiving and satisfying requests from

clients) using Web Technologies.

WebWindows The op erating environment created on the World Wide

Web to manage a distributed set of networked computers. WebWin-

dows is built from Web clients and Web servers.

WebWork (Fox:95a) An environment prop osed by Boston University, Co-

op erating Systems Corp oration, and Syracuse University, whichin-

tegrates computing and information services to supp ort a rich dis-

tributed programming environment.

World Wide Web and Web Technologies Avery imp ortant software

mo del for accessing information on the Internet based on hyp erlinks

supp orted by Web technologies,suchasHTTP, HTML, MIME, ,

Applets,and VRML. 26

References

[Almasi:94a] Almasi, G. S., and Gottlieb, A. Highly Paral lel Computing.

The Benjamin/Cummings Publishing Company, Inc., Redwood City,

CA, 1994. second edition.

[Andrews:91a] Andrews, G. R. Concurrent Programming: Principles and

Practice. The Benjamin/Cummings Publishing Company, Inc., Red-

wo o d City, CA, 1991.

[Angus:90a] Angus, I. G., Fox, G. C., Kim, J. S., and Walker, D. W. Solving

Problems on Concurrent Processors: Software for Concurrent Proces-

sors,volume 2. Prentice-Hall, Inc., Englewo o d Cli s, NJ, 1990.

[Birman:92a] Birman, K., et al. ISIS User Guide and Reference Manual.

Isis Distributed Syustems, Inc., Ithaca, NY, 1992.

[Foster:95a] Foster, I. Designing and Building Paral lel Programs. Addison-

Wesley, 1995. http://www.mcs.acl.gov/dbpp/.

[Fox:88a] Fox, G. C., Johnson, M. A., Lyzenga, G. A., Otto, S. W., Salmon,

J. K., and Walker, D. W. Solving Problems on Concurrent Processors,

volume 1. Prentice-Hall, Inc., Englewo o d Cli s, NJ, 1988.

[Fox:89y] Fox, G. C. \Parallel computing." Technical Rep ort C3P-830, Cal-

ifornia Institute of Technology,Pasadena, CA, Septemb er 1989. Chap-

ter in EncyclopediaofPhysical Science and Technology 1991 Yearbook,

Academic Press, Inc.

[Fox:90o] Fox, G. C. \Applications of parallel sup ercomputers: Scienti c re-

sults and computer science lessons," in M. A. Arbib and J. A. Robinson,

editors, Natural and Arti cial Paral lel Computation,chapter 4, pages

47{90. MIT Press, Cambridge, MA, 1990. SCCS-23. Caltech Rep ort

C3P-806b.

[Fox:92b] Fox, G. C. \Parallel sup ercomputers," in C. H. Chen, editor,

Computer Engineering Handbook,chapter 17. McGraw-Hill Publishing

Company,NewYork, 1992. Caltech Rep ort C3P-451d.

[Fox:92c] Fox, G. C. \The use of physics concepts in computation," in

B. A. Hub erman, editor, Computation: The Micro and the MacroView,

chapter 3, pages 103{154. World Scienti c Publishing Co. Ltd., 1992.

SCCS-237, CRPC-TR92198. Caltech Rep ort C3P-974. 27

[Fox:94a] Fox, G. C., Messina, P. C., and Williams, R. D., editors. Paral lel

Computing Works! Morgan Kaufmann Publishers, San Francisco, CA,

1994. http://www.infomall.org/npac/p cw/.

[Fox:95a] Fox, G. C., Furmanski, W., Chen, M., Rebbi, C., and Cowie,

J. H. \WebWork: integrated programming environment to ols for na-

tional and grand challenges." Technical Rep ort SCCS-715, Syracuse

University,NPAC, Syracuse, NY, June 1995. Joint Boston-CSC-NPAC

Pro ject Plan to Develop WebWork.

[Fox:95c] Fox, G. C. \Software and hardware requirements for some appli-

cations of parallel computing to industrial problems." Technical Rep ort

SCCS-717, Syracuse University,NPAC, Syracuse, NY, June 1995. Sub-

mitted to ICASE for publication (6/22/95); revised SCCS.134c.

[Fox:95d] Fox, G. C., and Furmanski, W. \The use of the national infor-

mation infrastructure and high p erformance computers in industry,"

in Proceedings of the Second International Conference on Massively

Paral lel Processing using Optical Interconnections, pages 298{312, Los

Alamitos, CA, Octob er 1995. IEEE Computer So ciety Press. Syracuse

UniversityTechnical Rep ort SCCS-732.

[Fox:95e] Fox, G. C. \Basic issues and current status of parallel comput-

ing." Technical Rep ort SCCS-736, Syracuse University,NPAC, Syra-

cuse, NY, Novemb er 1995.

[Hariri:95a] Hariri, S., and Lu, B. ATM-based High Performance Distributed

Computing.McGraw Hill, 1995. Zomaya, Alb ert (editor).

[Hariri:96a] Hariri, S. High Performance Distributed Computing: Network,

Architecture and Programming.Prentice Hall, 1996. To b e published.

[Hillis:85a] Hillis, W. D. The Connection Machine. MIT Press, Cambridge,

MA, 1985.

[Ho ckney:81b] Ho ckney, R. W., and Jesshop e, C. R. Paral lel Computers.

Adam Hilger, Ltd., Bristol, Great Britain, 1981.

[HPCC:96a] National Science and Technology Council, \High p erformance

computing and communications," 1996. 1996 Federal Blue Bo ok. A

rep ort by the Committee on Information and Communications. Web

address http://www.hp cc.gov/blue96/. 28

[Kung:92a] Kung, H. T. \Gigabit lo cal area networks: A systems p ersp ec-

tive," IEEE Communications Magazine, April 1992.

[McBryan:94a] McBryan, O. \An overview of message passing environ-

ments," Paral lel Computing, 20(4):417{444,1994.

[Morse:94a] Morse, H. S. Practical Paral lel Computing. Academic Press,

Cambridge, Massachusetts, 1994.

[Narayan:95a] Narayan, S., Hwang, Y., Ko dkani, N., Park, S., Yu, H., and

Hariri, S. \ISDN: an ecientnetwork technology to deliver informa-

tion services," in International Conference on Computer Application in

Industry and Engineering,1995.

[Nelson:90b] Nelson, M. E., Furmanski, W., and Bower, J. M. \Brain maps

and parallel computers," Trends Neurosci., 10:403{408, 1990.

[P ster:95a] P ster, G. F. In Search of Clusters: The Coming Battle in

Low ly Paral lel Computing. Prentice Hall, 1995.

[Stone:91a] Stone, H. S., and Co cke, J. \Computer architecture in the

1990s," IEEE Computer, pages 30{38, 1991.

[Sup erC:91a] Proceedings of Supercomputing '91, Los Alamitos, California,

1991. IEEE Computer So ciety Press.

[Sup erC:92a] Proceedings of Supercomputing '92, Los Alamitos, California,

Novemb er 1992. IEEE Computer So ciety Press. Held in Minneap olis,

Minnesota.

[Sup erC:93a] Proceedings of Supercomputing '93, Los Alamitos, California,

Novemb er 1993. IEEE Computer So ciety Press. Held in Portland, Ore-

gon.

[Sup erC:94a] Proceedings of Supercomputing '94, Los Alamitos, California,

Novemb er 1994. IEEE Computer So ciety Press. Held in Washington,

D.C.

[Tanenbaum:95a] Tanenbaum, A. S. DistributedOperating Systems. Pren-

tice Hall, 1995.

[Trew:91a] Trew, A., and Wilson, G. Past, Present, Paral lel: A Survey of

Available Paral lel Computing Systems. Springer-Verlag, Berlin, 1991. 29