<<

students to learning on their own. Both computational power and scientific ad- 9. j. r. Anderson. The ttfMr^q " ACT* is the most recent"*in a series of theories ed the lessons. However, students work- intelligent tutoring methods for pedago- ff^lSt U^t^eTSr^If ing witn the tutor spent 30 gy. However, we should stress that data Thought, a name that has only historical signifi- percent less time doing the problems collected in these pedagogical experi- n. STerown and K. VanLehn. Costive Sci 4 associated with the lemons and scored ments advance the science of human „ "\mQ) - -43 percent better on the final exam than cognition. SiS^^^^ the Students Working On their own. Press. New York. 1982). References and Notes 13. J. R. Anderson and R. Jeffries. Hum. Comput. , in press. i ? c r. wn a fro rid J Grecno, in Research Brief- 14. B. S. Bloom.Educ. Researcher 13. 3 (1984). demy Prt>s Washing- 15> c Boyle and J R Anderson Overalloverall Assessment ofDC ' and7" , ' ncu " mTi'> n ,-., automated instruction of geometry"^"«"proof 2. D Sleeman and J S. Brown. Eds., Intelligent paperpresented at theAmerican Educa- g Sys,ems Academic , „ __ ioo-n < New York, tion Researchiwsi.aii.il rvs>utiauunAssociation meetings, New Or-ur- Researcht> . on "...... , iS*°?"982) leans. intelligenttutoring is now v . A April 1984. on the threshold of a methodology that ' ' L2^^i£SSS^JKSKS'te will make qualitativechanges abil- M^.r^-S""hlske New York Times < 9 December*' "' gence

ically sell for $10,000 to $100,000 and sell for $100,000 to $500,000. 3) A new computer class can be pro- ducedby combining parts in a new way. Multis: A New Class of The and various micro- processor-based including Multiprocessor Computers and multis emerged in this fashion. In 1964. the supercomputer class was introduced with the CDC-6600, C. Gordon Bell although large computers, including IBM's Stretch, had been built earlier. Seymour Cray's CDC-6600 contained rfx „ . . about a half-million denselypacked, Fre- cost and selling pnce, thereby increasing on-cooled transistors connected with g ictecZToevthafrr th£ effectlveness for cu «*«. up For discrete circuits. Since the CDC-6600. «"uSs I ITJ aCCOrdmgly'TT thef : examPle < computers" were es- Cray has designed nearlyall ofthe super- v" tnrt t ?' ,meg ciTTd,^ " tabHshed in 195 With the Univac computers, which have madeuse ofvari- ge U which cost between $300,000 $5 ° and mil-*' ous forms of parallel computation to T>pical> one technology is used until Tni^r"^7 ""T \ hon. The best known family of main- execute a single instruction In Cray's mtr dUCed by IBM m 1%4 *«*■« 'speed is obtained by" o- or anotheftltol "" the"SyStem° 360 Which em ' WaS based Cessine vectors a < over million float- eLTo *"" tech"^V- In the early ing-pomt operationsper second. Cray's manTas' S"JT-^ g? ?" " IBM imrodUCed the 3? SeriesS suPerc"P"ters sell for $4 million to siHconsilicon chip, and *" ' chin^ni con- which used integratedcircuits; and° most million V nte" t,y tHe 43XX~3BOX S6rieS' WhiCh In 19?K a Single Chl rocessor the W tS WUh , > P "e^Ta^ S^-l ES r Use f hlgh-Perf°"nance integral- Intel 4004 , was intro-- mant «">m ir , an St Sll n Cd drCUitS ° intr dUCed duced and used in a nmge of H ° K° ""*" ' - apph- cZeJ^l Z. 2) ° St d3SS f COmpUt - Cations from calculato Tcost teeT, V"'" n ers WlthV'"'the ' "*' controSL acteris^ d!SSiP same'^Perf° °^nce° as a pre- for microwave ovens. In*1975, Altair packinlden tv rTKT !!7H ' !nH vlous computer can be produced, which introduced the first home- computer, different ways in which technolog.es Seremwa^ the will result in new applications for com- based on Intel's 8080 microprocessor ha t£d na ! PUterS- a d Phonal Today, microprocessors have" the La" vaTetv of\ZnntmpUter daSSeSr basedJ on comP"ters are the best known" products tures necessary nrice n° for building high-speed fFif of this design path. The first minicom- computers with u/VWhen a new virtual memory that technology becomes puter, the PDP-8, which was the result compare favorably with minicomputers available,there are usually three ways to of second-generation technology, was This high-performance component of lU tW f WhlCh gCnerate introduced in 1965 Djgital Equip- negligible cost has permitted the Ve i ° intro- aSSeS:° nt C By 1972, 91 compa- duction of many new n Th?,ecme hnn v , u classes of comput- ii technology can be used to in- "nies had°?°T*tion.formed to build minicomputers __ crease the performance of an existing with the third-generation technology,in- _c - Gordon Bell is vice Chairman ofTechnology ofcomputers ration class while maintainingthe tegratedcircuits (2). Minicomputers typ- Ma"sachu^ensPSiCorP° - We,les,ey Hills 462 SCIENCE. VOL. 228

3*

j Interact., ■

«, skills",

Press,■— »-v.^ ium, nuii r>ew

,T ,"

'

a/., 8,

t

S

C!I

' ,atCSt

trans,Stor S $20

reC

rS W3S C

reSU

'

-ft ers (2), including lap and home comput- asymmetric processing is that all the permits a trade-off between expensive ers ($2OO to personal computers operating-system functions are per- high-speed memory, and large, cheap to multiple-user formed by a single processor, which can low-speed memory. Since the content of (shared) ($6OOO to create a bottleneck in that processor, each cache is associated with a proces- workstations to IBM developed both symmetric and sor, extra logic must be added to the and microprocessor-based asymmetric dual processors, including system to delete "cached" copies of multiprocessors, or multis to the System 370 and System 308Xseries data or instructions that have been modi- It is with this newest class of computers (4). The IBM Attached Pro- fied by anotherprocessor. If a processor computers, the multis, that we will be cessor however, is asymmetric uses stale data, incorrect calculations i concerned. and, addition, requires f in the master can occur. i processor to perform the input-output function. Multiprocessors, Multicomputers, In the past, uniprocessors had a cost The Structure of a Multi and Multis advantageover multiprocessorsbecause the cost of switching and cabling is pro- Multis are based on advances in mi- Multiprocessors are computers that portional to the number of processors croprocessor technology and the cache two or moreprocessors capable the (5). memory 2). ! contain and memories in system As a (Fig. They use a single set of of independently executing instructions result ofprogram sharing, a multiproces- wires called a for all communication and gainingaccess to programs and data held in a common memory. A processor is the part of the computer that carries Summary. Multis are a new class of computers based on multiplemicroprocessors. out computational work by retrieving The small size, low cost, and highperformance of microprocessors allowthe design instructions from memory and perform- and construction of computer structures that offer significant advantages in manufac- ing operations on data that were also ture, price-performance ratio, and reliability over traditional computer families. Cur- retrieved from memory. Thus informa- rently, commercialmultis consist of 4 to 28 modules, which include microprocessors, tion in a multiprocessor can be freely common memories, and input-output devices, all of which communicatethrough a used by and exchanged among all pro- single set of wires called a bus. Adding microprocessors together increases the cessors. performance of multis in direct proportion to their price and allows multis to offer a In contrast, multicomputers consist of performancerange that spans that of smallminicomputersto mainframe computers. ! interconnected, independentcomputers, Multis are commercially available for applications ranging from real-time industrial each ofwhich has a processor, thatcom- control to transaction processing. Traditional batch, time-sharing, and transaction j municate by passing messages through systems process a number of independent jobs that can be distributedamong the fixed links or a switch such as a local- microprocessors of a multi with a resulting increased throughput (number of jobs area network. Intel recently introduced a completedper unit of time). Many scientific applications(such as the solvingof partial series of multicomputers,called the Intel differentialequations) and engineeringapplications(such as thechecking of integrat- iPSC Family of Concurrent Supercom- ed circuit designs) are speededup by this parallelcomputation; thus, multis produce puters, which have 32, 64, or 128 com- results at supercomputer speedbut at a fraction of the cost. Multis are likelyto be the puters connected in a hypercube config- basis for the next, the generationof computers a generationbased on parallel uration. Within a model, each computer processing. — is connected to five, six, or seven other computers through message-passing links that can transfer dataat arate of 10 sor with N processors requires less between processors, memories, and in- megabits per second. memory than N independentcomputers, put-output devices. This bus and corn- Since a multiprocessor can be parti- but the multiprocessor's memory must puter structure was pioneered in Digital tioned into independent computers and be faster and larger than a uniproces- Equipment Corporation's single-proces- use common memoryfor intercommuni- sor's. Unfortunately, a larger memory sor PDP-11 Unibus. which was intro- cation, it is a more general machine than that is k times faster costs more than k duced in 1970 (8). In a multi, the cache a multicomputer and, indeed, can simu- times as much per bit. memory associated with a processor late one. Althoughall mainframe manu- Several multiprocessors have been services approximately95 percent of the facturers sold multiprocessors with two built that containa single, central switch, processor's requests for memory. Since tofour processors before the fourth gen- The advantageof a central switch is low only 5 percent of a processor's requests eration of computers was developed,the cost, because the cost ofcable is directly reach the commonbus. therequirements multiprocessor structure has not been dependenton the number of processors for the bus bandwidth are reduced by a used in the low-cost classes of comput- and memories. Two single-switch com- factor of20. This allowsup to 20 timesas ers. puter systems with 16 processors have many processors as areavailable in corn- Multiprocessors are generally sym- been built: oneforhigh-performancemil- puters without cache memories. In addi- metric; that is, any processor in them itaryuse (6) and the otherfor experimen- tion, because all processor requests for can operate on any jobwithin the memo- tation in parallel processing (5). primary memory appear on the common ry. The Burroughs 85000, introduced in The introduction of cache memory (7) bus, each cache can independentlymoni- 1961, was a dual symmetric multiproces- in the late 1960's led 20 years later to the tor the bus and delete any data it con- sor (5). Some multiprocessors, however, design of microprocessor-based multi- tains that has been modified in primary are asymmetric; that is, one master pro- processors, or multis. Cache memory is memory by anotherprocessor (9, 10). cessor performs the operating system a small, high-speed memory that stores Standard microprocessor buses such functions and some applications work frequently used instructions and data. as Intel's 11, Motorola's VME, while all the other processors only do Placed between the processor and the Texas Instrument's Nu Bus, and the applications work. The disadvantage of large, main memory, the cache memory proposed lEEE 896 provide 26 APRIL 1985 463

$1500), ($lOOO $10,000),

$20,000), ($lO,OOO $60,000), ($20,000 $500,000). Scheme,

fifth, r? > 7 10 mils theconstruction ofphysically small multis, akey factor in their highperform- Fig. 1. The price of ance. Consideration of signal propaga- classes of computers plotted against time tion and packaging leads to a roughly 10 ( illustrates the cluster- cube-shaped system with 10 to 20 mod- ing of classes such as ules. The processors (both central and super- input-out) and memory occupy about computers, minicom- :| 5 half the volume, the heightis propor- .1 10 puters and supermini- and i o computers, tional to the bus width and, roughly, the "D personal i and home computers, bus-bandwidth capacity. The length i .Workstations I shared micros, work- (number of modules)and (the 4 depth area 10 stations, and multis. emergence of each module) of multis increase in * The of proportion new classes of com- to the bus capacity.Amdahl's puters coincides with Rule (75) describes how memory size I 103 the advent of new and input-output rate need to grow in technologies, as proportion to processing rate. marked by the gener- a built a number ations of technolo- Because multi is from ofmodules, 2 gies. Shaded areas it has an inherent redundan- 10 identify microproces- cy. Typically, a multi consists of four 1950 1960 1970 T 1980 1990 sor-based systems. module types: processor, memory, and Ist 2nd 3rd 4th two types of in-out controllers for disks i Computer generations and terminals. From these components, multis of 10 to 20 modules are built. The inherent redundancy provides 4 for multiple processors and operate at a circuits of transistor-transistor logic multis with greaterreliability(probabili- rate of about 10 million transactions per (TTL) and emitter-coupledlogic (ECL), ty that the system is operational), avail- second. A typical bus with time-shared respectively. The speed of bipolar cir- ability (fraction of time the system is and data dataat i address lines will deliver cuits has only increased by 15 percent available for use), and maintainability 3 20 megabytesper second or allowaccess per yearover the last 10 years (2), and it (the time to repair) ! than traditionalcom- ! to 5 million four-byte memoryvaluesper has been estimated that the speed of the puters of the same size. Indeed, the I second. If separate address and 64-bit largest single processor will improveby increase in these characteristics is great- 1 datalines areprovidedand simultaneous only 23 percent per year to reach 80 er than an i orderof magnitude. requests for memory are permitted, a million by 1990 All parameters can be adjusted at any large multi can be constructed to deliver (14). Microprocessors, however, are time during the computer's lifetime by I data at 100 megabytes per second or based on metal-oxide semiconductor selectingthe appropriatenumberofeach 1 allowaccess to 12.5 million 64-bit memo- (MOS) technology, which has increased type of module. With the appropriate ry values per second. in internal speed by 40 percent per year various modules can be System performance is linearly pro- for the last 10 years. The speed of TTL marked as faulty and taken out of ser- portional to the number of processors and MOS logic are now about the same. vice, which allows the system to func- until the bus is saturated with requests High-performance minicomputers use tion even in the presence offailed mod- for access to the memory (//). Because additional hardwarefor increased paral- ules. Maintainability is increased even ofthis linear proportionality,multis pro- lelism and faster floating-point arithme- more by the practice of having spare vide the means for producing a range of tic to outperform microcomputers, but modules within the individual comput- compatible computers without a specific the gapin theirperformance should close ers. The owner should be able to main- processordesign and specific technology in the next few years. tain a multi by simply replacing faulty for each model(72). With multis, aclient The microprocessor's small size per- modules. can select the computer that meets his needs by selecting the appropriate num- ber ofprocessors or memories insteadof Modules r 1 r i Manufacturing, Cost, and Longevity having to select from different family I > Central 1 < | ; processor i j members. Primary ; i Cache ' memory | I Hardware for multis is inherently sim- ' i memory '» Current microprocessor-chip sets (75) ' * ple because of the few module types and such as the National 32032 and Motorola the one explicitly defined interface stan- 68020 have the key characteristics of dard, the bus. A multi consisting of 20 past mainframes and minicomputers in- modules offour types can be designed cluding: 32-bit addressing, virtual memo- with less than one-fifth the effort ofa 20- ry control, complete instruction sets, in- -module , because in mini- cluding floating-point data, and perform- Input- computers each processor moduleis dif- ance levels per single-chipset that equal i ferent and all modulescommunicatewith those of a minicomputer. The gap be- L_ J one another in different ways. Conven- ! tween mainframe-processor and micro- To communications equipment, tional minicomputersrequire 2 or 3 years local area other buses processor performance will continue to of design (16, 17). close owing to the difference in the rate Fig. 2. Multis, multimicroprocessor comput- ers, are a Since the cost of product tooling is of improvement of the underlying tech- organized around single, uniform bus that connects central and input-output also proportional to design cost, the nologies. Minicomputers and main- processors and common memory. Each pro- manufacturing cost is reduced by build- frames are a based on bipolar integrated cessor requires cache memory. ing only four module types, each of 464 VOL. 228

mainframes,

software,

;

output computer

disks, tapes, networks,

SCIENCE, * which can be tested independently. The Encore's Multimax is designed for performance point. Furthermore, the cost of manufacturing also decreases as high-performance processing and high multi can be expanded to cover a wide more modules are manufactured, follow- input-output data rates; it is equipped range of requirements. ing a traditional manufacturing learning with the Nanobus. which transmits 100 curve. megabytes per second. The Multimax The product life of a multi is deter- permits up to 20 central processors (on Work Throughput and Speedup mined by the bus speed. Higher speed ten modules) or 10 input-output comput- means greater longevity because it en- ers to address 32 megabytes of memory Two characteristics specify the per- ables the bus to accept more and faster on eight 4-megabyte modules. A single formance of a multiprocessor computer: processors before the bus becomes satu- modulecontrols requestsfor thebus and throughput and speedup. Throughput is rated. The use of cache memory further provides centralized services including the number of jobs completed per unit increases longevityby allowingmemory timers, system initialization, and diag- time. Speedup is the number of times speed and size to be increased indepen- nostics. Each module, including memo- faster a single job can be completed by dently. ry, has a computer that is used for stand- using multiprocessors. For uniproces- alone and on-line diagnostics and main- sors, any speedupprovided by new tech- tenance. The bus and system architec- nology translates directly into higher The Proof of the Pudding ture permits 1024processorsto address4 throughput. Since a multi can be expand- gigabytes of physical and virtual memo- ed by the incrementaladdition of proces- Several multis have alreadybeen de- ry. Terminal and access is sors, both its throughput and speedup veloped and introduced. Masscomp has provided by one or morelocal-area net- can be improved without changing the introducedan asymmetrical dual proces- works ( orthe lEEE 802.3). The underlying technology. sorfor laboratory use. Unidot is offering operating system is UNIX 4.2 rede- Traditional uniprocessorsoperate on a a dual processor that uses National mi- signed for operation on a multi. stream of independent jobs. Dividing i croprocessors connectedby Intel's Mul- Figure 3 compares the performance these jobs (79) among the independent tibus. Arete provides a quadruple pro- and price of two multis with various processors in a multi results in improve- cessor that uses a proprietary, 16-mega- minicomputers. The multis compare fa- ment in throughput. This form ofparallel byte-per-second bus to connect Motor- vorably with the traditional approach of processing can be appliedto a numberof ola 68000 microprocessors. Sequent has buildinga family of products from sepa- transactions; and, it is introduced the Balance 8000, which can rate technologies. A single multi is inher- transparent to users. contain as many as 12 National 32032 ently a family that covers a traditional All modernbatch and time-sharingop- microprocessorsconnected by abus that product line family at every price and crating systems (such as UNIX) are mul- I can transmit 27 megabytes per second. Though not a multi per se, Elexsi's six- processor is a multiprocessor imple- 15 mented with ECL technologyand a bus that can transfer data at 320 megabytes per second. Introduced in 1983, the Synapse 13 4 *7V+1" computer is a relatively large ■o multi that uses the Motorola 68000 mi- c o croprocessor with special memory-man- v Fig. Performance is CU 3. CO 1 1 agement hardware. The Synapse struc- plotted against price ture has one morecomponent (bus, pow- for two multis (A and a B) er supply, processor, memory, in-out and several con- cto controller, and in-out device) than is ventional computer o families. C, D. and E 9 required for operation. Designed for are families of mini- 3 to fault-tolerant transaction processing, the computers; F is a c Synapse system is composed of as many family of mainframe as 28 arithmetic or in-out processors that computers. The cost- o effectiveness of a £ 7 communicate with as many as four mem- multi is achieved by o ory modules for a total of 32 megabytes the its use of same E I of memory. All are connected components to cover i modules a> t with full-duplexed buses that transfer a range offunctions as ! the indi- data at 32 megabytes per second. The opposed to vidual design, semi- c cache memories and main memory have CA conductor technolo- (0 CD control circuitry to determine the loca- gy, and packaging re- v tion of the "real" data when multiple quired for each mem- o o. 3 processors write to the same memory ber of a traditional computer family. location. This scheme minimizes the bus traffic caused by writing (18) compared with simple write-through schemes that always write in memory (9, 10). The processing rate is 11 to 14 million in- structions per second, a rate comparable to state-of-the-art uniprocessor main- List price of central processing units frames. (thousands of dollars) 26 APRIL 1985 465

furthermore,