The Illiac IV System

~ROCFEDINOS OF THE IEEE, VOL. 60, NO. 4, APRIL 1972 369 49, p. 439, 1966. 'H E. W. Wagner, British Patent No. 1,065796 1967. W. H. F. Talbot, "Photogem Drawing," The Athenaeum, No. 589, p. 114, 1839. lL9 L. E Walkup, US. Patent No. 2,777,957, 1957. I59 K. Tamarlbuchl.and M. L. Smlth."Charge Determiling Species in Non-Aqueous ''O L. E. Walkup, U.S. Patent No. 2.825.614. 1954. Solvents." J. Colloid Interlace Sci., VoI. 22, p. 404, 1966. P. J. Warter, Jr.. ' FactorsDeterminlng Xerographlc Photoreceptor Performance," '60 T. Taniand S. Klkuchi."Spectral Sensitlzarion in Photographyand Electrophotog- Appl. Optics: Suppl. =3 on Electrophotography. p 65, 1969. raphy," Report lnst. Industrial SCI., Univ. Tokyo, Vol. 18, p. 51, 1968. H.Watanabe. et al."The Activation Energy !or OxygenDesorptlon from Zinc Oxide A. Tereninand I. Akimov."Some Experiments on thePhotosensitization MechaTism Surfaces." Jap. J. Appl. Phys., Vol. 4, p. 945. 1965. of Semiconductors by Dyes," J. Phys. Chem., VoI 69. p. 730, 1965. "' J. W. Weigl. Photographic Science, Ed. by W. F. Berg, New York: Focal Press, 1963. IiZ V. D Tughan and R. C. Pink, "Solutions of Metal Soaps in Organic Solvents Part 11," 'I4 R. D. Weiss, "Electrolytic Photography," Phot. Sci. and Enq., Vol. 11. p. 287, 1967. J. Chem. p. 1604, 1951 Soc., 'I5 P. H. Wersema,et al, "Calculatlon of the Electrophoretic Mobility of aSpherical ' h3 V. Tuiagin."Imaging Method Based on Photoelectrophoresis," J. Opt Soc. Amer., Collold Particle," J. CoUoid Interface Sci., Vol. 22. p. 78. 1966. VoI. 59, p. 328. 1969. 'leH. Wiellcki. private communicatlon. ld4 E. J Verwey and J. Th. G. Overbeek, The- of theStability of LyophobicColloids, Amsterdam: Elsevler, 1946. 117 N. E. Wolff. "A PhotoconductiveThermoplastic Recording System," RCA Review, Vol. 25, p. 200. 1964. J. Viscakas.et al. "Recombination in Selenium Electrophotographic Layers," Awl. Optics: Suppl. -3 on Electrophotography, p. 27, 1969. 'leW. Yellin, et al, British Patelt No. 1,016,072, 1966. W. C. York, U.S. Patent No. 3,135.695. 1964. Ibb J. Viscakasand V. Gaidelis."The Role of IntercrystallineBarriers in ZnO Electro- photographicLayers," Reprographle 11, llnd Int.Conf. Cologne, Helwich Darmstadt, 'do C. J. Youngand H. G. Grieg."Electrofax: Direct Electrophotographic Printing on 1969. Paper," RCA Review, Vol. 15. p, 469, 1954. Ib7 0. Von Bronk, British Patent No. 188.030, 1922 '8' Anon., Electronic News, p. 68, February 16, 1970. The Illiac IV System W. J. BOUKNIGHT, STEWART A. DENENBERG, DAVID E. McINTYRE, J. M. RANDALL, AMED H. SmH,AND DANIEL L. SLOTNICK, SENIOR MEMBER, IEEE Invited Paper -Tho masons for the creation of llliac IV an described and Wn historyof tho liiiac IV pt+ct is mwntod. The archihchrn or hardware struetun of * llliac IV is discuss- llllac IV array is an army processor with a rp.dalied controlunit (CUI that urn be vkwed as a small stand-alone computer. The llliac IV sohare strategy is described in krms of current user habits and needs. Brief descriptions are given of the systems software itself, its history, and the maior lessons leaddur- ingits development. Some ideas forfuture development are suggested. Applications of lliiac N are discussed in terms of evaluating the fundon ffxJ rimultanoously on up to 64 distinct argument sets x<.Many of the time- consuming probkms In scientiAc computation invoke repeated evaludon of the same function on dihrenl agument soh. The agumenl sets which compose the problem data base must be structured in such a fashion tho+ theycan bo dimtbuted omong 64 separatememories. Two matrix ap plications: Jacobi's algorithm for Anding the eigenvalues and eigenvectors MEMORY of mal symmetric matrices, and reducing a mal nonsymmetric matrix to Fig. 1. Functional relationsa within a conventional computer. CU the uppor-Hawnberg form usingHouseholder's transformotions are dis- The cussed in detail. The ARPA networlr, a highly sophisticated and wide has thefunction of fetchinginstructions which arestored in memory, ranging experiment in the remote access and sharingof computer n- decoding or interpreting these instndons, and linally generating SOUKOS, is brieAydescribed and itscurrent status discussed. Many re- the microsequences of electronic pulses which cause the instruction marchers located about thecountry who will use Illiac IV in solving to be performed. The performance of the instruction may entail the problems will do so via the network. Thevarious systems, hardware, use or "driving" of oneof the three other components. TheCU may also a small registers can and procedures they will use is discussed. contain amount of memory called that be accessed faster than the main memory. The ALU contains the elec- I. INTRODUCI-ION tronic circuitry necess~~yto perform arithmetic and logical operations. The ALU may also contain register storage. Memory is the T ALL BEGAN in the early 1950's shortly after EDVAC [l] medium by which information (instructions or data) is stored. The became operational. Hundreds, then thousands of compu- 1/0accepts information which is input to or output from Memory. 1ters were manufactured, and they were generally organized The 1/0 hardware may also take care of converting the information from one coding scheme to another. The CU andALU taken on Von Neumann's concepts, as shown and described in Fig. 1. together are sometimes called a CPU. In the decade betwetm 1950 and 190, memories became cheaper and faster, andthe conceptof archival storage was evolved; control-and-arithmetic and logic units became more sophisticated: 1/0 devices expanded from typewriter to magnetic tape units, disks, drums, and remote terminals. But the four basic components of a conventional computer (control unit (0,arithmetic- Manuscript received December 10, 1971; revised January 17, 1972. and-logic unit (ALLJ), memory, and I/O) were all present in one Ttris inoitedpper is one of a series phned on topics of general interest- The Mtor. form or another. The authors are with the Center for Advanced Computation andthe The turning away fromthe conventional organization came in Illiac N Project, University of Illinois, Urbana, Ill. 61801. the middle lm's, when the law of diminishing returns began to Authorized licensed use limited to: The University of Auckland. Downloaded on May 27,2010 at 01:22:55 UTC from IEEE Xplore. Restrictions apply. 370 PR~INGSOF THE IEEE, APRIL 1972 Timing Cycle 1 2 3 4 5 Fig. 2. Pipelined operation. The large boxes representthe circuits required to transform the operands A and B into the quantity O(A, B) (some function of A and B, say the sum of A and B). The smaller boxes represent storage stages for the intermediate results O(A, B), and @A, B)2, and the desired resultO(A, B). The operation0 has been broken.down intothree stages, each of which acceptsas input the output of the previous stage, and all of which perform a stage of the operation at the same time. At esch step of the timing cycle, the pipeline accepts a new pair of operands (A, B) and the previous pair moves to the next stage. This mode of operation causes results (the sum in this example) to appear at the end of the pipeline at time intervals equal to the time of operation of the slowest stage of the pipeline. take effectin the effort to increase the operational speedof a com- The replication philosophy is exemplified by the general multi- puter. Up until this point the approach was simply to speed up processor which replicates three of the four major components the operationof the electronic circuitry which comprised the four (all but the I/O)many times.The cost of a general multiprocessor major functional components. (See Fig. 1.) is, however, very high and further design options were considered Electronic circuits are ultimatelylimited in their speed of which would decrease the cost without seriously degrading the operation by the speed of light (light travels about one foot in a power or efficiency of the system. The options consist merely of nanosecond) and many of the circuits were already operating in recentralizing one of the three major componentswhich had been the nanosecond time range. So, although faster circuits could be previously replicated inthe general multiprocessor-the memory, made, the amount of money necessary to produce these faster the ALU, or the CU. Centralizing the CU gives rise to the basic circuits was not justifiable in terms of the small percentage in- organization of a vector or array processor such as Illiac IV. This crease of speed. particular option was chosen for two main reasons. At this stage of the problem two new approaches evolved. 1) Cost: A very high percentage of the cost within a digital 1) Owrlup: The hardware structure of the conventional or- computer is associated with CU circuitry. Replication of this ganization was modified so that two or more of the major func- component is particularly expensive, and therefore centralizing tional components (or subcomponents within a major compo- the CU saves more moneythan can be saved by centralizing either nent) could overlap their operations. Overlap means that more of the other twocomponents. than one operation is occurring during the same time interval, 2) Structure: There is a large class of both scientific and and thus total operation timeis decreased. business problemsthat can be solved by a computer with oneCU Before operations could be overlapped, control sequences be- (one instruction stream) and many ALUs.The same algorithm is tween the components had to be decoupled. Certainly the CU performed repetitively on mzny sets of different data: the data could at least be fetching the next instruction while the ALU was are structured as a vector, and the vector processor of Illiac IV executing the present one.

The Illiac IV System

Early Stored Program Computers

Computer Development at the National Bureau of Standards

Performing the Shuffle with the PM2I and Illiac SIMD Interconnection Networks

Simon E. Gluck Collection of Photographs of EDVAC and MSAC Computers 1990.232

P the Pioneers and Their Computers

Online Sec 6.15.Indd

SIMD1 Ñ Illiac IV

John William Mauchly

The Manchester University "Baby" Computer and Its Derivatives, 1948 – 1951”

Oral History Interview with David J. Wheeler

Vector Machines  Vector Machines Today Introduction  a Vector Processor Is a CPU That Can Run One Instructiononanentire Vector of Data

John W. Mauchly Papers Ms