‘ USO 05794059A United States Patent [191 [11] Patent Number: 5,794,059 Barker et a1. [45] Date of Patent: Aug. 11, 1998

[54] N-DHVIENSIONAL MODIFIED HYPERCUBE 4297331312 6/1991 European Pat. O11‘. . 4605991513 12/1991 European Pat. Off. . [75] Inventors: Thomas Norman Barker. Vestal; Clive 4856901512 5/1992 European Pat. 01f. . Allan Collins. Poughkeepsie; Michael 493876A2 7/1992 European Pat. Off. . Charles Dapp. Endwell; James 2223867 4/1990 United Kingdom . Warren Die?‘enderi'er. Owego; Billy 89/09967 4/1988 WIPO . Jack Knowles. Kingston; David Bruce 92/06436 4/ 1992 W'IPO . Rolfe. West Hurley. all of N.Y. OTHER PUBLICATIONS [73] Assignee: International Business Machines Bhuyan & Agrawal. “Generalized Hypercube and Hyperbus Corporation. Armonk. NY. Structures for a Network”. IEEE Transactions on . vol. C-33 No. 4. Apr. 1984. pp. 323-333. [21] Appl. No.: 282,101 T.A. Kriz and MJ. Marple. “Multi-Port Bus Structure With Fast Shared Memory", IBM Technical Disclosure Bulletin. [22] Filed: Jul. 28, 1994 vol. 27. No. 10A, pp. 5579-5580. Mar. 1985. Related US. Application Data (List continued on next page.) Primary Examiner-Ayn R. Sheikh [63] Continuation of Ser. No. 888,684, May 22, 1992, aban Assistant Examiner—valerie Darbe doned, which is a continuation-impart of Ser. No. 301 ,278, Step. 6, 1994, Ser. No. 698,866, May 13, 1991, Pat. No. Attorney, Agent, or Fime—Lynn Augspurger; Morgan & 5,313,645, and Ser. No. 324,295. Oct. 17, 1994, Pat. No. Finnegan. L.L.P. 5,475,856, which is a continuation of Ser. No. 798,788, Nov. 27, 1991, said Ser. No. 301 ,278, is a continuation of Ser. No. [57] ABSTRACT 611,594, Nov. 13, 1990. A parallel array processor for massively parallel applications [51] Im. c1? ...... G06F 15/16 is formed with low power CMOS with DRAWM processing [52] US. Cl...... 395/800.1; 395/800.11; while incorporating processing elements on a single chip. 395180012; 395l800.13; 395/800.l5; 395/800.0l with nodes connected in an n-dimensional modi?ed non [58] Field of Search ...... 395/8001. 800.11. binary hypercube. In a 4-dirnensional modi?ed non-binary 395/800.12. 800.13. 800.15. 800.01 hypercube embodiment. each node includes either processor memory elements on a single chip. each processor memory [56] References Cited element having its own associated processing element. sig U.S. PATENT DOCUMENTS ni?cant memory. and I/O. with each processor memory element supporting an external port Pairs of ports are 3,537,074 10/1970 Stokes et a1...... 340/172 associated with each dimension. labeled X. Y. W. and Z. Eight nodes are connected in the X dimension to form a ring. 3,544,9733,970,993 12/19707/1976 FinnilaBorck et ...... a1. . .. 340/1725 Corresponding nodes from eight such rings are connected (List continued on next page.) into rings in the Y dimension to form an 8X8 array of nodes. referred to as a cluster. Corresponding nodes of eight clus FOREIGN PATENT DOCUMENTS ters are connected into ring (64 rings) in the Z dimension. forming an 8><8><8 array of nodes referred to as a “cluster 0132926 2/1985 European Pat. 01f. . 0208497 611986 European Pat. O?'. . ring". Corresponding_ _ _ nodes _of eight_ cluster rings are con European ‘ nected 11110 l'lIlgS 1D the W dimension. 340668A2 4/1989 European Pat. 01f. . 428327.611 11/1990 European Pat. O?. . 29 Claims, 24 Drawing 5118818

501015 330 Age“? PROCESSOR UNIT A 1. 340 500 CLUSTER |——i> RTALS SMALL FiNE HALi‘i/ERD"3O1 52°"\::mm ‘MOMS GRSghCPEgAQLiEL 1 MEMORY /302 . .___1 t‘ ‘ 16 B" NETWORK NODE s10 : ————?> mam“ ’ 350 54 NODES V PROCESSOR 512 PROCESSORS 1 1 ,/ 1 7 l -3n _-.w DIMENSION , lfOCESSOR ‘13%;; r 312 ozxv zzxv 4m szxv RS160“) _> ROUTER few 1 scum 1/0 1 H Hi “314 w‘ “T71 ii iii 1 new a 1 BYTE B101 PORTS z m g s PROCESSORS g t LARGL E FINE g GRNN PARALLiL N WM “ROCESSOR .&m FLAT LAYOUT 0F 41) IORUS 52,768 PROCESSiNG CLEMENT 5,794,059 Page 2

US. PATENT DOCUMENTS 4,992,926 2/1991 Janke et a]...... 364/134 4,992,933 2/1991 Taylor 364/200 4,101.960 7/1978 Stokes et a]...... 364/200 5,005,120 4/1991 Ruetz ______364300 4,107,773 8/1978 Gilbreath er a1...... 364/200 5,006,973 4/1991 Neches 354/200 4,270,170 5/1981 Reddaway ...... 364/200 5,008,815 4/1991 Hills ...... 364/‘200 4,314,349 2/1982 Batcher ...... 364/716 5,008,882 4/1991 Peterson et al. 3701943 4,338,675 7/1982 P81166161 a1...... 364/748 5,010,477 4/1991 Omoda e161, 364/200 4,380,046 4/1983 Fllllg ...... 364/200 5,016,163 5/1991 Jesshope e141. 364/200 4,394,726 7/1983 K0111 _____ ,_ __ 364/200 5,020,059 5/1991 Gorin et al, ...... 371/113 4,412,303 10/1983 Barnes Ct 81...... 364/900 5,091,945 6/1991 MOM-S99 9191- 364000 4,435,758 3/1984 1.6116 et a1, ...... 3641200 5,038,282 811991 Gilbert cl 91‘ 364000 4,467,422 8/1984 Hunt 364000 54033386 811991 Ll --,- ~ ------382/49 4,468,727 8/1984 Can‘ison 364/200 594L189 M991 Tamltam 364300 4,498,133 2/1985 B01101! 6131. 364/200 510414971 “991 Cm.” 6‘ “1 364000 4,523,273 6/1985 Adams, [116181 364/200 5,045,995 9/1991 Lemma‘ 6"] 364000 4,598,400 7/1986 Hillis ...... 370/60 5,047,917 9/1991 Am” eta‘ -- 364000 4,621,339 "/1986 Wagner etal 364,200 5,056,000 10/1991 Chang 364/200 4,622‘650 "/1986 Kulisch ____ 0 364,748 5,072,217 12/1991 Georgwu Cl 21, . 340/825.79 4,672,373 6/1987 Mori C131 340/825.05 5413523 5/1992 C911“ 8‘ a1" " " 395/800 4,706,191 11/1987 Hamstra et al. 364/200 542L498 “992 si‘lbcne‘al- - 395’700 4’709‘327 "/1987 ms el all 364,200 5,136,582 8/1992 Flroozmand ... 3.70/85 1 4,720,780 1/1988 13616661; ...... 364/200 54142540 8'19” 613$“ 371/401 4,736,291 4/1988 Jennings 61 a]. . 364/200 5446508 9/1992 H111“ ' 395/800 4,739,474 4/1988 Holsztynski .364/200 541653023 "/1992 615°“ 395325 4,739,476 4/1988 Fiduccja ...... 364/200 5470482 12/1992 5'1""! 395/800 4,748,585 5/1988 Chiamlli ...... 3641900 1170484 12/1992 Gmdals'“ 395/800 4,763,321 8/1988 Calvignac 6181 370/94 1173947 12/1992 Chm“ 6"" 382/41 4,780,873 10/1988 Mattheyses .. 370/94 5’175'862 12/1992 “3d.” eta‘ 395%“ 4,783,738 11/1988 Liet a1...... 364/200 5,175,865 12/1992 H111“ ------395/8°° 4,783,782 11/1988 M01101! ...... 371/11 $481,017 "1993 FmY’JY- '3"! 340/825” 4,805,091 211989 Thie] et a] ‘ 364,200 5,187,801 2/1993 MOS et 31 ...... 395/800 4,809,159 2/1989 Sowa ...... 364/200 5,189,665 2/1993 Niehaus ‘Ml-- 370/4581 4,809,169 2/1989 Sfani etal ...... 364/200 5,197,130 3'19” Cl?!“ ‘1 ‘1- ' " 3955” 4,809,347 2,1989 Nash a ,1 382,49 5,212,773 5/1993 HllllS ...... 395/200 4,814,980 3/1989 Peterson eta]. 364/200 54212777 511993 6°“ '3‘ a1 " 395’375 4,825,359 4/1989 Ohkami C181. . 364/200 gig/S33 gig; gfn'Ayjaf‘al -' 233$ 23233513 22323 $332331... 122% 5139079, 71993 94144581_|any .. 395400 4,841,476 6/1989 1411611611 6181. . 364/900 542394629 8/1993 MmeFem~ -- ~ 395/325 4,847,755 7/1989 Momson a a1. " 364,200 5,239,654 8/1993 Ing—S1mmons 0'. a1...... 395/800 4,849,882 7/1989 Aoyama et a1...... 364/200 535L097 10/1993 5mm 9‘ a1‘ -- 361/637 4,852,048 7/1989 Morton ...... 3641200 54253559 10/1993 SP1‘ e‘i -- 395575 4,855,903 8,1989 Carleton e‘ all n 3641200 5,265,124 11/1993 S1221) et a], ...... 375/3 4,858,110 8,1989 Miyata ______364,200 5,280,474 111994 NiCkOllS e! a], ...... 370/60 4,860,201 8,1989 5mm, 6, 2L 364,200 5,297,260 3/1994 Kametam .. 395/325 4,872,133 10/1989 1.6618116 ...... 364/748 5313545 5/1994 R°1f° ------3951300 4,873,626 10/1989 Gifford .364/200 5345573 9/1994 Mam“ -- 395/425 4,891,737 “1990 Gi?-Ord 364,”) 5,355,503 10/1994 K113 ...... 395/800 “1990 Fiduocia e! 364,200 5,367,636 11/1994 C(?l?y et 31. .. 395/200 4901224 211990 Ewen ...... 364/200 5404562 ‘"1995 Bauer“ “1 -- 395/800 47903260 2,1990 Home 8H1 ______. _ __ 370,60 5,420,982 5/1995 Take ...... , 395/200 4,905,143 2/1990 Takahashi et a]. . 364/200 4,907,148 3/1990 M01101] ...... 364/200 OTHER PUBLICATIONS 1:312:22; 213% "1‘ ‘_ 3%32 H.P. Bakoglu, “Second-Level Shared Cache lmpl?m?llla 4,916,657 4/1990 Mom-m _ 364,900 tion For Multiprocessor Computers With A Common Inter 4,920,4g4 4/1990 Rmade _ 354/200 face For The Second-Level Shared Cache And The See 4,9Q2,408 5/1990 Davis et a1. . 3641200 ond-Level Private Cache”, IBM Technical Disclosure 4,925,311 S/1990 Neches et a1...... 3641200 Bulletin, vol. 33. No. 11. pp. 362-365. Apr. 1991. 1933336 6’ 199° T‘llP‘?e e‘ "1‘ ~~ moo Mansingh et 31.. “System Level Air Flow Analysis for a 933,846 6/1990 Humphrey et a1. 3641200 C . .8 4,933,895 6,1990 G?nberg a all n 364,748 omputer SystemProcessmg Umt ,Hewlett-Packardjour 4,942,516 71/1990 Hyatt 364/200 nal. V01. 41 N0- 5. OCL PP. 82-87. 4,942,517 7/1990 Cok ...... 364/200 Tewksbury et 81., “Communication Network Issues and 4,943,912 7/1990 Aoyma e131‘ 364/200 High-Density Interconnects in Large-Scale Distributed 4,956,772 9/1990 Neches ------~ 364/200 Computing Systems”, IEEE Journal on Selected Areas in ' Communications, vol. 6 No. 3. Apr. 1988. pp. 587-607. 4,967,340 10/1990 Dawes 364000 Boubekeur et aL. "Con?gun'ngAWafer-Scale Two-Dimen 4,975,834 12/1990 Xu et a], , 3641200 sional Array of Single-Bit Processors". Computer. v01. 2. 4,985,832 1/1991 Grondalskj .. .. 364/200 Issue 4. Apr. 1992, pp. 29-39. 5,794,059 Page 3

Korpiharju et al.. ‘TUTCA Configurable Logic Cell Array T.H. Dunigan. “Performance of the iPSC/860 and Architecture” IEEE. Sep. 1991. pp. 3-3.1—3-3.4. Ncube 6400 hypercubes*”. Parallel Computing 17. pp. C.K. Barn and S.Y.W. Su. ‘The Architecture of SM3: A 1285-1302. 1991. Dynamically Partitionable Multicomputer System". IEEE DD. Gajski and JK. Peir. “Essential Issues in Multiproces Transactions on Computers. vol. C-35. No. 9. pp. 790-802. sor Systems". 1985 IEEE. pp. 9-27. Jun. 1985. Sep. 1986. A. Holman. “The Meiko Computing Surface: A Parallel 8t SP. Booth et al.. “An Evaluation of the Meiko Computing Scalable Open Systems Platform for Oracle”. A Study of a Surface for HEP Fortran Farming*”. Computer Physics Parallel Database Machine and its Performance-The NCR/ Communications 57. pp. 486-491. 1989. Teradata DEC/1012. pp. 96-114. SP. Booth et al.. “Large Scale Applications of Transputers Baba et al.. “Parallel Object-Oriented Total Architecture: in HEP: The Edinburgh Concurrent Supercomputer A-NET”. Proceedings Supercomputing. Nov. 1990. pp. Project”. Computer Physics Communications 57. pp. 276-285. 101-107. 1989. Mitchell et al.. “Architectural Description of a New. Easily P. Christy. “Software to Support Massively Parallel Com Expandable Self-Routing Computer Network Topology". puting on the MasPar MP-l”. 1990 IEEE. pp. 29-33. IEEE INFOCOM. Apr. 1989. pp. 981-988. S.R. Colley. “Parallel Solutions to Parallel Problems”. K. Padmanabhan. “Hierarchical Communication in Cube Research & Development. pp. 42-45. Nov. 21. 1989. -Connccted Multiprocessors". The 10th International Con LR. Nickolls. ‘The Design of the MasPar MP-l: A Cost ference on Distributed Computing Systems. May 1990. pp. Etfective Massively Parallel Computer". 1990 IEEE. pp. 270-277. 25-28. Fineberg et al.. “Experimental Analysis of Communication! J.F. Prins and LA. Smith. “Parallel Sorting of Large Arrays Data-Conditional Aspects of a Mixed-Mode Parallel Archi on the MasPar MP-l’l‘. The 3rd Symposium on the Frontiers tecture via Synthetic Computan'ons". Proceeding Supercom of Massively Parallel Computation”. pp. 59-64. Oct.. 1990. puting '90. Nov. 1990. pp. 647-646. J.B. Rosenberg and JD. Becher. “Mapping Massive SIMD Kan et al.. “Parallel Processing on the CAP: Cellular Array Parallelism onto Vector Architectures for Simulation”. Soft Processo ”. COMPCON 84. 16 Sep. 1984. pp. 239-244. ware-Practice and Experience. vol. 19(8). pp. 739-756. Ezzedine et al.. “A 16-bit Specialized Processor Design”. Aug. 1989. Integration ‘lire VLSI Journal. vol. 6 No. 1. May 1988. pp. J.C. Tilton. “Porting an Interactive Region Growing Algo 101-110. rithm from the MPP to the MasPar MP-l”. The 3rd Sym A. Mudrow. “High Speed Scienti?c Arithemetic Using a posium on the Frontiers of Massively Parallel Computation. High Performance Sequencer". ELECTRO. vol. 6. No. 11. pp. 170-173. Oct. 1990. 1986. pp. 1-5. “Sequent Computer Systems Balance and Symmetry Alleyne et al.. “A Bit-Parallel. Word-Parallel. Massively Parallel Accociative Processor for Scienti?c Computing”. Series”. Faulkner Technical Reports. Inc.. pp. 1-6. Jan.. 1988. Third Symposium on the Frontiers of Massive Parallel Computation. Oct. 8-10. 1990', pp. 176-185. “Symmetry 2000/400 and 2000/700 with the DYNX/pbt Jesshoppe et al.. “Design of SIMD Array". Operation System”. Sequent Computer Systems Inc. IEEE Proceedings. vol. 136.. May 1989. pp. 197-204. “Symmetry 2000 Systems—Foundation for Information DeGroot et al.. “Image Processing Using the Sprint Multi Advantage”. Sequent Computer Systems Inc. prooesson”. IEEE. 1989. pp. 173-176. “Our Orstomers Have Something That Gives Them an Nudd et al.. “An Heterogeneous M-SIMD Architecture for Unfair Advantage". The nCUBE Parallel Software Environ Kalrnan Filter Controlled Processing of Image Sequences”. ment. nCUBE Corporation. IEEE 1992. pp. 842-845. Y.M. Leung. “Parallel Technology Mapping With Identi? Li et al.. “Polrnorphic-Torus Network". IEEE Transactions cation of Cells for Dynamic Cell Generation”. Dissertation. on Computers. vol. 38. No. 9. Sep. 1989 pp. 1345-1351. Syracuse University. May 1992. Li et al.. “Sparse Matrix Vector Multiplication of Polymor ‘The Connection Machine CM-S Technical Summary”. phic-Torus”. IBM Technical Disclosure Bulletin. vol. 32. Thinking Machines Corporation. Oct.. 1991. No. 3A. Aug. 1989. pp. 233-238. Fineberg et al.. “Experimental Analysis of a Mixed-Mode Li et al.. “Parallel Local Operator Engine and Fast P300". Parallel Architecture Using Bitonic Sequence Sorting”. IBM Tech. Disc. Bulletin. vol. 32. No. 8B. Jan. 1990. pp. Journal of Parallel And Distributed Computing. Mar. 1991. 295-300. pp. 239-251. R. Duncan. “A Survey of Parallel Computer Architectures". T. Bridges. “The GPA Machine: A Generally Partionable IEEE. Feb. 90’ pp. 5-16. MSIMD Architecture". The 3rd Symposium on the Frontiers C.R. Iesshope et al.. “Design of SIMD Microprocessor of Massively Parallel Computation. Oct. 1990. pp. 196-203. Array”. UMI Article Clearing house. Nov. 88’. Abreu et al.. “The APx Accelerator”. The 2nd Symposium Sener Ilgen 8: Issac Schers. “Parallel Processing on VLSI on the Frontiers of Massively Parallel Computation. Oct. Associative Memory”. NSF Award #ECS-8404627. pp. 1988. pp. 413-417. 50-53. D.A. Nicole. “Esprit Project 1085 Recon?gurable Trans H. Stone. “Introduction to Computer Architecture”. Science puter Processor Architecture”. CONPAR 88 Additional Research Associates. 1975. Ch. 8. pp. 318-374. Papers. Sep. 1988. pp. 12-39. R.M. Lea. “WASP: A WSI Associative String Processor” E. DeBenedictis and J.M. del Rosario. “nCUBE Parallel 110 Journal of VLSI Signal Processing. May 1991. No. 4. pp. Software”. IPCCC ’92. 1992 IEEE. pp. 0117-0124. 271-285. T.H. Dunigan. Hypercube Clock Synchronizationz. Concur Lea. R.M.. “ASP Modules: Cost-Effective Building-Blocks rency: Practice and Experience. vol. 4(3). pp. 257-268. May for Real-Time DSP Systems”. Journal of VLSI Signal 1992. Processing. vol. 1. No. 1. Aug. 1989. pp. 69-84. 5,794,059 Page 4

Isaac D. Scherson. et al.. “Bit Parallel Arithmetic in a G. J. Lipovski. “SIMD and MIMD Processing in the Texas Massively-Parallel Associative Processor”. IEEE. V0. 41. Recon?gurable Array Computer". Feb. 1988. pp. 268-271. No. 10. Oct. 1992. R.M. Lea. “ASP: ACost-e?’ective Parallel Microcomputer”. Supreet Singh and Jia-Yuan Han. “Systolic arrays”. IEEE. IEEE Oct. 1988. pp. 10-29. Feb. 1991. Mark A. Nichols. “Data Management and Control-Flow H. Richter and G. Raupp. “Control of a Tokarnak Fusion Constructs in a SIMD/SPMD Parallel Language/Compiler". Esperiment by a Set of MUL'I'ITOP Parallel Computers”. IEEE. Feb. 1990. pp. 397-406. IEEE vol. 39. 1992. pp. 192-197. Will R. Moore. “VLSI For Arti?cial Intelligence”. Kluwer Higuchi et al.. “1XM2: A Parallel Associative Processor for Academic Publishers. Ch. 4.1. Semantic Net Processing-Preliminary Evaluation-". IEEE. Jun. 1990. pp. 667-673. Mosher et al.. “A Software Architecture for Image Process Frison et al.. “Designing Speci?c Systolic Arrays with the ing on a Medium-Grain Parallel Machine”. SPIE vol. 1659 API15C Chip”. IEEE 1990. xii+808pp.. pp. 505-517. Image Processing and Interchange. 1992/279. Berg et a.l.. “Instruction Execution Trade-Oils for SIMD vs. Patent Abstracts of Japan. vol. 8. No. 105. 17 May 1984. p. MIMD vs. Mixed Mode Parallelism”. IEEE Feb. 1991. pp. 274. App. No. JP-820 125 341 (Tokyo Shibaura Denkl KK) 301-308. Jan. 27. 1984. Raghaven et 211.. “Fine Grain Parallel Processors and Real W.D. Hillis. “The Connection Machine”. The MIT Press. -Time Applications: MIMD Controller/SIMD Array”. Chapters 1. 3. and 4. IEEE. May 1990. pp. 324-331. “Joho-syori". vol. 26(3). 1985-3. pp. 213-225. (Japanese). US. Patent Aug. 11, 1998 Sheet 1 0f 24 5,794,059

2 BUS FPU EXPONENT 1 Bus ALU

X Y SHIFT NORMALIZE L NORMALIZE ROUNDING DlN BUS DATA BUS DiN BUS A REG INTERFACE 00m BUS 00111 BUS A B REG B REG C REG :> c REG /\ FPOPCODE INSTRUCTION y, STREAMER DBUS /_ <:>lNTERFACE- :> ‘NSTRUCT'ONPTR \ Ar <:> OPERAND REG D I I 1 4 E I SCHEDULER N I WORKSPACE Q \- 2 8 PTR u S T S X Y z D A A REG

‘A RE 5U c REG QBu '5 8 I l l l 5 2'3 5 j) DATA m REG 8 DOUT BUS DATA OUT REG < Big-'1 u5 DATACHANNEL REG FIG.1A Prior Art US. Patent Aug. 11, 1998 Sheet 2 0f 24 5,794,059

0 CONFIGURATION B TIMERS REGISTER & U x Y TIMING CONTROL ‘ S ALU EXTERNAL MEMORY INTERFACE » \- [LINK 0] PTR REG INPUT DATA REG 4- LINK COUNT REG LOGIC U v w x PTR REG OUTPUT DATA REG +- LINK MIKSI COUNT REG LOGIC iu'; v w‘x LINK 1 LINK 2 LINK 3 LINKS

v w Z , ADDRESS REGISTERS INSTRUCTION FETCH ADDRESS <:I CHANNEL ADDRESS DATA ADDRESS

FIG.I B Prior Art US. Patent Aug. 11, 1998 Sheet 3 0f 24 5,794,059

o 0 0000000000000000000000000000 on 0000000000000000 (O Enl-(WIEEE00000000.0000000 0000000000000000 ‘vadjona _:_I m3“3“333”“MM”3333“M3332056?; MwuoamMMP0 OOOOOOMMM00 000000.0000. I_E E5mozo>

zo;

O Q O

‘E: [51

\

DUUQUQUUESSG J US. Patent Aug. 11, 1998 Sheet 5 0f 24 5,794,059

200\ . _ APPLICATION _' ARRAY N o‘ 310 PROCESSOR O

210 L 250__\______ARRAY 02 APRICATION IF- ARRAY DIRECTOR _: 300 PROCESSOR 1 T 260\ ] ARRAY OT 1 APPLICATION ARRAY : 290 220\ L : PROCESSOR CONTROLLER A ‘ ARRAY OO APPUCATION I INTERFACE SYNCHRONIZERI P280 PROCESSOR 2 I I I I I I I I I I i i : ‘ 230\ 3 L “““““““““““ "'“l APPLICATION 240\ V V H 5 L mm

"51%; HOST APPLICATION PROCESSOR

pERFo?gg’gg?Méf DEMO APPLICATION I» HOST I'FACE DEV LIB - M'TOR I’F PME ARRAY I’F “ARRAY REC/6000 TEST & DEBUG MONITOR

DEBUGGER ‘ — PERFORMANCE MONITOR RS/BOOO & ANALYSIS TEST & MPP OIACNOSTICS INTERFACE A SIMULATOR SIMULATOR MONITOR ASSEMBLER/LINKER & LOAOER

FTG.19

US. Patent Aug. 11, 1993 Sheet 10 0f 24 5,794,059

64 64 64 64 K8 K8 K8 K8 +x+~ PE PE -+ _x CPU‘- CPU“ CPU CPU‘ ' I I I I , _ ' ' ‘ ' +w+> PE 4 PE > -w I NETWORK \ \ I J ‘ ' +Y=~ PE — PE --+-Y CPU -~ CPU‘ cPu CPU‘ / /

64 64 64 64 K8 K8 K8 KB +2“ PE PE '0 ‘Z

k J V FIG.10

EXTERNALPORTS +wi +x1 +yI +2I L PE PE PE PE J <+w> (+6 ‘ (+y) (+2) CMD/INST 1 l n 1 A H Al ‘—_.J

SERREO Q'CAST ' lNTER B'ST <———~— ' . PE BUS L. l 1 y 1 "1 -—>——— PE PE PE PE SERIAL LOOPS F (-w) (-X) (-y) (-2)? EXTERNAL f i x I PORTS —w ->< —y —2

US. Patent Aug. 11, 1998 Sheet 12 0f 24 5,794,059

V .@1915

1 H525@555...... 9:0 L 22E;A‘>x T>1/_11T11% 1 $5228.1 n”u258 5OxEQNxzH:N;m;52x 1m.m302X 1U.$530#1w I“.ii5:8 55%5:32E2E;E:52

E;#1.._

52%

US. Patent Aug. 11, 1998 Sheet 14 of 24 5,794,059

.3 v.

\ 2%

0 .5024ESL US. Patent Aug. 11, 1998 Sheet 15 of 24 5,794,059

HOW WOULD A 16 ELEMENT SORT REPEAT THE PATTERN? STAGE123 456 7 8910\

oowmmAuw-Ao > 1000

/ "FOR SORTING n DATA ELEMENTS (n a {2'1 € N,2' g # OF REST) dol=0ta(log2n)—1 doJ=OtoI 'TT (PE#/2'— J) 2,2 = 0 then TARGET = PE#+2 - J else TARGET = PE#2 - J send DATA to TARGET TTOO< receive data store in TEMP (if data is not available — wait) . PE# PE# TT (2( 2% 1) 2:2) + ((71) 22) +1) %2 - 0 then if TEMP < DATA then DATA = TEMP else NOP then if TEMP > DATA then DATA = TEMP else NOP end both da’s US. Patent Aug. 11, 1998 Sheet 16 Of 24 5,794,059

HOST HOST PROCESSOR I MEMORY

DATA AND COMMANDS

APPLICATION PROCESSOR INTERFACE API

ARRAY CONTROLLER COMMAND DATA /n DATA CLUSTER (u—CI-IANNEL) SYNCHRONIZER CS

I l I ‘ ‘ ‘y I I (OPTTONAL U CLUSTER CLUSTER CLUSTER CHANNEL CONTROLLER 0 CONTROLLER T CONTROLLER N DEVICES, DASD. cc CC "- CC GATEWAYS’DISPLAYS, C (PORT IYPE) (NON _ PORTER) PP ITSSIIER /64+P DATA BROADCAST /16+ 8‘ STATUS IB'CASTr LOOP CLUSTER O CLUSTER 1 CLUSTER N PME ~ 64 NODES ARRAY (512 PME’S)

HQLB.