A Parallel Processing Architecture for CAD of Integrated Circuits

s.\21õ A Parallel Processing Architecture for CAD of Integrated Circuits Bruce A. Tonkin, B.E.(Hons.) Departrnent of Electrical ancl Electronic Engineering The University of Aclelaicle Aclelaicle, South Australia. September 1990 -¿.f û. v,=) a- * å.". Å \t 1,'l Contents Abstract x Declaration xi Acknorr-ledgements xii 1 Introduction 1 1.1 Computer Aided Design of Integrated Circuits 1 I.2 Synthesis Tooìs 6 I.2.1 Algorithms 6 n I.2J..I Heuristic Algorithms I 1.2.1.2 Simulated Annealing 8 1.2.1.3 Iinosledge-based Systems 10 1.2.2 Examples 10 1.2.2.1 Cellgenerators 10 7.2.2.2 Compactors 11 1.2.2.3 Routers 72 7.2.2.4 Placement tools 12 I.2.2.5 Silicon comPilers t2 1.3 Static Analysis Tools 13 1.3.1 Algoritìims 13 1.3.1.1 Layor.rt analysis algorithms . 13 1.3.1.2 Netu'ork analysis algorithms t4 1.3.1.3 Test generation algorithms . 15 1.3.1.4 Flielarchical analYsis t7 1.3.2 Examples t7 7.3.2.1 Cilcuit extractors 17 1.3.2.2 Netu'ork comParators. - 18 1.3.2.3 Design rule checkers 18 7.3.2.4 Electrical rule checkers 19 1.3.2.5 Tiningverifiers 19 1.3.2.6 Test generation tools 20 r.4 D¡-namic Analr-sis Tools 20 1.-l.1 Algorithrns 2I 1.4.1.i Direct nethods 2l I.4.I.2 Relaration methods 24 1.4.1.3 Er¡ent driven simulators 25 1.4.1.4 Device moclels 26 1.1.2 Examples 26 1.4.2.I Circuit simulators 27 I.4.2.2 Timing simulators 27 L4.2.3 Logic simulatols 28 L4.2.4 Fault simulation 29 1.4.2.5 Functional simula'tors 30 I.4.2.6 Behavioural simulators 31 1.4.2-7 i\{ixed ler-el simulators 31 1.4.2.8 Nlixed mode simulat'ors 31 1.5 Summary 32 2 Background óÐ 2.f Real Tirne Suìrsysten-r (RT'S) 36 2.1 .1 Introcluction 36 2.1.2 Real Tirne veÌsns General Purpose ComputeÌ Systems 38 It 2.I.2.L Real tirne computer systems 3E 2.I.2.2 Gener-al purpose compttter srl'stems 38 2.r.3 Real Tirne System Overview 4I 2.7.4 Conclusions 44 2.2 Parallel Circuit Extraction 48 2.2.1 Intr^oduction 48 2.2.2 Overview of Circuit Extlaction 50 2.2.3 Cìoalie 51 2.2.4 Strategy fol Parallel Circuit Extraction 55 2.2.5 Parallel Goalie 58 2.2.5.I General purpose multicomputer' 58 2.2.5.2 Serial disk accesses 61 2.2.5.3 Parallel disk accesses 61 2.2.5.4 Cilcuit paltitioning . ' . 64 2.2.5.5 Layout analYsis 65 2.2.5.6 l\'Ierging region numbers bC) 2.'2.6 Results 70 2.2.7 Conclusions 77 2.2.8 Acknowledgements 8r 3 Approaches to Accelerating Computer-Aided Design 82 3.1 Introcluction . 82 3.2 Strategies for High Performance 84 3.2.1 SISD . 85 3.2.1.1 RISC versus CISC öD 3.2.2 SIMD 86 3.2.2.7 Pt-ocessor alra}S 87 3.2.2.2 \/ector pÌocessors 8S 3.2.2.3 1\{ulti-oPeration . 89 3.2.3 l\{IlvID 91 iii :1.2.4 Snmmarr- o.) 3.3 C'haractelistics of À'IIN{D architectures ot 3.3.1 Circuit TechnologY 93 3.3.2 Nlultiprocessor 95 3.3.3 Nlulticomputer' 96 3.3.4 Grain Size 97 .1.3.5 T)'p" of Plocessing \ode 99 3.3.6 ]nterconnection \etrtork 100 3.4 ParalleÌ plocessing fol CAD 103 3.1.1 S1'nthesis Tools - Simulated Annealing 103 3..1.1.1 Summaly 111 3.+.2 Static Analysis Tools r12 3.1.2.1 Design rule checking IT2 3.1.2.2 Cilcuitertlaction 115 3.+.2.3 Testgeneration 115 3.4.2.4 Summary L).7 3.1.3 Dvnamic Analvsis Tools 119 3.1.3.1 Cilcuit simulation 1r9 3.1.3.2 Tiningsimulation r23 3.4.3.3 Logicsimulation 121 3.+.3.4 Faultsimulation 132 3.4.3.5 Sunmarr- 133 3.5 C'onclusions 134 4 Computer architecture for \rLSI CAD r37 4.1 Introduction . r37 1.2 \\:olkstations , 140 4.3 C'omputer \ode Architectule . t45 +.3.1 \,'LSI N4icroprocessor Technology . 145 1.3.2 Floating Point SuPPolt . 148 1V i.:J.3 \:irtual \lemor'¡' \f anagement ' 148 4.3.4 Local t\4e:-r-ror¡, ' 149 4.4 hlerconnection\etrvork . ' ' 150 -1-l.l Nectar' "'151 4.+.2 Netrvolk Topolog¡- " " 156 i.+.3 Scalable Cohelent Interconnect (SCI) ' ' I57 i.J.4 High Speecl Chanlel (HSC) standard ' ' 161 l.+.5 Fiber Dis:ributed Data Lìterface (FDDI) . ' ' 162 -1,1.6 InterconrrectionNetrvorkRequirements ' ' ' ' ' 163 4.5 Seconclar¡'storace ' ' ' ' 166 4.51 DiskArr¿,,-s ' " '167 !.i.2 Briclge Fì.e S1''ster:r " " 772 i.õ.3 Secondar-,-stolage Requirements " ' 173 4.6 Operatings)'st'en " '177 :.6.1 Arnoeba "'178 +.6.2 \lach ""181 :.6.3 OperatinlSl'stenRequirements- ' ' ' 183 4.7 PalallelPrograruning ' ' ' 187 :. t.I \lessage ..'er.sus SLared Variable Programming ParadigmslSS :-.t.2 Sirar.ecl r-¿.riables ia a Loosely Coupled Environment . 190 :.7.3 Lincla "'i90 :.1.4 Distlibutecl Share,l \-irtual ìvlemoly ' ' ' 192 :.7.5 Shaled Data Obiet-t \{odel ' ' ' ' 195 +.7.6 Computer-aidedPlogrammittg. " "198 1.7.6.1 cAPtrR .....199 1. r.6.2 H,r'pertool " " 200 :.t.7 Proglanr:ning Retiuirements ' " 201 5 Conclttsions 206 A Real Time Subsystem Architecture 2ll 4.1 Comnercial Alternatives . .211 4.2 Specification for the Real Time Subsystem . .214 4.3 \rchitecture of the Real Time Subsystem . ...218 -\.3.1 VMEbus BackPlane . .2r9 -\.3.2 Central Processing Unit . .220 -\.3.3 On-board Software . .223 -\.3.4 A/D Conver-ter Module . .223 -\.3.5 D/A Converter Module ' . .225 -\.3.6 Timer/Parallel Interface Module . .227 -\.3.7 Data Link . .227 -\.3.8 Softu'ale Suppolt ...231 -\.3.9 Surlma.-y ...231 .'232 -\.3. 1 0 Àcknotvledgements . Bibliography 233 VI List of Figures 1.1 Design Hierarchy 2 7.2 Design Representation Hieralchy J 1.3 Icleal Design Cycle b 1..4 Simulated Annealing Algorithm I 1.5 Direct Nlethod Algorithm 23 2.7 System Overvierv 42 2.2 Real Time Subs¡'stem (RTS) Architecture 43 2.3 Example of a region melge 54 2.4 P altitioning strategies bb 2.5 HPC Architectule 59 2.6 Programmer's tr'Iodel of the HPC 60 2.7 Programming N'Iodel used in Parallel Goalie 63 2.8 Example of mer-ging acloss slice boundaries br 2.9 Example of mer-ging across slice boundaries (ctd.) 68 2.70 Performance results for the detection and numbering of tran- sistor regions 70 2.1r Partitioning for circuit 3 rvith 12 processors T3 2.r2 Load variation for circuit 3 rvith 12 processors t.t 2.13 Number of scanline stops lbr circuit 3 rvith 12 processors 74 2.1.4 Partitioning for cilcuit 2 rvith 12 processors 75 2.15 Loacl valiation for circuit 2 l'ith 12 processors 75 2.16 Nurnbel of scanline stops lbl cilcuit 2 rvith 12 processols 76 vll 4.1. Systenl Ar-chi tecture . 138 4.2 Leoparcl multiprocessor wolkstation [Ashenden et al., 1989] . r42 4.3 Intel iE60 microplocessor layout fl{ohn & Margulis, 19Eg] . r47 4.4 Single HUB Cluster of Nectar netrvork [Arnould et al., i989] . 155 4.5 SCI applications . r59 4.6 SCI Tlansaction phases [IEBE P1596, 1990] . 160 4.7 HSC Interface signals [Chlamtac & Franta, 1990] . t62 4.8 Interconnection system . 164 4.9 Disk arlay . 168 4.10 Intelleavecl parity sectors [I(atz et al., 19E9a] . 170 4.11 Secondary storage architecture . 176 4.1 \,IC500 Vlinicomputer Architecture [lvlasscomp, ] 2r2 4.2 System Ovelview 215 4.3 Real Time Subsystern (RTS) Architecture 219 4.4 Data Link structur-e 229 vlu List of Tables 2.1 Charactelistics of the cir-cuits in Figure 2.10 7l 2.2 Times fol the slorvest nocle out of 12 nodes 72 IX Abstract Designers use computer-aided-design (CAD) software to check that an integrated circuit rvill function correctly and meet performance specifications. The software must produce results cluickly enough for a designer to interactively evaluate many design iterations before fabricating the circuit. For large circuits, present rvork- stations are not fast enough to run the softrvare at the rate needed by designers. This has lead to research into parallel processing techniclues to reduce the running time of C-\D softrvare. The goal of this thesis is to describe a general purpose parallel computing systern that shoulcl be capable of supporting the full range of design tools, and offel a substantial speed up to the design verification step in the design cycle. The thesis begins by discussing the computational characteristics of modern CAI) software. It then discusses initial ideas for th.e hardware and so{tware attributes of a multiple plocessor computer system suitable for running CAD software. These ideas derive frorn the author's expeLiences rvith the design and construction of a high speed processing system, rvhich can accluire and generate analog signals in real time. and the implementation of a parallel circuit extraction algorithm on a multiconpLrter. The thesis goes on to revierv different architectural approaches to designing faster computers. before recommending a \,Iultiple Instruction stream / Multiple Data strea,m (\II\ID) architecture because it offers the flexibility required by CAD algorithms. It then analyses recent research into parallel CAD algorithms for running on \,IilVID architectures.

A Parallel Processing Architecture for CAD of Integrated Circuits

Overview of the SPEC Benchmarks

Computer Architectures an Overview

R00456--FM Getting up to Speed

Solaris X Window System Reference Manual

Icase Interim Report 6

Openwindows Reference Manual

Japanese Semiconductor Industry Service

The Spare ™ Architecture Manual Version 7

Towards the Teraflop in CFD

Icase Interim Report 3

The First Vector Computers

The SPARC Architecture Manual Version 8