Fast Integrated Tools for Circuit Design with FPGAs z Stephan W. Gehring y , Stefan H.-M. Ludwig

Institute for Computer Systems Swiss Federal Institute of Technology (ETH) Zurich,¨ Switzerland

[email protected], [email protected] t Abstrac desiring complete control over the circuit implementation and of- fers a highly interactive and fast design cycle. It consists of an To implement high-density and high-speed FPGA circuits, design- architecture-independent framework (front-end) which is comple- ers need tight control over the circuit implementation process. How- mented by architecture-dependent back-ends. This enhances tool ever, current design tools are unsuited for this purpose as they lack maintenance as the framework is reused for each back-end. A com- fast turnaround times, interactiveness, and integration. We present plete back-end including layout synthesis has been developed for a system for the XC6200 FPGA, which addresses these is- the Xilinx XC6200 architecture [23]. A further back-end for the sues. It consists of a suite of tightly integrated tools for the XC6200 AT6000 architecture [3] exists which comprises a layout ed- architecture centered around an architecture-independent tool frame- itor for manual circuit implementation and a bitstream generator. work. The system lets the designer easily intervene at various In the following, we first describe how the tight integration stages of the design process and features design cycle times (from of our tools is achieved. Then we explain how the designer can an HDL specification to a complete layout) in the order of seconds. influence the implementation process and discuss the techniques used to achieve the necessary speedy design cycle. We support our

claims by measurements and compare our system with the vendor-

Intro duction 1 supplied development system for the XC6200. A more detailed Fully automatic circuit synthesis from an HDL description is a presentation of the described system can be found in [8] and [14].

difficult and computationally intensive task, especially for Field-

To ol Integrat ion Programmable Gate Arrays (FPGAs). Ideally, circuits are mapped, 2 placed and routed without human intervention. However, to im- plement high-density or high-speed circuits with FPGAs, today’s One of the major goals of this work was to tightly integrate the designers are faced with the need to manually intervene in the de- various circuit design tools into a single development environment. sign process and reiterate the implementation cycle until the circuit Traditionally, the tools used in the circuit design process are very implementation meets the requirements [4, 20]. Unfortunately, de- loosely coupled and are often developed by a variety of independent signers cannot expect much support from current design tools as the companies. For instance, the tools used to capture designs may latter do not support fast iterative design cycles and offer only lim- come from a CAD tool developer, while the layout synthesis and ited interactivity. This effectively hinders the designer from con- bitstream generation tools are provided by the chip manufacturer. trolling the implementation process. Typically, tool communication is achieved by exchanging disk files Developers of circuit design tools face yet a different problem: containing the design data in standard formats. the still growing number of new FPGA architectures pressures them We feel that this loose coupling bears several disadvantages. to create new tool suites at a rapid pace. Since the tools are typi- First, it makes the seamless integration of the tools a difficult, if not cally tailored to a specific architecture, they are rewritten almost impossible, task. This, however, is a prerequisite to quickly switch completely for every new architecture. Over time this results in a back and forth between design tools during the design process. Sec- huge collection of difficult to maintain tools. ond, file-based inter-tool communication necessitates conversions The system we present addresses these problems. It tightly in- between internal and external formats on both ends, which often tegrates the circuit design process in a single environment, from results in design information loss and makes data exchange a slow the initial circuit specification in an HDL down to bitstream gener- and vulnerable process. Third, decoupled tool development does ation. At all stages of the process, the user can exert control over not take into account that a significant part of a tool suite is inde- the circuit implementation and thus efficiently guide the tools to- pendent of any specific architecture and could thus profitably be wards the desired solution. The system targets the experienced user reused by several tools. Taking advantage of this reduces the size and memory requirements of tools and eases system maintenance. y Interval Research Corporation, Palo Alto, California

z We have addressed these problems by capturing the common DEC Systems Research Center, Palo Alto, California traits of design tools in an application framework for circuit design tools [9, 8]. To create tools for a specific FPGA architecture, this architecture-independent framework is extended with architecture- specific components. The foundation for the tight integration of the framework and its extensions lies in the use of a single data structure to represent circuits throughout the system [9, 8]. That is, all tools, whether architecture-independent or -dependent, use a universal circuit rep- ends. The use of a single circuit representation resulted in the de- resentation defined in the core of the framework (Figure 1). The sired tightly integrated system with short response times and high design tools operate in a shared memory environment and modify interactivity. the data structure in situ. Data exchange between tools can thus

be realized efficiently just by passing a reference to the data struc-

Controlling the D esign Process ture. This avoids data structure conversions altogether and simpli- 3 fies both framework and tool implementations considerably. Keep- A key issue in the implementation of high-speed and high-density ing even large circuits entirely in main memory is possible due to circuits with FPGAs is allowing the (experienced) designer com- the compactness and simplicity of the representation. plete control over the circuit implementation. This is because, un- like in the software world, resources in an FPGA are scarce and

HDL Bitstream therefore need to be tightly managed. Being the creator of the cir-

Placer Router cuit, the designer has the most intimate knowledge of the circuit’s Compiler Generator structure and therefore knows best how it should be laid out. How- ever, the complexity of today’s circuits requires support by layout synthesis tools. The designer’s goal is therefore to guide the syn- thesis tools towards the desired solution. To enable this, our system

Central allows the designer to influence the outcome of the tools at various

Data Structure levels. At the circuit specification level, the designer can specify the placement of circuit components using position assignments in the Figure 1: Single Circuit Representation Accessed by Multiple De- Lola HDL. These placement hints are passed on to the automatic sign Tools placer which pre-places the parts prior to placing the remaining circuit parts. After placement, the layout editor can be used to en- The data structure represents circuits as a hierarchical forest of hance the placement manually. Since the hierarchy of the circuit is binary trees. Each tree specifies the Boolean equation for a sin- preserved by all tools and visualized by the layout editor, the de- gle circuit output. To accommodate for the needs of placers and signer is able to quickly identify and rearrange individual cells or routers, positional information and net lists can be attached to the entire subcircuits. As the layout editor is used frequently, we have nodes. All tools involved in the implementation process support taken care to make circuit manipulation as simple and as fast as and maintain the hierarchical circuit structure.Wehavefoundthat possible. Once a satisfactory placement is achieved, more place- this greatly improves the efficiency and quality of the layout syn- ments hints can be added to the HDL specification to reflect the thesis tools (see Section 4). new placement. These will constrain the placer during the next de- To describe the hierarchy of a circuit, the framework uses pa- sign iteration and will thus relieve the user from having to make the rameterizable templates. A template is a representation of a sub- same changes again during subsequent iterations. A compile-place circuit. Every instance of such a template is an exact copy of the cycle takes only a few seconds and the design quickly converges to template, i.e. it contains the same gates and net lists and they are at the desired placement. the same (relative) positions as in the master copy. Tools, such as a The designer can also influence the routing process. The router placer, can exploit this additional knowledge to efficiently produce allows individual nets to be routed and also supports the record- regular layouts for regular circuits, such as bit-sliced designs. ing and playback of routing scripts, which define the sequence in For textual design specification, the framework comprises a which the nets are routed. These scripts can be conveniently ap- compiler for the hardware description language Lola [21, 22]. The pended to the HDL specification text. Furthermore, the router can compiler translates a Lola circuit specification into the universal be constrained to only use certain types of routing resources, e.g. data structure ready to be processed further. The Lola HDL sup- only use the local interconnect. Should the automatic router fail to ports parameterized descriptions of subcircuits, e.g. N-bit adders, route a design successfully, the designer can use the layout editor and allows the designer to pass placement hints to back-end tools, to route certain nets manually. i.e. to constrain the placement of circuit components. Typically, the final layout is achieved only after a number of it- Also provided is a layout editor framework, which can be cus- erations. The ability to predict the outcome of changes made to the tomized to create layout editors for specific FPGA architectures. circuit specification is therefore of utmost importance. Our tools The layout editors fully support hierarchical layout and allow cir- comply with this requirement by employing only deterministic al- cuits to be constructed manually. The latter option is typically used gorithms. Stochastic algorithms, such as simulated annealing, are only for small to medium sized circuits and in education. The lay- ruled out, as minor changes to the circuit specification may result out editor framework has proven so versatile that a schematics edi- in drastically different layouts. tor was implemented with the framework.

To aid in the secure manual construction of circuits, the frame- Sp ee d of Design Cycle work features a design checker, which checks two circuit descrip- 4 tions, e.g. an HDL specification and a layout, for Boolean equiv- Current CAD tools take a considerable amount of runtime (several alence. The design checker is mainly used in education, where minutes to hours) to compile, place and route a design. One goal circuits are laid out manually by the student and are then checked of the presented tools is to achieve design cycle times that lie in against a Lola specification. It has also been used to prove the cor- the same range as those common in software development, namely rectness of the layout synthesis tools themselves. To prove equiv- minutes at most. The Lola front-end and the XC6200 back-end [14] alence between two Boolean expressions, the design checker uses achieve this through various techniques discussed in this section. ordered binary decision diagrams (OBDDs) [5].

In the course of our work, we have found that significant parts

L ola HDL Compila tion of design tools are indeed architecture-independent and are there- 4.1 fore profitably implemented by the framework. The framework- Due to the simplicity of the Lola language, most notably the re- based approach also increases software reliability, as common func- striction to a structural description style, the compilation process is tionality is implemented once only and reused by several back- straightforward. No elaboration process has to be performed, i.e. the mapping of source language constructs to implementable gates is obvious. A two-pass compiler is used, which first translates the source code into a syntax tree and then interprets this syntax tree,

generating the universal data structure directly into main memory. TYPE Example1; Forest The first pass takes time linear in the size of the source code and the second pass takes time linear in the size of the described circuit. IN a, b, c, d, e: [N] BIT;

OUT z: [N] BIT;

BEGIN

4.2 Technology Mapping to the XC620 0

FORi:=0..N−1DO

The Xilinx XC6200 FPGA consists of an array of simple cells and z.i := ˜((a.i * b.i) − (c.i + ˜d.i * e.i))

a hierarchical routing network [23]. Each cell can implement any END function of one or two inputs or a multiplexer, and contains an optional register, possibly with feedback. The match between the END Example1; universal data structure and the possible cell configurations of the c.1 d.1 XC6200 is almost perfect. Only simple transformation steps have e.1 to be performed, such as translating an SR-latch into two cross-

coupled Nand-gates. Packing multiple components into a single a.1

z.1 cell is deferred to the placement phase, which deals with the ge- b.1 ometric properties of the circuit. Therefore, no time-consuming packing step has to be performed, as is the case for most coarse-

c.0 d.0 grained FPGAs like the XC4000 series. The mapping process takes e.0 time linear in the size of the circuit.

a.0

z.0

b.0

4.3 Placement

0/0 The XC6200 back-end uses a constructive, deterministic placement e2 algorithm. For the same input it produces the same output, which is a very important and desirable property of a tool that is used it- Figure 2: Compact Placement of Expressions and Arrays eratively and interactively. It gives the designer the least surprises when a design is recompiled. Stochastic placement algorithms such as simulated annealing [12] are inappropriate for this task, as they typically produce different results every time they are run and ex- hibit long runtimes. The placing algorithm proceeds bottom-up, placing the inner- most subcircuits (templates) first. It is similar to the algorithm de- scribed in [15]. Within a template, it proceeds as follows: first, TYPE Example2(N); Selector

instances and expression trees with associated position hints are IN a, b, c, d: BIT; q, r, s, t: [N] BIT; placed. Second, array structures are placed using a simple heuris- OUT p: [N] BIT; tic that places elements of an array either from left to right or from bottom to top, depending on the aspect ratio of the element. A BEGIN good heuristic for arrays is essential for the placement of regular, FORi:=0..N−1DO bit-sliced designs which frequently occur in data paths. Finally, in- p.i:=a*q.i+b*r.i+c*s.i+d*t.i dividual instances and expression trees are placed using a recursive END

algorithm: the root of an expression is placed into the first avail- END Example2;

able cell. The expression trees, from which the root cell reads, are d placed recursively to the right of the root cell and above it. The tree t.1 above is offset by the vertical height of the tree to the right of the

c cell. Free space is managed using a bitmap. This simple placement s.1 strategy does not produce dense layouts, but is fast and guarantees

b a routable design. If necessary, the user can optimize this initial r.1 placement manually. Figures 2 and 3 give two examples of how

a

p.1 expressions and arrays are placed. q.1 Once a template is placed, the derived positional information is 0/4 propagated to all instances of that same template. This preserves d

t.0 the invariant of the front-end, which requires that all instances of

the same template have the same structural, placement, and wiring c information. Moreover, it also speeds up the placement algorithm, s.0

which uses time linear in the size of the circuit. b For larger circuits, which cannot be placed using this simple r.0 strategy, a floor-planner can be used to place subcircuits by hand a

p.0

q.0

and optimize their layouts individually. 0/0

e3

Routing 4.4 Figure 3: Placement Leaving Empty Cells The routing algorithm used in the back-end is a maze-running rou- ter based on the algorithm presented in [13]. It finds a path between twocells(terminals)byspreadingawavefromthedestinationto- Windows NT , version 4.0. The Lola system is wards the source until it finds the source cell or a wire segment implemented in the Oberon-2 [16] and runs driven by that source cell. A cost function determines the shape of within ETH’s Oberon for Windows System V4.0 [11] using the im- the spreading wave. plementation from the University of Linz, version 2.0. The router proceeds bottom-up, routing the innermost templates

first. Since all instances of a given template must have identical

Archit ecture Inde p endence wiring, the router must take into account the different positions 5.1 of the instances when determining the routing resources available The usefulness of the framework approach with an architecture-in- for routing the given template. It sorts all nets according to their dependent front-end and several architecture-dependent back-ends length and routes the shortest nets first. This simple, but effective, manifests itself in the size of the tools. Table 1 lists the software scheduling policy achieves quite satisfactory results. complexity of the front-end and of two back-ends, one for the Xil- To limit the size of the wave expansion, a bounding rectangle is inx XC6200 and one for the Atmel AT6000 FPGA. The AT6000 used, which bounds the size of the wave. If a subcircuit is routed, back-end consists of a layout editor and bitstream generator. It was

the size of this rectangle is the size of the subcircuit’s bounding developed by a student and proves that a layout editor back-end

=4 box. If the net is in the top-level the rectangle is made 1 larger can be developed by a programmer with no prior knowledge of the on each side than the bounding box spanned by the source (S) and framework within reasonable time (3 months). destination (D) nodes (cf. Figure 4). If routing fails within this It is interesting to compare the sizes of the framework front- rectangle, it is enlarged to the size of the chip and a new attempt is end to the XC6200 back-end. Note that in addition to the tools made. The effectiveness of this two-phase approach is dramatic, as described in Section 4, the XC6200 back-end includes a bitstream it can reduce the routing times by an order of magnitude. generator and a timing analyzer, as well as hardware driver soft- ware. The front-end is nearly of the same size as the back-end. Put in another way, approximately 40% of the circuit design system is architecture-independent. This fact clearly supports our propo- sition that a common architecture-independent front-end substan- 2 tially reduces the effort to develop new back-ends.

Subsystem Lines Object (KB) 1 Front-End 11300 199 XC6200 Back-End 18100 296

S Total (Front-End+XC6200) 29400 495

Bounding Box AT6000 Layout Editor 4400 80

of S and D

D Table 1: Software Size

25% Wider and Taller For comparison, the commercial tool XACT running under Win- dows NT has an object code size of 1192 KB. It is more than Full Chip Size twice as large as our system and features neither an HDL compiler nor does it contain architecture-independent parts, which may be reused for different architectures. Figure 4: Two-Phase Growing of Router Bounding Box Memory requirements of our tools are modest. The biggest de- sign of the next section is compiled, placed and routed using no Once a template is routed, the routing information is propa- more than 16 MB, while the memory requirement of XACT for the gated to all instances of that template occurring in the design. This same design is 46 MB.

propagation process speeds up the routing time of the whole de-

Sp ee d of D esign Cycle sign considerably, as the costly wave spreading is only performed 5.2 once for each net in each template. In designs with many repeti- tive structures, the speedup is directly proportional to the number We use two designs to evaluate the tools. The first is a floating point of instances of the same template. adder for 16-bit operands. The second is a pattern matcher,which To allow for manual intervention by the user, the router takes consists of a regular datapath, and a small amount of random logic. already routed nets into account by extracting the connectivity in- It matches 5-bit characters stored in the FPGA (the patterns) against formation prior to routing. Critical nets can therefore be pre-routed a stream of 5-bit characters (the text) and signals a match. Three manually. design variants are evaluated: one with 2 parallel pattern matchers of 4 characters each, one with 16 pattern matchers of 12 charac-

ters each and one with 32 parallel pattern matchers of 24 characters Evaluat ion 5 each. Both designs are data-path intensive circuits, for which the XC6200 is particularly well suited. Table 2 lists the characteristics We evaluate the Lola front-end and the XC6200 back-end and com- of the four designs after successful layout synthesis. The biggest pare it to the XACT Step 6000 V1.1.2 software available from the design is implemented on an XC6264 (128x128 cells) while all chip vendor. While our system offers an integrated design flow other designs are implemented on an XC6216 (64x64 cells). Fig- from HDL specification to bitstream generation, XACT provides ures 5 and 6 show the placed and routed layouts of the floating point the back-end functionality and relies on third party front-ends. The adder and the medium pattern matcher, respectively. measurements were carried out on a Digital Celebris GL 5166ST Table 3 shows the time spent in each phase of the Lola system. PC, equipped with a 166 MHz Pentium processor, 256 KB of Note that routing is not involved in the design cycle until the user second-level cache, 128 MB of main memory, running Microsoft’s is satisfied with the placement of the circuit. Therefore the first column lists the combined times of HDL compilation, mapping and placement. The (nh)-rows list the results for Lola code without CLBs Nets Bounding Box Util. placement hints. The number of unrouted connections is listed as well and clearly indicates how the quality of the routing is affected FP-Adder 542 1283 64x34 25% by the placement. The time for routing dominates the total design Small PM 248 630 18x46 30% cycle time. Medium PM 3048 6310 60x61 83% Large PM 11748 23830 107x121 90% Compile+ Table 2: Design Characteristics Place Route Total Unroutes FP-Adder (nh) 1.1 16.3 17.4 32 FP-Adder 1.1 4.5 5.6 0 Small PM (nh) 0.2 6.4 6.6 21 Small PM 0.3 2.0 2.3 0 Medium PM 3.5 17.1 20.6 0 Large PM 33.5 162.6 196.1 0

Table 3: Speed of Lola System (Times in Seconds)

Table 4 lists the times spent in the XACT tool. Since the HDL compiler is not integrated into XACT, the first column lists the com- bined times for reading the mapped netlist and placement. The mapped netlists were produced with our Lola compiler and a con- version tool. With no hints, XACT uses more time in the placement phase, trying to produce a good placement using stochastic algo- rithms. Repetitive runs on the same input do not yield the same re- sult, which can affect the result of the routing phase as well. Hence, little progress can be made between iterations. The placement and the routing can be influenced by the user through various switches. To produce the data presented in Table 4, those settings were cho- sen that ran the fastest, or achieved a completed design. Figure 5: Layout of Floating Point Adder Read+ Place Route Total Unroutes FP-Adder (nh) 58.4 1046.2 1104.6 81 FP-Adder 5.9 125.2 131.1 0 Small PM (nh) 6.6 683.1 689.7 9 Small PM 3.0 281.1 284.1 0 Medium PM 22.2 216.5 238.7 0 Large PM 273.9 2543.1 2817.0 0

Table 4: Speed of XACT (Times in Seconds)

Table 5 lists the total times and the speedup obtained by using the Lola system. As the table shows, the Lola system is one to two orders of magnitude faster than the commercial tool. In contrast to XACT, our router does not support the “Magic” routing resource of the XC6200 — but still routes the designs — and only supports one global clock signal. However, this does not fully explain the difference in execution time. Moreover, the times for our tools in- clude HDL compilation. Commercial HDL compilers are typically at least an order of magnitude slower than our Lola compiler. The tables do not show quantitative results of the layouts be- cause our timing analyzer is not yet fully functional. Inspection of the layouts, however, reveals critical paths of similar lengths. Our system exhibits exceptionally fast turnaround times and is therefore well suited for interactive circuit development. The fast turnaround times of compile, map and place are especially notice- able when many design iterations are being performed. Figure 6: Layout of Medium Pattern Matcher HDL level throughout the system proved to be valuable both to en- XACT Lola Speedup hance the predictability of the synthesized layouts and to reduce FP-Adder (nh) 1104.6 17.4 63.5 execution times of the design tools. FP-Adder 131.1 5.6 23.4 The resulting design cycle is one to two orders of magnitude Small PM (nh) 689.7 6.6 104.5 faster than what can be achieved with vendor-supplied tools and Small PM 284.1 2.3 123.5 approaches that of tools for software development. We hope that Medium PM 238.7 20.6 11.6 this enabling technology will spark interest in the software commu-

Large PM 2817.0 196.1 14.3 nity to use FPGAs for custom computing applications.

knowl edgme nts Table 5: Speed of Lola System vs. XACT (Times in Seconds) Ac We would like to thank Marco Sanvido for implementing a timing analyzer for the XC6200 FPGA and Daniel Hofmann for the im-

plementation of the Atmel AT6000 layout editor. We are grateful

Rela ted Work 6 to Xilinx Development Corporation in Scotland for their continu- Many different frameworks for circuit design are reported in the ing support and to Virtual Computer Corporation in California for literature, and the term “framework” is defined in various contexts. making our tools available to a wider community. For instance, the CAD Framework Initiative (CFI), an international TheLolatoolscanbedownloadedfreeofchargefromhttp://www.- consortium developing framework standards, defines a framework lola.ethz.ch and a version supporting the H.O.T. Works develop- as “a software infrastructure that provides a common operating en- ment system of Virtual Computer Corporation is available from vironment for CAD tools” [6]. This definition targets the interop- http://www.vcc.com. erability of loosely coupled design tools through a standard layer which is separate from the tools. It encapsulates existing tools which have been developed independently and thus focuses on the References management rather than the implementation of design tools. Ex- [1]R.Amerson,R.J.Carter,W.B.Culbertson,P.Kuekes,G. emplary for this philosophy is the Nelsis framework [17, 19]. Snider. Teramac — Configurable Custom Computing. Proc. Our front-end differs from this approach in that it closely cou- IEEE Symposium on FPGAs for Custom Computing Ma- ples extensible design tools. It does not need any data translators chines. IEEE Computer Society Press, 1995. since a common data structure is used. More in the spirit of our tools is the FACE environment [18], which is a framework contain- [2] J. M. Arnold, D. A. Buell, E. G. Davis. Splash 2. Proc. 4th ing a design manager and a user interface toolkit centered around a Annual ACM Symposium on Parallel Algorithms and Archi- common design representation model. tectures, 1992. Because the bitstream formats of most FPGAs are kept propri- etary by their respective vendors, CAD tools developed by third [3] Atmel. Configurable Logic: Design & Application Book, parties generate netlists that are then read by commercial tools. 1995. Therefore, these third party tools suffer from the lack of integra- tion and speed and the design cycle times are slow, although these [4] P. Bertin, H. Touati. PAM Programming Environments: times are seldom published in the literature. Examples for such Practice and Experience. Proc. IEEE Symposium on FPGAs tools are PamDC from Digital’s Paris Research Laboratory [4] and for Custom Computing Machines. IEEE Computer Society the SPLASH-tools from the Supercomputing Research Center in Press, 1994. Maryland [2]. The only systems we know of, which have compa- [5] R. E. Bryant. Symbolic Boolean Manipulation with Ordered rably fast turnaround times are Tsutsuji [7] for the Teramac cus- Binary Decision Diagrams. ACM Computing Surveys,Vol. tom computing machine from Hewlett-Packard Laboratories [1], 24, 293–318, 1992. and the Debora/CALLAS tools for the Algotronix CAL architec- ture [10]. The bitstream formats of both FPGAs were available [6] CFI Architecture Technical Subcommittee. CAD Framework to the tool developers. Compared to our system, the CALLAS syn- Users, Goals, and Objectives, Version 0.91, CAD Frame- thesis tools make no use of hierarchical information and the Debora work Initiative, 1990. HDL can not be annotated with placement information. [7] B. Culbertson, T. Osame, Y. Otsuru, J. B. Shackleford, M.

Tanaka. The HP Tsutsuji Logic Synthesis System. Hewlett-

Summary and Conclusions 7 Packard Journal, August 1993. We have developed circuit design tools for FPGAs which feature [8] S. Gehring. An Integrated Framework for Structured Cir- a very fast design cycle. They give the user tight control over the cuit Design with Field-Programmable Gate Arrays. Disser- implementation process and encourage an iterative, exploratory de- tation No. 12188, ETH Z¨urich, 1997 (http://www.inf.ethz.- sign style. This is particularly useful for experienced users who ch/publications/diss.html). push the technology to its limits. Our tools should be seen as com- plements, rather than replacements, to sophisticated design tools [9] S. Gehring, S. Ludwig. The Trianus System and its Appli- that offer fully automatic synthesis at the expense of execution cation to Custom Computing.Proc. 6th Intl. Workshop on time. Field-Programmable Logic and Applications. LNCS 1142, The separation into an architecture-independent front-end and Springer, 1996. architecture-dependent back-ends is beneficial, as it relieves the [10] B. Heeb and C. Pfister. Chameleon: A of a Dif- tool implementor from having to (re)implement common behavior ferent Colour. 2nd Intl. Workshop on Field-Programmable for each back-end anew. Logic and Applications. LNCS 705, Springer, 1992. By centering all tools around a universal data structure, the speed of the tools is improved and the memory requirements are [11] Institute for Computer Systems. The Oberon Archive. ftp:- lowered. Preserving the hierarchical information available on the //ftp.inf.ethz.ch/pub/software/Oberon. [12] S. Kirkpatrick, C. D. Gelatt, Jr., M. P. Vecci. Optimization by Simulated Annealing. Science, Vol. 220, May, 1983. [13] C. Y. Lee. An Algorithm for Path Connections and its Ap- plications. IRE Trans. Electronic Computer, Vol. EC-10, September 1961. [14] S. Ludwig. Hades — Fast Hardware Synthesis Tools and a Reconfigurable Coprocessor. Dissertation No. 12276, ETH Z¨urich, 1997 (http://www.inf.ethz.ch/publications/- diss.html). [15] L. M. Monier, J. Dion. Recursive Layout Generation. Proc. 16th Conference on Advanced Research in VLSI. IEEE Com- puter Society Press, 1995. [16] H. Mossenb¨ ock,¨ N. Wirth. The Programming Language Oberon-2. . Vol.12,No.4,1991. [17] Technical University of Delft. The Nelsis CAD Framework. http://www.ddtc.dimes.tudelft.nl, 1996. [18] W. Smith, D. Duff, M. Dragomirecky, J. Caldwell, M. Hartman, J. Jasica, M. d’Abreu. FACE Core Environment: The Model and its Application in CAE/CAD Tool Devel- opment.Proc. 23rd Design Automation Conference, IEEE, 1989. [19] P. van der Wolf. CAD Frameworks — Principles and Archi- tecture. Kluwer Academic Publishers, 1994. [20] R. Woods, A. Cassidy, J. Gray. VLSI Architectures for Field Programmable Gate Arrays: A Case Study.Proc. of the IEEE Symposium on FPGAs for Custom Computing Machines. IEEE Computer Society Press, 1996. [21] N. Wirth. Digital Circuit Design. An Introductory Textbook. Springer, 1995. [22] N. Wirth. The Language Lola and Programmable Devices in Teaching Digital Circuit Design. Proc. of the 2nd Intl. Andrei Ershov Memorial Conference. LNCS 1181, Springer, 1996. [23] Xilinx. The Programmable Logic Data Book, September 1996.