STEPHENBROWN RECENTLY, the development of of the various FPD ,ardhitectures JONATHAN ROSE new types of sophisticated field- and‘ discuss the most’ important programmable devices (FPDs)has commercial products, semphasiz- dramatically changed the process ingdevicestiithrelativelyhigh log- of designing digital hardware. ic capacity. Unlike previous’ generations of hardware technology in which Evolution of FPDs, board level designs included large The first user-programmable numbers of SSI (small-scale inte- -chip that chuld implement logic cir- gration) chips containing basic cuits was >theprogrammable read- gates,virtually every digital design only memory (PROM), *in which produced today consists mostly of address lines serve as logic circuit high-density devices. This is true inputs and,data Iin& as outputs. not only of custom devices such as Logic functions, however, rarely re- processors and memory but also quire more than’s fetv product of logic circuits such as state ma- terms, and a PROM contains a full chine controllers, counters, regis- decoder for its address iriputs. ters, and decoders. When such PROMSare thus inefficient for real- circuits are destined for high-vol- izing logic circuits, so’ d’esigners ume systems, designers integrate ‘rarely use them for that purpose. them into high-density gate arrays. The first device developed However, the high nonrecurring specifically for imljlementing log- engineering costs and long manufac- The FPD market has grown over the ic circuits was the field-progr&mmable turing time of gate arrays make them past decade to the point where there is logid’array, or simply PLA for short. A unsuitable for prototyping or other low- now a wide assortment of devices to PLA consistsof two levels of logic gates: volume scenarios. Therefore, most pro- choose from. To choose a product, de- a programmable, wired-AND plane fdl- totypes and many production designs signers face the daunting task of re- lowed by a programmable, wired OR now use FPDs. The most compelling searching the best uses of the various plane. A PLA’s structure allows any of advantages of FPDs are low startup chips and learning the intricacies of its inputs (or their complem6nts) to be cost, low financial risk, and, because vendor-specific software. Adding to the ANDed together in the AND plane; each the end user programs the device, difficulty is the complexity of the more AND plane output canthus correspond quick manufacturing turnaround and sophisticated devices. To help sort out to any product term of th’e inputs. easy design changes. the confusion, we provide an overview Similarly, users can configure each OR

42 0740.7475/96/$05,00 0 1996 IEEE IPEE DESIGN & TEST QW6’fiPUTERS Inputs and flip-flop feedbacks

utputs

Figure 7. PAL structure. plane output to produce the logical sum of any AND plane output. With this structure, PLAs are well-suited for im- plementing logic functions in sum-of- products form. They are also quite versatile, Since both the AND and OR terms can have many inputs (product literature often calls this feature “wide AND and OR gates”). When introduced PLAsin the early 197Os,their main drawbacks were expense of manufacturing and some- what poor speed performance. Both disadvantages arose from the two lev- els of configurable logic; programma- ble logic planes were difficult to manufacture and introduced significant propagation delays. To overcome these weaknesses, Monolithic Memories (MMI, later merged with ) developed PAL devices. As Figure 1 shows, PALSfeature only a single level of programmability-a pro- grammable, wired-AND plane that feeds fixed-OR gates. To compensate for the lack of generality incurred by the fixed-OR plane, PALScome in variants with different numbers of inputs and outputs and various sizes of OR gates. T6 implement sequential circuits, PALS usually contain flip-flops connected to the OR gate outputs. The introduction of PAL devices pro- 1tectures that we will describe shortly. FPDs, including PLAs, PALS,and PAL foundly affected digital hardware de- Variants of the basic PAL architecture like devices, into the single category of sign, and they are the basis of some of appear in several products known by simple programmable-logic devices the newer, more sophisticated archi- rarious acronyms. We group all small (SPLDs), whose most important char-

SUMMER 1996 43 F I E L D - P R 0 G R A M M A B L E D E V I C E S

tyis that the programmablelogic plane structure grows too quickly as the num- ber of inputs increases. The only feasi: we-explain laterieac ble way to provide largecapacity herently,bette-r suite devices based on SPLD archirectures is cationsthan for others; There to programmably interconnnect multi- ple SPLDs on a single chip. I&any FPD Groducts on the market todayhave this basic structure and are know.n as com- plex programmable-logic devices. devices have limi~teduse, we.+dc pioneered CPLDs,first in their scribe them here. ‘.’ il Y. SPLDs CPLDs FPGAs ClassicEPLD chips, and then in the Max 1 ., , ,L ’ )_< 5000,7000, and 900O’series.Because of User-programniable sw,itc h 1 *‘* Altera FlexlOK, ATT&T ORCA2 ) a rapidly growing market for large FPDs, technolc @es. other manufacturers developed CPLD User-programmable switches are‘the / 1 *I ~f~ff~,Abl; Mach, 1 devices, and many choices ‘are now keyto user customization of FPDs. THe Lattice (p)LSI, CypressFlash370, available. CPLDs provide logiccapaci- first::userprogrammable switch devel- ty up to the equivalent of about 50 typi- aped was the fuse used’: in PLAs. Figure 3. FPD logic capacities. cal SPLD devices,. but extending these Although*some smfa.llerd,evi&s still use architectures to higher’densities is diffi- fuses,wetwill not discuss them hereabe- cult. Building FPDs’with very high logic 1 cause newer technology is ,ouickly re- acteristics are low cost and very high capacity requires a different approach. placing. them. For h&l&r density pin-to-pin speed performance. The highest capacity general-purpose devices, CMOS dominates:the IC in- Advances in technology have pro- logic chips available today are the tra- / dustry, anddifferent~approaches toim- duced devices with higher capacities ditional gate arrays sometimes referred ( plementilig pro~gramrnable stitches are than SPLDs.The difficulty with increas- to as mask-programmable gate:arrays. necessary: For-CPLDs,themainswitch ing a strict SPLD architecture’s capaci- An MPGA consists of an arrayeof pre- 1 techno1oQie.s4 fin\ corn~~~mercialpr0duct.s)

44 IEEE RESIGN a ~~sT~.o*F~d0i~PiriEds Table 7. Summary of FPD progiamming technologies. input wire , input wire Switch type Reprogrammable? Volatile? Technology L Product Fuse No No Bipolar wire EPROM Yes No UVCMOS (out of circuit) EEPROM Yes No EECMOS (in circuit) SRAM Yes Yes CMOS EPROM : (in circuit) Antifuse No No CMOS+ Figure 4. EPROM programmable T switches. are floating gate like those used in EPROM (erasable programma- ble’read-only memory) and EEPROM (electrically erasable PROM). For FPGAs,they are SRAM(static RAM) and antifuse. Table 1 lists the most impor- tant characteristics of these program- ming technologies. To use an EPROM or EEPROMtran- sistor as a programmable switch for CPLDs (and many SPLDs), the manu- facturer places the between two wires to facilitate implementation of wired-AND functions. Figure 4 shows EPROM transistors connected in a CPLD’sAND plane. An input to the AND plane can drive a product wire to logic level 0 through an EPROMtransistor, if that input is part of the corresponding product term. For inputs not involved in a product term, the appropriate EPROMtransistors are programmed as permanently turned off. The diagram of an EEPROM-baseddevice would look figure 5. SRAM-controlled programmable switches. similar to the one in Figure 4. Although no technical reason pre- vents application of EPROM or EEP- AND gate in the upper left comer) to an- gy. As an example, Figure 6 (next page) ROM to FPGAs, current commercial other through two pass-transistor depicts ’s PLICE (programmable- FPGA products use either SRAM or an- switches and then a , all logic interconnect circuit element), an tifuse technologies. The example of controlled by SRAM cells. Whether an tifuse structure.l The antifuse, posi- SRAM-controlled switches in Figure 5 il- FPGA uses pass transistors, multiplex- tioned between two interconnect wires, lustrates two applications, one to con- ers, or both depends on the particular consists of three sandwiched layers: trol the gate nodes of pass-transistor product. conductors at top and bottom and an switches and the other, the select lines Antifuses are originally open circuits insulator in the middle. Unpro- of that drive in- that take on low resistance only when grammed, the insulator isolates the top puts. The figure shows the connection programmed. Antifuses are. manufac- and bottom layers;programmed, the in- of one logic block (represented by the tured using modified CMOS technolo- sulator becomes a low-resistance link.

SUMMER 1996 45 F I E 10 - PROGRA M M A B L E D E V I C E S

migghtuse a small-h’ardwa’e d&cription language such as ABEL for some. mod- ules; a symbolic schematie..aapturetool for-others, anda MlLfeatured-hardwaie descriptiiin language such as VHDL for still ,otbe.rs. Also, the devici&fitting proGess may gequire steps similar to those described next for FP.GAs, de- pen@ing,bn the CPLD’s sophistication. Either th,eCPLD manufacturerora third party supplies the necessary software Figure 6. Actel’s PUCE anfifuse structure. for these$asks. ., The FPGA design process is similar@ that of CPLDs but requires additional Fix errors tools to ‘support in,erease+ehip .com- plexity. The major.diffe&zceiis in’de- vic@fitt?ng, for which FPGAs need ‘at 1eBstthree tools: a:technol@y mapper to transform basiC logic :gateSinro t&5 FPGA’s logic blocks; a placement tool to choose the specific logic blocks, and a router to allocate wire s@gmei terc’onnect thelo.& blo;cks. With th I I I I Manual Automatic I added domplexityj- the CAD tools take a fairly long time (often more than an Figure 7. CAD process for SPLDs. hou,Gor:even,several hours) to colrl- plete their tasks. I_,1 I‘ 1 .: ,) .-:,-” ; PLICEuses polysilicon and n+ diffusion lar?guage,or combines these methods. I Comm&ciallv availal ble~FPDs ,I , : as conductors and a custom-developed Since initial logic entry is not usually in I Thisowervie’w Dxovides exa:mples:of I compound, ON0 (oxide-nitride-ox- an optimized form, the system applies commer.eial FPD prqduc tsaild their.ap; ide),’ as an insulator. Other antifuses algorithms to optimize the circuits. plications. W&eti&oura ge readers in- rely on metal for conductors, with Then additional algonthms anaIyze the teresied in,more, details. to.contact the amorphous silicon as the middle lay- resulting logic equations and fit them manufatiturers or,dislributors forthe 1;i\- er.2,3 into the SPLD. Simulation verifies cor- est .datasheets.: Most FPD.man i:wfactur- rect operation, and the designer returns ers provide. data-sheets on.;the.World CAD for FPDs to the design entry step to fix errors. Wide Web at http://www:compan Yr Computer-aided design programs are When a design simulates correctly, the name.com. :;a ‘L’ “; essential in designing circuits for im- designer loads it into a programming -,,,: : / ;i ll plementation in FPDs. Such software unit to configure an SPLD.In most CAD SPLDs, As a staple o fidi$tal hard- tools are important not only for CPLDs systems,the designer performsthe orig- 1 ware. d&igneYsus’,.for the .

46 IEEE DESIGN & TEST ~GPX~MPUTERS andsourced by other companies. The designation 16R8 means that the PAL has a maximum of 16 inputs (eight ded- icated inputs and eight input/outputs) I/O and a maximum of eight outputs, and block Logic that each output is registered (R) by a D array flip-flop. Similarly, the 22VlO has a max- block imum of 22 inputs and ten outputs. The V means versatile-that is, each output can be registered or combinational. Another widely used and second- sourced SPLD is the Altera Classic EP610. This device is similar in com- plexity to PALS,but offers more flexibil- ity in the production of outputs and has larger AND and OR planes. The EPGlO’s outputs can be registered, and the flip- gure 8. Ahera Max 7000 series architecture. flops are configurable as D, T, JK, or SR. Many other SPLDproducts are avail- able from a wide array of companies. Array of 16 macrocells All share common characteristics such as logic planes (AND, OR, NOR, or NAND), but each offers unique features suitable for particular applications. A partial list of companies that offer SPLDs includes AMD, Altera, ICT, Lattice, Cypress,and Philips-.The com- plexity of some of these SPLDs ap- proaches that of CPLDs. PIA F To I/O cells CPLDs. As we said earlier, CPLDs consist of multiple SPLD-likeblocks on a single chip. However, CPLDproducts are much more sophisticated than SPLDs,even at the level of their basic SPLD-like blocks. In the following de- scriptions, we present sufficient details to compare competing products, em- phasizingthe most widely used devices.

Altera Max. Altera has developed I three families of CPLDchips: Max 5000, Fi!gure 9. Ahera Max 7000 logic array b/oCk 7000, and 9000. We focus on the 7000 seriesbecause of its wide use and state- of-the-artlogic capacity and speed per- te cture of the Altera Max 7000series. It and outputs connect directly to the PIA formance. Max 5000represents an older cc)nsists of an array of logic array blocks and to logic array blocks. A logic array technology that offers a cost-effective arrd a set of interconnect wires called a block is a complex, SPLD-likestructure, solution; Max 9000 is similar to Max Pr .ogrammable interconnect array and so we can consider the entire chip 7000 but offers higher logic capacity @‘IA). The PIA can connect any logic an array of SPLDs. (the industry’s highest for CPCDs)‘.‘~’‘” ar,ray block input %r output to any oth- I, Figure,9showsthe structure of a log- Figure 8 depicts the general archi- er logic array block. The chip’s inputs ic array block. Each logic array block

SUMMER 1996 47 F I E 1 D - P R 0 G R A M M A B 1 E D E v I C E S

architecture supports wider functions when necessary.Variabl ’esize OR.gates of thissort are hot ,avail,ab,le:in:basfc SPLDs (see Figure l),, butfsimilar, fca- tures exist in other CPLDpartihitectures. Max 9000 ,de@es are’available,in both EPROM and EEPROM-technold- gies. Until recently, even with EEEROM, Max 7000 chips were pK,ogKammable only out of circuit in a special-purpose programming unit; in 19’96,‘however, Altera released the 7000s series, which is imcircuit repro&ammable. i

(glohai clear To PIA 4 A&‘; kach. Ai$lD offersaCPLD fa’m- not shown) ily comprising five subfamilies called Mach. Each Mach device consists of multiple PAL-like blocks(or optimized interconnect PALS). Mach 1 and 2 consist of opti- Figure 10. Max 7000 macrocell. mized 22V16 PALS,Mach 3.and 4 con- sist,of several; opt,imized 34\/‘16 PALS, and Mach 5 is sirriilar to Mach 3 and 4 34V16 PAL-like block but offers enhanced spee,d peffdu’- I/O (32) \ mance.,.All Mach chips use EEPROM II II tecfihology, and together the five sub; l/O (8) families’provide a wide rdngefofselec- tion, from small, &rexpelsive chips )tQ large’r, state-of-the-art ones. We will foe- cus on Mach 4 because it represents the most advanced currently available parts in the family. Figure:. 11 depicts a Mach 4 chip, showing.the multipl,e 34V16 PAL-like blocks and theinterconnect,:called the centkal switch. mhtrix. The *inrcfrcuit programmable chips range in size frorh 6 to 1:6PAL-like blocks, correspondirig roughly to 2,000 to 5,000 equiGalent l/O (32) gates.All conne,ctions-between PAL-like Figure 11. AMD Mach 4 structure. blocks (even from a PAL-like block to itself) pass throughsthe central switch matrix. Thus,.the device is not merely a consists of two sets of eight macrocells Any or all of the five product terms in coll&ction,of PAL-like blocks but a sin- (shown in Figure 10). A macrocell is a the macrocell can feed the OR gate, gle, ltirge device. Since allconnectiotis set of programmable product terms which can have up to 15 extra product traved .through the same path, circu’it (part of an AND plane) that feeds an OR terms from macrocells in the same log- timing‘dekays are piedictable. : \ gate and a flip-flop. The flip-flops can ic array block. This product term flexi- Fig*uur;e.$2illustrates a Mach& PAL-like be D, JK, T, or SR, or can be transpar- bility makes the Max 7000 series mo,re block. Itlas 16 outptits.and,atotal of 34 ent. As Figure 10 shows, the product se- ,efficient in chip area than classic SPLDs, inputs (16.of whichate the fed-back.,out- lect matrix allows a variable number of because typical logic functionsneed no puts); so it correspotids toa 34Y16 PAL. inputs to the OR gate in a macrocell. more than ‘five product terms, and the However, there are ‘two key differehces

48 IEEE DESIGN & TEST OFXOA&PUTERS between this block and a normal PAL: 1) a product term (PT) allocator be- tween the AND plane and the macro- cells (the macrocells comprise an OR gate,an EXORgate, and a flip-flop), and 2) an output switch matrix between the I/O (8) OR gates and the I/O pins. These fea- tures make a Mach 4 chip easier to use because they decouple sections of the PAL-like block. More specifically, the product term allocator distributes and shares product terms from the AND plane to OR gatesthat require them, al- lowing much more flexibility than the Figure 12. Mach 4 34Vl6 PAl-like block. fixedsize OR gatesin regular PAls. The output switch matrix enables any ...... macrocell output (OR gate or flip-flop) Genericlooic to drive any I/O pin connected to the PAL-like block, again providing greater flexibility than a PAL, in which each *------macrocell can drive only one specific I/O pin. Mach 4’s combination of in-sys ‘\ tern programmability and high flexibil- ity allow easyhardware design changes. -II -0 ‘II! Product Macrocellsj Lattice pLSI and ispLSI.Lattice offers term allocator ,’ a complete range of CPLDs,with two ,’ ,f’ main product lines: the pLS1and the _-’ ispJS1.Each consistsof three families of EEPROMCPLDs with different logic ca- pacities and speed performance. The Input bus ispLS1devices are in-system program- I/O pads mable. Lattice’s earliest generation of CPLDs Figure 73. lattice plSl and isplSl architecture. is the pl.SI and ispLSIlOO0series. Each chip consists of a collection of SPLD- like blocks and a global routing pool to delays. Compared with the chips dis- small PAL-like blocks consisting of an connect the blocks. Logic capacity zussed so far, the functionality of the AND plane, a product term allocator, ranges from about 1,200to 4,000 gates, 3000series is most similar to that of the and macrocells. The global routing and pin-to-pin delays are 10 ns. Lattice VIach4. Unlike the other Lattice CPLDs, pool is a set of wires that span the chip also offers the 2000 series-relatively :he 3000series offers enhancements to to connect generic logic block inputs small CPLDs with between 600 and ;upport more recent design styles,such and outputs. All interconnects pass 2,000 gates. The 2000 series features a asIEEE Std 1149.1boundary scan. through the global routing pool, so tim- higher ratio of macrocells to I/O pins Figure 13shows the general structure ing between logic levels is fully pre- and higher speed performance than the 3f a Lattice pLSI or ispLS1 device. dictable, as it is for the AMD Mach 1000series. At 5.5-nspin-to-pin delays, 4round the chip’s outside edges are devices. the 2000series provides state-of-the-art %directional I/OS, which connect to speed. 20th the generic logic blocks and the Cypress Flash370. Cypress has re- Lattice’s 3000 series consists of the global routing pool. As the magnified cently developed CPLD products simi- company’s largest CPLDs, with up to Jiew on the right side of the figure lar to the AMD and Lattice devices in 5,000 gates and lo- to 15-nspin-to-pin shows, the generic logic blocks are several ways. CypressFlash370 CPLDs

SUMMER 1996 49 F I E LD - P R 0 G R A M M A B LE D E V- I C'E S

*\ ‘, ,’ _,’ (0, T, latch) --- -______- -. flip-flop, tristate buffer Figure 14. Cypress Flash370 architecture. (PIM: programmable interconnect matrix,.)

Data in

Address

Control

Clock Data out

@I (cl Figure 15. Alfera Flashlogic CPLD: general architecture (a); CFB in PAL mode {b]; CFB in SR4M mode [cl.

use flash EEPROM technology and of- The smallest parts have 32 macrocells fer speed performance of 8.5 to 15 ns and 32 I/O pins; the largest have 256 pin-to-pin delays. The Flash370sare not macrocells and 256 pins. in-system programmable. To meet the Figure 14 shows that Flash370s have needs of larger chips, the devices pro- a typical CPLD architecture with multi- vide more I/O pins than competing ple PAL-like blocks connected by a pro- products, with a linear relationship be- grammable interconnect matrix. Each tween the number of macrocells and PAL-like block contains an AN:D plane the number of bidirectional I/O pins, that feeds a product term allocator that

50 all other CPLDs:Instead of containing AND/OR logic, a CFB can serve as a IO-nsSRAM block. Figure 15b shows a CFB configured as a PAL, and Figure Input 15c shows another configured as an pins SRAM. In the SRAM configuration, the PAL block becomes a 12%word by lO- bit read/write memory. Inputs that would normally feed the AND plane in the PAL become address lines, data Figure 77. /CT PEELArray /ogic ce// lines, or control signals for the memo- Group of four structure. ry. Flip-flops and tristate buffers are still sum terms available in the SRAM configuration. Figure 16. /CT PEELArray architecture. In the Flashlogic device, the AND/OR zan exploit wide AND/OR gatesand do logic plane’s configuration bits are rot need a large number of flip-flops are SRAM cells connected to EPROM or well into the CPLD category. good candidates for CPLD implemen- EEPROMcells. Applying power loads Nevertheless,we include them here be :ation. Finite state machines are an ex- the SRAMcells with a copy of the non- causethey exemplify PLA-based(rather :ellent example of this classof circuits. volatile EPROM or EEPROM, but the than PAL-based)devices and offer larg- 4 significant advantage of CPLDsis that SRAM cells control the chip’s configu- er capacity than a typical SPLD. :hey allow simple design changes ration. The user can reconfigure the The PEELArray logic cell, shown in :hrough reprogramming (all commer- chips in systemby downloading new in- Figure 17, includes a flip-flop, config- :ial CPLD products are reprogramma- formation into the SRAMcells. The user urable as D, T, or JK, and two multi- lie). Insystem programmable CPLDs can make the SRAM cell reprogram- plexers. Each multiplexer produces a zven make it possible to reconfigure ming nonvolatile by writing the SRAM logic cell output, either registered or rardware (for example, change a pro- cell contents back to the EPROMcells. combinational. One logic cell output :ocol for a communications circuit) can connect to an I/O pin, and the oth- Nithout powering down. ICT PEEL Arruys. ICT PEEL (pro- er output is buried. An interesting fea- Designsoften partition naturally into grammable, electrically-erasablelogic) ture of the logic cell is that the flip-flop :he SPLD-like blocks in a CPLD, pro- Arrays are large PLAsthat include logic clock, preset, and clear are full sum-of- jucing more predictable speed perfor- macrocells with flop-flops and feed- product logic functions. Distinguishing nance than a design split into many back to the logic planes. Figure 16 il- PEEL Arrays from all other CPLDs, small pieces mapped into different ar- lustrates this structure, which consists which simply provide product terms for ?asof the chip. Predictability of circuit of a programmable AND plane that thesesignals, this feature is attractive for mplementation is one of the strongest feeds a programmable OR plane. The some applications. Because of their idvantages of CPLDarchitectures. OR plane’s outputs are partitioned into PLAlike OR plane, PEELArrays are es- groups of four, and each group can be pecially well suited to applications that FPGAs. As one of the fastestgrowing input to any of the logic cells. The log- require very wide sum terms. segmentsof the semiconductor indus- ic cells provide registers for the sum :ry, the FPGA marketplace is volatile. terms and can feed back the sum terms CPLD applications. Their high The pool of companies involved to the AND plane. Also, the logic cells speeds and wide range of capacities changesrapidly, and it is difficult to say connect sum terms to I/O pins. make CPLDsuseful for many applica- yYhichproducts will be most significant Because they have a PLA-like struc- tions, from implementing random glue Nhen the industry reaches a stable ture, the logic capacity of PEELArrays logic to prototyping small gate arrays. ;tate. We focus here on products cur- is difficult to measure compared to the An important reason for the growth of centlyin widespread use. In describing CPLDsdiscussed so far, but we estimate the CPLD market is the conversion of ?ach device, we list its capacity in two- a capacity of 1,600to 2,800 equivalent designs that consist of multiple SPLDs nput NAND gates as given by the ven- gates.Containing relatively few I/O pins, into a smaller number of CPLDs. lor. Gate count is an especially the largestPEEL Array comes in a 4C-pin CPLDscan realize complex designs contentious issue in the FPGAindustry, package. Since they do not consist of such as graphics, LAN, and cache con- and so the numbers given should not SPLD-likeblocks, PEELArrays do not fit trollers. As a rule of thumb, circuits that 3e taken too seriously. In fact, wags

SUMMER 1996 51 LD - P R 0 G R A M M A B E ,D E VIC E S

/ 2,000 to more t.han, 15;0OO~equ&aleht gates.The XC50OOfamilyprovid& sini- ilarfeatures at a’more attr&ctive price with some penalty inspeed.: has recently.announced an-antifuse-based FPGA family;the XC81OO:The XC8’l’OO h&many interesting features;but :sinde it is notyet in wi,despread use, wae ,d6 notdiscuss it here.. j 1’ The XC4000 features.a configurable logic block (CLB] based onlookupta- bles. A lookun table is a l-bit-wide,. rnem- ory array; the me’mbry address hnes are lo@ block inputs; ‘and the i-bit’mem- ory on~tputisthe lookup table:dutI:)ut: A lookup’table with Kinputs correspon& to a 2Kx-l-bit memary; and the,user can Figure 18. Xilinx XC4000 CLB. realize!any K-input logic, function by programming thelog@ function% truth table directly into the memory. ‘1-nthe Vertical configuration shown in .Fig.ure 18, an XC4000:CLB contains two four-input lookuptables fedlby CLB inputs, and a third lookup table fed,by the’other two. ’ Thisarrangement alloivs the,CLB to im- plement a wide range of logicfunctions Length 1 of up to nine inputs, two separate four- I wires input functions, or,other possibi,lities. Length 2 Each CLB also containstwo flipflops. I wires The~XC4000.chips~havefeamres de- Long signedto.support the integration of en- wires tire systems. For:instwde;ieach-CLB contains circuitrythat allows it to effi- cientlyperform,arithmetie (th*atis,a dir- cuit. that implements -a fast carry Figure 19. Xihnx XC4000 wire segments. operation for add.er-like circuits).: Also, users can-configure the hookup tables as read/write RAM ,,eells..A new addi- have coined the term “dog gates,” a ref- Xilinx FPGAs.Xilinx FPGAs have an tion, the 4000E allows-configuration as erence to the often-cited ratio between array-based structure, each chip com- a dual-port RAM with a singlewrite and human and dog years, to indicate the prising a two-dimensional array of log- tworread,ports, and RAM.blo&kscan be dubiousness of vendors’ figures. ic blocks interconnected by horizontal synchronous RAM.:Each 1XC-4000~chip The two basic categories of FPGAson and vertical routing channels (see includes very wideAND.planes around the market today are SRAM- and anti- Figure 2). Xilinx introduced the first the periphery of the-logic blodkarrayto fuse-based FPGAs.In the first category, FPGAseries, the XC2000,in about 1985 facilitate implementation of’ circuit Xilinx and Altera lead in number of and now offers three more generations: blocks such as wide,decoders., users, their major competitor being XC3000,XC4000, and XC5000.Although Besidesits logic,‘the other key feature AT&T. For antifuse-based products, the XC3000devices are still widely used, thatdistinguish,es’:.an FPGA-is its inter- Actel, Quicklogic, and Cypress are the we focus on the more recent and more connect structure.:.Morizohtakand ver- leading manufacturers. popularXC4000 family. The XC4000de- tical;channels characterize-the XC4000 vices range in capacity from about interconnect.!Eaoh channel,contiains

52 I&E D’ESIGN & TESTfGF%GM.Pi#iEtiS short wire segments that span a single CLB (the number of segments in each channel varies for each member of the XC4000 family), longer segments that span two CLBs,and very long segments that span the chip’s entire length or rack width. Programmable switches are onnect available (see Figure 5) to connect CLB inputs and outputs to the wire segments or to connect one wire segment to an- other. A small section of an XC4000 routing channel appears in Figure 19. The figure shows only the wire seg- ments in a horizontal channel-not the vertical routing channels, CLB inputs Logic array block i and outputs, and the routing switches. An important point about the Xilinx in- terconnect is that signals must pass through switches to reach one CLB from another, and the total number of IFigure 20. Altera Flex 8000 architecture. switches traversed depends on the par- ticular set of wire segmentsused. Thus, an implemented circuit’s speed perfor- Cascadeout mance depends in part on how CAD tools allocate the wire segments to in- dividual signals. Logic elementout Altera Flex 8000 and Flex IOK. Altera’s Flex 8000 series combines FPGA and CPLD technologies. The de- vices consist of a three-level hierarchy Carry out much like that of CPLDs.However, the lowest level of the hierarchy is a set of lookup tables, rather than an SPLDlike block, and so we categorize the Flex 8000as an FPGA.The SRAM-basedFlex 8000 features a four-input lookup table as its basic logic block. Logic capacity IGgure 2 1. flex 8000 logic element. of the 8000 series ranges from about 4,000 to more than 15,000gates. Figure 20 illustrates the overall Flex This designgroups logic elementsinto in the Xilinx XC4000,each FastTrackwire 8000 architecture. The basic logic jets of eight, called logic array blocks (a extendsthe full width or height of the de block, called a logic element, contains erm borrowed from Altera’s CPLDs).As vice. However, a major difference be- a four-input lookup table, a flip-flop, ;hown in Figure 22 on the next page, tween Flex 8000 and Xilinx chips is that and special-purpose carry circuitry for :ach logic array block contains local in- FastTrack consists only of long lines, arithmetic circuits (similar to the Xilinx erconnection, and each local wire can making the Flex 8000easy for CAD tools XC4000). The logic element also in- :onnect any logic element to any other to configure automatically.All FastTrack cludes cascade circuitry that allows ef- ogic element within the same logic ar- horizontal wires are identical. Therefore, ficient implementation of wide AND .ay block. The local interconnect also interconnect delays in the Flex 8000are functions. Figure 21 shows details of the :onnects to the Flex 8000’s FastTrack more predictable than in FPGAs that logic element. global interconnect. Like the long wires employ many shortersegments because

SUMMER 1996 53 From FastTrack interconnect Control Cascade,carry

To FastTrack interconnect

To FastTrack interconnect

To FastTrack interconnect To adjacent logic array block

Figure 22. Flex 8000 logic &-ray block.

Figure 23. Abra Flex 7OK architecture.

can configure an embedded array block Sf to implement a complex logic.Icircuit, e such as a multiplier, by employing it as a C large, multioutput lookup table. Altera fL CAD tools provide several macrofunc- tions that implement useful logic circuits in embedded array blocks Counting me Figure 24. AT&T ORCA programmable- embedded array blocks as logic gates, function unit. Flex 10K offers the highest logic capaci- ty of any’FPGA, although obtaining an accurate number is difficult. the longer paths contain. fewer pro- grammable switches. Moreover, con- AT&r ‘ORCA. AT&T’s SRAM-based nections between horizontal and vertical FPGAs, called Optimized Reconfig- lines passthrough active buffers, further urable Cell Arrays (ORCAs), feature an enhancing predictability. overall structure similar to that of Xilinx The Flex 10Kfamily offers all the Flex FPGAsll’he ORCA logic block contains 8000 features with the addition of vari- an arrayr.‘of programmablefunction able-sizeblocks of SRAMcalledembed- units (Figure 24) based on‘-lookup’ta- ded array blocks. As Figure 23 shows, bles. A:programmable-function unit is each row of a Flex 10K chip has an em- unique among lookup;table+ased log- bedded array block on one end. Users ic blocks: It is configurable as four 4-i& can configure each embedded array put lookup tables, tv;ro.5inputlookup block to serve as an SRAM block with a tables, or ‘one 6-input lookup table. A variable aspect ratio: 256x8, 512x4, key element of this architecture is that lKx2, or 2Kxl. Alternatively, CAD tools when the programmable-function unit units based on the original ORCA I/O blocks architecture.

Actel FfGAs. Actel offers three main FPGA families: Act 1, Act 2, and Act 3. Although the three generations have similar features, we focus on the most recent devices. Unlike the FPGAsde- scribed so far, Actel’s devices use anti- fuse technology and a structure similar to traditional gate arrays. Their design arranges logic blocks in rows with hor- izontal routing channels between adja- cent rows (Figure 25). Actel logic I I/O blocks blocks, based on multiplexers, are Figure 25. Acte/ FPGA structure. small compared to those based on lookup tables. Figure 26 illustrates the Act 3 logic block, which consists of an to several other FPGAs: Like Xilinx AND and an OR gate connected to a FPGAs,it has an array-basedstructure; multiplexer-basedcircuit block. In com- like Actel FPGAs, its logic blocks use Inputs bination with the two logic gates, the multiplexers; and like Altera Flex 8OOOs, arrangement of the multiplexer circuit its interconnect consists only of long enables a single logic block to realize a lines. The pASlC2 is a recently intro- wide range of functions. About lialf the duced enhanced version,which we will logic blocks in an Act 3 device also con- not discuss here. Cypressalso offers de- tain a flip-flop. vices using the pASlC architecture, but Actel’s horizontal routing channels we discuss only Quicklogic’s version. Inputs consist of various-length wire segments Quicklogic’s ViaLink antifuse struc- with antifuses to connect logic blocks ture (see Figure 27b) consistsof a metal Figure 26. Acte/ Act 3 logic module. to wire segments or one wire to anoth- top layer, an amorphoussilicon insulat- er. Although not shown in Figure 25, vertical wires also overlie the logic 0 0 blocks, forming signal paths that span multiple rows. The speed performance u of Actel chips is not fully predictable be causethe number of antifusestraversed by a signal depends on how CAD tools 0 allocate the wire segments during cir- cuit implementation. However, a rich selection of wire segment lengths in 0 each channel and algorithms that guar- antee strict limits on the number of an- ViaLink .: i Logic cell i i i i at, every tifuses traversed by any two-point wire connection improve speed perfor- crossing mance significantly.

Quicklogic pASZC.Actel ’s main corn-, --- I 0 petitor in antifuse-based FPGAs is l/O blocks Quicklogic, which has two device fam- ilies, pASIC and pASlC2.The pASlC, il- [a) 0) lustrated in Figure 27a, has similarities Figure 27. Quicklogic pASlC structure (al and ViaLink lb/.

SUMMER 1996 55 D - P R 0 G R M M L E D E V IC E S

lation of entire large hardware systems Rekrences via the use of many interconnected 1. E. Hamdy et al., “DieIectric:BasedAnti- A3 AZ FPGAs.QuickTurn and others have de fusefor Logicand MemoryI&, ” Tech.Di- A4 gest IEEE Int’l Electron Debices Meeting, A5 veloped products consisting of the A6 FPGAsand software necessaryto parti- IEEE,Piscataway, N.J., 1988, pp. 786-789. tion and map circuits for hardware em- 2. J.Birkner et al.,“ Avery-High-SpeedField- ulation. Prog&mmableGate Array Using Metal- An application only beginning devel- to-M&al Antifuse Programmable Di opment is the use of FPGAsas custom ,Elements,”Microelectronics J., Vol. ,23, D2 computing machines. This involves us- 1992,:pp.561568, , El 3. D. Marpleand L..Cooke,:“Progiamming E2 ing.the programmable parts to execute software, rather than compiling the soft- :Antifusesin CrossPoint’sFPGA, ”Proc. ware for execution on ti regular CPU.For IEEE* It&l Custom Integrated Circuits information, we refer readers to the prc- Corzf,IEEE, Piscataway, NrJ., 1994, pp. ceedings of the IEEE Workshop -on 185-188. FPGAsfor Custom Computing Machines, 4. H. Wolff, “How QuickTurnIs Filling the , held for the last four years.5 G@$Electronics,,Apr. 1990: QR As mentioned earlier, pieces of de- 5. Puoc..IEEESymp:;FPGAs for CustomCom- Figure 28. Quicklogic pAS/C logic cell. signs often map naturally to the SPLD puting/Machines,IEEE Computer Society like blocks of CPLDs.However, designs Press, Los Alamitos, Calif.; 1993-1996. mapped into an FPGA break up into , ing layer, and a metal bottom layer logic-block-size pieces distributed Suggested reading Compared to Actel’s PLICE antifuse through an area of the FPGA.Depending S. Brown et al., Field-PvogrammableGate Ar- ViaLink offers very low on-resistance- on the FPGA’s interconnect structure, rays;Kluwer Academic Publishers, Nor- about 50 ohms (PLlCE’s is about 30( the logic block interconnections may well, Mass.,1992. A general introduction ohms)-and a low parasitic capaci produce delays. Thus, FPGA, perfor- to FPGAs. tance. ViaLink antifuses are present a mance often depends more on how every crossing of logic block pins aDd ir CAD tools map circuits into the chip than J. Oldfield and R. Dorf, FieldPrograminable terconnect wires, providing generou. does CPLD performance. Gate Aways, John Wiley &Sons, New connectivity. Figure 28 shows the pASI( York, 1995:A textbook-like treatment, in- multiplexer-based logic block. It is more eluding digital logic design based’on the complex than Actel’s logic module, witl THE LOW COST OF FPDs makes them Xilini 3000serie.sand the Algotronix CAL more inputs and wide (six-input) AN1 attractive to small firms and large com- , chip.. :” gates on the multiplexer select line: panies alike. Their fast manaf&uring Every logic block also contains a flip turnaround is anessential ele$ent of J. Rose,A. El Gamaljand A. Sangibv&mi-Vin- flop. their market success. Althotigh their centelli, “Architecture of Field-Program- : large, slow programmable swit&s pre- mable‘GateArrays, ”FYOC: IEEE, Vol. 81; FPGA applications. FPGAs havl vent FPDsfrom providing the s..eed per- No. 7, July 1993;:pp. 1013-1029.Detailed gained rapid acceptance over the pa: formance and logic capacity df MPGAs, discussion of architectural! trade-offs. decade because users can apply then1 improvements in architecture. and CAD to a wide range of applications: randon n tools will overcome these di.saclv&tages. Field-Programmable ,GateArray Technology, logic, integrating multiple SPLDs,devicl Over time FPDswill become the domi- S,.Trimberger, ed., Kluwer,Academic controllers, communication encodin, nant technology for implemeliting digi- Publishers, Norwell, Mass.,1994., Discus-, ._ and filtering, small- to mediumsize syr tal circuits. @ sioq of three FPGAiCPLD architectures. ternswith SRAMblocks, and many more Another interesting FPGAapplication I Up-to-dateFPD research appears in th e pub- is prototyping designs to be implemen lished proceedings of several ‘conferences: ed in gate arrays by using one or marle Acknowledgments Proc. IEEE Int’l Custom Integrated Circuits large FPGAs.(A large FPGAcorrespond S We acknowledge students,colleagues. Co&, IEEE. to a small gate array in terms of capac i- and acquaintancesin indusywho havecon. Proc.‘lnf ‘1Co& Computer-Aided‘ besign (KY ty). Still another application is the emIl- I tributed to our knowledge. (%D);IEEE CSPress, Los Alamitos, Calif.

56 IEEE DESIGN & TEST OFtCGMPUTERS Proc.Design Automation Conference @AC), trical engineering, computer engineering, and systems.He coauthored the book Field- IEEECS Press. and computer science courses. Brown is ProgrammableGate Arrays. Rose holds a FPGASymp. Series: Third Int’l ACM Symp. the general and program chair for the PhD in electrical engineering from the Uni- Field-ProgrammableGate Arrays (FPGA Fourth Canadian Workshop on Field-Pro- versity of Toronto. He is the general chair 95) and FourthInt ’lACMSymp.Field--Pro- grammable Devices (FPD 96), and is on the of the Fourth International Symposium on grammable Gate Arrays (FPGA 96), Technical Program Committee for the Sixth FPGAs (FPGA 96) and serves on the tech- Assoc. for Computing Machinery, New International Workshop on Field-Program- nical program committee for the Sixth York. mable Logic (FPL 96). He is a member of International Workshop on Field-Program- the IEEE and the Computer Society. mable Logic. In 1990,ICCAD awarded him and coauthor Stephen Brown a Best Paper Stephen Brown is an assistantprofessor of award. He is a member of the IEEE, the electrical and computer engineering at the Computer Society, the Association for University of Toronto. He holds a PhD in Computing Machinery, and SIGDA. electrical engineering from that university; his dissertation (on architecture and CAD for FPGAs)won him the Canadian NSERC’s 1992 prize for the best doctoral thesis in Canada. In 1990, the International Confer- ence on Computer-Aided Design awarded Jonathan Rose is an associate professor Direct questions concerning this article him and coauthor Jonathan Rose a Best Pa- of electrical and computer engineering at to Stephen Brown, Dept. of Electrical and per award. A coauthor of the book Field- the University of Toronto. His research in- Computer Engineering, Univ. of Toronto, 10 ProgrammableGate Arrays, he has also won terests are in the area of architecture and Kings College Rd., Toronto, ONT, Canada four awards for excellence in teaching elec- CAD for field-programmable gate arrays M5S 3G4; [email protected]. _ CALLFOR ART ICLES IEEE Design & Test of Computers Special Issue on

D&T focuses on practical articles of near-term interest Interested authors should submit four copies of a double to the professional engineering community. D&Tseeks ar- spaced manuscript no longer than 35 pages, in English, by titles of significant contribution that address the design, June 15, 1996. Each copy must contain contact informa- test, debugging, manufacturability, and yield improvement tion (name, postal and e-mail addresses, and phone/fax of microprocessors and microcontrollers. The areas of in- numbers). Final articles will be due October 15, 1996. terest include but are not limited to For author guidelines, see D&T’sSpring 1996issue or Web page at http://www.computer.org/pubs/d&t/d&t.htm. # Circuit design and design methodologies Submit manuscripts to: ti Logic design and design methodologies ) CAD tools and methodologies - Marc E. Levitt ) Design-for-testtechniques and applications Special Issue Guest Editor ) Debugging experiences, tools, and methodologies Sun Microdectronics, USUN02-301 ) ‘Yield improvement experiences, tools, and 2550 Garcia Avenue, Mountain View, CA 94043 methodologies phone (408) 774-8268; fax (408) 774-2099 * Project management -marc.levitt@?eng.sun.com

SUMMER 1996 57