TABLE OF CONTENTS

Page 3 - Hard-Wired Forth: Harris's RTX-2000 (Embedded Systems Programming, February 1989).

Page 13 -A New Breed of (Embedded Systems Programming, March 1989).

Page 15 - Three RTX Development Systems (Embedded Systems Programming, August 1989).

mHARRIS

Page 2 BY ERNEST L. MEYER

Reprinted from Embedded Systems Programming, · February 1989. Miller Freeman Publications. All rights reserved. -

is fashionable these days, but for some real-time tasks Forth could be a better choice. In fact, ru­ mor has it that a number ofC programmers actually write C code during the day and then rewrite the functions in Forth at night. The manag­ ers think the code is in C, but the pro­ grammers can get the application work­ ing much more easily using Forth. With the advent of Forth-based mi­ crocontrollers, programmers no longer need to delude management in such a devious manner. The R TX microcon­ troller family, now available in produc­ tion quantities from Harris Semicon­ ductor, uses a subset of Forth instruc­ tions as op codes. Because there's no intermediate assembly language be­ tween the high-level constructs and the final machine code, the Forth instruc­ tions written by the programmer have direct and predictable machine-code equivalences. Not only does this simplify the con­ struction of efficient real-time routines, but once it's decided that the R TX real­ ly is the best choice for an application, the programmer is free to write the ap­ plication code in Forth. The itself can be custom­ ized in hardware for a wide range of specific application requirements. Ad­ ditional ALUs, multipliers, registers, and stacks can be added right inside the microcontroller's main data paths. Alternatively, on-chip stacks could be extended or a or UART added; D/ A and A/D interfaces could even be added on-chip, although the analog cir­ cuitry would make the chip somewhat more difficult to manufacture.

EMBEDDED SYSTEMS PROG.RAMMING • FEBRUARY 1989 Page 3 •

Harris Semiconductor's family of RTX microcontrollars uses a subset of Forth instructions as op codas. There's no intermediate assembly language, so Forth instructions written by the programmer have direct and predictable machine-coda equivalences.

Page 4 EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Hard-Wired Forth

FORTH-BASED HARDWARE ARCHITECTURE n the abstract, a chip with a subset of Forth instructions as op codes I has a number of advantages for real-time embedded applications. For example, think of the benefits Forth of­ fers for real-time tasks. It's fully reen­ trant. Direct control of the data and in­ struction stacks allows more intimate contact with branches, conditional loops, and interrupt functions than oth­ er languages usually offer. And reverse Polish notation, because it more closely represents the actual activity of the pro­ cessor, allows the designer to think more Higher-speed are also The RTX offers clearly about what's happening than do available at a debatably lower cost. And the "easier" code- and data-ordering Forth compilers can be used for the several advantages mechanisms used by other high-level more generic hardware. languages. Even if the compiled Forth code uses over generic On the other hand, no chip will ever twice as many clock cycles on a generic reach the ideal, however close it man­ 25-MHz microcontroller as would the microcontrollers­ ages to approach it. A chip that uses same application on the 10-MHz RTX Forth instructions as op codes is bound microcontroller, the final system still predictabil ity, code to have some drawbacks. Most obvious­ works faster with a generic processor ly, it's not the ideal choice for other high­ running at a significantly higher clock language uniformity, level languages. With a more generic speed. architecture, it's certainly possible to The R TX microcontroller does offer lower system use different high-level languages for four advantages over generic microcon­ different applications and still use the trollers-predictability, code language speed, and potential same chip. Although cross-compilers uniformity, lower system speed, and po­ will no doubt become available for the tential customization-that can offset customization­ Forth chip at some stage, the chip is un­ the disadvantages. likely to offer the same level of perfor­ First, because high-level instructions that can off set mance with other high-level languages with direct machine-code correspon­ as it does with Forth. dences are used, it's much easier for the the disadvantages code developer to determine how long a IRRESISTIBLE FEATURES task will take. This is particularly im­ of high cost and ta pragmatic level, the 16-bit portant in real-time applications, where R TX microprocessor is not operations must be completed within reduced A cheap, with a unit cost of $190 strict time slots. in 1,000-piece quantities and a price tag Second, the developer can use the performance when of $3,000 for the complete development same language for critical functions as system (not including the host com­ for the rest of the code. Since applica­ used with other high­ puter). And the clock speed, at. 10 tions tend to use 10% of the code 90% of MHz, is comparatively low. the time, the critical sections are usual­ level languages. Other manufacturers offer 16-bit ly hand-coded in assembler. If the ma­ microcontrollers with similar clock chine code is simply a subset of the high­ speeds at a substantially lower cost. level code, however, there's no need to

EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Page 5 link routines in different languages. Writing all the code in Forth also sim- plifies debugging. Unfortunately, these advantages mean very little to hardware designers.

Luckily, two other benefits offered by IUH Register :a the RTX methodology should sway Q ~ lilJLR Register them. First, the use ofa 10-MHz rather ~ than 25-MHz microcontroller results in ~"' NEXT Register lower system speed, so the other chips in c: i 255-Word- the system don't need to run at ultra- Deep Stack Stack Pointer high speeds. And since cache memories aren't necessary at 10 MHz, system cost is lower. Lower-speed components also sim- plify board design. Even CMOS be- Scratchpad Register comes difficult to handle at speeds over 20 MHz, when transmission line effects can ~ause erroneous device triggering and corners in the wire traces can cause signal reflections. The hardware de- signer will therefore prefer a lower- speed processor from the standpoints of ~ Interrupt Vector ~ ~ board design effort and overall system Interrupt Base cost, ev.en if the processor itself is more Data Page Register expensive. Code Page Register A second advantage for hardware designers is that the structure of the User Page Register RTX chip permits them to customize User Page Register hardware for particular requirements Index Base Re ister by adding functional units to the micro- controller core. The functions can then be accessed directly by the as

an integral part of the microcontroller. ... II! Figure 1 is a high-level diagram of .s =~ 255-Word i§~ the chip core. The user can hang addi- Return Stack Stack Pointer ... ~ tional functional units on the ASIC U ~I 'P -t;~ (visible on the right side of the figure). If ~ !tJ ... i speed isn't a critical issue, functional ~ ...."' units can be hung on the ASIC bus off- Q"' L.J ...Cl>

Figure 1 High-level diagram of RTX microcontroller core.

Page 6 EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 If speed is a Data is always shifted on- and off­ chip through the NEXT register. Not paramount issue and shown in the figure is the SWAP unit, which can swap the first and second Hard-Wired cost is not a severe bytes in the NEXT register at any time without cycle overhead. As a result, the limitation, resources chip can be used with both "big endian" Forth and "little endian" memory architec­ can be placed on­ tures, permitting compatibility with both Motorola- and Intel-like memory­ chip. Up to eight devices can be directly chip. Although this addressing schemes. addressed on the ASIC bus via address Multiplies and divides can be per­ lines AO-A2. A full bidirectional 16-bit raises chip cost formed by loading the two multipli­ data bus is then available between the cands sequentially into the multiplier additional functional units on the ASIC substantially, the unit from the NEXT register. The least sig­ bus and the microcontroller core. nificant byte of the 32-bit result is read­ If speed is a paramount issue and added peripheral able from the MULH register in the next cost isn't a severe limitation, the addi­ clock cycle. The upper eight bits of the tional resources can be placed on-chip. functionality works result can then be read in the subse­ Although this raises chip cost substan­ quent clock cycle by pushing the con­ tially, the added peripheral function­ about an order of tents of MULR into TOP. (The old contents ality works about an order of magnitude of TOP, the eight MSBs of the result, are faster. System cost may also be lower magnitude faster. automatically pushed into NEXT.) Two due to the higher degree of functional scratchpad registers are available for integration, which decreases the neces­ System cost may also storing intermediate values in divide sary PCB real estate for the system and and negative-exponent calculations. increases system reliability. be lower due to the The chip also contains three counters Of course, a similar hardware cus­ that can function in various modes. The tomization approach is available from higher degree of counters can be read independently into other vendors. Two examples are the TOP and can generate a separate mask­ HPC core from National Semiconduc­ functional able interrupt. Interrupts from each tor and the superintegrated products shift program control to an in­ from Zilog. The register-based struc­ integration. terrupt routine pointed to by separate ture of Forth nevertheless yields a some­ interrupt vectors. The vector for the what higher ratio of performance im­ current interrupt level is readable in the provement with RTX customization interrupt vector register; it can be than with most other processor archi­ hardware, both of which are 255 words changed by writing to that register. Be­ tectures, where added functionality on­ deep. The stack limit and stack pointer, cause interrupt vectors are offsets from chip can't be treated as simply an inte­ both eight-bit values that indicate stack the address stored in the interrupt base gral part of the processor itself. status, are available in separate regis­ register, counter interrupts can have ters. Since the stack limit register is different effects at different program WHY IS THE HARDWARE writable, stack depth can be expanded levels (ensuring a reentrant program).· THE WAY IT IS? beyond 255 words in off-chip memory. The five page registers expand the et's take a closer look at the The top two items on.the data stack, TOP addressable memory space to 1 Mbyte. R TX architecture shown in and NEXT, are addressable as registers. Four of the 16 non overlapping pages are L Figure 1. The chip revolves The single top item on the address stack, directly addressable at any one time for around registers and stacks; there's no INDEX, is also addressable as a register. code, data, user, and interrupt base ad­ cache, and pipelines are minimized. The ALU, at the top of the figure, dresses. A five-bit index base register Instructions that don't go off-chip always stores its results in the TOP regis­ contains the five MSBs of the index reg­ are either register-to-stack or stack-to­ ter. When there are consecutive ALU ister address, expanding the address­ register. All such on-chip operations, operations, TOP is pushed into NEXT. After able return stack space to 21 bits. with the exception of complete multi­ any ALU operation, TOP can be trans­ The instruction address is stored in plies, take a single clock cycle ( 100 ferred to any register except the multi­ the . The instruction is nsec). Operations that manipulate off­ plier. The ALU can perform adds, sub­ then loaded into the instruction register, chip data and hardware resources take tracts, shifts, and Boolean operations; a executed, and decoded. If no branches two clock cycles. version of the R TX microcontroller or loops intervene, the address is incre­ Virtually all the registers, buses, and with an enhanced ALU that supports mented and stored in the program stack words on the chip are a full 16 bits additional single-cycle operations, such counter. Loop counts and branches can long. There are two stacks in the chip as ROT ATE, is in the works. be stored in the return stack; when a

EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Page 7 The architecture of ternal hardware is organized to permit parallel instruction execution ii) these the RTX cases. By understanding how to use the chip's internal parallelism, you can Hard-Wired microcontroller is write highly efficient code. Figure 2 is a state diagram for the designed to allow the chip's instruction execution sequence. Forth Each 100-nsec clock cycle is divided execution of as many into two 50-nsec machine cycles sepa­ rated by the rising and falling edges of loop is executing, the index register instructions as the clock pulse. Only one off-chip data automatically decrements the remain­ access (instruction fetch, data read, or ing loop count. Eight-bit stack limit and possible in one clock data write) can take place in each cycle. stack pointer registers are also available Instruction decoding always takes for the return stack. cycle. By half a clock cycle, and nothing else can happen at the same time. At the left of MAXIMIZING CODE PERFORMANCE understanding how the state diagram is one instruction de­ he architecture of the RTX mi­ code operation. If the instruction that's crocontroller is designed to al­ to use this internal decoded in the first half of the clock cy­ T low the execution of as many cle doesn't require an off-chip data read instructions as possible in one clock cy­ parallelism, you can or write operation and isn't a multiply cle. Unfortunately, it's not always possi­ operation, it can probably be completed ble to complete all the instructions with­ write highly efficient in the second half of the cycle. At the in the 100-nsec duration of one clock same time, the next instruction can be cycle. In fact, there are three kinds of code. fetched from off-chip and be ready for long instructions: off-chip reads and decoding in the next clock cycle. writes, ALU-dependent instructions, It's clear that the chip does in fact and multiply operations. The chip's in- have a two-stage pipeline for instruction

Latch instruction into instruction register and decode

OFF-CHIP DATA ACCESS

Figure 2 Each state bubble takes 112 clock cycle to execute.

PageB EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 degree of parallelism that can be at­ There are two rules tained. For example, remember how an instruction fetch and a memory read or in RTX programming: write can't occur in the same clock cy­ Hard-Wired cle? Because of this, if the instruction try to alternate involves a memory read or write, then the total execution time will be two Forth use of the stacks clock cycles. However, the next instruc­ tion can be fetched during the second and data buses clock cycle, during which the fetched prefetch, despite company literature data can be processed (as shown in Fig­ that claims otherwise. Since the pipe is to ensure the ure 2). no more than a clock cycle in length, This leads to a second rule: only fetch however, the impact of this pipeline isn't highest possible data when it's going to be processed im­ as great as that of the seven-stage pipe­ mediately by the ALU. Just fetching lines found in some of the more complex level of task data and flinging it onto the stack for CISCcores. future use wastes hardware parallelism. concurrency, It's best to keep a static variable out of ON- AND OFF-CHIP OPERATIONS the stack until it's needed for some ince instructions can be fetched and only fetch data arithmetical manipulation by the next and executed at the same time, immediate instruction. That way, use of S at least two parallel operations when it's going to be the instruction prefetch pipeline is can be executed in one clock cycle. In­ maximized. deed, because there are four buses in­ processed side the chip, it's possible to have up to MULTIPLIES four instructions working in parallel. immediately by ultiplies are special cases Data can move on all four buses that don't obey these generic simultaneously. the ALU. M rules. The company litera­ As mentioned earlier, on-chip opera­ ture states that the chip takes one clock tions are either stack-to-register or reg­ cycle to perform a multiply. This is true, ister-to-stack. Because there are two but additional clock cycles are needed stacks, a write to one stack can occur in parallel with a read to the other stack. Two off-chip data buses-the main data bus and the ASIC bus-make it possible to move data to or from the two stacks and on- or off-chip, all at the same time. For example, it's possible to concur­ rently fetch a data item from a peripher­ al on the ASIC bus, push it onto the data stack, force a subroutine return (which pops a return address into the program counter), and fetch the next instruction. (These instructions are actually execut­ ed in parallel by the Forth word 8 G@ : •) This leads to the first rule of R TX programming: try to alternate use of the two stacks and two data buses to ensure the highest possible level of task concur­ rency. The direct correspondence of instructions to the actual machine code used by the R TX microcontroller makes it relatively easy to improve task concurrency in the time-critical regions of an application. We can see that many internal instructions are highly "paralleliza ble." There are, however, limitations to the

EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Page9 and an R TX cross-compiler. Code is written in the 80xx environment and cross-compiled onto the R TX board. Code development is enhanced by Hard-Wired the availability of function libraries. The development system includes a li­ brary of words for basic 1/0 manipula­ Forth tion; a library of Forth words for com­ plex arithmetic functions, including transcendentals, should be available by to load the multiplier and read the press time. Harris is also preparing a results. library of advanced functions, such as To perform a multiply, the program­ vector and Fourier transforms, that mer loads the two multiplicands into the should be available over the next year. MULH and MULR registers from the NEXT reg­ Debugging Forth code using the ister. The eight LSBs of the result can RTX processor differs from using age­ then be transferred into TOP in the next neric processor in that the RTX con­ cycle by the instruction READML. As noted tains no trace or single-step utilities. earlier, the eight LSBs are pushed into However, since all the internal registers NEXT and the eight MSBs are pushed into are static, you can stop program execu­ TOP in the following clock cycle. This can tion at any time and examine the con­ be accomplished using the instruction tents of the registers by setting a break­ READMH. point with the RTX monitor. During A full multiply therefore takes four debugging, the monitor can be used for clock cycles if the multiplicands are al­ examining ~nd changing target mem­ ready r~sident on-chip. If one multipli­ ory locations, registers, and 1/0 ports. cand is fetched from off-chip, a full Forth names are automatically inserted multiply takes five clock cycles. If both by the monitor in place of numeric multiplicands are fetched from off­ memory locations. chip, a full multiply takes six clock cy­ The development board features 32 cles. If one of the multiplicands is al­ Harris' development kbytes of high-speed static RAM that ready resident in the multiplier reg­ can be used with the chip for real-time isters and the other multiplicand is system for the code execution. System control resides resident in a scratch register or stack on­ in 16 kbytes of EPROM; there are two chip, the multiply can be completed in RTX includes sockets for 16-kbyte EPROMs as well three clock cycles. as a breadboarding area and facilities This disparity in multiplier perfor­ a real-time for adding 1/0 ports for test purposes. mance between operations for on-chip This permits the addition of physical and off-chip data is especially impor­ symbolic monitor memory maps, wait state generation tant when using a data item from main circuitry, and 1/0 functions that fully memory as one multiplicand and on­ for debugging, emulate the target hardware. chip data as the other. It makes sense to load the on-chip multiplicand into the a disass~mbler, USING THE RTX MICROCONTROLLER multiplier first. The multiply operation TX microcontrollers are being can then be carried out in the next clock a DOS file used for medical image scan­ cycle rather than waiting for the next R ning, to control robotic arms in instruction to load the multiplier with interface utility, cannery lines, and in aerospace applica­ the on-chip multiplicand (a practice tions. In fact, they can perform virtually that degrades on-chip parallelism). a Forth-83 any high-performance real-time task. RTX microcontrollers are particularly CODE DEVELOPMENT SUPPORT compiler for the effective for tasks that have a high de­ arris supplies a development gree of context switching and where system for the R TX called the 8086/286,and pre<;iictability is a foremost issue. H RTXDS, which runs on an Listing 1 illustrates the use of the XT, AT, or compatible and a develop­ an RTX R TX microcontroller for real-time ment board. The system has a real-time tasks. The code allows the R TX to func­ symbolic monitor for debugging, a dis­ cross-compiler. tion as a UART without any additional assembler, a DOS file interface utility, a hardware and takes up less than 1% of Forth-83 compiler for the 8086/286, the available microcontroller run time.

Page 10 EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Hard-Wired Forth

It treats one of the ASIC bus locations 16 VALUE • 1BIT ! \shift left: merge data into receive register . as a UART running at a user-settable '...... :GETCHAR speed. This example illustrates the pow­ \UART RECEPTION UART G@ er of the RTX microcontroller without \ ...... COUNT @ added hardware. (Although the code \Sample incoming data 16 times per bit . OF( 2• does use up all the on-chip timer re­ :TIMEILUNT RXCHAR @OR RXCHAR ! sources, it can be restructured to use UART G@ COUNT @ 1 + only one timer.) 0001 AND IF NODATA ELSE CHECUTARLl!IT DUP COUNT ! Full UART functionality can also THEN 8- IF ENDCHAR be achieved by adding a UART onto \ tells us there · s no start bit ELSE WAIT2 the ASIC bus. The hardware UART :NODATA \Set timer 2 interrupt interval . can then be run with the minute code 0 2NDTIME ! :WAIT2 portion shown in Listing 2. As you can \Wait before interrupting again . 1BIT TIMER2 G@ see, the necessary code is minimal. : WAIT1 \Finish up the character . 1BIT TIMER2 G! \Disable interrupt until new character : CHECUTARLBIT \is detected . 2NDTIME @IF GETCHAR ELSE Listing 1 \ There · s a start bit-enable interrupt. A UART function can be embedded in \set a flag. and' check 1/2 cycle later software while using only 1% of the \ to recheck start bit . processor's available time. FFFF 2NDTIME G! \Set timer 1 interrupt interval . 1BIT 2/TIMER1 G! : SETBRG THEN VALUE TIMER 1 G! \Fetch character from UART register and

EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 Page 11 : ENOCHAR However, since the software-based 0 COUNT 1 UART uses only I% of the processor's FFDF INTMASK G! total run time and the code is fully reen­ \ ...... Listing 2 trant, it may be less sensible to embed \UART TRANSMISSION Running a arscrete UART in on-chip or on-chip hardware the UART in hardware. And the soft­ \ ...... takes far less code. \ ware UART only uses about 200 bytes \Prepare a word for transmission Initialize and clear data port . : INIT of EPROM space. \by adding stop bi ts . then check CTS bit. : TRANSWORD PARAM1 UCMD G! 2•1+2•1•+ PARAM2 UCMO G! TW ORD ! uom G(ll DROP UART Gtt \Read status port . (--nJ 0040 ANO IF NOT CLEAR ELSE TRANSMIT :(asmus THEN UCMD G(A Ernest Meyer, an independent techni­ \Poll for and read character \Transmit a word from the top of the stack . . cal writer and former editor of VLSI : TRANSMIT :RECEIVE (--n) Systems Design, is also a contributing 0 UART GI BEGIN (aSTATUS RRDY AND UNTIL editor and columnist for Embedded FFFF INTMASK G! uom G(a 007F AND Systems Programming. 1BIT WAITJ \ Transmit when ready . 0 TCOUNT I : TRANSMIT BEGIN \ transmitting individual characters (aSTATUS TXRDY AND UNTIL : TRANSMIT-INT UDAT A G!

TWORD ~ OUP UART G! 2 • TWORD !

TCOUNT ~ 1 + 9 ., IF DISABLE ELSE WAITJ

Page 12 EMBEDDED SYSTEMS PROGRAMMING • FEBRUARY 1989 I •WHEN IN ROM Programming, March 1989. Miller by Ray Duncan Freeman Publications. All rights reserved. A New Breed of Microcontroller

f you're one of the lucky program­ sults of arithmetic/logical operations, mers who received the premiere is­ as already described. It's also the source I sue of Embedded Systems Prrr or destination of any instruction that gramming, you could hardly have What's unusual accesses main memory. That is, a value missed the four-page color advertise­ in main memory must be loaded onto ment by Harris Semiconductor for about the Harris the parameter stack before it can be something called the Real Time Ex­ manipulated, and a value can only be press family of microcontrollers. This chip? For starters, written into main memory from the top advertisement made some impressive of the parameter stack. The return claims for speed, interrupt response the RTX-2000 stack holds loop counters and return ad­ times, and power consumption of the is not a classic Von dresses during execution of subroutines RTX-2000, in particular, but gave few and occasionally provides temporary clues--0ther than the enigmatic phrase Newmann machine. storage for other data. "innovative dual-stack architecture"­ What else is unusual about the Har­ as to how such impressive numbers were Another novel ris chip? For starters, the RTX-2000 is achieved or what the RTX-2000's rela­ not a classic Von Neumann machine. It tionship to other, more familiar micro­ aspect is its has three distinct address spaces: main controllers might be. memory, which can hold either code or The RTX-2000 bears little resem­ use of a sort of data, and the two stacks, each of which blance to any processor you're likely to is 256 words deep and has its own ad­ be familiar with (unless you're grizzled horizontal dress and data buses. The stacks can't enough to have worked on Burroughs be positioned at arbitrary addresses in mainframes). It's a member of that rare microcoding main memory as they can in conven­ species: the . Regardless tional CPUs, and arbitrary stack loca­ of your current affiliation with one pro­ scheme. tions can't be accessed with memory cessor school of thought or another fetch and store instructions. This idio­ (RISC or CISC, Intel or Motorola, and syncrasy has interesting implications so on), the Harris chip is worth a closer that you can appreciate only after you look. It represents a radical divergence In a true stack machine, the stack is begin to code for such a CPU and dis­ from the conventional wisdom on how the center of the action rather than an cover that your faithful old program­ to wring more computing power and auxiliary area of storage. Operands for ming idioms (or cliches) no longer shorter development cycles out of the arithmetic/logical operations are al­ work! hardware/software duo. ways taken from the stack, and results Another novel aspect of the R TX- Of course, almost any CPU will sup­ are returned to the stack. (If you've ever 2000, at least in the microcontroller port a stack these days, but a conven­ programmed an Intel 80x87 numeric world, is its use of a sort of horizontal tional CPU's only real need for a stack is or used an HP calculator, microcoding scheme. The chip has five to save return addresses and register you're already familiar with this con­ internal buses that carry data between contents during execution of a subrou­ cept.) Such machines typically have the tops of the two stacks, the instruc­ tine or interrupt handler. The stack few or no registers in the traditional tion pointer, the ALU, the multiplier, typically doesn't participate in arith­ sense-that is, registers used as accu­ the shifter, and so on. These buses are metic/logical operations; operands for mulators or indexes-although special­ controlled by separate bit fields in the such instructions are taken from regis­ purpose registers to control I/O and to op codes for arithmetic/logical instruc­ ters or memory, and the results are re­ point to the base of the stack and the tions. Consequently, it's often possible turned to registers or memory. Al­ current instruction are still needed. to merge two operations into the same though some high-level languages, such The R TX-2000 actually carries the instruction and thus into the same ma­ as C, also use the stack for parameter stack-machine concept a bit further by chine cycle. For example, you can dupli­ passing, this is only a convention; stack supporting two hardware stacks, the pa­ cate the number on top of the parameter usage may vary drastically from one rameter and the return. The parameter stack and then divide the new top of the compiler to another. stack is used for the operands and re- stack by two.

EMBEDDED SYSTEMS PROGRAMMING • MARCH 1989 Page 13 WHEN IN ROM

This horizontal microcoding also al­ The most amazing into its standard cell library. Thus, a lows the Harris chip to have a one-cycle modified form of the N ovix CPU, to­ overhead for invocation of a subroutine. part of this whole gether with some timers, on-board This might sound like preposterous high-speed stack memory, and a 16-by- marketing hype, but it turns out to be on history is the open­ 16 multiplier, constitutes the heart of the level. The call kJ a subroutine is, of the Harris R TX-2000 microcontroller course, a distinct machine instruction mindedness of as we know it today. In retrospect, the and requires a complete machine cycle, most amazing part of this whole history but the return from the subroutine Harris is the remarkable open-mindedness of comes for free. How is this possible? All Harris Semiconductor when most of its machine instructions--other than the Semiconductor competitors are rushing headlong into call instruction itself--contain a re­ RISC technology. served "return" bit field. If the bit is while most of its The Forth ancestry of the R TX- clear, the instruction pointer is incre­ 2000 doesn't mean that it can only be mented during execution of the op code competitors rush programmed in Forth, however; quite in the normal manner. If the bit is set, headlong into RISC the contrary. Recursive descent compil­ the return stack is popped into the in­ ers for just about any modern, block­ struction pointer in parallel with what­ technology. structured language generate Forth­ ever else the instruction is supposed to like, postfix intermediate code at some be doing, at no additional cost in execu­ point during translation from source tion time. Thus, the return from any code to object code, making the chip an subroutine can simply be folded into the International) and the design of a new excellent target for such compilers. last instruction of that subroutine. processor called the N C4000 by Moore Harris is beta testing a C compiler for By this point, those of you who know and Bob Murphy. The first working the RTX-2000 that should be officially something about Forth are probably NC4000 chips, fabricated by Mostek released by the time you read this; a thinking that this description of the using a three-micron HCMOS PROLOG compiler is also under devel­ Harris chip has a familiar ring to it. and packaged in a 128-pin grid array, opment. In fact, it seems safe to predict That's not too surprising, since the lin­ were delivered to Novix in March 1985. that the only kind of translator you'll eage of the R TX-2000 can be traced The chips turned out to be buggy but never find offered for the Harris chip is directly to Charles Moore, who invent­ usable, which was fortunate since No­ an assembler. That's because Forth is, ed the Forth programming language in vix was undercapitalized and overex­ in a sense, the RTX-2000's assembler the late 1960s. Moore was one of the tended almost from the outset and never language-all of the chip's machine founders of FORTH Inc., the flagship revised the NC4000 mask to eliminate instructions map directly onto Forth software house for Forth development some severe problems with interrupt language elements. tools and applications. As Forth ma­ handling and step-wise multiplication. tured and stabilized, however, his inter­ Novix did manage to sell over a thou­ ests turned to hardware and he became sand chips before it faded into obscurity. increasingly dissatisfied with the mis­ By a strange quirk of fate, the Novix Ray Duncan has written columns for match between the Forth virtual ma­ chip caught the attention of some engi­ Dr. Dobb's Journal, Softalk/PC, and chine and the architecture of conven­ neers in the CAD/ CAM division of PC Magazine. He's the author of Ad­ tional CPUs. Harris, who hoped to use it as a high­ vanced MS-DOS Programming (Red­ In 1980, Moore left FORTH Inc. speed number-crunching coprocessor in mond, Wash.: Microsoft Press, 1986) and began to devote most of his time to graphics applications. After looking and Advanced OS/2 (Microsoft Press, experimentation with dual-stack ma­ more closely at the chip's capabilities, 1989). He owns Laboratory Microsys­ chines. His efforts led to the formation Harris decided to license the N C4000 tems Inc. (Marina de/ Rey, Calif.), a in 1984 of a Silicon Valley start-up design, fix the bugs, add some new ca­ software house specializing in Forth in­ called Novix Inc. (funded by Sysorex pabilities, and incorporate the CPU terpreters and compilers.

Page 14 EMBEDDED SYSTEMS PROGRAMMING • MARCH 1989 Reprinted from Embedded Systems Programming, August 1989. Miller Freeman Publications. All rights reserved. •PRODUCT EVALUATION by Tyler Sperry Three R TX Development Systems hen Harris Semiconductor launched its "Real-Time Ex­ W press" campaign last year and unveiled the R TX 2000 processor, Almost everyone it was greeted with mixed reactions from developers. Almost everyone was was intrigued by intrigued by Harris's claim that it had the world's fastest 16-bit microcon­ Harris's claim that troller; an average of 15 MIPS from a 10-MHz clock isn't easily ignored. it had the world's Those unfamiliar with the chip's gen­ esis, though, were a trifle confused by fastest 16-bit the dual-stack architecture and the microcontroller. chip's "near-RISC" description. Devel­ opers in the Forth community, on the But then they said, fflallable From: Silicon Composers Inc., other hand, were much more enthusias­ 21 OCalifornia Ave., Ste. K, Palo tic, as it was clear the processor had "Sure it's fast, but Alto, Calif. 94306, (415) 322-8763 been specifically designed to run Forth. Price: $1,495 (includes software de­ But for both groups, the response imme­ velopment system) what development Support: Free telephone support, 90- diately following the introduction was day warranty, contract services simple: "Sure it's fast, but what devel­ tools are System Requirements: Serial cable, ter­ opment tools are available?" minal, five-volt power supply A few months back I asked that available?" same question and quickly wound up FOR INFORMATION CIRCLE #101 evaluating three different development systems for the R TX 2000: the SC/ design here; suffice to say that the RTX FOX SBC from Silicon Composers, the 2000 is a 16-bit RISClike processor de­ Harris RTXDS, and the Innovative In­ signed to implement Forth primitives in SC/FOX SBC tegration FB2000. The review scenario its instruction set. Almost all instruc­ Silicon Composers was a solid support­ was the same for all the systems; the tions execute in a single cycle, and the er of the N ovix Forth processor (the goal was to develop an R TX-based solu­ processor doesn't have a cache. All the chip that served as the original model tion to a har~ware design involving real­ systems I examined were available with for the R TX development team), so it's time constraints. To even things out, all a of either 8 or 10 MHz; all no surprise that it also supports the Har­ the systems were to use an IBM PC as the processor boards used high-speed ris chip. The company is probably best the host and communicate via an RS- CMOS static RAMs and EPROMs. known for its coprocessor PC cards, but 232 serial link. While all three systems One RISClike aspect of the R TX it also provides stand-alone develop­ demonstrated admirable hardware per­ processor is its lack of : all the ment cards. Third-party Forth pack­ formance, their design rationales and bits of an instruction actually toggle ages are available for all the cards from software support varied substantially. transistors and perform operations. Un­ Forth Inc. (Manhattan Beach, Calif.) like most RISC processors, however, and Laboratory Microsystems Inc. THE RTX BASICS the R TX has several internal buses that (Los Angeles, Calif.). It's no surprise that all these develop­ connect the various registers to the two The SBC is a small ( 100 mm by 160 ment systems use the stock R TX 2000 internal stacks and the outside world. mm) board that contains the RTX pro­ processor. The R TX 2000 has been dis­ The multiple-bus arrangement actually cessor, EPROMs, CMOS static RAM, cussed before in this magazine (see allows the programmer (or, more pre­ expansion connectors, and miscella­ "Hard-Wired Forth," Feb. 1989, pp. cisely, the compiler) to overlap several neous "glue" parts. The board is avail­ 58-71, and "ANew Breed of Microcon­ instructions into one word. All three of able in a number of memory configura­ troller," When in ROM, Mar. 1989, pp. the development systems I examined tions ranging from 64 kbytes to 512 15-18), so I won't go into the processor's support this optimization of code. kbytes, and the bus connectors give you

EMBEDDED SYSTEMS PROGRAMMING • AUGUST 1989 Page 15 full access to the ASIC bus for hard­ First, no schematics are available for ware expansion. If judged solely on the board, implying (to me, at least) clean design and hardware functional­ that the company would like you to de­ ity, this system would be the winner. velop applications and then buy your Unfortunately, there's a lot more· to a hardware from them. This is certainly development system than just the no crime, but it also doesn't make the hardware. SC/FOX SBC my first choice as a development platform where custom ON THE SOFT SIDE hardware would be involved. While the software is adequate for de­ The second unique aspect is that the veloping applications, it's Spartan in SC/FOX SBC was clearly designed for comparison with the alternatives. It applications requiring more than the took up about 240 kbytes on my hard usual amount of RAM. To the credit of disk and consisted of the compiler, the Dr. Nicol and crew, they've made it loader, and a few libraries and utilities. mllablt Fr1•: Harris Corp., P.O. Box easy for customers to upgrade the board Installing the software was a simple 883, Melbourne, Fla. 32902-0883, by either swapping boards or installing operation. (407) 724-7302 bigger RAM chips. That's good from The software development cycle on PriCI: $2,995 the user's perspective, but if I were de­ the SC/FOX SBC was a shock-I was Sup,...t: Free telephone support, 60- veloping applications that used a lot of used to the seductive, interactive nature day warranty, bulletin board service, RAM (and had a budget to match) I'd training courses of Forth systems running directly be tempted to opt for development using on the Syst111 Raq1lrmnta: IBM PC, DOS target system and using the PC strictly v. 2.1 or later, serial port Silicon Composers' coprocessor boards. for I/O. Alas, the SBC has more in common with traditional C cross-com­ HARRIS RTXDS FOR INFORMATION CIRCLE *102 piling than with traditional Forth devel­ Anxiety was my initial reaction to the opment. You edit your source files with arrival of the Harris development sys­ your favorite editor; the use of straight tem: the shipping box measured 12 ASCII text files and the C-like exten­ it's better than sprinkling print state­ inches by 24 inches by six inches and sion of #include for library usage was a ments throughout your code. weighed enough to remind me of the nice alternative to the traditional block massive crates of documentation that files. Once you've edited the source A FEW MISSING DETAILS came with Microsoft's infamous OS/ 2 code, however, you have to compile it I found the SBC users manual to be SOK. Happily, this package is nothing and load it into the target board's good, with only a couple of annoying like the OS/2 SOK-in almost every RAM. Does this process sound famil­ problems. I discovered the worst of respect, it's a pleasure to work with. iar? Does it sound slow? Correct on them indirectly when I tried to use the The R TXDS I received consisted of both counts. parallel printer port. The folks at Sili­ the RTXDB development board, a Some explanation of the SBC philos­ con Composers told me I could get an Harris binder filled to overflowing with ophy came when I contacted Silicon adapter cable for the board from a local documentation, a modified version of Composers for help. (I'd been convert­ software/hardware emporium. Unfor­ Laboratory Microsystems' PC/Forth, ing some old code and had discovered tunately, the salespeople were unaware and a power supply. The RTXDB itself that the Forth compiler didn't under­ of their support role for Silicon Com­ was clearly devised with the hardware stand words like PICK and CASE .) George posers and didn't have the appropriate designer in mind: the roughly nine-by- Nicol, the company's president, ex­ cable in stock. I managed to manufac­ 14-inch board contains a ZIF socket for plained that while Silicon Composers ture my own cable without a great deal the processor, plenty of space between had a resident Forth working with its of effort--0r a great deal of success. components, lots of test points, and two coprocessor boards, the stand-alone Another call to Silicon Composers separate areas for breadboarding (one board's compiler was designed more as straightened out the problem (the pin­ area is specifically designed for memory an optimizing assembler/ compiler for out diagram reflected only the pins for circuits). There's little doubt that this the R TX processor than as a Forth de­ the printer end of the cable, not the board combined with the Harris hard­ velopment system. header-pin assignments on the SBC) ware documentation is a great hard­ The debugging assistance is minimal but left me with a lingering irritation ware development package. As I said and not particularly interactive. One of with companies that don't feel they before, though, there's more to a devel­ the library files contains a few tradition­ need to supply schematics or cables for opment system than just hardware. al display words like . s and . wdump, their boards. The software part of the package joined by the more sophisticated . break, The SC/FOX is unique among the worked well, with one noticeable excep­ which performs a breakpoint. At least systems I tested in two important ways. tion: the installation. The PC/ Forth in-

Page 16 EMBEDDED SYSTEMS PROGRAMMING • AUGUST 1989 stallation process for the RTXDS was inches square) and consider its power. conceptually very simple. All the instal­ The small size of the FB200(}-let alone lation batch file had to do was load a the accompanying EPROM burner, binary overlay for the appropriate video which is even smaller-is a nice hard­ display, load the RTXDS overlay, and ware demonstration of one of Chuck save the resulting images to disk. Moore's guiding principles: small is In practice, the automatic installation beautiful. blew up several times. I don't know if Apart from the form factor, though, the blame lies with Laboratory Micro­ there's nothing small about this pack­ systems or with Harris; I do know that age. For example, the various source, it's inexcusable for an installation batch documentation, and executable files file to be shipped without first being take up over 2 Mbytes on my hard disk. checked. Libraries and routines for such features I'm not a Forth guru, but I was able as graphics, multitasking, and ftoating­ to manually install the system rather point support are casually included in quickly. This is due in no small part to the source code as though they're no big the excellent PC /Forth documentation; Avallabl• From: Innovative Integration, deal. (By way of contrast, Silicon Com­ every time I consulted the manual, I was 4086 Little Hollow Pl., Moorpark, posers sells a four-function IEEE float­ able to find the information I needed Calif. 93021, (805) 529-7570 ing-point RTX package for a mere quickly and easily. Price: $995 (or $1,500 for system with $500.) I hate to be repetitive, but have I I'd rate the documentation from full source code) mentioned the word bargain yet? Support: Harris as good. There were a few times Free telephone support for 90 In contrast to the SC /FOX SBC and days, one-year warranty on board, when I wasn't able to find things as easi­ the RTXDS, the FB2000 is a one year free software updates resident ly as I should have, and there were times System Requirements: IBM PC with se­ Forth package. That is, the PC is used when I found contradictory remarks rial port (color graphics adapter for terminal and disk I/O via the serial about the software. Harris has prom­ desirable) link and the developer interacts with ised to work on the documentation and a completely functional polyFORTH produce a separate, bound program­ FOR INFORMATION CIRCLE #103 system running on the target. While mer's reference in the next revision (due there's some delay in loading files via out sometime this summer). serial link, it's nowhere near as painful Once properly installed, the software as it is with the Silicon Composers SBC. worked very well. Like the Silicon Com­ RTXDS software package took up 520 That probably has something to do with posers product, the PC/Forth package kbytes on my hard disk, not a bad bal­ the fact that the FB2000 system cranks doesn't implement Forth running on the ance between size and utility. up the serial rate to 38 kbps or faster. target system but instead compiles code Despite all the talk about clock rates on the PC and then loads it into the tar­ INNOVATIVE INTEGRATION FB2000 and wait states from the hardware labs, get's RAM. That's about the extent of In many ways the FB2000 is both the the measure of our productivity on the the similarities between the two pack­ easiest and the hardest development software side most often relies on the ages, though. Consider some of the ad­ system to review. If I could do a decent quality and speed of our programming vantages of the PC/Forth RTXDS imitation of Bill Machrone, I'd give it tools. In that regard, Forth program­ package: an Editor's Choice award. As it stands, mers have traditionally had an advan­ • Forth-83 compatibility. I'll have to settle for congratulating Jim tage in embedded systems develop­ • A dual-window programming en­ Henderson and the folks at Innovative ment; the small size of most Forth vironment that emulates (to a degree) a Integration for winning our unofficial systems has allowed developers to do target-resident Forth system. Bang-for-the-Buck award. At $1,075 their work quickly and interactively on • A trueRTXemulatorfordevelop­ for a developers package that includes a the target system instead of wading ing I/0-independent code without us­ complete polyFORTH system and an through the delays of cross-develop­ ing the target system. EPROM programmer, the FB2000 is a ment. If ever there were a processor • Debugging help that includes a bargain. where that advantage was worth serious memory-access monitor for tracing For people outside the field of em­ consideration, it would be the RTX variable assignments, as well as inter­ bedded systems, there might not be any­ 2000. active disassembly and symbol table thing remarkable in the appearance of access. the FB2000. lfyou'vegrown used to de­ LOOKING FOR FAULTS • Sample code including UART velopment systems the size of a home I tried to find some, really I did, but and interrupt handlers. stereo, though, it can come as a pleasant there weren't that many faults. Perhaps The combined files of the PC/ Forth surprise to pick up a little card ( 4.2 the worst thing you could say about the

EMBEDDED SYSTEMS PROGRAMMING • AUGUST 1989 Page 17 FB2000 is that it's the least stable of the among the best I've seen in Forth sys­ screamers compared to the Forth con­ systems mentioned here. (It's ironic tems, so there's no problem there (un­ trollers most of us have seen before. And that the system with the most changes less you count all the READ.ME files while there were occasional problems, was the system I had the least problems from upgrades). these are well-engineered ·products. So with.) No sooner did I get a system for One potential shortcoming of the how do you choose? review than Jim Henderson was on the board is the limitation of 64 kbytes of As hinted at in the "Bang-for-the­ phone uploading a new READ.ME file static RAM, but expansion connectors Buck" comment, the Innovative Inte­ with improvements to the port 1/ 0. (and schematics) would enable an engi­ gration product is my pick as the best The system I got for review had neer friend to add another board if you investment for developers interested in a wire-wrapped EPROM program­ had an application that needed the ex­ investigating the RTX processor. You mer-an improvement added to the de­ tra memory. get excellent performance and value velopers package because the people at The biggest short-term problem for and huge amounts of source code for Innovative Integration didn't think developers is the fact that Innovative In­ your applications. their customers wanted to shell out tegration is, like Silicon Composers, a If you have friends in hardware who $1,000 for a premium ROM burner small company. If you call with a sup­ are seriously considering building their compatible with the speedy Cypress port problem, there's a good chance own boards based on the R TX 2000, EPROMs they use. I called Innovative you'll have to leave a message and wait a however, I don't see a real alternative to Integration's offices a month or two lat­ couple of hours for a reply. As with the the Harris RTXDS. The documenta­ er to confirm a few details and it turns best companies, though, when you get a tion and the board's design alone will out that, yes, the EPROM burner now response you wind up talking with pay for the system in the time they save exists as a finished board. By the way, someone who knows the product very the engineering staff. The software they're speeding up the disk-file access well. development tools aren't too shabby by adding in-line compression and THE BUCK STOPS HERE either. decompression. All these systems have an R TX under The two documentation manuals are the hood, so to speak, so they're all

Page 18 EMBEDDED SYSTEMS PROGRAMMING • AUGUST 1989 We're Backing You Up With Products, Support, And Solutions!

Signal , Processing Power Products Military I Aerospace Products • Linear • Power MOSFETs • and • Custom Linear • IGBTs Periplierals • Data Conversion • Bipolar Discretes • Memories • Interface • Transient Voltage • Analog ICs • Analog Suppressors • Digital ICs • • Opto Devices • Discrete Power • Filters • Power Rectifiers · Bipolar • DSP - MOSFET • Telecom Intelligent Power - Rad-Hard ICs • Power ICs Digital • Power ASICs Military I Aerospace Programs • CMOS Microprocessors • Hybrid Programmable • COMSEC Pro~ams and Peripherals Switches • Strategic and Space Programs • CMOS Microcontrollers • Full-Custom High • Military ASIC Programs • CMOS Logic Voltage ICs • CMOS Memories Microwave ASICs • Gallium Arsenide FETs • Full-Custom • Standard MMI Cs • Semicustom • Custom MMICs · Gate Arrays · Standard Cell · Cell Based - Core Processors • ASIC Design Software

SALES OFFICE HEADQUARTERS United Stales European North Asia Harris Semiconductor Harris Semiconductor Harris K.K. 1301 Woody Burke Road Mercure Centre Shinjuku NS Bldg. Box 6153 Melbourne, Florida 32902 Rue de la Fusse 100 2-4-1 Nishi-Shinjuku TEL: (407) 724-3739 Brussels, Belgium 1130 Sinjuku-Ku, Tokyo 163 Japan TEL: (32) 246-2201 TEL: 81-3-345-8911

South Asia Harris Semiconductor H.K. Ltd. 13/F Fourseas Building 208-212 Nathan Road Tsimshatsui, Kowloon Hong Kong TEL: (852) 3-723-6339

HARRIS © Harris Corporation 1990 SEMICONDUCTOR Reorder Number: AR-RTX-002 HARRIS RCA GE INTERS IL

TABLE OF CONTENTS

Page 3 - Stack-Based Processor Speeds Target Recognition (Design News, September 4, 1989).

Page 5 - High-Speed Microprocessor Makes Star Tracker Functional (Design News, May 22, 1989).

Page 7 - Embedded Controls Improve Exercise For Handicapped (Design News, August 21, 1989).

mHARRIS

Page2 DESIGN IDEAS-From The Regional Editors Reprinted from Design News, September 4, 1989. Copyright 1989. Cahners Publishing Company. Stack-Based Processor All rights reserved. Speeds Target Recognition Chip executes all instructions in a single machine cycle

David J. Bak, East Coast Editor condition that's often un­ acceptable (as seen in the F-16 scenario). A new processor, based Melbourne, FL-You 're flying an F- on four parallel internal 16. Infrared sensors suddenly lock buses, reflects the most onto an object approaching at 1000 desirable features of an mph. Friend or foe? You've got 10, ideal RISC processor. Its maybe 20 msecs to extract an edge, internal data paths are di­ identify, and respond. The almost­ rectly routable to the instantaneous code execution ALU (arithmetic logic needed to perform such a task de­ unit) in a single machine mands a processor that addresses cycle, with no sacrifice in events with a predetermined, re­ execution predictability. peatable, sequence of operations. And, with the exception Most processors sold today boost of memory reference in­ Crucial to avionics target-tracking, high-speed embed­ speed at the expense of predictabil­ structions that consume ded real-time systems can perform single-target extrac­ tion operations in as little as 10 msecs-an amount of ity. Based on RISC (Reduced In­ two machine cycles each, time in which most general purpose processors can struction Set Computer) architec­ the processor executes all only execute a maximum of around 100 instructions. tures, they assign data and instruc­ of its instructions in a (Photo: ©Fred Ward, Black Star.) tions to separate on-chip register single clock cycle. Like banks, i.e., sets of flip- clocked RISC, instructions are decoded in a return stack-take care of internal by the processor and used for stor­ logic, rather than translated into a storage. The parameter stack pro­ age. These register banks, in turn, sequence of microcoded primitives. vides storage for all operands. Ad­ permit the simultaneous execution Unlike RISC, context saving is not dresses of all subroutine returns are of independent operations using necessary. ·The processor does not stored in the return stack. Use of simple fetch-and-store commands. discriminate between direct or the two-stack system minimizes Good for straight-line program branched flow, treating interrupts overhead associated with proce­ code, the architecture presents as simple subroutine calls. dural calls, branches, and jumps. In problems when used in real-time Built by Harris Corp., the RTX- addition, this approach reduces control applications. 2000 includes a main memory bus, programming complexity. Their ability to rapidly respond a high-speed, bidirectional Applica­ Due to its high level of parallel­ to internal and external interrupts, tion Specific ism, the RTX-2000 can execute up for example, depends on the state (ASIC) bus, and two on-chip stack­ to five FORTH primitives concur­ of the individual registers when the memory buses. The main memory rently within a single, 16-bit ma­ interrupt is acknowledged. Before bus carries the sequence of instruc­ chine instruction cycle. Consider servicing the subroutine, the proc­ tions needed to execute a given the sequence of instructions in­ essor must "clean itself out," trans­ task. Additionally, main memory volved in the last cycle of an image­ ferring everything in the registers to holds all data not needed for on­ processing task, such as edge extrac­ the main memory. This process, chip storage. The ASIC bus allows tion. The processor can: called context saving, slows execu­ easy interface with application spe­ • Read a 16-bit data value from the tion time. Worse yet, because the cific ICs to further enhance system ASIC bus. processor is never in the same state processing throughput. • Perform an ALU operation. when the interrupt occurs, execu­ Two parallel on-chip LIFO • Store the result on the parameter tion times are unpredictable-a memories-a parameter stack and stack.

DESIGN NEWS • SEPTEMBER 1989 Page3 DWIN DIWASE

• Perform a subroutine return. 2000's on-chip 16- x 16-bit parallel • Fetch the first instruction of a multiplier functions as the proces­ subroutine returned to by the sor's major source of arithmetic processor. power, while three on-chip, 16-bit Comparatively, such concurrent down-counters perform all timing execution would equate to three or and event-counting tasks. four RISC-equivalent instructions. As an integral part of the Ball On most RISC processors, execut­ Aerospace Daylight Star Tracker, ing that many instructions would the RTX 2000 processes 900 CCD require 6 to 8 bytes of program code pixels in 9.1 msecs. The Tracker, and as many as 16 clock cycles. described on page 132 of the May Stack controllers, timers, a multi­ 22, 1989 issue of Design News plier, and an interrupt controller measures the bearing and azimuth complete the on-chip package. of a star to determine the point on Stack controllers internally main­ the earth's surface occupied by the tain the absolute address of the observer. Other applications, be­ stack location's current 16-bit sides image processing, avionics "top" value, onto which the next and space guidance systems, in­ datum will be pushed. This value, clude robotic motion/effector con­ stored in the controller's stack­ trol, speech processing, smart muni­ pointer register (SPR), can be read tions, and all areas of graphics proc­ by the processor at any time, such essing. Embedded expert systems is as during the process of switching another area of potential use. a task context. The SPR can also Additional details ... Contact be written into in the process of re­ David Williams, Harris Corp., storing the context of a new task. Semiconductor Products Div., Box Implementation of the interrupt 883, Melbourne, FL 32902, 407- controller allows 14 prioritized in­ 729-4629. D terrupts. Of these, nine are used by the processor's internal peripherals. The remaining five are available as inputs to the processor. The RTX-

lnternal, highly parallel busing gives thE;t RTX-2000 extremely high throughput capability. The only class of two-cycle intructions needed are memory-reference instructions, i.e., one cycle to fetch an instruction, and one to fetch/store the data.

RTX PROCESSOR INTERRUPT TIMER INPUTS INPUTS

CONTROL .------INPUTS CLOCK AND CONFIGURATION CONTROL

MEMORY RTX BUS PROCESSOR MAIN INTERFACE OFF-CHIP MEMORY PERIPHERALS

PARAMETER STACK RETURN STACK CONTROLLERS STACK

Page4 DESIGN NEWS • SEPTEMBER 1989 Reprinted from Design News, DESIGN IDEAS May 22, 1989. Copyright 1989. Cahners Publishing Company. High-Speed Microprocessor All rights reserved. Makes Star Tracker Functional Unit processes 900 pixels in 10 msecs and consumes just 3.1W·

Lyle H. McCarty, Western Editor

Boulder, CO-Star trackers do what navigators did in days of yore-measure the bearing and azi­ muth of a star to determine the point on the earth's surface occu­ pied by the observer. That informa­ tion is then put to work-correcting the navigator's dead-reckoning cal­ culations in the old days, providing updated zero references for the in­ ertial navigation system today. Of course, star trackers work much more quickly than the busiest Star tracker electronics assembly occupies little space, weighs just 14.2 oz, and operates on 3.1W. Its Harris microprocessor processes 900 CCD pixels of navigators, so fast that the equip­ in 9.1 msecs. ment shown here was required to cle processors can execute only five To solve this problem, Ball Aero­ proc~ss signals from 900 Charge to ten instructions in this time in- space engineer Mike Hubbard Coupled Device (CCD) pixels in 10 terval, not nearly enough comput- turned to a Harris Corporation mi­ msecs. Conventional multiclock cy- ing power for the job. croprocessor known as a Forth en-

MODEL LN-20 STAR TRACKER

1. Time base generator reads charge coupled device (CCO), STAR ·o·. ANALOG 4kX 18bitl --­ pixel data * ' PROCESSOR RAM MEMORY TELESCOPE ...___-4 FORTH ---4ENGINE 2. Analog processor loads forth engine memory with TIME pixel array BASE 2CUiz GENERATOR OSCILLATOR

a. Forth engine searches pilel data for star position

4. Host interface provides control and status to user ELEVATION RELATIVE ERROR BEARING ERROR

DESIGN NEWS • MAY 1989 Page 5 gine. This device runs at a clock to read out the 900 pixels in 10 the AID converter converts data speed of 10 MHz and can execute msecs. This clever circuitry in­ and writes it into RAM, the counter more than 100 instructions/msec on cludes three programmable array pro vi des RAM addresses. each pixel. It is also small-one­ logic devices and a 32k x 16 bit The Forth Engine accesses pixel inch square by 114-inch high-and EPROM. The waveform required data on the other RAM port to de­ uses just 25 mW /MHz. In addition, to read out the CCD is determined termine star position. The RAM the chip requires only a small by the bit patterns that are burned also provides variable and array amount of support circuitry to form into the EPROM. data storage. Constants and the star a complete computer system. A camera interface, dual D/A tracker program are stored in an 8k Tracker electronics are function­ converters for X and Y analog out­ x 16 bit EPROM. A host interface ally partitioned into camera and put error signals, and digital input enables a user to determine system processing areas. The camera is sub­ and output ports comprise the ana­ status and control the star tracker. divided into a CCD focal plane, a log processor. The unique camera Additional details ... Contact correlated double sampler analog interface loads pixel data into RAM Robert Baker, Ball Aerospace Sys­ processor, and a Time Base Genera­ with no processor overhead. Major tems Division, Box 1062, Boulder, tor (TBG). A state machine se­ components of the interface are a co 80306-1062, 303-939-4917. c quencer running at 12 MHz, the 12-bit AID converter, dual-port 4k TBG provides CCD control signals x 16 bit RAM, and counter. While

Page 6 DESIGN NEWS • MAY 1989 DESIGN FEATURE Reprinted from Design News, August 21 , 1989. Copyright 1989. Cahners Publishing Company. All rights reserved.

EMBEDDED CONTROLS IMPROVE EXERCISE FOR HANDICAPPED Advanced microprocessor triggers electrical stimulation of paralyzed muscles in real time

Gail M. Robinson, Associate Editor

ike many of us who are listen­ ing to the advice of our doc­ Ltors, Arthur Schlesinger works out three times a week. Taking about 30 minutes, his workout in­ volves a complete cardiovascular program. But there is one major difference. Schlesinger is a paraplegic. He is a volunteer in a program at Wright State . University Medical School, Dayton, OH, geared to developing complete exercise programs for handicapped people. Though he has no use of his legs, Schlesinger gets an aerobic workout by riding a bicy­ cle while turning a gear crank. The key to the workout is a sys- 1 tern that controls electrical stimula­ tions which, when applied to Sch­ lesinger's legs, produce functional muscle contractions. The controller is a new type of microprocessor that takes on the normal stimulus and response functions of the nervous system. Its job is to synchronize the lower and upper extremities so the individual gets all the benefits of a cardiovascular workout. Called the RTX 2000, this high­ speed 16-bit programmable micro­ controller may prove to be a major contender in the microprocessor market for real-time applications. Designed by Harris Semiconductor, Melbourne, FL, the system offers interrupt response times of 400 nsecs, context times of 2 µsecs , and an instruction execution

DESIGN NEWS • AUGUST 1989 Page 7 rate of over 10 MIPS. And, it fea­ control system plays a major role Taking a closer look tures a new way of dealing with in this kind of exercise program," "There is a growing need for pro­ real-time events. Instead of a simple says Ezenwa. "It must determine cesssors with higher performance interrupt, it can merge responses the condition of the lower extremity standards than existing microcon­ from different channels to make up musculature, for example, the fa­ trollers," says Dave Williams, man­ a unique set of instructions that tigue condition, and the phase and ager of the RTX product marketing handle more than one instruction rate of the arm cranking. It then at Harris. "Even RISC processor at a time. applies the necessary current in real technology with its promise of add­ time to specific lower extremity ing even more sophisticated prod­ muscles so they are synchronized ucts to the marketplace may not be with the arm cranking." suitable for many embedded con­ Most conventional microproces­ trol applications." sors are optimized for office com­ Though RISC processors are puters or computer-aided worksta­ making strides in real-time process­ tion environments. In these cases ing, especially in respect to speed a "real-time" response may take a and performance, Williams notes few seconds, probably making little that most real-time system environ­ difference to the user~ ments are driven by asynchronous However, these processors are external events,· which require fast not designed for applications where interrupt response and predictable even a 112-sec response time to an system level timing. external event is too slow. When Williams adds that RISC proces­ dealing with more than one task, a sors rely on pipelining and cache conventional microprocessor must memory schemes to provide im­ jump back and forth between the provements in the statistical speed different instruction groups. of the processor, so response times For example, in this case one set are often unpredictable. of instructions is set for pedaling Though the RTX does exhibit the bicycle, while a completely RISC-like behavior in its perform­ separate set of instructions deals ance of one instruction per cycle, The RTX 2000 is a sophisticated microcon­ with the turning of the crankshaft. there is an important difference in troller optimized for real-time applications. Though both of these tasks must oc­ how the data and control paths are cur simultaneously, a conventional configured. The microcontroller Fitting a need microprocessor can only process does not use pipelines to achieve its "Everyone knows that without one set of instructions at a time in speed, but instead it uses parallel activity the muscles go through at­ the sequence. It must jump from techniques in the form of a RTX rophy," says Bertram Ezenwa, di­ one task to another, interrupting Quad Bus. This enables several dif­ rector of technical development in the other task in the process. If the ferent kinds of information transfer the Div. of Research and Develop­ controller is well-matched to the to occur at the same time as part ment at Wright State. "Unfortu­ job, it can jump back and forth fast of one instruction. nately, people with paralyzed limbs enough to look as though it were The chip also has no microcode can't do anything about it. Their doing both tasks simultarfoously. muscles go through complete dete­ But the chance of an overload is rioration. Not only are the physical very high. problems difficult to deal with, but With the RTX 2000, two differ­ there are psychological problems as ent sets of instructions for each task well." are not required. Instead, the mi­ Ezenwa's goal is to design a sys­ crocontroller makes one set of in­ tem that gives these people the op­ structions that deals with the two portunity to be able to do a com­ different events. Therefore, it deals plete workout, rather than exercise with a greater variety of tasks using Space Shuttle uses RTX 2000 in solid-state only certain parts of the body. "The the same level of processing power. Star Tracker device.

Page 8 DESIGN NEWS • AUGUST 1989 EMBEDDED CONTROLS

for contraction through surface ARM CRANK/BICYCLE SYSTEM electrodes attached to the body. The model or alogorithm is down­ loaded in FORTH from the micro­ computer to the RTX chip. An integral part of the equipment is a specialized stationary bicycle designed by Therapeutic Technolo­ gies Inc., Dayton, OH. The ERGYS I is designed to combine functional electrical stimulation with such ad­ vances in microprocessor develop­ ment to control the muscle contrac­ tions. The system on the market now is directed towards home reha­ bilitation. It features a programma­ ble electronic cartridge for indi­ vidualized treatment. For Schlesinger's particular work­ out, the system stimulates three dif­ ferent muscle groups: the quadri­ ceps, the hamstrings and the gluteus maximus. This enables him to ride the stationary bicycle as he turns the arm crank. After 36 sessions, he undergoes strength, muscle, and cardiovascular testing. "The physical results have been great," says Schlesinger. "My mus­ cle mass and strength have in­ An exercise profile is programmed into a computer that interfaces with the RTX 2000. creased, I have more circulation, The controller accesses the required muscle contractions and synchronizes the lower and and noticeably less fat. But the psy­ upper extremities of the body. chological effects are just as impor­ sequencer or microcode. All in­ chip has several other features that tant. My muscles have tone and structions except memory access in­ may keep it ahead of the pack, in­ look normal. I can't tell you how structions execute in one cycle. The cluding the interrupt controller, good that makes me feel. I look bet­ system minimizes address calcula­ three on-chip 16 bit timer/counters, ter and I feel better about myself." tion delays by incorporating a sim­ an ASIC bus for off-chip extension Researchers are currently devel­ plified memory paging mechanism, of t~e architecture, and 1M byte of oping simulators for selective and eliminates the complexity of memory address space. stimulation of deep muscles from multiple addressing modes and surface electrodes. And, Dr. memory arrangement. Putting it to work Ezenwa is also designing a system The RTX is also a stack machine. So far, parts of the real-time data for weight lifting. "By applying the Stacks facilitate the evaluation of acquisition and display for the sys­ proper amount of electrical current expressions and minimize the con­ tem have been tested on an Intel to the quadricep muscles, the leg trol overhead needed to organize 386-based microcomputer. "After can lift weight against gravity," says data. A stack machine not only uses we applied the RTX 2000 real-time Ezenwa. "But several conditions a stack for temporary data storage, , the data acquisi­ must be factored into the control but executes all operations on data tion and display were much faster," design, such as applying only the from the stack. Thus, the ALU finds says Ezenwa. necessary current intensity for each all of its data in a pre-defined loca­ To initiate the program, an exer­ phase of contraction, determining tion, and can get that data without cise model is programmed into the muscle fatigue condition, and keep­ an address specification. microcomputer. The controller ing the weight lifting to predeter­ Besides two 256-word stacks, the then accesses the required muscle mined safe limits."

DESIGN NEWS • AUGUST 1989 Page9 To follow these parameters, Dr. environments. form to keep the Tracker aligned. Ezenwa is developing an automated He expects to apply the RTX The system mounts on a guidance stimulator for the adaptive control 2000 "because it offers optimized gimbal and uses a CCD sensor and of knee extension exercise for spinal control for multitasking and inter­ the RTX 2000 to process a 900 cord individuals. The RTX 2000 rupts, and it is easy to interface with pixel frame in 10 msec. The RTX development system will carry out the real world." scans the field and locates the stars, the real-time control algorithms. Ball Aerospace, Boulder, CO, is and then sends the tracking infor­ using the RTX 2000 in its newest mation to the computer. Beyond the health field solid-state Daylight Space Tracker. Exceptional instruction speed There are other research applica­ The Tracker is an electroptical de­ was the main reason the company tions in need of such systems as vice used in aircraft and the space is using the chip, notes design engi­ well, including many outside the shuttle that analyzes stars through neer Mike Hubbard. "I doubt if the medical field. Fred Sias, professor a scope. The signals it generates job would have been done without in the electrical engineering dept. control the body surrounding the the RTX," he explains. "If so, it at Clemson University, Clemson, spinning wheels of the gyroscope. would have had to be custom-made, SC, is designing wheeled mobile ro­ A computer responds to the which would have been more ex­ bots for surveillance and radiation Tracker's signals and moves a plat- pensive and taken more time." D

Page 10 DESIGN NEWS • AUGUST 1989 We're Backing You Up With Products, Support, And Solutions!

Signal Processing Power Products Military I Aerospace Products • Linear • Power MOSFETs • Microprocessors and • Custom Linear • IGBTs Periplierals • Data Conversion • Bipolar Discretes • Memories • Interface • Transient Voltage • Analog ICs • Analog Switches Suppressors • Digital ICs • Multiplexers • Opto Devices • Discrete Power • Filters • Power Rectifiers · Bipolar • DSP - MOSFET • Telecom Intelligent Power · Rad-Hard ICs • Power ICs Digital • Power ASICs Military I Aerospace Programs • CMOS Microprocessors • Hybrid Programmable • COMSEC Pro~ams and Pe~ipherals Switches • Strategic and Space Programs • CMOS Microcontrollers • Full-Custom High • Military ASIC Programs • CMOS Logic Voltage ICs • CMOS Memories Microwave AS I Cs • Gallium Arsenide FETs • Full-Custom • Standard MMI Cs • Semicustom • Custom MMICs · Gate Arrays · Standard Cell - Cell Based · Core Processors • ASIC Design Software

SALES OFFICE .HEADQUARTERS United States European North Asia Harris Semiconductor Harris Semiconductor Harris K.K. 1301 Woody Burke Road Mercure Centre Shinjuku NS Bldg. Box 6153 Melbourne, Florida 32902 Rue de la Fusse 100 2-4-1 Nishi-Shinjuku TEL: (407) 724-3739 Brussels, Belgium 1130 Sinjuku-Ku, Tokyo 163 Japan TEL: (32) 246-2201 TEL: 81-3-345-8911

South Asia Harris Semiconductor H.K. Ltd. 13/F Fourseas Building 208-212 Nathan Road Tsimshatsui, Kowloon Hong Kong TEL: (852) 3-723-6339

c Harris Corporation 1990 m HARRIS Reorder Number: AR-RTX-001 \A.I SEMICONDUCTOR HARRIS RCA GE INTERSIL

FORTH Processor· Core for Integrated 16-Bit Systems Peter S. Danile and Christopher W. Malinowski, Harris Corp. Semiconductor Division, Melbourne, FL

he development of a high-performance dedicated 16- Tbit process~r u~ing an industry-standard microcon­ troller or b1t-shce processor calls not only for an extensive board-level design effort, but also for a long-term development program for software and firmware. It has been difficult to use semicustom techniques for such development because core processors have been scarce and microcode development for custom ICs is very difficult. However, in a growing number of high-performance sys­ tems for digital signal processing, control, and arithmetic, application-specific processors are bringing forth the advan­ tages of integration-including high throughput, low power dissipation, and much higher density than board-level implementations. Clock generator Harris Semiconductor now has a semicustom technofogy, called the Processor Toolbox, that eases the developme~t of high-performance 16-bit integrated proc~ssors. Toolbox fea­ tures include a very small core-processor cell, a highly parallel architecture for maximum throughput, easy program­ Program ming and code development, code portability, a full set of ROM core-compatible peripheral cells that can support processor clock frequencies as high as 15 MHz, and a set of high-speed arithmetic and logic cells, such as a 16-bit multiplier, for further customization. FIGURE l. The Force Toolbox contains a FORTH The processor architecture derives from one conceived by processor and support peripherals, all Charles Moore, inventor of the FORTH language. This RISC­ available as cells and packaged parts. like, highly parallel architecture meets the size and through­ put requirements. The combination of the processor's instruc­ ported to the Harris engine. tion set-a directly executable set of FORTH high-level The principal advantage to designers of dedicated proces­ primitives-and its reliance on two stacks that reflect a sors stems from the host of proprietary LSI and VLSI cells FORTH virtual machine results in a compact core processor being developed to support the FORTH core processor. The with less than 2500 gates. This core is called the FORTH Toolbox provides designers of Force-based products with a Optimized RISC Computing Engine, or Force. set of packaged Force Circuits identical to the cells available Because each instruction comprises more than one FORTH in Harris' standard-cell library (Figure 1). The designer can primitive (opcode), data-manipulation throughput can exceed use these parts to create a breadboard and experiment with the processor's clock frequency, often by a factor of three. different configurations of the Force processor before moving Consequently, for instruction sets rich in multiple-opcode to an ASIC. He gains greater confidence in the design's instructions, peak processing throughput can exceed 30 MIPS, functionality, because it can be exercised in a real environ­ with a steady throughput of 10 to 20 MIPS. ment in addition to a CAE simulation environment. Moreover, Users of the Toolbox can develop application code in a the designer can use the prototype breadboard to generate a high-level FORTH language working with an interactive and reliable and accurate set of functional test vectors, a task both interpretive environment. FORTH is not only portable but also formidable and error-prone otherwise. offers expeditious debugging tools and eitsy target-compila­ Another advantage of creating a breadboard is that the tion from one environment to another. Therefore, a wealth of designer can compile the applications code and run it before code written for DSP, artificial intelligence, control, number­ committing it to a ROM pattern. Running the application code crunching, and real-time data-processing applications can be lets the designer make trade-offs between implementing func-

Reprtntedfrom VLSI Systems Design June 1987, Copyright 1987 CM.P. Publtcattons Inc All rights reserved. tions in hardware and coding them in software. The core's execution speed allows many functions-such as memory swaps, arithmetic and logic functions, shifts, and masking­ to be implemented in firmware without sacrificing perfor­ mance, shrinking die size considerably. Such trade-offs can be investigated only if the designer has access to the proces­ sor's functional blocks and can exercise alternatives in real time-that is, on a breadboard. Finally, the Toolbox approach makes it easier to design a testable circuit. During breadboarding, the designer can ob­ serve the timing of signal paths, such as the processor's data paths,, which may not be directly accessible in an integrated system. Identifying buried trouble spots increases the under­ standing of the circuit's testability requirements and makes it easier to implement such testability features as scan paths and memory-check routines. The Toolbox version of the core processor (Figure 2) comes in a 144-pin pin-grid array with all the UO signals bound to pins. Besides the core, the Force Toolbox also contains packaged versions of a stack controller with an on­ chip 64 x 16-bit stack RAM, an interrupt controller, and a 16 x 16-bit multiplier. By early 1988, Harris also will offer a FIGURE 2. The packaged version of the Force target ~ompiler hardware-development system and a Force core processor. that, in combination with a breadboard system, will support firmware-code development. (IR), the program counter (PC), and two arithmetic-instruction Once hardware and firmware are defined, the design is registers (MD and SR). The MD register stores the partial implemented in a semicustom IC, for which· the designer results of step-multiply and step-divide instructions; the SR customizes the program, data, and stack memories by using register stores the partial results of hardware-assisted square­ RAM and ROM module compilers. As much as 64K of on-chip root operations. firmware ROM can be integrated in addition to 16K of data The minimal overhead to support subroutines also reflects RAM. Because FORTH code is so compact, very extensive the nature of the FORTH language, which is heavily oriented application-specific code can be implemented in firmware toward the use of subroutines. The core processor is opti­ along with the kernel code. To replace the discrete logic on mized for the minimum number of cycles necessary to ex­ the breadboard, the designer uses 7400-type SSI and MSI cells ecute a subroutine call and return. Through instruction parti­ from the Harris standard-cell library. tioning and architectural refinement, the execution of a: subroutine call requires only a single clock cycle; the return The Force Core requires no added clock cycles. All interrupts that are inter­ The Toolbox's Force core processor is a bare control preted as subroutine calls therefore require only one clock engine with 123 1/0 lines. These signals include three parallel cycle of overhead. 16-bit data buses (two for stack memories and one for main Stack Controller Cell memory), a. 16-bit main-memory address bus, and a dual­ purpose 5-bit address-extension bus. In addition, a general­ Because the stack-oriented FORTH instructions employ purpose 16-bit bus (G-bus) acts as the processor's primary I/O stacks in every command, the most critical peripheral used signal path. with the Force core processor is the stack controller. Control­ A principle of RISC philosophy is to bring execution speed ling the stacks with software routines would degrade the as close as possible to the maximum memory-access speed. system's performance. The core directs the stack controller As a corollary, the number of multicycle instructions in the through the RW and SA signals. The stack controllenesponds processor's instruction set is reduced to maximize the proces­ to the SA signal with either a push (data write) or pop (data sor's data-bus throughput and bandwidth. read), according to the status of the RW signal. The processor's independent buses account for the Force In the packaged version of the stack controller are 64 words core's high throughput. Because each instruction executes in of memory with an access time of approximately 30 ns, so the no more than two clock cycles, at least three of the five buses designer can operate the core at 15 MHz without worrying are active during any clock cycle. The G-bus transfers data at about the access time of data in the stack. In an ASIC up to 30 MB/s when the processor operates at 15 MHz; the implementation, the .stack's memory size can be altered and main-memory bus transfers data at up· to 30 MB/s when in a the access time improves to about 20 ns. streamed-move mode. The stack controller operates with the core processor's The Force core processor's highly parallel architecture parameter stack (through SAS and RWS signals) and the return (Figure 3) reflects the structure of its horizontal instructions. stack (SAR and RWR), giving a typical system two stack Its eight main registers provide parallel storage and access to controllers (Figure 4). To enhance system flexibility, the the parameter stack's top two locations (TOP and NEXT), the signals OVER and UNDER generate interrupts when the stacks top location of the return stack (I), the instruction register ate ready to overflow or underflow. UNDER occurs when the

FIGURE 3. Parallel architecture lets three of five processor buses be active for all instructions. stack is pushed more than popped and the stack address lines tional data buses into read (POP) and write {PUSH) buses reach 00. The assertion should initiate a routine that either (Figure 5). In the discrete versions, the data buses are resets the controller (if more returns than routines were bidirectional to allow them to connect directly to standard called) or tries to recover from other sources of underflow. RAMs with bidirectional data buses. Separating the buses adds The OVER signal occurs when the number of words pushed some additional routing area; on the other hand, the time onto the stack exceeds a user-defined maximum. This maxi­ required to set up the buses for either a read or a write is mum can be programmed into the ASIC implementation by eliminated, improving system response time substantially. writing into an offset register through the G-bus. Interrupt Controller and Host Interface To implement multitasking versions of the Force proces­ sor, Harris is developing a multitasking stack controller The interrupt controller and the host interface help the core (MSC). The MSC enables the user to partition the 256-word processor interact efficiently with the surrounding system. stack into eight separate stacks via a 3-bit address size First, the interrupt controller contains 15 prioritized interrupt­ register. For example, if the system must nm two concurrent request inputs and a separate input for nonmaskable interrupts jobs, the size register is programmed to divide the stack RAM (NM!). The interrupt priorities are fixed (to decrease response into two 128-word stacks. To switch between tasks, the time) but can be defeated by writing in the interrupt control­ processor enables a task-select register through the G-bus . ler's mask register (which has a discrete address on the G­ When this register is written to, the stack pointer of the bus). The intemipt controller samples the request inputs on current task is saved and the stack pointer for the new task is opposite edges of the system clock. When two consecutive restored at the new task's current address. samples confirm that an interrupt is present, the INT line is When the stack controllers and core processor are integrat­ asserted. The core processor responds with the INTA signal, ed on a single chip, options for increasing performance are which directs the interrupt controller to generate the appropri­ available. Not only can the access times of the controller and ate vector for the interrupt. This vector comprises a 7-bit user­ the data RAMs be reduced merely by integrating, but access defined field for the location of the interrupt-vector table and time can also be reduced further by separating the bidirec- a 5-bit field that designates the appropriate interrupt. 6 A OVEft

SELSTER SELSTER SB.Of FR SELOFi:R

OQ

FEG A FEG 16 16 FORC ..,.. aw OE Sf' OELSP

16 16 TOP

FIGURE 4. Breadboard design of Force system. In the packaged version of the interrupt controller, the shared memory, the MEMORY _ READY input to the host interrupt vector for the valid highest-priority interrupt is interface is de-asserted until the data in the memory becomes presented to the core processor within 40 ns of the INTA pulse. valid. While the signal is low, the interface suspends execu­ Thus the controller can operate with the core processor at tion by the core processor; the processor continues when system frequencies greater than 15 MHz. Once the system has MEMORY _ READY is re-asserted. entered an interrupt routine, the core's INTE signal inhibits When the host processor wants to write to the shared any new interrupt from generating a new INTA. During INTE memory, the host interface checks the FORCE_ LOCK signal assertion, the system can clear the interrupt source and to determine if the Force processor has priority on the rewrite the Force configuration register to re-enable the inter­ memory bus. If not, the interface suspends the core processor rupts. The interrupt controller does not automatically nest and hands control over to the host. If wait states are neces­ interrupts, so the system does not need to modify the interrupt sary, MEMORY _ READY becomes low and the host interface controller until it must mask or unmask any interrupt line. stalls the host with the· HOST _READY signal. When HOST_ To create an interface between the Force core and a host READY is asserted, the host can relinquish priority on the bus. controller that does not degrade the core's performance, the Writing host-processor data into the shared memory is Toolbox includes a host interface cell. The interface allows simpler than reading from it. The host writes directly to the another processor to read and write data in the Force core's host interface, which holds the data in a buffer. When the memory-address space; it receives as inputs the address and memory bus becomes free, the interface writes the data into data lines of the shared memory, the command lines from the the memory. A HOST_READY signal is set when the buffer is processors, the core's data signals, and a MEMORY _READY full to prevent the host from writing over new data. signal that indicates when the data in the shared memory is If the host processor wants to do some house-cleaning in valid. It provides a READY signal to the host, to indicate when the shared memory, it requests priority on the memory bus by it can read or write data, and a to the core: Figure asserting the HOST _ LOCK pin . This signal suspends the core 6 shows a typical configuration of the interface, the core processor so the host can have exclusive access to the mem­ processor, and the shared memory. ory. In normal operation, the use of LOCK signals should be The host interface suspends the core processor when it minimized so performance is not degraded. attempts to read invalid data. When the data is not in the The host interface is designed to work with an asynchro- FlOCK

WEL HADY Hoel wa. lneldact FORCE AD COl'9 MO QJ( EN OEMSP AD SVSClK

FIGURE 5. Integrated configuration of core, stack controller, and multiplier. HOATA nous host by synchronizing all commands from the host with FIGURE 6. Configuration of core and shared the Force clock signal. The core processor can work with memory with host interface. synchronous or asynchronous RAMs , even ifthe host interface is used in a completely synchronous system. If the interface is tion , the processor executes a FETCH_SWAP from the G-bus integrated with the core, minor modifications can improve to receive either the MSP or LSP of the result (depending on the response time in the host read/write cycle. Also, if the shared multiplier's configuration), placing it in the NEXT register. A memory is integrated with the other components, the MEM­ G-bus FETCH then retrieves the remaining product word. This ORY _ READY line is not necessary because the on-chip com­ operation requires only three clock cycl~s to multiply two 16- piled RAM is fast enough to always contain valid data. bit numbers, and when many numbers need to be scaled by a constant (the multiplicand), the multiply takes only two clock Multiplier cycles once the initial multiplier is written to the· G-bus. Harris has designed a 16 x 16-bit multiplier, using its When the multiplier is integrated with the Force processor, proprietary MPS algorithm, for use with the Force core pro­ the multiplier and multiplicand can be placed directly on the cessor. The peripheral performs full 16 x 16 multiplication, TOP and NEXT buses. The multiplier would be configured in generating a 32-bit product in as little as two clock cycles its feedthrough mode, and a 32-bit product would be available when the peripheral is integrated with the core; a breadboard every clock cycle. To execute a multiplication, the processor system can complete a multiply in five clock cycles. needs to read only the appropriate G-bus addresses. D The multiplier operates either clocked or unclocked and provides tristate signals at the output control and data (MSP About the Authors and LSP) lines. It has data latches at its inputs to allow clocked operation independent of the core. On-chip results are gener­ Peter S. Danile is currently a section head ated in less than 50 ns from the edge of the clocking signal, of semicustom design at Harris Semiconduc­ and the two 16-bit words in the product can be read simulta­ tor. Prior to joining Harris in 1981 , he neously or in sequence. worked for both Northern Telecom and Mo­ The designer can incorporate the multiplier into the Force torola Communications . A graduate of the architecture in several ways. First, he could designate several University of South Florida in 1976, Peter G-bus addresses to identify the multiplier, multiplicand, and went on to receive his MSE with honors from Florida Atlantic University in 1980. the product's most-significant word (MSP) and least-signifi­ cant word (LSP) . Using this configuration, the core processor Christopher W. Malinowski is a senior would write the addresses and read back the results to receive scientist for Harris' semiconductor research the result in four clock cycles. and development department, and a program He also could designate one G~bus · address to identify either manager for the Force project. He holds an the multiplier or the multiplicand. The second operand can MS degree in nuclear electronics and a PhD in attach directly to the processor's TOP bus. To perform a solid-state physics from Warsaw Technical multiplication, the multiplier would be written to the G-bus University. and the multiplicand placed on the TOP bus. Upon comple- We're backing you up lVith products, support, and solutions!

SEMICUSTOM/CUSTOM TECHNOLOGIES LINEAR • CMOS Programmable Logic • CMOS Digital •Op Amps • Gate Arrays • CMOS Analog • Comparators • Standard Cells • Bipolar Analog • Analog Switches •Full Custom • Dielectric Isolation • Buffers • Gallium Arsenide • Radiation Hardened

DATA ACQUISITION TELECOMMUNICATION DIGITAL COMMUNICATION MICROPROCESSOR • Analog Multiplexers • SLICs • CMOS 1553 Bus Interface •CMOS 80C86-16-Bit • DIA Converters •PCM and Univ. Active Filters •CMOS UARTs • CMOS 80C88-8/16-Bit •AID Converters • CVSDs · • CMOS Manchester • CMOS 80C85 RH-8-Bit .·•Sample-and-Hold • T-1 and ISDN Circuits Encoder/Decoder •CMOS 80C86 RH-16-Blt Amplifiers • CMOS ARINC Bus Interface • CMOS 8/16-Bit Peripherals

MEMORY GALLIUM ARSENIDE RADIATION HARDENED •CMOS RAMs • Microwave FETs • SRAMs/PROMs •CMOS PROMs • Digital ICS • Microprocessors • CMOS Memory Modules • Microwave Monolithic ICs • Gate Arrays/Standard Cells • Microwave Amplifiers •Op Amps/Multiplexers • Custom/Fabrication Services •Full Custom

Company Headquarters 2401 Palm Bay Road Palm Bay, FL 32905 (305) 724-7418

International OEM Sales

Europe Headquarters (UK) 44-734-698-787 . Japan (Tokyo) 81-3-345-8911

National Distributors

Anthem Electronics Falcon Electronics Hall-Milrk Electronics Hamilton-Avnet Corporation RC. Components Schweber Electronics

In Canada Hamilton/Avnet Corporation ·· Semad Electronics ..

Reorder Number. SAR-8019 "'Harris Corp., June 1987 Printed in U.S.A.

DESIGN ENTRY Standard-cell CPU toolkit crafts potent processors

Todd Jones, Christopher Malinowski, and Stanley Zepp Reprinted with P"7llission from Electronic Design (Vol 35, No. 12) May 14, 1987, Harris Semiconductor Sector, P.O. Box 883, Melbourne, FL 32901; (305) 724-7000. Copyright 1987. Hayden Publishing Co. Inc., a subsidiary of VNU

Real-time-system designers demanding the highest sor's simple, directly executable set of Forth words performance are bound by hardware and software and to heavy reliance on the two memory stacks. · chains. System throughputs are shackled by the The CPU's low gate count leaves plenty of chip for limitations of standard microprocessors, a poor stack memory, external interfaces, and specialized match between popular languages and real-time 1/0 to be created with cell-library CAD tools. control applications, and the lack of a powerful, Depending on the task, the core processor can general-purpose 16-bit microprocessor that' can operate at clock rates in excess of 15 MHi, with an share one chip with application-specific logic. average execution throughput of between 1 and 1.5 Designers can overcome the speed limitations clock cycles per instructfon. That corresponds to a with bit-slice processors, which are hard to pro- sustained throughput of 10 to 15 million high-level gram; or by coupling (MIPS). Also, because each custom logic and bipo­ instruction executes the equivalent of several Forth A 16-bit, RISC-like lar microcontrollers, primitives, peak processing throughput can exceed CPU and complex which are power­ 30 million Forth primitives per second. hungry. Such solutions A typical form that the Force machine would support functions are expensive to develop . take starts with the core processor and adds three help a cell library and build, difficult to memories: a main program memory, a return stack called Force set the document and main­ for dealing with subroutine calls, and a parameter ideas of real-time tain, and often cannot stack for storing data (Fig. 1). Inside the core are designers into silicon. be applied to new three key registers that keep the operations moving technologies. at high speed. The I register, which is the logical top Out to break those of the return stack; and the Top and Next registers, chains is a Forth optimized reduced-instruction-set computing engine (Force) and a standard-cell tool­ kit. The CMOS engine is a 16-bit processor in stan­ dard-cell form, and the toolbox is a set of complex cells, among them a stack controller, interrupt con­ troller, multiplier, and multiplier-accumulator. The cells fit into a computer-aided-design package for developing real-time products. The architecture of the reduced-instruction-set processor puts to work in hardware an existing Forth-language virtual machine (see "Forth: A Language for Real-Time Control," p. 94). The sili­ con version grows out of technology licensed from Novix Inc., of Cupertino, Calif., and makes the vir­ tual machine available as an embedded CfU and standard all-based product. The low gate count, however, does not sacrifice performance. The RISC-like architecture avoids a high gate count, a problem that blocks other processors from serving as standard cells. The CPU has only 2500 gates, a very low number that owes to the proces- DESIGN ENTRY • Cover: Standard-cell CPU which are the two top locations of a parameter stack. mands operate on the Top and Next registers and return Thanks to the small core size, designs can rely heavily the result to both those units, especially the Top register. on on-chip stack memories. For some tasks, data and pro­ Similarly, if operations must be performed on the return gram memories could also be included on chip. However, stac~, the I register comes into play. There is also a fast when large amounts of memory become too costly, fast, 1/0 bus that can be accessed in parallel with the memo­ off-chip ROMs, EPROMs, and static RAMs enable de­ ries. For extra speed, arithmetic or logic operations, if signers to build systems operating at clock speeds of 10 needed, can be performed on read 1/0 data during, rather MHz or greater. than after, the read cycle. The main memory holds data, instructions, and the HIGHLY PARALLEL ARCHITECTURE traditional Forth "dictionary" structures. Dual stacks All instructions execute in one.or two clock cycles, and are formed by two dedicated RAMs, which appear to the all three memory spaces can be accessed simultaneously. processor as last-in, first-out (LIFO) structures con­ Although the concepts of RISC design apply to the pro­ trolled by the stack-controller subsections (one for each cessor, the architecture differs significantly from other stack). The stack controllers generate stack memory ad­ CPUs of that type. The Force core puts to work a high­ dresses under the direction of the processor. level language as its native instructions, giving the pro­ In Forth, subroutine calls and operations on the' data grammer a compact set of very powerful commands. stack are most important. For this reason, the architec­ Other RISC processors use a reduced set of low-level ture is geared to these operations. For instance, the sub­ instructions that help the chips optimize throughput. Of­ routine call takes one clock to do the "top-of-stack" arith­ ten, however, programs are big and development time is metic or logic operations. In addition, a subroutine return long. In contrast, the Force core needs less code for a giv­ can occur in the same clock cycle as most other instruc­ en application, increasing programmer productivity and tions. As a result, the total subroutine call-and-return cutting software-development costs. overhead is cut to just one clock cycle with no extra time The processor core executes instructions fetched out of needed for the return. main memory. These instructions closely mirror the The processor's highly parallel architecture executes Forth language primitives. Arithmetic and logic com- .the equivalent of several Forth language primitives at

Forth: a language for real-time control

Designing today's real-time control ing an editor, compiler, and debug­ words predefined by the program­ applications requires a high-level lan­ ger, as well as a host of other mer. These words can in turn be used guage that interfaces easily with cus­ development utilities that, in other again to define mote complex words tom hardware. The ideal language environments, are usually separate. in the application. As each new word would need little or no dedicated Because Forth is interpretive, the pro­ is defined, it is entered into the dictio­ memory outside that required for the grammer can directly compile and ex­ nary, from which it is pulled as need­ application; it would present few, if ecute code as soon as it is entered. ed. This process continues until the fi­ any, restrictions on application-mem­ Being interpretive, the language nal application program exists as a ory locations. The language must also speeds prototyping. It lets designers single word. lend itself to testing and debugging find out if the basic algorithm is cor­ The internal structure of a Forth the application in its real-time envi­ rect before committing much time word consists of a header, containing ronment. Finally, compiled applica­ and resources to generating code. the name field and dictionary link, tion code should be compact and exe­ Forth is a software environment and the body. The body is a list of sub­ cute fast. that embodies a "virtual machine," routine calls to the words that make One language that easily meets all which is the heart of the development up the definition. During execJtion of those conditions is Forth. No ancil­ system. Much of the virtual machine a word, the internal list ofproc~dures lary libraries or executives take up is the dictionary, which is a linked list is executed, calling another word of valuable memory space, and because of procedures called words. Stacks internal subroutine lists. This i:!rocess Forth is quite compact, much of the communicate parameters between of subroutine calls continues hntil a development can actually take place procedures and link subroutine calls low-level primitive is encourltered. on the application hardware. This during execution; they are the hub of That primitive is then executed, as it ability greatly aids the integration all machine activity. The parameter contains low-level machine and testing phase of the program. stack maintains the program data, instructions. Run-time diagnostics are similarly and the return stack maintains the re­ This process of layered subroutine aided by a small interpreter that does turn addresses during execution. calls is often referred to as threaded real-time monitoring and control in A typical application is built upon code. Because of this form of execu­ stand-alone tasks. subroutines, and each word that is de­ tion, the virtual machine depends Forth is an integrated software-de­ fined in Forth is treated like a proce­ heavily on modular programrtling velopment environment incorporat- dure call. An application is built up of techniques and subroutine calls. DESIGN ENTRY • Cover: Standard-cell CPU ments, the cells can be cascaded using their Carry and At the same time, the IVG produces a vector location cor­ Borrow signals. The cells generate the Stack Underflow responding to the highest-priority unmasked interrupt­ and Stack Overflow outputs to indicate an empty and full ing input. This location establishes the point at which in­ status of the stack. Usually, these outputs would inter­ terrupt-code execution starts when the interrupt is rupt the processor. acknowledged. The interrupt-vector location is read onto For math intensive applications, the toolbox includes a the core I/0-input (G) bus when the signal INTA ema­ 16-by-16-bit fast multiplier that executes a proprietary al­ nates from the core during an interrupt-acknowledge cy­ gorithm and operates at cycle times below 50 ns. The mul­ cle. That vector can also be read by subsequent 1/0 read tiplier performs a signed-magnitude multiplication with­ instructions. in one clock cycle of the engine. Also available is a 16-by- An asynchronous host-interface cell lets the core com­ 16-bit multiplier-accumulator (MAC) cell for jobs like municate with slower hosts, typically general-purpose digital filtering. processors, in a master-slave environment. In the inter­ For dedicated tasks calling for parallel external ALU face scheme, the host processor can access either part or or concurrent external arithmetic operations, a set of fast all of the chip's data or program RAM, which would ap­ 16- and 32~bit arithmetic cells is available. The library pear to the host as a block of its own memory space. The also includes a 32-bit and a leading-zero-de­ interface cell's arbitration logic allows the host only one tector cell for floating-point math operations that need access at a time, interleaved with one or more core-pro­ fast normalization and denormalization. cessor accesses. This limfr, however, may pe bypassed Giving priority eiicoding of up to 15 asynchronous with lock options by either processor. Those options maskable interrupts, the interrupt vector-generator grant temporary exclusive access to the interface by one (IVG) cell also delivers a 16th, nonmaskable interrupt processor or the other. (NMI) directly to the processor's NMI input. The IV G's PUTIING THE PIECES TOGETHER mask register has a bit for each of the prioritized interrupt inputs and is loaded from the processor's T bus. That bus To see how all the macrocells come together, examine connects to the top of the parameter stack. their work in a high-performance processor aimed at Any true interrupt input that is not masked will set the closed-loop control and other math-intensive tasks. Such INT, informing the processor of the pending interrupt. jobs include robotics, instrumentation, flight control, sig-

Main memory 1/0 registers Parameter stack Return stack input address bus Main memory Stack Output input Next stack 16 16 16 bus output bus Write 16 16 Decrementer Data 16 110 input Byte swap bus Top bus

Instruction Configuration Multiply-divide Square-root register register register register

Y bus

Instruction decode

2. A simple processor, the 16-bit Force core contains three key registers, the Top, Next, and I, that hold the most time-critical information. Other registers in the core take care of status-handling operations. Special registers and logic help multiply, divide, and find square roots. DESIGN ENTRY • Cover: Standard-cell CPU once. In only two cycles, one such multifunction instruc­ in board and aimed at a PC-integrated development envi­ tion performs the Forth equivalent of over swap-which ronment is now in final development by Silicon Compos­ exchanges, duplicates, and subtracts the values in two ers Inc. of Palo Alto, Calif. Moreover, a Forth target registers. It is executed just like classic executed Forth compiler is now hosted on the IBM PC family. This target code. compiler makes possible the development of software in The complex macrocells in the Force toolbox are based the PC environment, generating executable code for the on an advanced cell library and computer-aided-design Force processor and for use as a development tool for capability developed by Harris and SDA Systems Inc., of software vendors. Santa Clara, Calif. The tools define cells and macrocells A CLOSE LOOK AT THE MACROCELLS and efficiently place and route completed designs. They route designs incorporating fixed blocks of logic or mem­ The Force processor core is the key element of the tool­ ory, macrocells, and standard cells. Also included are box. That simple, yet powerful control engine is tied to tools to verify and simulate completed designs. The SDA other circuits by means of parallel, 16-bit data paths to software can compile RAMs and ROMs with variable the parameter stack, return stack, main memory, and the size, configuration, and layout shape. general-purpose 1/0 bus. There is also a 16-bit main Besides the specially develop_ed Force toolbox macro­ memory address bus and a 5-bit address extension, which cells, the SDA design environment also contains a num­ also functions as an 1/0 address bus. Inside the processor ber of microprocessor-support peripherals, such as indus­ core are eight registers; four of which are independently try-standard and proprietary serial asynchronous accessible in parallel (Fig. 2). transmitter-receivers, baud-rate generators, clock gener­ The Top, Next, I, and Instruction Registers are all sep­ ators, programmable interval timers, and Manchester en­ arately accessible so that multiple operations can be done coder-decoders. Not only that, the toolbox ties directly simultaneously. The other four are the program counter, into Harris's already available standard-cell library. As a square-root, configuration, and multiply-divide registers. result, the designer can glue the complex blocks together With its byte-swap logic on the main memory buses, the or develop added' logic functions using the 7400-series processor can rapidly reorder or perform byte reads or logic elements. writes. For breadboarding of systems, Harris has developed a The core's two main work areas, the parameter-stack 144-lead version of the Force core only. The core is in­ and return-stack memories, are addressed through iden­ cluded in a demonstration board from Logical Devices tical stack-controller cells (Fig. 3), which generate ad­ Inc., of Ft. Lauderdale. With this board, designers can ac­ dress pointers within 5 ns from the time that the core's. cess all of the processor's 1/0, making it possible to stack, read, and write signals become valid. The ability to breadboard a full system. It can interface to any CRT ter­ quickly generate the stack addresses is critical to maxi­ minal or serial-communication port, and contains ROM mizing throughput. and RAM space for the application code. Extra space is The stack-controller cells can also be externally pre­ available should a particular task require more memory. loaded from the processor's data bus with a predeter­ A development system configured as an IBM PC plug- mined stack address. For deep stack-memory require-

Stack­ .. Stack­ pointer Return­ Parameter­ pointer controller stack RAM stack RAM controller

r----1 16x16-bit multiplier Clock Force generator core Interrupt logic

I

Data Host interface I Program I I RAM I ROM L J L J "User-configurable for depth

1. Harris Semiconductor's Forth-optimized, reduced-instruction-set computing engine (Force) core can be surrounded by the stacks and various support functions and memories to build a complete system on just one chip. DESIGN ENTRY • Cover: Standard-cell CPU nal processing, graphics, and image processing, (That stack and onefor the return stack. The address registers processor is, in fact, the first planned product to be built of each of these macrocells can be loaded and read as I/0 with the toolbox.) devices. On top of that, the overflow and underflow out­ The control processor takes advantage of the Force en­ put lines from each of these macrocells drive interrupt in­ gine, the high-performance proprietary multiplier, and puts on the IVG. the normalize-shift macrocells (Fig. 4). It can function as Arithmetic hardware sits on-chip to speed computa­ a stand-alone processor but because it includes a host in­ tions. That hardware includes a 16-by-16-bit multiplier terface, it can share external main memory with any host and the normalize-denormalize-shift macrocell. The lat­ processor. Provisions are also made to handle interrupts ter simplifies software development by delivering all the to and from the host processor. normalization that modern control systems typically de­ Highly integrated, the chip includes two 128-word mand for fast floating-point operations. stack memories and controllers, one for the parameter For control tasks, three 16-bit timer-counters on the chip supply a programmable time base, internal timing, or event-counting functions. These timers are clocked by Stack a prescaled internal clock or by external inputs. A 16-in­ Write Address increment/ address put IVG rrpcrocell obtains a fast response to internal or decrement logic Read external events. The internal events flagged include over­ flow or underflow conditions for the four stacks and the Underflow Stack address register Underflow/ three timeouts for the timers. overflow- Provisions are also made for nine external interrupts detection logic t---'---O Overflow including a host interrupt, a nonmaskable interrupt, and seven maskable interrupts. The interrupt mask register in the IVG cell can enable or disable any of these interrupts

Stack address steering Top-of-stack except the nonmaskable one. logic limiting register The general-purpose coprocessor has the ability to ad­ dress up to 16 Mbytes of external memory for code and data, all of which is external. Such a large range matches Read/ the addressing capability of most general-purpose host Write processors and supplies the space needed for software de­ velopment, graphics, and image-processing jobs. It also 3. One of the key support cells in the Force toolbox makes possible complete flexibility with respect to memo­ is the stack controller, which turns ordinary RAMs into last-in, first-out stacks for the processor. Two ver­ ry configuration. sions of the stack controller address 64 or 256 Because the core processor itself can produce only a 16- words of memory. Multiple controllers can be cas­ bit address (capable of addressing 64 kbytes of code or caded to handle larger memory spaces. data memory), it is supplemented by memory-address-

I= interrupt input generators (nonmaskable 12. 14 l2-l9 interrupt), t I r I II lw-110 Memoiy, t l I I t data,address, Data control Data stack Interrupt Multiplier Timer stack control (128 words) control I User drivers I l I Gbus Host Force Bidirectional interface - core buffers - l I 1. Return Current Address Return Configuration 1 stack stack· task Normalize/ extension Host Host control shift register r--- interrupt and address (128 words) - control register logic Data enable ~ 13, 15

4. A general-purpose coprocessor, with on-chip resources to handle fast integer multiplication and floating­ point math, is easily assembled from the cells in the Force toolbox. Both the parameter and return stacks, as well as three timer-counters and an interrupt controller, are on the chip. DESIGN ENTRY • Cover: Standard-cell CPU

Force core, 1/0 devices, stack RAMs, and to the rest of PRICE AND AVAILABILITY the appliction system. The core processor's 1/0 bus is brought off-chip to The Force library is a part of standard product designs. connect to specialized 1/0 deviCes required by the job. The first such design is the coprocessor described in this These 1/0 devices can also be built with the Harris­ article, which will be released in the fourth quarter. SDA standard cell gate-array design systems. For tasks Prices for the coprocessor will be set then. It is also an­ ticipated that the Force library will be released for semi­ that cio not need all the internal 1/0 deyices, an internal­ custom design, but no date for that has yet been set. configuration register selectively disables timers, math CIRCLE501 hardware, or address extension hardware, making their 1/0 addresses available to external devices.D extension logic to achieve the 16-Mbyte range. Thanks to two 8-bit memory-extension registers within the address­ Todd Jones is a senior engineer at Harris, responsible for extension logic, independent address extension is sup­ Force software planning and development. He has a BS de­ plied for code and data to generate 24-bit addresses (16 gree in computer science from the University ofIdaho and Mbytes). Because the address extension for code and data an MS in computer science from the Florida Institute of is independent, maximum memory flexibility is gained. Technology. An external processor works with a host-interface ma­ Christopher Malinowski is a senior scientist for Harris's crocell to gain access to the entire co-processor memory semiconductor research and development department, (16 Mbytes, ifneeded). This interface looks like a block of and program manager for the Force project. He holds an memory to the host, making it very easy to interface to MS degree in nuclear electronics and a Ph.D. in solid-state any processor. The host interrupt to the Force processor physicsfrom Warsaw Technical University. .is put into effect by a host write to a particular memory Stanley Zepp, a senior scientist, is Harris's manager of address. For stand-alone tasks, the host interface need business development for the microprocessor product line. not be used. Also included on the chip is a clock oscillator He holds a BEE degree from the University ofFlorida and and clock generator, which supplies timing signals to the an MEE from New York University. We're backing you up lVith products, support, and solutions!

SEMICUSTOM/CUSTOM TECHNOLOGIES LINEAR • CMOS Programmable Logic • CMOS Digital •Op Amps • Gate Arrays • CMOS Analog • Comparators • Standard Cells • Bipolar Analog • Analog Switches •Full Custom • Dielectric Isolation • Buffers • Gallium Arsenide • Radiation Hardened

DATA ACQUISITION TELECOMMUNICATION DIGITAL COMMUNICATION MICROPROCESSOR • Analog Multiplexers •sues • CMOS 1553 Bus Interface •CMOS 80C86-16-Bit • DIA Converters •PCM and Univ. Active Filters •CMOS UARTs •CMOS 80C88-8/16-Bit • AID Converters • CVSDs • CMOS Manchester • CMOS 80C85 RH-8-Bit • Sample-and-Hold • T-1 and ISDN Circuits Encoder/Decoder • CMOS 80C86 RH-16-Bit Amplifiers •CMOS ARING Bus Interface • CMOS 8/16-Bit Peripherals

MEMORY GALLIUM ARSENIDE RADIATION HARDENED •CMOS RAMs • Microwave FETs • SRAMs/PROMs •CMOS PROMs • Digital ICs • Microprocessors • CMOS Memory Modules • Microwave Monolithic ICs •Gate Arrays/Standard Cells • Microwave Amplifiers •Op Amps/Multiplexers • Custom/Fabrication Services •Full Custom

Company Headquarters 2401 Palm Bay Road Palm Bay, FL 32905 (305) 724-7418

International OEM Sales

Europe Headquarters (UK) 44-734-698-787 Japan (Tokyo) 81-3-345-8911

National Distributors

Anthem Electronics Falcon Electronics Hall-Mark Electronics Hamilton-Avnet Corporation RC. Components Schweber Electronics

In Canada Hamilton/Avnet Corporation Semad Electronics

Reorder Number: 6AR-8018 ®Harris Corp., June 1987 Printed in U.S.A Sftj;1ndard C@ll

0 September 1987 HSC 250 CMOS Cell Library Features • 1.5 Micron Effective Channel Length, 2-Layer • CMOSmL Compatible I/O's Metal CMOS • Commercial-Industrial-Military Temperature Ranges • 1.2ns Typical Gate Delay Through 2-lnput NANO • Proven Rellable and Manufacturable Process • Up to 1OOMHz Fllp·Flop Toggle Rate • Extensive Range of Packaging Options • Over 200 Primitive and Macrocell Functions • Minimum 4kV ESD Protection • Complex Function Megacells • Screening and Qualification to Mil·Std-883 Method 5004/5005, Class B • Customer Definable RAM and ROM • Fully Compatible with the HSC200-RH Rad-Hard • Supported on Multiple CAE Platforms Library Description The HSC 250 STANDARD CELL LIBRARY is a proven, high macros, complex megacells and compilable RAM and ROM performance dual-level metal library. The library offers a for developing reliable, cost effective customer specific IC's. broad range of predesigned and fully characterized cells, Die Photo

0

0

Copyright © Harris Corporation 1987 This information is current as of August 1987. Updates are issued semiannually. HSC250 CMOS Standard Cell Library

Complex Function Megacells To enhance the level of system integration, and reduce the highly integrated microprocessor peripherals, communica­ design cycle time Harris has developed a series of complex tion elements, high performance multipliers, and bit slice ele­ function megacells. These functions consist of a family of ments. A list of the available megacells follows: 0 -• Microprocessor Peripherals 82C37A ...... DMA Controller 82C50A ...... Asynchronous Communication Element 82C50B ...... Asynchronous Communication Element 82C52 ...... UART/BAG 82C54 ...... Programmable Interval Timer 82C55A ... , ...... Programmable Peripheral Interface 82C59A ...... Priority Interrupt Controller 82C84A ...... Clock Generator 82C88 , ...... Bus Controller

• Communication Elements HD4702 ...... Programmable Bit Rate Generator HD6402 ...... UART HD6406 ...... UART/BAG/Modem Control HD6408 ...... ASMA HD6409 ...... Manchester Encoder/Decoder HD15530 ...... Manchester Encoder/Decoder HD15531 ...... Programmable Manchester Encoder/Decoder

•Other Functions H2901 ...... 4-Bit Slice ALU 0 *HMU16, HMU17, HMU18...... l6 x 16 Multipliers **HMU1010 ...... 16 x 16 Multipliers/Accumulator

Compilable Cells Harris has further expanded user definability by providing the customer to quickly generate design specific RAM and high performance, module compilation. This capability allows ROM cells. RAM ...... Compilable to 16K ROM ...... Compilable to 64K

*Contact Factory for availability **Available 01, CY'88 HSC250 CMOS Standard Cell Library

Absolute Maximum Ratings

Supply Voltage ...... ,, ...... : ...... -0.5Vto 7.0V lnpuVOutput Voltage ...... VSS -0.5V VCC +0.5V 0 Input Diode Current ...... 10mA VI < 0 or VI > VCC Output Diode Current ...... 10mA...... VOVCC Power Dissipation ...... 1000mW Continuous Supply Pin Current VCC or GND ...... 100mA Storage Temperature Plastic ...... -4QOC . . . . . to. + 125oc Ceramic ...... -650Cto+1500C Continuous Current per Output ...... 1OmA CAUTION: Stresses beyond those listed under "Absolute Maximum Ratings" may cause permanent damage to the device. These are stress ratings only and func­ tional operation of the device at these or any other conditions beyond those indicated under "Recommended Operating Conditions" is not implied. Exposure to absolute maximum rated conditions for extended periods may effect device reliability.

NOTE: All applied voltages are with reference to GND (VSS). Recommended Operating Conditions D.C. Electrical Specifications VCC = 5V ± 10% TA = Operating Temperature Range

SYMBOL PARAMETER MIN MAX UNIT CONDITIONS

vcc Operating Supply Voltage 4.5 5.5 v

TA Operating Temperature Commercial 0 70 oc Industrial -40 85 oc Military -55 125 oc VIH Input High Voltage TTL 2.2 .v VCC=5.5V CMOS 70%VCC 0 VIL Input Low Voltage TTL 0.8 v VCC=4.5V CMOS 30%VCC

II Input Current Standard -1.0 +1.0 µA VIN = VSS = O.OV Pull Up -500 +10 VIN = VCC = 5.5V Pull Down -10 +500 Pull Up* -50 µA Vl=2.2V VCC=5.5V Pull Down* +50 µA Vl=0.8V

IOH Output~ c...... ,/~ 6.0 mA VOH = 2.4V; VCC = 4.5V IOL Output~· -~ -~A )f--- -6.0 mA VOL= 0.4V; VCC = 4.SV IOZ Output Leakage -10.0 +10.0 µA VSS = VOL = O.OV; VCC = VOH = 5.5V

ICCSB Stand-By Supply *** µA 11=0;10=0

Cl** Input Capacitance 10.0Typical pF VI =VCCorVSS;f= 1MHz

CO** Output Capacitance 10.0 Typical pF VO =VCCorVSS;f= 1MHz CIO** lnpuVOutput Capacitance 15.0 Typical pF VO =VCCorVSS;f= 1MHz

* Maximum input current for which specified VI will be maintained. ** Characterized at initial design and after any major design or process changes. Maximum values may vary by package type. *** Customer design dependent.

0 HSC250 CMOS Standard Cell Library

0

Sales Offices U.S. HEADQUARTERS EUROPEAN HEADQUARTERS · FAR EAST HEADQUARTERS Harris Semiconductor Harris/System Limited Harris K.K. 2401 Palm Bay Road Semiconductor Sector Shinjuku NS Bldg. Box 6153 Palm Bay, Florida 32905 Eskdale Road 2-4-1 Nishi-Shinjuku TEL: (305) 724-7418 Winnersh Triangle Shinjuku-Ku, Tokyo 163 Japan Wokingham RG11 STA TEL: 81-3-345-8911 Berkshire United Kingdom TEL:0734-698787

DISTRIBUTORS IN U.S.A. DISTRIBUTORS IN CANADA Anthem Electronics Hamilton/Avnet Corporation Hamilton/Avnet Corporation Falcon Electronics Schweber Electronics Semad Electronics Hall-:Mark Electronics

mHARRIS 0

SEMICONDUCTOR PRODUCTS DIVISION Reorder Num.ber: 708-0097