Design Languages for Embedded Systems
Stephen A. Edwards Columbia University, New York [email protected] May, 2003
Abstract ward hardware simulation. VHDL’s primitive are assign- ments such as a = b + c or procedural code. Verilog adds Embedded systems are application-specific computers that transistor and logic gate primitives, and allows new ones to interact with the physical world. Each has a diverse set be defined with truth tables. of tasks to perform, and although a very flexible language might be able to handle all of them, instead a variety of Both languages allow concurrent processes to be de- problem-domain-specific languages have evolved that are scribed procedurally. Such processes sleep until awak- easier to write, analyze, and compile. ened by an event that causes them to run, read and write variables, and suspend. Processes may wait for a pe- This paper surveys some of the more important lan- riod of time (e.g., #10 in Verilog, wait for 10ns in guages, introducing their central ideas quickly without go- VHDL), a value change (@(a or b), wait on a,b), ing into detail. A small example of each is included. or an event (@(posedge clk), wait on clk un- 1 Introduction til clk='1'). An embedded system is a computer masquerading as a non- VHDL communication is more disciplined and flexible. computer that must perform a small set of tasks cheaply and Verilog communicates through wires or regs: shared mem- efficiently. A typical system might have communication, ory locations that can cause race conditions. VHDL’s sig- signal processing, and user interface tasks to perform. nals behave like wires but the resolution function may be Because the tasks must solve diverse problems, a lan- user-defined. VHDL’s variables are local to a single pro- guage general-purpose enough to solve them all would be cess unless declared shared. difficult to write, analyze, and compile. Instead, a variety Verilog’s type system models hardware with four-valued of languages have evolved, each best suited to a particu- bit vectors and arrays for modeling memory. VHDL does lar problem domain. For example, a language for signal- not include four-valued vectors, but its type system allows processing is often more convenient for a particular prob- them to be added. Furthermore, composite types such as C lem than, say, assembly, but might be poor for control- structs can be defined. dominated behavior. Overall, Verilog is the leaner language more directly This paper describes popular hardware, software, geared toward simulating digital integrated circuits. VHDL dataflow, and hybrid languages, each of which excels a cer- is a much larger, more verbose language capable of handing tain problems. Dataflow languages are good for signal pro- a wider class of simulation and modeling tasks. cessing, and hybrid languages combine ideas from the other 3 Software Languages three classes. Due to space, this paper only describes the main features Software languages describe sequences of instructions for a of each language. The author’s book on the subject [9] pro- processor to execute. As such, most consist of sequences of vides many more details on all of these languages. imperative instructions that communicate through memory: an array of numbers that hold their values until changed. 2 Hardware Languages Each machine instruction typically does little more than, Verilog [14, 25] and VHDL [13, 23, 8, 2] are the most pop- say, add two numbers, so high-level languages aim to ular languages for hardware description and modeling (Fig- specify many instructions concisely and intuitively. Arith- ure 1, 2). Both model systems with discrete-event seman- metic expressions are typical: coding an expression such tics that ignore idle portions of the design for efficient sim- as ax2 + bx + c in machine code is straightforward, tedious, ulation. Both describe systems with structural hierarchy: a and best done by a compiler. The C language provides such system consists of blocks that contain instances of primi- expressions, control-flow constructs such as loops and con- tives, other blocks, or concurrent processes. Connections ditionals, and recursive functions. The C++ language adds are listed explicitly. classes as a way to build new data types, templates for poly- Verilog provides more primitives geared specifically to- morphic code, exceptions for error handling, and a standard
1 module fulladd(ai, bi, ci, o, co); jmp L2 # go to L2 input ai, bi, ci; L1: # label output o, co; movl %ebx, %eax # n -> m wire s1; movl %ecx, %ebx # r -> n L2: xor x1(s1, ai, bi), x2(o, s1, ci); xorl %edx, %edx # clear %edx divl %ebx # m / n assign co = (ai + bi + ci) >= 2; movl %edx, %ecx # rem -> r endmodule testl %ecx, %ecx # if r = 0, jne L1 # go to L1 module testadd; reg [2:0] y; wire o, co; reg clk; Figure 3: Euclid’s algorithm in i386 assembly language. Symbols like %ebx represent registers. movl means “move fulladd a1(y[0], y[1], y[2], o, co); long value.” divl %ebx divides %eax by %ebx and puts initial begin the remainder in %edx. y = 0; clk = 0; $monitor($time,,"%d%d%d %d%d", #include
2 class Cplx { import java.io.*; double re, im; class Counter { public: int value = 0; Cplx(double v) : re(v), im(0) {} boolean present = false; Cplx(double r, double i) public synchronized void count() { : re(r), im(i) {} try { while (present) wait(); } double abs() const { catch (InterruptedException e) {} return sqrt(re*re + im*im); value++; present = true; notifyAll(); } } void operator+= (const Cplx& a) { public synchronized int read() { re += a.re; im += a.im; try { while (!present) wait(); } } catch (InterruptedException e) {} }; present = false; notifyAll(); return value; int main() { } Cplx a(5), b(3,4); } b += a; class Count extends Thread { cout << b.abs() << ’\n’; Counter cnt; return 0; public Count(Counter c) { cnt = c; start(); } } public void run() { for (;;) cnt.count(); } } class Mod5 { Figure 5: A C++ fragment illustrating a partial complex public static void main(String args[]) { number type (the C++ library has a complete version). Counter c = new Counter(); Count count = new Count(c); int v; for (;;) if ( (v = c.read()) % 5 == 0 ) flow constructs such as loops of conditionals can affect the System.out.println(v); } order in which instructions execute. When control reaches a } function call in an expression, control is passed to the called function, which runs until it produces a result, and control Figure 6: A contrived Java program that spawns a counting returns to continue evaluating the expression that called the thread to print all numbers divisible by 5. function. C derives its types from those a processor manipulates different use of the template. For example, the same min directly: signed and unsigned integers ranging from bytes template could be used for both integers and floating-point to words, floating point numbers, and pointers. These can numbers. be further aggregated into arrays and structures—groups of named fields. 3.4 Java C programs use three types of memory. Space for Sun’s Java language [1, 11, 20] resembles C++ but is in- global data is allocated when the program is compiled, compatible. Like C++, Java is object-oriented, providing the stack stores automatic variables allocated and released classes and inheritance. It is a higher-level language than when their function is called and returns, and the heap sup- C++ since it uses object references, arrays, and strings in- plies arbitrarily-sized regions of memory that can be deal- stead of pointers. Java’s automatic garbage collection frees located in any order. the programmer from memory management. The C language is an ISO standard, but most people con- Java provides concurrent threads (Figure 6). Creating sult the book by Kernighan and Ritchie [17]. Ritchie de- a thread involves extending the Thread class, creating in- signed the language. stances of these objects, and calling their start methods to 3.3 C++ start a new thread of control that executes the objects’ run methods. C++ (Figure 5) [24] extends C with structuring mechanisms Synchronizing a method or block uses a per-object lock for big programs: user-defined data types, a way to reuse to resolve contention when two or more threads attempt to code with different types, namespaces to group objects and access the same object simultaneously. A thread that at- avoid accidental name collisions when program pieces are tempts to gain a lock owned by another thread will block assembled, and exceptions to handle errors. The C++ stan- until the lock is released, which can be used to grant a thread dard library includes a collection of efficient polymorphic exclusive access to a particular object. data types such as arrays, trees, strings for which the com- piler generates custom implementations. 3.5 RTOS A class defines a new data type by specifying its repre- Many embedded systems use a real-time operating system sentation and the operations that may access and modify it. (RTOS) to simulate concurrency on a single processor. An Classes may be defined by inheritance, which extends and RTOS manages multiple running processes, each written modifies existing classes. For example, a rectangle class in sequential language such as C. The processes perform might add length and width fields and an area method to a the system’s computation and the RTOS schedules them— shape class. attempts to meet deadlines by deciding which process runs A template is a function or class that can work with mul- when. Labrosse [18] describes the implementation of a par- tiple types. The compiler generates custom code for each ticular RTOS.
3 process f(in int u, v; out int w) P1 A { int i; bool b = true; P2 B for (;;) { i = b ? wait(u) : wait(w); P3 C printf("%i\n", i); send(i, w); B A B C b = !b; } } B completes; C starts process g(in int u; out int v, w) A completes; B resumed { C initiated, but higher-priority A continues for (;;) { B preempted by higher-priority A send(wait(u), v); send(wait(u), w); } } Figure 7: The behavior of an RTOS with fixed-priority pre- emptive scheduling. Rate-monotonic analysis gives pro- process h(in int u, out int v, int init) { cess P1 the highest priority since it has the shortest period; send(v, init); P has the lowest. A would have missed its deadline (the for(;;) 3 send(wait(u), v); tick mark) had it not preempted B. }
channel int X, Y, Z, T1, T2; f(Y, Z, X); Most RTOSes uses fixed-priority preemptive scheduling g(X, T1, T2); h(T1, Y, 0); in which each process is given a particular priority (a small h(T2, Z, 1); integer) when the system is designed. At any time, the RTOS runs the highest-priority running process, which is expected to run for a short period of time before suspend- Figure 8: A Kahn Process Network [16]. The f process ing itself to wait for more data. Priorities are usually as- alternately copies from its u and v ports to its w port; the g signed using rate-monotonic analysis [6] (due to Liu and process does the opposite, copying its u port to alternately Layland [21]), which assigns higher priorities to processes v and w; and h simply copies its input to its output. that must meet more frequent deadlines. 1 1 1 8 2 4 2 2 2 2 2 2 1 1 A B C D O F G H 2 2 1D 1 2 4 Dataflow Languages 2D 2 1 K 1 1 1 E P N M L 1 1 I 2 2 2 1 1 2 1 1 2 1 1 Dataflow languages describe systems of procedural pro- 1 J 1 cesses that run concurrently and communicate through queues. Although clumsy for general applications, dataflow languages are a perfect fit for signal-processing algorithms, Figure 9: A modem in SDF (from Bhattacharyya et al. [5]) which use vast quantities of arithmetic derived from linear system theory to decode, compress, or filter data streams that represent periodic samples of continuously-changing blocks until space is available, but if the system deadlocks values such as sound or video. Dataflow semantics are because all buffers are full, the scheduler increases the ca- natural for expressing the block diagrams typically used to pacity of the smallest buffer. describe signal-processing algorithms, and their regularity 4.2 Synchronous Dataflow makes dataflow implementations very efficient because oth- erwise costly run-time scheduling decisions can be made at Lee and Messerschmitt’s Synchronous Dataflow [19] fix the compile time, even in systems containing multiple sampling communication patterns of the blocks in a Kahn network. rates. Each time a block runs, it consumes and produces a fixed 4.1 Kahn Process Networks number of data tokens on each of its ports. This predictabil- ity allows SDF to be scheduled completely at compile-time, Kahn Process Networks [16] form a formal basis for producing very efficient code. dataflow computation. Kahn’s systems consist of processes Scheduling operates in two steps. First, the rate at which that communicate exclusively through unbounded point-to- each block fires is established by considering the produc- point first-in, first-out queues. Reading from a port makes tion and consumption rates of each block at the source and a process wait until data is available, so the behavior of sink of each queue. For example, one of the arcs in Fig- Kahn’s networks does not depend on execution speeds. ure 9 implies 2C = 4D. Once the rates are established, any Balancing processes’ relative execution rates to avoid algorithm that simulates the execution of the network with- an unbounded accumulation of tokens is the challenge in out buffer underflow will produce a correct schedule if one scheduling a Kahn network. One general approach, pro- exists. However, more sophisticated techniques reduce gen- posed in Parks’ thesis [22] places artificial limits on the erated code and buffer sizes by better ordering the execution size of each buffer. Any process that writes to a full buffer of the blocks (see Bhattacharyya et al. [4]).
4 5 Hybrid Languages Esterel SDL SystemC CCSS Hybrid languages combine ideas from others to solve dif- Concurrent ● ● ● ● ferent types of problems. Esterel excels at discrete control Hierarchy ● ● ● ● by blending software-like control flow with the synchrony Preemption ● ● ● and concurrency of hardware. Communication protocols Deterministic ● ❍ ● are SDL’s forte; it uses extended finite-state machines with Synchronous communication ● ● ● single input queues. SystemC provides a flexible discrete- Buffered communication ● ● ● event simulation environment built on C++. CoCentricTM FIFO communication ● ❍ ● System Studio combines dataflow with Esterel-like finite- Procedural ● ❍ ● ❍ ● ● ❍ ● state machine semantics to simulate and synthesize dataflow Finite-state machines ● ● ● applications that also require control. Dataflow Multi-rate dataflow ❍ ● 5.1 Esterel Software implementation ● ● ● ● Hardware implementation ● ● ● Intended for specifying control-dominated reactive systems, ● full support ❍ partial support. Esterel [3] combines the control constructs of an impera- tive software language with concurrency, preemption, and Table 2: Hybrid language features compared. a synchronous model of time like that used in synchronous digital circuits. In each clock cycle, the program awakens, reads its inputs, produces outputs, and suspends. module Example:
An Esterel program communicates through signals that input S, I; are either present or absent each cycle. In each cycle, each output O; signal is absent unless an emit statement for the signal runs signal R, A in and makes the signal present for that cycle only. Esterel every S do await I; guarantees determinism by requiring each emitter of a sig- weak abort nal to run before any statement that tests the signal. sustain R when immediate A; 5.2 SDL emit O || loop SDL is a graphical specification language developed for pause; pause; describing telecommunication protocols defined by the present R then emit A end; end ITU [15] (Ellsberger [10] is more readable). A system con- end sists of concurrently-running FSMs, each with a single in- end put queue, connected by channels that define which mes- end module sages they carry. Each FSM consumes the most recent mes- sage in its queue, reacts to it by changing internal state or sending messages to other FSMs, changes to its next state, Figure 10: An Esterel program modeling a shared resource. and repeats the process. Each FSM is deterministic, but be- cause messages from other FSMs may arrive in any order because of varying execution speed and communication de- #include "systemc.h" lays, an SDL system may behave nondeterministically. struct complex_mult : sc_module { sc_in
5 5.4 CoCentric System Studio [10] Jan Ellsberger, Dieter Hogrefe, and Amardeo Sarma. SDL: Formal Object-Oriented Language for Communicating TM CoCentric System Studio [7] uses a hierarchical for- Systems. Prentice Hall, Upper Saddle River, New Jersey, malism that combines Kahn-like dataflow and hierarchi- second edition, 1997. cal, concurrent FSMs. The FSMs resemble Harel’s State- [11] James Gosling, Bill Joy, Guy Steele, , and Gilad Bracha. charts [12], but use Esterel’s synchronous semantics to en- The Java Language Specification. Addison-Wesley, sure determinism. Reading, Massachusetts, second edition, 2000. A CCSS model is built hierarchically from Dataflow, [12] David Harel. Statecharts: A visual formalism for complex AND, OR, and Gated models. Dataflow models are Kahn systems. Science of Computer Programming, 8(3):231–274, Process networks. The blocks may be dataflow primitives June 1987. written in a C++ subset or other hierarchical models. AND [13] IEEE Computer Society, 345 East 47th Street, New York, models run concurrently and communicate with Esterel-like New York. IEEE Standard VHDL Language Reference synchronous semantics. OR models are finite-state ma- Manual (1076-1993), 1994. chines that may manipulate data and whose states may con- [14] IEEE Computer Society, 345 East 47th Street, New York, tain other models. Gated models contain sub-models whose New York. IEEE Standard Hardware Description execution can be temporarily suspended under external con- Language Based on the Verilog Hardware Description trol. Language (1364-1995), 1996. [15] International Telecommunication Union, Place des Nations, References CH-1221, Geneva 20, Switzerland. ITU-T Recommendation [1] Ken Arnold, James Gosling, and David Holmes. The Java Z.100: Specification and Description Language, 1999. Programming Language. Addison-Wesley, Reading, [16] Gilles Kahn. The semantics of a simple language for Massachusetts, third edition, 2000. parallel programming. In Information Processing 74: [2] Peter J. Ashenden. The Designer’s Guide to VHDL. Proceedings of IFIP Congress 74, pages 471–475, Morgan Kaufmann, San Francisco, California, 1996. Stockholm, Sweden, August 1974. North-Holland. [17] Brian W. Kernighan and Dennis M. Ritchie. The C [3] Gerard´ Berry and Georges Gonthier. The Esterel Programming Langage. Prentice Hall, Upper Saddle River, synchronous programming language: Design, semantics, New Jersey, second edition, 1988. implementation. Science of Computer Programming, 19(2):87–152, November 1992. [18] Jean Labrosse. MicroC/OS-II. CMP Books, Lawrence, Kansas, 1998. [4] Shuvra S. Bhattacharyya, Ranier Leupers, and Peter [19] Edward A. Lee and David G. Messerschmitt. Synchronous Marwedel. Software synthesis and code generation for data flow. Proceedings of the IEEE, 75(9):1235–1245, signal processing systems. IEEE Transactions on Circuits September 1987. and Systems—II: Analog and Digital Signal Processing, 47(9):849–875, September 2000. [20] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-Wesley, Reading, Massachusetts, [5] Shuvra S. Bhattacharyya, Praveen K. Murthy, and 1999. Edward A. Lee. Synthesis of embedded software from [21] C. L. Liu and James W. Layland. Scheduling algorithms for synchronous dataflow specifications. Journal of VLSI Signal multiprogramming in a hard real-time environment. Journal Processing Systems, 21(2):151–166, June 1999. of the Association for Computing Machinery, 20(1):46–61, [6] Loic P. Briand and Daniel M. Roy. Meeting Deadlines in January 1973. Hard Real-Time Systems: The Rate Monotonic Approach. [22] Thomas M. Parks. Bounded Scheduling of Process IEEE Computer Society Press, New York, New York, 1999. Networks. PhD thesis, University of California, Berkeley, [7] Joseph Buck and Radha Vaidyanathan. Heterogeneous 1995. Available as UCB/ERL M95/105. modeling and simulation of embedded systems in El Greco. [23] Douglas L. Perry. VHDL. McGraw-Hill, New York, third In Proceedings of the Eighth International Workshop on edition, 1998. Hardware/Software Codesign (CODES), San Diego, [24] Bjarne Stroustrup. The C++ Programming Language. California, May 2000. Addison-Wesley, Reading, Massachusetts, third edition, [8] Ben Cohen. VHDL Coding Styles and Methodologies. 1997. Kluwer, Boston, Massachusetts, second edition, 1999. [25] Donald E. Thomas and Philip R. Moorby. The Verilog Hardware Description Language. Kluwer, Boston, [9] Stephen A. Edwards. Languages for Digital Embedded Massachusetts, fourth edition, 1998. Systems. Kluwer, Boston, Massachusetts, September 2000.
6