Advances in Dataflow Programming Languages

Advances in Dataﬂow Programming Languages

WESLEY M. JOHNSTON, J. R. PAUL HANNA, AND RICHARD J. MILLAR University of Ulster

Abstract. Many developments have taken place within dataflow programming languages in the past decade. In particular, there has been a great deal of activity and advancement in the field of dataflow visual programming languages. The motivation for this article is to review the content of these recent developments and how they came about. It is supported by an initial review of dataflow programming in the 1970s and 1980s that led to current topics of research. It then discusses how dataflow programming evolved toward a hybrid von Neumann dataflow formulation, and adopted a more coarse-grained approach. Recent trends toward dataflow visual programming languages are then discussed with reference to key graphical dataflow languages and their development environments. Finally, the article details four key open topics in dataflow programming languages.

Categories and Subject Descriptors: A.1 [Introductory and Survey]; C.1 [Processor Architectures]; D.2 [Software Engineering]; D.3 [Programming Languages] General Terms: Languages, Theory Additional Key Words and Phrases: Dataﬂow, software engineering, graphical programming, component software, multithreading, co-ordination languages, data ﬂow visual programming

1. INTRODUCTION both of which had become bottlenecks [Ackerman 1982; Backus 1978]. The al- The original motivation for research into ternative proposal was the dataflow archi- dataflow was the exploitation of mas- tecture [Davis 1978; Dennis and Misunas sive parallelism. Therefore, much work 1975; Weng 1975], which avoids both of was done to develop ways to program these bottlenecks by using only local mem- parallel processors. However, one school ory and by executing instructions as soon of thought held that conventional “von as their operands become available. The Neumann” processors were inherently un- name dataflow comes from the conceptual suitable for the exploitation of parallelism notion that a program in a dataflow com- [Dennis and Misunas 1975; Weng 1975]. puter is a directed graph and that data The two major criticisms that were lev- flows between instructions, along its arcs eled at von Neumann hardware were di- [Arvind and Culler 1986; Davis and Keller rected at its global program counter and 1982; Dennis 1974; Dennis and Misunas global updatable memory [Silc et al. 1998], 1975]. Dataflow hardware architectures

Authors’ addresses: Faculty of Engineering, University of Ulster, Newtownabbey, Northern Ireland, BT37 0QB; email: W.M. Johnston, [email protected]; J. R. P.Hanna and R. J. Millar, {p.hanna,rj.millar}@ ulster.ac.uk. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for proﬁt or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior speciﬁc permission and/or a fee. c 2004 ACM 0360-0300/04/0300-0001 $5.00

ACM Computing Surveys, Vol. 36, No. 1, March 2004, pp. 1–34. 2 Johnston et al. looked promising [Arvind and Culler 1986; [Iannucci 1988; Nikhil and Arvind 1989], Dennis 1980; Treleaven and Lima 1984; although not all, for example, Verdoscia Veen 1986], and a number of physical im- and Vaccaro [1998]. plementations were constructed and stud- The 1990s saw a growth in the field of ied (for examples, see Davis [1978], Keller dataflow visual programming languages [1985], Papadopoulos [1988], Sakai et al. (DFVPLs) [Auguston and Delgado 1997; [1989], and Treleaven et al. [1982]). Baroth and Hartsough 1995; Bernini and Faced with hardware advances, re- Mosconi 1994; Ghittori et al. 1998; Green searchers found problems in compil- and Petre 1996; Harvey and Morris 1993, ing conventional imperative programming 1996; Hils 1992; Iwata and Terada 1995; languages to run on dataflow hardware, Morrison 1994; Mosconi and Porta 2000; particularly those associated with side ef- Serot et al. 1995; Shizuki et al. 2000; Shurr¨ fects and locality [Ackerman 1982; Arvind 1997; Whiting and Pascoe 1994; Whitley et al. 1977; Arvind and Culler 1986; 1997]. Some of these, such as LabView Kosinski 1973; Wail and Abramson 1995; and Prograph were primarily driven by Weng 1975; Whiting and Pascoe 1994]. industry, and the former has become a They found that by restricting certain successful commercial product that is still aspects of these languages, such as as- used today. Other languages, such as NL signments, they could create languages [Harvey and Morris 1996], were created [Ackerman 1982; Ashcroft and Wadge for research. All have software engineer- 1977; Dennis 1974; Hankin and Glaser ing as their primary motivation, whereas 1981; Kosinski 1978] that more naturally dataflow programming was traditionally fitted the dataflow architecture and could concerned with the exploitation of paral- thus run much more efficiently on it. These lelism. The latter remains an important are the so-called dataflow programming consideration, but many DFVPLs are no languages [Ackerman 1982; Whiting and longer primarily concerned with it. Ex- Pascoe 1994] that developed distinct prop- perience has shown that many key ad- erties and programming styles as a conse- vantages of DFVPLs lie with the soft- quence of the fact that they were compiled ware development lifecycle [Baroth and into dataflow graphs—the “machine lan- Hartsough 1995]. guage” of dataflow computers. This article traces the development of The often-expressed view in the 1970s dataflow programming through to the and early 1980s that this form of dataflow present. It begins with a discussion of architecture would take over from von the dataflow execution model, including Neumann concepts [Arvind et al. 1977; a brief overview of dataflow hardware. Treleaven et al. 1982; Treleaven and Lima Insofar as this research led to the de- 1984] never materialized [Veen 1986]. It velopment of dataflow programming lan- was realized that the parallelism used guages, a brief historical analysis of these in dataflow architectures operated at too is presented. The features that define tra- fine a grain and that better performance ditional, textual dataflow languages are could be obtained through hybrid von discussed, along with examples of lan- Neumann dataflow architectures. Many guages in this category. The more recent of these architectures [Bic 1990] took trend toward large-grained dataflow is advantage of more coarse-grained paral- presented next. Developments in the field lelism where a number of dataflow in- of dataflow programming languages in the structions were grouped and executed in 1990s are then discussed, with an empha- sequence. These sets of instructions are, sis on DFVPLs. As the environment is key nevertheless, executed under the rules of to the success of a DFVPL, a discussion the dataflow execution model and thus of the issues involved in development en- retain all the benefits of that approach. vironments is also presented, after which Most dataflow architecture efforts being four examples of open issues in dataflow pursued today are a form of hybrid programming are presented.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 3

Fig. 1.Asimple program (a) and its dataﬂow equivalent (b).

2. THE DATAFLOW EXECUTION MODEL token on some or all of its output arcs. It then ceases execution and waits to become 2.1. The Pure Dataflow Model fireable again. By this method, instruc- In the dataflow execution model, a pro- tions are scheduled for execution as soon gram is represented by a directed graph as their operands become available. This [Arvind and Culler 1986; Davis and Keller stands in contrast to the von Neumann ex- 1982; Dennis 1974; Dennis and Misunas ecution model, in which an instruction is 1975; Karp and Miller 1966]. The nodes only executed when the program counter of the graph are primitive instructions reaches it, regardless of whether or not it such as arithmetic or comparison oper- can be executed earlier than this. ations. Directed arcs between the nodes The key advantage is that, in dataflow, represent the data dependencies between more than one instruction can be executed the instructions [Kosinski 1973]. Concep- at once. Thus, if several instructions be- tually, data flows as tokens along the come fireable at the same time, they can be arcs [Dennis 1974] which behave like un- executed in parallel. This simple principle bounded first-in, first-out (FIFO) queues provides the potential for massive parallel [Kahn 1974]. Arcs that flow toward a node execution at the instruction level. are said to be input arcs to that node, while An example of dataflow versus a tra- those that flow away are said to be output ditional sequential program is shown in arcs from that node. Figure 1. Figure 1(a) shows a fragment of When the program begins, special acti- program code and Figure 1(b) shows how vation nodes place data onto certain key this is represented as a dataflow graph. input arcs, triggering the rest of the pro- The arrows represent arcs, and the circles gram. Whenever a specific set of input arcs represent instruction nodes. The square of a node (called a firing set) has data on it, represents a constant value, hard-coded the node is said to be fireable [Arvind and into the program. The letters represent Culler 1986; Comte et al. 1978; Davis and where data flows in or out of the rest of Keller 1982]. A fireable node is executed at the program, which is not shown. Where some undefined time after it becomes fire- more than one arrow emanates from a able. The result is that it removes a data given input, it means that the single value token from each node in the firing set, per- is duplicated and transmitted down each forms its operation, and places a new data path.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 4 Johnston et al.

Fig. 2. Gates in a dataﬂow graph.

Under the von Neumann execution data is never modified (new data tokens model, the program in Figure 1(a) would are created whenever a node fires), no execute sequentially in three time units. node has any side effects, and the absence In time unit 1, X and Y are added and of a global data store means that there is assigned to A.Intime unit 2, Y is di- locality of effect. As a result of being func- vided by 10 and assigned to B.Intime unit tional, and the fact that the data travels 3, A and B are multiplied together and in ordered queues, a program expressed assigned to C. in the pure dataflow model is determi- Under the dataflow execution model, nate [Arvind and Culler 1986; Davis and where the graph in Figure 1(b) is the ma- Keller 1982; Kahn 1974]. This means that, chine code, the addition and division are for a given set of inputs, a program will both immediately fireable, as all of their always produce the same set of outputs. data is initially present. In time unit 1, X This can be an important property in cer- and Y are added in parallel with Y be- tain applications. Some research has been ing divided by 10. The results are placed done on the implications of nondetermi- on the output arcs, representing variables nate behavior [Arvind et al. 1977; Kosinski A and B.Intime unit 2, the multiplica- 1978] and this is discussed further in tion node becomes fireable and is executed, Section 6.4. placing the result on the arc representing the variable C. (In dataflow, every arc can be said to represent a variable.) In this sce- 2.1.1. Controlling Data Tokens. In nario, execution takes only two time units Figure 1(b) the two arcs emanating under a parallel execution model. from input Y signify that that value is to It is clear that dataflow provides the be duplicated. Forking arcs in this man- potential to provide a substantial speed ner is essential if a data token is needed improvement by utilizing data dependen- by two different nodes. A data token cies to locate parallelism. In addition, if that reaches a forked arc gets duplicated the computation is to be performed on and a copy sent down each branch. This more than one set of data, the calculations preserves the data independence and the on the second wave of values of X and functionality of the system. Y can be commenced before those on the To preserve the determinacy of the first set have been completed. This is token-flow model, it is not permitted to known as pipelined dataflow [Gao and arbitrarily merge two arcs of flowing data Paraskevas 1989; Wadge and Ashcroft tokens. If this were allowed, data could 1985] and can utilize a substantial degree arrive at a node out of order and jeopar- of parallelism, particularly in loops, al- dize the computation. It is obvious, how- though techniques exist to utilize greater ever, that it would be difficult indeed to parallelism in loops [Arvind and Nikhil express a program in the dataflow model if 1990]. A dataflow graph that produces arcs could only be split and never merged. a single set of output tokens for each Thus, the dataflow model provides spe- single set of input tokens is said to be cial control nodes called gates [Davis and well-behaved [Dennis 1974; Weng 1975]. Keller 1982; Dennis 1974] that allow this Another key point is that the operation to happen within well-controlled limits. of each node is functional. This is because Figure 2(a) shows a Merge gate. This gates

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 5 takes two data input arcs, labeled the true of streams of data structures. As a conse- and false inputs, as well as a “control” in- quence, a structure model will need a more put arc that carries a Boolean value. When complex supporting language [Davis and the node fires, the control token is ab- Keller 1982]. sorbed first. If the value is true, the token Initially, the structure model seems at- on the true input is absorbed and placed on tractive. Token streams can be repre- the output. If it is false, then the token on sented by infinite objects with the advan- the false input is absorbed and placed tage that the streams can be accessed on the output. Figure 2(b) shows a Switch randomly and that the entire history of gate. This gate operates in much the same a stream can be accessed without need- way, except that there is a single input and ing to explicitly preserve earlier data from the control token determines on which of the stream. Additionally,the point is made two outputs it is placed. that token models force the programmer A full treatment of controlling tokens to model all programs as token streams, to provide conditional and iterative execu- while the structure model allows them to tion is given in Section 6.2. At this stage, make the choice [Davis and Keller 1982]. it is sufficient to say that by grouping to- However, the structure model has the gether three Switch gates, it is possible key disadvantage that it cannot store data to implement well-behaved conditional ex- efficiently. It requires all data generated ecution, while the combined use of both to be stored, to preserve the ability to ex- types of gate implements well-behaved it- amine the history of the program. Token erative execution. models are inherently more efficient at storing data. Some of this problem can be alleviated by compiler efficiency, but it 2.1.2. An Alternative to Token-Based is a complex process. Despite research in Dataflow. The pure dataflow execution the area in the early 1980s [Davis and model outlined above is based on flowing Keller 1982; Keller and Yen 1981], the data tokens, like most dataflow models. structure model was not widely adopted However, it should be pointed out that into dataflow,which remains almost exclu- an alternative, known as the structure sively token-based. model, has been proposed in the literature. Expounded by Davis and Keller [1982] and Keller and Yen [1981], it 2.1.3. Theoretical Implementation: Data and contains the same arc-and-node format Demand-Driven Architectures. The earliest as the token model. In the structure dataflow proposals imagined data tokens model, however, each node creates only to be passive elements that remained on one data object on each arc that remains arcs until they were read, rather than ac- there: the node builds one or more data tually controlling the execution [Kosinski structures on its output arcs. It is possi- 1973]. However, it quickly became normal ble for these structures to hold infinite in dataflow projects for data to control the arrays of values, permitting open-ended execution. There were two ways of doing execution, and creating the same effect as this in a theoretical implementation of the the token model, but with the advantage pure dataflow model. that the structure model permits random The first approach is known as the access of the data structures and history data-driven approach [Davis and Lowder sensitivity. 1981; Davis and Keller 1982; Dennis 1974; The key difference between the struc- Treleaven et al. 1982], although this term ture model and the token model is the way is slightly misleading as both approaches they view data. In the token model, nodes can be said to be driven by data, in- are designed to be stream processors, oper- sofar as they follow the principles of ating on sequences of related data tokens. dataflow. This approach should really be In the structure model, however, the nodes termed the data-availability-driven ap- operate on structures and have no concept proach because execution is dependent on

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 6 Johnston et al. the availability of data. Essentially a node environment to simply demand input. is inactive while data is arriving at its in- These examples seem to require a data- puts. It is the responsibility of an over- driven approach. all management device to notify and fire nodes when their data has arrived. The 2.2. Early Dataflow Hardware Architectures data-driven approach is a two-phase process where (a) a node is activated when its While dataflow seems good in theory, inputs are available, and (b) absorbs its in- the practical implementation of the pure puts and places tokens on its output arcs. dataflow model has been found to be an The second approach is the demand- arduous task [Silc et al. 1998]. There are driven approach [Davis and Keller 1982; a number of reasons for this, primarily Kahn 1974]. In this approach, a node is the fact that the pure model makes as- activated only when it receives a request sumptions that cannot be replicated in the for data from its output arcs. At this point, real world. First, it assumes that the arcs it demands data from all relevant input are FIFO queues of unbounded capacity, arcs. Once it has received its data, it exe- but creating an unbounded memory is im- cutes and places data tokens on its output possible in a practical sense. Thus any arcs. The demand-driven approach is thus dataflow implementation is heavily tied a four-phase process [Davis and Keller to token-storage techniques. Second, it as- 1982] where (a) a node’s environment re- sumes that any number of instructions can quests data, (b) the node is activated and be executed in parallel, while in reality requests data from its environment, (c) the the number of processing elements will environment responds with data, and (d) be finite. These restrictions mean that no the node places tokens on its output arcs. hardware implementation of the dataflow Execution of the program begins when the model will exactly mirror the pure model. graph’s environment demands some out- Indeed, this fact can make subtle but put from the graph. important changes to the pure dataflow Each of these approaches has certain ad- model that mean that the implementa- vantages. The data-driven approach has tion may deadlock in cases where the pure the advantage that it does not have the ex- model predicts no deadlock [Arvind and tra overhead of propagating data requests Culler 1986]. It is useful to summarize the up the dataflow graph. On the other hand, early development of dataflow hardware the demand-driven approach has the ad- in order to reinforce this point. vantage that certain types of node can be eliminated, as pointed out by Davis and 2.2.1. The Static Architecture. When the Keller [1982]. This is because only needed construction of dataflow computers began data is ever demanded. For example, the in the 1970s, two different approaches to Switch, node, shown in Figure 2(b), is not solving the previously mentioned prob- required under a demand-driven approach lems were researched. The static architec- because only one of the True or False out- ture was proposed by Dennis and Misunas puts will demand the input, but not both. [1975]. Under this architecture [Dennis Therefore, they can both be attached di- 1974, 1980; Silc et al. 1998], the FIFO de- rectly to the input. This is one example of sign of arcs is replaced by a simpler de- how programming with dataflow can be af- sign where each arc can hold, at most, fected by the choice of physical implemen- one data token. The firing rule for a node tation, or at least by the choice of execution is, therefore, that a token be present on model. each input arc, and that there be no to- It can also be argued that the demand- kens present on any of the output arcs. In driven approach prevents the creation of order to implement this, acknowledge arcs certain types of programs. For example, are implicitly added to the dataflow graph modern software is often event-driven, that go in the opposite direction to each such as for business software or real-time existing arc and carry an acknowledgment systems. It is not enough for the output token.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 7

Fig. 3. The static dataﬂow architecture (based on Arvind and Culler [1986]).

The static architecture’s main strength acknowledgment arcs increase data traffic is that it is very simple and quick to de- in the system, without benefiting the com- tect whether or not a node is fireable. Addi- putation. According to Arvind and Culler tionally, it means that memory can be allo- [1986], traffic can increase by a factor of cated for each arc at compile-time as each 1.5 to 2.0. Because a node must wait for arc will only ever hold 0 or 1 data token. acknowledgment tokens to arrive before This implies that there is no need to create it can execute again, the time between complex hardware for managing queues of successive firings of a node increases. data tokens: each arc can be assigned to a This can affect performance, particularly particular piece of memory store. in situations of linear computation that The graph itself is stored in the com- do not have much parallelism. Perhaps puter as a series of templates, each repre- most importantly, the static architecture senting a node of the graph. The template also severely limits the execution of loops. holds an opcode for the node; a memory In certain cases, the single-token-per-arc space to hold the value of the data token limitation means that a second loop itera- on each input arc, with a presence flag tion cannot begin executing until the pre- for each one; and a list of destination ad- vious one has almost completed, thereby dresses for the output tokens. Each tem- limiting parallelism to simple pipelining plate that is fireable (the presence flag for and preventing truly parallel execution of each input is set, and that of each out- loop iterations. Despite these limitations, put is not set) has its address placed in a number of static dataflow computers an instruction queue.Afetch unit then have been built and studied [Davis 1978; repeatedly removes each template from Dennis and Misunas 1975; Dennis 1980]. this queue and sends an operation packet to the appropriate operation unit. Mean- while, the template is cleared to prepare it 2.2.2. The Dynamic, or Tagged-Token for the next set of data tokens. The result is Architecture. An alternative approach was sent from the operation unit to an update proposed by Watson and Gurd [1979], unit that places the results onto the cor- Arvind and Culler [1983], and Arvind rect receiving arcs by reading the target and Nikhil [1990]. Known as the dynamic addresses in the template. It then checks model,itexposes additional parallelism each template to see if it is fireable and, by allowing multiple invocations of a sub- if so, places it in the instruction queue to graph that is often an iterative loop. While complete the cycle. This process is shown this is the conceptual view of the tagged- in Figure 3. token model, in reality only one copy of Unfortunately the static model has the graph is kept in memory and tags are some serious problems. The additional used to distinguish between tokens that

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 8 Johnston et al. belong to each invocation. A tag contains Another noteworthy benefit of the a unique subgraph invocation ID, as well tagged-token model is that less care needs as an iteration ID if the subgraph is a to be taken to ensure that tokens re- loop. These pieces of information, taken main in order. For example, the pure together, are commonly known as the dataflow model requires Merge operators color of the token. (see Section 2.1.1) to ensure that data to- Instead of the single-token-per-arc rule kens are merged in a determinate way. In of the static model, the dynamic model rep- the dynamic model, however, this is not resents each arc as a large bag that can required as the tags ensure the determi- contain any number of tokens, each with nacy, and so token streams can be merged a different tag [Silc et al. 1998]. In this arbitrarily. scenario, a given node is said to be fire- The main disadvantage of the tagged- able whenever the same tag is found in a token model is the extra overhead re- data token on each input arc. It is impor- quired to match tags on tokens, instead tant to note that, because the data tokens of simply their presence or absence. More are not ordered in the tagged-token model, memory is also required and, due to the processing of tokens does not necessarily quantity of data being stored, an asso- proceed in the same order as they entered ciative memory is not practical. Thus, the system. However, the tags ensure that memory access is not as fast as it could the tokens do not conflict, so this does not be [Silc et al. 1998]. Nevertheless, the cause a problem. tagged-token model does seem to of- The tags themselves are generated by fer advantages over the static model. A the system [Arvind and Culler 1986]. To- number of computers using this model kens being processed in a given invoca- have been built and studied [Arvind tion of a subgraph are given the unique and Culler 1983; Barahona and Gurd invocation ID of that subgraph. Their it- 1985]. eration ID is set to zero. When the token As stated above, the choice of target ar- reaches the end of the loop and is being fed chitecture can have implications on the back into the top of the loop, a special con- programming of software. Depending on trol operator increments the iteration ID. the model chosen, certain types of nodes, Whenever a token finally leaves the loop, such as merge or switch nodes, are not re- another control operator sets its iteration quired. Additionally, the performance of ID back to zero. the program will be affected and some A hardware architecture based on the properties of the system (such as its ten- dynamic model is necessarily more com- dency to deadlock, which can be veri- plex than the static architecture outlined fied under the pure dataflow model) may in Section 2.2.1. Additional units are re- change subtly under certain implemen- quired to form tokens and match tags. tations [Arvind and Culler 1986; Naggar More memory is also required to store the et al. 1999]. For example, some networks extra tokens that will build up on the arcs. that deadlock under the static model may Arvind and Culler [1986] provided a good not deadlock under the dynamic model. summary of the architecture. This is due to the pure model’s theoret- The key advantage of the tagged-token ically valid but impractical assumptions model is that it can take full advantage that there are an infinite number of proof pipelining effects and can even execute cessing elements and infinite space on separate loop iterations simultaneously. It each arc [Kahn 1974]. can also execute out-of-order, bypassing any tokens that require complex execution 2.3. Synchronous Dataflow and that delay the rest of the computation. It has been shown that this model A later development in dataflow, but offers the maximum possible parallelism one that became quite widely used, was in any dataflow interpreter [Arvind and synchronous dataflow (SDF) [Lee and Gostelow 1977]. Messerschmitt 1987]. This is a subset of

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 9 the pure dataflow model in which the type of functional language that became number of tokens consumed and produced known as dataflow languages. on each arc of a node is known at compile- An important clarification must be time [Bhattacharyya 1996]. Under SDF, made at this stage. In early publica- the number of tokens initially on each arc tions, dataflow graphs are often used to is also specified at compile-time. In this illustrate programs. In many cases, these scenario, there are certain limitations that graphs are simply representations of the mean that some kinds of program can- compiled code [Dennis and Misunas 1975] not be represented. For example, loops can that would be executed on the machine, only be specified when the number of iter- where the graph was generated either ations is known at compile-time. by hand or by a compiler from a third- The advantage of the SDF approach, generation programming language. Un- however, is that it can be statically sched- til the advent of Dataflow Visual Pro- uled [Buck and Lee 1995]. This means gramming Languages in the 1980s and that it can be converted into a sequential 1990s, it was rarely the intention of re- program and does not require dynamic searchers that developers should gener- scheduling. It has found particular appli- ate these graphs directly. Therefore these cations in digital signal processing where early graphs are not to be thought of as time is an important element of the com- “dataflow programming languages.” putation [Lee and Messerschmitt 1987; Plaice 1991]. Even dataflow graphs which are not SDF in themselves may have 3.1.1. What Constitutes a Dataflow Program- subgraphs that are, and this may allow ming Language?. While dataflow programs partial static scheduling [Buck and Lee can be expressed graphically, most of the 1995], with the rest scheduled according to languages designed to operate on dataflow the usual dataflow scheduling techniques. machines were not graphical. There are This has applications to coarse-grained two reasons for this. First, at the low level dataflow discussed in Section 4.3. of detail that early dataflow machines required, it became tedious to graphically specify constructs such as loops and data 3. EARLY DATAFLOW PROGRAMMING structures which could be expressed more LANGUAGES simply in textual languages [Whiting and Pascoe 1994]. Second, and perhaps more 3.1. The Development of Dataflow importantly, the hardware for displaying Languages graphics was not available until relatively With the development of dataflow hard- recently, stifling any attempts to develop ware came the equally challenging prob- graphical dataflow systems. Therefore, lem of how to program these machines. traditional dataflow languages are pri- Because they were scheduled by data de- marily text-based. pendencies, it was clear that the program- One of the problems in defining ex- ming language must expose these depen- actly what constitutes a dataflow lan- dencies. However, the data dependencies guage is that there is an overlap with other in each class of language can be exploited classes of language. For example, the use to different degrees, and the amount of of dataflow programming languages is not parallelism that can be implicitly or ex- limited to dataflow machines. In the same plicitly specified also differs. Therefore, way, some languages, not designed specifi- the search began for a suitable paradigm cally for dataflow, have subsequently been to program dataflow computers and a found to be quite effective for this use (e.g., suitable compiler to generate the graphs Ashcroft and Wadge [1977]; Wadge and [Arvind et al. 1988]. Various paradigms Ashcroft [1985]). Therefore, the bound- were tried, including imperative, logical, ary for what constitutes a dataflow lan- and functional methods. Eventually, the guage is somewhat blurred. Nevertheless, majority consensus settled on a specific there are some core features that would

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 10 Johnston et al. appear to be essential to any dataflow lan- Freedom from side effects is also es- guage. The best list of features that con- sential if data dependencies are to de- stitute a dataflow language was put for- termine scheduling. Most languages that ward by Ackerman [1982] and reiterated avoid side effects do so by disallowing by Whiting and Pascoe [1994] and Wail global variables and introducing scope and Abramson [1995]. This list includes rules. However, in order to ensure the va- the following: lidity of data dependencies, a dataflow program does not even permit a function to (1) freedom from side effects, modify its own parameters. All of this can (2) locality of effect, be avoided by the single-assignment rule. (3) data dependencies equivalent to sche- However, problems arise with this strat- duling, egy when data structures are being dealt with. For example, how can an array be (4) single assignment of variables, manipulated if only one assignment can (5) an unusual notation for iterations due ever be made to it? Theoretically,this prob- to features 1 and 4, lem is dealt with by conceptually viewing (6) lack of history sensitivity in proce- each modification of an array as the cre- dures. ation of a new copy of the array, with the given element modified. This issue is dealt Because scheduling is determined from with in more detail in Section 6.3. data dependencies, it is important that It is clear from the above discussion that the value of variables do not change be- dataflow languages are almost invariably tween their definition and their use. The functional. They have applicative seman- only way to guarantee this is to disallow tics, are free from side effects, are determi- the reassignment of variables once their nate in most cases, and lack history sen- value has been assigned. Therefore, vari- sitivity. This does not mean that dataflow ables in dataflow languages almost uni- and functional languages are equivalent. versally obey the single-assignment rule. It is possible to write certain convo- This means that they can be regarded as luted programs in the functional language values, rather than variables, which gives Lucid [Ashcroft and Wadge 1977], which them a strong flavor of functional pro- cannot be implemented as a dataflow gramming. The implication of the single- graph [Ashcroft and Wadge 1980]. At the assignment rule is that the compiler can same time, much of the syntax of dataflow represent each value as one or more arcs in languages, such as loops, has been bor- the resultant dataflow graph, going from rowed from imperative languages. Thus it the instruction that assigns the value to seems that dataflow languages are essen- each instruction that uses that value. tially functional languages with an imper- An important consequence of the single- ative syntax [Wail and Abramson 1995]. assignment rule is that the order of statements in a dataflow language is not 3.1.2. Dataflow Languages. A number of important. Provided there are no circular textual dataflow languages, or functional references, the definitions of each value, languages that can be used with dataflow, or variable, can be placed in any order in have been implemented. A representative the program. The order of statements be- sample is discussed below. (Whiting and comes important only when a loop is be- Pascoe [1994] presented a fuller review ing defined. In dataflow languages, loops of these languages.) Dataflow Visual Pro- are usually provided with an imperative gramming Languages are discussed in de- syntax, but the single-assignment rule is tail in Section 5. preserved by using a keyword such as next to define the value of the variable on the —TDFL. The Textual Data-Flow Lan- next iteration [Ashcroft and Wadge 1977]. guage was developed by Weng [1975] as A few dataflow languages offer recursion one of the first purpose-built dataflow instead of loops [Weng 1975]. languages. It was designed to be

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 11

compiled into a dataflow graph with and Ashcroft and Wadge [1980] brooded data streams in a relatively straight- on the topic in literature before publish- forward way and supported compile- ing a book in 1985 [Wadge and Ashcroft time deadlock detection. A program 1985] that firmly established Lucid’s expressed in TDFL consisted of a se- claim to be a dataflow language. ries of modules, analogous to proce- —Id. Originally developed by Arvind et al. dures in other languages. Each module [1978] for writing operating systems, Id was made up of a series of statements was intended to be a language with- that were either assignments (obeying out either sequential control or memory the single-assignment rule), conditional cells, two aspects of the von Neumann statements, or a call to another mod- model that Arvind et al. felt must be ule. Iteration was not provided directly, rejected. The resultant language had as Weng could find no way to make it single-assignment semantics and was compatible with the single-assignment block-structured and expression-based. rule, but modules could call themselves Id underwent much evolution, and later recursively. versions tackled the problem that data —LAU. Developed in 1976 for the structures were not comfortably com- LAU static dataflow architecture, patible with the single-assignment rule the LAU language was developed by through the inclusion of I-structures the Computer Structures Group of [Arvind et al. 1989] (which are them- ONERA-CERT in France [Comte et al. selves functional data structures and 1978; Gelly 1976]. It was a single- are explained in Section 6.3). assignment language and included —LAPSE. Developed by Glauert [1978], conditional branching and loops that LAPSE was derived from Pascal and were compatible with this rule through was designed for use on the Manchester the use of the old keyword. It was one dataflow machine. The language had of the few dataflow languages that single-assignment semantics and pro- provided explicit parallelism through vided functions, conditional evaluation, the expand keyword that specified and user-defined data types. It provided parallel assignment. LAU had some iteration without using any qualifying features that were similar to object- keywords to differentiate between the oriented languages, such as the ability current and next value of the loop vari- to encapsulate data and operations able. Rather, the compiler assumed that [Comte et al. 1978]. the old value was intended if it appeared —Lucid. Originally developed indepen- in an expression, and the next value was dently of the dataflow field by Ashcroft assumed if it appeared on the left of an and Wadge [1977], Lucid was a func- assignment. Like LAU, LAPSE provided tional language designed to enable for- a single explicit parallel construct, for mal proofs. Recursion was regarded as all for parallel array assignment. too restrictive for loop constructs, but —VAL.VAL was developed by Dennis it was realized that iteration introduced starting in 1979 [Ackerman and two nonmathematical features into pro- Dennis 1979; Dennis 1977], and obeyed gramming: transfer and assignment. the single-assignment rule. A pro- Thus, Lucid was designed to permit iter- gram in VAL consisted of a series of ation in a way that was mathematically functions, each of which could return respectable, through single assignment multiple values. Loops were provided and the use of the keyword next to define by the Lucid technique [Ashcroft and the value of the variable in the next iter- Wadge 1977], and a parallel assignment ation. It quickly became apparent, how- construct, for all,was also provided. ever, that Lucid’s functional and single- However, recursion was not provided assignment semantics were similar to as it was not thought necessary for the those required for dataflow machines, target domain. Other disadvantages

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 12 Johnston et al.

[Whiting and Pascoe 1994] included the and (3) limited constructs to support lack of general I/O and the fact that concurrency. nondeterministic programs could not be expressed. 3.1.3. Using Imperative Languages with —Cajole.First developed under this name Dataflow. While the majority consensus in 1981 [Hankin and Glaser 1981], settled on the previously mentioned Cajole was a functional language de- dataflow languages, this does not mean signed to be compiled into acyclic that other directions were not pursued. dataflow graphs. It did not provide Dataflow compilers have been built for loops, but did permit recursion. Cajole several imperative languages [Wail and was later used in a project that explored Abramson 1995]. These include For- structured programming with dataflow tran, Pascal, and several dialects of C [de Jong and Hankin 1982]. [Whiting and Pascoe 1994]. All of these approaches had to deal with the major —DL1. Developed by Richardson [1981] to problem of how to generate code based on support research into hybrid dataflow data dependencies from languages that architectures, DL1 was a functional lan- allow a lot of flexibility in this regard. guage designed to be compiled into Wail and Abramson [1995] confirmed low-level dataflow graphs. This tar- that when programming dataflow ma- get was made more explicit than in chines with imperative languages, the other languages, as evidenced by key- generation of good parallel code can be words such as subgraph. The language extremely difficult. They also confirmed provided for recursion and conditional that the implementation of nonfunctional execution. facilities, such as global variables, will —SISAL. Like Lucid, SISAL was not orig- reduce possible concurrency. Their main inally written specifically for dataflow motivation for pursuing this line of re- machines, but found that application search was that much software is already later. Originally developed in 1983 written in imperative languages and most [Gurd and Bohm 1987; McGraw et al. programmers are already familiar with 1983], SISAL is a structured functional the paradigm. language, providing conditional evalu- In their 1982 paper, Gajski et al. [1982] ation and iteration consistent with the offered the opinion that using dataflow single-assignment rule. Although it pro- languages offered few advantages over vides data structures, these are treated imperative languages, on the grounds as values and thus cannot be rewrit- that the compiler technology was as com- ten like I-structures [Arvind et al. 1989]. plex for one as for the other. They ar- The only parallel construct provided is gued that the use of sophisticated com- a parallel loop. piler techniques and explicit concurrency —Valid. Designed by Amamiya et al. constructs in an imperative language [1984], Valid was an entirely functional could provide the same level of paral- language designed to demonstrate the lel performance on dataflow machines as “superiority” of dataflow machines. It a dataflow language. While not deny- provided recursion as a key language ing the advantages of the syntactical element, but also provided functional purity of dataflow languages, they ar- loops using the Lucid method [Ashcroft gued that these advantages do not jus- and Wadge 1977]. A simple parallel loop tify the effort required for the introduc- construct was also provided. tion of a totally new class of programming languages. The above list represents much of the The possibility of placing explicit con- population of dataflow programming lan- currency constructs into dataflow languages in existence. Many are similar and guages has been largely resisted by re- the majority have (1) functional seman- searchers (parallel assignment/loops have tics, (2) single assignment of variables, been provided in some cases—see Dennis

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 13

[1977], Gelly [1976], and Glauert [1978]— architectures operated at a level that was but almost no other concurrency features too fine-grained. While von Neumann ar- have been included). It is probable that chitectures operate at process-level gran- the explanation for this resistance is that ularity (i.e., instructions are grouped implicit parallelism is an extremely ap- into threads or processes and then exe- pealing idea, and to introduce explicit cuted sequentially), dataflow operates at concurrency constructs into their dataflow instruction-level granularity. This point languages would destroy one of the most had been recognized by 1986, when Veen appealing parts of the dataflow concept. [1986, p. 393] remarked that “there are The argument of Gajski et al. [1982] signs that a deviation is also necessary against creating specifically “dataflow” from the fine-grain approach” because programming languages would be valid it led to “excessive consumption of re- if the only justification for the pursuit of sources.” This is because it required a high dataflow were the pursuit of improved per- level of overhead to prepare each instruc- formance through exploiting parallelism. tion for execution, execute it, propagate However, as they themselves commented, the resultant tokens, and test for further dataflow languages have features that are enabled firings. Indeed, algorithms which very advantageous to the programmer. exhibit a low degree of natural paral- The future development of dataflow visual lelism can execute unacceptably slowly on programming languages provides much dataflow machines because of this degree evidence for this, where the emphasis has of overhead. Veen [1986] defended these moved toward benefits in software engi- claims, arguing that the overhead can be neering. Therefore, contrary to the asser- reduced to an acceptable level by com- tions of Gajski et al. [1982] it is proposed piling techniques, but later experiences that further research into dataflow pro- seem to demonstrate that the criticism gramming languages is justified. was valid. For example, Bic [1990, p. 42] commented that “inefficiencies [are] inherent to purely dataflow systems” while 3.2. The Dataflow Experience in the 1980s Silc et al. [1998, p. 9] commented that In the 1980s, proponents confidently “pure dataflow computers ...usually per- predicted that both dataflow hardware form quite poorly with sequential code.” and dataflow languages would supersede The reason for the decline in dataflow von Neumann-based processors and lan- research in the late 1980s and early guages [Arvind et al. 1977; Treleaven et al. 1990s was almost entirely due to prob- 1982; Treleaven and Lima 1984]. How- lems with the hardware aspects of the ever, looking back over the 1990s, it is clear field. There was little criticism of dataflow that this did not happen [Silc et al. 1998; languages—other than those leveled at Veen 1986; Whiting and Pascoe 1994]. functional languages in general [Gajski In fact, research into dataflow languages et al. 1982]—which are still unrivaled in slowed after the mid-1980s. In their re- the degree of implicit parallelism that they view paper of 1994, Whiting and Pascoe achieve [Whiting and Pascoe 1994]. The [1994] reported that several dataflow re- dataflow execution model can be used with searchers had concluded that dataflow or without dataflow hardware, and, there- had been mostly a failure—cost-effective fore, any decline in the hardware aspect dataflow hardware had failed to materi- does not necessarily affect dataflow lan- alize. Given that the dataflow concepts guages, provided they have advantages on looked promising, the languages were ap- their own merit. This article contends that pealing, and much research effort was put they do. into the subject, there must be good reasons for the decline of these early dataflow 4. EVOLUTION OF DATAFLOW concepts. It is widely believed that a main rea- It is not surprising, on the basis of son was that the early dataflow hardware Section 3.2, that when research into

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 14 Johnston et al. dataflow again intensified in the early 1990s [Lee and Hurson 1993], the issue of granularity was one of the key points to be addressed. One of the primary realizations that made this shift possible was the recognition that, contrary to what was popularly believed in the early 1980s, dataflow and von Neumann techniques were not mutually exclusive and irreconcilable concepts, but simply the two extremes of a contin- uum of possible computer architectures Fig. 4. Dataflow granularity optimization curve, [Papadopoulus and Traub 1991; Silc et al. (based on Sterling et al. [1995]). 1998]. Fine-grained dataflow could now be seen as a multithreaded architecture in approach should be used. This suggests which each machine-level instruction was that some form of hybrid—dataflow with executed in a thread on its own. At the von Neumann extensions, or vice versa— same time, von Neumann architectures would offer the best performance. The could now be regarded as a multithreaded question then was, what level of medium architecture in which there was only one granularity was best? thread—the program itself. For example, In terms of hardware architectures, in their survey paper, Lee and Hurson there is no universal consensus on how [1993, p. 286] observed that “the foremost best to achieve this hybrid. Some ap- change is a shift from the exploitation proaches are essentially von Neumann of fine- to medium- and large-grain par- architectures with a few dataflow addi- allelism.” The primary issue in dataflow tions. Others are essentially dataflow ar- thus immediately became the question of chitectures with some von Neumann addi- granularity. tions. (For examples see Iannucci [1988]; The result of this shift in viewpoint was Nikhil and Arvind [1989]; Papadopoulos the exploration of what has become known and Traub [1991]). It is not our intention as hybrid dataflow. to explore the hardware aspects of hybrid dataflow in depth here, as the concentra- tion of this article is on dataflow program- 4.1. The Development of Hybrid Dataflow ming, but a good summary was published Although hybrid dataflow concepts had by Silc et al. [1998]. been explored for many years [Silc et al. While this new research was aimed 1998], it was only in the 1990s that they at improving hardware architectures, the became the dominant area of research in rejection of fine-grained dataflow and the dataflow community. In their 1995 pa- the move toward more coarse-grained per, Sterling et al. [1995] explored the per- execution has also freed the dataflow pro- formance of different levels of granularity gramming of its restriction to fine-grained in dataflow machines. Although the range execution. This allows a much wider range of possible test scenarios is huge, they did of research to be conducted into dataflow produce a generalized graph that summa- programming, taking advantage of these rized their findings. A simplified version new degrees of granularity. of this is shown in Figure 4. Essentially the now-accepted require- Figure 4 indicates that neither fine- ment for more coarse-grained execu- grained (as in traditional dataflow) nor tion has caused a divergence in the coarse-grained (as in sequential execu- dataflow programming community. The tion) dataflow offers the best parallel per- first group advocates generating fine- formance, but rather a medium-grained grained dataflow graphs as before, but

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 15 they then propose analyzing these graphs is little point in scheduling them to two and identifying subgraphs that exhibit different processors. Under fine-grained low levels of parallelism that should al- dataflow, the output token from the first ways execute in sequence. These nodes node must be mapped back through the are grouped together into segments. Thus, system, added to the input arc for the sec- when the first node in the segment is ond node, which must then wait to be fired. fired, the remaining nodes can be fired It is much more efficient to place the two immediately. They are still executed in instructions in a single execution quanta, a fine-grained manner, but the costly so that the output of the first node can token-matching process is avoided for the be immediately used by the second [Bic subsequent nodes in the sequence, sav- 1990]. ing time and resources. This approach This principle was used by Bic [1990], is termed threaded dataflow [Silc et al. who proposed analyzing a dataflow graph 1998]. and producing sequential code segments The second group advocates dispens- (SCS), which are nodes that are in a chain ing with fine-grained dataflow execution, and cannot be executed in parallel. Under and instead compiling the subgraphs into the modified execution model, the granu- sequential processes. These then become larity is at the SCS level. However, other coarse-grained nodes, or macroactors. The than the fact that the execution of the first graphs are executed using the traditional instruction in an SCS causes the rest of dataflow rules, with the only difference the chain to be executed, the model obeys being that each node contains, for exam- the standard dataflow rules. Bic [1990] ple, an entire function expressed in a se- also proposed a method to automatically quential language as opposed to a sin- identify the SCSs without programmer gle machine-level instruction. This second intervention. approach is usually termed large-grain The advantage of the threaded dataflow dataflow [Silc et al. 1998]. approach is that those parts of the It was noted as early as 1974 that the dataflow graph that do not exhibit good po- mathematical properties of dataflow net- tential parallelism can be executed with- works are valid, regardless of the degree out the associated overhead, while those of granularity of the nodes [Arvind et al. that do show potential parallelism can 1988; Jagannathan 1995; Kahn 1974; Lee take advantage of it. In their study of 1997; Sterling et al. 1995], and, therefore, this approach, Papadopoulus and Traub the hybrid approaches to dataflow pro- [1991] confirmed these conclusions, al- gramming do not in any way compromise though they warned that it is not wise the execution model. Both the threaded to carry the line of sequentiality too and large-grain approaches are exciting far. An analysis of these proposals un- developments but it is the latter that of- dertaken by Bohm et al. [1993] demon- fers the most potential for improvements strated that medium-grained dataflow did to dataflow programming. indeed improve performance, as predicted by Sterling et al. [1995]. A key open question is how best to par- 4.2. Threaded Dataflow tition programs into threads and what degree of granularity is best [Bohm The threaded dataflow approach takes ad- et al. 1993; Lee and Hurson 1994]. The vantage of the fact that dataflow program Pebbles group in the U.S. is examin- graphs display some level of sequential ex- ing the relationship between granular- ecution. For example, in the case where ity of parallelism and efficiency in hybrid the output of one node goes into the next dataflow [Bohm et al. 1993; Najjar et al. node, the two nodes could never execute 1994], although the group is primarily con- in parallel when they are operating on cerned with large-grain dataflow. Another a single wave of data. Therefore, there consequent benefit of this area of research

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 16 Johnston et al. has been that it is easier to encode certain that could be incorporated into dataflow, functions (e.g., resource management) if including greatly simplified schemes for some sequential execution is permitted providing iteration and the use of sub- [Lee and Hurson 1994]. streams within streams. The book empha- sizes the benefits of these principles to business software, and to software engi- 4.3. Large-Grain Dataflow neering in general. Large-grain dataflow can begin with The key benefit of Morrison’s [1994] ap- a fine-grained dataflow graph. This proach is a much reduced development dataflow graph is analyzed and divided time. Empirical evidence of this is of- into subgraphs, much like the threaded fered in his book, where real-life experi- approach. However, instead of remaining ence of a large piece of software devel- as groupings of associated nodes, the oped with flow-based techniques is cited. subgraphs are compiled into sequential The approach led to a considerable sav- von Neumann processes. These are then ings of time and effort on the part of run in a multithreaded environment, the programmers, particularly when it scheduled according to the usual dataflow came to modifying the program after ini- principles. The processes are termed tial completion. Morrison expounded at macroactors [Lee and Hurson 1994]. length on the benefits of the stream-based, Much recent work has been done in the large-grained modular approach to busi- area of large-grain dataflow systems ness software engineering in particular. and it offers a great opportunity for If these concepts were applied to large- improvements to the field of dataflow grain dataflow, the advantages of both programming. dataflow execution and Morrison’s graph- One important point is that, since the ical component-based software could be macroactors in large-grain dataflow are merged. sections of sequential code, there is no rea- Similar techniques have already son why these have to be derived from fine- been used in one key area—digital sig- grained dataflow graphs. The macroactors nal processing [Bhattacharyya 1996; could just as easily be programmed in an Naggar et al. 1999]. Many signal- imperative language, such as C or Java. processing environments, such as Ptolemy Each macroactor could represent an en- [Bhattacharyya 1996], operate by letting tire function, or part of a function, and the user connect together components, could be designed to be used as off-the- each of which performs a medium-grained shelf components. It is the fact that the programming task. The whole network macroactors are still executed according is essentially a dataflow network. Some to dataflow rules that lets this approach work has been done in formalizing such retain the clear advantages of dataflow, networks [Lee and Parks 1995]. It has but solve the high-overhead problem been noted that these principles were of fine-grained dataflow. Research has used by the signal-processing community been conducted into the best degree of before being formalized in research, and granularity by Sterling et al. [1995], have, therefore, already been demon- who found a medium-grained approach strated to be beneficial to software to be optimal. engineering [Lee and Parks 1995]. A related field is the “flow-based programming” methodology advocated by Morrison [1994]. Morrison advocated the 5. RECENT DEVELOPMENTS IN DATAFLOW use of large-grain components, expressed PROGRAMMING LANGUAGES in an imperative language, but linked to- 5.1. Introduction gether in a way reminiscent of dastaflow. Although it is not dataflow—it does not The last major reviews of dataflow pro- strictly obey the dataflow firing rules— gramming languages were Hils [1992] and Morrison’s proposals do suggest features Whiting and Pascoe [1994]. In the decade

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 17 since then, the field of dataflow has ex- growth of dataflow visual programming panded and diverged to include many languages (DFVPLs). Although the the- disparate areas of research. However, the ory behind DFVPLs has been in existence focus of this section is strictly on program- for many years, it is only the availability ming languages that are based upon the of cheap graphical hardware in the 1990s dataflow execution model. Indeed, there that has made it a practical and fruitful are some languages that have the appear- area of research. ance of dataflow, but upon examination, it Investigations of DFVPLs have indi- is clear that they are sufficiently different cated many solutions to existing problems from the pure dataflow execution model to in software engineering, a point which make such a label questionable, for exam- will be expanded upon below. It has also ple, JavaBeans. led to the introduction of new problems Since the majority of developments in and challenges, particularly those associ- dataflow programming languages in the ated with visual programming languages past decade have been in the field of vi- in general [Whitley 1997], as well as con- sual programming languages, this sec- tinuing problems, such as the represen- tion also concentrates on visual program- tation of data structures and control-flow ming languages. It should be stressed that structures [Auguston and Delgado 1997; the textual dataflow languages detailed in Ghittori et al. 1998; Mosconi and Porta previous sections still exist and are being 2000]. Research has been fairly intense in developed, although most of the current the past decade, and it is the subject of research in that area is in the field of hard- this section to identify some of the main ware and compilation technology. Since trends in dataflow programming over this the emphasis in this article is on software period. engineering rather than hardware, these issues are beyond the scope of this section. 5.2. The Development of Dataflow Visual Hardware issues are mentioned only inso- Programming Languages far as they have affected the development of dataflow languages. In Section 3, textual dataflow languages As has already been outlined in the pre- were discussed, and much of the research vious section, the major development in into dataflow hardware utilizes these tex- the past 10 years in dataflow has been tual languages. The “machine” language of the move away from fine-grained paral- programs designed to be run on dataflow lelism toward a more coarse-grained ap- hardware architectures is the dataflow proach. These approaches ranged in con- graph. Most textual dataflow languages cept from adding limited von Neumann were translated into these graphs in order hardware to dataflow architectures, to to be scheduled on the dataflow machine. running dataflow programs in a multi- However, early on it was realized that threaded manner on machines that were these graphs could have advantages in largely von Neumann in nature. themselves for the programmer [Davis To some extent, dataflow languages 1974; Davis and Keller 1982]. Graphs evolved to meet these new challenges. A allow easy communication of ideas to greater emphasis was placed upon com- novices, allowing much more productive piling dataflow programs into a set of meetings between the developer and the sequential threads that were themselves customer [Baroth and Hartsough 1995; executed using the dataflow firing rules. Morrison 1994; Shurr¨ 1997]. In addition, a However, these changes did not have a ma- range of research into VPLs has indicated jor effect upon the languages themselves the existence of significant advantages in whose underlying semantics did not have a visual syntax [Green and Petre 1996], to change. for example, dynamic syntax and visual- From a software engineering perspec- ization [Hils 1992; Shizuki et al. 2000]. tive, however, the major development in The fact that several dataflow environ- dataflow in the past 15 years has been the ments have been the basis of successful

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 18 Johnston et al. commercial products adds weight to this By the early 1980s, Davis had devel- case [Baroth and Hartsough 1995]. Fi- oped a more practical, higher-level DFVPL nally, research has shown that most devel- known as GPL (Graphical Programming opers naturally think in terms of dataflow Language). Davis and Lowder [1981] con- in the design phase, and DFVPLs re- tended that text-based programming lan- move the paradigm shift that is forced on guages lacked intuitive clarity and pro- a programmer when entering the coding posed going further than using graphs phase. Indeed, DFVPLs arguably remove as a design aid by creating an environ- this distinction altogether [Baroth and ment in which the program is a graph. Hartsough 1995; Iwata and Terada 1995]. GPL was also an attempt to create a Researchers published papers on higher-level version of DDNs [Davis 1979; DFVPLs intermittently in the 1980s. Whiting and Pascoe 1994]. In the GPL en- Their ideas were intriguing and showed vironment, every node in the graph was great promise, but were restricted by the either an atomic node or could be ex- expense and low diffusion of graphical panded to reveal a sub-graph, thereby hardware and pointing devices. Davis providing structured programming with and Keller [1982] recognized the now top-down development. These subgraphs universally accepted trend toward more could be defined recursively. Arcs, in the graphically based computer systems, graph were typed and the whole environ- and made the argument that textual ment had facilities for debugging, visu- languages could be completely replaced alization and text-based programming if by graphical ones in the future. Although desired. The lack of suitable graphical this prediction has not fully come to pass, hardware for their system was the main they judiciously proposed that human reason for a lack of rapid development of engineering rather than concurrent these concepts. execution would become the principal In the early 1980s, researchers Keller motivation for developing dataflow visual and Yen [1981] developed FGL, indepen- programming languages, a motivation dently from Davis. FGL stands for Func- that has indeed been at the fore of more tion Graph Language, and was born from recent DFVPL research. the same concept of developing dataflow graphs directly. Unlike the token-based dataflow model of GPL, FGL was based 5.2.1. Early Dataflow Visual Programming around the structure model, of which Languages. In the 1970s, Davis [1974, Keller was a proponent [Davis and Keller 1979] devised Data-Driven Nets (DDNs), a 1982] (see Section 2.1.2). Under this graphical programming concept that was model, data is grouped into a single struc- arguably the first dataflow visual lan- ture on each arc rather than flowing guage (as opposed to a graph used purely around the system. In other regards, FGL for representation). In DDN, programs are was similar to GPL in its support for top- represented as a cyclic dataflow graph down stepwise refinement. The relative with typed data items flowing along the advantages and disadvantages of GPL and arcs which are FIFO queues. The pro- FGL mirror those of the token-flow model gram is stored in a file as a parenthe- and structure model, respectively. sized character string, but displayed as a Shortly afterwards, the Grunch system graph. The language operates at a very was developed by de Jong et al. [1982], the low level and, in fact, Davis [1978] com- same researchers who created the Cajole mented that it was not the intention that textual dataflow language [Hankin and anyone should program directly in DDNs. Glaser 1981]. While not a programming Nevertheless, they illustrated key con- language in the proper sense, it was a cepts such as the feasibility of providing graphical overlay for Cajole that allowed iteration, procedure calls, and conditional the developer to graphically express a execution without the use of a textual dataflow program using stepwise refine- language. ment, and then use the tool to convert

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 19 the graph into Cajole. The actual con- makes the programming experience less version was performed by an underlying cumbersome by providing iterative con- tool called Crunch. The development of structs and a form of stepwise refinement Grunch supported the claims of Davis and whereby programmers can produce their Keller [1982] that software engineering own function nodes. could be as much a motivation for pursu- Empirical evidence reported by the ing graphical dataflow as the pursuit of Jet Propulsion Laboratory [Baroth and efficient parallelism. Hartsough 1995] has shown a very favor- able experience with LabView when used for a large project, compared to develop- 5.2.2. More Recent Dataflow Visual Program- ing the same system in C. In particular, ming Languages. Interestingly, from the they found that the DFVPL led to a sig- mid-1980s on, further development of nificantly faster development time than C, DFVPLs often came from different sources mainly due to the increased communica- than direct research into dataflow. Indeed, tion facilitated by the visual syntax. An industry played a part in this phase of de- example of a program written in LabView velopment. The most common source was is shown in Figure 5. As well as its signal- and image-processing, which lends demonstrated and continuing industrial itself particularly well to a dataflow ap- successes, LabView has proved particu- proach [Buck and Lee 1995]. Therefore, larly popular with researchers [Ghittori many DFVPLs were produced to solve et al. 1998; Green and Petre 1996]. specific problems and utilized dataflow ProGraph was a more general-purpose because it provided the best solution to DFVPL than LabView, and involved the problem. As Hils [1992] commented, combining the principles of dataflow DFVPLs in this period were most success- with object-oriented programming. The ful in narrow application domains and in methods of each object are defined us- domains where data manipulation is the ing dataflow diagrams. Like LabView, foremost task. ProGraph includes iterative constructs Hils [1992] provided details of 15 lan- and permits procedural abstraction by guages developed in the 1980s and very condensing a graph into a single node. early 1990s that could be classed as ProGraph has also been used as a subject DFVPLs. In order to avoid repetition, in research [Cox and Smedley 1996; only two examples of these are discussed Green and Petre 1996; Mosconi and Porta here. NL, a significant language that ap- 2000]. Example screenshots of ProGraph peared after Hils wrote his paper, is also programs can be found in Mosconi and described. Porta [2000]. LabView is a well-known DFVPL devel- In the mid 1990s, the language NL was oped in the mid-1980s to allow the con- developed by Harvey and Morris [1993, struction of “virtual” instruments for data 1996], along with a supporting program- analysis in laboratories. As such, it was ming environment. It is fully based on the intended for use by people who were not dataflow model of execution. NL has an themselves professional programmers. A extended typing system, whereby arrays program in LabView is constructed by con- can behave as arbitrarily long lists, to the necting together predefined functions, dis- point of being infinite. It provides an in- played as boxes with icons, using arcs for genious method of control flow, through data paths. Each program also has a vi- combined use of “block” and “guard” nodes. sual interface to allow the design of the vir- For example, a guard node may contain a tual instrument. Components that have a condition that, if evaluated to true, causes visual representation appear both in the its associated block node to be executed. interface and the program, whereas func- Sequences of guard nodes can be created, tions only appear in the program window. and once one guard has been executed, all The whole program is executed accord- others are ignored. This has the advan- ing to the dataflow firing rules. LabView tage of reducing screen clutter and making

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 20 Johnston et al.

Fig. 5. Example program in LabView designed to find the real roots of a quadratic equation. the flow of control more explicit. Loops are of the computation language and the co- supported by a related method. ordination language. The NL environment reported in Dataflow researchers have taken this Harvey and Morris [1996] features a idea on board. Indeed, since dataflow visual debugger. Screenshots of typical graphs explicitly express the relationships NL programs can also be found in this between computations, it is clear that article. Programmers sees their program dataflow is a natural coordination lan- in the same way that they developed it guage. While few researchers have gone and can choose to step one firing at a so far as to create an entirely indepen- time, or use breakpoints. A node that dent general-purpose co-ordination lan- is firing is highlighted, and placing the guage based on dataflow ideas, many have cursor over a port allows them to examine produced DFVPLs that strongly display and change values. When it comes to the distinction. For example, the Vipers loops, a slider allows the programmers DFVPL [Bernini and Mosconi 1994] is a to select any of the iterations that are coordination language where the nodes in taking place and examine them in any the graph are expressed using the lan- order. guage Tcl. Morrison’s [1994] flow-based programming concept, while it does not strictly 5.2.3. Dataflow as a Coordination Language. obey the rules of dataflow, describes a sys- Gelernter and Carriero [1992] emphasized tem where nodes are built in arbitrary the concept of the coordination language, programming languages which the pro- that is, the concept that programming con- grammer arranges using a single network sists of two tasks, not one: computation, editing environment. Morrison [1994] which specifies what is to be done, and reported empirical evidence that appears coordination, which specifies how the com- to support his assertion that this method putations are to proceed. They argued that is practical in real-world situations. the need for such a distinction was becom- An excellent example of such a system ing more necessary with the advent of dis- is Granular Lucid (GLU) which was devel- tributed and heterogeneous computer sys- oped by Jagannathan [1995]. It is based tems. The proposal was for a separation upon Lucid, with the key addition that

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 21 functions are defined in a foreign, proba- veloping programming environments in bly sequential, language such as C. Data tandem with the DFVLPs that they use. types are also of a foreign format. Since Indeed, so tightly have the two become Lucid itself is a textual dataflow language, that it has become difficult to distinguish GLU allows a much more coarse-grained where the language ends and the envi- approach to dataflow by the programmer. ronment begins. Therefore, this section Instead of primitive operations being ex- necessarily overlaps with language issues. ecuted in a fine-grained manner, this al- Many of the advantages of DFVPLs are ad- lows the rules of dataflow to be applied to vantages of their environments as much as a much coarser granularity. Jagannathan of the language. [1995] went on to show how this degree Burnett et al. [1995] discussed what is of granularity achieves performance simi- needed in order to scale up a visual pro- lar to conventional parallel languages and gramming language to the point of be- concluded that using dataflow program- ing a practical proposition for a sizeable ming languages to develop applications real world project. They came up with a for conventional parallel processors is list of four things that VPLS are trying feasible. to achieve. These are the reduction in key All of the above-mentioned languages concepts, such as pointers; a more concrete are examples of how dataflow program- programming experience, such as explor- ming may be moved to a higher level ing data visually; explicit definitions of re- of abstraction. For example, a language lationships between tasks; and immediate in which entire functions are enclosed visual feedback. within a node could be envisaged, while DFVPLs have the potential to achieve the nodes themselves are executed, using all of these to some degree. The dataflow the dataflow semantics. This would be a graph itself is an ideal example of an development of the ideas put forward by explicitly defined relationship, and it is Bernini and Mosconi [1994] and by Rasure true that they have a smaller set of key and Williams [1991]. concepts than their textual counterparts. With the recent trend toward het- A number of the languages reviewed by erogeneous distributed systems and Hils [1992] feature a high degree of live- component-based programming, it is ness, that is, immediate visual feedback, believed that thinking of dataflow as a co- with one, VIVA [Tanimoto 1990], allow- ordination language has much merit and ing programmers to dynamically edit the one that deserves further investigation. programs while it is running visually in front of them. Finally, the visual debugger in Harvey’s NL environment [Harvey 5.3. Assessment of Visual Dataflow and Morris 1996] is a good example both Programming Environments of the immediate feedback of information The power of any visual programming and the visual exploration of data. language depends more heavily upon its Of course, it is more difficult to mea- environment than its text-based counter- sure how well a DFVPL meets the crite- parts. The ease with which tasks can be ria that it sets out to achieve. There are performed has a large bearing on how few metrics available in literature at this it compares to other languages. Those stage, although Kiper et al. [1997] offered who have used DFVPLs in industry have one set of subjective metrics for measuring commented that the visual nature is an VPLs in general. Their criteria included essential component of the language, not its scalability, its ease of comprehension, simply an interface, and that without the gauging the degree of visual nature of the visualization tools offered by the environ- language, its functionality, and its support ment, DFVPLs would have limited use for the paradigm. The first of these points, [Baroth and Hartsough 1995]. that of scalability, has been answered to In keeping with this fact, the trend some degree by Burnett’s work, mentioned in the late 1990s has been toward de- above [Burnett et al. 1995].

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 22 Johnston et al.

An attempt to measure the compre- from these primitives is really an issue for hension of dataflow languages was made the vendor of the programming environ- by Green and Petre [1996]. They stud- ment, not for academia. ied ProGraph and LabView at length, and A point that Baroth and Hartsough concluded that they had clear advantages. [1995] were keen to stress was that vi- They drew the following interesting con- sualization, and animation in particular, clusions regarding the current state of is absolutely essential to making the tool DFVPLs: useful. Indeed, they went so far as to comment that “the graphics description of —that DFVPLs allow the developer to pro- the system without the animation would ceed with design and implementation in not be much more than a CASE tool with their own order, thus making the design a code generator” [Baroth and Hartsough process freer and easier; 1995, p. 28]. —that secondary notation could be uti- A final point made by Baroth and lized much more than it currently was; Hartsough [1995] was that the bound- —that more work needed to be con- aries between the requirements, design, ducted on incorporating control-flow and coding phases of the software lifecycle constructs; collapse and blend into one another. This —that the effectiveness of program ed- appears to be both an advantage and a dis- itors remained to be investigated in advantage. It is a problem in that the ex- literature; isting methodologies in place were unable —that the problem of real estate was not to support the tool and this led to an inabil- as major as many assume it to be. ity to assess the progress in the project. On the other hand, the single phase al- Further feedback on what is needed lows the customer to be involved at all in DFVPLs was provided by Baroth and stages, reducing the prospect for expen- Hartsough [1995]. Having used a DFVPL sive mistakes, and also reducing develop- for a real-world project, they concluded ment time. that the advantages offered lie more to- It is the previously mentioned empha- ward the design end of the software lifecy- sis on the design phase that prompted the cle, and less in the later stages of coding. development of Visual Design Patterns They found increased communication be- (VDPs) by Shizuki et al. [2000; Toyoda tween developer and customer, comment- et al. 1997]. Under the VDP approach, the ing, “We usually program together with user is equipped with generic design pat- the customer at the terminal, and they terns of common task layouts. Developers follow the data flow diagrams enough to choose a VDP to suit their needs and then make suggestions or corrections in the insert specific components into the holes in flow of the code. It is difficult to imagine the pattern in order to produce an actual a similar situation using text-based code” implementation. The concept has been in- [Baroth and Hartsough 1995, p. 26]. The troduced into the KLIEG environment and development time improvement in this demonstrated in the literature. In their case was a factor of 4. more recent paper, Shizuki et al. [2000]. By contrast, Baroth and Hartsough extended the idea to include the possibility [1995] commented that the provision that the use of VDPs could help to focus a of software libraries, while speeding up smart environment on the specific aspects coding, is merely a case of “who can type of dataflow execution that a developer is faster,” and is not an advantage in itself. likely to be interested in. And so, the issue of the provision of a Animation is an important concept that library of nodes is not a major one for was highlighted above. There is a sig- DFVPLs. Already, a DFVPL can be used nificant difference between viewing a with a very primitive set of nodes and the program graphically, and viewing it dy- provision of nodes that can be built up namically. The animation of executing

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 23 dataflow programs is an exciting topic, 6. OPEN ISSUES IN DATAFLOW but one in which research has only re- PROGRAMMING cently been undertaken in detail. A good Dataflow programming is an active area example is Shizuki et al. [2000], which ex- of research, and many problems remain plored how a large program can be ani- open. Four of these issues are discussed mated for a programmer. This addressed in more detail in this section: the problems of how to view multiple lay- ers at once, how to view different areas —the provision of iteration in textual of the program at once, how to change fo- dataflow languages, cus rapidly so as to avoid loss of concentra- —iteration structures in DFVPLs, tion, and how to create sufficiently smooth —the use of data structures, animation that will not appear disjointed to the developer. The solution proposed —nondeterminism. is a smart, multifocal, fisheye algorithm. Much research deserves to be done in this 6.1. Iteration in Textual Dataflow Languages area. Most dataflow programming languages On the basis of this discussion, the fol- provide loops of some form, but the way lowing conclusions concerning Dataflow in which loops are expressed as a dataflow Visual Programming Environment can be graph is quite different from most other drawn: representations of iteration. The problem arises because iteration does not fit —InaDFVPL there is a blur in neatly into the functional paradigm, as the distinction between language and it involves repeated assignment to a environment. loop variable and sequential processing. Nevertheless, most dataflow researchers —In addition, DFVPLs tend to sig- recognized that programmers’ demands nificantly blur the distinctions be- made it necessary to provide iteration tween the requirements, design, cod- [Ackerman 1982] and worked on ways ing, and testing phases of the software to make it mathematically respectable lifecycle. [Ashcroft and Wadge 1977]. Ways of mak- —This blurring offers the opportunity for ing it efficient were also studied [Ning rapid prototyping. and Gao 1991]. It should be noted that many dataflow languages provide itera- —The design phase benefits the most tion through tail-recursion. However, as from the use of DFVPLs over textual this is usual practice in functional lan- languages. guages, this section deals specifically with —The animation offered by a DFVPL en- the more explicit iterative constructs. vironment is vitally important to its The exact syntax of the various solu- usefulness. tions offered differed, but they were all fundamentally the same. The idea was —The dataflow semantics of DFVPLs are to think of the body of an iteration as intuitive for nonprogrammers to un- being executed in an environment where derstand and thus improve communi- the loop variable had a certain value that cation between the customer and the remained the same throughout the itera- developer. tion. Thus, a single pass of the loop can —The library of functions included with be regarded as a set of definitions like any a DFVPL is not a major factor in other. The loop variable is updated by us- productivity. ing an identifier such as “NEW” to refer to the value that the loop variable will have —Key areas requiring work include the on the next iteration. For example, use of secondary notation, and control- flow constructs. NEW X = X + 1;

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 24 Johnston et al.

Fig. 6. Dataﬂow graph representing the factorial program.

As the value of X has not actually been clared as a loop variable in the dataflow changed by this statement, this is a math- version of the loop because that is the ematically acceptable way of representing only way that a variable can be assigned iteration. When the loop has completed multiple times in the manner required by the iteration, the value of NEW X is as- iteration. signed to X, but again, this is acceptable While this code definitely looks differ- since all that is required is that the value ent from the imperative example, it does, of X remain unchanged during the sin- nevertheless, retain a strong imperative gle iteration. Some languages use an “old” feel and could be used more intuitively keyword to achieve the same effect. by programmers when compared to tail- A piece of code to calculate factorial(N) recursion. by iteration, when translated into the Of course, this code has to be translated functional loop favored by dataflow pro- into a dataflow graph before it can be ex- gramming languages, looks like this: ecuted. While a loop in a dataflow graph can look complicated, most loops can be LOOP WITH i = N, fact = 1 coded in the same way. Figure 6 shows a NEW fact = fact ∗ i; dataflow graph that could result from the NEW i = i − 1; above dataflow code example. WHILE i > 1; It cannot be denied that this representation is much less succinct than the In this code, the values of “fact” and “i” are text-based loop. However, the point is not defined functionally, using the loop. They that a loop can be drawn directly as a are modified using the keyword “NEW.” graph, but that the text-based loop can Note that the definitions of “NEW fact” be converted into a well-behaved dataflow and “NEW i” can be placed in any order. graph. Few dataflow researchers would ex- If the definition of “NEW i” were placed pect any programmer to manually gener- first, the definition of “NEW fact” would ate the graph shown in Figure 6. This also still be valid because the original value of i illustrates one of the failings of early is unchanged until the end of the iteration. graphical dataflow languages. However, Note also that the value “fact” must be de- as we shall note in the next section, recent

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 25 development in dataflow research permit tional check delays the execution of the programs to be specified graphically with- Switch nodes that could otherwise begin out this level of detail. to execute sooner. When the situation of In Figure 6, the rectangles marked several loops being executed at once is con- “Switch” and “Merge” operate as explained sidered, that is, by several triggers arriv- in Section 2.1.1. The “Delay” gate sim- ing simultaneously, the pure model per- ply waits until data appears at both in- mits very little overlap of separate loop puts before outputting the data on the left instances: in this case, the maximum over- arc and discarding the data on the right lap is under 10%. This is because the to- arc. This acts as a trigger, preventing the ken that will begin the next loop is delayed loop from repeatedly executing ad infini- by the Merge gate until a false token ar- tum. With the Delay gate, a single token rives, and this only occurs when the pre- passed down the trigger arc will cause one vious loop has completed. execution of the loop. The squares with When executed under the dynamic “1” in them are constants, that repeat- model, the loop does not provide any more edly generate tokens with the value “1.” pipelining of iterations within a loop, but The circles are nodes that perform opera- it does provide excellent overlap of sepa- tions. These operations produce either nu- rate iterations. In fact, they can occur si- merical or Boolean results. The small grey multaneously and independently. Under circles labeled “F” signify tokens that are the dynamic model, the “false” tokens that defined to be present on the given arcs are initially placed at the two Merge gates when the program first activates. Three will be present initially for each separate horizontal parallel lines denote a sink, instance of the loop, rather than having which destroys any tokens that fall into it. to be generated by the previous loop as A small open square at a crossing of two it completes. It should also be noted that arcs indicates that the arcs are joined. In other loops, such as those which populate all other cases, arcs pass over each other arrays, do not have the pipelining prob- without being joined. lems mentioned above. If executed under the pure token-based As illustrated above, some loops in dataflow model, with N = 3, the 32 sepa- dataflow graphs have the potential to rate firings necessary to complete the exe- limit concurrency. However, the use of al- cution are performed in 14 time units, with ternative models of execution can limit parallelism in each time unit of either two this restriction. Although it is not to- or three instructions. The left-hand side of tally natural in functional languages, it- the graph produces a sequence of tokens eration has been accepted as necessary representing the counter “i,” starting at 1. by most researchers. Indeed, Whiting and The right-hand side produces a series of to- Pascoe [1994, p. 53] commented that “the kens representing the accumulating facto- introduction of this form of loop con- rial by multiplying the previous factorial struct ...was responsible for much of the token by a token from the first sequence acceptance of data-flow languages.” The each time. The node in the center of the efficient execution of dataflow loops has graph halts the feedback once the value been a subject of active and ongoing re- of i ≤ N becomes false, and the Switch search [Bic et al. 1995; Ning and Gao 1991; gates are used to output the completed val- Yu and D’Hollander 2001]. ues. The value of “i” is discarded, while the value of the factorial is sent out of this por- 6.2. Iteration in Dataflow Visual tion of the graph. Programming Languages When executed under the pure token- based dataflow model, as above, the graph Although iteration remains an open ques- exhibits some pipelining within one in- tion in DFVPLs as well as textual dataflow stance of the loop, as the next iteration languages, it is a different kind of prob- can begin before the previous one has lem. Here the problem is how to express a completed. However, the necessary condi- repetitive structure in a graphical model

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 26 Johnston et al. that does not naturally allow such struc- Show-and-Tell, a FOR loop is a special tures. Figure 6 shows how a loop looks node that encloses all of the nodes to be under the pure dataflow model. Few pro- executed iteratively. Unlike Show-and- grammers would wish to construct such a Tell, it has an additional input port that graph, and, if they did, it would be un- specifies how many times the loop is to clear and error-prone. It has long been run. All other values that are output recognized that a practical DFVPL must ports conceptually reenter on identical provide a better way to support iteration. input ports. Another port visible only in- The question has been what constructs are side the loop specifies the current value most appropriate for expressing iteration. of the loop variable. It should also be noted that iteration is The WHILE loop operates in a sim- merely an example of the wider issue of ilar way, except that it does not have how to express control-flow constraints in the loop variable. Instead, it has a port DFVPLs. However, since iteration is ar- only visible inside the loop that termi- guably the most important and heavily nates after the current iteration once it researched problem, this section concen- receives a value of “false.” A construct trates on it. unique to LabView allows the timings A key recent article on this topic was of the loop to be specified, for example, Mosconi and Porta [2000], and we do not loop every 250 ms. This is due to its intend to reproduce their review. Instead, application of reading scientific instru- each of five examples of iteration con- ments. A LabView program is shown in structs will be described briefly. Figure 5. Further screenshots can be found in Mosconi and Porta [2000]. —Show-And-Tell. Show-and-Tell [Kimura and McLain 1986] was an early dataflow —Prograph.InPrograph [Cox et al. 1989], visual language designed for children. any user-defined node that has the same In its approach to iteration, a special number and type of inputs as outputs node is used to enclose an area of code can be deemed to be a loop. Its icon that is to be executed iteratively. Each changes to illustrate this fact. Prograph loop box has what is known as a consis- provides a special “terminate” node for tency check. Data can only flow through use within a loop. When the condition a node if it is consistent. If the consis- specified within the terminate node is tency check evaluates to false, the node satisfied, the iteration is terminated af- becomes inconsistent, and execution of ter the current iteration is complete. the loop stops. The loop has the same —Cantata. Cantata [Rasure and Williams number of inputs, as outputs and data 1991] is a coarse-grained language in is fed back from the outputs into the in- which nodes contain entire functions, puts, as long as the box is consistent. rather than just a primitive operation. When it becomes inconsistent, the data Its approach is to conceal the entire loop is ejected to the rest of the graph. within one node. Each input is desig- For example, a loop might contain one nated a name by the programmer, who input that identifies the number of it- also specifies either a loop variable and erations required. This value is decre- bounds, or a WHILE-condition, using mented and sent to the output during the names. The programmer then sets each iteration. The consistency check up a series of assignments that are to is that this value is greater than zero. take place within each loop. The node Thus, when the iteration count reaches then executes the loop internally. zero, the loop stops executing. Screen- Note that this is far removed from shots of Show-and-Tell loops can be pure dataflow philosophy. For exam- found in Mosconi and Porta [2000]. ple, a loop assignment may contain —LabView. LabView, a commercial prod- the expression j = j + 1, a statement uct, has two kinds of loop, a FOR loop which traditionally makes no sense and a WHILE loop [LabView 2000]. Like in a dataflow language. Examples of

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 27

Cantata programs can be found in 6.3. Data Structures Mosconi and Porta [2000]. One of the key issues in the drive for an ef- —VEE.Incontrast with Cantata, loops in ficient implementation of dataflow is that VEE [Helsel 1994] are expressed most of data structures. Whiting and Pascoe closely to the pure dataflow model, that [1994] commented that “data structures is, through cycles in the graph. However, sit uneasily within the data-flow model” the model has been augmented with a (see also Treleaven et al. [1982] and Veen number of additional nodes in order to [1986]). However, they went on to note that simplify the appearance of the cycles. In much research has been undertaken in a FOR loop, a special FOR node gener- this area and that a number of quite suc- ates a series of indexes between a range cessful solutions have been proposed, most that are queued. The programmer does notably I-structures [Arvind et al. 1989]. not need to worry about incrementing The “pure” token model of dataflow and feeding back the loop variable, being states that all data is represented by val- free instead to concentrate on the values that, once created, cannot be modified. ues being calculated. A WHILE loop can These values flow around the dataflow be set up by using three related nodes. graph on tokens and are absorbed by The UNTIL BREAK node repeatedly ac- nodes. If a node wishes to modify this tivates the graph it is connected to, un- value, it creates a new token, containing til the graph activates a related BREAK new data which is identical to the origi- node which halts the repetition. Data ar- nal data, except for the element that had riving instead at the NEXT node trig- to be altered. Some of the earliest dataflow gers the next iteration. languages that had support for data structures worked in this way [Davis 1979]. If this way of treating data as values rather Mosconi and Porta [2000] concluded than variables were not part of the to- their paper by proposing a syntax that is ken model of dataflow, then the single- consistent with the pure dataflow model. assignment rule would have been violated They were keen to stress that they were and thus the data-dependent scheduling of not proposing their syntax for actual use, the entire graph would be compromised. but to prove that practical iteration is While this conceptual view of data struc- possible without sacrificing the pure se- tures is perfectly fine for the theoreti- mantics of the model. Their loop system, cal study of dataflow, and perhaps even implemented as part of the Vipers envi- for dataflow graphs that deal only with ronment [Ghittori et al. 1998], includes primitive data types, this approach is cycles, but is simplified by the use of en- clearly unsatisfactory for graphs that re- abling signals. They also demonstrated quire the use of data structures. With the away to collapse an iteration into a era of structured programming, followed single node without sacrificing the pure by the era of object-oriented programming, model. the idea of software development with- All of these approaches have advantages out the use of data structures is virtu- and disadvantages. Some, such as Can- ally incomprehensible. Thus, any practical tata, introduce imperative structures that implementation of dataflow must include are inconsistent with dataflow, although an efficient way of providing data struc- Cantata also offers the simplest loops in tures, although it should be stated that terms of visual syntax. Others, such as some languages designed for research pur- VEE, involve relatively complex graphs. poses solved the problem by not providing All of them suffer from the inability to dy- data structures at all [Hankin and Glaser namically express the concept of a repet- 1981]. itive loop with a static icon. The whole area of control-flow constructs, and iteration in particular, remains an open topic in 6.3.1. Dennis’s Method. Dennis [1974] DFVPLs. was the first to provide realistic data

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 28 Johnston et al.

heap, value B retains references to all the data that is not modified, thus saving time by not copying the entire data structure. Meanwhile, value A remains unmodified, preserving the functional semantics of the model. This method prevents the unbounded copying of arbitrarily complex values. It also permits the sharing of identical data Fig. 7. Showing the effect on Dennis’s data heap of modifying a value. elements which saves memory.However, it is not an ideal solution for all situations. For example, if the values in a 100-element structures in a dataflow context by propos- array are being modified sequentially by a ing that the tokens in the dataflow pro- loop, this solution would require making gram hold not the data itself, but rather 100 new data structures in the process, a pointer to the data (see also Davis and notwithstanding the fact that they are not Keller [1982]). He devised a memory heap copying the entire array each time. A good in the form of a finite, acyclic, directed compiler could detect such a loop and pre- graph where each node represents either vent needless copying such as this. Ex- an elementary value or a structured value cessive overhead in the Dennis approach which behaves much like an indexed ar- was also examined and reported by Gajski ray. Each element of a structured value is, et al. [1982]. in turn, a node that represents either an A second problem, and one which be- elementary value or a structured value. came more evident as research progressed The pointers in the data tokens refer to [Ackerman 1982], was that the use of data one of these nodes and a reference count structures can reduce parallelism in a is maintained. A node which is no longer dataflow program due to the long delay be- referred to, either directly or indirectly, tween creating the structure, and all parts in the graph is removed by an implicit of it being completed. To use Ackerman’s garbage collector. [1982] example, consider a dataflow pro- Dennis [1974] went on to show that gram that has two main sections. The first implementation is possible without need creates a 100-element array and populates of copying arbitrarily complex values. As it, one element at a time. The second takes Dennis’s dataflow programs were always the array and reads the elements, one at functional, it was necessary that modify- a time. In this case, the second part can- ing a value should result in a new value, not begin to execute until the first part is without changing the original. Whenever complete, even though it could be reading an elementary value is modified, a new element 1 while the first part of the pro- node is simply added to the heap. When- gram is writing element 2, and so on. In ever a structured value is modified, and this case, a program that could conceiv- there is more than one reference to the ably execute in 101 time units, takes 200 value, a new root node is added to the heap, time units to complete. This delay proved pointing to the same values as the orig- to be frequently unnecessary, and led to inal root node, with the exception of the the development of I-structures. one value that was to have been modified for which a new node is created. This is illustrated by Figure 7, which shows the 6.3.2. I-Structures. To overcome this effect on Dennis’s data heap when a value problem, Arvind and Thomas [1980] A, which represents the array [a, b, c], proposed a system that they called is modified. The second element is modi- I-structures. They observed that the fied to create a new value B, which rep- problem with Dennis’s approach was that resents the array [a, e, c]. In the memory it imposed too strict a control structure

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataflow Programming Languages 29 on the computations that filled in the wasted space as it was unable to share components of the data structure [Arvind common substructures. They claimed that et al. 1989]. Ideally, what was needed their proposed hybrid structures carefully was some way of allowing more flex- combined the advantages brought forth ible access to the data structures but by both copying and sharing. To demon- which—crucially—would not destroy the strate their proposed structure, they used functional semantics of dataflow. arrays, although the method can be ap- I-structures are related to lazy evalua- plied to any form of data structure. tion. A data structure is created in mem- Their method represents each array as ory,but its constituent fields are left blank. an array template (for full details, see Each field can either store a value or be Hurson et al. [1989]). The array template “undefined.” A value can only be stored in has a reference count and can either refer a field that is currently undefined. Thus to an original array or a modified array. I-structures obey the single-assignment In the case of an original array, the tem- rule. Any attempt to read from an unde- plate points to a sequential area of mem- fined field is deferred until that value be- ory that contains the elements of the array. comes available, that is, until a value is Whenever an element of the array is “mod- assigned to it. ified,” a new “modified” array template is By following this set of rules, a data created to represent the new array. It con- structure can be “transmitted” to the rest tains an area of memory that represents of the program as soon as it is created, the new array, but with only the modi- while the sender continues to populate the fied value filled in. The other values are fields of that structure. Meanwhile, the remarked as absent and a link is provided ceiver can begin to read from the struc- back to the original array. An attempt to ture. Referring again to Ackerman’s [1982] read from the new array will either re- example, this would dramatically improve turn a modified value or, if the value is not the performance of some programs. Al- there, the link to the original array will though they are somewhat opposed to the be followed and the value retrieved from purity of dataflow, I-Structures have been there. widely adopted [Arvind et al. 1988; Culler If another value is modified in the new et al. 1995; Keller 1985]. array, and its reference count remains 1, However, while I-structures do solve it can be modified in situ without hav- problems of unnecessary delays in func- ing to create a new array template. In tional dataflow programs, they do not ad- order to prevent copying the entire array dress the initial problem of copying data each time, a new array is required; hybrid structures in order to modify them. Gajski structures allow large arrays to be broken et al. [1982] pointed out other issues re- up into equally sized blocks, each storing lated to the overhead of storing and ful- a certain number of elements. Because the filling deferred reads in this approach, al- blocks are of identical size, looking for an though Arvind and Culler [1986] argued element within them can be achieved in that this overhead is small and easily out- constant time. Blocks themselves have ref- weighed by the benefits. This latter view erence counts, allowing for sharing of sub- appears to be supported by the experi- portions of arrays. ments of Arvind et al. [1988]. Experiments reported in the same paper [Hurson et al. 1989] suggest that hybrid structures lead to improvements in both 6.3.3. Hybrid Structures. Hurson et al. performance and storage over the copying [1989] examined both Dennis’s copying method of Dennis [1974] and I-structures, strategy and the later I-structures. As dis- although it does contain a certain element cussed above, they found that the for- of overhead. mer had high-overhead issues and wasted While the three approaches outlined potential parallelism, while the latter above, and their derivatives, have resolved

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 30 Johnston et al. many of the problems related to effi- his model was severely limited because it ciently implementing data structures in could produce only determinate programs. the dataflow model, those problems nev- While not dismissing the possibility of ex- ertheless remain open issues. A brief tending the theory to nondeterminate pro- overview of data structures and dataflow grams, the task appears daunting: Kahn was given in Lee and Hurson [1993]. Ef- [1974] remarked only that he did not forts to reduce unnecessary copying and think it was impossible, but did not find to reduce wasted memory from needless it obvious how to do it satisfactorily. This duplication, and the desire not to review was supported by Kosinski [1978], duce parallelism when implementing data who reported that attempts to formal- structures, remain topics for further re- ize nondeterminate dataflow graphs had search. Data structures will always sit un- been rather unsatisfactory due to their easily within pure dataflow models, but complexity. as their provision is virtually essential, The dichotomy in regard to nondeter- it is an issue that must necessarily be minism appears to be the result of a di- examined. vision between those who wish to use dataflow as a means to ease the formal proof and analysis of programs and those 6.4. Nondeterminism who wish to use dataflow for all varieties The deterministic nature of dataflow of programming problems. The former re- graphs has been promoted many times gard determinacy as essential, whereas as an advantageous feature [Kahn the latter regard the provision of nonde- 1974; Karp and Miller 1966; Kosinski terminacy as essential. 1973; Naggar et al. 1999; Verdoscia and This problem can be resolved by pro- Vaccaro 1998]. This is because the viding well-structured nondeterminacy.In dataflow concept lends itself well to admitting the need for nondeterminacy, mathematical analysis and proofs [Kahn Dennis [1974] nevertheless insisted that 1974] and nondeterminism would destroy he wanted to be able to guarantee users of or limit many essential properties. Weng his language that his program was deter- [1975] observed that in von Neumann minate if they desired such a guarantee. languages, concurrency constructs almost Arvind et al. [1977] proposed that nonde- always introduce unwanted concurrency terminacy be permitted only by very ex- into programs, and that developing dis- plicit means, to provide it for those who tributed systems in this model is made want it, but guarantee its absence, if not. extremely difficult by this fact. Most They demonstrated two constructs: the dataflow programming languages are dataflow monitor and the nondeterminis- determinate, and the nondeterminacy in tic merge as vehicles for this. some of those that are not is not always The nondeterminate merge appears to intentional—often being the result of be able to solve many of the problems as- imperfect implementation decisions. sociated with the lack of nondeterminacy. Valid, Cajole, and DL1 have very limited Semantically, it is a node that takes two nondeterminate features. input arcs and one output arc and merges However, it is also widely accepted that the two streams in a completely arbitrary there are many applications that actually way. In most cases, such as a booking require non-determinacy. These are sys- system, this is all the nondeterminacy that tems that are essentially operating in non- is required. The advantage of this is that determinant environments, such as book- the nondeterminism can be readily identi- ing systems and database access systems. fied. It is even possible to have nondeter- This was recognized early in the develop- minate subgraphs within a graph that is ment of dataflow languages. Dennis [1974] otherwise determinate. Therefore, it may conceded the point, and Kahn [1974], after be possible to apply mathematical princi- his detailed mathematical analysis of de- ples to the graph even if it does have non- terminate dataflow graphs, conceded that determinate sections.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 31

If dataflow is to become an accept- —The major change in dataflow re- able basis for general-usage programming search as a whole has been the move languages, nondeterminacy is essential. away from fine-grained parallelism to- As well as having disadvantages for for- wards medium- and coarse-grained mal proofs, nondeterminacy also damages parallelism. the software engineering process by mak- —The major change in the past decade ing debugging more difficult. Therefore, in dataflow programming has been the the question is how to successfully con- advent of dataflow visual programming trol the propagation of nondeterminacy in languages. dataflow systems, but still permit the soft- —Asitisvisualization that is key to a ware engineer to write usable programs. visual programming language, the distinction between a dataflow visual pro- 7. CONCLUSION gramming language and its environment has become blurred and the two In this article, the history of dataflow pro- must now be treated as one unit. gramming has been charted to the present day. Beginning with the theoretical foun- —Dataflow languages increasingly de- dations of dataflow, the design and imple- serve to be treated as coordination lan- mentation of fine-grained dataflow hard- guages, an important area of research ware architectures have been explored. with the advent of heterogeneous dis- The growing requirement for dataflow pro- tributed systems. gramming languages was addressed by —The three key open issues in dataflow the creation of a functional paradigm of programming remain the representa- languages, and the most relevant of these tion of control-flow structures, the rep- have been discussed. resentation of data structures, and the The discovery that fine-grained visualization of execution. dataflow had inherent inefficiencies led to a period of decline in dataflow research REFERENCES in the 1980s and early 1990s. However, research in the field resumed in the 1990s ACKERMAN,W. 1982. Data flow languages. IEEE Comput. 15,2,15–25. with the acceptance that the best dataflow ACKERMAN,W.B.AND DENNIS,J.B. 1979. VAL— hardware techniques would come from A value-oriented algorithmic language: Prelim- merging dataflow and von Neumann inary reference manual. Tech Rep. 218. MIT, techniques. This led to the development Cambridge, MA. of hybrid architectures, whose primary AMAMIYA, M., HASEGAWA, R., AND ONO,S.1984. trait was a move away from fine-grained Valid, a high-level functional programming language for data flow machines. Rev. Electric. parallelism toward more coarse-grained Comm. Lab. 32,5,793–802. execution. ARVIND AND CULLER,D.E. 1983. The tagged to- The most important development in ken dataflow architecture (preliminary version). dataflow programming languages in the Tech. Rep. Laboratory for Computer Science, 1990s was the advent of dataflow visual MIT, Cambridge, MA. programming languages, DFVPLs, which ARVIND AND CULLER,D.E. 1986. Dataflow architec- have been explored. Integral to a DFVPL tures. Ann. Rev. Comput. Sci. 1, 225–253. is its development environment, and these ARVIND,CULLER,D.E.,AND MAA,G.K. 1988. As- sessing the benefits of fine-grain parallelism in have been discussed. The change in moti- dataflow programs. Int. J. Supercomput. Appl. 2, vation for pursuing DFVPLs toward soft- 3, 10–36. ware engineering has been noted. Finally, ARVIND AND GOSTELOW,K.P.1977. Some rela- many issues remain open in dataflow pro- tionships between asynchronous interpreters gramming, and four of these have been of a dataflow language. In Proceedings of the IFIP WG2.2 Conference on the Formal Descrip- discussed. tion of Programming Lanugages (St. Andrews, Five key conclusions can be drawn re- Canada). garding the current state of dataflow ARVIND,GOSTELOW,K.P.,AND PLOUFFE,W. 1977. programming. Indeterminancy, monitors, and dataflow. In

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 32 Johnston et al.

Proceedings of the Sixth Symposium on Operat- BUCK,J.AND LEE,E.A. 1995. The token flow model. ing System Principles, 159–169. In Advanced Topics in Dataflow Computing and ARVIND,GOSTELOW,K.P.,AND PLOUFFE,W. 1978. An Multithreading. IEEE Computer Society Press, asynchronous programming language and com- Los Alamitos, CA, 267–290. puting machine, Tech. Rep. TR 114a. University BURNETT,M.M., BAKER,M.J.ET AL. 1995. Scaling of California, Irvine, Irvine, CA. up visual programming languages. IEEE Com- ARVIND,NIKHIL,R.S.,AND PINGALI,K.K. 1989. put. 28,3,45–54. I-structures: Data structures for parallel com- COMTE,D.,DURRIEU,G.,GELLY,O.,PLAS, A., AND SYRE, puting. ACMTrans. Program. Lang. Syst. 11,4, J. C. 1978. Parallelism, control and synchro- 598–632. nisation expression in a single assignment lan- ARVIND AND NIKHIL,R.S. 1990. Executing a pro- guage. ACM SIGPLAN Not. 13,1,25–33. gram on the MIT tagged-token dataflow archi- COX,P.,GILES,F.,AND PIETRZYKOWSKI,T. 1989. Pro- tecture. IEEE Trans. Comput. 39,3,300–318. graph: A step towards liberating programming ARVIND AND THOMAS,R. 1980. I-structures: An effi- from textual conditioning. In Procedings of the cient data type for functional languages. Tech. IEEE Workshop on Visual Languages. 150–156. Memo 178. Laboratory for Computer Science, COX,P.AND SMEDLEY,T. 1996. A visual language MIT, Cambridge, MA. for the design of structured graphical objects. In ASHCROFT,E.A.AND WADGE,W.W. 1977. Lucid, a Proceedings of the IEEE Symposium on Visual nonprocedural language with iteration. Comm. Languages. 296–303. ACM 20,7(July), 519–526. CULLER,D.E., GOLDSTEIN,S.C.,SCHAUSER,K.E., ASHCROFT,E.A.AND WADGE,W.W. 1980. Some AND VON EICKER,T. 1995. Empirical study of common misconceptions about Lucid. ACM a dataflow language on the CM-5. In Ad- SIGPLAN Not. 15, 10, 15–26. vanced Topics in Dataflow Computing and Mul- AUGUSTON,M.AND DELGADO,A. 1997. Iterative con- tithreading. IEEE Computer Society Press, Los structs in the visual data flow language. Proceed- Alamitos, CA, 187–210. ings of the IEEE Conference on Visual Languages DAVIS,A.L. 1974. Data driven nets—A class of (VL’97, Capri, Italy). 152–159. maximally parallel, output-functional program BACKUS,J. 1978. Can programming be liberated schemata. Tech. Rep. IRC Report. Burroughs, from the von Neumann style? A functional style San Diego, CA. and its algebra of programs. Comm. ACM 21,8 DAVIS,A.L. 1978. The architecture and system (Aug.), 613–641. method of DDM1: A recursively structured BARAHONA,P.AND GURD,J.R. 1985, Simulated per- data driven machine. In Proceedings of the 5th formance of the Manchester multi-ring dataflow Annual Symposion on Computer Architecture machine. In Proceedings of the 2nd ICPC (Sept.). (New York). 210–215. 419–424. DAVIS,A.L. 1979. DDN’s—a low level program BAROTH,E.AND HARTSOUGH,C. 1995. Visual pro- schema for fully distributed systems. In Proceed- gramming in the real world. In Visual Object- ings of the 1st European Conference on Parallel Oriented Programming: Concepts and Environ- and Distributed Systems (Toulouse, France). 1– ments. Prentice-Hall, Upper Saddle River, NJ, 7. 21–42. DAVIS,A.L.AND KELLER,R.M. 1982. Data flow pro- BERNINI,M.AND MOSCONI,M. 1994. VIPERS: A gram graphs. IEEE Comput. 15,2,26–41. data flow visual programming environment DAVIS,A.L.AND LOWDER,S.A. 1981. A Sample based on the Tcl language. In Proceedings of the management application program in a graphi- Workshop on Advanced Visual Interfaces. 243– cal data-driven programming language. In Di- 245. gest of Papers Compcon Spring, February 1981. BHATTACHARYYA,S.S.1996. Software Synthesis 162–165. from Dataflow Graphs. Kluwer Academic Pub- DE JONG,M.D.,AND HANKIN,C.L. 1982. Struc- lishers, Dordrecht, The Netherlands. tured data-flow programming. ACM SIGPLAN BIC,L. 1990. A process-oriented model for efficient Not. 17,8,18–27. execution of dataflow programs. J. Parallel Dis- DENNIS,J. 1977. A Language Design for Struc- trib. Comput. 8, 42–51. tured Concurrency. The Design and Implementa- BIC, L., ROY,J.M.A.,AND NAGEL,M. 1995. Exploit- tion of Programming Languages. Lecture Notes ing iteration-level parallelism in dataflow pro- in Computer Science, vol. 54. Springer-Verlag, grams. In Advanced Topics in Dataflow Comput- Berlin, Germany, 23–42. ing and Multithreading. IEEE Computer Society DENNIS,J.B. 1974. First version of a data flow pro- Press, Los Alamitos, CA, 167–186. cedure language. In Proceedings of the Sympo- BOHM,W.,NAJJAR,W.A., SHANKAR,B.,AND ROH,L. sium on Programming (Institut de Programma- 1993. An evaluation of coarse grain dataflow tion, University of Paris, Paris, France). 241– code generation stretegies. In Proceedings of the 271. Working Conference on Massively Parallel Pro- DENNIS,J.B. 1980. Data flow supercomputers. gramming Models (Berlin, Germany). IEEE Comput. 13,11(Nov.), 48–56.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. Advances in Dataﬂow Programming Languages 33

DENNIS,J.B.AND MISUNAS,D.P. 1975. A prelimi- Dataflow Computing and Multithreading. IEEE nary architecture for a basic data-flow processor. Computer Society Press, Los Alamitos, CA, 103– In Proceedings of the Second Annual Symposium 112. on Computer Architecture. 126–132. JAGANNATHAN,R.1995. Coarse-Grain Dataflow GAJSKI,D.D., PADUA,D.A., KUCLE,D.J.,AND KUH,R.H. Programming of Conventional Parallel Comput- 1982. A second opinion on data-flow machines ers. In Advanced Topics in Dataflow Computing and languages. IEEE Comput. 15,2,58–69. and Multithreading. IEEE Computer Society GAO,G.R.AND PARASKEVAS,Z. 1989. Compiling for Press, Los Alamitos, CA, 113–129. dataflow software pipelining. In Proceedings of KAHN,G. 1974. The semantics of a simple lan- the Second Workshop on Languages and Com- guage for parallel programming. In Proceed- pilers for Parallel Computing. ings of the IFIP Congress 74 (Amsterdam, The GELERNTER,D.AND CARRIERO,N. 1992. Coordina- Netherlands). 471–475. tion languages and their significance. Comm. KARP,R.AND MILLER,R. 1966. Properties of a model ACM 35,2,97–107. for parallel computations: Determinacy, termi- GELLY,O. 1976. LAU software system: A high-level nation, queueing. SIAM J. 14, 1390–1411. data-driven language for parallel processing. In KELLER,R.M.1985. Rediflow architecture Proceedings of the International Conference on prospectus, Tech. Rep. No. UUCS-85-105. De- Parallel Processing (New York, NY). partment of Computer Science, University of GHITTORI, E., MOSCONI, M., AND PORTA,M. 1998. Utah, Salt Lake City, Utah. Designing and testing new programming con- KELLER,R.M.AND YEN,W.C.J. 1981. A graphical structs in a data flow VL. Tech. Rep. Universita` approach to software development using func- di Pavia, Pavia, Italy. tion graphs. In Digest of Papers Compcon Spring, GLAUERT,J.R.W. 1978. A single assignment lan- February 1981. 156–161. guage for data flow computing, Master’s thesis. KIMURA,T.AND MCLAIN,P. 1986. Show and Tell University of Manchester, Manchester, U.K. user’s manual. Tech. Rep. WUCS-86-4. Depart- GREEN,T.R.G.AND PETRE,M. 1996. Usability ment of Computer Science, Washington Univer- analysis of visual programming environments: sity, St Louis, MO. A “Cognitive Dimensions” Framework. J. Vis. KIPER,J.,HOWARD, E., AND AMES,C. 1997. Criteria Lang. Comput. 7, 131–174. for evaluation of visual programming languages. GURD,J.R.AND BOHM,W. 1987. Implicit parallel J. Vis. Lang. Comput. 8,2,175–192. processing: SISAL on the Manchester dataflow KOSINSKI,P. 1978. A straightforward denotational computer. In Proceedings of the IBM-Europe In- semantics for non-determinate data flow pro- stitute on Parallel Processing (Aug., Oberlech, grams. In Proceedings of the 5th ACM Sympo- Austria). sium on Principles on Programming Languages. HANKIN,C.L.AND GLASER,H.W. 1981. The data- ACM Press New York, NY. flow programming language CAJOLE—an infor- KOSINSKI,P.R. 1973. A data flow language for op- mal introduction. ACM SIGPLAN Not. 16,7,35– erating systems programming. In ‘Proceedings 44. of ACM SIGPLAN-SIGOPS Interface Meeting.’ HARVEY,N.AND MORRIS,J. 1993. NL: A general pur- SIGPLAN Not. 8,9,89–94. pose visual dataflow language, Tech. Rep. Uni- LABVIEW. 2000. Lab View User Manual. National versity of Tasmania, Tasmania, Australia. Instruments, Austin, TX. HARVEY,N.AND MORRIS,J. 1996. NL: A parallel pro- LEE,B.AND HURSON,A.R. 1993. Issues in dataflow gramming visual language. Australian Comput. computing. Adv. in Comput. 37, 285–333. J. 28,1,2–12. LEE,B.AND HURSON,A.R. 1994. Dataflow architec- HELSEL,R. 1994. Cutting Your Test Development tures and multithreading. IEEE Comput. 27,8 Time with HP VEE. Prentice-Hall, HP Profes- (Aug.), 27–39. sional Books, Englewood Cliffs, NJ. LEE,E.1997. A denotational semantics for HILS,D.D. 1992. Visual languages and comput- dataflow with firing. Memorandum UCB/ERL ing survey: Data flow visual programming lan- M97/3. Electronics Research Laboratory, Uni- guages. J. Vis. Lang. Comput. 3,1,69–101. versity of California, Berkeley, Berkeley, CA. HURSON,A.R., LEE,B.,AND SHIRAZI,B. 1989. Hy- LEE,E.AND MESSERSCHMITT,D.1987. Static brid structures: A scheme for handling data scheduling of synchronous dataflow programs structures in a data flow environment. In Pro- for digital signal processing. IEEE Trans. ceedings of the Conference on Parallel Architec- Comput. C-36,1,24–35. tures and Languages (PARLE), 323–340. LEE,E.AND PARKS,T. 1995. Data-flow process net- IANNUCCI,R.A.1988. Towards a dataflow/von works. Proc. IEEE. 83,5,773–799. Neumann hybrid architecture. In Proceedings of MCGRAW,J.,SKEDZIELEWSKI,S.ET AL. 1983. the ICSA-15 (Honolulu, HI). 131–140. SISAL—Streams and Iteration in a Single IWATA,M.AND TERADA,H. 1995. Multilateral di- Assignment Language Reference Manual (Ver- agrammatical specification environment based sion 1.0). Livermore National Laboratory, on data-driven paradigm. In Advanced Topics in Livermore, CA.

ACM Computing Surveys, Vol. 36, No. 1, March 2004. 34 Johnston et al.

MORRISON,J.P. 1994. Flow-Based Programming: SILC,J.,ROBIC,B.,AND UNGERER,T. 1998. Asyn- A New Approach to Application Development. chrony in parallel computing: from dataflow to van Nostrand Reinhold, New York, NY. multithreading. Parallel Distrib. Comput. Pract. MOSCONI,M.AND PORTA,M. 2000. Iteration con- 1,1,3–30. structs in data-flow visual programming lan- STERLING,T.,KUEHN,J.,THISTLE, M., AND ANASTASIS, guages. Comput. Lang. 26, 2-4, 67–104. T. 1995. Studies on Optimal Task Granular- NAGGAR,W.,LEE, E., AND GAO,G.R. 1999. Ad- ity and Random Mapping. In Advanced Topics in vances in the dataflow computational model. Dataflow Computing and Multithreading. IEEE Parallel Comput. 25, 1907–1929. Computer Society Press, Los Alamitos, CA, 349– 365. NAJJAR,W.A., ROH, L., AND WIM,A. 1994. An evaluation of medium-grain dataflow code. Int. J. Par- TANIMOTO,S. 1990. VIVA: A visual language for allel Program. 22,3,209–242. image processing. J. Vis. Lang. Comput. 1, 127– 139. NIKHIL,R.S.AND ARVIND 1989. Can dataflow sub- sume von Neumann computing? In Proceedings TOYODA, M., SHIZUKI,B.,TAKAHASHI,S.,MATSUOKA, of the ICSA-16 (Jerusalem, Israel). 262–272. S., AND SHIBAYAMA,E. 1997. Supporting design patterns in a visual parallel data-flow program- NING,Q.AND GAO,G.R. 1991. Loop storage optimization for Dataflow machines. ACAPS Tech. ming environment. In Proceedings of the IEEE Memo 23. School of Computer Science, McGill Symposion on Visual Languages (Capri, Italy). University, Montreal, P. Q., Canada. 76–83. TRELEAVEN,P.C., BROWNBRIDGE,D.R.,AND HOPKINS, PAPADOPOULOS,G.M. 1988. Implementation of a R. P. 1982. Data-driven and demand-driven general purpose dataflow multiprocessor. Tech. computer architecture. ACM Comput. Surv. 14, Rep. TR432. Laboratory for Computer Science, 1, 93–143. MIT, Cambridge, MA. TRELEAVEN,P.C.AND LIMA,I.G. 1984. Future com- PAPADOPOULOS,G.M.AND TRAUB,K.R. 1991. Multi- puters: Logic, data flow, ..., control flow? IEEE threading: A revisionist view of dataflow archi- Comput. 17,3(Mar.), 47–58. tectures. MIT Memo CSG-330. MIT, Cambridge, MA. VEEN,A.H. 1986. Data flow machine architecture. ACM Comput. Surv. 18,4,365–396. PLAICE,J. 1991. RLUCID, A General Real-Time Data-Flow Language. Lecture Notes in Com- VERDOSCIA,L.AND VACCARO,R. 1998. A high-level puter Science, vol. 571. Springer-Verlag, Berlin, dataflow system. Comput. J. 60,4,285–305. Germany, 363–374. WADGE,W.W.AND ASHCROFT,E.A. 1985. Lucid, the RASURE,J.AND WILLIAMS,C. 1991. An integrated Dataflow Programming Language. APIC Stud- data flow visual language and software devel- ies in Data Processing, no. 22. Academic Press, opment environment. J. Vis. Lang. Comput. 2, New York, NY. 217–246. WAIL,S.F.AND ABRAMSON,D. 1995. Can Dataflow RICHARDSON,C. 1981. Manipulator control using a Machines be Programmed with an Imperative data-flow machine. Doctoral dissertation. Uni- Language? In Advanced Topics in Dataflow versity of Manchester, Manchester, U.K. Computing and Multithreading. IEEE Com- puter Society Press, Los Alamitos, CA, 229–265. SAKAI,S.,YAMAGUCHI,Y.,HIRAKI, K., KODAMA,Y.,AND YUBA,T. 1989. An architecture of a dataflow WATSON,I.AND GURD,J.R. 1979. A prototype data single chip processor. In Proceedings of the 16th flow computer with token labelling. In Proceed- International Symposium on Computer Architec- ings of the National Computer Conference. 623– ture. 46–53. 628. SEROT,J.,QUENOT,G.,AND ZAVIDOVIQUE,B. 1995. A WENG,K.S. 1975. Stream oriented computation in visual dataflow programming environment for a recursive data-flow schemas. Tech. Rep. 68. Lab- real time parallel vision machine. J. Vis. Lang. oratory for Computer Science, MIT, Cambridge, Comput. 6,4,327–347. MA. SHIZUKI,B.,TOYODA, M., SHIBAYAMA, E., AND TAKAHASHI, WHITING,P.AND PASCOE,R. 1994. A history of data- S. 2000. Smart browsing among multiple as- flow languages. IEEE Ann. Hist. Comput. 16,4, pects of data-flow visual program execution, 38–59. using visual patterns and multi-focus fisheye WHITLEY,K. 1997. Visual programming languages views. J. Vis. Lang. Comput. 11,5,529–548. and the empirical evidence for and against. SHURR¨ ,A. 1997. BDL—a nondeterministic data J. Vis. Lang. Comput. 8,1,109–142. flow programming language with backtracking. YU,Y.J.AND D’HOLLANDER,E.H. 2001. Loop paral- In Proceedings of the IEEE Conference on Visual lelization using the 3D iteration space visualizer. Languages (VL’97, Capri, Italy). J. Vis. Lang. Comput. 12,2,163–181.

Received October 2001; revised September 2002, September 2003; accepted April 2004

ACM Computing Surveys, Vol. 36, No. 1, March 2004.