Generalized data flow graphs : theory and applications

Citation for published version (APA): Jong, de, G. G. (1993). Generalized data flow graphs : theory and applications. Technische Universiteit Eindhoven. https://doi.org/10.6100/IR402436

DOI: 10.6100/IR402436

Document status and date: Published: 01/01/1993

Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne

Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim.

Download date: 08. Oct. 2021 Generalized data flow graphs theory and applications

c ____ ,.______

Gjalt Gerrit de Jong Generalized data flow graphs

theory and applications Generalized data flow graphs theory and applications

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de Rector Magnificus, prof. dr. J.H. van Lint, voor een commissie aangewezen door het College van Dekanen in het open­ baar te verdedigen op vrijdag 8 oktober 1993 om 16.00 uur

door

Gjalt Gerrit de Jong

geboren te Bierum Oit proefschrift is goedgekeurd door de promotoren prof. Dr. -Ing. J.A.G. Jess en prof. dr. ir. Th. Krol en door de copromotor dr. ir. J.T.J. van Eijndhoven

© Copyright 1993 Gjalt Gerrit de Jong

CIP-GEGEVENS KONINKLIJKE BIBLIOTHEEK, DEN HAAG

Jong, Gjalt Gerrit de

Generalized data flow graphs : theory and applications I Gjalt Gerrit de Jong. - [S.I. : s.n.]. - Fig., tab. Thesis Eindhoven. - With index, ref. - With summary in Dutch ISBN 90-9006384-6 NUGI 852/ 832 Subject headings: programming theory I programming verification I formal languages Abstract

The area of application for the concepts defined and the theorems and properties proved in this thesis, is the automatic generation of integrated circuits. Starting from a high level description of the specification of the function of a system, high level synthesis results in a network graph, i.e. a network or a description at the structural (register transfer) level, and a control graph, defining which modules are active at what (time) instance. The network graph is the main result of the alloca­ tion phase of high level synthesis, whereas the control graph is the main result of the scheduling phase. The high level description typically is a program or algo­ rithm describing the function of the system to be designed. The concept of data flow graphs is a well-suited formalism to start synthesis from, since it contains all the necessary details to do synthesis, without the bother by a specific syntax or the syntactic sugar of any language. It is therefore a general formalism. A data flow graph imposes also no other restriction on the execution of the system than those prescribed by the data dependencies in the high level description. In this thesis, the concept of data flow graphs is extended with so-called multi­ destination and multi-origin edges, i.e. hyper edges. Data flow graphs convention­ ally have only edges with one origin and one destination. The introduction of multi-destination edges does not lead to any additional expressiveness, but the multi-origin feature does. The multi-origin edges lead to the notion of 'choice'. In this way, networks are a special type of these generalized data flow graphs, namely those in which all nodes are defined by various kinds of hardware modules instead of by some abslract type of behavior. The multi-destination and multi-origin edges Vi ABSTRAC'T in the data flow graph are just nets in a network. Also it is shown that the control graph is an abstraction of the data flow graph under consideration. Therefore, high level synthesis is just a graph transformation on the data flow graph level. Scheduling turns out to be a partitioning of the flow graph, which is implied by the introduction of sequence edges. Sequence edges have in principle the same func­ tion as the normal data edges, namely to denote a precedence relation. Another important task of high level synthesis is to minimize the number of multiplexors and demultiplexors. These (de)multiplexors are introduced by scheduling (in which they normally are not mentioned explicitly at all), and by the control state­ ments, like if and while, in the specification. Scheduling also is just a graph trans­ formation that can be proved to be equivalence preserving. With the concepts pre­ sented here, some requirements are stated under which synthesis results in a prop­ erly and equivalently behaving network, which is only restricted in the number of execution sequences. These requirements turn out to be less restrictive than those of existing synthesis tools. This gives the opportunity to find better solutions. Nonnal data flow graphs may be viewed as marked Petri nets, while choice is included with the generalized data flow graph presented here, leading to possible non-determinism. In principle the notion of conflict is also added. Most results assume a conflict-free flow graph, since conflict is not used for the type of applica­ tion of high level synthesis. However, conflict is considered in one of the main results, namely the reduction technique to compute as little as possible of the reachable of a data flow graph. This technique is called anticipation. It is e.g. proved that a choice-free graph is deterministic, and that only one state sequence of the reachable state space is necessary to extract the behavior of the flow graph. When the data flow graph is not choice-free, only a few execution sequences are needed. Since conflict in normal, i.e. uninterpreted, Petri nets and choice in data flow graphs are each other's dual, the same result is valid for Petri nets. Several semantics are defined for the generalized data flow graphs: an operational semantics and two denotational semantics. One denotational semantics defines only the (input-output) behavior of a flow graph, while the other also defines the different execution sequences, i.e. the internal behavior, too. Another presented denotational semantics is based on a process algebra. All these semantics are proved to be equivalent, and they reduce to the conventional semantics as defined for normal, i.e. choice-free, data flow graphs. With these semantics, some prop­ erty preserving transformations are defined, mainly of use for hierarchical expan­ sion of nodes as well as for abstraction. ABSTRACT vii

Furthermore, some examples are given to show the absence of any other restriction in the data flow graph concept, apart from the data dependencies. These examples show the power of the anticipation strategy, too, even for Petri nets, and illustrate some of the other stated theorems. It is also shown how data flow graphs can be used for specifications which at first sight violate the data flow principle, as is the case for some constructs commonly used in parallel programs and in control domi­ nated specifications. Also a technique is described which can be used to verify requirements that a sys­ tem should satisfy, and which are not part of the functional description. Examples of such additional requirements are the freedom of deadlock and the absence of starvation for a set of cooperating flow graphs. The technique is model checking of temporal logic formulae. The use of temporal logic allows for a large variety of properties to be checked, and is not restricted to the classical ones such as liveness and safeness in Petri nets. An extension to this method of model checking is pre­ sented, in which constraints are suggested under which the system under consider­ ation will satisfy certain additional requirements, if possible. These constraints are of the form of additional sequence edges, so that only the correctly behaving sub­ set of the state space is reachable. Samenvatti11g

Het toepassingsgebied van de concepten die in dit proefschrift worden gedefinieerd, en van de theorema's en eigenschappen die worden bewezen, is het automatisch genereren van gefutegreerde schakelingen. Uitgaande van een hoog niveau beschrijving van de specificatie van de functie van een systeem, leidt hoog niveau syntbese tot een netwerk graaf, dat wil zeggen een beschrijving op structuur-, of register-transfer, niveau, en een controle-graaf, die definieert welke modulen actief zijn op welk moment. De netwerkgraaf is bet belangrijkste resultaat van de allocatie-fase, terwijl de controle-graaf bet resultaat is van de scheduling fase. De hoog niveau beschrijving is veelal een programma of een algoritme die de functie van het te ontwerpen systeem beschrijft. Het data flow graaf concept is een zeer geschikt fonnalisme voor synthese, omdat het alle noodzakelijke details bevat die nodig zijn om synthese te bedrijven, zonder rekening te hoeven houden met specifieke syntactische constructies van een taal. Het is daarom een algemeen formalisme. Een data flow graaf legt ook geen andere beperkingen op aan de uitvoering van het systeem dan die voorgeschreven zijn door de data afhankelijkheden in de hoog niveau beschrijving. In dit proefschrift is het data flow graaf concept uitgebreid met takken die meer dan een eindpunt en (meer dan een) beginpunt hebben, de zogenoemde hyper­ takken. Data flow grafen hebben normaal gesproken alleen maar takken met een begin- en een eindpunt. Het toestaan van takken met meerdere eindpunten leidt niet tot een grotere expressiviteit, maar van takken met meerdere beginpunten wel. Deze laatste takken leiden tot het begrip 'keuze'. Op deze manier is een netwerk X SAMENVATilNG een speciaal geval van deze veralgemeniseerde data flow grafen, namelijk die in welke alle knopen zijn gedefinieerd als verschillende soorten hardware modulen in plaats van door een bepaald soort abstract gedrag. De takken met meerdere eind­ en beginpunten in de data flow graaf zijn gelijk aan netten in een netwerk. Ook wordt aangetoond dat de controle-graaf ~n abstractie is van de beschouwde data flow graaf. Daarom is hoog niveau synthese een graaftransformatie op het data flow graaf niveau. Scheduling blijkt dan een partitionering van de flow graaf te zijn, welke geiinpliceerd wordt door het aanbrengen van volgorde-takken. Volgorde-takken hebben in principe dezelfde functie als de normale data-takken, namelijk zij stellen een data precedentie relatie voor. Een andere belangrijke taak van hoog niveau synthese is het minimaliseren van het aantal multiplexers en demultiplexers. De (de)multiplexers worden geihtroduceerd door de scheduling (in welke zij meestal niet eens expliciet genoemd worden), en door de controle statements in de specificatie, zoals de if en de while. Ook scheduling is een graaftransformatie waarvan bewezen kan worden dat zij gedragsbehoudend is. Met de hier gepresenteerde concepten zijn een aantal eisen geformuleerd onder welke synthese resulteert in een goed werkend en equivalent gedragend netwerk, en waarin het systeem alleen beperkt wordt in het aantal executie-volgordes. Deze eisen blijken minder restrictief te zijn dan die van bestaande synthesesystemen. Dit geeft de mogelijkheid om betere oplossingen te vinden. Gewone data flow grafen kunnen beschouwd worden als 'marked Petri nets', terwijl keuze inbegrepen is in bet hier gepresenteerde veralgemeniseerde data flow graaf concept, hetgeen tot mogelijk niet-determinisme leidt. In principe is het begrip conflict ook toegevoegd. De meeste resultaten veronderstellen een conflict­ vrije flow graaf, omdat conflict niet voorkomt in het toepassingsgebied van hoog niveau synthese. Conflict wordt echter wel beschouwd in een van de belangrijkste resultaten, namelijk de reductietechniek om zo weinig mogelijk van de bereikbare toestandsruimte van een data flow graaf te berekenen. Deze techniek wordt anticipatie genoemd. Het is bijvoorbeeld bewezen dat een keuze-vrije graaf deterministisch is, en dat maar een toestandsreeks van de bereikbare toestandsruimte nodig is om het gedrag van de flow graaf te bepalen. Een paar executie-volgordes zijn maar nodig, wanneer de data flow graaf niet keuze-vrij is. Omdat conflict in gewone, dat wil zeggen ongeiitterpreteerde, Petri netten en keuze in data flow grafen elkaars duale zijn, is hetzelfde resultaat geldig voor Petri netten. Verscheidene semantieken zijn gedefinieerd voor de veralgemeniseerde data flow grafen: een operationele semantiek en twee denotationele semantieken. Een denotationele semantiek modelleert alleen maar het (input-output) gedrag van een SAMENVAT!lNG Xi flow graaf, terwijl de andere ook de verschillende executie-volgordes representeert, dat wil zeggen bet interne gedrag. Een andere denotationele semantiek die gepresenteerd wordt is gebaseerd op een proces algebra. Van al deze semantieken wordt bewezen dat ze equivalent aan elkaar zijn, en dat zij gelijk zijn aan de semantieken zoals die gedefinieerd zijn voor gewone, dat wil zeggen keuze­ vrije, data flow grafen. Met deze semantieken zijn enkele eigenschap-behoudende transformaties gedefinieerd, die hoofdzakelijk gebruikt worden in hierarchische expansie van knopen en voor abstractie. Verders zijn er enkele voorbeelden gegeven om de afwezigheid aan te tonen van enige andere beperking dan de data afhankelijkheden zelf in het data flow graaf concept. Deze voorbeelden tonen ook de kracht van de anticipatie strategie, zelfs voor Petri netten, en illustreren enkele andere theorema's. Ook is er aangetoond hoe data flow grafen gebruikt kunnen worden voor specificaties welke op het eerste gezicht botsen met het data flow principe, zoals het geval is voor bepaalde constructies die veelal voorkomen in parallelle programma's en in specificaties waarbij controle de belangrijkste rol speelt. Ook wordt een techniek beschreven die gebruikt kan worden om eisen aan welke een systeem moet voldoen, en die geen deel uitmaken van de functionele specificatie, te verifieren. Voorbeelden van zulke extra eisen zijn het niet optreden van deadlock en de afwezigheid van •verhongering' voor een verzameling van samenwerkende flow grafen. De gebruikte techniek is 'model checking' van temporele logische formules. Temporele logica staat een grote verscheidenheid van eigenschappen toe om te controleren, en is niet beperkt tot de klassieke eigenschappen van liveness en safeness in Petri netten. Een uitbreiding op deze methode van 'model checking' is gepresenteerd, waarbij beperkingen worden aangedragen waaronder het beschouwde systeem aan (de) bepaalde extra eisen zou kunnen voldoen, voorzover mogelijk. Deze suggesties zijn in de vorm van extra volgorde-takken, zodat alleen het zich correct gedragende deel van de toestandsruimte bereikbaar is. Preface

At this place, I have to express my gratitude to Prof. Jochen Jess, who gave me the opportunity to do this research in the Design Automation Section of the Eindhoven University of Technology, finally resulting in this thesis. Not only because of the freedom given to the group during his absence when he performed his duties as Dean of the faculty of Electrical Engineering, but also of my non-determination, which already existed at the start of my study when I made the choice for Electri­ cal Engineering instead of Computing Science, my research did not always flow along everyone's expectations. During the entire time of this research, I felt to have to bridge the gap between Electrical Engineering, which is an applied sci­ ence, and Computing Science, which satisfied, as always being rather impractical, my attraction to theoretical issues. Balancing between the EE and CS buildings, the other members of the group always pulled me towards the ground. Leon Stok stood at the basis of this thesis, since he used the data flow graphs, of which I began later to see all its beauties, in his research on own high level synthe­ sis. High level synthesis gave me also the feeling that many with an EE back­ ground understand maybe intuitively the structures they are dealing with, but lack­ ing the formal reasoning. The other direction of computing scientists who work on designing can also be seen. To my opinion, the best results will come from a sym­ biosis of both disciplines. I should mention here Jos van Eijndhoven as manager of the Esprit BRA 3281 pro­ ject, better known as the ASCIS project. He also was stimulating me on discover­ ing the possibilities of data flow graphs. xiv PREFACE

But most of my thanks must go to Geert-Leon Janssen as the example of an ideal position in a research group: interested in everyone's work, being able to work on all the subjects of the group and changing your own principal subject regularly. He introduced Temporal Logic to me, which we then explored together, but he stimulated me most by his interest in formal issues. But apart from assisting in this laissez-faire group, although not drawn into its full consequences, the 'in education' part of the job was the most satisfactory to me. The daily traveling between home and work not only allowed me to do my real research then, but also gave me the opportunity to read! For this the financial com­ pensation allowed me to buy books, almost, at will. Together with the group of people I became to working with and who felt the same, this led to a self­ development. So I would like to thank those whom I owe to (apart from the ones who I met in the train and who are too many to name them all): Pim Buurman, Hans Fleurkens, Leon Ham, Tekin Yilmaz and Zbyszek Struzik (from these last two especially the discussions and talks). But most of all to Ronald Tangelder. I also have to thank the Free Software Foundation since this thesis would never have been made without its tools. At last, I thank my parents for their support and understanding of what studying and reading means to me. Contents

Abstract v

Samenvatting ix

Preface xiii

Introduction 1 I. I Data flow graphs 4 1.2 Behavior 11 1.3 Overview of some results 14

2 Theory 17 2. I Structural graph definitions 17 2.1.1 General graph definitions 17 2.1.2 Generalized data flow graphs 29 2.2 Data flow graphs as an algebra 32 2.3 Operational semantics of flow graphs 36 2.3.1 Transitional semantics for choice-free graphs 38 2.3.2 Transitional semantics for general graphs 45 2.3.3 Operational semantics 49 2.4 Behavioral denotational semantics of flow graphs 55 xvi CONTENTS

2.4.l Choice free graphs 56 2.4.2 General graphs 60 2.5 Proof of equivalence 65 2.6 Anticipation 68 2.6.1 Liveness 83 2.6.2 Safeness 85 2.7 Well-behavedness 89 2.7.1 Embeddings 91 2.7.2 Well-structuredness 94 2.7.3 Property preserving embeddings 99

3 Applications 107 3.1 Dining philosophers 107 3.2 Loop example 116 3.3 High level synthesis 123 3.3.l Scheduling 126 3.3.2 Allocation 133 3.4 Parallel programs 139 3.4.1 Shared variables 140 3.4.2 Semaphores 143 3.4.3 Queues 144 3.4.4 Further examples 148 3.5 Control dominated specifications 149 3.5.1 State-transition based formalisms 149 3.5.2 Delay insensitive circuits 151 3.6 Requirement verification 152 3.6.l Temporal logic 153 3.6.2 C1L Model checking 155 3.6.3 Model checking for data flow graphs 157 3.6.4 Constraints and constraint generation 159 3.6.5 Discussion 162 3.7 Graph transformations 164 3.8 Final remarks 165 CONTENTS xv ii

3.8.1 Specification-implementation paradigm 165 3.8.2 Non-standard semantics 166 3.8.3 Timing 167 3.8.4 Arrays 167 3.8.5 Implementation 167 3.8.6 Using other formalisms instead 168

References 171

A Mathematical preliminaries 183 A.1 Some basic notations 183 A.2 Partially ordered sets and domains 186 A.2.1 Tuples 190 A.2.2 Streams 191 A.2.3 Sets 193 A.3 Lambda calculus 194 References 195

B Additional semantics 197 B.1 Execution graph denotational semantics of flow graphs 197 B.2 Behavior expression semantics of flow graphs 205 References 210

Glossary 211 Symbols 211 Functions 214

Curriculum Vitae 218 Chapter 1

Introduction

Starting from a functional or a behavioral specification, (automatic) synthesis of an integrated circuit can be divided into three steps: 1. High level synthesis, which transforms a behavioral specification, mostly written in an algorith­ mic way, to a structural description on the register-transfer level; i.e. an architecture is generated. The register-transfer description consists of a data path and a controller, which is a kind of finite state machine. 2. Logic synthesis From the data path and the controller as the result of high level synthesis, an implementation at the gate level must be made. The description is in princi­ ple a set of boolean functions. Logic synthesis includes state encoding and possible optimizations of the boolean specification. The modules of the net­ work may be generated with specific module generators, for instance if they are parameterized on the word lengths. The boolean specification can be optimized and synthesized to the gate level by logic synthesis and optimiza­ tion tools like [18]. 3. Layout generation, which involves the placement and the routing of the gate level description and the interconnection between the modules and the off-chip 10. The approach to solve the problem of high level synthesis depends on: 2 INTRODCCTION

I. the type of network architecture, because clearly a mapping from an algorithm to a microprocessor like archi­ tecture is different from a mapping to a 'full custom' implementation. 2. the constraints that are applied to the design, e.g. timing, area and power-dissipation constraints 3. the behavioral description itself Note that not the syntax of the description is meant, but the type of algo­ rithm, e.g. whether the specification is data or control dominated. This may for instance depend on the area of application, and in that case is also related to the type of network architecture. In this thesis, a formalism and a corresponding theory is described, that can be used as a uniform system description for synthesis tools, and that can model the behavioral specification as well as the implementation on the architectural level. A specification fed to a high level synthesis system is normally not a structural specification, since the logic synthesis phase starts from this level of abstraction. The high level specification is normally written down in an algorithmic way, because the specification itself does not have to deal with any implementation con­ straint but is just a way to express the functionality of the circuit and nothing about how to implement it. However, there also exist synthesis systems, e.g. [17], that are supposed to generate a system in which there is a one-to-one relation between the implementation and the specification. By consequence, these systems are in most cases syntax-directed. In those systems, almost no optimizations may be per­ formed, because those are all considered to be design decisions that should be taken by the designer. In the former case, a behavioral specification does not state anything about a possible implementation, since for instance it is independent of the type of network architecture to which it is to be mapped. It might be possible that a designer tries different architectures to find the one that suits best. For certain types of network architectures, different synthesis systems may of course exist, each suitable for such a type of architecture, instead of a general one. The same applies for the 'format' of the behavioral description itself. Therefore, many specific description languages have been designed for specific types of applications, in which case also the network architecture is fixed. In this thesis, high level synthesis is considered to be a mapping from a behavioral specification to a 'full custom' hardware implementation. However, it is not restricted to this class, since any restricted type of network architecture, e.g. an architecture with a fixed number of (complex) 'computational' units and busses, IN1RODUCTION 3 can be modeled in the formal model presented here, too. Only the synthesis tools themselves must guarantee that such an architecture is generated. An algorithmic behavioral specification may also be simulated, or even executed if it is written in a programming language, from which properties and other charac­ teristics may be extracted that can guide further synthesis. Of course, the synthesis system must be informed of the constraints that are imposed to the circuit by the designer or the environment. A timing constraint may for example state the maximal delay or the minimal throughput required. Another type of a timing constraint is an ordering relation for some operations, which is not explicitly mentioned in the behavioral specification itself. Area and power-dissipation constraints are also independent of the behavioral specification itself, and may be extracted from the synthesized system. Instead of just generat­ ing an implementation and verifying afterwards whether the constraints are satis­ fied, a better approach might be to let the synthesis tools be directed by such con­ straints and have some means to evaluate the final result by 'figures' attached as attributes to the different elements of the specification. Constraints must then be related to the specification or to the architecture, e.g. the area constraint becomes a constraint on the number of modules that may be used, while the total timing constraint of the system must be related to the individual delays of the modules. It is advantageous to choose a system representation for the behavioral specification that also allows for such constraints to be represented. It is indeed possible to incorporate such 'technology independent' constraints in the formalism presented in this thesis. Instead of a language on which the synthesis operates, an underlying graph model is presented in this thesis, because - from an algorithmic point of view, i.e. for developing tools, many simple algo­ rithms exist to operate on and manipulate with graphs - it is easier to use in general than a language, because difficulties due to syntac­ tic constructs do not happen - it is general, since many specification languages can be mapped onto this graph model The formalism is a generalization of ordinary data flow graphs as found for instance in (5, 37, 39, 132]. 4 INTRODUCTION

1.1 Data flow graphs Signal flow and data flow graphs play an important role in high level synthesis, because they model just the essence of the behavioral specification. A behavioral specification is mainly a data processing task, i.e. a 'manipulation' of input data to produce output data. Data flow graphs are well suited for this purpose, since a data flow graph is a graph model representing just the data dependencies of the specifi­ cation. It is therefore also the most parallel representation. Usually, a high level specification is a program written in a (subset of a) programming language, e.g. Pascal, or an adapted version, e.g. Hardware C [101], or a specifically designed 'hardware description' language, e.g. VHDL [2]. Although the term HDL applies better to a description on the register-transfer level, it is shown in this thesis that the network description can be modeled by the same formalism as the behavioral description. The data flow graph model is a general model, since programs written in languages as mentioned above can be mapped onto it. High level synthesis is the mapping of such a behavioral specification onto an architecture at the register-transfer level, i.e. a network of modules and a con­ troller. Normally the network is represented by a network graph and the controller by a control graph. In other words, the high level synthesis task is a mapping from the data flow graph to a network and a control graph which are (inter)related. Nodes in a data flow graph represent the operators of the program, while data flows along the edges as output of one node to the input of another node, i.e. edges represent values. The behavior of a node is that it generates a new value on its outputs depending on the input values. Instead of a single value on an edge, also a stream of values can be present. To link nodes with statements or operators in a program, nodes have two additional attributes: a type and a function. An example flow graph is given in figure 1.1, representing the expression (a + b) * (c - d). All nodes are of the operator type and are inscribed with their function. A node has input ports as well as output ports. In this way, the different inputs, respectively outputs, can be distinguished, which is needed for the correct computation of non­ communicative operations, e.g. minus. Four types of nodes can be distinguished in data flow graphs: operator nodes, branch nodes, merge nodes and IO nodes. Operator nodes are, as also illustrated in figure 1.1, nodes inscribed with the func­ tion they perform on the values on the inputs and of which they put the result on its output. The type of the function is not restricted. Simple arithmetical functions like+, -, and relational functions like=, :t:., :::> may exist, but also more complex functions are allowed, as well as constant functions. An operator may also be a DATA FLOW GRAPHS 5

Figure 1.1 Example of a simple flow graph hierarchical node, i.e. the abstraction of another data flow graph, since the behav­ ior of a data flow graph resembles that of an operator node, by computing from a set of input value a set of output values. The graphical representation of an opera­ tor node with input ports p 1 and p2 and output port p 0 is shown in figure l .2a.

Po

a b c

Figure 1.2 Graphical representation of a) an operator node, b) a branch node, c) a merge node A branch node is a conditional node whose behavior informally is to pass data from the data input port Pd to one output port p;. Which output port is selected is determined by the value on the control (input) port Pc· In case of boolean values, the output ports are also labeled as Ptrue respectively p false• thus modeling a kind of if statement. More generally, a branch node models an n-switch. The graphical representation of a 'boolean' branch node is given in figure 1.2b. Merge nodes are dual to branch nodes. The informal behavior of a merge node is passing data from one data input port Pi to the output port Pd· Which input port is selected is again determined by the value on the control (input) port Pc· The graphical representation of a 'boolean' merge node is given in figure l .2c. 6 INTRODUCTION

10 nodes appear in two types: get nodes and put nodes. A get node is an input action, i.e. reading a value from the external environment, while a put node mod­ els an output action, i.e. writing a value to the environment. The concept of data flow graphs is a suitable model to describe the data dependen­ cies in programs [5, 37, 39, 132], and therefore also system specifications at a high level [38, 123]. A data flow graph, or equivalently a process net, is just an alterna­ tive way of writing down a specification written in a functional or applicative pro­ gramming language, since it defines in a sense a program with only single­ assignments, and therefore it is an equational or algebraic system [5, 38, 48, 132). Each output edge of a node models a (unique) variable, defined as the function of the node applied to the values on the incoming edges. For a graph, this then yields a set of equations. But also imperative languages can be mapped onto such flow graphs [38, 122, 131]. This is also used in conventional software compilers [8]. Only a full, and preferably global, data flow analysis must be performed in that case [l]. An example of such a flow analysis can be seen from the following program text of which figure 1.1 is the data flow graph. e =a+ b; f = c - d; g = e * f; Since no interrelation exists between the first two statements, the order in which they are executed is unimportant. As mentioned before, edges represent values instead of variables. The variables of the specification are only used to determine the data dependencies in the construction of the data flow graph from the behav­ ioral specification to obtain the maximal parallel solution. If variables should be modeled explicitly in data flow graphs, this would already imply some assignment of values to registers, i.e. a form of storage allocation (Section 3.3.2). If this is desired, it is a kind of design constraint that the synthesis should satisfy. However, the concept of data flow graphs is still not the only type of system repre­ sentation used as the main internal data structure in high level synthesis systems. The system representations can be divided in the following types, as also men­ tioned in e.g. [44, 123]: 1. representations This tree is often the parse tree derived from a specification written in some textual interface [58, 95]. Examples of such synthesis systems are [40, 90, 95]. Since such tree representations do not resemble the behavior of a sys­ tem very well, implementations of many optimization algorithms in high DATA FLOW GRAPHS 7

level synthesis become complex. Because variables instead of values are maintained in the description, each tool should do a flow analysis. 2. Semi data flow graph representation This representation is similar to data flow graphs, except that edges repre­ sents variables instead of values. Thus no full data flow analysis is applied. Sticking to such a representation means that the search space to find optimal solutions is much smaller; in fact, the major part of the storage allocation has already been done.

A synthesis system that uses a semi data flow graph representation is described in [76]. A somewhat more relaxed approach can be found in [25], in which edges represent indeed only a precedence relation between the operator nodes. But apart from this, mappings are defined between the vari­ ables and (the ports) of the nodes. Thus here also, a (partial) storage alloca­ tion is already performed. 3. Separate data and control graph representations This representation is the most common one. In principal this is even true for conventional software compilers. Straight line code is modeled by a pure data flow graph, on which subsequently many optimizations can be applied [9, 75]. The control structure of the specification language (i.e. the condi­ tional and repetitive statements and the procedure calls), is modeled in a separate control graph. Thus a partition into so-called 'basic blocks' is made (see also section 3.2), yielding a set of flow graphs instead of a single one. Global data flow analysis then becomes difficult [I].

In [86] and [108] such systems are described. Similar systems are described in (109, 133], but simple conditional constructs are included into the data flow graph in these systems. 4. Combined data and control graph representations This is the type of model that is dealt with in this thesis and also used in (44, 123]. Similar to the separate data and control graph representation, straight line code is modeled in the flow graph (thus securing the same optimiza­ tions), but also the control structure is embedded in this single graph by branch and merge nodes, instead of in a separate control graph. This allows for global optimizations. The advantage is then that it is not yet fixed how the control schemes are to be implemented, for example as real data­ dependent control in the data path or by means of states allotted in the con­ troller. 8 IN1RODUCTION

Another example of this strategy is SIL [77, 78], although there exist some fundamental, conceptual, differences with the flow graph model described here. 1 The model used in [97] may also be considered as an example of this type. Here a partition of the flow graph into basic blocks, alike as described above, is already imposed on the graph, yet it is not modeled by a separate control graph. Signal flow graphs are also commonly used as the internal design representation, especially when digital signal processing is the field of application. In principal a signal flow graph is identical to a data flow graph, but it possesses a delay node that plays a special and significant role. In data flow graphs: such a delay operator does not really exist, since time is not mentioned explicitly, only (partial) order­ ings are dealt with. Time, and therefore the introduction of clock cycles, is intro­ duced by scheduling, which partitions the flow graph into such clock-cycles. When a partition is made, registers or other type of storage nodes (see also section 3.3) are inserted at the boundaries of clock cycles. In such a graph, delay nodes may of course be used, too: control has already been made (partially) explicit by using delay operators. Together with other results (see e.g. section 3.2), this makes clear that the data flow graph concept is the most unrestricted one, and thus is viable to serve a larger class of optimizations. As mentioned before, a data flow graph defines an equational system, or a process net, from a functional point of view [5, 38, 48, 132]. Each node represents a func­ tion, or operation. Instead of being as general as functional languages, the nodes in the flow graphs described in this thesis represent only strict functions. To allow for conditionals, and therefore non-strict functions, the branch and merge nodes are used. This is motivated by the type of application: hardware synthesis. In hard­ ware, all modules are, in a sense, strict. The only non-strict modules are the (de)multiplexors. Thus in this way, a close correspondence between a high level system specification and an implementation at the structural level exists, i.e. the so-called network graph. This may even be accomplished by a one-to-one map­ ping from nodes in the flow graph to modules in the implementation: by mapping each operator node to a module performing the same function, and each branch and merge node to a (de)multiplexor, a hardware equivalent of the flow graph

I. SIL graphs are restricted to a single-token ftow concept. while the flow graph described here follows the multiple-token concept. since more than one value, or token, may be present on a data flow graph edge. Also in SIL there is a different notion of time, as delay operators are a basic node type; i.e. SIL is considered to be a signal flow graph. In SIL there is an (external) time frame defined by the unit of delay of the delay nodes. But. in addition, another (internal) time frame is used in the definition of the 'latest arrival' of tokens at joins. DATA FLOW GRAPHS 9 results. Commonly, edges in data flow graphs have a single origin and a single destination [5, 39]. By having multi-destination edges, there is no need for special copy [48] (or fork [38] or link (39, 7 4] ) nodes which occur abundantly in such flow graphs. Without these copy nodes, more than one edge could be connected to an output port of a node, leading, according to the interpretation used in this thesis, to the notion of conflict (Definition 2.21). The hardware equivalent of an edge is a wire, and wires may have multiple destinations. To keep up with the one-to-one corre­ spondence between flow graphs and hardware, the edges are allowed to have mul­ tiple destinations, and the copy nodes are then superfluous. A value that flows along a multi-destination edge is carried to all the destinations of this edge. An example of a flow graph that can be seen as a direct hardware implementation is given in figure l .3a.

a b

Figure 1.3 Data flow graph of abs(x) with multi destination and origin edges For the same reason of a one-to-one correspondence between flow graphs and hardware, edges having multiple origins are allowed, representing a kind of "wired-ors", and leading to the notion of choice (Definition 2.20). Each value 10 (N1RODUCTION output by a node to one origin of such a choice edge, is carried to (all the) destina­ tions of the edge. When values are put on such an edge by more than one origin, the order of these values at the destination is unknown. Thus with respect to the behavior (this is defined by the different types of semantics in sections 2.3 and 2.4, and also in appendix B), these choice edges may cause non-deterministic behavior. The join at ports in SIL [77, 78] has similar semantics as the choice edge. How­ ever, a join considers only the latest arriving value(s), and results in one 'token' with a set of values. The choice edges as described here, i.e. the edges with more than one origin, will result in more than one 'token', each with a value. In figure 1.3b such general hyper edges are illustrated, i.e. edges with possibly more than one origin and destination; this example is an optimized (hardware) version of figure 1.3a. Another reason that hyper edges are advantageous over simple edges, i.e. edges with a single origin and destination, is because of an easier definition for the corre­ lation between (streams of) values at the different terminals of a hyper edge, instead of between two or more (structurally independent) edges. Also, high level synthesis becomes just a graph transformation for graphs with these hyper edges. (Section 3.3). Fundamentally, the type of IO nodes is the same as the type of operator nodes. They are distinguished here because of the following reasons: 1. they play an important role in hardware, although many high level synthesis systems do not deal with them. 2. they have a channel attribute through which the communication with the external world of the flow graph takes place. The IO nodes must be chained; for a deterministic communication with the external environment, a total ordering must be defined per channel. The IO nodes may be primitive (asynchronous, i.e. event driven) nodes but it is also possible that these 10 nodes incorporate some protocol, e.g. a handshake protocol, to communi­ cate with the environment. In this way, they may also be viewed as hierar­ chical nodes. Note that communication channels play a role identical to those of edges when a set of communicating or cooperating graphs is considered. Communication via a channel means a broadcast, since a message output by a put node is sent to all the get nodes connected to the same channel. Thus for a channel all the corresponding get nodes are the destinations of that channel, and the corresponding put nodes are the origins. This is illustrated in figure 1.4. DATA FLOW GRAPHS ll

Figure 1.4 An IO channel as an edge Fundamentally there is one type of edge. However, from a practical point of view, two types of edges exist: 1. data edges Data edges are the edges that connect nodes and are used to transport data from one node, defining a value, to another node, using that value. 2. sequence edges Sequence edges have the same function as data edges, but do not carry the value of a data item, at least no values that are really used by the nodes to compute new values. They are solely used to model a timing constraint in which an (additional) ordering of the nodes is defined. Thus since both types of edges have equal semantics, no difference is made between them.

1.2 Behavior As mentioned in the previous section, data flow graphs are generalized in such a way that edges are allowed to be hyper edges, i.e. in general they can have more than one origin and destination. In this thesis the theory of pure data flow graphs, i.e. graphs without hyper edges, is extended to such generalized data flow graphs. A formal semantics is defined for flow graphs, giving an exact interpretation for them. In this way, one does not have to reason only informally about flow graphs and their behavior. Also the tasks performed by high level synthesis can then be formalized, just as other optimizations that may be performed by synthesis and analysis tools. Besides the structural definition of data flow graphs, also the dynamics, i.e. the behavior, of the flow graph model is formalized by defining an interpretation or meaning for the structure. This formal semantics models the input-output, or exter­ nal, behavior in a natural way, namely as a data processing unit: a set of input 12 INTRODUCTION values results in a set of output values. Clearly, with the concept of a process net­ work, this may be done repetitively and the behavior of a flow graph may therefore be viewed as 'evaluating' from a set of streams of values on the input edges of the graph, to a set of streams of values on the output edges of the graph. The exact 'computation' of this evaluation is up to the synthesis tools, although the behav­ ioral specification can already be viewed as a suggestion, and which may therefore be used as starting point. Not only small changes may be applied, but also com­ plex transformations may be performed. The internal behavior of a flow graph is just the extension of the external behavior by including the internal edges, instead of only considering the streams of values on the input and output edges of the graph. But the internal behavior includes also the different orders in which the nodes are executed, since the data flows through the graph along the edges by the execution of nodes. The internal behavior there­ fore models also in which ways the input-output behavior is obtained. The internal as well as the external behavior is related to the concept of a state, which is a mapping from node-(input)port pairs to (streams of) values. With this notion of a state, the external behavior is a relation from an (input) state to an (out­ put) state. The internal behavior states also how, i.e. by execution of which nodes and in which order, a state is transformed into another. Of course, sequence edges influence the behavior of a graph. For a deterministic system, the input-output behavior is unique, i.e. having only internal non­ determinism. Adding sequence edges to a graph reduces the number of possible (internal) executions, i.e. reducing the internal non-determinism. For a determinis­ tic system, adding sequence edges does not change the input-output behavior, except for the case of introducing deadlock. For an externally non-deterministic system, the addition of sequence edges may also lead to reducing the external non-determinism.2 To define the behavior of a graph, or equivalently to assign a meaning or an inter­ pretation to a graph, the following types of semantics may be used [ 124]: - operational semantics An operational semantics states something about how a program can be

2. An implementation of a system is mostly (see Section 3.8.1) a system that should have an input-output behavior equivalent to the specification. Only the internal non-detenninism may be restricted, e.g. by adding sequence edges as is principally the method in high level synthesis (Section 3.3). Thus in such a case, reducing the external non-detenninism is not intended. BEHAVIOR 13

computed, i.e. it defines an interpreter. - denotational semantics Denotational semantics states what a program means, i.e. what value it has. - axiomatic semantics An axiomatic semantics states something one can reason about a program, for instance to derive properties of a program. In this thesis, an operational semantics and denotational semantics are given for flow graphs. From the discussion above, it might be concluded that an operational · semantics is defined to model the internal behavior, while the denotational seman­ tics represents the external behavior. This is true, but also a denotational semantics is presented to describe the internal behavior. The equivalence between these different semantics is proved. Without such an equivalence notion, the different semantics cannot be shown to represent the same behavior. More than one semantics is defined, since each semantics has its advan­ tages with respect to applicability. The operational semantics is the easiest to understand, since it follows closely the informal lines of reasoning about the behavior of a flow graph by consecutive execution of nodes. Therefore, the opera­ tional semantics can be considered as a way to build a (non-deterministic) simula­ tor for flow graphs. An (equivalent) axiomatic se!llantics is not defined. This can be a subject of further research, although a beginning is made in section 2.2 where an algebra is pre­ sented that can be used as a model for such a semantics. The propositions stated there might be used as soundness proofs of a deduction scheme with those alge­ braic laws. The denotational semantics for the external behavior defines for a graph immedi­ ately the input-output relation. From this point of view, the behavior of a graph is identical to the behavior of a node, leading to notions of hierarchy and interchang­ ing of (sub)graphs with other (sub)graphs having the same abstract behavior. In this thesis, also a less restrictive characterization is given, by which a subgraph may be replaced by another graph without changing the behavior of the entire graph. Also, a denotational semantics is defined that views the behavior of the data flow graph as a kind of process algebra. With this, a theory like that of CCS [ 104] and CSP [60] might be developed. Such an approach has shown already many interest­ ing applications, as for instance the vast amount of papers on the use of formal description techniques like LOTOS [3] and Estelle [4] shows. 14 INTRODUCTION

1.3 Overview of some results Besides the formalization of the data flow graph model, also with regard to the dynamics, a theoretical framework is presented. This allows one to prove proper­ ties of graphs, possibly under certain restrictions, and which eases the synthesis and analysis of graphs. For instance, it is shown that pure data flow graphs, i.e. flow graphs without multi-origin edges, do not have external non-determinism. The formalization also shows that better solutions exist than those searched for by current high level synthesis tools, i.e. the design space is even larger than the cur­ rently used one. Normally the synthesis tools impose some, mostly intuitive, restrictions onto the data flow graph of the specification. An example of such a restriction is the partitioning of the flow graph in separate control sections, like conditional and repetitive statements. I.e. the control scheme for such statements is already fixed, and thus the control graph to a large extent, too. When designing a system, this restriction is not at all obligatory, but it just eases the algorithms used in synthesis tools and also restricts the design space. Here it is shown that these restrictions are not necessary for a correct behavior. It is also proved that these restrictions are too severe even for generating a safe network, i.e. a network in which each edge, or wire, contains at most one data item at a time. A theorem, which is less restrictive, is given to obtain a safe network. Therefore, with the theory presented in this thesis, the developer of high level design tools may get a formal and exact notion of what the restrictions and assumptions of its tools are with respect to the power of generalized flow graphs. Because it is shown that the data flow graph concept is the least restrictive formal­ ism, in that only the data precedences are represented and no other presumptions, it is therefore more powerful than commonly thought of. Besides the results on expansion of nodes as mentioned in the previous section, also some reduction techniques are presented that reduce the analysis cost to a very large extent. These reductions still allow for deciding which properties, e.g. safeness and liveness, are valid. In the last chapter, the applicability of the theory is illustrated. It is indeed shown that the data flow concept does not impose any other restriction than the data dependency and allows for other solutions than normally found. For instance, the reduction technique is shown to be really effective, as for example the exponential state space of the dining philosopher problem is reduced to a linear one and still all the important properties are preserved. It is also proved that high level synthesis is just a graph transformation of which it can be proved that it is behavior preserving. OVERVIEW OF SOME RESULTS 15

With the formalization, also the correctness of other graph transformations and optimizations can be proved. Some verification issues are addressed, too. Verification means that a system must be shown to satisfy some requirements. These requirements are not part of the behavioral specification, but state additional properties like freedom of deadlock, fairness. An approach is presented in which the verification does not only give a yes-or-no answer whether the system satisfies a requirement, but that also gives some suggestions how the specification must be changed in order to satisfy the requirements. The last section can be read as some concluding remarks of this thesis. The appendix contains not only notations that are used throughout this thesis, but also defines several semantic domains and functions on these domains that are used when presenting the different semantics.

Chapter 2

Theory

This chapter is the core of this thesis as it introduces data flow graphs as a special type of graphs. Apart from the structural definition of generalized flow graphs, mainly the dynamics, i.e. the behavioral interpretation, of such graphs is consid­ ered. Not only different semantics that define the behavior of a graph are proved to be equivalent, but also theoretical results are stated that may be used to reduce the analysis costs to a large extent.

2.1 Structural graph definitions This section contains definitions and notations to describe the structural properties of data flow graphs. 2.1.1 General graph definitions As discussed in the introduction, a particular type of graphs is the main point of interest in this thesis. These graphs are directed graphs, but the edges are consid­ ered as hyper edges1 [82], i.e. edges with possibly more than one origin and desti­ nation. Each node has a set of input ports and output ports, while edges are directed from output ports p ouJ of nodes v to input ports pin of nodes w.

I. Such edges may also be called multi-edges. But this nomenclature may clash with the terminology used for weighted edges as in (127]. 18 THEORY

Let V, E, Pin, Pout be setc;. Definition 2.1 A graph G is a 7-tuple (V, Pin• P ouJ• E, I, 0, ifJ) where - V is the set of nodes of graph G - Pin is the set of node input ports of graph G - Pout is the set of node output ports of graph G - E is the set of edges of graph G

- I: V ~ P(P in) is a mapping denoting the set of input ports of a node

- 0: V ~ P(P 0111 ) is a mapping denoting the set of output ports of a node

- (J: E ~ P(V x Pour) x P(V x Pin) is the connection function of graph G with PinnP out = 0 and P(A) denoting the powerset of a set A (Notation A.5).

Example 2.1

Figure 2.1 shows an example graph G = (V, Pim P 0111 , E, I, 0, ifJ) with V =

{vi.···,v4}, Pin={Pin}, Pout={p0 ur}, E={ei.···,e5}, VveV: /(v) = {Pin }A O(v) = {Pout} and (J(e1) = (0, {(Vi. Pin)}), (J(e2) = (0, {(v2, Pin)}), ifJ(e3) = ( { (v3, Pout)}, 0), ifJ(e4) = ( { (v4, Pout)}, 0) and ifJ(e5) = ( {(Vi. Pour), (v2, Pour)}, {(v3, Pin), (v4, Pin)}).

Figure 2.1 A graph

The disjointness of the namespaces of Pin and Pour is not a real drawback, since bidirectional ports can also be modeled by a proper renaming. However, allowing bidirectional ports complicates the discussion of graphs unnecessarily. It is STRUCTURAL GRAPH DEFINITIONS 19 allowed that fst(;i(e)) = 0 or snd(;i(e)) = 0, where fst and snd are the disassem­ bling functions of pairs, or in general tuples (Definition A.56). Such edges denote (partially) unconnected edges, which are the input and output edges of the graph (Definition 2.11 ). Of course, instead of using separate sets of nodes and ports, only one set of port<> is sufficient to describe graphs. A node is then just a cluster of input ports P;nEP;n and output ports Pout ePout• since the behavior of a node is in principle defined as a relation between its input ports and its output ports. In that case ;i becomes a relation from P(P out) to P(Pin>· Here the more common way of viewing a graph as a set of nodes and a set of edges is chosen, allowing different nodes to have the same port names. Notation 2.2 When graphs G or G; are not specified any further, G and G; are used as abbreviations for the graphs (V, Pim Pout, E, 1, 0, ~) or (VG• Pina• P outo• Ea, la, Oa, ;a) and (Vi,Pin;•p om;' E;, l;,O;, ;;) respectively.

Notation 2.3 In the sequel - V, VG• V; represent sets of nodes

- Pin• P;n0 ,Pin; represent sets of input ports

p OUI• p OU/a• pout; represent Sets Of OUtpUt pOrtS - E, Ea. E; represent sets of edges

- 1, 10 , l; represent mappings of input ports of nodes 1: V -7 P;n, 10 : Va -7 P;no and l;: V; -7 Pin; respectively

- O,Oa.O; represent mappings of output ports of nodes 0: V -7 Pout• Oa: VG -7 P outa and O;: V; -7 Pout, respectively

- ;, ;a, ~i represent connection functions ;i: E -7 P(V x Pout) x P(V x P;n),

-7 G X ) X G X ), -7 X X ~G: Ea P(V P ou10 P(V P;n 0 and ;i;: E; P(V; Pour) P(V; x Pin) respectively.2

Notation 2.4 In the sequel

- v, v1, w represent nodes

2. In many cases, the subscript G or i is omitted for the sets Pin and P 0,,1 since they are considered to be global. 20 THEORY

- p, p; represent (input and output) ports

- e,e; representedges

Notation 2.5 U denotes the class of graphs.

Notation 2.6 In this thesis, many functions are defined more than once. The type of the argument(s) indicates which definition is valid for each given instance. This overloading of functions only occurs when these functions have similar informal meanings.

Definition 2.7 orig: E --'> P(V x P 0 ui) dest: E --'> P(V x P;n) are defined as: orig(e) =fst( P(V) to: E --'> P(V) are defined as: from(e) ={ veV I 3pePout: (v, p)eorig(e)} to(e) ={veV I 3peP;n: (v, p)edest(e)} The functions from and to yield the nodes of which an edge is an output respec­ tively an input edge. Notation 2.9 •e =from(e) e• =to(e) This notation (as well as notation 2.17) is mainly introduced as a shorthand, also to show the similarity with Petri net theory [21].

Definition 2.10 The sets of connected ports of a node are given by the functions In: V --'> P(P;n)

Out: V--'> P(P0 u1) which are defined as: ln(v) ={peP;n I 3eeE:(v,p)esnd(

Out(v) = {pEPout I 3eeE:(v,p)efst(;(e))J

Proposition 2.1 ln(v)i;;,l(v)AOut(v)i;;,O(v)

Definition 2.11 The sets of input and output edges of a graph G are given by the functions I :Ill -t P(E) 0:6 -t P(E) which are defined as: /(G) {eeE I fst(;(e)) 0} O(G) = {eeE I snd(;(e)) = 0} In this way, an edge eEE cannot be both an input edge and an output edge of a graph G, except when the edge e is fully unconnected. The latter case is only pathological. Definition 2.12 The sets of input and output nodes of a graph G are given by the functions ln:lfi -t P(V) Out:lfi -t P(V) which are defined as: ln(G) ={ veV I 3ee/(G): vee•} Out(G) {veV I 3eeO(G):ve•e} Now follows a set of definitions of adjectives to (elements of) graphs, where JB denotes the boolean values (Notation A.I). Example 2.2 will illustrate these adjec­ tives. Definition 2.13 duplicate: Ex E -t JB parallel: E x E -t JB are defined as: duplicate(ei. e2) =e 1 * e2 A 9l(e1) = fl(e2) parallel(ei. e2) = e1 -.:;; e2 A orig(e1)norig(e2)-.:;; 0 /\ dest(e 1)ndest(e2)-.:;; 0 Two edges are each other's duplicate, when they exactly connect the same node­ port pairs, i.e. they are identical. Two edges are in parallel when they are partially identical. Proposition 2.2 duplicate(e1> e2 )~ parallel(ei. e2) 22 THEORY

Definition 2.14 adjacent: Ex E-? JB adjacent: V x V -? JB are defined as: adjacent( e., e2) =e 1 ::F e2 /\ (orig(e 1)r.orig(e2) ::F 0 v dest(e 1)r.dest(e2) ::F 0) adjacent(v, w) =v :;t:. w /\ 3eeE: {v, w}s;;;•eue• Adjacency defines a neighboring relation in that adjacent edges have one or more ports in common; adjacent nodes have at least one edge in common. Proposition 2.3 parallel(ei, e2 )~adjacent(ei. e2)

Definition 2.15 independent: V x V -? JB is defined as: independent(v, w) = v ::F w /\ ..,3eeE: { v, w }s;;;•eue• Thus two nodes are adjacent when they are not independent, and vice versa: Proposition 2.4 independent(v, w)~-iadjacent(v, w)

Definition 2.16 The sets of input, or incoming, and output, or outgoing, edges of a node v are given by the functions in: V-? P(E) out: V -? P(E) which are defined as: in(v)= {eeE I vee•} out(v)= {eeE I ve•e}

Notation 2.17 •v=in(v) v• = out(v)

Definition 2.18 succ: V -7 P(V) pred: V -7 P(V) are defined as: succ(v) ={ weV I 3eeE: eev•Awee•} pred(v) = {weV 13eeE: ee•vAwe•e} SlRUCTURAL GRAPH DEFINmONS 23 succ( v) are the successor nodes of a node v; pred( v) are the predecessor nodes. Definition 2.19 self - loop: V -t IB self -loop: E -t IB self - loop: G -t IB are defined as: self -loop(v) = •vrw• if:. 0 self -loop(e) =•ene•-::/= 0 self -loop(G) =3veV: self- loop(v) A self-loop means that a node itself is among its successors. Proposition 2.5 3eeE: self-loop(e)~3veV: self- loop(v) Informally, the behavior of a graph is a transformation of data on the input edges of the graph to the output edges. The behavior results from the separate behaviors of the nodes in the graph. Data items flow along the edges, while the nodes operate on this data. Possibly, more than one data item may exist on an edge. A node can execute if (some of) the incoming edges have data. Execution of a node then means to compute new values that are output to (some of) the outgoing edges of the node. From this, it is clear that two structural independent nodes are also dynamical independent, i.e. the behavior of one node does not influence the behav­ ior of the other. The concepts of choice and conflict, which are dual to some extent, are formaliza­ tions of two special cases in the behavior of a graph. Both lead to a (forward) branching, i.e. a non-determinism, in the graph behavior and have the following meaning: - Choice denotes that an edge has more than one origin, i.e. dynamically such an edge is 'driven' by more than one node. When more than one origin puts a value on such an edge, the order of the values at the destination is unknown. - Conflict denotes that more than one edge is connected to an output port of a node, i.e. dynamically only one of these edges is 'driven' by that node. When the common origin of such an edge executes, this node puts, in a non­ deterministic way, a new value to one of the conflictive edges. Definition 2.20 choice: E -t IB is defined as (where #A denotes the cardinality of a set A (Notation A.4)): 24 THEORY choice(e) =#orig(e) > 1

Definition 2.21 conflict: V ~ 1B choice: V ~ 1B output - choice: V ~ 1B are defined as: 3 conflict(v) = 3ei. e2 eE, peO(v): e1 -:# e2: (v, p)eorig(e1)riorig(e2) choice(v) a 3ee•v: choice(e) output - choice(v) =3eev•: choice(e)

Proposition 2.6 3eeE: choice(E)<:>3veV: choice(v)

Definition 2.22 conflict:

Definition 2.23 self - choice: V ~ 1B is defined as: self -choice(v) =3eeE: lorig(e)ri({ v} x P out)I > 1 Note that an edge cannot be connected twice as an origin to an output port of a node (or a destination to an input port of a node respectively), since fst((J(e)) and snd((J(e)) are sets. Definition 2.24 dangling: E ~ 1B dangling: V ~ IB are defined as: dangling(e) = orig(e) =0 v dest(e) = 0 dangling(v) =ln(v):;: l(v) v Out(v):;: O(v)

3. The dual function of conflict is the function defined as 3e 1,e2eE,pel(v):e1 ;t:e2: (v, p)edest(e1)ridest(e2). This may also be viewed as a choice situation. STRUCTURAL GRAPH DEFINl110NS 25

Dangling means that a node or an edge is not fully connected. The dangling edges are assumed to be, by definition 2.11, the input and output edges of the graph. Definition 2.25 isolated: v ~ m is defined as: isolated(v) =V'ee•vuv•: •eve•= {v} An isolated node is not connected to any other node. However it is not required to be dangling, provided that all connected edges are dangling, or in a self-loop. Definition 2.26 well formed:(; ~ m is defined as: well - formed(G) =V'ei, e2 eE: -,adjacent( et> e2) This means that in a well-formed graph, each port is connected to at most one edge. Thus: Proposition 2.7 well - formed(G)~-.conjlict(G) Nevertheless, choice edges are allowed in well-formed graphs. Definition 2.27 simple:(;~ m is defined as: simple(G) =V'veV: #/(v) = 1 /\ #O(v) = 1 A V'eeE:#orig(e)::;; 1 /\ #dest(e)::;; 1 A V' ei. e2 eE: e1 =F e2: ....,parallel(ei. e2) The first condition states that the port concept is not used; the second condition states that no hyper edges exist (only simple edges), whereas the third condition induces that a simple graph can be considered as a pair (V, E) with E~V x V. Par­ allel edges are then modeled as weighted edges. Note that a simple graph is not required to be well-formed. A well-formed simple graph would only consist of a chain of nodes. Example 2.2 The graph of figure 2.2 illustrates many of the above defined concepts. In this

graph the edges e1 and e2 are not duplicates, although they are parallel, and therefore also adjacent. If edge e2 would not have been connected to node v2, the edges e1 and e2 would have been duplicates. The nodes VJ> v2 and v5 are 26 THEORY

Figure 2.2 A general graph

adjacent to each other, just as the nodes v2, v3, v6 and v7 are. The edges e1 and e2 cause conflict for node v1• Every node is independent of the nodes v4 and v8, just as v1 and v5 are also independent of the nodes v3, v6 and v7• Edges e2, e3 and e4 are choice edges. Node v4 is both a self-choice node and a self-loop; while the isolated node v8 is dangling. The graph is not well-formed, but the subgraph (see definition 2.41) consisting of the nodes v2, v3, v4, v6, v7 and v8 is.

Notation 2.28 A path in a graph G is a sequence Vi. ei. v2, e2,. · ·, Vn-i. en-l> Vn (n > 1), where vieV, e;eE and Vi: 1::;; i::;; n- 1: e;ev;•Ae;e-V;+i·

Notation 2.29 Patfi denotes the class of paths.

Definition 2.30 length: Patfi--7 N is defined as: length(vi.e1, v2,e2,··-, v,,...i.e,,...i, Vn) = n-1

Notation 2.31 Given a path tc ePatfi. ltcl = length(tc)

Definition 2.32 : Patfi --7 JB is defined as: cycle( Vi. ei. V2, e2, • ·., Vn-l> e,,...i. Vn) = n > 1 /\ v1 = Vn

The clause n > 1 is superfluous, since by notation 2.28 the sequei:ice v1 is not a SiRt:CTURAL GRAPH DEFINITIONS 27 path. Definition 2.33 cyclic:«J ~ IB is defined as (where R+ denotes the transitive closure of a relation R (Definition A.13)): cyclic(G) =3veV: vesucc+(r)

Notation 2.34 4 acyclic(G) =...,cyclic(G)

Notation 2.35 reach: V ~ P(V) is defined as: reach(v) = succ+(v) Thus for acyclic graphs VveV: ve:reach(v). Definition 2.36 reachable: V x V ~ IB is defined as: reachable(v, w) = wereach(v)

Definition 2.37 reach: P(V) ~ P(V) is defined as: reach(W) = u reach( w) weW

Definition 2.38 reach:«J ~ P(V) is defined as: reach(G) = u reach(e•)ue• eel(G)

Definition 2.39 connected:«J ~ IB strongly - connected:Gl ~ IB are defined as:

4. Since this case is not fully covered by notation A.3. 28 THEORY connected(G) = -i3WcV: U (succ(w)upred(w))!:;;;W ~ weW strongly-connected(G) s 'Vv, weV: v * w: reachable(v, w)Neachable(w, v)

Proposition 2.8 strongly - connected(G)=:>cyclic(G)

Definition 2.40 subgraph:(J x(J -7 1B is defined as: subgraph(Gi. G2) = V1!:;;;V2 A P;n1!:;;;P;11z A p oul1!:;;;P OUl2 A E1!:;E2 A 'VveV1: f1(V)!:;;;l2(v) A 01(V)!:;02(v) A 'VeeE1:

is defined in general for any partial order ::;; as the subset relation respecting the componentwise partial order::;; (see section A.2): (a1>b1)~(a2,b2) = a1::;; a2Ab1::;; b2

Definition 2.41 subgraph:(J x P(V) -+ (J, i.e. the subgraph H of graph G induced by a set of nodes V H!:;;; V, is defined as:

H = subgraph(G, VH) e (VH,P1n,P 0 ui,EH,l/VH,OIVH,r/JH) where EH= {eeE I •enVH * 0 V e•nVH :t; 0} and

Definition 2.42 subgraph:(J x P(E) -7 (J, i.e. the subgraph H of graph G induced by a set of edges EH !:;;;E, is defined as:

H subgraph(G, EH)= (VH•Pin•P 0111 , EH, /IVff,O/V H•

P(VH X P0 u1) X P(VH x P;n) is given by:

strong - component(G, W) =strongly - connected(subgraph(G, W)) A -.3W': W cW': strongly - connected(subgraph(G, W')) *

Definition 2.44 A function f:fl -7 f1 is a graph homomorphism of a graph G1 to a graph G2 iff f = (fy, fp10 ,fp..,./£),where fv:V1 -7 V2

fp1.: P1n1 -7 Pin2 fp •.,:Poutl -7 Pout2 fE:E1 -7 E2 (v2,P2)efst(!/J2(e2)) if 3e1eEi. v1 eVi. P1 ePoutl: e2 = fB(e1)Av2 = fv(v1)AP2 = fp• .,(P1)A (vi. p 1)efst(;1 (e1)) (vz, p2)esnd((J2(e2)) if

3e1 EE1t Vi eVl> Pt EPinl: e2 = fE(e1)Av2 /v(v1)AP2 = fp1.(P1)A (vi. P1)esnd(!/J1 (e1))

2.1.2 Generalized data flow graphs In this section, a special type of graphs called data flow graphs is defined, which is a generalization of the type of flow graphs found in the literature, e.g. [5, 37, 38, 39, 123, 132], since graphs with hyper edges instead of simple edges are consid­ ered. Data flow graphs may be used to model the data dependencies of a program written in a (programming) language. Nodes in a data flow graph represent the operators of the program, while data fl.ows along the edges as output of one node to the input of another node, i.e. edges represent values. The behavior of a node is that it generates a new value on its outputs depending on the input values. Instead of a single value on an edge, also a stream of values can be present. To link nodes with statements or operators in a program, nodes have two additional attributes: a type and afunction. Notation 2.45 For a node v of a data flow graph, type(v) denotes its type, e.g. operator, branch or merge; function(v) denotes its function. function(v) is a func­ tion Tupn(D) -7 Tupm(D), where D is the value domain (Section A.2), of the data flow graph, Tup(D) is the domain of tuples (Section A.2.1) and vis a node with n input ports /(v) ={Pin,•···, Pin.} and m output ports O(v) = {Pouz,, · · ·, Pout..1· function(v) is a (mathematical) function and defines the input-output behavior of v. Therefore a bijection exists between the input ports Pin, of v and the formal parameters of function(v). Similarly, a bijection exists between the output ports Poui, of v and the outputs of function(v). A function may have multiple outputs; an 30 THEORY example is a div - mod function that computes both the div and the mod of two values. In terms of hardware modules, an example is an ALU that outputs data, but that also sets some status flags.

, Notation 2.46 A node v with I(v) ={Pin,,·· -.Pin.} andO(v) = {P 0 ut 1 ···,Pout) is "strict in its inputs", or equivalently "has strict input ports", iff strict(function( v )) (see definition A.46). The strict inputs, or strict input ports, of a node v are those input ports Pin, for which strict(function(v), i).

,· Notation 2.47 A node v with /(v) ={Pin,•· ···Pin.l andO(v)= {Pout 1 ··•Pout,.} is "strict in its outputs", or equivalently "has strict output ports", iff (Vi: strict(function(v), i): ai ::/:. ..L):::>(V j: 1 ::;; j s; m: o j :t:. ..L) where function(v)(alt ... , an)= (oi. · · ·, Om) and ..L denotes the undefined value (Notation A.27 and definition A.44). The strict outputs, or strict output ports, of a node v are those output ports Pour- for I which (Vi: strict(function(v), i): a; :t:. ..L)=>o j :t:. ..L where function(v) (at> ... , an)= (01> • · ·, Om). Thus the strict outputs are the outputs that have a (defined) value if a (defined) value is applied to the strict inputs. Definition 2.48 selective: V ~ IB disjunctive: V ~ IB are defined as:5 [73, 74] selective(v) =(Vi: strict(function(v), i): a; :t:. ..L)=>(3!j: I ::;; j:::; m: o i :t:. ..L) disjunctive(v) =(Vi: strict(function(v), i): a; -:t. ..L) /\ (3k: -.strict(function(v), k): at :t:. ..L)=> (V j: 1 ::;; j :::; m: oj :t:. ..L) where /(v)={P1n1 ,-·-,P1n.l. O(v)={Pou11 ,··-.p0 urJ and function(v) (ai. ... , an)= (Oi. · · ·, Om)• Thus a node is selective if it defines only one output; a node is disjunctive if it needs only one input (of the non-strict ones) to have a 'defined' output, i.e. a value ::/:. ..L. Normally, the values on the strict inputs of a disjunctive node detennine

5. 3! denotes "there exists exactly one" STRU<:,TURAL GRAPH DEFll'\ITIONS 31 which non-strict input port must have a value -:t ..L; similarly for a selective node. Definition 2.49 A generalized data flow graph is a well-formed graph G with 'v'veV: type(v)e {operator, branch, merge, lO} and operator nodes are strict in all their inputs as well as all their outputs. - a branch node is a selective node with two strict input ports {pd, pc} and n (n > 1) non-strict output ports {Pt,···, Pn}. - a merge node is a disjunctive node with one strict output port Pd• one strict input port pc and n (n > 1) non-strict input ports {Pt, · · · , p n}. - 10 nodes appear in two types: get nodes and put nodes, which are discussed at the end of section 1.1. In principle, IO nodes are operator nodes.

From this definition, operator nodes are nodes v whose input-output behavior is defined by function(v). The type of the function is not restricted, as long as it is strict. Simple arithmetical functions like+, -, and relational functions like =, 'f:., ~ may exist, but also more complex functions are allowed, as well as constant functions.6 The behavior of a graph is identical to the behavior of a single node, namely a mapping from values on the input edges to values on the output edges. Therefore, the function of an operator node may also be defined by means of another data flow graph, i.e. allowing hierarchy. For branch nodes, data from the incoming edges is transferred to one outgoing edge dependent on the value on control input; whereas merge nodes transfer data from one incoming data-edge to the outgoing edge. As branch and merge nodes are non-strict input and output nodes, they may be used to model data-dependent control statements. The (conditional) branching and merging of data may yield data-dependent graph behavior. Note that IO nodes are not the same as the input and output nodes of a graph (Definition 2.12). The latter nodes are the structural input and outputs of the graph, and which are used to start up the execution of the graph, while the 10 nodes get and put model the external communication with the environment. Notation 2.50 A pure data flow graph is a choice-free data flow graph. I.e. all edges in a pure data flow graph have a single (or no) origin, and thus each value is defined uniquely. This notation is used when ordinary flow graphs (those

6. Since, by definition A.46, constant functions are strict by definition. 32 THEORY that are found in literature, e.g. [5, 37, 38, 39, 123, 132]} are addressed.

2.2 Data flow graphs as an algebra In this section, a 'syntax' of data flow graphs is defined that is different from the conventional viewpoint of a graph as a tuple (V,Pin,P001 ,E,I,O,!/J) as used in section 2.1. It is a more compositional syntax and is used in section 2.4 and in appendix B and is also referred to in section 3.8.6. Definition 2.51 elementary:IJJ -t JB is defined as: elementary(G) =#V IA \iveV: -.dangling(v) Note that an elementary graph is not necessarily well-fonned. Also self-loops are allowed in such an elementary graph. The following operators are defined on graphs: 1. parallel composition 2. deletion of edges 3. renaming and/or combining edges These operations are similar to those that are found in process algebras like ACP [14], CSP [60), and CCS [104], where these type of operators are defined on so­ called (process) agents. But here it is on a structural level as in [103). Definition 2.52 composition:{] xfJ -t fJ is defined as: composition(Gi. G2) = (V1uV2, P;n 1UP;n 2,PourtUP oUJ 2, E1uE2, /iU/2, 0 1u02. ip') with V1nV2 = 0 where

l/J1 (e) if eeE1\E2 !/J'(e) = l/J2(e) j if eeE2\E1 l (fst(1 (e))ufst(2(e)), snd((J1(e))usnd((J2(e))) if eeE1nE2

Notation 2.53 G1llG2 =composition(Gi.G 2) DATA FLOW GRAPHS AS AN ALGEBRA 33

Proposition 2.9 G1 llG2 = G2llG1 G1ll(G2llG3) = (G1llG2)llG3

Definition 2.54 restrictio11:(fj x E --? Gi is defined as (where I denotes function restriction (Notation A.10)): restriction(G, e) = (V, Pm, Pout• E\( e }, I, 0, f/!/(E\{ e} ))

Notation 2.55 G\e = restriction(G, e)

Proposition 2.10 eeE=:}G\e = G G\e1\e 2 = G\e2\e1 (G111G2)\e = (G1\e) II (G2\e) The following notation is therefore well-defined. Notation 2.56 G\E =restriction(··· (restriction(G, e1) ···),en) where E = {ei. ···,en}.

Definition 2.57 re/abeling:IG x Ex E -?Gi is defined as: relabeling(G, ei. e1) = (V, P;n,P out• E', 1,0, rf/) where E' -{E if e1 eE (E\{ei})u{e2} if e1 EE fl'(e) = f/!(e) if e :t= e1Ae =F e2

if e EEAe2EE f(e2) = {ip(ei) 1 (fst((!(e2))u fst(ip(e1)), snd(ip(e2))usnd(ip(e1))} if ei. e2 EE {aliasing} le. e1 is replaced by e2, and possibly combined with e2• Proposition 2.11 For E' as defined in definition 2.57, IE'I = {1£1 if e1 eE v (e1 EEAe2eE) IEI - l if e1 EE A e2 eE 34 THEORY

Notation 2.58 G[e1 ~ e2] =relabeling(G, e1, e2)

Proposition 2.12 !1 '# ezA/i '# e1.::::>G[e1 ~ /1Ue2 ~ fz] = G[ez ~ fzHe1 ~ fi] The following notations are therefore well-defined.

Notation 2.59 With ..,3i, j: Ji= e j• and Ra relation {(ei. / 1), ···,(em fn) }, G[e1 ~/i.··., en ~ fn1 = relabeling(- · -(relabeling(G, e1.f1)- ·-),em fn) G[RJ =G[e1 ~ /1,. .. , en~ fnl

Proposition 2.13 G[R][S] =G[SoR] G[R]\e G\( e;IR(e;) = e }[R] (G1llG2)[e1 ~ e2] = (G1[e1 ~ ezJ) 11 (G2[e1 ~ ez])

Theorem 2.14 Well1ormedness of graphs is closed under the graph composition operators II, \E and [R].

Proof: The proof is by case analysis on the operator: - Composition Let G' = G111G2. Given, by the well-formedness assumption, 'v'e1.e2eE1: e1 '# e2: fst(1(e2)) = 0 A (2.1) snd( 1(e 1))nsnd( e2 eE2: e1 '# e2: f st(2(e 1))rifst(2(e 2)) = 0 A snd(2(e1))nsnd(2(e2)) = 0. Also V1nV2 = 0. By symmetry, only the following cases need to be consid­ ered:

- Caseei.e2 eE1\E2: fst('(e 1))nfst('(e2)) =fst( 1(e 1))nfst(1(e 2)) = 0 snd(t/>'(e1))nsnd(tf>'(e2)) = snd(1(e1))nsnd(1(e2)) = 0

- Casee1 eE1\E2,e2 eE1nE2: fst('(e 1))nfst('(e2)) = fst( 1(e 1))n(fst(1 (e2))v/st( 2(e2))) = (fst(1 (e1)nfst(1 (e2)))v(fst(1(e 1))ri/st(2(e2))) = 0 Similarly, snd('(e1)nsnd('(e2)) = 0. DATA FLOW GRAPHS AS AN ALOEBRA 35

- Case e1,e2eE1nE2: fst('/l'(e1))nfst(rp'(e2)) = (fst(rp 1(e 1))u fst('f!2(e1)))n(fst('f! 1(e2))u fst(f/!2(e2))) = (fst(f/! 1(e 1))nfst(fPt (e2)))u(fst(r/J1(e1))nfst('fli(e2)))u (fst('/12(e 1))nfst(rp1(e 2)))u(fst(ip2(e1))nfst(th(e2))) = 0 Similarly, snd(ip'(e1))nsnd(f/i'(e2)) = 0.

- Casee1eE1\E2,e2eE2\E1: fst(f/J'(e 1))nfst(ip'(e2)) = fst('fi1(e1))nfst(rh(e2)) 0 Similarly, snd(rjJ 1(e 1))nsnd(f/12(e2)) = 0. - Restriction Trivial. - Relabeling

LetG'=G[e1 -)ez].

- if e1 eE, then G[e1 -7 e2] = G well-formed.

- if e1 eEAe2 eE, then G[e1 --) e2] well-formed, since only edge e1 is renamed to e2•

- if e1 eEAe2eE, then {aliasing} 'VeeE'\{e2 }:rp'(e) = rp(e) => {by equation (2.1)} 'V e1', e2' eE\{ ez): ei' ::f:. e1': f st(rp'(e1'))nfst(rp'(e2')) = f st(rp(e1'))nfst(lfJ(e/)) = 0 1 Similarly, 'Vet', e2 eE\{ e2}: et' ::f:. e{: snd(rp'(e1'))nsnd('fi'(e2')) = 0. Also 'Ve eE': e ::f:. e2: f st(rp'(e))nfst(ifJ'(e2)) = f st(tjJ(e))n(fst(lfJ(e2))ufst(rp(ei))) (fst(tjJ(e ))nfst('f!(e2)))u(fst(f/J(e ))nfst(f/J(e1))) 0 1 Similarly, Vee£': e ::f:. e2: snd(1/l(e))nsnd(lfJ (e2)) = 0. Since well-formedness is a symmetric relation, this is sufficient to prove that well-formedness is closed under the relabeling operation. D

Definition 2.60 A flow graph expression is an expression, generated by the well-formed elementary graphs and the graph composition operators II, [R] and \E. 36 THEORY

Theorem 2.15 Each (finite) well-formed graph can be represented by a flow graph expression.

Proof: When for a graph G a construction of a flow expression g can be given, the theorem is proved. Informally such a construction is: 1. Each node veV is an elementary graph with edges, that have unique edge names, connected to all ports. 2. Relabel all the edges that are connected in the graph G to each other with an identical name Formally this construction can be described as: L V'v;eV construct the elementary graphs G; = ({v;},l(v;),O(vi),E;,l/J;), so that a bijection q: E; ~ l(v;)uO(v;), or isomorphic relation, exists, and l{J{e) ={({(v;,TJ(e))}, 0) if q(e)e/(v;) ' (0, {(v;, 17(e))}) if 17(e)eO(v;) It is clear that all these elementary graphs Gi are well-formed.

2. Define V'v;eV the relabeling R;: E; ~ E with R;(e;) = e, when (if e exists, i.e. v; is connected) fst(l{J(e))nfst(l{J;(e;)) :F 0 v snd(l)(e))nsnd(t/J;(e;)) :F 0 Note that such an e is unique, since G is well-formed (hypothesis). Define G' = (G1[Ri]ll · · · llGn[Rn]). Now it is clear that G = G'. D

Corollary A (finite) graph can be represented by a flow graph expression iff it is well-formed.

Proof:

- Case~= Theorem 2.14

- Case f-: Theorem 2.15 D

2.3 Operational semantics of flow graphs In the previous sections, structural properties of data flow graphs are described. In OPERATIONAL SEMANTICS OF FLOW GRAPllS 37 this section and the following one, dynamic interpretations of flow graphs are given. As mentioned in the introduction, behavior is related to the concept of a state, which is a mapping from node-(input)port pairs to (streams of) values. The domain of values are the booleans, integers, etc. Definition 2.61 State= V X P;n ~ Str(D) with D the flat domain (Definition A.48) D = lB +JN+ JR + · · · and Str(D) the domain of streams (Section A.2.2). With this notion of a state, the behavior of a graph is a relation from an (input) state to an (output) state. In this section, an operational semantics for flow graphs is defined. Such a seman­ tics states something about how a program can be computed, i.e. it defines an inter­ preter. A (non-deterministic) simulator for flow graphs must also be based on this semantics. The operational semantics for flow graphs is defined by means of a transitional semantics, which also induces an execution tree. It is based on an update rule of how the current state s( w, p) changes to s'( w, p) by executing a node v. When the flow graph is viewed uninterpreted, i.e. the values of the data items on the edges are of no importance, but only the existence of values is important, the operational semantics defined here is like the one for ordinary Petri nets [110], and is then similar to the approach in [73, 74]. The transitional semantics is defined with the help of inference rules. v Notation 2.62 s~s' denotes the transition of state s to state s' by the execution of node v.

Notation 2.63 Inference rules for a transitional semantics have the form · · · transition· · · · ' . . '' condition transltlon meaning that when all the transitions transition; can be deduced and also the side condition condition is valid, the transition transition may be inferred. v The states' for a transition s~s' is an update of states. The idea of this update is

6. Note that the domain D is not defined as the recursive domain D =IB + IN + .. · + D ~ D. I.e. D ~ D is not included, because this is not of interest for hardware synthesis as the type of application. (Section 3.2) 38 THEORY that the heads of the streams on the (strict) input ports (and possibly some of the non-strict input ports) of node v are used for the computation of the new values which are put at the end on the streams at the (strict) output ports (and possibly some non-strict output ports) of node v. All the other streams remain identical. 2.3.1 Transitional semantics for choice-free graphs First only (well-formed) flow graphs without choice edges are considered. When executing a node v, the following cases can be distinguished for the updating of the state s(w, p): 1. The node-port pair (w, p) is not the destination of an outgoing edge of v, and also v :f. w. This includes the case that w is not a successor node of v, but also the case that w is a successor node of v, but port p is not connected to v, as for example the pair (w, p2) in figure 2.3.

Figure 2.3 Example for operational semantics 2. The node-port pair (w, p) is the destination of an outgoing edge of v (and v :f. w). We can distinguish two subcases:

a. (w, p) is connected to an output port of v onto which the current execution of v outputs a value. b. (w, p) is connected to an output port of v onto which the current execution of v outputs nothing. 3. v wand the value on the node-input port pair (w, p) is used for the execu­ tion of v. Then the following subcases exist: a. (w, p) is connected to an output port of v, i.e. this is the case of a self­ loop. Here again, two subcases exist:

1. (w, p) is connected to an output port of v onto which the current execution of v outputs a value. OPERATIONAL SEMANTICS OF FLOW GRAPHS 39

2. ( w, p) is connected to an output port of v onto which the current execution of v outputs nothing. b. (w, p) is not connected to an output of v. 4. v = w and the value on the node-input port pair (w, p) is not used for the execution of v. Then the following subcases can again be distinguished: a. (w, p) is connected to an output port of v, i.e. in case of a self-loop. 1. (w, p) is connected to an output port of v onto which the current execution of v outputs a value. 2. ( w, p) is connected to an output port of v onto which the current execution of v outputs nothing. b. (w, p) is not connected to an output of v. Operator nodes have only strict input and strict output ports, so for operator nodes only cases 1, 2a, 3al and 3b apply. With the help of figure 2.4 the different cases can be illustrated. The destination of edge e3 is an example of case 2a; the destina­ tion of edge e1 is an example of case 3b; while the destination of edge e2 is an example of case 3al.

Figure 2.4 Operator node Branch nodes have only strict input ports, so for branch nodes only cases 1, 2a, 2b, 3al, 3a2 and 3b apply. Similarly for merge nodes, which have only strict output ports (and one strict input port), only cases 1, 2a, 3al, 3b, 4al and 4b apply. The different cases for the branch node are illustrated in figure 2.5. The destina­ tion of edges e3 respectively e4 in figure 2.5a are examples of case 2a respectively case 2b if the value on the control port of node v; is true; the other way around if this value is false. The destination of edge e1 (figure 2.5b), if the value on the control port of node v1 is false is an example of case 3a2; if the value on the con­ trol port of node v1 is true, then it is an example of case 3al. The destination of edge e3 (figure 2.5c), if the value on the control port of node v; is true is an exam­ ple of case 3a2; if the value on the control port of node v; is false, then it is an example of case 3al. The destination of edge e1 (figure 2.5a and figure 2.Sc) is an 40 TuEORY

a b c

Figure 2.5 Branch node example of case 3b. The different cases for the merge node are illustrated in figure 2.6. The destination of edge e4 in figure 2.6a is an example of case 2a. The destination of edge e2 (figure 2.6b), if the value on the control port of node vi is false, and the destination of edge e3 (figure 2.6b) are examples of case 3b. The destination of edge e1 (figure 2.6b), if the value on the control port of node vi is true, and the destination of edge e3 (figure 2.6c) are examples of case 3al. The destination of edge e2 (figure 2.6b), if the value on the control port of node v; is true, and the destinations of edges e1 respectively e2 (figure 2.6c), if the value on the control port of node vi is false respectively true, are examples of case 4b. Finally, the destination of edge e1 (figure 2.6b), if the value on the control port of node vi is false, is an example of case 4al.

a b c

Figure 2.6 Merge node A state s(w, p) is then updated to s'(w, p) by the execution of a node v as follows for the different cases, using the functions on streams as defined in definition A.60: 1. s'(w, p) = s(w, p) OPERA TI ON AL SEMANTICS OF FLOW GRAPHS 41

2. a. s'(w, p) = rcons(new value, s(w, p)) b. s'(w, p) = s(w, p) 3. a. 1. s'(w, p) = rcons(new value, tail(s(w, p)}) 2. s'(w, p) = tail(s(w, p)) b. s'(w, p) = tail(s(w, p)) 4. a. 1. s'(w, p) = rcons(new value, s(w, p)) 2. s'(w, p) = s(w, p) b. s'(w, p) = s(w,p) where new value is a function of the values on the heads of the streams at the strict input ports (and possibly some of the non-strict input ports) of v and is determined by the type and function of the node.

This then all leads to the following rules for the transitional semantics for each of the node types: Case type(v) =operator. Let I(v)= {it>· .. ,i,.} O(v)= {oi.···,Om}. enabled - op(v, s) = 'r/ pe/(v): head(s(v, p)) :/:..land The transitional semantics for an operator node is -v- enabled - op(v, s) S-?S1 where s' =update- op(s, v) is defined in figure 2.7 (where J. is a tuple disas­ sembly function (Definition A.56)) with new function(v)(- · ·, head( s(v, i j)), · · ·). -- Case type(v) =branch. Let /(v) = lPc•Pd} O(v) = {{P1rue• P false} if v is an if statement lPt> · · ·, p,.} if v is an n-switch.

enabled - branch(v, s) =enabled - op(v, s) and The transitional semantics for a branch node is -v- enabled -branch(v, s) s-?s' 42 TuEORY

v:;t:wA 'f -.3eeE,ke{l, .. ·,m}: s(w,p) I A { (v, Ot)Efst(;(e)) (w, p)esnd(;(e)) V=WA 'f -i3eeE,ke{l, .. .,m}: tail(s(w, p)) 1 { (v,ok)efst(t)(e)) A (w, p)esnd(t)(e)) s'(w, p) = V ;CW A if eeE,ke{l, .. ·,m} is such that rcons(new.J,k, s(w, p)) { (v, Ot)efst(t)(e)) A (w, p)esnd(t)(e)) V=WA if 3eeE, ke {1, .. ·, m} is such that rcons(new.J,k, tail(s(w, p))) { (v, Ot)efst(t)(e)) A (w, p)esnd(tjJ(e))

Figure 2.7 s' =update op(s, v) where s' =update - branch(s, v) is defined in figure 2.8 with new= head(s( v,pd)).

Note the use of the shorthand notation Pval for Phead(s(v.p.>J> which reduces to Ptrue or p false for an if statement, and to Pi for a n - switch. This notation is used to represent the different cases for output ports together. - Case type( v) = merge. Let

if v is an if statement /(v) ={{Pc•Ptrue•Pfatse} {pc,Pi.···,pn} if v is an n-switch. O(v) ={Pd} enabled - merge(v, s) = head(s(v, Pc)) "I:- l.. A head(s(v, Phead(s(v,p,)i}) * l.. and The transitional semantics for a branch node is -v- enabled - merge(v, s) s~s' where s' = update-merge(s, v) is defined in figure 2.9 with new= head(s(v, Phead(s(v,p,)})). - Case type(vi) = 10: same as operator nodes. 0PERATIOl\AL SEMANTICS OF FLOW GRAPHS 43

(v :;t: w A

..,3eeE: (v,pv0 1)efst(if>(e)) I\ (w, p)esnd(t;(e))) V s(w,p) if (v :;t: w I\ 3eeE: (v, Pva1)efst(if>(e)) A (w, p)esnd(;(e)) I\ val :;t: head(s(v, Pc))) (v = w I\ 3eeE: (v, Pva1)efst(if>(e)) A (w, p)esnd(;(e)) I\ tail(s(w,p)) if val :;t: head(s(v, Pc))) V (v = w I\ ..,3eeE: (v, Pva1)efst(if>(e)) A l (w, p)esnd(if>(e))) s'(w,p) =

v ;t: w I\ if 3eeE: (v, Pvaz)efst(if>(e)) A rcons(new, s(w, p)) {(w,p)esnd(rp(e)) A val =head(s(v, Pc))

v = w I\ if 3eeE: (v, Pva1)efst(;(e)) A rcons(new, tail(s(w, p))) { (w, p)esnd(if>(e)) I\ val = head(s(v, Pc))

Figure 2.8 s' = update-branch(s, v) To take care of sequence edges, add to all the above given definitions the input sequence edges to the set of the strict inputs, and the output sequence edge(s) to the set of strict output edges. function(v) is then extended to output, for example, a special value (ED) for the ports onto which the (output) sequence edges are con­ nected. Because only well-formed graphs are considered, the real values on the 44 THEORY

(v :.t: w A ..,3eeE: (v, Pd)efst(ip(e)) A (w, p )esnd(!f>(e))) v s(w,p) if (v;;:; W A P :F. Phead(s(v,p,)) A P ";/:. Pc /\ -.3e eE: (v, Pd)efst(ip(e)) A (w, p)esnd(ip(e)))

v = w A (p = Phead(s(v,pc)> V P;;:; Pc) A tail(s(w, p)) if -i3eeE:(v,pd)efst((J(e))A {(w, p)esnd(t}(e)) s'(w, p);;:; (v :.t: w /\ 3eeE: (v, Pd)efst(ip(e)) A (w, p)esnd(ip(e))) v rcons(new, s(w, p)) if (v =WA P :.t: Phead(s(v,p,)) A P :,t: Pc/\ 3eeE: (v, Pd) efst(f/l(e)) A (w, p)esnd(t}(e)))

V =WA (p = Phead(s(v,p<)) V P rcons(new, tail(s(w, p))) if 3eeE: (v, Pd)efst(ip(e)) /\ { (w, p)esnd(ip(e))

Figure 2.9 s' =update - merge(s, v) input ports onto which sequence edges are connected, have no influence on the resulting outputs, except that a value must exist, and are therefore ignored. Note that the above defined semantics presumes a well-formed graph G, i.e. a graph without conflict. Nevertheless, it can be updated easily for non well-formed graphs G. For a conflictive node, more than one edge is connected to an output port. To model this as a conflict, a new value is put to only one of the output edges. In the case of more than one edge connected to an input port, only the head of a stream on one of these edges is taken to execute the node. This can be modeled in a similar way as described in the following section. OPERATIONAL SEMANTICS OF FLOW GRAPHS 45

2.3.2 Transitional semantics for general graphs Now the case of generalized well-formed data flow graphs with choice edges is considered. For a choice edge, the values on an edge, or better the values entering a node-port pair, may originate from more than one source. If it is wanted to main­ tain the stream of values that is generated by a node, but allowing for any order between the streams generated by the different origins of a choice edge, the notion of state must be extended. Definition 2.64 State'= V X P;n -7 Str(D') with D' = V X Pout X D. Thus a value is labeled with the node-( output)port pair that defines this value. With this notion of a state, it is possible to distinguish between the values in a stream that come from the different origins of a choice edge. Before giving the transitional semantics, the function shufflercons must be defined, as a replacement for the rcons function. Instead of putting a new value just behind the current stream when executing a node, the new value will be put somewhere in the end of the stream. Definition 2.65 With Set(D) the domain of sets (Section A.2.3), val:D' -7 D shufflercons: (V x Pout x D) x Str(D') -7 Set(Str(D')) are defined as:7 · val J.d'. d'j,3 shufflercons = A..(vi, Pt> d). J.s.ps where (with ::;; the partial order on streams (Definition A.58))8 ps = {s' eStr(D') I V'(v, p)eV x Pout: s(v, p)::;; s'(v, p) A '"( )-{rcons((v,p,d),s~(v,p)) if(v,p)=(v;,pk) SI V,p - s~(v, p) otherwise

Thus shufflercons(v, p, d, s) extends the stream s by insertion of (v, p, d) at any place after the last element corresponding to ( v, p ).

7. Notations from .t-calculus [ 13) are used. Other references to the 1-calculus that may be faster to grasp can be found in any text book on functional languages like [48], and/or denotational semantics (115, 124]. An overview of the syntax and an infonnal interpretation are given in the appendix, too (Section A.3). 8. ~ is the projection operator on streams and sequences. 46 THEORY

Example 2.3 Let s =([v, p, a], [w, q, b], [v, p, c], [w, q, dJ). shufflercons(v, p, e, s) = {si.s2}, with s1 = ([v,p,a], [w,q,b], [v,p,c], [v,p,e], [w,q,d]) and s2 = ([v,p,a], [w,q,b], [v,p,c], [w,q,d], [v,p,e]). The transitional semantics for an operator node becomes: -v- enabled - op(v, s) s-+s' where s' =update- op'(s, v) is defined in figure 2.10

'f ke{ 1, · · · ,m}: s(w, p) 1 {:;e:;, (v, ok)efst(;(e)) A (w, p)esnd(;(e)) v=wA ·r .,3eeE,ke{l,···,m}: tail(s(w, p)) 1 (v, ok)efst(;(e)) A (w, p)esnd(;(e)) s'(w,p)= lv :t: w A eeE, if ke {1, · · ·, m} is such that shufflercons'(new.!k, s(w, p)) { (v,ok)efst((J(e)) A (w, p)esnd(;(e)) v=w A3eeE, 'if k e { 1,. .. , m} is such that shufflercons'(newJ.k, tail(s(w, p ))) { (v, ok)efst(;(e)) A (w, p)esnd(;(e))

Figure 2.10 s' =update - op'(s, v) with new.!k = [v, Ot> function(v)(· · ·, val(head(s(v, p; .))), · · -)!k] and (seStr(D'), J veV, peP0u1 and deD) shufflercons'([v, p, d], s)eshufflercons([ v, p, d], s) Execution of a node may lead to more than one next state, i.e. non-deterministic forward branching may occur. update should in principle be considered as a func­ tion State x V-+ P(State) instead of the above defined non-deterministic function State x V -+ State. The other update functions are changed similarly. Note that the enabled function stay the same. The only difficulty for update exists when v, the node to be executed, is a self-choice node. In that case, shuffling must be done with all the OPERATIONAL SEMANTICS OF FLOW GRAPHS 47

new values E V x Pout x D, on such an edge. Note that in this case, it does not make any difference in what order the shuffles are performed. Definition 2.66 For a (well-formed) data flow graph G, the transitional semantics is defined as: v enabled(v,s) s~update(s, v) where enabled: V x State'~ IB update: State' x V ~State' are defined as:

= 1 ( ) {enabled - merge(v, s) if type(v) =merge ena ble d A v,s. . enabled - op(v, s) otherwise update -branch'(v, s) if type(v) =branch update= lt(s, v). enabled(v, s) /\ update - merge'(v, s) if type(v) =merge { update - op'(v, s) otherwise

Notation 2.67 update: State xv+ ~ State is defined as: update(s, v1v2 ••• v,,) = update( .... (update(s, v1), v2)· · ·, v,,) This is well-defined, because of the following proposition. Proposition 2.16 Given a sequence of nodes u. V o-1, 0"2: u = u10-2: update(s, u) = update(update(s, o-1), u2)

Notation 2.68 enabled: v+ x State~ IB is defined as: enabled(v1 v2 • • • vn. s) = enabled(vi. s)Aenabled(v2, update(s, v1))A • • · Aenabled(v,,, update(s, v1 v2 · · · v,,...1))

Definition 2.69 k - enabled: V x State ~ IB is defined as: \:/ j: 0 S j S k: enabled(v, update j(s, v)) where update0(s, v) =sand (I S k) updatek(s, v) = update(updatek_1(s, v), v) 48 THEORY

A node is multiply enabled if it is still enabled after being executed.

Notation 2.70 Given a sequence aev+ and a states. a s~ =enabled(a, s) s~sa ' =enabled(a,s)As ' = update(s,a)

Proposition 2.17 The transitional semantics as defined in definition 2.66 reduces for a (well-formed) data flow graph G for which -ichoice(G) to the transitional semantics as defined for data flow graphs without choice, by taking for all values d' = [v, p, d] eD' the value val(d') = d eD.

The operational semantics of a well-formed graph G and initial state s0 induces a tree. Notation 2.71 A reachability tree for a (well-formed) flow graph G with an ini­ tial state s0 is a connected directed rooted (simple) tree, in which the nodes are labeled with states s and edges are labeled with nodes v e V. The root of the tree is v a node labeled with the initial state s0 • For each transition s~s' defined by the transitional semantics, an edge labeled with v exists in the reachability tree from the node labeled with states to a node labeled with states'. The reachability tree induces by the transitional semantics can also considered to be a graph. The edges (s, v, s') of the reachability tree are labeled by the node that is executed to get s' from s, i.e. s' = update(s, v). But since the possible paths starting from a state s are only dependent on the current state s and independent of any prefix path of the root of the reachability tree to this state s, nodes that are labeled with identical states s and s' may be collapsed. This is even the case when s can be reached from s itself. Notation 2.72 A reachability graph for a (well-formed) flow graph G with an ini­ tial state s0 is a connected directed rooted simple graph, in which the nodes are labeled with states s and edges are labeled with nodes v e V. The root of the graph is a node labeled with the initial state s0 • Each edge, labeled with a veV, in the reachability graph from the node labeled with a state s, to a node labeled with a v states', models a transition s~s'.

Notation 2.73 For a (well-formed) data flow graph G and initial state s0, the reachability graph induced by the transitional semantics as defined in definition 2.66 ,is denoted by RG(G, s0 ). When the initial state s0 is clear from the context, RG(G) = RG(G, s0 ). OPERATIONAL SEMANTICS OF FLOW GRAPHS 49

Definition 2.74 Given a reachability graph RG. The final, or terminal states, of RG are defined as: final(RG) = {seRGl-.3v: enabled(s, v)}

Example 2.4 The reachability graph of the data flow graph of figure 2.11 a with initial state s0, consisting of a value on each of the input edges, is given in figure 2.11 b. Due to the output choice of nodes v3 and v4 two different final, or terminal, states s26 and s27 exist.

2.3.3 Operational semantics The reachability tree models the internal behavior of the graph. The operational semantics is defined as the abstraction of the input-output behavior, i.e. the exter­ nal behavior, of the graph induced by the transitional semantics, i.e. the reachabil­ ity graph. Definition 2.75 Patft.= e +(State'. V)*. State'+ (State'. V)a> Patft. is the domain of paths of a reachability graph, which are alternating sequences of states and nodes.9 A path can be viewed as a sequence of edges, since an edge is characterized by two states and a node with which the edge is labeled.

Notation 2.76 In the sequel, a path trEPatfi is s0 v1s1 v2s2 • • ·

Definition 2.77 tail: Patft. ~ Patfi is defined as:

. {S1V2S2··· iftr=SoV1S1V2S2··· tail= A.n. Si if 7r = SoV1S1 e if 1Z" =SV1Z" =So

9. The symbol . is the language concatenation operator, which also is denoted by juxtaposition, extended to sets, e.g. L. a= {l. alleL), and L1• L2 = {[1• /21 /1 eL1A l2eL2 ) where L, L1 and L2 are languages and a a symbol; e is the empty string. 50 TllEORY

a b

Figure 2.11 Example for transitional semantics: a) data flow graph b) reachability graph Definition 2.78 paths: 'R.{j ~ Patli is defined as: paths(p) ={ 7r = SoV1S1 V2S2 ••• I So= root(p)A' V;, S;)eedges(p)}

Notation 2.79 When the node-(input) port pair (v, p) and the path 7r are clear from the context, then OPERATIONAL SE~1ANTICS OF FLOW GRAPHS 51

- v 11 is a shorthand for the node(s) that 'use' s(v, p) for a given (v, p) along a path ;r ('v'sen'), namely for the node that changes the head of s(1', p) by its execution (by removing it, and possibly adding something to s(v, p)). In that case, ;r can be written as (n1• Vu. K2)*, with s the last element of n1, the state of which the head will change by the execution of Vu· This is always the case for the strict input ports of v, and possibly some non-strict input ports of v. When v has only strict input ports, this is the only possible case and then all occur­

rences of 1• are v11 , and only those. However, it may also be valid for some non­ strict input ports. For the data flow graphs presented in this thesis, all occur­ rences of v in ;r, with type(v) ':f= merge, are examples of vu, and only those. If v is a merge node, then still Vu is the same as v, but not all occurrences of v in ;c are examples of Vu.

- v d is a shorthand for the node( s) that 'define' s( v, p) for a given ( v, p) along a path ;c ('v' se;r), namely the node that changes the tail of s(v, p) by its execu­

tion (by inserting a new value [v d• p 0 ,,1, val] in s(v, p) and possibly, e.g. in the case of a self-loop and vd = v, removing the head of s(v, p)). Also in this case, n can be written as (;rr1• vd. ;rr2) *. In case of a choice-free graph, and type( vd) ':f= branch, vd is uniquely determined as the predecessor nodes of v. But if a branch node is a predecessor of v, then not necessarily all occurrences of this branch node is a vd for ( v, p ). Also for choice graphs, v d is not uniquely deter­ mined by the structure of the graph.

Definition 2.80 s"': V x P;n X Patti~ Str(D') s**: V x P;n ~ Set(Str(D')) are defined as: l(v, p, ;rr) =if ;c = s0 then s0(v, p) else if v1 =Vu then cons(head(s0(v, p)), l(v, p,tail(;rr))) else /(v, p, tail(:n)) s**(v,p) == u /(v,p,;r) aepaths(RG(G.s0 ))

Definition 2.81 Behav:lfi x Patfi ~State' Behav:lfi x State'~ Set(State') are defined as: Behav(G, s0) = A..(v, p). /*(v, p) 52 THEORY

Definition 2.82 For a (well-formed) data flow graph G and initial state s0, the operational semantics is defined as Behav(G, s0).

Notation 2.83 When the initial state s0 is clear from the context, Behav(G) = Behav(G, s0). An issue which has not been addressed, is the state of an output edge. According to its definition, a state is a mapping of node-(input) port pairs to streams of val­ ues. Since the output edges of a graph O(G) have no destinations (Definition 2.11), the values output by the output nodes of the graph Out(G) (Definition 2.12), are not 'recorded'. This may be solved by adding dummy nodes that are connected to the edges O(G) (and which are not allowed to execute), or by changing the defini­ tion of State to State= V x P;n + O(G) -t Str(D), and similarly for State'. The operational semantics, as well as the denotational semantics defined in the next section, change accordingly. Comparable to this is the following notation. Notation 2.84 Given a well-formed graph G. In examples, a state s is repre­ sented as a set Given a node-port pair (v, p), the elements of the set are the non­ empty streams s(v, p) labeled by (v, p). If the edge e for which (v, p)esnd(ip(e)), is not a multi-destination edge, i.e. lsnd(ip(e))I = 1, then it might also be labeled with this edge e.

Example 2.5 Given the data flow graph of figure 2.12a with initial state so= { e1: (a), e2: (b) }, the reachability graph is given in figure 2.12b with s1 = {ez: (b, v(a))}, Sz = {e2:(v(a)),e3:(w(b))}, S3 = {e3: (w(b), w(v(a)))} and S4 = {e1:(a),e3:(w(b))}. Two paths in the reachability graph Jr1 = s0 vs 1ws2ws3 and 1!'2 = s0ws4s2s3 exist, and s*(et> 7r1) = cons(head(s0(e 1)), s3(e1)) =cons(a, nil)= (a), s*(el> 7rz} = cons(head(s4(e 1)), s3(e 1)) =cons(a, nil)= (a), s*(ez, 7r1) = cons(head(s1(e2)), cons(head(s2(e2)), s3(e2))) = cons(b, cons(v(a), nil))= (b, v(a)), s *(e2, 7r2) = cons(head(so(e2)), cons(head(s2(e2)), s3(e2))) =(b, v1(a)) and s*(e3,n1) s*(e3,n2) =s3(e3) = (w(b), w(v(a))). OPERATIONAL SEMAl\'TICS OF FLOW GRAPHS 53

So v·A 'fw S3 a b

Figure 2.12 Example flow graph and reachability graph for operational semantics Thus

Behav(G)(ei) = s**(e1) ={(a)}, Behav(G)(e2) = s** (e2) = {(b, v(a))} and Behav(G)(e3) = s**(e3) {(w(b}, w(v(a)))}.

Notation 2.85 Given a well-fonned graph G. In examples, values [ v, p, d] eD' are denoted by d for choice-free edges, and by [ v, d] for choice edges if v has only one output port

Example 2.6

Given the flow graph of figure 2.13a and initial state s0 = {e1:(a), e2: (b)}. The reachability graph is given in figure 2.13b, with so= {e1:(a),e2:(b)}, St= {e2:(b),e3:([v1> v1(a)])}, s2 == {e1:(a},e3:([v2, v2(b)])}, S31 {e3: ([v1. V1(a)], [v2. v2(b)])} = S42, S41 = {e3: ([v2. V2(b)], [VJ, V1(a)])} = S32. ss = {e2:(b),e4: v3(V1(a))}, s6 { e1: (a), e4: v3(v2(b))}, s1 = {e3:([v2, v2(b)]),e4:(v3(v1(a)))} = s9, ss = {e3:([vi. v1(a)]),e4:(v3(v2(b)))} = sw. s11 = {e4: (v3(v1(a)), v3(v2(b)))} and S12 = {e4: (v3(V2(b)), V3(V1(a)))}. 54 THEORY

a b

Figure 2.13 Example with choice for operational semantics: a) data flow graph b) reachability graph • Since the reachability graph has two terminal states, it is certainly non­ deterministic. There exist 6 paths in the reachability graph: 7Z'1:::: SoV1S1V3S5V2S7V3S11, tr2:::: SoV1S1V2S31V3S7V3S11, 1C3 = SoV1S1V2S32V3S3V3S12, 1r4:::: SoV2S2V1S31V3S7V3S11• 7r5:::: SoV2S2V1S32V3S3V3S12 and 1!6 = SoV2S2V3S6V1SgV3S12, with

s*(et> K 1) = s * (e1> n2) =cons(head(s 0(e 1)), s11 (e1)) =(a), s'"(e1>1C3) = cons(head(so(e1)),s12(e1)) =(a), s*(ei. 1r4) = cons(head(s2(e1)), Sn (e1)) =(a) and s'"(e1> 1r5) = s *(ei. Jr6) = cons(head(s6(e1)), su(e1)) =(a). Similarly, \:Ii: 1::;; i::;; 6: s'"(e2, Jr;) = (b), s'" (e3, K1) = cons(head(s1 (e3)), cons(head(s1(e3)), s 11 (e3))) = ([vi. v1(a)], [vz, Vz(b)]), OPERATIONAL SEMANTICS OF FLOW GRAPHS 55

s*(e3, n-2) =s *(e 3, n-4) =cons(head(s31(e3)), cons(head(s1(e3)), s11(e3))) = ([v1o v1(a)], [vz, v2(b)]), s*(e3,n-3) = s"(e3, n-5) = cons(head(s32(e3)), cons(head(ss(e3)), s12(e3))) = ([Vz, v2(b)],[vi, v1(a)]), s'"(e3, 1r6) =cons(head(s 2(e3)), cons(head(ss(e3)), s12(e3))) = ([vz, v2(b)],[vi. v1(a)]), \::Ii: i e {1, 2, 4 }: s" (e4 , Ki) = s11 (e4) = (v3(v1(a)), v3(v2(b))) and \::Ii: i e {3, 5, 6}: s* (e4 , 1r;) = s12(e4) = (v3(v2(b)), v3(v1(a))). Thus Behav(G) = { { ei: (a), e2: (b), e3: ([v1, Vi (a)], [v2, V2(b)]), e4: (v3(V1 (a)), V3(v2(b))) }, { e1: (a), e2: (b), e3: ([v2, v2(b)], [vlt v1(a)]), e4 : (v3(v2(b)), v3(v1(a)))}}.

2.4 Behavioral denotational semantics of flow graphs In this section a denotational semantics for the input-output behavior, i.e. the external behavior, of a data flow graph is defined. This is in contrast to the opera­ tional semantics defined in the previous section, which models the internal behav­ ior. A denotational semantics states what a program means, i.e. what value it has. A denotational semantics is compositional, i.e. a valuation function is defined for each syntactic construct. Usually, the argument to such a valuation function is bracketed in [ Dinstead of ( ) to stress the fact that it is a syntactic structure. Besides the basic domains as the set of nodes V, the set of input and output ports Pin and P ous, which are subsets of the set of identifiers and thus flat domains, some derived domains like D =18 + N + R + · · ., D' = V x Pout x D, 10 State = V x P;n -+ Str(D) and State'= V x Pin -+ Str(D') are used.

Definition 2.86 Given a flat domain D, the domain State ofstates is defined as: State =(V x Pin -+ Str(D ), 5:sia1e) where s 5:stale s' = \:/(v, p)eV X P;n: s(v, p) 5:s1r(D) s'(v, p).

'5:s1a1e is the normal partial order for functions (Definition A.53 ). The bottom ele­ ment is therefore Snii• defined as \:/(v, p): Snu(v, p) =nil. Proposition 2.18 State is a domain. The following assembling and disassembling functions on the domain State are

10. By making use of currying, V x P""' ~ D could of course also have been chosen. 56 TuEORY defined, where . Di is the domain D lifted with a bottom element (see notation A.45 and definition A.49). Definition 2.87 flail: State -7 State= State -7 (V x P,.,. -7 Str(D)) fhead: State -7 (V x Pu. -7 Di) fcons: (V x Pin -7 Di) x State -7 State /append: State x State -? State are defined as: flail= 11.s.11.(v, p). tail(s(v, p)) fhead = 11.s. ll.(v, p). head(s(v, p)), and for each vreV x P1n -7 D.L fcons = A.(vr, s).11.(v, p). if vr(v, p)...;:. ..L then cons( vr( v, p ), s) else s(v,p) /append= 11.(st> s2). A.(v, p). append(s1(v, p), s2(v, p))

Proposition 2.19 The functions flail, fhead are strict, monotonic and continu­ ous. The function fcons is strict in its first argument and is monotonic and contin­ uous. The function /append is monotonic and continuous in its second argument. The terms strict, monotonic and continuous are defined in section A.2; the above proposition is necessary to justify the use of the least fixpoint operator fix (Theorem A.2) in the semantic functions. As said before, the type State' must be used in case of generalized data flow graphs with choice edges. Definitions 2.86 and 2.87 are extended in the natural way for State'= V x P1n -7 Str(D') with D' =V x Pout x D. 2.4.1 Choice free graphs This section contains a denotational semantics to describe the input-output behav­ ior of a (well-formed) data flow graphs without choice edges (Cf. [38] ). In section 2.2 it is shown that a data flow graph can be written down as the parallel composition of elementary graphs: G =V 111 · · · llVn, where each elementary graph consists of only one node. This composition is used in the sequel when denota­ tional semantics are defined for data flow graphs. The only difference with the ele­ mentary graphs that are constructed in theorem 2.15 (page 36) is that the destina­ tion node-port pairs of the outgoing edges of a node v should also be incorporated in the description of the elementary graphs, instead of being 0.

Notation 2.88 Getem denotes the class of such elementary graphs. BEHAVIORAL DENOTATIONAL SEMAl\TlCS OF FLOW GRAPHS 57

I.e., given a graph G with V = {v 1, · • • , vn}, its set of elementary graphs is {Vi."·,Vn}, with V;=({v;}, l(v;), O(v;), •v;uv;•, ip') with f(e)= (fst(ip(e))ri{ v;} x O(v;), snd(ip(e))).

Execution of a node transforms (part of) the current state. A function join is needed which transforms the global state when a set of nodes is executed. Definition 2.89 Provided for a set of states {si.··.,sn}, V(v,p)eVxP;n: #{i I s;(v, p)-:;:, J.} :5 1 join: Tup 11 (State) -? State is defined as: join= Jt(s., · ·., s11 ). A.(v, p). if s1(v, p)-:;:. J. then s1(v, p) else if s2(v, p) ::F. J. then s2(v, p)

else if s11 (v, p) '::/:. J. then s11(v, p) else ..L join combines several local states, which define the state for each node-port pair uniquely, into one global state. For the behavioral denotational semantics, valuation functions are defined for each syntactic construct of a graph G =(V,P;mPout•E,l,0,(J) = V111· ··llV11 , where V = {v i. · .. , vn } • The valuation function for a node defines its input-output behavior, in that given a stream of values for each input port, it yields a stream of values on each (destination of the) outgoing edge(s). The graph valuation function joins all these local states into the global state.

Definition 2.90 For a (well-formed) data flow graph G = V1 II··· llV 11 for which -.choice(G) and initial state s0, the behavioral denotational semantics Behavior(G, s0) is defined as fi[G]s0 where q:GJ-? State -? State 'lf.Gielem -? State -? State ;F: v -? (V x Pin ~ DJ.) -? (V x P;n ~ D .L) are defined as (with : State -? State): fi[G] = A..s. fappend(s, ftx(Jts'. join('vlIV 1]fappend(s, s'),

'V[V nDfappend(s, s')))) type( v) = operator 'V[V] = fix(.:tct> . ..is. fcons(J[V]jhead(s), (ftail(s)))) 58 THEORY

- type(v) branch

Snu if head(s(v, Pc))= ..l) 'Y[VD =fix( A-. A-s. . { fcons(s head(s(v,p,))> (ftail(s ))) otherwise where Snil> Shead(s(v,p.)): V X Pin ~ D .L are defined as: snu = A,(v, p). nil head(s(v, Pd)) if 3eeE: (v, Pheaa(s(v,p.)))efst(<}(e})A Shead(s(v,p.)) = A-(w, p). ..l (w, p)esnd(<}(e)) { otherwise

- type(v) =merge

Sni/ if head(s(v, Pc))= ..l 'Y[VD = fix(A,. Its. fcons(shead(s(v,p.))> otherwise ) { (mftail(s, head(s(v, Pc))))) where Smi. Shead(s(v,p,)): V X P;n ~ D.L• mftail: State X D ~State SniJ = A.(w,p).nil head(s(v, Pheaa(s(v,p.>») if 3eeE: (v, Pd) efst((J(e))A shead(s(v,p,)) = A.(w, p). (w, p)esnd(<}(e)) { ..l otherwise

. = 1 ( l) 1 ( ) {tail(s(w, p)) if (w, p) = (v, Pc)v(w, p) = (v, Pva1) m a1 1 A. s, va . A. w, p . ( ) h . ift s w,p ot erw1se

- type(v) =JO same as operator nodes

.11IvD =function(v) Note again the shorthand notation Pvat and Phead(s(v,p,))· Since -.choice(G), the pre­ condition of join is also satisfied. The node valuation functions 'll defines the output streams for a node v in a recur­ sive way by applying function(v) on each element of the input streams. The [ap­ pend in the graph valuation function (j is needed because the node valuation func­ tion 'llfor a node v defines only the resulting output streams. The initial state must therefore always be put on the front to get the complete streams (see for instance BEHAVIORAL DENOTATIONAL SEMANTICS OF FLOW GRAPHS 59 example 2.10).11 Recursion in fj is required to take care of the flow of (partial) streams through the graph. Example 2.7

Given a node v with function+ and two data input ports p 1 and p2 and one

output port p 0 to which an edge is connected with destination (w, p). Then 'V![Vi]]s with s(v, p 1) = (2, 4) and s(v, p2) = (1, 3, 5) is just s' with s'(w, p) = (3, 7) and nil for all other ( v, p ). This can be checked easily by a Kleene itera­ tion to solve the fixpoint of 'V!IY1].

Example 2.8

Given a branch node vbr with two choice-free edges e1rue and efalse connected to Pirue respectively p false· errue is connected to (vlt Pi) while e false is con­ nected to (v2,P2). Assumes is defined as s(vbnP4 )= (a,b), s(vbr,Pc)= (true, false, false) and s(v, p) =nil for all other node-port pairs (v, p). Then by a Kleene iteration it can be checked that 411:Vbr Ds = s' where s' is defined as s'(vi. p 1) =(a) and s'(v2, p2) = (b) and otherwise nil.

Example 2.9

Given a merge node Vme with a choice-free output edge e connected to (vi. Pi). Assumes is defined as s(vme,Ptrue)=(a,b,c), s(vme•Pfalse)=(d,e) and s(vme• Pc)= (true, false, false.false) and s(v, p) =nil for all other node-port pairs (v, p). Then by a Kleene iteration it can be checked that 'V[[VmeDs = s' where s' is defined as s'(vl> Pi)= (a, d, e) and otherwise nil.

Example 2.10

Given the .flow graph of figure 2.14 with initial state s0 = { ei: (a), e2: (b, c), e3 : (d), e4 : (e) }. Behavior(G) can be computed by a Kleene iteration. The first iteration yields: join('VIIV1Dso), 'V!IV2Jlso. 'll[V3]so} = join({ e3: (v1 (a))}, { e4: (v2(b), v2(c)) }, { e5: (v3(d, e))}) = {e3: (vi(a)), e4: (v2(b), vz(c)), es: (v3(d, e))} =St

The second and further iterations yield, with si = fappend(s0, St) = { e1: (a), ez: (b, c), e3: (d, v1(a}), e4: (e, v2(b), v2(c)), e5: (v3(d, e))}:

11. This is in contrast to the approach in (38], but there the initial streams. besides the start-up tokens, are defined by some 'initializing' nodes. 60 THEORY

Figure 2.14 Example flow graph without choice for behavioral semantics join('vl.IY1Ds!), ifl:v2Ds}, •J1[V3DsD = join({ e3: (v1(a))}, {e4: (v2(b), v2(c)) }, {es: (v3(d, e), V3(V1 (a), V2(b)))}) = {e3: (v1 (a)), e4: (v2(b), V2(c)), es: (v3(d, e), V3(v1 (a), v2(b)))} = s2

Thus Behavior(G, s0) = fappend(s 0, s2) = {ei: (a), e2: (b, c), e3: (d, V1 (a)), e4: (e, v2(b ), V2(c)), es: ( V3(d, e), V3(V1 (a), v2(b)))}

2.4.2 General graphs In this section, the denotational semantics of the previous section is extended to cover flow graphs with choice edges. This means, as before in the case of the oper­ ational semantics, that the domain State' is used instead of State to be able to per­ form the right shuffles. It is clear that the node valuation functions only need to be altered to use State'. To compute the right shuffles, and as a 'replacement' for the function join, the following two definitions are needed. Definition 2.91 shuffle: Tupm(Str(D')) ~ Set(Str(D')) is defined as: shuffle(si. ··.,Sm)= {s I Vi: 1 '!ii ~ m: s~(vi, Pin)= s;} provided the arguments to shuffle are streams s;e ({v;}xfpm}XD)*u ( { vi} x {Pin} x D),,,, i.e. all elements of the stream are labeled with identical ( Vj, Pin). shuffle defines all the interleavings of a set of streams. Definition 2.92 shufflejoin: TUPn+i(State') ~ Set(State') is defined as: shufflejoin = A.(si. ···,Sm s prev). ps where BEHA V!ORAL DENOTATIONAL SEMANTICS OF FLOW GRAPHS 61 ps = {seState' I V'(v,p)eV xPin: 3s' eshuffles: s prev(v, p) '5.sir(D') s(v, p) Ssir(D') s'} with12 shuffles= shuffle({s;(v,p) I s;(v, p) :;eJ.}) shufflejoin is used in the graph valuation function for general graphs in the same way as join for choice-free graphs. shufflejoin yields a set of states due to the non­ determinism caused by choice edges. Informally, shufflejoin yields the set of states s that extend s prm but are prefixes of the shuffles of the s;( v, p) that define

(v, p). If -.choice(G), then each (v, Pin) is uniquely defined by one (v, Pou1), and shufflejoin then reduces to the natural defining function (Definition A.14) of join. Example 2.11 Consider the flow graph of figure 2.15.

Figure 2.15 Example flow graph for shufflejoin

Assume so= {e 1: (a,b);e2:(c)}, Sprev = {e3: ([vi.a])} and function(v1) = function(v 2) =identity (Definition A.11). Then shufflejoin( 'l·1IV i]lso, 'VIIV 2Dso, s prev) = shufflejoin({ e3: ([vl> a], [vt> b]) }, {e3: ([v2, c]) }, s prev) = { {e3: ([v1.a])}, {e 3 : ([vi. a], [vi. b])}, { e3: ([vh a], [v., b], [v2, c])}, { €3: ([ V1, a], [ V2, C])}, {e 3 : ([vh a], [v2, c], [vh b])} }. Because of the non-determinism due to the choice edges, the behavior of a flow graph may be a set of states, i.e. (j should be a function Ii -7 State' -7 P(State') instead of a function Ii-+ State'-+ State'. For this, a real powerdomain P(State') must be used, of which the desired partial order Spcsiaul) would be:

12. Here of course a silent 'type cast' is assumed between Set(Str(D')) and Tupm(Str(D')). 62 THEORY

PS1 Spcs1a1e') PS2 = (2.2) (V'seps1:3s'eps2:s Ssiate' s') A (V'seps2:3s'eps1:s Ssiate' s') But this is not a partial order, since, with s 1 $s1ate' Sz, {s2} Spcs1me') {si.s2} and {slts2} Sp an ordering based on the Scott topology of State' should be used, i.e. resulting in a Smyth or Egli-Milner powerdomain. However, when only maximal elements of S::pcsiate') are final results of the non-deterministic flow graph, equation (2.2) reduces to the subset ordering [119], which is exactly what is wanted. The graph valuation function defined here is therefore devised in such a way that it yields only these maximal elements.13 This is justified by the fact that the other elements of the set are those that have not yet completed passing all data items of an origin to the destination(s) of the (choice) edge. These elements are waiting to be shuffled after elements from the other origins, but will not show up. Thus a deadlock situation occurs. In the final behavior this is not possible; while it is needed to represent the intermediate behaviors, since then data may come along the other origins. This leads to, using the functions on sets as defined in definition A.65:

Definition 2.93 For a (well-formed) data flow graph G = V111 · · · llVn and initial state so. the behavioral denotational semantics Behavior( G, s0) is defined as g[GDso where g:GJ ~State'~ P(State') is defined as (with pseSet(State'), : Set(State') ~ Set(State')): g[GD = J.s. map(/append(s),

max(fix(l. A.ps. union(shufflejoin( 'lt[V1Dfappend(s, first(ps)),

'vlIV nDfappend(s, first(ps)), first(ps)), (rest(ps)))))) withl4 map =A.fun. J..ps. fix(J.. J.ps. union( { fun(first(ps))}, (rest(ps)))) max = A.{slt .... ,sm}· {s; h3j: 1 $ j Sm Ai* j:s; S Sj} and 'J/as in definition 2.90. It is clear that shufflejoin is monotonic and continuous, resulting in a proper

13. Cf. this with the use of the right closure in [102]. 14. The first argument of map, fappend(s), is the higher-order function obtained by currying /append (Section A.3). BEHA VIORAL DENOTATIONAL SEMANTICS OF FLOW GRAPHS 63 application (Theorem A.2) of the above fix operator. As long as infinite streams do not occur, the finitely generating tree is indeed finitely branching [I IS, 119]. The graph valuation function (j is basically the same as for choice-free graphs as defined in definition 2.90. Only sets of states are considered instead of a single state, and in addition the function max is applied for the purposes mentioned above. Another view of the flow graph yields the same behavior as above: transform each choice e into an n-merge node, with n =#•e. The control port of all these merge nodes is connected by an edge originating from an additional output port of the merge itself, i.e. a self-loop. The values on this edge are then random values in the range of I .. n. This models identically the same shuffling of the inputs onto the output as the shufflejoin function. Also the same 'deadlocks' occur. The 'complexity' of shufflejoin is solely caused by the possibility of a cycle in the flow graph, and when at least one of the edges in this cycle is a choice edge. If such cycles do not exist in the flow graph, then shufflejoin may be reduced to just the set of complete shuffles, i.e. no care has to be taken of prefixes. This is justified by the fact that all different origins of this choice edge are independent of each other in that case. These situations are illustrated in examples 2.12 and 2.13. However, a partial order for P(State') will then not be a natural one.

Notation 2.94 When the initial state s0 of the (well-formed) data flow graph G is clear from the context, Behavior(G) == Behavior(G, s0).

Proposition 2.20 The behavioral denotational semantics as defined in definition 2.93 reduces for a graph G for which ..,choice(G) to the behavioral denotational semantics as defined in definition 2.90.

Corollary Given a well-fonned graph G. -ichoice(G)~Behavior(G) = 1

Notation 2.95 A graph G is deterministic iff #Behavior(G) = L

Example 2.12

For the flow graph of figure2.16, and initial state s0 {e 1:(a),e2:(b)}, Behavior(G, s0) can again be computed with a Kleene iteration. The first iter- ation, i.e. shufflejoin('Jt[V1]s0 , '11IV2Jls0, •JllJ:v3]so. so). yields ps1 {Si. s2. S3, S4, Ss} with S1 = Snil• 64 THEORY

Figure 2.16 Example flow graph with choice for behavioral semantics

s2 {e3:([vi. v1(a)])}, S3 {e3:([v2, v2(b)])}, s4 = {e3 : ([vi. v1(a)], [v2, v2(b)])} and Ss {e3:([v2, v2(b)],[v1' v1(a)])}. All other iterations yield PS2 PS1 U{ S6, S7, Sg, S9, S10, Su} with s6 {e3:([vi. v1(a)]),e4:(v3(v1(a)))}, s1 = {e 3 : ([vh v1(a)], [ v2, v2(b}]), e4: (v3(v1(a)))}, ss {e3: ([ Vz, v2(b)]), e4: (v3(v2(b)))}, S9 = {e3: ([vz, v2(b)], [vi. v1(a)]), e4: (v3(v2(b)))}, S10 = { e3: ([Vi. V1(a)], ( V2, V1(b)]), e4: (v3(V1 (a)), V3(V2(b)))} and Su = {e3: ([v2. v2(b)], [v1, v1(a)]), e4: (v3(v2(b)), v3(v1 (a)))}.

Thus Behavior(G, s0) = map(fappend(s0 ), max(ps1)) = { { ei: (a), e1: (b), e3: ([ vi. v1 (a)], [ v2, v2(b)]), e4: ( V3( V1 (a)), v3( v2(b ))) }, {e1: (a), e1: (b), e3: ([v2, v1(b)], [vi. v1(a)]), e4: {v3(v2(b)). v3{v,(a)))} }.

Example 2.13 This example illustrates the behavioral denotational semantics for a flow graph with a choice edge in a cycle. Consider the flow graph of figure 2.17, with so= {e2:(a,b)}. Let function(v2) =identity and function(v1)(x) = x' for all xe{a,b,a',b',a", ... }.

The first iteration of the graph valuation function yields ps 1 = {s l> s2, s3 } with s, = Snil• s2 = {ei:([ v2. a])} and S3 = {e1: ([v2, a], [ V2, b])}.

The second iteration yieldsps2 = {st>s 2,s3,s4,s5,s6,s7} where Si. s2 and s3 result from s1; s2, s4, s5 and s6 from s2; s3, s6, and s7 from s3, with S4 = {e 1: ([v2,a], [vi.a'])}, ss = {e1:([v2,a],[vi.a'],[v2,b])}, BEHAVIORAL DENOTATIONAL SEMANTICS OF FLOW GRAPHS 65

Figure 2.17 Example flow graph with a choice edge in a cycle s6 = {e1: ([v2, a], [v2, b], [vi, a'])} and s7 = {e 1: ([ V2, a], [v2, b ], [vi. a1, [vl> b'])}.

The third iteration yields ps3 =ps 2u{s8, s9, s10, s11 , s 12, s 13 )} with sa = {e1:([v2,a],[v1>a'],[vi.a"])}, s9 = {e1:([v2,a],[vi.a'],[v2,b],[vi,a"])}, s1o = {e1: ([v2,a],[v1> a'],[vb a"], [v2,b])}, sn = { e1: ([vz, a], [Vi. a'], [ v2, b], [v., a"], [v1> b'])}, s12 = {e1: ([vz, a], [v2, b], [vi, a'], [vi. b1, [vi. a"])} and s13 = {e 1: ([v2, a], [ v2, b J, [vi. a'], [vlt b'], [vh a"], [ vh b"])}. Such an iteration can be performed ad infinitum, since the behavior of this graph results in infinite streams for e1• This shows that indeed all proper shuf­ fles are taken, since from the operational semantics it is clear that first v2 must execute once before v1 can ever execute, and v1 then consumes that first a. Similarly, the sequence b'b" · · · can only start after b. And only from the moment on that v1 has consumed both a and b (and possibly some intermedi­ ary ai), the future (maximal) sequence is fixed.

2.5 Proof of equivalence In this section, it is proved that the operational semantics and the behavioral deno­ tational semantics defined in the previous sections are equivalent to each other. The transitional semantics defines the internal behavior, by enumerating all possi­ ble execution sequences. But the operational semantics as defined in definition 2.82 extracts the external behavior from the reachability graph, so that the operational and the behavioral denotational semantics can be compared. First, two examples are given, that illustrate the equivalence of the operational semantics of a graph G, defined by Behav( G, s0), and the behavioral denotational semantics, as it defined by Behavior(G, s0). Next, the formal proof is given. 66 THEORY

Example 2.14 Given the data flow graph and the reachability graph of example 2.5. For this graph and and initial state s0 ={e1:(a),e2:(b)}, Behavior(G) = {e 1:(a),e2 :(b, v(a)),e3:(w(b), w(v(a)))}. This is identical to Behav(G) as example 2.5 shows.

Example 2.15 This example uses the same flow graph and reachability graph of examples 2.6 (page 53) and 2.12 (page 63). Since the reachability graph has two terminal states, it is certainly non-deterministic, as is also illustrated by the behavioral semantics as shown in example 2.12. Clearly from the results of these two examples, Behav(G, s0) = Behavior(G, s0 ). For choice-free (well-formed) data flow graphs, the following theorem can be proved.

Theorem 2.21 Given a (well-formed) datajiow graph G and initial state s0. -.choice(G)~Behav(G) = 1 Section 2.6 is completely devoted to the proof of this theorem. If Behav(G) = Behavior(G) is proved, then it is also an alternative of the corollary on page 63:

-.choice( G)~B ehavior(G, s0) = I For the proof of Behav(G) = Behavior(G), it is necessary to show that Behav(G) is a fixpoint of Behavior and also that it is the least one. The proof becomes eas­ ier, when the following alternative definition of Behav is given. In this definition and in the proof of lemma 2.22,. the shorthand notations vd and vu (notation 2. 79) are used again. Definition 2.96 s'*: V X Pin X Patfi ~ Str( D') s'**: V x Pin ~ Set(Str(D')) are defined as: s'*(v,p,K) =if ;r = s then nil

else if v1 = v d then cons(last(s1(v, p)), s'* (v, p, tail(K))) else s'* (v, p, tail(K)) s'** (v, p) = v append(s0(v, p),s'*(v, p, ;r)) ze paths(RG(G)) PROOF OF EQUIVALENCE 67

Lemma 2.22 Given a (well-formed) data flow graph G. -ichoice(G)=>s'** = s**

Proof: It is sufficient to prove that

Vnepaths(RG), veV, peP;n: s°((v, p), n) = append(s0(v, p), s'* ((v, p), n)) Clearly, the 'heads' of S;(v, p) only change when a node v fires, i.e. if v =Vu, and that the 'tails' only change if v1 = vd. Because -ichoice(G) and, by definition of a well-formed graph, -iconflict(G), there is exactly one node vd that defines s(v, p), just as there is exactly one node v that uses s(v, p). This yields s'*(v,p,n)=s'*(v,p,n~vd) and s*(v,p,n)=s°(v,p,n~vu). Note that with n' = n~vd, s'*(v, p, n') =if n' = e then nil

else cons(last(s1(v, p )), s'°(v, p, tail(n'))) Similarly, with n' = n~v 11 , s*(v, p,n') =if n' = s0 then s0(v,p) else cons(head(s0(v, p)), s°(v, p, tail(n))) Say that n~vd = sovds 1vds 2 •• • and n~v 11 = SoVus 1'vus 2' .. -. First, assume s0(v, p) =nil and sn(v, p) =nil for a finite path n = s0 ···Sn. Then, since

-ichoice(G) and -iconflict(G), node vd is executed inn as many times as node v11 , and all along the path re, node v d has executed as least as many times as node vu. Then clearly last(s;(v, p)) = head(s/(v, p)). It is then straightforward s'*cv, p) = s°(v, p), and thus also s'** (v, p) = s**(v, p).

If s0(v, p)-:;:. nil or sn(v, p)-:;:. nil, then there is some slack of size m = length(s0(v, p)) and m' = length(sn(v, p)) with ln~v 11 I = n + m - m' and Sn = S~+m-m'- Thus '( )) -{last(S;+m'(v, p)) if m < i < n + m - m' h ea d( s; v,p - So(V, p)(i) ifO < i::;; m head(si+m(v, p)) if 0 < i < n-m' l ast ( s; ( v, p )) = , { Sn(v, p)(m - n + i) if /1 - m' < i < /1 Then it is again straightforward that s'** ( v, p) = s** (v, p ); similarly for infinite paths. o

Theorem 2.23 Given a (well-formed) data flow graph G. -ichoice(G)=>Behav(G) = Behavior(G) 68 THEORY

Proof: By theorem 2.21, it is sufficient to look at one path fr of the reachability graph of G. As in the proof of lemma 2.22, s'* (v, p, fr')= if fr'= e then nil

else cons(last(s1(v, p)), s'*(v, p, tail(fr'))) where fr' = fr~ v d. Clearly this is the least fix point solution of an equation of the form fix(A.. AS. cons( new, (tai/(s)))). Also, last(s 1(v,p)) = last(update(s0, vd)(v,p)) and by assuming vd is an operator node, = fun(vd)(· · ·, head(so(vd, Pi)),·· }h. When lifting this to the level of the State domain, s' becomes the solution of fix(A.. AS. fcons('J(,y d)jhead(s), (ftail(s)))) The proof is similar for branch and merge nodes. Therefore, by lemma 2.22, s** is the solution of Behavior(G). D

Theorem 2.24 Given a (well1ormed) data flow graph G. Behav(G) = Behavior(G)

Proof: Tedious, but most of the work has already be done in the proof of theorem 2.23. In principle the proof goes along the lines of this theorem, but now it must be proved that each path in RG(G) leading to a behavior in Behav(G) is eBehavior(G), and vice versa. The results of section 2.6 show that instead of the complete reachability graph, only a subset of this graph, namely the anticipated reachability graph RG ani(G), needs to be computed, since Behav(RG an1(G)) = Behav(RG(G)). The different paths in RGani(G) are just caused by the choice edges of G. Now the proof must consider each path of RG am separately, yielding a result like that of lemma 2.22, which must then be used along the lines of the proof of theorem 2.23. Only, s'*(v, p, fr)= if tc = e then nil else if v1 = vd then cons(last(s1(v, p)~vd), s'*(v, p, tail(fr))) else s'* (v, p, tail(fr)) should be used instead of definition 2.96. o

2.6 Anticipation This section shows that it is not necessary to build the complete reachability graph for a flow graph, and is also the proof of theorem 2.21. Instead of constructing all interleaving execution sequences, only a few are needed to extract the external behavior of the graph and these still cover the internal non-determinism. For a ANTICIPATION 69 deterministic graph, one sequence is sufficient. 15 Important properties of the behavior of the flow graph can also still be deduced from the reduced reachability graph, as is proved in sections 2.6.1 and 2.6.2. The important observation for the following lemmata, theorems and corollaries is that for a states, s(v, p) remains identical until a predecessor node of vis executed or v itself. In the case that there exist no choice edges in the graph, there is pre­ cisely one node which defines s(v, p). In the case that there exist choice edges in the graph, s(v, p) is in general defined by a set of nodes. Also, when there are no conflictive nodes, all values are consumed by uniquely determined nodes. This leads to the conclusion, that when a node v is enabled in a state s in a well­ formed graph, it remains enabled at least until v itself is executed. Afterwards it may again be enabled immediately, but it is also possible that v is multiply enabled in state s. If no choice edges exist, the heads of the streams at the input ports of v cannot be changed, meaning the computation of v leads to the same values no mat­ ter if vis executed at states or in successor states of s, of course as long as v itself is not executed. Note that the above is not only valid for strict operator nodes, but also for branch and merge without taking special care. This is justified by the observation that branch nodes as well as merge nodes have strict input ports. Therefore, no special care has to be taken for these types of nodes in the following lemmata. In the sequel, several groups of lemmata and corollaries are given. The first lemma of such a group is the most restrictive one, in which conflict, choice and output­ choice is excluded; output-choice is allowed in the other lemmata, and sometimes even choice. The lemmata without the restriction of output-choice closely resem­ ble the lemmata with this restriction. The only difference is that in the case of no choice, only one state can be reached instead of a set of states. First it will be shown that the execution of two (enabled) nodes v and w may yield the same state, independent of the order of execution.

Lemma 2.25 Given nodes v and w (v -::;:. w) and a state s0 •

15. The results in this section can also be found in a preliminary fonn in [67, 68]. Only lately, similar results are presented by other authors [51, 52, 129]. These are based on the same obseivations as described here, namely by using partial-order semantics for state-transition based systems. (see also section 3.8.6), and thus abstract from the interleaving semantics. As [52] describes, it is even partially applicable to model checking as described in section 3.6. In [130] similar results are therefore mentioned for the dining philosopher problem as given in section 3.1. 70 THEORY

enabled(v, s0)Mnabled(w, s0) /\

•conflict(v)A•conflict(w) A vw wv , , =>So~S /\ So~S /\ s = s •choice(v)A•output - choice(v) /\ •choice(w)A•output - choice(w)

VW WV • Proof: Clearly, the paths s 0 ~s and s 0 ~s' exist, smce

•conflict(v)=>enabled(v, update(s0, w)) and similarly for w. Let p be an input port of v, and since enabled(v, s0 )=>head(s0(v, p)) * ..L, also •choice(v)=>head(s0(v, p)) = head(update(s0 , w)(v, p)) and similarly for w. The following cases can be distinguished for s(u, q): 1. (u,q)e u dest(e)andue{v,w}: eev•uw• Clearly for these (u, q): s(u, q) = s0(u, q) = s'(u, q). 2. (u, q)e u dest(e)\ u dest(e) and u * w: eev• eew• For these (u, q), s(u, q) is only changed by executing v, i.e. s(u, q) = update(s0 , v)(u, q), and s'(u, q) = update(update(s0, w), v)(u, q) = update(s0, v)(u, q) since update(s0 , w)(u, q) = s0(u, q). 3. (u, q)e u dest(e)\ u dest(e) and u -:t; v: eew• eev• This is dual to the previous case. 4. (u, q)e u dest(e)n u dest(e): eev• eew• Because •output - choice( v) and •output - choice( w ), this set is 0. 5. (u, q)e u dest(e) and u = w: eev• Since •choice(w), execution of v leads to a new value put at the end of

s0(w, p). It is then straightforward from the definition of update that s(w, p) = s'(w, p). 6. (u, q)e u dest(e) and u = v: eew• This is dual to the previous case. D

Corollary Given a graph G with initial state s0 and nodes v and w (v -:t; w). ANTICIPATIO:\ 71

enabled(v, s0 )Aenab/ed(w, s0) /\ } -.conflict(v)A-.conflict(w) /\ ha (G ) B h (G ) . => 8 e v ,vw= eav ,wv -.choice(v)A-ioutput - choice(v) /\ -.choice( w )A -.output - choice( w)

Removing the restriction of output-choice leads to:

Lemma 2.26 Given nodes v and w (v -:t w) and a state s0 . enabled(v, s0 )Aenabled(w, s0) /\} VW WV -.conflict(v)A-.conflict(w) /\ :}So-+ /\so-+ /\ S = S' -.choice( v)A-.choice( w) VW WV where S = {sls0 --+s} and S' = {s'ls0 --+s'}.

Proof: Similar to lemma 2.25, but now execution of v, respectively w, may lead to more than one next state. But since -.choice(v) and -.choice(w), s(v, p) and s(w, p) are exactly as above. o For the case of possible choice nodes v and w:

Lemma 2.27 Given nodes v and w ( v -:t w) and a state s0 enabled(v, s0)Aenabled(w, s0) /\} -.conflict(v)A--.conflict(w) /\ =>S = S' W!i!:V•AV!eW• VW WV where S = {sls0 --+sd and S' ={s'ls 0 --+s'}.

Proof: The proof goes along the same lines as for lemma 2.26, but now also s( v, p) and s( w, p) may be changed by interleaving. But since v can never enable w directly, execution of v respectively w does not change s( v, p) or s( w, p) respec­ tively. o

The next lemma is similar, but the nodes v and w may even be dependent.

Lemma 2.28 Given nodes v and w (v -:t w) and a state s0 enabled(v, s0)Aenabled(w,so) /\} vw wv =>3s, s': s0 --+s /\ s0 --+s' /\ s = s' --.conflict( v )A-.conflict( w) 72 THEORY

Proof: Take always the update that resembles the update in the non-choice case, i.e. put the new value computed by the execution of a node at the end of the out­ put streams. o

Corollary Given a graph G with initial state s0 and nodes v and w (v 'f:. w). enabled(v,s0)Aenabled(w,s0) B ha (G ) B h (G ) A} => e v ,vwn eav ,wv '# 0 -.conflict( v)A -.conflict( w)

A similar group of lemmata is given in which there is no restriction on the node w.

Lemma 2.29 Given nodes v and w (v :;i!: w) and a state s0• enabled(v, s0)Aenabled(w, s0) A} =>S = S' -.conflict( v )A -ichoice( v) VW WV where S = {sls0 -7s} and S' = {s'ls0 -1s'}.

Proof: Along the same lines as lemma 2.27, because vis choice-free and therefore the heads of the input streams of v cannot be changed by the execution of w. o

Corollary Given a graph G with initial state s0 and nodes v and w (v,;; w). enabled(v, s0 )Aenabled(w, s0) /\} B h (G )- B h (G ) . . =>eav ,vw- eav ,wv -iconflzct( v)A -.choice( v)

Lemma 2.30 Given nodes v and w (v:;; w) and a state s0 • enabled(v, so) A } vw wv enabled(w, so) A =>3s, s': so-1s /\ so-1s' /\ s = s') --conflict(v) I.e. there exists a state s that is reached independently whether v or w is executed first. Proof: Along the same lines as lemma 2.29, and, as in lemma 2.28, by taking the s that can be reached by always putting the new value at the end of the output streams when a node is executed, i.e. no interleaving takes places just as if there had not been any choice. o

Corollary Given a graph G with initial state s0 and nodes v and w ( v ,;; w ). ANTICIPATION 73

enabled(v, s0) A } enabled(w, s0) A ~Behav(G, vw)nBehav(G, wv) :f:. 0 -.conflict( v)

Corollary Given nodes v and w (v :F w) and a state s0• enabled(v, s0) A } enabled(w, s0) A ~s = S' independent(v, w) VW WV where S {sls0 -ts} and S' = {s'ls0 -ts'}.

41 Notation 2.97 In the sequel a, a', cl', ah a 2 eV*uV • A group of lemmata and corollaries is given next in which it is shown that a per­ mutation of a set of nodes yields the same (set of) state(s).

Lemma 2.31 Given a set ofnodes {v., · · · , vn} and a state s0 'Vi: 1 s; i Sn: enabled(v;, s0)A-.conflict(v;)A-.choice(v;)A-.output - choice(va=} C1 3!s': ('Vcreperm({ v1 .. · Vn }): so-ts') where perm(V) is the set of all permutations of elements of V.

I.e. all permutations of { v 1 • • • vn } lead to the same state. C1 Proof: Clearly, since 'Vi: l ::;; i :5 n: -.conflict(v;), all these paths s0-ts exist Thus it is only left to prove that all permutations lead to the same state s'. This can be proved by induction. The base case for n = 2 is proved in lemma 2.25. perm({v1,. • ., Vn+d) = U perm({vi.···, vi-1> V;+i.· ·., Vn+I }). V; l:Si~n+l By induction for a given i (1SiSn+1), 'Vaeperm({ vl> · · ·, vi-h vi+l> · · ·, Vn+l }), (F s0-t leads to one and the same state si. Since -.choice(v;), there exists exactly one

er. V·1 si for which s0 -t si. Now it is only left to prove that all si are identical. For 'Vi: I s; i Sn, this is implied by 'Vi: IS i::;; n: Vn+l· perm({v1,. • ·, vi-1> V;+t" · ·, vn}). V; n perm( {Vi. ••• ' V;-1. Vj+l> ••• ' Vn+l }). V; * 0 By induction, 'lfaeperm({v 1 ,-·-,vn- 1 }):s0 ~ leads to exactly one and the same ,,VnVn+I ,,vn+1Vn ' state s". Then, by lemma 2.25, s -t Sn+l and s -t Sn+l · Thus all s; 74 THEORY

(IS i Sn+ 1) are equal. o

Lemma 2.32 Given a set of nodes {Vi.···, Vn} and a state s0 for which 'Vi: 1 Si:::;; n: enabled(v;, s )A-iconflict(vi)A...,choice(v;). . 0 a , Then for all sequences aeperm({v1 • • • vn}), s0 -ts for all states Vt ••• Vn s' e { s I s0 --7 s}.

I.e. all permutations of { v1 • • · vn} lead to the same set of states. Proof: As lemma 2.31, but using lemma 2.26 instead of lemma 2.25. D

Corollary Given a graph G with initial state so, and a set of nodes { V1 • • • Vn}. 'Vi: 1 Si$ n: enabled(v;, s0)A-.conflict(v;)A-ichoice(v;)=> "iaeperm({ v1 • • · vn}): Behav(G,a) = Behav(G, v1 • • • v,,) Similar lemmata are given in the following, but the sequences of nodes may con­ tain multiple occurrences of the same node, instead of only permutations. First it is shown that given a node v and a sequence a it makes no difference whether v is executed right before or after a.

Lemma 2.33 Given a node v. a sequence of nodes a and a state s0 .

enabled(v, s0) /\so~/\ vea /\ } "iw: weavw = v: -iconflict(w)A-ichoice(w)A-ioutput-choice(w) => va av ::l!s: So-7S /\ So-7S

Proof: Clearly, since -.conjlict(v), these paths s0'!:'?s and s0 ~s' exist It can be proved by induction on the length of a that s = s'. The base case for lal = 1, i.e. a= w, is proved as lemma 2.25. Assume la,,+11=n+1, an+t =an. vn+1 and an+1 So --7 S.

s(w, p) = update{s0 , van+1)(w, p)

= update(so, va11 vn+1)(w,p) = update{update(s0, va,,), Vn+i)(w, p) {by induction} = update(update(so,a,,v), Vn+i)(w,p) = update(update(update(so,an), v), v,,+1)(w,p) = update(update(so. a,,), Vn Vn+1)(w, p) ANTICIPATION 75

{by induction and lemma 2.25} = update(update(so, an). vn+l vn)(w, p) = update(so, O"n Vn+l Vn)(w, p) = update(so,O"n+l vn)(w, p) = s'(w, p) D

Lemma 2.34 Given a node v, a sequence of nodes u and a state s0 for which a enabled(v,s0) I\ s0-7 A vea A 'v'w: weovw = v: -iconflict(w)A -ichoice(w). VO" O'V VO" O'V Then s0 -7 I\ s0 -7 and S = S' where S = {s I s0 -7s} and S' = {s' I s0 -7s'}.

Proof: As lemma 2.33, but using lemma 2.26 instead of lemma 2.25. o

Lemma 2.35 Given a node v, a sequence of nodes a and a state s0• enabled(v, s )I\ I\ vea A} 0 s 0 ~ =>S=S' -iconflict( v )A -.choice( v) VO' O'V where S = {s I s0 -7s} and S' = {s' I s0 -7s'}

Proof: Straightforward, as for lemma 2.34, but tedious, from the previous lem­ mata. o

Corollary Given a graph G with initial state s0, a node v and a sequence of nodes O'. enabled(v, s0) A A veu /\} s0 ~ =>Behav(G, vu)= Behav(G, av) -iconflict( v )A -ichoice( v)

Now a similar group of lemmata is given, in which it shown that it does not matter when a node vis executed within a sequence of nodes u (vea).

Lemma 2.36 Given a node v, a sequence of nodes a and a state s0 • enabled(v, s0) A s 0 ~ I\ vea /\ } 'v'w: weavw = v: -.conflict(w)A-.choice(w)A-.output - choice(w) =>

0'1 VO"z 3!s: ('v' ui. a 2: u =a 1 u2: s0 -7 s) 76 THEORY

I.e. all sequences a 1 va2 lead to the same states. 0"1 V0"2 Proof: Clearly, because -.conflict(v), all these paths s0 ~ Si exist. si(w,p) = update(so,a1va2)(w,p) = update(update(so, 0"1v),0"2)(w, p) {by lemma 2.33} = update(update(so, va1), a2)(w, p) = update(so, va1a2)(w, p) = update(s0, va)(w, p) Thus all sequences lead to the same final state. D

Lemma 2.37 Given a node v and a sequence of nodes CT and a state so.

enabled(v, s0)A so~ Avea A} , =>V'CT1>0"2:0" = 0"10'2: S = S -iconjlict( v)A -ichoice( v)

O"V O't V0'2 where S = {s I s 0 ~s} and S' = {s' I s0 ~ s'}.

Proof: The proof is similar to lemmata 2.35 and 2.36. D

Corollary Given a graph G with initial state s0, a node v and a sequence of nodes a.

enabled(v,s0 ) As 0 ~ Avea A} -iconflict(v)A...,choice(v) =>

V'G1> o-2: a= a 1a 2: Behav(G, o-1 va2) = Behav(G, av)

Notation 2.98 node - equal(0'1> a 2) iff a2 is a permutation of a1•

Thus two node-equal sequences cr1 and a2 consist of the same nodes and have the same number of occurrences for each node. Now it is shown that executing a node v before or after a sequence of nodes er yields the same state, no matter whether vea or not.

Lemma 2.38 Given a node v, a sequence of nodes a and a state s0• enabled(v,s0) A so~s~ A } V'w: wecrvw = v: -.conflict(w)A...,choice(w)A-.output - choice(w) => ANTICIPATION 77

va' 3a': node - equal(a, u'): s0 -7 s

Proof: The following cases can be distinguished:

1. vea: This is lemma 2.33 with a' = u. 2. VEU:

VEO", thus (j = U1V0'2 with veu1. s'(w,p) = update(s0,av)(w,p) = update(so, u 1 V

Lemma 2.39 Given a node v, a sequence of nodes a and a state s0 . enabled(v,s0) A s0O'V-7 A } · ' :::=>3a': node - equal( a, a'): S = S' -.conflict( v )N-.choice( v)

av vu' where S = {s I s0 -7s} and S' = {s' I s0 -7 s'}.

Proof: As lemma 2.38, but using lemma 2.34. o

Corollary Given a graph G with initial state s0, a node v and a sequence of nodes u. enabled(v,s0) As0 ~ A} -..conjlict(v)A-ichoice(v) =>

3a': node- equal( a, a'): Behav(G, vu')= Behav(G, av) These lemmata state that it does not matter when node v is executed; it remains 78 TuEORY enabled independent of which other nodes in u are executed, and when it will be executed, it will lead to the same state(s) as when it is executed immediately.

Lemma 2.40 Given a node v, a sequence of nodes u and a state s0• enabled(v, s0 ) A s0 ~s A } \fw: weuvw = v: -.conjlict(w)A-.choice(w)A-.output - choice(w) ~

v vu vu' (3!s': s-+s'As0 ...+s') V (3u': node- equal(u, vu'): s0 -+ s)

Proof: The following cases can be distinguished: 1. veu: v vu Then clearly s...+s' ASo-+ s'.

2. vea: veu, thus a= a 1 vcr2 with vEu1. s(w, p) = update(s0 ,u1 vu2)(w, p) = update(update(s0, u1v), u2)(w, p) {by lemma 2.38, there exists a u1' with node - equal(u1', u1)} = update(update(s0, vo/),

0 ~::~~~~~~;~:~j~' a s~~~e of nmks a and a state s .

\../ (}" V I I I ( v s: so-+s: s...+s AS = S ) V (3a : node - equal( a, va')AS = S") uv VO' vu' where S = {s I s0 ...+s}, S' = {s' I s0 ...+s'} and S" = {s" I s0 ~ s"}.

Proof: As lemma 2.40, but using lemma 2.39. o ANTICIPATION 79

Lemma 2.42 Given sequences of nodes a and a'. and a state s0• s0 ~sa A s0 a'~s' A node- equal(a,a') } =>S = s' 'Vwea: ...,conflict(w)A...,choice(w)A...,output - choice(w)

Proof: This can be proved by induction on the length of a (and er'). 1. lal = 1: In this case q =er'= v, and since ..,output -choice(v) there exists only one v path s 0 ~s.

2. lo-I= 2: This is lemma 2.25. 3. lerl=n+l: Say a = VO"n and a'= wo-' m with lanl = la'nl = n. The following cases can be distinguished: 1. v = w: s(vk, p) = update(s0,a)(vh p) = update(so, van)(vb p) = update(update(s0, v), an)(vk> p) {by induction} = update(update(so, v),a'n)(vk, p) = update(so, va'n)( v b P) = update(s0, a')(vkt p) = s'(vtt p) 2. V*W: s(vb p) = update(s0, a)(vb p) = update(so, van)(vt. p) = update(update(s0, v), an)(vk> p) Clearly, since node - equal(a, a'), wean and vea' n· Also for this case, enabled(w, s), and therefore, since ...,conflict(w), enabled(w, update(s0, v)). Then, by lemma 2.40, there exists a a" with node - equal( am wa"). = update(update(s0, v), wa")(vk, p) = update(update(s0, vw), a")(vb p) {by lemma 2.25) = update(update(s0, wv),a")(vt. p) = update(update(s0, w), va")(vb p) 80 THEORY

Since node equal(va", a'n), by induction and if enabled( a' n• update(s0 , w)): = update(update(s0, w), a'n)(vk> p) The latter condition is valid, since enabled(s, a') = enabled(s, wa'n) =enabled(s, w) A enabled(a'm update(so. w)). Since •output - choice(w), there exists exactly one update(so, w). Thus update(so, wa'n)Cvt. p) update(s0 , a')(vk> p) = s'(vk> p) D

1 Lemma 2.43 Given sequences of nodes a and 0' , and a state so.

1 s0-7(j /\ s0-70' /\ node - equal( a,"')/\I } "' a' :=}3s:s0-7sAs0-?s 'ilvEa: •conjUct(v)

Proof: As lemma 2.42, but using lemmata 2.41 and 2.26 instead of lemmata 2.40 and 2.25. o

Corollary Given a graph G with initial state s0, a node v and a sequence of nodes (]'. so-7a /\ so-7a' /\node - equal(a, a')/\ l:=}Behav(G,a)nBehav(G,a') :t. 0 'ilvEa: ...,conjUct(a) ·

Theorem 2.44 Given a well-formed graph G and initial state s0• If ...,choice(G), then all paths of RG(G, s0) have the same behavior. This theorem says that the reachability graph of a choice-free graph has one final state. Also for an infinite reachability graph, all the different infinite paths that do not postpone the execution of a node indefinitely, lead to only one behavior. Proof: Since ·'ilveV:-.canjlict(v)/\...,choice(v), there exists exactly one path ;repaths(RG(G)) corresponding to a sequence a that can be 'extracted' out of tr. Assume two paths ;r1 epaths(RG(G)) and tr2epaths(RG(G)), and s0 = root(RG(G)). If it can be shown that node - equal(a1, u2), then the theorem fol­ lows immediately from lemma 2.42 and the last corollary. The proof of node - equal(O'., a 2) is proved by induction on the length of the sequence and by ANTICIPATION 81 contradiction. 0'1 0'2 Assume s0-7s, So-7s' and -.node-equal(a1,a2). Let 0"1;::: v1a'1 and 0"2 v2a'2, with ...,choice(v1)A -.conjlict(v1)A -.choice(v2)A -iconjlict(v2). The following two cases exist:

1. V1 '# V2: Since -.conjlict(v1)A-.conjlict(v2), v1ea'2A v2ea'1. Then, by lemma 2.39, V20"' 3a':s0 -7 s A node-equal(v1a'i. v2a'). V2 Since ...,output - choice(v2), there exists exactly one s 0-7si. and the problem is reduced to I (jl (j 2, , ') s1-7s/\S1 -7 s /\-.node - equal( a , er 2 • (2.3)

Now the same technique can be used. When n2 is finite, this will halt, and the conclusion is then a contradiction, since node - equal(cri. a 2). Then, by lemma 2.42, s ::::: s'. In the case of infinite paths and if it guaranteed that a node is not indefinitely postponed: "\/ aepref(u2): 30"' e pref(u1): node - equal(u, u') And again by lemma 2.42, this is in contradiction with the assumption.

2. Vt ::::: V2: Now the problem is directly reduced to equation (2.3). D From all this, it can be concluded that the following algorithm computes enough of the reachability graph to extract the behavior of a well-formed data flow graph G. In this algorithm, only one choice-free node among the enabled nodes is executed. The other enabled nodes are said to be 'anticipated'. Since in all previous lem­ mata conflict is always included, This algorithm can also be adapted to build the derive all the behaviors of a Petri net. For Petri nets, conflict is more important than choice. Choice is not such an important issue in Petri nets, since interleaving of data items does not result in different behaviors, i.e. states or markings, because all the tokens in a place cannot be distinguished from each other, contrary to the case of interleaving data items in flow graphs. The reachability graph constructed by this anticipation algorithm can even be reduced further without losing behavior. For instance, if there is more than one choice-free node enabled (or the corresponding case of non-conflictive transitions in Petri nets), then all these nodes may be executed at the same time. (Cf. lemmata 82 THEORY

Algorithm Anticipated reachability graph construction global: AG I* reachability graph */ reachability-graph (G , s0) in: a well-formed graph G, an initial state s0 begin RG = [{s0 },0,s0] reach (G , s0) end reach {G, s) in: a well-formed graph G, a states from which the reachability graph is extended begin let enabled-nodes= {v I enab!ed(v,s)} if 3veenabled- nodes: -ichoice(v) then anticipated ( s) =enabled-nodes\ {v} execute {G, s, v) else forall veenabled- nodes do execute (G, s, v) od fi end execute {G, s, v} in: a well-formed graph G, a states, a node v that is executed at states begin fora II s' e update(s, v) do v add s-ts' to RG if s'~RG reach (G, s') fi od end

2.31and2.32). Another reduction follows from the following. Assume that all nodes that define the (input) choice edge of a node v, have executed. In that case, and if vis fully Al\TICIPATION 83

enabled, v may be executed also and anticipate all other enabled nodes, since those remain enabled. The proof of this follows the same lines as of the above Iemmata, since they can all be extended using relations choice(v, w) instead of choice(v) and choice(w); similarly for the conflict relation. By the execution of such a node, behavior is not lost, since all interleavings have already taken place and it is not possible that the head of the stream on (input) port p of node v can be changed by executing any other node. Of course, this node v is then (partially) multiply enabled, so that at the next states this anticipation might not be possible any more. Clearly this must be the case for all the input choice edges of v. Similarly for Petri nets, other transitions may be anticipated when a complete set (Definition A.16) of (pairwise) conflictive transitions is enabled. One small issue should be addressed and that has not fully been dealt with. It is possible that a reachability graph is of infinite size. Note that this is different from an infinite reachability tree. An infinite reachability graph represents a non­ terminating system that is not finitely representable. Note that it is also non­ repetitive, since repetitive systems are finitely representable (unless the repetition is of infinite length). The converse is not necessarily true, as automata accepting m-regular languages show [32, 98]. Non-finitely representable systems are of no importance when only finite hardware is considered. However, the anticipation algorithm should take care of finitely representable infinite paths. Anticipation may not lead to indefinitely postponing, i.e. anticipating, a node to execute. Thus it needs to be checked, when leaving the function reach at each level of recursion, whether the anticipated nodes are really executed at a (transitive) successor state. It is obvious that it is sufficient to do such a check only for each strongly connected component in the reachability graph. If it is found that anticipated transitions are not yet executed for some successor, then the (anticipated) reachability graph con­ struction has to proceed by executing these nodes. Cf. this check with a similar check which has to be performed for satisfiability checking of temporal logic for­ mulae, namely to verify that formulae of the form <> f are not postponed indefi­ nitely [65]. 2.6.1 Liveness This section proves that the anticipated reachability graph, as constructed by the algorithm of page 81, preserves the liveness properties of the corresponding flow graph. Liveness [110] concerns the question whether a node can ever be fired. Clearly liveness is opposed to deadlock, which means that a system halts prema­ turely. Therefore deadlock should normally be avoided. The importance of 84 THEORY liveness is also shown in section 2.7.1. Definition 2.99 live: V x :R..,t;j-> lB strong - live: V x :R..,r;J-> lB are defined as: live(v,RG) = 3Jrepaths(RG): VEJr strong - live(v, RG) Vsestates(RG): 3Jrepaths(RG, s): veir

Notation 2.100 dead: V x '1 lB is defined as: dead(v, RG) = -.Uve(v)

Definition 2.101 live:lfJ x 'l({j-> lB strong - live: If} x '.R.Jj-> lB are defined as: live(G, RG) =VveV: live(v,RG) strong-live(G,RG) = \lveV: strong-live(v,RG)

Notation 2.102 When the reachability graph RG is clear from the context: live(v) =live(v, RG) strong - live(v) =strong - live(v, RG) live(G) =live(G,RG) strong - live( G) =strong - live( G, RG)

Notation 2.103 When the graph G is clear from the context: live(RG) = VveV: live(v,RG) strong- live(RG) =VveV: strong-live(v,RG)

Theorem 2.45 Given a data flow graph G and its reachability graph RG and its anticipated reachability graph RG ant. live( G, RG)~live( G, RGant) strong - live( G, RG)~strong - live( G, RGant)

Proof: Directly from the observation that for a state seRG and veenabled(s), whether or not conflict(v) or choice(v), 3a: s~. o ANTICIPATION 85

2.6.2 Safeness This section proves that the safeness properties of the corresponding flow graph can still be deduced from the anticipated reachability graph. Safeness means that at most one value is ever present at each node-port pair [I 10]. Section 3.3 shows that edges of a safe graph may be mapped onto wires instead of implementing queues in the hardware to be synthesized. A similar concept is boundedness, which means that the number of values on an edge is always bounded, i.e. never an infinite stream occurs. Such an unbounded graph cannot be implemented in (finite) hardware. Definition 2.104 safe: V x Pin x State -t lB safe: V x Pin x 'l({j-t 18 safe: V x Pin x Patli-t lB are defined as: safe(v, p,s) =ls(v,p)I s; I safe(v, p,RG) =Vsestates(RG): safe(v, p, s) safe(v, p, ;r) =Vse;r: safe(v, p, s)

Definition 2.105 safe:(j x State-? 18 safe:(j x './?.Jj-t lB are defined as: safe(G,s) =V(v, p)eV x Pin: safe(v, p,s) safe(G, RG) =V(v, p)eV x Pin: safe(v, p, RG)

Notation 2.106 When the reachability graph RG is clear from the context: safe(v, p) = safe(v, p, RG) safe(G) = safe(G,RG)

Notation 2.107 When the graph G is clear from the context: safe(s) = safe(G,s) safe(RG) = V(v, p)eV x P;n: safe(v, p, RG) In the sequel, the shorthand notations vd and vu (Notation 2.79) are used again to denote the node(s) that define, respectively use, s(v, p) along a path 1r. Proposition 2.46 Given a data flow graph G and reachability graph RG with root so. safe(v,p,RG) if! saje(v,p,s0) and vd and Vu alternate in all paths 1&epaths(RG). For such a path 1&, 1l'n vd, vu} starts with Vu if s(v, p) ;e nil, and 86 THEORY with v d otherwise.

Proof: Directly from the observation that for all states s, a veenabled(s) is never disabled, since well- formed(G). o

Theorem 2.47 Given a data flow graph G and its reachability graph RG and its anticipated reachability graph RG ant· safe(RG)~safe(RG ant)

Proof: Directly from the observation that RG anti;;;,RGRG and thus states(RG an1)i;;;, states(RG). o Although theorem 2.47 is not valid in the opposite direction, safeness can never­ theless be reconstructed out of the anticipated reachability graph RG ant as theorem 2.52 shows. To prove this, first the following lemmata are needed.

Notation 2.108 Given paths n-1 and :1r2• :Tr1t;;;,:Tr2 ijJ3:1r',:1r"::1r2 = :1r':1r1:n:"

Lemma 2.48 Given a path n in an anticipated reachability graph RG ant and a node-port pair (v, p). vdavdi;;;,tr /\ vuea~-isafe(v,p) VuO"Vut;;,K /\ vdea~-.safe(v, p)

Proof: Straightforward from proposition 2.46. o In the sequel, it is assumed that safe(v, p, n-).

Lemma 2.49 Given a path n- in an anticipated reachability graph RG ant and a node-port pair (v, p). vda'vut;;,11' /\ vdea' /\vu ea'/\} . . ~-.safe( v, p) 3sea: v d eant1cipated(s)

Proof: Let <>=s1v1 • .. sn. By anticipation 3<>':avua'vdi;;;,n- and also 3u1,u2: . ~~~~ node-equal(u1vda2Vu, O'Vuu'vd): s1 ~ • Then the lemma follows from lemma 2.48. o

Lemma 2.50 Given a path n- in an anticipated reachability graph RG ant and a node-port pair (v, p). If 'Vn': tr' = vdavu: tr't;;,tr with 'Vseu: vdeenabled(s) and 3seu: anticipated(s) = 0, then for all paths tr" eRG that are anticipated by tr, ANTICIPATION 87

11 safe(v, p, 1& ).

1 Proof: Let sea: anticipated(s) =0, and 1& = vda1sa2vu. Then for all (sub)paths 11 1 11 11 1& that are anticipated by 1& , 1C =a'sa with node-equal(a',vda1) and 11 1 node - equal(a", 0"2 vu). Thus no path 1& vda3 vda 4 Vu that is anticipated by 1& exists. o

Definition 2.109 Given a (sub)path 1& = VuO'Vd in an anticipated reachability

graph RGan1 and a node·port pair (v, p) for which safe(v, p,1&), 'efse1&: Vuli!l antici­ pated(n), 'ef se1C: vde=anticipated(n) and -.3sen: anticipated(s) = 0. r:Patli.-4 lB is defined as: case rr = s0 vus 1vds2: if vdeenabled(s0) then false else true case rr = SoVuStWO'Vdsn: r(s{VuO'Vd) where s 1' = update(s0, w).

Lemma 2.51 Under the assumptions of definition 2.109, if r(rr) returns true for a safe (sub }path tr of an anticipated reachability graph RG an1, then r induces a transformation of tr to a non-safe (sub )path tr' of the complete reachability graph RG that is anticipated by tr.

Proof: Under the assumptions of definition 2.109, r induces a transfonnation of tr for which the following cases can be distinguished: 1. n=soVuS1VdS2

If vdeenabled(s0), then tr' = s0 vds1'v11 s2 is also a (sub)path of RG and with -.safe(v, p, n').

2. 1t = soVuS1WO'VJSn and weenabled(so)

Then n' =sows!v 11 uvdvn is also a (sub)path of RG. If -.conjlict(w) and -.choice( w), then only one such path wv u exists. Otherwise possibly more

than one such path wv11 exist, but then by anticipation all states s2' for which WVu so -4 s2' are eRG ant· And here only s2 has to be looked at, since the others

will be dealt with separately. It is not possible that v11 is a successor of w, since then w = vd. But it is possible that w is a successor of Vu. Note that in that case w might not be safe. Note that the set of anticipated nodes for the

state at which v11 now fires, i.e. after swapping, may change, namely some successors of w might be added, while w (and possibly nodes with which w is in conflict) is removed. 88 TuEORY

3. ,. = sovus1wavdsn and wEenabled(so) Thus w is a successor of vu and cannot be executed before vu is executed. However, the execution of w may be postponed.

For the last case, w is skipped for the recursion, since it can only be fired after Vw Note also that the sets of enabled and anticipated nodes of s 1 do not need to be changed in this case. That w may just be skipped, can be seen by another case analysis: 1. a=e

Then by assumption vdeenabled(s1). Thus vd must be a successor of win which case vu wv d is the only possible sequence, i.e. no permutation of these nodes is possible. This result will also result from the recursion. 2. a= via' If v;eenabled(s1), then wand vi may be swapped, leading to,.'= vuv;wvd. If also v;eenabled(s0 ), then v; can be swapped at s0 also, just as the recur­ sion will yield. Otherwise, V; is dependent of Vu but not of w. Thus Vu can never be swapped past v;. Since the existence of both sequences Vu v;w and

vu wv; is of no interest, it is sufficient to look at only one. If v; eenabled(s 1), then v1 is dependent of w and cannot be swapped before w and neither

before v11 • 0

Theorem 2.52 Given a (finite) anticipated reachability graph RG ant for a graph G, and a node-port pair ( v, p ). safe( v, p) is decidable.

Proof: Clearly, from the previous lemmata, the proof of this theorem may be restricted to (sub)paths,. = s0 vda1vua2vdsn of the anticipated reachability graph RGan1· The following cases exist: 1. If -isafe(v, p, RG an1>. then -.safe(v, p) by proposition 2.46. 2. If 3JreRGan1 for which lemma 2.48, 2.49 or 2.50 applies, then -.safe(v, p).

1 3. Otherwise assume that for all paths 7tERGan1, and all ,.'r;;;;,,. with 7' = vda1vua2vd where 'tsea1:vde:enabled(s), vde:cr2, Vui!a1 and V'sea2: vueenabled(s). (If n' does not satisfy this, -.safe(n') by lemma 2.49 or,." lemma 2.50) Then 3,." = vdcr'vua"vda"' with node-equal (11:', ,.")and s1~ for which 'ts:sea': {v,41 vd}nenabled(s)=0 and V's:secr": {vu,vd}nenabled(s)=0. Then safe(v,p) can be deduced as in lemma2.51. ANTICll'ATION 89

Now it is only left to prove that the above conditions are sufficient, i.e. it is required to prove that all non-safe (sub)paths ;c of the complete reachability graph RG are only anticipated by (sub)paths ;c' for which one of the above cases apply.

It is obvious that only (sub)paths 11:' = a 1 v da2 Vua3 v J0'4 need to be considered for which V'sea1: vdeenabled(s), vdeo-2, vuea1 and V'sea2: Vufl:enabled(s). Such a path 7r' can always be transformed to a path 7r" for which the assumptions of definition 2.109 apply. But in that case, '/: always yields true. The problematic cases could be those implying a complete conflictive or choice set to be executed. But then more than one such a path exists, of which at least one yields the correct result. o

2.7 Well-behavedness In this section, some definitions and theorems are stated that are useful to analyze flow graphs in a hierarchical way. For this the notion of well-behavedness is defined, similar to the well-behaved definition in [39] for flow graphs and in [125] for Petri nets, or what is called well-formed in [128] and [87, 88]. Also the results of [125, 128] for blocks in Petri nets are extended. Here some similar theorems are stated, which however are not restricted to subgraphs with one initial and one final node. (Cf. also [87, 88] ). One of the results of this section is that the external behavior of a (well-behaved) graph is equal to the behavior of this graph by expanding all hierarchical nodes. Of course, the internal behavior is non­ deterministic to a larger extent.

Definition 2.110 A well-formed graph G is k - well - behaved with respect to an internal initial state s0, i.e. defining s0(v,p) for only the (v, p)edest(e) for all eeE\(/(G)uO(G)), ifffor all initial states s0 such that

1. V'ee/(G): (v, p)edest(e)=>ls0(v, p)I = k 2. V'eeO(G): lso(e)I = 016

3. V' eeE\/(G): (v, p) edest(e): s0(v, p) = s0(v, p), the following is valid for alls final efinal(RG(G, so)):

1. V'ee/(G): (v, p)edest(e)=>ls finat(V, p)I = 0

16. See the remark on page 52. In order to define embeddings (Section 2.7.1) very easily, dummy nodes are not introduced. 90 THEORY

2. 'lieeO(G): Is final(e)I = k 3. 'Ii eeE\/(G): (v, p)edest(e): s finaz(V, p) = s{i(v, p). Informally, a graph is well-behaved if it generates on all output edges the same number of data items, as initially there were on the inputs, while consuming all the inputs and reinitializing the rest of the graph. Proposition 2.53 k + 1-well-behaved(G, s0 }~k -well-behaved(G, s0)

Proposition 2.54 Under the assumptions for s0 of definition 2.110, the reachabil­ ity graph RG(G, s0 )for a k-well-behaved data flow graph G isfmite.

Definition 2.111 well - behaved:6 x State' ~ 1B is defined as: well - behaved(G, So)= 'likeJN: k -well-behaved(G, so)

Notation 2.112 In the sequel, the precise definition of the internal state lo is not needed, and well - behaved( G) denotes well - behaved( G, s0).

Proposition 2.55 Under the assumptions for s0 of definition 2.110, the reachabil­ ity graph RG(G,s0)for a well-behaved data flow graph G is finite. Well-behavedness of a graph is comparable to total correctness of programs. Note that the operational semantics, and therefore also the execution graph denota­ tional semantics, 'coincides' also with the behavioral denotational semantics for nullary operator, or constant, nodes. According to the operational semantics, a nullary operator node is always enabled to fire when there are no input sequence edges. Such a node then always leads to infinite execution sequences in the reacha­ bility graph and to infinite streams. The solution of the least fixpoint of valuation function of such a node in the behavioral semantics is also an infinite stream, because 'V[constD = fu(A.. cons(c, ))= (c, c, · · ·). Therefore a graph contain­ ing such a node is never well-behaved. For well-behaved graphs, nullary nodes must always be 'connected' by means of sequence edges. The same applies to get nodes. An informal motivation for well-behaved graphs might be that it is for such a graph 'observable' when it has terminated its execution. When more than one value can be produced, and it is not known how many, one can never be sure that WELL-BEHA VEDNESS 91 the graph execution has terminated. Note that this does not imply that systems that produce output data at a higher or a lower rate than the input rate cannot be mod­ eled by data flow graphs. Those rates are defined by the number of executions of get and put nodes in the flow graph. This is due to the fact that the initial state is not necessarily defined by the get nodes in a graph G, but by the input edges I (G); similarly for the put nodes and the output edges O(G) (see also the remark on 10 nodes at page 31 ). 2.7.1 Embeddings

Definition 2.113 Given a graph G' and a strict node v e VG and bijections 81: /(v) -7 /(G') and 82: O(v) -7 O(G') and Behavior(G') = .1(v). embedding:fl x V xfl -7 fJ is defined as: embedding(G, v, G') =(VG\{ v }uVG'• P;n0 uPintr• P °"'a uP outa" (E0 uE0 ·)\(l(G')u0(G')), laf(Va\{ v})u/6 ·, Oa/(va\{ v DuOa·, ;") with

,Pa(e) if eeE6 \(•vuv•) ;"(e) = ; 6 .(e) if eeE6 ·\(/(G')uO(G')) {(a,b) if ee•vuv• where a= fst(;G(e))\({ v} x O(v))u U fst(f/Ja·(Bz(p))) V p:(v,p)efst(;(e)) b = snd(rp0 (e))\({v} x /(v))u U snd(f/Ja.(B1(p))) V p:(v,p)esnd(;(e))

Proposition 2.56 well- formed(G)Awell- formed(G')~well- formed(embedding(G, v, G'))

Proof: Let G" = embedding( G, v, G'). By definition 2.26, it has to be proved that "lei. ezeEa": ei * ez: f st(1fon(e1))flfst(f/JG0 (e2)) = 0 A (2.4) snd(f/Jo"(e1))flsnd(f/Ja"(e2)) =0 The following cases can be distinguished:

1. eh eze(EauEa·)\(l(G')uO(G')uv•u•v) and e1 * e2: Since well -formed(G) and well- formed(G'), equation (2.4) is clearly valid. 92 THEORY

2. e1 e(EGuEG·)\(/(G')uO(G')uv•u•v) and e2 ev•u•v: In this case, with ;(e1) = ~G(e 1 ) or ; 6 .(e1) depending on whether e, eEG or e1 eE0 ., fst(;"(e1))nfst(;"(e2)) = fst(;(ei))na (2.5) Now the following cases can be distinguished:

1. e1 eEa\(•vuv•): In this case, equation (2.5) is equal to =fst(; 6 (e1))n(fst(;6(e2))\( { v} x O(v))) r;;fst(;6 (e 1))nfst(ifo(e2)) = 0. Similarly, snd(;0 .. (e 1))risnd(ifo.. (e2)) = 0.

2. e1 eEa·\(/(G')uO(G')): In this case, equation (2.5) is equal to = fst(;G·(e1))n U /st(~G-(82(P))) V p;(v,p)e/st(;(e)) r;;fst(;o·(e1))n U fst(;o·(e2)) = 0. VeeOa· Similarly, snd(;0 .. (e 1))risnd(;0 .. (e 2)) 0.

3. et> e2 e•vuv•: fst(;G .. (e1))nfst(;G.. (ez)) = (fst(;(e1))\{ v} x O(v)u u fst(;G·(82(P)))) n V p;(v,p)efst(;a(e1)) (fst(;(e2))\{ v} x O(v)u U fst(;o·(82(P)))) V p;(v,p)Efst(;a(e2)) = (/st(;(e1))ri/st(;(e2)))\{ v} x O(v) U ( u /st(;G·(82(P))) () u /st(;G·(B2(P)))) V p:(v,p)Efst(;a(e1)) V p:(v,p)e/st(;a(e2)) since the other cross-products are empty, = u fst(;G-CB2(P))) () u /st(;u(B2(P))) V p:(v,p)efst(;a(e1)) Vp:(v,p)eftt(;o(e;)) = U /st(;G·(82(P))) = 0 V p:(v,p)efst(;o(e1))A(v,p)efst(\\lo(e2)) since -.3(v, p)e/st(;a(e1))A(v, p)efst(;a(e2)). Similarly, snd(;0 .. (ei))risnd(;0 .. (e2)) =0. Since equation (2.4) is symmetric, it is also valid for all the other cases. o

Lemma 2.57 Given well-formed graphs G and G', and a node veV. Let G" = embedding(G, v,G'). well -behaved(G')=>\faeRG(G): 3a" eRG(G"): aNV\{ v}) = a"~(V\{ v})

Proof: The following cases can be distinguished: WElL-BEHA YEDNESS 93

1. VEa Trivial, choose a" == a.

2. VEO' Since well -behaved(G') and by proposition 2.55 RG(G') is finite, 3a': I (Vwea': weVG') A s0(G')~sn(G'), where s0 is the initial state of G' and Sn is a final state of G'. Thus any firing sequence aeRG(G) can be trans­ formed in a firing sequence a" eRG(G") by replacing every occurrence of v in a by a'. 0

Theorem 2.58 Given welljormed graphs G and G', and a node veV. Let G" == embedding(G, v,G'). well -behaved(G)Awell-behaved(G')=*well-behaved(G")

Proof: By lemma 2.57, "laeRG(G): 3a' eRG(G"). However, it can also be the case that vis partially enabled in <:r, and therefore G' can be partially executed in G". This possibly results in some outputs, which may again lead to (partially) enabling G'. However, since G is well-behaved, G' must get fully enabled, yield­ ing a full set of output values, because G' is well-behaved, too. If G' does not become fully enabled, then v is dead in such a sequence, and therefore G would not be well-behaved, contradicting the hypothesis. D Thus expansion is well-behavedness preserving, but collapsing is in general not well-behavedness preserving. But nevertheless, the following is valid. Theorem 2.59 Given welljormed graphs G and G' and a node veV. Let G" == embedding(G, v,G'). well - behaved(G")Awell - behaved(G')Alive(G)=*well - behaved(G)

Proof: By lemma 2.57, "l<:reRG(G): 3a' eRG(G"). However, it is not necessary that a' is a complete path of RG(G"), i.e. a may deadlock in G but a' not in G". However, if live(G), then for all complete paths aeRG(G), there exists a complete path a' eRG(G"). Since all paths a' eRG(G") are well-behaved, this is clearly also for all paths ueRG(G). I.e. in a sense, if G is live, then G" cannot be 'more live' for those corresponding sequences, or at least, if more partial firings of G' exist, then also more full firings of G' exist, since k + 1 - well - behaved(G)=* k -well -behaved(G). o Note that theorem 2.59 at first sight contradicts one of the main results of [125], since now 'Condition A' of theorem 11 of [125] is skipped. This condition states, 94 THEORY for a theorem similar to theorem 2.59, for a k-well-behaved graph G', and for some node v that is not k + 1-enabled, that it must always be able to k-enable v in G. Here such a condition is not needed, since in data flow graphs it cannot be stated that ls(v, p)I must be larger than 1 for a node to be enabled to execute, as it is possible for general Petri nets. Thus for analysis and verification purposes, liveness preserving reductions have to be looked for. 2. 7.2 Well-structuredness In this section, a reduction scheme is defined to detennine whether a flow graph without choice edges is well-behaved. Definition 2.114 The well-structuredness rewriting system for graphs G for which -.choice(G) consists of the following rewrite rules: 1. Transform the graph by collapsing all input and output ports of operator nodes. Formally: G' =(V' {Pd• Pc• P1rue• Pfalse }, {Pd• Ptrue• P false}, E,I', O', ;')where

I'(v) ={{p4} type(v) :=operator I ( v) otherwise

O'(v) ={{Pd} type(v) :=operator O(v) otherwise

rp'(e) = ({(v, Pd) I 3(v, p)eV X Pout: (v, p)efst(ip(e))Atype(v) = operator}v {(v, p) I (v, p)efst(ip(e))Atype(v) :t: operator}, {(v, Pd) I 3(v, p)eV x Pin: (v, p)esnd((J(e))Atype(v) =operator }u {(v, p) I (v, p)esnd(rp(e))Atype(v):;:. operator} )

Note, only branch and merge nodes have input and/or output ports Pc• p false• and Ptrue· 2. Map branch nodes with the same control edge onto each other. Formally, for each branch node vbr and its control edge ec ((vbr• Pc)esnd(ip(ec))) and its induced set of corresponding branch nodes V br = {v e e• I type(v) =branch}: G' =(V\Vbru{ vbr }, Pm,Pout• E, I', O', ip') where WELL-BEHA VEDNESS 95

/'(v) ={/(v) veV\Vme l(vme) V = Vme

O'(v) = {O(v) v:_V\Vme O(Vme) V - Vme

efi'(e) = ({ (v, p) I (v, p)efst(rp(e))1weVbr }u {(vbnP) I 3veVbr:(v,p)efst((b(e))}, {(v, p) I (v, p)esnd(f)(e))AveVbr }u {(vbnP) 13veVbr:(v,p)esnd((b(e))} ) 3. Map merge nodes with the same control edge onto each other. Formally, for each merge node Vme and its control edge ec ((vme, Pc)esnd(rp(ec))) and its induced set of corresponding merge nodes V me = {v e e• I type( v) = merge}:

1 G' = (V\VmeU{ Vme}, P;m Pout• E, l',O', 9) ) where

/'(v) ={/(v) veV\Vme l(vme) V = Vme

O'(v) = {O(v) veV\Vme O(vme) V = Vme

(b'(e) = ({(v, p) I (v, p)efst((b(e})AveVme }u {(vme, p) I 3veVme: (v, p)e/st(f)(e))}, {(v, p) I (v, p)esnd((b(e))AveVme}U {(vme, p) I 3veVme: (v, p)esnd(f}(e))} ) 4. Split all multi-destination edges. Formally, for an edge e with #snd(rp(e)) = n, n > 1: 17 G' = (V, Pin, Pout• E', I, 0, rp') where (with 'r;/i: 1 :::; i:::; n: e;eE) E' = E\{e}u{ei.· ··,en}

, , l(b(e') . if e' eE r/J (e) = (fst(rp(e)), {snd(f}(e))(i)}) if e' = e;

1 17. For a sets, s(i) denotes the i h element of s. Cf. notation A.61. 96 TuEORY

Note, since -.choice(G), 'r:/eeE:#fst(tp(e)) = l. 5. Remove duplicate edges. 6. Remove 'empty' edges. Formally:

G' (V, Pin• Pout• E', I, 0, ;/E') where

E' = {eeE I fst(~(e))-::/:. 0 v snd(;(e))-::/:. 0} 7. Remove an edge e connecting two operator nodes v and w (v-::/:. w), i.e. q;(e)=({(v,pd)},{(w,pd)}). Note, it is not meant to remove all edges between operator nodes, but to propagate them in the direction of the input and the output edges of G, so that the precedence relation between input and output becomes apparent, and the intermediate subgraph becomes isolated. Formally, with n == #•v:

G' = (V, P;n, Pout• E', I, 0, t;') where (with 'r:/i: 1 sis n: e;~E) E' = (E\{ e })u{ et>···, en}

q) '( e ') ={q)( e') if e' e E (fst(ip(•v(i))), {(w,pd)}) if e' = e;

8. Remove 'dead' nodes, since the previous step may lead to nodes v with v• = 0. Formally: G' = (V', P;n. P ouz, E, I /V', O/V', ;) where V' = { veV I v•-:/:. 0} 9. Replace the structures of figures 2.18 and 2.19 by a single operator node. Note that in a sense, this reduction is the converse of the construction of data flow graphs for imperative and applicative languages as discussed in section 3.2. Also cf. this definition of well-structuredness as used in [83] and the concept of reducible flow graphs [56], and hence basic blocks [8]. Proposition 2.60 The rewrite system consisting of the rewrite rules 5 to 9 given above is confluent, i.e. the order of application is irrelevant. The proposition is even valid when extended to all rules, since rules 1-4 can be applied only once. Definition 2.115 For a (well-formed) data flow graph G for which -.choice(G), G is well-structured iff the rewrite system of definition 2.114, results in a set of iso­ lated nodes V', with u •v = /(G) and u v• =O(G). The internal initial state s • veV' veV' 0 (see definition 2.110) must consist only of false values on the control ports of the WELL-BEHA VEDNESS 97

Figure 2.18 Reduction of well-structured switch statement merge nodes of the constructs of figure 2.19.

Theorem 2.61 Given a well-formed graph G. well -structured(G):::}well - behaved(G, s0)

Proof: All the reduction rules of the rewrite system of definition 2.114 are reduc­ tions of well-behaved subgraphs, and result in live reduced graphs, iff the sub­ graphs are live. Since if well - structured(G) the reduction of G yields a live graph, the latter is obvious. o An example of a non well-structured graph is given in figure 2.20. In this figure, the control edge from the :::> node to all the branch and merge nodes is not drawn. In this example, data items accumulate on one edge of the operator node outside the loop for each loop iteration (+l). The other input edge gets an amount of data items equal to the number of complete loop executions, leading in general to a non well-behaved graph. The rewrite system of definition 2.114 can also be extended to include choice edges and other well-behaved substructures than those of figures 2.18 and 2.19. Difficulties to extend the reducibility do not arise from choice edges itself, i.e. static control structures. Since e.g. rule 7 of definition 2.114 can be changed (assuming no self-loops and no self-choice) to equation (2.6) to 98 THEORY

Figure 2.19 Reduction of well-structured repetitive statement include choice edges. In the end, the number of data items on the output edges must be identical to the number of data items on the input edges in the initial state for a graph to be well-behaved. Equation (2.6) then, just as in the case for choice­ free graphs, results in passing the same number of data items in the reduced graph as in the non-reduced graph.

Rule 7 of definition 2.114 becomes: change an edge e connecting two operator nodes v and w (v 'f:. w) by duplicating each edge e0 ev• n = #•v times and connect­ ing it to the origins of •v. Or formally, where •v = {e1 ···en} and v• = {eo, ... eoJ: G'=(V,P;mPout•E',1,0,(J') (2.6) where (with Vi, j: 1 ::5 i ::5 m, 1 ::5 j ::5 n: e0 i eE) 1 E' = (E\{ e })u{ · · · e • . • e n ••• } ' ' Oi ' ' O; '

(l'(e') = {(J(e') if e' eE

((fst((J(eo))\{ (v, Pd) })v fst((J(e j)), {(w, Pd)}) if e' = e0 / WELL-BEHA VEDNESS 99

Figure 2.20 A non well-structured graph Difficulties to extend the reducibility mainly arise from the dynamical operation of branch and merge nodes in combination with structures as for instance shown in figure 2.21. Since generalized data flow graphs are very expressive and non­ restrictive, it is not possible to define a reduction scheme such that well -behaved(G) if! reducible(G). This is opposed to reducible flow graphs as for instance discussed in [56] and [75], since in those references the type of control structures is fixed and restricted. 2. 7.3 Property preserving embeddings As mentioned in the previous paragraph, it is very difficult to extend the reducibil­ ity of definition 2.114 to include all well-behaved graphs. However, in section 2.7.l the point is already made that well-behaved embeddings preserve well-behavedness. Now some other useful theorems are stated.

Definition 2.116 Given a path n: in a graph G and a states. token - count: Pat/ix State ~ JN is defined as: J(}Q THEORY

a b

Figure 2.21 Examples for non well-structuredness token-count(K,s)= :L #s(v,p) 'v'(v,p)EVxP ;.:(v,p)Eitr

Lemma2.62 cycle(K) A 1 Vven:type(n)e{branch,merge} A ~ Veen: -ichoice(e) J Vs': (3a: s~s': token- count(7r, s') =token - count(7r, s))

Proof: If an operator node VE7r is enabled, then execution removes one 'token' from the input port p; for which (v, p;)EK and puts it at the output port Po for which (v, p 0 )e1r. And because n is choice-free, execution of a node veK does not influence token - count(K). o Lemma 2.62 is similar to a result in (35] where it is shown that the token count of cycles in marked Petri nets is constant

Corollary Given a graph G, and initial state s0. acyclic( G )A -ichoice( G)Asafe(s0)~safe( G) WELL-BEHAVEDNESS 101

Proof: Similar to lemma 2.62. o

Lemma 2.63 cycle(1r) /\ } Vve1r: type(1r)~{branch} => Vs': (3u: s~s': (Vu' e pref(u): token - count(1r, update(s, u')) ::; token - count(1r, s')))

Proof: By lemma 2.62, execution of operator nodes ve1r leaves token - count(1r) constant. Execution of a merge node ve1r may increase token - count(1r) when it 'chooses' the input port p for which (v, p)~1r. Execution of a node v~1r can only change token - count(1r) by increasing it through a choice edge ee1r. o

Theorem 2.64 Given a (well-formed) data flow graph G and a path,. of G. live(G) /\ cycle(1r) /\ ) -i3ve1r: type(v)e {branch, merge}/\ =>-iwell - behaved(G) -i3ee1r: choice(e)

Proof: Since live(G), each node v of G is executed at least once. Therefore, a data item enters ,. at least once. Because token - count(1r) cannot decrease, by lemma 2.63, the initial state of G can never be reached again. For a merge node, it is also assumed to be fired in such a way that,. is entered from the 'outside'. Oth­ erwise, this merge node could have been replaced by an operator node. o This property might also be viewed as a well-structuredness property. Now some properties for embeddings are given that extend lemma 2.57 to well - behaved(G')=>project(RG(G), { v }) = project(RG(G"), VG') where G" = embedding(G, v, G') and project is the projection operator defined on state-transition graphs. Lemma 2.65 Given well-formed graphs G and G', and a node veV. Let G" = embedding(G, v,G'). well - behaved(G')=>project(RG(G), { v}) $RG project(RG(G"), VG')

Proof: This is lemma 2.57 rephrased. o 102 THEORY

Lemma 2.66 Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). ln(G') = {V;n;tl A Out(G') = {v finatl A}=> safe(v, p,RG(G))

Va" eRG(G"): Vm;1 and v fmal alternate in a"

Proof: Assume aeRG(G") for which V;n;1a'v;nu<;;a and Vfinalea'. After first executing vinit• only nodes in G or G' are enabled. As long as v final is not executed in G', nodes in G cannot enable V;ni1> by the assumption of safe(v, p, RG(G)). D

Lemma 2.67 Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). /n(G') {Vinit} AOut(G') = {V final} A} safe(v,p,RG(G)) A live(G) => Va" eRG(G"): 3ueRG(G): u"~(V\{ v}) = u~(V\{ v})

Proof: This is a direct consequence of lemma 2.66 and that for all u" eRG(G") with V;n;1clvrma1<;;u" all the nodes of G' that execute in u' are independent of the nodes of G that execute in u'. Therefore, as anticipation shows, executing nodes of G' does not influence the order of executing nodes of G. For example, all nodes of G may be 'swapped' out of u' without changing their order, which directly implies the lemma. D

Theorem 2.68 Given well-behaved graphs G and G', and a node v e V. Let G" = embedding(G, v,G'). #/n(G') = 1 A #Out(G') = 1 A } safe(v, p, RG(G)) A live(G) => project(RG(G), {v}) = project(RG(G"), VG')

Proof: Directly from lemmata 2.65 and 2.67. o

Corollary Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). #ln(G') = lA#Out(G') = l=> safe(G", RG(G")) if! safe(G, RG(G))Asafe(G', RG(G'))

Proof: Directly from lemma 2.67 and that all nodes of G and G', except for In(G') and Out(G'), are independent of each other. Thus firing of a node of G cannot lead WELL-BEHA VEDNESS } 03 to -.safe(v', p, RG(G")) for a v' eG' and vice versa. I.e. 'r/s" eRG(G"): 3saeRG(G), sa,eRG(G'): s" = join(sa, sa'). This is even valid for the input nodes /n(G') and the destinations of the output nodes O(G'), since 'r/ee/(G'), fst((J(e)) = 0, and thus all the input node-port pairs (v;nil•P) that are defined by a ee/(G') are defined by executing nodes in G. Although the destinations of v final> may be defined by nodes of G' and of G, possi­ ble shuffles are exactly those that also occur in RG(G). o Note that the liveness of G is not required for this corollary, so that this theorem is also directly applicable to collapsing, since for safeness of G", deadlock in G is of no importance, as long as all other states are safe. Theorem 2.69 Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). #Out(G) = 1 A } safe(v, p, RG(G)) A live(G) => project(RG(G), {v}) = project(RG(G"), Va·)

Proof: The proof follows along the same lines as of theorem 2.68, since lemma is still valid, and a similar lemma as lemma 2.66 states that all the vinil e/n(G') alter­ nate in RG(G") with the v final eOut(G'). The observation is again that all nodes in G and G' are independent of each other, except for the input nodes vinit e/n(G'). But for these it holds, too, as can be shown by a similar argument as for the last corollary. o

Corollary Given well-behaved graphs G and G', and a node v e V. Let G" = embedding(G, v,G'). #Out(G') = 1=> safe(G", RG(G")) if! safe(G, RG(G))Asafe(G', RG(G'))

Proof: Similar to the corollary of lemma 2.67. o It is not possible to extend theorems 2.68 and 2.69 and the corresponding corollar­ ies to well-behaved graphs G' that consume all inputs to compute each output of G' (e.g. if all paths in G' from /(G') to O(G') have one or more nodes in com­ mon). Although G' can in that case not partially evaluate and lead to an accumula­ tion of data items at the other inputs, and therefore also choice edges cannot intro­ duce problems, but accumulation may occur on other output edges, as is illustrated by figure 2.22. In this figure, the subgraph G' can be evaluated more than once when a data item is put on edge ei. leading to more than one data item on edge e2• 104 TuEORY

r------, 1 I I I I I I I I I t I I I I I I I I 1 I I I :c'I I I I I I I I I I I I I I I I I I I I ___ J I '---- e2

Figure 2.22 Possibly non-safe embedding of G' However, as the next theorem shows, the above 'inside-out' is valid. I.e. when the node v is enabled again in G, then all outputs of v must have been used. Theorem 2.70 Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). VseRG(G): s(v, p) ~nil: (Vwesucc(v): s(w, p) =nil) A} . ~ safe(v, p, RG(G)) /\ l1ve(G)

project(RG(G), {v}) = project(RG(G''), V 0 .)

Proof: Similar to theorem 2.68 since, from the above discussion, if V seRG(G): s(v, p) ~nil: (Vwesucc(v): s(w, p) =nil) then all outputs of v are needed to enable v again in G. And thus /n(G') and Out(G') alternate in G". o

Corollary Given well-behaved graphs G and G', and a node veV. Let G" = embedding(G, v,G'). VseRG(G): s(v, p) ~nil: ('v'wesucc(v): s(w, p) =nil)~ safe(G", RG(G")) if! safe(G, RG(G))Asafe(G', RG(G')) Note that theorems 2.64-69 are also directly the liveness counterparts of the above safeness corollaries. WELL·BEHA VEDNESS 105

The following theorem is also a direct consequence of the above.

Theorem 2.71 Given a graph G = G1II··· llG n· 'iii: 1 :::; i::;; n: well - behaved(G;)ASafe(G;):::;.safe(G) This in fact proves, together with the corollary of lemma 2.62 and the well­ structuredness theorem 2.61, the correctness of the scheduling in high level syn­ thesis, since, as is shown in sections 3.2 and 3.3, synthesis partitions a (well­ structured) graph into acyclic 'blocks' and should result in a safe graph (where a block is a (sub)graph G' with #/n(G') =#Out(G') = 1).

Chapter 3

Applications

In this chapter, some examples and applications are given of data flow graphs and their analysis methods. This should illustrate the applicability of data flow graphs and the concepts defined in the previous chapters for the analysis and verification of data flow graphs. The dining philosophers are used in section 3.1 to illustrate the power of the anticipation strategy as presented in section 2.6. In section 3.2 another example is given to illustrate that the flow graphs presented here do not impose any restriction. except the real data dependencies. Also the application of some theorems is shown. High level synthesis is proved to be just a gradual graph transformation in section 3.3, i.e. the separation into a control graph and a network graph is not necessary. Also the minimal requirements are given for synthesis tools to generate safe systems. Sections 3.4 and 3.5 illustrate how other types of system specifications, which at first sight violate the data flow principle, might be viewed as, or mapped into, flow graphs. Section 3.6 dt?scribes a method to verify the correctness of flow graphs with respect to requirements that are not specified in the flow graph itself. Section 3.7 mentions some optimizations, and some useful extensions to the theory of general­ ized data flow graphs are hinted at in section 3.8.

3.1 Dining philosophers In this section the well-known problem of the dining philosophers is discussed. It 108 APPLICATIONS is shown how this problem can be modeled in the generalized data flow graph con­ cept, and also the power of the anticipation strategy is illustrated. The problem consists of n philosophers sitting around a table with a bowl of food. There are n forks, and a philosopher needs its right and left fork to eat. A philosopher can be in one of 4 states, which are traversed consecutively: 1. thinking 2. wanting to eat, i.e. picking up the forks 3. eating 4. releasing the forks In the original problem, a philosopher picks up first its right fork, if possible, and then the left fork, if possible. Releasing the forks is also done in a particular order. It is clear that with n philosophers and n forks, there may be forks too few, i.e. the philosophers may starve. Here also an alternative to the original problem of the dining philosophers is presented, namely the one in which the restriction of first picking up the right fork and then the left fork is removed. This results in an even larger (reachable) state space. But the anticipated reachable state space is shown to be smaller! This is because the latter problem has more freedom, and therefore the anticipation strategy has more possibilities to reduce the state space.

Figure 3.1 shows an implementation of one philosopher-fork pair i (1 :5 i :Sn) in terms of data flow graphs. The subgraph consisting of the nodes p; and q; models a philosopher, and the subgraph consisting of the branch node fil and the merge node /;2 models a fork. The node Pi models the transition of the 'thinking' state to the 'wanting to eat' state; it puts the value left on the edge connected to the port labeled with I, and a value right on the edge connected to the port labeled with r. The node Q; models the 'eat to think' state transition, which also puts values left and right to its outgoing edges to indicate the release of one fork as a 'left' and the other as a 'right' fork. The branch node /ii represents the picking up of the fork. If the head of the stream on the control port has the value left, then the branch will branch to the output labeled with l, indicating that this fork is then picked up as the left fork of a philosopher; likewise for a 'right fork. Similarly, the merge node /;2 models the releasing of the fork. The control value indicates whether this fork should be released as a 'left' fork or as a 'right' fork. Note the choice of the con­ trol edges, because a single fork is desired by two philosophers. The edges con­ nected to the ports labeled with r of the philosopher nodes p; and q; are connected to the control ports of the branch and merge node of fork i + 1, i.e. they are like the edges labeled with r' with originate from philosopher i - 1. The edge connected to DINING PHILOSOPHERS 109

the port r of the branch node /i1 is also connected to an input port of node q;_1 to indicate that this fork is picked up as the 'right' fork of philosopher i - 1. The value domain of the outgoing edges of the nodes Pi and q; is {left, right}; for all the other edges, a one-value domain is sufficient.

r'

r'

Figure 3.1 Data flow graph of a dining philosopher and a fork This is already an optimized version of the problem, since only the required transi­ tions are shown, and not the separate states. For instance, a philosopher is 'think­ ing' when the input of the node Pi is non-empty; a philosopher is 'eating' when both inputs of the node 'q;' are non-empty; a fork is picked up as a 'left' fork it the input l of the merge node fi2 is non-empty. However, the model can be optimized even further for purposes of analysis. It can be deducted that the control ports of the merge nodes /i2 are always safe, since the value on the control port points to the data input port that carries the 'fork' and the data input ports are always exclusive. This may be proved in a stepwise refine­ ment technique as in the well-structured rewriting system. Therefore the merge nodes may be eliminated resulting in the reduced data flow graph of figure 3.2. 110 APl'UCATIONS

r

Figure 3.2 Reduced flow graph of dining philosopher problem For this data flow graph, values on the edges have no importance, i.e. only the presence of a 'data' item matters. Also the origin of the choice control edge deter­ mines which branch will be taken instead of a value computed in node p;, which may differ each time this node is executed. This all means that the conflict mod­ eled by the branch node is static, i.e. it not really determined dynamically by unknown values. Therefore, the flow graph of figure 3.2 may be mapped into an equivalent Petri net [110], as shown in figure 3.3, where the two different places marked A, respectively B, are the same. This transformation is done mainly for two reasons: 1. The anticipation results of section 2.6 are also valid for Petri nets. That is the reason why the possibility of conflict is added to all the theorems, although this cannot occur in well-formed data flow graphs.

2. With an initial state consisting of a token for the incoming edge of each p 1 transition, the data flow graph is 2-bounded, whereas the corresponding Petri net is safe. Instead of a number of 2n states, i.e. all the possible states that may occur after firing then p; nodes (since the control ports of each of the n forks can be in 2 states), the corresponding Petri net has just 2n of such 'initial' states. This is due to the fact that each control edge of the n forks is split into two places, and each place contains at most 1 token. DINING PHILOSOPHERS 111

.. ' ' .. fi-1-,...... - jl

Figure 3.3 Equivalent Petri net for the dining philosopher problem The entire reachable state space is of course exponential in n, the number of philosophers or forks. Next the anticipation algorithm of page 81 is shown to result in a state space that is quadratic in n for the original problem and even linear for the problem where the order of picking up forks is unrestricted. This shows the power of the anticipation technique which is crucial for analysis techniques based on state space enumeration. The dashed lines in figure 3.3 model the sequencing of picking up the 'right' fork before the 'left' fork. The initial state consists of tokens in the fork places and in the input places of the 'think to eat' transition (those marked with A and B in figure 3.3) of all the n philosophers. The firing of just the n 'think to eat' transi­ tions Pi result in a graph of 2" states. In total, the reachability graph has no more than 4" states, since each philosopher can be in 4 different states, and it can be proved to be a safe Petri net. The total number of states is somewhat less than this maximum number, since deadlock can occur. Applying the anticipation algorithm with multiple firing and the reduction for complete, or maximal, enabled conflictive sets (see page 83) to the Petri net of figure 3.3 can result in the reachability graph shown in figure 3.4. After firing the n initially enabled conflict-free 'think to eat' transitions p;, all the conflictive fork transitions /iare enabled, i.e. all philosophers want to pick up their 'right' fork. Since only the picking up of the 'right' forks is enabled, no maximal conflictive set 112 APPLICATIONS

Iq; ,, I ,I Qj-11 ,, jvi PHI ,,

Sterm Sz

Figure 3.4 Anticipated reachability graph for the serialized dining philosopher Petri net {fj, g j} (Definition A.16) is fully enabled. Thus no anticipation can take place, and the picking up of all the n 'right' forks f; must be executed separately. After firing one such transition f;, there is exactly one fully enabled maximal conflictive set {fi,gi }, namely {f;_l> g;_i}. Thus philosopheri has now picked up the i'h fork 1 as its 'right' fork f;, and the i - 1 h as its 'left' fork g;_1• After picking up its 'left' fork, the philosopher can complete its sequence of eating. But a similar situation as before results, but now for the left neighbor of philosopher i, when this philoso­ pher i - I picks up its 'right' fork /i-l · And, of course, when all philosophers have picked up only their 'right' forks, the starvation of the philosophers starts. The constraint to solve this problem is therefore easy: simply do not allow all philoso­ phers to pick up their 'right' forks. When the restriction on the order of picking up the forks is lifted, all n maximal conflictive sets are fully enabled after the firing of the n initially enabled 'think to eat' transitions p;. Only one pair needs then to be executed. The same situation remains after firing such a pair, leading to all states having two successors, until a 'right' and 'l~ft' fork have been picked up by a philosopher. After this, the DINING PHILOSOPHERS 113 respective philosopher can complete its sequence. Straightforward application of the anticipation algorithm of page 81 may then lead to the reachability graph of figure 3.5. So !P1 ·· · Pn AA~

2 A A q ! A I I \ I ' \ I \ I !q3 I ' \ I \

I \ I \ \ f,/ !p3 '$n ; Sterm2 Srerml' Figure 3.5 First anticipated reachability graph for the non-serialized dining philosopher Petri net This reachability graph can be reduced even further. Instead of just executing the 'next' pair of enabled conflictive fork transitions, just fire that fork pair that will result in a philosopher having picked up both its forks /; and g i-1 • This leads to the reachability graph of figure 3.6. From figure 3.4, it is straightforward that the total number of states in the antici­ n-1 pated reachability graph is equal to n(L, 3) + 1+2 = 3n2 -3n + 3. The term '1' in i=l this sum comes from the terminal state in the reachability graph; the term '2' is due to the 'initial' states s0 and s 1• Similarly, the number of transitions is equal to n-1 n((L, 4) +I)+ I = 4n2 -3n + 1. For the complete reachability graph, i.e. the i=l reachability graph without applying any reduction at all, the following recurrent equation can be derived for the number of states Sn for a given value of n, the J 14 APPLICATIONS

I \ I \ I \ I \ I \ I \ I '

S1erm2 S1 Sterml

Figure 3.6 Second anticipated reachability graph for the non-serialized dining philosopher Petri net number of philosophers (n ~ 2, s0 = 2, s1 = 3): s,. = 3s,,_1 +2s,,_2 (3.1) For the number of transitions t m the following recurrent equation is valid ( n :::: 2): ~=n~-~ (3~ where d,. = 3d,,_1+2d,,_2 with d 1 = 1 and d0 = -!. The number of states is approximately (3. 5)". For the problem in which the order of picking up the forks is not restricted, it fol­ lows straightforward from figure 3.6 that the anticipated reachability graph has a n-1 number of2((L 3)+ 1)+2= 6n-2 states. The term 'l' is again from the terminal i=l state. Note that for this problem two different terminal states exist; one is reached by picking up all 'left' forks, and the other one by picking up all 'right' forks. The term '2' is again the 'initial' set of states s0 and s1• The number of transitions is n-1 equal to 2((L 4)+ 1)+ 1 = Sn-5. Similar to equations (3.1} and (3.2), the follow­ i=t' DINING PHILOSOPHERS 115

ing recurrent equations can be derived for the number of states and transitions in the non-reduced reachability graph for this case (n :=: 2, s0 = 2, s1 = 4): Sn= 4sn-l + Sn-2 and tn = n(sn - dn) but now with (d1 =0)

d11 =4dn-1 +I 11 The number of states is approximately ( 4. 25) • A numeric overview of the num­ ber of states of the complete reachability graph of the Petri net, as well as of the anticipated reachability graphs is given in the tables 3.1 and 3.2 for different val­ ues of n. Also the time is included that it took for a Petri net analysis tool written in CommonLisp [121] to compute the anticipated reachability graph. 1

Table 3.1 Number of states of the original dining philosopher problem #transitions time (s) anticipated 13 9 22 11 0.3 45 21 111 28 0.8 4 161 39 532 53 1.9 5 573 63 2365 86 3.8 6 2041 93 10110 127 7.8 7 7269 129 42007 176 13 8 25889 170984 233 9 92205 685089 298 328393 0 371 41 1. 1*1011 1. 8 * 1012 1541 505 In the analysis of problems with large state spaces, (memory) space can be traded­ off for time. For the (original) dining philosopher problem, current verification tools based on BDD's are limited to approximately n = 50. It is of course not fair to use a coding of 1 bit per state as advocated in [62, 63]. But assume a moderate computer with only 16MB memory. Since the Petri net is safe, and since each philosopher-fork pair consists of 8 places, exactly one byte is needed to store a state. When the complete reachability graph must be kept, space

I. Runtimes are on an Apollo DN2500 with 16MB of memory. 116 APPLICATIONS

Table 3.2 Number of states of the non-serialized dining philosopher problem II #states #transitions time (s) anticipated anticipated 2 18 10 34 11 0.3 3 76 16 213 19 0.7 4 322 22 1204 27 1.3 28 6375 35 2.1 34 32526 43 3.3 40 161329 51 4.8 783720 59 6.7 67 9.0 10 75 13 155 90 2 is needed for the 4n - 3n + 1 transitions as well. Assume that a state is a structure of the token distribution, n bytes, and a pointer to the outgoing transitions, which takes 4 bytes on conventional Unix machines; and a transition is a structure con­ sisting of the label of the transition, n/2 bytes (because each philosopher-fork pair consists of 4 transitions), and of a pointer to the destination state. For this data structure, the problem can be analyzed for up to 140 philosophers. A more efficient encoding needs only 3n bits to store a state, since the input places of a conflictive fork transition are each other's complement, when the philosopher is not thinking. Also the contents of the fork place can be extracted from such an encoding. Note that even the two transitions Pi and q; may be collapsed into one transition. Then only 2n bits are needed to encode a state. The number of states for n-1 n-1 2 this Petri net is then reduced to n(L, 2) + 1+1 = 2n - 2n + 2 and n((L, 3) + 1) = i=l i=l 2 3 n - 2n transitions. Then up to about 2000 philosophers fit in memory. Of course, for parameterized problems like this one, inductive techniques should be used, e.g. [79], but these are difficult to use in an automatic way.

3.2 Loop example In this section, a data flow graph for an algorithm with an iterative statement is given. With this example it is illustrated that the data flow graph concept presented here does not impose any restriction on the graph but the data precedences given LOOP EXAMPLE 117

by the algorithm. Also the application of theorem 2.52 is shown to verify whether the flow graph is safe when only the anticipated reachability graph is constructed. The constraint generation of section 3.6 is illustrated for this example, too.

Consider the algorithm of figure 3.7. read (a); equivalently: read (a); n := 10; n := 10; for i:=1 ton i := 1; a :=a+ a; while (i <= n) end; a:= a+ a; write (a); i := i + 1; end; write (a);

Figure 3.7 Algorithm of loop example A data flow graph for this program is shown in figure 3.8, where the control edge from the ::;; node to all the branch and merge nodes is not drawn. This data flow graph is constructed from the algorithmic specification as follows. Branch and merge nodes are needed for algorithmic constructs like if •.• then •.• else, case, and loop constructs like while ••• do, for ... do. Loops are constructed by a feedback edge between the branch and merge nodes. Note that a branch-merge pair is used in the data flow graph for every variable (even for the constants) used or defined in the repetitive statement, as well as in the body as in the test of the statement. Since there also exist edges from the merge nodes to the subgraph that forms the test of the statement (here just the ::;; node), the test can be enabled again, with updated values for the test variables, when the loop is traversed. For a particular repetitive statement, there is one edge that connects to the control ports of all the branch and merge nodes, and which originates from the part of the data flow graph that constitutes the test. The merge node is the entrance of the loop, whereas the branch node is the exit. As long as the loop must be traversed, data is fed back from the branch node to the merge node of the loop. When the loop is left, the merge node of the loop must be informed that, the next time, it should expect data from outside the loop, instead of from the feedback edge. Since the value on the control port of the merge nodes is set to false when the loop is left, and which is also the case in the initial state, proper execution of the loop is assured. In the initial state, the destination ports of the input edge of the graph, which in a sense is a sequence edge, also contains a value.

The precise construction of a program written in an imperative language or in an applicative language into this form of a data flow graph can, for instance, be found 118 APPL!CATIOXS

1 n a

Figure 3.8 Data flow graph of loop example in [38, 123, 131]. But note that these constructions lead to pure data flow graphs, i.e. graphs without choice edges. Since pure data flow graphs have no choice edges, there is exactly one behavior for such a graph, and it is not that difficult to prove that this behavior is semantically equivalent to the behavior of the program for which the data flow graph is constructed [38]. This result is also valid for the generalized data flow graphs presented here, since the class of pure data flow graphs is a proper subset. The flow graphs constructed in these ways are even well-structured! The following, informally stated, theorems are therefore straight­ forward. Theorem 3.1 Generalized data flow graphs are as expressive as imperative pro­ gramming languages without non-local jumps, i.e. without unrestricted go- to statements, and recursion. LOOP EXAMPLE 119

Theorem 3.2 Generalized data flow graphs are as expressive as applicative pro­ gramming languages without function application, i.e. without higher-order apply operators.

Theorem 3.3 The data flow graphs representing a program written in an impera­ tive programming language without non-local jumps and recursion or in an applicative programming language without function application are well-behaved.

Theorem 3.4 From a well-structured data flow graph, a semantically equivalent program in an imperative or an applicative programming language can be deduced. An implementation for the last theorem is discussed in [43]. Fundamentally, this 'decompilation' of well-structured flow graphs follows the lines from the rewrite system in section 2.7.2 (Definition 2.114). Recursion and function application are not allowed for the flow graphs in the above theorems, because there does not exist a special node in the data flow graph concept presented here for function application. If an appropriate apply operator exists, higher-order functions together with their closures can also be dealt with correctly. In principle, such a node acts exactly like an operator node. An apply node would have several data input ports that carry the values of the arguments of a function. The function itself is another input of the apply node. This node then outputs the value of this function applied to the arguments. This means that D -? D is included in the domain of values on the ports (Definition 2.61), and a node may then deliver a (higher-order) function instead of a single value. However, implementation of higher-order functions into hardware has never yet been considered by designers of silicon compilers, since also the type of applica­ tion has no need for it. Also no hardware description language allows for it. Of course hardware implementations like (54] for the evaluation of functional lan­ guages allow for it, but such implementations are, like microprocessors, instruc­ tion based and general purpose. Also synthesis of recursive data flow graphs is not possible because of finite hard­ ware, except for the class of languages in which the depth of the recursion is bounded by a given number. But even for this class of programs, recursion is not adequate. High level synthesis will always have to transform the recursion into an iteration. Whenever this is not possible, such a graph cannot be implemented by hardware. 120 APPLICATIONS

In a sense, this is a criticism on SIL (77] where recursion is the only allowed type of repetition. For a given SIL graph it has to be checked that the recursion can be transfonned into an iteration, otherwise any real (bounded) hardware cannot be designed for it. In [77, 78) only SIL graphs derived from hardware description lan­ guages or other specification languages like SILAGE [58] and ELLA [106] are considered, that do not allow recursion. Therefore the concept of standard nodes is defined in SIL to keep the hardware synthesis tools as simple as possible. Imperative languages are mostly viewed as sequential languages because of the use of the semicolon as a statement separator, also in 'time'. The transformation of a program written in such a language consequently leads to many unnecessary sequence edges. A full, global data flow analysis can be performed to obtain the maximal parallel graph representation [131]. But this is identical to applying a transitive reduction [6, 55] on the flow graph. By extending the definition of tran­ sitive reduction in the natural way for hyper-graphs, the following theorem for graphs for which the transitive reduction is uniquely defined, is again straightfor­ ward (where RG(G, s) represents the reachability graph of a graph G with initial states (Definition 2.73)). Theorem 3.5 For a well-formed graph G and an initial state s, RG(transitive- reduction(G), s) = RG(G,s).

Corollary For a well-formed graph G and an initial states, Behavior(transitive - reduction(G), s) = Behavior(G, s). But note that the above mentioned transformation from a program to a data flow graph is different from flow graph constructions as can be found in conventional (software and hardware) compilers. The latter are based on basic blocks [8]. A basic block is an acyclic (pure) data flow graph, many times even without condi­ tionals. Loops in the flow graph have then one entry point and one exit point and the body is then a basic block (or further partitioned in case of nested loops). This is identical to the concept of reducible flow graphs [56], This can be modeled by a single branch and a single merge node for each loop, and in between are the loop test and the loop body. Usually, the test and the body are executed consecutively as separate bodies (or the loop is left). I.e. the possible execution behaviors are restricted and also the number of possible loop control schemes is fixed. In the method described here, a branch-merge pair is established for each variable occurring in the loop. For the example of figure 3.7, no direct relation exists between the loop variable i and variable a. Therefore there is no edge, or prece­ dence relation, drawn in the data flow graph of figure 3.8 between the subgraphs LOOP EXAMPLE 121 for these two variables. It is therefore allowed, according to the semantics of generalized data flow graphs, that first the test variable is updated many times (and for that reason, its values are accumulated on the control ports of the branch-merge construct of the variable a). Then this test variable (or its location in store) can be used again somewhere else in the program. Hereafter the loop of the variable a can be executed. Also many other 'interleavings' of updating the test variable and the loop body variable are allowed, all yielding the same result. The only data dependency of the variable a on the variable i is that at any moment a is not updated more often than i. This is the correct behavior of the loop. Compilers and synthesis tools should use this maximal choice of parallelism to make the optimal decisions for scheduling and allocation. It is of course possible to lay a much smaller burden on the complexity of the syn­ thesis phase by choosing to restrict the loop semantics to, for instance, lockstep­ ping, i.e. full synchronization of, the entrance of all the branch-merge constructs of a loop. This can be modeled in the data flow graph by inserting some additional (sequence) edges between the corresponding branch and merge nodes. The initial state of the data flow graph of figure 3.8 consists of false values on the control ports of the merge nodes, and of a value on the input edge of the graph. Because of the almost unrestricted semantics, the number of possible execution sequences is very large, all having the same behavior. The anticipated reachability graph, with multiple firing, consists of only one path and is given in figure 3.9. The merge nodes are labeled m;; the branch nodes b;; the two+ nodes are labeled +;,just as the two 1 nodes are labeled l;. All other nodes are labeled by their func­ tion. The subscripts i are numbered starting with 1 and continuing from left to right as they appear in figure 3.8. The loop is traversed as many times as is sug­ gested by the algorithm of figure 3.7. Although the reachability graph is itself safe, it can be transformed in such a way that the control port of the merge node m4 and of the branch node b4 is non-safe if the loop is traversed more than once. Note that the other edges of the cyclic subgraph are always safe. The non-safeness can be proved by propagating m4 (in the transition of s5 to s6) down through s7 and s8• The first propagation is allowed because independent(m4, m2), and the sec­ ond propagation is allowed because m4 is only a successor of ::; and not a prede­ cessor; therefore m4 is already enabled before ::; is executed and will therefore be multiply enabled after firing of ::;. Similarly, b4 can be propagated downwards through the ::; node. This is an example of the application of theorem 2.52. ] 22 APPLICATIONS

So !mi. 12, 10, get

St

!m2,m3,m4 'Is

'jb" b,, b,. b4 S4 !li. m3,+2

'(m"+,,m 4 'fm, 'js s,s I I I sI fbi. b2, b3, b4 !put

Figure 3.9 Anticipated reachability graph for loop example Application of the constraint generation of section 3.6 shows that, for instance, by adding a sequence edge between the merge node m4 and the operator node ::;, the flow graph is safe. This is the same result as predicted by theorem 2. 71. This latter theorem is of importance when the data flow graph of figure 3.8 is scheduled in a silicon or a software compiler. In principle, scheduling is nothing else than a parti­ tioning of the data flow graph, since a scheduler introduces state transitions in the controller. As is also discussed in section 3.3 those state transitions can also be modeled in our data flow graph, by introducing sequence edges between all nodes LOOP EXAMPLE 123 of a state to all nodes of the next state. However, this can also be modeled by intro­ ducing an extra (dummy) node on the state boundaries that has incoming edges from the (last) nodes in the state and outgoing edges to the (first) nodes of the next state. The essence of a scheduler with respect to loops, is to break them at (at least) one point. Each partition is then an acyclic data flow graph. When the data flow graph is built from a behavioral description language constructed as in, for exam­ ple [38, 123, 131], as the graph of figure 3.8 is, all these partitions are even safe and well-behaved blocks. Therefore, by theorem 2.61 and theorem 2.71, the entire data flow graph is safe [69]. Thus it can be seen that the data flow graph concept presented here does not impose any restriction on the scheduler. The only con­ straint is to break loops at some place, but it is of no importance where the loop is broken, or whether all loops should be broken at the same place (for instance at the start or the end of the loop). This leads also to another conclusion: the optimal schedule can be determined effi­ ciently. The schedule with the minimum number of state transitions in the con­ troller is the schedule in which each loop is broken at one place. The schedule with the minimum number of registers, i.e. with the minimum number of back edges, can be determined by computing the maximal spanning forest of the flow graph. This can be computed in O(IEI) [7]. The complete reachability graph of this example in which the firing of the merge nodes and the branch node is lockstepped, i.e. synchronized, is shown in figure 3.10. This reachability graph is safe. The complete reachability graph in which only the branch nodes must be fired in lockstep has already 46 states for each iteration. This shows again the power of the anticipation strategy.

3.3 High level synthesis As it is meant to show that high level synthesis tools may release some of the (intuitive) restrictions onto the data flow graph concept, and also to show the even larger expressiveness of generalized flow graphs: Notation 3.1 The phrase normally is intended to denote in current high level syn­ thesis systems, e.g. [90, 123]. High level synthesis is the process of mapping a system specification written in a high level description language, i.e. a data flow graph, into a system with the same behavior, but specified at the structural, or register-transfer level, i.e. written in a hardware description language. Normally, high level synthesis can be roughly divided in two subtasks: 124 APPLICATIONS

~ 10~10 ~

iln1,tn2,tn3,tn4

ibi. b2, b3, b4 i/'{2 +1\3/i1 +V1

Figure 3.10 Reachability graph of loop example in full lockstep l. Scheduling: a mapping of the specification into time 2. Allocation: a mapping of the specification into space, or hardware Scheduling results in a control graph, while allocation yields a network graph. The network graph is identical to the data path and can be implemented HIGH LEVEL SYNTHESIS 125 straightforwardly in hardware. The control graph is needed because in general a complete graph cannot be mapped into synchronous hardware that completes the function of the graph in one clock cycle. Also such a system may be too fast_ i.e. the timing requirements of the system under design are less demanding. In that case, a slower solution with less hardware is assumed to be a better solution. The main goal of the allocation phase is to minimize the necessary amount of hardware. Of course, these two tasks are related to each other, since minimizing the amount of hardware usually lengthens the time for the system to complete; and faster solutions tend to use more hardware. State transitions in the control graph may also be caused by the conditional state­ ments in the original specification. This leads to the conclusion that the control graph is just a partitioning of the initial flow graph. In many synthesis systems, such a partitioning is already done to a great extent in the initial specification! Namely, the specification is decomposed in basic blocks, and these basic blocks are further scheduled separately. Consequently many optimizations are not possi­ ble any more, for instance sophisticated schemes to implement the control of loops (see section 3.2 and [69] ). Note that the decomposition in basic blocks can be modeled in the flow graph con­ cept presented here. The decomposition is identical to a mapping of the merge nodes and branch nodes (of a well-structured loop) onto each other (cf. the rewrit­ ing system definition 2.114), which is equivalent to the introduction of one (de)multiplexor in the final network. Also all other restrictions induced by high level synthesis tools can be represented, mainly by introducing sequence edges. In this section, it is shown that the division into a control graph and a network graph is not really necessary, since scheduling and allocation are proved to be graph transformations on the data flow graph. If a control graph is wanted explic­ itly, it can be constructed by abstracting from the subgraph containing all nodes that are scheduled in the same time slot, i.e. by collapsing the graph between the storage nodes into one big (operator) node (see also section 3.2). The control graph is a partitioning of the data flow graph into such blocks. Instead of devising a 'language' to describe how the modules in the network graph, i.e. nodes in the flow graph, are controlled by the control graph, the control is still explicit in the transformed flow graph. Normally, because of the initial decomposition in basic blocks, there are no branch and merge nodes in the control and network graphs. Network generation is consid­ ered as a special task, because (de)multiplexors appear without further motivation. If control is made explicit in the flow graph, the (de)multiplexors are just the 126 APPLICATIONS branch and merge nodes that are left, i.e. optimized, in the transformed flow graph. It is assumed to be good practice to build a safe network graph. Then each edge can be implemented by an ordinary wire, instead of a buffer or a queue. Although it is not always fully realized by the designers of synthesis tools, reducible flow graphs [56, 126], i.e. a graph decomposed into basic blocks, are inherently safe (Theorems 2.61 and 2.71) This is mainly of interest for the loops in the graph. But theorem 2.71 shows, which is also proved in [69], that safeness is even guaranteed for a larger class of graphs, namely for graphs that are partitioned into well­ behaved, safe graphs. With this theorem, the number of possible control schemes is much larger and more optimizations may be found [44, 123]. The result of high level synthesis should be that the scheduled and allocated sys­ tem has the same behavior as the initial specification, except that the number of possible execution sequences is reduced. This means that the resulting system is more 'serialized', i.e. sequence edges are introduced. In this way, it fits into the ordering suggested in section 3.8.1. Deadlock should, of course, not be intro­ duced, i.e. scheduling must be a liveness preserving graph transformation. For nor­ mal flow graphs this means that scheduling should not introduce cyclic constraints by forming new loops in the flow graph. Note that allocation may introduce loops, but those should always be 'controlled', i.e. they must contain a break, implying registers. Concluding, formalizing the intuitive ideas of what high level synthesis in principle is, and what the, also intuitive, assumptions on the result are, may yield new optimizations and solutions, especially when using the least restrictive and most explicit formalism. In the following two sections, scheduling and alloca­ tion are formalized and proved to be graph transformations on generalized data flow graphs. From this, all the above mentioned claims can be concluded. 3.3.1 Scheduling o. Two types of scheduling can be distinguished. One is the direct inclusion of sequence edges in the flow graph in order to restrict the parallelism, by reducing the number of possible execution sequences. This does not introduce (external) control, since only some precedence relations are added. The other type of scheduling introduces a separate controller to steer the execution of the network, or equivalently the flow graph. This type of scheduling maps a node of the data flow graph onto a state in a control graph, which is a finite state machine. All nodes that are mapped onto the same state in the control graph are executed in the same clock cycle. If the timing requirement is such that the flow graph can entirely be scheduled in one clock cycle, the control graph may contain HIGH LEVEL SYNTHESIS 127 only one state. Otherwise, the flow graph has to be partitioned into subgraphs, and a clock cycle is allotted to each subgraph. 'I\:vo reasons exist to introduce a state in the control graph: 1. because of a subgraph in the data flow graph having a delay larger than a clock tick 2. because of a branch of control in the data flow graph The first case will yield a chain of states in the control graph, whereas in the sec­ ond case, a state will have more than one next state and causes transitions as in figure 3.11. Note that in this case, the choice of a transition is determined dynami­ cally by a value in the flow graph.

Figure 3.11 Controlled transition in the control graph Normally, only operator nodes are scheduled onto states, while the branch and merge nodes are not considered. However, transitions of type 2, and their counter­ part of a state having more than one incoming transition, will lead to (de)multiplexors in the network (graph). These (de)multiplexors have the same function as the branch and merge nodes in the data flow graph. The schedule nor­ mally does not mention the control of these (de)multiplexors (which would mean a labeling of the transitions in the control graph), but has to be (re)generated by the mapping task to a complete network. This mapping is to synthesize one hardware circuit from both the data path(s) and the controller. Often, this mapping is not trivial, since it implies an (implicit) optimization by the combined allocation and scheduling. An example of a schedule with such an implicit control graph is given in figure 3.12. In this example, the nodes v2 and v3 are both mapped to state s1 since they are mutual exclusive. In the final network, a demultiplexor should be allo­ cated. When the branch and merge nodes of the data flow graph are also incorporated in the schedule, i.e. the control graph includes also branch and merge nodes, the con­ trol is made explicit. In that case the mapping from the control graph to the con­ trolling network is a one-to-one relation. The optimization of the number of (de)multiplexors in the mapping of the control graph to the network, is then 128 APPUCA110NS

Figure 3.12 Implicit control graph identical to an optimization of the number of the branch and merge nodes in the data flow graph. Such an optimization can be for instance perfonned by 'dragging' branch and merge nodes forward or backward. When the control is made explicit in this way, transitions of type 2 reduce to tran­ sitions of type 1. Thus a schedule induces a simple partitioning of the data flow graph, since the control is represented by branch, respectively merge, nodes in the control graph. For the example of figure 3.12, making the control explicit yields the control graph of figure 3.13. It also shows a transformed, but equivalent, data flow graph for this schedule. Note that the nodes v2 and v3 are both executed in the final network, whereas they are mutual exclusive in the flow graph of figure 3.12. Note that the

schedule is even correct when the control edge originates from v1• The formalization of the scheduling task and its induced partitioning are given in the following paragraphs.

Definition 3.2· A normal schedule is a function t: V ~JN. HIGH LEVEL SYNTHESIS 129

-----_-_-.:-~~ ------

-----s4 ------

Figure 3.13 Explicit control graph Note that in this way all the occurrences of a node in a loop in the flow graph, i.e. for each loop iteration, are mapped onto the same state. When this is not desired because of possible further optimizations (for example when the even and odd iter­ ations should be different), this can be accomplished by a graph transformation, e.g. a loop unrolling. In this way, the schedule satisfies definition 3.2. This is even the case for more elaborate loop implementation schemes. It is strange that, if this is desired, it normally is not modeled explicitly in the flow graph. Instead this information is (re)generated mysteriously, out of the blue, when synthesizing the complete final network in hardware. It is always possible to find a graph transfor­ mation that makes this type of control explicit. The advantage of scheduling one node into more than one state is illustrated by an example (page 57) in [123]. This reference uses the data flow graph concept described here, but without choice edges. When a node is executed under differ­ ent, i.e. mutual exclusive, conditions, such a one-to-many schedule is allowed. However, when this implicit control is made explicit (i.e. changing the 'while do ' statement into the statement 'if then repeat until 130 APPLICATl01'S not '), the simple schedule of definition 3.2 is again valid.

Definition 3.3 A normal schedule is proper iff Vv, weV: wesucc(v)=>t(v)::; t(w). A proper schedule means that a successor node cannot be scheduled before the node itself, which would result in a deadlock situation. From definition 3.3 it is also clear that in this way cyclic graphs can never be scheduled, unless all nodes constituting a cycle are mapped to the same state. Therefore, loop constructs are already broken in many high level synthesis systems, i.e. the control scheme for such a loop is fixed, leaving only acyclic graphs to be scheduled. An example of such a fixed control scheme is to break a loop in a flow graph at all the merge nodes. First, the test of the body is computed completely, and then the body of the loop. In this way, each iteration will be completed before the next one can start. Normally, flow graphs are decomposed in basic blocks and each basic block is scheduled separately. The following shows that this partitioning in basic blocks is not necessary, and the loop control scheme needs therefore not to be fixed. Notation 3.4 Seq(JN) is the domain of sequences of integers that are ordered, i.e. for a sequence< d1'd2,.· ·,dn >,Vi: 1::; i < n:di::; edi+l· For a sequence t, t(i) denotes the i'h element.

Definition 3.5 A generalized schedule is a function t: V ~ Seq(JN}. In this way, a schedule is a mapping from nodes to time slots instead of states. This may, but not necessarily, be viewed as that a node is still mapped onto one state, but that a set of time slots is assigned to each state. Normally a linear rela­ tion exists between states and time slots. For example, t = i * s + c, where t are the time slots, s the states eJN in which the loop is scheduled, i the loop counter, and c a constant, i.e. the offset. However, the generalized schedule of definition 3.5 is not restricted to this case, but is the most general as is possible.

Definition 3.6 A generalized schedule for choice-free data flow graphs is proper iff Vv, weV: wesucc(v)=>('v'i: 1::; i::; #t(v): t(v)(i)::; t(w)(i)). I.e. a node is never executed before the associated execution occurrence of a pre­ decessor.

Since the generalized schedule of definition 3.5 is not restricted to choice-free HIGH LEVEL SYNTHESIS 131 graphs, the following can be established.

Definition 3.7 A generalized schedule for generalized data flow graphs is proper iff it is an interleaving of proper schedules defined by definition 3.6 with respect to the different origins of a (choice} edge.

Proposition 3.6 \iweV: proper(t)::;:}#t(w) ~ Sum t(v) vepred(v) Thus, for a proper schedule, a node v may only execute after the associated execu­ tion occurrence of a predecessor, and each execution of v corresponds to the execution of a single predecessor. Given a schedule, some conclusions on properties of the flow graph can be made, for instance on safeness. Some examples are: 1. a choice-free graph with multiple firing nodes is safe if the successors always fire before the node is executed itself again.

2. a choice edge e, with •e = {v 1, v2 }, e• = {w}, is safe when t(v1) ::£ t(w)(l) < t(v2) ::£ t(w)(2). Next follows the formal definition of the flow graph partitioning induced by a schedule. Definition 3.8 Given a well-formed graph G and a generalized schedule t: V ~ Seq(JN). Assume \ie e E: #dest( e) ::;; 1. 2 For each e e E, let e• = {w}, VbreakEV be a fresh node, with l(vbreak) = {Pinl and O(vbreak) = {Poutl and e1, e2 EE be two fresh edges. t induces a graph transformation of G to G' defined as follows. For each eeE: E' = (E\{ e })u{ e" e2 } V' =Vu{ Vbreak}

2. By proper multiplication of the multi-destination edges and the corresponding output ports, similar to the transformation used in definition 2.114 of the well-structuredness rewrite system, this can always be obtained without changing tlle behavior of the data How graph. 132 APPLICATIONS

Thus for all edges eeE, •e is partitioned in •e1 ={ve•elt(v)nt(w)=0}, i.e. the origins of e that do not have a same time slot in common with the destination of e, and in •e2 = •e\•e 1; then e is split by putting vbreak in between. The possible cases are shown in figure 3.14.

a b c

Figure 3.14 Possible breaks of an edge due to scheduling

As an optimization, a dangling node vbreak needs not to be inserted in the flow graph, as for instance is the case in figure 3.14c.

Each break in an edge, i.e. the insertion of a Vbreak> means that execution of these node crosses a state, or time, boundary in the control graph. Registers or other types of storage nodes should then be inserted in the network. These storage nodes coincide with the nodes vbreak. The break nodes may also be viewed as synchro­ nization nodes. As an optimization, several breaks between the same time slots, i.e. same state transition, may be represented by just one. Other optimizations based on lifetime analysis can be performed, too.

Proposition 3. 7 The graph transformation of definition 3 .8 for a proper schedule t yields a behavior preserving partitioning of the graph G.

Proof: The nodes v between two (successive) breaks are all scheduled onto the same state. The control graph is therefore identical to the flow graph G in which all nodes, that are mapped onto the same state, are collapsed into one 'super' node. Because the schedule t is proper, the scheduled graph has a behavior that in gen­ eral has less (internal) non-determinism than G (and possibly also for its external non-determinism), but no deadlock occurs. This means that Behavior(G') ~ Behavior(G) for a proper schedule of a well-formed graph, since 3be Behav­ ior(G'): b~Behavior(G) if deadlock can occur. o HIGH LEVEL SYNTHESIS 133

Now some concepts are given that are mentioned, and sometimes dealt with as special cases, in high level synthesis tools Notation 3.9 When the nodes of a connected subgraph is scheduled in the same time slot, this set of nodes is said to be chained [ 123]. Such chaining is allowed if the longest (delay) path in the subgraph is smaller than a time slot. Notation 3.10 A node is a multi-cycling oode when its delay is longer than a clock tick [123]. A multi-cycling node is scheduled in more than one time slot, namely onto an interval, i.e. a set of consecutive states. Notation 3.11 A flow graph is said to be pipelined when different instances of the input data are executed simultaneously. For pipelining, the relation between the states of the control graph and the time slots is normally also a linear function in the number of iterations, just as for loops. However, the generalized schedule defined here by definition 3.5 is not restricted to this case, but is as general as possible. The case of multi-cycling is incorporated, too. In this framework, there exist no special needs for pipelining, with which schedul­ ing has to deal with. If a flow graph is pipelined, nodes, which could be mapped to the same module without pipelining, are allocated to different modules and also into different states, but these states are mutual exclusive. Pipelining is therefore not a scheduling issue, but an allocation issue. 3.3.2 Allocation Normally, the following types of allocation are distinguished in high level synthe­ sis: - module allocation - storage allocation - interconnect allocation A module allocation maps (operator) nodes of a data flow graph to (operator) mod­ ules in the data path. Storage allocation is used to assign variables that are live across state boundaries after scheduling, to registers or other types of storage mod­ ules. Fundamentally, module allocation as well as storage allocation map nodes of 134 APPLICATIONS the data flow graph to modules in the network.

Definition 3.12 Module allocation is a function a14 : V 0P ~ M, and storage allo­ cation is a function a R: V R ~ R, where M is a set of modules, Vop is the set of non-storage nodes of V, V R is the set of storage nodes of V, and R is a set of stor­ age modules, e.g. registers. In this way, allocation is defined as general as possible. These two allocations may be combined into one generalized module allocation function a: V ~ H, defined as a = a 14 uaR, where H = MvR represents the hard­ ware modules. Note that for allocation purposes, branch and merge nodes are con­ sidered as operator nodes, since they are mapped onto (de)multiplexors. Definition 3.13 A generalized module allocation is proper iff 'VveV:func­ tion(v)eoperations(a(v)), and 'Vv, weV: v 'f:. w: a(v) = a(w)=>t(v)flt(w) = 0 v mutex(v, w), where operations(m) are the operations a module m can perform; and mutex( v, w) if nodes v and w are enabled under mutual exclusive conditions, e.g. occurring in different branches of a switch statement [123). In words, for a proper allocation, the module m onto which a node v is mapped must indeed be able to perform the function of v, and all the different 'invocations' of m may not overlap in time, or they are enabled under different conditions.

Definition 3.14 Interconnect allocation is a function a1: E ~ W, where W is a set of wires or bosses.

Note that choice edges may be mapped onto wired-ors in the synthesized network. Also a merge node may be chosen to be mapped onto a choice edge, and therefore onto a wire. In fact, all allocations are mappings from nodes and edges to modules and wires. In principle, when we abstract from particular hardware issues, module allocation can be viewed as a renaming of nodes. Then, module allocation can be viewed as an intra-graph transformation of nodes to nodes. Similarly, interconnect allocation can be viewed as a mapping from edges to edges, instead of wires. Thus: Definition 3.15 For a data flow graph, a generalized module allocation is a func­ tion a: V ~ V, and a generalized interconnect allocation is a function a1: E ~E. Now follow the formalizations of the graph transformations that are induced by the allocations. HIGH LEVEL SYNTHESIS 135

A module allocation maps several nodes onto each other. This induces that also the edges that are connected to the corresponding ports of these nodes, indicated by a bijection TJ, have to be merged. Definition 3.16 Given a well-formed graph G, a generalized module allocation a: V ~ V, and a bijection 7]: P;nUP out ~ PinUP out between the ports of the nodes that are mapped onto each other.3 a induces the graph H =(V H• Pin• Pout•EH, l/VH,0/VH, fPH) defined as:4 V H =Range( a) [v] = {weV: a(w) = v} V'veV H• peP;n: Ecv,p) = {eeE I (w, Pk)edest(e)Awe[v]Ap = TJ(pk)} (3.3) V'veV H• pePout: Ecv.p) = {eeE I (w, Pk)eorig(e)Awe[v]Ap = TJ(Pk)} (3.4) V'pePinuP out• veVH: close({Ecv.p) I 3veVffApePin: (3.3) V (3.5) 3veVffApEP out: (3.4)}) Let {E'i.···,E'ml =(3.5), and take fresh labels e;' so that EH= {e{,···,em'l with EHr1E = 0. fPH(e;') = ({ (v, p) 13weV, Pk ePout• eeE'i: a(w) = VATJ(Pk) = pA(w, Pk)efst(e)}, { (v, p) 13weV, Pk ePim eeE';: a(w) = VATJ(Pk) = pA(w, Pk)esnd(e) }) The mapping of several nodes onto each other is represented by the (equivalence) classes [v]. Equation (3.3) represents all edges that are connected to a particular input port of the nodes that are merged; whereas equation (3.4) is the same for the output edges. Since all these sets may overlap, the closure of these sets must be taken. This is accomplished by equation (3.5). For every set in this closure, a new edge is made, which is the merge of all the edges in such a set. Note that this transformation is only defined for well-formed graphs. It can be extended to general graphs, i.e. in which conflict is allowed, but then an edge has to be made in the above described way for each combination of adjacent edges. I.e. one edge is made for each of all the 'product terms'. Example 3.1

An example of an allocation a: V ~ V for the graph of figure 3.15a, for which a(v1) = a(v3) =Vi. a(v4) =a(v 6) = v4, a(v2) = Vz and a(vs) = v5 is given in figure 3.15b. In this graph, V'veV: l(v) ={Pin }A O(v) ={Pout}.

3. Of course, T/IP;n: P;n-+ P;n and T//P out: P 0,.1 -+ P°"''too. 4. The function close is defined in Definition A.15. 136 APPLICATIONS

Ecv,,p.,.) = {e 1, e2 }, the outgoing edges of v1 and v3, Ecvs.P ... > = {e3}' Ecv2.Pu.> = lei} and Ecv,,p.,,) = {e2, e3 }, the incoming edges of V4 and v6.

These sets Ecv,p) represent the edges that have to be merged .. Since edge e1 merges with edge e2 due to Ecv,,p.,.) and also edge e3 with e2 due to E(v,,p.,)> all three edges ei. e2 and e3 have to be merged, resulting in a new edge et'. This is

also given by close( {Ecv,,p..,)> Ecvs.Pv.» E(v2,p.,,)> Ecv4,p,.>}) = { { e1, e2, e3}}.

a b

Figure 3.15 Example graph for module allocation

Proposition 3.8 The transformation of definition 3.16 is a graph homomorphism.

Proof: Take fE: E ~ E' defined as (Definition 2.44) fE(e) = e/ if eeE';. o

Proposition 3.9 VveRange(a): u •wn u w• = 0~-iself - loop(H) we[v] we[v) Proposition 3.9 is important, since self-loops in the allocated graph are considered harmful. Registers should be inserted in every cycle of the allocated graph, because feedback between clock-ticks in synchronous hardware is assumed to be not a good practice. Interconnected allocation induces the graph transformation that merges all edges, that are mapped onto each other. Definition 3.17 Given a well-formed graph G and a generalized interconnect allocation a1:E ~E. a1 induces the graph H=(V,P;n,Pou1,EH,J,O,;H) defined as: Eu = Range(a1) ;u(e) = ({(v, p) I 3e' e [e]: (v, p)efst(;(e'))}, {(v, p) I 3e' e [e]: (v, p)esnd(;(e'))}) HIGH LEVEL SYNTHESIS 137 where [e] = {e'eE:a1(e') = e}

Proposition 3.10 The transformation of definition3.17 is a graph homomor­ phism.

Proof: fE =a. o The module allocation and the interconnect allocation may also be combined. This will yield a graph in which the merging of nodes is given by the module allo­ cation, but the merging of the edges is partially induced by the interconnect alloca­ tion as defined in definition 3.17 and partially by the module allocation as defined in definition 3.16. Definition 3.18 Given a well-fonned graph G, a generalized allocation a:VvE-+ VuE, and a bijection T/:P;11VPouz-+ P;,,UP0 u1 between the ports of the nodes that are mapped onto each other. a induces the graph H =(V H• P;mP ou1• EH, l/Vff,O/V ff,; 8 ) defined as, where VH =Range(a/V) E'8 =Range(a/E) VveVH• pePin: Ecv,p) = {e' eE'H 13eeE: a(e') = eA(w, Pk)edest(e)A (3.6) we[v]Ap = T/(Pk)} 'v'veVH• pePout: E(v,p) = {e' eE'H 13eeE: a(e') = eA(w, Pk)eorig(e)I\ (3.7) we[v]Ap = T/(Pk)} 'v'peP;nVP out• veVH: close({Ecv,p) 13veVnAPEPm: (3.6) v (3.8) 3veVnl\PEPOU/:(3.7)}) Let {E'i. .. ·,E'ml =(3.8), and take fresh labels e;' so that EH= {e1',···,em'} with E8 nE = 0.

;H(e/) = ({(v, p) I 3weV, Pk eP0 u1, eeE'i: a(w) = VAT/(Pk) = pA(w, P1:)efst(e)}, { (v, p) I 3weV, Pk eP;m eeE';: a(w) = VATJ(Pk) = pA(w, p,;)esnd(e)}) Equations 3.6 and 3.7 are similar to equations 3.3 and 3.4, only the edges are restricted to the (equivalence) classes induced by the interconnect allocation. Example3.2

The graph of figure 3.16b is an example of a combined allocation a: VuE-+ VvE for the graph of figure 3.16a, for which a(e1) = a(e2) =et> and a(v4) = a(v6) = v4 • Because a(e1) = a(e2), edges e1 and e2 will always appear both in the sets Ecv,p)- 138 APPLICATIONS

E(v,,p.,.)={ e1, e2} = Ecv3,p""')t Ecvs.p •..,>={ e3}, Ecvz,p;.) ={ ei. e2 } and Ecv,,p;.) ={ ei' e2, e3}.

e2eEcv,,p;.> and e3eEcv.,p;.) are induced by a(v4) = a(v6) = V4. eieE(v1,p •.,) and e1 eEcv,,pi.> are induced by the afore mentioned reason, because ei EV1• and e2 e•v4 • The closure of all the sets Ecv,p) is again { {eto ei. e3} }.

a b

Figure 3.16 Example graph for combined module and interconnect allocation The same graph results, when first applying the interconnect allocation, and then the module allocation.

Proposition 3.11 The transformation of definition 3 .18 is a graph homomor­ phism.

Proof: Given a: VuE-+ VuE. The functions aM: V-+ V and a1: E-+ E can be defined, with a= aMoa1' since a/V: V-+ V and a/E: E-+ E, i.e. aM = a/V and a1 = a/E. As well as aM as a1 induce graph homomorphisms (Proposition 3.8 and proposition 3.10). Composition of two homomorphisms is again a homomor­ phism. Then it is only required to show that aMoa1 induces the graph given above. Let a1 induce G' as given in definition 3.17. As in definition 3.16, aM induces on G': VveRange(aM ), peP;,,: Ecv,p) = {eeE' I (w, Pk)edest(e)l\WE[V]l\p = q(pk)}

VveRange(aM ), p eP0 ui: Ecv,p} = {eeE' I (w, Pk)eorig(e)l\WE [v]l\p = 17(pk)} By definition of G' (Definition 3.16): (w, Pk)edest(e') (e' eE'~3eeE: ee[e1 ]1\(w, Pk)edest(e) (w, Pt)eorig(e') (e' eE')<=>3eeE: ee[e1 ]1\(w, Pt)Eorig(e) Thus 1 VveV, peP;,,: Ecv.p) = {e'eE' I 3eeE: ee[e ]1\(w, p1.:)edest(e)I\ (3.9) we[v]Ap = q(pk)} HIGH LEVEL SYNTHESIS 139

\::/veV, peP out: Ecv.p) = { e' eE' 13eeE: ee[e1A(w, Pk)eorig(e)A we[v]Ap = 17(pk)} It is straightforward that equation (3.9) with each e' eE' expanded to [e'}!,;;;E is equal to equation (3.6). This also proves that both closures are identical. Then it is straightforward that tP as defined in definition 3.18 is equal to fPMOtPJ where tPM is as defined in definition 3.16 and ; 1 as in definition 3.17. o An allocation may in general be a mapping from nodes to sets of modules or wires, for example when a node is executed under mutual exclusive conditions. But this never occurs in pure data flow graphs. It may only occur when such a node is driven by a choice edge, or when a partial mapping from nodes to 'opera­ tions' is already done. However, the above definition of allocation is not restricted to any special case, but is as general as possible. Example3.3

An example of an allocation a: V ~ P(V) for the graph of figure 3.17a, for which a(v3) = {v 5 , v1 }, is given in figure 3.17b. In this graph, 'v'veV:l(v)= {pin}AO(v)= {Pouil·

a b

Figure 3.17 Example graph for allocation of a node to a set of nodes For this allocation, E(vs.p;.)={ e1, e2},

Ecv,,p1.)={ e1, e3 } and close( {E

3.4 Parallel programs In this section, some special constructs are illustrated that are necessary to model parallel programs by data flow graphs, although many of these constructs violate at 140 APPLICATIONS

Figure 3.18 Intermediate result of allocation of example graph first sight the data flow principle. A parallel program is modeled as a set of cooper­ ating data flow graphs. The graphs communicate through the get and put nodes. Note that for this 'internal' communication on the chip, the get and put nodes should not be implemented as real IO pads, that really interact with the off-chip environment. Many of these IO nodes are just used to ease the modeling, but they may be eliminated when it comes to a real implementation. This does, however, not change the behavior of the complete data flow graph, i.e. the result is the same as when these nodes are hidden in the reachability graph. 3.4.1 Shared variables A data flow graph implementation for a shared variable is given in figure 3.19. It is an implementation of the specification (where II is the infix parallel operator; get(p, v) denotes the input of a value from the environment through port p and assign it to variable v; put(p, val) denotes the output of a value val through port p): if get(read, read - request) then put(read, var) II if get( write, val) then var: = val; The graph is non-deterministic, since it is not known beforehand what order will be taken when a read access and a write of the shared variable occur simultane­ ously. Edges labeled with seq are sequence edges; the edge labeled var represents the shared variable. When a read request comes in, the branch node executes in such a way that the value of the shared variable can be output by the put node. The node id, with function( id) = identity, is needed for correct handling of the construct var: = val in the specification, since the update of the variable should not be an interleaving of the old value and the new one. Therefore, the id node is also driven by a sequence edge. It can be proved that the data input port of the branch node (the var edge) is always safe, provided a safe initialization.

As is to be expected, the data flow graph of figure 3.19 must be initialized for proper functioning. This is to be done by putting a value on the var edge. Note PARALLEL PROGRAMS 141

-----read

Figure 3.19 A shared variable construct that this initialization is even required for programs, or data flow graphs, in which all variables are defined, i.e. written, before being used. But as an alternative ini­ tialization, for example to verify· this 'define-before-use' behavior of programs, a value (e.g. the special sequence value as defined in section 2.3) may be put on the incoming sequence edge of the id node. In a set of data flow graphs where shared variables are used, the constructs as given in figures 3.20 and 3;21 may be used for the read access respectively the update of such a variable.

I I I I I I -7 I I I I I I IL-----

Figure 3.20 Read access of a shared variable 142 APPLICATIONS

r------data------, 1 I I I I l I I I I write: -? I I I I I I I I I I I I ______...I l..------seq

Figure 3.21 Write access to a shared variable With shared variables, cf. volatile variables, a normal data flow analysis is not suf­ ficient in general for parallel processes where the current use is not exactly defined, as it is for sequential programs (provided no non-local jumps). Therefore the constructs of figures 3.20 and 3.21 should be used instead of an edge represent­ ing the value of a variable as it is normally the case. Clearly there should be as many 'write' get nodes in the construct of figure 3.19, as there are update 'nodes' in the set of parallel data flow graph; and as many get and put node pairs of the read part as there are read accesses. Note that the constructs given here may only be used as 'macros': they must be expanded to get any 'working' behavior, because the implementation of the shared variable construct as given in figure 3.19 is not strict in all its inputs, as is required for hierarchical nodes. Note that it nevertheless satisfies the definition of well­ behavedness! The constructs given here might be viewed as standard nodes, to use the term as used in SIL [78]. However, here these 'standard' nodes must always be expanded before synthesis or verification of the data flow graph can start. Of course, by expansion, many get and put nodes may be eliminated without altering the behav­ ior of the data flow graph. In SIL [78) a set of standard nodes is defined for each input language for which a translation, or compiler, exists to SIL. These standard nodes are therefore part of the definition of the translation from an input language to SIL. Standard nodes are hierarchical nodes, so they should be expanded when (high level) synthesis or veri­ fication must be performed on the SIL description. Because there is no direct PARALLELPROORAMS 143 repetitive 'statement' in SIL (only repetition through recursion is possible), many standard nodes are hierarchical nodes to map the repetitive statements of the input language into recursive ones. However, standard nodes are devised to ease the syn­ thesis since the current synthesis cannot deal with recursion but can deal with iter­ ative constructs as defined in the input languages. Thus such a synthesis might be viewed as being directly written from the specification in the input language instead of in SIL. Therefore, and for the reasons given in section 3.2, recursion is not dealt with in the data flow graph concept defined here. Of course, the constructs of figures 3.19-21 may be synthesized into ordinary reg­ isters or other memory like hardware structures with more elaborate read and write ports. This is only valid when the preconditions for these hardware modules are satisfied by the (expanded) data flow graph. Therefore the above standard nodes are in general not used in the same way as standard nodes in SIL [78]. 3.4.2 Semaphores An implementation of a semaphore is given in figure 3.22, which is just a synchro­ nization.

v

p

..,... ___p _

Figure 3.22 Semaphore This figure shows a semaphore if there exists only one P location, i.e. one 'request' for a resource, in the parallel program and one V location, i.e. one release of a resource. A generalized semaphore for n P locations consists of an n-switch instead of a I -switch. Also n get and put nodes exist for the P action. The n get nodes are all 144 APPLICATIONS connected by a single choice edge to the branch node. Figure 3.23 shows such a generalized semaphore with 2 P locations and 2 V locations.

Figure 3.23 A general semaphore In the data flow graphs of the constituents of the parallel program, the P and V nodes will be similar tQ figures 3.20 and 3.21. 3.4.3 Queues A high level description of a queue is given in this section, and also an implemen­ tation in tenns of a data flow graph. In principle, a queue is just a parallel process of inserting an element to the queue, and of removing an element from it. Of course, for a bounded queue, an element may only be inserted when the queue is not yet full, and an element can only be removed if the queue is not empty. A high level specification is given in figure 3.24. In the sequel, all the arithmetic is mod­ ulo n + l, where n is the size of the queue. Two pointers head and tail are used. head points to the first element of the queue; tail to the last. Seemingly, a queue of length n + 1 is used, but the n + l 1h element is used as a sentinel, to distinguish between an empty (head= tail) and a full queue (tail= head -1). Note that by modulo n arithmetic, tail is also initialized to zero. In this specification only the control structu£e is specified, i.e. the real array with the data is skipped, since it is PARALLEL PROGRAMS 145

queue {n) begin head := O II tail := n; I* initialization */ insert{) II remove() ; end

insert{) begin while get (write) do while tail =head-2 do I* full *I od buf[++tail] =write; put {write-ack) od end

remove{) begin while get {read) do while tail = head-1 do I* empty *I od put {read, buf[head++]); od end

Figure 3.24 Specification of a queue of size n of no special interest Note that, because of the asynchronous behavior of data flow graphs, the requests for input get(read) and get(write) are blocking, i.e. they are only executed when indeed data is available. Also when there is a request to read from an empty queue, this is blocked until the queue is non-empty; similarly for a write to a full buffer. A write is also acknowledged, but it is up to the designer whether this acknowledge is used. Data flow graphs of the functions insert and remove are given in figures 3.25 and 3.26. Since the variables head and tail are used in both subgraphs, the constructs for shared variables of section 3.4.1 are used. The control edge from the = node to the branch and merge nodes is not drawn in both graphs, just as the sequence edge between the put and the get node to implement the while loop around this body. Since the head variable does not change in figure 3.26, it may be 'dragged out' of the loop. Because the put and get are connected in a chain, also the merge nodes may be replaced by simple choice edges. This results in figure 3.27 for the remove function. The insert function can be optimized similarly. Implementing busy waits 146 APPLICATIONS

Figure 3.25 Data flow graph of insert is never good practice, but they can be eliminated by adding sequence edges in this optimized graph between the corresponding update and access nodes. The sequence edge from the node upd head in the graph of remove to the node ace head in the graph of insert should be initialized with a stream of n values. The sequence edge from the upd tail node in the insert graph to the ace tail node in the remove graph should not be initialized. It is also possible to implement a queue with a possibility to flush the contents, i.e. the queue specification becomes: PARALLEL PROGRAMS 147

Figure 3.26 Data flow graph of remove queue (n) begin flush (n) ; r initialization*/ flush (n) II insert () II remove () ; end For a correct functioning of this queue, a semaphore is needed for the access of the head and tail variables. I.e. the specification of the flush function should be: J48 APPUC' ATIONS

Figure 3.27 Optimized data flow graph of remove flush() begin while get (flush) do p (); head:= O II tail:= n; V(); od end The insert and remove functions should also make use of these semaphores. As a disadvantage, the optimized graphs of figure 3.26 cannot be used any more with­ out change, since the head and tail variables are now 'volatile'. 3.4.4 Further examples Communication protocols may also be modeled by data flow graphs. This is done in [67] and [68] for 2 and 4 phase handshaking protocols, and the alternating bit protocol. The protocols were first described at a high level, and the descriptions were implemented in terms of data flow graphs. Then the requirements of such PARALLEL PROGRAMS 149 protocols were examined. It was thus shown that such systems can be modeled and verified with the help of data flow graphs. With the techniques described in section 3.3, one can specify a processor pipelined in two stages: an instruction fetch unit and an execution unit [68]. When the com­ munication between the different components and the memory is modeled appro­ priately, then such a system can be verified on its possible read and write hazards. The instruction fetch unit obtains data from the memory ahead. It has to be assured that this data is consistent with the data that the execution unit writes to the mem­ ory. When branches are encountered, the instruction buffer must be flushed. Other, more complicated, strategies based on, for instance, performance evaluation stud­ ies, may also be used. Other popular examples for formal verification are a cache controller and the clas­ sical example of a traffic-light controller.

3.5 Control dominated specifications This section shows in which way control dominated specifications might be viewed as flow graphs. First, direct state-transition based formalisms are dis­ cussed, which are normally considered to be synchronous. The second subsection deals with asynchronous specifications. 3.5.1 State-transition based formalisms Instead of system specifications in which data plays the most important part, e.g. algorithmic descriptions in a programming language, many system specification are control dominated. This is for instance the case for systems that are indeed modeled as finite state machines, or even as a set of cooperating finite state machines. Communication protocols are examples of this class of systems. This type of systems is mainly controlled by events and may be described, for instance, in the formalisms of [42, 72]. Of course, values are also used in the specification of complex (communication) protocols, resulting in specifically designed (formal) languages for these types of systems like [3, 4]. Microprocessor-like systems are another class of control dominated systems [109]. All control dominated specifications are mainly characterized as being a finite state machine. The transitions are labeled with the conditions under which a state becomes the next state. For each state a specification is given for the 'action' to be performed in that state. Also, such an action must lead to the validity of some tran­ sition. Such a specification can be very simple, like the direct generation of a next event [72], but also more complicated, e.g. another finite state machine. The latter 150 APPLICATIONS results in a hierarchical description [42]. Another possibility is by means of some program [109]. Thus the (hierarchical) finite state machine can be seen as a set of cooperating data flow graphs. In section 3.2 and section 3.3 it is shown that a control graph is just a partitioned data flow graph. The partitions are separated by branch and merge nodes that are controlled by values emerging from the subgraphs. For a higher view, the sub­ graphs may each be collapsed into one (operator) node. This leads to the conclu­ sion that a control dominated system specification is just the same as the control graph that is generated after scheduling of an (ordinary) flow graph. The main dif­ ference with a control graph is that in a control graph normally only one state is active, while for a (hierarchical) set of finite state machine descriptions this is not necessary.

In many of the specification formalisms for control dominated systems, a simple mechanism exists for general transitions, e.g. a reset to the initial state, trapping of errors or overflow of an arithmetic unit. The non-existence of the latter feature is often mentioned as the major drawback of data flow graphs. However, it is just a syntactic construct of the specification language, since in the end the general tran­ sitions must be implemented for each 'state'. Mostly modules have additional cir­ cuitry for this. Therefore, if the synthesis tools should really benefit from the exis­ tence of this extra circuitry, it should also be written down precisely in the specifi­ cation. The possibility of being in more than one state in the control graph is of course not a limitation of a data flow graph but of the way scheduling is performed. Control dominated specifications can be modeled by data flow graphs without any prob­ lem. The following types of transitions may occur, apart from the general 'broad­ casting' type of transitions as discussed before: 1. only one next state is possible 2. several states are triggered for execution in parallel (branch) 3. a state is triggered by completion of one predecessor state 4. a state is executed if all predecessor state have completed (merge or join) Cases 2 and 4 mostly appear in pairs, cf. branch and merge pairs for switch and repetitive statements in the data flow graphs of programming languages [38, 123] (see also section 3.2). Once control is distributed over several subsystems, further action is allowed only when all these subsystems have completed. It is immedi­ ately clear that such a specification can be modeled by data flow graphs: model CONTROL DOMINA TED SPECIFICATIONS 151

(recursively) each state by the data flow graph for its 'action'. Transitions of type 2 mean that the outputs of a state may be executed as soon as they are available and can therefore be implemented by one (multi-destination) edge in the data flow graph. Similarly, transitions of type 4 means just a synchronization, i.e. the nor­ mal behavior of a node in a flow graph. Transitions of type 1 can be modeled by a branch node, while transitions of type 3 can be modeled by merge nodes, or just as a single choice edge (if allowed). Of course all these types may be mixed. There remains the problem of global conditions or global variables, for example a reset signal. These can be modeled by using the constructs of section 3.4.1. Of course it can be advantageous first to check which variables are indeed used glob­ ally. This check concerns the use of variables defined and used in more than one state that may be active at the same time. This is just a kind of data flow analysis and may be performed on the data flow graph itself. The data transported along the transitions is the set of used and defined variables instead of the values of these variables [75]. (Cf. the non-standard semantics as described in section 3.8.2.) The action of the nodes must be changed accordingly. Often a hidden issue in the specification of control dominated systems is the syn­ chronous execution of the system. Such an execution must also be specified explic­ itly in the data flow graph. This again leads to the concept of the control graph concept as a result of scheduling, since scheduling is in principle just the mecha­ nism to create a synchronous implementation from an asynchronous specification. 3.5.2 Delay insensitive circuits Suitable formalisms for the specification and analysis of asynchronous systems are based on traces [93, 94, 120], or are formalisms like CSP [60] and CCS [104], which can also be viewed as sorts of path expressions [30]. The possible sequences of actions are given in some concise way, instead of the, much larger, underlying state-transition graph itself. Of course, a useful system also performs some data processing, which can be viewed as a kind of action, although not a communication. This leads to the value-passing extensions of the core languages, which are just path expressions [30]. Data flow graphs are in principle also asynchronous, and it is possible to map pro­ grams written in path expression languages into equivalent flow graphs. In princi­ ple, the same strategy as described in [12, 8l] for Petri nets can be used for the core path expressions, i.e. the control structure. Because of the use of value pass­ ing and guarded commands, instead of static control structures, data flow graphs are more appropriate than ordinary Petri nets, although colored or high level Petri nets might also be eligible [50, 66]. Note that with this technique, the flow graph 152 APPLICATIONS remains as concise as the specification itself. The compilation of programs written in Tangram [17] into handshake circuits [15] resembles the mapping of path expressions into Petri nets. Especially the intro­ duction of mixers in the handshake circui~ for variables, which are used more than once, has a striking resemblance. This translation is on the highest abstract level of Tangram, but such a transformation can also be defined on the other levels of abstraction, e.g. CP-1 [16]. All the different handshake protocols that can be used in the implementation of the handshake circuits into VLSI circuits can be modeled directly in a data flow graph. This gives the opportunity to analyze and verify these circuits in a different way. In [15] correctness proofs are hinted at for a specific type of handshake protocol for all communications; similarly for the approach in [94]. But when flow graphs, or Petri nets, are used, it might be proved that not all communications need to com­ plete a full handshake cycle before further action may take place. Alternative proofs can also be given, for example, for the initialization theorems of [15, 16].

3.6 Requirement verification Almost all formal verification methods and tools are based on the specification­ implementation paradigm: to check the equivalence of the specification and the implementation. This may be done when the specification is known in sufficient detail. Since equivalence is in fact too strong a relation, mostly an implication is checked (see also section 3.8.1). This leads to two types of verification: - tautology checking This is found on the switch-level and the gate-level of a design. Boolean equa­ tions are extracted from a (transistor) layout and checked against the gate-level description, possibly including flipflops etc. Verification is carried out by a conventional tautology checker or by means of symbolic simulation [26, 27, 28, 89]. These techniques are also used on the register transfer level for which boolean equations are extracted [24]. - theorem proving [23, 53] Theorem provers are mainly used to verify modules on two levels of abstrac­ tion: "bit" level and "integer" level. Theorem provers are too general for this restricted field of application and therefore other, more efficient techniques should be searched for. Theorem provers are also used to verify whether a transformation yields an equivalent system, or whether two types of REQUIREMENT VERIFICATION 153

implementations, are equivalent. This is an ad-hoe approach: each time such a transformation is applied, it is verified whether the result is correct. It should be used on another level: verify the correctness of the transformation itself, which must then be defined in a formal way [71, 111 ]. Time is highly important in system behavior. Tautology checking is purely static, and hence time dependence cannot be modeled nor verified. This is an enormous limitation. In theorem provers, time can be introduced with predicate logic, which is both general and expensive. The formalism of temporal logic has some opera­ tors that introduce time in logic formulae. This allows for the verification of behavior over time. The temporal logic approach has the advantage that the verifi­ cation problem remains finite and decidable, even for infinite time sequences. The specification-implementation paradigm is not applicable anymore on the higher levels of design in which formal verification is not yet used very much. On these levels, it is not known anymore if the specification is correct, or what the designer really meant to specify. Thus another type of verification is required: to check whether requirements, which are given additionally, are met by the specifi­ cation and the implementation [68]. Temporal logic is suitable to describe system properties that can be verified against . the specification. It is a general formalism and it has proved to be a well suited for­ malism for program verification [112] as well as hardware specification and verifi­ cation [22, 31, 65). Temporal logic can also be used in a different way for verifi­ cation: model checking. This can especially be used very well for requirement ver­ ification. In this section, the branching time temporal logic Computation Tree Logic is pre­ sented to illustrate how temporal logic can be used for such requirement verifica­ tion for (a set of cooperating) data flow graphs. First a recapitulation of CTL and model checking with CTL is given, as can for instance be found in [33]. Then an important extension is presented. The extension yields an algorithmic way to change the given data flow graph in such a way that the requirements in question are satisfied. 3.6.1 Temporal logic Temporal logic is a modal logic and is an extension of normal propositional and predicate logic in the sense that a temporal aspect is explicitly introduced [9ll In temporal logic, some logic operators are introduced to reason about the temporal aspects of the logic formulae. Two types of temporal logic are presented in litera­ ture: linear time and branching time temporal logic. Linear time temporal logic 154 APPLICATIONS considers only one future story, while more than one possible future stories are considered in branching time temporal logic. Linear time temporal logic is used for the representation of an (infinite) sequence of states s0, s 1, • • ·, in which each state s; .represents a non-temporal logic formula valid in that state. Each states; is the (only possible) successor state of state s;_1• Branching time temporal logic is used for the representation of an (infinite) tree of states T = t(s). Each path P = s0, sl> · · · in this tree is an (infinite) sequence of states in which each states; is a successor of states;+ Operators like for all paths and for some path are used in branching time temporal logic formulae. The branching time temporal logic Computation Tree Logic (CTL) is used here since model checking can be performed in polynomial time and space for CTL, as opposed to the case of linear time temporal logic [33, 117]. Definition 3.19 The syntax of the branching time temporal logic Computation Tree Logic (CTL) is defined inductively as, where AP is the set of all atomic propositions: 1. Every atomic proposition p e AP is a CTL formula 2. If f and g are CTL formulae, then the following are also CTL formulae: --if - f /\g, - AXJ,EXJ, - A(f U g), E(f U g) The symbol X is the next time operator and U is the until operator. The symbol A means for all paths; the symbol E means for some path. In a states;, AXf means that the formula f is valid for all successor states of states;. E(f U g) means that there is at least one path in the state tree for which there is an initial prefix, such that g is valid at the last state on that prefix, and f is valid at all other states along that prefix. The semantics of a temporal logic CTL formula is defined with respect to a model M = (S, T, AP) which is a state transition graph with a set of states S, and where T~S x S is the relation defining the possible state transitions. AP(s) assigns to state s the atomic propositions that are valid in state s. Linear time temporal logic has also the X and U operators, but it does not have the 'quantifiers' A and E, since the semantics is based on a single sequence of REQUIREMENT VERIFICATION 155 states.

Notation 3.20 M, s0 f: f is used to express that the formula f is valid in state s0 of model M.

Notation 3.21 When the model M is clear from the context: so l=f = M,so f=f

Definition 3.22 For C1L, the relation I= is defined as: sol=P if! peAP(so) so f:-.f if! not so f:f so I= f Ag if! so I= f and so pg s0 t=AXf if! for all successor states t of s0 t f= f s0 f=EXf if! for at least one successor state t of s0 t f= f s0 fA(jUg) if! for all paths (s0, Si.···) 3i: i;::: 0: S1 f8AV' j:O $ j < i:sj Ff s0 t=E(jUg) if! for at least one path (s0, si. · ·-} 3i:i;;::: O:s1 f8AV' j:O S j < i:si t=f where p is an atomic proposition, f and g are C1L formulae. In linear time logic, two other temporal operators are used: always (D) and some­ time or eventually ( <> ). These operators can be defined for branching time tempo­ ral logic, too. In branching time temporal logic the always operator can be under­ stood as for every state on all paths. The sometime or eventually operator can be interpreted as for some state on some path, or alternatively it must be valid at some moment in some future [80]. Definition 3.23 so f:o/ if! for all paths (so. s1,- ..), V'i: i;::: 0: s1 I= f so f=<> f ifffor at least one path (s 0, si. · · -), 3i: i;::: 0: s; f: f As C1L formulae: <> f =E(true U f) Df = -.E(true U -.f)

3.6.2 CTL Model checking

Definition 3.24 Model checking is the process to verify whether M, s0 t=f, where M = (S, T, AP) is a model, or a state transition graph, in which each state is 156 APPLICATIONS labeled with the atomic propositions that are valid for that state.

Definition 3.25 The subfonnulae of a formula f are defined as:

'f {/ -.g or sub(f) = {g }vsub(g) 1 = f = AX.g or f = EXg sub(/)= ·r{f=gAhor sub(f) {g, h} vsub() g vsu b(h) • f = A(gUh) or f = E(gUh)

Definition 3.26 The length of a CTL formula f is defined as: length(/) =#sub(/)

Theorem 3.12 Given a model M = (S, T, AP) and an initial state s and a CTL formula f. M, s0 I= f can be decided in polynomial time.

Proof: The check whether a formula consisting of only atomic propositions can be done directly in the structure M, since each state is labeled with the atomic propo­ sitions that are valid for that state. Now consider the case that for all subformulae of a CTL formula f and for each states, it is known whether these subformulae are valid. Since - case f =-.g: f is true iff g is not valid for that state - case f = gAh: f is true iff g and hare valid in that state - case f = AX.g: f is true iff g is valid for all successor states of s - case f = EXg: f is true iff there is a successor state of s for which g is valid

the validity of these formulae can be determined in constant time. - case f = E(gUh) = hv(gAEX(E(gUh))) Thus f is valid in a state if h is valid for that state, or if g is valid and there is a successor state for which f is valid. In all other states, f is not valid. Thus the validity of f in a state transition graph M can be verified by first determining all the states for which h is valid. In these states, f is also valid. Also, f is valid in all predecessor states for which g is valid. Thus the validity of f = E(gUh) can be done in a single backward depth first search of the state transition graph. REQUIREMENT VERIACA TION 157

- Case f = A(gUh) = hv(gAAX(A(gUh))). Thus f is valid in all states for which h is valid, or if g is valid and for all suc­ cessor states f is valid. This can also be determined in a single depth first search. This can be deduced from the fact that when a backlink is encountered, i.e. a path from a state s back to itself, and g is valid for all states along this path and not h, then a counterexample of AXf is found and therefore f cannot be valid for this state. Concluding, the complexity of model checking of a CTL formula against a model M = (S, T, AP) is linear in both the length of the formula f and the size of the modelM. 3.6.3 Model checking for data flow graphs The reachability graph can be used as the state transition graph for the CTL model checker. Each state must be labeled with all the atomic propositions valid in that state. For a reachability graph, these atomic propositions may be any function of the nodes and edges of the data flow graph, for instance the node that is executed to reach this state immediately from its predecessor state. This is a realistic assumption, since the CTL formulae to be verified describe events that have to occur, or erroneous events that are not supposed to happen at all. These events can be directly mapped onto the nodes of the data flow graph. For example, the alter­ nation of request and acknowledgment signals in a communication protocol is identical to the alternation of the execution of the put nodes corresponding to these requests and acknowledgements. The following is not valid, where RG an1 is the anticipated reachability graph: RGani• so I= f =>RG, so I= f (3.10) Therefore the anticipated reachability graph cannot be used directly for model checking, which is a major drawback. However, if the nodes that are used in the CTL formula fare never anticipated in RGani, then equation (3.10) is valid. Thus the anticipation algorithm of page 81 should be adapted for this. Normally the set of 'variables' used in a requirement is much less than the number of nodes in a (detailed) specification. The anticipated reachability graph is then still much smaller than the complete reachability graph. Another possibility would be to restrict the requirements to a class for which equation (3.10) is valid, or to use reconstruction techniques as used in theorem 2.52 to detect non-safeness. A system behaves correctly if it satisfies all its requirements. For a system of cooperating processes, requirements are often specified in terms of liveness, safety and fairness. The following is not an exhaustive enumeration of all interesting properties, but only shows how some general notions can be expressed in temporal 15 8 APPLICATIONS logic. Definition 3.27 Given a property f written as a temporal logical formula. live(/) = <>D<> f strong - live(/) = D<> f liveness means that there should be progress. Liveness of a property f means that it must be able to occur infinitely many times. A stronger notion is that it always must occur infinitely many times. Note that these notions of liveness are equiv­ alent to the definition of liveness of nodes in data flow graphs (Section 2.6.1) But more general liveness properties can be expressed and verified in temporal logic, since there is no restriction on the type off. Definition 3.28 Given a property f written as a temporal logical formula. dead(/) = <>~ f This is the negation of strong - live(/). Clearly, this includes also livelock, which is intuitively defined as: there is no observable progress, but the system may con­ tinue for ever by doing unobservable events. Definition 3.29 Given a property f written as a temporal logical formula. safe(/) = D(f) A safety property specifies that "nothing bad will happen", or, alternatively, only "good things happen", i.e. it is an invariant. Definition 3.30 Given a property f with pre(/) the precondition for this prop­ erty and post(/) the postcondition written as temporal logical formulae. absence - of - starvation(/) = o(pre(f)~<> post(/)) This property is dealt with in section 3.1 and needs therefore no further explana­ tion. Definition 3.31 Given two properties f and g written as temporal logical formu­ lae. fair(f,g) = D<> f AO<>g Fairness of events means that no event is favored, and is expressed as the conjunc­ tion of two (strong) liveness properties. Absence of starvation is a weak form of fair behavior, since it only states that an event should not be postponed for ever. The above defined functions just form a framework to specify useful properties. An example of a more complicated formula is given next.

Definition 3.32 Given two properties f and g written as temporal logical REQUIREMENT VERIACATION 159

formulae. The alternation off and g is defined as: alternating(/, g) = safe(h) =oh, where h = (/=>A(/ u (-if AA(-i/ u g)))) /\ (g=>A(g u (..,gAA(-ig u /)))) This states that when f is valid, it may not again become valid until g has been valid; and vice versa. 3.6.4 Constraints and constraint generation In most formal verification approaches, verifying a design results in just a yes or no answer. And of course, when a system is under design, the answer is mostly no. Here a method is proposed to extend the C'IL model checker to say more than just no, by suggesting possibilities on how to correct the design, so that the verification will yield a positive answer. Note that this is more than the error diagnosis done for instance in [4 l] and [47). In those references, the error diagnosis consists only of producing a counterexample. The method presented here can also be used to find counterexamples, but it shows in addition how the system description can be changed to remove the validity of these counterexamples, thereby removing faulty behaviors. To make the CTL model checker return more then only an affirmative or negative answer, the following types of constraints are considered: - On the state transition level, given a temporal logic formula f, eliminate the state transition between states si and s i• if state si is the root of a (sub )statespace that does not satisfy the requirement f. - If the state transition graph is computed from another system description model, e.g. a data flow graph, a constraint on this level of system description can be introduced to eliminate the execution of the violating transition. The result is adding a timing constraint by means of an additional sequence edge. An illustration of a constraint on the data flow graph level is given in figure 3.28.

The execution of node v1 from state si leads to state s 2 which belongs to a correct behaving subset of the state space, whereas the execution of node v2 leads to an incorrect behaving state s3 • But adding the constraint that node Vi must be executed before node v2, yields a state space without this incorrect subset. Different heuristics may be used to find the cause of the non-validity of a certain formula and to generate a constraint for it. For example, constraints can be gener­ ated as soon as a cause is found for the non-validity. It is also possible to postpone the generation of a constraint by finding constraints recursively. Assume the state transition graph of figure 3.29, and the C'IL formula AXAF f which is not valid 160 APPLICATIONS

correct incorrect correct

Figure 3.28 Example of a constraint

Figure 3.29 Example of a model in state si. where AF f = A(true U f). Eliminating the transition between s1 and s3 makes AXAF f valid. However, it can also be made valid by the elimination of the transition between s3 and s5• Definition 3.33 The constraint generating functions why - not and why for a CTL formula f and a state s may be defined as: why - not(/, s) = - Case f is an atomic formula: No constraint can be generated for this case. REQUIREMENT VERIFICATION 161

- Case/ =-.g: why(g, s) - Case f =gAh: Here the following cases can be distinguished:

1. g is valid and h is not valid: why - not(h, s)

2. The dual case of validity of hand non-validity of g: why - not(g, s) 3. Both g and hare not valid: merge the constraints of why - not(g, s) and why - not(h, s)). - Case f = AXg: If f is not valid, there is at least one next state si of s at which -.g is valid. Generate the constraint by skipping these states s; and/or merging all the con­ straints of why--:- not(g, si) - Case f = EXg: Combine the constraints of why- not(g, s;) of all next states s;. - Case f = A(gUh) = hv(gAAXA(gUh)): Combine the constraints of why- not(h, s) and the constraints of skipping the next state(s) Si that violate A(gUh) and/or why - not(A(gUh), s;) for all those next states si. Possibly also combined with why - not(g, s). - Case f = E(gUh) = hv(gAEXE(gUh)): Combine the constraints of why- not(h, s) and/or why - not(E(gUh), si) for all next states s;. why(f,s) = - Case f =-.g: why - not(g, s) - Case f = gAh: Merge the constraints of why(g, s) and why(h, s) - Case f = AXg = -.EX-.g: why- not(EX-.g, s) Case f = EXg = -iAX-ig: why- not(AX-ig,s) Case f = A(gUh) = hv(gAAXA(gUh)): The following cases can be distinguished: J 62 APPLICATIONS

l. h is valid: why(h, s) 2. h is not valid, thus gAAXA(gUh} is valid: merge the constraints of why(g, s) and why- not(EX-iA(gUh}, s). Of course these two cases may also be combined. Case f = E(gUh) = hv(gAEXA(gUh)): Here also two cases exist which may again be combined: 1. h is valid: why(h, s) 2. h is not valid, thus gAEXE(gUh) is valid: merge the constraints of why(g, s) and why- not(AX-.E(gUh), s). Note the difference between merging and combining constraints. The cases for A(gUh) and E(gUh) may goto into recursion because of their 'now-and-next' characteristics. By using the same method as for model checking, i.e. to check subformulae first, the recursion may stop in the same manner as for model check­ ing. It is clear, that no constraints can be found for propositional formulae. Therefore, the temporal cases should include a check whether their arguments are proposi­ tional. I.e. the recursion to find constraints should stop one level above the propo­ sitional formulae. Or stated differently, if the recursive call cannot find a con­ straint, then the path to those next states should be eliminated. This approach results in a set of constraints under which the system will satisfy its requirements. These constraints can be given back to the designer or to another program to select the best constraints, i.e. choosing non-contradictory constraints, which are can be implemented easiest. 3.6.5 Discussion This method for constraint generation is not limited to data flow graphs and model checking with CTL. It is also applicable to other behavioral models, for example finite state machines and Petri nets. In principle, it is applicable to all state based formalisms, since the semantics of CTL is based on Kripke structures (labeled state transition graphs) [64]. But it can also be applied to other, more powetful, verification methods than CTL model checking, for example model checking in the propositional µ-calculus, like it is used for CCS and CSP [34, 45, 57]. Just as CTL, the µ-calculus is a branching modal logic. Here only the technique of constraint generation is hinted at. It should be investi­ gated further )low the 'best' constraints out of the generated set of all possible REQUIREMENT VERIFICATION 163

constraints can be selected. Also heuristics should be found, since it is probably not needed to enumerate all constraints.

It is possible that by adding a sequence edge in the flow graph, a much larger sub­ space, then would have been suggested by the constraint generation, is not reach­ able anymore. It is also possible that other requirements will not be valid anymore. So care should be taken, since only suggestions are given[ It is of course possible

to compute first why(f, s0 ) to get some minimal set of transitions that may not be eliminated.

Of course, CTL model checking is not the only possible way for fonnal verifica­ tion. It is only used often, because model checking is efficient, whereas model checking for linear time temporal logic is exponential [84], although writing down linear time temporal logic specifications is more natural. Normally, writing down CTL specifications, if not in disguise as in section 3.6.3, already makes one think in terms of state transition graphs. And in that case, using direct automaton based formalisms is more natural. Since the semantics of temporal logic is based on a state transition graph, these two formalisms can be linked together. It is shown in [135, 136] that propositional linear time temporal logic is contained in the class of m-regular expressions [32, 98]. This is based on a tableau kind decision procedure from which a Biichi automaton is extracted. From then on, papers have appeared to show how certain properties, described as temporal logic formulae, can be stated as Biichi automata, for instance [11, 92]. However, transformations from temporal logic formulae onto Biichi automata all suffer from the drawback that deterministic Biichi automata are not closed under complementation [32], which gives a burden on the complexity [118].

Another approach to map temporal logical formulae into an equivalent automaton is presented in [70]. In this reference, a syntax directed transformation of linear time temporal logic formulae is given into Miiller automata. Model checking can then be transformed also, as in [84] by first constructing the finite state automaton for the formula, and then checking containment. It is also possible to devise a con­ straint generation for this approach. It gives alternative ways to check satisfiability and tautology; even more: it allows to mix freely temporal logic and state­ transition based formalisms for specification, verification [65] as well as manipula­ tion. This applies for the specification-implementation paradigm, too, since high level specifications are synthesized into finite state machines. The register-transfer level is in principle at this level of abstraction. 164 APPLICATIONS

3. 7 Graph transformations In many examples, a flow graph has been optimized by replacing a merge node by a choice edge without changing the behavior of the data flow graph. This is for instance allowed when there is a pairing branch node that is controlled by the same control edge and which is deterministic. I.e. it should be guaranteed that always the data is available at that (data) input port that matches the value on the control port (Cf. the first few lemmata of section 2.7.3). This is the same principle as that of allocation in high level synthesis (Section 3.3). There also some outgoing edges nodes are combined into one without introducing merge nodes, i.e. (de)multiplexors in the hardware, since for a proper allocation it is guaranteed that the data generation for this edge is mutual exclusive. Other graph transformations that are used frequently in high level synthesis are the dragging forward and backward of branch and merge nodes. This is done in high level synthesis to reduce the number of multiplexors and demultiplexors. It is done there implicitly, since scheduling gives a control graph so that the allocation phase is performed on the operator nodes of the partitioned flow graph only. But for­ mally, all branch and merge nodes (i.e. the complete control graph), need to be viewed. An example is given in figure 3.30; with the here presented semantics, it can be verified that this optimization is behavior preserving. But apart from these optimizations many other graph transformations are also pos­ sible that do not change the behavior of a data flow graph. An example is given by the expansion of a node by a well-behaved graph, or the reduction of a well­ behaved subgraph, as defined in section 2.7.1. Possible types of other valid graph transformations are those that are found in many software compilers and other flow graph optimizers [75, 132]. Examples are: - Dead code elimination - Constant propagation - Strength reduction - Basis rule Dependent on the underlying (value) domain, for example, inputs of associa­ tive operator nodes can be rearranged. Another example is tree balancing for such expressions. GRAPH TRANSFORMATIONS 165

Figure 3.30 Example of a valid graph transformation by moving merge nodes - Substitution rule For example the duplication of (sub)graphs for each destination of a multi­ destination edge. - Calling rule For example the expansion of (hierarchical) nodes, as discussed in section 2.7.1.

3.8 Final remarks 3.8.1 Specification-implementation paradigm In section 3.3, it is proved that high level synthesis is just the application of some graph homomorphisms. Of course, each flow graph homomorphism on a data flow graph G induces a homomorphism ' on the reachability graph RG(G). But RG((G)) = '(RG(G)), (3.11) or even the stronger statement '(RG(G)) 5.Rc RG(cf>(G)), (3.12) is not valid in general, except when has certain properties. Such a property for may be that cJ> must be proper in the sense that cf>(G) is as deterministic as G is, e.g. only mutual exclusive choice may be introduced. Section 2.7.3 is already a start for this quest of useful properties for which equations (3.11) and (3.12) are valid. 166 APPLICATIONS

Related to this is the following. The specification-implementation paradigm for designing systems states that a specification S is refined into an implementation D. This may be done in more than one step, each step leading to an implementation D;. All these implementations D; should satisfy the specification but each imple­ mentation shows more detail. For data flow graphs this might be interpreted as (possibly after some kind of projection): ('t/ D;: Behavior(D;) = Behavior(S)) /\("di: RG(D;+1) 5.Ra RG(D;)) I.e. a partial order::; (maybe a preorder is already sufficient) may be defined. Then it is attractive to have a theory 'I, e.g. some kind of a logic deduction scheme, to prove properties of flow graphs that are consistent with respect to this ordering [107] (see also section 3.6). This means that S f=.,f A D 5, S => D f=.,f should be valid. 3.8.2 Non-standard semantics Of course apart from the semantics defined in chapter 2, non-standard semantics may also be defined for the analysis and verification of data flow graphs. A fre­ quently referenced example of non-standard semantics is for type-checking. The necessity of type checking is illustrated by results of the CHEOPS project (ESPRIT Basic Research Action 3215) for the Cathedral silicon compiler [90]. Modules were designed independently from each other from a specification, but the types did not match at the module boundaries in the final result. More advanced semantics may be defined, e.g. (logic) predicate propagation. Instead of having the nodes of the flow graph operate on streams of values, predi­ cates may be used. Nothing has to be changed for such a semantics, only a new valuation function for the nodes must be devised. A node is then just a predicate transformer. The simplest of such a semantics is the token based semantics. For this, the value of the data items on the data flow edges is of no importance, but only the absence and/or presence of data is important. The result is then a real Petri net-like seman­ tics. Such a semantics might be very useful for performance evaluations [61, 105). Another example of a predicate transforming semantics is a kind of data flow anal­ ysis (as already suggested in section 3.5) that checks the use of identical variables in parallel states. The specific type of verification of the data flow graph may sug­ gest more complicated predicates. An example of such a verification is a more global one that verifies the data flow graph for all integers in one pass. With such a non-standard semantics, the concept of a reachability graph is still useful. States are then labeled differently, but model checking for such a special type of logic might be possible. FINAL REMARKS 167

3.8.3 Timing Sequence edges are introduced to impose extra precedence relations in the data flow graph. Explicit delay values can also be incorporated very easily. One possi­ bility is to specify the propagation delay for every node. This has the disadvantage that the execution of a node is not instantaneous and input values of a node disap­ pear and the output values appear some time later. This is the method as used in timed Petri nets [114]. Similarly, a delay can be associated with the edges [116]. A more general method is that is used for time Petri nets [19, 20, 67, 99, 100] In this approach an interval [ti. t2] is associated with each node. After the node is enabled, it must fire in this interval. I.e. t 1 is the hold time, the time the values must at least be present before the node is fired after getting enabled; t2 is the max­ imal delay after which a node has to execute. This method includes the previous time methods [20, 67]. It also includes the non-timed data flow graphs, by setting the interval to [0, oo > for all nodes. 3.8.4 Arrays Arrays are not dealt with here, since they do not contribute substantially to the the­ ory of data flow graphs. Two possibilities exist to 'implement' arrays in data flow graphs: 1. An array can be viewed as a collection of individual elements, i.e. as a set of independent variables and thus values. This allows for many optimizations [123]. This view can be used successfully for small arrays that are used throughout some program and which might then eliminate or reduce the use of on-chip memory and registers. 2. An array can be viewed as an entity, i.e. as a single value. This view is used mostly when the array is implemented as a global variable, e.g. as back­ ground memory, and this decision is mostly already taken before synthesis is started. In this case, synthesis should try to optimize the address calculations and the number of memory accesses [85]. This is part of the data flow graph itself, and therefore the theory presented here applies. 3.8.5 Implementation Nothing is said about implementation in this thesis. Only one algorithm is given, which describes the construction of the anticipated reachability graph. This algo­ rithm has been implemented in a very straightforward way, together with the veri­ fication based on CTL model checking. As far as state transition graphs and any kind of logic is concerned for verification and analysis, even for non-standard semantics, implementations may use Binary Decision Diagrams (BDD's) [10, 27], 168 APPLICATIONS which are nowadays viewed as the only possible efficient way to solve almost all problems having a huge state space, including ClL model checking [29, 36]. The titles of these papers are slightly misleading, since the numbers are the sizes of the complete state spaces, and not of the reachable state space. And since many of these examples are parameterized, other techniques, for instance inductive ones, should be used. With BDD's, there is always the problem of ordering the variables and more important (when it is also used for state transition graphs) of a good state encoding. Although some results are known for a good encoding to obtain a small final BDD, the intermediate BDD's might still "blow-up", i.e. exceed available memory [46]. 3.8.6 Using other formalisms instead In this thesis, data flow graphs are extended with choice edges, which are included in well-formed flow graphs. Another interesting extension would be the inclusion of conflict. In principle this would yield the same type of formalism as colored or high-level Petri nets [50, 66]. Most of the theory presented here can be extended to such Petri nets, as already mentioned in some paragraphs. The main reason to extend pure data flow graphs with choice edges was to incorporate the network graph that is constructed from a flow graph by high level synthesis. The general concept of the multi-origin and multi-destination edges stems from the nets, or wires, in networks, or circuits. Since conflict is not that emanent from networks, it is not included in this thesis. Although the branch nodes represent a type of con­ flict, the inclusion of conflict would enlarge the expressiveness of the formalism in such a way that 'backward branching' is included in the behavior of graphs, whereas choice only yields 'forward' branching. A major drawback of the theory presented here is that it is mainly based on the reachability graph. The reachability graph shows explicitly all the different possi­ ble execution orders, information that is encoded concisely in the flow graph itself. It would therefore be more advantageous to use a formalism and a theory that is based directly on this partial ordering. One approach could be the theory of pom­ sets (partially ordered multisets) [96, 113, 134]. Definition 3.34 A partially ordered multiset, or shortly a pomset, is the isomor­ phism class of a labeled partial order (P, :E, ~ , A.), where ~ is a partial order on P, and A.: P ~:Eis a labeling function.5

5. The labeling function is used to eliminate multiple occurrences of the same symbol. FINAL REMARKS 169

Of course, the proofs using such a formalism should not become undecidable. And it is not very useful if the proofs are "exponentially" difficult. But probably the complexity remains within bounds, since the anticipated reachability graph has already shown to be powerful. Loops in the flow graph are of course a difficult issue when modeling a flow graphs by means of a pomset. Unwinding all loops is clearly not the solution. Related to the use of pomsets is the possibility to devise a theory based on the behavior expression semantics as defined in definition B.18 using the composi­ tional graph syntax of section 2.2.

References

[1] Program Flow Analysis: Theory and Applications, ed. S.S. Muchnick and N.D. Jones, Prentice-Hall Inc., Englewood Cliffs, NJ, 1981. [2] VHDL Language Reference Manual, Standard 1076, IEEE, 1987. [3] LOTOS, a Formal Description Technique Based on Temporal Ordering of Observational Behavior, IS 8807, International Organization for Standard­ ization, Infonnation Processing Systems, Open Systems Interconnection, 1987. [4] Estelle, a Formal Description Technique Based on an Extended State Transi­ tion Model, IS 9074, International Organization for Standardization, Infor­ mation Processing Systems, Open Systems Interconnection, 1987. [5] W.B. ACKERMAN, "Data Flow Languages", IEEE Computer, vol. 15, no. 2, February 1982, pp. 15-25. [6] A.V. AHO, M.R. GAREY, AND J.D. ULLMAN, "The Transitive Reduction of a ", SIAM J. Comput., vol. 1, no. 2, pp. 131-137. [7] A.V. AHO, J.E. HOPCROFI', AND J.D. ULLMAN, Data Structures and Algo­ rithms, Addison-Wesley, Reading, MA, 1983.

[8] A.V. AHO, R. SETHI, AND J.D. ULLMAN, Compilers: Principles, Techniques and Tools, Addison-Wesley, Reading, MA, 1986. 172 REFERENCES

[9] A.V. AHO AND J.D. ULLMAN, "Optimization of Straight Line Programs", SIAM J. Comput., vol. 1, no. 1, pp. 1-19. [10] S.B. AKERS, "Binary Decision Diagrams", IEEE Trans. on Computers, vol. C-27, June 1978, pp. 509-516. [11] B. ALPERN AND F.B. SCHNEIDER, "Verifying Temporal Properties without Temporal Logic", ACM Trans. on Programming Languages and Systems, vol. 11, no. 1, January 1989, pp. 147-167. [12] E.A. BAARS, Het omzetten van commands in Petri netten, Training report, Eindhoven University of Technology, Eindhoven, the Netherlands, 1990. [13] H.P. BARENDREGT, The Lambda-Calculus: Its Syntax and Semantics, North-Holland, Amsterdam, 1985. [14] J.A. BERGSTRA AND J.W. KLOP, "Algebra of Communicating Processes with Abstraction'', Theoretical , vol. 37, 1985, pp. 77-121. [15] C.H. VAN BERKEL, Handshake Circuits: an Intermediary between Commu­ nicating Processes and VLSI, Ph. D. Thesis, Eindhoven University of Tech­ nology, May 1992. [16] C.H. VAN BERKEL AND R.W.J.J. SAEIJS, "Compilation of Communicating Processes into Delay-Insensitive Circuits", Proc. of the IEEE lnt. Conf. on Computer Design, Rye Brook, October 3-5, 1988, pp. 157-162. [17] K. VAN BERKEL, J. KESSELS, M. RONCKEN, R. SAEIIS, AND F. SCHALIJ, "The VLSI Programming Language Tangram and its Translation into Hand­ shake Circuits'', Proc. of the European Design Automation Conference, Amsterdam, February 26-28, 1991, pp. 384-389. [18] M.R.C.M. BERKELAAR AND J.F.M. THEEUWEN, "Real Area-Power-Delay Trade-off in the Euclid Logic Synthesis System", Proc. Custom Integrated Circuits Conference (CICC), Boston, May 13-16, 1990. [19] B. BERTHOMIEU AND M. DIAZ, "Modeling and Verification of Time Dependent Systems using Time Petri Nets", IEEE Trans. on Software Engi­ neering, vol. SE-17, no. 3, March 1991, pp. 259-273. [20] B. BERTHOMIEU AND M. MENASCHE, "An Enumerative Approach for Ana­ 1 lyzing Time Petri Nets", Proc. of the 9 h IFIP Congress, ed. R. Mason, North-Holland, Paris, September 1983, pp. 41-46. REFERENCES 173

[2l] E. BEST AND C. FERNANDEZ, "Notations and Terminology on Petri Net Theory", Petri Net Newsletter, no. 23, April 1986, pp. 21-46. [22] G. v. BOCHMANN, "Hardware Specification with Temporal Logic: an Example", IEEE Trans. on Computers, vol. C-31, no. 3, March 1982, pp. 223-231. [23) R.S. BOYER AND J.S. MOORE, A Computational logic, ACM Monograph Series, Academic Press, 1979. [24) A. BRATSCH, H. EVEKINO, H.-J. FAERVER, M. KELELATCHEW, J. PINDER, AND U. SCHELLIN, "LOVERT - A Logic Verifier of Register Transfer Level Descriptions", Formal VLSI Correctness Verification, VLSI Design Meth­ ods-II, ed. L.J. M. Claesen, North-Holland, 1990, pp. 247-256. [25) R.K. BRAYTON, R. CAMPOSANO, G. DE MICHELI, R.H.J.M. OTTEN, AND J.T.J. VAN EIJNDHOVEN, "The Yorktown Silicon Compiler System", Sili­ con Compilation, ed. D.D. Gajski, Addison-Wesley, 1988, pp. 204-310. [26] R.E. BRYANT, "Symbolic Verification of MOS Circuits", Proc. of the 1985 Chapel Hill Conf on VLSI, May 1985, pp. 419-438. [27} R.E. BRYANT, "Graph-Based Algorithms for Boolean Function Manipula­ tion", IEEE Trans. on Computers, vol. C-35, no. 8, August 1986, pp. 677-691. [28] R.E. BRYANT, "Boolean Analysis of MOS Circuits", IEEE Trans. on Com­ puter Aided Design, vol. CAD-6, no. 4, July 1987, pp. 634-649.

[29] J.R. BURCH, E.M. CLARKE, K.L. MCMILLAN, D.L. DILL, AND LJ. HWANG, "Symbolic Model Checking: 1020 States and Beyond", Proc. of the 5th IEEE Symposium on Logic in Computer Science, Philadelphia, 1990, pp. 428-439. [30] R.H. CAMPBELL AND A.N. HABERMANN, "The Specification of a Process Synchronization by Path Expressions", Operating Systems, ed. E. Gelenbe and C. Kaiser, Lecture Notes in Computer Science 16, Rocquenfort, April 23-25, 1974,pp.89-102.

[31] P. CAMURATI AND P. PRINETTO, "Formal Verification of Hardware Correct­ ness: Introduction and Survey of Current Research", IEEE Computer, vol. 21, no. 7, July 1988, pp. 8-19.

[32] Y. CHOUEKA, "Theories of Automata on aJ·Tupes: a Simplified Approach", J. Comput. System Sci., vol. 8, 1974, pp. 117-141. 174 REFERENCES

[33] E.M. CLARKE, E.A. EMERSON, AND A.P. SISTLA, "Automatic Verification of Finite-State Concurrent Systems using Temporal Logic Specifications", ACM Trans. on Programming Languages and Systems, vol. 8, no. 2, April 1986, pp. 244-263. [34] R. CLEAVELAND AND B. STEFFEN, "A Linear-Time Model-Checking Algo­ rithm for the Alternation-Free Modal Mu-Calculus", Proc. of the 3rd Work­ shop on Computer Aided Verification, Lecture Notes in Computer Science 575, Aalborg, Denmark, July 1-4, 1991, pp. 48-58. [35] F. COMMONER, A.W. HOLT, s. EVEN, AND A. PNUELI, "Marked Directed Graphs", J. Comput. System Sci., vol. 5, 1971, pp. 511-523. [36] 0. COUDERT, J.C. MADRE, AND c. BERTHET, "Verifying Temporal Proper­ ties of Sequential Machines without Building their State Diagrams", Proc. of the Workshop on Computer Aided Verification, June 18-21, 1990, New Brunswick, NJ, pp. 23-32. [37] A.L. DAVIS AND R.M. KELLER, "Data How Program Graphs", IEEE Com­ puter, vol. 15, no. 2, February 1982, pp. 26-41. [38] C. DELGADO KLoos, Semantics of Digital Circuits, Lecture Notes in Com­ puter Science 285, Springer Verlag, Berlin, 1987. [39] J.B. DENNIS, "First Version of a Data How Procedure Language", Proc. Colloque sur la Programmation, Lecture Notes in Computer Science 19, Paris, April 9-11, 1974, pp. 362-376. [40] P. DENYER, "SAGE - A User Directed Synthesis Tool", Proc. of the ASCIS Open Workshop on Synthesis Techniques for (Lowly) Multiplexed Datap­ aths, Leuven, Belgium, August 1990. [41] D.L. DILL AND E.M. CLARKE, "Automatic Verification of Asynchronous Circuits using Temporal Logic", Proc. of the 1985 Chapel Hill Conf on VLSI, pp. 127-143. [42] D. DRUSINSKY AND D. HAREL, Statecharts as an Abstract Model/or Digi­ tal Control Units, Internal Report CS86-12, Rehovot, Israel, May 1986.

[43] D.F. VAN EGMOND, Decomposition of Demand Graphs, Training report, Eindhoven University of Technology, 1990.

[44] J.T.J. VAN EUNDHOVEN AND L. STOK, "A Data Flow Graph Exchange Standard", Proc. of the European Design Automation Conference, Brussels, March 16-19, 1992,pp.193-199. REFERENCES 17 5

[45] E.A. EMERSON AND .C.L. LEI, "Efficient Model Checking in Fragments of the Propositional Mu-calculus", Proc. First Annual Symposium on Logic in Computer Science, Cambridge, MA, June 16-18, 1986, pp. 267-278. [46] R. ENDERS, T. FILKORN, AND D. TAUBNER, "Generating BDDs for Sym­ bolic Model Checking in CCS'', Proc. of the 3rd Workshop on Computer Aided Verification, Lecture Notes in Computer Science 575, Aalborg, Den­ mark, July 1-4, 1991, pp. 203-213. [47] J.-C FERNANDEZ AND L. MOUNIER, "Verifying Bisimulations on the Fly", Proc. 3rd lnt. Conf on Formal Description Techniques (FORTE), North­ Holland, Madrid, November 1990, pp. 91-105. [48] A.J. FIELD AND P.G. HARRISON, Functional Programming, Addison­ Wesley, Amsterdam, 1988. [49] R.W. FLOYD, "Assigning Meaning to Programs", Proc. of the Symposium in Applied Mathematiscs of the AMS, Providence, RI, 1967, pp. 19-32. [50] H.J. GENRICH AND K. LAUTENBACH, "System Modeling with High-Level Petri Nets", Theoretical Computer Science, vol. 13, 1981, pp. 109-136. [51] P. GODEFROID, "Using Partial Orders to Improve Automatic Verification Methods", Proc. of the 2nd Workshop on Computer Aided Verification, Lec­ ture Notes in Computer Science 531, New Brunswick, NJ, June 18-21, 1990, pp. 176-185.

[52] P. GODEFROID AND P. WOLPER, "Using Partial Orders for the Efficient Ver­ ification of Deadlock Freedom and Safety Properties'', Proc. of the 3rd Workshop on Computer Aided Verification, Lecture Notes in Computer Sci­ ence 575, Aalborg, Denmark, July 1-4, 1991, pp. 332-342. [53] M.J. C. GORDON, "HOL: a Proof Generating System for Higher-Order Logic", VLSI Specification, Verification and Synthesis, ed. G. Birtwistle and P.A. Subrahmanyam, Kluwer Academic, Dordrecht, 1988, pp. 73-128. [54] B. GRAHAM, T. SIMPSON, K. SLIND, AND M. WILLIAMS, "Verifying an SECD Chip in HOL", Formal VLSI Correctness Verification, VLSI Design Methods-I/, ed. L.J.M. Claesen, North-Holland, 1990, pp. 369-378. [55] D. GRIES, AJ. MARTIN, J.L.A. VAN DE SNEPSCHEUT, AND J.T. UDDING, "An Algorithm for Transitive Reduction of an Acyclic Graph", Science of Computer Programming, vol. 12, 1989, pp. 151-155. 17 6 REFERENCES

[56] M.S. HECHT AND J .D. ULLMAN, " Flow Graph Reducibility", SIAM l. Computing, vol. 1, no. 2, June 1972, pp. 188-202. [57] M. HENNESSY, "Observing processes", Linear Time, Branching Time and Partial Order in Logics and Models for Concurrency, ed. J.W. de Bakker, W.-P de Roever and G. Rozenberg, Lecture Notes in Computer Science 354, Noordwijkerhout, the Netherlands, May-June 1988, pp. 173-200. [58] P.N. HILFINGER, SILAGE: a Language for Signal Processing, University of California, Berkeley, 1984. [59] C.A.R. HOARE, "Towards a Theory of Parallel Programming", Operating Systems Techniques, ed. C.A.R. Hoare and R.H. Perrott, Academic Press, New York, 1972, pp. 61-71. [60] C.A.R. HOARE, Communicating Sequential Processes, ed. C.A.R. Hoare, Prentice-Hall International, Englewood Cliffs, NJ, 1985. [61] M.A. HOLIDAY AND M.K. VERNON, "A Generalized Timed Petri Net Model for Performance Analysis", IEEE Trans. on Software Engineering, vol. SE-13, no. 12, December 1987, pp. 1297-1310. [62] G.J. HOLZMANN, "Validating SDL Specifications: an Experiment", Proc. 91h lnt. Symposium on Protocol Specification, Testing and Verification, North-Holland, Enschede, the Netherlands, June 6-9, 1991 , pp. 317-326. [63] G.J. HOLZMANN, "Algorithms for Automated Protocol Validation", Proc. of the Workshop on Automatic Verification Methods for Finite State Systems (CAV), Lecture Notes in Computer Science 407, Grenoble, June 12-14, 1991. [64] G.E. HUGHES AND M.J. CRESWELL, An Introduction to Modal Logic, Methuen and Co., 1977. [65] G.L. J.M. JANSSEN, "Hardware Verification using Temporal Logic: a Practi­ cal View", Formal VLSI Correctness Verification, VLSI Design Methods-II, ed. L.J.M. Claesen, North-Holland, 1990, pp. 159-168. [66] K. JENSEN, " Coloured Petri Nets", Petri Nets: Central Models and Their Properties, ed. W. Brauer, Lecture Notes in Computer Science 254, Bad Honnef, Germany, September 1986, pp. 248-299. [67] G.G. DE JONG, High Level Verification of (A)synchronous Circuit Descrip­ tions, Master Thesis, Eindhoven University of Technology, December 1987. REFERENCES 177

[68] G.G. DE JONG, "Verification of Data Flow Graphs using Temporal Logic", Formnl VLSI Correctness Verification, VLSI Design Methods-//, ed. L.J.M. Claesen, North-Holland, 1990, pp. 169-178. [69] G.G. DE JONG, "Data Flow Graphs: System Specification with the Most Unrestricted Semantics", Proc. of the European Design Automation Confer­ ence, Amsterdam, February 26-28, 1991, pp. 401-405. [70] G.G. DE JONG, "An Automata Theoretic Approach to Temporal Logic", Proc. of the 3rd Workshop on Computer Aided Verification, Lecture Notes in Computer Science 575, Aalborg, Denmark, July 1-4, 1991, pp. 477-487. [71] J. JOYCE, "Formal Verification and Implementation of a Microprocessor", VLSI Specification, Verification and Synthesis, ed. G. Birtwistle and P.A. Subrahmanyam, Kluwer Academic, Dordrecht, 1988, pp. 129-157. [72] J. KATZENELSON AND R.P. KURSHAN, "S/R: a Language for Specifying Protocols and Other Coordinating Processes.", Proc. of the Fifth Ann. lnt. Phoenix Conf on Computers and Communications, Scottsdale, AZ, March 26-28, 1986,pp.286-292. [73] K.M. KAVI, B.P. BUCKLES, AND U.N. BHAT, "A Formal Definition of Data Flow Graph Models", IEEE Trans. on Computers, vol. C-35, no. 11, November 1986, pp. 940-948. [74] K.M. KAVI, B.P. BUCKLES, AND U.N. BHAT, "Isomorphisms between Petri Nets and Dataftow Graphs", IEEE Trans. on Software Engineering, vol. SE-13, no. 10, October 1987, pp. 1127-1134. [75] K. KENNEDY, "A Survey of Data Flow Analysis Techniques", Program Flow Analysis: Theory and Applications, ed. S.S. Muchnick and N.D. Jones, Prentice-Hall Inc., Englewood Cliffs, NJ, 1981, pp. 5-54. [76] D.W. KNAPP AND A.C. PARKER, "A Unified Representation for Design Information", Proc. of the lnt. Conf on Computer Hardware Languages and Descriptions and their Applications, Tokyo, August 1985, pp. 337-353. (77] TH. KROL, J. VAN MEERBERGEN, C. NIESSEN, W. SMITS, AND J. HUISKEN, "The Sprite Input Language: an Intermediate Format for High Level Synthesis", Proc. of the European Design Automntion Conference, Brussels, March 16-19, 1992, pp. 186-192. [78] TH. KROL, J. VAN MEERBERGEN, AND W. SMITS, Report on the Sprite Input Language, SIL-I, ESPRIT 2260 deliverable D.1.1/ PHILIPS/ 178 REFERENCES

Yl-ml2/2, 1989. [79] R.P. KURSHAN AND K.L. MCMILLAN, "A Structural Induction Theorem for Processes", Proc. of the 81h annual ACM Symposium on Principles of Dis­ tributed Computing, 1989, pp. 239-247. [80] L. LAMPORT, ""Sometime" is sometimes "not never": on the Tempora1 1 Logic of Programs", Proc. 1 h ACM Annual Symp. on Principles of Prog. Lang., Las Vegas, January 1980, pp. 174-185. [81] P.E. LAUER AND R.H. CAMPBELL, "Formal Semantics of a Class of High­ Level Primitives for Coordinating Concurrent Processes", Acta lnformatica, vol. 5, 1974, pp. 297-332. [82] TH. LENGAUER, Combinatorial Algorithms for Integrated Circuit Layout, John Wiley & Sons, Chichester, 1990. (83] U. LICHTBLAU, "Decompilation of Control Structures by Means of Graph Transformations", Proc. of the lnt. Joint Conf on Theory and Practice of Software Development (TAPSOFT), Lecture Notes in Computer Science 185, Berlin, March 25-29, 1985, pp. 284-297. [84] 0. LICHTENSTEIN AND A. PNUELI, "Checking that Finite State Concurrent 1 Programs Satisfy their Linear Specification'', Proc. 12 h ACM Symp. on Principles of Programming Languages, New Orleans, January 14-16, 1985, pp. 97-107. [85] P.E.R. LIPPENS, JL. VAN MEERBERGEN, A. VANDERWERF, AND W.FJ. VERHAEGH, "PHIDEO: a Silicon Compiler for High Speed Algorithms", Proc. of the European Design Automation Conference, Amsterdam, Febru­ ary 26-28, 1991, pp. 436-441. [86] J.L. LIS AND D.D. GAJSKI, "Synthesis from VHDL", Proc. lnt. Conf on Computer Design, Rye Brook, NY, October 3-5, 1988, pp. 378-381. [87] P.F. LISTER AND A.M. ALHELWANI, "Data-flow Based Design of Self­ Timed Systems'', IEE Colloquium on VLSI Design Methodologies, IEE Digest, no. 41, April 1985, pp. 4/1-4/4. [88] P.F. LISTER, C. ENG, AND A.M. ALHELWANI, "Design Methodology for Self~Timed VLSI Systems'', Proc. of TEE, Part E, vol. 132, January 1985, pp. 25-33. [89] J.C. MADRE AND J.P. BILLON, "Proving Circuit Correctness using Formal Comparison Between Expected and Extracted Behaviour", Proc. of the 251h REFERENCES 179

Design Automation Conference, Anaheim, June 12-15, 1988, pp. 205-210. [90) H. DE MAN, F. CAITHOOR, G. GOOSSENS, J. VANHOOF, J. VAN MEERBER­ GEN, S. NOTE, AND J. HUISKEN, "Architecture-driven Synthesis Techniques for VLSI Implementation of DSP Algorithms", Proc. of the IEEE, vol. 78, February 1990, pp. 319-335. (91) Z. MANNA AND A. PNUELI, "Verification of Concurrent Programs: the Temporal Framework", The Correctness Problem in Computer Science, ed. R.S. Boyer and J. Strother Moore, Academic Press, New York, 1981, pp. 215-273. [92] Z. MANNA AND A. PNUELI, "Specification and Verification of Concurrent 1 Programs by 'I-Automata", Proc. 14 h ACM Symp. on Principles of Pro­ gramming Languages, Munich, January 21-23, 1987, pp. 1-12. (93) A.J. MARTIN, "The Design of a Self-Tooed Circuit for Distributed Mutual Exclusion", Proc. of the 1985 Chapel Hill Conf. on VLSI, pp. 247-260. [94) A.J. MARTIN, "Compiling Communicating Processes into Delay-Insensitive VLSI Circuits", Distributed Computing, vol. I, 1986, pp. 226-234. [95) P. MARWEDEL, "The MIMOLA Design System: a Design System which Spans Several Levels", Methodologies of Computer System Design, ed. B.D. Shiver, North-Holland, 1985, pp. 223-237. (96] A. MAZURKIEWICZ, "Compositional Semantics of Pure Place{fransition Systems", Fundamenta lnformaticae, vol. XI, 1988, pp. 331-356. [97] M.C. MCFARLAND, The Value Trace: a Data Base for Automated Digital Design, Master Thesis, CMU, Pittsburgh, December 1978. [98) R. MCNAUGHTON, "Testing an Generating Infinite Sequences by a Finite Automaton", Information and Control, vol. 9, 1966, pp. 521-530. [99) M. MENASCHE, "PAREDE: an Automated Tool for the Analysis of Time(d) Petri Nets", Proc. of the 1985 lnt. Workshop on Timed Petri Nets, Torino, Italy,July 1-3, 1985,pp.162-169. [100) P. MERLIN, A Study of the Recoverability of Computing Systems, Ph. D. Thesis, University of California, Irvine, 1974. [101] G. DE MICHELI AND D.C. KU, "HERCULES: a System for High Level 1 Synthesis", Proc. of the 25 h Design Automation Conference, Anaheim, June 12-15, 1988, pp. 483-488. 180 REFERENCES

[102] G.J. MILNE AND R. MILNER, "Concurrent Processes and their Syntax", J. of the ACM, vol. 26, no. 2, April 1976, pp. 302-321. [ 103] R. MILNER, "Flow Graphs and Flow Algebras", J. of the ACM, vol. 26, no. 4, October 1976, pp. 794-818. [104] R. MILNER, A Calculus of Communicating Systems, Lecture Notes in Com­ puter Science 92, Springer Verlag, Berlin, 1980. [105] M.K. MOLLOY, "Performance Analysis using Stochastic Petri Nets", IEEE Trans. on Computers, vol. C-31, no. 9, September 1982, pp. 913-917. [106] J.D. MORISON, N.E. PEELING, AND T.L. THORP, "ELLA: Hardware Description or Specification?", Proc. of the lnt. Conf. on Computer Aided Design, Santa Clara, November 12-15, 1984, pp. 54-56. [107] E.-R. OLDEROG, Nets, Terms and Formulas: Three Views of Concurrent Processes, ed. C.J. van Rijsbergen, Cambridge Tracts in Theoretical Com­ puter Science, 23, Cambridge Press, Cambridge, 1991. [108] B.M. PANGRLE, A Behavioural Compiler for Intelligent Silicon Compila­ tion, Ph.D. Thesis, University of Illinois, Urbana-Champaign, 1987.

[109] P.G. PAULIN AND A. JERRAYA, "SIF: an Interchange Format for the Design and Synthesis of High-Level Controllers", Digest of the 5th High-Level Syn­ thesis Workshop, Buhlerhohe, Germany, March 1991. [110] J.L. PETERSON, Petri Net Theory and the Modelling of Systems, Prentice­ Hall Inc., Englewood Cliffs, NJ, 198 L [ 111] L. PIERRE, "The Formal Proof of Sequential Circuits described in CAS­ CADE using the Boyer-Moore Theorem Prover", Formal VLSI Correctness Verification, VLSI Design Methods-II, ed. L.J.M. Claesen, North-Holland, 1990, pp. 309-328. [112] A. PNUELI, "Applications of Temporal Logic to the Specification and Verifi­ cation of Reactive Systems: a Survey of Current Trends", Current Trends in Concurrency: Overviews and Tutorials, ed. J.W. de Bakker, W.-P. de Roever and G. Rozenberg, Lecture Notes in Computer Science 224, pp. 510-584. [113] V. PRATI, "Modeling Concurrency with Partial Orders", lnt. J. of Parallel Programming, vol. 15, no. 1, 1986, pp. 33-71.

[I 14] C. RAMCHANDANI, Analysis of Asynchronous Concurrent Systems by Petri Nets, Ph.D. Thesis, Massachusetts Institute of Technology, July 1973. REFERENCES 181

[115] D.A. SCHMIDT, Denotational Semantics: a Methodology for Language Development, Allyn and Bacon, London, 1986. [116] J. SIFAKIS, "Performance Evaluation of Systems using Nets", Proc. of the Advanced Course on General Net Theory of Processes and Systems, ed. W. Brauer, Lecture Notes in Computer Science 84, Hamburg, October 8-19, 1979, pp. 307-319. [117] A.P. SISTLA AND E.M. CLARKE, "Complexity of Propositional Linear Time Logics'', Journal of the ACM, vol. 32, no. 3, July 1985, pp. 733-749. [ 118] A.P. SISTLA, M. Y. V ARDI, AND P. WOLPER, "The Complementation Prob­ 1 lems for Buchi Automata with Applications to Temporal Logic", Proc. 12 h lnt. Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science 194, Napfl.ion, Greece, July 1985, pp. 465-474. [119] M.B. SMYTH, "Power Domains", J. Comput. System Sci., vol. 16, 1978, pp. 23-36. [120] J.L.A. VAN DE SNEPSCHEUT, Trace Theory and VLSI Design, Lecture Notes in Computer Science 200, Springer Verlag, Berlin, 1985. [121] G.L. STEELE JR., Common LISP: the Language, Digital Equipment Corpo­ ration, Burlington, 1984. [122] L. STOK, Higher Levels of a Silicon Compiler, EUT Report 86-E-163, Eind­ hoven University of Technology, November 1986. [123] L. STOK, Architectural Synthesis and Optimization of Digital Systems, Ph. D. Thesis, Eindhoven University of Technology, 1991. [124] J.E. STOY, Denotational Semantics: the Scott-Strachey Approach to Pro­ gramming Language Theory, MIT Press, London, 1977. [125] I. SUZUKI AND T. MURATA, "A Method for Stepwise Refinement and Abstraction of Petri Nets", J. Comput. System Sci., vol. 27, no. 1, August 1983, pp. 51-76. [126] R.E. TARJAN, "Testing Flow Graph Reducibility", J. Comput. System Sci., vol.9,no.3, 1974,pp.355-365.

[127] J.P. TREMBLAY AND R. MANOHAR, Discrete Mathematical Structures with Applications to Computer Science, McGraw Hill, New York, 1975. [128] R. V ALETTE, "Analysis of Petri Nets by Stepwise Refinement", J. Comput. System Sci., vol. 18, 1979, pp. 35-46. 182 REFERENCES

[ 129] A. V ALMARI, "Stubborn Sets for Reduced State Space Generation", Proc. 1 10 h lnt. Conf. on Applications and Theory of Petri Nets, Bonn, 1989, pp. 1-22. [130) A. VALMARI, "A Stubborn Attack on State Explosion", Proc. of the 2nd Workshop on Computer Aided Verification, New Brunswick, NJ, June 18-21, 1990, pp. 156-165. [13l] A. VEEN, The misconstrued semicolon, Ph.D. Thesis, Eindhoven University ofTechnology, 1985. [132] W.W. WADGE AND E.A. ASHCROFT, LUCID: the Datajlow Programming Language, Academic Press, London, 1985. [133) R.A. WALKER AND D.E. THOMAS, "Design Representation and Transfor­ mation in the System Architect's Workbench'', Digest of Technical Papers of the Int. Conf. on Computer Aided Design, Santa Clara, November 9-12, 1987, pp. 166-169. [134] G. WINSKEL, "Event Structures", Petri Nets: Applications and Relation­ ships to Other Models of Concurrency, ed. W. Brauer, W. Reisig and G. Rozenberg, Lecture Notes in Computer Science 255, Bad Honnef, Germany, September 8-19, 1986, pp. 325-392. (135] P. WOLPER, "Temporal Logic Can Be More Expressive", Information and Control, vol. 56, 1983, pp. 72-99. [136] P. WOLPER, M.Y. v ARDI, AND A.P. SISTLA, "Reasoning about Infinite 1 Computation Paths", Proc. 24 h Ann. Symp. on Foundations of Computer Science, Tucson, November7-9,1983, pp. 185-193. Appendix A

Matl1ematical preliminaries

Apart from some basic notations that are used throughout this thesis, this appendix contains a brief overview of the basics of (semantic) domain theory [3, 4). This includes the definition of the semantic domains Tup, Str and Set which are domains of tuples, streams respectively sets. Many functions on these domains are defined, too.

A.1 Some basic notations

Notation A.1 IB is the set of boolean values {true, false}.

Notation A.2 If for a set of objects 0 a predicate p: 0 ~ IB is defined, the fol­ lowing phrase may be used: "O has (a) p", if3oe0: p(o). Similarly, "O has no p", if'v'oeO:-ip(o).

Notation A.3 If for an object o the predicate p is defined, the following phrase may be used: "o is p", if p(o). Similarly, "o is p-free", if -ip(o ).

Notation A.4 The cardinality of a finite set 0, denoted by IOI, or equivalently #0, is the number of elements in 0. 184 MATHEMATICAL PRELIML'IARJES

Notation A.5 The powerset of a set A is denoted by as P(A) = {a I a~A}.

Definition A.6 A relation R~A x A is reflexive iff \faeA: (a, a)eR. A relation R~A x A is irreflexive iff \faeA: (a,a)~R. A relation R~A x A is symmetric iff \fa, be A: (a, b)eR==:f}(b, a)eR. A relation R~A x A is anti-symmetric iff\fa,beA: a-:;:. b:(a,b)eR =::::} (b,a)ER. A relation R~A x A is transitive iff \fa, b, ceA: (a, b)eRA(b, c)eR =::::}(a, c)eR.

Notation A.7 A __,. B denotes the class of functions f ~A x B. f: A __,. B denotes a function f ~A x B. Thus for functions ':' has the meaning of 'e '. Although the terms "function" and "mapping" denote the same mathematical object: Notation A.8 The term "mapping" is used solely to indicate a property. The term "function" is used solely to indicate a 'calculation'.

Definition A.9 Let f: A __,. B be a function. Range(/)= {beB I 3aeA: f(a) = b}

Notation A.10 Let f: A __,. B be a function, and X ~A. The restriction off to X is denoted by f/X:X __,. B, /IX= fn(X x B).

Definition A.11 identity: (A __,.. A) __,. m is defined as: identity(/)= \faeA: f(a) =a

Definition A.12 An n-ary operation, or equivalently an operation of arity n, is a function f: An __,. A. If n = 0, then f is a nullary operation, or a constant. If n = 1, then f is a unary operation, If n = 2, then f is a binary operation.

Definition A.13 R+: D __,. P(D), the transitive closure of a function R: D __,.. P(D), and R*: D __,. P(D), the reflexive, transitive closure of a function R: D __,.. P(D) are defined as: SOME BASIC :'\OTA TIONS 185

R+(d) = u Rn(d) m-0 R* (d) = u Rn(d) n;;:,{) where Ro(d) = {d} Rn+1(d) = {djeD I 3d;ERn(d):dieR(d;)}

Definition A.14 Given a function /: D ~D. The natural defining function fna1: D ~ P(D) off is defined as f nal(d) = {/(d)}.

Definition A.15 Let Ai. A2, ···,Am be sets. The closure close({Ai.A2 , .. ·,Am}) is defined as closen+l such that closen+I =closem where the close1 are defined with respect to the set {Ai. A1, · · ·, Aml as: closeo = {A1. Az, ···,Am} closen+i = {A;UAi I A1eclosenAAjeclosenAA;,;:. AjAA;nAi :/:. 0}u {A; I A1eclosenA-.3Ajeclosen: A1 ,;:. Aj: A1nAi,;:. 0}

Clearly, close( {Ai. A2, ··.,Am}) exists, since closen+1 =close,, for at least for all n~m.

Definition A.16 Given a symmetric function/: Ax A~ 1B and a set B!,;;;A. complete(B, /)a -.3aeA\B:3beB: f(a,b) A set is complete, also called maximal, with respect to a binary function f, if it cannot be extended by elements that have that property f in common with an ele­ ment of that set. Morphisms are functions, or transformations, which map one domain onto another, while preserving certain properties.

Definition A.17 Let A, B be sets, +: Ax A ~ A, *: Bx B ~ B be binary opera­ tions. homomorphism: (A ~ B) ~ 1B isomorphism: (A ~ B) ~ 1B are defined as: homomorphism(/)= 'r:/a,beA:f(a+b) =/(a)* f(b) isomorphism(/) =homomorphism(f)Abijective(f)

Definition A.18 A lifted homomorphism for an n-tuple (Ai. A2, ···,A,,) is defined as the n-tuple (tPi, ; 2,. ·., ;). 186 MAntEMA TICAL PRELL\fiNARIES

A.2 Partially ordered sets and domains Many of the following definitions on partial orders are just included for complete­ ness, and may therefore be read as such, to define what exactly a domain is. These can also be found in any textbook on partial orders, lattices and domains, e.g. [2, 3). Definition A.19 A relation R!:;A x A is a partial ordering iff R is reflexive, anti­ symmetric and transitive.

Notation A.20 A partially ordered set, or shortly a partial order or a poset, is a tuple (P, s;), where P is a set and::;; a partial ordering on P.

Notation A.21 I' denotes the class of posets.

Notation A.22 Let (P, ::;;P) be a poset. This may also be written as: P is a poset, when s;P is clear from the context.

Notation A.23 as; b iff (a, b)e ::;;; a'.bb iff (a, b)e ::;;.

Definition A.24 a and b are incomparable in a poset (P, ::;;) iff a'i.bAb'l.a.

Definition A.25 Aposet(P,::;;) is a total ordering iff\la,beP: (a::;; b)v(b::;; a).

Definition A.26 Let (P, ::;;) be a poset. aeP is a least element of P iff'\fbeP: a::;; b. Note that a poset cannot have more than one least element. Notation A.27 Let (P, ::;;) be a poset. l is the least element of P (if P has a least element).

Definition A.28 Let (P, ::>)be a poset, and let a, beP The least upper bound of a and b, denoted by aUb, is that element of P (if it exists) for which a::;; aUb A b::;; aUb A 'efceP: a::;; C/\b::;; c:>aUb::;; c.

Definition A.29 Let (P, ::>) be a poset. The least upper bound of a set A!:;P is LJA eP it exists for which '\I a eA: a ::;; LJA and '\fdeP: ('\faeA: a::;; d):>LJA::;; d. PARTIALLY ORDERED SETS AND DOMAINS 187

Definition A.30 Let (P, $) be a poset. A set C <;;.P is a chain iff Va,beC: a$ bvb $a.

Definition A.31 complete - partial - ordering: JP ~ IB is defined as: complete- partial - ordering(P, $) = VC<;;.P: chain(C)~Uc eP

Notation A.32 cpo is a shorthand notation for complete - partial - ordering.

Definition A.33 pointed - complete - partial - ordering: JP ~ IB is defined as: pointed - complete - partial - ordering(P, $) = cpo(P)AP has a least element

Notation A.34 pointed - cpo is a shorthand notation for pointed - complete - partial - ordering.

Definition A.35 Let (A, $A), (B, $ 8 ) be posets. monotonic: (A ~ B) ~ IB is defined as: monotonic(/)= Va,beA: a $A b~f(a) $n f(b)

Definition A.36 Let (A, SA), (B, S8 ) be cpo's. continuous: (A ~ B) ~ IB is defined as: continuous(/)= VC<;;.A: chain(C)~ /(UAC) = U8 {/(c) I ceC} Informally, a function f is continuous if all values /(a) can be approximated as closely as desired.

Proposition A.I Let (A, :S:A), (B, $ 8 ) be cpo's, and f: A~ B afanction. continuous(f)~monotonic(f)

Proof: For all elements a, beA with a$ b, letC be the chain {a, b }. /(UA {a, b}) = f(b). Since f is continuous, also /(UA{a,b}) = U 8 {f(a),f(b)}. Thus f(b) = UB {/(a), /(b) }, and therefore /(a)$ f(b). o

Definition A.37 fvced - point: ;rx D ~ IB 188 MAntEMATICAL PRELIMINARIES

least - fixed - point: ;Fx D ~ IB are defined as: fixed - point(!, d) e continuous(f)Af(d) = d least- fixed- point(f,d) =fixed- point(d)l\\:/eeD: /(e) = e=>d Se

Notation A.38 fixpoint is a shorthand notation for fixed - point.

Theorem A.2 lf(P,Sp) is a pointed cpo, then the least.fixed point, denoted by fix f, ofa continuous function f: D ~ D exists and is defined by: fix I= U{/(..L) I O s i} where J0(d) = d r+ 1 = t> Proof: f(fix f) = /(U{/(..L) I 0 s i}) { f is continuous} = IJ{f(/(..L)) I 0 Si} = IJ{/(..L) 11 $ i} (f0(..L) =..Land ..L So f (..L)} = IJ{/(..L) I 0 Si} =fix I . Thus fix f is indeed a fixed point of f.

Let eeD be a fix.point off. Since ..L So e and f is monotonic: Vi: 0 < i: /(..L) S0 /(e) = e Thus fix f = U{/(..L) I 0 Si} So e, i.e. it is least. o

Definition A.39 Let (P, S) be a pointed cpo. finite: P ~ IB is defined as: finite(a) = T:/Cr;;;;,P: chain(C): a S UC=>3ceC: a$ c

Notation A.40 A finite element of a pointed cpo may also be called isolated or compact.

Definition A.41 Let (P, $) be a pointed cpo. countably - algebraic: 1P ~ IB is defined as: PARTIALLY ORDERED SETS AND DOMAINS 189 countably - algebraic( P, S) = {a e P I finite( a)} is countable /\ VaeP: (3C<;;J': VceC: finite(c)): a= UC

Definition A.42 A domain is a countably-algebraic pointed cpo.

Definition A.43 A relation f~A x B is a partial function iff VaeA: #{beBl(a, b)ef} S 1. I.e. f may not be defined for some elements aeA. Definition A.44 The completion of a partial function /:A -t B is a function f': A -t Bu{..L}, where.l is a symbol (E:B) meaning undefmed, defined by: f'( ) = {f(a) if f is defined for a a .l if f is not defined for a

Notation A.45 Let A be a set, .l~ A. A.t =Au{..L}

Definition A.46 strict: (A x · · · A· x · · · A -t Y) x N -t 1B 1.i 'J. nJ. strict: (A1J. x · · · A;i x · · · AnJ. -t Y) -t 1B are defined as: strict(/, i) = Vai. · · ·, ai-t> ai+h ···,an: f(ai. · · ·, ai-i. ..L, ···,an)= ..L strict(/)= Vi: 1 Si Sn: strict(!, i) Informally: a function is strict in its i'h argument, when that argument is (always) needed for the evaluation of f. A function is strict when it is strict in all its argu­ ments. Note that a nullary function, or a constant (Definition A.12), is strict by definition. Notation A.47 In the sequel D is a domain; D denotes the class of domains.

Definition A.48 flat:D -t 1B is defined as: flat(D) =Va, beD: a SD b=>(a = bva =.l)

Definition A.49 lift:P-t P is defined as: lift( A, SA)= (A..1.. SA) where 190 MATHEMATICAL PRELL\tJNARlES

rta,beAJ.: a :SA.1. b~(a = .L v a :SA b) In the latter disjunct, clearly a, be A. Notation A.50 Let A be a poset. AJ. = lift(A).

Proposition A.3 Let A be a poset. cpo(A)~pointed- cpo(AJ.)

Proposition A.4 Each set A induces a domain.

Proof: Take the domain to be AJ. with :SA such that r/a,beA:a :5 b~a =b. I.e. all elements of A are incomparable with each other. D This domain is called the natural defining domain of a set.

Definition A.51 Let (A, :::;A), (B, S8 ) be posets. The product poset (Ax B, :::;;AxB) is defined as: (ai. b1) ::>AxB (a2, b2) = ai :SA a2/\b1 ::>o b2

Proposition A.5 If A and Bare domains, then Ax B is also a domain.

Definition A.52 Let (A, ::>A), (B, Sn) be posets. The separated sum poset (A+ B, SA+n) is defined as: A+ B = {(O, a) I aeA}u{(l,b) I beB} di SA+n d2 =di = (0, ai) A d2 = (0, az) /\ ai SA a2 v d1 = (1, b1) A dz = (1, b2) A b1 Sn b2

Proposition A.6 If A and B are domains, then A+ B is also a domain.

Definition A.53 Let (A, SA), (B, S0 ) be posets. The function poset (A~ B, SA-to) is defined as: A ~ B = { f: A ~ B I continuous(!) /\ cpo(A) /\ pointed - cpo(B)} f SA-tn g =r/aeA: f(a) S0 g(a)

Proposition A.7 If A and Bare domains, then A ~Bis also a domain.

A.2.1 Tuples Tuples are ordered collections of objects. Definition A.54 Given a domain D, the domain Tupn(D) of n-tuples on D, P ARTIAU.. Y ORDERED SETS AND DOMAINS 191

denoted by [d1, • • ·, dn1 is defined as: Tupn(D) = (Dn, Srup.(D» where [di.···, dnJ Srup.(D) [di',···, dn'J iff"i/i: 1 Si Sn: di So d;'

Tup(D) is defined as the domain of tuples on domain D, i.e. Tup(D) = uTupi(D). 12:()

Notation A.55 t = [di.··., dn]eTup(D) may also be written as t = (d1,. ·., dn). The notation (di.···, dn) for tuples is only used for applying arguments to a func­ tion. Proposition A.8 Tup(D) is a domain.

Definition A.56 [ ]: Dn -1 Tup(D) . fst:Tup(D) -1 DJ. snd: Tup(D) -1 D .L ..l,:Tup(D)xlN -1 DJ. are defined as (where t = [di. d2, • • ·] eTup(D)): [ ](di,dz,-··,dn) = [di.dz, .. .,dn] fst(t) = {l.d if teTup0(D) 1 otherwise snd(t) = {dl. if t eTup0(D)vt eTup1(D) 2 otherwise l. if teTupk(D)Ak < n .J,(t,n)= dn { otherwise

Notation A.57 Given a tuple teTup(D). r.J,k = -1.(t, k)

Proposition A.9 The functions [ ], fst, snd, J, are strict, monotonic and continu­ ous.

A.2.2 Streams

Definition A.58 Given a flat domain D, the domain Str(D) of streams over D is defined as: Str(D) = (D"'uD(l},Ss1r(D)) 192 MATIIEMATICAL PRELIMIKARIES

where D * is the set of finite sequences with elements from D, D co is the set of infi­ 1 nite sequences with elements out of D, and s 1 5sir s2 = s1 epref(s2) where pref(s) = (s' I 3s"eD*uDw:s'.s" =s}, i.e. s 1 is a prefix of s2•

Notation A.59 The bottom element of Str(D) is the empty stream, denoted by nil, or equivalently ( ).

Proposition A.10 Str(D) is a domain.2

Definition A.60 ( ): on ~ Str(D) head: Str(D) ~ D l. tail: Str(D) ~ Str(D) cons: D l. x Str(D) ~ Str(D) append: Str(D) x Str(D) ~ Str(D) length: Str(D) ~ JN last: Str(D) ~ D l. rcons: D x Str(D) ~ Str(D) J,: Str(D) x JN~ D l. are defined as: ( )(d1, dz, · · ·, dn) = (d1, dz, · · · ,dn) d 1 ifs= (d1, dz,. ·-) head(s) ..L 'f . ={ 1s=m1

(d2, d3, · · ·) ifs= (di. d2, d3, • ·-)

tail(s) = nil ifs= (d1) {nil ifs= nil

(d,dt>d2, .. ·) ifd;t:..L /\ s=(di.d2,-··) cons(d, s) = (d) if d ;t: ..L /\ s = nil { nil if d =..L

1. The symbol . is the language concatenation operator, which also is denoted by juxtaposition. 2. In principle Tup(D) as well as Str(D) may also be defined as the derived function domain N ~D. This is not done for suggestive reasons. Tuples are considered to be 'horizontal', while streams are 'vertical' as suggested in PARTIALLY ORDERED SETS AND DOMAINS 193

if St ED(<} ~~1' d2, ... 'dn> di'. d2'· .. ·) if St =(di. dz, · · ·, dn) A appen d( Si. s2 ) = { S2 = (d1', dz', .. ·) S2 if St= nil o ifs= nil length(s) = n ifs =(di, d2, · · ·, dn) lloo otherwise dn if S =(di. d2, · · · dn) last(s) = J_ 'f . Dm { l S = nl1 VSE ifs= nil rcons(d, s) = {~s~~' d2, · · ·, dm d) if S =(di. d2, · · ·, dn) if SEDm J,( )={J_ ifseDk,k

Notation A.61 Given a stream seStr(D). s(k) = J,(s, k)

Notation A.62 Given a stream seStr(D). lsl = length(s)

Proposition A.11 Thefanctions (),head, tail, length, last and J, are strict func­ tions. cons is strict in its first argument. The functions ( ), head, tail, cons, length, J, are monotonic and continuous.

A.2.3 Sets

Definition A.63 Given a domain D, the powerset domain Set(D) of Dis defined as: Set(D) =(P(D),Sse1

where s1 Sset(D) s2 =s1<=- 0 s2

where tuples are used as arguments to functions, i.e. at the input ports of nodes; streams are considered as flowing along edges. 194 MATI!EMATICAL PRELIMINARIES

C- 5:ser(O) 0 is the normal set inclusion operator; it only makes clear that for only equal elements are compared. I.e. the ordering 5:0 is not used for Set(D), and D is in a sense considered to be flat. Thus, with a 5:0 b, {a, b} 5:ser(O) { b} is not valid. Notation A.64 The bottom element of Set(D) is the empty set, denoted by 0, or equivalently { }.

Proposition A.12 Set(D) is a domain.

Proof: Straightforward from the above remark that =o is used instead of 5:0 . o

Definition A.65 { }: Dn ~ Set(D) first: Set(D) ~ D .l rest: Set(D) ~ Set(D) union: Set(D) x Set(D) ~ Set(D) are defined as: { }(dJ,'",dn)= {dJ,'",dn} d1 ifs = { d1' d1, . .. } first(s) = 'f { J_ I S = 0

s\{first(s)} ifs :;t: 0 rests( ) = { 0 ifs =0 union(si. s2) = s1 us2 It is clear that in principle first and rest are not functions, since they are not uniquely defined. When sets are viewed to be ordered, first and rest can be imple­ mented as functions. But since first and rest are only used together to 'visit' recursively the elements of a set, their meaning is intuitively clear. Proposition A.13 The functions { }, first, rest are strict functions. The functions { }, first, rest, and union are monotonic and continuous.

A.3 Lambda calculus Here follows a brief overview of the syntax of the lambda calculus [l], as it used in this thesis, and an informal interpretation is given. Definition A.66 An expressions Expr of the lambda calculus is constructed according to the following syntax: Expr =variable I LAMBDA CALCULUS 195

(Expr) I Expr Expr I 11.x. Expr The third syntactic construct is application, while the fourth is a function defini­ tion. (lx. Expr1)Expr2 is a function application, in which the value of Expr2 is bound toxin Expr1• The notation 11.x. expr, where expr is an expression in x, is used to define an unnamed function, say f, of x defined by /(x) = expr. Thus the notation f =11.x. expr binds this unnamed function to f, so that f (x) may be used. This lambda calculus notation is mainly used since it also allows for higher-order func­ tions, i.e. functions returning functions instead of 'plain' values. From a theoretical point of view, a function is just a value in the function domain. Functions with more than one argument can be defined in more than one way, namely as 11.(x, y, z). expr, i.e. in a tuple, or as lx. ly. lz. expr. In the former case, all arguments must be applied to a function before it can be 'computed'; in the lat­ ter case, applying only the first argument to the function, it results in a function having one argument less (and can thus be viewed as a higher-order function). This is also called currying.

References

[l] H.P. BARENDREGT, The lambda-Calculus: Its Syntax and Semantics, North-Holland, Amsterdam, 1985.

[2] G. BIRKHOFF, lattice Theory, 3rd Edition, American Mathematical Society, Providence, RI, 1967.

[3] D.A. SCHMIDT, Denotational Semantics: a Methodology for language Development, Allyn and Bacon, London, 1986.

[4] J.E. STOY, Denotational Semantics: the Scott-Strachey Approach to Pro­ gramming language Theory, MIT Press, London, 1977.

Appendix B

Additional semantics

In this appendix, some additional denotational semantics are defined for flow graphs, and which are proved to be equivalent to the operational semantics, as defined in section 2.3, and the behavioral denotational semantics of section 2.4. In section B.1, it is shown that the operational semantics semantics of a flow graph can also be described denotationally. Section B.2 defines the semantics of a data flow graph in terms of behavior expressions, similar to the expressions in process algebras like CSP [2], and CCS [4]. The latter allows for an alternative modeling of and reasoning about flow graphs, namely in a more algebraic way.

B.1 Execution graph denotational semantics of flow graphs In the behavioral denotational semantics, a flow graph corresponds to a function describing its input-output behavior. The execution graph denotational semantics [3] defined in this section yields for a flow graph its reachability graph, similar to the reachability tree that is induced by the operational semantics (Notations 2.71 and 2.72). I.e. the internal behavior of the flow graph is again considered. First this notation 2.72 is repeated. Notation B.1 A reachability graph for a (well-formed) flow graph G with an ini­ tial state s0 is a connected directed rooted simple graph, in which the nodes are labeled with states s and edges are labeled with nodes v e V. The root of the graph is a node labeled with the initial state s0 • Each edge, labeled with a v E V, in the reachability graph from the node labeled with a states, to a node labeled with a 198 ADDITIONAL SEMANTICS

v states', models a transition S-"S'.

The following domains and functions are defined. Definition B.2 'Y,J', =State x V x State 'RJj =Set(State) x Set(']($) x State

Notation B.3 In the sequel p, p;e'l{{j; TJE'l('E.

Definition B.4 fromstate: 'l('E _,, State label: 'l('E _,, V tostate: 'l('E _,, State states: 'l((j _,, Set(State) edges: 'l((j _,, Set(Edge) root: 'l((j _,, State are defined as: fromstate = ATJ. TJ-1..1 label= ATJ. ,,.L2 tostate = AT/. ,,.l-3 states= A.p. p.l-I edges= A.p. p.l-2 root= A.p. p.l-3

Definition B.5 The partial orders ::;n and ::;!l(f} are defined as: T/1 ::;n T/2 = fromstate(TJi) = fromstate(TJ 2)Atostate(TJ1) = tostate(TJ2)Alabel(q1) = labe/(112) PI ::;;1(.lj P2 = root(p1) = root(p2)Astates(p1)<;;,.states(p2)Aedges(p1)<;;,.s'R:J;edges(p2) Note that with this definition, ::;n, is a real partial order (although degenerated since it is an equivalence relation) and P('l('E) is a powerdomain. I.e. it is not nec­ essary to use, for example, the Smyth powerdomain construction or the Egli­ Milner powerdomain, since by quotienting P('l('E) by ::;n, P(1?$)/ ::;n = P(1?$). This is because of ::;n is defined by requiring the equivalence of the state compo­ nents instead of an ordering between them, as for instance in: T/1 ::;; 'n T/2 = fromstate(T/1) ::;State fromstate(TJ 2)A t ostate(111) ::;State tostate(112)A label(e1) = label(e2) EXECUilON GRAPH DENOTATIONAL SEMANTICS OF FLOW GRAPHS 199

In that case, P(!R.$) should have been quotiented by, for example, the Smyth order­ ing [5, 6]. But now the normal subset inclusion ordering can be chosen as the par­ tial order. Cf. this with the partial order defined on streams and sets. Therefore: Proposition B.1 '1.\Jj is a domain.

That the above defined S '~ yields 'unwanted' results can be seen from the fol­ lowing example, where s 1 Sstate s1:

-<'!IW

Definition B.6 graphcons: '1(1!,-+ 'R.Jj-+ '.R..fj graphunion: Set('.R$) -+ 'R.Jj-+ '.l((j graphjoin: '1.\Jj x 9{,fj-+ '.l((j are defined as (with : Set(!l{.'E) -+ '1.\Jj-+ !l{.{j): [union(states(p ), if fromstate(q)estates(p) { fromstate(TJ ), tostate(TJ)} ), graphcons = A.q. A.p. union(edges(p), {TJ }), { root(p)] .L otherwise graphunion = fix(A.. A.H. A.p. if H = 0 then p else graphcons(first(H), (rest(H))p))

[union(states(p1), states(p2)), if root(p1) = root(p2) graphioin =A.( ). union(edges(p1), edges(p2)), 'J Pi. P2 root(p1)] { .L otherwise graphcons adds a (reachability graph) edge T/ to the reachability graph. The condi­ tion fromstate(q)estates(p) guarantees that p remains a rooted graph. 200 ADDITIONAL SEMANTICS graphunion can be viewed as the consecutive application of graphcons for all the members of a set of (reachability graph) edges H. graphjoin joins two reachability graphs having the same root. Proposition B.2 graphjoin(pt> P2) = graphjoin(p2, Pi) graphjoin(pi. graphjoin(p2, p3)) = graphjoin(graphjoin(pi. P2). p 3) The following notation is therefore well-defined.

Notation B.7 The join graphjoin(pl> .. ·, Pn) of a set of graphs is defined by graphjoin(- · · (graphjoin(pi. P2). · • ·), Pn).

Notation B.8 Let pi and p 2 be reachability graphs, i.e. p I> p 2 e '.R.Jj. Pi <;;.p2 iff root(Pi) = root(p2)Astates(p1)r;;.states(p2)Aedges(p1)r;;.edges(p2)

Notation B.9 Let Pi and p2 be reachability graphs, i.e. Pi, p2 e!R.fj. Pi= p2 iff root(pi) = root(p2)Astates(p1) = states(p2)Aedges(p1) = edges(p2) For the execution graph denotational semantics, valuation functions are again defined for each syntactic construct of a graph G = (V, Pim Pout, E, I, 0 if>) = Vi 11 • • · llV m where V = {v., · · ., vn }• First it is defined for the case of well-formed data flows without choice edges and many functions defined for the operational semantics are used.

Definition B.10 For a (well-formed) data flow graph G = V111 · · · llV12 with -.choice(G) and initial state s0, the execution graph denotational semantics is defined as f;IIGTI[{s0 },0,s0], where [{s0 },0,s0] denotes the initial reachability graph consisting of only the root node s0 ) and g: (J ~ 'ltg ~ !R.fj 'lf.«Jelem ~ !RJ} ~ !RJ} are defined as (withpseSet(State), : Set(State) ~ !R.fj): f;IIG] = fix(A,p. graphjoin('vf[V1]p, · · ·, 'vfJ:V12 ]p)) 'V[V] = J.p. fix(J.. J.ps. graphjoin( if enabled(v, /irst(ps)) then graphcons([first(ps), v, update(first(ps), v)], p) else p, (rest(ps))) )states(p) Note that the function update for choice-free graphs is used, i.e. which yields only EXECUTION GRAPH DENOTATIONAL SEMANTICS OF FLOW GRAPHS 201 one next state. As far as infinite streams are not considered, the finitely generating tree is indeed finitely branching [5, 6]. The first argument to graphjoin in the node valuation function is the reachability graph extended by the transition of executing vat a state if vis enabled in that state (otherwise the reachability graph itself). By the recursion with graphjoin, v is executed at all states of the reachability graph in which v is enabled and also at the newly generated states. The graph valuation function joins all the locally extended reachability graphs to a global reachability graph. For generalized graphs with choice edges, only the valuation function for nodes, i.e. the elementary graphs, needs to be updated. Just as in the operational seman­ tics the interleaving takes place inside the node execution. Also the domains '17,fj and !RJ?, are extended to make use of State' instead of State.

Definition B.11 For a (well-formed) data flow graph G = V111 • • • llVn with initial state s0 , the execution graph denotational semantics RG(G, s0) is defined as fi[GD[{so},0,so] where q: 111 -7 f.RJj -7 'i{.fj 'lf.Gelem -7 !l(J} -7 'i{.fi map: State' x V x Set(State') -7 Set(State') are defined as (with pseSet(State'), Cl>: Set(State') -t !l(Jj, 'P: Set(State') -7 !8.:J?,): q[GD = fix(:ip. graphjoin('Vl[Vi]Jp, · · ·, 41IVn]P)) 'V[V] = A.p. fix(A.<1>. A.ps. graphjoin( if enabled(v, first(ps)) then graphunion(map(first(ps), v, update(first(ps), v)), p) else p, cf>(rest(ps))) )states(p) map = A.(s, v). fix(:i'P. A.ps. union( {[s, v, first(ps)]}, 'P(s, v, rest(ps)))) The node valuation function 'JI is basically the same as for choice-free graphs as defined in definition B.10. Only a set of edges is added to the reachability graph instead of a single edge by the execution of a node. Proposition B.3 The execution graph denotational semantics as defined in definition B.l I reduces for a (well-formed) data flow graph G for which -ichoice(G) to the execution graph denotational semantics as defined in definition B.10. 202 ADDmONAL SEMANTICS

Notation B.12 When the initial state s0 of the (well-formed) data flow graph G is clear from the context, RG(G) = RG(G, s0 ). In the following examples, the shorthand notations 2.84 (page 52) and 2.85 (page 53) are used to represent states. Example B.1 This example illustrates the node valuation function for a node v as given in figure B.1 in a choice-free graph. The initial graph is given in figure B.2a with so= {et:(a,b),e2:(d,e),e3:(f)} and St= {et:(a,b,c),e2:(d,e),e3:(f)}.

Figure B.1 A node without choice for execution graph semantics The fixpoint of 'V[V]J can again be computed by a Kleene iteration. The first iteration results in the graph of figure B.2b, while the following iterations yield the fixpoint as shown as figure B.2c. '[w )\ St S3 :1 S2

a b c

Figure B.2 Execution graph semantics of a node without choice In these graphs, s2 = {et: (b, c), e2: (e), e3: (f, v(a, d))}. S3 = {et: (b), e2: (e), e3: (f, v(a, d))}. EXECUTION GRAPH DENOTATIONAL SEMANTICS OF FLOW GRAPHS 203

S4 = {e,: (c), e3: (f, v(a, d), v(b, e))} and ss = {e3:(f, v(a,d), v(b,e))}.

Example B.2 This example illustrates the node valuation function for a node v with an out­ put choice edge as given in figure B.3. The initial reachability graph is given in figure B .4a with so= {e1:(a,b),e2:(d,e),e3:([w,f])} and s1 = {e1:(a,b,c),e2:(d,e),e3:([w,f])}.

..-, I \ I WI '... ,,,..'

Figure B.3 A node with choice for execution graph semantics The successive iterations are given in figure B.4, for which 821 = {e1:(b,c),e2:(e),e3:([w,/],[v, v(a,d)])}, 822 = {e1: (b, c), e2: (e), e3: ([v, v(a, d)], [w,f]) }, 831 = {e1: (b), e2: (e), e3:'([w, f], [v, v(a, d)]) }, S32 = {e1: (b), e2: (e), e3: ([v, v(a, d)], [w, /])}, S41 = {e1: (c),e3:{[w,/], [v, v(a,d)], [v, v(b,e)])}, S42 = {e 1: (c), e3: ([ v, v(a, d)], [w, f], [v, v(b, e)])}, 843 = {e1: (c), e3: ([v, v(a, d)], [v, v(b, e)], [w,f]) }, s51 = {e3: ([w, f], [ v, v(a, d)], [v, v(b, e)]) }, S52 = { e3: ([v, v(a, d)], [w, !J, [v, v(b, e)])} and S53 = { e3: ([ v, v(a, d)], [ v, v(b, e)], [w, f])}.

Example B.3 This example illustrates the execution graph semantics for the flow graph of figure B.5 (see also examples and 2.6 and 2.12). It requires 5 iterations to reach the fixpoint of §[GDp0 where Po is the reacha­ bility graph consisting only of the root state s0• The iterations are shown in fig­ ures B.6-11 with so= {e1:(a),e2:(b)}, s1 = {e2:(b),e3:([v1t v1(a)])}, 204 ADDmOKAL SEMANTICS

a b c

Figure B.4 Execution graph semantics of a node with choice

Figure B.5 Example flow graph with choice for execution graph semantics

Sz = {ei: (a), e3 : ([v2, v2(b)])}, S31 = {e3: ([vi. v1(a)], [vz, v2(b)])} = S4z, S41 ={ e3: ([v2. V2(b)], [Vi. V1(a)])} =S3z, S5 = (e2: (b), e4: v3(v1(a))}, s6 = {e 1: (a), e4: v3(v2(b))}, s1 = {e3:([v2. v2(b)]),e4:(v3(v1(a)))I, Sg = {e3:([vi. v1(a)]),e4:(v3(v2(b)))}, s9 = {e4: (v3(v1 (a)), v3(v2(b)))} and S10 = {e4: (v3(V2(b)), V3(V1(a))) }.

The fifth iteration yields graphjoin('J1[ v1]p4, 'i1I v2]p4, 'Vil V3])p4) =graphjoin(p4, p4, p4) = p4 and thus the fixpoint is reached. EXEClTl10N GRAPH DENOTATIONAL SEMANTICS OF FLOW GRAPHS 205

P01:

Pu: So P12: So p13: So VA St S1 StA A Sz Sz Sz v,h, V2h2 v,j Jv, S41S42 S31S32 S5 s6

Figure B.7 graphjoin('V[v1DP1· 'V[vzDPt• 'V[v3]p1) = graphjoin(P11. P12. p13) =P2

Figure B.8 p2 Theorem B.4 For (well-formed) data flow graphs, the reachability graph induced by the transitional semantics as defined in definition 2.66 is equivalent to the reachability graph of the execution graph denotational semantics as defined in definition B.11.

B.2 Behavior expression semantics of flow graphs In this section, a denotational semantics is defined based on behavior expressions, which resemble process algebras like CSP [2J and CCS [4] programs. This again 206 ADDITIONAL SEMANTICS

Figure B.9 graphjoin( 'J1[ V1]P2. 'J1[ v2JIP2. 'J1I V3]p2) = graphjoin(P21. P22. p23) = P3

Figure B.10 graphjoin('J1[vi]Jp3, 'J1[v2])p3, 'J1[v3]p3) = graphjoin(p3, p3, p33) = p4 models also the internal behavior of a flow graph. Notation B.13 Let A be a set of 'channels' and J... a set of complementary 'chan­ nels'. In the sequel, aeA, iieA and µeAuAu{ -r}, where -r is the so-called silent action, and with 71 =µ(and r = -r). Thusµ is either a or ii. BEHA VIOR EXPRESSION SEMANTICS OF FLOW GRAPHS 207

Definition B.14 A behavior expression B is an expression according to the fol­ lowing syntax: B =nil I av. 8 I iiExpr. BI 8+B I BllB I if BoolExpr then B else B where a and ii are members of a set of 'channels'. v is a variable having a value in the value domain D, Expr is an expression on D, and BoolExpr is an expression that yields a boolean value. nil means that no action can take place. The + operator is a choice operator, i.e. the behavior of 8 1+82 is one of 8 1 or B2. II is the parallel operator, which means that the behavior of 8 11182 is the interleaving of 8 1 and B2, but synchronization is possible. The conditional expression is clear. av. 8 is the action prefix and is the dual of iiExpr. B. The meaning of av. BlliiExpr. B' is comparable to (A.v. 8)(Expr) in the lambda calculus [1]. Notation B.15 BehExp denotes the class of behavior expressions.

Definition B.16 The transitional semantics of behavior expressions is defined by the following inference rules, with xeD:

ax (for every xeD) iix av. B-?[x/v]B iix. B-78

µx I 81-781

µx I µx I Bi +B2-7B1 81 +82-782

µx I µx I B1-7B1 B2-7B2 µx (if true then 8 1 else B 2 )~B1 (if false then 8 1 else B2)-7B2 µx B2-7B2

"C I I B1llB2-7B1llB2 where Jil(B) is the alphabet of B defined by: 208 ADDITIONAL SEMANTICS

0 if B =nil {a,a}vYI(B0 if B = av. 8 1 or B = aExpr. B1 YI(B) = %B1)vYI(B2) if B = B1 +B2 or B = B111B2 Yl(B 1) if B =if true then B1 else B2 %82) if B =if false then B1 else B2

Cf. this semantics with the semantics of CSP with value passing [2]. There also communication is defined as a join, i.e. a required synchronization. Proposition B.5 B1 +B2=B2+81 (81 + B2) + 83 = B 1 + (B2 + 83) 8+B=B B+nil=B B1llB2 = B2llB1 (B1llB2)llB3 = B1ll(B2llB3) Bllnil = B

Definition B.17 For a behavior expression B, a closed derivation is a path that is only labeled with i"s in the tree induced by the transitional semantics for behavior expressions as defined in definition B.16.

Definition B.18 For a (well-formed) data flow graph G = V111 · • • llVn with initial state s0, the behavior expression denotational semantics BehExpr(G, s0) is defined as (the closed derivations of) q[GDso where q:GJ ~State'~ BehExp o/.(]elem ~ BehExp are defined as: q[GD = A.s. 'l11Ydlll · · · ll'VIIVn~llsem(s) 'VIIV]J = V where V = vs. ji(update(s, v)). V sem(s) =case enabled(vi. s): vts. µs'. sem(s') case: case enabled(vms): vns. µs'. sem(s') otherwise nil case is the non-deterministic conditional expression. Note that update is used here as a non-detyrministic function State' x V ~State'. Considering update as a function State' x V ~ Set(State'), V would be defined as: BEHA VIOR EXPRESSION SEMANTICS OF FLOW GRAPHS 209 v =vs. r, fis'. v s'eupdate(s,v)

Notation B.19 When the initial state s0 of the (well~formed) data flow graph G is clear from the context, BehExpr(G) = BehExpr(G, s0).

Example 8.4 Consider the flow graph as used in examples 2.6, 2.12 and B.3 (figure B.5) with initial state s0 = {e 1: (a), ez: (b)}. q[G]) = V111V2llV3llsem(so) Since only the closed derivations need to be considered, =1'. (ji(update(s0 , v1). V1)11V211V311µs'. sem(s')) + -.. (V1 llji(update(s0 , v2). V2)11V311µs'. sem(s')) = r. -r. (VillV2llV3llsem(s1)) + r. r. (V1 llV2llV3llsem(s2)) where s1 = update(s0 , v1) = {e2:(b),e3:([vi. v1(a)])} and s2 = update(s0, v2) = {ei: (a), e3: ([vz, v2(b)]) }. Further expansion yields: = i-. i-. ((V1llji(update(si. v2)). V211V311µs'. sem(s')) + (V1 llV211ji(update(si. v3)). V311µs'. sem(s'))) + -.. -.. ((ji(update(s2, v1)). V111V211V311µs'. sem(s')) + (V1 llV211ji(update(s2, v3)). V311µs'. sem(s'))) = -.. r.(-r. -r.(V111V211V311sem(s31))+ -r. r. (V1llY2llV3llsem(s32)) + r. r. CV1 llY2llV3llsem(s5))) + r. r. (r. r. (V111Y2llV3llsem(s42)) + -r. r. (V 1llV211V3 11sem(s41)) + -r. r. (V111V2llV3llsem(s6))) with S31 = {e3:([vi.v1(a)],[v2,v2(b)])} =S4z, S41 = {e3: ([v2, V2(b)], [Vi. V1(a)])} =S3z, ss = {e2:(b),e4:v3(v1(a))} and s6 = {e1: (a), e4: v3(v2(b))}. Further expansion will yield the same tree as the reachability graph of example B.3, in which each edge is a chain of two edges, both labeled with r.

Theorem B.6 For (well1ormed) data flow graphs, the reachability tree as induced by the transitional semantics as defined in definition 2.66 is equivalent to the (closed) derivation tree induced by the behavior expression semantics as defined in definition B.18. 210 ADDITIONAL SEMANTICS

Only some informal lines of the proof are given. Both semantics use the same type of state. Because the transitional semantics of the communication operator II resembles the communication operator of CSP [2], i.e. it is a real synchronization opposed to the CCS communication operator [4], there is a bijection between the closed derivations of BehExpr(G) and the paths of RG(G). This is even true for all the non-determinism in the graph. When the transitions of the closed derivations of BehExpr(G) are labeled with v; instead of r, when such a r comes from the syn­ chronization of vs. Blliis, then the labels of the closed derivations and the paths in the reachability tree also match. Thus all the defined semantics, i.e. the operational semantics of definition 2.82, the behavioral denotational semantics of definition 2.93, as well as the execution graph denotational semantics and the behavior expression semantics defined in this appendix, denote the same behavior.

References [I] C. HEWIIT AND H. BAKER, "Actors and Continuous Functionals", Formal Description of Programming Concepts, ed. E.J. Neuhold, North-Holland, Amsterdam, 1978, pp. 367-390. [2] C.A.R. HOARE, Communicating Sequential Processes, ed. C.A.R. Hoare, Prentice-Hall International, Englewood Cliffs, NJ, 1985. [3] P. HUDAK, "Denotational Semantics of a Para-Functional Programming Language", Int. J. of Parallel Programming, vol. 15, no. 2, 1986, pp. 103-125. [4] R. MILNER, A Calculus of Communicating Systems, Lecture Notes in Com­ puter Science 92, Springer Verlag, Berlin, 1980. [5] D.A. SCHMIDT, Denotational Semantics: a Methodology for Language Development, Allyn and Bacon, London, 1986. [6] M.B. SMYTH, "Power Domains", J. Comput. System Sci., vol. 16, 1978, pp. 23-36. Glossary

Symbols symbol de/ description page # Not. A.4 cardinality of a set 183 (A.x. E 1)E2 Def. A.66 function application 195 () Def. A.60 stream assembly function 193 () Not. A.59 bottom element of the domain Str(D) 192 Def. 2.75 language concatenation 49 Def. A.58 language concatenation 192 I Not. A.10 function restriction 184 !.state Def. 2.86 partial order of the domain State 55

O' Not. 2.97 sequence of nodes 73 214 GLOSSARY symbol def description page

kPatli Not. 2.108 partial order on paths 86 t Def. 3.2 schedule 129 Def. 3.5 generalized schedule 130 v Not. 2.4 node 20 IAI Not. A.4 cardinality of a set 183 hrl Not. 2.31 length of a path 26 lsl Not. A.62 length of a stream 193 II Not. 2.53 parallel graph composition 32

Functions function de/ description page Behav Def. 2.82 operational semantics, set of all path behaviors 52 Behavior Def. 2.90 behavioral semantics of choice-free graphs 57 Behavior Def. 2.93 behavioral semantics of generalized graphs 62 I Def. 2.11 input edges of a graph 21 In Def. 2.10 connected input ports of a node 21 In Def. 2.12 input nodes of a graph 21 0 Def. 2.11 output edges of a graph 21 Out Def. 2.10 connected output ports of a node 21 Out Def. 2.12 output nodes of a graph 21 Range Def. A.9 range of a function 184 acyclic Not. 2.34 adjective for acyclic graphs 27 adjacent Def. 2.14 adjective for adjacent edges and nodes 22 anti - symmetric Def. A.6 anti-symmetric relation 184 append Def. A.60 stream assembly function 193 choice Def. 2.20 adjective for choice edges 24 choice Def. 2.21 adjective for choice nodes 24 choice Def. 2.22 adjective for graphs with choice 24 choice Def. 2.23 adjective for self-choice nodes 24 close Def. A.15 maximally closed set 185 complete Def. A.16 adjective for maximal complete functions 185 FUNC110NS 215 function def description page composition Def. 2.52 parallel graph composition 32 conflict Def. 2.21 adjective for conflict nodes 24 conflict Def. 2.22 adjective for graphs with conflict 24 connected Def. 2.39 adjective for connected graphs 28 connected Def. 2.39 adjective for strongly-connected graphs 28 cons Def. A.60 stream assembly function 193 continuous Def. A.36 adjective for continuous functions 187 cycle Def. 2.32 adjective for cyclic paths 26 cyclic Def. 2.33 adjective for graphs with cycles 27 dangling Def. 2.24 adjective for dangling nodes and edges 25 dead Not. 2.lOOdeadlock of a node 84 dest Def. 2.7 destinations of an edge 20 disjunctive Def. 2.48 adjective for disjunctive nodes 30 duplicate Def. 2.13 adjective for duplicate edges 21 elementary Def. 2.51 adjective for elementary graphs 32 embedding Def. 2.l 13substitution of a node by a graph 91 enabled Def. 2.66 adjective for the enabling of nodes 47 enabled Not. 2.68 adjective for the enabling of node sequences 47 fappend Def. 2.87 assembly function on states 56 fcons Def. 2.87 assembly function on states 56 from Def. 2.8 originating nodes of an edge 20 fhead Def. 2.87 disassembly function on states 56 final Def. 2.74 terminal states of reachability graphs 49 first Def. A.65 set disassembly function 194 flat Def. A.48 adjective for flat domains 189 fst Def. A.56 tuple disassembly function 191 ft ail Def. 2.87 disassembly function on states 56 function Not. 2.45 function of a node 29 head Def. A.60 stream disassembly function 193 identity Def. A.11 identity function 184 in Def. 2.16 incoming edges of a node 22 independent Def. 2.15 adjective for independent nodes 22 isolated Def. 2.25 adjective for isolated nodes 25 216 GLOSSARY function def description page join Def. 2.89 merge of a set of states 57 k-enabled Def. 2.69 adjective for the multiple enabling of nodes 48 k - well - behaved Def. 2.1 lOadjective fork-well-behaved graphs 90 last Def. A.60 stream disassembly function 193 length Def. 2.30 length of a path 26 length Def. A.60 length of a stream 193 lift Def. A.49 lifted domain 190 live Def. 2.10 I liveness of a graph 84 live Def. 2.99 liveness of a node 84 monotonic Def. A.35 adjective for monotonic functions 187 node-equal Not. 2.98 adjective for node sequence permutations 76 orig Def. 2.7 origins of an edge 20 out Def. 2.16 outgoing edges of a node 22 output - choice Def. 2.21 adjective for output-choice nodes 24 parallel Def. 2.13 adjective for parallel edges 21 paths Def. 2.78 the paths of a reachability graph 50 pred Def. 2.18 immediate predecessors of a node 23 proper Def. 3.13 adjective for proper allocations 134 proper Def. 3.3 adjective for proper schedules 130 proper Def. 3.6 adjective for proper generalized schedules 130 rcons Def. A.60 stream assembly function 193 reach Def. 2.37 successors of a set of nodes 27 reach Def. 2.38 successor nodes of the input nodes of a graph 27 reach Not. 2.35 successors of a node 27 reachable Def. 2.36 adjective for mutually reachable nodes 27 reflexive Def. A.6 reflexive relation 184 relabeling Def. 2.57 graph relabeling 33 rest Def. A.65 set disassembly function 194 restriction Def. 2.54 graph restriction 33 safe Def. 2.104safeness of a node-port pair 85 safe Def. 2.105 safeness of a graph 85 selective Def. 2.48 adjective for selective nodes 30 self-loop Def. 2.19 adjective for self-loop nodes, edges and graphs 23 FUNCTIONS 217 function de/ description page union Def. A.65 set assembly function 194 shuffle Def. 2.91 interleaving of streams 60 shufflejoin Def. 2.92 interleaved merge of a set of states 61 shufflercons Def. 2.65 interleaved extension of streams 45 simple Def. 2.27 adjective for simple graphs 25 snd Def. A.56 tuple disassembly function 191 strict Def. A.46 adjective for strict functions 189 strong - component Def. 2.43 adjective for strongly-connected components 29 strong - live Def. 2.101 strong definition of liveness of a graph 84 strong - live Def. 2.99 strong definition of liveness of a node 84 subgraph Def. 2.40 adjective for subgraphs 28 subgraph Def. 2.41 subgraph induced by a set of nodes 28 subgraph Def. 2.42 subgraph induced by a set of edges 28 succ Def. 2.18 immediate successors of a node 23 symmetric Def. A.6 symmetric relation 184 tail Def. 2.77 disassembly function on paths 49 tail Def. A.60 stream disassembly function 193 token-count Def. 2.116number of data items on a path 100 transitive Def. A.6 transitive relation 184 to Def. 2.8 destination nodes of an edge 20 type Not. 2.45 type of a node . 29 update Def. 2.66 state update by node executions 47 update Not. 2.67 state update by the node sequence executions 47 val Def. 2.65 disassembly operator in the D' domain 45 well - behaved Def. 2.11 l adjective for well-behaved graphs 90 well - formed Def. 2.26 adjective for well-formed graphs 25 Curriculum Vitae

In 1965, the twelfth of March, I was born in Holwierde, then in the county of Bierum, currently of Delfzijl. At the age of almost six, I moved to Tiel where I from then on attended elementary school. Then came another 6 years of secondary education, i.e. ongedeeld Voorbereidend Wetenschappelijk Onderwijs, at the Koningin Wilhelmina College in Culemborg. From September 1983, I was regis­ tered as a student in the faculty of Electrical Engineering at the Eindhoven Univer­ sity of Technology, by then still a Hogeschool. I graduated with honors at Decem­ ber 17, 1987, after which I started as a Research Assistant extending the research that resulted in my Master thesis. The research I did in these four years in the Design Automation Section became part of the ESPRIT Basic Research Action project 3281, better known as ASCIS, and resulted finally in this thesis. Having fulfilled the duties, that were expected from me as a Dutch citizen, in the military service, I am currently working on design technologies for heterogeneous systems at IMEC in Leuven, Belgium. Stellingen behorende bij bet proefschrift van Gjalt Gerrit de Jong

1. Ben systeem met een grotere toestandsruimte heeft niet noodzakelijkerwijs een grotere gereduceerde toestandsruimte. [paragraaf 3.1 van dit proefscbrift]

2. Data flow grafen als onderliggende datastructuur biedt een architectuur synthese systeem meer mogelijkheden tot optimalisatie. [paragraaf 3.2 van dit proefscbrift]

3. Ben (hardware) beschrijvingstaal moet ondersteund worden door een semantiek die overeenkomt met de intuiueve interpretatie.

4. De effectiviteit van Binary Decision Diagrams wordt overschat.

5. Temporele logica is niet geschikt voor de analyse en verificatie van real-time problemen.

6. Voor bet beschrijven van een systeem met temporele logica is de doorgaans gemaakte keuze voor een vertakkende tijd logica feitelijk alleen gebaseerd op een efficiente wijze van verificatie door middel van 'model checking', terwijl een lineaire tijd temporele logica natuurlijker is.

7. Argumenten die tegen simulatie en voor formele verificatie geponeerd worden, kunnen veelal ook tegen ontwerp verificatie in stelling gebracht worden.

8. (Common) Lisp wordt veelal gebruikt als een imperatieve taal. En daar de runtime van een prograrnma geschreven in een taal zoals Lisp maar een geringe factor groter is dan van een zelfde programma geschreven in een taal zoals C, is Lisp zeer goed te gebruiken voor prototyping.

9. De keuze voor commerciele software in plaats van 'free' of 'public domain' software vanwege robuustheid en betere ondersteuning blijkt vaak improduktief.

10. Een woon-werk afstand van meer dan een uur per openbaar vervoer kan het werk ten goede komen.