InformationProcessing Letters 16(1983) 139-143 North-Holland Publishing Company 15 April 1983

, A CONTROL MECHANISM OF A LISP-BASED DATA-DRIVEN MACHINE

Toshitsugu YUBA, Yoshinori YAMAGUCHI and Toshio SHIMADA ElectrotechnicalLaboratory, Niiharigun, Ibaraki 305, Japan

Communicatedby A. Ershov Received 15 October 1982

Keywords: Lisp, data-drivenarchitecture,lazy evaluation,multiprocessing

£- 1. Introduction (2) Parallel evaluation: The mechanism of com- putation leads to parallel evaluation of 'suspen- We are doingresearch on and development of a sion' in a lazyevaluation scheme. So the computa- Lisp-based high level language machine with tional capability is the same as lazy evaluation, but data-driven architecture, called the ETL data- the control of evaluation is carried out driven Machine-3 eagerly. (EM-3). The EM-3 is basically (3) Distribution of structure oriented to symbol memories: Stor- manipulation and aims at ages needed for list structures are distributed bringing out the intrinsic parallelism over in ordinary multiple processing elements (PE). programs. Each PE has a local store with a common addressing space. The EM-3 is a new generationLisp machine in the sense that the basic technology is on data- driven architecture. 2. Function evaluation scheme Recently, many types of data-driven machines have been proposed but few of these are proto- The EM-3 has very innovative typed [I]. control features For symbol manipulation use there are in order to achieve parallel lazy only a few evaluation. plans such as that of the University of First, the notion of a 'pseudo-result' Utah [2] "and the Electrical is intro- Communication duced in describing parallel lazy evaluation. Func- Laboratory of NTT [3]. The model of the Univer- tion execution causes the generation of a pseudo- sity of Colorado [4] is related to ours in some result in the control mechanism as shown in Fig. 1. points but is independently conceived. These Similar to other data-driven models, when and data-driven machines have such a 'good' property of parallelism that programming for parallel ex- ecution does not require a special language con- F Defined Function struct. A program in some data-driven language is executed in a parallel processing manner in accor- dance with the natural data dependencies between the operations in the actual program. pseudo The design principles of the EM-3 are sum- semi marized as follows: (1) Lisp-based machine: As a user interface a Lisp-like language called Emlisp is assumed. The Emlisp is a functional single-assignment language. Fig. 1. Generation of a pseudo-result. 0020-0190/83/$03.00© 1983 North-Holland 139

Sakuramura,

ofc

; J f Volume 16, 3 Number INFORMATION PROCESSING LETTERS 15 April 1983 only when all argument values for the defined M : CAR or CDR function F become available, the pseudo-result is Z wait generated as the virtually resulting value of the function application. At that time each argument value is not necessarily actual, as mentioned in detail later. The notion of a pseudo-result corresponds to that of suspension in lazy evaluation. In a lazy evaluation scheme, evaluation of a computation is delayed until the following computation requires "Fig. 3. Application of a car/cdr operation. actual argument values. Whereas in pseudo-result scheme, the evaluation of a function keeps going in parallelwith the evaluation of its successor. The result, the output is identifiers of pseudo-results immediately obtained but it is are realized by addre- not always an actual-result. sses on a result store and these are eventually filled In list operations, the by actual-results. Note type of their outputs that pseudo-results can fire differs from each other other operations defined depending on the individ- and/or functions as well ual case, as shown in Figs. 3 and as actual-results do. 4. For example, car and cdr operations may be Advanced control and pipelining immediately ex- are carried ecuted when the input is an actual-result out by utilizing the notion of pseudo-results or a and semi-result. When a structure result they may increase concurrency in arrives at an multiprocessing atom or a null operation, the output environments. The becomes an introduction of pseudo-results actual-result ('false'). If there is relaxes firing conditions and, no pseudo-result therefore, the evalua- at inputs, the operationeg can tion of a function is immediately gener- advanced and the operations ate an actual-result. But when 1 contained in the function the inputs to list can be executed concur- operationsinclude a rently pseudo-result, the execution with the evaluation of its successor. This is deferred until its allowes actual value arrives. some degree of overlapping of computa- Third, for non-list tion. operations like arithmetic, the execution of operationswill be deferred Secondly, the operation in Emlisp until con- the inputs turn into actual-results. Fig. 5 shows structs either a 'semi-result' or an the actual-result as general operationwith one and two inputs. illustrated in Fig. 2. A cons accepts When any combina- an operation is deferred, the control mechanism tion of actual-results, semi-results and pseudo-re- works as follows: An 'entrust' packet, whose sults as its inputs, and then generates for- a semi-result. mat is given later, is generated Like a list cell of a and sent to another conventional Lisp, a semi-result PE according to thepseudo-result has car and cdr parts identifier. There, as its substructure. Semi-re- the operation waits until pseudo-result sults are able to fire the has operations/functions as well. been turned into an actual-result. In case of applying car/cdr operationsfor a semi-

X: CONS j" jji -jj p CD CD ll f |

Fig. 2. Generationof a semi-resultin a consoperation. Fig. 4. Application ofan atom, null or eg operation. 140 16, Volume Number 3 INFORMATION PROCESSING LETTERS 15 April 1983

P, : General Operation 3. Functional organization of a processing element

The functional diagram of the PE of the EM-3 is illustrated in Fig. 6. Each PE is assumed to be connected together through communication lines such as buses, routers or crossbar switches. The Input Section receives three types of packets; these are 'result', 'invoke' and 'entrust' packets which are shown in the next section in concreteform. These areplaced into a queue. Each packet in the queue is sent to a corresponding section according to its type. Outputs of each section are sent to another section of the same PE Fig. 5. Application of a general operation. in a pipeline manner, or are placed in a queue of the Output Section. Packets on the output queue are sent to another PE.

I

1 I I

l

i

Fig. 6. Functional organization of the PE.

141

Q Volume 16, Number 3 INFORMATION PROCESSINGLETTERS 15 April 1983 In the Operand Section, Matching packets de- when inputs include a pseudo-result, and defers stinated to operands of an operation and/or a the execution of the operation. The generated en- defined function are matched with each other in trust packet is sent to a PE which is assigned by the Matching Store and then synchronized as a the pseudo-result identifier. consequence. Unmatched packets are in a waiting Note that the assignment of a pseudo-result state and are stored in the Matching Store, which identifier is carried out at each PE independently. associatively is searched whenever needed. Therefore, the global identifier can be represented The Operation Fetch Section fetches operations by combining a pseudo-result identifier with the from theProgram Store according to the operation number of the PE where it is created. addresses sent from the Matching Store and then combines them with their operands to generate internal packets. 4. Formats of packets The Initiation Section accepts invoke packets and extracts a function name and its arguments The following are the formats of each packet: placed in the packets, and then generates result Result Packet: packets corresponding to each argument. After {'result', PE, color, that, the body of the function is activated by these destination, value) result packets. Invoke Packet: The Invocation Section works when the opera- ('invoke', PE, color, function-name, tion fetched at the Operation Fetch Section is an invoke operation which invokes a defined func- arguments, dest-list). tion. A pseudo-result identifier is created for a Entrust Packet: newly invoked function and, at the same time, an invoke packet is generated and sent to the PE ('entrust', PE, color, opcode, operands, dest-list). scheduler. In a result packet, PE represents the number of i The Searching Section is activated by an entrust a PE to which the result packet is sent and the packet. An entrust packet is associated with a color means a tag discriminating the body of a pseudo-result identifier whose actual value is re- function which is invoked recursively. The destina- quired. Using this identifier, the Pseudo-result Ta- tion points to the successor operation and is repre- ble in the Result Store is searched. If the corre- sented by an operation address, a 1-bit sponding actual result is found and the number entrust indicating the port of the successor operation packet is immediately executable, i.e., its and operands the number of inputs to enable the destination are sufficientlycomplete to allow execution of the address. operation, then it is sent to theExecution Section. In an invoke packet, the function-name is the If not found, the entrust packet is stored at the i name of an invoked function and the arguments I Deferred Buffer in the Result Store and waits for i field is its parameters. The dest-list a completion of the predecessor operation is list of assigned destinations mentioned above. The by the pseudo-result identifier. opcode and operands in an entrust packet are the code of At the exit of each function, the Exit Section an i entrusted operationand its operand values, respec- saves values of actual-results or pointers to semi- tively. results which correspond to each pseudo-result of the function into the Pseudo-result Table. This time, the entrust packets waiting for completion of 5. Conclusion the function corresonding to theexit operation are activated and, if immediately executable, they are In this paper a model of the control mechanism then sent to the Execution Section or else to the for a Lisp-based data-driven machine is proposed Entrust Section. and the functional configuration of a processing The Entrust Section generatesan entrust packet element is shown on the base of the model. By the 142 13 Volume 16, Number 3 INFORMATION PROCESSING LETTERS 15 April 1983 i s introduction of the notions pseudo-result and EM-3 through defining each PE section as a 'class. semi-result, it is shown that 'parallel' lazy evalua- We believe that many ideas about data-driven >y tion can be naturally realized in data-driven archi- architecture should be prototyped and then the tecture. feasibility of implementation should be tested by ilt In conclusion we present an overview of current measuring and evaluating performance of the pro- y- related research; one of the studies currently in totypes. :d i progress is the measurement of intrinsic paralle- Le lism in Lisp programs. A summary of this empiri- cal study was reported in [6]. The experimental References results will be useful for designing a Lisp-like which should be well suited to data-driven [1] V.K. Avind and K. Pingali, A data flow architecture with language Science, the sizes of each tagged tokens, Rept. TM-174, Lab. for Computer architecture and for estimating MIT, 1980. store in a PE and thecommunication traffic among [2] R.M. Keller, G. Lindstrom and S. Patil, A loosely-coupled PEs. applicative multi-processing system, Proc. NCC (1979) Another study is the design of a user interface 613-622. languagefor the EM-3, Emlisp, and the implemen- [3] M. Amamiya, R. Hasegawa and H. Mikami, A list process- compiler. inter- ing oriented data flow machine and its simulator, Papers of tation of its We have assumed an on Comp. 40-8, IPS Japan(1981) 77-88 architecture, Tech. Group Arch. mediate language for data-driven (in Japanese). named Emil, which is useful for describing a data- [4] J.C. Peterson and W.D. Murrey, Parallel computer architec- flow graph. The Emlisp compiler translates an ture employing systems, Proc. In- Emlisp program into an Emil program. ternal Workshop Highlevel Language Computer Architec- ture (1980) 190-195. 1 A study is the development of a soft- further [5] Y. Yamaguchi, T. Yuba and T. Shimada, A function I ware simulator for EM-3's processing elements. evaluation mechanism of EM-3, Papers of Tech. Group on This work is now being carried out in order to Comput. lECE Japan(1981) 1-8(in Japanese). Yamaguchi, M. and T. Yuba, An empirical i confirm the EM-3 control mechanism and to [6] Y. Yamada evaluate its efficiency. This simulator accepts an study of intrinsic parallelism in Lisp programs, Proc. 24th (1982) 101-102 (in Japanese). Emil program as its source. Written in Simula, it Nat. Conf. of IPS Japan is able to simulate the pipeline behavior of the

I

143

t

ECBI-57,