On Remote Procedure Call *

Patricia Gomes Soares t and Communications (DCC) Lab . Computer Science Dept ., Columbia Universit y New York City, NY 1002 7

Abstract makes resources available to a wider commu- nity. Both data and processes) can be repli- The Remote Procedure Call (RPC) paradigm i s cated thereby achieving greater performance reviewed. The concept is described, along with through parallelism, increased reliability and the backbone structure of the mechanisms tha t availability, and higher fault tolerance. The re- support it . An overview of works in support- mote accessibility of specialized units provid e ing these mechanisms is discussed. Extensions better performance for applications with spe- to the paradigm that have been proposed t o cific requirements. enlarge its suitability, are studied . The main Initially, only operating-system-level prim- contributions of this paper are a standard vie w itives supported interprocess communication . and classification of RPC mechanisms accord- Unfortunately, the details of communicatio n ing to different perspectives, and a snapshot o f were not transparent to application program- the paradigm in use today and of goals for the mers. To transmit complex data structures, i t future of RPC . was necessary to explicitly convert them int o or from byte streams . The programmer neede d to be aware of heterogeneity at the machin e 1 Introductio n level, as well as at the language level, betwee n the sending and the receiving processes. No Over time, operating-system support as well compile-time support was offered by the lan- as language-level support have been designed guages, and debugging became an extremel y to ease the development of distributed applica- difficult task. A higher level of abstraction was tions . A distributed application is a set of re- needed to model interprocess communication . lated programs that execute concurrently or in A variety of language-level constructors an d parallel, on the same or distinct autonomou s operators were then proposed to address thi s processing nodes of a distributed system, inter- problem. They improved readability, portabil- changing information. ity, and supported static type checking of dis- The decentralized configuration provided b y tributed applications . A major achievemen t a distributed system brought many benefits . in this area was the Remote Procedure Call The interconnection of these processing units (RPC) . *The IBM contact for this paper is Jan Pachl, Centr e The procedure mechanism, which is provide d for Advanced Studies, IBM Canada Ltd ., Dept. B2/894, by most imperative nondistributed languages , 895 Don Mills Road, North York, Ontario, M3C 1W3 . was initially proposed to support code mod- This work was partly supported by Brazilian Re- search Council (CNPq) under Grant 204543/89 .4. IA running program unit is a process .

215 ularization, and to be an abstraction facility. tinct approaches . In Section 8, extensions and RPC extends this mechanism to a distributed software support added to the RPC paradigm environment, whereby a process can call a pro- are discussed. Finally, Section 9 describes mea- cedure that belongs to another process . Inter- surements that can be used to evaluate the per- process communication is then given the syn- formance of RPC systems . tax and semantics of a well-accepted strongly typed language abstraction . Significant result s have been achieved in the effort to better ac- 2 General Concepts commodate in the model problems related not only with interprocess communication issues , Before exploring the concept of RPC, the un- but also with distributed environment configu- derlying environment is described here. The rations. major environment requirements that RPC is Computing systems continue evolving at aimed to support are introduced . Other ap- an astonishing speed, and parameters change proachs that address these features are sur- quickly. A high-speed network is capable of veyed. continuously broadcasting a database, thu s A distributed computing system consists of providing for an almost zero access time t o a collection of autonomous processing nodes data. An today plays the role interconnected through a communication net - of a decentralized manager that searches for fa- work. An autonomous processing node com- cilities to provide applications with necessar y prises one or more processors, one or mor e resources . Users are close to a declarative en- levels of memory, and any number of exter- vironment where they specify what their needs nal devices [8, 47] . No homogeneity is as- are, and the distributed system is supposed t o sumed among the nodes, neither in the proces- select, based on the instantaneous configura- sor, memory, nor device levels . Autonomous tion of the system, how to provide it . nodes cooperate by sending messages over the network. The communication network may be Much demand is placed on mechanisms that local area (short haul), wide area (long haul) , isolate architectural dependent characteristic s or any combination local ones and wide ones from the user . The RPC is an abstraction connected by gateways . Neither the network which can make these dependencies transpar- nor any node need be reliable . In general, it is ent to the user . A system supporting RPCs assumed that the communication delay is long may provide for location transparency, where enough to make the access to nonlocal data sig - the user is unaware of where the call is exe- nificantly more expensive than the access to 10- cuted. cal primary memory. It is arguable, though, as The RPC constitutes, also, a sound basi s to whether or not the nonlocal access is not sig- for moving existing applications to the dis- nificantly more expensive than the local acces s tributed system environment . It supports soft- when secondary storage is involved [59] . ware reusability, which can be one of the most A distributed program consists of multiple important medicines in attacking the softwar e program units that may execute concurrently crisis [69] . on the nodes of a distributed system, and inter- In this paper, the whole RPC concept is ex- act by exchanging messages2 . A running pro- plored, and surveyed . In Section 2, the un- gram unit is a process . The distributed pro- derlying architecture is described, and in Sec- gramming consists of implementing distributed tion 2.1 the concept of RPC is introduced . Sec- applications . The distributed languages consist tion 3 presents the major challenges in support - of programming languages that support the de- ing the paradigm. The standard design of an velopment of distributed applications . There RPC system is described in Section 4, as well are three important features that distribute d as its major components, their function, and how they interact . Sections 5, 6, and 7 sur- 2 1n shared memory distributed systems, progra m vey each of those major components, analyzin g units can also interact through shared memory. Thes e systems will not be considered in this paper, and some the problems they are supposed to deal with , assumptions and assertions made may not be valid fo r and describing several systems and their dis- those environments .

216 languages must deal with that are not presen t of only a subset of the nodes in an applica- in sequential languages [8] : tion's execution, partial failure, leaves still the opportunity for the detection and recovery of • Multiple program units the failure, avoiding the application's failure. • Communication among the program unit s Considerable increase in reliability can also b e achieved by simply replicating the functions o r • Partial failure . data of the application on several nodes . A distributed language must provide sup - Distributed languages differ in modeling pro- port for the specification of a program compris- gram units, communication and synchroniza- ing multiple program units. The programme r tion, and recoverability . Some languages model should also be able to perform process man- program units as objects and are distributed agement . That is, to specify the location i n object-based. An extensive survey of dis- which each program unit is to be executed ; the tributed object-based programming systems i s mode of instantiation, heavy weighted or light described in [16] . Others model the program weighted; and when the instantiation should units as processes units and are named process- start. The operating system allocates a com- oriented (Hermes [75], and NIL [76]) . Yet oth- plete address space for the execution of heav y ers model program units into tasks, as well as weighted processes . The address space con- processes (Ada [20]), where a task is a less ex- tains the code portion of the process, the dat a pensive instantiation of program units than the portion with the program data, the heap for other two methods. dynamic allocation of memory, and the stack , Many languages model communication as which contains the return address linkage for the sending and reception of messages, re- each function call and also the data elements ferred to as message-based languages, simulat- required by a function . The creation of a light- ing datagram mailing (CSP [35], NIL [74]) . A weighted process causes the operating system process P willing to communicate with a pro- to accommodate the process into the addres s cess Q, sends a message to Q and can continue space of another one . Both processes have th e executing . Process Q, on the other hand, de- same code and data portions and share a com- cides when it is willing to receive messages and mon heap and stack spaces . Process manage- then checks its mail box. Process Q can, if nec- ment can be used to minimize the executio n essary, send messages back to P in response to time of a distributed program, where the par- its message. P and Q are peer processes, that allel units cooperate rather than compete. A is, processes that interact to provide a service full analysis of the limitations of synchronous or compute a result. The application user is re- communication in languages that do not sup - sponsible for the correct pairing and sequencin g port process management is found in [46] . Pro- of messages. cesses can be created either implicitly by thei r Several languages require the processes t o declaration, or explicitly by a create construct. synchronize in order to exchange informa- The total number of processes may have to b e tion. This synchronization, or meeting, is ren- fixed at compile time, or may increase dynam- dezvous, and is supported by languages such ically. as Ada [20], Concurrent C [27], and XMS [26] . Language support is also needed to spec- The rendezvous concept unifies the concept s ify the communication and synchronization o f of synchronization and communication betwee n these processes . Communication refers to the processes. A simple rendezvous allows only uni - exchange of information, and synchronizatio n directional information transfer, while an ex- refers to the point at which to communicate . tended rendezvous, also referred to as transac- For applications being executed in a single tion, allows bidirectional information transfer . node, the node failure causes the shutdown o f The Remote Procedure Call (RPC) mech- the whole application . Distributed systems ar e anism merges the rendezvous communicatio n potentially more reliable . The nodes being au- and synchronization model with the procedur e tonomous, a failure in one does not affect th e model. The communication details are handled correct functioning of the others. The failure by the mechanisms that support such abstrac -

217 tion, and are transparent to the programmer . Languages that support RPC are referred to as RPC-based languages' (Mesa [13], HRPC [55] , Concert C [90, 7]) . For the synchronization is- sue, RPC-based languages are classified as th e ones that block during a call until a reply is re- ceived, while message-based languages unbloc k after sending a message or when the messag e has been received. A distributed language may provide excep- tion handling mechanisms that allows the use r to treat failure situations . The underlying op- erating system may also provide some fault tol- erance. Projects such as Argus [44] concentrate on techniques for taking advantage of the po- tential increase in reliability and availability. A survey of programming languages paradigms for distributed computing is foun d PU: Program Unit in [8], and an analysis of several paradigms fo r Pi: Procedure i —~► Control transfer process interaction in distributed programs i s PCi: Procedure call i in [5] . Figure 1: Conventional Procedure Call Sce- nario 2 .1 RPC : The Concept This paper assumes familiarity with the opera- tional semantics of the procedure call binding . and callee are inside distinct threads . Section B Section A of the Appendix presents a quick re- of the Appendix presents an extension of opera- view of the concept . tional semantics of the procedure call presente d RPC extends the procedure call concept t o in Section A, to handle RPCs . express data and control scope transfer acros s Distributed applications using RPC are threads of execution in a distributed environ- then conceptually simplified, communicating ment. The idea is that a thread (caller thread) through a synchronized, type-safe, and well- can call a procedure in another thread (callee defined concept . A standard RPC works as thread), possibly being executed on another follows . Thread P that wants to communicat e machine. In the conventional procedure cal l with a thread Q, calls one of Q's procedures , mechanism, it is assumed that the caller an d passing the necessary parameters for the call . the callee belong to the same program unit , As in the conventional case, P blocks until the that is, thread of execution, even if the lan- called procedure returns with the results, if any. guage supports several threads of execution i n At some point of its execution, thread Q exe- the same operating-system address space as cutes the called procedure and returns the re- shown in Figure 1 . In RPC-based distributed sults. Thread P then unblocks and continues applications, this assumption is relaxed, an d its execution . interthread calls are allowed as shown in Fig- The communication mechanism has a clean, ure 2. Throughout this paper, local or conven- general, and comprehensible semantics . At the tional call refers to calls in which the caller and application level, a programmer designs a dis- callee are inside the same thread, and remote tributed program using the same abstraction calls or RPC refers to calls in which the caller as well-engineered software in a nondistributed application. A program is divided into modules 'There exists a common agreement in the researc h community on the viability of the use of RPC for dis- that perform a well-defined set of operation s tributed systems, especially in a heterogeneous environ- (procedures), expressed through the interfac e ment [56] . of the module, and modules interact through

218 Figure 3 : Possible configurations of RPC s

PU: Program Unit fic [9], and between domains on separate ma- Pi : Procedure i — Control transfer chines is cross-machine traffic (as for RPC3) . PCi : Procedure call i Figure 2 : Remote Procedure Call Scenario 3 Challenges of an RPC Mechanism procedure calls. Information hiding, where as much information as possible is hidden within The ideal RPC mechanism is the one that pro- vides the application user with the same syn- design components, and largely decoupled com - ponents are provided . For the application user , tax and semantics for all procedure calls in- it is transparent where these modules are exe- dependently of being a local call or a remot e cuted, and all the communication details are one in any of the possible configurations . The hidden. The implementation framework that underlying system supporting the concept pro- ceeds differently for each type of call, and i t supports the concept is an RPC system or RPC mechanism . will be in charge of hiding all the details from the application user . The system must be capa- The application programmer or the RPC sys- ble to deal with the following differences in the tem designer are responsible for mapping th e source of the calls, that require development application modules into the underlying archi- and execution-time support : tecture where the program is to be executed . The RPC system designer must be prepare d • Artificialities may occur in extending the to deal with three possible configurations of conventional procedure call model, inde- an RPC from a system-level point of view as pendent of the RPC configuration . Some shown in Figure 3, where threads are abbre- aspects to be considered are : viated by the letter T, operating-system pro- cesses by P and machines by M . An RPC can With the relaxation of the single- be performed between two threads executing thread environment for the caller and the callee, thread identifications and inside a process (RPC1 in the Figure), betwee n addressing must be incorporated into two processes running in the same machine (RPC2), or between two processes running i n the model . different machines (RPC3). For each situation, – In the conventional model, the failure the communication media used may vary, ac- of the callee thread means the fail- cording to RPC system designers, environmen t ure of the caller thread . In the RP C availability, and possibly the wants of the ap- model, this is not the situation and plication programmer . RPC1 could use shared new types of exceptions and recovery memory, RPC2 could use sockets [72], and are required. RPC3 could use a high-speed network. Com- – Caller and callee threads share the munication traffic on the same machine (suc h same address space, in the conven- as for RPC1 and RPC2) is cross-domain traf- tional case, and therefore have the

219 same access rights to the system re- – Uniform call semantics sources . The implementation of the – Powerful binding and configuration information exchange protocol is sim- ple. Global variables are freely ac- – Strong type checkin g cessed, and parameters, return val- – Excellent parameter functionality ues, and their pointers are manipu- – Standard concurrency and exception lated. In the RPC model, the RP C handling. mechanism has to move data across address spaces and pointer-based ac- • Pleasant properties of an RPC mech- cesses require special handling . Ad- anism: dresses need to be mapped across ad- dress spaces and memory need to be – Good performance of remote call s allocated to store a copy of the con - – Sound remote interface design tents of positions aliased by pointers . By using pointers, cyclic structures – Atomic transactions, respect for au- may be created, and the mechanism tonomy must overcome infinite loops durin g – Type translation the copying of data . – Remote debugging . • The presence of a communication networ k between caller and callee can cause differ- In Section 8 these issues are considered. ences. The RPC mechanism has to han- If a remote call preserves exactly the sam e dle transport-level details, such as, flow syntax as a local one, the RPC mechanism is control, connection establishment, loss of syntactically transparent . If a remote call pre- packets, connection termination, and so serves exactly the same semantics of a local on. call, the mechanism is semantically transpar- ent. The term remote/local transparency refers • The distributed environment configura- to both syntax or semantics transparency, o r tion can cause differences. Issues such either of them (when there is no ambiguity) . as availability, recovery, locality trans- To support the syntax and semantics trans- parency of processes, and so on, ca n parency, even in the presence of machine or arise. Some of the problems of distribute d communication failures, the RPC mechanism environments result from heterogeneit y must be very robust . Nevertheless, an RPC at the machine level, operating-system must be executed efficiently, otherwise (experi - level, implementation-language level, o r enced) programmers will avoid using it, losing communication-media level . Distinct RPC transparency. mechanisms support heterogeneity at dis- RPC mechanisms should provide for the easy tinct levels, the ultimate goal being to sup- integration of software systems, thereby facili- port heterogeneity at all levels in a way tating the porting of nondistributed applica- completely transparent to the applicatio n tions to the distributed environment . The pro- programmer . cedure model can limit the ratio of dependenc y Some mechanisms accommodate these re- between components of a program, by coupling , quirements by exposing the application pro- increasing the potential for software reusabil- grammer to different syntaxes, or semantics , ity [69] . In languages supporting RPCs, the in- depending on the configuration of the call be- teraction between modules is specified through ing issued. Nelson [53] describes and analyzes a collection of typed procedures, the interface , the major concerns of an RPC mechanism (an d which can be statically checked [29] . Inter- their complexity), which can be summarized as faces, being well-defined and documented, give follows: an object-oriented view of the application. An RPC mechanism must not allow cross- • Essential properties of an RP C domain or cross-machines access by unautho- mechanism: rized RPC calls . This provides for a large

220 grained protection model [9], because the pro- of the procedure, the epilogue puts the value s tection must be divided by thread domains . returned, if any, either in the stack of the pro- Criticism of the paradigm exists [82]. Issues cess in memory or in registers, and jumps bac k such as the following are still undergoing anal- to execute the instruction pointed to by th e ysis: return address. Depending on the RPC config- uration, stubs may simulate this behavior in a • Program structuring in RPC, mainly dif- multithread application. ficulties in expressing pipelines and multi- casting. Suppose two threads that want to commu- nicate as shown in Figure 4 : the caller thread , • Major sources of transparency loss, suc h issues the call to the remote procedure, and th e as, pointers and reference parameters, callee thread executes the call . When the caller global variables, and so on. thread calls a remote procedure, for example , procedure remote, it is making a local call as • Problems introduced by broken network s shown in (1) in Figure 4, into a stub proce- and process failures . dure or caller stub. The caller stub has th e same name and signature as the remote proce- • Performance issues around schemes to , dure, and is linked with the caller process code . transparently or gracefully set up dis- In contrast with the usual prologue, the stub tributed applications based on RPC in packs the arguments of the call into a call mes- such a way that the performance is not sage format (the process is marshaling), and degraded . passes (2) this call message to the RPC run- A survey of some features provided by a se t time support which sends (3) it to the callee of major RPC implementations (such as Ceda r thread. The RPC runtime support is a RPC, Xerox Courier RPC, SUN4 ONC/RPC , designed by the RPC system to handle the Apollo NCA/RPC, and so on) can be foun d communication details during runtime . After in [83]. The implementations are mainly com- sending the call message to the callee thread, pared considering issues such as design objec- the runtime support waits for a result messag e tives, call semantics, orphan treatment, bind- (10), that brings the results of the execution of ing, transport protocols supported, security, the call. On the callee side, the runtime sup - and authentication, data representation, and port is prepared to receive (4) call messages to application programmer interface . that particular procedure, and then to call (5 ) a callee stub . Similar to the caller stub, the callee stub unpacks the arguments of the cal l 4 Design of an RPC Sys- in a process referred to as demarshaling . After tem that, the callee stub makes a local call (6) t o the real procedure intended to be called in th e The standard architecture of an RPC system , first place (procedure remote, in our example). which is based on the concept of stubs is pre- When this local call ends, the control returns sented here. It was first proposed in [53] base d (7) to the callee stub, which then packs (that on communication among processes separate d is, marshals) the result into a result message , by a network, and it has been extended and and passes (8) this message to the runtime sup- used since then . port that sends (9) it back to the caller thread . Stubs play the role of the standard proce- On the caller side, the runtime support that dure prologue and epilogue normally produced is waiting for the result message receives (10 ) by a compiler for a local call. The prologu e it and passes (11) it to the caller stub . The puts the arguments, or a reference to them, as caller stub then unpacks (that is, demarshals) well as the return address, in the stack struc- the result and returns (12) it as the result t o ture of the process in memory, or in registers, the first local call made by the caller thread . In the majority of the RPC mechanisms, the stub s and jumps to the procedure body. By the end are generated almost entirely by an automati c 4 SUN is a trademark of , Inc . compiler program, the stub generator .

221

caller process callee process RPC facilities can differ in any of these com------ponents and in several dimensions . For in- stance, the binding discipline (a process mod- ule component) may differ from one RPC sys- LPC(1) I I (12) (7) (6) LPC tem to another on the interface that they offe r caller stub callee stub to the application user, on the time one pro- cess should perform the binding (compilation (2) (11) (8) (S) t 1o) (9); or run time), on the implementation details o f RPC runtime • communication RPCruntime how this binding is performed, on whether th e auppod i media I 5uPP sR _ :(3 ) ------binding deals with heterogeneity and how, an d LPC: local procedure call so forth. While these choices should be orthog- —+ procedure call onal, they are usually hard coded and inter- wined. As a result, RPC mechanisms usually Figure 4: Basic RPC components and interac- cannot interact and are extremely difficult t o tions modify to make such interaction possible . In the following sections, the functionality of 4 .1 RPC System Component s each of the components is defined precisely, th e dimensions of differences in RPC implementa- RPC systems must interact with a variety o f tions are analyzed, and real systems and thei r computer system components, as well as sup- approaches are explored. A brief historical sur- port a wide range of functionality, exemplifie d vey of the systems referred to is presented in as follows . The system or the source language Appendix C . must provide a minimum process management , that allows the design and execution of threads (heavy-weight or light-weight), that constitut e 5 Application-Develop- the distributed application . To issue an RPC , a process should acquire an identifier to the re - ment Time Components spective procedure (name binding) so that it can refer to it . RPC systems must provide The interface description language (IDL) and means for processes to exchange bindings, us- stub generator are two major application- ing a binding discipline . The runtime support development time components . In designing a module must interact with the transport layer program unit, the user specifies the procedure s to transport the data among the communicat- that the process can call remotely (importe d ing processes should a cross-machine communi- procedures), and the procedures that other pro- cation occur. cesses can call (exported procedures) . The set The major components of an RPC mecha- of imported and exported procedures of a pro- nism can be divided into three categories : cess is its interface and is expressed through the interface description language (defined in • Application-development time com- Section 5 .1) of an RPC facility. ponents: components used by the appli- For the imported and exported procedures of cation user during development time . a process, the stub generator creates the caller and callee stubs, respectively. These stubs are • Process modules : components that are responsible for the marshaling of arguments , mainly dependent on the specific RP C the interaction with the communication media , mechanism. Depending on the particu- and the demarshaling of result arguments, if lar mechanism, they can be application - any during run time . development components, execution-tim e components, or even components prede- fined and available in the operating sys- 5 .1 Interface Description Lan- tem. guage (IDL) • Execution-time components: components An interface description (ID), or module speci- used only during run time. fication, consists of an abstract specification o f

222

Y a software component (program unit) that de - attribute means that only the shape informa- scribes the interfaces (functions and supporte d tion of the data is passed from the caller to th e data types) defined by a component . It can b e callee. The out attribute means that the infor- interpreted as a contract among modules. An mation is passed from the callee to the caller , interface description language (IDL) is a nota- and the callee can change the value of the pa- tion for describing interfaces . It may consist rameter, as well as its shape information . The of an entirely new special-purpose language, or out (value) attribute is similar to out, but may be an extension of any general-purpose the shape information does not change. Only language, such as, C . the value of the parameter changes . The concept of an ID is orthogonal to Concert also defines the following value- the concept of RPC. Its use motivates th e retention attributes : discard, discard information-hiding concept, because module s (callee), discard (caller), keep, kee p interact based on their interfaces and the im- (callee), and keep (caller) . These at- plementation details are nonapparent. Never- tributes specify whether copies of storage gen- theless, IDs are of major significance for RPC erated by the runtime system during the pro- mechanisms, which use them for generating cedure's execution should remain or be deallo- caller and callee stubs . In general, RPC pro- cated (discarded) . These allocations are gener- tocols extend IDLs to support extra features ated as part of the interprocess operation . The inherited solely from the RPC mechanism . keep attribute and the discard attribute spec- In specifying a function, an ID contains it s ify that both caller and callee can keep or dis- signature, that is the number and type of its ar- card the storage allocated . guments, as well as the type of its return value , To deal with pointers across operating- if any. More elaborate IDLs also allow for th e system address spaces or across machine specification of the other important options : boundaries, Concert defines the following pointer attributes : optional, required , • Parameter passing semantic s aliased, and unique . A pointer to a require d type is pointer that cannot be NULL, whereas • Call semantic s a pointer to an optional type can be NULL , • Interface identification . simply indicating the absence of data . A pointer to a unique type is a pointer to data It is not always possible to find out the di- that cannot be pointed to by any other pointer rectionality of the arguments in a function cal l used in the same call. In other words, it does based only on its ID, because of the diver- not constitute an alias to any data being passe d sity of parameter passing semantics (PPS) sup- during the call. Pointers to aliased types, on ported by high-level languages . By directional- the other hand, are pointers to data that may ity is meant the direction of the informatio n also be pointed to by other pointers in the sam e flow, that is, whether the caller is passing val- call. This property is expressed in the trans- ues to the callee, or vice versa, or both . The mission format, and reproduced on the othe r PPS specification explicitly determines how side, so that complex data structures, such as the arguments of a function are affected by graphs, can be transmitted . a call. Concert [90, 7] defines a complete set The call semantics, on the other hand, is of attributes. As directional attributes, Con- an IDL feature inherited from the RPC model . cert contains in, in (shape), out, and out In conventional procedure call mechanisms, a (value) . It makes a distinction between th e thread failure causes the caller and callee pro- value and the shape of an argument . The valu e cedures to fail. A failure is a processor fail- refers to the contents ofthe storage, while shape ure, or abnormal process termination . If th e refers to the amount of storage allocated, as thread does not fail, the called procedure was well as the data format. The in attribute executed just once in what is called exactly- means that the information is passed from th e once semantics . If the thread fails, the runtim e caller to the callee, and because of that mus t system, in general, restarts the thread, the cal l be initialized before the call . The in (shape) is eventually repeated, and the process contin -

223 ues until it succeeds. Just as procedures may them, an at-least-once semantics are equivalen t cause side-effects on permanent storage, earlie r to an exactly-once semantics . Another way of (abandoned) calls may have caused side-effect s approaching the problem is to extend the ID L that survived the failure . The results of the last so that the user can express such properties as executed call are the results of the call . These idempotency of a procedure, and let the system semantics are called last-one semantics, last- do the work. of-many semantics, or at-least-once semantics . Some RPC mechanisms extend IDLs to sup - In the RPC model, caller and callee may b e port the declaration of an interface identifier, in different operating system address spaces, a s which may be just a name, or number, or both, well as in different machines . If one thread is or it may include such additional informatio n terminated, the other may continue execution . as the version number of the module that im- A failed caller thread may still have outstand- plements it. This identifier can be particularl y ing calls to other nonterminated callee threads . useful for RPC mechanisms that support the A callee, on the other hand, may fail before re- binding of sets of procedures, rather than just ceiving the call, after having participated in the one procedure at a time. The binding can be call, or even after finishing the call . For callers done through this identifier . and callees on different machines, the network The Polygen system [14], within the 5 can also cause problems . A call message can environment, extends the functionality of in- be lost or duplicated, causing the RPC to be terfaces a little further . In addition to a mod- issued none or several times. If the result mes- ule's interface, the interface describes proper - sage is lost, the RPC mechanism may trigger ties that the module should pursue, and rule s the duplication of the call. to compose other modules into an application . The RPC mechanism must be powerfu l A site administrator provides a characteriza- enough to detect all, or most of, the abnor- tion of the interconnection compatibilities of mal situations, and provide the application user target execution environments, and an abstrac t with a call semantics as close as possible to the description of the compatibility of several pro- conventional procedure call semantics . This is gramming languages with these environments . only a small part of semantics transparency of This characterization is in terms of rules. Ap- an RPC mechanism, which is discussed thor- plying these rules, Polygen generates, for a oughly in Section 8 . Nevertheless, it is very ex- particular interface, an interface software that pensive for an RPC system, to support exactly- can be used to compose components in a par- once semantics . The run time would recover ticular environment . Besides the usual stubs , from the failure without the user being aware . the stub generator creates commands to com- Weaker and stronger semantics for the RP C pose and create the executables, including com- model were, therefore, proposed, and some sys - mands to compile, link, invoke, and intercon- tems extended the IDL with notations that al- nect them. The source programs, the gener- low the user to choose the call semantics tha t ated stubs, and the generated commands ar e the user is willing to pay for a particular pro- brought together in a package . The activity cedure invocation. The possibly, or maybe se- of analyzing program interfaces and generat- mantics do not guarantee that the call is per- ing packages is packing . This scheme allow s formed. The at-most-once semantics ensures for components to be composed and reused i n that the call is not executed more than once . In different applications and environments with- addition, calls that do not terminate normally out being modified explicitly by the software should not produce any side-effects, which re- developer. quires undoing of any side-effect that may hav e In Cedar [13], Mesa [51] is the language use d been produced so far . as IDL and as the major host language for th e On the other hand, a procedure that does not programs. No extensions were added . The pro- produce any side-effect can be executed sev- grammer is not allowed to specify arguments o r eral times and yet preclude the same results results that are incompatible with the lack of without causing any harm to the system . Such 5 UNIX is a trademark of UNIX System Laborato- procedures are idempotent procedures, and, for ries, Inc.

224 shared memory . ity of declaring transmissible types, which the NIDL [22], the IDL of NCA/RPC, is a net- NIDL compiler cannot marshal . The user is re- work interface description language, because i t sponsible for the design of a set of procedure s is used not only to define constants, types, an d that convert the data to or from a transmissi- operations of an interface, but also to enforc e ble type to or from types that NIDL can mar- constraints and restrictions inherent in a dis- shal. This allows, for instance, the user to pas s tributed computation. It is a pure declara- nested pointers as arguments to remote calls . tive language with no executable constructs . SUN RPC [79] defines a new interface de- An interface in NIDL has a header, constan t scription language, RPC Language (RPCL) . In and type definitions, and function descrip- the specification file, each program has a num - tions. The header comprises a UUID (Univer- ber, a version, and the declaration of its proce- sal Unique Identifier), a name, and a version dures associated with it . Each procedure has number that together identify an interface im- a number, an argument, and a result . Multi- plementation. NIDL allows scalar types an d ple arguments or multiple result values mus t type constructors (structures, unions, top-level be packed by the application user into a struc- pointers, and arrays) to be used as arguments ture type. The compiler converts the names and results. All functions declared in NID L of the procedures into remote procedure names are strongly typed and have a required handl e by converting the letters in the names to lowe r as their first argument . The handle specifies case and appending an underscore and the ver- the object (process) and machine that receives sion number. Every remote program must de- the remote call. A single global variable can fine a procedure numbered 0, a null procedure , be assigned to be the handle argument for all which does not accept any arguments, and does functions in an interface, making the handle ar - not return any value. Its function is to allow a gument implicit and permitting the binding o f caller who, by calling it, can check whether th e all functions of an interface to a remote pro- particular program and version exist. It also al- cess. This scheme disturbs the transparency lows the client to calculate the round-trip time . and semantic issues of RPCs somewhat . The In the Modula-2 extension [1] to support required argument reminds the user that th e RPC, the old Modula-2 external module defini- procedure that is being called is a remote one , tions were used as the interface description wit h losing transparency. Yet, the flexibility of mak- only a few restrictions and extensions added . ing it implicit, increases the chance of confusin g The idea was to make as few modifications a s the user who is used to local call interface and possible, allowing any external modules to b e who may not expect to be calling a remote func- executed remotely . For restrictions, remote ex- tion. Consequently, the user may not be pre- ternal modules cannot export variables and for - pared to handle exception cases that can arise . mal parameters of pointer types do not have In the implicit situation, the application user i s the same semantics as when called locally . For responsible for implementing the import func- extensions to the syntax, a procedure can b e tion that is automatically called from the stubs defined as being idempotent, a variable can b e when the remote call is made . defined as of type result (out parameter) or seg- Procedures declared in an interface can b e ment. The attribute segment allows the user to optionally declared as idempotent . The NID L directly control the Modula/V system to mov e compiler generates the caller and callee stub s exactly one parameter as a large block, and t o written in C, and translates NIDL definitions move the other data needed in the call as a into declarations in implementation language s small message . (C and Pascal are the supported languages s o far). This scheme makes an interface in ID L 5.2 Stub Generator general enough to be used by any programming language if the NIDL compiler can do the con- During runtime, the stubs are the RPC mecha- version. These files must be included in th e nism generated components that give the user caller and callee programs that use the inter- the illusion of making an RPC like a local call . face. In this interface, the user has the flexibil - The main function of stubs is to hide the imple -

225 Lion (wire representation or standardized rep- resentation) . The demarshaling is the reverse process. 1: 1 The data representation protocol is respon- sible for defining the mapping at two lev- els [73] : the language level and the machine level. Caller and callee threads can be imple- mented in the same language or different ones , which is homogeneity or heterogeneity at the language level or homogeneity or heterogeneity at the abstract data type level . Different lan- guages, such as, FORTRAN, LISP, COBOL , and C, have different abstract data types, and data type conversion may be needed for the communication. Existing compilers for nondis- T tributed languages already provide transforma- tions in the same machine, in the same or dif- ferent languages, and in the configuration o f different data types . These conversions are im- plicit conversion, coercion, promotion, widen- ing, cast, or explicit conversion. An example of implicit conversion is a function that accept s only arguments of float type, and is called wit h integer arguments. Stubs are also responsible for giving se- Figure 5: Stub Generator Global Overvie w mantics to parameter passing by reference, or pointer-based data in general, across a commu- nication media . Several implementations have mentation details from the application designe r been proposed . One possible solution would and the user . They are usually generated at ap- be to pass the pointer (address) only . Any fu- plication development time by the RPC mecha- ture data access through the pointer sends a nism component, the stub generator . From the message back to the caller to read or write th e interface description of an interface, the stub data [82] . Besides inefficiency issues, this im- generator generates a callee header file, a caller plementation requires that the compiler on th e header file, a caller stub, and a callee stub for callee side treats local pointers differently fro m each of the functions declared in the interfac e remote ones. This violates one of RPC prin- as shown in Figure 5 . If the caller and callee ciples : the compiler should not be affected b y are written in the same language, the header the presence of RPCs . file can be the same for both the caller an d In some other systems [41], a dereference callee. Some stub generators [29] also generat e swaps a memory page via the communication import and export routines. Such routines per- media, and the data is accessed locally. This form the binding of the respective interface o r scheme, however, requires homogeneous ma- procedure. chines and operating systems, and must ensure Stubs are the RPC mechanism component s that only one copy of a page exists within the responsible for the transmission of abstrac t system. data types expressed in (high-level) program- The threads can be executed on homoge- ming languages over a communication media . neous machines presenting homogeneity at th e A data representation protocol defines the map- machine level, or heterogeneous ones present- ping between typed high-level values and byte ing heterogeneity at the machine level . Dif- streams. The marshaling is the process of con- ferent machines have different primitive data verting typed data into transmitted representa- types. In a message-passing system, some con -

226 version is required when sending data to a dis- destination type . If no specialized unit exist s similar machine. In a shared memory system , to transform type A into type C, but there ar e a transformation is needed when the share d specialized units to transform type A to typ e memory is accessed by a processor that rep- B, and from type B to C, a transformation from resents data in a different way than the share d A to C can be accomplished via A to B, and B memory. to C . The number of transformation units may At the machine level, the data representation be reduced by using a partially connected ma- protocol must select from three possible dimen- trix, and by rotating the data through several sions of heterogeneity between machines : (1) specialized transformation units . the number of bits used to represent a primitive At the software level the most common ap- data type, (2) the order of bits, and (3) the cod- proaches to performing these transformation s ing method. Order of bits refers to choices such are the local method and a degeneration of th e as big endian, little endian, and so on. Coding distributed one . In the local method, the caller method refers to choices such as 2 's comple- stub converts the data from the source repre- ment, sign magnitude, exponent bias, hidde n sentation into a standard one. The callee stub bit, and so forth . then does the conversion back to the original A hardware representation transformation format. This scheme requires that both calle r scheme for primitive data types in a heteroge- and callee agree upon a standard representa- neous distributed computer system is suggeste d tion, and that two transformations are always in [73] . It also performs an extensive analysis performed . The use of a standard data repre- of the complexity of the several possible archi- sentation to bridge heterogeneity between com- tecture models to implement it . ponents of a distributed system is not restricted As observed in [73], the primitive data typ e to the communication level . One example is th e transformation can be local, central, or dis- approach used in [84] for process migration i n tributed through the system. In the local heterogeneous systems by recompilation . The model, the machine or a transformation unit strategy used is to suspend a running proces s attached to it performs the transformation of on one machine at migration points to which the data into a common format. This data in the states of the abstract and physical machine s a common format is sent to the destination , correspond . Then the state of the binary ma- and the transformation unit there transforms chine program is represented by a source pro- the data into the destination format . This gram that describes the corresponding abstract approach always requires two transformations , machine state. This source program is trans- even when dealing with two homogeneous ma- ferred to the remote machine, recompiled there , chines. It also requires the specification of a and continues execution there . The compiler powerful common format that can represent all intermediate language is used to communicate data types that may be transferred . between a language-dependent front-end com- In the central approach, a central unit per- piler, and a machine-dependent code generator . forms all the transformations. The central uni t In the degenerated distributed approach, ei- receives the data in the source format, trans - ther the caller stub (sender makes it right forms it to the destination format, and send s method) or the callee stub (receiver makes it it to the destination . This central unit keep s right method) does the transformation from the information about all the interconnected ma- source directly into the destination representa- chines, and each piece of data to be trans - tion. Only one transformation is needed . Nev- formed needs to have attached to it its sourc e ertheless, the side performing the conversion machine, its data type in the source machine , needs to know all the representation details and the destination machine type . used by the other side to properly interpre t In the distributed approach, specialize d the data. The caller packs the data in its fa- transformation units scattered throughout the vorite data format if the callee side recognize s system perform the transformations. A spe- it. The callee needs only to convert the data if cialized transformation unit can only perform the format used to transmit the data is not the transformations from one source type into one one used locally. This avoids intermediary con-

227 versions when homogeneous machines are com- level. Over the years, distinct approaches hav e municating, and makes the process extensible , been developed to attack the conversion prob- because the data format can be extended to lem. Some of the approaches concentrate o n specify other features such as new encodings o r solving only the machine level conversion prob- packing disciplines. In the local approach, the lems, assuming homogeneity at the languag e connection of a new machine or language int o level. Others tried to solve only the language the system requires only that the stubs for thi s level conversions, assuming that communica- new component convert from its representatio n tion protocols at a lower level would be respon- to the standard one . In the distributed method , sible for performing the machine-level conver- all the old components must be updated with sions. More ambitious approaches tried to dea l the information about this new component, o r with both problems . In the following section , the new component must be updated about all distinct approaches are studied . the old ones . When using the local approach, the stan- 5.2 .1 Stub Generator Designs dardized representation can be either self- describing (tagged) or untagged . Self- The Cedar [13] system designed the first stu b addressing schemes have benefits such as en- generator, called Lupine [13] . The stub code hanced reliability in data transfer, potentia l was responsible for packing and unpacking ar- for programs to handle values whose types ar e guments and for dispatching to the correct pro- not completely predetermined, and extensibil- cedure for incoming calls in the server side . The ity. Untagged representations make more effi- Mesa interface language was used without any cient use of the bandwith and require less pro- modifications or extensions . It assumed homo- cessing time to marshal and demarshal. The geneity at the machine level and the language caller and callee must nevertheless agree on the level. standard representation . Matchmaker [39, 38] was one of the earlies t Heterogeneity at the representation level, o r efforts toward connecting programs written i n more specifically, at the mapping level betwee n different languages. The purpose was to deal abstract data types and machine bytes, has al- with heterogeneity at the language level . The ready been studied in a nondistributed envi- same idea was later explored by the MLP [33] ronment. A mechanism for communicating ab- system. On the other hand, HORUS [28], de- stract data types between regions of a syste m veloped almost concurrently with MLP, aimed that use distinct representations for the ab- at also dealing with heterogeneity at the ma - stract data values is presented in [34] . A local chine level. These three systems shared th e approach is used, in which the main concerns same approach of designing their interface de- is modularity. The implementation of each scription language, and generating the stub s data abstraction includes the transmission im- based on the interfaces . plementation details of instances of that partic- The HORUS [28] was one of the primary ef- ular abstraction. Each data type implementa- forts in designing a stub generator for RPC tion must then define how to translate between that would deal with heterogeneity at the dat a its internal representation and the canonical representation level and at the language level . representation, and the reverse, which is don e All interfaces are written (or rewritten) in a through the definition of an external represen- specific interface specification language. The tation type for that type. The system support s aim was to provide a clean way to restrict the transmission of a set of transmissible types , the RPC system data structures and inter- that are high-level data types, including shared faces that could be reasonably supported i n and cyclic data structures . The external rep- all target languages. This simplifies the stub resentation type of an abstract type T is any generator and provides a uniform documenta- convenient transmissible type XT, which can tion of all remote interfaces. The system use d be another abstract type if wanted . a single external data representation that de- Stubs are responsible for conversion of dat a fined a standard representation of a common - at the language level as well as at the machine language and common-machine data types over

228

the wire. Data types were then converted from User Supplied Interface Declaration the source language into common-language , and later from the common-language into the target language . This simplified the stub gen- erator and stub design, because only one repre- caller caller callee callee sentation was allowed. Nevertheless, the exter- machine language language machine specification specification specification specification nal data representation may not represent effi- ciently some data values from a source or target language . On the other hand, new source lan- guages and machine types can be added to the Horns Horns system without requiring an update in the en - stub generator stub generator tire network. The HORUS defined precisely the classes of knowledge needed for a marshaling l l system, and this classification can be consid- ( caller stub ) r callee stub ) ered one of its major achievements . The infor- mation was classified as : language dependent , Figure 6: Structure of HORUS Stub Generator machine dependent, and language and machin e independent . It also defined a format for this knowledge that was easy to manipulate, tha t that entry, generating the stubs and headers . was general enough to contain all the informa- HORUS provides IN, OUT, and INOUT param- tion required, and that was self-documenting . eter types to optimize passing data by value - All the language dependent knowledge require d result . HORUS provides for the marshaling of for the marshaling is stored as language specifi- pointer-based data types, as well as arbitrary cations. They contain entries for each common graph structures . A pointer argument is mar- language data type, and additional (data typ e shaled into an offset and the value it references . independent) specifications needed to correctly The offset indicates where in the byte array thi s generate the caller and callee stubs in the par- value is stored . Pointers are chased depth first ticular language. The machine dependent in- until the entire graph structure is marshaled . formation, including primitive data type rep- HORUS constructs a runtime table of object s resentation, order of bits, and coding metho d and byte array addresses to detect shared an d of all possible target machines is stored as repeated objects. machine specifications . Entries for both, ma- The Universal Type System [33] (UTS) lan- chine and language dependent information, ar e guage, another effort towards the support of expressed in an internal HORUS command- heterogeneity at the language level was devel- driven language. The language specificatio n oped concurrently with HORUS . Unlike HO- entries contain a set of code pieces for eac h RUS, it did not deal with heterogeneity at pair of basic data types (or type constructors ) the machine level. The language was designed in the common language and a possible target to specify interfaces in the Mixed Language language. The set completely specifies all infor- Programming System (MLP System), where a mation required by the stub generator to mar- Mixed Language Program was written in two shal that type for that target language . It con- or more programming languages . A program tains a piece of code for generating a declara- consisted of several program components, each tion of the type in the target language, a piec e written in a host language . The UTS, a set of code to marshal an argument of this type , of types, and a type constructor contain a lan- and a piece of code to demarshal . HORUS guage for constructing type expressions or sig- works as a compile time interpreter as show n natures that express sets of types . The lan- in Figure 6. It generates the stubs and header guage consists of the alternation operator or , files by parsing the interface written in commo n the symbol — that indicates one unspecifie d language and, for each declaration, it consult s bound in an array or string, the symbol * tha t the corresponding table and entry, and executes indicates an arbitrary number of arguments o r the appropriate piece of code associated with unspecified bounds, and the symbol ?, which i s

229 the universal specification symbol that denote s This rule guarantees that the conversion can the set of all types. The representation of a be done. When a signature does not unify , value of UTS type is tagged, usually indicatin g MLP delays the decoding of the UTS value an d the type and length of the data, and is stored transfers the problem to the user to handle . in a machine- and language-independent way . This is done through the use of a representa- For each host language in the MLP system , tive for an UTS value as the argument value . a language binding is defined, which is a set MLP provides the user with a library of rou- of conventions specifying the mapping betwee n tines that can be called passing a representa- the host language types and the language typ e tive value as an argument to inquire the UTS of the UTS language. The mapping is divide d type of the corresponding value, or to perfor m into three parts: mapping for types in which explicit conversion of types . Valid operation s the two languages agree, meaning that each for the host language types are not allowed t o type has a direct equivalent in the other type be applied on the representative itself, which is system; mapping for types that do not have a maintained by the agent. Representatives ar e direct counterpart, but that are representable similar to tickets that are presented to agents : in the other language ; and mapping for UT S they provide access to the data . types that do not fall in the above categories . Matchmaker [39, 38], an interface specifica- For each language binding, there is an agent tion language for distributed processing, was process that performs the mapping, and is writ - one of the earliest efforts toward connecting ten in a combination of that language and a sys - programs written in different languages. It pro- tem programming language. This use of mul- vides a language for specifying the interfaces tiple languages may be necessary, because th e between processes being executed within the agents need to perform low-level data trans - SPICE network, and a multitargeted compile r formations, and yet be callable from the host that converts these specifications into cod e language . for each of the major (target) languages used In the MLP system, there is a distinction within the SPICE environment . Matchmake r between the interface that a module export s was intended to be a language rich enoug h (exported interface), and the (possibly several) to express any data structure that could b e interfaces that other modules import (importe d efficiently represented over the network, an d interfaces) . These interfaces are specified i n reasonably represented in all target languages . the UTS language as signatures that describe The major target languages were C, PER Q the number and UTS type of the arguments Pascal, Common Lisp, and Ada . and returned values of a procedure . Signatures Each remote or local procedure call contain s that make use of the additional symbols in UT S an object port parameter, and a list of in and are underspecified, because the data types may out parameters . The object port specifies the not map into only one host language-level type , object in which the operation is performed . All but a set of legal types. The MLP linker, there - parameters, but the object port, are passe d fore, takes a list of program components tha t by value. Pointers, variable-sized arrays, an d constitute a mixed-language program and per - unions can only occur in top-level remote dec- forms static type checking of the arguments larations, and may not be used when construct- and parameters of the intercomponent calls . In ing other types. Time-out values can be spec- UTS, a match between an exported signature ified, and the reply wait can be made asyn- and an imported one has a more formal notio n chronous as well . than simply determining if the signatures ar e The Matchmaker language is used to mask equivalent. As defined in [33] : language differences, such as language syntax , type representations, record field layout, pro- An import signature matches, or uni- cedure call semantics, and exception handlin g fies, with the corresponding export mechanisms . Representation for argument s signature if and only if the set denoted over the network is chosen by the Matchmake r by the former is a subset of the set de- compiler. A Matchmaker specification includes noted by the latter . complete descriptions of the types of every ar -

230 gument that is passed, and the compiler gen- user then ends up being responsible for most erates the target language type declarations t o marshaling and demarshaling of data . The rpc- be imported into the generated code . Match- gen [79], the SUN stub generator, generates , maker assigns a unique identification (id) to from an RPC specification file, the caller an d each message, which is used at run time to callee stubs, as well as a header file . The rpc- identify messages for an interface . After the gen generates the remote procedure names b y procedure (message) is identified, the types of converting all letters in the names to lowercase , all fields within it are also known, because mes - and appending an underscore and the program sages are strongly typed at compile time . The version number to end of the names . The SUN compiler is internally structured to allow th e defined and used a canonical representation for addition of code generators for other language s data over the wire, the eXternal Data Repre- as they are added to the SPICE environment . sentation [78] (XDR) protocol . This allows a callee to be automatically acces- To avoid constrainting the user, another ap- sible to callers written in any of the supporte d proach was to allow the user to implement spe- languages, regardless of the language in which cial routines for the marshaling and demar- the callee is written . shaling of complex data structures . This ap- The Abstract Syntax Notation One [37] proach was used, for instance, by the Networ k (ASN.1), a more general purpose interface de- Interface Definition Language (NIDL) of the scription language, was later designed, an d NCA/RPC [22] system and by HRPC [10] . has been widely used in international standar d In NIDL, the routines convert the transmis- specifications. The ASN.1 interfaces are com- sible types defined by the user to or from types piled into its transfer syntax based on the Basic that the NIDL compiler can marshal . The het- Encoding Rules (BER), which is used as the ex- erogeneity at the data representation level was ternal data representation . Each data type in dealt by NCA with a Network Data Represen- ASN.1 has a rule to translate it into its corre- tation protocol (NDR) . A set of values across sponding transfer syntax. The ED [54] library the communication media is represented by a is an implementation of the ASN .1 BER. The format label that defines the scalar value rep- library includes encoding and decoding rou- resentation, and a byte stream with the values . tines that can be used as primitive functions to The NDR does not specify the representation compose encoders and decoders for arbitraril y of the information in packets, however . The complicated ASN .1 data types. It also contains NIDL compiler and the NCA/RPC system en- routines that translate data values directly be- code the format label in packet headers, frag- tween C and the BER transfer syntax withou t ment the byte stream into packet-sized pieces , converting it to an intermediate format . From and pack the fragments in packet bodies. Thi s the ED library, CASN1, an ASN.1 compiler , implementation is very flexible, allowing the re- was designed and implemented for translating ceiver makes it right approach to deal with het- ASN .1 protocol specifications into C, and fo r erogeneity in the data representation level . generating the BER encoders and decoders fo r In HRPC [10], each primitive type or typ e the protocol defined data types . constructor defined by the IDL is translate d Because of the difficulty introduced with pa - from the specific data type in the host machin e rameter passing by reference, or pointer-base d representation to or from a standard over-the- data in general, in an RPC, some systems sim- wire format . No direct support is provided for ply do not allow arguments of such type in the the marshaling of data types involving point- declaration of functions that can be called re- ers, although the user can provide marshalin g motely. An example is the SUN RPC [79] im- routines for complicated data types . plementation, which disallows it and restrict s The main concern of other systems durin g the number of arguments that can be passed i n the design of the stub generator is the main- an RPC to one, as well as the return value . If tenance of location transparency . The Eden the user wants to pass more than one argument , system [2] consists of objects called Ejects that or receive more than one value, the data mus t communicate via invocations . It is transpar- be grouped into a structure data type. The ent for the user whether the invocation is for a

231 local Eject or a remote one. The system trans- fect performance, as may be the situation in parently does the packing and stub generation rising alarms in a process control application . when required . Two phases are involved in acquiring a bind - ing that fully connects a caller and a callee : naming, and port determination . The naming 6 Process Modules is the process of translating the caller-specifie d The binding disciplines, language support, and callee name into the network address of the host on which the callee resides. The port determi- exception handling are the main component s of the process modules . The binding disci- nation identifies a callee thread in one host ma- plines define a way of identifying and address- chine. The network address produced durin g ing threads, and binding procedure names t o naming is not enough for addressing a callee , threads. The language support of the RPC because multiple callees may be running on a . So, each callee has its own com- mechanism allows the user to make these map- single host munication port and network address, which pings, and to issue the remote calls in situations uniquely identifies the callee thread . In the where syntax transparency is not supported . conventional situation, the procedur e's name is The exception handling mechanism supports enough, because everything else is known b y the handling of abnormal situations. default . Some systems allow the threads to expor t 6.1 Binding Disciplines additional attributes associated with a binding In conventional procedure call mechanisms, a that provides the callees with more informa- binding usually consists of finding the addres s tion so that they can decide the thread to use , of piece of code, by using the name of a proce- or with several ways of identifying a thread be - dure. In the RPC model, a binding discipline sides its name. For instance, a thread may ex- has also to identify the thread in which the call port the version number of its implementation , will be executed, and several approaches may so a callee may choose the newest version, or be taken. First of all, there may be several a particular version of interest. A thread may threads that can or are willing to execute a also publish the data objects upon which it is specific procedure. These threads are usuall y updated, so a callee might choose a thread tha t server threads (server processes), because they does, or does not, modify a piece of data . The give the illusion that they exist to serve call s binding handler is all the information needed to these procedures . Some implementations re- to identify a port . quire the caller to identify a specific thread t o The binding process works as follows. A execute its call . Some require only the name of thread willing to accept RPCs exports (pub- the machine in which the caller wants the call lishes) its port address and the procedures that to be executed, independently of the particula r it wants to execute . A thread willing to is- thread. Finally, some implementations require sue RPCs imports the port information of a that the caller identify only the procedure to b e thread or set of threads it thinks is adequat e called, and dynamically and transparently, th e to execute its calls. The exported or imported runtime support locates the most appropriat e address information can connect the caller di- thread to execute the call (Cedar [13]) . In this rectly to the callee, or can connect the caller t o way, an RPC binding can change during the the RPC runtime in the callee machine that will caller's execution, and a caller can be virtually act as bridge to the callee . The main goal is to bound to several similar callees simultaneously. make this process work as similar as possibl e The binding can be static, performed dur- to the conventional procedure scheme, hiding ing compilation or linkage time, or dynamic , from the application user the import and ex- performed during runtime. Static bindings are port processes. Concert [90, 7] is the system useful if the application configuration is static that supports the most conventional and flexi- or seldom changes. They can also be used in ble binding mechanism. situations where the extra cost of binding im- To import and export procedures, som e mediately before the call may, significantly, af- systems (Cedar RPC [13], NCA RPC [22] ,

232 HRPC [10, 55], and so on) manage a global ing an import function upon an interface or name space, that is, a set of names whose asso- procedure. Some other systems allow for the ciated data can be accessed from anywhere i n declaration of an interface-wise global variable the environment . Export corresponds to writ- that will keep the address of the handler. Al- ing or updating information into this global though the user interface is similar to a local name space, whereas import corresponds to one, the side-effects of the call can be unfortu- reading this information. Some other system s nate if the user is not careful enough to updat e use well-known ports, where each callee is as- the handler when needed . signed a specific port number . Callers usu- ally consult a widely published document t o get the well-known port numbers. The handler 6 .1 .1 Different Binding Discipline De- consists of this port number and the home o f signs the machine in which the callee is supposed t o In the Cedar RPC Facility [13], the unit ex - run. Because the global name service usually ported or imported is an interface, and it i s involves permanent memory, systems should identified by its type and instance. The type detect when a callee no longer exists, and un- specifies in some level of abstraction the inter- export its associated information . Other sys- face functionality and the instance, the imple- tems use a Binding Daemon (SUN RPC [79]) , mentor. For each instance, the Grapevine dis- a process that resides at a well-known port on tributed database is used to store the network each host and manages the port numbers as- address of the machine where it is running . At signed to callees running on that host . An- the application user point of view, for each in- other way would be to have a server-spawning terface imported or exported there is a related daemon process, a process that also reside s function to import or export the binding (net - at a well-known port on each host, but tha t work address) of that interface . The export serves import requests by creating new port s function gets the type and instance name of a (through forking other processes) and passing particular instance of an interface as argument s to the callee the new port number . and updates the database with this informatio n Some systems provide thread facilities to im- and the machine's network address . The set of port or export a single procedure, whereas oth- users that can export particular interfaces is re- ers provide only for the import or export of a stricted by the access controls that restrict up - set of procedures (an interface). In some situ- dates to the database . The import function re- ations the interfaces represent a functional re- trieves this information from the database an d lated grouping of functions, that typically op- issues a call to the RPC runtime package run- erate on shared data structures . The unit of ning on that machine, asking for the binding binding being the interface is sometimes justi- information associated with that type and in- fied as a means of ensuring that all the func- stance. This binding information is then ac- tions in that interface are executed by the cessible to the caller stubs of that interface . If same server. This is a step towards an object- there is any problem in the callee side, by th e oriented approach, where program objects ca n time the binding is requested, the caller side is only be accessed through an specific set of pro- notified accordingly and the application user i s cedures. supposed to handle the exception . There are Another aspect of the binding discipline i s several choices for the binding time, dependin g the interface exposed to the user . The most ad- on the information the application user speci- equate interface would be the one that would fies in the import calls . If the user gives only provide the user with the same call inter - the type of the interface, the instance is chosen face (syntax transparency), independently of dynamically at the time the call is made by th e whether the user was making an RPC or a lo- run time support. On the other hand, the user cal call . Most systems however require that can specify the network address of a particular the RPC handler be an argument of any RP C machine as the instance identification for bind- call (NCA/RPC [22], SUN RPC [79]) . This ing at compile time . RPC handler must also be the product of call - In NCA/RPC [22], as mentioned previously ,

233 RPCs require a first argument, a handle, tha t Initially, the HRPC binding subsystem specifies the remote object (process) that re- queries the name service to retrieve a bind- ceives and executes the call . The primary goal ing descriptor . A binding descriptor consists of this scheme is to support location trans- of, a machine-independent description of th e parency, that is, to allow dynamic determi- information needed to construct the machine - nation of the location of an object . The lo- specific and address-space-specific binding . It cal calls have the same syntax. Handles are indicates the control component, data repre- obtained by calling the runtime support an d sentation component, and transport compo- passing the object (process) identification, th e nent that the callee uses, as well as, a net - UUID (Universal Unique Identification), an d work address, a program number, a port num- network location as input arguments. One flex- ber, and a flag indicating whether the bind- ibility is the implicit semantics . The user de- ing protocol for this callee involves indirectio n clares a global variable for each interface tha t through a binding agent . The HCS Name Ser- stores the information necessary to acquire th e vice (HNS) applies the concept of direct-access- handle, and designs a procedure that converts naming, where each HCS system has a nam- this variable into a handle . The handle argu- ing mechanism and HNS accesses the informa- ment is not passed in the call and the con- tion in these name services directly, rather tha n version procedure called automatically by th e reregistering all the information from all the ex- the stubs during run time . To import and ex- isting name services . This design allows close d port interfaces in NCA, the NCA/LB (Loca- systems named insular systems to evolve with- tion Broker) was designed, which retrieves lo - out interfering with the HNS, and makes newl y cation information based on UUIDs . A callee added insular services immediately available, side process exports some functionality by reg- eliminating consistency and scalability prob- istering its location and the objects, type of lems. The HCS name service (HNS) can be objects, or interfaces that it wants to export a s viewed as a mapping between the global nam e a global (to the entire network) or a local (lim- of an object and the name of that object in its ited users) service . The LB is composed of the local system. The local system is the one re- Global Location Database, a replicated objec t sponsible for the final name to data mapping . that contains the registration information of all Each HNS name then consists of two pieces : globally registered processes; and the Local Lo- the context, which determines the specific nam e cation Broker that contains the information o f service used to store the data associated with the local exportations. The Global Databas e the name, and the individual name or local provides a weakly consistent replicated pack - name . Each name service connected to HNS age, meaning that replicas can be inconsistent has a set of name semantics managers (NSMs). at any time, but, without updates, all replicas These routines convert any HNS query call t o converge to a consistent state in a finite amoun t the interface, format, and data semantics o f of time. A caller process can inquire about th e a local call to the name service. These rou- information of a specific node, or can make an tines are remote routines and can be added to RPC by providing an incomplete binding han- the system dynamically without recompilatio n dle, and specifying, for instance, only an inter- of any existing code . For a new system to b e face, and not a specific process, and the run - added to the HNS, NSMs for that system ar e time system asks the Local Location Broker th e built and registered with the HNS ; the data reminder of the information . stored in them becomes available through th e In HRPC [10, 55], to support the heterogene- HNS immediately. ity at the data representation protocol, trans- Using the context part of the name and the port protocol, and runtime protocol, the bind- query type, the HNS returns a binding for th e ing is much richer, containing not only the loca - NSM in charge of handling this query type for tion, but also a separate set of procedure point- this context . The caller uses this binding t o ers for each of these components . Calls to thes e call the NSM, passing it the name and quer y routines of these components are made indi- type parameters as well as any query type spe- rectly through these pointers . cific parameters. The NSM then obtains th e

234 information, usually by interrogating the nam e a module, which can have only one exportable service in which it is stored. procedure. The binding is done at call time. SUN RPC [79] has no strongly expressed Concert [90, 7] expresses a binding in th e binding model . Some procedures take the I P most transparent way : as a function pointer. host name of the machine to which the caller Languages like C, which have already function wants to communicate . Others simply take file pointers, need no extensions . Concert simply descriptors to open sockets that are assumed t o treats functions as closures . The closures iden- already be connected to the correct machine . tify a code body and a process environment Each callee's host machine has a port mapper in which the body is executed . In Concert, daemon that can be contacted to locate a spe- when a process assigns the address of a function cific program. The caller has to make an ex- body to a function pointer, not only is the func- plicit call to the port mapper to acquire a n tion body address implicitly assigned, but also RPC handler. The caller specifies the callee's the encompassing process environment . Any host machine, program number, version num- function pointer passed to another process pre- ber, and TCP or UDP protocol to be used . The serves this information, exporting a function RPC handler is the second argument of the pointer or binding. RPC. The port mapper also provides a more flexible way of locating possible callees . The 6.2 Language Support caller sends a broadcast request for a specifie d remote program, version, and procedure to a In the traditional procedure call model, the particular port mapper's well-known port . If callee process, being the same as the caller pro- the specified procedure is registered within th e cess, by default, is always active . In the RPC port mapper, the port mapper passes the re- model, an issue exists as to when to activate quest to the procedure. When the procedure the process in which the callee is supposed to is finished, it returns the response to the por t run. These problems are called activation prob- mapper instead, which forwards it to the caller . lem or callee management problem or callee se- The response also carries the port number o f mantics problem . In some systems, the applica- the procedure so that the caller can send later tion user is responsible for this activation. In requests directly to the remote program . others, callees processes are activated by the In the Template-Based model [68] for dis- underlying process model. There is a callee tributed applications design, an application manager that can create callee processes on de- consists of modules that communicate wit h mand, spawn new processes, or simply act as a each other via RPC . Each module can have resource manager that chooses an idle process only one exportable procedure, but it can call from a pool of processes created earlier . Such any of its local (internal to the module) proce- callees can remain in existence indefinitely, an d dures, as well as any procedure exported b y can retain state between successive procedure other modules . The module's interaction i s calls; or they may last a session as a set of expressed through a set of attribute binding s procedure calls from a specific client, or they known as template attachments . A template is may last only one call, which is a instance-per- a set of characteristics that can be used to spec- call strategy. When the activation process is ify the scheduling and synchronization struc- controlled by the runtime system, several in- ture of a module in an application . It describes stances of a callee can be installed on the sam e the interaction of one module with others. Ev- or different machines transparently to the user . ery module can have up to three templates t o This may be only to provide either load bal- specify its behavior : input, which describes the ancing, or some measurement of resilience to correct scheduling and synchronization of in - failure. The binding mechanisms would allow coming calls, output, where similar to input for one of these instances to be selected at import calls originated by the module, and body, which time. describes attributes that can modify the mod- Some languages allow the user to start pro- ule's execution behavior in the distributed en- cesses that will either be caller or callee fro m vironment . In this model, the unit of binding is the operating-system level, and also offer Ian -

235 guage support to start new processes durin g the cost of process creation and initialization of run time. In the latter situation, an interfac e state. In peak situations, additional processe s should be powerful enough to allow passing ar- can be created to help and terminated after th e guments between the parent and the children , situation returns to normal . This scheme is as pointed out in [89] . Supporting parameter very efficient, because it reduces the number passing for newly created processes allows fo r of process switches in a call, but the user has compile-time protection, more readability, an d no control during run time of the callee side structuring . Concert [90] allows the passing of processes . arguments between parent and children, an d In explicit mode, one of the most common these arguments can be remote procedure ref- statements is the accept statement that ac- erences as well. cepts a procedure name, or a set of proce- In Athena RPC [70], callees are character- dures, as input and blocks until a call to one of ized by their lifetime, whether they are share d the specified procedures is made. The select among clients or are private, and the means statement allows the callee to express Boolean through which they are instantiated . In the in- expressions (guards) that must be attended to terface definition, the user specifies these char- before the callee accepts a call for the corre- acteristics through attributes that are associ- sponding procedure . Most of the languages ated with callee's stubs. The system is very that do have a select construct on the callee flexible, allowing callee processes to be instan - side such as Ada [20], do not allow RPC call s tiated either manually or automatically in re- to be part of the Boolean expressions, that is, sponse to a request from a caller, or being act as a guard. A discussion about situations long-lived, belonging to a pool, and potentially in which the absence of such a construct cause s answering multiple callers . This instantiate d loss of expressiveness, of robustness, and even mode is transparent to the caller . of the wanted semantics, is presented in [25] . RPC systems also differ in the way callees re- An expressive extension of Ada's select con- spond to the calls . For the syntax of the calle e struct that solve the problem in an elegant wa y reception, the receive can be explicit or im- is also described. Having RPC calls as guards plicit . In explicit mode, the language provides in select constructs is an advantage in situa- a function that can be called by an already ac- tions such as : conjunctive transmission (broad- tive process just like any other. In the implicit casting), disjunctive transmission, pipelining , mode, the language provides a mechanism for transmission stopped by a signal, and so on . handling calls or messages. The receive is not The broadcasting occurs when the RPC caller performed by any language-level active process . gets some information that it wants to broad- The callees are dedicated processes that per - cast to a specific set of processes . It has the form only remote calls . This mode may loo k calls it is to make, but the order in which the more like the behavior in a sequential proce- callees are ready to answer the call is unknown . dure environment, where the arrival of the cal l The caller does not want to block waiting fo r triggers the execution of an appropriate body a busy callee to answer while it can be call- of code, much as an interrupt triggers its han- ing other callees and afterwards try the bus y dler (see [65] for further discussion) . In essence , one again. The disjunctive transmission oc- a classification can be made [65] in which ex- curs when a caller wants to call any potential plicit receive is a message based-mode, whereas callees. The pipelining happens when processes implicit receipt is a procedure-based one . The P1 , . . . , P„ are to form a pipeline, where process issue of synchronization, though, is orthogonal Pi receives items from P;_1 and sends them t o to the issue of the syntax of how the callee pro- Pi+1 . Each Pi is simultaneously ready to either cess responds to it . accept an item from Pi_1 or to send an item to In Cedar [13], for instance, there is no user - Pi+1, and is willing to choose by the availabilit y level control to the service of calls on the callee of its neighbors . The transmission stopped b y side. There is a pool of callee side processes a signal occurs when mixed calls and accept s in the idle state waiting to handle incoming are necessary—for example, when you want a calls. In this way, a call can be handled without process to continuously make specific calls un -

236 til another process calls it when it is supposed 7 Execution-Time Compo- to accept and stop calling . nents Concert [90] presents a set of language sup - ports for the handling of RPCs in the explicit The runtime support and communication me- mode, which is discussed in Section C of the dia support constitute the execution time com- Appendix . ponents of RPC . The runtime support consists of a set of libraries provided by the RPC sys- tem. Its a software layer between the other RPC components and the underlying architec- ture. It receives and enqueues calls to the callee 6.3 Exception Handlin g stubs, sends the call messages generated by the caller stubs, guarantees that calls are executed in the correct order, and so on . The commu- In a single thread of execution, there are tw o nication media support is the transport media modes of operation : the whole (including calle r that carries the messages. At the operating- and callee) program works, or the whole pro- system level, there are several possible configu- gram fails. With RPC, different failure mode s rations for an RPC call, as shown previously i n are introduced : either the callee or the caller Figure 3 . Depending on the particular syste m malfunction, or the network fails. Even with and the underlying architecture, the communi- good exception raising schemes, the user is cation media chosen for each particular config- forced to check for errors or exceptions where uration varies . Nevertheless, when the caller logically one should not be possible, making the and callee threads reside on different machines , RPC code different from the local code, and vi- network transport support is needed . This sup - olating transparency. port has been the subject matter of several RPC systems, and it is discussed in the fol- Transparency is lost either because RPC has exceptions not present in the local case, or be- lowing section . cause of the different treatment in the RPC or local case of exceptions common to both . The 7.1 Network Transport Support programming convention in single machine pro - For the network transport support, the primar y grams is that, if an exception is to be notified question is which protocol is the most suitable to the caller, it should be defined in the inter - for the RPC paradigm . face module, while the others should be han- For better performance in the extended ex - dled only by a debugger [13] . Based on this change case, TCP could be chosen, since it pro- assertion, Cedar RPC [13] emulated the seman- vides better flow and error control, with negli- tics of local calls for all RPC exceptions defined gible processing overhead after the connectio n in the interfaces exported . The exception for is established. It also provides the predictabil- call failure is raised by the runtime support if ity required for RPC . On the other hand, it is any communication difficulty should occur . typically expensive in terms of space and time. To prevent a call from tying up a valuabl e In space, it requires the maintenance of connec- resource in the callee side, the callee should tion and state information, and, in time, during be notified of the caller's failure [86]. For a connection establishment, it requires usuall y caller's failure, or any other abnormal termi- three messages to set it up, and three more t o nation, the resource should be freed automati- tear it down . RPC systems usually have short cally. The general problem of garbage collectio n periods of information exchange between caller in a distributed system could be very expen- and callee (the duration of a call), and a calle e sive . The NCA/RPC [36] exposes to the client must be able to easily handle calls from hun- exceptions that occur during the execution of dreds of callers. Network implementations usu- RPCs . It also propagates interrupts occurring ally are strict about the number of concurrently to the caller process, all the way to the calle e open connections. Even without this restric- process executing on the caller's behalf. tion, it would be unacceptable for a callee t o

237 maintain information about such a large num- the connection is idle, the caller has only a ber of connections. counter, the sequence number of its next call , Datagram-oriented network services (such as and the callee has the call identifier of the last UDP) do not provide for all the security an call. When the arguments or results are too RPC mechanism needs, but are inexpensive large to fit in a single packet, multiple packets and are available in any network . are sent alternatively by the caller and callee. In general, RPC-based systems make use of The caller sends data packets, and the callee re- the datagram service provided by the network, sponds with acknowledgments both during ar- and they implement a transport protocol on guments transmission, and the reverse, durin g the top of the datagram, which meets RPC re- results transmission . This allows the imple- quirements. This protocol, usually, performs mentation to use only one packet buffer at each the flow and error control of TCP protocols , end for a call, and avoids buffering and flo w but the connection establishment mechanism i s control strategies in normal-bulk data transfer light weight . protocols . In this way, costs for establishin g The designers of the Cedar RPC Facility [13 ] and terminating connections are avoided, and found that the design and implementation of a the cost of maintenance is minimized . transport protocol specifically for RPC woul d The NCA/RPC [22] protocol is robus t be a great source of performance gains . First, enough to be built on top of an unreliable net- the elapsed real-time between initiating a cal l work service and yet to deal with lost, du- and getting results should be minimized. The plicated, out-of-order, long-delayed messages adequacy for bulk data transfer, where mos t that cause callee side processes to fail. It of the time is spent transferring data, was not also guarantees at-most-once semantics when a major concern in the project . Second, the necessary. A similar approach to the Cedar amount of state information per connection , project is used . The connection information and the handshaking protocol to establish one , kept between processes constitutes, basically, a should be minimized. Finally, at-most-once se- last-call sequence number. This number can mantics should be used. be queried from both ends for synchronization The scheme adopted bypasses the software purposes through ping packets. The runtime layers that correspond to the normal layer s support, though, uses the socket abstraction t o of a protocol . The communication cost was hide the details of several protocols, allowing minimized when all the arguments of a call protocol independent code in this way. A pool fit in a single packet . The protocol work s of sockets is maintained for each process : allo- as follows. To issue a call, the caller sends cated when a remote call is issued, and returne d a packet containing a call identifier, a proce- to the pool when the call is over . A similar dure identifier, and the arguments . The call end-to-end mechanism is implemented by th e identifier consists of the calling machine iden- Athena RPC [70] and by NCS/RPC [22] . tifier, a machine-relative identifier of the call- SUN RPC [79] allows the user to choose th e ing process, and a sequence number . The pair transport protocol based on the user 's needs . [machine identifier, process] is named ac- SUN RPCs require an RPC handler as the sec- tivity. Each activity has at most one outstand- ond argument. This handler is obtained by call- ing remote call at any time. The sequence num- ing the port mapper, and passing the transpor t ber must be monotonic for each activity. The protocol type specified by the user, TCP, o r call identifier allows the caller to identify the UDP. When using TPC, there is no limit t o result packet, and the callee to eliminate du- the size of the arguments or result values . An plicate call packets. Subsequent packets ar e integer at the beginning of every record deter - used for implicit acknowledgment of previous mines the number of bytes in the record . With packets. Connection state information is sim- UDP, the total size of arguments must not gen- ply the shared information between an activ- erate a UDP packet that exceeds 8,192 bytes in ity on a calling machine, and the RPC runtim e length. The same holds for the returned values . support on the callee machine, making possible ASTRA [3] developed RDTP/IP, a reliable a light-weight connection management . When datagram transport protocol, that is built on

238 UDP. It provides a low-cost reliable transport , table. Otherwise, the global bound T .upper is where calls and replies are guaranteed through used. Because no information is in the table for a connectionless protocol . For flow and error this connection, the last message on this con- control, it uses a stop and wait protocol. It nection, if there was any, had a time stamp < also provides an urgent datagram option that T.upper. Provided that p is sufficiently large , causes RDTP to push the packet out to th e and having T .upper < T.time —p, the chances network immediately. On the callee side, the of mistakenly assuming a message as duplicat e urgent datagram is stored in a special buffer , is very low. and a signal is raised to the user. A user can The above protocol is nevertheless for callees receive urgent datagrams by specifying the ur- that do not survive failures . For resilient gent datagram option in receiving statements . callees, it is necessary to determine by the time The synchronized clock message protocol [48] of a message arrival, whether the message is a (SCMP) is a new protocol that guarantees at- new one or a duplicate of a message that ar- most-once delivery of messages at low cost an d rived before the failure . An estimate of the is based on synchronized clocks over a dis- timestamp of the latest message that may have tributed system . It assumes that every nod e been received before the failure occurred . The has a clock and that the clocks of the node s system operates as if no message with a times- are loosely synchronized with some skew e, les s tamp greater than the estimate was received than a hundred milliseconds . Nodes carry out a before the failure, and therefore, messages older clock synchronization protocol that ensure suc h than the estimate are new. An upper bound precision. This synchronization protocol is be- on the timestamps of accepted messages is en- ing used for other purposes, such as for authen- forced and is named T.latest. After the failure , tication and for capabilities that expire . T.latest is used to initialize T .upper. Roughly The protocol can be summarized as follows : speaking, a server periodically writes T .time every thread of control T has a current time , to stable storage . After a failure, T .latest is T.time, that is read from the clock of the nod e the most recent value written to stable storage . in which T is running. Every message has a For example, SCMP can be used by higher- timestamp, m.ts, that is T.time of the sending level protocols to establish connections . It can thread at the time m is created . Each messag e be used as a security reinforcement mechanism contains a connection identifier, m.conn, cho- that allows only new messages to be passed t o sen by the client without consultation with th e higher-level mechanisms that use it. Perfor- server. Each server maintains a connection ta- mance measurements [48] show that at-most- ble, T. CT, that is a mapping from connection once semantics for callers calling callees oc- identification to connection information . One casionally are achieved with about the same piece of information is the timestamp of th e cost as UDP-based SUN RPC, and with signifi- last message accepted on that connection . Not cantly better performance than the TCP-based all connections have an entry in T.CT. Any T i s SUN RPC. If the caller makes several calls to free to remove an entry T .CT[C] for a connec- the same callee, none major overhead cost is tion C from its table, if T.CT[C] .ts < T.time added over the UDP, and it performs bette r —p, where p is the interval during which a con- than the TCP-based RPC. nection information must be retained . A callee When the RPC involves processes connecte d also keeps the upper bound, T.upper, on the through a wide-area network, the protocol timestamps that are removed from the table . must support location services and transparent Because only old timestamps are removed from communication among the distant connecte d the table, T .upper < T.time —p. The mech- processes. An adequate protocol for local-are a anism then distinguishes new from duplicated networks would be inappropriated for wide- (old) messages, based on the notion of bound area networks, because the latter has a very of a particular connection . On the callee side, high latency rate. Amoeba distributed operat- the bound of a connection is the timestamp of ing system [85] introduced a session layer gate- the most recent previously accepted message , way to support the interconnection of local - if the message 's connection has an entry in the area networks. Through a publish function,

239 target servers (callees) export their port an d is built, and the networking latency of the un- wide-area network address to other Amoeba derlying implementation. sites. When receiving this information, each An operating system should be capabl e site installs a server agent . The server agent of supporting several threads of control [82] . acts as a virtual server (callee) for all calls t o Single-threaded environments do not provid e that service, listening locally to requests to that for the performance and ability for concurrent port from that site. When issuing an RPC, the access of shared information required for the callee calls the server agent, which forwards the implementation of the mechanism . An attempt request across the wide-area network using the at implementing an RPC prototype as part o f published address . When the request arrives the MIT's Project Athena, around 1986, on th e at the remote Amoeba site, a client agent pro- top of 4.2 BSD, is described in [70] . There were cess is created there that acts as the virtua l no support for more than one thread of execu- client (caller) of the target server and starts tion within a single address space. The com- a local call to it . After execution, the tar- munication protocol was a user mode request get server sends the results to the client agent , and response layered on UDP (UDP-RR) . The which forwards the results through the wide- 4.2BSD does not provide the low latency re- area network to the server agent. The server quired. The process scheduling algorithm is agent then completes the call, returning the re- nonpreemptive, which tends to increase the cal l sults to the real client . Amoeba provides ful l latency. protocol transparency for the user . For wide- One of the most useful properties of a lan- area communication, it uses whatever protocol guage that is to be extended with RPC, or is available and without the client and server any synchronous mechanism, is the support processes knowing about it . For local commu- of process management . This topic is dis- nication, it uses protocols optimized for local cussed in detail in [46], where the limitation s networks. The error recovery is very powerful of synchronous communication with static pro- in this model, because the client agent notifie s cess structure are presented. Two situation s the server agent, and the reverse, when a shut- in which a process could block are identified : down of the client occurs. local delay and remote delay . When the cur- REX [57] is a remote execution protocol that rent activity needs a local resource that is un- extends the RPC concept to a wider range o f available, the activity is said to be blocking be- remote process-to-process interactions for a n cause of a local delay . A callee in this situa- object-oriented distributed processing architec- tion should suspend the execution of this call, ture . and turn its attention to requests from other callers. If the current activity makes a remot e call to other activity and is delayed, the othe r 8 Supporting and Extend- activity is blocked because of a remote delay . ing the RPC Mechanism A remote delay can actually be a communi- cation delay, which can be large in some net- This section discusses several implementation works. It can also happen because the calle d techniques to enhance the RPC mechanism , activity is busy with another request, or must and extensions added to the concept. The ex- perform considerable computing in response t o tensions try to model the RPC paradigm so the request. The caller process, while waiting , that it will be more efficient, or more suitable , should be able to work on something else . The for particular applications . user should be able to design several threads o f execution inside a process . This provides the operating system with alternative functions t o 8 .1 Process Management execute while that call, on the callee or caller Of foremost concern in designing and imple- side, is blocked. menting an RPC mechanism are three factors : Several alternative ways to avoid or overcom e the language that is extended to support RPC , the blocking in the absence of dynamic process the operating system in which the mechanism creation are described as follows and the de-

240 ficiencies inherited in the alternative ways ar e are used to pass initial data to an initializatio n analyzed [46] . Examples of some solutions are : routine within each process being created. The choice of processes to load into a componen t • Refusal: a callee that accepts a request is made at run-time. Hermes [75] and Con- and is unable to execute it, because of a cert [90] are languages that came afterwards , delay, returns an exception to the caller to also based upon the process model . indicate that the caller should try agai n later . 8 .2 Fault Tolerance • Nested accept: a callee that encounters a local delay accepts another request in a The fault tolerance and recoverability are pri- nested accept statement . This scheme re- mary issues in distributed environments . A dis- quires that the new call is completed be- tributed environment provides a decentralized fore the old call is completed. This may computing environment. The failure of a sin- result in an ordering that may not corre- gle node in such environments represents onl y spond to the needs of applications . a partial failure, that is, the damage (loss) is confined to the node that failed . The rest o f • Task families : instead of using a single the system should be able to continue process- process to implement a callee, a family ing. A system is fault tolerant [8] if it contin- of identical processes is used. These pro- ues functioning properly in the face of proces- cesses together constitute the callee, and sor (machine) failures, allowing distributed ap- synchronize with one another in their use plications to proceed with their execution an d of common data. allowing users to continue using the system . The recoverability refers to the ability to re- • Early reply : simulate asynchronous prim- cover from a failure, partial or not . itives with synchronous primitives . A There are three sources of RPC failures : callee that accepts a call from a caller sim- ply records the request and returns an im- • Network failure : The network is down, o r mediate acknowledgment . there is a network partition. The caller and callee cannot send, or receive any • Dynamic task creation in the higher-leve l data. callee: the higher-level server uses early reply to communicate with callers . When • Caller site failure: The caller process o r it receives a request, it creates a subsidiary the caller host fails . task (or allocates an existing task) and re- turns the name of the task to the caller . • Callee site failure : The callee process or This would handle remote delay. the callee host fails . This might cause th e caller to be indefinitely suspended while A synchronization mechanism may provid e waiting for a response . adequate expressive power to handle local de- lays only if it permits scheduling decisions t o The system may hide the distributed envi- be made on the basis of the the name of th e ronment, or other underlying details from th e called operation, the order in which requests user, thus providing transparency . The sys- are received, the arguments of the call, and th e tem can provide location transparency, mean- state of the resource [46] . ing that the user does not have to be aware NIL [76] was one of the primary efforts in of the machine boundaries and the physical 10- having processes as the main program structur- cations of a process to make an invocation on ing construct at the language level, which ar e it. On the other hand, the simplest approach both units of independent activity, and unit s to handling failures is to ignore them, or to let of data ownership . Processes can be dynam- the user take care of them . ically created and destroyed in groups called The fault model for node failures is as fol- components, and each component accepts a list lows: either a node works according to its spec- of creation-time parameters. These parameters ifications, or it stops working (fails). After

241 a failure, a node is repaired within a finit e • The caller did not receive the reply mes- amount of time and made active again . sage, but it was sent . Usually, the receipt of a reply message from the callee process constitutes a normal termi- • The callee failed during the execution o f nation of a call. A normal termination is pos- the respective call. The callee either re- sible if [58] : sumes the execution after failure recovery , or does not resume the execution even af- • No communication or node failures occur ter starting again. during the call . • The callee is still executing the call, but • The RPC mechanism can handle a fixed the time-out was issued too soon . finite number of communication failures . There are two situations in which the ab- • The RPC mechanism can handle a fixed normal termination can cause the caller to is - number of communication failures an d sue the same call twice. If the time-out is is - callee node failures. sued too soon for a particular call, the caller may erroneously assume that the call coul d • The RPC mechanism can handle a fixe d not be executed, and might decide to call an- number of communication failures and other callee process, while the former one is callee node failures and there is tolerance still processing the call . If the two executions to a fixed finite number of caller node fail- share some data, the computations can inter- ures. fere with each other . Another situation can The appropriate choice of fault tolerance ca- occur when the caller recovers from a failur e pabilities and the semantics under normal an d but does not know whether the call was alread y abnormal situations, are among the most im- issued or not, and makes the call again . Exe- portant decisions to be taken in an RPC de- cutions that are failed are orphans, and many sign [58] . A correctness criterion for an RPC schemes have been devised for detecting an d implementation is described in [58] . Let C; de- treating orphans [40, 53] . note a call, and W; represent the correspond- Network failures can be classified [58] as fol- ing computation invoked at the callee side . Let lows: C; and C1 be any two calls made such that : • A message transmitted from a node does (1) C; happens after C; (denoted by C; then not reach its intended destination (a com- C;), and (2) computations W; and W1 share munication failure) . some data such that W; and/or W1 modify the shared data. The correctness criterion (CR) • Messages are not received in the same or- then establishes that an RPC implementation der as they are sent . must meet in the presence of failures the fol- lowing CR: • A message is corrupted during its trans- mission. CR: C; then C; implies Wt then W1 • A message is replicated during its trans - A call is terminated abnormally if the termi- mission. nation occurs because no reply message is re- ceived from the callee . Network protocols typ- There are well-known mechanisms (based on ically employ time-outs to prevent a process checksums and sequence numbers) that a re- that is waiting for a message from being held ceiver can use to treat messages that arrive out up indefinitely. of order, are corrupted, or that are copies of If the call is terminated abnormally (the previously received messages . Therefore, net- time-out expires), there are four mutually ex- work failure is the treatment of communication clusive situations to consider : failures only. If the message handling facilit y is such that messages occasionally get lost, a • The callee did not receive the request, bu t caller would be justified in resending a messag e it was made. when it suspects a loss . This can sometimes

242

result in more than one execution at the callee A side. When a process survives failures on its node , B C the process is a resilent process . It might re- quire maintenance of information in a log on a nonvolatile storage medium. Because of the difficulties in achieving con- Figure 7: Concurrent Call of D by B and C ventional call semantics for remote calls, be- cause of problems with argument passing an d problems with failures, some works [31] argu e procedures D, which has static data (Figure 7) . that remote procedures should be treated dif- If B calls D first, C's call to D should be rejected ferently from the start, resulting in a com- until B returns to A . If C's call is accepted, an d pletely nontransparent RPC mechanism . if B calls D again later, B's and C's effects on D cannot be serialized, violating the atomicity 8.2 .1 Atomicity requirement . Procedure calls can be atomic, that is, to- A major design decision of any RPC mecha- tal and serializable, or nonatomic . Exactly- nism is the choice of the call semantics of an once semantics are guaranteed for atomic calls , RPC in the presence of failures. Some systems while only at-most-once semantics are assured adopt the notion that RPC should have only for nonatomic ones. last-of-many semantics when failures occurre d Calls invoked by the callees of an atomic cal l to model procedure calls in a centralized sys- are also executed as atomic calls. The total- tems [53] . In Argus [45], RPC has at-most-onc e ity property requires that when a caller mal- semantics and malfunctions are treated by ex- functions and recovers to an old state, all the ception handling. The Cedar [13] implementa- states of its callees must also be restored . If tion enforces at-most-once semantics by send- a callee fails and recovers, its caller is not af- ing probe messages, and by using Mesa's power- fected, but the call must be repeated . The al- ful exception handling mechanism . Other sys- gorithm recovers a failed procedure's data state tems [67, 80] follow the idea that RPC shoul d to its state after the last time it finished a n have exactly-once semantics to be useful . atomic call. A procedure that was called by An implementation of an atomic RPC mech- a failed atomic procedure is recovered to the anism, which maintains totality and serializ- state before the failed atomic procedure mad e ability, is presented in [42] . Serializability is the first direct or indirect call to the procedure . guaranteed through concurrency control by at- The backup states for a procedure are Associ- taching a call graph path identifier to each mes- ated States (AS). An AS is the procedure's dat a sage representing a procedure call . Each pro- state (static variable and process identification) cedure keeps the message path of the last ac- that is tagged with a version number, which is cepted call . This path is used to make compar- increased by one every time a new AS is cre- isons with incoming message paths, and only ated. An AS is created on procedure initiatio n calls that can be serialized are accepted . In and updated every time after the procedure re- this system, procedures are separated entities . turns from an atomic call . A procedure's data can be either static or au- A Transactional RPC design and implemen- tomatic. The automatic data is deleted after a tation is presented in [23]. The design consists procedure's activation, while static data is re- of a preprocessor, a small extra runtime library, tained. Associated states of static variables are and the usual RPC components . The prepro- saved in backup processors to survive failures . cessor accepts interfaces written in DCE ID L Because of concurrency control and to main- and generates, for each procedure, a shadow tain atomicity, some calls to procedures wit h client stub and a shadow manager stub (shadow static data can be rejected . For instance, sup- server stub) . It also generates an internal in- pose a procedure A that concurrently invokes terface, which contains a new version of the calls on procedures B and C, both of which call procedures declared in the original interface ,

243 where an extra in-out argument is inserted . was executed without problems, whether th e This extra argument contains the data infor- call was not done, whether the callee was ab- mation needed for the transaction system . The sent or could not execute the call . The system internal interface is the input to the stub gen- assigns sequence numbers (SN) to messages . erator. SNs are unique over the entire system . A callee When issuing the RPC, the user calls th e maintains only the largest SN received in a fail- shadow client stub, which communicates wit h ure proof storage . All retry messages are sent the transaction subsystem. The shadow client by a sender with the same SN as the origina l registers the call, obtains some registering data message. A callee accepts only messages whos e (piggyback data), and updates the log file . Af- SN is greater than the current value of the las t ter that, it calls the shadow manager stub gen- largest SN. A similar approach is provided at erated for the internal interface and passes the the caller's side. For the generation of network- piggyback data as the extra argument . On wide unique sequence numbers, the loosely syn- other side, the manager stub receives the call . chronized clock approach is used . Each node i s It recovers the piggyback data argument, an d equipped with a clock and is also assigned a interacts with the transaction system to com- unique node number . A sequence number a t municate the acceptance of the request . There- each node is the current clock value concate- after, it calls the real caller stub with the re- nated with the node number . minder of the arguments . A similar protocol is used when the call is finished w . The shado 8.2.2 Replicatio n server interacts with the transaction system communicating the completion of the call, an d Some systems use replication to support faul t obtaining some more piggyback data . It then tolerance against hardware failure. Schemes passes the piggyback data and the procedure that use replication can be classified in two cat- results back to the client shadow . The client egories: primary standby and modular redun- shadow, once again, interacts with the trans- dancy . In the primary standby approach on e action system by passing this piggyback data , copy is the active primary copy and the oth- and then returns the results of the procedure t o ers are passive secondary copies. The primary the original caller . The user interfaces are no t copy receives all the requests and is responsi- modified, and the user with transactional ap- ble for providing the responses. It also updates plications benefits in a completely transparent periodically the checkpoints of the passive sec- manner. ondary copies, so that they can assume the role A reliable RPC mechanism that adopts th e of primary copy, if this is the case . If the pri- exactly once semantics is described in [67] . At mary copy fails, a prespecified secondary cop y the RPC system language level, the user can assumes its role . The major drawback of such specify additional information that enables th e a scheme is the considerable system support re- underlying process to manage the receipt an d quired for checkpointing and message recovery . sending of messages . Orphans are not treated In modular redundancy scheme, all the repli- at this level. The level immediately above this cated copies receive the requests, perform th e requires that all programs be atomic action s operation, and send the results. There is no dis- with the all or nothing property. Executing tinction made between the copies . The caller callees in an atomic way guarantees that the usually waits for all the results to be returned , repeated execution at the callee side is per - and may employ voting to decide the correct formed in a logically serial order with orphan result. The major disadvantage of this scheme actions terminating without producing any re- is the high overhead of issuing so many request s sults. The syntax transparency is lost, and an and so many responses. If there are m requests RPC call requires, at least, the name of the and n copies of the requested service, the num- callee's host and a status and a time-out ar- ber of messages needed to complete the request gument. The time-out specifies how long the is O(m x n). caller is willing to wait for a response to his The Replicated Procedure Call [18] is an ex- request. The status indicates whether the call ample of modular redundancy-based mecha-

244

Caller troupe Callee troupe of grouping the related requests, that is, to de- cide whether the requests are unrelated or are part of the same replicated call, because the requests may not be synchronized or identical . Call P The system does not require complete deter- minism, that is, that each member of a troup e reaches the same stable state through the ex- Ca11 P ecution of the same procedures with the same arguments in the same order. The system is based on the notion of state equivalence rela- tion between modules, and parameter equiva- Call P lence and result equivalence among procedure calls. Procedure executions in a module have an equivalence relation induced by the rela- Figure 8: Replicated Procedure Call tions. Two executions are equivalent if: (1) the results are result equivalent, (2) the new states are state equivalent, (3) the traces are identica l nism for RPC-based environments . The set up to parameter equivalence of the remote pro- of replicas (instances) of a program module i s cedures called . A module is deterministic up t o a troupe . A replicated procedure call is made equivalence if: (1) all executions of a procedur e from a caller troupe to a callee troupe . Each with a specific set of arguments in a specifi c member of the caller troupe makes a one-to- state are equivalent, or (2) whenever a param- many call to the callee troupe, and each callee eter is equivalent to another parameter, and a member troupe handles many-to-one call from state is also equivalent to another state, all ex- the caller troupe as shown in Figure 8 . Each ecutions of a specific procedure in the forme r member of the callee troupe only performs th e state with the former arguments are equivalen t requested procedure once, while each member to the executions of the same procedure with of the caller troupe receives several results. To the second argument in the second state . the programmer this looks like normal proce- To assure the deterministic up to equivalence dure calls . Distributed applications continue relation, the notion of call stack is extended to function normally despite failure if at least to distribute call stack, which consists of the one member of each troupe survives . The de- sequence of contexts, possibly across machin e gree of replication can be adjusted to achiev e boundaries, that were entered and not returne d varying levels of reliability . Increasing the num- from yet. At the base of the stack is the ber of nodes spanned by a troupe increases its top-level context that originated the chain o f resilience to failures . stacks . The distribute call stack is, once again , To issue a replicated procedure call, the extended to replicated call stack, in which an caller's side of the protocol sends the same cal l entry for an active procedure consists of a set of message, with the same sequence number, t o contexts, one from each member of the troup e each member of the callee troupe . On the implementing that procedure . Given an active callee's side, a callee member receives call mes- replicated procedure call, the replicated cal l sages from each caller's troupe member, per - stack can be traced back, and a unique troup e forms the procedure once, and sends the same can be found at the base level. This is the root result to each member of the caller 's troupe . troupe of the replicated procedure call . For this call, each troupe has a unique troup e In this way, a root id is added to each call id, assigned by a binding agent, and each mes- message. A root id contains the troupe id of sage contains the caller's troupe id . When a the root troupe of the call, and the sequence request arrives, the callee checks the member s number of the root troupe's original replicated of the caller's troupe id, and determines fro m call . The root id is a transaction identifier, an d whom to expect call messages as part of the whenever a callee makes a replicated call o n many-to-one call. The callee also needs a way behalf of a caller, it propagates the root id of

245

the call that it is currently performing . The 8.2 .3 Orphan Detectio n root ids have the essential property that two Rajdoot [58] uses three built-in mechanisms to or more call messages arriving at a callee have cope with orphans in an efficient way . It adopt s the same root id if they are part of the same replicated call. This allows a callee to handle exactly once semantics . For callee failure, the many-to-one calls. call is guaranteed to be terminated abnormally. There are three orphan handling mechanisms : • (M1 :) If a client call is terminated abnor- y A combination of the modular redundanc mally, any computations that the call may approach and the primary standby is describe d have generated are also terminated. in [88], which provides for an RPC mechanis m tolerant to hardware failures . Whereas all the • (M2 :) Consider a node that fails an d replicated copies execute each call concurrently , makes a remote call to node C after recov- as in the modular scheme, the caller only make s ery. If C has any orphans because of th e one call, which is then transmitted to the oth- caller's failure, they are terminated befor e ers. The mechanism exposes to the user a clus- the execution of the call starts at C . ter abstraction, which constitutes a group of replicated incarnations of a procedure . The in- • (M3 :) If the node remains down, or if af- carnations are identical, but reside in different ter recovery never makes calls to C, an y nodes. An RPC call is issued by a caller clus- orphans on C are detected and remove d ter, and the callee cluster is the one that exe- within a finite amount of time . cutes the call. The caller process, the proces s To ensure M1, every call has a deadlin e that starts the call, is not replicated, and is (time-out) argument that indicates to the callee a special instance of a caller cluster with only the maximum time available for execution . one instance . In recursive and nested calls, a When the deadline expires, the callee termi- cluster can function both as a caller and as a nates the execution and the call is terminate d callee . abnormally. To ensure M2, every node main- tains a crashcount in permanent storage. The There are primary and secondary incarna- crash count is a counter that is incremented im- tions of a procedure . During cluster creation mediately after a node recovers from a failure . time, an incarnation is designated primary , Also in stable storage, a node maintains a table and is responsible for handling the communica- of crashcount values for callers that made call s tion between clusters . All others are considered to it. Every call request contains the caller' s secondary . If the primary incarnation fails, a crashcount value . If the value in a request secondary incarnation assumes its function an d is greater than the value in the callee's table becomes the primary incarnation . A secondar y for that caller, orphans may be at the callee . incarnation can only interact with other clus- These orphans are terminated before the callee ter's processes, if all its superior incarnations proceeds with the call . To ensure M3, ev- fail and it has become primary. The incarna- ery node runs a terminator process that occa- tion hierarchy in a cluster is determined whe n sionally checks the crashcount values of other the service is created . All incarnations play an nodes by sending messages to them and receiv- active role : all the incarnations in the callee ing replies, and then terminates any orphans cluster execute a call, and every incarnation in when it detects failures . a caller cluster receives a copy of the result . In The RPC call syntax is as follows, where pa- the absence of failure, though, the caller make s rameters and results are passed by values: a single call and only one copy of the result i s RPC(server : . . . ; call : . . . ; sent to the caller cluster . The system assumes time-out : . . . ; retry : . . . ; that a cluster is deterministic: when receiving var reply : . . . ;var rpc_status : . . .) ; the same call each of its incarnations produce s the same result and has the same side-effects . The first parameter specifies the callee's host . The callee procedures must be idempotent . The second parameter, which is the name of the

246 function and the arguments. The retry indi- concurrently. For a group of callees, the dis- cates the number of times that the call is t o tributed client program generator generates a be retried (the default value being zero). If af- driver program that can make use of any num- ter issuing a call, no reply is received within ber of remote procedures provided by thes e the duration specified by time—out, the call callees. is reissued . The process is repeated a maxi- The produced programs are monitored b y mum of retry times. The manager processes , the monitor when the debug option is on, which are run by each process and accessed via and during prototyping generation . The mon- a well-known address, are responsible for cre- itor has a controller and a group of managing ating callee processes that execute remote calls servers. The controller offers a user interface of a client . All remote calls of the client ar e and has a filter. Each host that has monitored directed to the callee process . program parts on it has a managing server con- SUN RPC does not specify the call seman- sisting of a server (MS) and an event database . tics to be supported and has no provision for or- The monitoring is done in three steps . Dur- phan treatment . The transport protocol chosen ing the monitoring step, all events that occur at the time that the RPC handler is acquire d on one host are monitored by the local M S determines the call semantics . Similarly, Xerox and recorded into the local event database . An Courier RPC [87] appears to support exactly event is an RPC and execution, a process cre- once semantics, but its description is not pre- ation (through a fork), and termination, or any cise about its fault tolerance capabilities and combined event defined by the user through th e no support for orphan treatment is provided . Event Definition File (EDF) . During the sec- ond step, the ordering step, the events saved in 8 .3 Debugging and Monitorin g the local event databases are ordered through messages interchanged among the MSs RPC Systems . The controllers can be invoked by any host . The Concurrency and communication among th e last step, the replaying step, consists of combin- concurrent parts of a distributed program ing the results on all related event databases . greatly increase the difficulty associated wit h The filter presents the execution trace of the program debugging . For this reason, a system distributed program to the user . After the pro- for prototyping and monitoring RPCs was de - gram is debugged, the user needs only to pro- signed [91] . From programmer-written callee gram the RPC body and the application inter- definition files, the system can prototype callee face. programs, caller driver programs, interface de- scription files written in NIDL for each callee , 8 .4 Asymmetry stubs, and prototype programs of RPC proce- dures for each callee . Another aspect of the RPC mechanism tha t The system consists of two parts: a prototyp- might not be appropriate for certain applica- ing generator and a monitor. The prototyping tions is the asymmetry inherited by the RPC generator gets the callees' definition files, an d model. In the RPC model, the caller can choose generates the above-mentioned files . It consists the particular callee, or at least the callee' s of a set of specialized generators that provid e host machine . Seldom can the callee mak e for the gradual development of the component s any restrictions about the caller, because the of a distributed environment . The server pro - callee cannot distinguish among callers to pro- gram generator generates a callee stub and a vide them with different levels of service or to callee program for each callee definition file . extend to them different levels of trust . Callees The sequential client program generator gen- must deal with callers that they do not under - erates, for each callee, a caller stub and a stand, and certainly cannot trust [64] . caller driver program . The parallel client pro - This issue is referred as naming (or address- gram generator generates a driver program for ing) of the parties involved in an interaction [8] . each callee which can make use of any num- A communication scheme based on direct nam- ber of remote procedures provided by the callee ing is symmetric if both the sender and re -

247 ceiver can name each other. In an asymmet- a higher level of abstraction in an elegant way . ric scheme, only the sender names the receiver . It is implemented as part of a server structuring When the interprocess communication mecha- system called CLAM. Users can layer abstrac- nism provides for explicit message receipt, th e tions in caller processes (statically bound) or receiver has more control over the acceptance . dynamically load layers into the callee process . The receiver can be in many different states Consider two abstractions P and Q in differen t and accept different types of messages in each layers with P in a higher layer than Q . The P state. The indirect naming involves an inter- passes to Q pointers to functions in which P i s mediate object, usually called a mailbox, to willing to accept UpCalls . To pass pointers, P which the sender directs its message and t o calls a Q registering function, with the functio n which the receiver listens . pointers as arguments, which is similar to ex- In LYNX [64], a link is provided as a built-i n porting in normal RPC . When an event occurs data type, and is used to represent a resource . that requires an UpCall to be made, Q decide s Although the semantics of a call over a link ar e the higher level of abstraction that should re- the same as an RPC, both ends of the call can ceive the call, which could be P. The P and Q be moved. Callees not only can specify the set can be in the same address space (local Up- of callers that might be connected to the othe r Calls) or in different ones (remote UpCalls) . side of the link, but also are free to rearrang e This is a different way of expressing, for in - their interconnection to meet the needs of a stance, exception handling . changing user community, and to control access to the resources they provide. 8 .5 Lack of Parallelism Also related to symmetry is the nondeter- minism issue. A process may want to wait for The inherited lack of parallelism between th e information from any of a group of other pro- caller and the callee, in an RPC model, is an- cesses, rather than from one specific process . other feature that deserved a lot of attention In some situations, it is not known in advance by the scientific community. With RPC, whil e the member (or members) of the group tha t the callee is active, the caller is always idle an d will have its information available first . Such waiting for the response [82] . Parallelism is behavior is nondeterministic . not possible. To gain performance, the caller On the other extreme, communication should continue computing while the callee is through shared data is anonymous . A process working . Related to the above issue, is the fac t accessing data does not know or care who sen t that there is no way of interleaving between the the data. caller and the callee [82] . Another criticism of the RPC model is the A caller can only have one outstanding call at passiveness of the callee [82] . Except for the a time. The RPC paradigm is inherently a two- results, a callee is not able to initiate action t o party interaction [82] . With the large number signal the caller . The point is that the proce- of available processors possible in a distribute d dure call model is asymmetric and the callee environment, the caller is actually forbidden by has no elegant way of communicating or awak- the mechanism to use such parallelism . Many ing the callers. One example [17] where th e distributed applications could benefit from con - callee should call the caller is when the net - current access to multiple callees [86] . work server (callee) needs to signal to an uppe r A caller may wait to make several calls si- layer in a protocol, or when a window manage r multaneously. This is bulk data situations for server needs to respond to user print . an RPC environment . The procedure mechanism provides a syn- A solution would be to build RPC-based ap- chronous interface to call downward through plications with several threads, so that, while successive layers of abstraction. RPCs exten d waiting for a call, only that call's thread block s the mechanism and allow the layers to reside and the others proceed being executed . This in different address spaces. The distributed Up- solution does not scale well [3] . Since the cumu- Calls mechanism [17] is a facility which allows a lative cost of thread creation, context switch- lower level of abstraction to pass information to ing, and thread destruction can be too great

248 in large distributed environments, where the deliver the replies from the sender in the or- number of RPC calls grows and shrink dynam- der of the corresponding calls . The purpose ically. Moreover, threads are not universally of the proposed mechanism is to achieve hig h supported. throughput where calls are buffered and flushe d The interactions between a caller and a calle e when convenient . Low latency can be achieved can be classified as follows [3] : by explicitly flushing the calls . The call stream • Intermittent exchange : the caller makes relies solely on a specific, reliable byte-strea m a few intermittent request-response typ e transport such as TCP, making it more suitable calls to the callee . for bulk data transfer. The use of TCP lead s to higher overhead for most transactional ap- • Extended exchange : the caller is either plications in which a request-response protocol involved in bulk data transfer, or makes is more appropriate . many request-response type calls to a The mechanism offers three types of calls: callee. • Ordinary RPCs: the receiver can make an- Several solutions have been proposed to ex - other call only after receiving the reply . tend the RPC concept, so that it is both more efficient for bulk data transfer, and so that i t • Stream calls: the sender can make mor e allows for asynchronous communication . All calls before receiving the reply. To the solutions try to maintain as much as possible user, the sequence of calls has the same the semantics of a procedure call . effect as if the user had waited to receive A call buffering approach [30] is a mechanism the nth reply before doing the (n + 1)-t h that provides a call buffering server, where call. The system delivers calls and replies the requests and replies are stored (buffered) . in the correct order. An RPC call results in a call to the buffer- ing server, which stores the request, the caller • Sends: the sender is not interested in the names and the callee name . After that, the reply. caller makes periodic requests to the call buffer- ing server to check whether the call is executed . The receiver provides a single interface, and In the meantime, callees poll the call buffe r does not distinguish among the three kinds server to check whether there are any calls of calls. The underlying system takes care of waiting. If so, the arguments are retrieved, th e buffering messages where appropriate, and de- call is executed, and the callee calls the call livering calls in the correct order. At the mo- buffer server back to store the results . The ment of the call, the senders can choose inde- next time the caller pools the buffer, the re- pendently the particular type of call desired . sults can be retrieved . To minimize the delay for the ordinary RP C In the Mercury System [45], a mechanis m mode, requests and replies are sent over th e is proposed that generalizes and unifies RP C network immediately, that is, as soon as the y and byte-stream communication mechanisms. are issued or available . On the other hand , The mechanism supports efficient communica- stream calls and sends are buffered and sent tion, is language independent and is called call when convenient. stream or stream . To add the mechanism to The paradigm also introduces the abstrac- a particular language, a few extensions may be tion of entities. An entity has a set of port added to the language . Such extensions are lan- groups and activities (processes) . The port guage veneer . Veneers for three languages have groups group ports for sequencing purposes . already been provided : Argus, C, and Lisp. On the sender side, entities have agents and A call stream connects two components of a a port group that define the sending end of a distributed program . The component sender stream. In this way, the user and the system es- makes calls to the component receiver. The tablish the sequencing in which calls from port s next call can be made before the reply to the of the same group, calls from different stream s previous call received . The mechanism can de- but from the same entity, or calls from different liver the calls in the order they are made, and streams and different entities must be executed .

249 In the CRONUS [3] system, Future is de- optimizations necessary . The caller does not signed only for low latency, and yet the orde r block during a call, and the replies can b e of execution in Future may vary with the orde r claimed when needed similar to promises . All called. Stream and Future do not optimize for calls are executed in the same order as called . If intramachine calls . the reply is not available when the caller claim s Another extension to the RPC paradigm was it, the caller can unblock the receive operation the new data type introduced in the work de- by specifying a no delay option. ASTRA also scribed in [43], promise. The goal was to pre- provides optimized intramachines calls, bypass- serve the semantics of an RPC call, yet pre- ing the data conversion and network communi- vent the caller from blocking until the momen t cation. For binding it uses a superserver dae- the caller really needs the values returned b y mon that must reside on every machine tha t the call. An RPC is declared as returning a runs a callee process, and also detects whether promise, a place holder for a value that will a particular server is still active . The major exist in the future. The caller issues the call , drawback of such a design is the loss of trans- and continues executing in parallel with the parency between local and remote procedure callee. When the callee returns the results, the y calls. are stored in the promise, where they can b e claimed by the caller when needed . The sys- tem guarantees that the calls are executed i n Another new data type, Pipes [29], proposes the order that they were called, even if the re- to combine the advantages of RPC with the sult of call i + 1 is claimed before the result o f efficient transfer of bulk data . It tries to com- call i, if this result is ever claimed . The pro- bine low-latency and high-throughput commu- grammer explicitly claims the result, and th e nication into a single framework, allowing fo r introduction of this data type shows the user bulk data incremental results to be efficiently the distinction between call time and claimin g passed in a type safe manner . RPCs are first- time. Functions are restricted to have only ar- class values, that is, have the same status as gument passing by value and return only on e any other values . They can be the value of an value that constitutes the promise value . expression, or they can be passed as an argu- The extension is language independent an d ment . Therefore, they can be freely exchange d the abstraction can be incorporated in any lan- among nodes . Unlike procedure calls, pipe calls guage. An RPC mechanism could implement do not return values and do not block a caller . the idea behind the mechanism in a transpar- Pipes are optimized for maximum throughput , ent way, confining the problem at the imple- where RPCs are optimized for minimum la- mentation level, but not at the language level. tency. RPC and pipes are channels used in the The system could do some data flow analysis i n same manner as local procedures. The node the program and allow the parallel execution of that makes the call is the source node, and the caller and callees, blocking the caller only when node that processes calls made on a channe l values that are expected to be affected by RPC is the channel 's sink node . For timing and se- calls are claimed. The complication here is t o quencing among calls, channels can be collecte d take care of pointer-based arguments that are into a channel group. A channel group is a set modified as a side-effect of the remote execu- of pipes and procedures that share the same tion. sink node and that observe a sequential order- ASTRA [3] tries to combine low-latency and ing constraint for a source process . This order- high-throughput communication into a singl e ing constraint guarantees that calls made by a asynchronous RPC model, and provides the process on the members of a channel group ar e user with the flexibility of choosing among dif- processed in the order that they were made . ferent transport mechanisms at bind time . It The user makes the grouping . uses the methodology that the programmer user knows better, letting the programmer de- cide whether a specific call requires low latency A survey of asynchronous remote procedure or high throughput . The system does all the calls can be found in [4] .

250 8.6 Security a 3-phase handshake between the caller and th e callee. By the end of the operation, the calle r A major problem introduced by an external and the callee share one handshake key . The communication network is the providing of caller is assured that the callee can derive this data integrity and security in such an open key from its identification, and the callee is as- communication network . Security in an RPC sured that the caller possesses the correct hand - mechanism [71] involves several issues : shake key . The RPC mechanism does not hav e • Authentication : To verify the identity of access to the format of a caller's identification, each caller . nor the way in which the key can be derived from this identification. All this functionality • Availability: To ensure that callee access is hidden by function pointers with which the cannot be maliciously interrupted. security mechanism supplies the RPC runtim e system. The code for the details of the authen- Secrecy: • To ensure that callee information tication handshake is small, self-contained, an d is disclosed only to authorized callers . can be treated as a black box, allowing for alter- • Integrity: To ensure that callee informa- native mutual authentication techniques to be tion is not destroyed . substituted with relative easy. A call to an un- bind operation terminates the connection, an d Systems treat these problems differently, destroys every state associated with that con- providing more insight or priority according t o nection. their most suitable applications . One way of guaranteeing security in RP C The Cedar RPC Facility [13] uses th e systems, is to enhance the transport layer pro- Grapevine [12, 62] Distributed Database as a n tocol at the implementation level and to extend authentication service or key distribution cen- the call interface at the user level, as in [11] . ter for data encryption . It treats the RP C The new abstraction data type conversations is transport protocol as the level of abstraction a t introduced, through which users interact wit h which to apply end-to-end authentication and the security facilities. To create a conversation, secure transmission measures [61] . The An- a caller passes its name, its private key, and the drew system [61] independently chose the same name of the other principal to the RPC runtime approach. system. That conversation is then used as an Andrew's RPC mechanism [61] supports se- argument of an RPC, and the runtime system curity. When a caller wants to communicat e ensures that the call is performed securely us- with a callee, it calls a bind operation . The ing a conversation key known only to the two binding sets up a logical connection at one of principals. the four levels offered by the system : The scheme is based on the use of private • OpenKimono : the information is neithe r keys and built upon an authentication servic e authenticated nor encrypted . (or key distribution center) . Each principa l has a private key known only to the princi- • AuthOnly : the information is authenti- pal and the authentication service. A caller cated, but not encrypted . and a callee that want to communicate nego- tiate with the authentication service to obtain • HeadersOnly : the information is authen- a shared conversation key. This key is used ticated, and the RPC packet headers, bu t to encrypt subsequent communication between not bodies, are encrypted . the two principals . The federal Data Encryp- tion Standard [52] (DES) is used for encryp- • Secure : the information is authenticated , and each RPC packet is fully encrypted. tion, because very fast and inexpensive hard - ware implementations are broadly available . When establishing the connection, the caller For example, if principal A wants to commu- also specifies the kind of encryption to be used. nicate securely with principal B, A 's program The callee rejects the kinds of encryption that includes a call on the RPC runtime system i n it cannot handle. The bind operation involves the form:

251

cony <- R .P .C .Creat e Because of the wide range of applications to Conv[from : nameOfA , which these processors may be applicable, ther e to : nameOfB , may be no ideal set of remote procedures tha t key : privateKeyOfA] a callee should provide [71] . In the worst sit- uation, the RPC model requires as many cus- Principal A can then make remote calls to pro- tomized interfaces from a set of callee processes cedure P.Q implemented by B, such as : as there are applications that may make use of it. Without an appropriate RPC interface, a x <- P .Q[thisConv :conv,arg :y] distributed application can suffer degraded per- Inside the implementation of P .Q, principal B formance because of the communications over- can find the identity of the caller by a call in head . It is claimed in [71] that, in the RP C the RPC runtime system in the form : model, a program has a unique partition int o fragments for local and remote execution . caller <- RPC .GetCaller[thisConv] Remote Evaluation (REV) is the ability to The concept of conversation is orthogonal t o evaluate a program expression at a remote com - the other abstractions in a call . Multiple pro- puter [71]. A computer that supports REV cesses can participate in a conversation, an d makes a set of procedures available to other multiple simultaneous calls may be involved in computers by exporting them. The set of a conversation. procedures exported by a computer is its in- terface . When a computer (client computer) sends a program expression to another com- 8.7 Broadcasting and Multicast- puter (server computer), this is a REV request. ing The server evaluates the program expressio n It is argued in the research community tha t and returns the results (if any) to the client. there are applications to which RPC seems t o Before sending the program expression, the be an inappropriate paradigm . Several exten- client prepares the code portion for transmis- sions to this model have therefore been pro- sion, in the encoding process . The end result posed. One extension is applications based on of this encoding is a self-contained sequence of the necessity for multicasting or broadcasting . bits that represents the code portion of the re- Programming language support for multicast quest in a form that can be used by any serve r communication in distributed systems that ca n supporting the corresponding service . It may be built on RPC is presented in [19] . The callee consist of compiled code, source code, or othe r interface has no modifications at all. The caller program representation . This choice affect s has additional binding disciplines, which pro - mainly the run-time performance and server se- vide for binding a call to a group of callees , curity. and additional linguistic support . Callers have The use of precompiled REV requests can to deal with several replies, instead of just one , improve run-time performance, because th e and may have to block only until the first n servers directly execute the REV requests in- replies are received . stead of using an interpreter . It may not be very useful in a heterogeneous computing en- vironment, and has the potential for compro- 8 .8 Remote Evaluation mising server security. Dynamic compilation One of the main motivations behind interpro- of REV requests at servers is certainly possi- cess communication and distributed environ- ble, but the net effect on performance depend s ments is that specialized processors can be in- on the REV requests. The performance of an terconnected . This kind of environment seems RPC mechanism is bounded by the overhead of like a single, multitarget system able to ex- internode communication . The REV may elim- ecute, possibly efficiently, several specialize d inate such overhead. Because trade-off is be- computations. In an RPC model, the special- tween generality and performance when RPCs ized processors are generally mapped to callee are used, service designers typically choose per - processes that offer (export) a certain interface . formance over generality. When REV is avail-

252 able, however, programmers can construct dis- the other hand, it can be rather inefficient hav- tributed applications that need both. While in ing these expressions interpreted on the server' s the RPC model, a callee process exports a fixe d machine. The integration with RPC-based sys- set of procedures, a REV environment allows tems is also not easy. the application programmer to compose serve r procedures to create new procedures that have equal standing with the exported ones . 9 Performance Analysi s h The RPC REV relationship is compared wit It is a common practice for experienced pro- the relationship between built-in data type an d - d grammers to use small procedures instead of in new data type definitions . The REV is claime line code, because it is more modular and doe s to be a generalization of, and an alternative to , not affect performance too much [82] . If such RPCs . procedures are run remotely however, instead One of the drawbacks of REV is that it does of locally, the performance may be severely de- not allow functions to be passed as arguments graded . of REV program expressions. The Network RPCs incur an overhead between two an d Command Language [24] (NCL) defines a new three orders of magnitude greater than local , programming model that is symmetric, that is calls [86] . The cost of making an RPC in th e not restricted to caller and callee network archi - absence of network errors can be classified as . Both callers and callees are allowe d tectures follows [86] : to use NCL to send expressions that specify an algorithm for an operation . call time = parameter packing To communicate with a remote host, th e + transmission queuin g caller has to establish a session with it, which + network transmissio n is equivalent to logging into a server by passing + callee queuing and appropriate credentials . If the caller fulfills the schedulin g authorization requirements, it enters an inter- + parameter unpacking active LISP read-eval-print loop with the hos t's + execution evaluator (server) . The server connects back + results packing with the caller in the same way. The calle r + transmission queuing and server exchange NCL expressions repre- + network transmission senting requests and responses over this ses- + caller schedulin g sion as if each were a user typing to an interac - + results unpackin g tive evaluator . A caller or server can vary the programming dynamically to suit changing re- It can be rewritten as: quirements within a heterogeneous distribute d call time = parameter transformation s system. In this way, a server can perform all of + network transmission - the operations necessary for a collection of het + executio n erogeneous callers without previous definition , + operating-system delay s linkages, or interfaces modules for every combi - nation of primitives . As a result, the ultimat e Programs that can issue RPCs can b e point of compatibility for distributed systems modeled, and a cost analysis can be devel- shifts to the NCL syntax, libraries, and canon- oped based on simple measurements, as shown ical data formats . in [50]. RPCs can be classified according t o The primary advantage of this scheme i s the size of input, the size of output, and th e that many application programs can be buil t amount of computation performed by the pro- that translate tasks from the language that the cedure. Application programmers are provided client is used to, to expressions to be evaluated with some guidelines, such as, how often the remotely. This advantage makes this software user can afford to call an expensive RPC con- highly extensibly . sidering the degradation of performance be- To transmit expressions, LISP-like represen - cause of remote execute instead of local exe- tations are used that are very economical . On cution, the cases in which the remote host can

253

perform the computation faster and the speed- Number Time for Time for ing up factor, or, more generally, the expecte d of bytes Local Calls RPCs execution time of a program with a certain mix 1 1.05 3 .51 of RPCs. 25 1.79 4.1 7 An extensive analysis of several RPC pro- 1016 2 .99 8.22 tocols and implementations developed befor e 3000 6.66 19.19 1984 was presented in [53], and the firs t measurements of a real implementation given Table 2: RPC Performance in Extende d in [13] . The measurements were made for Modula-2 (in ms). remote calls between two Dorados connected by an Ethernet with a raw data rate of 2 .94 callee's domain . megabits per second, lightly loaded . Table 1 shows the time spent on average, and the frac- • Simple data transfer: a shared memory ex- tion of that time that was spent in transmissio n ists between the caller and the callee, and and handling of local calls in the process . These the arguments are copied only once for all measurements are made without using any en- situations. cryption facility from the moment an RPC i s requested to the moment at which the result is • Simple stubs : are a consequence of the in the final destination . simple model of control and data transfe r In the Modula-2 extension [1] to suppor t between threads. RPC, it was also observed that RPC timings The system adopts a concurrency-based design are directly affected by the number and size o f that avoids shared data structure bottlenecks . parameters, as shown in Table 2. The perfor- It also caches domains context on idle proces- mance was much better also for out arguments , sors, avoiding context switch. A performance than for in-out, and for idempotent procedures of 157 ms is achieved for the null cross-domai n than for non-idempotent ones . call, and of 227 ms for the 200-byte in-out ar- In distributed (RPC or message passin g gument . based) operating systems, large percentage Another approach used to improve perfor- (94% to 99%) of the communication is onl y mance was to carefully analyze the steps taken cross-domain, not cross-machine [9] . Effort has by an RPC, and to precisely identify the bot- been made to optimize the cross-domain im- tlenecks [63]. An optimization for the RP C plementation of RPCs to bring considerable mechanism for the Firefly multiprocessor com- performance improvements. The Lightweight puter is described in [63] . The architecture RPC [9] (LRPC) consisted of a design and is based on multiprocessors in the same ma - implementation effort to optimize the cross - chine sharing only one I/O bus connected to domain use for the TAOS operating system o f the CPU 0. This feature could affect RPC the DEC SRC Firefly multiprocessor worksta- latency on other processors if a high load o n tion. The improvements are the result of thre e this particular CPU occurs. Great performance simple aspects: increases are achieved through the identifica- • Simple control transfer: the caller's threa d tion of the major bottlenecks in the mechanism executes the requested procedure in the and the optimization of the process in the par- ticular points . During marshaling, the stub s use customized code that directly copy argu- Procedure Average Trans . Local ments to and from the call or result packet , Time Time Call Time instead of calling library routines or an inter- no args/results 1097 131 9 preter. The RPC and transport mechanism s 10 args/results 1278 239 1 7 communicate through a buffer pool that resides 240 word array 2926 1219 98 in a shared memory accessible to all user ad- dress spaces, to the Nub, and which are als o Table 1: RPC Performance in the Cedar permanently mapped into the I/O space . In Project (in ps) . this way, all these components can read an d

254 write packet buffers in the same address spac e 10 Conclusion with no need for copying or extra address map - ping operations. Nevertheless, no protection Distributed systems provide for the intercon- has been added and this strategy can lead t o nection of numerous and diverse processing leaks into the system . An efficient buffer man- units. Software applications must relish such agement is done, making the 1server stub reuse broad functionality, the inherited parallelism the call packet for the results, and the receiver available, and the reliability and fault toler- interrupt handler to immediately replace the ance mechanisms that can be provided . Sev- buffer used by an arriving call or result packet , eral language-support paradigms have been checking for additional packets to process be - proposed to the modeling of applications i n fore termination . A speedup by a factor of al- distributed environments . An inappropri- most three was obtained rewriting a particular ate choice of language abstractions and dat a path in assembly language. The demultiplex- type constructors may only add to the ever - ing of RPC packets is done at the interrupt rou- increasing software development crisis. It may tine, instead of the usual approach of awaken- prevent software maintenance, portability and ing an operating-system thread to demultiple x reusability. Today's conception of a distributed the incoming packet. The Firefly RPC offer s system is far behind the desirable model o f the choice of different transport mechanisms computation . It is mandatory that new soft- at bind time . The integration of Firefly RPC ware paradigms be as independent as possibl e into a heterogeneous environment, where RP C of the intrinsic characteristics of a specific ar- can be especially beneficial, is not discussed . chitecture and, consequently, must be more ab- Heterogeneity is also a potential source of per- stract . formance optimization problems, because dif- In this work, we studied RPC, a new pro- ferences in machine architecture and informa- gramming language paradigm. A survey of the tion representation can make time-consumin g major RPC-based systems and a discussion of conversion operations unavoidable . Portions of the major techniques used to support the con- the software are implemented in assembly lan- cept were presented . The importance of RPC i s guage, which may cause portability and main- because of its conservative view toward alread y tenance problems. Possible additional hard - developed software systems . The goal of RP C ware and software speedup mechanisms are dis- development is to provide easy interconnection cussed, offering estimated speedups of 4% t o of new and old software systems, despite it s 50%. language, operating system, or machine depen- dency. To achieve this goal, the foundation of RPC is based upon a strong language construc- tor, the procedure call mechanism, which exist s in almost any modern language . An effort was made to take some of the op- This work may inspire important futur e erations in communications in distributed sys- work. Syntax and semantic transparency ar e tems to the hardware level. Hardware support ever-challenging research topics and the suit- for message passing at the level of the oper- ability of the RPC paradigm for new dis- ating system primitives was proposed in [60] . tributed systems applications, such as multi- A message coprocessor is introduced that in- media, is also an important topic to be studied . teracts with the host and network interface through shared memory, and provides concur- rent message processing while the host execute s Acknowledgements other functions . In control of the network in- terface the coprocessor takes care of the com- I am thankful to Shaula Yemini for suggesting munication process . This involves checking the the writing of this survey, and for giving signif- validity of the call, addressing and manipu- icant comments on its overall organization. I lating control blocks, kernel buffering, short - am also thankful to Prof. Yechiam Yemini, my term scheduling decisions, and sending net - advisor, for carefully reviewing earlier drafts of works packets, when necessary, and so on. this work. Finally, I would like to thank the

255 referees for their careful reading of this paper Levy. Lightweight remote procedure call. and for their valuable comments. ACM Transactions on Computer Systems, 8(1) :37-55, February 1990 . References [10] Brian N. Bershad, Dennis T Ching, Ed- ward D Lazowska, Jan Sanislo, and [1] Guy T. Almes. The impact of language Michael Schwartz. A remote call facil- and system on remote procedure call de- ity for interconnecting heterogeneous com- sign. In The 6th International Conference puter systems. IEEE Transactions on on Distributed Computing Systems, pages Software Engineering, 13(8):880-894, Au- 414-421, Cambridge, Massachusetts, May gust 1987. 1986. IEEE Computer Society Press . [11] Andrew D. Birrell. Secure communication [2] Guy T. Almes, Andrew P. Black, Ed- using remote procedure calls . ACM Trans- ward D. Lazowska, and Jerre D . Noe. actions on Computer Systems, 3(1) :1-14, The eden system : A technical review . February 1985. IEEE Transactions on Software Engineer- ing, 11(1):43-58, January 1985. [12] Andrew D. Birrell, R. Levin, R . M. Need- ham, and M. D. Schroeder. Grapevine : An [3] A. L . Ananda, B. H . Tay, and E . K. Koh. exercise in distributed computing . ACM ASTRA - an asynchronous remote pro- Communications, 4(25):260-274, April cedure call facility. In The 10th Inter- 1982. national Conference on Distributed Com- puting Systems, pages 172-179, Arlington , [13] Andrew D . Birrell and Bruce Jay Nel- Texas, May 1991 . IEEE Computer Society son. Implementing remote procedure calls . Press. ACM Transactions on Computer Systems, 2(1):39-59, February 1984 . [4] A. L . Ananda, B. H . Tay, and E . K. Koh. A survey of asynchronous remote proce- [14] John R. Callahan and James M . Purtilo. dure calls . ACM Operating Systems Re- A packing system for heterogeneous exe- view, 26(2):92-109, April 1992. cution environment . IEEE Transactions on Software Engineering, 17(6) :626-635 , [5] Gregory R. Andrews. Paradigms for pro- June 1991. cess interaction in distributed programs . ACM Computing Surveys, 23(1):49-90 , [15] B. E. Carpenter and R . Cailliau. Experi- March 1991 . ence with remote procedure calls in a real- time control system . Software Practice [6] Gregory R. Andrews and Ronald A . Ols- and Experience, 14(9):901-907, September son. The evolution of the sr language . Dis- 1984. tributed Computing, 1(3) :133-149, 1986 . [16] Roger S. Chin and Samuel T . Chan- [7] Joshua Auerbach. Concert/C specificatio n son. Distributed object-based program- - under development. Technical report, ming systems . ACM Computing Surveys, IBM T. J . Watson Research Center, Apri l 23(1):91-124, March 1991 . 1992. [17] David L. Cohrs, Barton P. Miller, and [8] Henry E. Bal, Jennifer G . Steiner, and An- Lisa A. Call. Distributed up calls: A mech- drew S . Tanembaum . Programming lan- anism for layering asynchronous abstrac- guages for distributed computing systems. tions. In The 8th International Conferenc e ACM Computing Surveys, 21(3):261-322, on Distributed Computing Systems, pages September 1989 . 55-62, San Jose, California, June 1988 . [9] Brian N . Bershad, Thomas E . Ander- [18] E. C. Cooper . Replicated procedure call. son, Edward D. Lazowska, and Henry M . In Srd ACM Symposium on Principles

256 of Distributed Computing, pages 220-232, [28] Phillip B . Gibbons. A stub generator for Vancouver, BC, 1984. multi-language RPC in heterogenous envi- ronments. IEEE Transactions on Software [19] Eric C. Cooper. Programming languag e Engineering, pages 77-87, 1987. support for multicast communication in distributed systems . In The 10th Interna- [29] David K. Gifford and Nathan Glasser . Re- tional Conference on Distributed Comput- mote pipes and procedures for efficient dis- ing Systems, pages 450-457, Paris, France , tributed communication. ACM Transac- June 1990. IEEE Computer Society Press . tions on Computer Systems, 6(3):258-283, [20] Department of Defense, Department of August 1988 . Defense, Ada Joint Program Office, Wash- ington, DC . Reference Manual for the Ad a [30] R. Gimson. Call buffering service . Tech- Programming Language, July 1982 edition. nical Report 19, Programming Research Group, Oxford University, Oxford Univer- [21] L. P. Deutsch and E . A. Taft. Require- sity, Oxford, England, 1985 . ments for an exceptional programming en- vironment. Technical Report CSL-80-10 , [31] K. G. Hamilton. A Remote Procedure Xerox Palo Alto Research Center, Palo Call System . PhD thesis, University of Alto, California, 1980. Cambridge, Computing Lab., Univ. Cam- bridge, Cambridge, England, 1984 . [22] Terence H. Dineen, Paul J . Leach, Nathaniel W . Mishkin, Joseph N . Pato, [32] . Distributed pro- and Geoffrey L. Wyant. The network com- cesses: A concurrent programming con- puting architecture and system : An en- cept . Communications ACM, 21(11) :934- vironment for developing distributed ap- 941, November 1978 . plications. In USENIX Conference, pages 385-398, Phoenix, Arizona, June 1987 . [33] Roger Hayes and Richard D . Schlichting . g [23] Jeffrey L . Eppinger, Naveen Saxena, and Facilitating mixed language programmin Alfred Z . Spector . Transactional RPC . In in distributed systems . IEEE Transaction s Proceedings of Silicon Valley Networking on Software Engineering, 13(12) :1254- Conference, April 1991 . 1264, December 1987 . [24] Joseph R. Falcone. A programmable inter- [34] M. Herlihy and B . Liskov. A value trans- face language for heterogenous distribute d mission method for abstract data types . systems. ACM Transactions on Compute r ACM Transactions on Programming Lan- Systems, 5(4) :330-351, November 1987 . guages and Systems, 4(4) :527-551, Octo- ber 1982 . [25] Nissim Francez and Shaula A. Yem- ini. Symmetric intertask communication . [35] C. A. R. Hoare. Communicating sequen- ACM Transactions on Programming Lan- tial processes. Communications of ACM, guages and Systems, 7(4) :622-636, Octo- 21(8) :666-677, August 1978 . ber 1985. [26] Neil Gammage and Liam Casey. XMS: A [36] Apollo Computer Inc . Apollo NCA and rendezvous-based distributed system soft- sun ONC: A comparison . In 1987 Sum- ware architecture. IEEE Software, 2(3) :9- mer USENIX Conference, Phoenix, Ari- 19, May 1985 . zona, June 1987 . [27] Narain H. Gehani and William D . Roome. [37] International Standard for Informatio n Rendezvous facilities : Concurrent C an d Processing System . Open System s the Ada language . IEEE Transactions o n Interconnection—Specification of Abstract Software Engineering, 14(11) :1546-1553 , Syntax Notation One (ASN.1), 1987. ISO November 1988 . Final 8824.

257

T [38] Michael B . Jones and Richard F. Rashid. ciples of Programming Languages, pages Mach and matchmaker : Kernel languag e 150-159, St . Petersburg, Florida, 1986. support for object-oriented distributed systems: In OOPSLA '86 Proceedings, [47] Barbara Liskov and R Scheifler . Guardians pages 67-77, Portland, Oregon, Novembe r and actions : linguistic support for robust, 1986. SIGPLAN Notices . distributed programs. ACM Transactions on Programming Language and Systems, [39] Michael B . Jones, Richard F . Rashid, an d 5 :381-404, 1983 . Mary R. Thompson. Matchmaker: An interface specification language for dis- [48] Barbara Liskov, Liuba Shrira, and John tributed processing. In 12th ACM Sympo- Wroclaswski. Efficient at-most-once sium on Principles of Programming Lan- messages based on synchronized clocks . guages, pages 225-235, New Orleans, LA , ACM Transactions on Computer Systems, January 1985 . 9(2) :125-142, May 1991 . [40] B. Lampson. Remote procedure calls. Lec- [49] Klaus-Peter Lohr, Joachim Muller, an d ture Notes in Computer Science, 105:365- Lutz Nentwig. Support for distributed ap- 370, 1981. plications programming in heterogeneous computer networks. In The 8th Interna- [41] P. J . Leach, P . H. Levine, B . P. Douros, tional Conference on Distributed Comput- J. A. Hamilton, D . L. Nelson, and B . L . ing Systems, pages 63-71, San Jose, Cali- Stumpf. The architecture of an integrate d fornia, June 1988 . local area network . IEEE Journal on Selected Areas in Communications, pages [50] Dan C. Marinescu . Modeling of program s 842-857, 1983 . with remote procedures . In R. Popescu- Zeletin, G. Le Lann, and K . H. Kim, edi- [42] Kwei-Jay Lin and John D . Gannon . tors, The 7th International Conference o n Atomic remote procedure call. IEEE Distributed Computing Systems, pages 98- Transactions On Software Engineering, 104, Berlin, West Germany, September 11(10):1126-1135, October 1985 . 1987. [43] Barbara Liskov and Li uba Shrira. [51] J . G. Mitchell, W. Maybury, and R . Sweet. Promises : Linguistic support for effi- Mesa language manual (version 5 .0) . Tech- cient asynchronous procedure calls in dis- nical Report CSL-79-3, Xerox Palo Alto tributed systems . In SIGPLAN'88 Con- Research Center, Palo Alto, California , ference on Programming Language Desig n 1979. and Implementation, pages 260-267, At- lanta, Georgia, June 1988 . [52] National Bureau of Standards (NBS) , NBS, Department of Commerce, Wash- [44] Barbara Liskov. Distributed program- ington, D .C. Data Encryption Standard , ming in Argus. Communications ACM, 1977. 31(3):300-312, March 1988 . [53] B. J . Nelson . Remote Procedure Call. PhD [45] Barbara Liskov, Toby Bloom, David Gif- thesis, Carnegie Mellon University, Pits- ford, Robert Scheifler, and William Weihl . burgh, Pensilvania, 1981 . Communication in the Mercury system . In 21st Annu . Hawaii Int . Conf Syst. Sc. , [54] Gerald W. Neufeld and Yueli Yang . The pages 178-187, January 1988 . design and implementation of an ASN.1- C compiler. IEEE Transactions on Soft- [46] Barbara Liskov, Maurice Herlihy, an d ware Engineering, 16(10) :1209-1220, Oc- Lucy Gilbert. Limitations of synchronous tober 1990. communication with static process struc- ture in languages for distributed comput- [55] David Notkin, Andrew P. Black, Ed- ing. In 19th ACM Symposium on Prin - ward D . Lazowska, Henry M. Levy, Jan

258 Sanislo, and John Zahorjan . Intercon- [63] Michael D. Schroeder and Michael Bur- necting heterogeneous computer systems . rows. Performance of firefly RPC . Communications of the ACM, 31(3):258- ACM Transactions on Computer Systems , 273, March 1988 . 8(1) :1-17, February 1990 .

[56] David Notkin, Norman Hutchinson, Jan [64] M. L. Scott. Language support for loosel y Sanislo, and Michael Schwartz. Hetero- coupled distributed programs . IEEE geneous computing environments : Report Transactions on Software Engineering , on the ACM SIGOPS workshop on ac- 13(1) :88-103, January 1987 . commodating heterogeneity. Communica- [65] Michael Lee Scott . Messages vs. remote tions of the ACM, 30(2) :132-140, Febru- procedures is a false dichotomy. SIGPLAN ary 1987 . Notices, 18(5) :57-62, May 1983 . [57] Dave Otway and Ed Oskiewicz . REX : [66] Robert Seliger . Extending C++ to sup- A remote execution protocol for object- port remote procedure call, concurrency, oriented distributed applications. In exception handling, and garbage collec- R. Popescu-Zeletin, G. Le Lann, and K. H . tion. In Proceedings of USENIX C++ Kim, editors, The 7th International Con- Conference, 1990 . ference on Distributed Computing Sys- tems, pages 113-118, Berlin, West Ger- [67] S . K . Shrivastava and F. Panzieri. The many, September 1987 . design of a reliable remote procedure cal l mechanism . IEEE Transactions on Com- [58] Fabio Panzieri and Santosh K . Shrivas- puters, 31:692-697, July 1982 . tava. Rajdoot: A remote procedure cal l [68] Ajit Singh, Jonathan Schaeffer, and Mark mechanism supporting orphan detectio n Green. A template-based approach t o and killing . - IEEE Transactions on Soft the generation of distributed applications ware Engineering, 14(1):30-37, Januar y using a network of workstations . IEEE 1988 . Transactions on Parallel and Distributed Systems, 2(1) :52-67, January 1991 . [59] C . Pu, D . Florissi, P. G . Soares, P. S. Yu , and K. Wu. Performance comparison of [69] Ian Sommerville . Software Engineer- sender-active and receiver-active mutua l ing . Addison-Wesley Publishing Company, data serving. In HPDC-1, Symposium on third edition, 1989 . High Performance Distributed Computing , Syracuse, New York, September 1992 . [70] Robert J. Souza and Steven P. Miller . UNIX and remote procedure calls : A [60] Umakishore Ramachandran, Marvin peaceful coexistence? In 6th Interna- Solomon, and Mary K. Vernon. Hardware tional Conference on Distributed Comput- support for interprocess communication . ing System, pages 268-277, Cambridge , IEEE Transactions on Parallel and Dis- Massachusetts, May 1986 . tributed Systems, 1(3) :318-329, July 1990 . [71] James W. Stamos and David K. Gifford . [61] M. Satyanarayanan. Integrating security Remote evaluation . ACM Transaction s in a large distributed system . ACM Trans- on Programming Languages and Systems , actions on Computer Systems, 7(3):247- 12(4) :537-565, October 1990 . 280, August 1989 . [72] W . Richard Stevens . UNIX Network Programming . Prentice Hall, Englewoo d [62] Michael D. Schroeder, Andrew D . Birrel , Cliffs, New Jersey, 1990 . and Roger M . Needham . Experience with grapevine : The growth of a distribute d [73] Michael W. Strevell and Harvey G . system . ACM Transactions on Distributed Cragon. High-speed transformation of Systems, 2(1):3-23, February 1984 . primitive data types in a heterogeneous

259 distributed computer system . In The 8t h [82] Andrew S . Tanenbaum and Robbert van International Conference on Distribute d Renesse. A critique of the remote pro- Computing Systems, pages 41-46, San cedure call paradigm. In R. Speth, edi- Jose, California, June 1988 . tor, Proceedings of the EUTECO 88 Con- ference, pages 775-783, Vienna, Austria , [74] Robert Strom and Nagui Halim . A new April 1988. Elsevier Science Publishers B . programming methodology for long-live d V. (North-Holland) . software systems . IBM Journal Researc h Development, 28(1):52-59, January 1984 . [83] B. H. Tay and A . L . Ananda. A survey of remote procedure calls . ACM Operatin g [75] Robert E. Strom, David F. Bacon, A r Systems Review, 24(3):68-79, July 1990 . thur P. Goldberg, Andy Lowry, Daniel M . Marvin M. Theimer and Barry Hayes . Yellin, and Shaula Alexander Yemini . Her- [84] Heterogeneous process migration by re- mes — A Language for Distributed Com- compilation . In 11th International Confer- puting. Prentice Hall, 1991 . ence, pages 18-25, Arlington, Texas, May [76] Robert E. Strom and Shaula Yemini . NIL : 1991. IEEE Computer Society Press . An integrated language and system for [85] Robbert van Renesse and Jane Hall. Con- distributed programming. In Proceedings necting RPC-based distributed system s of SIGPLAN Symposium on Programmin g using wide-area networks . In R. Popescu- Language Issues in Software System, page s Zeletin, G . Le Lann, and K. H. Kim, ed- 73-82. ACM, June 1983 . itors, The 7th International Conferenc e on Distributed Computing Systems, page s [77] Mark Sullivan and David Anderson . Mar- 28-34, Berlin, West Germany, September ionette: A system for parallel distribute d 1987 . programming using a master/slave model . In The 9th International Conference o n [86] Steve Wilbur and Ben Bacarisse . Build- Distributed Computing Systems, pages ing distributed systems with remote pro- 181-188, Southern California, 1989 . IEE E cedure call . Software Engineering Journal, Computer Society Press . pages 148-159, September 87 .

[78] Sun Microsystems, Network Information [87] Xerox Corporation, Xerox OPD . Courier: Center, SRI International . XDR: Exter- The Remote Procedure Call Protocol, De- nal Data Representation Standard (RF C cember 1981 . Xerox System Integration 1014), June 1987. Internet Network Work- Standard 038112 . ing Group Requests for Comments, No . [88] M. Yap, P. Jalote, and S. K. Tripathi . 10014 . Fault tolerant remote procedure call . In The 8th International Conference on Dis- [79] Sun Microsystems, Network Information tributed Computing Systems, pages 48-54 , Center, SRI International . RPC: Re - San Jose, California, June 1988 . IEEE mote Procedure Call, Protocol Specifica- Computer Society Press . tion, Version 2 (RFC 1057), June 1988 . Internet Network Working Group Re - [89] Shaula Yemini. On the suitability of Ada quests for Comments, No. 10057 . multitasking for expressing parallel algo- rithms. In Proceedings of the AdaTEC [80] L . Svobodova . Resilient distributed com- Conference on Ada, pages 91-97, Arling- puting . IEEE Transactions on Software ton, VA, October 1982 . ACM. Engineering, 10 :257-268, May 1984 . [90] Shaula A. Yemini, German S . Goldszmidt , [81] D . C . Swinehart, P. T. Zellweger, and Alexander D. Stoyenko, and Yi-Hsiu Wei . R. B. Hagmann. The structure of cedar . Concert: A high level language ap- SIGPLAN Notices, 20(7) :230-244, July proach to heterogeneous distributed sys- 1985 . tems. In 9th International Conferenc e

260 on Distributed Computing Systems, pages them) . Programming languages differ (1) o n 162-171, Newport Beach, CA, June 1989 . the kind of entities that they support, (2) on the kind of attributes they have associated with [91] Wanlei Zhou. PM: A system for proto- one entity type, and (3) on the binding time for typing and monitoring remote procedure each attribute . call programs . ACM Software Engineer- In conventional procedure call mechanisms, a ing Notes, 15(1) :59-63, January 1990 . binding usually consists of finding the addres s of the respective piece of code (ip) by using the name of a procedure. The context in which Appendix free variables are to be resolved is the context of the caller procedure at the moment of th e A Operational Semantics call accessed through the ep . This binding can be done during compilation time, if the bod y of Procedure Call Bind- of the function is defined in the same module ing that the call is ; during link-editing time if th e procedure's body is in a module different fro m e Programming languages, together with the op- the caller's module; or during run time if th erating systems, provide the programmer user procedure's name is stored in variables (point- d with an abstract view of the underlying physi- ers in C) whose values can only be determine dynamically. In the last case, the compiler pro- cal architecture. From the user point of view , a program manipulates entities such as vari- duces a mapping between a procedure's nam e ables, statements, abstract data types, proce- and its address that can be accessed indirectl y dures, and so on, that are identified through during run time . their names. To do the appropriate mappin g between these entities and the physical ma- s chine, the language designers usually maintain B Operational Semantic a list of attributes (properties) associated with of RPC Binding each entity. For instance, an entity of typ e procedure must have attributes that represent An RPC is as an extension of the conven- the name and type of formal parameters, it s tional procedure call mechanism to which a lit- result ' s name and type, if any, and parame- tle binding information must be added . In a ter passing conventions. It also maintains an conventional situation, it is assumed that th e instruction pointer (ip) and an environment caller and callee belong to the same program pointer (ep or the access environment) . The unit, that is, thread of execution, even thoug h ip allows for the localization of the code o f the language supports several threads of exe- the procedure inside the current address space . cution in the same operating system addres s The ep provides access to the context for resolv- space or in distributed programs as shown in ing free variables (global variables) in a proce- Figure 1 . In other words, there are no in- dure body. In this way, when the programmer terthread calls, only intrathread calls, wher e refers to a specific entity through its name, th e the binding limits its search space to the pro- system can access a list of attributes and per - gram unit in which the procedure is called . The form the correct mapping. The attributes must same happens with the execution environment be specified before an entity is processed durin g when the called procedure is being identified . run time. The binding corresponds to the spec- The callee also inherits the caller 's context . ification of the precise value of an attribute of In RPC-based distributed applications, the a particular entity. This binding information is same thread assumption is relaxed and in- usually in data structure descriptor of the en- terthread calls are allowed, as shown in Fig- tity. Some attributes can be bound at compila- ure 2 . The binding information must then tion time, others at run time (so that they can contain additional attributes that allow th e change dynamically), and some are defined b y identification of the particular thread in whic h the language designers (the user cannot change the callee procedure is to be executed (callee

261 thread). The callee thread now has its own con- the corresponding RPC . There was a flexibilit y text, perhaps completely apart from the calle r's in the binding time, which could be at com- context, and the threads exchange information pilation or runtime. Only a fixed, restricte d only through the parameters passed . Some ad- set of data types, not including pointers, coul d ditional semantics is needed, though, to defin e be passed as arguments or returned in RPCs . the behavior of parameter passing by reference The environment assumed was completely ho- and pointer-based memory access . mogeneous . The callee processes were starte d and terminated by the runtime support acting as permanent servers of their interface, and ex- C Brief Historical Survey ecuting no other computation and leaving no control to the user application to manage the The idea of extending the concept of proce- service of the calls . An independent transport dure calls to distributed environment was firs t protocol was entirely developed, and was con- introduced in Distributed Processes [32] . In cerned mainly with efficient establishing and Distributed Processes, a distributed applica- canceling of connections, and with their light - tion consisted of a fixed number of sequen- weight management. For system malfunctions , tial processes that could be executed simulta- the Cedar mechanism did not use time-outs t o neously. A process has variables that are no t prevent indefinite waiting by a client, but mad e directly accessible to other processes, has an use of special probe messages to detect whether common proce- initial statement, and has some the called server was still running . dures . A process can call local common proce- dures or the common procedures of other pro- cesses (external-request RPC) . An external re- An extension of P+, a language for applica- quest call has the syntax : tions in control system, to support RPC wa s proposed in [15] . The control system in which call . P+ was implemented already had a remote ex- (, ) ecution protocol . Any programmer could re- where identifies the process in quest the remote execution of any part of the which the common procedure i s program, which was written in a specialized to be executed. Before the procedure is ex- syntax. This part was transmitted in ASCII ecuted, the values of the call code to an interpreter (server) in the remot e are assigned to the input parameters . After machine . The server executed the transmit- execution, the output parameters are then as - ted source code interpretatively, and returned signed to the of the call. Dis- the results to the calling program, using a spe- tributed Processes also provided guarded com- cialized syntax. To extend P+, routine REM mands and guarded regions to allow nondeter- was introduced. The REM accepted a runtime ministic execution of statements . They allow a call descriptor, which includes the remote pro- process to make arbitrary choices among sev- cedure name, the remote computer name, and eral statements based on the current state of for each parameter, a tag that indicates the pa- its variables . rameter type and mode (read-only, read-write , One of the major pioneers of RPC-based Dis- write-only), the size of an array, and the ad - tributed Environment was the Cedar [21, 81 ] dress of the parameter . The REM encoded the project that was developed around 1984 . The descriptor in a datagram format written in th e RPC Facility was mostly based on the con- specialized syntax (containing the necessary in - cepts described in [53]. The Interface De- terpretative code in ASCII) and binary values scription Language used was Mesa [51] with of the parameters . Making use of the remot e no extensions, and the user could import an d protocol already present in the system, REM export only interfaces, not procedures . The sent the datagram to the server and waited for Grapevine [12, 62] distributed database was the results. The server executed the code re- used as a means to store the exported infor- ceived interpretatively, which was a single pro- mation. The user was responsible .for export- cedure call, that was unaffected by the call is- ing and importing the interfaces before making sued by REM .

262 ARGUS [47] is an integrated programmin g cationProcedure that is tailored to the invoca- language and system for distributed program tion procedure it defines. On the caller side, th e development . The fundamental concept in the usual stub is built . On the EPL level, the lan- language is atomicity . The logical units of dis- guage provides ReceiveOperation, ReceiveSpe- tribution are guardians and an atomic activity cific, and ReceiveAny that are used directly by is an action. An action can be completed either the programmer. The Receive Operation takes a by committing, when all involved objects take set of operations, and returns only when one o f on their new states, or by an abort procedure, them is called, whereas ReceiveSpecific is pro- where the effect is as if the action had never vided only when universal or singleton sets are started. A distributed application consists of a wanted. set of guardians that control access to a group Modula-2 was also extended to support RPC of resources available to its users only through calls [1], on top of the V System implemen- functions named handlers (local and possibly tation for SUN workstations . The goal was remote functions). A guardian contains pro- to allow entire external modules, as defined in cesses that execute background tasks and han- Modula-2, to be remote, that is, a callee woul d dlers. Each call to a handler triggers the spawn- be defined by the definition module and imple- ing of a separate process inside that guardian , mented (specified) by the implementation mod- which can execute many calls concurrently . A ule. A major effort was made to generate as fe w guardian can be created dynamically, and th e as possible dissimilarities between a remote ex - user need only specify the node in which it mus t ternal module and a local module . Only a few be created . Guardian and handler names are attributes and constraints were added to th e first-class elements. They can be exchanged definition module ; a stub generator was imple- through handler calls, which avoids the explicit mented for a particular definition module that importing and exporting of functions . Handler generated the callee and caller stubs as well a s calls are location independent, and continue t o headers; and a small runtime module was de- work even though the corresponding guardian veloped that supported the paradigm during changes its location . They have at-most-once execution time. The binding was done auto- semantics (defined in Section 5 .1) and they are matically the first time that a remote function based on the termination model of exception , was called, and this binding lasted for the en - terminating either normally or in one of several tire execution time. Null RPCs, with no argu- user-defined exception conditions . ments, executed in around 3 .32 milliseconds . In the Eden system [2], a distributed appli- SR [6] is a distributed programming languag e cation is a collection of Eden objects or Ejects . that was concerned mainly with expressivenes s Each Eject has a concrete Edentype, the piece (to make it possible to solve relevant problems of code executed by the particular Eject, an d in a straightforward way), efficiency in com- an abstract Edentype, which describes its be- piling and executing, and simplicity in under - havior. Several concrete Edentypes can imple- standing and using the language . The proble m ment the same abstract Edentype . Ejects com- domain was distributed programming. SR pro- municate via invocations procedures from one vides primitives, which, when combined in a va- Eject to another. To issue an invocation, an riety of ways, support semaphores, rendezvous, Eject must have a capability for the callee Eject . asynchronous message-passing, and local and Each Eject has one capability to which invoca- remote procedure calls . SR's main component tions can be addressed . Invocation procedures is resources, which have a specification part and have CallersRights as a parameter, allowing th e an implementation part . The specification part callee Eject to check the caller's right . The defines the operations provided by the resource Eden Programming Language (EPL) provides that is implemented by possibly several pro- facilities to control concurrency within Ejects. cesses running on the same processor . All re- If the code fragments appear in different Ejects , sources and processes are created dynamicall y it is the the job of the EPL to ensure parame- and explicitly by execution statements . To call ter packing and stub generation . On the callee operations, SR provides mainly the call and side, the translator builds a version of CallInvo- send statements . The call statement has the

263

1 semantics of a procedure call, either locally o r (NCS), a portable NCA implementation . Het- remotely, while send has the semantics of semi- erogeneity was then included in the RPC asynchronous message passing . A send invoca- designers concerns, but mainly at the leve l tion is returned after the call message is deliv- of the machine data representation. To in- ered to the remote machine . On the callee side , terconnect heterogeneous distributed systems , SR provides proc and in statements to deter- NCA designed an RPC facility support calle d mine the operations execution time . An oper- NCA/RPC, a Network Interface Definitio n ation declared as a proc operation triggers a Language called NIDL, and a Network Dat a separate process to execute each call received . Representation called NDR . The NCA/RPC is The in statement provides for the execution of a transport-independent RPC system built o n a selected operation using the value of its as- multiple threads of execution in a single ad- sociated Boolean expression . A receive state- dress space. A replicated object location bro- ment waits for a call of an specified operation, ker is used for location purposes. The con- and then saves the arguments of the call, but cept of object-oriented programming was ex- the call is not executed . To allow the early tended to abstract the location (where it is termination of an operation, the return state- going to be executed) from the user, making ment terminates the smallest enclosing in stat- objects the unit of distribution, reconfigura- ment or proc statement . Similarly, the reply tion, and reliability. The unit to be exporte d statment terminates the respective statement s or imported was an object, an object type, o r without terminating the process . It provides an interface . Location transparency was en- for conversations, in which caller and callee forced by the flexibility of importing only ob- continuously exchange information . jects or functionality independently of thei r LYNX [64] supports RPC calls on the top of location. The replication aspect of the sys- links, which are built-in data types . A link is tem as a whole brought failure transparency, a virtual circuit abstraction that represents a because several objects (processes) could ex- resource. Callee processes can specify the pro- port the same functions . One important exten- cesses that can be bound to the other end of th e sion was the flexibility of having user-designed link. In this way, LYNX provides for some sym- procedures to marshal and demarshal complex metry between caller and callee, because bot h data structures, thereby widening the usability can choose with whom to communicate . Pro- of the paradigm . It also provided the possi- cesses can exchange link bindings, and once ex- bility of attaching binding procedures to dat a changed, the previous owner loses access to th e types defined in the interface, allowing the de- link. Procedures are called through connec t velopment of even more customized code . An statements that use a specified link to send a extensive analysis of the major advantages of call through it. This operation blocks until the Apollo NCA over the SUN/ONC (Open Net- results are returned in a way similar to proce- work Computing) system is presented in [36] . dure calls. The process bound to the other end SUN RPC system [79] is a stub generator of the link accepts the call, executes it, and called rpcgen, an eXternal Data Representa- replies the results back, unblocking the caller tion [78] (XDR), and a runtime library. From process. A process can contain a single threa d an RPC specification file, rpcgen generates the of control that receives requests explicitly, or a caller and callee stubs, and a header file that process can receive them implicitly, where each is included by both files. Each callee's host appropriate request creates its own thread au- must have a daemon that works as a port map- tomatically. A link end can be bound to more per that can be contacted to locate a specific than one operation, which leads to a type se- program and version . The caller has to spec- curity check on a message-by-message basis . ify the host's name and the transport protocol Around 1987, Apollo presented the Net- to acquire an RPC handler. The RPC han- work Computing Architecture [22] (NCA), a n dler is used as the second argument of an RP C object-oriented environment framework for the call. SUN supports either TCP or UDP (ex- development of distributed applications, to- plained in Section 7 .1) . An RPC call returns a gether with the Network Computing System NULL value in case any error occurs . An effort is

264 made to provide at-most-once semantics . Each RPC mechanisms that were heterogeneous on RPC handler has a unique transaction identi- several levels, but mainly on the system level . fier (id) . Each distinct request using the RP C There was considerable and broad reusabilit y handler has a different value that is obtained of components already built . HRPC divided an by slightly modifying the id . On the caller side, RPC System into five major components : the the protocol uses tests to match this id before compile-time support (programming language , returning an RPC value. This matching guar- IDL and stub generator), the binding protocol , antees that the response is from the correct re - the data representation, the transport proto- quest. col, and the control protocol (basically a run- If the protocol chosen is UDP, the UDP calle e time support). The localized translation was has an option of recording all of the caller's re- another basic principle adopted by HRPC : if quests that it receives. It maintains a cache there exists an expert to do the job, let it do it , with the result values obtained that can be in- instead of concentrating a narrower knowledg e dexed by the transaction id, program number , in a more-general purpose component . For in- version number, procedure number, and caller's stance, one way of implementing a name serve r UDP address. Before executing a request, the to deal with heterogeneous databases is to build callee checks into this cache, and if the call is a centralized database capable of doing all the duplicated, the result value saved into the cache different searches and conversions between sys- is simply returned, on the assumption that the tems. HRPC's name service (NSM) is a simple previous response was lost or damaged . bridge between the several homogeneous name For security, SUN provides three forms of au - services that are connected to the RPC mecha- thentication. The Null authentication is the de- nism. Lazy decision making or procrastination, fault situation, where no authentication occurs . was also adopted, because in dealing with het- The Unix authentication transmits with every erogeneous systems, an a late-as-possible com- ROC request a timestamp, caller 's host name , mitment represents the most flexible one . caller 's user id, calle r's group id, and a list of all Marionette [77] is a software package base d other group ids to which the caller belongs to . on the master/slave model for distributed en- The callee decides, based on this information , vironments. A distributed application in this whether it wants to grant the caller's request . model is one master process and many slave The DES authentication uses the SUN RP C processes, all of them sequential . The interac- protocol. tion is done either through shared data struc- Mainly concerned with heterogeneity, and tures created and updated by the master pro- searching for accommodation of old and new cess, or through operation invocation, whic h distributed systems in a single environment , can be a worker operation or a context opera- HCS [55, 10] was designed . Major component tion . The context operations modify the slaves ' was the HCS Remote Procedure Call (HRPC) , process state and are executed by all slaves . designed on a plug replacement basis, where Worker operations are each executed in a single the basic components could be independentl y slave, are not specified by the master, and are built and plugged together . The goal of HRP C invoked through asynchronous RPC. Later, the was to identify (factorize) the major compo- master explicitly accepts the operation's result . nents of an RPC system, and to define pre- To deal with heterogeneity at the data repre- cisely their interfaces . In this way, component s sentation level, all data exchanged between ma - could be replaced easily, if the new compo- chines is translated into SUN External Dat a nents supported the interface and functional- Representation [78] (XDR). The conversion is ity of the former one and hide the implemen- done through user-defined routines . The im- tation details. The interconnection of closed port and export of functions is avoided by hav- systems (systems in which changes were nei- ing all programs specify linkage arrays . One ther possible nor wanted) were accomplished by linkage array contains descriptors for worke r simply building components that emulated the operation functions, another contains descrip- behavior of the systems. This interconnection tors for context operation functions, and an- made possible the interconnection of different other contains descriptors for types of share d

265

a data structures . An index into one of the ar- mand. A caller process can issue the parallel rays works as a function identifier . execution of several callees and resumes con- C++ was also extended to support RPC , trol only when it receives a reply from all calls. concurrency, exception handling and garbage Each machine has a spawner process that sup- collection to form a superset of C++ name d ports the creation of callee processes on de- Extended-C++ [66]. A translator was designe d mand . It is reached via RPC using a well- to compile code in Extended-C++ into C+ + known transport address . code, that along with a runtime library, sup- In Concert [90, 7], a distributed process ported the extensions . For the support of RPC, model and an interface description language attributes were added to the syntax that al- are designed . The ultimate goal of Concert lowed the declaration of classes as remotable : is to allow the interprocess communication of its member functions were called remotely. A processes that support its process model an d class was instantiated only for executing th e IDL. The interoperability is possible even i n callee side of RPCs. In this situation, only the the presence of heterogeneity at the languag e class declaration was required in the caller's level, operating-system level, or machine level. side code, avoiding linking the code with the The process model is integrated at the lan- definitions of the functions that can be made re- guage level, which provides greater portabil- mote. There is an explicit distinction between ity of the concept . Languages usually must a pointer to a local value and a pointer to a be minimally extended to support the model . remote value, and the user must be careful t o Extensions encompass process dynamics, ren- dereference correctly, or else a compilation er- dezvous, RPC, and asynchronous RPC. The ror occurs. The pointer is declared as being model defines additional operations and thre e remotable and the syntax used for dereferecing data abstractions : processes, ports, and bind- it is different from the syntax to dereference a ings. The processes consist of a thread of con- local pointer . Only a subset of the types can trol and some encapsulated data local to a pro- be used as the types of remotable function ar- cess. Processes can be dynamically created an d guments or returned values . A type in this se t terminated at the language level, and may no t is a transmissible type . The fundamental dat a correspond to operating-system processes . The types and a remotable pointer are transmissi- interprocess communication is achieved via di- bles. A local pointer is not transmissible : a rected typed channels, whose receiving ends ar e user-defined encoder and decoder are neede d represented by ports . The bindings represent for each class to make it a transmissible class . the capability to call a port or to send a typed For encoding and decoding complex structures , message to. The operations allow the manipu- such as, circular linked list the degree of shar- lation of these abstract types, such as, the dy- ing of remotable objects can be controlled . An namic creation of processes, ports, and bind- object in Extended-C++ can be encoded an d ings, the query as to whether the ports hav e decoded as a shared object or as a value object . messages queued, the ability to wait for mes- During a remotable call, a shared object can sages to arrive on specific ports, and the abil- be encoded (decoded) at most once, whereas ity to process these messages and to reply if a value object has no limit on the number o f needed. times it can be encoded (decoded) . This avoids The ports introduce a new storage class, looping during encoding (decoding) for circular which have queues associated with them t o lists, for instance, through the correct specifi- hold calls. Functions can be declared to b e cation of an object encoder (decoder) function . in the port storage class, and can be passed DAPHNE [49] consists of language-indepen- to other processes, and can then be used fo r dent tools and runtime support to allow pro- interprocess communication . On the callee grams to be divided into parts for distributed side, the function accept() receives a collec- execution on nodes of a heterogeneous com- tion of port storage-class functions and exe- puter network. A callee process is not share d cutes a rendezvous on one of the ports . If between concurrent callers belonging to differ- no calls are pending on any of the port func- ent users, and is private and created on de- tions, the accept function waits for a call to

266 become pending on at least one of the ports . Otherwise, it selects one of the port functions nondeterministically. The accept function de- marshals the arguments, invokes the respective function, and transmits the results to the caller after the function is finished . Concert is an ambitious design in which no t only RPC but also asynchronous communica- tion and process dynamics are provided at the language level . To support asynchronous com- munication, Concert adds two new data types and five new operators . The receiveport op- erator specifies ports for one-way messages; the cmsg (call message) operator specifies a two- way message that was received, but not replie d to yet; the send operator sends a message in a non-blocking manner, that is, the caller is un- blocked right after the message is sent, proba- bly before it is received and processed ; the poll operator allows a process to check whether mes- sages are present at any of a collection of ports , but without processing the messages ; the send operator is similar to the poll operator, but blocks if no call is pending and until some other process makes a call ; the receive oper- ator receives (dequeues) a message of any kin d from any kind of port ; and the reply oper- ator replies to a two-way message which wa s received and represented as a cmsg operator . Hermes [75] was the first language of th e Concert family to be implemented . Hermes i s a process-oriented language that fully supports all of the Concert process model in type-safe manner .

About the Author

Patricia Gomes Soares received the B .Sc . and M .Sc . degrees in Computer Science from Universidade Federal de Pernambuco (UFPE) , Recife, Brazil, in 1987 and 1989, respectively . Since 1989 she is a Ph .D . candidate in the Com- puter Science Department, Columbia Univer- sity, New York . Her research interests include languages, compilers, and software systems i n general . She can be reached at the address Com- puter Science Department, Columbia Univer- sity, 450 Computer Science Building, New York, NY, 10027 . Her Internet address i s [email protected] .

267