<<

Detection in Distributed Databases

EDGAR KNAPP

Department of Computer Sciences, University of Texas at Austin, Austin, Texas 78712

The problem of deadlock detection in distributed systems has undergone extensive study. An important application relates to distributed database systems. A uniform model in which published can be cast is given, and the fundamental principles on which distributed deadlock detection schemes are based are presented. These principles represent mechanisms for developing distributed algorithms in general and deadlock detection schemes in particular. In addition, a hierarchy of deadlock models is presented; each model is characterized by the restrictions that are imposed upon the form resource requests can assume. The hierarchy includes the well-known models of resource and communication deadlock. Algorithms are classified according to both the underlying principles and the generality of resource requests they permit. A number of algorithms are discussed in detail, and their complexity in terms of the number of messages employed is compared. The point is made that correctness proofs for such algorithms using operational arguments are cumbersome and error prone and, therefore, that only completely formal proofs are sufficient for demonstrating correctness.

Categories and Subject Descriptors: .2.4 [Computer-Communication Networks]: Distributed Systems-distributed applications; distributed databases; network operating systems; D.4.1 [Operating Systems]: Management-concurrency; ; synchronization; D.4.7 [Operating Systems]: Organization and Design-distributed systems; H.2.4 [Database Management]: Systems-distributed systems; transaction processing General Terms: Algorithms Additional Key Words and Phrases: Deadlock detection, deadlock models, distributed deadlocks

INTRODUCTION selection of a so-called victim whose roll- back or abortion breaks the deadlock, and Deadlock detection is an important prob- finally, deadlock resolution itself. This pa- lem in database systems (DBSs), and much per is concerned only with the aspect of attention has been devoted to it in the deadlock detection. Recent developments research community. Generally speaking, a in the area of distributed deadlock detec- deadlock situation is the possible result of tion algorithms are surveyed, with a special competition for resources, such as multiple emphasis on their relation to distributed database transactions requesting exclusive DBSs. The paper introduces a uniform access to data items. framework for the discussion of these al- The deadlock problem has several inter- gorithms. The abstraction achieved this esting components. Among these are dead- way allows us to talk about the algorithms prevention, deadlock avoidance, and- in terms of the underlying theoretical con- in connection with deadlock detection-the cepts, instead of just giving a phenomeno-

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 0 1988 ACM 0360-0300/87/1200-0303 $1.50

ACM Computing Surveys, Vol. 19, No. 4, December 1987 304 l Edgar Knapp

CONTENTS developed. Section 2 gives a systematic classification of the models of deadlock as they appear in database applications. A hierarchy of models that gives rise to one INTRODUCTION way of classifying most of the distributed 1. THE DEADLOCK PROBLEM deadlock detection procedures found in the 1.1 A Brief Introduction to literature is introduced. Another classifi- 1.2 Deadlock in Centralized Systems cation, focusing on the theoretical princi- 1.3 Deadlock in Distributed Databases 1.4 The Database Model ples underlying the work in distributed 1.5 A Specification of the Deadlock Problem deadlock detection schemes, is given in Sec- 1.6 Centralized versus Distributed Deadlock tion 3. A survey of a number of algorithms Detection can be found in Section 4, with examples 2. MODELS OF DEADLOCK 2.1 One-Resource Model from each of the classes introduced in the 2.2 AND Model two previous sections. In the final section 2.3 OR Model the relative merits of the algorithms pre- 2.4 AND-OR Model sented are discussed. The references con- 2.5 (;) Model tain an exhaustive list on the work done in 2.6 Unrestricted Model 3. CLASSES OF DISTRIBUTED DEADLOCK the field of distributed deadlock detection DETECTION ALGORITHMS after 1980. Earlier papers included consti- 3.1 Path-Pushing Algorithms tute “classical articles” related to the 3.2 Edge-Chasing Algorithms subject. 3.3 Diffusing Computations 3.4 Global State Detection 4. A SURVEY OF SELECTED ALGORITHMS 4.1 Obermarck’s Path-Pushing 1. THE DEADLOCK PROBLEM 4.2 Mitchell and Merritt’s Algorithm for the Single-Resource Model 1.1 A Brief Introduction to Concurrency 4.3 Chandy and Misra’s Algorithm Control for the AND Model 4.4 Chandy, Misra, and Haas’s Algorithm The deadlock problem in DBSs is part of for the OR Model the area of concurrency control.’ Concur- 4.5 Hermann and Chandy’s Algorithm for the AND-OR Model rency control deals with the problem of 4.6 Brscha and Toueg’s Algorithm coordinating the actions of processes that for the (;) Model operate in parallel, access shared data, and 5. DISCUSSION therefore potentially interfere with each ACKNOWLEDGMENTS other. The object of study is an abstraction REFERENCES BIBLIOGRAPHY (model) of many different types of infor- mation systems. The main component of this model is the transaction. Informally, a transaction is an execution of a program that accesses a shared database. In our logical description of the workings of the model transactions are characterized by a algorithms (cf. Elmagarmid [1986]). sequence of operations, for example, R(x) The paper is organized as follows. Sec- denoting the operation of reading some tion 1 focuses on the relationship between data item x from the database and W(X) the deadlock problem and DBSs. For the standing for the operation of assigning a benefit of those readers not familiar with new value to data item x in the database. the necessary database terminology, a brief When two or more transactions execute outline of the relevant concepts is given. concurrently, their database operations ex- For a more thorough treatment of this ma- ecute in an interleaved fashion. That is, terial, the reader is referred to recent operations from one transaction may exe- texts on concurrency control, for example, cute in between two operations of another Bernstein et al. [1987] and Papadimitriou transaction. This interleaving can cause [1987]. Next, a database model is pre- sented, and a specification of the deadlock 1 Part of Section 1.1 follows the introductory chapter detection problem in terms of this model is of Bernstein et al. 11987 1.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 305

transactions to behave incorrectly or inter- Hence both transactions are blocked, wait- fere, thereby leading to an inconsistent ing for each other: a deadlock situation. database state. Informally, deadlock in a DBS can be The part of the DBS that controls the defined as “a situation in which each trans- relative order in which database operations action in a set of transactions is blocked requested by transactions execute is called waiting for another transaction in the set, the scheduler. The scheduler determines and therefore none will become unblocked the interleaving of database operations unless there is external intervention” (cf. such that the consistency of the database Bernstein et al. [1987]). is preserved. A particular such interleaving Even though some concurrency control of database operations is called a schedule. protocols are provably deadlock free (e.g., As an example, consider the schedule conservative 2PL,3 tree locking), most given below involving two transactions tl known protocols are vulnerable to dead- and t2 and two entities x and y (the notation lock. We next look at a number of other follows Bernstein et al. [1987] and Papa- ways in which deadlock can arise in a DBS. dimitriou [ 19871): Let us consider the case of multiversion schedulers, where each write operation on t1: R(x) W(Y) some data item produces a new version of tz : R(Y) W(x) this item, and each read operation of an This schedule formalizes the following se- item is mapped to some version of this item quence of database operations: transaction that was written previously. In the proto- tl first reads database item x, then t2 reads col for a multiversion-view-serializability y, followed by tl writing y, and finally t2 (MV-VSR) scheduler, the last step of a writing x. transaction is treated in a special way to ensure that at most one uncommitted 1.2 Deadlock in Centralized Systems version exists for any data item in the database. If such a scheduler is given the There are many different known strategies schedule of the previous example, the for schedulers for solving the problem of following happens: concurrent accesses without compromising database consistency. A detailed discussion tl reads x, of these strategies is beyond the scope of t:! reads y, this paper. The interested reader is referred tl waits for t2 to finish (commit), to Bernstein et al. [1987] and Papadimi- t2 waits for tl to finish (commit), triou [ 1987 1. The most popular of these strategies is and, again, the result is deadlock. so-called locking. Locking is the strategy of Lock conversion is a concurrency control reserving access rights (locks) that prevent technique that allows upgrading of a lock other transactions from obtaining certain to a stronger lock type, such as converting other (conflicting) locks. a read lock on a data item into a write lock As an example, consider a protocol called on the same item. Schedulers that allow for basic two-phase locking (2PL), which is lock conversion are prone to deadlock sit- widely in used in commercial systems. In uations for yet another reason. To see this, this protocol’ a transaction that has re- consider the following example: leased a lock may not subsequently obtain any more locks. If this strategy is applied tl: R(x) to the example schedule given above, the W(x) tz : following scenario is bound to happen: R(x) W(x) tl locks x, After both reads have been performed, with tz locks y, tl and t2 holding read locks on x, neither tl waits for t2 to release the lock on y, transaction can convert its read lock into a t2 waits for tl to release the lock on x. write lock; hence they are blocked forever.

* Also called dynamic 2PL in Papadimitriou [1987]. 3 Also called static 2PL in Papadimitriou [1987].

ACM Computing Surveys, Vol. 19, No. 4, December 1987 306 . Edgar Knapp

Multigranularity locking is a method subsumes-among others-the traditional whereby transactions can lock different resource and communication models. granularities of data items, such as a record, a disk page, or an entire file. In multigran- 1.3 Deadlock in Distributed Databases ularity locking protocols, deadlock can oc- cur for more than one reason. First, we In general, a distributed DBS consists of a have the problems due to the already men- number of sites, each of which constitutes tioned two-phase rule and lock conversion, a centralized system. Hence all problems of especially in conjunction with lock escala- the previous section plus additional ones tion, a technique in which a transaction due to the distributed nature of the data- that obtains too many locks on data items base (e.g., replication of data, single trans- of small granularity can increase the gran- actions executing in parallel at different ularity of its subsequent lock requests. An- sites) are present. Also, distributed dead- other problem arises when granules are lock is harder to detect, since each site has structured hierarchically in a rooted di- only a local view of the whole system, and rected acyclic graph. In this case a locking hence collaboration of the sites is required protocol may require a transaction that to detect deadlocks involving more than wants to lock some set of granules to lock one site. a majority of parents of these granules first. Both resource and communication dead- If two transactions happen to try locking locks can be distributed. In distributed the same set of granules, they may get to a DBSs, transactions that access nonlocal point at which they both hold locks on data migrate to other sites by invoking exactly half of the parents of the set so that subtransactions that may run concurrently neither of them will succeed. If more than with each other. So the originating trans- two transactions are competing for an in- action is blocked until all subtransactions tersecting set of granules, then deadlock is terminate, an indication of the resource even more likely to result. model. Notice that there is a fundamental dif- Communication deadlock can occur if ference between deadlocks due to majority in a replicated database a transaction re- locking and the other schemes mentioned quests the value of some nonlocal data above. The former has been termed com- item and is blocked until one of the sites munication deadlock, since it was first stud- that hold a copy of this item responds. ied in systems of communicating processes, Furthermore, one can conceive of subtrans- where a process waits to communicate with actions running in parallel on a repli- any one from a set of neighbors. The same cated database, resulting in situations principle underlies majority locking, in in which resource and communication which a transaction that is blocked can models are interwoven. proceed after some other transaction re- As an example of such an interplay be- leases its lock on a parent of the granule in tween both models, consider a distributed question. DBS with replicated data. Gifford [1979] The other examples demonstrate what has shown that in order to preserve data- has been called a resource deadlock, which base consistency, a transaction that wants assumes that a process becomes unblocked to read (write) a replicated data item, must only after it receives all the resources for read (write) r (w) copies out of the n copies which it is waiting. In the case of a database of the data item such that r + w > n and model in which a transaction is either ac- 2w > n. This is to ensure that at most one tive or waiting for exactly one resource, the writer has access to a replicated data item distinction between resource and commu- at a time. To read or write some copy of a nication deadlock is irrelevant since they data item, a transaction must request and reduce to the same concept. obtain a lock on this copy. Therefore, the As we shall see in Section 2, there is a reading and writing of a data item generate whole hierarchy of deadlock models that so-called (:) and (Z) resource requests, re-

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 307 spectively. These requests require essen- further that if a single transaction runs by tially a combination of the resource and itself in the distributed DBS, it will termi- communication models. nate in finite time and release all resources. A uniform model of a distributed DBS When two or more transactions run in par- underlying most of the algorithms found in allel, deadlock may arise. the literature is made precise in the follow- A transaction agent is said to be idle if it ing section. is waiting to acquire a resource; it is said to be executing if it is not idle. Thus, if an agent never acquires a requested resource, 1.4 The Database Model it is permanently idle. For notational sim- We introduce a model due to Menasce and plicity, we may assign a single identifying Muntz [1979] for studying deadlock detec- subscript (rather than a double subscript) tion algorithms for distributed DBSs. A to an agent. Hence ti denotes the ith agent. distributed DBS consists of a collection of In subsequent sections we often refer to N sites, S1, Sz, . . . , SN, connected by a processes instead of transaction agents. communication network. The network is Processes are more powerful than transac- assumed to be reliable and fully connected. tion agents. They are assumed to know the Each site is a centralized DBS that stores identities of all the processes they are wait- some portion of the database. There are M ing for, for example, by having access to transactions, TI, T,, . . . , TM running on their controller’s tables. Besides sending the distributed database. A transaction pre- request and release messages like transac- sents resource requests to a transaction tion agents, they can also exchange other manager (TM), also called controller. messages. This means that, with respect to There is one controller Ci per site Si. A deadlock computations, transactions are resource request may be a request to lock passive objects, whereas processes are ac- some data item or may have a more abstract tive participants in deadlock detection. In meaning. A transaction is blocked from the our database model, processes can be time it presents a request to a TM until the thought of as belonging in part to the TM grants the request and the transaction controller and in part to the transaction. becomes active. A resource request can be Although there can be at most one trans- local or can refer to a resource at another action agent per transaction at each site, in which case the transaction is dis- site, there is no such restriction for pro- tributed. A distributed transaction Ti is cesses. As for controllers, implemented by transaction agents tij , each between processes is assumed to be first in, of which is the local agent for transaction first out. Ti at site Sj. In case a transaction agent tij At any time a process is in one of two requests a nonlocal resource that is man- states: blocked or executing. A process is aged by some controller C,, controller Cj blocked from the time it issues a resource transmits the request to agent ti, via con- request until it receives a grant message for troller C,. When ti, acquires the requested the requested resource. If a process never resource from C,, it sends a message to tij receives a grant message for which it is (via C, and Cj) stating that the resource waiting, it is permanently blocked. While a has been acquired. Hence intersite requests process is blocked it may not send any are always between two agents of the same request or grant messages. It may, however, transaction. send and receive other messages or perform When agents in a transaction Ti no other tasks (e.g., related to deadlock detec- longer need a resource managed by con- tion). Examples of this behavior are given troller C,, they communicate with agent in later sections. tim, which is responsible for releasing the Henceforth we refer to either transaction resource to C,. We assume that messages agents or processes, depending on which sent by any controller Ci to Cj arrive se- variant of the model is more appropriate quentially and in finite time. We assume for our discussion.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 308 l Edgar Knapp

1.5 A Specification of the Deadlock Problem lock detection, Bernstein et al. [1987] show A transaction wait-for-graph (WFG) is a that as long as transactions follow a 2PL mathematical model of resource requests. protocol, a phantom deadlock can occur The vertices of the graph are associated only if some transaction spontaneously aborts. In accordance with most of the ar- with transaction agents (or processes, de- ticles on the subject, for the purpose of our pending on the context). Directed edges in discussion we assume that the DBS is free the graph represent relations be- tween transaction agents (processes). A of spontaneous abortions. vertex with outgoing edges corresponds to an idle transaction agent (blocked process). 1.6 Centralized versus Distributed Deadlock More precisely, there is an edge in the WFG Detection from transaction agent tij to tkj if controller Cj has a request from tij for resources held There are a number of reasons why distrib- by tkj; such an edge is called an intracon- uted deadlock detection seems more attrac- troller edge. There is an edge from tij to tin tive than a centralized scheme, that is, one

if tij is waiting for a grant message from ti, in which a single agent is responsible for (that it has acquired a resource managed deadlock detection. First, a centralized by C,,,); such an edge is called an intercon- deadlock detection algorithm is vulnerable troller edge. to failures of the central detector. Hence A cyclic structure in this graph indicates special provisions for this kind of faults a deadlock. The precise definition of the have to be made, resulting in long delays term cyclic structure depends on the dead- until a new central agent is determined and lock model we are considering. An example supplied with up-to-date wait-for informa- is given in Figure 1. There are five trans- tion. Distributed algorithms deal with these actions Z’,-Z’,, implemented by eight trans- kinds of problems in a much more natural action agents. The directed edge from node way. Furthermore, because of the heavy tll to node tzl indicates that transaction traffic to and from the central agent, this agent tll is blocked. This edge is an intra- agent can constitute a performance bottle- controller edge, whereas edge (tzl, tzz) is an neck, limiting the overall performance of intercontroller edge. Node tzz has no out- the DBS. going edges and is therefore active (non- More evidence for the superiority of dis- blocked). Node tll has two outgoing edges, tributed schemes is supplied by the obser- which means that it has two outstanding vation that for typical applications most resource requests. Note the presence of the WFG cycles are very short. Bernstein et al. cycle tll + tsl + t33 + t43 + & + tll in the [1987] give theoretical reasons for the pre- WFG. This cycle may or may not indicate dominance of short paths in WFGs. In par- a deadlock, depending on which deadlock ticular, for most applications over 90% of model we adopt. The relationship between WFG cycles can be expected to be of length deadlocks and WFGs is made more precise 2. The same figure also appears in an em- in Section 2, when we talk about specific pirical study [Gray et al. 19811. deadlock models. The observation that deadlock cycles are The correctness of a deadlock algorithm short makes centralized deadlock detection depends on two conditions. First, every an even less attractive choice. With a global deadlock must be detected eventually. This algorithm there may be a significant time constitutes the basic progress property any and message overhead in assembling all the solution must have. Second, if a deadlock local WFGs at the global detector. Thus, a is detected, it must indeed exist (safety distributed deadlock might go undetected property). Incorrectly detected deadlocks for quite a while. Since most deadlocks due to message delays and out-of-date involve only two sites, they can detect the WFGs have been termed phantom dead- deadlock more efficiently by communicat- locks. In the presence of spontaneous aborts ing directly. no deadlock scheme can guarantee to detect Mitchell and Merritt [1984] present a only genuine deadlocks. For global dead- fully distributed deadlock detection algo-

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases l 309

I Site 1

Figure 1. Example of a WFG.

I I rithm that has a very simple and appealing is given in the next section. The model is correctness proof and that, according to the widely used in theoretical studies of data- authors, had been implemented in a DBS base systems (cf. Bernstein et al. [1987] in less than an hour. This dissents from the and Papadimitriou [1987]). A very simple widely held opinion that distributed algo- and elegant algorithm for deadlock detec- rithms are necessarily more complex and tion in the one-resource model appears in harder to prove and implement than cen- Mitchell and Merritt [1984] and is de- tralized schemes. scribed in Section 4.2.

2. MODELS OF DEADLOCK 2.2 AND Model Depending on the application, database In the AND model, transactions are per- systems allow a number of different kinds mitted to request a set of resources. A of resource requests. For example, a trans- transaction is blocked until it is granted all action might need to acquire a combination the resources it has requested. Therefore, of resources like (resource a and resource requests of this type are called AND re- b) or resource c. This section introduces a quests. The AND model is identical to the hierarchy of request models used in the resource model mentioned in Section 1.2. literature, starting from very restricted We prefer the term AND model for system- forms and going to models with no restric- atic reasons. The AND model has been the tions whatsoever. This hierarchy can then traditional view of resource requests in dis- be used to classify deadlock detection al- tributed DBSs. The nodes of the WFG are gorithms according to the complexity of the called AND nodes and may have outdegree resource requests they permit. greater than 1. The problem of detecting deadlocks again reduces to finding cycles 2.1 One-Resource Model in the WFG. The simplest possible model is one in which As an example, again consider the WFG a transaction can have at most one out- given in Figure 1. Node tll has two out- standing resource request at a time. Hence standing resource requests, and in the case the maximum outdegree of the WFG is 1. of the AND model both must be satisfied Finding deadlocks in this model corre- before tll becomes active. The example de- sponds to finding a cycle in the WFG. A picts a deadlock situation, corresponding to formal justification for this correspondence the cycle tll + h1 + t33 ---, t43 + t41 --, hl.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 310 l Edgar Knapp

More precisely, we define deadlock in the quested resource, such as satisfying a read AND model along the lines of Chandy and request for a replicated data item by read- Misra [1982] as follows: A transaction ing any copy of it. This model was referred agent ti is said to be dependent on agent tj to as communication model in Section 1.2. if there is a sequence seq = ti, ti,, . . . , ti,, tj In the OR model, discovery of a cycle is of transaction agents such that each agent insufficient for deadlock detection. To see in seq is idle and each agent except the first this, suppose all requests in Figure 1 are holds a resource for which the previous OR requests; the nodes are then called OR agent in seq is waiting. We define ti to be nodes. In this case, transaction T1 is not

locally dependent on tj if all the agents in deadlocked because tzz has no outgoing seq belong to the same controller. Observe edges, and after Tz releases the resources it that, if ti is dependent on tj , then ti remains holds, T, can continue. idle at least as long as tj does. Furthermore, In terms of the WFG, a knot will indicate ti is deadlocked if it is dependent on itself a deadlock [Holt 19721. By definition, a or an agent that is dependent on itself. In vertex IJ is in a knot if (VW :: w is reachable either case, deadlock exists only if there is from UJ u is reachable from w ). Intui- a cycle of idle agents, each dependent on tively, no paths originating from a knot the next one in the cycle. have “dead ends.” Deadlock detection algorithms for the Formally, we define deadlock in the OR AND model declare that deadlock exists if model in terms of processes as follows (cf. and only if such cycles exist. Note that this Chandy et al. [1983]): A process is blocked condition does not imply that, if an agent if it has an outstanding OR request. Asso- ti is deadlocked, the detection algorithm ciated with each blocked process is a set of will detect that ti is deadlocked. In fact, if processes, called its dependent set. A ti is deadlocked but not part of a cycle of blocked process starts executing upon re- deadlocked agents, ti might never be de- ceiving any grant message from a process clared deadlocked. As an example, consider in its dependent set. Otherwise it does not transaction agent ts3 in Figure 1, which is change state or its dependent set. Intui- deadlocked even though it is not part of a tively, a set S of processes is deadlocked if cycle. all processes in S are permanently blocked. Deadlock in the one-resource model is A process is permanently blocked if it never conveniently defined the same way, with receives a grant message from any process the additional restriction that a transaction in its dependent set. More precisely, a set agent can have at most one outstanding S of processes is deadlocked if request (i.e., one outgoing edge) at a time. From this it is immediate that the AND (1) all processes in S are blocked, model is strictly more general than the one- (2) the dependent set of every process in S resource model. is a subset of S, and In the literature a number of algorithms have been proposed for the AND model (3) there are no grant messages in transit [Chandy and Misra 1982; Chandy et al. between processes in S. 1983; Gligor and Shattuck 1980; Haas 1981; Haas and Mohan 1983; Menasce and A process is deadlocked if it belongs to Muntz 1979; Obermarck 1980, 19821. We some deadlocked set. A set S of processes take a closer look at two of them [Ober- satisfying the above three conditions re- marck 1982; Chandy and Misra 19821 in mains permanently blocked because (1) a Sections 4.1 and 4.3, respectively. blocked process pi in S can start executing only after receiving a grant message from some process pj in its dependent set, 2.3 OR Model (2) every process pj in pi’s dependent set is also in S and cannot send a grant message An alternative model of resource requests while remaining blocked, and (3) there are is the OR model. A request for numerous no grant messages in transit from pj to pi, resources is satisfied by granting any re- which implies that pi will never receive a

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 311 message from any process in its dependent in general, not very efficient. A more effi- set. cient algorithm that was developed in Presence of a deadlocked sei of processes Hermann and Chandy [1983] is the topic is equivalent to the existence of a knot in of Section 5.5. This section also includes a the WFG. Hence, deadlock detection in the formalization of deadlock in the AND-OR OR model can be reduced to finding knots model. Since this definition does not cap- in a graph. Note, however, that a process ture the notion of deadlock exclusively, it can be deadlocked without being in a knot. has been omitted from the more general Rather, a necessary and sufficient criterion discussion here. for deadlock of some process p is the follow- ing: A blocked process p is deadlocked if p 2.5 (;) Model is in a knot or p can reach only deadlocked processes. The algorithm for the OR model The (Z) model allows the specification of we discuss in Section 5.4 detects deadlock requests to obtain any k available resources for any process belonging to some dead- out of a pool of size n. The (E) model is a locked set. generalization of the AND-OR model. AND-model deadlock detection can be Even though it turns out that both models simulated by repeated applications of OR- are equivalent in expressive power, the model deadlock computations, where each length of an AND-OR formula correspond- invocation operates on a subgraph of the ing to an (;) request is k(Z), which is of AND-model WFG. This method, however, exponential size for n = 212,since is hopelessly inefficient, so it is only of theoretical interest. In this sense, OR- 2n def= 2n(2n - 1) . . . (n + 1) model deadlock is a more general notion 0 n n(n - 1) ... 1 than AND-model deadlock. An algorithm for distributed knot detec- 2n 2n - 2 2 ->-- . . . - tion appears in Misra and Chandy [1982a]. n n-l 1 Termination detection of diffusing compu- tations in the OR model, which is also the Y n factors model of deadlock in CSP, is discussed in Misra and Chandy [1982b]. The algorithm = 2”. presented in Section 5.4 is taken from Chandy et al. [1983]. Other algorithms for So every request in the (E) model can be the OR model are given in Haas [1981], expressed in the AND-OR model. To see Natarajan [1986], and Rauchle and Toueg that the converse is also true, observe that [1983]. any AND or OR requests for n resources can be stated as an (“n) or (7) request, re- 2.4 AND-OR Model spectively. The only definition of deadlock in the (E) model we know was given by The AND-OR model is a generalization of Bracha and Toueg [1983] and suffers the two previous models. AND-OR re- from the same deficiencies as that of the quests may specify any combination of and AND-OR model. It is, therefore, not dis- and or in the resource request. For example, cussed here. An algorithm for deadlock a request for (a and (b or c)) or d is possible, detection in the (E) model was published in and a, b, c, and d may exist at different Bracha and Toueg [1983] and is presented sites. There does not appear to be a familiar in Section 5.6. construct of graph theory to describe a deadlock situation in the AND-OR model 2.6 Unrestricted Model in terms of the WFG. In principle, deadlock in the AND-OR model can be detected by In the most general model no underlying repeated application of the test for OR- structure of resource requests is assumed. model deadlock, exploiting the fact that Instead, the stability of deadlock is the only deadlock is a stable property; that is, it does assumption made. The advantage of look- not go away by itself. But this strategy is, ing at the deadlock problem in this way is

ACM Computing Surveys, Vol. 19, No. 4, December 1987 312 . Edgar Knapp that it helps in the : WFG to a number of neighboring sites Properties of the underlying database com- every time a deadlock computation is per- putations (e.g., degree of concurrency, i.e., formed. After the local data structure of single locus of control for each transaction each site is updated, this updated WFG is versus parallelism of individual transac- then passed along, and the procedure is tions, and message passing versus syn- repeated until some site has a sufficiently chronous communication) are rigorously complete picture of the global situation to abstracted and separated from concerns announce deadlock or to establish that no about properties of the problem (stability deadlocks are present. The main feature of of deadlock). Therefore, all the algorithms this scheme, namely, to send around paths dealing with this general model can be used of the global WFG, has led to the term to detect other stable properties as well. path-pushing algorithms. However, in the context of deadlock detec- One noteworthy point about path- tion in distributed databases, these algo- pushing algorithms is that many of them rithms seem to be of more theoretical value, were found to be incorrect, either by not since the very fact that no further assump- detecting true deadlocks, by discovering tions are made about the underlying struc- phantom deadlocks, or both. For example, ture of the database computation leads to Gligor and Shattuck [1980] show that the a great deal of overhead that can be avoided algorithm of Menasce and Muntz [1979] in algorithms for the simpler models. In is defective; a counterexample to the algo- Section 3.4 we present a general theory due rithm of Ho and Ramamoorthy [1982] to Chandy and Lamport [1985], which can was presented by Jagannathan and be applied to both the previous and the Vasudevan [1982]; in Section 4.1 we unrestricted models. For more details on give reasons why Obermarck’s algorithm the subject, the interested reader is referred [Obermarck 19821 is incorrect. This is even to Awerbuch and Micali [1986], Chandy more surprising as these algorithms had all and Lamport [ 19851, Chandy and Misra been “proved” correct. [1986], Chang [1982], Helary et al. [1987], Looking back, the failure of many of and Misra [ 19831. these algorithms is not so astonishing as might first appear, since at that time the 3. CLASSES OF DISTRIBUTED DEADLOCK notion of snapshots and consistent global DETECTION ALGORITHMS states in asynchronous systems was not well understood. Another consequence of The distributed deadlock detection algo- this lack of understanding was the fact that rithms that are found in the literature de- most of the algorithms had to depend on veloped basically from four different roots: “ freezing” the underlying (database) com- path-pushing, edge-chasing, diffusing com- putation for the time the deadlock detec- putations, and global state detection. This tion was going on. This guaranteed in most observation gives rise to another way of cases that the picture of the assembled classification, which will be developed in global WFG was consistent. the next four sections. For historical reasons, an example of a path-pushing algorithm [Obermarck 19821, 3.1 Path-Pushing Algorithms which has been implemented in System R, is presented in Section 4.1. The first distributed algorithms for the deadlock problem maintained the notion of 3.2 Edge-Chasing Algorithms an explicit global WFG, which had worked so well in the centralized case. One influ- The presence of a cycle in a distributed ential algorithm appeared in Menasce and graph structure can be verified by propa- Muntz [ 19791. The basic idea underlying gating special messages called probes along this class of algorithms is to build some the edges of the graph. Probes are assumed simplified form of global WFG at each site. to be distinct from resource request and For this purpose each site sends its local grant messages. When the initiator of such

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases l 313 a probe computation receives a matching gaging query for pi. The process that sent probe, it knows that it is on a cycle in the the engaging query is called the engager of graph. pi. The edge along which the engaging A nice feature of this approach in con- query was sent is called the engagement nection with deadlock detection is that ex- edge of pi. ecuting processes can simply discard any After receipt of the engaging query, an probes they receive. Blocked processes internal node is free to send queries to its propagate the probe along their outgoing successors. Besides its ability to receive edges. An interesting variation of this queries from its predecessors and send method can be found in Mitchell and Mer- queries to its successors, a node is also able ritt [ 19841, where probes are sent upon to receive replies from its successors and request and in the opposite direction of the send replies to its predecessors. Notice that edge. We look at this algorithm more queries always travel in the direction of the closely in Section 4.2. edges, whereas replies always travel the Another example for this approach is opposite way. Chandy and Misra’s algorithm [Chandy We require that the number of queries and Misra 19821, which is discussed in received along an edge always be at least Section 4.3. the number of replies sent in the opposite direction. The difference between the num- 3.3 Diffusing Computations ber of queries and replies sent over an edge is called the deficit of this edge. Hence, from The second category of algorithms was in- the above we have the following: The deficit spired by the work of Chang [1982] and of all edges is at least zero. Dijkstra and Scholten [1980]. Here the The neutral state of a node can now be basic idea is that a diffusing computation is defined to be the state in which the deficits activated, for example, by a transaction of all incoming and outgoing edges are zero. manager that suspects a deadlock. This The diffusing computation terminates if computation is superimposed on the under- the root returns to its neutral state. When lying database computation. If this com- should a node reply to a query? We stipu- putation terminates, the initiator declares late that an active node reply to all queries deadlock. The characteristic feature of the it receives immediately. The crucial ques- superposed computation in the case of dis- tion is: When should a node reply to its tributed deadlock detection is that the engaging query? This reply is called the global WFG is implicitly reflected in the engaging reply. We require that a node send structure of the computation. The actual back its engaging reply only after it has WFG, however, is never built explicitly. received replies for each query it sent. The diffusing computation grows by With these stipulations it is not hard to sending query messages and shrinks by re- show that (1) each engagement edge con- ceiving replies. In our case query and reply nects two active nodes, (2) engagement messages are concerned exclusively with edges do not form cycles, and (3) each ac- deadlock detection and are distinct from tive internal node has exactly one incoming resource request and grant messages. When engagement edge. We say that the diffusing a diffusing computation shrinks back to its computation has terminated if and only if root, it terminates. all internal nodes are in their neutral state. More precisely, nodes different from the From what has been said above, it now root are called internal nodes. Each node in follows that (cf. [Dijkstra and Scholten the diffusing computation has an initial 19801) state called the neutral state. The root (also called initiator) sends queries to its succes- (1) when the root returns to the neutral sors to start a diffusing computation. After state, the diffusing computation has receipt of its first query, a node leaves the terminated; neutral state and becomes active. The first (2) a bounded number of steps after the query received by node pi is called the en- diffusing computation has terminated,

ACM Computing Surveys, Vol. 19, No. 4, December 1987 314 l Edgar Knapp

the root will have returned to the neu- (3) (3e’: e’ E E:el 5 e’ A e’ 5 e2). tral state. Part (1) of the definition says that the Algorithms using the paradigm of diffus- events of a single process are totally or- ing computations are presented in Sec- dered. Part (2) expresses the fact that mes- tions 4.4 [Chandy and Misra 19821 and 4.5 sages are received after they are sent. Part [Hermann and Chandy 19831. In general, (3) essentially states that 5 is transitive. this approach results in shorter messages We can represent the history of a system and less deadlock detection overhead as and its happened-before relation by a dia- compared with path-pushing algorithms. gram like that in Figure 2. The dots repre- Besides the work mentioned above, there sent events, the horizontal lines are the are other variations on this theme [Chandy time axes of the processes, and the arrows and Misra 1986; Chang 1982; Dijkstra et al. link corresponding sends and receives. 1983; Haas 1981; Haas and Mohan 1983; The following formalization is due to Misra 1983; Misra and Chandy 1982a; Chandy and Lamport [ 19851. A cut c of E Misra and Chandy 1982b]. is a partition of E into two sets PC and F,, standing for past and future, respectively. 3.4 Global State Detection A cut is consistent if F, is closed under 5. The work that has been done in the area of A consistent cut defines a consistent state. global state detection is largely based on Hence we use consistent cut and consistent results by Chandy and Lamport [1985]. state interchangeably. Intuitively, consist- The key notion here is a consistent global ent cuts are those that do not contain a state that can be determined without tem- send event in F, with the corresponding porarily suspending (“ freezing”) the under- receive event in PC. lying (database) computation. Below we Looking again at the example in Figure give a condensed presentation of the results 2, we see that PC = (elf e3, e4, e7, es, es, elo] of Chandy and Lamport, that are relevant and F, = (e2, e5, e6). Furthermore, since F, to our context. The discussion follows is closed under 5, c is a consistent cut. Bracha and Toueg [ 19831. A special type of consistent state is S,, The underlying computation, henceforth the global state at time t that is the collec- referred to as the system, is a collection of tion of all the local states of the processes processes, that can be thought of as trans- at time t. Note that S, is a purely theoretical action managers and transaction agents. construct that cannot be observed, since Processes communicate by sending mes- this would require an outside observer to sages (e.g., resource requests or grants) ac- record the local states of the processes cording to some underlying protocol (2PL, instantaneously, an impossible task in e.g.). Events in the system are the sending practice. In contrast, consistent states and receipt of messages. We denote the set can be obtained from within the system. of events in a system by E. The local state We now extend the relation 5 to consistent of a process p consists of the history of all states. events that occurred on p. Along the lines of Lamport [ 19781 we define a partial order Definition 3.2 5 G E x E as follows: Let S1, S2 be consistent states. Then S1 5 if Definition 3.1 S2, Ps, C PSA. We define a relation l- between states, Let el, e2 E E. Then el 5 e2 (el happened called reachability relation. before e2) if either (1) el and e2 are both on the same process Definition 3.3 p, and el occurred earlier in p then e2; Let S be a consistent state and e E E, such (2) el is a send event and e2 is the corre- that Ps U (e) defines a consistent state S ‘. sponding receive; Then S I-” 5” (S’ is reachable from S).

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 315

PC c FC 4. A SURVEY OF SELECTED ALGORITHMS 4.1 Obermarck’s Path-Pushing Algorithm In this section we discuss an algorithm that appeared in Obermarck [ 19821. The under- lying deadlock model is the AND model; hence the algorithm looks for cycles in the global WFG. First, the author makes some \- 1% j simplifying assumptions: es elo I 0) Transactions have a single locus of con- Figure 2. A cut of a distributed system. trol; that is, at most one transaction agent of each transaction can be active at any time. A sequence of events (r = (el, e2, . . . , e,) (2) Communication between transaction is a schedule, if S l--“I SI l--“z . . . l-en-1 Snel agents is logically synchronous. I-“n S I. We then write S l--” S ’ for short. One can show [Chandy and Lamport 19851 (3) The transactions are totally ordered, that which is useful in reducing deadlock detection overhead and ensuring that exactly one transaction in each cycle Lemma 3.1 detects deadlock. S 5 S’ implies (3schedule a::S P’S’). (4) The portion of the local WFG sent from In the context of deadlock detection, the one site to another does not change state of the system is a WFG, and schedules until the information has been received are sequences of WFG transformations. We and processed by some final site. The say that a transaction is deadlocked if it is final site is defined as deadlocked in WFGt, the WFG at time t. (a) the site at which a deadlock cycle Given a definition of deadlock in terms of is completed, or a WFGt, the following lemma of Bracha (b) the most distant site at which and Toueg [1983] allows us to apply a dis- global deadlock can be proved not tributed deadlock detection algorithm to to exist. consistent WFGs instead of WFG, s: Points (1) and (2) imply that in each trans- action only one agent may be active or in Lemma 3.2 resource-wait. This agent is expected to ( WFG 5 WFG ’ and v is deadlocked in send a message to other agents of the same WFG) implies v is deadlocked in WFG I. transaction, which are waiting to receive a message. This is the fundamental result on which Obermarck [1982] admits that point (4) deadlock detection algorithms can be is a fairly unrealistic assumption. It turns based. Chandy and Lamport [1985] show out that even with this assumption the how to obtain a consistent global state of a algorithm is still vulnerable to detecting distributed system by propagating markers phantom deadlocks. He suggests that, if the along the channels of the system. A con- occurrence of deadlock is rare, the assump- sistent global state obtained in this fashion tion be dropped and cycles found by the is also called a snapshot of the system. algorithm be validated. Such a snapshot can then be examined for Each controller cj at site Sj runs a copy deadlock off-line. Since this snapshot is by of the deadlock detection algorithm. The definition a static object, there are no prob- basic structure of the algorithm is an iter- lems in conjunction with message delays, ation of the following steps: and deadlock detection becomes much eas- ier. In Section 4.6 we see an example of the (1) Receive deadlock information from application of this result. some other sites that was produced

ACM Computing Surveys, Vol. 19, No. 4, December 1987 316 9 Edgar Knapp by the previous deadlock detection simplest case, a probe consists of a single iteration. natural number that is unique to the nodes (2) Build part of the global WFG using in the WFG. When the probe comes back local wait-for information and the to its initiator, the initiator declares dead- information received from other sites lock. in step 1. A special node in the WFG The algorithm will be stated in terms of called External is used to represent processes. It has a number of nice features: intersite wait-for relations. It is very simple, making the proofs (3) Find all elementary cycles in the WFG. (1) Break all cycles that do not contain EX elegant and fun to read (and write), and by aborting suitable transactions. the task of implementing a matter of hours. (4) Consider each elementary cycle con- taining EX. Such a cycle constitutes a (2) Exactly one process in the cycle will potential global deadlock. For each detect deadlock, which simplifies dead- such cycle EX + T1 + . . . + T, + lock resolution since this process could EX compare Tl with T,. If Tl > T,,, simply abort. By including priorities in send the cycle to each site, where an the algorithm, the lowest priority pro- agent of T,, is waiting to receive a mes- cess in a cycle detects deadlock and sage from the agent of T,, at this site. aborts. (3) Spontaneous aborts are allowed, even In Obermarck [1982, sect. 61, an attempt though under this assumption phantom is made to prove the algorithm correct. deadlocks cannot be excluded. It can be That the algorithm and proof are incorrect shown, however, that only genuine (in the sense that false deadlocks may be deadlocks will be detected in the ab- detected) can easily be seen from the fol- sence of spontaneous aborts. lowing observation [Elmagarmid 19861: The portions of the WFG that are shipped In this discussion, only the first version around may not represent a consistent view of the algorithm (without priorities) is of the global WFG, since each site takes its given. The extension to priority handling snapshot asynchronously. can be found in Mitchell and Merritt As far as the performance of the algo- [1984]. rithm is concerned, Obermarck shows that, Each node of the (virtual) WFG has two ifs sites are involved in a deadlock, at most local variables, called labels: a private label, s(s - 1)/2 messages are sent, where each which is unique to the node at all times, message may be of length O(s). Under cer- though not constant, and a public label, tain assumptions the expected case per- which can be read by other processes and formance is shown to be roughly linear in need not be unique. A process is repre- s, with a small constant factor. sented as 8 where u and u are the public For more details, the reader may consult and private labels, respectively. Initially, Obermarck [ 19821. In our opinion, however, private and public labels are equal for each this algorithm has been rendered obsolete process. by more recent developments. The state of the system is given by the global WFG. The WFG is maintained by the four state transitions shown in Fig- 4.2 Mitchell and Merritt’s Algorithm for the ure 3, where z = inc(u, v), and inc(u, u) Single-Resource Model yields a unique label greater than both The algorithm by Mitchell and Merritt u and u. Labels not mentioned explicitly [ 19841 presented in this section is as simple remain unchanged. as the deadlock model for which it was creates an edge in the WFG. Two defined. It is an edge-chasing algorithm in messages are needed, one resource request which probes are sent in the opposite direc- and one message back to the blocked pro- tions of the edges of the WFG. In the cess to inform it of the public label of the

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 317

Block 8 8 ==+ 8-8 From the invariant the following lemma is immediate:

Activate e-e -8 8 Lemma 4.1 Transmit eiise a 8-8 For any process 8, if u > u, then u was set by a Transmit step. Now we are ready to prove the following Figure 3. The four possible state transitions. theorem:

Theorem 4.1 process it is waiting for. Activate means If a deadlock is detected, a cycle of blocked that a process acquired the resource from nodes exists. the process it was waiting for. Transmit propagates larger labels in the opposite di- Proof. Deadlock is detected if the fol- rection of the edges by sending a probe lowing edge p + p’ exists: message. A process that receives a probe @---+@ with a number smaller than its own public label can simply ignore the probe. Detect We will prove the following claims: means that the probe with the private label of some process has made a whole round in (1) u has been propagated from p to p’ via a circle, indicating a deadlock. a sequence of Transmits. Note that the requirement for unique- (2) p has been continuously blocked, since ness of the public label does not cause any it “transmitted” u (i.e., engaged in a problems at all in the Block step of the Transmit event with some process 4, algorithm. Assuming that we can assign a Q+-P)* unique name to each process, labels can be (3) For all intermediate nodes q in the represented as pairs of sequence numbers transmit path of(l), includingp’, q has and process names, and < can be chosen to been continuously blocked since it be lexicographical ordering. Then to Block, transmitted u. only sequence numbers have to be com- The result then follows immediately. pared; to Transmit, the whole pair is sent. The proof of the correctness of the algo- Ad 1. By the invariant and the unique- rithm is quite simple. Mitchell and Merritt ness of private labels, we have for the pri- show that every deadlock is detected. Since vate label u of p ’ : u < u. By Lemma 4.1, u they did not exclude spontaneous aborts, was set by a Transmit step. By the seman- they did not worry about phantom dead- tics of Transmit, there is some p” with locks. Below we prove that in the absence private label u, public label w. of spontaneous aborts only genuine dead- If w = u, then p” = p, and we are done. locks are detected. Otherwise, w < u, and we repeat the argu- Assume for now that there are no spon- ment. Since there are only finitely many taneous aborts. The following is an inuar- processes, one of them is p. iant : Ad 2. Assume that p was active since it transmitted u. It is blocked when it detects For all processes 8 : u I U. deadlock; hence upon Blocking it incre- mented its private label. But then private Proof. Initially u = u for all processes. and public labels cannot be equal. The only transactions that change u or u Ad3. Assume that there is a process are that has been active since it transmitted u. Its predecessor has been active since (1) Block: u and u are set such that u = u. its transmission, too, because Transmits (2) Transmit: u is increased. Cl migrate in the opposite direction of the

ACM Computing Surveys, Vol. 19, No. 4, December 1987 318 . Edgar Knapp edges. By repeating this argument, we find gorithm for handling processes with more that p has been active since it transmitted than one outgoing edge (AND requests). u. 0 4.3 Chandy and Misra’s Algorithm for the The algorithm given can be easily ex- AND Model tended to include priorities such that the lowest priority process in a deadlock cycle In the approach developed by Chandy and aborts itself. The extended algorithm has Misra [ 19821, each controller runs a copy two phases. The first phase is almost iden- of the deadlock detection algorithm. In tical to the simple algorithm. In the second order to determine whether an idle trans- phase the smallest priority is propagated action agent is deadlocked, its controller around the circle, as the largest public label initiates a probe computation. In a probe was propagated before. The propagation computation, controllers send probes to stops when one process recognizes the each other. Probe computations may be propagated priority as its own. The full initiated for several transactions, and the algorithm is given in Mitchell and Merritt same transaction agent may have several [1984]. probe computations initiated for it in se- The performance of this algorithm is not quence. A probe consists of a triple (i, j, k), studied in Mitchell and Merritt [ 19841. It denoting that it belongs to a probe compu- is, however, not hard to obtain the follow- tation for ti4 and that this probe was sent ing complexity results. Assuming that a along intercontroller edge (tj, tk). A con- deadlock persists long enough to be de- troller sends a probe (i, j, k) if the following tected, the worst-case complexity of the conditions hold: (1) tj is idle, (2) tj is waiting simple algorithm is s(s - 1)/2 Transmit for tk, and (3) ti is dependent on tj. We call steps, where s is the number of processes an intercontroller edge (tj , tk) that meets in the cycle. After this many steps, every these three conditions an outgoing edge process in the cycle will have compared its Ofti. public label with every other one. That this Probes received by a controller may be bound is tight can be seen from an example discarded or accepted; probes that are ac- in which the public labels of the processes cepted are called meaningful. Formally, a are ordered increasingly around the cycle, probe (i, j, lz) is meaningful if (1) tk is idle with the detecting process having the great- and (2) the controller of tk did not know est label and the process for which the that ti was dependent on tk and can now detecting process is waiting having the deduce that t; is dependent on tk. It is smallest label. A similar argument shows immediate that, if the controller of ti re- that for the priority algorithm the largest ceives a meaningful probe (i, j, i) for any j, number of Transmit steps for detecting then ti is deadlocked. The formulation of deadlock is twice as large: s(s - 1). We these observations in terms of an algorithm conjecture that the expected-case complex- can be found in Figure 4. ity is linear for both algorithms. Observe that in a subsequent refinement Interestingly, the algorithm does not re- step, the test of whether a probe is mean- main correct if public labels are transmitted ingful can be implemented by a Boolean in the same direction as the edges instead array dependentk, where dependentk(i) = of the other way round. The reason for this tk’s controller knows that ti is dependent is exactly the point we were making when on tk. Local dependence and outgoing edges we defined AND- and OR-model deadlock can be determined by a standard marking in Section 3: If a deadlocked process that algorithm, like the one used for reachability is not part of a cycle has the largest public problems in graphs. label among the deadlocked processes, this The algorithm is proved correct by col- label might enter the cycle and circulate oring the edges of the WFG in the following once without any process in the cycle de- tecting the deadlock. Also, there seems to 4 Note, that we make use of the fact that double be no straightforward extension of the al- subscripts can be replaced by a single subscript.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 319

For initiation of a probe computation by the controller of ti:

if ti is locally dependent on itself then declare deadlock for ti else send probes (i, j, Jz) on all outgoing edges (tj,th) of ti.

For a controller on receiving a probe (i, j, k):

if the probe is meaningful

thenifk=i

then declare deadlock for ti else send probes (i, p, q) on all outgoing edges (tp, te) of th.

Figure 4. Algorithm for the AND model.

manner. An edge (tip tj ) is Site 1 l gray if ti has sent a request to tj that tj Site 2 has not yet received; l black if tj has received a request from ti but has not yet sent a grant message t0 ti; l white if tj has sent a grant message to ti but ti has not yet received it. Site 3 Hence edge colors represent the state of a “channel” between processes. This ap- proach is also used in Bracha and Toueg [1984], Chandy et al. [1983], and Hermann Figure 5. A counterexample to the algorithm in Sec- and Chandy [1983]. tion 6.6 of Chandy and Misra [1982]. Gray and black edges are called dark edges. It is easy to see that in our model of message passing, a dark cycle, that is, a Incidentally, the algorithm given in cycle in which all edges are dark, will persist Chandy and Misra [1982, sect. 6.61 is not forever. Hence the existence of a dark cycle correct, as can be observed by applying it is equivalent to deadlock. In order to prove to the counterexample of Figure 5. If the the algorithm correct, one must show the controller at site 1 initiates a probe com- following: putation for tll, tll will not be marked in the process of finding a local deadlock at (1) [Safety] If the initiator of a probe com- site 1 (because tll is not part of a local putation for ti receives a meaningful deadlock). Since the set of outgoing edges probe (i, j, i), then ti is on a black cycle is determined starting from marked trans- when this probe is received. action agents only, the set of outgoing edges (2) [Progress] If ti is on a dark cycle at the will be found empty, and no probes will be time its controller initiates a probe sent. Therefore, deadlock will never be de- computation for it, then the controller tected by the controller of site 1, even of ti will eventually get a meaningful though tll is part of a deadlock cycle. A probe (i,j, i). subsequent version of the algorithm that

ACM Computing Surveys, Vol. 19, No. 4, December 1987 320 . Edgar Knapp

appeared in Chandy et al. [1983, sect. 3.11 message of the form reply (i, k, j ) to the introduced the notion of dependence and is query message query(i, j, k). A blocked (to our knowledge) correct. The formula- process initiates a deadlock computation tion of this algorithm given above both by sending queries to processes in its de- retains the notational simplicity of the pendent set (cf. Section 2.3). The basic idea first-incorrect-solution and is free of is that a blocked process, on receiving a errors. query, should propagate the query to its Below, the performance analysis of the dependent set if it has not done so already. algorithm is summarized. Each probe sent Thus, if there is a sequence of permanently is of fixed length. Deadlock detection over- blocked processes pi, . . . , pj such that each head is introduced primarily when trans- process in the sequence (except the first) is action agents are idle (i.e., have nothing to in the dependent set of the previous process do and nothing to send). Furthermore, if in the sequence, a query initiated by pi will transaction agents that are referred to in a be propagated to pj . probe are executing, the controller simply Next we discuss the action taken by a discards that probe. Every single deadlock process pk on receiving a query or reply detection computation involves no more with fixed initiator i and some sender j. If than e probes, where e is the number of pk is active, it ignores all queries and replies. communicating pairs of controllers in If it is blocked, there are several possibili- the network. Hence in the worst case e = ties: If pk receives an engaging query, it N(N - 1). Normally, however, e will be propagates the query to all processes in its much less, depending on the locality behav- dependent set and remembers the number ior of the transactions. of queries sent in a local variable nun(i). Some optimizations regarding questions Let the local variable wait(i) denote the of when and how often probe computations fact that pk has been continuously blocked should be initiated are given in Chandy and since it received its engaging query. If pk Misra [ 19821. receives subsequent queries, it replies to them immediately, if wait(i) holds. If it has been executing since then, that is, -I wait(i) 4.4 Chandy, Misra, and Haas’s Algorithm for holds, it discards the query. the OR Model If pk receives a reply and wait(i) holds, The algorithm for the OR model [Chandy it decrements num(i). When should pk re- et al. 1983, sect. 41 is an application of the ply to its engaging query? From our dis- technique of diffusing computations. A cussion in Section 3.3 it should be clear blocked process can determine whether it that pk replies to its engager only if it has is deadlocked by initiating a diffusing com- received a reply for each query it has putation. Several processes may initiate propagated, that is, if num(i) = 0. diffusing computations at the same time. When pk initiates a deadlock computa- However, for the time being we restrict tion, it does so by sending query (k, k, j ) to ourselves to the case in which each process each process j in its dependent set and initiates at most one diffusing computation. setting rum(k) to the number of queries The extension to the same process initiat- sent. If the initiator receives replies to all ing diffusing computations several times in the queries sent, then the initiator is dead- a row is then quite straightforward. locked. The messages in the deadlock compu- These observations lead to the algorithm tation have the form query (i, j, k) and in Figure 6. S denotes the dependent set of repZy(i, j, k), denoting that these messages pk, wait(i) = fake initially, for all i. belong to the diffusing computation initi- To guarantee that every deadlock will be ated by process pi and are being sent detected by some deadlocked process, we from pj to pk; pi, pi, pk are called the ini- now relax the restriction that only one tiator, sender, receiver, respectively. There deadlock computation can be initiated by will be at most one message of the form some particular process. We require that a query(i, j, K); there will be at most one reply process initiate a diffusing computation

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases l 321

For a blocked process PI: to initiate a diffusing computation: send query(k, k,j) to all pj in S. set num(i) := ISI; wail(i) := true.

For a blocked process ~1: upon receiving query(i,j, k): if the query is the engaging query for initiator i then send query(i, k, m) to all p,,, in S. set num(i) := ISI; wait(i) := true. else if wait(i) then send reply(i, k,i) to pj. Fit1~6. Simplified algorithm for the OR

Upon receiving reply(i, k, j):

if wait(i) then num(i) := num(i) - 1. if num(i) = 0 then if i = k then declare deadlock for pk. else send reply(i, k,j) to the engager of pk.

For a blocked process upon becoming executing: wait(i) := false, for all i. each time it becomes blocked. To distin- a new diffusing computation whenever it guish several diffusing computations initi- becomes blocked. ated by the same process pi, queries and The analysis of the algorithm’s perform- replies are endowed with an additional pa- ance is similar to that for the AND-model rameter denoting the sequence number of algorithm in the previous section. There is the diffusing computation initiated by pi. a maximum number of e queries and e The generalized algorithm can be found in replies per diffusing computation, where Chandy et al. [1983], together with a cor- e=N(N-1). rectness proof. In particular, the following theorems are proved: 4.5 Hermann and Chandy’s Algorithm for the Theorem 4.2 AND-OR Model If the initiator of a diffusing computation is The basis of the algorithm for the AND- deadlocked when it initiates the computa- OR model [Hermann and Chandy 19831 is tion, it will (eventually) declare itself dead- a so-called tree computation. A tree com- locked. putation consists of a hierarchy of diffusing computations along the lines of Section 3.3. Theorem 4.3 Below we will make this idea more precise. If the initiator of a diffusing computation Transaction agents are mapped to proc- declares itself deadlocked, then it belongs to esses in the following manner: A process a deadlocked set. may have an AND request or an OR re- quest; an AND-OR request issued by some transaction agent is mapped to a tree of Theorem 4.4 processes. The mapping is a representation At least one process in every deadlocked set of the AND-OR request in a regular form, will report deadlock if every process initiates such as disjunctive normal form (DNF).

ACM Computing Surveys, Vol. 19, No. 4, December 1987 322 . Edgar Knapp (b) For an OR process, all outgoing edges disappear instantaneously and the process is active. The central idea underlying the algo- rithm is that any time a diffusing compu- tation reaches a blocked OR process, the diffusing computation is propagated to the dependent set of this process; if the engaged process is a blocked AND process, it initi- ates a separate tree computation for each outgoing edge. So a tree computation con- u sists of either a diffusing computation or a set of tree computations. In order to start a deadlock computation, an initiating pro- cess sends a query to the process that is suspected of deadlock. From there queries are propagated according to the rules ex- plained later. A tree computation termi- nates when its initiator receives a reply from the suspected process. Deadlock in the process model is defined in the follow- ing obvious way:

Definition 4.1 A blocked process p is deadlocked, if either (1) p is an AND process and will never Figure 7. Mapping transaction agents receive a grant for at least one of the to processes. resources requested, or (2) p is an OR process, but will never re- ceive a grant message. Figure 7 gives an example of this mapping. Note, that in order to use this definition Transaction agent tl waits for (tz and t3) or to define the correctness of a deadlock t4 or t5; a line connecting edges denotes an detection algorithm, we have to exclude AND request. We call processes like p; permanent blocking of processes due to AND processes; consequently, p1 is called individual starvation or infinite loops. an OR process. For this reason, Hermann and Chandy The behavior of individual processes call this a local definition of deadlock. with respect to the underlying computation Next we discuss process behavior with is a refinement of the process behavior respect to the deadlock computation defined in Section 1.3. Upon receiving a proper. Let us assume for now that only grant message an edge in the WFG disap- one deadlock computation is performed at pears, and there are several possibilities for a time. We shall see later how to extend the receiving blocked process: the scheme to many concurrent deadlock computations. Queries have the form query(seq, k), (1) No outgoing edges remain, that is, the where seq = ( iI, . . . , in) is a sequence of process is active. processes and k is the sender of the query. (2) If outgoing edges remain, there are two The initiator i of a deadlock computation cases: sends query ( ( i ) , i ). A query is propagated (a) An AND process remains blocked. in the following manner: If an engaging

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 323 query(seq, m) arrives at a blocked AND each time a new initiator be created and process pk, a new set of tree computations that the names of the initiators be unique. is initiated by pk. For this purpose, pk sends Verification of the algorithm proceeds by out a query of the form query (seq 0 1, k) to proving the following claims: pI for all outgoing edges (pk, pi ) of the WFG (1) [Safety] If an initiator i detects dead- (0 denotes concatenation). If a blocked OR lock for some process p, then p is truly process receives an engaging query(seq, m), deadlocked. it propagates query (seq, k) to all processes in its dependent set. These actions are re- (2) [Progress] If a process p is deadlocked ferred to as extension. when a deadlock computation is initi- If query(seq, m) is not engaging and the ated by some initiator i, then i will receiving process pk has been blocked con- detect deadlock for p in finite time. tinuously from the time it received its en- The proof employs invariants and the gaging query, a reply(seq, k), is sent to pm. so-called “tree computation termination This action is called reflection. lemma” given below: If a reply (seq, m) is received by an AND process, it sends back its engaging reply if Lemma 4.2 it has been continuously blocked from the time it received its engaging query. An OR A tree computation T terminates iff for every process sends back its engaging reply only i and j such that query(T, i) is sent to pj , if it has received all the replies from pro- reply(T, j) arrives at pi with no intervening cesses in its dependent set and has not been grants. executing from the time it was engaged. The details of the very well-written proof These actions are called collation. In all can be found in the paper by Hermann and other cases messages received are dis- Chandy [1983]. A performance analysis is carded. not provided by the authors, but it is not To distinguish among many different hard to see that in the worst case one single concurrent deadlock computations, proc- deadlock computation will take at most e = esses use the information in seq. Two mes- N2(N - 1) queries and e replies, where N sages M(s, k) and M’(s’, k’) received by is the number of processes. Messages are of some process in this order relate to the variable length with a maximum size of N. same deadlock computation if and only if However, N can be exponential in the num- s 5 s’, where I means “is prefix of.” ber of transactions if a normal form like This observation and the rule for sequence DNF is used for the transaction-to-process extension by AND processes enables us to mapping, as suggested in the paper. On the identify tree computations with sequences, other hand, we do not see the necessity for which plays an important role in the proof converting an arbitrary AND-OR request of the algorithm. To keep track of the quer- into normal form: AND-OR requests can ies sent and replies received by each pro- be mapped directly to a tree of processes cess, two lists of messages are used: an without an exponential blowup. Hints for incoming query list IQ-list and an outgoing efficiency improvements and implementa- query list OQ-list. Those lists are updated tion considerations are again given in Her- in a straightforward manner. Care has to mann and Chandy’s paper. be taken only in the case in which a grant is received. For details the reader is referred 4.6 Bracha and Toueg’s Algorithm for the (;) to Hermann and Chandy’s paper [ 19831. Model A deadlock computation is started by some controller, creating a process called The algorithm for the (E) model presented the initiator; the initiator then sends a in this section [Bracha and Toueg 19831 is query to the process that is checked for an application of the global state detection deadlock. Several deadlock computations technique described in Section 3.4. We can be initiated concurrently and for the shall discuss in some detail only the first of same process. The only constraints are that the three versions of the algorithm given

ACM Computing Surveys, Vol. 19, No. 4, December 1987 324 9 Edgar Knapp by Bracha and Toueg, which assumes syn- the computation due to individual starva- chronous communication between pro- tion or infinite loops, then the definition of cesses and a static WFG. The second deadlock is not quite correct. algorithm is supposed to relax the con- Bracha and Toueg’s first algorithm is a straint of synchrony, but this is the case nested invocation of diffusing computa- only to a very limited extent. The state of tions, with a slight twist. The twist is that an edge (channel) incident on a process a process leaving its neutral state remem- must still be known to that process in order bers that it did so. Only the first query it for the algorithm to be correct. Hence the ever receives will be engaging. All subse- second algorithm is basically synchronous quent queries are answered immediately as well; it is just one more scheme to sim- with a reply, even if the process has re- ulate synchrony by sending status messages turned to its neutral state. One conse- back and forth, which is a fairly standard quence of this behavior is that the number scheme and not new at all. The third algo- of messages exchanged during an invoca- rithm first determines a global snapshot of tion of the algorithm is reduced. Another the WFG by using the technique introduced consequence, however, is that many of the in Chandy and Lamport [ 19851. This snap- nice properties of diffusing computations shot can then be used to run one of the first are lost and the proofs of correctness be- two algorithms on to detect whether there come messy and almost incomprehensible. is a deadlock. A novel feature of the algorithm in com- The underlying resource model is the parison with the others we have seen so far (;) model. A transaction can have as a is that a diffusing computation always ter- request an arbitrary and-or combination of minates, and when it terminates, every (2) requests. This combination is mapped process knows whether or not it is dead- to a tree of processes by using a scheme locked. Every process p employs a local similar to the one discussed in the previous variable free,, whose value upon termina- section, with each process having a single tion satisfies free, = p is not deadlocked in (;) request. the static WFG. The value of free is estab- A process becomes blocked upon issuing lished by simulating the propagation of an (E) request. It does so by sending out n grant messages through the WFG. request messages. It becomes executing With these remarks, the algorithm can again when it receives k grant messages. In be described as a nesting of two instances this case it sends relinquish messages to the of an algorithm similar to a diffusing com- remaining n - iz processes, informing them putation, which Bracha and Toueg call that the edge created by sending the request CLOSURE. So the deadlock detection al- no longer exists. Relinquish messages are gorithm looks like the following: necessary because each process must know both its set of outgoing edges and its set of l [Outer invocation of CLOSURE] Find incoming edges. the set S of all reachable executing proc- Deadlock in this model is defined in esses by propagating queries, starting at terms of the WFG, using the terminology some initiator i. of Section 3.4. -Each p E S simulates granting all the resources it holds and the other pro- cesses are waiting for. A separate in- Definition 4.2 stance of CLOSURE is invoked by each p. A process p is deadlocked in a WFG G if -The grants are propagated and the there is a schedule (r such that (G l--” G’ number g of simulated grants received and p is executing in G ’ ). at each reachable process q is com- The same problem as with the definition pared with the number r of resources of Hermann and Chandy [ 19831 arises here, needed by q to become executing even though Bracha and Toueg seem to fail again. If g < r, then q will never get to recognize this. If there is no extension of enough grants; hence it is deadlocked.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases 325

Observe that all the inner instances of be detected yet, lead to further delays. A CLOSURE will terminate before the outer number of optimizations are suggested and one does. That the algorithm always ter- can be found in Bracha and Toueg’s paper minates and that at termination each pro- [1983]. cess knows whether or not it is deadlocked Compared with many other deadlock de- are proved by Bracha and Toueg [1983]. tection schemes, this algorithm employs The proof uses purely operational argu- very little concurrency. First, since snap- ments, and no invariants are given. The shooting is used and the WFG is processed algorithm itself is stated in a Pascal-like off-line, there is no true concurrency be- language and uses global side effects, which tween deadlock detection and underlying make both the understanding of the algo- computation. Second, the algorithm pro- rithm and its proof unusually hard. ceeds in phases (up to three, when space- Moreover, the second version of the al- saving optimizations are used), which gorithm cannot be correct. Bracha and sequentialize the deadlock computation. Toueg claim that “if an initiator i starts the To conclude the discussion of Bracha and deadlock detection algorithm in a colored Toueg’s paper, we want to point out that a WFG G then the algorithm terminates.5 study of the algorithms revealed a number Moreover, when i terminates, the local vari- of errors. One of the more severe ones able freei = true if and only if i is not was the fact that the instantiation of the deadlocked in G.” Deadlock is defined as in CLOSURE algorithm for the first phase Definition 4.2, with G containing black of detecting all reachable executing nodes edges only. A counterexample to the above is incorrect. Recently, Gafni [ 19861 claim is a simple cycle of gray edges in G. suggested improvements to Bracha and These edges will turn black after a finite Toueg’s algorithm but without giving a cor- amount of time. Hence, there is no schedule rectness proof. u such that (G l-” G’ and any of the pro- cesses in the cycle are active in G’). There- 5. DISCUSSION fore, by Definition 4.2, all the processes in the cycle are deadlocked. But since Bracha The large number of errors in published and Toueg chose to consider gray edges as algorithms addressing the problem of dis- nonexistent in the WFG, if the initiator is tributed deadlock detection [Bracha and among the processes in the cycle, it will Toueg 1983; Chandy and Misra 1982; Ho erroneously claim that it is not deadlocked. and Ramamoorthy 1982; Menasce and It is true, however, that this deadlock will Muntz 1979; Obermarck 19821 shows that be detected eventually. But this does not only rigorous proofs, using as little opera- save the algorithm from not meeting its tional argumentation as possible, suffice to specification. show the correctness of these algorithms. The performance analysis is summarized By falling back on well-known and com- below. In the synchronous case at most pletely general principles like diffusing 4N(N - 1) message are needed. In the computations and global state detection, it asynchronous version more messages have seems possible to achieve both elegance and to be exchanged to determine the state of correctness, even for more advanced models the edges between processes. In the worst of resource requests, without introducing case an additional overhead of O(N2) mes- unnecessary complexity. sages suffices. Since the algorithm proceeds We believe that recent developments in by taking a snapshot first and then running the area of distributed deadlock detection deadlock detection, the time between the algorithms has rendered much of the older occurrence of deadlock and its detection work obsolete. This paper has focused on a may be significant. Also, situations that small number of concepts and does not will inevitably lead to deadlock, but cannot claim completeness in the sense that all known approaches have been covered. ’ The notion of a colored WFG used here is similar to Among recent publications that have not the one we introduced in Section 4.4. been considered here are Chandy and Misra

ACM Computing Surveys, Vol. 19, No. 4, December 1987 326 l Edgar Knapp [1986], Elmagarmid [1985], Helary et al. Database Systems. Addison Wesley, Reading, [1987], Natarajan [ 19861, and Sinha and Mass. Natarajan [ 19841. An annotated bibliog- BRACHA, G., AND TOUEG, S. 1983. A distributed algorithm for generalized deadlock detection. raphy on distributed deadlock detection al- Tech. Rep. TR 83-558, Cornell Univ., Ithaca, gorithms appeared in Zobel [ 19831. N.Y. - The differences in message complexity of BRACHA, G., AND TOUEG, S. 1984. A distributed the algorithms presented have turned out algorithm for generalized deadlock detection. In to be less significant than expected. The Proceedings of the ACM Symposium on Principles of (Vancouver, Canada, one-resource, AND-, and OR-model algo- Aug.). ACM, New York, pp. 285-301. rithms all had a worst-case complexity of CHANDY, K. M., AND LAMPORT, L. 1985. Distributed O(N’), where N was the number of nodes snapshots: Determining global states of distrib- in the WFG. The AND-OR algorithm uted systems. ACM Trans. Program. Lang. Syst. needed at most O(N3) messages but was 3, 1 (Feb.), 63-75. vulnerable to an exponential blowup if (;) CHANDY, K. M., AND MISRA, J. 1982. A distributed requests are cast into AND-OR form. The algorithm for detecting resource deadlocks in dis- tributed systems. In Proceedings of the ACM algorithm for the (z) model itself used Symposium on Principles of Distributed Comput- O(N2) messages in the worst case, but this ing (Ottawa, Canada, Aug.). ACM, New York, could only be achieved by performing the pp. 157-164. deadlock detection proper off-line from the CHANDY, K. M., AND MISRA, J. 1986. An example of database computations. There are no clear stepwise refinement of distributed programs: Qui- escence detection. ACM Trans. Program. Lang. winners among the algorithms: Mitchell Syst. 8, 3 (July), 326-343. and Merritt’s algorithm excels by its ele- CHANDY, K. M., MISRA, J., AND HAAS, L. M. 1983. gance and simplicity, Chandy and Misra’s Distributed deadlock detection. ACM Trans. AND-model algorithm by its streamlined Comput. Syst. 1,2 (May), 144-156. proof, and Hermann and Chandy’s tech- CHANG, E. 1982. Echo algorithms: Depth parallel nique by the idea of applying diffusing com- operations on general graphs. IEEE Trans. Softw. putations to several levels of hierarchy. Eng. SE-&4 (July), 391-401. DIJKSTRA, E. W., AND SCHOLTEN, C. S. 1980. Each of the algorithms presented has its Termination detection for diffusing computa- merits, but many of them achieve simplicity tions. Znf. Process. Lett. 11, 1 (Aug.). by severely restricting the forms of resource DIJKSTRA, E. W., FEIJEN, W., AND VAN GASTEREN, requests permitted. In selecting a deadlock A. J. M. 1983. Derivation of a termination de- detection scheme to be embedded into a tection algorithm for distributed computations. particular application, it is therefore Znf. Process. Lett. 16, 5 (June), 217-219. advisable-in order to avoid unnecessary ELMAGARMID, A. K. 1985. Deadlock detection and resolution in distributed processing systems. complexity-to choose the least general Ph.D. dissertation, Dept. of Electrical Engineer- technique that is still general enough to ing, Ohio State Univ., Columbus, Ohio. solve the problem at hand. ELMAGARMID, A. K. 1986. A survey of distributed deadlock detection algorithms. ACM SZGMOD Rec. 15, 3 (Sept.). ACKNOWLEDGMENTS GAFNI, E. 1986. Perspectives on distributed network We are grateful to Hank Korth for his help and protocols: A case for building blocks. In IEEE encouragement and to Salvatore March and the ref- Military Communications Conference (Monterey, Calif.). IEEE, New York, pp. 1.1.1-1.1.5. erees for their comments and suggestion. The work was partially supported by Office of Naval Research GIFFORD, D. G. 1979. Weighted voting for replicated data. In Proceedings of the 7th ACM Symposium contract N00014-86-K-0182. on Operating Systems Principles (Pacific Grove, Calif., Dec.). ACM, New York, pp. 150-163. REFERENCES GLIGOR, V., AND SHATTUCK, S. 1980. On deadlock detection in distributed databases. IEEE Trans. AWERBUCH, B., AND MICALI, S. 1986. Dynamic Softw. Eng. SE-6, 5 (Sept.). deadlock resolution protocols. In Proceedings of GRAY, J. N., HOMAN, P., KORTH, H. F., AND OBER- the Foundations of (Toronto, MARCK, R. L. 1981. A straw man analysis of the Canada). IEEE, New York, pp. 196-207. probability of waiting and deadlock in a database BERNSTEIN, P. A., HADZILACOS, V., AND GOODMAN, system. Tech. Rep. RJ 3066, IBM Research Lab- N. 1987. Concurrency Control and Recovery in oratory, San Jose, Calif.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 Deadlock Detection in Distributed Databases l 327

HAAS, L. M. 1981. Two approaches to deadlock de- OBERMARCK, R. 1982. Distributed deadlock detec- tection in distributed systems. Ph.D. dissertation, tion algorithm. ACM Trans. Database Syst. 7, 2 Dept. of Computer Sciences, Univ. of Texas, Aus- (June), 187-208. tin, Tex. PAPADIMITRIOU, C. 1987. The Theory of Database HAAS, L. M., AND MOHAN, C. 1983. A distributed Concurrency Control. Computer Science Press, deadlock detection algorithm for a resource-based Rockville, Md. system. Res. Rep. RJ 3765, IBM Research Labo- RAUCHLE, T., AND TOUEG, S. 1983. Exposure to ratory, San Jose, Calif. deadlock for communicating processes is hard to H~LARY, J., JARD, C., PLOUZEAU, N., AND RAYNAL, detect. Tech. Rep. TR 83-555, Cornell Univ., M. 1987. Detection of stable properties in dis- Ithaca, N.Y. tributed applications. In Proceedings of the ACM SINHA, M. K., AND NATARAJAN, N. 1984. A distrib- Symposium on Principles of Distributed Comput- uted deadlock detection algorithm based on ing (Vancouver, Canada, Aug.). ACM, New York, timestamps. In Proceedings of the 4th Interna- pp. 125-136. tional Conference on Distributed Computing Sys- tems. IEEE, New York, pp. 546-556. HERMANN, T., AND CHANDY, K. M. 1983. A distrib- uted procedure to detect AND/OR deadlock. ZOBEL, D. 1983. The deadlock problem: A classifying Tech. Rep. TR LCS-8301, Dept. of Computer bibliography. Operat. Syst. Rev. 17, 2 (Oct.), Sciences, Univ. of Texas, Austin, Tex. 6-15. Ho, G. S., AND RAMAMOORTHY, C. V. 1982. Protocols for deadlock detection in distributed database systems. IEEE Trans. Softw. Eng. SE- BIBLIOGRAPHY 8, 6 (Nov.), 554-557. HOLT, R. C. 1972. Some deadlock properties on com- BADAL, D. Z., AND GEHL, M. T. 1983. On deadlock puter systems. ACM Comput. Surv. 4, 3 (Sept.), detection in distributed computing systems. In 179-196. IEEE INFOCOM. IEEE, New York. JAGANNATHAN, J. R., AND VASUDEVAN, R. 1982. A BRACHA, G. 1985. Randomized agreement protocols distributed deadlock detection and resolution and distributed deadlock detection algorithms. scheme; performance study. In Proceedings of the Ph.D. dissertation, Cornell Univ., Ithaca, N.Y. Third International Conference on Distributed COHEN, S., AND LEHMANN, D. 1982. Dynamic sys- Computing Systems (Miami, Fla.). IEEE, New tems and their distributed termination. In Pro- York, pp. 496-501. ceedings of the ACM Symposium on Principles of LAMPORT, L. 1978. Time, clocks, and the ordering Distributed Computing (Ottawa, Canada, Aug.). of events in distributed systems. Commun. ACM ACM, New York, pp. 29-33. 21, 7 (July), 558-565. DIJKSTRA, E. W. 1982. Distributed termination MENASCE, D., AND MUNTZ, R. 1979. Locking and detection revisited. EWD 828, Plataanstraat 5, deadlock detection in distributed databases. 5671 Al Nuenen, The Netherlands. IEEE Trans. Softw. Eng. SE-5,3 (May). FRANCEZ, N. 1980. Distributed termination. ACM MISRA, J. 1983. Detecting termination of distributed Trans. Program. Lang. Syst. 2, 1 (Jan.), 42-55. computations using markers. In Proceedings of FRANCEZ, N., AND RODEH, M. 1982. Achieving dis- the ACM Symposium on Principles of Distributed tributed termination without freezing. IEEE Computing (Montreal, Canada, Aug.). ACM, New Trans. Softw. Eng. SE-8, 3 (May), 287-292. York, pp. 290-294. FRANCEZ, N., RODEH, M., AND SINTZOFF, M. 1981. MISRA, J., AND CHANDY, K. M. 1982a. A distributed Distributed termination with interval assertions. graph algorithm: Knot detection. ACM Trans. In Proceedings of Formalization of Programming Program. Lang. Syst. 4,4 (Oct.), 678-686. Concepts (Peninsula, Spain). Springer Verlag, MISRA, J., AND CHANDY, K. M. 1982b. Termination New York. detection of diffusing computations in com- GOLDMAN, B. 1985. Deadlock detection in computer municating sequential processes. ACM Trans. networks. Tech. Rep. LCS TR-185, Massachu- Program. Lang. Syst. 4, 1 (Jan.), 37-43. setts Institute of Technology, Cambridge, Mass. MITCHELL, D. P., AND MERRITT, M. J. 1984. A GOUDA, M. 1981. Distributed state exploration distributed algorithm for deadlock detection and for protocol validation. Tech. Rep. TR-185, Dept. resolution. In-Proceedings of the ACM Symposium of Comnuter Sciences, Univ. of Texas, Austin, on Principles of Distributed Computing. ACM, Tex. - New York, pp. 282-284. ISLOOR, S. S., AND MARSLAND, T. A. 1980. The NATARAJAN, N. 1986. A distributed scheme for de- deadlock problem: An overview. Computer tecting communication deadlock. IEEE Trans. (Sept.), 58-70. Softw. Eng. SE-12, 4 (Apr.), 531-537. JAGANNATHAN, J. R., AND VASUDEVAN, R. 1982. OBERMARCK, R. 1980. Deadlock detection for all re- Detection and resolution of deadlocks in distrib- source classes. Res. Rep. RJ2955, IBM Research uted systems. Tech. Rep. 82-108-27, Univ. of Cal- Laboratory, San Jose, Calif. gary, Calgary, Alta., Canada.

ACM Computing Surveys, Vol. 19, No. 4, December 1987 328 . Edgar Knapp

KORTH, H. F., KRISHNAMURTHY, R., NIGAM, A., AND TSAI, W. 1982. Distributed deadlock detection in ROBINSON, J. T. 1983. A framework for under- distributed database systems. Ph.D. dissertation, standing distributed (deadlock detection) algo- Univ. of Illinois at Urbana-Champaign, Urbana, rithms. In Proceedings of the Second ACM Ill. Symposium on Principles of Database Systems TSAI, W., AND BELFORD, G. 1982. Detecting dead- (Atlanta, Ga., Mar.). ACM, New York, pp. lock in distributed svstems. In IEEE ZNFOCOM. 192-201. IEEE, New York, pp. 89-95. MARSLAND, T. A., AND ISLOOR, S. S. 1980. Wuu, G. T., AND BERNSTEIN, A. J. 1985. False Detection of deadlocks in distributed database deadlock detection in distributed systems. IEEE systems. ZNFOR 18, 1 (Feb.), l-20. Tram. Softw. Erg. SE-11,8 (Aug.), 820-821.

Received May 1987; final revision accepted January 1988.

ACM Computing Surveys, Vol. 19, No. 4, December 1987