SURVEY & TUTORIAL SERIES
Deadlock Detection in Distributed Systems
Mukesh Singhal Ohio State University
distributed system is a network process acquires a resource before access- of sites that exchange informa- lock detection in distributed systems, this each requesting an access to resources held article describes a series of deadlock de- by other processes. Unless the deadlock is tection techniques based on centralized, resolved, all the processes involved are and restoring all the relinquished resources hierarchical, and distributed control or- blocked indefinitely. Therefore, a dead- to their original states. In the simplest case, ganizations. The article complements one lock requires the attention of a process a process is aborted by starting it afresh by Knapp, which discusses deadlock de- outside those involved in the deadlock for and relinquishing all the resources it held. tection in distributed database systems.' its detection and resolution. Knapp emphasizes the underlying theo- A deadlock is resolved by aborting one Resource vs. communication dead- retical principles of deadlock detection and or more processes involved in the deadlock lock. Two types of deadlock have been gives an example of each principle. In and granting the released resources to other discussed in the literature: resource dead- contrast, this article examines deadlock processes involved in the deadlock. A lock and communication deadlock. In re- detection in distributed systems more from process is aborted by withdrawing all its source deadlocks, processes make access the point of view of its practical implica- resource requests, restoring its state to an to resources (for example, data objects in tions. It presents an up-to-date and com- appropriate previous state, relinquishing database systems, buffers in store-and- prehensive survey of deadlock detection all the resources it acquired after that state, forward communication networks). A algorithms, discusses their merits and
November 1989 0018-9162/89/1100-0037$01.00 0 1989 IEEE 37 drawbacks, and compares their perform- Deadlock handling is complicated in ance (delays as well as message complex- distributed systems because no site has ity). Moreover, this article examines re- accurate knowledge of the current state of lated issues, such as correctness of the the system and because every intersite algorithms, performance of the algorithms, communication involves a finite and un- and deadlock resolution, which require predictable delay. Next, we examine the further research. complexity and practicality of the three deadlock-handling approaches in distrib- Graph-theoretic model of deadlocks. uted systems. The state of a system is in general dynamic; that is, system processes continuously Deadlock prevention. Deadlock pre- acquire and release resources. Characteri- vention is commonly achieved either by zation of deadlocks requires a representa- having a process acquire all the needed tion of the state of process-resource inter- resources simultaneously before it begins actions. The state of process-resource executing or by preempting a process that interactions is modeled by a bipartite di- holds the needed resource. In the former rected graph called a resource allocation Figure 1. Resource allocation graph. method, a process requests (or releases) a graph. Nodes of this graph are processes remote resource by sending a request and resources of a system, and edges of the message (or release message) to the site graph depict assignments or pending re- where the resource is located. This method quests. A pending request is represented Deadlock-handling has the following drawbacks: by a request edge directed from the node of strategies (1) It is inefficient because it decreases a requesting process to the node of the system concurrency. requested resource. A resource assignment The three strategies for handling dead- (2) A set of processes may get dead- is represented by an assignment edge di- locks are deadlock prevention, deadlock locked in the resource-acquiring phase. rected from the node of an assigned re- avoidance, and deadlock detection. In For example, suppose process PI at site SI source to the node of the process assigned. deadlock prevention, resources are granted and process P, at site S, simultaneously For example, Figure 1 shows the resource to requesting processes in such a way that request two resources R, and R, located at allocation graph for two processes PI and a request for a resource never leads to a sites S, and S,, respectively. It may happen P, and two resources RI and R,, where deadlock. The simplest way to prevent a that S, grants R, to PI and S, grants R, to edges RI-+ PI and R,+ P, are assignment deadlock is to acquire all the needed re- P,, resulting in a deadlock. This problem edges and edges P,+ RI and PI-+ R, are sources before a process starts executing. can be handled by forcing processes to request edges. In another method of deadlock prevention, acquire needed resources one by one, but A system is deadlocked if its resource a blocked process releases the resources that approach is highly inefficient and allocation graph contains a directed cycle requested by an active process. impractical. in which each request edge is followed by In deadlock avoidance strategy, a re- (3) In many systems future resource an assignment edge. Since the resource source is granted to a process only if the requirements are unpredictable (not allocation graph of Figure 1 contains a resulting state is safe. (A state is safe if known a priori). directed cycle, processes PI and P, are there is at least one execution sequence deadlocked. A deadlock can be detected by that allows all processes to run to comple- In the latter method, an active process constructing the resource allocation graph tion.) Finally, in deadlock detection strat- forces a blocked process, which holds the and searching it for cycles. egy, resources are granted to a process needed resource, to abort. This method is In a distributed database system without any check. Periodically (or when- inefficient because several processes may (DDBS), the user accesses the data objects ever a request for a resource has to wait) be aborted without any deadlock. of the database by executing transactions. the status of resource allocation and pend- A transaction can be viewed as a process ing requests is examined to determine if a Deadlock avoidance. For deadlock that performs a sequence of reads and set of processes is deadlocked. This exami- avoidance in distributed systems, a re- writes on data objects. The data objects of nation is performed by a deadlock detec- source is granted to a process if the result- a database can be viewed as resources that tion algorithm. If a deadlock is discovered, ing global system state is safe (the global are acquired (by locking) and released (by the system recovers from it by aborting one state includes all the processes and re- unlocking) by transactions. In DDBS lit- or more deadlocked processes. sources of the distributed system). The erature the resource allocation graph is The suitability of a deadlock-handling following problems make deadlock avoid- referred to as a transaction-wait-for (TWF) strategy greatly depends on the applica- ance impractical in distributed systems: graph.3 In a TWF graph, nodes are transac- tion. Both deadlock prevention and dead- (1) Because every site has to keep track tions and there is adirected edge from node lock avoidance are conservative, overly of the global state of the system, huge TIto node T, if TI is blocked and must wait cautious strategies. They are preferred if storage capacity and extensive communi- for T, to release some data object. A sys- deadlocks are frequent or if the occurrence cation ability are necessary. tem is deadlocked if and only if there is a of a deadlock is highly undesirable. In (2) The process of checking for a safe directedcycle in its TWFgraph. Since both contrast, deadlock detection is a lazy, opti- global state must be mutually exclusive. graphs denote the state of process-resource mistic strategy, which grants a resource to Otherwise, if several sites concurrently interaction, we will collectively refer to a request if the resource is available, hop- perform checks fora safe global state (each them as state graphs. ing that this will not lead to a deadlock. site for a different resource request), they
38 COMPUTER may all find the state safe but the net global algorithms exploit access patterns local to global cycle. Unlike centralized algo- state may not be safe. This restriction se- a cluster of sites to efficiently detect dead- rithms, distributed algorithms are not vul- verely limits the concurrency and through- locks. nerable to a single point of failure, and no put of the system. site is swamped with deadlock detection (3) Due to the large numbers of pro- Correctness of deadlock detection al- activity. Deadlock detection is initiated cesses and resources, checking for safe gorithms. To be correct, a deadlock detec- only if a waiting process is suspected to be states is computationally expensive. tion algorithm must satisfy two criteria: part of a deadlock cycle. But deadlock resolution is often cum- (1) No undetected deadlocks: the algo- Deadlock detection. Deadlock detec- bersome in distributed deadlock detection rithm must detect all existing deadlocks in tion requires examination of process-re- algorithms because several sites may de- finite time. source interactions for the presence of tect the same deadlock and may not be (2) No false deadlocks: the algorithm cyclic wait. In distributed systems dead- aware of other sites and/or processes in- should not report nonexistent deadlocks. lock detection has two advantages: volved in the deadlock. Distributed algo- In distributed systems where there is no rithms are difficult to design because sites (1) Once a cycle is formed in the state global memory and communication occurs may collectively report the existence of a graph, it persists until it is detected and solely by messages, it is difficult to design global cycle after seeing its segments at broken. a correct deadlock detection algorithm different instants (though all the segments (2) Cycle detection can proceed con- because sites may receive out-of-date and never existed simultaneously) due to the currently with the normal activities of a inconsistent state graphs of the system. As system’s lack of globally shared memory. system; therefore, it does not have a seri- a result, sites may detect a cycle that never Also, proof of correctness is difficult for ous effect on system throughput. existed but whose different segments ex- these algorithms. For these reasons the literature on dead- isted in the system at different times. That lock handling in distributed systems is is why many deadlock detection algo- Strengths and weaknesses of hierar- highly biased toward deadlock detection. rithms reported in the literature are in- chical algorithms. In hierarchical dead- correct. lock detection algorithms, sites are ar- ranged hierarchically, and a site detects Strengths and weaknesses of central- deadlocks involving only its descendant Issues in deadlock ized algorithms. In centralized deadlock sites. To efficiently detect deadlocks, hier- detection detection algorithms, a designated site, archical algorithms exploit access patterns often called the control site, has the re- local to a cluster of sites. They tend to get Deadlock detection involves two basic sponsibility of constructing the global state the best of both worlds: they have no single tasks: maintenance of the state graph and graph and searching it for cycles. The point of failure (as centralized algorithms search of the state graph for the presence of control site may maintain the global state have), and a site is not bogged down by cycles. Because in distributed systems a graph all the time, or it may build it when- deadlock detection activities that it is not cycle may involve several sites, the search ever deadlock detection is to be carried out concerned with (as sometimes happens in for cycles greatly depends on how the by soliciting the local state graph from distributed algorithms). For efficiency, system state graph is represented across every site. Centralized algorithms are most deadlocks should be localized to as the system. conceptually simple and are easy to imple- few clusters as possible; the objective of Classified according to the way state ment. Deadlock resolution is simple in hierarchical algorithms will be defeated if graph information is maintained and the these algorithms -the control site has the most deadlocks span several clusters. search for cycles is carried out, the three complete information about the deadlock Next, we describe a series of central- types of algorithms for deadlock detection cycle, and it can optimally resolve the ized, distributed, and hierarchical dead- in distributed systems are centralized, dis- deadlock. lock detection algorithms. We discuss the tributed, and hierarchical algorithms. In However, because control is centralized basic idea behind their operations, com- centralized algorithms the state graph is at a single site, centralized deadlock detec- pare them with each other, and discuss maintained at a single designated site, tion algorithms have a single point of fail- their pros and cons. We also summarize the which has the sole responsibility of updat- ure. Communication links near the control performance of these algorithms in terms ing it and searching it for cycles. In distrib- site are likely to be congested because the of message traffic, message size, and delay uted algorithms the state graph is distrib- control site receives state graph informa- in detecting a deadlock (see Table 1). It is uted over many sites of the system, and a tion from all the other sites. Also, the not possible to enumerate these perform- cycle may span state graphs located at message traffic generated by deadlock ance measures with high accuracy for several sites, making distributed process- detection activity is independent of the rate many deadlock detection algorithms for ing necessary to detect it. In centralized of deadlock formation and the structure of the following reasons: the random nature algorithms the global state of the system is deadlock cycles. of the TWF graph topology, the invocation known and deadlock detection is simple. of deadlock detection activities even In distributed algorithms the problem of Strengths and weaknesses of distrib- though there is no deadlock, and the initia- deadlock detection is more complex be- uted algorithms. In distributed deadlock tion of deadlock detection by several pro- cause no site may have accurate knowl- detection algorithms, the responsibility of cesses in a deadlock cycle. Therefore, for edge of the system state: In hierarchical detecting a global deadlock is shared most algorithms we have given perform- algorithms sites are arranged in a hierar- equally among the sites. The global state ance bounds rather than exact numbers (for chy, and a site detects deadlocks involving graph is spread over many sites, and sev- example, the maximum number of mes- only its descendant sites. Hierarchical eral sites participate in the detection of a sages transferred to detect a global cycle).
October 1989 39 Table 1. Performance comparison of distributed deadlock detection algorithms. algorithm, Ho and Ramamoorthy have presented two centralized deadlock detec- [ Algorithm Number of Messages Delay Message Size I tion algorithms called the two-phase and one-phase algorithms.’ These algorithms, Goldman 40 COMPUTER tency in state information by using only the information common to both tables. For example, if the resource table at SI indi- cates that resource RI is being waited for by a process P, (i.e., RI+ P2)and the proc- ess table at S, indicates that process P, is waiting for resource RI(i.e., P,+ RI),then edge P,+ R, in the constructed state graph correctly reflects the system state. If either of these entries is missing from the re- source or process table, then a request message or release message from S, to S, is in transit and P,+ RI cannot be ascer- tained. The one-phase algorithm is faster Figure 2. Example of OBPL (ordered blocked process list). and requires fewer messages than the two- phase algorithm. But it requires more stor- age because every site maintains and ex- changes two status tables. Distributed deadlock detection algorithms In distributed algorithms all sites coop- erate to detect a cycle in the state graph, which is distributed over several sites of the system. Deadlock detection can be initiated whenever a process is forced to wait, and it can be initiated either by the local site of the process or by the site where the process waits. Information about the state graph can be maintained and circu- lated in various forms (for example, table,’ list,6 and probe’,’) during the deadlock detection phase. Goldman’s algorithm. Goldman’s al- s3 gorithm6 exchanges deadlock-related in- formation in the form of an ordered blockedprocess list (OBPL), in whicheach Figure 3. Example of Goldman’s algorithm. process (except the last) is blocked by its successor. The last process in an OBPL may either be waiting to access a resource or be running. For example, OBPL PI, P,, P,, P, represents the state graph in Figure P,, P,, P,, P, to P,. When P, receives the time of making decisions about data allo- 2. OBPL, it discards the OBPL because it is cation at the concerned site. It is based on The algorithm detects a deadlock by not blocked. Had P, been blocked by PI,P,, the concept of reachable set. The reachable repeatedly expanding the OBPL, append- P,, or P,, a deadlock would have been set of a node in the state graph is the set of ing the process that holds the resource detected by P,. all the nodes that can be reached from it. A needed by the last process in the list until An advantage of Goldman’s algorithm process is deadlocked if the reachable set either a deadlock is discovered (that is, the is that it does not require continuous main- of the corresponding node contains the last process is blocked by a process in the tenance of TWF graphs. It constructs an node itself. list) or the OBPL is discarded (the last OBPL whenever deadlock detection is to The algorithm detects deadlocks by con- process is running). As an example, sup- be carried out. However, it requires that structing reachable sets and checking pose in the system shown in Figure 3 pro- every process have at most one outstand- whether any node belongs to its own reach- cess P, initiates deadlock detection and ing resource request. able set. To do this, every site maintains sends OBPL PI, P, to process P,. When the system state graph and reachable sets process P, receives the OBPL, it appends Isloor-Marsland algorithm. The “on- for each node in the state graph; the reach- P, to the OBPL and sends the new OBPL line” deadlock detection algorithm of Is- able sets are continually updated whenever PI, P,, P, to P,. Likewise, P, sends OBPL loor and Marsland” detects deadlocks at edges are added to or deleted from the state PI,P,, P,, P, to P, and P, sends OBPL PI, the earliest possible instant -that is, at the graph. Whenever a resource is allocated, November 1989 41 tuck have fixed this algorithm by precisely defining the status of all the transactions, whether active, blocked, or waiting for the outcome of a nonlocal resource request, and by having the algorithm propagate appropriate blocking pairs when it be- comes certain that a waiting transaction is blocked. Obermarck’s algorithm. In Ober- Before Tg waits for T1 marck’s algorithm,’ the nonlocal portion of the global TWF graph at a site is ab- stracted by a distinct node, called “exter- nal” or Ex, which helps determine poten- tial multisite deadlocks without requiring a huge global TWF graph to be stored at each site. Deadlock detection at a site follows the following iterative process: (1) A site waits for deadlock-related information (produced in the previous deadlock detection iteration) from other sites. (2) The site combines the received in- formation with its local TWF graph, de- tects all cycles, and breaks only cycles that After T3 waits for T1 do not contain the node Ex. (3) For all cycles Ex+ TI+ T,+ Ex that contain the node Ex (these cycles are potential candidates for global deadlocks), Figure 4. Example of Menasce-Muntz algorithm. the site transmits them in string form Ex, TI, T,, Ex to all other sites. The algorithm reduces message traffic by lexically ordering the nodes (transac- whenever a process is made to wait for a information about the condensed TWF tions) and sending a string Ex, TI,T,, T,, resource, or whenever a process releases a graph is sent along the paths of the global Ex to other sites only if TI is higher than T, resource, the corresponding information is TWF graph.) in the lexical ordering. broadcast to all other sites. Therefore, if r Figure 4 illustrates the algorithm for a changes per second occur in the state deadlock involving three transactions TI, Chandy -Misra -Haas algorithm. graph, then the algorithm requires r(N-I) T,, and T,. Initially TIis blocked by T, and Chandy, Misra, and Haas’s algorithm’ uses messages per second for deadlock detec- T, by T,, and the home sites of T, and T, a special message called a probe. A probe tion. However, the messages are short have the knowledge of the TWF graph TI is a triplet (i,j, k) denoting that it belongs because they contain only an update to the + T, and T,+ T,, respectively. Now, to a deadlock detection initiated for pro- state graph resulting from the execution of when T, makes a request and is blocked by cess P, and is being sent by the home site of a request. TI,the blocking pair (T,, T,) is sent to the P, to the home site of P,. A probe message home site of T,. This causes an edge from travels along the edges of the global TWF Menasce-Muntz algorithm. The dead- T, to T, to be added in the TWF of the home graph, and a deadlock is discovered when lock detection algorithm of Menasce and site of T,, resulting in a cycle T,+ T,+ T, a probe message returns to its initiating Muntz3 propagates only the two end points and detection of a deadlock at the home site process. As an example, consider the sys- of a directed path (called a blocking pair), of T,. tem shown in Figure 5. If process PI initi- rather than the whole path, to detect dead- Gligor and Shattuck4 have shown that ates deadlock detection, it sends probe (I, locks. The blockingset(T) of a transaction this algorithm fails to detect some dead- 3,4) to the controller C, at site S,. Since P, T is the set of all nonblocked transactions locks for two reasons: First, in the case of is waiting for P, and P, is waiting for PI,, that can be reached from T by following all a nonlocal request, the determination of C, sends probes (1, 6, 8) and (1, 7, 10) to directed paths in the TWF graph. This is whether a transaction is blocked or not is C,, which in turn sends probe (I, 9,1) to C,. the set of transactions responsible for incorrect because that determination can- On receipt of probe (1, 9, l), C, declares blocking the transaction T. When a trans- not be made until the response arrives from that PI is deadlocked. action T gets blocked, then for each trans- a remote site. Second, even if this response In Haas and Mohan’s algorithm,’ a vari- action T, in the blockingset(T), the algo- arrives to determine correctly whether a ation of Chandy, Misra, and Haas’s algo- rithm sends the blocking pair (T, T,) to the transaction is blocked or not, the algorithm rithm, a process comes to know (besides home sites of T and T,. (In other words, does not make use of it. Gligor and Shat- detecting that it is deadlocked) all the 42 COMPUTER deadlock cycles in which it is involved. The algorithm achieves this by passing more information about the potential cycles in a probe message. In this algo- rithm a message consists of not only the information about the initiator of the dead- lock, but also all the paths to it. Mitchell-Merritt algorithm. In the deadlock detection algorithm of Mitchell and Merritt," each node of the TWF graph has two labels: private and public. The 0 private label of each node is unique to that node, and initially both labels have the same value. The algorithm detects a dead- lock by propagating the public labels of nodes in a backward direction in the TWF graph. When a transaction gets blocked, the public and private labels of its node in the TWF graph are changed to a value greater than their previous values and greater than the public labels of the block- ing transaction. (Blocked transactions update their labels in this manner periodi- cally.) A deadlock is detected when a trans- action receives its own public label. In essence, the largest public label propa- gates in a backward direction in a deadlock cycle. Deadlock resolution is simple in this Figure 5. Example of Chandy-Misra-Haas algorithm. algorithm because only one process de- tects a deadlock (that process can resolve the deadlock by aborting itself). Sugihara et al.'s algorithm. As in Obermarck's algorithm, in Sugihara et al.'s algorithm'* every site maintains only a local TWF graph. A wait for a remote resource is reflected by adding a global edge to the local TWF graphs of the site of the requesting process and the site of the 0-nodes / process holding the requested resource. In a global edge the nodes corresponding to the requesting process and the process holding the requested resource are referred to as the 0-node and the I-node, respec- tively (Figure 6). A site initiates deadlock detection by sending a message, similar to the probe message of Chandy, Misra, and Haas, whenever the addition of an edge (due to a resource wait) in its TWF graph creates a new path between any of its I-nodes and 0- nodes. Note that a site can be involved in a global deadlock only when there is a path between some of its I-nodes and 0-nodes. The algorithm has a unique resolver for every deadlock, and deadlock resolution does not cause detection of false dead- locks. An advantage of the algorithm is that a site maintains minimal information about the global TWF graph. Figure 6. I-nodes and 0-nodes. November 1989 43 Sinha-Natarajanalgorithm. Sinha and designed to detect global deadlocks that The controllers at the bottommost level, Natarajan’s algorithm’ does not construct escape the first two levels, and it closely called leaf controllers, manage resources; the TWF graph, but it follows the edges of resembles Obermarck’s algorithm. the others, called nonleaf controllers, are the graph to search for cycles. Transac- The most attractive feature of Badal’s responsible for deadlock detection. A leaf tions are prioritized, and an antagonistic algorithm is that it detects the most fre- controller maintains the part of the global conflict is said to occur when a transaction quent deadlocks with minimum overhead TWF graph that is concerned with the allo- waits for a data object that is locked by a (first- and second-level algorithms) and cation of the resources at that leaf control- lower-priority transaction. The algorithm switches to an expensive algorithm (third ler. A nonleaf controller maintains the initiates deadlock detection only when an level) only when really needed. However, TWF graph spanning its children control- antagonistic conflict occurs, rather than it has a fixed overhead due to information lers and is responsiblc for detecting only whenever a transaction begins to wait for kept in lock tables, frequent checking of deadlocks involving its own leaf control- another transaction. Therefore, it requires deadlocks of length two, and longer mes- lers. Whenever a change occurs in a con- fewer messages to detect deadlocks and sages. Consequently, it is most suitable in troller’s TWF graph due to a resource allo- generates fewer messages during normal environments where deadlocks occur fre- cation, wait, or release, the change is conditions. quently, justifying the fixed overhead. propagated to its parent controller. The The algorithm detects a deadlock by parent controller makes the changes in its circulating a probe message through a Bracha-Toueg algorithm. In Bracha TWF graph, searches for cycles, and cycle in the global TWF graph. A probe and Toueg’s deadlock detection algorithm propagates the changes upward if neces- message is a 2-tuple (i, j) where i is the for generalized environments, called the r- sary. A nonleaf controller can be kept up to transaction that faced the antagonistic out-of-s request model,15 a process can date about the TWF graphs of its children conflict and initiated deadlock detection request any r resources from a pool of s continuously (that is, whenever a change andjis the transaction of thc lowest prior- resources. After issuing an r-out-of-s re- occurs) or periodically. ity among all the transactions (nodes of the quest, a process remains blocked until it TWF graph) the probe has traversed so far. gets any r out of the s resources. The Ho-Ramamoorthy algorithm. In the When a waiting transaction receives a algorithm consists of two phases: notify hierarchical algorithm of Ho and probc initiated by a lower-priority transac- and grant. In the first phase, notify mes- Rarnamo~rthy,~sites are grouped into tion, the probe is discarded. (Thus, the sages are propagated downward in several disjoint clusters. Periodically, a algorithm filters out redundant messages). forestlike patterns of the TWF graph; in the site is chosen as the central control site, An interesting property of this algorithm second phase, grant messages are echoed which dynamically chooses a control site is that a deadlock is detected when the back from all active processes, simulating for each cluster (Figure 7). The central site probe issued by the highest-priority pro- the granting of resources to requests. At requests every control site for its inter- cess in the cycle returns to that process. the end of the second phase, all the pro- cluster transaction status information and (There is only one detector of every dead- cesses that are not made active are dead- wait-for relations. As aresult, acontrol site lock.) Deadlock resolution is simple; the locked. collects status tables from all the sites in its detector of a deadlock can resolve the Because the system state is dynamic, the cluster and applies the one-phase deadlock deadlock by aborting the lowest-priority TWF graph may change during the execu- detection algorithm to detect all deadlocks transaction of the cycle. Choudhary et al. tion of the algorithm. The algorithm over- involving only intracluster transactions. have shown that this algorithm detects comes such changes by propagating spe- Then it sends intercluster transaction status false deadlocks and fails to report all cial Freeze messages throughout the sys- information and wait-for relations (de- deadlocks because it overlooks the pos- tem. When a process receives a Freeze rived from the information thus collected) sibility of a transaction waiting transitively message, it saves a snapshot of its state. to the central site. The central site splices on a deadlock cycle and because probes Deadlocks are detected by running the the intercluster information thus received, of aborted transactions are not deleted deadlock detection algorithm on the col- constructs a system state graph, and properly.13 lection of snapshots thus obtained. searches it for cycles. Thus, a control site‘ detects all deadlocks located in its cluster, Badal’s algorithm. Badal’s algorithmI4 and the central site detects all intercluster exploits the fact that deadlocks can be deadlocks. divided into several categories based on Hierarchical deadlock the complexity of their topology; the fre- detection algorithms quency of deadlock occurrence and the costs of deadlock detection differ among In hierarchical algorithms sites are Future research categories. There is no point in detecting (logically) arranged in hierarchical fash- directions simple deadlocks with an algorithm de- ion, and a site is responsible for detecting signed to detect complex deadlocks. Badal deadlocks involving only its children sites. Several issues related to deadlock detec- optimizes performance by using three lev- To optimize performance, these algo- tion in distributed systems have not been els of deadlock detection; activity at each rithms take advantage of access patterns adequately studied and require further level is more complex (and expensive) localized to a cluster of sites. research. than at the preceding level. Deadlock de- tection starts at the first level algorithm Menasce-Muntz algorithm. In the hi- Algorithm correctness. There is a and is delegated to the next higher level if erarchical deadlock detection algorithm of dearth of sophisticated formal methods to the current-level algorithm fails to report a Menasce and M~ntz,~all the resource prove the correctness of deadlock detec- deadlock. The third-level algorithm is controllers are arranged in tree fashion. tion algorithms for distributed systems. 44 COMPUTER semination of information about the state graph, which implies high message traffic. There is another dimension to the trade- off between message traffic and deadlock persistence time. For example, Chandy, Misra, and Haas's algorithm requires short messages, but whenitdetects adeadlock, it takes a while to resolve it. In contrast, Haas and Mohan's algorithm exchanges longer messages; however, when a process de- tects a deadlock, it knows all the processes involved in it, and therefore the deadlock can be resolved quickly. Besides communication overhead and / Control r/ \ deadlock persistence time, any evaluation site of deadlock detection algorithms should consider measures such as storage over- head for deadlock detection information, processing overhead to search for cycles, and additional processing overhead to optimally resolve deadlocks. Among the factors that influence these measures are the techniques used for deadlock detec- tion, the data access behavior of processes, the request-release pattern of processes, and resource holding time. How these fac- Figure 7. Example of Ho-Ramamoorthy algorithm. tors influence performance and how the performance characteristics of different detection algorithms compare with each other are not well understood. A complete performance study of deadlock detection Most researchers have used informal, in- deceptive because deadlock detection al- algorithms calls for the development of tuitive arguments to show the correctness gorithms also exchange messages during performance models, the measurement of of their algorithms. But intuition has normal conditions (when there is no dead- performance using analytic or simulation proved to be highly unreliable, and more lock). The number of messages exchanged techniques, and a performance comparison than half the algorithms have been found may not be a true indicator of communica- of existing algorithms. incorrect. A formal proof of the correct- tion overhead because some algo- ness of deadlock detection algorithms is rithm~',~.'~exchange long messages Deadlock resolution. Persistence of a difficult for several reasons: whereas others',3 exchange short mes- deadlock has two major disadvantages: sages. Therefore, we require a different First, all the resources held by deadlocked (1) The TWF graph and deadlock cycles criterion for computing communication processes are not available to any other can form in innumerable ways, making it overhead, which should take into account process. Second, the deadlock persistence difficult to imagine and exhaustively study the number as well as the size of messages time gets added to the response time of all conceivable situations. exchanged, not only in deadlocked condi- each process involved in the deadlock. (2) Deadlock is very sensitive to the tions but also in normal conditions. Therefore, the problem of promptly and timing of requests. The persistence of deadlocks results in efficiently resolving a detected deadlock is (3) In distributed systems, message wasteful utilization of resources and in- as important as the problem of deadlock delays are unpredictable and there is no creased response time to user requests. detection itself. Unfortunately, most dead- global memory. Therefore, an important performance lock detection algorithms for distributed Time-dependent proof techniques are par- measure of deadlock detection algorithms systems do not address the problem of ticularly necessary. is the average deadlock persistence time. deadlock resolution. There is often a trade-off between message A deadlock is resolved by aborting at Algorithm performance. Although traffic and deadlock persistence time. For least one process involved in the deadlock many deadlock detection algorithms have example, although the on-line deadlock and granting the released resources to other been proposed for distributed systems, detection algorithm of Isloor and Marsland processes involved in the deadlock. Effi- their performance analysis has not re- detects a deadlock at the earliest instant, it cient resolution of a deadlock requires ceived sufficient attention. Most authors has high message traffic. On the other knowledge of all the processes involved in (for example, Obermarck and Sinha and hand, Obermarck's algorithm has less the deadlock and all resources held by Natarajan) have evaluated their algorithms message traffic, but its deadlock persis- these processes. When a deadlock is de- on the basis of the number of messages tence time is proportional to the size of the tected, the speed of its resolution depends exchanged to detect an existing cycle in the cycle. This trade-off is intuitive - quick on how much information about it is avail- TWF graph. This performance criterion is detection of deadlock requires fast dis- able, which in turn depends on how much November 1989 45 information concerning the victim at all sites. Execution of the second step is compli- cated in environments where a process can simultaneously wait for multiple resources because the allocation of a released re- source to another process can cause a dead- lock. The third step is even more critical Site S2 Site S4 because if the information about the victim is not deleted quickly and properly, it may be counted in several other (false) cycles, causing detection of false deadlocks. As Choudhary et al. point out, the failure to delete probe messages in the Sinha-Nata- rajan algorithm causes the detection of false deadlocks. To be safe, during the Site S3 execution of the second and third steps, the deadlock detection process (at least in potential deadlocks that include the vic- tim) must be halted to avoid detection of false deadlocks. In the Sugiharaet al. algo- 7igure 8. Detection of a false deadlock. rithm, a control token serializes global deadlock resolution to eliminate its side effects on the deadlock detection process. False deadlocks. In environments where information is passed around during the ume of information exchanged during the a process can simultaneously wait for deadlock detection phase. In existing dis- deadlock detection phase and the amount multiple resources, deadlock resolution is tributed deadlock detection algorithms, of time needed to resolve a deadlock once even more complex because an edge may deadlock resolution is complicated by at it is detected. For example, Haas and be shared by two or more cycles, and delet- least one of the following problems: Mohan’s algorithm exchanges long mes- ing that edge will break all those dead- sages, but when a deadlock is detected, its locks. However, since the search for each A process that detects a deadlock does resolution is quick. On the other hand, in cycle is carried out independently, dead- not know all the processes (and resources Menasce and Muntz’s algorithm the mes- lock detection initiated for some cycles held by them) involved in the deadlock - sages exchanged are short, but when a may not be aware of the deleted edge, for example, the algorithms of Chandy, deadlock is detected, its resolution is tedi- resulting in detection of false deadlocks. Misra, and Haas and Menasce and Muntz. ous and time consuming. Figure 8 illustrates such a scenario. Two Two or more processes may independ- Whether it is better to exchange long deadlocks share an edge (T4, TJ. Suppose ently detect the same deadlock - for ex- messages during the deadlock detection the top cycle has been detected by process ample, the Chandry-Misra-Haas and phase and resolve a detected deadlock T,, which is breaking it by deleting the Goldman algorithms. If every process that quickly or to exchange short messages edge (T4,T,). Concurrently process T, may detects a deadlock resolves it, then dead- during the deadlock detection phase and do initiate a deadlock detection message, and lock resolution will be inefficient because extra computation to resolve a detected it may happen that T, breaks the edge (T4, several processes will be aborted to re- deadlock depends on how frequently dead- T,) after the deadlock detection message solve a deadlock (different processes may locks occur in a system. If deadlocks are initiated by T, has crossed (or traversed) it. choose to abort different processes). frequent, then the former approach should In this case, T, will detect a (false) dead- Therefore, we need some postdetection perform better and vice versa. Even after lock involving processes T4,T,, T,, T,, and processing to select aprocess to be respon- all the information necessary to resolve a T,, which has already been broken by T,. sible for resolving the deadlock. deadlock is available, resolution involves In brief, deadlock detection involves the following nontrivial steps: detecting a static condition -once a dead- Many deadlock detection algorithms lock cycle is formed, it persists until it is require an additional round of message (1) Select a victim (the process to be detected and broken. On the other hand, exchanges to select a deadlock resolver aborted) for the optimal resolution of a deadlock resolution is a dynamic process and/or to gather the information needed to deadlock (this step may be computation- -it changes the state graph by deleting its efficiently resolve a deadlock. The Sinha- ally tedious). edges and nodes. Two forces are working Natarajan algorithm is one of the excep- (2) Abort the victim, release all the in opposite directions: the wait for re- tions where each deadlock is detected only resources held by it, restore all the released sources adds edges and nodes to the state by the highest-priority process that (upon resources to their previous states, and grant graph, while deadlock resolution removes deadlock detection) knows the lowest-pri- the released resources to deadlocked pro- them from the state graph. Therefore, if ority processes in the deadlock cycle. cesses. deadlock resolution is not carefully incor- There is often a trade-off between the vol- (3) Delete all the deadlock detection porated into deadlock detection, false 46 COMPUTER deadlocks are likely to be detected. Theform in which the algorithm main- tributed Database Systems,” ACM Comput- tains and passes around information ing Surveys, Dec. 1987, pp. 303-328. Deadlock probability. The frequency about the process-resource interaction. 3. D.E. Menasce and R.R. Muntz, “Locking of deadlocks is a crucial factor in the de- Goldman’s algorithm uses lists, Hass- and Deadlock Detection in Distributed sign of distributed systems. If deadlocks Mohan’s and Obermarck’s use strings, Databases,” IEEE Trans. Software Engi- are infrequent, then a time-out mechanism Isloor-Marsland’s uses sets, and Menasce- neering, May 1979, pp. 195-202. is the best approach to handling deadlocks Muntz’s uses the condensed TWF graph. 4. V.D. Gligor and S.H. Shattuck, “On Dead- because it has very low overhead. In a The way the algorithm conducts the lock Detection in Distributed Systems,’’ time-out mechanism a transaction or a search for cycles. Obermarck’s algorithm IEEE Trans. Software Engineering, Sept. process is aborted after it has waited for sends lists of the edges of a directed path in 1980, pp. 435-440. more than a specified period, called the the state graph; Chandy et al.’s and Sinha- time-out interval, after issuing a resource Natarajan’s circulate a probe message 5. G.S. Ho and C.V. Ramamoorthy, “Proto- cols for Deadlock Detection in Distributed request. The most critical issue in a time- along the edges of the state graph; Me- Database Systems,” IEEE Trans. Software out mechanism is to choose an appropriate nasce-Muntz’s passes the condensed TWF Engineering, Nov. 1982, pp. 554-557. time-out interval; if the time-out interval is graph; Mitchell-Merritt’s passes a label. short, then many transactions may be The amount of information available 6. B. Goldman, “Deadlock Detection in Computer Networks,” Tech. Report MIT/ aborted unnecessarily (that is, without about a deadlock when it is detected. In the LCS/TR-185, MIT, Cambridge, Mass., being deadlocked), and if the time-out in- algorithms of Chandy et al., Menasce- Sept. 1977. terval is long, then deadlocks will persist Muntz, Sinha-Natarajan, and Sugihara et for a long time. The time-out mechanism is al., aprocess that detects adeadlock knows 7. L.M. Haas and C. Mohan, “A Distributed also susceptible to cyclic restarts, in which that it is deadlocked but does not know all Detection Algorithm for a Resource-Based System,” Research Report, IBM Research transactions are repeatedly aborted and the other processes involved in the dead- Laboratory, San Jose, Calif., 1983. restarted. lock; in Goldman’s and Haas-Mohan’s The probability of deadlocks depends algorithms, a process that detects a dead- 8. R. Obermarck, “Distributed Deadlock De- on factors such as process mix, resource lock knows all the other processes in- tection Algorithm,” ACM Trans. Database request and release patterns, resource hold- volved in that deadlock. Systems, June 1982, pp. 187-210. ing time, and the average number of data 9. M.K. Sinha and N. Natarajan, “A Priority- objects held (locked) by processes. The Although several deadlock detection Based Distributed Deadlock Detection probability of deadlocks is difficult to algorithms have been proposed for distrib- Algorithm,” IEEE Trans. Sofrware Engi- analyze because deadlock occurrence is uted systems, a number of issues remain to neering, Jan. 1985, pp. 67-80. highly sensitive to the timing and order in be addressed. Future research should focus 10. S.S. Isloor and T.A. Marsland, “An Effec- which resource requests are made. Gray et on efficient resolution of deadlocks, cor- tive On-line Deadlock Detection Technique have done an approximate analysis of rectness of distributed deadlock detection, for Distributed Database Management deadlock probability and found that modeling and performance analysis of Systems,” Proc. Compsac 78, NOV. 1978, deadlock detection algorithms, and the pp. 283-288. (1) transaction waits and deadlocks are probability of deadlocks in distributed rare, but they both increase linearly 11. D.P. Mitchell and M.J. Merritt, “A Distrib- with the degree of multiprogram- systems. 0 uted Algorithm for Deadlock Detection and ming, Resolution,” Proc. ACM Conf. Principles of Distributed Computing, Aug. 1984, pp. (2) most deadlocks are of length (size) 282-284. two, (3) deadlocks rise as the fourth power Acknowledgments 12. K. Sugihara et al., “A Distributed Algo- of transaction size, and rithm for Deadlock Detection and Resolution,” Proc. Fourth Symp. Reliabil- (4) waits rise as the second power of The author is deeply indebted to Virgil Gligor ity in Distributed Software and Database transaction size. of the University of Maryland, T.V. Lakshman Systems, Oct. 1984, pp. 169-176. The probability of deadlock is an impor- of Bellcore, and seven anonymous referees whose comments on an earlier version were 13. A.L. Choudhary et al., “A Modified Prior- tant parameter, and any further work in instrumental in improving the quality of pres- ity-Based Probe Algorithm for Distributed this direction will be a worthwhile con- entation and enhancing the technical content of Deadlock Detection and Resolution,” IEEE tribution. this article. The author is also thankful to Bruce Trans. Software Engineering, Jan. 1989, Shriver, editor-in-chief of Computer, for his pp. 10-17. encouragement and valuable suggestions for revising the article. 14. D.J. Badal, “The Distributed Deadlock o sum up, of the three types of al- Detection Algorithm,” ACM Trans. Com- gorithms for detecting global puter Systems, Nov. 1986, pp. 320-337. deadlocks, distributed are the most T 15. G. Bracha and S. Toueg, “A Distributed prominent and most thoroughly investi- Algorithm for Generalized Deadlock gated. All distributed deadlock detection References Detection,” Proc. ACM Symp. Principles of algorithms have a common goal -to de- Distributed Compuring, Aug. 1984, pp. tect cycles that span several sites in a dis- K.M. Chandy, J. Misra, and L.M. Haas, 285-301. tributed manner - yet they differ in the “Distributed Deadlock Detection,” ACM Trans. Computer Systems, May 1983, pp. 16. J. Gray et al., “A Straw-Man Analysis of the ways they achieve this goal. The following 144-156. Probabilty of Waiting and Deadlocks in a are the most salient characteristics of these Database System,” IBM Research Report, algorithms: Edgar Knapp, “Deadlock Detection in Dis- 1981. November 1989 47 I. Cidon et al., “Local Distributed Deadlock lock Detection and Resolution Algorithm and Additional readings Detection by Cycle Detection and Clustering,” Its Correctness,” IEEE Trans. Software Engi- IEEE Trans. Software Engineering, Jan. 1987, neering, Oct. 1988, pp. 1443-1452. Theory pp. 3- 14. K. Shafer and M. Singhal, “A Correct Priority- E.G. Coffman et al., “System Deadlocks,” ACM B. Awerbuch and S. Micah, “Dynamic Dead- Based Probe Algorithm for Distributed Dead- Computing Surveys, June 1971, pp. 66-78. lock Resolution Protocols,” Proc. 27th Annual lock Detection and Resolution and Proof of Its Symp. Foundations of Computer Science, Oct. Correctness,” Tech. Report No. OSU-CISRC- R.C. Holt, “Some Deadlock Properties of 1987, pp. 196-207. 4/89-TR16, Ohio State Univ., Dept. of Com- Computer Systems,” ACM Computing Surveys, puter and Information Science, Apr. 1989. Dec. 1972, pp. 179-195. M. Roesler and W.A. Burkhard, “Resolution of Deadlocks in Object-Oriented Distributed Performance 0. Wolfson and M. Yannakakis, “Deadlock Systems,” IEEE Trans. Computers, Aug. 1989, Freedom (and Safety) of Transactions in a pp. 1212-1224. J.R. Jagannathan and R. Vasudevan, “A Dis- Distributed Database,” Proc. Fourth ACM tributed Deadlock Detection and Resolution SigactlSigmod Symp. Principles of Database B.A. Sanders and P.A. Heuberger, “Distributed Scheme: Performance Study,” Proc. Third Int’l Systems, 1985. Deadlock Detection and Resolution with Conf. Distributed Computing Systems, 1982, Probes,” to appear in Proc. Third Int’l Work- pp. 496-501. Algorithms shop on Distributed Algorithms, Sept. 26-28, 1989, Nice, France. Probability of Deadlocks A.N. Chandra et al., “Communication Protocol for Deadlock Detection in Computer Survey C.A. Ellis, “On the Probability of Deadlocks in Networks,” IBM Technical Disclosure Bulle- Computer Systems,” Proc. Fourth Symp. Oper- tin, Vol. 16, No. IO, Mar. 1974, pp. 347 1-3481. S.S. Isloor and T.A. Marsland, “The Deadlock ating Systems Principles, Oct. 1973, pp. 88-95. Problem: An Overview,” Computer, Sept. L.M. Haas, “Two Approaches to Deadlock 1980, pp. 58-77. A.W. Shum and P.G. Spirakis, “Performance Detection in Distributed Systems,” PhD Disser- Analysis of Concurrency Control Methods in tation, Dept. of Computer Science, Univ. of M. Singhal, “Deadlock Detection in Distrib- Database Systems,” Performance 81, 1981, pp. Texas at Austin. 1981. uted Systems: Status and Perspective,” Tech. 1-19. Report No. OSU-CISRC-TR-86-10, Dept. of W. Tasi and G. Belford, “Detecting Deadlocks Computer and Information Science, Ohio State W. Massey, “A Probabilistic Analysis of a in Distributed Systems,” Proc. IEEE Infocom, Univ., Columbus, June 1986. Database,” Performance Evaluation Review, 1982, pp. 89-95. Vol. 14, No. 1, May 1986, pp. 141-146. A.K. Elmagarmid, “A Survey of Distributed A.K. Elmagarmid, “Deadlock Detection and Deadlock Detection Algorithms,” ACM Sigmod Resolution in Distributed Processing Systems,” Records, Sept. 1986. PhD Dissertation, Dept. of Computer and Infor- mation Science, Ohio State Univ., Columbus, Correctness Ohio, 1985. J.R. Jagannathan and R. Vasudevan, “Com- A. N. Choudhary, “Two Distributed Deadlock ments on Protocols for Deadlock Detection in Detection Algorithms and Their Performance,” Distributed Database Systems: Corrigenda,” Master’s Thesis, Dept. of Electrical and Com- Trans. Software Engineering, May 1983, p. puter Engineering, Univ. of Massachusetts, 271. 1986. G. Wuu and A. Bernstein, “False Deadlock N. Natarajan, “A Distributed Scheme for De- Detection in Distributed Systems,” IEEE Trans. tecting Communication Deadlocks,” IEEE Sofhvare Engineering, Aug. 1985, pp. 820-821. Trans. Software Engineering, Apr. 1986, pp. 531-537. A.K. Elmagarmid et al., “A Distributed Dead- Mukesh Singhal is an assistant professor of Moving? computer and information science at Ohio State University, Columbus. His research and teach- Name (Please Print) ing interests are distributed systems, distrib- uted database systems, and performance mod- PLEASE NOTIFY eling of computer systems. From 1981 to 1985, US 4 WEEKS I New Address he served as a research assistant and instructor IN ADVANCE in the Department of Computer Science, Uni- versity of Maryland, College Park. City StatelCountry Zip Singhal received a bachelor of engineering degree with distinction in electronics and com- munication engineering from the University of Roorkee, Roorkee, India, in 1980, and a PhD degree in computer science from the University MA~LTO: This notice of address change will apply lo all of Maryland, College Park, in May 1986. He is ATTACH IEEE Service Center IEEE publications to which you subscribe. a member of the Computer Society and the LABEL 445 Hoes Lane List new address above. IEEE. Piscataway, NJ 08854 HERE If you have a question about your subscription, place label here and clip this form lo your letter. Singhal’s address is Ohio State University, Dept. of Computer and Information Science, I 2036 Neil Avenue Mall, Columbus, OH 43210. 48 COMPUTER