Distributed Shortest Paths Algorithms (Extended Abstract)

Distributed Shortest Paths Algorithms (Extended Abstract) Baruch Awerbuch * Dept. of Mathematics and Lab. for Computer Science, M.I.T., Cambridge, MA 02139 ARPANET: [email protected] .edu Abstract performed relatively fast. Thus, we will be interested in minimizing the message exchange, as well as time This paper is concerned with distributed algorithm for find- of control algorithms. We deal exclusively with worst- ing shortest paths in an asynchronous communication net- case performance of the algorithm. work. For the problem of Breadth First Search, the best Given execution of a protocol in an asynchronous previously known algorithms required either O(V) time, or network, the longest message delay in that execution is O(E + V .D) communication. We present new algorithm, the maximum difference between arrival and transmis- which requires O(#+‘) time, and O(E1+C) messages, for any c > 0. (Here, V is number of nodes, E is number of sion times of a message, in terms of some global clock edges and D is the diameter.) This constitutes a major (which is not accessible to nodes). Following [Gal82], step towards achieving the lower bounds, which are 0(E) normalized time between two events in an execution communication and Q(D) time. is the ratio between the physical time between those For the general (weighted) shortest paths problem, pre- two events in terms of the global clock above, and the viously known shortest-paths <algorithms required O(k. V”) longest message delay in that execution, i.e. physical messages and O(V . log, V) time. Our algorithm requires time in case that link delays vary between 0 and 1. In O(E’+” . log W) messages and O(V’+’ . log W) time. context of asynchronous network, “time” always means Our results enable to improve significantly solutions for “normalized time”. other basic network problems (e.g. leader election). The Communication Complexity of ?F, denoted by c., is the upper bound on the number of messages sent in any execution of 7~. The Time Complexity off, I Introduction denoted by t,, is the upper bound on on normalized time of any execution of 7r. 1.1 Model and complexity measures This paper is concerned with distributed algorithms 1.2 The problems in an asynchronous communication network. Those We consider the problem of finding shortest paths in algorithms may be repeated many times in case that the communication graph G(V, E) of the network. In the network’s topology changes. From the point of the “undirected” version of the problem, we are inter- view of network’s performance, it is desirable that the ested in finding shortest paths in an undirected graph messages of the control algorithms don’t occupy much G, = (V, E, UJ) from all nodes to an a distinguished of the network bandwidth, and that such algorithms is nodes s E V, where w is an assignment of non-negative *Supported by Air Force Contract TNDGAFOSR86-0078, weights we to edges e E E. AR0 contract DAAL03-86-K-0171, NSF contract CCR8611442, The BFS problem is the special case of shortest and a special grant from IBM. paths, where w(e) = 1, for all e E E. The “directed” version of the problem is defined similarly, ex- Permission to copy without fee all or part of this material is granted cept that we look for shortest paths in a directed graph provided that the copies are not made or distributed for direct @ = (v, E) which is an orientation of G(V, E), i.e. commercial advantage, the ACM copyright notice and the title of (i + j) E i? implies that (i - j) E E. the publication and its date appear, and notice is given that copying Those problems appear to be fairly basic in the field is be permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific of distributed network protocols. The major applica- permission. tion for shortest paths tree is that shortest paths tree @ 1989 ACM O-89791-307-8/t19/0005/0490 $1 SO w.r.t. a given source node can be used for routing 490 of data from other nodes to the source node, in the Author Communication Time cheapest possible way. Improving efficiency of distributed Shortest Paths [Awe853 E+V.k.D D . log, V V1+c (BFS) yields improvement in more complex distributed [AG85] E=+’ El+’ D’fC algorithms, in which Shortest Paths (BFS) appears This paper to be the bottleneck. Examples of such problems are Figure 1: Our BFS algorithms versus existing ones. given in section 2. Observe that a shortest paths problem on network G = (V, E, w) can be reduced to a BFS problem on an “expanded” network & = (v’, ,6?) where an edge Author Communication 1 Time e is substituted by a path containing w, edges and [Awe851 k . V2 v -log, v We - 1 “dummy nodes”. The difficulty with this re- [Fre85] Vii vii duction is that the numbe_r of nodes in the expanded This paper El+’ . log W V’+’ . log W network is huge, namely IV1 = O(W . V), where W is the maximal edge weight. Using Gabow’s scaling tech- Figure 2: Our Shortest Paths algorithms versus existing ones. nique, one can effectively guarantee W 5 lV[. How- ever, with Gabow’s technique or without it, in either case [VI >> IV1 and 121 >> ]E]. Thus, existence of a e.g. the ARPANET [MRR80], where D < V. The black bot performing BFS with linear complexity, O(k) high (R(V)) time overhead is inherent for that method. messages, only guarantees O(E . V) messages shortest The best known algorithm in terms of time is paths algorithm, which is far from satisfactory. achieved by applying synchronizer y of [Awe851 to the obvious synchronous algorithm. This algorithm (symbolically attributed to [Awe85]) requires O(E+V.D.k) 2 Contributions of this paper messages and O(logk V . D) time, for any h. It meets the lower bound in time but is quite inefficient in com- We present new distributed algorithms for the above munication. problems with sharply improved bounds on communi- The new BFS algorithm requires O(E’+‘) messages cation and time. and O(Dl+‘) time, for all E > 0, thus making a step Our algorithm is based on novel synchronization towards achieving the lower bounds. The following technique, which proceeds recursively. Even though Figure 1 summarizes results of this paper, comparing the algorithm is just a composition of very simple mod- them to existing results. ular blocks, its analysis is non-trivial. To overcome the difficulties, we introduce novel counting methods, based on notions of amortized complexities. 2.2 Improvements for Shortest Paths There have been a number of works on shortest paths 2.1 Improvements for BFS in the past [Ga182], [Jaf80]. The best previously known shortest-paths algorithms is obtained using the SYN- The obvious lower bounds on communication and time CHRONIZER of to [Awe85]. It requires O(E . V”) mes- complexities of distributed BFS algorithms are Cl(E) sages and O( V. log, V) t ime. (This does not count the messages and 0(D) time, where E is the number of preprocessing phase of [Awe85].) For planar graphs network edges, and D is the diameter, In the syn- only, the algorithm of Frederickson achieves O(V1i) chronozls network, the obvious algorithm meets those message and time. lower bounds. However, the situation is much more Using the scaling techniques of Gabow [Gab851 and complex in the asynchronous network, where the best the BFS algorithm in this paper, we obtain a new known algorithms exhibit trade-o$in terms of commu- shortest paths algorithm, which requires O(E1+C . nication and time complexities. log W) messages and 0( V’+’ . log W) time. The best known algorithm in terms of communica- The following Figure 2 summarizes our improve- tion is due to [AG85]. It requires @(El+‘) messages ments over existing algorithms. and @(V1+‘) time, for all E > 0. [AG85] is “relatively close” to the lower bound in communication. However, 2.3 Applications for other problems as advocated by Peleg [Pe187], this algorithm is quite inefficient in time, in case that D < V. Peleg [Pe187] Leader Election, Spanning Tree, Global E’unc- points out that the difference between O(V) and O(D) tions: One of the most well-studied problems in the time can be very significant in many existing networks, field of distributed network algorithms is the problem 491 3 Basics Author C0mln1~cation~) 3.1 DIJKSTRA algorithm ~;qFJ Let us first outline a simple BFS algorithm, which will be (symbolically) referred to as the DIJKSTRA Algo- Figure 3: Our Leader Election algorithms versus existing ones. rithm, because of its similarity with Dijkstra’s shortest path algorithm and Dijkstra-Sholten distributed termination detection procedure [DS80]. of finding a leader in a network. It is equivalent to The algorithm maintains a tree rooted at the source the problem of finding a spanning tree. Leader elec- node. Initially, the tree is empty. Upon termination of tion is an important tool for breaking symmetry in a the algorithm, the tree is the desired BFS tree. Thru- distributed system. Construction of a spanning tree out the algorithm, the tree can only grow, and at any or finding a leader appears as a building block essen- time it is a sub-tree of the final BFS tree. The algo- tially in every complex network protocol, and is closely rithm operates in successive iterations, each processing related to many problems in distributed computing. another BFS layer. At the beginning of a given iter- There are many problems which are provably equiv- ation 1, the tree has been constructed for all nodes in alent [Awe87] to the problem of electing a leader, in layers m < 1.

Distributed Shortest Paths Algorithms (Extended Abstract)

A Survey of Network Performance Monitoring Tools

An Introduction

Three-Dimensional Integrated Circuit Design: EDA, Design And

Learning to Predict End-To-End Network Performance

Designing Vertical Bandwidth Reconfigurable 3D Nocs for Many

A Social Network Matrix for Implicit and Explicit Social Network Plates

Social Network Analysis and Information Propagation: a Case Study Using Flickr and Youtube Networks

Network Performance Analysis

A Framework for a Bandwidth Based Network Performance Model for CS Students

Network Information Theory

Network Protocols and Algorithms

Machine Learning, Data Mining and Big Data Frameworks for Network Monitoring and Troubleshooting