Distributed Network Algorithms (Lecture Notes for GIAN Course)

Distributed Network Algorithms (Lecture Notes for GIAN Course) Assoc.-Prof. Dr. Stefan Schmid Assoc-Prof. Dr. Partha Sarathi Mandal Thanks to Prof. Dr. Roger Wattenhofer and Prof. Dr. Christian Scheideler for basis of manuscript! Summer 2016 Contents 1 Topology & Routing 3 2 Vertex Coloring 9 2.1 Introduction . .9 2.2 Coloring Trees . 11 3 Leader Election 17 3.1 Distributed Algorithms and Complexity . 17 3.2 Anonymous Leader Election . 18 3.3 Asynchronous Ring . 19 3.4 Lower Bounds . 21 3.5 Synchronous Ring . 23 4 Tree Algorithms 25 4.1 Broadcast . 25 4.2 Convergecast . 26 4.3 BFS Tree Construction . 27 4.4 MST Construction . 29 5 Distributed Sorting 33 5.1 Array & Mesh . 33 5.2 Sorting Networks . 36 5.3 Counting Networks . 39 6 Shared Memory 45 6.1 Introduction . 45 6.2 Mutual Exclusion . 46 6.3 Store & Collect . 49 6.3.1 Problem Definition . 49 6.3.2 Splitters . 50 6.3.3 Binary Splitter Tree . 51 6.3.4 Splitter Matrix . 53 7 Shared Objects 55 7.1 Introduction . 55 7.2 Arrow and Friends . 56 7.3 Ivy and Friends . 61 i ii CONTENTS 8 Maximal Independent Set 65 8.1 MIS . 65 8.2 Fast MIS from 1986 . 67 8.3 Fast MIS from 2009 . 70 8.4 Applications . 74 9 Locality Lower Bounds 77 9.1 Locality . 78 9.2 The Neighborhood Graph . 80 10 Social Networks 85 10.1 Small-World Networks . 86 10.2 Propagation Studies . 92 11 Synchronization 95 11.1 Basics . 95 11.2 Synchronizer α ............................ 96 11.3 Synchronizer β ............................ 97 11.4 Synchronizer γ ............................ 98 11.5 Network Partition . 100 11.6 Clock Synchronization . 102 12 Hard Problems 105 12.1 Diameter & APSP . 105 12.2 Lower Bound Graphs . 107 12.3 Communication Complexity . 110 12.4 Distributed Complexity Theory . 116 13 Stabilization 117 13.1 Self-Stabilization . 117 13.2 Advanced Stabilization . 122 14 Wireless Protocols 125 14.1 Basics . 125 14.2 Initialization . 127 14.2.1 Non-Uniform Initialization . 127 14.2.2 Uniform Initialization with CD . 127 14.2.3 Uniform Initialization without CD . 128 14.3 Leader Election . 129 14.3.1 With High Probability . 129 14.3.2 Uniform Leader Election . 129 14.3.3 Fast Leader Election with CD . 130 14.3.4 Even Faster Leader Election with CD . 131 14.3.5 Lower Bound . 133 14.3.6 Uniform Asynchronous Wakeup without CD . 133 14.4 Useful Formulas . 134 15 All-to-All Communication 137 CONTENTS iii 16 Consensus 143 16.1 Impossibility of Consensus . 143 16.2 Randomized Consensus . 148 17 Multi-Core Computing 153 17.1 Introduction . 153 17.1.1 The Current State of Concurrent Programming . 153 17.2 Transactional Memory . 155 17.3 Contention Management . 156 18 Dominating Set 163 18.1 Sequential Greedy Algorithm . 164 18.2 Distributed Greedy Algorithm . 165 19 Routing 171 19.1 Array . 171 19.2 Mesh . 172 19.3 Routing in the Mesh with Small Queues . 173 19.4 Hot-Potato Routing . 174 19.5 More Models . 176 iv CONTENTS Welcome! Welcome to the GIAN course on Distributed Network Algorithms! Distributed network algorithms play a major role in many networked systems, ranging from computer networks (such as sensor networks, peer-to-peer networks, software-defined networks, datacenter networks, networks on chip) to social or even biological networks. The goal of this course is to introduce the fundamental formal models and methods needed to reason about the correctness and performance of distributed network algorithms. In particular, we will teach essential algorithmic and analytic techniques which, after the course, are a useful toolbox and allow the students to develop and study their own distributed network algorithms. The course also discusses applications, e.g., in the context of wireless networks, peer- to-peer networks, datacenter networks, and virtual networks. Due to time constraints, we will focus on some selected aspects, and we will not be able to cover all the material you find in these lecture notes. However, I will be very happy to talk to you about any other related topic during this week in person. Enjoy the course! Stefan Schmid (Aalborg University, Denmark & TU Berlin, Germany) Partha Sarathi Mandal (IIT Guwahati, India) 1 2 CONTENTS Chapter 1 Topology & Routing The most basic network topologies used in practice are trees, rings, grids or tori. Many other suggested networks are simply combinations or derivatives of these. The advantage of trees is that the routing is very easy: for every source- destination pair there is only one possible simple path. However, since the root of a tree is usually a severe bottleneck, so-called fat trees have been used. These trees have the property that every edge connecting a node v to its parent u has a capacity that is equal to all leaves of the subtree routed at v. See Figure 1.1 for an example. 4 2 1 Figure 1.1: The structure of a fat tree. Remarks: • Fat trees belong to a family of networks that require edges of non-uniform capacity to be efficient. Easier to build are networks with edges of uniform capacity. This is usually the case for grids and tori. Unless explicitly mentioned, we will treat all edges in the following to be of capacity 1. In the following, [x] means the set f0; : : : ; x − 1g. Definition 1.1 (Torus, Mesh). Let m; d 2 N. The (m; d)-mesh M(m; d) is a 3 4 CHAPTER 1. TOPOLOGY & ROUTING graph with node set V = [m]d and edge set ( d ) X E = f(a1; : : : ; ad); (b1; : : : ; bd)g j ai; bi 2 [m]; jai − bij = 1 : i=1 The (m; d)-torus T (m; d) is a graph that consists of an (m; d)-mesh and addi- tionally wrap-around edges from nodes (a1; : : : ; ai−1; m; ai+1; : : : ; ad) to nodes (a1; : : : ; ai−1; 1; ai+1; : : : ; ad) for all i 2 f1; : : : ; dg and all aj 2 [m] with j 6= i. In other words, we take the expression ai − bi in the sum modulo m prior to computing the absolute value. M(m; 1) is also called a line, T (m; 1) a cycle, and M(2; d) = T (2; d) a d-dimensional hypercube. Figure 1.2 presents a linear array, a torus, and a hypercube. 00 10 20 30 110 111 01 11 21 31 100 101 02 12 22 32 010 011 0 1 2 m−1 03 13 23 33 000 001 M(m ,1) T (4,2) M(2,3) Figure 1.2: The structure of M(m; 1), T (4; 2), and M(2; 3). Remarks: • Routing on mesh, torus, and hypercube is trivial. On a d-dimensional hypercube, to get from a source bitstring s to a target bitstring d one only needs to fix each \wrong" bit, one at a time; in other words, if the source and the target differ by k bits, there are k! routes with k hops. • The hypercube can directly be used for a structured Peer-to-Peer (P2P) architecture. It is trivial to construct a distributed hash table (DHT): We have n nodes, n for simplicity being a power of 2, i.e., n = 2d. As in the hypercube, each node gets a unique d-bit ID, and each node connects to d other nodes, i.e., the nodes that have IDs differing in exactly one bit. Now we use a globally known hash function f, mapping file names to long bit strings; SHA-1 is popular in practice, providing 160 bits. Let fd denote the first d bits (prefix) of the bitstring produced by f. If a node is searching for file name X, it routes a request message f(X) to node fd(X). Clearly, node fd(X) can only answer this request if all files with hash prefix fd(X) have been previously registered at node fd(X). • The hypercube has many derivatives, the so-called hypercubic networks. Among these are the butterfly, cube-connected-cycles, shuffle-exchange, and de Bruijn graph. We start with the butterfly, which is basically a \rolled out" hypercube (hence directly providing replication!). 5 Definition 1.2 (Butterfly). Let d 2 N. The d-dimensional butterfly BF (d) is d a graph with node set V = [d + 1] × [2] and an edge set E = E1 [ E2 with d E1 = ff(i; α); (i + 1; α)g j i 2 [d]; α 2 [2] g and d E2 = ff(i; α); (i + 1; β)g j i 2 [d]; α; β 2 [2] ; α and β differ only at the ith positiong : A node set f(i; α) j α 2 [2]dg is said to form level i of the butterfly. The d-dimensional wrap-around butterfly W-BF(d) is defined by taking the BF (d) and identifying level d with level 0. Remarks: • Figure 1.3 shows the 3-dimensional butterfly BF (3). The BF (d) has (d + 1)2d nodes, 2d · 2d edges and degree 4. It is not difficult to check that combining the node sets f(i; α) j i 2 [d]g into a single node results in the hypercube. • Butterflies have the advantage of a constant node degree over hypercubes, whereas hypercubes feature more fault-tolerant routing. • The structure of a butterfly might remind you of sorting networks from Chapter 5. Although butterflies are used in the P2P context (e.g. Viceroy), they have been used decades earlier for communication switches. The well-known Benes network is nothing but two back-to-back butterflies.

Load more