Arxiv:1808.06705V4 [Cs.DS] 3 May 2021

Graph connectivity in log steps using label propagation Paul Burkhardt∗ May 4, 2021 Abstract The fastest deterministic algorithms for connected components take logarithmic time and perform superlinear work on a Parallel Random Access Machine (PRAM). These algorithms maintain a spanning forest by merging and compressing trees, which requires pointer-chasing operations that increase memory access latency and are limited to shared-memory systems. Many of these PRAM algorithms are also very complicated to implement. Another popular method is “leader-contraction” where the challenge is to select a constant fraction of leaders that are adjacent to a constant fraction of non-leaders with high probability, but this can require adding more edges than were in the original graph. Instead we investigate label propagation because it is deterministic, easy to implement, and does not rely on pointer-chasing. Label propagation exchanges representative labels within a component using simple graph traversal, but it is inherently difficult to complete in a sublinear number of steps. We are able to overcome the problems with label propagation for graph connectivity. We introduce a surprisingly simple framework for deterministic, undirected graph connectivity using label propagation that is easily adaptable to many computational models. It achieves logarithmic convergence independently of the number of processors and without increasing the edge count. We employ a novel method of propagating directed edges in alternating direction while performing minimum reduction on vertex labels. We present new algorithms in PRAM, Stream, and MapReduce. Given a simple, undirected graph G = (V,E) with n = |V | vertices, m = |E| edges, our approach takes O(m) work each step, but we can only prove logarithmic convergence on a path graph. It was conjectured by Liu and Tarjan (2019) to take O(log n) steps or possibly 2 O(log n) steps. Our experiments on a range of difficult graphs also suggest logarithmic convergence. We leave the proof of convergence as an open problem. Keywords: graph connectivity, connected components, parallel algorithm, pram, stream, mapreduce 1 Introduction Given a simple, undirected graph G = (V, E) with n = |V | vertices and m = |E| edges, the connected components of G are partitions of V such that every pair of vertices are connected by a path, which is a sequence of adjacent edges in E. If two vertices are not connected then they are in different components. The distance d(v,u) is the shortest-path length between vertices v and u. The diameter D is the maximum distance in G. We wish to find the connected components of G in O(log n) steps using simple, deterministic methods that are adaptable to many arXiv:1808.06705v4 [cs.DS] 3 May 2021 computational models. The fastest, deterministic parallel (NC) algorithms for connected components take logarithmic time and perform superlinear work on a Parallel Random Access Machine (PRAM). These algorithms maintain a spanning forest by merging and compressing trees [22, 13, 3, 42], which requires pointer-chasing operations that increase memory access latency. Pointer jumping was a primary source of slowdown in a parallel minimum spanning tree algorithm [12]. The PRAM implementations are also limited to shared-memory systems [19, 34, 4]. Another popular method is “leader- contraction” where the challenge is to select a constant fraction of leaders that are adjacent to a constant fraction of non-leaders with high probability [2, 26, 36, 23]. Not only is this method randomized but it can require adding many more edges to the graph. Instead we investigate label propagation because it is deterministic, easy to implement, and does not rely on pointer-chasing. Label propagation exchanges representative labels within a component using simple graph traversal, but it is inherently difficult to complete in a sublinear number of steps [37, 39, 41]. Adding and removing edges must be carefully managed to keep the edge count and therefore the work linear in each step. We are able to overcome the problems with label propagation for graph connectivity. We introduce a surprisingly simple framework for deterministic, undirected graph connectivity using label propagation that is easily adaptable to many computational models. It achieves logarithmic convergence independently of the number of processors and without increasing the edge count. We employ a novel method of propagating directed ∗Research Directorate, National Security Agency, Fort Meade, MD 20755. Email: [email protected] 1 1 1 1 3 4 2 3 4 3 4 3 4 2 2 2 (1) (2) (3) (4) Figure 1: A path graph converges in three steps. After each step the output graph becomes the input for the next step. Undirected edges denote a pair of counter-oriented edges. edges in alternating direction while performing minimum reduction on vertex labels. We believe our solution to obtaining sublinear convergence and near optimal work for connected components is one of the simplest to date. Moreover, our experiments demonstrate fast convergence. In this paper we say that a (v,u) edge is directed from v to u and (·, ·) denotes an ordered pair that distinguishes (v,u) from (u, v). We call the counter-oriented (v,u), (u, v) edges the twins of a conjugate pair. An undirected {u, v} edge in G is then comprised of these twins. To propagate a label w from edge (v,u) we create just the (u, w) twin. In the next step we reverse the direction to return the opposite twin, (w,u), if w is the minimum label for u. We’ll call these two edge operations label propagation and symmetrization, respectively. Comitant with label propagation is a min update on u’s minimum label, which may or may not change. Contraction of the graph is due to the label proagation operation and symmetrization ensures that a vertex and its minimum label are able to exchange new minimum labels. Thus every vertex in each step propagates and retains its current minimum label. Let l(v) be the current minimum label for v. Then for an edge (v,u) we get (u,l(v)) or (u, v) in a single step, due to either label propagation or symmetrization, respectively. Each edge is replaced by a new edge and hence the method maintains a stable edge count. The essential operations are summarized as follows. • For every (v,u) edge if u is not l(v) then min update and label propagation, else symmetrization, repeating until no label changes. Figure 1 illustrates our method where the starting minimum label for each vertex is the lowest vertex ID among its neighbors and itself, and eventually the graph is transformed into a star whose root is the component label. This is a practical and very simple technique for large-scale streaming and parallel graph connectivity. We present new algorithms in PRAM, Stream, and MapReduce. Our approach takes O(m) work each step, but we can only prove logarithmic convergence on a path graph. Despite the simplicity of our algorithm, the proof of logarithmic convergence is elusive and poses a rather interesting challenge. We conjecture that our algorithm takes O(log n) steps to converge. Our algorithm behaves well and empirically takes O(log n) steps on a range of difficult graphs. In 2019 Liu and Tarjan conjectured that an earlier version of our algorithm takes O(log n) steps or possibly O(log2 n) steps [29]. We leave the proof of convergence as an open problem. 2 Our contribution We introduce a simple, deterministic label propagation method for undirected graph connectivity. Our approach propagates directed edges in alternating direction to achieve fast convergence independently of the number of processors while also maintaining O(m) work each step. We present new algorithms in PRAM, Stream, and MapReduce. We will silently use the standard notation for asymptotic bounds to provide a familiar basis for comparison, but the reader should keep in mind that our bounds that depend on the convergence are conjecture only. If our conjecture of O(log n) convergence is true, then our label propagation algorithm on a Concurrent Read Concurrent Write (CRCW) PRAM achieves O(log n) time and O(m log n) work with O(m) processors. On an Exclusive Read Exclusive Write (EREW) PRAM it takes O(log2 n) time and O(m log2 n) work. In contrast, the fastest deterministic CRCW graph connectivity algorithms take O(log n) time and O((m + n) · α(m,n)) work using O((m + n) · α(m,n)/ log(n)) processors [22, 13], where α(m,n) is the inverse Ackerman function. The best-known NC EREW algorithm [11] takes O(log n) time and O((m + n) log n) work with O(m + n) processors. Although our results are slower than those for the fastest deterministic CRCW and EREW algorithms, our method is much simpler and easier to implement. We also give an efficient Stream-Sort algorithm that takes O(log n) passes and O(log n) memory, and a MapReduce algorithm taking O(log n) rounds and O(m log n) communication overall. These would be the first deterministic O(log n)-step graph connectivity algorithms in Stream and MapReduce models. For the 2 Model Steps Cost Class Author CRCW O(log n) O((m + n) · α(m,n)) deterministic Cole, Vishkin [13] EREW O(log n) O((m + n) log n) deterministic Chong, Han, and Lam [11] Stream-Sort O(log n) O(log n) randomized Aggarwal et al. [1] MapReduce O(log2 n) O(m log2 n) deterministic Kiveras et al. [26] CRCW O(log n) O(m log n) deterministic This paper EREW O(log2 n) O(m log2 n) deterministic This paper Stream-Sort O(log n) O(log n) deterministic This paper MapReduce O(log n) O(m log n) deterministic This paper Table 1: Comparison to state-of-the-art.

Arxiv:1808.06705V4 [Cs.DS] 3 May 2021

Vertex Coloring

Easy PRAM-Based High-Performance Parallel Programming with ICE∗

Instructor's Manual

Lawrence Berkeley National Laboratory Recent Work

PRAM Algorithms Parallel Random Access Machine

CSE 4351/5351 Notes 9: PRAM and Other Theoretical Model S

Towards a Spatial Model Checker on GPU⋆

2009-04-21 AMSC Cloud Assembler

Implementing a Tool for Designing Portable Parallel Programs

Models for Parallel Computation in Multi-Core, Heterogeneous, and Ultra Wide-Word Architectures

PRAM Algorithms Parallel Random Access Machine

Introduction to Parallel Processing : Algorithms and Architectures