Practical Parallel Hypergraph Algorithms

Practical Parallel Hypergraph Algorithms Julian Shun [email protected] MIT CSAIL Abstract v0 While there has been significant work on parallel graph pro- e0 cessing, there has been very surprisingly little work on high- v0 v1 v1 performance hypergraph processing. This paper presents a e collection of efficient parallel algorithms for hypergraph pro- 1 v2 cessing, including algorithms for computing hypertrees, hy- v v 2 3 e perpaths, betweenness centrality, maximal independent sets, 2 v k-core decomposition, connected components, PageRank, 3 and single-source shortest paths. For these problems, we ei- (a) Hypergraph (b) Bipartite representation ther provide new parallel algorithms or more efficient implementations than prior work. Furthermore, our algorithms are Figure 1. An example hypergraph representing the groups theoretically-efficient in terms of work and depth. To imple- fv0;v1;v2g, fv1;v2;v3g, and fv0;v3g, and its bipartite repre- ment our algorithms, we extend the Ligra graph processing sentation. framework to support hypergraphs, and our implementations benefit from graph optimizations including switching between improved compared to using a graph representation. Unfor- sparse and dense traversals based on the frontier size, edge- tunately, there is been little research on parallel hypergraph aware parallelization, using buckets to prioritize processing processing. of vertices, and compression. Our experiments on a 72-core The main contribution of this paper is a suite of efficient machine and show that our algorithms obtain excellent paral- parallel hypergraph algorithms, including algorithms for hy- lel speedups, and are significantly faster than algorithms in pertrees, hyperpaths, betweenness centrality, maximal inde- existing hypergraph processing frameworks. pendent sets, k-core decomposition, connectivity, PageRank, and single-source shortest paths. For these problems, we pro- 1 Introduction vide either new parallel hypergraph algorithms (e.g., between- A graph contains vertices and edges, where a vertex represents ness centrality and k-core decomposition) or more efficient an entity of interest, and an edge between two vertices repre- implementations than prior work. Additionally, we show that sents an interaction between the two corresponding entities. most of our algorithms are theoretically-efficient in terms of There has been significant work on developing algorithms their work and depth complexities. and programming frameworks for efficient graph processing We observe that our parallel hypergraph algorithms can due to their applications in various domains, such as social be implemented efficiently by taking advantage of graph pro- network and Web analysis, cyber-security, and scientific com- cessing machinery. To implement our parallel hypergraph putations. One limitation of modeling data using graphs is that algorithms, we made relatively simple extensions to the Ligra only binary relationships can be expressed, and can lead to graph processing framework [75] and we call the extended loss of information from the original data. Hypergraphs are a framework Ligra-H. As with Ligra, Ligra-H is well-suited generalization of graphs where the relationships, represented for frontier-based algorithms, where small subsets of ele- as hyperedges, can contain an arbitrary number of vertices. ments (referred to as frontiers) are processed on each iteration. Hyperedges correspond to group relationships among ver- We use a bipartite graph representation to store hypergraphs, tices (e.g., a community in a social network). An example of and use Ligra’s data structures for representing subsets of a hypergraph is shown in Figure 1a. Hypergraphs have been vertices and hyperedges as well as operators for mapping shown to enable richer analysis of structured data in various application-specific functions over these elements. The oper- applications, such as protein network analysis [71], machine ators for processing subsets of vertices and hyperedges are learning [94], and image segmentation [12, 23]. For example, theoretically-efficient, which enables us to implement paral- Ducournau and Bretto [23] show that by representing pixels lel hypergraph algorithms with strong theoretical guarantees. in an image as vertices and groups of nearby and similar pix- Ligra-H inherits from Ligra various optimizations developed els as hyperedges, the quality of image segmentation can be for graphs, including switching between different traversal strategies based on the size of the frontier, edge-aware paral- PL’18, January 01–03, 2018, New York, NY, USA lelization, bucketing for prioritizing the processing of vertices, 2018. and compression. 1 PL’18, January 01–03, 2018, New York, NY, USA Julian Shun Our experiments on a variety of real-world and synthetic on every iteration, even if few vertices/hyperedges are active, hypergraphs show that our algorithms implemented in Ligra- which makes them inefficient for frontier-based hypergraph H achieve good parallel speedup and scalability with respect algorithms. The Chapel HyperGraph Library [39] is a library to input size. On 72 cores with hyper-threading, we achieve for generating synthetic hypergraphs. However, it does not a parallel speedup of between 8.5–76.5x. Compared to Hy- implement any hypergraph algorithms, and currently does not perX [41] and MESH [36], the only existing frameworks for have a high-level programming abstraction for doing so. hypergraph processing, our results are significantly faster. For Hypergraphs are useful in modeling communication costs example, one iteration of PageRank on the Orkut commu- in parallel machines, and partitioning can be used to minimize nity hypergraph with 2.3 million vertices and 15.3 million communication. There has been significant work on both hyperedges [54] takes 0.083s on 72 cores and 3.31s on one sequential and parallel hypergraph partitioning algorithms thread in Ligra-H, while taking 1 minute on eight 12-core ma- (see, e.g., [14, 15, 20, 44, 47, 49, 83]). While we do not chines using MESH [36] and 10s using eight 4-core machines consider the problem of hypergraph partitioning in this paper, in HyperX [41]. Certain hypergraph algorithms (hypertrees, these techniques could potentially be used to improve the connected components, and single-source shortest paths) can locality of our algorithms. be implemented correctly by expanding each hyperedge into Algorithms have been designed for a variety of problems a clique among its member vertices and running the corre- on hypergraphs, including random walks [23], shortest hy- sponding graph algorithm on the resulting graph. We also perpaths [1, 65], betweenness centrality [69], hypertrees [26], compare with this alternative approach by using the original connectivity [26], maximal independent sets [3, 7, 43, 45], Ligra framework to process the clique-expanded graphs, and and k-core decomposition [40]. However, there have been no show the space usage and performance is significantly worse efficient parallel implementations of hypergraph algorithms, than that of Ligra-H (2.8x–30.6x slower while using 235x with the exception of [40], which provides a GPU implemen- more space on the Friendster hypergraph). tation for a special case of k-core decomposition. Our work shows that high-performance hypergraph pro- 3 Preliminaries cessing can be done using just a single multicore machine, on which we can process all existing publicly-available hyper- Hypergraph Notation. We denote an unweighted hyper- ¹ º graphs. Prior work has shown that graph processing can be graph by H V ; E , where V is the set of vertices and E is done efficiently on just a single multicore machine (e.g.,[21, the set of hyperedges. A weighted hypergraph is denoted by ¹ º 22, 62, 64, 75, 81, 90, 95]), and this work extends the obser- H = V ; E;w , where w is a function that maps a hyperedge vation to hypergraphs. To benefit the community, our source to a real value (its weight). The number of vertices in a hyper- j j j j code will be made publicly available. graph is nv = V , and the number of hyperedges is ne = E . Vertices are assumed to be labeled from 0 to n − 1, and hy- 2 Related Work v peredges from 0 to ne −1. For undirected hypergraphs, we use Graph Processing. There has been significant work on devel- deg¹eº to denote the number of vertices a hyperedge e 2 E con- oping graph libraries and frameworks to reduce programming tains (i.e., its cardinality), and deg¹vº to denote the number of effort by providing high-level operators that capture common hyperedges that a vertex v 2 V belongs to. In directed hyper- algorithm design patterns (e.g., [16, 18, 19, 24, 28–30, 33–35, graphs, hyperedges contain incoming vertices and outgoing 37, 42, 46, 51, 55, 56, 58–60, 63, 64, 66–68, 70, 72, 75, 78– vertices. For a hyperedge e 2 E, we use N −¹eº and N +¹eº to 81, 85, 88, 90, 93, 95], among many others; see [61, 73, 87] denote its incoming vertices and outgoing vertices, respec- for surveys). Many of these frameworks can process large tively, and deg−¹eº = jN −¹eºj and deg+¹eº = jN +¹eºj. We use graphs efficiently, but none of them directly support hyper- N −¹vº and N +¹vº to denote the hyperedges that v 2 V is an graph processing. incoming vertex and outgoing vertex for, respectively, and − Hypergraph Processing. HyperX [41] and MESH [36] are deg ¹vº = jN −¹vºj and deg+¹vº = jN +¹vºj. We denote the Í − + distributed systems for hypergraph processing built on top of size jH j of a hypergraph to be nv + e 2E ¹deg ¹eº + deg ¹eºº, Spark [89]. Algorithms are written using hyperedge programs i.e., the number of vertices plus the sum of the hyperedge and vertex programs that are iteratively applied on hyperedges cardinalities. and vertices, respectively. HyperX stores hypergraphs using Computational Model. We use the work-depth model [38] fixed-length tuples containing vertex and hyperedge identi- to analyze the theoretical efficiency of algorithms. The work fiers, and MESH stores the hypergraph as a bipartite graph.

Practical Parallel Hypergraph Algorithms

Practical Parallel Hypergraph Algorithms

Adjacency Matrix

Introduction to Graphs

Practical Forward Secure Signatures Using Minimal Security Assumptions

9 the Graph Data Model

Computational Hardness of Optimal Fair Computation: Beyond Minicrypt

Counter-Mode Encryption (“CTR Mode”) Was Introduced by Difﬁe and Hellman Already in 1979 [5] and Is Already Standardized By, for Example, [1, Section 6.4]

The Cryptographic Impact of Groups with Infeasible Inversion Susan Rae

List Representation: Adjacency List – N Rows of the Adjacency Matrix Are

Chromatic Scheduling of Dynamic Data-Graph

Package 'Igraph'

An XML-Based Description of Structured Networks