A Parallel Maximal Independent Set Algorithm
Total Page:16
File Type:pdf, Size:1020Kb
A Parallel Maximal Indep endent Set Algorithm y Mark Adams Abstract The parallel construction of maximal independent sets is a useful building blo ck for many algorithms in the computational sciences including graph coloring and multigrid coarse grid creation on unstructured meshes We present an ecient asynchronous maximal indep endent set algorithm for use on parallel computers for well partitioned graphs that arise from nite element mo dels For appropriately par titioned b ounded degree graphs it is shown that the running time of our algorithm under the PRAM computational mo del is O which is an improvement over the previous b est PRAM complexity for this class of graphs We present numerical exp eriments on an IBM SP that conrm our PRAM complexity mo del is indicative of the p erformance one can exp ect with practical partitions on graphs from nite element problems Key words maximal indep endent sets multigrid parallel algorithms graph coloring AMSMOS sub ject classication F F Y Q R C Introduction An independent set is a set of vertices I V in a graph G V E in which no two members of I are adjacent ie v w I v w E a maximal independent set MIS is an indep endent set for which no prop er sup erset is also an indep endent set The parallel construction of an MIS is useful in many computing applications such as graph coloring and coarse grid creation for multigrid algorithms on unstructured nite element meshes In addition to requiring an MIS which is not unique many of these applications want an MIS that maximizes a particular application dep endent quality metric Finding the optimal solution in many of these applications is an NPcomplete problem ie they can not b e solved in p olynomial time or can b e solved by a nondeterministicN machine in p olynomialP time for this reason greedy algorithms in combination with heuristics are commonly used for b oth the serial and parallel construction of MISs Many of the graphs of interest arise from physical mo dels such as nite element simulations These graphs are sparse and their vertices are connected to only their nearest physical neighbors The vertices of such graphs have a b ound on their maximum degree We will discuss our metho d of attaining O PRAM complexity b ounds for computing an MIS on such graphs namely nite element mo dels in three dimensional solid mechanics Our algorithm is notable in that it do es not rely on global random vertex ordering although it is closely related to these algorithms and can b e viewed as a two level random algorithm as we use random actually just distinct pro cessor identiers Our O complexity on nite element graphs is an improvement of the O logn log logn complexity of the random algorithms Nor do es our algorithm rely on deterministic coin tossing to achieve correctness in a distributed memory computing environment but explicitly uses knowledge of the graph partitioning to provide for the correct construction of an MIS in an ecient manner Deterministic coin tossing algorithms have O log n complexity on b ounded degree graphs a more general class of graphs than nite element graphs although their constant log is somewhat higher than ours and the ability of these metho ds to incorp orate heuristics is also not evident We will not include the complexity of the graph partitionings in our complexity mo del though our metho d explicitly dep ends on these partitions We feel justied in this as it is reasonable to assume that A preliminary version of this pap er app eared in the pro ceedings of the Fifth Copp er Mountain Conference on Iterative Metho ds April y Department of Civil Engineering University of California Berkeley Berkeley CA madamscsb erkeleyedu This work is supp orted by DOE grant No WENG the MIS program is embedded in a larger application that requires partitions that are usually much b etter than the partitions that we require The design of an O partitioning algorithm for nite element graphs whose partitionings can b e proven to satisfy our requirements describ ed in x is an op en problem discussed in x Our numerical exp eriments conrm our O complexity claim Additionally the complexity mo del of our algorithm has the attractive attribute that it requires far fewer pro cessors than vertices in fact we need to restrict the number of pro cessors used in order to attain optimal complexity Our PRAM mo del uses P O n pro cessors n jV j to compute an MIS but we restrict P to b e at most a xed fraction of n to attain the optimal theoretical complexity The upp er b ound on the number of pro cessors is however far more than the number of pro cessors that are generally used in practice on common distributed memory computers of to day so given the common use of relatively fat pro cessor no des in contemporary computers our theoretical mo del allows for the use of many more pro cessors than one would typically use in practice Thus in addition to obtaining optimal PRAM complexity b ounds our complexity mo del reects the way that mo dern machines are actually used This pap er is organized as follows In x we describ e a new asynchronous distributed maximal indep endent set algorithm in x we show that our algorithm has optimal p erformance characteristics under the PRAM communication mo del for the class of graphs from discretized PDEs Numerical results of the metho d are presented in section x and we conclude in x with p ossible directions for future work An asynchronous distributed memory maximal indep endent set algorithm Consider a graph G V E with vertex set V and edge set E an edge b eing an unordered pair of distinct vertices Our application of interest is a graph which arises from a nite element analysis where elements can b e replaced by the edges required to make a clique of all vertices in each element see Figure Finite element metho ds and indeed most discretization metho ds for PDEs pro duce graphs in which vertices only share an edge with their physically nearest neighbors thus the degree of each vertex v V can b e b ounded by some mo dest constant We will restrict ourselves to such graphs in our complexity analysis Furthermore to attain our complexity b ounds we must also assume that vertices are partitioned well which will b e dened later across the machine FE mesh Graph Figure Finite element quadrilateral mesh and its corresp onding graph We will introduce our algorithm by rst describing the basic random greedy MIS algorithms describ ed in We will utilize an object oriented notation from common programming languages as well as set notation in describing our algorithms this is done to simplify the notation and we hop e it do es not distract the unini tiated reader We endow vertices v with a mutable data member state state fsel ected del eted undoneg All vertices b egin in the undone state and end in either the selected or deleted state the MIS is dened as the set of selected vertices Each vertex v will also b e given a list of adjacencies adjac D ef inition The adjacency list for vertex v is dened by v adjac fv j v v E g We will also assume that v state has b een initialized to the undone state for all v and v adjac is as dened in D ef inition in all of our algorithm descriptions With this notation in place we show the basic MIS algorithm BMA in Fgure forall v V if v state undone then v state sel ected forall v v adjac v state del eted I fv V j v state sel ectedg Figure Basic MIS Algorithm BMA for the serial construction of an MIS For parallel pro cessing we partition the vertices onto pro cessors and dene the vertex set V owned by p pro cessor p of P pro cessors Thus V V V V is a disjoint union and for notational convenience 1 2 P we give each vertex an immutable data member pr oc after the partitioning is calculated to indicate which S pro cessor is resp onsible for it Dene the edge separator set E to b e the set of edges v w such that S S v pr oc w pr oc Dene the vertex separator set V fv j v w E g G is undirected thus v w and S S w v are equivalent Dene the processor vertex separator set for pro cessor p by V fv j v w E p B S and v pr oc p or w pr oc pg Further dene a pro cessors boundary vertex set by V V V and p p L B a pro cessors local vertex set by V V V Our algorithm provides for correctness and eciency in a p p p distributed memory computing environment by rst assuming a given ordering or numbering of pro cessors so that we can use inequality op erators with these pro cessor numbers As will b e evident later if one vertex is placed on each pro cessor an activity of theoretical interest only then our metho d will degenerate to one of the well known random types of algorithms set an acronym for maximum pro cessor in vertex set which We dene a function mpiv sv er tex op erates on a list of vertices maxfv pr oc j v v er tex set v state del etedg if v er tex set D ef inition mpiv sv er tex set if v er tex set Given these denitions and op erators our algorithm works by implementing two rules within the BMA running on pro cessor p as shown b elow Rul e Pro cessor p can select a vertex v only if v pr oc p Rul e Pro cessor p can select a vertex v only if p mpiv sv adjac Note that Rul e is a static rule b ecause v pr oc is immutable and can b e enforced simply by iterating over V on each pro cessor p when lo oking for vertices to select In contrast Rul e is dynamic b ecause the p result of mpiv sv adjac will in general change actually monotonically decrease as the algorithm progresses and vertices in v adjac are deleted