Skeletons for Exact Combinatorial Search

Archibald, B., Maier, P., Stewart, R. and Trinder, P. (2020) YewPar: Skeletons for Exact Combinatorial Search. In: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Proceedings, San Diego, CA, USA, 22-26 Feb 2020, pp. 292-307. ISBN 9781450368186 (doi:10.1145/3332466.3374537). There may be differences between this version and the published version. You are advised to consult the publisher’s version if you wish to cite from it. © Association for Computing Machinery 2020. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Proceedings, San Diego, CA, USA, 22-26 Feb 2020, pp. 292-307. ISBN 9781450368186. http://eprints.gla.ac.uk/206497/ Deposited on: 10 January 2019 Enlighten – Research publications by members of the University of Glasgow http://eprints.gla.ac.uk YewPar: Skeletons for Exact Combinatorial Search Blair Archibald Patrick Maier University of Glasgow University of Stirling [email protected] [email protected] Robert Stewart Phil Trinder Heriot-Watt University University of Glasgow [email protected] [email protected] Abstract and delivers provably optimal answers. Conceptually exact Combinatorial search is central to many applications, yet the combinatorial search proceeds by generating and traversing huge irregular search trees and the need to respect search a (huge) tree representing alternative options. Combining heuristics make it hard to parallelise. We aim to improve the parallelism, on-demand tree generation, search heuristics, reuse of intricate parallel search implementations by provid- and pruning can reduce the execution time of exact search. ing the first general purpose scalable parallel framework for There are three main types of search: enumeration, which exact combinatorial search, YewPar. searches for all solutions matching some property, e.g. all We make the following contributions. (1) We present a maximal cliques in a graph; decision, which looks for a spe- novel formal model of parallel backtracking search, covering cific solution, e.g. a clique of size k; and optimisation, which enumeration, decision, and optimisation search. (2) We intro- looks for a solution that minimises/maximises an objective duce Lazy Node Generators as a uniform API for search tree function, e.g. finding a maximum clique. generation. (3) We present the design and implementation of Parallelising exact combinatorial search is extremely chal- 12 widely applicable algorithmic skeletons for tree search on lenging due to huge and highly irregular search trees, the shared and distributed memory architectures. (4) Uniquely in need to preserve search order heuristics that bias search the field we demonstrate how a wide range of parallel search towards likely solutions, and pruning heuristics that dynam- applications can easily be constructed by composing Lazy ically alter the workload to avoid unnecessary search. State- Node Generators and the search skeletons. (5) We report a of-the-art parallel searches (1) are single purpose, i.e. for systematic performance analysis of all 12 YewPar skeletons a specific search application, e.g. Embarrassingly Parallel on standard instances of 7 search applications, investigating Search [24] supports constraint programming only; (2) use skeleton overheads and scalability up to 255 workers on 17 hand crafted parallelism, e.g. parallel MaxClique [28], with distributed locations. almost no reuse of sophisticated parallel coordination between search applications; and (3) target a single scale of CCS Concepts • Computing methodologies → Combi- parallelism, often shared memory multicores. Hence typi- natorial algorithms; Parallel programming languages; cally an application is parallelised just once, on a specific Keywords Combinatorial Search, Algorithmic Skeletons, architecture, in a heroic effort. Distributed Memory Parallelism, HPX In contrast this paper presents YewPar, the first scalable parallel framework for exact combinatorial search. YewPar is 1 Introduction designed to allow non-expert users to benefit from parallelism; to reuse parallel search patterns encoded as algorith- Exact combinatorial search is essential to a wide range of ap- mic skeletons; and to run on multiple parallel architectures. plications including constraint programming, graph match- The paper makes the following contributions. ing, and computational algebra. Combinatorial problems are A new formal model of parallel backtracking search solved by systematically exploring a search space, and doing uniformly characterising enumeration, decision and optimi- so is computationally hard both in theory and in practice, sation searches as a fold of the search tree into a monoid (Sec- encouraging the use of approximate algorithms that quickly tion 3). The formal model combines an encoding of search provide answers yet with no guarantee of optimality. Al- trees and a nondeterministic small-step semantics. We prove ternatively, exact search explores the entire search space that the semantics correctly models parallel search, i.e. re- PPoPP ’20, February 22–26, 2020, San Diego, CA, USA ductions terminate and yield correct/optimal results, regard- © 2020 Association for Computing Machinery. less of the interleaving of parallel reductions. Correctness is This is the author’s version of the work. It is posted here for your personal far from obvious because pruning reductions may alter the use. Not for redistribution. The definitive Version of Record was published shape of the search tree nondeterministically. We use the in 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’20), February 22–26, 2020, San Diego, CA, USA, https: model to derive a programming interface for the skeletons //doi.org/10.1145/3332466.3374537. PPoPP ’20, February 22–26, 2020, San Diego, CA, USA Blair Archibald, Patrick Maier, Robert Stewart, and Phil Trinder (Section 4.1) and to succinctly describe the work generation implementation, including workload management, see [3, 5]. behaviours of the skeletons (Section 4.2). YewPar is available as an evaluated artifact (Appendix A). A widely applicable API for search tree generation (Section 4.1). We introduce Lazy Node Generators as a uni- 2 Background form abstraction for application developers to specify how To illustrate combinatorial search, consider the problem of the search trees for a specific application are created. The finding cliques within a graph such as the tiny graph shown generators are based on a formalisation of backtracking tree in Fig. 1. A clique is a set of vertices C such that all vertices traversal and implicitly encode application-specific search in C are pairwise adjacent. Example cliques in Fig. 1 are fag order heuristics. Search tree nodes are constructed lazily to and fa;b;cg, but fa;b;c;дg is not a clique as there is no edge avoid materialising the entire (huge) tree, such that pruning between c and д. removes redundant search by eliminating subtrees before The search space is huge, the powerset of the set of ver- they are materialised. tices, but we can minimise the space to be explored by gener- 12 widely applicable algorithmic skeletons for back- ating only valid cliques. So given a clique, extend it by adding tracking search (Section 4). The skeletons use sophisticated any vertex that is adjacent to all elements of the clique. This parallel search coordinations inspired by the literature, cur- leads naturally to backtracking search: extend the clique rently: Sequential, Depth-Bounded, Stack-Stealing and Budget until no more additions are possible; thereafter remove the (Section 4.2). The 12 skeletons are parametric combinations most recently added element and try a different vertex. of the 3 search types, and the 4 search coordinations. To We can view the search space as a tree where each node scale on both shared and distributed memory architectures, corresponds to a clique, e.g. level 1 of the tree for the graph in YewPar builds on the HPX C++ distributed execution frame- Fig. 1 represents the cliques fcg, ff g, fдg etc, and the direct work [22], and features custom work-stealing schedulers descendants of clique fcg are fc; ag, fc;bg and fc; eg. that handle search irregularity and carefully manage search It is not always necessary to traverse the entire search order heuristics and knowledge exchange. tree. For example, to check whether the graph has a 3-clique, A demonstration of the easy construction of paral- a decision search, we only need to generate nodes fcg, fc; ag lel search applications using Lazy Node Generators and and fc; a;bg. Likewise we can use bounding to prune the search skeletons (Sections 4.4, 5.1 and A.3). We show how to search space, i.e. we do not need to explore a branch if we parallelise exact combinatorial search applications by com- can prove that a solution cannot exist in that branch. posing a Lazy Node Generator (to create the search tree), The search order determines the sequence in which nodes with an appropriate skeleton to traverse the tree (Fig. 3). The in the search tree are explored, and is extremely important generality of YewPar is shown with 7 search applications as finding a solution quickly allows early termination in covering the 3 search types. Prior implementations like [1, decision searches and improved pruning for optimisation 34, 40] support just one search type, and most provide only searches. Like most search frameworks YewPar primarily one search coordination, the exception being Muesli [33] uses depth-first search as it requires less memory than other which supports two for optimisation search. Most paral- search strategies such as breadth-first. lel implementations demonstrate performance for a single Search spaces are often huge and grow exponentially, e.g. search application, a minority do two applications, and very 378 × 1012 nodes at depth 67 with a growth rate of 1.62 few [16, 23] do more. between successive depths in [17].

Load more