A Structure-Aware Satisfiability Solver

SymChaff: A Structure-Aware Satisfiability Solver Ashish Sabharwal Computer Science and Engineering University of Washington, Box 352350 Seattle, WA 98195-2350, U.S.A. [email protected] Abstract performing specific solvers on problems from many real- world domains including hardware verification (Biere et We present a novel low-overhead framework for encoding al. 1999; Velev & Bryant 2001), automatic test pattern and utilizing structural symmetry in propositional satisfiabil- generation (Konuk & Larrabee 1993; Stephan, Brayton, & ity algorithms (SAT solvers). We use the notion of complete multi-class symmetry and demonstrate the efficacy of Sangiovanni-Vincentelli 1996), planning (Kautz & Selman our technique through a solver SymChaff that achieves ex- 1992), and scheduling (Gomes et al. 1998). With a large ponential speedup by using simple tags in the specification of community of researchers working towards a better under- problems from both theory and practice. standing of SAT, it is not surprising that many competing Efficient implementations of DPLL-based SAT solvers are general purpose systematic SAT solvers have come into light routinely used in areas as diverse as planning, scheduling, in the past decade such as Grasp (Marques-Silva & Sakallah design automation, model checking, verification, testing, and 1996), Relsat (Bayardo Jr. & Schrag 1997), SATO (Zhang algebra. A natural feature of many application domains is the 1997), zChaff (Moskewicz et al. 2001), Berkmin (Gold- presence of symmetry, such as that amongst all trucks at a berg & Novikov 2002), and March-eq by Marijn Heule and certain location in logistics planning and all wires connecting Hans van Maaren. All of these solvers fall into the cate- two switch boxes in an FPGA circuit. Many of these prob- gory of systematic DPLL-based solvers, and build upon a lems turn out to have a concise description in many-sorted basic branch and backtrack technique (Davis, Logemann, & first order logic. This description can be easily specified by the problem designer and almost as easily inferred automat- Loveland 1962). With the addition of features such as smart ically. SymChaff, an extension of the popular SAT solver branch selection heuristics, conflict clause learning, random zChaff, uses information obtained from the “sorts” in the first restarts, conflict-directed backjumping, fast unit propagation order logic constraints to create symmetry sets that are used using watched literals, etc., these have been quite effective to partition variables into classes and to maintain and utilize in solving challenging problems from various domains. symmetry information dynamically. Despite the success, one aspect of many theoretical as Current approaches designed to handle symmetry include: well as real-world problems that we believe has not been (A) symmetry breaking predicates (SBPs), (B) pseudo- fully exploited is the presence of symmetry. Symmetry oc- Boolean solvers with implicit representation for counting, (C) curs naturally in many application areas. For example, in modifications of DPLL that handle symmetry dynamically, FPGA routing, all wires or channels connecting two switch and (D) techniques based on ZBDDs. SBPs are prohibitively boxes are equivalent; in circuit modeling, all inputs to a mul- many, often large, and expensive to compute for problems tiple fanin AND gate are equivalent; in planning, all boxes such as the ones we report experimental results for. Pseudo- Boolean solvers are provably exponentially slow in certain that need to be moved from city A to city B are equivalent; in symmetric situations and their implicit counting representa- multi-processor scheduling (or cache coherency protocols), tion is not always appropriate. Suggested modifications of all available processors (or caches, respectively) are typi- DPLL either work on limited global symmetry and are dif- cally equivalent. While there has been work on using this ficult to extend, or involve expensive algebraic group com- equivalence or symmetry in domain-specific algorithms and putations. Finally, techniques based on ZBDDs often do not techniques, current general purpose complete SAT solvers compare well even with ordinary DPLL-based solvers. Sym- are unable to fully capitalize on symmetry as suggested by Chaff addresses and overcomes most of these limitations. our experimental results. Introduction Previous Work In recent years, general purpose propositional satisfia- A technique that has worked quite well in handling sym- bility algorithms (SAT solvers) have been designed and metry is to add symmetry breaking predicates (SBPs) to the shown to be very successful in handling and even out- input specification to weed out all but the lexically-first solu- tions (Crawford et al. 1996). Tools such as Shatter (Aloul, Copyright c 2005, American Association for Artificial Intelli- Markov, & Sakallah 2003) use graph isomorphism detectors gence (www.aaai.org). All rights reserved. like Saucy to generate SBPs. This latter problem of com- puting graph isomorphism is not known to have any polyno- rather than its CNF representation to obtain symmetry in- mial time solution, and is conjectured to be strictly between formation. (We give concrete examples of this later in the the complexity classes P and NP (see e.g. Kobler,¨ Schoning,¨ paper.) This leads to several advantages. The high level de- & Toran´ 1993). Further, the number of SBPs one needs to scription of a problem is typically very concise and reveals add in order to break all symmetries may be prohibitively its structure much better than a relatively large set of clauses large. This is typically handled by discarding “large” sym- encoding the same problem. It is simple, in many cases al- metries. This may, however, result in a much slower SAT most trivial, for the problem designer to specify global sym- solution as indicated by some of our experiments. metries at this level using straightforward “tagging.” If one Solvers such as PBS (Aloul et al. 2002a), pbChaff prefers to compute these symmetries automatically, off-the- (Dixon, Ginsberg, & Parkes 2004), and Galena (Chai & shelf graph isomorphism tools can be used. Using these Kuehlmann 2003) utilize non-CNF formulations known as tools on the concise high level description will, of course, pseudo-Boolean (PB) inequalities. They are based on the be much faster than using the same tools on a substantially Cutting Planes proof system which is known to be strictly larger CNF encoding. stronger than the resolution proof system on which DPLL While it is natural to pick a variable and branch two ways type CNF solvers are based (Cook, Coullard, & Turan 1987). by setting it to TRUE and FALSE, this is not necessarily the Since this more powerful proof system is difficult to imple- best option when k variables, x1, x2, . , xk, are known to ment in its full generality, PB solvers often implement only be arbitrarily interchangeable. The same applies to more a subset of it, typically learning only CNF clauses or re- complex symmetries where multiple classes of variables si- stricted PB constraints upon a conflict. PB solvers may lead multaneously depend on an index set I = {1, 2, . , k} and to purely syntactic representational efficiency in cases where can be arbitrarily interchanged in parallel within their re- a single constraint such as y1+y2+...+yk ≤ 1 is equivalent spective classes. We formalize this as a k-complete multi- k (k + 1) to 2 binary clauses. More importantly, they are relevant to class symmetry and handle it using a -way branch symmetry because they sometimes allow implicit encoding. based on I that maintains completeness of the search and For instance, the single constraint x1 + x2 + ... + xn ≤ m shrinks the search space by as much as O(k!). The index over n variables captures the essence of the pigeonhole for- sets are implicitly determined from the many-sorted first or- mula over nm variables (described in detail later) which is der logic representation of the problem at hand. We ex- provably exponentially hard to solve using resolution-based tend the standard notions of conflict and clause learning to methods without symmetry considerations. This implicit the multiway branch setting, introducing symmetric learn- representation, however, is not suitable in certain applica- ing. Our solver SymChaff integrates seamlessly with most tions such as clique coloring and planning that we discuss. of the standard features of modern SAT solvers, extending One could conceivably keep the CNF input unchanged but them in the context of symmetry wherever necessary. These modify the solver to detect and handle symmetries during include fast unit propagation, good restart strategy, effective the search phase as they occur. Although this approach is constraint database management, etc. quite natural, we are unaware of its implementation in a general purpose SAT solver besides sEqSatz (Li, Jurkowiak, Preliminaries & Purdom 2002) whose technique appears to be somewhat specific and whose results are not too impressive compared A propositional formula in conjunctive normal form (CNF) to zChaff itself. Related work has been done in specific ar- is a conjunction (AND) of clauses, where each clause is a eas of automatic test pattern generation (Marques-Silva & disjunction (OR) of literals and a literal is a propositional Sakallah 1997) and SAT-based model checking (Shtrichman (Boolean) variable or its negation. A pseudo-Boolean (PB) 2004), where the solver utilizes global information obtained formula is a conjunction of PB constraints,

Load more