<<

Overview SAT Solving Tremendous gains have been achieved over the ڗ and its last 5 years in SAT solving technology. Systematic backtracking based systems have ڗ Relationship to CSPs become the best method for solving structured SAT instances. New theoretical insights have been gained into ڗ Fahiem Bacchus the behaviour of backtracking SAT solvers via University of Toronto their close relationship with resolution proofs. These insights have direct relevance to finite ڗ domain CSP solvers.

9/15/2005 Fahiem Bacchus 1 9/15/2005 Fahiem Bacchus 2

Resolution (SAT)

A complete proof procedure for ڗ Resolution Proofs propositional logic that works on formulas expressed in . (Robinson 1965) All (complete) backtracking algorithms (Conjunctive Normal Form (CNF ڗ for CSPs are implicitly generating resolution proofs. Ӎ Literal: a propositional variable p or its On problems with solution, still have negation ¬p to backtrack out of failed subtrees. Ӎ Clause: a disjunction of literals (a set). Baker (1995), Mitchell (2002) Ӎ CNF theory: a conjunction of clauses.

9/15/2005 Fahiem Bacchus 3 9/15/2005 Fahiem Bacchus 4 Resolution Resolution

A resolution refutation › of a CNF theory ѝ is ڗ From two clauses (A, x) (B, ¬x) the ڗ resolution rule generates the new clause a sequence of clauses C1, C2, …, Cm such that

(A, B), where A and B are sets of literals. Ӎ each Ci is either a member of ѝ or Ӎ (A,B) is the resolvant. Ӎ Ci is a resolvant of two previous clauses in the proof: Ci = R(Cj,Ck)j,k< I Ӎ x is the variable resolved on Clauses arising from resolution are called the derived ڗ Ӎ duplicate literals are removed from the clauses of ›. resolvant Ӎ Cm = () the empty clause. .is also called a resolution proof › ڗ Ӎ denote by C = R(C1,C2) .The SIZE of › is the number of resolvants in it ڗ

9/15/2005 Fahiem Bacchus 5 9/15/2005 Fahiem Bacchus 6

Resolution DAG Resolution Dag

,(Any resolution proof can be represented () [(¬Y), (Z,¬X),(¬Z,Y),(Y,¬X ڗ (¬X),(Z),(Q,¬Z,X),(¬Q,X), as a DAG. X (¬Z,X),(X),(¬X),()] Ӎ nodes are clauses in the proof. ¬X X

Ӎ Every clause Ci that arises from a resolution Y Z

step has two incoming edges. One from each Z ¬Z,X of the clauses that were resolved together to ¬Y Y, ¬X obtain Ci. Z Y Q

Ӎ The arcs are labeled by the variable that was Z,¬X ¬Z,Y Q,¬Z,X ¬Q,X resolved away to obtain Ci. Ӎ Clauses in ѝ have no incoming edges.

9/15/2005 Fahiem Bacchus 7 9/15/2005 Fahiem Bacchus 8 Restrictions of Resolution Resolution

Tree resolution ڗ A number of restricted forms of ڗ resolution can be defined, where, e.g., Ӎ The DAG is required to be a tree. we require the DAG to be a tree. Î Clauses derived during the proof can only be used once. The reason the restricted forms have ڗ been developed is that the restrictions Ӎ Work must be duplicated to rederive clauses that need to be used more than once. can make it easier to find a proof.

9/15/2005 Fahiem Bacchus 9 9/15/2005 Fahiem Bacchus 10

Tree Resolution Ordered Resolution

C4 C5 The variables resolved on along any path ڗ C4 C5 in the DAG to the empty clause must C2 C2 C3 C1 C2 C3 C1 respect some fixed ordering of the variables.

9/15/2005 Fahiem Bacchus 11 9/15/2005 Fahiem Bacchus 12 Ordered Resolution Regular Resolution

Along any path in the DAG to the empty ڗ Not ordered () X clause the sequence of variables resolved ¬X X away cannot contain any duplicates. Y Z

¬Y Y, ¬X Z ¬Z,X

Z Y Q

Z,¬X ¬Z,Y Q,¬Z,X ¬Q,X

9/15/2005 Fahiem Bacchus 13 9/15/2005 Fahiem Bacchus 14

Regular Resolution Negative Resolution

Q,Z,Y,P Q,Z,Y,P One of the clauses in each resolution ڗ X H step must contain only negative literals. ¬X,Z X,Q,Y,P Z,Q,¬H, (a negative clause) H X Ӎ This is complete! Q,¬H,X H,P,Y H,P,Y ¬X,Z Q,¬H,X Ӎ Note ѝ must contain at least one negative X X clause else the “all true” truth assignment is P,H,X ¬X,Y P,H,X ¬X,Y a satisfying model.

Not Regular Regular

9/15/2005 Fahiem Bacchus 15 9/15/2005 Fahiem Bacchus 16 Relative Power Relative Power

R(F)—the minimal size R-refutation of ѝ among all# ڗ .A general formalism for comparing the possible R-refutations of ѝ ڗ power of different proof systems was (For a family of formulas ѝi we look at how #R(ѝi ڗ developed by Cook and Reckhow 1997. grows with i. For our purposes we simply look at the ڗ Let S and T be two restrictions of resolution. S ڗ minimal size refutation proof (the p-simulates T if there exists a polytime computable number of clauses in the proof). function f such that: Ӎ For any S-refutation › of a formula ѝ, f(›) is a T-refutation of ѝ. Ӎ Note that this means that f(›) can’t be more than polynomially longer than › Î #T(F) no more than polynomially larger than #S(F) for any formula F.

9/15/2005 Fahiem Bacchus 17 9/15/2005 Fahiem Bacchus 18

Relative Power Known Results Relative Power Known Results Regular Negative Ordered Tree Buresh-Oppenhiem, Pitassi (2003) many ڗ Regular No Yes Yes new results and a summary of previously Negative No No Yes proved results. Ordered No No No Tree No No No

Regular always yields shorter proofs than ڗ either Ordered or Tree Negative and Regular are incomparable ڗ .Ordered and Tree are incomparable ڗ

9/15/2005 Fahiem Bacchus 19 9/15/2005 Fahiem Bacchus 20 Relative Power Known Results

It is also known that none of these Solving Sat ڗ restrictions can p-simulate general resolution.

9/15/2005 Fahiem Bacchus 21 9/15/2005 Fahiem Bacchus 22

DP & DPLL (DLL) DP

Pick a variable ordering (one that has a low ڗ Two earliest algorithms for solving SAT ڗ elimination width if possible): X1, X2, …, Xn actually predate resolution. Starting with the original set of clauses ѝ ڗ :At the i-th stage ڗ DP: Davis-Putnam (1960) a variable ڗ elimination technique. Ӎ Add to ѝ all possible resolvants that can be generated by resolving on Xi. (DPLL: Davis-Logemann-Loveland (1962 ڗ Ӎ Remove from ѝ all clauses containing Xi or ¬Xi. a backtracking search algorithm. Ӎ If the empty clause is generated stop The input set of clauses (the formula ѝ) is ڗ UNSAT iff this process generates the empty clause.

9/15/2005 Fahiem Bacchus 23 9/15/2005 Fahiem Bacchus 24 DP DP

[a] [b] [c] [a] [b] [c] (a,b,c) (b,c) (c) () (a,b,c) (b,c) (c) () (¬a,b,c) (¬b, c) (¬c) (¬a,b,c) (¬b, c) (¬c) (¬b, c) (¬b,¬c) (¬b, c) (¬b,¬c) (a,¬b,¬c) (b,¬c) (a,¬b,¬c) (b,¬c) (¬a,¬b,¬c) (¬a,¬b,¬c) (b,¬c) (b,¬c) Potentially many redundant clauses are generated, but an ordered resolution is contained in these clauses.

9/15/2005 Fahiem Bacchus 25 9/15/2005 Fahiem Bacchus 26

DP DPÙOrdered Resolution

Note also that every ordered resolution ڗ Every DP proof contains an ordered ڗ resolution, and thus it can never be shorter can be found inside of a DP refutation: than an ordered resolution refutation. Ӎ just follow the same order. Ӎ Note lower bounds are wrt any possible ordered Ӎ Since DP generates all possible ordered resolution (i.e., any ordering). refutations along that order, it might In practice, DP’s space requirements are terminate before completing the specified ڗ prohibitive ordered refutation (by finding a shorter Ӎ Although some attempts using ZBDDs to represent ordered refutation). the clauses compactly. Ӎ DP can also waste a lot of time generating Ӎ Still not competitive with current best techniques. clauses that are not needed for the refutation.

9/15/2005 Fahiem Bacchus 27 9/15/2005 Fahiem Bacchus 28 DPLL DPLL Simplification

Given a clausal theory ѝ, we can simplify ڗ Developed shortly after DP, DPLL is based ڗ on backtracking search. The connection it by a literal р as follows: to resolution was realized later. Ӎѝ|р = Ӎ One picks a literal (a true or false variable) Remove from ѝ all clauses containing рڗ .Remove р from all of the remaining clausesڗ Ӎ simplify the formula based on that literal Ӎ recursively solve the simplified formula. Ӎ if the simplified formula is UNSAT, try using the negation of the literal chosen.

9/15/2005 Fahiem Bacchus 29 9/15/2005 Fahiem Bacchus 30

Unit Propagation DPLL

(a,b,c) (In addition DPLL employs unit propagation: (¬a,b,c),(¬b, c ڗ (a,¬b,¬c) Ӎ if contains any unit clauses, e.g. (¬x) then ѝ|р (¬a,¬b,¬c) further simplify ѝ|р by the literal in the unit clause, (b,¬c) i.e., generate (ѝ ) |р |¬x C ¬C Unit propagation is the iterative application of this Ӎ (a,b,c) simplification until the resultant theory has no unit b ¬b (a,b,c) clauses (or contains the empty clause). (¬a,b,c),(¬b, c) a a (¬a,b,c),(¬b, c) (a,¬b,¬c) (More powerful forms of propagation examined (a,¬b,¬c ڗ in Bacchus (2002) (¬a,¬b,¬c) (¬a,¬b,¬c) (b,¬c) (b,¬c)

9/15/2005 Fahiem Bacchus 31 9/15/2005 Fahiem Bacchus 32 (a,b,c) DPLL DPLL (¬a,b,c),(¬b, c) () Every DPLL proof contains an tree ڗ (a,¬b,¬c) (¬a,¬b,¬c) C (¬C) ¬C (C) resolution, and thus it can never be (b,¬c) shorter than an tree resolution b: (b,¬c) (¬b,¬c) ¬b: (¬b, c) (b,c) refutation. a: (a,¬b,¬c) a: (a,b,c) Ӎ Note it need not be ordered. So the (¬a,¬b,¬c) (¬a,b,c) minimal size DPLL tree can be bigger or .A tree refutation is embedded in every DPPL smaller than the minimal size DP proof ڗ proof of UNSAT. Ӎ every resolvant consists of literals negated by the prefix of assignments.

9/15/2005 Fahiem Bacchus 33 9/15/2005 Fahiem Bacchus 34

DPLLÙTree Resolution DPLLÙTree Resolution

Note also that every tree resolution can ڗ (a,b,¬c,¬d) be found inside of a DPLL refutation: Ӎ Make the DPLL search mimic a depth first X search of the tree refutation. ¬x x always instantiate the negation of the literal that ڗ was resolved on in the child node. (a,b,x,¬c) (¬x,¬d)

Corresponding node of Node of tree resolution DPLL search

9/15/2005 Fahiem Bacchus 35 9/15/2005 Fahiem Bacchus 36 Resolution and Intelligent DPLL Tree Resolution Ù Backtracking If we keep track of the refutation being ڗ In general DPLL search will also do a lot ڗ of extra work not required for the tree generated, we can use the derived resolution since it did not employ clauses to perform intelligent intelligent backtracking. backtracking. Keeping track of the resolution ڗ Hence, DPLL in its original form is a ڗ pretty poor algorithm. Although is refutation is precisely what CBF (conflict “reasonable” for proving UNSAT for directed backjumping) does. random problems.

9/15/2005 Fahiem Bacchus 37 9/15/2005 Fahiem Bacchus 38

¬f Resolution and Intelligent (a,b,c) DPLL (¬a,b,c),(¬b, c) ¬x Backtracking If instrumented to keep track of the ڗ (a,¬b,¬c) (¬a,¬b,¬c) resolution refutation and thus perform (b,¬c,x,f) “intelligent backtracking” (non-moronic

(x,f) backtracking), it can find tree resolutions fairly effectively.

Still some inefficiencies ڗ (c (¬c,x,f) ¬c (c Ӎ can spend time in subtrees that don’t contribute to final refutation. Still limited to relatively weak tree ڗ (b: (b,¬c,x,f) (¬b,¬c) ¬b: (¬b, c) (b,c a: (a,¬b,¬c) a: (a,b,c) resolution. (¬a,¬b,¬c) (¬a,b,c)

9/15/2005 Fahiem Bacchus 39 9/15/2005 Fahiem Bacchus 40 Resolution and Intelligent Backtracking Modern Techniques move DPLL beyond Solving Finite Domain CSPs ڗ the limited power of tree resolution.

9/15/2005 Fahiem Bacchus 41 9/15/2005 Fahiem Bacchus 42

Translation to Propositonal Logic Translation to Propositonal Logic

means that Vi has not been ٿVi=djډ¬ Set of variables Vi and constraints Cj Ӎ ڗ assigned the value d Each variable has a domain of values j ڗ perhaps not been assigned any value, or has beenڗ Dom[Vi] = {d1, …, dm}. assigned a different value.

.True when dj has been pruned from Vi’s domainڗ ٿVi=djډ Consider the set of propositions ڗ one for each value of each variable.

means that Vi has been assigned ٿVi=djډ Ӎ the value dj.

9/15/2005 Fahiem Bacchus 43 9/15/2005 Fahiem Bacchus 44 Translation to Propositonal Logic Translation to Propositonal Logic

Each constraint C is over some set of ڗ For simplicity ڗ

(variables X1,…,Xk: C(X1,…,Xk ٿVi=djډ Ӎ Write Vi=dj instead of

Typically a constraint is defined to be a ڗ ٿVi=djډ¬ Ӎ Vidj instead of Ӎ But be aware that these are actually set of tuples of assignments to its propositional variables that can be assigned variables that satisfy the constraint. true or false. Equivalently, we look at the complement ڗ Ӎ The set of tuples of assignments that falsify the constraint.

(E.g., (X1=a,X2=b,…,Xk=h) falsifies C(X1,…,Xkڗ

9/15/2005 Fahiem Bacchus 45 9/15/2005 Fahiem Bacchus 46

Translation to Propositonal Logic Translation to Propositonal Logic

.So each constraint is a set of clauses ڗ A falsifying tuple is typically called a ڗ All of the constraints of the CSP thus ڗ nogood: a set of assignments that cannot be extended to a solution of the CSP. form a set of clauses. If the tuple falsifies a constraint of the CSP, itڗ can’t be extended to a solution of the CSP. .Nogoods are clauses ڗ

Ӎ A nogood (X1=a,X2=b,…,Xk=h) asserts (X1=a ն X2=b ն … ն Xk=h)¬ڗ .(Î (X1a շ X2b շ … շ Xkh) (a clauseڗ

9/15/2005 Fahiem Bacchus 47 9/15/2005 Fahiem Bacchus 48 Translation to Propositonal Logic FC

,For simplicity look at Forward Checking ڗ Finally, we must deal with the fact that ڗ the variables have non-binary domains. and we will see that For each variable V with Ӎ embedded in a failed FC search tree is a tree ڗ Dom[V]={d1,…,dk} we obtain the resolution. following clauses: Ӎ Keeping track of the resolution refutation Ӎ (V=d1,V=d2,…,V=dk) gives us CBJ. (must have a value) Ӎ The resolution also makes the improvement Ӎ (Vd1,Vd2), (Vd1,Vd3), …, (Vd1,Vdk) of backpruning (Bacchus 2000) obvious. ,…, (Vd2,Vd3), …, (Vdk-1,Vdk) (must have a unique value)

9/15/2005 Fahiem Bacchus 49 9/15/2005 Fahiem Bacchus 50

FC FC X=a Each value of V was removed ڗ FC maintains node consistency. because it falsified some ڗ Ӎ when a constraint becomes unary (all but Z=b nogood from some constraint. one of its variables have been instantiated), we enforce node consistency on that Q=c constraint to prune the domain of the sole remaining variable. Ӎ This definition works with both binary and R=a n-ary constraints. Domain wipe-out of V

9/15/2005 Fahiem Bacchus 51 9/15/2005 Fahiem Bacchus 52 FC FC X=a X=a {Dom[Y] = {a,b,c ڗ {Dom[V] = {a, b, c ڗ Ӎ (Va,Xa) Ӎ (Ya,Xa,Zb) Z=b Ӎ (Vb,Ra,Xa) Z=b Ӎ (Yb,Rb) Ӎ (Vc,Ra,Xa) Ӎ (Yc, Xa,Ra) Resolving these against Q=c ڗ Q=c Resolving these against ڗ V=a,V=b,V=c), we obtain the new) clause (Xa,Ra): a clause (Y=a,Y=b,Y=c), we obtain the R=a containing the current value of R. R=b new clause (Xa,Zb,Rb) Again we backtrack and try a ڗ FC now backtracks and tries a Domain ڗ Domain wipe-out of different value for R. wipe-out of V Y different value for R.

9/15/2005 Fahiem Bacchus 53 9/15/2005 Fahiem Bacchus 54

FC FC X=a X=a Now we have a clause for each ڗ Perhaps R=c has already been pruned ڗ by FC before we reached this node. value of R: Z=b Z=b (Then there is a clause forcing by Rc, Ӎ (Xa,Ra ڗ e.g. Ӎ (Xa,Zb,Rb) Q=c Ӎ (Zb,Rc) Q=c Ӎ (Zb,Rc) Now we have a clause forcing the ڗ Resolve these against ڗ removal of each of R’s values Rc Rc Ӎ Either computed via resolution from the (R=a,R=b,R=c) to obtain Domain subtree below that assignment. Domain Ӎ (Xa,Zb) wipe-out of Ӎ Or from forward checking above. wipe-out of Y Y

9/15/2005 Fahiem Bacchus 55 9/15/2005 Fahiem Bacchus 56 FC-CBJ Extended FC X=a X=a The clause (Xa,Zb) tells us that we ڗ Ordinary FC would then contine ڗ with the next value of Q. can soundly backtrack to undo Z=b. The clauses we learned for the values ڗ Z=b Z=b But embedded in each failed of R ڗ subtree of the FC search tree a Ӎ (Xa,Ra) Q=c is a tree resolution. Q=c Ӎ (Xa,Zb,Rb) Ӎ (Zb,Rc) FC-CBJ simply keeps track of ڗ Tell us that we also need not try the ڗ Rc the resolution refutation, and Rc value R=a again until we backtrack uses the clause produced even further to undo X=a. Keeping track of this information ڗ Domain Ӎ (Xa,Zb) Domain wipe-out of wipe-out of allows us to “backprune” values. Y to backtrack to undo Z=b. Y Bacchus (2000)

9/15/2005 Fahiem Bacchus 57 9/15/2005 Fahiem Bacchus 58

Negative Resolution Negative Resolution In fact in the standard techniques all ڗ Notice the resolution steps involved ڗ Ӎ (R=a,R=b,R=c),(Xa,Ra) Î (Xa,R=b,R=c) clauses (nogoods) Ӎ (Xa,R=b,R=c),(Xa,Zb,Rb) Î Ӎ In the original constraints are negative. (Xa,R=c,Zb) Ӎ Learned during search are negative. Ӎ (Xa,R=c,Zb),(Zb,Rc) Î (Xa,Zb) .Return to this later ڗ Negative resolution steps. (One of the ڗ clauses in always negative). I.e., FCCBJ actually embeds a negative ڗ tree resolution. Even more limited in power. Mitchell (2003)

9/15/2005 Fahiem Bacchus 59 9/15/2005 Fahiem Bacchus 60 Clause Learning (CL)

The main feature of modern SAT solvers ڗ Modern Sat Solvers is the development of new techniques to support effective clause learning Ӎ Without clause learning DPLL and CSP backtracking algorithms are both limited to tree resolution, (negative tree resolution in the case of CSPs). Ӎ Modern solvers are N-orders of magnitude faster than the best implementations of standard DPLL on many problems. Where N is probably >6.

9/15/2005 Fahiem Bacchus 61 9/15/2005 Fahiem Bacchus 62

X Failed Path ڗ DPLL+CL Ӎ A Ӎ ¬B • X,Y,Z: Decision Variables. DPLL Ӎ C ڗ Y • A,¬B,C,D,¬E,F,H,I,¬J,¬K: forced by unit¬ ڗ Ӎ picks a literal Ӎ D propagation Ӎ reduces the theory with that literal Ӎ ¬E • (K,¬I,¬H, ¬F,E, ¬D,B): Falsified clause. Ӎ this perhaps induces some sequence of Ӎ F This clause is called a conflict clause: .Z it is falsified by the current path ڗ .further literals all forced by unit propagation Ӎ Stops and backtracks when some clause Ӎ H becomes falsified. Ӎ I Ӎ ¬J Ӎ ¬K (K,¬I,¬H, ¬F,E, ¬D,B)

9/15/2005 Fahiem Bacchus 63 9/15/2005 Fahiem Bacchus 64 X Forced Literals ڗ X Forced Literals ڗ Ӎ A Í … Ӎ A Í … Ӎ ¬B Í … • Each forced literal was forced Ӎ ¬B Í … • Each clause reason contains Ӎ C … by some clause becoming Ӎ C … Í Í • One true literal on the ¬Y unit. ¬Y (path (the literal it forced ڗ ڗ D (D,B,Y) D (D,B,Y) Ӎ Í • We keep track of the forcing Ӎ Í Ӎ ¬E Í … Ӎ ¬E Í … •Literalsfalsified higher up clause as part of the unit Ӎ F Í … Ӎ F Í … on the path. propagation process. Z ڗ Z ڗ Ӎ H Í (H,B,E,¬Z) Ӎ H Í (H,B,E,¬Z) Ӎ I Í (I,¬H,¬D,¬X) Ӎ I Í (I,¬H,¬D,¬X) Ӎ ¬J Í (¬J,¬H,B) Ӎ ¬J Í (¬J,¬H,B) Ӎ ¬K Í (¬K,¬I,¬H,E,B) Ӎ ¬K Í (¬K,¬I,¬H,E,B)

(K,¬I,¬H, ¬F,E, ¬D,B) (K,¬I,¬H, ¬F,E, ¬D,B)

9/15/2005 Fahiem Bacchus 65 9/15/2005 Fahiem Bacchus 66

X Forced Literals Conflict Clauses ڗ Ӎ A Í … • Hence we can resolve away Any forced literal in any conflict clause can be ڗ Ӎ ¬B Í … any forced literal in the Ӎ C Í … conflict clause. resolved on to generate a new conflict clause. Y¬ ڗ If we continued this process until all forced ڗ Ӎ D Í (D,B,Y) • This will yield a new conflict Ӎ ¬E Í … clause. literals were resolves away we would end up Ӎ F Í … with a clause containing decision literals only Z .(K,¬I,¬H, ¬F,E, ¬D,B), (D,B,Y) Î (All-decision clause) .1 ڗ Ӎ H Í (H,B,E,¬Z) (K,¬I,¬H, ¬F,E,B,Y) But empirically the all-decision clause tends ڗ Ӎ I Í (I,¬H,¬D,¬X) 2. (K,¬I,¬H, ¬F,E, ¬D,B), (¬K,¬I,¬H,E,B) Î Ӎ ¬J Í (¬J,¬H,B) (¬I,¬H, ¬F,E, ¬D,B) not be very effective. Ӎ ¬K Í (¬K,¬I,¬H,E,B) 3. (K,¬I,¬H, ¬F,E, ¬D,B), (H,B,E,¬Z) Î Ӎ Too specific to this particular part of the search to (K,¬I,¬F,E, ¬D,B,¬Z) (K,¬I,¬H, ¬F,E, ¬D,B) be useful later on. 4. …

9/15/2005 Fahiem Bacchus 67 9/15/2005 Fahiem Bacchus 68 Conflict Clauses 1-UIP Clauses

Start with C equal to the original conflict ڗ Various choices exist as to how to ڗ generate a conflict clause on failure. clause 1. Let n be the number of literals in C at or The most popular form of clause learning ڗ below the last decision variable. is 1-UIP learning (Zchaff). (Now almost the standard). 2. If n > 1 Let C be equal to the result of resolving away the ڗ deepest forced literal. Goto 1 ڗ 3. Else store C for future use and use it for backtracking.

9/15/2005 Fahiem Bacchus 69 9/15/2005 Fahiem Bacchus 70

X 1-UIP ڗ 1-UIP Clauses Ӎ A Í … … This process must terminate. As when we Ӎ ¬B Í ڗ resolve away a literal can only introduce Ӎ C Í … (Y 1. (K,¬I,¬H, ¬F,E, ¬D,B), (¬K,¬I,¬H,E,B¬ ڗ literals above it on the path. Ӎ D Í (D,B,Y) Î (¬I,¬H, ¬F,E, ¬D,B) Ӎ ¬E Í … 2. (¬I,¬H, ¬F,E, ¬D,B), (I,¬H,¬D,¬X) … The last remaining literal from the Ӎ F Í ڗ Î (¬H, ¬F,E, ¬D,B,¬X) Z ڗ deepest level in the 1-UIP clause may or may not be the decision literal. Ӎ H Í (H,B,E,¬Z) Ӎ I Í (I,¬H,¬D,¬X) Ӎ ¬J Í (¬J,¬H,B) Ӎ ¬K Í (¬K,¬I,¬H,E,B)

(K,¬I,¬H, ¬F,E, ¬D,B)

9/15/2005 Fahiem Bacchus 71 9/15/2005 Fahiem Bacchus 72 Backtracking 1-UIP X ڗ X ڗ Ӎ A Í … Ӎ A Í … ¬B … … The advantage of a 1-UIP clause (or any Ӎ Í Ӎ ¬B Í ڗ C … unique implication point clause) is that it Ӎ Í Ӎ C Í … Y¬ ڗ Y¬ ڗ forces the single literal from the deepest Ӎ D Í (D,B,Y) Ӎ D Í (D,B,Y) ¬E … level. Ӎ Í Ӎ ¬E Í … Ӎ F Í … Ӎ F Í … (Z Ӎ ¬H Í (¬H,¬F,E, ¬D,B,¬X ڗ We can backtrack to the point that literal ڗ is forced and augment the set of forced Ӎ H Í (H,B,E,¬Z) Ӎ I Í (I,¬H,¬D,¬X) literals at that level by the new unit prop. Ӎ ¬J Í (¬J,¬H,B) More unit Ӎ ¬K Í (¬K,¬I,¬H,E,B) propagation (¬H, ¬F,E, ¬D,B,¬X) (K,¬I,¬H, ¬F,E, ¬D,B)

9/15/2005 Fahiem Bacchus 73 9/15/2005 Fahiem Bacchus 74

Backtracking Far Backtracking

Far Backtracking”, i.e., backtracking to the“ ڗ Note that the decision literal has not ڗ been exhausted. We don’t know if the point we have the new unit implicant instead of backtracking to undo the deepest decision. has current prefix with ¬Z instead of Z might two motivations: have a solution. Ӎ ¬H is implied at this higher level, so undoing all of the work and starting again is the easiest way to take this constraint into account. (See Bacchus 2000 for an alternate approach to back forcing of backpruning values). Ӎ Perhaps heuristically it might be better to start the search under the newly discovered implication all over again.

9/15/2005 Fahiem Bacchus 75 9/15/2005 Fahiem Bacchus 76 Managing Large number of Far Backtracking Clauses Once we start learning a clause at every ڗ E.g., with far backtracking whenever a ڗ unit conflict is discovered, the search backtrack point, we soon have the returns to the root: a complete restart. problem of having to deal with lots of Ӎ Unclear if there is any real empirical evidence new clauses. The learned clauses often are far more ڗ .about whether or not this is more efficient numerous than the input clauses.

9/15/2005 Fahiem Bacchus 77 9/15/2005 Fahiem Bacchus 78

Watch Literals Watch Literals

Like ideas in current GAC algorithms where only aڗ Some other techniques have been developed in ڗ the SAT literature that have made clause single support is maintained. But no reliance on learning feasible. lexographic ordering. Thus watches have an important benefit of requiring no work on Ӎ More efficient unit propagation by the technique of backtrack. watch literals. Also clever empirically tuned techniques forڗ in order for a clause to be come unit all but one of its where to locate the watches and how to store ڗ literals must become false. clauses in memory designed to maximize cache Assign two watch literals per clause. Only when the watch ڗ literal becomes false do we check the clause. hits. Try to find another watch, or determine that the clause has ڗ become unit or empty.

9/15/2005 Fahiem Bacchus 79 9/15/2005 Fahiem Bacchus 80 VSIDs Heuristics The Power of Clause Learning

Beame, Kautz, and Sabharwal (2003) showed ڗ Additional success has been obtained ڗ from dynamic variable ordering heuristics that regular resolution cannot p-simulate that are very quick to compute: don’t clause learning. Ӎ I.e., there exists formulas with short CL proofs but require examining all unassigned long regular resolution proofs. variables. (Recently (Pitassi & Hertel, not yet published ڗ These heuristics favor literals that have have shown that CL can p-simulate regular ڗ appeared in recently learned clauses. resolution. Ӎ Intuition is claimed to be: learning new Ӎ I.e., with clause learning DPLL becomes strictly more clauses that can be resolved against recent powerful than regular resolution, and thus a major advance over standard tree-resolution limited DPLL. clauses.

9/15/2005 Fahiem Bacchus 81 9/15/2005 Fahiem Bacchus 82

The Power of Clause Learning

We are still working on the question of Improving CSP solvers ڗ whether or not CL is as powerful as general resolution.

9/15/2005 Fahiem Bacchus 83 9/15/2005 Fahiem Bacchus 84 Improving CSP Solvers Improving CSP Solvers

.FCCBJ no more powerful than tree A. Unrestricted Clause (nogood) learning ڗ resolution, also bounded by negative B. Learning non-negative clauses. resolution. C. Improving the input clauses. How do we gain from advances in SAT? .D. Better clauses (nogoods) from GAC ڗ Ideas along this line have been pursued E. Better clauses (nogoods) from ڗ by my PhD student George Katsirelos propagators. who has written a general CSP solver toolkit called EFC based on our ideas. Ӎ http://www.cs.toronto.edu/~gkatsi/efc/

9/15/2005 Fahiem Bacchus 85 9/15/2005 Fahiem Bacchus 86

Negative Resolution A. Unrestricted NoGood Learning X=a FC (reminder) Standard works on clause (nogood) learning ڗ :With clause for each value of R ڗ ,has concentrated on various restricted forms e.g., k-relevance bounded, length bounded, Z=b Ӎ (Xa,Ra) etc. Ӎ (Xa,Zb,Rb) Our first attempt was to utilize SAT techniques ڗ for the efficient handling of large numbers of Q=c Ӎ (Zb,Rc) clauses (watch literals, clause databases) to Resolve these against ڗ allow storing as many nogoods as can fit into main memory. Rc (R=a,R=b,R=c) to obtain With standard techniques are used to learn the ڗ Domain Ӎ (Xa,Zb) nogoods the results were occasionally very wipe-out of good, however often the results were not great. Y

9/15/2005 Fahiem Bacchus 87 9/15/2005 Fahiem Bacchus 88 Negative Resolution No Unit Propagation

Each resolution step is a negative Ӎ Unit propagation over the learned clauses ڗ resolution. never propagates! Xa,Zb) when X=a this becomes unit forcing)ڗ (All learned nogoods are negative clauses. Zb (the pruning of b from Z’s domain ڗ But now Zb can only make other learned clausesڗ Ӎ Restricts the power of the search to negative resolution. true, it cannot make any of them unit. So one never gets very much further than theڗ original clauses. In contrast unit propagation in SAT can oftenڗ value hundreds of literals after each decision.

9/15/2005 Fahiem Bacchus 89 9/15/2005 Fahiem Bacchus 90

B. Learning Non-Negative Clauses B. Learning Non-Negative Clauses

We developed a first-decision clause ڗ Idea is simple, when learning a new clause ڗ simply don’t resolve away all of the positive learning scheme, where we replace all literals! literals (assignments/non-assignments) :With clause for each value of R ڗ Ӎ (Xa,Ra) in the clause until we have only the Ӎ (Xa,Zb,Rb) decision literal at the deepest level. This allows us to learn non-negative ڗ (Ӎ (Zb,Rc ,Resolve these against (R=a,R=b,R=c) to obtain clauses, perform intelligent backtracking ڗ (X a,R=b,R=c) Ӎ  and it seems to work better than the 1- Ӎ or (Zb,R=a,R=c) Ӎ instead of doing all of the resolution steps. UIP scheme in CSPs.

9/15/2005 Fahiem Bacchus 91 9/15/2005 Fahiem Bacchus 92 Unit Propagtion B. Learning Non-Negative Clauses

Generalized Nogood” learning often gives“ ڗ Zb,R=a,R=c), now if we prune a from) ڗ R’s domain and assign Z=b, this mixed dramatic performance improvements over clause forces us to assign R=c. FCCBJ+learning standard nogoods (negative clauses), which in turn is better than FCCBJ .That assignment can make other clauses without any learning ڗ Also provably adds power over standard ڗ unit, pruning other values or forcing other assignments. negative clauses (Katsirelos & Bacchus 2005) Ӎ See also Hwang & Mitchell (2005) on how 2-way branching also has the potential to get around negative resolution.

9/15/2005 Fahiem Bacchus 93 9/15/2005 Fahiem Bacchus 94

C. Improving Input Clauses C. Improving Input Clauses

Constraints can be optimized (and ڗ The input clauses are all negative ڗ clauses. This also limits the effectiveness converted from negative clauses) using, of resolution. e.g., information theory based decision .One can generalize these clauses, e.g., tree algorithms ڗ say the constraint C(X,Y,Z) with We haven’t as yet completed an empirical ڗ Dom={a,b,c} contains the clauses evaluation of this idea. Ӎ (Xa,Ya,Za), Note same idea can be used to make GAC- Ӎ (Xb,Ya,Za), Ӎ schema checking faster. Î these two clauses can be replaced by the single clause (X=c,Ya,Za)

9/15/2005 Fahiem Bacchus 95 9/15/2005 Fahiem Bacchus 96 GAC Standard Technique

Inductively assume that every pruned ڗ Most CSP solvers use GAC, and GAC is ڗ empirically much more effective than FC. value is labeled by a (negative) clause .How do we use clause learning ideas to that caused the pruning ڗ How do we compute a clause to label a ڗ ?improve GAC value newly pruned by GAC on a constraint C?

9/15/2005 Fahiem Bacchus 97 9/15/2005 Fahiem Bacchus 98

Standard Technique Standard Technique

(H=a & I=b &J=a) Xa Logically The standard technique is to use the implies ڗ union of the clauses that pruned any (E=a & F=b &G=a) Yb GAC on value of any of the variables of the Xb constraint. [Chen 2000]. (Z=b) Zc C(X,Y,Z)

(Z=b) Za

Therefore H=a & I=b &J=a & E=a & F=b &G=a & Z=b ҧ Xb In clause form (Ha,Ib,Ja,Ea,Fb,Ga,Zb,Xb)

9/15/2005 Fahiem Bacchus 99 9/15/2005 Fahiem Bacchus 100 Standard Technique D. Better clauses from GAC

(H=a & I=b &J=a) Xa Logically We obtain only negative clauses. implies ڗ (E=a & F=b &G=a) Y b  The resulting clause is very specific to ڗ (Z=b) Zc GAC on this particular part of the search space, Xb and can be quite long (not as powerful). (Z=b) Za C(X,Y,Z)

An immediate and computationally inexpensive clause we can obtain is simply the set of pruned values that caused the new pruning.

Xa& Yb& Zc& Za ҧ Xb Î (X=a,Y=b,Z=c,Z=a,Xb) Î (Y=b,Z=c,Z=a,Xb)

9/15/2005 Fahiem Bacchus 101 9/15/2005 Fahiem Bacchus 102

D. Better clauses from GAC D. Better clauses from GAC

However, even though this clause is in ڗ This “all of the values pruned” clause is actually ڗ already captured by GAC processing itself. some sense “redundant” it can still be Ӎ (Y=b,Z=c,Z=a,Xb) resolved against other clauses to produce Ӎ Under any situation where we make all but one of powerful new clauses. these literals true GAC will infer the remaining Ӎ Increases the power of the search. literal. In some sense this method is ڗ E.g., if we prune b from Y’s domain, c and a from Z’s ڗ domain, GAC will detect that b must be pruned from X’s “converting” GAC inferences to clauses domain. on the fly, and these clauses can then be ,Similarly if X=b, a and c have been pruned from Z’s domain ڗ GAC will prune b from Y’s domain. used as inputs to more powerful resolution refutations.

9/15/2005 Fahiem Bacchus 103 9/15/2005 Fahiem Bacchus 104 D. Better clauses from GAC D. Better clauses from GAC

The “all values pruned” clause empirically ڗ We can also resolve away various literals ڗ from this clause, to yield a variety of is often quite useful. Empirical analysis of the other possible ڗ .different clauses Ӎ (Y=b,Z=c,Z=a,Xb) Î (Y=b,Zb,Xb) clauses one could generate remains open. (H=a & I=b &J=a) Xa

(E=a & F=b &G=a) Yb

(Z=b) Zc GAC on Xb (Z=b) Za C(X,Y,Z)

9/15/2005 Fahiem Bacchus 105 9/15/2005 Fahiem Bacchus 106

Exploiting Constraint Structure Exploiting Constraint Structure

The “all values pruned” clause fails to exploit ڗ In general, any set of pruned values that ڗ .information about the constraint It could be that from the structure of the constraint suffices to remove all of the supports of ڗ only a subset of the currently pruned values contributed to the newly pruned value. X=b is a minimal reason for the pruning. It is feasible to find such sets when the ڗ E.g., perhaps Xb ڗ Xa follows from just Yb and Zc. constraint relatively small (e.g., small Yb enough to perform GAC-Schema) If we can detect this ڗ Zc GAC on In this case such clauses can be more ڗ Xb efficiently, we could Za C(X,Y,Z) learn even better clauses from GAC. effective than the “all values pruned” constraint.

9/15/2005 Fahiem Bacchus 107 9/15/2005 Fahiem Bacchus 108 E. Better clauses from E. Better clauses from propagators propagators .E.g., all-diff ڗ Another critical technique in CSPs is the ڗ recognition that some constraints have a Ӎ The propagator (Regin 1994) works by specialized structure, and thus specialized identifying sets of variables S that consume algorithms can be used to achieve GAC. all of the values in their domain. E.g., a set of 3 variables all of which have theڗ These specialized algorithms can work even ڗ when the constraint is too large to be same 3 values remaining in their domain. In this case these values are consumed, theyڗ .represented as a set of clauses -In these cases it should be feasible to cannot be used by any other variable in the all ڗ additionally exploit this structure to obtain diff. better clause reasons for the values pruned.

9/15/2005 Fahiem Bacchus 109 9/15/2005 Fahiem Bacchus 110

E. Better clauses from E. Better clauses from propagators propagators In general, getting better clauses by ڗ .E.g., all-diff ڗ Ӎ Say that X=b, Y=b, Z=b are all pruned because we exploiting structure specific to particular have that b must be consumed by one of the variables in the set {V,W}. constraints remains an area where much Ӎ Then a shorter, structure specific, clause explaining additional work needs to be done. each of these pruned values is simply the set of values already pruned from the domain of V and W. b” is consumed by V and W because these other values are“ ڗ no longer available. Other values pruned from the domains of other variables ڗ are irrelevant.

9/15/2005 Fahiem Bacchus 111 9/15/2005 Fahiem Bacchus 112 Social Golfer

.From Katsirelos & Bacchus 2005 ڗ Conclusions Ӎ Note: no sophisticated symmetry breaking techniques being used!

w,g,s GAC GAC+S GAC+G 2-7-5 1586.0s 218.0s 4.4s 2-8-5 >2000.0s 1211.9s 5.5s 3-6-4 >2000.0s 869.7s 5.0s 3-7-4 >2000.0s 549.6s 1.6s 4-7-3 843.4s 91.5s 0.3s

9/15/2005 Fahiem Bacchus 113 9/15/2005 Fahiem Bacchus 114

Conclusions

These techniques can yield a significant ڗ improvement in CSP solvers. Many other issues remain to be explored ڗ Ӎ The impact of learning different kinds of clauses from GAC. Ӎ Heuristics based on recently learned clauses-very successful in SAT, seemingly less so in CSPs. Ӎ Theoretical power in the presence of propagators. Ӎ Extending specialized constraints to be able to extract better clauses.

9/15/2005 Fahiem Bacchus 115