<<

Satisfiability Difficult Problems

Dealing with SAT

Implementation

Klaus Sutner Carnegie Mellon University 2020/02/04

Finding Hard Problems 2 to the Rescue 3

The Entscheidungsproblem is solved when one knows a pro- cedure by which one can decide in a finite number of operations So where would be look for hard problems, something that is eminently whether a given logical expression is generally valid or is satis- decidable but appears to be outside of P? And, we’d like the problem to fiable. The solution of the Entscheidungsproblem is of funda- be practical, not some monster from CRT. mental importance for the theory of all fields, the of which are at all capable of logical development from finitely many The Circuit Value Problem is a good indicator for the right direction: . evaluating Boolean expressions is polynomial time, but relatively difficult D. Hilbert within P.

So can we push CVP a little bit to force it outside of P, just a little bit? In a sense, [the Entscheidungsproblem] is the most general Say, up into EXP1? problem of . J. Herbrand

Exploiting Difficulty 4 Scaling Back 5

Herbrand is right, of course. As a research project this sounds like a Taking a clue from CVP, how about asking questions about Boolean fiasco. formulae, rather than first-order?

But: we can turn adversity into an asset, and use (some version of) the Probably the most natural question that comes to mind here is Entscheidungsproblem as the epitome of a hard problem.

Is ϕ(x1, . . . , xn) a ? The original Entscheidungsproblem would presumable have included arbitrary first-order questions about number theory. This would indeed be where ϕ is a Boolean formula, with variables x , . . . , x . very difficult, truth in arithmetic requires a monster oracle (ω). 1 n ∅ Close, but no cigar. We need something less ambitious and closer to real algorithms. Satisfiability 6 The 7

For technical reasons, it is better to ask the very similar question Problem: Satisfiability (SAT) Instance: A Boolean formula ϕ(x1, . . . , xn).

Is ϕ(x1, . . . , xn) satisfiable? Question: Is ϕ satisfiable?

Obviously, ϕ is a tautology iff ϕ fails to be satisfiable, so nothing is n lost. ¬ The difficulty here comes from the fact that there are 2 possible truth assignments σ : Var 2 . Even though we can evaluate ϕ[σ] in linear time, any algorithm→ using brute-force search will be exponential, But, as we will see, satisfiability is slightly better behaved than tautology something like 2npoly(n). if one is concerned about resource bounds. This has to do with convenient normal forms: ϕ may be in normal form (such as CNF), but ϕ is not. Of course, that does not mean that there is no better algorithm, ¬ brute-force is just the most obvious line of attack. It would also work for Tautology.

Aside: Problem Specification 8 Mighty SAT 9

Since Garey and Johnson published their famous and Intractability in 1979, it has become standard to specify a decision If you think of a Boolean formula as the kind to little thingy you problem like so: encountered in 151, they might seem pretty feeble. E.g.,

(A B) (B C) (A C) Problem: Problem Name (catchy acronym) ⇒ ⇒ ⇒ ⇒ ⇒ Instance: Description of an instance x. Question: Does x have property such-and-such? is clearly a tautology (the infamous cut rule). It is quite useful as a logcical in a , though. In fact, all the logical axioms in any of the standard systems are obviously tautologies. The description has to be in terms of a finite data structure, ultimately a string x 2?. But there is no need to spell out all the details, we have ∈ But: what if the formula has 10000 variables, and takes a megabyte of reasonable standard conventions (e.g., numbers are written in binary). memory just to write it down? Your intuition will tell you zip about a monster like that. Unfortunately, big formulae are where the action is. The question must have a clear Yes/No answer (no counting, no data structures, . . . ).

Example 1: VC to SAT 10 Coding Covers 11

As a warm-up exercise showing off the expressiveness of SAT, we will The idea is simply that show how to translate the Vertex Cover problem into a satisfiability σ = p x is in the alleged cover problem. | x ⇐⇒ So the truth assignment σ is just a bitvector for the Problem: Vertex Cover Cσ = x σ = px V . Instance: A ugraph G, a bound k. { | | } ⊆ Question: Does G have a vertex cover of size k? We need to construct a formula ΦG,k that enforces this . Let’s ignore the cardinality part for the moment. Every edge needs to have at least one endpoint in the alleged cover C : For any translation to SAT, it is critical to interpret the Boolean variables σ the right way.

Let’s assume G looks like [n],E . It seems natural to introduce n pu pv h i ∨ Boolean variables (u,v) E ^∈ px 1 x n, ≤ ≤ This conjunction has size O(n2), so we are good. one for each vertex x. Counting 12 Problems with Reductions 13

To establish a reduction from A to B we need to avoid three possible We also need to make sure that C = k. | σ| errors:

Write CNTr,s(p1, p2, . . . , pr) for a formula that is true under σ iff exactly Logical correctness: we must have x A f(x) B. s of the r variables are true. ∈ ⇔ ∈

We could simply add CNTn,k(p1, p2, . . . , pn) to ΦG and be done. Computational simplicity: f must be easy to compute. Size constraint: f(x) must not be too long.

Easy, what could possibly go wrong? In the heat of battle, it’s quite possible to screw up one of these issues.

Threshold Functions 14 Expressiveness 15

The standard way to get a counting formula is to use threshold functions. Lots of Boolean functions can be defined in terms of threshold functions.

Definition thrn is the constant tt. n 0 A threshold thrm, 0 m n, is an n-ary Boolean function ≤ ≤ n defined by thr1 is n-ary disjunction.

n 1 if #(i xi = 1) m, n thrm(x) = ≥ thrn is n-ary conjunction. (0 otherwise. n n thrk (p) thrk+1(p) is the counting function: CNTn,k(p1, p2, . . . , pn), “exactly∧k ¬out of n.”

n So thrm(x) simply means that at least m of the n variables are true.

Dire Warning 16 How Big? 17

For example, CNTn,2(p) looks like A casual observer might say this formula has size around k, but that’s (p p ) (p p p ) totally wrong: we need to expand out the disjunctions and conjunctions: i ∧ j ∧ ¬ i ∧ j ∧ k i

To keep the cardinality formula small, we introduce new variables

q 0 i n, 0 j k + 1 i,j 2 ≤ ≤ ≤ ≤ We get a formula Φg,k of size O(n ) (at least with uniform size function) with the intent that that clearly can be constructed from G and k in polynomial time. A closer look shows that we can actually get away with just logarithmic q thri (p , . . . , p ) space: all we need is a few loops over variables. i,j ⇔ j 1 i

We can determine the qi,j in a dynamic programming style, very much Moreover, like an instance of CVP. σ = Φ C is a vertex cover of size k | G,k ⇐⇒ σ qi,0 = 1 i = 0, . . . , n and we have our translation to SAT. q0,j = 0 j = 1, . . . , k + 1 Done. qi+1,j = qi,j (qi,j 1 pi+1) ∨ − ∧ q q n,k ∧ ¬ n,k+1

Example 2: HC to SAT 20 A Big Formula 21

Here is another translation, from Hamiltonian Cycle problem to Satisfiability. The Idea: the Hamiltonian path we are looking for touches node x at time t iff σ(pt,x) = 1 for a satisfying truth assignment σ.

Problem: Hamiltonian Cycle So we need to construct a (large) Boolean formula ΦG that enforces the Instance: A ugraph G. following: Question: Does G have a Hamiltonian cycle? σ = Φ σ codes a Hamiltonian cycle in G | G ⇐⇒ Again, it is critical to interpret the Boolean variables the right way. Then ΦG is satisfiable iff G has a Hamiltonian cycle and we are done. As always, assume G looks like [n],E . We introduce n(n + 1) Boolean variables h i Of course, there is the constraint that ΦG needs to be constructible from G in polynomial time. pt,x 0 t n, 1 x n. ≤ ≤ ≤ ≤ Otherwise we could use Φ = or Φ = ;-) G ⊥ G > Think of t as time, and of x as location.

Building ΦG 22 How Bad Can It Be? 23

ΦG is a conjunction with 4 parts as follows:

2 CNTr,1(x1, . . . , xr) is easily seen to be size Θ(r ). Hence the size of ΦG is Θ(n3) and thus polynomial in the size of the graph. CNTn,1(pt,1, pt,2, . . . , pt,n) t ^ Clearly, ΦG can be constructed in a straightforward manner from G, CNTn,1(p1,x, p2,x, . . . , pn,x) there is a polynomial time that does the job. x=1 ^6

p0,1 pn,1 Even better, with a little effort we see that the function is log-space ∧ computable: we only need to keep track of a few variables that require pt,x pt+1,y log n bits each (recall that we don’t charge for the output tape). ⇒ t,x y Γ ^ _∈ x Here Γ = y [n] (x, y) E denotes the neighborhood of x in G. x { ∈ | ∈ } It Works 24 Tip of an Iceberg 25

Suppose G has a Hamiltonian cycle. We may think of this cycle as a The same holds true for lots of other combinatorial problems that fit sequence vt, 0 t n of vertices where v0 = vn = 1. ≤ ≤ exactly the same pattern. Set σ(pt,x) = 1 iff vt = x. It is easy to check that σ satisfies ΦG. Exercise Express Independent Set and Clique as a Satisfiability problem. In the opposite direction, suppose σ satisfies ΦG. By part 1 there is a sequence of vertices vt, 0 t n: let vt be the unique x such that σ = p . ≤ ≤ Exercise | t,x Express Sum as a Satisfiability problem: By part 2 every vertex appears on this list. Also, by part 3, v0 = vn = 1 so that all other vertices must appear exactly once by counting. Problem: Subset Sum Lastly, by part 4, (vt, vt+1) is an edge. Instance: A list of natural numbers a1, . . . , an, b. Hence G has a Hamiltonian cycle – which can be read off directly from Question: Is there a subset I [n] such that i I ai = b? ⊆ ∈ the satisfying truth assignment. P

Tackling SAT 27

Difficult Problems The brute force approach to SAT is to try all possible truth assignments, leading to a 2npoly(n) algorithm where n is the number of variables.

This would be of little interest for our translations from VC, HamCyc, Dealing with SAT Subset Sum, and the like: we could perform an exponential search directly in the orginal domain of the problem, without the translation.

Often the search is much better in the original domain: for VC we only n 2 Implementation need to consider k sets of vertices, not a formula with Θ(n ) variables.  Are there any algorithms for SAT that are fast at least some/most of the time?

SAT Algorithms 28 The Key Papers 29

P. C. Gilmore A proof method for quantification theory: its justification and realization IBM J. Research and Development, 4 (1960) 1: 28–35

M. Davis, H. Putnam A Computing Procedure for Quantification Theory Journal ACM 7 (1960) 3: 201–215.

M. Davis, G. Logemann, D. Loveland A Machine Program for Proving There is an old, but surprisingly powerful satisfiability testing algorithm Communications ACM 5 (1962) 7: 394–397. due to Davis and Putnam, originally published in 1960. The Real Prey 30 FOL to Propositional 31

Recall that valid (aka provable) formulae in FOL are only semidecidable, Note the titles: the real target was an algorithm to establish in but not decidable. So the challenge is to find computationally first-order . well-behaved methods that can identify at least some valid formulae. Gilmore and Davis/Putnam exploit a theorem by J. Herbrand: This is really the afterglow of Hilbert’s old Entscheidungsproblem: try to find as general an algorithm as possible to solve “arbitrary” questions in To show that ϕ is valid, show that ϕ is inconsistent. ¬ math. Translate ϕ into a set of clauses Γ. ¬ Thanks to G¨odelwe know that this cannot work in general, but there still Enumerate potential counterexamples based on Herbrand models, may be interesting partial answers. stop if one is found. The last step requires what is now called a SAT solver.

Tiny Example 32 Gilmore’s Abstract 33

A program is described which can provide a with quick logical facility for and moderately more complicated ϕ P (a) x (P (x) Q(f(x))) Q(f(a)) ≡ ∧ ∀ ⇒ ⇒ sentences. The program realizes a method for proving that a sentence of quantification theory is logically true. The program, ϕ P (a) x (P (x) Q(f(x))) Q(f(a)) ¬ ≡ ∧ ∀ ⇒ ∧ ¬ furthermore, provides a decision procedure over a subclass of the sentences of quantification theory. The subclass of sentences x P (a) (P (x) Q(f(x))) Q(f(a)) ≡ ∀ ∧ ⇒ ∧ ¬ for which the program provides a decision procedure includes all Γ = P (a), P (x) Q(f(x)), Q(f(a))  syllogisms. Full justification of the method is given. { ¬ ∨ ¬ } Try substitution x = a: A program for the IBM 704 Data Processing Machine is outlined which realizes the method. Production runs of the program in- Γ = P (a), P (a) Q(f(a)), Q(f(a)) dicate that for a of moderately complicated sentences the { ¬ ∨ ¬ } program can produce proofs in intervals ranging up to two min- Γ = p, p q, q utes. 0 { ¬ ∨ ¬ }

The “Multiplication” Method 34 The DPLL Idea 35

The basic idea of the DPLL solver is beautifully simple. Assume that the input Γ is in : Γ is a conjunction of disjunctions Unfortunately, Gilmore’s method to check satisfiability of a propositional of literals: formula ψ comes down to this: Γ = C ,C ,...,C { 1 2 k} where each clause Ci is a disjunction of literals. Transform ψ into . Of course, in an actual algorithm this would be a list of lists (say, of Remove all conjuncts containing x and x. integers where i denotes x and i denotes x ). i − i If nothing is left, report success. Repeatedly apply simple cleanup operations, until nothing changes.

Bite the bullet: pick a variable and explicitly set it True and False, DOA. respectively.

Backtrack. SmackDown 36 Naive Algorithm 37

Here is the most basic recursive approach to SAT testing (in reality backtracking). We are trying to build a truth-assignment σ for a set of clauses Γ, initially σ is totally undefined. As the authors point out, their method yielded a result in a 30 minute hand-computation, where Gilmore’s algorithm running on an IBM 704 If every clause is satisfied, then return True. failed after 21 minutes. If some clause is false, then return False. The variant presented below was first implemented by Davis, Logeman Pick any unassigned variable x. and Loveland in 1962 on an IBM 704. Set σ(x) = 0. If Γ now satisfiable, return True. Set σ(x) = 1. If Γ now satisfiable, return True.

Return False.

Three-Valued Logic 38 Streamlining Things 39

During the execution of the algorithm variables are either unassigned, true or false; they change back and forth between these values. Obviously, it is a bad idea to pick x blindly for the recursive split. Strictly speaking, this is best expressed in terms of a three-valued logic with values 0, 1, ? . Moreover, one should do regular cleanup operations to keep Γ small. { } One has to redefine the Boolean operations to deal with unassigned variables. For example There are two simple yet surprisingly effective methods:

0 ? 1 ∧ 0 0 0 0 Unit Clause Elimination ? 0 ? ? Pure Literal Elimination 1 0 ? 1

In practice, no one bothers.

Unit Clauses 40 Unit Clause Elimination (UCE) 41

Unit Subsumption: delete all clauses containing x, and

A clause is a unit clause iff it contains just one literal. Unit : remove x from all remaining clauses.

Clearly, if x Γ any satisfying truth-assignment σ must have This process is called unit clause elimination. σ(x) = 1.{ But} ∈ then we can remove clause x and do a bit of surgery on the rest, without affecting satisfiability. { }

Let x be a unit clause in Γ and write Γ0 for the resulting set of clauses after{ UCE} for clause x . This is also called Boolean constraint propagation (BCP). SAT solvers { } spend a lot of time dealing with constraint propagation.

Γ and Γ0 are equisatisfiable. Pure Literal Elimination (PLE) 42 More on PLE 43

Here is a closer look at PLE. Let Γ be a set of clauses, x a variable. Here is another special case that is easily dispatched. Define

A pure literal in Γ is a literal that occurs only directly, but not negated. Γ+: the clauses of Γ that contain x positively, So the formula may either contain a variable x or its negation x, but not x both. Γx−: the clauses of Γ that contain x negatively, and

Γx∗ : the clauses of Γ that are free of x. Clearly, we can accordingly set σ(x) = 1 (or σ(x) = 0) and remove all the clauses containing the literal. So we have the partition

+ Γ = Γ Γ− Γ∗ This may sound pretty uninspired but turns out to be useful in the real x ∪ x ∪ x world. Note that in order to do PLE efficiently we need to keep counters for the number of occurrences of both x and x.

Note that UCE produces Γ0 = C x C Γ− Γ∗ . { − | ∈ x } ∪ x

PLE lemma 44 The DPLL Algorithm 45

Unit Clause Elimination: do UCE until no unit clauses are left.

Proposition Pure Literal Elimination: do PLE until no pure literals are left. + If Γ or Γ− is empty, then Γ and Γ∗ are equisatisfiable. x x x If an empty clause has appeared, return False.

If all clauses have been eliminated, return True.

Since Γx∗ is smaller than Γ (unless x does not appear at all), this Splitting: otherwise, cleverly pick one of the remaining variables, x. transformation simplifies the decision problem. Backtrack to test both

Γ, x and Γ, x { } { } But note that PLE flounders once all variables have positive and negative for satisfiability. occurrences. If, in addition, there are no unit clauses, we are stuck. Return True if at least one of the branches returns True; False otherwise.

For Glass-Half-Empty People 46 Example 47

After three UCE steps (no PLE) and one split on d we get the answer Note that UCE may well produce more unit clauses as well as pure “satisfiable”: literals, so the first two steps hopefully will shrink the formula a bit. 1 {a,b,c} {a,!b} {a,!c} {c,b} {!a,d,e} {!b} 2 {a,c} {a,!c} {c} {!a,d,e} Still, thanks to Splitting, this looks dangerously close to brute-force 3 {a} {!a,d,e} search. 4 {d,e} We could also have used PLE (on d, a, c ):

The algorithm still often succeeds beautifully in the RealWorld, since it 1 {a,b,c} {a,!b} {a,!c} {c,b} {!a,d,e} {!b} systematically exploits all possibilities to prune irrelevant parts of the 2 {a,b,c} {a,!b} {a,!c} {c,b} {!b} search tree. 3 {c,b} {!b} 4 {!b} Finding an Assignment 48 Correctness 49

Claim This algorithm also solves the search problem: we only need to keep track of the assignments made to literals. In the example, the The Davis/Putnam algorithm is correct: it returns true if, and only if, the corresponding assignment is input formula is satisfiable.

σ(b) = 0, σ(c) = σ(a) = σ(d) = 1 Proof. The choice for e does not matter. We already know that UCE and PLE preserve satisfiability. Let x be any literal in ϕ. Then by Boole-Shannon expansion Note that we also could have chosen σ(e) = 1 and ignored d. ϕ(x, y) (x ϕ(1, y)) (x ϕ(0, y)) ≡ ∧ ∨ ∧ But splitting checks exactly the two formulae on the right for Exercise satisfiability; hence ϕ is satisfiable if, and only if, at least one of the two Implement a version of the algorithm that returns a satisfying truth branches returns true. assignment if it exists. Termination is obvious. How about all satisfying truth assignments? 2

Bad News 51

Difficult Problems One can think of DPLL as a particular kind of resolution (see Wikipedia). Unfortunately, it inherits potentially exponential running time as shown by Tseitin in 1966. Dealing with SAT Intuitively, this is not really surprising: too many splits will kill efficiency, and DPLL has no clever mechanism of controlling splits.

Implementation And there is the Levin-Cook theorem (which we will prove soon) that shows that SAT is NP-complete, so one should not expect algorithmic miracles.

Good News 52 Example: Exactly One 53

Neither UCE nor PLE applies here, so the first step is a split. {{!a,!b},{!a,!c},{!a,!d},{!a,!e},{!b,!c},{!b,!d}, In practice, though, Davis/Putnam is usually quite fast, even for huge {!b,!e},{!c,!d}, {!c,!e},{!d,!e},{a,b,c,d,e}} formulae. {{!a},{!a,!b},{!a,!c},{!a,!d},{!a,!e},{!b,!c}, It is not entirely understood why formulae that appear in real-world {!b,!d},{!b,!e}, {!c,!d},{!c,!e},{!d,!e},{a,b,c,d,e}} problems tend to produce something like polynomial running time when tackled by DPLL. {{!b},{!b,!c},{!b,!d},{!b,!e},{!c,!d},{!c,!e}, {!d,!e},{b,c,d,e}}

Take the restriction to RealWorld problems here with a grain of salt. For {{!c},{!c,!d},{!c,!e},{!d,!e},{c,d,e}} example, in algebra, DPLL has been used to solve problems in the theory of so-called quasi groups (cancellative groupoids). In a typical case, there {{d},{d,e},{!d,!e}} are n3 Boolean variables and about n4 to n6 clauses; n might be 10 or 20. True Of course, this formula is trivially satisfiable, but note how the algorithm quickly homes in on one possible assignment. The Real World 54 Bookkeeping 55

If you want to see some cutting edge problems that can be solved by SAT We pretended that literals are removed from clauses: in reality, they algorithms (or can’t quite be solved at present) take a look at would simply be marked False. In this setting, a unit clause has all but one literals marked False.

http://www.satcompetition.org Similarly, if every clause has true literal, then the algorithm returns True. And, if some clause has only false literals, then it returns False. http://www.satlive.org So one should keep count of non-false literals in each clause. And one should know where a variable appears positively and negatively. Try to implement DPLL yourself, you will see that it’s pretty hopeless to get up to the level of performance of the programs that win these competitions. At present, it seems that lean-and-mean is the way to go with SAT solvers. Keeping track of too much information gets in the way.

Splitting 56 A Hack: Watch Pointers 57

Here is a clever hack that minimizes the number of times a clause needs There are several strategies to choose the next variable in a split. Note to be inspected (after the algorithm has performed one of its basic that one also needs to determine which to try first. steps).

Hit the most (unsatisfied) clauses. For every clause, keep pointers to two unassigned literals.

Use most frequently occurring literal. For each variable, keep track of watched clauses (positive and negative). Focus on small clauses.

Do everything at random. The key idea: examine a clause only when one of its watched literals is assigned False.

Watching 58

Suppose literal x is assigned True and let C be a clause on the watch list for x.

k = # literals in C ` = # false literals in C `0 = # true literals in C

If ` = k: return False.

If 0 < `0: return True.

If ` < k: check for UCE.

Otherwise: update pointers and watch lists.

As it turns out, the additional bookkeeping is more than compensated for by cutting down on the number of inspected clauses.