The Satisfiability Problem

The Satisfiability Problem Boolean Formula Assignment. Let F be a Boolean formula over variables x1; x2; : : : ; xn, and Boolean operators f^; _; ()g.A truth assignment for F is a vector a 2 f0; 1gn. The evaluation of F under truth assignment a is an element of f0; 1g, and is denoted by F (a). It is defined recursively as follows. • Base Case. If F = xi, then F (a) = ai. If F = xi, then F (a) = ai. • Recursive Case. Assume F1(a) and F2(a) have been defined. Then (F1 _ F2)(a) = F1(a) _ F2(a), (F1 ^ F2)(a) = F1(a) ^ F2(a), F1(a) = F1(a). If F (a) = 1, then we say that a satisfies F , and that F is satisfiable. F is called unsatisi- fiable if it is not satisfied by any truth assignment. The SATisfiability Problem. Given Boolean formula F (x1; x2; : : : ; xn), does there exist a truth assignment A = a1; a2; : : : ; an such that F (a1; a2; : : : ; an) evaluates to true? Example 1. Given Boolean formula F (x1; x2; x3; x4) = (x1 _ (x2 ^ x3)) ^ (x2 _ x3 _ x4) ^ (x2 ^ (x3 _ x4)) ^ (x2 _ (x3 ^ x4)) ^ (x2 _ x3 _ x4); evaluate F (a) for a = (1; 1; 0; 1). Is F satisfiable? 1 Literals. A literal is a Boolean variable, or the negation of a Boolean variable. For example, x12 and x2 are examples of literals, with the first being called a positive literal, and the second being called a negative literal. Conjunctive Normal Form. A Boolean formula F is said to be in conjunctive normal form if F is of the form C1 ^ C2 ^ · · · ^ Cm, where each clause Ci has the form Ci = li1 _ li2 _···_ limi which is the disjunction of a finite number of literals. F is called a CNF formula. Further- more, in the case that each Ci has exactly k literals, for some constant k, then F is called a k-CNF formula. We can thus define the CNF-SAT problem as the problem of deciding if a given CNF formula is satisfiable. The k-CNF-SAT (or kSAT for short) problem is defined similarly. When k = 2, the problem is called 2SAT, while, when k = 3, it is called 3SAT. Example 2. Provide a Boolean formula that is logically equivalent to the defining formula from Example 1, and is in conjunctive normal form. For what value of k is this formula an instance of the k-CNF-SAT problem? Currently there no known polynomial-time algorithm for deciding if a 3-CNF formula is satisfiable (and hence no known polynomial-time algorithm for SAT, since an arbitrary Boolean formula can be reduced to a 3-CNF formula in polynomial time). In fact, in a future lecture we shall demonstrate that 3SAT is one of the hardest problems to solve in a family of problems called NP, which consists of those problems that can be solved in nondeterministic polynomial time. 2 However, it turns out that the 2SAT problem can be solved in time that is quadratic in m, the number of clauses, and n, the number of variables. The algorithm relies on the fact that, for a directed graph G = (V; E), one can check in linear time (in jV j + jEj) if some vertex b 2 V is reachable from another vertex a 2 V . The reachability algorithm is now described. • Name: Reach • Input hG = (V; E); a; bi, where a; b 2 V . • Output accept if and only if b is reachable from a. • Begin Algorithm • Initialize a FIFO queue Q = ; • Mark a as having been reached and enter a into Q • While Q is nonempty { Remove u from the from the front of Q { For each directed edge of the form (u; v) 2 E ∗ If v is unmarked, then mark v and enter v into Q • If b is marked then accept • Else reject • End Algorithm 3 Example 3. Show the contents of the queue Q during the execution of the above algorithm on the graph G = (V; E), where V = fa; b; c; d; e; f; g; hg and the edges are given by E = f(a; b); (a; c); (b; c); (b; d); (b; e); (b; g); (c; g); (c; f); (d; f); (f; g); (f; h); (g; h)g: Decide if h is reachable from a. 4 CNF Notation. To simplify notation, a k-CNF formula of the form (l11 _···_ l1k) ^ (l21 _···_ l2k) ^ · · · ^ (lm1 _···_ lmk) will be written as f(l11; : : : ; l1k); (l21 : : : ; l2k);:::; (lm1; : : : ; lmk)g; which is a family of sets of literals. Example 4 Re-write the CNF formula from Example 2 using CNF notation. Implication Graph of a 2-CNF Formula. Let F be a 2-CNF formula over the variables x1; x2; : : : ; xn. The implication graph of F is defined as the directed graph GF = (V; E), where V = x1; x2; : : : ; xn; x1; x2;:::; xn, and for any two literals li and lj,(li; lj) 2 E if and only if either (li; lj) is a clause of F or (lj; li) is a clause of F . 5 Example 5. Draw the implication graph for the following set of CNF clauses. (x2; x4); (x2; x3); (x2; x3); (x2; x3); (x2; x4); (x1; x4): 6 Theorem 1. 2-CNF formula F is unsatisfiable iff there exists a variable x such that x is reachable from x and x is reachable from x in the implication graph GF . Proof. (. Let F be a 2-CNF formula and assume that there is some variable x such that x is reachable from x and x is reachable from x. Then there are edge sequences in GF of ^ ^ ^ ^ ^ ^ the form (x; l1); (l1; l2);:::; (lr−1; lr); (lr; x) and of the form (x; l1); (l1; l2);:::; (ls−1; ls); (ls; x). The first edge sequence implies that F is not satisfiable when x is assigned to 1. For example, (x; l1) 2 E implies that either (x; l1) or (l1; x) is a literal of F . Then the assignment of 1 to x forces an assignment of 1 to l1. Similar reasoning shows that this in turn forces an assignment of 1 to l2, and, using this reasoning through the entire edge sequence, we see that lr is forced to have an assignment of 1. But edge (lr; x) corresponds with the literal (lr; x). And if lr is forced to 1, then this literal cannot be satisifed, since x was already assigned 1. Hence, no satisfying assignment of F can assign x to 1. A similar argument using the second edge sequence shows that no satisfying assignment of F can assign x to 0. Therefore, F is unsatisfiable. Proof. ). Now assume that F is unsatisfiable. We prove that there is some variable x such that x is reachable from x and x is reachable from x. The proof is by induction on the number of variables in formula F . Basis Step. F has one variable x. Since F is unsatisfiable, F must have the two clauses (x; x) and (x; x). These yield implication graph edges (x; x) and (x; x), and hence x is the desired variable. Induction Step. Assume that any unsatisfiable Boolean formula with less than n variables has a variable x such that x is reachable from x and x is reachable from x in its implication graph, for some n ≥ 1. Let F be an unsatisfiable Boolean formula with n variables. Choose a variable w such that either w is not reachable from w or vice versa. If no such variable exists, then the statement is proved. Without loss of generality, assume w is not reachable from w. Let R be the set of literals that are reachable from w. From what we just stated, w 62 R. Claim 1: R is a consistent set of literals, meaning that, if l 2 R, then l 62 R. By way of contradiction, assume otherwise; e.g. that l 2 R and l 2 R. Then there is a path P1 from w to l and a path P2 from w to l. And by contraposing the edges in P2, this yields a path P3 from l to w. Hence, P1 · P3 yields a path from w to w, a contradiction. Now let V (R) denote the set of variables x for which either x 2 R or x 2 R. Without loss jRj of generality, assume that V (R) = fx1; x2; : : : ; xjRjg. Furthermore, let a 2 f0; 1g be an assignment for which ai = 1 if xi 2 R and ai = 0 if xi 2 R. Clearly, a satisfies all literals in R. Slightly more subtle is that a satisfies all clauses C that have at least one variable in V (R). For example, assume l 2 R. Then if C = (l; ^l), then C is satisfied by a since l is satisfied by a. On the other hand, if C = (l; ^l), then this implies that ^l 2 R, since C yields the edge (l; ^l), and l is reachable by w. Thus ^l is also reachable by w and is thus in R. Thus 7 a satisfies C since ^l in R and a satisfies ^l by definition. Now let F^ denote the new 2CNF formula that is F with all clauses removed that contain a ^ variable in V (R). Notice that i) GF^ is a subgraph of GF , ii) F is unsatisfiable (otherwise F would be satisfiable), and iii) F^ has jRj > 0 fewer variables than F .

Load more