<<

NP-Completeness and Boolean Satisfiability

Mridul Aanjaneya

Stanford University

August 14, 2012

Mridul Aanjaneya Automata Theory 1/ 49 Time-Bounded Turing Machines

• A that, given an input of lengthn, always halts within T(n) moves is said to be T(n)-time bounded. The TM can be multitape. Sometimes, it can be nondeterministic. • The deterministic, multitape case corresponds roughly to an O(T(n)) running-time .

Mridul Aanjaneya Automata Theory 2/ 49 The class P

• If a DTMM is T(n)-time bounded for some polynomial T(n), then we sayM is polynomial-time(polytime) bounded. • AndL(M) is said to be in the classP. • Important Point: When we talk ofP, it doesn’t matter whether we mean by a computer or by a TM.

Mridul Aanjaneya Automata Theory 3/ 49 Polynomial Equivalence of Computers and TM’s

• A multitape TM can simulate a computer that runs for time O(T(n)) in at most O(T2(n)) of its own steps. • If T(n) is a polynomial, so isT 2(n).

Mridul Aanjaneya Automata Theory 4/ 49 Examples of Problems in P

• Isw in (G), for a given CFGG? Inputw. Use CYK algorithm, which is O(n3). • Is there a path from nodex to nodey in graphG? Input =x,y, andG. Use Dijkstra’s algorithm, which is O(n log n) on a graph ofn nodes and arcs.

Mridul Aanjaneya Automata Theory 5/ 49 Running Times Between Polynomials

• You might worry that something like O(n log n) is not a polynomial. • However, to be inP, a problem only needs an algorithm that runs in time less than some polynomial. • Surely O(n log n) is less than the polynomial O(n2).

Mridul Aanjaneya Automata Theory 6/ 49 A Tricky Case: Knapsack

• The Knapsack problem is: given positive integersi 1, i2,..., in, can we divide them in two sets with equal sums? • Perhaps we can solve this problem in polytime by a dynamic-programming algorithm: Maintain a table of all the differences we can achieve by partitioning the firstj integers.

Mridul Aanjaneya Automata Theory 7/ 49 Knapsack

• Basis: j=0. Initially, the table has true for0 and false for all other differences.

• Induction: To consideri j , start with a new table initially all false. • Then setk to true if, in the old table, there is a valuem that was true, andk is either m+i j or m-ij .

Mridul Aanjaneya Automata Theory 8/ 49 Knapsack

• Suppose we measure running time in terms of the sum of the integers, saym. • Each table only needs space O(m) to represent all the positive and negative differences we could achieve. • Each table can be constructed in time O(n).

Mridul Aanjaneya Automata Theory 9/ 49 Knapsack

• Sincen ≤ m, we can build the final table in O(m2) time. • From that table, we can see if0 is achievable and solve the problem.

Mridul Aanjaneya Automata Theory 10/ 49 Subtlety: Measuring Input Size

• Input size has a specific meaning: the length of the representation of the problem as it is input to a TM. • For the Knapsack problem, you cannot always write the input in a number of characters that is polynomial in either the number of or sum of the integers.

Mridul Aanjaneya Automata Theory 11/ 49 Knapsack - Bad Case

• Suppose we haven integers, each of which is around2 n. • We can write integers in binary, so the input takes O(n2) space to write down. • But the tables require space O(n2n). • They therefore require at least that order of time to construct.

Mridul Aanjaneya Automata Theory 12/ 49 Bad Case

• Thus, the proposed polynomial algorithm actually takes time O(n22n) on an input of length O(n2). • Or, since we like to usen as the input size, it takes time O(n2sqrt(n)) on an input of lengthn. • In fact, it appears no algorithm solves Knapsack in polynomial time.

Mridul Aanjaneya Automata Theory 13/ 49 Redefining Knapsack

• We are free to describe another problem, call it Pseudo-Knapsack, where integers are represented unary. • Pseudo-Knapsack is inP.

Mridul Aanjaneya Automata Theory 14/ 49 The class NP

• The running time of a nondeterministic TM is the maximum number of steps taken along any branch. • If that time bound is polynomial, the NTM is said to be polynomial-time bounded. • And its language/problem is said to be in the classNP.

Mridul Aanjaneya Automata Theory 15/ 49 Example: NP

• The Knapsack problem is definitely inNP, even using the conventional binary representation of integers. • Use nondeterminism to guess one of the subsets. • Sum the two subsets and compare.

Mridul Aanjaneya Automata Theory 16/ 49 P versus NP

• Originally a curiosity of Computer Science, mathematicians now recognize as one of the most important open problems the questionP=NP? • There are thousands of problems that are inNP but appear not to be inP. • But no proof that they aren’t really inP.

Mridul Aanjaneya Automata Theory 17/ 49 Complete Problems

• One way to address theP=NP question is to identify complete problems forNP. • An NP-complete problem has the property that if it is inP, then every problem inNP is also inP. • Defined formally via polytime reductions.

Mridul Aanjaneya Automata Theory 18/ 49 Complete Problems: Intuition

• A complete problem for a class embodies every problem in the class, even if it does not appear so. • Strange but true: Knapsack embodies every polytime NTM computation.

Mridul Aanjaneya Automata Theory 19/ 49 Polytime Reductions

• Goal: Find a way to show problem X to be NP-complete by reducing every language/problem inNP to X in such a way that if we had a deterministic polynomial time algorithm for X , then we could construct a deterministic polynomial time algorithm for any problem inNP.

Mridul Aanjaneya Automata Theory 20/ 49 Polytime Reductions

• We need the notion of a polytime transducer - a TM that: 1 Takes an input of lengthn. 2 Operates deterministically for some polynomial time p(n). 3 Produces an output on a separate output tape. • Note: output length is at most p(n).

Mridul Aanjaneya Automata Theory 21/ 49 Polytime Reductions

State

input n

scratch tapes

output <= p(n)

Mridul Aanjaneya Automata Theory 22/ 49 Polytime Reductions

• LetL andM be languages. • SayL is polytime reducible toM if there is a polytime transducerT such that for every inputw toT, the outputx= T(w) ∈ M if and only ifw ∈ L.

Mridul Aanjaneya Automata Theory 23/ 49 Picture of Polytime Reductions

in L in M

not T in L not in M

Mridul Aanjaneya Automata Theory 24/ 49 NP-complete Problems

• A problem/languageM is said to be NP-complete if for every languageL ∈ NP, there is a polytime reduction fromL toM. • Fundamental Property: IfM has a polytime algorithm, then L also has a polytime algorithm. i.e., ifM ∈ P, then everyL ∈ NP is also inP, orP=NP.

Mridul Aanjaneya Automata Theory 25/ 49 Proof that Polytime Reductions Work

• SupposeM has an algorithm of polynomial time q(n). • LetL have a polytime transducerT toM, taking polynomial time p(n). • The output ofT, given an input of lengthn, is at most of length p(n). • The algorithm forM on the output ofT takes time at most q(p(n)).

Mridul Aanjaneya Automata Theory 26/ 49 Proof that Polytime Reductions Work

• We now have a polytime algorithm forL: 1 Givenw of lengthn, useT to producex of length ≤ p(n), taking time ≤ p(n). 2 Use the algorithm forM to tell ifx ∈ M in time ≤ q(p(n)). 3 Answer forw is whatever the answer forx is. • Total time ≤ p(n)+ q(p(n))= a polynomial.

Mridul Aanjaneya Automata Theory 27/ 49 Boolean Expressions

• Boolean, or propositional-logic expressions are built from variables and constants using the operators AND,OR, and NOT. Constants are true and false, represented by1 and0, respectively. We’ll use concatenation for AND, + forOR, - for NOT.

Mridul Aanjaneya Automata Theory 28/ 49 Example: Boolean Expressions

• (x+y)(-x+-y) is true only when variablesx andy have opposite truth values. • Note: parentheses can be used at will, and are needed to modify the precendence order NOT (highest), AND,OR.

Mridul Aanjaneya Automata Theory 29/ 49 The Satisfiability Problem (SAT)

• Study of boolean functions generally is concerned with the set of truth assignments (assignments of0 or1 to each of the variables) that make the function true. • NP-completeness needs only a simpler question (SAT): does there exist a truth assignment making the function true?

Mridul Aanjaneya Automata Theory 30/ 49 Example: SAT

• (x+y)(-x+-y) is satisfiable. • There are, in fact, two satisfying truth assignments: 1 x=0;y=1. 2 x=1;y=0. • x(-x) is not satisfiable.

Mridul Aanjaneya Automata Theory 31/ 49 SAT is in NP

• There is a multitape NTM that can decide if a Boolean formula of lengthn is satisfiable. • The NTM takes O(n2) time along any path. • Use nondeterminism to guess a truth assignment on a second tape. • Replace all variables by guessed truth values. • Evaluate the formula for this assignment. • Accept if true.

Cook’s Theorem SAT is NP-complete.

Mridul Aanjaneya Automata Theory 32/ 49 Picture so far

• We have one NP-complete problem: SAT. • In the future, we shall do polytime reductions of SAT to other problems, thereby showing them to be NP-complete. • Why? If we polytime reduce SAT to X , and X ∈ P, then so is SAT, and therefore so is all ofNP.

Mridul Aanjaneya Automata Theory 33/ 49 Conjunctive Normal Form

• A Boolean formula is in Conjunctive Normal Form(CNF) if it is the AND of clauses. • Each clause if the OR of literals. • Each literal if either a variable or the negation of a variable.

CSAT Is a boolean formula in CNF satisfiable?

Mridul Aanjaneya Automata Theory 34/ 49 NP-completeness of CSAT

• You can convert any formula to CNF. • If may exponentiate the size of the formula and therefore take time to write down that is exponential in the size of the original formula, but these numbers are all fixed for a given NTMM and independent ofn.

Mridul Aanjaneya Automata Theory 35/ 49 k-SAT

• If a boolean formula is in CNF and every clause consists of exactly k literals, we say that the boolean formula is an instance of k-SAT. Say the formula is in k-CNF. • Example: 3-SAT formula: (x+y+z)(x+-y+z)(x+y+-z)(x+-y+-z)

Mridul Aanjaneya Automata Theory 36/ 49 k-SAT Facts

• Every boolean formula has an equivalent CNF formula. But the size of the CNF formula may be exponential in the size of the original. • Not every boolean formula has a k-SAT equivalent. • 2-SAT is inP, 3-SAT is NP-complete.

Mridul Aanjaneya Automata Theory 37/ 49 Proof: 2-SAT is in P (sketch)

• Pick an assignment for some variable, sayx= true. • Any clause with -x forces the other literal to be true. Example: (-x+-y) forcesy to be false. • Keep seeing what other truth values are forced by variables with known truth values.

Mridul Aanjaneya Automata Theory 38/ 49 Proof: 2-SAT is in P (sketch)

• One of three things can happen: 1 You reach a contradiction (e.g.,z is forced to be both true and false). 2 You reach a point where no more variables have their truth value forced, but some clauses are not yet made true. 3 You reach a satisfying truth assignment.

Mridul Aanjaneya Automata Theory 39/ 49 Proof: 2-SAT is in P (sketch)

• Case 1: (Contradition) There can be only a satisfying assignment if you use the other truth value forx. Simplify the formula by replacingx by this truth value and repeat the process. • Case 3: You found a satisfying assignment, so answer yes.

Mridul Aanjaneya Automata Theory 40/ 49 Proof: 2-SAT is in P (sketch)

• Case 2: (You force values for some variables, but other variables and clauses are not affected.) Adopt these truth values, eliminate the clauses that they satisfy, and repeat. • In cases1 and2 you have spend O(n2) time and have reduced the length of the formula by ≥ 1, so O(n3) total.

Mridul Aanjaneya Automata Theory 41/ 49 3-SAT

• This problem is NP-complete. • Clearly it is inNP, since SAT is. • It is not true that every boolean formula can be converted to an equivalent 3-CNF formula, even if we exponentiate the size of the formula.

Mridul Aanjaneya Automata Theory 42/ 49 3-SAT

• But we don’t need equivalence. • We need to reduce every CNF formulaF to some 3-CNF formula that is satisfiable if and only ifF is. • Reduction involves introducing new variables into long clauses, so that we can split them apart.

Mridul Aanjaneya Automata Theory 43/ 49 Reduction of CSAT to 3-SAT

• Let (x1 +x 2 +...+x n) be a clause in some CSAT instance, withn ≥ 4. Note: thex’s are literals, not variables; any of them could be negated variables.

• Introduce new variablesy 1,...,yn−3 that appear in no other clause.

Mridul Aanjaneya Automata Theory 44/ 49 Reduction of CSAT to 3-SAT

• Replace (x1 +x 2 +...+x n) by (x1+x2+y1)(x3+y2+ -y1)... (xn−2+yn−3+ -yn−4)(xn−1+xn+ -yn−3). • If there is a satisfying assignment of thex’s for the CSAT instance, then one of the literalsx i must be made true.

• Assigny i = true ifj < i-1 andy j = false for largerj.

Mridul Aanjaneya Automata Theory 45/ 49 Reduction of CSAT to 3-SAT

• We are not done. • We also need to show that if the resulting 3SAT instance is satisfiable, then the original CSAT instance was satisfiable.

Mridul Aanjaneya Automata Theory 46/ 49 Reduction of CSAT to 3-SAT

• Suppose (x1+x2+y1)(x3+y2+ -y1)... (xn−2+yn−3+ -yn−4)(xn−1+xn+ -yn−3) is satisfiable, but none of thex’s is true.

• The first clause forcesy 1 = true.

• Then the second clause forcesy 2 = true. • And so on ... all they’s must be true. • But then the last clause is false.

Mridul Aanjaneya Automata Theory 47/ 49 Reduction of CSAT to 3-SAT

• There is a little more to the reduction, for handling clauses of 1 or2 literals.

• Replace (x) by (x+y1+y2)(x+y1+ -y2)(x+ -y1+y2)(x+ -y2+ -y2). • Replace (w+x) by (w+x+y)(w+x+ -y).

Mridul Aanjaneya Automata Theory 48/ 49 CSAT to 3-SAT Running Time

• This reduction is surely polynomial. • In fact, it is linear in the length of the CSAT instance. • Thus, we have polytime-reduced CSAT to 3-SAT. • Since CSAT is NP-complete, so is 3-SAT.

Mridul Aanjaneya Automata Theory 49/ 49