Lecture 24 1 the Complexity of Counting
Total Page:16
File Type:pdf, Size:1020Kb
Notes on Complexity Theory Last updated: November, 2011 Lecture 24 Jonathan Katz 1 The Complexity of Counting We explore three results related to hardness of counting. Interestingly, at their core each of these results relies on a simple | yet powerful | technique due to Valiant and Vazirani. 1.1 Hardness of Unique-SAT Does SAT become any easier if we are guaranteed that the formula we are given has at most one solution? Alternately, if we are guaranteed that a given boolean formula has a unique solution does it become any easier to ¯nd it? We show here that this is not likely to be the case. De¯ne the following promise problem: USAT def= fÁ : Á has exactly one satisfying assignmentg USAT def= fÁ : Á is unsatis¯ableg: Clearly, this problem is in promise-NP. We show that if it is in promise-P, then NP = RP. We begin with a lemma about pairwise-independent hashing. n m m+1 Lemma 1 Let S ⊆ f0; 1g be an arbitrary set with 2 · jSj · 2 , and let Hn;m+2 be a family of pairwise-independent hash functions mapping f0; 1gn to f0; 1gm+2. Then Pr [there is a unique x 2 S with h(x) = 0m+2] ¸ 1=8: h2Hn;m+2 Proof Let 0 def= 0m+2, and let p def= 2¡(m+2). Let N be the random variable (over choice of random h 2 Hn;m+2) denoting the number of x 2 S for which h(x) = 0. Using the inclusion/exclusion principle, we have X 1 X Pr[N ¸ 1] ¸ Pr[h(x) = 0] ¡ ¢ Pr[h(x) = h(x0) = 0] 2 x2S x6=x02S µ ¶ jSj = jSj ¢ p ¡ p2; 2 ¡ ¢ P 0 jSj 2 while Pr[N ¸ 2] · x6=x02S Pr[h(x) = h(x ) = 0] = 2 p . So µ ¶ jSj Pr[N = 1] = Pr[N ¸ 1] ¡ Pr[N ¸ 2] ¸ jSj ¢ p ¡ 2 ¢ p2 ¸ jSjp ¡ jSj2p2 ¸ 1=8; 2 1 1 using the fact that jSj ¢ p 2 [ 4 ; 2 ]. 24-1 Theorem 2 (Valiant-Vazirani) If (USAT; USAT) is in promise-RP, then NP = RP. Proof If (USAT; USAT) is in promise-RP, then there is a probabilistic polynomial-time algorithm A such that Á 2 USAT ) Pr[A(Á) = 1] ¸ 1=2 Á 2 USAT ) Pr[A(Á) = 1] = 0: We design a probabilistic polynomial-time algorithm B for SAT as follows: on input an n-variable boolean formula Á, ¯rst choose uniform m 2 f0; : : : ; n ¡ 1g. Then choose random h à Hn;m+2. Us- ¡ ¡ ¢¢ ing the Cook-Levin reduction, rewrite the expression Ã(x) def= Á(x) ^ h(x) = 0m+2 as a boolean formula Á0(x; z), using additional variables z if necessary. (Since h is e±ciently computable, the size of Á0 will be polynomial in the size of Á. Furthermore, the number of satisfying assignments to Á0(x; z) will be the same as the number of satisfying assignments of Ã.) Output A(Á0). If Á is not satis¯able then Á0 is not satis¯able, so A (and hence B) always outputs 0. If Á is satis¯able, with S denoting the set of satisfying assignments, then with probability 1=n the value of m chosen by B is such that 2m · jSj · 2m+1. In that case, Lemma 1 shows that with probability at least 1/8 the formula Á0 will have a unique satisfying assignment, in which case A outputs 1 with probability at least 1=2. We conclude that when Á is satis¯able then B outputs 1 with probability at least 1=16n. 1.2 Approximate Counting, and Relating #P to NP #P is clearly not weaker than NP, since if we can count solutions then we can certainly tell if any exist. Although #P is (in some sense) \harder" than NP, we show that any problem in #P can be probabilistically approximated in polynomial time using an NP oracle. (This is reminiscent of the problem of reducing search to decision, except that here we are reducing counting the number of witness to the decision problem of whether or not a witness exists. Also, we are only obtaining an approximation, and we use randomization.) We focus on the #P-complete problem #SAT. Let #SAT(Á) denote the number of satisfying assignments of a boolean formula Á. We show that for any polynomial p there exists a ppt algorithm A such that · µ ¶ µ ¶¸ 1 1 Pr #SAT(Á) ¢ 1 ¡ · ANP (Á) · #SAT(Á) ¢ 1 + ¸ 1 ¡ 2¡p(jÁj) ; (1) p(jÁj) p(jÁj) that is, A approximates #SAT(Á) (the number of satisfying assignments to Á) to within a factor 1 (1 § p(jÁj) ) with high probability. The ¯rst observation is that it su±ces to obtain a constant-factor approximation. Indeed, say we have an algorithm B such that 1 ¢ #SAT(Á) · BNP (Á) · 64 ¢ #SAT(Á): (2) 64 (For simplicity we assume B always outputs an approximation satisfying the above; any failure probability of B propagates in the obvious way.) We can construct an algorithm A satisfying (1) as follows: on input Á, set q = log 64 ¢ p(jÁj) and compute t = B(Á0) where 0 def Vq Á = i=1 Á(xi) ; 24-2 1=q and the xi denote independent sets of variables. A then outputs t . Letting N (resp., N 0) denote the number of satisfying assignments to Á (resp., Á0), note that 0 q 1 0 0 N = N . Since t satis¯es 64 ¢ N · t · 64 ¢ N , the output of A lies in the range ·µ ¶ µ ¶ ¸ h i 1 1 2¡1=p(jÁj) ¢ N; 21=p(jÁj) ¢ N ⊆ 1 ¡ ¢ N; 1 + ¢ N ; p(jÁj) p(jÁj) as desired. In the last step, we use the following inequalities which hold for all x ¸ 1: µ ¶ µ ¶ µ ¶ 1 1=x 1 1 ¸ 1 ¡ and 21=x · 1 + : 2 x x The next observation is that we can obtain a constant-factor approximation by solving the promise problem (¦Y ; ¦N ) given by: def ¦Y = f(Á; k) j #SAT(Á) > 8kg def ¦N = f(Á; k) j #SAT(Á) < k=8g: Given an algorithm C solving this promise problem, we can construct an algorithm B satisfying (2) as follows. (Once again, we assume C is deterministic; if C errs with non-zero probability we can handle it in the straightforward way.) On input Á do: ² Set i = 0. ² While M((Á; 8i)) = 1, increment i. i¡ 1 ² Return 8 2 . ¤ Let i be the value of i at the end of the algorithm, and set ® = log8 #SAT(Á). In the second step, we know that M((Á; 8i)) outputs 1 as long as #SAT(Á) > 8i+1 or, equivalently, ® > i+1. So we end up with an i¤ satisfying i¤ ¸ ® ¡ 1. We also know that M((Á; 8i)) will output 0 whenever i > ® + 1 and so the algorithm above must stop at the ¯rst (integer) i to satisfy this. Thus, i¤ · ® + 2. Putting this together, we see that our output value satis¯es: i¤¡ 1 #SAT(Á)=64 < 8 2 < 64 ¢ #SAT(Á); i as desired. (Note that we assume nothing about the behavior of M when (Á; 8 ) 62 ¦Y [ ¦N .) Finally, we show that we can probabilistically solve (¦Y ; ¦N ) using an NP oracle. This just uses another application of the Valiant-Vazirani technique. Here we rely on the following lemma: n m Lemma 3 Let Hn;m be a family of pairwise-independent hash functions mapping f0; 1g to f0; 1g , and let " > 0. Let S ⊆ f0; 1gn be arbitrary with jSj ¸ "¡3 ¢ 2m. Then: · ¸ jSj m jSj Pr (1 ¡ ") ¢ m · jfx 2 S j h(x) = 0 gj · (1 + ") ¢ m > 1 ¡ ": h2Hn;m 2 2 24-3 m Proof De¯ne for each x 2 S an indicator random variable ±x such that ±x = 1 i® h(x) = 0 (and 0 otherwise). Note that the ±x are pairwise independent random variables with expectation ¡m ¡m ¡m def P m 2 and variance 2 ¢ (1 ¡ 2 ). Let Y = x2S ±x = jfx 2 S j h(x) = 0 gj. The expectation m jSj ¡m of Y is jSj=2 , and its variance is 2m ¢ (1 ¡ 2 ) (using pairwise independent of the ±x). Using Chebychev's inequality, we obtain: Pr [(1 ¡ ") ¢ Exp[Y ] · Y · (1 + ") ¢ Exp[Y ]] = Pr [jY ¡ Exp[Y ]j · " ¢ Exp[Y ]] Var[Y ] ¸ 1 ¡ (" ¢ Exp[Y ])2 (1 ¡ 2¡m) ¢ 2m = 1 ¡ ; "2 ¢ jSj which is greater than 1 ¡ " for jSj as stated in the proposition. The algorithm solving (¦Y ; ¦N ) is as follows. On input (Á; k) with k > 1 (note that a solution is trivial for k = 1), set m = blog kc, choose a random h from Hn;m, and then query the NP oracle on the statement Á0(x) def= (Á(x) ^ (h(x) = 0m)) and output the result. An analysis follows. m Case 1: (Á; k) 2 ¦Y , so #SAT(Á) > 8k. Let SÁ = fx j Á(x) = 1g. Then jSÁj > 8k ¸ 8 ¢ 2 . So: £ 0 ¤ £ m ¤ Pr Á 2 SAT = Pr fx 2 SÁ : h(x) = 0 g 6= ; £ ¤ 1 ¸ Pr jfx 2 S : h(x) = 0mgj ¸ 4 ¸ ; Á 2 1 which we obtain by applying Lemma 3 with " = 2 . m Case 2: (Á; k) 2 ¦N , so #SAT(Á) < k=8. Let SÁ be as before. Now jSÁj < k=8 · 2 =4. So: £ 0 ¤ £ m ¤ Pr Á 2 SAT = Pr fx 2 SÁ : h(x) = 0 g 6= ; X · Pr [h(x) = 0m] x2SÁ 2m 1 < ¢ 2¡m = ; 4 4 where we have applied a union bound in the second step.