Theory of Computer Science to Msc Students, Spring 2007 Lecture 2 Lecturer: Dorit Aharonov Scribe: Bar Shalem and Amitai Gilad Revised: Shahar Dobzinski, March 2007

1 BPP and NP

The theory of computer science attempts to capture the notion of computation. To under- stand this notion, we first need to understand the computing devices. The commonly used definition of a computing device stems from the Church-Turing thesis:

The Church-Turing Thesis: A can simulate any reasonable physical computational model.

In the sixties it became clear that we are interested in efficient computational devices. The definition of efficient is absolutely not straightforward. Nowadays, we are interested in understanding the power of polynomial-time (i.., polynomial time Turing machines). However, it seems that even polynomial time Turing machine does not fully capture the power of efficient computation: we might let the to err with some small probability. That is, we should also consider randomized algorithms. We have already seen the definitions of the relevant complexity classes, namely BPP and RP. Our current knowledge of the complexity classes is described in the next scheme:

As the diagram shows, NP contains RP (why?), but the relationship between NP and BPP is unknown. For most problems if a is known, then also a deterministic algorithm is known. Primality is one example. It is believed that randomness adds no computational power.

2-1 In the recitation, we have also considered the class of P/P oly (a family of polynomial size circuits). What is the power of randomness comparing to this class? Surprisingly, we know of a very interesting connection, known as Hardness vs. Randomness: if there are hard problems (with “large” circuit size), then randomness adds no computational power, and vice versa. Later in this course we will go over this informally-described connection in more details.

2 Polynomial Identity Testing

An interesting (an exceptional) example to a program for which we know a randomized algorithm but no deterministic algorithm is known is the problem of testing whether two polynomials are equal. Recall that the degree of a polynomial defined over n variables is defined as the maximum over the monomial degrees. The input to the problem is two polynomials P and Q over n variables X1 ··· Xn and some field F. The question is to determine if P − Q ≡ 0. (Of course, the problem is equivalent to asking if a given polynomial over n variables is zero, by letting the polynomial be P − Q.) For polynomials with one variable of degree d, a solution can be given using algebraic considerations: arbitrarily choose d + 1 points in F and assign them to the polynomials. If the polynomials are equal on points, then the polynomials are equivalent. This algorithm is always correct. However, this solution is not applicable for general polynomials. Thus we turn to a probabilistic algorithm using the Schwartz-Zippel lemma. Lemma 1 (Schwartz-Zippel Lemma) Let P 6≡ 0 be a polynomial over the field F with degree d, S ⊆ F. Select, uniformly at random, n points ai ∈ S. Then, d Pr[P (a , . . . , a ) = 0] ≤ 1 n |S|

The proof will be given in the recitation. Sanity check: A polynomial P 6≡ 0 with a single variable, has at most d (the degree) roots (points where P = 0). Hence, for a random assignment the probability to hit a root is at d most |S| . ? For the sake of our problem (P ≡ 0), we choose |S| = 2d and pick a random assignment (uniformly over S) to our polynomial P . using the Schwartz-Zippel Lemma, if P 6≡ 0 we 1 ? have probability ≥ 2 to answer P 6≡ 0. The general P ≡ Q problem, is resolved using the ? Schwartz-Zippel lemma by trivially converting to the equivalent (P − Q) ≡ 0 problem. Remark The problem we have just described is in RP, in the sense that solving it without randomization power will solve all RP problems as well.

Application: Assume we have two (very long) vectors w, z of size n on separate remote machines. We would like to know w =? z without broadcasting w or z. We resort to polynomial identity checking in the following manner.

2-2 1. Select a field S, s.t. |S| ≥ 2n. S is known to both sides a-priori.

def i def i 2. Define polynomials W (x) = Σiwix (side A), and Z(x) = Σizix (side B). Notice that if W − Z ≡ 0 then w = z.

3. Side A chooses x ∈ S and sends it to side B, together with W (x).

4. If W (x) = Z(x) then side B declares “w = z”, otherwise “w 6= z”.

Note that all broadcasted messages are of length O(log n). As for the correctness of the protocol, notice that if w is equal to z then we will always output that the strings are equal. Otherwise, by the discussion above, we will output that the strings are not equal 1 with probability of at least 2 . The error probability can be reduced to any ² > 0, by repeating the described procedure O(nlog ²) times.

3 Finding a Perfect Matching

We will now see another application of the polynomial identity problem: finding a perfect matching in a graph. First recall that a matching in a graph G is a set of non-adjacent edges. A perfect matching is a matching which covers all vertices of the graph (i.e. every vertex of the graph is incident in exactly one edge of the matching). Our goal will be to reduce the perfect matching problem to the polynomial identity problem. To do so, we first define the following matrix: ½ x , if (i, j) ∈ E M = i,j i,j 0, otherwise

This means that the (i, j) entry contains the xi,j variable if the edge (i, j) appears in the graph, and 0 otherwise. The key step is understanding that all we have to do is to check if the determinant of M is 0.

sign(σ) n det(M) = Σσ∈Sn (−1) Πi=1Mi,σ(i) where Sn is the set of all permutations, and sign(σ) is the parity permutation. Computing the determinant results in a polynomial of a degree of at most n, over n2 variables. We claim that there is no perfect matching in the graph if and only if det(M) = 0. Let us start by sketching the proof of the “only if” direction. First note that in a perfect matching no two edges ar adjacent to a common vertex. Thus, the edges in a perfect match- ing are actually represented by a one-to-one correspondence between the vertices 1, ..., n to themselves, meaning a permutation. In our case, the perfect matching is represented by the monomial x1,σ(1) ··· xn,σ(n). Thus, if no perfect matching exists, all monomials equal zero, meaning that det(M) = 0.

2-3 As for the “if” direction, observe that if a perfect matching exists, the perfect matching defines a unique permutation, represented by a unique monomial (each edge is represented by a different variable). Therefore the sum will not be zero, thus det(M) 6= 0. In order to determine if there is a matching in the bipartite graph G, we can check the polynomial which is the determinant of M, using the Schwarz-Zippel lemma. Notice that it is not clear that we can compute the determinant in polynomial time, since the matrix is symbolic. One could wonder why we would want such a probabilistic algorithm, if other deter- ministic algorithms for this problem exist. One answer is that the algorithm we have just described can be parallelized.

4 Polynomial Reductions

The class NP contains most of the “interesting” problems. For example, it contains math- ematical theorems with polynomial proofs and SAT. We would like to compare between problems, i.e. to have an order of “hardness”. This results in the following definitions of the reduction term: Definition 2 We say that a language is Karp-reducible to L’ if there exist a polynomial- time function f, such that: ∀x : x ∈ L ⇔ f(x) ∈ L0

Thus, L is “no-harder” than L0. Another definition for a reduction is the Cook reduction: Definition 3 We say that a language L is Cook-reducible to L0 if there exists a polynomial- time Turing machine that decides L with the help of an (oracle) Turing machine that decides L0. Cook reduction is weaker than Karp reduction, since any Karp reduction is also a Cook reduction. We note that SAT is Cook reducible to SAT . Also, it is known that SAT ∈ coNP (prove it!), but we do not know whether SAT ∈ coNP . However, using Karp-reduction, the problems are in the same class. The next scheme illustrates the relationships between NP problems. There are 3 possi- bilities:

P=NP P NP P NP

We know that the rightmost option is not possible. This means that unless P = NP, there are NP problems which are a bit “easier”, but still not in P. For example, consider 3-SAT: if we adapt the random-walk we presented for 2-SAT to 3-SAT, we get a biased 1 2 random walk (using the probabilities 3 and 3 ). The probability of getting to a solution

2-4 3 n after n iterations is 4 . As before, we will reach a solution with high probability (if one 3 n n exists) after 3n · ( 4 ) steps. This is better than 2 .

2-5