<<

princeton university cos 522: computational complexity

Lecture 6: Randomized Computation

Lecturer: Sanjeev Arora Scribe:Manoj

A randomized has a transition diagram similar to a nondeterministic TM: multiple transitions are possible out of each state. Whenever the machine can validly more than one outgoing transition, we assume it chooses randomly among them. For now we assume that there is either one outgoing transition (a deterministic step) or two, in which case the machine chooses each with probability 1/2. (We see later that this simplifying assumption does not effect any of the theory we will develop.) Thus the machine is assumed to have a fair random coin. The new quantity of interest is the probability that the machine accepts an input. The probability is taken over the coin flips of the machine.

1 RP and BPP

Definition 1 RP is the class of languages , such that there is a polynomial time randomized TM M such that 1 x ∈ L ⇔ Pr[M accepts x] ≥ (1) 2 x ∈ L ⇔ Pr[M accepts x]=0 (2)

We can also define a class where we allow two-sided errors.

Definition 2 BPP is the class of all languages L, such that there is a polynomial time randomized TM M such that 2 x ∈ L ⇔ Pr[M accepts x] ≥ (3) 3 1 x ∈ L ⇔ Pr[M accepts x] ≤ (4) 3 As in the case of NP, we can given an alternative definiton for these classes. Instead of the TM flipping coins by itself, we think of a string of coin flips provided to the TM as an additional input.

Definition 3 BPP is the class of all languages L, such that there is a polynomial time deterministic TM M and c>0 such that 2 x ∈ L ⇔ Pr |x|c [M accepts x] ≥ (5) ∈{0,1} 3 1 x ∈ L ⇔ Pr |x|c [M accepts x] ≤ (6) r∈{0,1} 3

1 A. Q. October 5, 2005 a 22

We make a few observations about randomized classed. First, RP NP, since a 0 We make a few observations about randomized classed. First, RP ⊆ NP, since a to collide is Thus, the number of There are How many for but once these are chosen, exactly one choice ⊆ 2 randomrandom string string that that causes causes the the machine machine to to accept accept is is a a certificate certificate that that the the input input is is in in the the language.language. We We do do not not know know if if BPPBPP isis in in NPNP.. Note Note that that BPPBPP isis closed closed under under complemen- complemen- tation.tation.We make We We do do a not not few know know observations of of any any complete about randomized problems problems for for classed. these these classes. classes. First, RP One One di difficultyffiNPculty, since seems seems a ⊆ randomtoto be be that that string the the problem problem that causes of of determining determining the machine whether whether to accept or or not not is a a a certificate given given randomized randomized that the polynomial polynomial input is in time time the

= language.TMTM is is an an RP RPWe machine machinedo not know is is undecidable. undecidable. if BPP is in NP. Note that BPP is closed under complemen- tation. We do not know of any complete problems for these classes. One difficulty seems to1.11.1 be that Probability Probability the problem Amplification Amplification of determining whether or not a given randomized polynomial time TM is an RP machine is undecidable. 22 11 Proof (completed) InIn the the definition definition of of BPPBPP aboveabove the the actual actual values values in in the the place place of of 3 andand 3 areare not not crucial, crucial, 3 n 3 n asas we we observe observe below. below. These These two two numbers numbers can can be be replaced replaced by by 1 1 − 11//22n andand 1 1//22n withoutwithout

⎝ ⎜ ⎜ ⎛ − a 1.1 Probability Amplification changingchanging the the class. class. By repeatedly running a machine M accepting a BPP language2 L with1 probabilities In theBy definition repeatedly of runningBPP above a machine the actualM accepting values in a theBPP placelanguage of 3 andL with3 are probabilities not crucial, as above, and taking the majority vote of the different decisions it makes,n wen can get a asas we above, observe and below. taking Thesethe majority two numbers vote of can the be different replaced decisions by 1 1 it/2 makes,and 1 we/2 canwithout get a − ⎝ ⎜ ⎜ ⎛ much better probability of a correct answer. Using a polynomial (in input size) number 0 changingmuch better the class.probability of a correct answer. Using a polynomial (in input size) number ofof repeated repeatedBy repeatedly trials, trials, werunning we get get an an a exponentially exponentially machine M accepting small small probability probability a BPP oflanguage of deciding decidingL the thewith input input probabilities wrongly. wrongly. asThisThis above, follows follows and from from taking bounding bounding the majority the the tail tail vote of of a a binomial binomialof the diff distribution distributionerent decisions using using it Cherno Chernoff makes,ff webounds. bounds. can get a much better probability of a correct answer. Using a polynomial (in input size) number − causes of2 repeated Theorems trials, we about get an exponentially BPP small probability of deciding the input wrongly. This2 follows Theorems from bounding about the BPP tail of a binomial distribution using Chernoff bounds. NowNow we we show show that that all all BPPBPP languageslanguages have have polynomial polynomial sized sized circuits. circuits.

2TheoremTheorem Theorems 1 1 about BPP ∑

i BPPBPP ⊆ PP//poly Now we⊆ showpoly that all BPP languages have polynomial sized circuits. Proof:Proof: For any language LL ∈ BPPBPP consider a TM MM (taking a random string as additional = For any language ∈ consider a TM (taking a random string as additional r Theorem 1 input) which decides any input xx with an exponentially low probability of error; such an M BPPinput) whichP/poly decides any input with an exponentially low probability of error; such an M Copyright © 2001-5 by Erik D. De existsexists⊆ by by the the probability probability amplification amplification arguments arguments mentioned mentioned earlier. earlier. In In particular particular let let the the

(n+1) 1 probability M(x, r) being wrong (taken over r)be 1/2 (n+1), where n = x . Say that an Proof:probabilityForM any(x, language r) beingL wrongBPP (takenconsider over ar)be TM M≤ 1(taking/2 a, random where n string= |x|. as Say additional that an ∈ ≤ | | input)rr isis badbad whichforfor xx decidesifif MM((x,x, any r r)) is is input an an incorrect incorrectx with an answer. answer. exponentially Let Let tt bebe low the the probability total total number number of error; of of choices choices such anfor for Mrr..

m (n+1) existsForFor each each byx thex,, at at probability most most t/t/22(n amplification+1) valuesvalues of of argumentsrr areare bad. bad. mentioned Adding Adding over over earlier. all all xx,, In we we particular conclude conclude that thatlet the at at

n (n+1) probabilitymostmost 2 2n × t/t/M22((x,n+1) r)valuesvalues being wrong of of rr areare (taken bad bad for for over some somer)bexx.. In In1/ other other2(n+1) words, words,, where at atn least least= x .t/t/ Say22 choices choices that an of of a × ≤ | | rr isnotnotbad bad badfor for forx if every everyM(x,xx.. r) In In is particular particular an incorrect there there answer. is is some some Let one onet be value value the of of totalrr forfor number which which ofMM choices((x,x, r r)) decides decides for r. (n+1) ForxxHecorrectly.correctly. eache if wxe ,ch at We Weoos moste can cana ran hard-wiret/ hard-wiredo2m elemvaluesen this thist r ofr rt ofhintointoesre tare/ a2 a c polynomial polynomialh bad.oices, Addingthen th size sizee p overro circuit. circuit.bab allilityx t Givenh, Givena wet the concludere an an is a input inputny x thatfoxxr, ,w the theh atich n (n+1) mostcircuitcircuit 2 simulates simulatest/2 MMvaluesonon ( (x,x, of r r).).r Hence Henceare badLL for∈ PP some/poly./poly.x.✷2 In other words, ath least t/2 choices of r is bad is ×< 1. This provides a nonconstructive∈ proof that there is some one value of r for which M(x,r) decides i r notThe bad next for theorem every x. relates In particularBPPBPP to there the is polynomial some one hierarchy.value of r for which M(x, r) decides

m The next theorem relates to the . x correctly. We can hard-wire this r into a polynomial size circuit. Given an input x, the

TheoremTheorem 2 2 choices for each of ✷ ( circuit simulates M on (x, r). Hence L P/poly.

p p a BPPBPP ⊆ ΣΣp ∩ΠΠp ∈ The⊆ next22 ∩ theorem22 relates BPP to the polynomial hierarchy. x Proof:Proof: It is enough to prove that BPPBPP ⊆ ΣΣ2 because BPPBPP is closed under complemen-

It is enough to prove that 2 because is closed’s cause under complemen- x Theorem 2 ⊆ tation. p p r BPPtation. Σ Π Suppose2 LL ∈2 BPPBPP, and MM is a randomized TM for LL, that uses mm random bits such Suppose⊆ ∩ ∈ , and is a randomized TM for , that uses random bits such that x L Pr [M(x, r) accepts ] 1 2 −nnpand x L Pr [M(x, r) accepts ] 2 −nn. Proof:that x ∈ItL is⇒ enoughPrrr[M( tox, prover) accepts that ]BPP≥ 1 − 2−Σ2 andbecausex ∈ LBPP⇒ Prisrr[ closedM(x, r under) accepts complemen- ] ≤ 2− . i ∈ ⇒ ≥ −⊆ ̸∈ ⇒ ≤ tation.Fix an input xx, and let SSx denote the set of rr for which MM accepts (x,x, r r). Then either and ·1 = Fix an input , and let x denote the set of for which accepts ( ). Then either −nn mm −nn mm |SSxx|≥Suppose(1(1 − 22L− )2)2BPPoror,|S andSxx|≤M22−is2 a2 randomized,, depending depending TM on on whether whetherfor L, that or or not notusesxxm∈randomLL.. We We will willbits show showsuch | | ≥ − ∈ | | ≤ n ∈ n thathowhow tox to guess, guess,L Pr using usingr[M two two(x, r alternations, alternations,) accepts ] which which1 2− of of theand the two twox cases casesL Pr is is true. true.r[M(x, r) accepts ] 2− . ∈ ⇒ ≥ − ̸∈ ⇒ ≤

− Fix an input x, and let Sx denote the set of r for which M accepts (x, r). Then either S (1 2 n)2m or S 2 n2m, depending on whether or not x L. We will show | x| ≥ − − | x| ≤ − ∈ how to guess, using two alternations, which of the two cases is true. y y i

) m h to collide, namely ⎟ ⎟ ⎠ ⎞ r a

x ⋅ ’s that cause

= | ( and maine and Charles E. Leiserson x 0 H

− |/ y m y to collide?

0 .

) a − 1 1

,

⎠ ⎟ ⎟ ⎞ a x mod 2 , …, and m y L7.15 a .

r , 3

m m Consider r as an element of GF(2) . For a set S ⊂ GF(2) we define the shift of S m by r0 as {r + r0|r ∈ S} (where the addition is as in GF(2) , i.e., bit-wise XOR). −n m Suppose x ∈ L,so|Sx|≥(1 − 2 )2 . We shall show that there are a small number of m vectors such that the set of shifts of S by these vectors covers the whole space GF(2) . Lemma 3 m k m ∃r1,r2,...,rk, where k = n +1such that i=1(S + ri)=GF(2) .  Proof: Pr k S r  GF(2)m < / We shall show that (r1,r2,...,r )[ i=1( + i) = ] 1 2. m k Let z ∈ GF(2) .Ifr1 is a random vector, then so is z + r1.SoPrr1 [z ∈ S + r1]= −n Prr1 [z + r1 ∈ S] ≤ 2 . So, for each z, k Pr z ∈ S r ≤ −nk r1,r2,...,rk [ + i] 2 i=1  Pr z ∈ k S r ≤ m−nk < / < 2 So, r1,r2,...,rk [some i=1 + i] 2 1 2 1. −n m Now suppose x ∈ L.Now|Sx|≤2 2 . This is a small set, and the union of any k m shifts of S can be of size at most k2m−n < 2m, and hence cannot equal GF(2) . Thus we have established that

m x ∈ L ⇔∃r1,r2,...,rk ∈ GF(2) such that m ∀ z ∈ GF(2) M accepts x using at least one of z + r1,z+ r2,...,z+ rk (7) p Thus L is in Σ2. 2

3 Model Independence

Now we argue that our specific assumptions about randomized TMs do not affect the classes RP and BPP. We made the assumption that the Turing machine uses a two-sided fair coins for ran- domization. Now we shall see that it can use these to simulate a machine that uses k-sided coins. To simulate a k-sided fair coin do as follows: flip our two-sided coin for log2 k times. If the binary number generated is in the range [0,k− 1] output it as the result, else repeat the experiment. This terminates in expected O(1) steps, and the probability that it does not halt in O(k) steps is at most 2−k. Since the machine needs to simulate only poly(n) coin tosses, the probability that this simulation does not work can be made 2−n, and so it does not substantially affect the probability of acceptance (which is something like 1/2or 2/3). Another issue that arises in real life is that one may not have an unbiased coin, and the probability it comes heads is an unknown quantity p. But we can simulate a an unbiased coin as follows: flip the (biased) coin twice (we are assuming independence of different flips). Interpret 01 as 0 and 10 as 1; on a 00 or 11 repeat the experiment. The probability that we fail to producea0or1isatmost 1 − p2 − (1 − p)2. Since p is constant, this failure probability is a constant. The expected number of repetitions before we produce a bit is O(1). 4

4 Recycling Entropy

Randomness may be considered a resource, because it is likely that a random bit generator is much slower than the rest of the computer. So one would like to minimize the number of random bits used. One place where we can recycle random bits is while repeating a BPP or RP . Consider running an RP-machine on some input x. If it accepts x we know x is in the language L. But if it rejects x, we repeat the experiment, up to k times. But to repeat the experiment each time, it is not necessary to acquire all new random bits all over again. Intuitively, we can “recycle” most of the randomness in the previous random string, because when the experiment fails to detect x being in L, all we know about the random string used is that it is in the set of bad random strings. Thus it still has a lot of “randomness” left in it.

Definition 4 If d>2 and β>0, a family of (d, β)-expanders is a sequence of graphs {Gn} where Gn is a d-regular graph on i nodes, and has the property that every subset S of at most n/2 nodes has edges to at least β |S| nodes outside S.

Deep mathematics (and more recently, simpler mathematics) has been used to construct expanders. These constructions yield that, given n and an index of a node in Gn, can produce the indices of the neighbors of this node in poly(log n) time. Suppose the RP algorithm uses m random bits. The na¨ıve approach to reduce its error probability to 2−k uses O(mk) random bits. A better approach involving recycling is as follows. Pick the first random string as a node in an expander graph on 2m nodes, take a random walk of length k from there, and use the indices of the nodes encounted in this walk (a total of mk bits) in your RP algorithm. Since the graph has constant degree, each step of the random walk needs O(1) random bits. So a random walk of length k takes m + O(k) random bits; m to produce the first node and O(1) for each subsequent step. A surprising result shows that this sequence will be random enough to guarantee the probability amplification, but we will not give a proof here. Of course, the same approach works for any , such as a Monte Carlo simulation.