<<

P=BPP unless E has sub-exponential circuits: Derandomizing the XOR Lemma

Russell Impagliazzo∗ Avi Wigderson† Department of Computer Science Institute of Computer Science University of California Hebrew University San Diego, CA 91097-0114 Jerusalem, Israel [email protected] [email protected]

Abstract 1 Introduction

Yao showed that the XOR of independent ran- This paper addresses the relationship between dom instances of a somewhat hard Boolean prob- three central questions in complexity theory. lem becomes almost completely unpredictable. First, to what extent can a problem be eas- In this paper we show that, in non-uniform set- ier to solve for probabilistic algorithms than tings, total independence is not necessary for for deterministic ones? Secondly, what prop- this result to hold. We give a pseudo-random erties should a pseudo-random generator have generator which produces n instances of a prob- so that its outputs are “random enough” for the lem for which the analog of the XOR lemma purpose of simulating a randomized algorithm? holds. Combining this generator with the re- Thirdly, if solving one instance of a problem is sults of [25, 6] gives substantially improved re- computationally difficult, is solving several in- sults for hardness vs randomness trade-offs. In stances of the problem proportionately harder? particular, we show that if any problem in E = Yao’s seminal paper ([30]) was the first to DTIME(2O(n)) has circuit complexity 2Ω(n), show that these questions are related. Building then P = BPP . Our generator is a combina- on work by Blum and Micali ([8]) he showed tion of two known ones - the random walks on how to construct, from any cryptographically expander graphs of [1, 10, 19] and the nearly secure one-way permutation, a pseudo-random disjoint subsets generator of [23, 25]. The qual- generator whose outputs are indistinguishable ity of the generator is proved via a new proof of from truly random strings to any reasonably the XOR lemma which may be useful for other fast computational method. He then showed direct product results. that such a psuedo-random generator could be used to deterministically simulate any proba- ∗Research supported by NSF YI Award CCR-92- 570979, Sloan Research Fellowship BR-3311, grant bilistic algorithm with sub-exponential overhead. #93025 of the joint US-Czechoslovak Science and Tech- Blum, Micali, and Yao thus showed the cen- nology Program, and USA-Israel BSF Grant 92-00043 tral connection between “hardness” (the com- †Work partly done while visiting the Institute for Advanced Study, Princeton, N. J. 08540 and Princeton putational difficulty of a problem) and “ran- University. Research supported the Sloan Foundation, domness” (the utility of the problem as the ba- American-Israeli BSF grant 92-00106, and the Wolfson Research Awards, administered by the Israel Academy sis for a pseudo-random generator to de-randomize of Sciences. probabilistic computation). A different approach for using hardness as randomness was introduced by Nisan and Wigder- son [25]. They achieve deterministic simulation of randomness making a weaker complexity as- sumption: instead of being the inverse of a func- tion in P , the “hard” function could be any in EXP , deterministic exponential time. Their result was used in [6] to show that if any func- and a boolean function f : 0, 1 n 0, 1 . Ω(1) { } → { } tion in EXP requires circuit size 2n , then Assume that any algorithm in the model of a O(1) BPP DTIME(2(log n) ). In words, under certain complexity has a significant probabil- the natural⊆ hardness assumption above, ran- ity of failure when predicting f on a randomly domness is not exponentially useful in speed- chosen instance x. Then any algorithm (of a ing computation, in that it can always be sim- slightly smaller complexity) that tries to guess the XOR f(x1) f(x2) f(xk) of k ran- ulated deterministically with quasi-polynomial ⊕ ⊕ · · · ⊕ dom instances x1, , xk won’t do significantly slow-down. ··· The main consequence of our work is a sig- better than a random coin toss. nificantly improved trade-off: if some function The main hurdle to improving the previous in E = DTIME(2O(n)) has worst-case circuit trade-offs is the k-fold increase of input size complexity 2Ω(n), then BPP = P . In other from the mildly hard function to the unpre- words, randomness never speeds computation dictable one, resulting from the independence by more than a polynomial amount unless non- of the k instances. In this paper, we show uniformity always helps computation more than that true independence of the instances is not polynomially (for infinitely many input sizes) necessary for a version of the XOR Lemma to for problems with exponential . hold. Our main technical contribution is a way to pick k (pseudo-random) instances x1, , xk This adds to a number of results which show ··· that P = BPP follows either from a “hard- of a somewhat hard problem, using many fewer ness” result or an “easiness” result. On the than kn random bits, but for which the XOR of one hand, there is a sequence of “hardness” the bits f(xi) is still almost as hard to predict assumptions implying P = BPP ([25], [4],[5]; as if the instances were independent. Indeed, this last, in work done independently from this for the parameters required by the aforemen- paper, draws the somewhat weaker conclusion tioned application, we use only O(n) random that P = BPP follows from the existence of a bits to decrease the prediction to exponentially function in E with circuit complexity Ω(2n/n). small amount. Combining this with previous ) On the other hand, P = BPP follows from techniques yileds the trade-off above. the “easiness” premise P = NP [29], or even This task falls under the category of “deran- the weaker statement E = EH [6] (where EH domizations”, and thus we were able to draw is the alternating hierarchy above E). on the large body of techniques that have been We feel that from these results, the evidence developed for this purpose. Derandomization favors using as a working hypothesis that P = results eliminate or reduce reliance on random- BPP . However, it has not even been proved ness in an algorithm or construction. The gen- unconditionally that BPP = NE! This is a sad eral recipie requires a careful study of the use of state of affairs, and one we6 hope will be recti- independence in the probabilistic solution, iso- fied soon. One way our result could help do so lating the properties of random steps that are is if there were some way of “certifying” random acutally used in the analysis. This often allows Boolean functions as having circuit complexity substitution of the randomness by a pseudo- 2Ω(n). (See [28] for some discussion of this pos- random distribution with the same properties. sibility.) Often, derandomization requires a new anal- The source of our improvement is in the am- ysis of the probabilistic argument that can be plification of the hardness of a function. The interesting in its own right. The problem of de- idea of such an amplification was introduced randomizing the XOR lemma has led to two in [30], and was used in [25]. One needs to new proofs of this lemma, which are interest- convert a mildly hard function into one that is ing in their own right. In [16], a proof of the nearly unpredictable to circuits of a given size. XOR lemma via hard-core input distributions The tool for such amplification was provided was used to show that pairwise independent in Yao’s paper, and became known as Yao’s samples achieve some nontrivial amplification XOR-Lemma. The XOR Lemma can be para- (a result we use here). We give yet another phrased as follows: Fix a non-uniform model of proof of the XOR lemma, based on an analy- computation (with certain closure properties) sis of a simple “card guessing” game. A careful dissection of this proof reveals two requirements of hardness for a function with a small range, of the “random inputs”. Both requirements are with the extreme case being a Boolean func- standard in the derandomization literature, for tion, is slightly more complicated than that for which optimal solutions are known. One solu- general functions, since one can guess the cor- tion is the expander-walk generator of [1, 10, 19] rect value with a reasonable probability. How- (used for deterministic amplification) and the ever, the Goldreich-Levin Theorem ([11]) gives other is the nearly disjoint subset generator of a way to convert hardness for general functions [23] (used for fooling constant depth circuits, to hardness for Boolean functions. as well as in [25]). Our generator is simply the XOR of these two generators. Our new proof of Definition 1 Let m and ` be positive integers. Let f : 0, 1 m 0, 1 `. S(f), the worst-case the XOR lemma has already found another ap- { } → { } plication: showing a parallel repetition theorem circuit complexity of f, is the minimum number for three move identification protocols [7]. of gates for a circuit C with C(x) = f(x) for m We conclude this section by putting our con- every x 0, 1 . ∈ { } m struction in a different context. Yao’s XOR- Let π an arbitrary distribution on 0, 1 , { π} Lemma is a prototypical example of a “direct and s be an integer. Define the success SUCs (f) product” theorem, a concrete sense in which to be the maximum, over all Boolean circuits C several independent instances of a problem are of size at most s of P rπ[C(x) = f(x)]. Note harder than one. Direct product results exist that if f can be exactly computed in size s (i.e., S(f) s), then SUCπ(f) = 1 for any distribu- (and are useful) for a variety of models besides ≤ s circuits; Boolean decision trees [24], one-way tion. communication complexity [17], 2-prover games For the (important) case ` = 1 we define the advantage ADV π(f) = 2SUCπ(f) 1. In [27] and recently also communication complex- s s − ity [26]. Our result is the first derandomization both notations we replace the superscript π by of a direct product theorem. Such derandom- A if π is the uniform distribution on a subset A 0, 1 m. When π is uniform on 0, 1 m ization for other models is of interest when- ⊆ { } { } ever the input size blow-up has negative con- we eliminate the superscript altogether. sequences. Indeed, a derandomization of the The hard-core predicate result of Goldreich direct product theorem for 2-prover games (the and Levin [11] allows a hardness result for a Parallel Repetition Theorem of [27]) would have function with a linear-size output to be con- yielded better results concerning the impossibil- verted to a hardness result for a Boolean func- ity of approximation, but Feige and Kilian [?] tion. proved that such derandomization is impossi- ble. For which other models can direct prod- Definition 2 Let f : 0, 1 m 0, 1 `. De- uct results be derandomized, and for which can fine f GL : 0, 1 m 0{, 1 ` } →0, 1 { by} they not? f GL(x, y) ={< f}(x)×, y { >, where} → {< ,} > denotes the inner product in GF (2). For a· distribution· 2 Definitions and History π on 0, 1 m define πGL on 0, 1 m 0, 1 ` be π on{ the} first coordinate{ and} uniform× { on} In this section, we give the background needed the second, independently. The function f GL for the paper. For simplicity, we present the re- is called the Goldreich-Levin predicate for f. sults for the non-uniform Boolean circuit model πGL GL of computation. Some analogs of both the his- It is easy to see that ADVs+O(m)(f ) torical results quoted and the new results in this π ≥ SUCs (f). (Use the inner product of your guess paper hold for other models as well. for f and y as the guess for the inner product bit. If your guess for f is correct, so is this 2.1 Distibutional hardness of functions guess. If not, the guessed bit is correct with probability 1/2.) The Goldreich-Levin hard core In this sub-section, we give the standard def- predicate theorem is a strong converse of this initions of quantitative difficulty of a function inequality. The following paraphrases the main for a certain distribution of inputs. The notion result of [11] in our notation: O(1) 2 Theorem 1 [11] There are functions s0 =  s, h E with SUC (hcn) 1 n− for some ∈ s(n) ≤ − δ = O(1) so that: for every function f and dis- constant c and some s(n) = 2Ω(n). π tribution π, if SUCs (f) δ, then GL ≤ Yao’s XOR lemma could then be used to ADV π (f GL) . s0 ≤ amplify this mild hardness to the unpredictabil- Note that if  = 1 γ, where γ = o(1), then ity level required in Theorem 2, by taking k = 3 δ = (1 γ)O(1) = 1− O(γ). So between func- O(n ) independent inputs to h. However, this tions and− predicates− that are only weakly un- increases input size by a polynomial amount, predictable the above relationship gives quite which means that the simulation will still only tight bounds. be quasi-polynomial. Our main contribution will be achieving the same amplification with 2.2 Hardness versus Randomness only a linear increase in size. Our starting point is made a bit better by the following partial am- For this section only, we use f, g, h to denote a plification of [16] (proved via Theorem 8). Boolean function on strings of all lengths, and Theorem 5 [16] If h E has SUC2Ω(n) (hn) by fn, gn, hn their segments on n-bit strings. 2 ∈ ≤ O(n) 1 n− , then there is a g E with Also, let E = DTIME(2 ). − ∈ SUC Ω(n) (g2n) 2/3. The hardness versus randomness trade-off 2 ≤ we prove is stated formally below. To summarize, what remains is to, given a function with constant hardness against ex- Theorem 2 If there is a Boolean function f Ω(n) ∈ ponentially large circuits, “amplify” this hard- E with S(fn) = 2 , then BPP = P . ness to constant another function with an expo- Fortunately, much of the way towards this nentially small advantage against exponentially theorem was made already, and this subsection large circuits. This must be done without blow- details this progress, focusing on what remains ing up the input size by more than a constant to be done. factor. This is our main technical contribution, Nisan and Wigderson [25] showed how to stated formally in Theorem 9. simulate randomized algorithms eficiently using any problem in exponential-time that is suffi- 2.3 Direct Product Lemmas ciently difficult for non-uniform circuits. Their In order to achieve the task stated at the end result provides a wide trade-off, but we will of the previous section we first state the classi- need only the following version. cal Yao’s XOR Lemma, which accomplishes the Theorem 3 [25] If there is a Boolean function amplification of hardness, but at the expense of Ω(n) increasing input size. We show its relation to f E with ADV (fn) = 2− for some ∈ s(n) other direct product lemmas, and give a defini- s(n) = 2Ω(n), then P = BPP . tion of “de-randomized” direct product lemma. So we need a function with expoenential un- We discuss known pseudo-random direct prod- predictability. In [6], random self-reducibility uct lemmas, and state our amplification result, of EXP -complete functions was used, to “get which is proved in the remaining sections. Let g : 0, 1 n 0, 1 and assume that started”, namely generate a function with very { } → { } SUCs(g) 1 δ. In an information theoretic mild unpredictability, from a function hard in ≤ − the worst case. The premise in Theorem 2 re- analog, we could think of the value of g on a ran- gards E, so we what state below is a (somewhat dom input as being completely predictable with probability 1 2δ and as being a random coin tighter) variant of their results for this class. It − can be proved with the same (by now standard) flip othehwise. Then the advantage of guessing techniques, which we defer to the final paper. the exclusive or of k independent copies of g would be (1 2δ)k, since if it were ever a ran- (The improvement is that our trasformation in- − creases the input size by only a constant factor.) dom coin flip, we would be unable to get any advantage in predicting the XOR. Theorem 4 If there is a Boolean function f The XOR lemma is a computational mani- E with S(f) = 2Ω(n), then there is a function∈ festation of this intuition. Define (k) n k g⊕ :( 0, 1 ) 0, 1 by (Here, the distribution is over uniformly chosen { } → { } g(x1, , xk) = g(x1) g(xk). inputs to G.) ··· ⊕ · · · ⊕ Let s, s0, , m, δ and k be fixed functions of Theorem 6 [30],[20]. There are functions k, s 0 n, and let Gn be a polynomial-time computable O(1) of n, s, , δ with k = O( log /δ), s0 = s(δ) function from 0, 1 m to ( 0, 1 n)k. The fam- and so that, for any g−: 0, 1 n 0, 1 with { } { } ily of functions G is called a (s, s0, , δ) direct { } → { (k)} SUCs(g) 1 δ, we have: ADVs (g⊕ ) . ≤ − 0 ≤ product generator if each Gn is a (s(n), s (n), (n), δ(n)) direct product generator. Theorem 1 reduces the XOR lemma to the 0 following “direct product” theorem, which re- In this notation, [16] proved the following, quires the circuit to output all values of g(xi), which implies Theorem 5. rather than their exclusive-or. Define g(k) :( 0, 1 n)k 0, 1 k by Theorem 8 [16] There is a function G from (k) { } → { } 2n n n g (x1, , xk) = (g(x1), , g(xk)). Then 0, 1 to ( 0, 1 ) which, for any s = s(n), { } { } O(1) (c 1) c ··· ··· c 1, is a (s, sn− ,O(n− − ), n− ) direct Theorem 7 There are (functions of , δ, s) k = product≥ generator. O(1) O( log /δ), s0 = (δ) s so that for any g : − n 0, 1 0, 1 with SUCs(g) 1 δ, we The main technical contribution of this pa- { } → { (k}) ≤ − have: SUCs (g ) , per is the following: 0 ≤ Note that the Goldreich-Levin predicate for Theorem 9 For any 0 < γ < 1, there is a (k) k/2 g is basically g⊕ , (The number of inputs 0 < γ0 < 1 and a c > 0 so that there is a poly- is not fixed at k/2 but a binomial distribution nomial time G : 0, 1 cn ( 0, 1 n)n which γn γ0n γ{0n } → { } with expectation k/2; however, if we restrict is a (2 , 2 , 2− , 1/3) direct product gener- r in the Goldreich-Levin predicate to have ex- ator. actly k/2 1’s, the bound on the advantage is The proof of this theorem is in Sections 3 and almost the same.) Thus, combining Theorem 4. As explained above, it establishes our hard- 7 with Theorem 1 yields Theorem 6. We’ll ness versus randomness trade-off Theorem 2. use this approach in the rest of the paper: first The proof will actually give the more general show that computing f on many inputs is hard construction, with the above being a special and then invoke Theorem 1 to make the hard case. function Boolean. In the next section, we give a new proof of Theorem 10 For any s, , δ there is a polyno- Theorem 7, which generalizes to distributions mial time G : 0, 1 m ( 0, 1 n)k, which is an { } → { } on ~x which are not uniform and hence can be (s, s0, , δ)-generator with k = O( (log )/δ), O(1) − sampled from using fewer than nk random bits. s0 = √s(δ/n) and 1 2 This allows us to give some conditions which m = O(Min n log − , n / log s . suffice for such a pseudo-random distribution { } to have direct product properties similar to the Despite the tightness of our generator for the uniform distribution. We give a construction parameters of interest, the above leaves a gap of such a pseudo-random generator which uses to close. We conjecture that the bounds above only a linear number of bits to sample n n- can be achieved for all s, , δ with m = O(n). bit inputs. We formalize the notion of a di- rect product result holding with respect to a 3 The Card Guessing Game: A new proof of the pseudo-random generator as follows: direct product lemma

m Definition 3 Let G be a function from 0, 1 3.1 The Card Game n k { } to ( 0, 1 ) . G is an (s, s0, , δ) direct product generator,{ } if for every boolean function g with The Casino XOR offers the following game. Af- SUCs(g) 1 δ, we have: ter the player antes, the dealer places a se- ≤ − quence of k cards, each labelled either 0 or 1, k SUCs0 (g G)  on the table facing down. The cards are chosen ◦ ≤ by the casino, with the only restriction being that (by law) they must all be 1 with proba- Theorem 11 For any distribution on a¯ which bility p. One card is chosen at random, and is equal to 1k with probability at least p, S has k/3 the dealer turns over all cards except that one. an accepted pay-off of at least p 2− . The player then chooses whether or not to wa- − ger that the remaining card is a 1. If the player Proof. wagers, and the card is a 1, the player recieves Leta ¯ be a fixed sequence of cards. Clearly, k $1, if the player wagers and the card is a 0, then ifa ¯ = 1 , then S always accepts and always the player loses $1. What is the value of this wins. We’ll show that, for any other sequence game? a¯, the conditional expected payoff for S is at least 2k/3. The correspondence with the direct product − lemma is, intuitively, as follows. Let g be a Let T denote the number of coordinates i in boolean function. We want to use an algorithm which ai = 0. Then with probability T/k, aI = 0, and the obseved number of 0’s is t = T 1. In A advertised as having a probability p of com- − k this case, S accepts and loses with probability puting g (x1, ..xk) to get an algorithm A0 that t T predicts g on a single x. We know that A gets 2− = 22− . On the other hand, with proba- bility 1 T/k, aI = 1 and the observed number all k answers correct with probability p. But − when A does not get the answers all correct, of 0’s is t = T . Thus, the overall expectation T T is 2− ((1 T/k) 2T/k) = 2− (1 3T/k). If then the answers it gives might be anything, so − − − we assume they are picked adversarily (by the T k/3, this is non-negative. If T > k/3, it is ≤ k/3 “casino”). Define the value of the i’th “card” at least 2− . Thus,−S overall has an expectation 1 with at to be 1 when A computes g(xi) correctly and 0 least probability p and at least 2k/3 the rest otherwise. Thus, we know all are 1 with proba- − bility p. The player “picks” a card i by setting of the time, for an overall advantage of at least p 2 k/3. xi = x, and “turns over” the other cards by − 1 − 2 having the values g(xj) as “advice”. A0 will then incorporate our strategy for the card game We shall actually need a small variant of the to get a strategy for either believing or disbe- above game. We showed that, if A has a cer- lieving the prediction of g(x) = g(xi) given by tain probability p of guessing all k values of g, A, i.e., wagering or holding. then A0 gets almost the same advantage on a We will give an almost optimal strategy for random input. However, this is not enough to the card guessing game, and then formalize the contradict our assumption that (intuitively) g above intuitive correspondence. Observe that has a constant-size “hard” fraction of inputs. if the dealer, when not making all cards 1’s, What we need to show is that A0 gets the same fixes the values as independent coin tosses, the small advantage on inputs from the supposed player can expect to win at most p. (Actually, “hard” set. To do this, we’ll just use the fact since the above distribution also accidentally that many instances must come from this hard k sets all cards to 1 with probability 2 , this is set. k really roughly p 2− .) We will show a player − This motivates the variant card guessing game. strategy that almost achieves this, getting an In the variant game, the casino picks, in addi- k/3 expected pay-off of p 2− . tion toa ¯, a subset K cards (which can depend − k Leta ¯ = (a1, a2, , ak) 0, 1 denote the ··· ∈ { } ona ¯). (Intuitively, K represents the “hard” in- random sequence chosen by the casino, and let stances of the problem.) I R [k] be the picked card. Consider the fol- ∈ The casino does not reveal K to the player, lowing strategy S: Let t be the number of cards but the card selected is a uniformly chosen el- you see marked 0, i.e., the number of j = I with ement I K. The player still gets to see the 6 ∈ aj = 0. Then S accepts the wager with proba- 2 t The optimal strategy seems to be the one that ac- bility 2− . k 1 cepts with probablity 1/ − if t (k 1)/2 and reject t ≤ − 1This assumes we are in a non-uniform model of com- otherwise. By a similar argument, this strategy obtains k 1 1/2 k putation. The direct product lemma also holds in uni- an expectation p 1/ − = p O(k /2 ). How- form cryptographic settings. Here, “turns over” would − (k 1/2) − ever, the calculation does− not easily generalize to the mean choosing x in a way so that g(x ) can be com- j j situation in Theorem 12. puted simultaneously. values of all cards (including those not in K) J = I, xJ is chosen uniformly. F sets xI = x 6 except aI before deciding whether to wager on and simulates C(¯x). F then computes t, the aI . number of J = I where the J’th output of C(¯x) 6 3 The player can still use the strategy S de- differs from g(xJ ). F outputs the I’th out- t scribed above, and a similar calculation shows: put of C(¯x) with probability 2− and otherwise outputs 0 or 1 with equal probability. Theorem 12 For any distribution on a¯ and K so that a¯ = 1k with probability at least p , and Lemma 13 Let H 0, 1 n, H δ2n. Let ⊆ { } H| | ≥ K < k0 with probability at most q, S has ex- q = q(δ, k). Then ExpF [ADVF (g)] > /2 q | | k0/3 δk/6 − − pected payoff at least p q 2− . when I is 2− /16 . a uniform element from−K.− ≥ Proof. Without loss of generality, assume H = δ2n. (If not, apply the result to a random| | 3.2 (Yet Another) Proof of the XOR Lemma H0 H of the above size.) Consider the distri- ⊆ Three completely different proofs of the XOR bution D0 on x1, ...xk chosen by F given that lemma are known [20, 16, 12], with the last ref- x H. Any sequence x1, ..xk with s elements erence containing a survey of all three. We are from∈ H has s values of I for which it can be now ready for our new proof, after which we chosen by F . Thus, the probabilities of such discuss its benefits for the purpose of deran- vectors are proportional to s, and since the av- domization. erage value of s is δk, the exact probability of Proof. (of Theorem 7) Let q(k, δ) be the prob- the sequence in D0 is s/(δk) times the proba- ability that if k independent coins are tossed bility in the uniform distribution. each with probability δ of heads, fewer than Now, there are at least a  q fraction of such δk/2 heads are tossed. Then there is a constant tuples so that both s δk/2− and C(¯x) = gk(¯x). cδk ≥ 0 < c < 1/6 so that q(k, δ) 2− . Pick k = For each such sequence, the probability that it ≤ ( log +3) log  is chosen by F given x H is at least 1/2 the − cδ = O( δ ) . Then q(k, δ) /8, − kδ/6 ≤ ∈ and, since c < 1/6, 2− /8. We’ll show probability that it is chosen under the uniform the contrapositive of the statement≤ in Theo- distribution. Therefore, the probability that F (k) simulates C on such a tuple is at least ( q)/2. rem 7: If SUCs0 (g , then SUCs(g) 1 δ O(1) ≥ O(1) ≥ − − for some s = s0n (− ) Note that given that F simulates C onx ¯, I is The plan of proving this staement is as fol- uniformly distributed among the indices i with lows. Assume SUC (g(k)) , and let C be xi H. s0 ∈ a circuit achieving this success≥ probability. We Consider the following casino strategy for shall use it to construct a distribution on cir- the variant card-guessing game. Let k0 = δk/2. cuits F of approximately the same size, for which I, x1, ..xk are chosen as F does. The casino sets H Ω(δk) a = 1 if and only if the i’th bit of C(~x) is g(x ). ExpF [ADVF (g)] > /2 2− for any sub- i i set H of inputs with H − δ2n. The casino defines K to be the set of indices i | | ≥ The above will imply that taking t = O(n/2) with xi H. From the above discussion, the ∈ samples F1, ...Ft from this distribution, casino sets all ai = 1 with probability at least Maj(F , ..F ) (the Boolean majority function ( q)/2, and I is uniformly distributed in K. 1 t − of the samples) is likely to predict f correctly Furthermore, the probability that K < k0 is | | on all but an at most δ fraction of inputs. Thus at most q/2, since at most a fraction q of se- for some fixed choice of the samples we get a quences have fewer than k0 elements in H, and circuit which violates the hardness assumption. each such has at most 1/2 the probability in D0 The distribution on F will merely implement that it would have in the uniform distribution. the strategy S of the previous section against Thus, the expected payoff if the player uses the casino playinga ¯ = 1k C(¯x) g(k)(¯x). We the strategy S, from Theorem 12, is at least ( ⊕ ⊕ − are now ready to formally present construction. 3Here, we are interested in non-uniform complexity Consider the following distribution on prob- so g(xJ ) is given as advice; we could also modify the abilistic circuits with one n-bit input x. I [k] argument for the uniform cryptographic setting since ∈ there xJ = f(yJ ) and g(xJ ) = b(yJ ) can be generated is chosen uniformly at random, and for each together from a common random input yJ . q)/2 q/2 2k0/3 = /2 q 2k0/3. This payoff [n] 0, 1 n 0, 1 m 0, 1 m, with the fol- is equal− to− the expected− advantage− for F (x) = lowing× { properties.} × { } For→ any { inputs} i, x, α, let P rob[F (x) = g(x)] P rob[F (x) = g(x)], since G(h(i, x, α)) = x1, ..xk. F outputs the correct− value when6 S accepts the For any fixed i and uniformly chosen x,α, wager and wins, the wrong value when S ac- • cepts and loses, and a random value when S the random variable h(i, x, α) is uniformly distributed in 0, 1 m. refuses the wager. { } For any fixed i, x, α, we have xi = x. Let H = x P robF (F (x) = g(x)) < 1/2 + • /32 . Since{ the| expected success probability } For any fixed i, j = i, and α, there is a of F on H is the average of the probabilities • set S 0, 1 n, S6 M, so that for any used to define H, Exp(AdvH (g)) < /16. Thus, ⊆ { } | | ≤ F value of x, xj S. from Lemma 13, H δ2n. By Chenoff’s ∈ | | ≤ bound, for t as mentioned earlier, if F1, ..Ft are We also need to quantify the first, statistical chosen independently, Maj(F1, ...Ft)(x) = g(x) use of independence: n for all x H with probability > 1 2− . Thus, 6∈ − Definition 5 We say that a pseudo-random gen- there are F1, ..Ft so that Maj(F1, ..Ft) achieves success probability 1 δ. This function will have erator G producing a distribution x1, ..xk on n k − O(1) O(1) ( 0, 1 ) is (k0, q, δ)-hitting if for any sets circuit size O(ts0) = s0n − as claimed. { } n n H1, ..Hk 0, 1 , Hi δ2 ⊆ { } | | ≥

we have P rob[ i xi Hi < k0] q. 4 Direct product lemmas for pseudo-random dis- |{ | ∈ }| ≤ tributions One nice feature of the above definitions is that, if we can find two pseudo-random genera- In this section, we present some general condi- tors that satisfy them separately, we can merely tions on a pseudo-random distribution on x1, ..xk take the bitwise XOR of the two to get a single so that the proof of the direct product theorem restrictable, hitting distribution. This is for- from the last section goes through basically un- malized as follows: changed. Note that the argument from the last section Definition 6 Let x y denote the bit-wise xor ⊕ basically used the independence of the xi in two of the two strings. Let G(r),G0(s) be pseudo- ways; random generators with outputs in ( 0, 1 n)k, { } G(r) = x1, ..xk and G0(s) = y1, ..yk. Then For any H 0, 1 n, H δ2n, with • ⊆ { } | | ≥ define the pseudo-random generator G G0 by high probability, the number of xi H is ⊕ ∈ (G G0)(r, s) = x1 y1, ..., xk yk. at least some k0 = Ω(k). ⊕ ⊕ ⊕ Lemma 14 Let G(r),G (s) be polynomial-time It is possible (and computationally feasi- 0 • pseudo-random generators with outputs in ( 0, 1 n)k. ble) to fix the values of all the xJ ,J = I { } 6 while leaving xI free to be any value. If G is (k0, q, δ)-hitting then G G0 is (k0, q, δ)- • hitting. ⊕ Of course the last condition really does force true independence. However, if instead of xJ If G0 is M-restrictable then G G0 is M- • ⊕ being completely fixed, xJ could be restricted restrictable. to a relatively small set of possibilities, we could still store g(xJ ) for each possible xJ . This mo- Proof. tivates the following definition: n n Let H1, ..Hk 0, 1 , Hi δ2 be given, • ⊆ { } | | ≥ m n k and fix y1, ..yk. Let H0 = Hi yi; clearly, Definition 4 Let Gn : 0, 1 ( 0, 1 ) be i ⊕ { } → { } H0 = Hi . Then xi yi Hi xi a polynomial time computable pseudo-random | i| | | ⊕ ∈ ⇐⇒ ∈ generator. We say G is M-restrictable if there Hi0, so applying the hitting condition for is a polynomial-time computable function h(i, x, α), h : G to H10 , , ..Hk0 gives us a bound on the hitting probability of G G0. ⊕ Let h be the restricting function for G. Let D. The probability in D that u ρk and that • k ≥ G(s) = y1, ..yk. Then let h0(i, x, α s) = C(r) = g (G(r)) is at least  q = 4δ/ρq from ◦ − (h(i, x yi, α), s). It is easy to check that the hitting property of G and the success prob- ⊕ h0 is a restricting function for G G0 . ability of C . Thus, the corresponding proba- ⊕ bility in D0 that u ρk and C succeeds , is at least ( q)ρ/δ =≥ 4q, since for each such se- − We now show that these conditions are suf- quence, its probability in D0 is at least ρ/δ of ficient for a direct product theorem to hold for its probability in D. Also, r = h(I, x, α0) is in- a generator: dependent of I so given r (and hence x1, ..xk), I is a uniformly distributed index so that xI H. Theorem 15 Let s be a positive integer and As before, the expected advantage of F∈ on m n k let G(r): 0, 1 ( 0, 1 ) be a (ρk, q, δ)- x H is the same as the expected payoff of S hitting, M{-restrictable} → { pseudo-random} genera- ∈ on the variant game when the casino sets aj = 1 tor, where q > 2 ρk/3 and s > 2Mnk. Then G − iff the j’th output of C is g(xi), and picks K to is a be the set of positions i with x H. From the 2 O(1) i (s, Ω(sq (n− )),O(q), δ) ∈ above, the probability p that all the ai are 1 is -direct product generator. at least 4q. The probability that K k0 = ρk is at most q. So by Theorem 12,| the| ≤ expected ρk/3 Proof. advantage of F is at least 4q q 2− Let  = (4(δ/ρ)+1)q, We give a probabilistic 3q 2ρk/3 q. − − ≥ construction which takes a circuit C that has − ≥ success probability  for g(k) G and outputs a Thus, there is at most a δ fraction of x where circuit F of size O( C + kMn◦) whose expected F has expected probability < 1/2 + q/2, and 2 advantage on any H,| H| δ2n is at least q. We taking the majority of O(n/q ) circuits F gives then take the majority| of| ≥O(n/q2) such circuits. a circuit that predicts g(x) on all x not in this fraction. Each F has size approximately O( C + The construction is based on the strategy for | | the variant card guessing game, and is similar kM), so the combined circuit has size O(( C + 2 | | to the previous section. We pick I from 1, ..k kMn)n/q ). Thus, if C s0, there is a circuit of size s that has success| | ≤ probability at least uniformly at random, and select α0 at random. 1 δ, a≤ contradiction. Let x1, ..xk = G(h(I, x, α0)), For each j = I, we 6 − construct (non-uniformly) a table of g(xj) for any xj that is a possible output of h(I, x, α0) 5 Examples of hitting and restrictable generators (as we vary over x). There are at most M such values. Our circuit F , on input x simulates C In this last section, we reduced the question on G(h(I, x, α0)). It then (via the tables) com- of constructing a direct product generator to putes t, the number of indices j = I where the that of (separately) constructing hitting and re- 6 j’th output of C differs from g(xj). F outputs strictable generators. Here, we present a well- t the i’th output of C with probability 2− , and known example of each type, and combine them. otherwise outputs a randomly chosen bit. This completes the proofs of Theorem 9 and Theorem 10. Lemma 16 Let

H 0, 1 n, H δ2n. 5.1 Expander Walks: a hitting generator ⊆ { } | | ≥ The generator in this section originates from Then Exp(AdvH (g)) q F ≥ [1], who used it to derandomize probabilistic Proof. Let D be the distribution on G(r) for a space-bounded machines. Its use in determin- stic amplification was established in [10, 19]. random r, and let D0 be the conditional distri- bution given that xI H for a random I. For ∈ Definition 7 Fix an explicit expander graph G a fixed such sequence, let u be the number of on vertex set 0, 1 n, of constant degree (say xi H. As before, the probability of the se- { } ∈ 16) and small second eigenvalue (say < 8) [22]. quence in D0 is u/(δk) times its probability in n k 1 Let the generator EW : 0, 1 [16] − { } × → n k ¯ ( 0, 1 ) be defined by EW (v; d) = (v1, v2, , vk) Lemma 18 There is a polynomial time prob- { } ··· where v1 = v and vi+1 is the dith neighbour of abilistic algorithm that for any γ > 0, n, k = vi in G. O(n) and m = 2n/γ, produces, with probability Ω(n) 1 2− , a γ-disjoint family of k subsets of Note that this generator uses a seed of length [m−] each of size n, and uses O(m) bits. m = n + O(k). The amplification property of m m this generator, formalized below, asserts that Proof. Let p = n 2 and encode (ef- with exponentially high probability, many of ficiently and 1-1) the n≤-subsets of m by the these k pseudo random points will satisfy ar- integers modulo p. Pick n numbers in Zp so bitrary large events. that they are individually uniform and pair- wise independent [10],[9] and let the Si’s be the Theorem 17 [10, 19] For every k n, the sets encoded by these numbers. The probabil- above generator is (k/10, 2 k/10, 1/3)≤hitting. − ity that, for two independent sets S and S0 of size n, their intersection is more than twice the 5.2 Nearly Disjoint Subsets: a restrictable gen- expected size m(n/m)2 = n2/m = γn/2 is ex- erator ponentially small in n. Thus, since the Si’s are pairwise independent, this exponetially small The generator of this section was invented in amount times nk bounds the probability that [23] (for derandomizing probabilistic constant- any Si Sj > γn for i = j.) 2 depth circuits), and was used to produce gen- | ∩ | 6 eral hardness-randomness trade-offs in [25]. This Σ generator will in general use more than O(n) Lemma 19 If Σ is γ-disjoint, then ND is γn bits, but for the parameters of interest in The- M = 2 -restrictable. orem 9 uses only O(n) bits. The number of bits Proof. For i [n], x 0, 1 n, α 0, 1 m n, given in Theorem 10 for general circuit size will − let h(i, x, α) be∈ the string∈ { }r ∈0, {1 m} with follow from the seed length here, and any im- ∈ { } rS = x and r = α. Fix i , j = i, and α. provement should circumvent this construction. i Si Since S S γn, and since the6 output of We mention that this is impossible to do by sim- j i ND(h|(i, x,∩ α))| is ≤ determined in all bits not in ply building more efficient restrictable genera- S , there are only 2γn possible values for x , tors! [18] use entropy arguments to prove that i j found by varying all bits in Sj Si. the bounds below are tight for any construction. ∩ Let M be given, let γ = (log M)/n, i.e., M = 2γn, and set m = 2n/γ. In this section, 5.3 The Direct Product (and XOR) Generator we construct an M-restrictable generator using XG O(m) bits. Unfortunately, Ω(m) bits is neces- sary for such any such generator. Thus, the di- The generator we use to “fool” the XOR lemma rect product generator we construct is only in- is the simplest combination of the two genera- tors above - we take the componentwise exclusive- teresting when s > M > 2O(√(n)); otherwise m or of their outputs. We describe here only the is larger than the number of bits in O(log s) in- choice of parameters required by Theorem 9. dependent instances. However, if we are dealing with exponentially hard functions, M = 2Ω(n) Definition 9 Let n be a positive integer, and and so m = O(n). let 1/30 > γ > 0. Let k = n in EW , and let γ0 = γ/4, and let m = 2n/γ0. With the Definition 8 Let Σ = S1, ...Sk be a family { } parameters above, we define the XOR genera- of subsets of [m] of size n. We say Σ is γ- ¯ tor XG(r; r0; v; d) as follows: Use r to select disjoint if Si Sj γn for each i = j. Let m| ∩ | ≤ 6 Σ, a γ0-disjoint family of subsets of [m]. Then r 0, 1 and S = s1 < s2 < ...sn [m]. ¯ Σ ¯ ∈ { } { } ⊆ XG[r; r0; v; d) = ND (r0) EW (v, d). Then let the restriction of r to S, r S, be the ⊕ | n-bit string r S = rs1 rs2 ..rsn . Then, for a γ- We ignore the possibility that Σ is not γ - | 0 disjoint Σ, NDΣ : 0, 1 m ( 0, 1 n)k is disjoint, since this happens only with exponen- Σ { } × → { } defined by: ND (r) = r S1 , r S2 , , r Sk . tially small probability. | | ··· | We can now prove Theorem 9: Proof. We need to construct a [10] A. Cohen and A. Wigderson, “Dis- γn γ0n γ0n (2 , 2 , 2− , 1/3)-direct product generator pensers, Deterministic Amplification, whose input length is O(n). We claim that XG and Weak Random Sources”, 30th is the desired generator. First note that r = FOCS, pp. 14–19, 1989. |¯| O(n), r0 = m = O(n), v = n, and d = [11] O. Goldreich and L.A. Levin. “A Hard- O(k) =| O|(n). So the input| to| G is O(n) bits.| | Core Predicate for all One-Way Func- By lemma 19 , Theorem 17, and Lemma tions”, in ACM Symp. on Theory of 14, XG is M = 2γ0n-restrictible, and is Computing, pp. 25–32, 1989. n/10 (n/10, 2− , 1/3)-hitting. Then by Theorem [12] O. Goldreich, N. Nisan and A. Wigder- 15, XG is a (s, O((s Mnk)q2/n),O(q), 1/3)- son. “On Yao’s XOR-Lemma”, available − k/30 direct product generator for any q > 2− . via www at ECCC TR95-050, 1995. γn γ0n In particular, for s = 2 , q = 2− , XG is a [13] Goldreich, O., Impagliazzo, R., Levin, γn (γ/2)n O(1) γ0n (2 , 2 /n ,O(2− ), 1/3)- direct prod- L., Vanketesan, R., Zuckerman, D., “Se- uct generator. curity Preserving Amplification of Har- ness”, 31st FOCS, pp. 318-326, 1990. [14] S. Goldwasser and S. Micali. “Prob- References abilistic Encryption”, JCSS, Vol. 28, [1] M. Ajtai, J. Komlos and E. Sze- pages 270–299, 1984. meredi, “Deterministic simulation in [15] J. Hastad, R. Impagliazzo, L.A. Levin LOGSPACE”, Proc. of 19th ACM and M. Luby, “Construction of Pseudo- STOC, 132-140, 1987. random Generator from any One-Way [2] N. Alon, “Eigenvalues and Expanders”, Function”, to appear in SICOMP.( Combinatorica, 6, pp. 83-96, 1986. See preliminary versions by Impagli- [3] N. Alon and J. Spencer , The Probabilis- azzo et. al. in 21st STOC and Hastad tic Method, Wiley-Interscience Publica- in 22nd STOC.) tion, 1992. [16] R. Impagliazzo, “Hard-core Distribu- [4] A. Andreev, A. Clementi and J. Rolim, tions for Somewhat Hard Problems”, in “Hitting Sets Derandomize BPP”, in 36th FOCS, pages 538–545, 1995. XXIII International Colloquium on [17] R. Impagliazzo, R. Raz. A. Wigderson, Algorithms, Logic and Programming “A Direct Product Theorem”, Proc. of (ICALP’96), 1996. the 9th Structures in Complexity confer- [5] A. Andreev, A. Clementi, and J. Rolim, ence, pp. 88–96, 1994. “Hitting Properties of Hard Boolean [18] R. Impagliazzo, Operators and its Consequences on A. Wigderson, “An Information Theo- BPP ”, manuscript, 1996. retic Analog of the Inclusion–Exclusion [6] L. Babai, L. Fortnow, N. Nisan and A. Lower Bound”, Manuscipt 1996. Wigderson, “BPP has Subexponential [19] R. Impagliazzo and D. Zuckerman, Time Simulations unless EXPTIME has “How to Recycle Random Bits”, 30th Publishable Proofs”, Complexity The- FOCS, pp. 248-253, 1989. ory, Vol 3, pp. 307–318, 1993. [20] L.A. Levin, “One-Way Functions and [7] M. Bellare, R. Impagliazzo, and M. Pseudorandom Generators”, Combina- Naor, “Acceptance Probabilities for torica, Vol. 7, No. 4, pp. 357–363, 1987. Parallel Repetitions of Cryptographic [21] R. Lipton, “New directions in testing”, Protocols ”, in progress. In J. Fegenbaum and M. Merritt, edi- [8] M. Blum and S. Micali. “How to Gener- tors, Distributed Computing and Cryp- ate Cryptographically Strong Sequences tography, DIMACS Series in Discrete of Pseudo-Random Bits”, SIAM J. Mathematics and Theoretical Computer Comput., Vol. 13, pages 850–864, 1984. Science Volume 2, pp. 191-202. Ameri- [9] B. Chor, and O. Goldreich, “On the can Mathematical Society, 1991. Power of Two-Point Based Sampling”, J. of Complexity, 5, 96-106, 1989. [22] A. Lubotzky, R. Phillips, and P. Sar- nak, “Ramanujan graphs”, Combinator- ica, 8(3):261–277, 1988. [23] N. Nisan, “Pseudo-random bits for con- stant depth circuits”, Combinatorica 11 (1), pp. 63-70, 1991. [24] N. Nisan, S. Rudich, and M. Saks. “Products and Help Bits in Decision Trees”, in 35th FOCS, pages 318–329, 1994. [25] N. Nisan, and A. Wigderson, “Hard- ness vs Randomness”, J. Comput. Sys- tem Sci. 49, 149-167, 1994 [26] I. Parnafes, R. Raz, A. Wigderson, “Di- rect Product Results, the GCD Prob- lem, and New Models of Communica- tion”, Proc. of the 29th STOC, 1997. [27] R. Raz, “A Parallel Repetition Theo- rem”, to appear in SIAM Journal on Computing. Preliminary version in 27th STOC, pp. 447-456, 1995. [28] S. Rudich, “NP -Natural Proofs”, Manuscript, 1996. [29] M. Sipser, “A complexity-theoretic ap- proach to randomness”, Proc. of the 15th STOC, 330–335, 1983. [30] A.C. Yao, “Theory and Application of Trapdoor Functions”, in 23st FOCS, pages 80–91, 1982.