Pseudorandom generators without the XOR Lemma

y z x

Madhu Sudan Luca Trevisan

June

Abstract

Impagliazzo and Wigderson IW have recently shown that if there exists a decision problem solvable

O (n) (n)

in time and having circuit complexity for all but nitely many n then P BPP This result

is a culmination of a series of works showing connections b etween the existence of hard predicates and

the existence of go o d pseudorandom generators

The construction of Impagliazzo and Wigderson go es through three phases of hardness amplication

a multivariate p olynomial enco ding a rst derandomized XOR Lemma and a second derandomized

XOR Lemma that are comp osed with the NisanWigderson NW generator In this pap er we present

two dierent approaches to proving the main result of Impagliazzo and Wigderson In developing each

approach we introduce new techniques and prove new results that could b e useful in future improvements

andor applications of hardnessrandomness tradeos

Our rst result is that when a mo died version of the NisanWigderson generator construction

is applied with a mildly hard predicate the result is a generator that pro duces a distribution indis

tinguishable from having large minentropy An extractor can then b e used to pro duce a distribution

computationally indistinguishable from uniform This is the rst construction of a pseudorandom gen

erator that works with a mildly hard predicate without doing hardness amplication

We then show that in the ImpagliazzoWigderson construction only the rst hardnessamplication

phase enco ding with multivariate p olynomial is necessary since it already gives the required average

case hardness We prove this result by i establishing a connection b etween the hardnessamplication

problem and a listdeco ding problem for errorcorrecting co des and ii presenting a listdeco ding algo

rithm for errorcorrecting co des based on multivariate p olynomials that improves and simplies a previous

one by Arora and Sudan AS

Keywords Pseudorandom generators extractors p olynomial reconstruction listdeco ding

Preliminary versions and abstracts of this pap er app eared in ECCC STV STOC STVa and Complexity

STVb

y

Lab oratory for Computer Science Technology Square MIT Cambridge MA Email

madhutheorylcsmited u

z

Department of Computer Science Columbia University W th St New York NY Email

lucacscolumbiaedu Work done at MIT

x

Lab oratory for Computer Science Technology Square MIT Cambridge MA Email

saliltheorylcsmited u URL httptheorylcsmited u sali l Supp orted by a DODNDSEG graduate fellowship

and partially by DARPA grant DABTC

Introduction

This pap er continues the exploration of hardness versus randomness tradeos that is results showing

that randomized algorithms can b e eciently simulated deterministically if certain complexitytheoretic

assumptions are true We present two new approaches to proving the recent result of Impagliazzo and

O n

Wigderson IW that if there is a decision problem computable in time and having circuit complexity

n

for all but nitely many n then P BPP Impagliazzo and Wigderson prove their result by presenting

a randomnessecient amplication of hardness based on a derandomized version of Yaos XOR Lemma

The hardnessamplication pro cedure is then comp osed with the NisanWigderson NW generator NW

and this gives the result The hardness amplication go es through three steps an enco ding using multivariate

p olynomials from BFNW a rst derandomized XOR Lemma from Imp and a second derandomized

XOR Lemma which is the technical contribution of IW

In our rst result we show how to construct a pseudo entropy generator starting from a predicate

with mild hardness Roughly sp eaking a pseudoentropy generator takes a short random seed as input and

outputs a distribution that is indistinguishable from having high minentropy Combining our pseudo entropy

generator with an extractor we obtain a pseudorandom generator Interestingly our pseudo entropy generator

is a mo dication of the NW generator itself Along the way we prove that when built out of a mildly

hard predicate the NW generator outputs a distribution that is indistinguishable from having high Shannon

entropy a result that has not b een observed b efore The notion of a pseudo entropy generator and the idea

that a pseudo entropy generator can b e converted into a pseudorandom generator using an extractor are due

to Hastad et al HILL Our construction is the rst construction of a pseudorandom generator that

works using a mildly hard predicate and without hardness amplication

We then revisit the hardness amplication problem as considered in BFNW Imp IW and we

show that the rst step alone enco ding with multivariate p olynomials is sucient to amplify hardness to

the desired level so that the derandomized XOR Lemmas are not necessary in this context Our pro of is

based on a listdeco ding algorithm for multivariate p olynomial co des and exploits a connection b etween the

listdeco ding and the hardnessamplication problems The listdeco ding algorithm describ ed in this pap er

is quantitatively b etter than a previous one by Arora and Sudan AS and has a simpler analysis

An overview of previous results The works of Blum and Micali BM and Yao Yao formalize

the notion of a pseudorandom generator and show how to construct pseudorandom generators based on

the existence of oneway p ermutations A pseudorandom generator meeting their denitions which we

call a BMYtype PRG is a p olynomialtime algorithm that on input a randomly selected string of length

n pro duces an output of length n that is computationally indistinguishable from uniform by any adver

sary of p oly n size where is an arbitrarily small constant Pseudorandom generators of this form and

pseudorandom functions GGM constructed from them have many applications b oth inside and outside

cryptography see eg GGM Val RR One of the rst applications observed by Yao Yao is

derandomization a given p olynomialtime randomized algorithm can b e simulated deterministically using

n

p oly n by trying all the seeds and taking the ma jority answer a BMYtype PRG in time

In a seminal work Nisan and Wigderson NW explore the use of a weaker type of pseudorandom

generator PRG in order to derandomize randomized algorithms They observe that for the purp ose of

t

derandomization one can consider generators computable in time p oly instead of p oly t where t is

the length of the seed since the derandomization pro cess cycles through all the seeds and this induces an

t

overhead factor anyway They also observe that one can restrict to generators that are go o d against

adversaries whose running time is b ounded by a xed p olynomial instead of every p olynomial They then

show how to construct a pseudorandom generator meeting this relaxed denition under weaker assumptions

than those used to build cryptographically strong pseudorandom generators Furthermore they show that

under a suciently strong assumption one can build a PRG that uses seeds of logarithmic length which

would b e imp ossible for a BMYtype PRG Such a generator can b e used to simulate randomized algorithms

in p olynomial time and its existence implies P BPP The condition under which Nisan and Wigderson

prove the existence of a PRG with seeds of logarithmic length is the existence of a decision problem ie a

n

O n

predicate P f g f g solvable in time such that for some p ositive constant no circuit of size

1

To b e accurate the term extractor comes fron NZ and p ostdates the pap er of Hastad et al HILL

n n

can solve the problem on more than a fraction of the inputs This is a very strong hardness

requirement and it is of interest to obtain similar conclusions under weaker assumptions

An example of a weaker assumption is the existence of a mild ly hard predicate We say that a predicate

n

is mildly hard if for some xed no circuit of size can decide the predicate on more than a fraction

p olyn of the inputs Nisan and Wigderson prove that mild hardness suces to derive a pseudorandom

generator with seed of O log n length which in turn implies a quasip olynomial deterministic simulation

of BPP This result is proved by using Yaos XOR Lemma Yao see eg GNW for a pro of to

convert a mildly hard predicate over n inputs into one which has input size n and is hard to compute

n

on a fraction of the inputs A series of subsequent pap ers attacks the problem of obtaining

stronger pseudorandom generators starting from weaker and weaker assumptions Babai et al BFNW

n

show that a predicate of worstcase circuit complexity can b e converted into a mildly hard one

Impagliazzo Imp proves a derandomized XOR Lemma which implies that a mildly hard predicate can b e

converted into one that cannot b e predicted on more than some constant fraction of the inputs by circuits of

n

size Impagliazzo and Wigderson IW prove that a predicate with the latter hardness condition can

b e transformed into one that meets the hardness requirement of NW The result of IW relies on a

dierent derandomized version of the XOR Lemma than Imp Thus the general structure of the original

construction of Nisan and Wigderson NW has b een preserved in most subsequent works progress b eing

achieved by improving the single comp onents In particular the use of an XOR Lemma in NW continues

alb eit in increasingly sophisticated forms in Imp IW Likewise the NW generator and its original

analysis have always b een used in conditional derandomization results since Future progress in the area

will probably require a departure from this observance of the NW metho dology or at least a certain amount

of revisitation of its main parts

In this pap er we give two new ways to build pseudorandom generators with seeds of logarithmic length

Both approaches bypass the need for the XOR Lemma and instead use to ols such as list deco ding extractors

and pseudo entropy generators that did not app ear in the sequence of works from NW to IW For a

diagram illustrating the steps leading up to the results of IW and how our techniques depart from that

framework see Figure Both of our approaches are describ ed in more detail b elow

Worst−case hard

polynomial [BFNW93] encoding

polynomial Mildly hard encoding Thm 10 XOR [Imp95] + list Lemma decoding (Thm 22) Constant hardness XOR Pseudoentropy Lemma [IW97] Generator Extremely hard [NW94] extractor (Lem 13)

Pseudorandom Generator

Figure A comparison of our approach with previous ones Double arrows indicate our results

2

This has to b e true for all but nitely many input lengths n

3

In fact the result of BFNW was somewhat weaker but it is easily extendable to yield this result

4

The techniques of Andreev et al ACR are a rare exception but they yield weaker result than the ones of IW

A pseudo entropy generator Nisan and Wigderson show that when their generator is constructed using

a very hardonaverage predicate then the output of the generator is indistinguishable from the uniform

distribution It is a natural question to ask what happ ens if there are stronger or weaker conditions on

the predicate In this pap er we consider the question of what happ ens if the predicate is only mildly hard

Sp ecically we are interested in whether exp onential averagecase hardness is really necessary for direct

pseudorandom generation In this pap er we rst show that when a mildly hard predicate is used in the

NW generator then there exists a distribution having high Shannon entropy that is indistinguishable from

the output of the generator Our main result is then that for a mildly hard predicate a mo died version

of the NW generator has an output indistinguishable from a distribution with high minentropy Such a

generator is essentially a pseudo entropy generator in the sense of Hastad et al HILL The intuition

b ehind our pro of is that if a predicate is hard to compute on more than a fraction of the inputs then

there should b e some subset of the inputs of density on which the predicate is very hard this intuition is

made precise by a result of Impagliazzo Imp Due to the high hardness the evaluation of the predicate

in a random p oint of this set will b e indistinguishable from a random bit The NW generator constructed

with a predicate P works by transforming an input seed s into a sequence of p oints x x from the

m

domain of P the output of the generator is then P x P x P x For a random seed each of the

m

p oints x is uniformly distributed and so we exp ect to typically generate m p oints from the hard set so

i

that the output of the generator lo oks like having m bits of randomness that is it is indistinguishable

from some other distribution having Shannon entropy m The generation of the p oints x x can b e

m

mo died so that the number of p oints landing in the hard set is sharply concentrated around its exp ected

value m The output of the mo died generator is then indistinguishable from having high minentropy

When our generator is comp osed with a suciently go o d extractor such as the one in Tre then the

result is a pseudorandom generator This is the rst construction of a pseudorandom generator based on

mild averagecase hardness that do es not rely on hardness amplication It is also the rst application of

the notion of a pseudo entropy generator to the construction of PRG in the NisanWigderson sense

Remark While in this pap er we analyze for the rst time the NisanWigderson generator under a weaker

assumption than the one originally considered in NW there has also b een some work exploring the eect

of stronger assumptions on the predicate Impagliazzo and Wigderson IW show that if the predicate has

certain additional prop erties such as downward selfreducibility then one needs only a uniform hardness

assumption on the predicate rather circuitcomplexity assumption Arvind and Kobler AK and Klivans

and van Melkebeek KvM show that if the predicate is hard on average for nondeterministic circuits then

the output of the generator is indistinguishable from uniform for nondeterministic adversaries Therefore it

is p ossible to derandomize classes involving randomness and nondeterminism such as AM Trevisan Tre

shows that if the predicate is chosen randomly from a distribution having certain prop erties then the output

is statistically close to uniform This yields the construction of extractors that we use in our generator

The connection with list deco ding of errorcorrecting co des Our second result deals with the

listdeco ding problem for errorcorrecting co des and its connection to amplication of hardness

We start by describing a new listdeco ding problem for errorcorrecting co des This problem diers

from the standard deco ding task in that the deco ding algorithm is allowed to output a list of nearby

co des rather than a unique nearest co deword and the deco ding algorithm is allowed oracle access to

the recieved word and exp ected to deco de in time much smaller than the length of the co deword It is also

allowed to output implicit representations of the list of co dewords by giving programs to compute the ith

co ordinate of each co deword This implicit version of the listdeco ding problem is closely related to and

inspired by work in program checking and probabilistic checking of pro ofs

We show a simple connection b etween amplication of hardness and the existence of uniformlyconstructible

families of co des with very ecient listdeco ders in our sense Theorem We then show that a recent

result of Arora and Sudan AS on p olynomial reconstruction leads to a family of errorcorrecting co des

with very ecient listdeco ders Lemmas and In particular this is sucient to imply the hardness

amplication results of IW Finally we simplify the reconstruction pro cedure of Arora and Sudan and

5

An extractor is an ecient algorithm that on input a distribution sampled from a distribution with high minentropy has

an output that is statistically close to uniform

give an analysis Theorem that works for a wider range of parameters and has a much simpler pro of

In contrast the analysis of Arora and Sudan relies on their dicult analysis of their lowdegree test for

the highly noisy case

Th p olynomial reconstruction problem has b een studied for its applications to program checking average

case hardness results for the p ermanent and random selfreducibility of complete problems in high complexity

classes BF Lip GLR FF GS FL CPS The applicability of p olynomial reconstruction

to hardness versus randomness results was demonstrated by Babai et al BFNW They show that the

existence of a p olynomial reconstruction pro cedure implies that one can convert a worstcase hard predicate

into one which is mildly averagecase hard by enco ding it as a p olynomial In eect our analysis shows that

already at this stage the p olynomial function is very hard hard enough to use with the NW pseudo

random generator This connection b etween p olynomial reconstruction and hardness amplication has also

b een observed indep endently by Wig and S Ravi Kumar and D Sivakumar KS

Preliminaries

n

We write U for the uniform distribution on f g The statistical dierence b etween two random variables

n

X and Y on a universe U is dened to b e max jPr X S Pr Y S j

S U

Our main ob jects of study are pseudorandom generators

d n

Denition A function G f g f g is an s pseudorandom generator if no circuit of size s can

distinguish G from U with advantage greater than That is for every circuit C of size s

n

jPr C U Pr C GU j

n d

We b egin by recalling the NisanWigderson construction of pseudorandom generators

The NisanWigderson generator

The combinatorial construction underlying the NW generator is a collection of sets with small intersections

called a design

Lemma design NW For every m N there exists a a family of sets S S f dg

m

such that

2

d O

log m

For al l i jS j and

i

For al l i j jS S j log m

i j

d

Moreover such a family can be found deterministically in time p oly m

For concreteness one can think of m for some small constant so that d O O log m

Given such a family of sets the NW generator takes a uniformly distributed string of length d and pro duces

m strings of length That is given parameters and m we take the family of sets given by Lemma and

d

m

dene NW f g f g by

m

x x NW x x

S S m S

m 2 1

denotes the pro jection of x onto the co ordinates sp ecied by S where x

i S

i

b ehave as if they are The key prop erty of this generator used in NW IW is that the strings x

S

i

indep endent when they are used as inputs to a hard function Let P f g f g b e any predicate

d m

P

Then the NW pseudorandom generator using P is a function NW PRG f g f g given by

m

P

NW PRG x P x P x P x where x x NW x

m m m

m

The main theorem of NW is that if P is taken to b e a suciently hard on average predicate

P

NW PRG is a go o d pseudorandom generator

m

Theorem NW Suppose P f g f g is a predicate such that no circuit of size s can compute

P correctly on more than a fraction of the inputs NW PRG is an s O m log m pseudorandom

m

m

generator

The pseudorandom generators pro duced by this theorem can b e sp ectacular as the seed length d

O log m can b e much smaller than even logarithmic in the number of output bits if P is suciently

hard The main drawback is that the hypothesis is also extremely strong in that P must b e very hard on

average and much work has b een done to construct predicates that are strong enough for Theorem based

on weaker assumptions BFNW Imp IW IW In the next section we analyze the quality of this

generator when only a mildly hard predicate is used

Pseudorandom generators via pseudo entropy

In this section we show how to build a pseudorandom generator out of a mildly hard predicate in a dierent

and arguably more direct way than IW Sp ecically we show how to directly build a pseudo en

tropy generator from a mildly hard predicate and argue that applying an extractor to its output gives a

pseudorandom generator

Using a mildly hard predicate

Intuitively the reason the NW pseudorandom generator works is that whenever x is a hard instance of

i

P P x is indistinguishable from a random bit If P is very hard as in the hypothesis of Theorem then

i

almost all inputs are hard instances Thus with high probability all the x s will b e hard instances and the

i

limited dep endence of the x s guarantees that the P x s will lo ok simultaneously random

i i

Now supp ose that P is instead only mildly hard in the sense that no small circuit can compute correctly

on more than a fraction of inputs for some small but noticeable Intuitively this means that some

fraction of the inputs are extremely hard for P Thus wed exp ect that a fraction of the output bits of

P

NW PRG are indistinguishable from random so that we should get some crude pseudorandomness out

m

of the generator In fact this intuition ab out hard instances can b e made precise using the following result

of Impagliazzo Imp

Theorem hardcore sets Imp Suppose no circuit of size s can compute P f g f g on

more than a fraction of the inputs in f g Then for every there exists an hardcore set

H f g such that jH j and no circuit of size s s can compute P correctly on more than

fraction of the inputs in H a

P

when a mildly hard Using this theorem we can prove something ab out the output of NW PRG

m

of predicate P is used Notice that if x is chosen uniformly at random then each comp onent x x

i S

i

the output of NW x is uniformly distributed in f g Hence the expected number of x s that land

m i

P

in H is m Thus the earlier intuition suggests that the output of NW PRG should have m bits of

m

pseudorandomness and this is in fact true

Theorem Suppose no circuit of size s can compute P f g f g on more than a fraction of

m

the inputs in f g Then for every there is a distribution D on f g of Shannon entropy

at least m such that no circuit of size s m s O m log m can distinguish the output of

d m

P

NW PRG f g f g from D with advantage greater than

m

Pro of Let H b e a mhardcore set for P as given by Theorem We will show that the following

distribution satises the requirements of the theorem

d

Distribution D Cho ose x uniformly from f g Let x x NW x If x H select

m i

b f g uniformly at random and if x H let b P x Output b b

i i i i m

P

6

Recall that the Shannon entropy of a distribution D is D Pr D log Pr D

H

7

A similar bitipping technique is used in HILL to prove their construction of a false entropy generator

First we argue that the entropy of D is at least m Dene N x x to b e the number of x s

m i

d

that are in H Then for any x f g the entropy of D j ie D conditioned on x is N NW x By the

x

denition of the NisanWigderson generator each x is individually uniformly distributed and therefore

i

lands in H with probability By linearity of exp ectations the exp ectation of N NW x over uniformly

selected x is m Thus since conditioning reduces entropy cf CT Th

D j D

H H E

x

x

N NW x

E

x

m

P

Now we show that D and NW PRG are computationally indistinguishable Supp ose that some circuit

m

C distinguishes the output of NW PRG from D with advantage greater than We will show that C

m

must b e of size at least m s O m log m By complementing C if necessary we have

P

Pr C NW PRG U Pr C D

d

m

For x f g and r f g dene

r if x H

Qx r

P x otherwise

P

U dened as follows Now consider hybrids D D of D and NW PRG

d m

m

d

Distribution D Cho ose x uniformly from f g and choose r r uniformly from f g For

j m

r Output p p q q and q Qx j m let p P x

j i i m j S j S

j j

P

Thus D NW PRG U and D D By the hybrid argument of GM cf Gol Sec

d m

there is an i such that

m Pr C D Pr C D

i i

H Pr C D j x

i S

i

H H Pr C D j x H Pr C D j x Pr C D j x

i S i S i S

i i i

H H Pr C D j x Pr C D j x

i S i S

i i

H Expanding and using where the last equality is b ecause D and D are identical conditioned on x

i i S

i

H we have r r when x the fact that q Qx

i i S i S

i i

H r j x r Qx r Qx P x C P x Pr

m S i S i S S S

i m i+1 i1 1

xr r

i m

H r j x r Qx Qx Pr P x C P x

m S i S S S S

i m i+1 i 1

xr r

i+1 m m

d

where x is chosen uniformly in f g and r r are selected uniformly in f g Renaming r as b and

i m i

using the standard transformation from distinguishers to predictors Yao cf Gol Sec we see

that

H j x r b P x r Qx bQx P x C P x Pr

S m S i S S S S

i i m i+1 i1 1

xbr r

m

i+1 m

Using an averaging argument we can x r r b and all the bits of x outside S while preserving the

i m i

for as z we now observe that z varies uniformly over H while P x prediction advantage Renaming x

S S

j i

r for j i are now functions P of z that dep end on only jS S j log m bits of z So j i and Qx

j j i j S

j

we have

Pr C P z P z bP z P z b P z

i i m

z

m

Each P can b e computed by a circuit of size O m log m since every function of log m bits can b e

j

computed by a circuit of that size Incorp orating these circuits and b into C we obtain a circuit C of size

sizeC O m log m such that Pr C z P z

z

m

Now since H is mhardcore for P as in Theorem C must have size greater than m

s m s and hence C must have size greater than m s O m log m

Thus using a mildly hard predicate with the NW generator we can obtain many bits of crude pseudo

randomness A natural next step would b e to try to extract this crude pseudorandomness and obtain an

output that is indistinguishable from the uniform distribution Unfortunately one cannot hop e to extract

uniformly distributed bits from a distribution that just has high Shannon entropy Extraction is only p ossible

from distributions that have high minentropy Recall that a distribution D on a nite set S is said to have

k

minentropy k if for all x D Pr X x

The reason that we were only able to argue ab out Shannon entropy in Theorem is that we could only

say that m x s land in H on average To obtain a result ab out minentropy we would need to guarantee

i

that many x s lie in H with high probability Clearly this would b e the case if the x s were generated

i i

pairwise indep endently instead of via the NW generator But we also need the sp ecial prop erties of the NW

generator to make the current argument ab out indistinguishability work Following IW we resolve this

dilemma by taking the XOR of the two generators to obtain a new generator with the randomness prop erties

of each That is we obtain x x from a seed x using the NW generator we obtain y y pairwise

m m

indep endent from a seed y and then use z x y z x y as the inputs to the predicate

m m m

P As we will prove shortly this gives a generator whose output is indistinguishable from some distribution

with high minentropy as desired

A pseudo entropy generator

The following denition following HILL formalizes the type of generator we obtain

d m

Denition A generator G f g f g is a k s pseudo entropy generator if there is a distribu

m

tion D on f g of minentropy k such that no circuit of size s can distinguish the output of G from D

with advantage greater than

Remark The ab ove denition diers from that of HILL in several ways Most imp ortantly we require

the output to b e indistinguishable from having high minentropy whereas they only require that it b e

indistinguishable from having high Shannon entropy They later convert to the Shannon entropy to min

entropy by taking many samples on indep endent seeds but we cannot aord the extra randomness needed

to do this Other dierences are that we ask for indistinguishability against circuits rather than uniform

adversaries that we do not require that G b e computable in p olynomial time and that we do not explicitly

ask that k b e larger than d though the notion is uninteresting otherwise

Recall that we need a way of generating many pairwise indep endent strings from a short seed

Lemma CG see also Gol For any N and m there is a generator PI f g

m

m

f g such that for y selected uniformly at random the random variables PI y PI y are

m m m

pairwise independent Moreover PI is computable in time p oly m

m

Let P f g f g b e any predicate let m b e any p ositive integer and let d b e the seed length of

d m

P

NW Then our pseudo entropy generator using P is a function PE f g f g given by

m

m

P

PE x y P x y P x y P x y

m m

m

where

x x NW x and y y PI y

m m m m

The following theorem conrms that this construction do es in fact yield a pseudo entropy generator

8

IW take the XOR of the NW generator with a generator coming from a random walk on an expander

Theorem Suppose no circuit of size s can compute P f g f g on more than a fraction

d m

P

of the inputs in f g Then for any m PE f g f g is a k s pseudoentropy

m

generator with

seed length d O log m

pseudoentropy k m

adversary size s m s O m log m

adversarys maximum advantage O m

2

P

log m

Moreover PE is computable in time p oly m with m oracle calls to P

m

Pro of Let m Let H b e a mhardcore set for P as given by Theorem Like in the pro of

of Theorem we consider the following distribution D

d

Distribution D Cho ose x uniformly from f g and y uniformly from f g Let x x

m

NW x and y y PI y If x y H select b f g uniformly at random and if

m m m i i i

x y H let b P x y Output b b

i i i i i m

m By an argument like the pro of of Theorem it can b e shown that no circuit of size s

P

s O m log m m s O m log m can distinguish D from PE with advantage greater than

m

The only change needed is that y should b e xed at the same time as r r b and all the bits of

i m

y rather than just x x outside S and z should b e x

i S i S

i i

Next we argue that D has statistical dierence at most m from some distribution D with min

entropy m This will complete the pro of with m O m as the advantage of any circuit

P P

in distinguishing D from PE is at most its advantage in distinguishing D from PE plus the statistical

m m

dierence b etween D and D

For any w w f g dene N w w to b e the number of w s that are in H As in

m m i

the pro of of Theorem each x y is individually uniformly distributed and therefore lands in H with

i i

probability By linearity of exp ectations the exp ectation of N NW x PI y is m Now since

m m

fy g are pairwise indep endent and indep endent from x it follows that fx y g are also pairwise indep endent

i i i

Thus by Chebyshevs inequality

m m

N NW x PI y Pr

m m

xy

m m

Therefore D has statistical dierence at most m from the following distribution D

d

Distribution D Cho ose x uniformly from f g and y uniformly from f g Let x x

m

NW x and y y PI y If N x y x y m output a uniformly selected

m m m m m

m

string from f g Otherwise select b b as in D and output b b That is if x y H select

m m i i

b f g uniformly at random and if x y H let b P x y

i i i i i i

m

Now we argue that D has minentropy m Let v b e any string in f g Then conditioned on any

m

x and y the probability that D outputs v is at most since in all cases at least m of the output

m

bits of D are selected uniformly and indep endently Thus Pr D v Pr D j v as

E

xy xy

desired

Extracting the randomness

The to ol we will use to transform our pseudo entropy generator into a pseudorandom generator is an extractor

m d m

Denition A function Ext f g f g f g is a k extractor if for every for every

m

distribution D on f g of minentropy k ExtD U has statistical dierence at most from U

d n

We will make use of the following recent construction of extractors

m

Theorem Tre For every m k and such that k m there is a k extractor Ext f g

p

d k

f g f g such that

log m

d O

log k

p

m d k

and Ext f g f g f g is computable in time p oly m d

The following lemma conrms the intuition that applying an extractor to a distribution that is com

putationally indistinguishable from a distribution with high minentropy should yield a distribution that is

indistinguishable from uniform

d m m d

1 2

Lemma Suppose G f g f g is a k s pseudoentropy generator and Ext f g f g

n d d n

1 2

f g is a k extractor computable by circuits of size t Then G f g f g dened by

G u v ExtGu v is a s t pseudorandom generator

Supp ose Pro of Let D b e the distribution of minentropy k that cannot b e distinguished from GU

d

1

n

from uniform with advantage greater U C f g f g is a circuit of size s t that distinguishes G U

d d

2 1

Pr C U U than By complementing C if necessary we have Pr C G U

m d d

2 1

m d

2

Let C f g f g f g b e the circuit of size s given by C x v C Extx v Then

Pr C ExtD U U Pr C GU Pr C D U U Pr C GU

d d d d d d

2 2 1 2 2 1

Pr C U U Pr C GU

m d d

2 1

and U have statistical dierence where the secondtolast inequality follows from the fact that ExtD U

m d

2

d

2

at most Now by an averaging argument the second argument of C can b e xed to some v f g

from D with advantage to obtain a circuit C x C x v of size at most s which distinguishes GU

d

1

greater than This is a contradiction

Summing up we have the following theorem

Theorem There is a universal constant such that the fol lowing holds Let P f g f g be

any predicate such that no circuit of size s can compute P correctly on more than a fraction of the

d m

P

1

inputs where s and s Dene n s and m n and let PE f g f g be the

m

m

m m s O m log m O m pseudoentropy generator of Theorem and let Ext f g

d n d d n

P

2 1 2

f g f g be the m mextractor of Theorem Let PEPRG f g f g be

P P

dened by PEPRG u v ExtPE u v

m

P

Then PEPRG is a s pseudorandom generator with

output length n s

seed length d d O

log s

p

adversary size s s

adversarys maximum advantage O n

2

P

O log s

Moreover PEPRG can be evaluated in time with O n oracle calls to P

In particular supp ose P is a predicate in E such that no circuit of size s can compute P correctly

on more than a p oly fraction of the inputs Then the output length is n the seed

length is O O log n no circuit of size s can distinguish the output from uniform and the

generator can b e evaluated in time p oly n so the resulting pseudorandom generator is suciently strong

to obtain P BPP

Pro of By Theorem

d O O

log m log s

By Theorem

m

log

m

A

O log s O d O

log m log s

and Ext is computable in time t p oly m d p oly m By Lemma no circuit of size s can distinguish

the output of PEPRG from uniform wth advantage greater than O m O n where

s m s O m log m t s p oly s

p

By choosing suciently small s will always b e at least s

Remark As mentioned earlier Hastad et al HILL introduced the notion of a pseudo entropy genera

tor and showed that the crude pseudorandomness of such a generator can b e extracted to yield a pseudoran

dom generator Their work is in the BlumMicaliYao setting in which the generators must b e computable

in time p olynomial in the seed length and hence one can only hop e for the output to b e p olynomially longer

than the seed rather than exp onentially as we obtain Hence throughout their construction they can

aord sup erlinear increases in seed length whereas preserving the seed length up to linear factors is cru

cial for obtaining pseudorandom generators go o d enough for P BPP For example they can aord to use

randomnessinecient extractors such as universal hash functions whereas we require extractors which use

only a logarithmic number of truly random bits which have only b een constructed recently Zuc Tre

P

constructed in Theorem is actually nicer Remark The output of the pseudo entropy generator PE

m

than stated Sp ecically it is indistinguishable from a oblivious bitxing source that is a distribution

on strings of length m in which m k bit p ositions are xed and the other k bit p ositions vary uniformly

and indep endently Such sources were the fo cus of the bit extraction problem studied in Vaz BBR

CGH Fri and the term oblivious bitxing source was introduced in CW To see that the output

P

of PE is indistinguishable from an oblivious bitxing source simply observe that the distribution D given

m

in the pro of of Theorem is such a source Extracting from oblivious bitxing sources in which all but

k bits are xed is an easier task than extracting from a general source of minentropy k and already in

CW there are implicitly extractors sucient for our purp oses

Remark It is natural to ask whether similar ideas can b e used to directly construct BMYtype pseudo

random generators from mild hardness Sp ecically we consider taking the construction of pseudorandom

generators from strong ie very hardonaverage oneway p ermutations of BM Yao and replacing

strong oneway p ermutation with a weak ie mildly hardonaverage one In analogy with Theorem

one might hop e that the resulting generator has output whch is indistinguishable from having high Shannon

entropy Unfortunately this is not the case in general at least not to the extent one might exp ect

n n

To see this let us recall the BMY construction Let f f g f g b e a oneway p ermutation and

n

let b f g f g b e a hardcore predicate for f so no p olynomialtime algorithm can predict bx from

9

Indeed the term extractor was not even present at the time of HILL and the rst constructions of randomnessecient

extractors used their Leftover Hash Lemma as a starting p oint

10

Actually D is a convex combination of oblivious bitxing sources Distribution X is said to b e a convex combination

of distributions X X if there is a distribution on I on f tg such that X can b e realized by choosing i f tg

t

1

according to I taking a sample x from X and outputting x It is easy to see that any extractor for oblivious bitxing sources

i

also works for convex combinations of them

n k

f x with inversepolynomial advantage over the choice of x Then the G f g f g is dened

f b

k O

by G x bxbf xbf x bf x It is shown in BM Yao that as long as k n the

f b

output of G cannot b e distinguished from uniform by any p olynomialtime algorithm

f b

Now we show to construct a weak oneway p ermutation F and a predicate B so that B x is mildly

unpredictable from F x for which the output of G is distinguishable from every distribution of high

F B

n n

Shannon entropy To construct F let f f g f g b e a strong oneway p ermutation with hardcore

n

bit b f g f g as ab ove Let t dlog ne F will b e a p ermutation on strings of length n t where

n

t t

the last t bits are viewed as an integer from to For x f g and i f g we dene

t

x i mo d if i f n g

F x i

t

f x i mo d otherwise

x if i f n g

i

B x i

bx otherwise

where x denotes the i st bit of x It is easy to verify no p olynomialtime algorithm can invert F on

i

more than say of the inputs and similarly B x i cannot b e predicted from F x i with probability

t

greater than say On the other hand from the rst n bits of G x i it is easy to predict the

F B

t

remaining bits with probability n successive applications of F always passes through a sequence of

p oints of the form y y y n during which the hardcore bits completely reveal y All further

applications of F and B are then p olynomialtime computable given y Therefore the output of G is

F B

t

distinguishable from any distribution with Shannon entropy greater than n O n whereas an analogy

with Theorem would exp ect indistinguishability from Shannon entropy k since B cannot b e predicted

with probability more than The mild hardness of F and B can b e varied in this counterexample by

increasing or decreasing t relative to log n

List deco ding and amplication of hardness

Recall the main theorem of Nisan and Wigderson Theorem which states that given a suciently hard

onaverage predicate P f g f g one can get a pseudorandom generator To obtain such a predicate

Impagliazzo and Wigderson IW start from a a predicate P that is hard in the worst case ie no small

circuit computes it correctly on all inputs and use a lowdegree extension of P to obtain a multivariate

p olynomial function p that is mildly hard on the average as in BFNW They then apply two dierent

XOR lemmas to obtain a functions that grow harder eventually obtaining as hard a function as required in

Theorem We use an alternate approach for this sequence by showing directly that the function p ab ove is

very hard as hard as required for Theorem

In the pro cess we discover a connection b etween amplication of the hardness of functions and ecient

deco ding of errorcorrecting co des In what follows we describ e the deco ding prop erties that we need why

they suce for amplication of hardness and how multivariate p olynomials yield co des with such deco ding

prop erties For the last part we use a result of Arora and Sudan AS which involves a technically hard

pro of We also provide a simpler pro of of their result with some improved parameters These improvements

are not needed for the hardness amplication

Notation and Denitions

We will b e working with errorcorrecting co des over arbitrary alphab ets A word or vector over a q ary

n

alphab et is simply an element of q It will often b e more convenient to think of such a vector as a function

mapping n to q We will switch b etween these two representations frequently

Denition For positive integers n k q with n k an n k co de C is simply an injective map from

q

k n

q to q Elements of the domain of C are referred to as messages and elements of the image are referred

to as co dewords

11

Strictly sp eaking Theorem requires hard Boolean functions This requirement is weakened b oth in the original result of

IW and in our result by using the GoldreichLevin GL construction of hardcore predicates from hard functions

For co des to b e of use to us we will need that the co dewords are suciently far from each other So

n

we dene the Hamming distance b etween two vector x y q to b e the number of co ordinates i such that

xi y i Notice we are already using the functional notation The relative Hamming distance denoted

x y is Pr xi y i

in

In the co des that we construct and use we will exp ect that every pair of co dewords are far from each

other But we wont imp ose such a restriction explicitly We will rather imp ose a restriction that the

co dewords allow for recovery even after many errors have o ccured

n

Denition An n k code C is l listdeco dable if for every word r q there exist at most l

q

codewords c C such that r c In other words at most l codewords agree with any word r

q

fraction of the coordinates r is referred to as the received word in a

q

Of course to make these co des useful we will have some computational requirements We will need an

innite family of co des one for every k capable of enco ding k letters of the alphab et into some n letters

These co des should b e uniformly constructible eciently enco dable and eciently listdeco dable We will

formalize all these notions in the next megadenition Two of these asp ects uniform constructibility and

ecient enco dability are dened along standard lines However the third asp ect listdeco dability will not

b e dened along standard lines We outline the nonstandard asp ects rst

First we will not exp ect the listdeco ding algorithm to return one co deword but rather a list of upto l

co dewords such that all nearby co dewords are included in the list This is natural given our denition

of l listdeco dable co des

Next we will exp ect the listdeco ding algorithm to work in time p olynomial in log k and This is

imp ossible in a conventional mo del of computation since it takes time at least n k to even read

the received word However and this is where the functional view of words b ecomes imp ortant we

will allow the input and output of the listdeco ding algorithm to b e sp ecied implicitly Thus we will

assume we have oracle access to the received word r the input We will also output the co dewords

implicitly by programs that compute the function represented by the co deword These programs will

b e allowed to make oracle calls to the received word r Thus b oth our deco ding algorithm and their

O

output programs are b est thought of as oraclemachines We will use the notation M x to denote

the computation of an oraclemachine M on input x with access to an oracle O When asking for

ecient listdeco ding we will exp ect that the deco ding algorithm as well as its output programs are

all ecient

Finally we will allow our deco ding algorithms as well as their output programs to b e randomized

Below we dene what is means for randomized algorithms to approximately compute functions and

to solve search problems

Denition A randomized procedure A is said to compute a function f X Y at a p oint x X if

Pr Ax f x where the probability is taken over the internal coin tosses of A We say that A has

agreement with f if A computes f on an fraction of the inputs in D We say that A computes

f if it has agreement with f A randomized procedure A is said to solve a search problem S if on input x

s

and security parameter s Pr Ax s S x

Notice that in the latter case we explicitly require that the algorithm A have a controllable error This is

due to the fact that for arbitrary search problems there are no generic techniques to reduce error However

in our case where S x is the set of lists which include all nearby co dewords we can amplify by running

the listdeco ding algorithm several times and outputting the union of the lists So once again we will ignore

the security parameter and just fo cus on getting the right answer with constant probability

We are now ready to dene co des that are nice for our purp ose

Denition A family of codes C fC g is nice if there exist functions n q l Z R Z and a pair of

k

algorithms Encode Decode satisfying the fol lowing conditions exist

k n

For every k C q q is an l decodable code where n nk p oly k q

k

q k p oly k and l l k p olylog k

Encodex k runs in time p oly n and returns C x where n nk

k

r

n

Decode k ie with oracle access to a word r q runs in time p oly log k and outputs a list

k

of oracle machines M M st for every message x q satisfying r C x

l k

q

r

computes x Decode as wel l as M s are al lowed to be randomized there exists j l such that M

j

j

The running time of M is bounded by p oly log k

j

A family of codes is binary if q k

Nice co des suce for amplication of hardness

We rst show that the existence of nice binary co des suce for obtaining functions that are as hard as

required for Theorem given any predicate that is hard in the worstcase

Theorem Let C be a nice family of binary codes Then there exists a constant c such that the fol lowing

is true Let P f g f g be a function such that no circuit of size s computes P Given

0

c

dene P f g f g by P C P Then no circuit of size s s computes the predicate

C P correctly on more than a fraction of the inputs

In particular taking s and assuming s for a suciently smal l constant eg c

P has the fol lowing parameters

seed length O

p

s adversary size s

adversarys maximum advantage s

O

Moreover P can be evaluated in time with access to the entire truth table of P

c

Pro of Let k Assume for contradiction that B is a circuit of size s s that computes

B

C P correctly on more than a fraction of the inputs Then the deco ding algorithm Decode k

B

outputs a list of programs M M such that for some j M computes P correctly Since the running

l

j

B

times of the algorithms M are b ounded by a p olynomial in log k and as a circuit we can express M

j

j

0

c

with some random inputs of size at most for some constant c This circuit will involve some oracle

calls to B Throwing in the circuit for B in place of all the oracle calls increases the size of the circuit to at

0

c

most s Using Adlemans metho d we can now get rid of the random inputs for a O t blowup in

0

c

size of the circuit Thus we get a circuit of size at most s to compute P Setting c c we

get the desired contradiction

We prove the existence of nice families of binary co des in two steps First we show that multivariate

p olynomials lead to a nice family of co des over a growing alphab et Then we use that to construct a nice

family of binary co des

Lemma A nice family of codes with q k p oly log k nk p oly k and l k p oly

exists

Remark The pro of will show that the alphab et size q k is at least This prop erty will b e used

later

12

Here we are again viewing messages and co dewords as functions Since the co des are binary the functions are Bo olean

13

For simplicity we have b ounded number of oracle calls by the running time which in turn we have made a p olynomial of

unsp ecied degree in log k and Clearly to obtain quantitively b etter results one should optimize and compute the number

of oracle calls to the received word in the deco ding pro cedure as this is the only part of the running time which aects the

circuit size multiplicatively

Pro of The enco ding scheme will interpret the message as the values of a multivariate p olynomial on a

sp ecied subset of p oints The enco ding will b e the evaluation of the p olynomial at all inputs Below we

m

sp ecify the choice of the parameters m the number of variables F the eld and H where H is the subset

of p oints where the p olynomial is sp ecied by the message

Given k we pick a eld F of cardinality log k and a subset H F of cardinality maxflog k g

m

and set m log k log jH j We let q jF j and asso ciate the set q with F Let b k H b e any

k m

injective map To enco de a string x F we nd a p olynomial p F F of degree at most jH j in

each of the m variables satisfying pbi P i for every i k Such a function do es exist and can b e

m

found easily The function may b e made unique by forcing pz for all z H n imageb Letting

m m

n jF j and asso ciating n with F the enco ding of x is simply the p olynomial function p n F

Note that with these settings

log k

log n m log jF j O log log k log O log k

log jH j

since log jH j maxflog log k log g Thus n p oly k as claimed

The uniform constructibility and ecient enco ding prop erties are standard The deco ding problem

m

reduces to a p olynomial reconstruction problem Given oracle access to a function f F F implicitly

nd a list of all total degree d p olynomials that agree with f on at least an fraction of the places Arora

jF j

and Sudan AS give an ecient solution to this problem In Theorem we give a simpler algorithm

and analysis with improved parameters In particular the theorem gives a solution to this problem provided

p

djF j for some constant c We need only verify that this condition is satised for the jF j c

choice of parameters ab ove With our choice of parameters

log k jH j

log k max d m jH j log k

log jH j log log k log

while jF j log k so the required condition is met The solution pro duces a list of size l p oly

k n

and thus the co de C q q is an l errorcorrectible co de

k

To convert the co des constructed ab ove into binary co des we concatenate them with Hadamard

co des The list deco ding algorithm is extended using the fact that the Hadamard co des have ecient list

deco ders due to a result of Goldreich and Levin GL For our purp oses even the bruteforce algorithm

one that enumerates all co dewords and outputs all that are close to the received word is sucient

Lemma There exists a nice family of binary codes with parameters n p oly k and l p oly

Pro of The enco ding metho d is to concatenate the co des from Lemma with Hadamard co des Let C

b e the co de as given by Lemma We obtain a nice family of binary co des C as follows

Given k and we rst set and let n q l b e the parameters of the co de C In particular

k

t k

q Let t dlog q e and let b q f g b e any injective map To enco de a string x f g

n t

we rst enco de it using C to get y C x q Then we enco de each co ordinate of y as a bit

k k

t

string using the Hadamard co de Had describ ed next For i n let z by i For w f g the

P

t

w th co ordinate of the enco ding Hadz is hw z i w j z j mo d where z j w j f g are the

j

t

co ordinates of w and z viewed as vectors in f g Thus the concatenated enco ding enco des a k bit vector

x as an vector

n

Hadby Hadby Hadby n where y C x q

k

t

Clearly the enco ding is of length at most n n n q p oly k bits It is also clear that the

enco ding for the concatenated co de can b e computed eciently We now describ e its deco ding

The deco ding pro ceeds using the usual paradigm for the deco ding of concatenated co des We rst deco de

each symbol of the inner co de ie the Hadamard co de and then deo cde the outer co de in each case

we use the resp ective deco ding algorithm The details that need to b e veried are We need to sp ecify

the deco ding algorithm for the Hadamard co de We have to implement the deco ding paradigm with

inputoutputs b eing implicit While deco ding the innerco de we dont get unique answers but rather a

list of co dewords We need a listdeco ding version of the deco ding pro cedure

t

Given k and an oracle for the received word r n f g we implement oracles r r r

2

t

n q as follows Given i n we consider the oracle r j f g given by r j j r i j We nd

i i

a list of all elements z q such that Hadbz has agreement at least with r j A brute force

i

implementation takes O q p oly log k time A more ecient solution to this problem would b e to

use an algorithm from GL that runs in time p oly t It is wellknown see for instance Gol

on query i outputs the j th element of this list after that this list has at most elements The oracle r

j

sorting them using some canonical order such the lexicographic order We then invoke the list deco ding

algorithm for C times once for each r and take the union of the lists obtained Thus the resulting

k

j

list is of length at most l p oly

To analyze the correctness of our deco ding algorithm consider a message x such that C x has

k

agreement with r Let y C x An application of Markovs inequality yields that for at least fraction

k

of the indices i n r j has at least agreement with Hadby i and therefore r j r i for some

i i

j

i r j j Since there are only choices for j it follows by averaging that there exists a j such that r

i

j

0

for at least a fraction of the indices i n Since q the listdeco ding

algorithm for C will pro duce a list of upto l p oly oracles which includes x

k

Comparison with IW Theorem and Lemma provide sucient hardness amplication to im

mediately apply the NisanWigderson construction Theorem and obtain the main result of Impagliazzo

and Wigderson IW Sp ecically if P is a predicate in E which cannot b e computed by circuits of size

0

s then P given in Theorem will also b e in E and circuits of size s will not b e able

0

to compute P with advantage more than Plugging such a predicate P into Theorem gives

a pseudorandom generator whose seed length is logarithmic in its output length and security and hence

implies P BPP

In addition our construction provides hardness amplication for other settings of parameters that im

proves over the hardness amplication of IW Sp ecically the input length of P is only a constant factor

more than that of P ie O regardless of the security s In contrast hardness amplication of

IW pro duces a predicate with input length log s which is O only if s Note however

that our construction do es not remove the log s overhead in seed length incurred when subsequently

applying Theorem to obtain a pseudorandom generator Obtaining a construction of pseudorandom gener

ators from hard predicates which increases seed length by only a constant factor for all values of the security

s is still an op en problem Our result demonstrates that it suces to solve this problem for very hardon

average predicates A solution would have signicant implications for the construction of extractors via the

connection b etween extractors and pseudorandom generators recently established by Trevisan Tre

The derandomized XOR lemma of Impagliazzo and Wigderson do es have an imp ortant advantage over

our hardness amplication technique when one starts with a mildly hard predicate rather than a worstcase

hard predicate Sp ecically if P f g f g cannot b e computed by small circuits on more than a

fraction of inputs they obtain a hardonaverage predicate P is computable in time p oly with oracle

access to P Our construction on the other hand do es not take advantage of this mild hardness Instead

we do a global enco ding of P just as if P were worstcase hard to obtain a hardonaverage predicate P

computable in time p oly with oracle access to P It would b e interesting to see if mild hardness could

b e amplied lo cally as in IW using techniques based on errorcorrecting co des

Listdeco ding of multivariate p olynomials

Recall that we wish to solve the following problem

m

Given An oracle f F F and parameters d N and R

Goal Reconstruct an implicit representation for every p olynomial that has agreement with the function

m

f Sp ecically construct randomized oracle machines M M such that for every p olynomial p F F

l

f

of degree d that has relative agreement with f there exists j l such that M computes p

j

We will b e interested in the running time of the reconstruction pro cedure ie the time taken to gen

erate the machines M M as well as the running times of the machines M M

k k

Theorem There exists a constant c such that the reconstruction problem above can be solved in time

p

p oly md log jF j provided c djF j Furthermore the running time of each of the oracle machines

listed in the output is at most p oly md log jF j

Remark This theorem is a strengthening of a theorem due to AS In particular the lower

b ound on here is smaller than that of AS who obtain an unsp ecied p olynomial in d and

jF j

Furthermore our pro of is simpler and in particular do es not require lowdegree testing

p

The b ound of djF j is within a constant factor of the b ound for the univariate case The constant

c ab ove is not optimized in this writeup But our metho ds can push it down to any constant greater

than For the univariate case this constant is No inherent reason is known for the gap

m m

Fix an oracle f F F and a degree d p olynomial p F F with agreement with f We observe that

f

it suces to reconstruct a randomized oracle machine M such that M has suciently high agreement with

p This is due to the existence of selfcorrectors of p olynomials BF Lip GLR GS Sp ecically

we use the following theorem

Theorem GLR There exists a randomized oracle machine Corr taking as parameters integers d

m

and m and a eld F such that on access to an randomized oracle M F F with agreement with some

M

degree d polynomial p Corr computes p in time p oly d m provided jF j d

As in the algorithms of BF Lip GLR we use the prop erties of lines in the mdimensional

m

space F dened b elow

def

m

Denition The line through x y F denoted l is the parametrized set of points fl t

xy xy

m

F F tx ty j t F g Given a function f F F f restricted to the line l is the function f j

xy l

xy

t f l t given by f j

xy l

xy

t is a univariate p olynomial of degree at Notice that if f is a p olynomial of total degree d then f j

l

xy

most d Our strategy to reconstruct the value of p at a p oint x is to lo ok at a random line going through x

On this line p turns into a univariate p olynomial Furthermore the random line through the randomly chosen

m

p oint x is a pairwise indep endent collection of p oints from F Thus p and f will have agreement close to

on this line as well Thus the goal of nding px reduces to the goal of reconstructing p restricted to this

line ie a univariate reconstruction problem a problem that has b een addressed in ALRS Sud GS

In particular we use the following theorem

n

Theorem Sud Given a sequence of n distinct pairs ft v g t v F and integer parameters

i i i i

i

d k a list of al l polynomials g g satisfying jfi f ngjg t v gj k can be reconstructed in

l j i i

p

n

dn Furthermore l time p oly n log jF j provided k

k

m

We describ e a family of reconstruction pro ceduresfM g that will b e used to construct the

z a z F aF

machines M M To gain some intuition into the pro cedure b elow it may b e helpful to consider only

k

the machines M The machines take as parameters a p ositive real number integers d and m and a

z pz

eld F

M x

z a

Explicitly nd a list of distinct univariate p olynomials g g such that this list includes

l

and do es not include any p olynomial all p olynomials that have agreement at least with f j

l

z x

with agreement less than

If there exists a unique index i f l g such that g a then output g else output

i i

anything

Remark Step ab ove can b e computed in time p olynomial in log jF j and d as follows If

F is small enough then we let t t b e all the elements of F and invoke Theorem on the set

n

p p

n

with k n Note that k ft f l t g dn as long as djF j which is true

i z x i

i

by hypothesis If F is to o large to do this then set n p oly d and pick t t distinct at

n

n

with k n Since random from F and then invoking Theorem on the set ft f l t g

i z x i

i

by the furthermore part there are at most p olynomials with agreement at least with f j

l

z x

of Theorem the choice of n guarantees that with high probability all of these p olynomials agree

p

dn on at least n of the t s As the choice of n also guarantees that k n with f j

i l

z x

Theorem yields a list containing all p olynomials with agreement at least Now we wish to

discard all p olynomials with agreement less than this can b e accomplished by comparing each

on a random sample of p oly p oints from F and discarding it if p olynomial g obtained with f j

l

z x

it has agreement smaller than on this sample

The number of p olynomials output in Step ab ove is at most by the furthermore part of

Theorem

is one of the g s returned in Step ab ove To shed some light on the steps ab ove We exp ect that pj

i l

z x

In Step we try to nd out which g to use by checking to see if there is a unique one which has g a

i i

g This pz and if so we use this p olynomial to output px pj recall that pj

i l l

z x z x

intuition is made precise in the Section We now nish the description of the reconstruction pro cedure

Reconstruction algorithm

Rep eat the following O log several times

m

Pick z F at random

m

Pick y F at random

Find a list of univariate p olynomials h h including all p olynomials with agreement at

l

least with f j

l

z y

M

z h (0)

j

For every p olynomial h include the oracle machine Corr in the output list

j

Analysis of the p olynomial reconstruction pro cedure

md

log jF j and outputs a Now we show that the reconstruction algorithm runs in time run in time p oly

list of oracles that includes one for every p olynomial p that has agreement with f Theorem follows

immediately

The claim ab out the running time is easily veried To analyze the correctness it suces to show that

in any iteration of Steps in Reconstruction Algorithm an oracle computing p is part of the output

with say constant probability for any xed p olynomial p of degree d that has agreement with f We show

this in two parts First we argue that for most choices of z M is an oracle that computes p on

z pz

M

z p(z )

computes p everywhere Then we show that for most pairs z y there of all inputs and thus Corr

exists j st the p olynomial h reconstructed in Step satises h pz

j j

p

djF j it is the case that Lemma There exists a constant c st for every d F satisfying c

Pr M x px

z pz

x

m

over the random choice of z F with probability at least

Pro of We rst argue that when b oth x and z are picked at random certain bad events are unlikely to

happ en The next two claims describ e these bad events and upp er b ound their probability

14

This is done as in Remark though here we do not care if the list contains extra p olynomials with low agreement

p

Claim If jF j then

Pr i l st g pj

i l

z x

xz

not to b e included in the output list it has to b e the case that p and f Pro of For the p olynomial pj

l

z x

do not have agreement on the line l But the line is a pairwise indep endent collection of jF j p oints in

z x

m

F The quantity of interest then is the probability that a random variable with exp ectation attains an

average of at most on jF j samples Using Chebychevs inequality this probability may b e b ounded by

2

jF j

p

Claim If djF j then

l d

Pr g and pj j l st g pj

j l j l

z x z x

xz

jF j

Pro of For convenience in this argument assume that M nds al l p olynomials of agreement at least

z pz

rather than just a subset as that is clearly the worst case for the claim Now instead of with f j

l

z x

picking x and z at random and then letting g g b e all degree d p olynomials with agreement with

l

m

g we rst pick z x indep endently and uniformly at random from F and let g f j b e all univariate

0

l

z x

l

degree d p olynomials with agreement with f j We now pick two distinct elements t t uniformly

l

0 0

z x

0 0 0 0

from F and let z l t and x l t Notice that we can express z l t t t and

z x z x z x

0 0

contain the same set of p oints and thus the x l t t t Thus the lines l and l

z x z x z x

def

Thus p olynomials g t g t t t t are exactly the set of p olynomials with agreement with f j

i l

z x

i

t and pj t g g is equivalent to the event pj g g and pj the event pj

l j l j l l

0 0 0 0

z x z x

j j

x z x z

d

for any xed j and thus the probability where t is b eing chosen at random This probability is at most

jF j

p

d

From l and t is at most l that there exists a j st pj t g djF j the claim

l

0 0

j

x z

jF j

follows

Discounting for the two p ossible bad events considered in Claims and we nd that with probability

furthermore there exists a p olynomial g returned in Step of M such that g pj at least

i i l

z pz

z x

px pz Thus the output is g pj this is the unique p olynomial such that g pj

i l i l

z x z x

Thus with probability at least we nd that for a random pair z x M computes px An

z pz

application of Markovs inequality now yields the desired result

one of the polynomials reconstructed in any one execution of Lemma With probability at least

M

z p(z )

and thus one of the oracles created in Step is Corr Step of Reconstruction Algorithm is pj

l

z y

provided jF j is large enough

is Pro of As in Claim we argue that p and f have at least agreement on the line l and then pj

z y l

z y

M

z a

one of the p olynomials output in this step Thus one of the oracles created is Corr pz for a pj

l

z y

Pro of of Theorem Fix any degree d p olynomial p with agreement with f Combining Lemmas

and we nd that with probability one of the oracles output by the reconstruction algorithm is

M

m

z p(z )

Corr and z is such that M computes px for at least fraction of xs in F and thus by

z pz

M

z p(z )

Theorem Corr computes p on every input

times ensures that every p olynomial p with agreement with f is included Rep eating the lo op O log

in the output with high probability using the wellknown b ound that there are only O such p olynomials

cf GRS Theorem

Acknowledgments

We thank Oded Goldreich Amnon TaShma and Avi Wigderson for clarifying discussions and p ointing us

to some related work

References

ACR Alexander Andreev Andrea Clementi and Jose Rolim Worstcase hardness suces for deran

domization a new metho d for hardnessrandomness tradeos In Proceedings of ICALP

pages LNC S SpringerVerlag

ALRS Sigal Ar Richard J Lipton Ronitt Rubinfeld and Madhu Sudan Reconstructing algebraic

functions from mixed data In rd Annual Symposium on Foundations of Computer Science

pages Pittsburgh Pennsylvania Octob er IEEE

AS and Madhu Sudan Improved low degree testing and its applications In Proceed

ings of the TwentyNinth Annual ACM Symposium on Theory of Computing pages El

Paso Texas May

AK V Arvind and J Kobler On resourceb ounded measure and pseudorandomness In Proceedings

of the th Conference on Foundations of Software Technology and Theoretical Computer Science

pages LNCS SpringerVerlag

BFNW Laszlo Babai Lance Fortnow and Avi Wigderson BPP has sub exp onential time

simulations unless EXPTIME has publishable pro ofs Computational Complexity

BF Donald Beaver and Joan Feigenbaum Hiding instances in multioracle queries In th Annual

Symposium on Theoretical Aspects of Computer Science volume of Lecture Notes in Com

puter Science pages Rouen France February Springer

BBR Charles H Bennett Gilles Brassard and JeanMarc Rob ert How to reduce your enemys

information extended abstract In Hugh C Williams editor Advances in Cryptology

CRYPTO volume of Lecture Notes in Computer Science pages SpringerVerlag

August

BM Manuel Blum and How to generate cryptographically strong sequences of pseudo

random bits SIAM Journal on Computing November

CPS JinYi Cai A Pavan and D Sivakumar On the hardness of the p ermanent In th International

Symposium on Theoretical Aspects of Computer Science Lecture Notes in Computer Science

Trier Germany March SpringerVerlag To app ear

CG Benny Chor and Oded Goldreich On the p ower of twopoint based sampling Journal of

Complexity March

CGH Benny Chor Oded Goldreich Johan Hastad Jo el Friedman and Roman Smolen

sky The bit extraction problem or tresilient functions preliminary version In th Annual

Symposium on Foundations of Computer Science pages Portland Oregon Oc

tob er IEEE

CW Aviad Cohen and Avi Wigderson Disp ersers deterministic amplication and weak random

sources extended abstract In th Annual Symposium on Foundations of Computer Science

pages Research Triangle Park North Carolina Octob er November IEEE

CT Thomas M Cover and Joy A Thomas Elements of Information Theory Wiley Series in

Telecommunications John Wiley Sons Inc nd edition

FL and On the hardness of computing the p ermanent of random matrices

Computational Complexity

FF Joan Feigenbaum and Lance Fortnow Randomselfreducibility of complete sets SIAM Journal

on Computing Octob er

Fri Jo el Friedman On the bit extraction problem In rd Annual Symposium on Foundations of

Computer Science pages Pittsburgh Pennsylvania Octob er IEEE

GLR Peter Gemmell Richard Lipton Ronitt Rubinfeld Madhu Sudan and Avi Wigderson Self

testingcorrecting for p olynomials and for approximate functions In Proceedings of the Twenty

Third Annual ACM Symposium on Theory of Computing pages New Orleans Louisiana

May

GS Peter Gemmell and Madhu Sudan Highly resilient correctors for p olynomials Information

Processing Letters September

Gol Oded Goldreich Foundations of Cryptography Fragments of a Book Weizmann In

stitute of Science Available along with revised version from http

wwwwisdomweizmannaciloded

Gol Oded Goldreich A computational p ersp ective on sampling survey Available from http

wwwwisdomweizmannaciloded May

Gol Oded Goldreich Modern Cryptography Probabilistic Proofs and Pseudorandomness June

To b e published by Springer

GGM Oded Goldreich Sha Goldwasser and Silvio Micali How to construct random functions Jour

nal of the ACM Octob er

GL Oded Goldreich and Leonid A Levin A hardcore predicate for all oneway functions In

Proceedings of the Twenty First Annual ACM Symposium on Theory of Computing pages

Seattle Washington May

GNW Oded Goldreich Noam Nisan and Avi Wigderson On Yaos XOR lemma Technical Re

p ort TR Electronic Collo quium on Computational Complexity March http

wwwecccunitrierdeeccc

GRS Oded Goldreich Ronitt Rubinfeld and Madhu Sudan Learning p olynomials with queries

the highly noisy case Technical Rep ort TR Electronic Collo quium on Computational

Complexity Preliminary version in FOCS

GM Sha Goldwasser and Silvio Micali Probabilistic encryption Journal of Computer and System

Sciences April

GS and Madhu Sudan Improved deco ding of ReedSolomon and algebraic

geometric co des Technical Rep ort Electronic Collo quium on Computational Complexity

Preliminary version in FOCS

HILL J Hastad R Impagliazzo L Levin and M Luby A pseudorandom generator from any oneway

function To app ear in SIAM J on Computing

Imp Russell Impagliazzo Hardcore distributions for somewhat hard problems In th Annual

Symposium on Foundations of Computer Science pages Milwaukee Wisconsin

Octob er IEEE

IW Russell Impagliazzo and Avi Wigderson P BPP if E requires exp onential circuits Deran

domizing the XOR lemma In Proceedings of the TwentyNinth Annual ACM Symposium on

Theory of Computing pages El Paso Texas May

IW Russell Impagliazzo and Avi Wigderson Randomness vs time Derandomization under a

uniform assumption In th Annual Symposium on Foundations of Computer Science Palo

Alto CA November IEEE

KvM Adam Klivans and Dieter van Melkebeek Graph nonisomorphism has sub exp onential size

pro ofs unless the p olynomialtime hierarchy collapses Technical Rep ort TR University

of Chicago Department of Computer Science December Extended abstract in these pro

ceedings

KS S Ravi Kumar and D Sivakumar Personal communication Octob er

Lip Richard Lipton New directions in testing In Proceedings of DIMACS Workshop on Distributed

Computing and Cryptography

NW Noam Nisan and Avi Wigderson Hardness vs randomness Journal of Computer and System

Sciences Octob er

NZ Noam Nisan and David Zuckerman Randomness is linear in space Journal of Computer and

System Sciences February

RR Alexander A Razb orov and Steven Rudich Natural pro ofs Journal of Computer and System

Sciences August

Sud Madhu Sudan Deco ding of Reed Solomon co des b eyond the errorcorrection b ound Journal of

Complexity March

STV Madhu Sudan Luca Trevisan and Salil Vadhan Pseudorandom generators without the XOR

lemma Technical Rep ort TR Electronic Collo quium on Computational Complexity De

cember httpwwwecccunitrierdeeccc

STVa Madhu Sudan Luca Trevisan and Salil Vadhan Pseudorandom generators without the XOR

lemma extended abstract In Proceedings of the ThirtyFirst Annual ACM Symposium on the

Theory of Computing pages Atlanta Georgia May

STVb Madhu Sudan Luca Trevisan and Salil Vadhan Pseudorandom generators without the XOR

lemma abstract In Proceedings of the Fourteenth Annual IEEE Conference on Computational

Complexity page Atlanta GA May

Tre Luca Trevisan Constructions of nearoptimal extractors using pseudorandom generators Tech

nical Rep ort TR Electronic Collo quium on Computational Complexity Extended

abstract in these pro ceedings

Tre Luca Trevisan Constructions of nearoptimal extractors using pseudorandom generators In

Proceedings of the ThirtyFirst Annual ACM Symposium on the Theory of Computing pages

Atlanta Georgia May

Val Leslie G Valiant A theory of the learnable Communications of the ACM

Vaz Umesh V Vazirani Towards a strong communication complexity theory or generating quasi

random sequences from two communicating slightlyrandom sources extended abstract In

Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing pages

Providence Rho de Island May

Wig Avi Wigderson Personal communication Octob er

Yao Andrew C Yao Theory and applications of trap do or functions extended abstract In rd

Annual Symposium on Foundations of Computer Science pages Chicago Illinois

November IEEE

Zuc David Zuckerman Simulating BPP using a general weak random source Algorithmica

Octob erNovember