1 Introduction

Extracting All the Randomness from a Weakly Random Source

Salil Vadhan

Lab oratory for Computer Science, MIT

Cambridge, MA

http://www-math.mit.edu /~ salil

[email protected]

Abstract

In this pap er, we givetwo explicit constructions of extractors, b oth of whichwork for a source of any

min-entropy on strings of length n. The rst extracts any constant fraction of the min-entropy using

2 3

n additional random bits. The second extracts all the min-entropy using O log n additional O log

random bits. Both constructions use fewer truly random bits than any previous construction whichworks

for all min-entropies and extracts a constant fraction of the min-entropy.

The extractors are obtained by observing that a weaker notion of \combinatorial design" suces

for the Nisan{Wigderson pseudorandom generator [NW94 ], which underlies the recent extractor of Tre-

visan [Tre98 ]. We give near-optimal constructions of such\weak designs" whichachievemuch b etter

parameters than p ossible with the notion of designs used by Nisan{Wigderson and Trevisan.

1 Intro duction

Roughly sp eaking, an extractor is a function which extracts truly random bits from a weakly random

source, using a small numb er of additional random bits as a catalyst. A large b o dy of work has fo cused on

giving explicit constructions of extractors, as such constructions haveanumber of applications. A recent

breakthrough was made by Luca Trevisan [Tre98], who discovered that the Nisan{Wigderson pseudorandom

generator [NW94], previously only used in a computational setting, could b e used to construct extractors.

Trevisan's extractor improves on most previous constructions and is optimal for certain settings of the

parameters. However, when one wants to extract all or most of the randomness from the weakly random

source, Trevisan's extractor p erforms p o orly, in that a large number of truly random \catalyst" bits are

needed. In this pap er, we give an extractor which extracts al l of the randomness from the weakly random

source using fewer truly random bits than any previous construction. This is accomplished by improving

the combinatorial construction underlying the Nisan{Wigderson generator used in Trevisan's construction.

Applying a construction of Wigderson and Zuckerman [WZ95], we also obtain improved expanders.

n n

Extractors. A distribution X on f0; 1g is said to have min-entropy k if for all x 2f0; 1g ,Pr[X = x]

k n d m

2 . Think of this as saying that X has \k bits of randomness." A function Ext: f0; 1g f0; 1g !f0; 1g

is called an k; "-extractor if for every distribution X on f0; 1g of min-entropy k , the induced distribution

ExtX; U onf0; 1g has statistical di erence at most " from uniform where U is the uniform distribution

d d

on f0; 1g . In other words, Ext extracts m almost truly random bits from a source with k bits of hidden

randomness using d additional random bits as a catalyst. The goal is to explicitly construct extractors which

minimize d ideally, d = O log n=" while m is as close to k as p ossible. Dispersers are the analogue of

extractors for one-sided error; instead of inducing the uniform distribution, they simply hit all but a " fraction

of p oints in f0; 1g .

Actually, since the extractor is fed d truly random bits in addition to the k bits of hidden randomness, one can hop e to

have m b e close to k + d. This will b e discussed in more detail under the heading \Entropy loss." 1

Previous work. Disp ersers were rst de ned by Sipser [Sip88] and extractors were rst de ned by Nisan

and Zuckerman [NZ96]. Much of the motivation for research on extractors comes from work done on \some-

what random sources" [SV86, CG88, Vaz87b, VV85, Vaz84, Vaz87a]. There have b een a number of pa-

p ers giving explicit constructions of disp ersers and extractors, with a steady improvementin the parame-

ters [Zuc96, NZ96, WZ95, GW97, SZ98, SSZ98, NT98, TS98,Tre98]. Most of the work on extractors is based

on techniques suchask -wise indep endence, the Leftover hash lemma [ILL89], and various forms of comp osi-

tion. A new approach to constructing extractors was recently initiated byTrevisan [Tre98], who discovered

that the Nisan{Wigderson pseudorandom generator [NW94] could b e used to construct extractors.

Explicit constructions of extractors and disp ersers have a wide varietyof applications, including simu-

lating randomized algorithms with weak random sources [Zuc96]; constructing oblivious samplers [Zuc97];

constructive leader election [Zuc97]; randomness ecient error-reduction in randomized algorithms and in-

teractive pro ofs [Zuc97]; explicit constructions of expander graphs, sup erconcentrators, and sorting net-

works [WZ95]; hardness of approximation [Zuc96]; pseudorandom generators for space-b ounded computa-

tion [NZ96]; and other problems in complexity theory [Sip88, GZ97,ACR97].

For a detailed survey of previous work on extractors and their applications, see [NT98].

Our results. In this pap er, we construct two extractors:

Theorem 1 For every n, k , m, and ", such that m k n, thereare explicit k; "-extractors Ext: f0; 1g

d m

f0; 1g !f0; 1g with

log n="

1. d = O

log k=m

2. d = O log n=" log1= , where 1+ = k=m 1, and 1=m <1=2.

In particular, using the rst extractor with k=m constant, we can extract any constant fraction of the

source min-entropy using O log n additional random bits, and, using the second extractor with k = m,

we can extract all of the source min-entropy using O log nlog k additional random bits. A comparison

of these extractors with the b est previous constructions is given in Figure 1. Our second extractor directly

2 3

improves that of Ta-Shma [NT98], in that ours uses O log nlog k O log n truly random bits in

comparison to a p olynomial of unsp eci ed and presumably large degree in log n. Both of our extractors

use more truly random bits than the extractors of [Zuc97, Tre98] and the disp erser of [TS98], but our

extractors have the advantage that they work for any min-entropy unlike [Zuc97] and extract all or a

constant fraction of the min-entropy unlike [TS98, Tre98]. The disadvantage of the extractors of [GW97]

describ ed in Figure 1 is that they only use a small numb er of truly random bits when the source min-entropy

k is very close to the input length n e.g., k = n p olylog n, whereas ours uses O log n random bits

for any min-entropy. There are also extractors given in [GW97, SZ98] which extract all of the min-entropy,

but these use a small number of truly random bits only when the source min-entropy is very small e.g.,

k = p olylogn, and these extractors are b etter discussed later in the context of entropy loss.

Plugging our second extractor into a construction of [WZ95] immediately yields the following expander

graphs:

Corollary 2 For every N and K N , there is an explicitly constructible graph on N nodes with degree

O log log N log log K

N=K 2 such that every two disjoint sets of vertices of size at least K have an edge

between them.

O p olylog log N

This degree compares with a degree b ound of N=K 2 due to Ta-Shma [NT98]. Such

expanders have applications to sorting and selecting in rounds, constructing depth 2 sup erconcentrators, and

constructing non-blo cking networks [Pip87, AKSS89, WZ95].

The Trevisan extractor. The main to ol in the Trevisan extractor is the Nisan{Wigderson genera-

tor [NW94], which builds a pseudorandom generator out of any predicate P such that the securityof the

pseudorandom generator is closely related to how hard P is to compute on average. Let S =S ;:::;S

1 m

be a collection of subsets of [d] each of size `, and let P : f0; 1g ! f0; 1g be any predicate. For a string 2

reference min-entropy k output length m additional randomness d typ e

[GW97] any k m = k d = O n k + log 1=" extractor

[Zuc97] k = n m =1 k d = O log n extractor

[NT98] any k m = k d = p olylogn extractor

1o1

[TS98] any k m = k d = O log n disp erser

1 1

[Tre98] k = n m = k d = O log n extractor

any k m = k d = O log n=log k extractor

ultimate goal any k m = k d = O log n extractor

this pap er any k m =1 k d = O log n extractor

any k m = k d = O log nlog k extractor

Ab ove, is an arbitrarily small constant.

Figure 1: Comparison with b est previous constructions

d `

y 2f0; 1g , de ne y j to b e the string in f0; 1g obtained by pro jecting y onto the co ordinates sp eci ed by

d m

S . Then the Nisan{Wigderson generator NW : f0; 1g !f0; 1g is de ned as

i S ;P

NW y =P y j P y j :

S ;P S S

1 m

In the \indistinguishability pro of " of [NW94], it is shown that for any function D : f0; 1g ! f0; 1g

which distinguishes the output of NW y for uniformly selected y from the uniform distribution on

S ;P

m D

f0; 1g , there is a \small" circuit C or pro cedure of small \description size" such that C i.e. , C

with oracle access to D approximates P reasonably well. It is shown that the size of the C is related

to max jS \ S j, so one should use a collection of sets in which this quantity is small, while trying to

i6=j i j

minimize the seed length d.

n d m

Wenow give a rough description of the Trevisan extractor Ext: f0; 1g f0; 1g !f0; 1g . For a string

n n

u 2f0; 1g , let u 2f0; 1g b e an enco ding of u in an error-correcting co de whose prop erties are unimp ortant

n. We view u as a b o olean function u: f0; 1g !f0; 1g. in this informal description and de ne ` = log

Then the extractor is simply

Ext u; y =NW y =uy j uy j :

S S ; S S

1 m

The analysis of this extractor in [Tre98] shows that the output of this extractor is close to uniform as

long as the source min-entropy required is greater than the size of the circuit built in the security reduction

of [NW94]. Hence, one needs to keep this circuit size small while minimizing the number d of truly random

bits needed, which is equal to the seed length of the Nisan{Wigderson generator. Unfortunately, using

max jS \ S j as the measure of the circuit size as in [NW94, Tre98], one cannot make d much smaller

i6=j i j

than what is obtained in [Tre98].

jS \S j

i j

2 The improvement. The improvements of this pap er stem from the observation that actually max

is much b etter than max jS \ S j as a measure of the size of the circuit built in the Nisan{Wigderson se-

i6=j i j

curity reduction. So we are left with the problem of constructing set systems in which this quantity is small;

we call such set systems weak designs in contrast to designs, in which max jS \ S j is b ounded. We show

i6=j i j

that with weak designs, one can have d much smaller than is p ossible with the corresp onding designs. The

weak designs used in our rst extractor are constructed using an application of the Probabilistic Metho d,

which we then derandomize using the Metho d of Conditional Exp ectations see [ASE92] and [MR95, Ch.

5]. We then apply a simple iteration to these rst weak designs to obtain the weak designs used in our

second extractor. We also provea lower b ound showing that our weak designs are near-optimal.

n d m

Entropy loss. Since a k; "-extractor Ext: f0; 1g f0; 1g !f0; 1g is given k bits of hidden randomness

in its rst input and d truly random bits in its second input, one can actually hop e for the output length 3

m to b e almost k + d, rather than just k . The quantity=k + d m is therefore called the entropy loss

of the extractor. Hence, in this language, the goal in constructing extractors is to simulataneously minimize

b oth d and the entropy loss.

Nonconstructively, one can show that, for any n and k n, there exist extractors Ext : f0; 1g

n;k

d k +d

f0; 1g !f0; 1g with d = logn k +O 1 and entropy loss = 2 log1="+O 1, and these b ounds

on d and are tight up to additive constants [RT97]. The explicit constructions, however, are still far from

achieving these parameters. As for what is known, every entry in Figure 1 with m = k has an entropy loss of

d. For example, the extractor of [GW97] has an entropy loss of O n k + log1=" which is only interesting

when k is very close to n and the extractor of [NT98] has an entropy loss of p olylog n. In addition, the

\tiny families of hash functions" of [GW97, SZ98] give extractors with d = O k + log n and entropy loss

2 log1="+ O 1; these are interesting when k is very small e.g., k = p olylog n

A slight mo di cation to our second extractor enables us to achieve logarithmic entropy loss:

n d

Theorem 3 For every n, k , and " such that k n, there is a k; "-extractor Ext: f0; 1g f0; 1g !

k +d

f0; 1g with

d = O log n=" log k

and entropy loss

=3logk="+O 1

2 Preliminaries

In this section, we intro duce some standard terminology and notation used throughout the pap er. log

indicates the logarithm base 2 and ln denotes the natural logarithm. If X is a probability distribution on a

nite set, we write xX to indicate that x is selected according to X . Two distributions X and Y on a set

S are said to have statistical di erence or variation distance " if

max jPr [D X =1] Pr [D Y =1]j = ";

where the maximum is taken over all functions \distinguishers" D : f0; 1g !f0; 1g. A distribution X is

said to have min-entropy k if for all x,Pr[X = x] 2 . It is useful to think of distributions of min-entropy

k as b eing uniform over a subset of the domain of size 2 .

n d

We write U for the uniform distribution on strings of length j . A function Ext: f0; 1g f0; 1g !

f0; 1g isak; "-extractor if for every distribution X of min-entropy k , ExtX; U has statistical di erence

at most " from U . In other words, Ext extracts m almost truly random bits from a source with k bits

of hidden randomness using d additional random bits as a catalyst. We say that a family of extractors

n d m

i i i

fExt : f0; 1g f0; 1g !f0; 1g g is explicit if Ext can b e evaluated in time p oly n ;d .

i i2I i i i

3 Combinatorial designs

The combinatorial construction underlying the Nisan{Wigderson generator are combinatorial designs.

De nition 4 [NW94] A family of sets S ;:::;S [d] is an `; -design if

1 m

1. For al l i, jS j = `.

2. For al l i 6= j , jS \ S jlog .

i j

There is a somewhat related notion in the combinatorics literature known as a 2-design see, e.g. [AK92]. In 2-designs,

strong additional regularity requirements are imp osed such as all the pairwise intersections b eing exactly the same size and all

p oints b eing contained in the same numb er of sets. These additional requirements are irrelevant in our applications. 4

Motivation. In Trevisan's extractor, the parameters of a design corresp ond to the parameters of the

extractor as follows:

source min-entropy m

output length = m

input length = 2

additional randomness = d

Hence, our goal in constructing designs is to minimize d given parameters m, `, and such that >1.

Notice that 1= is essentially the fraction of the source min-entropy that is extracted, so ideally would b e

as close to 1 as p ossible.

One explicit construction of designs is given by the following:

Lemma 5 [NW94,Tre98] For every m, `, and > 1, there exists an eciently constructible `; -

design S ;:::;S [d] with

1 m

2 O 1= log

` m

d = :

log

Notice that the dep endence on is very p o or. In particular, if wewant to extract a constant fraction

of the min-entropy,we need more than m truly random bits for some c>0. This is unavoidable with the

current de nition of designs: if <2, then all the sets must b e disjoint, so d m`. In general, wehave the

following lower b ound, obtained in jointwork with Luca Trevisan and proved in Section 6:

Prop osition 6 If S ;:::;S [d] is an `; -design, then

1 m

1= log 2

d m ` log

The improvements of this pap er stem from the observation that actually a weaker form of design suces

for the Nisan{Wigderson generator and the construction of extractors:

De nition 7 A family of sets S ;:::;S [d] is a weak `; -design if

1 m

1. For al l i, jS j = `.

2. For al l i,

jS \S j

i j

2 m 1:

We will show that the parameters of a weak design corresp ond to the parameters of our extractors in

the same way that designs corresp onded to the parameters of Trevisan's extractor. Notice that every `; -

design is a weak `; -design. But one can, for many settings of m, `, and achieve weak `; -designs

S ;:::;S [d] with much smaller values of d than p ossible with `; -designs. Indeed, we will prove the

1 m

following in Section 5 using a probabilistic argument:

Lemma 8 For every `; m 2 N and >1, there exists a weak `; -design S ;:::;S [d] with

1 m

`: d =

Moreover, such a family can be found in time p oly m; d.

2 2 c

This is already much b etter than what is given by Lemma 5; for constant , d is O ` instead of ` m .

However, as gets very close to 1, d gets very large. Sp eci cally,if =1+ for small , then the ab ove

gives d = O ` = . To improve this, we notice that the pro of of Lemma 8 do es not take advantage of the

jS \S j

i j

fact that there are fewer terms in 2 when i is small; indeed the pro of actually shows how to

jS \S j 3

i j

obtain 2 < i 1 with the same d. Since we only need a b ound of m 1 for all i,

this suggests that we should \pack" more sets in the b eginning. This packing is accomplished by iterating

the construction of Lemma 8 directly inspired by the iteration of Wigderson and Zuckerman [WZ95] on

extractors, and yields the following improvement.

Lemma 9 For every `; m 2 N and 3=m <1=2, there exists a weak m; `; 1+ ; d-design with

: d = O ` log

Moreover, such a family can be found in time p oly m; d.

In particular, we can take = 1=m and extract essentially all of the entropy of the source using

d = O ` log m truly random bits. Lemma 9 will b e proven in Section 5.

For extractors which use only O log n truly random bits, where n is the input length, one would need

d = O `. However, one cannot hop e to do b etter than ` using the current analysis with weak designs.

Indeed, the following prop osition shows that our weak designs are optimal up to the log1= factor in our

second construction.

Prop osition 10 For every `; -weak design S ;:::;S [d],

1 m

` m`

d min ;

2 log 2 2

Notice that d = m` can b e trivially achieved having all the sets disjoint and that log 2 approaches 1 as

 approaches 1, so the lower b ound for 1 is essentially ` .

4 The extractor

In this section, we describ e the Trevisan extractor and analyze its p erformance when used with our weak

designs. The description of the extractor follows [Tre98]very closely. The main to ol in the Trevisan extractor

is the Nisan{Wigderson generator [NW94]. Let S =S ;:::;S b e a collection of subsets of [d] of size `,

1 m

` d

and let P : f0; 1g !f0; 1g be any b o olean function. For a string y 2f0; 1g , de ne y j to b e the string in

f0; 1g obtained by pro jecting y onto the co ordinates sp eci ed by S . Then the Nisan{Wigderson generator

NW is de ned as

S ;P

NW y =P y j P y j :

S ;P S S

1 m

In addition to the Nisan{Wigderson generator, the Trevisan extractor makes use of error-correcting co des:

n n

Lemma 11 Error-correcting co des For every n and thereis acode EC : f0; 1g !f0; 1g where

n;

n 2

n = p oly n; 1= such that every Hamming bal l of relative radius 1=2 in f0; 1g contains at most 1=

n can be assumed to be a power codewords. Furthermore, EC can be evaluated in time p oly n; 1= and

n;

of 2.

We will actually use a stronger prop erty of such error-correcting co des:

Prop osition 12 Let EC be any family of codes such that every Hamming bal l of relative radius 1=2 

n;

contains fewer than B codewords and ln B n. Then, for suciently smal l , every ` -bal l of relative

n n n

radius 1=2 2 in [0; 1] R contains at most B codewords. In other words, for every real vector v 2 [0; 1] ,

 

1 1

 u 2

i i

n 2

i=1

jS \S j

3 2

i j

2 < i 1 for all i. See the remark after the pro of of In fact it is necessary that d = ` = log if

Prop osition 10. 6

2 2

Since we may assume that n ln in Lemma 11, Prop osition 12 applies to those co des with

B =1= . The pro of of Prop osition 12 was obtained with the help of Madhu Sudan.

Pro of: Supp ose not, i.e. there are at least B co dewords in an ` -ball of radius 1=2 2 around some

real vector v . Then, consider a vector w 2f0; 1g obtained by randomly rounding v . That is, let w =1

with probability v for every i indep endently. Now consider any co deword u within ` -distance 1=2 2 of

i 1

v . The Hamming distance b etween u and w has exp ectation ku v k and is the sum of n i.i.d. [0; 1]-valued

random variables. By the Ho e ding inequality, the probability that the Hamming distance b etween u and

2 2

w exceeds ku v k + is at most exp2n =1=jB j . Hence, the probability that there are fewer than B

co dewords within Hamming distance 1=2 around w is at most 1=jB j. Thus, there exists a w with at least

B co dewords within Hamming distance 1=2 contradicting the prop erty of the error-correcting co de.

We can now describ e the Trevisan extractor, which takes as parameters n, m, k , and ", where m n k .

n n

Let EC: f0; 1g ! f0; 1g n = O log n=". For be as in Lemma 12, with = "=4m and de ne ` = log

n `

u 2f0; 1g ,we view ECu as a b o olean function u: f0; 1g !f0; 1g.

Let S =S ;:::;S b e a collection of subsets of [d] for some d such that jS j = ` for each i. How

1 m i

S is selected will crucially a ect the p erformance of the extractor; we will later cho ose it to b e one of our

weak designs.

n d m

Then the extractor Ext : f0; 1g f0; 1g !f0; 1g is de ned as

y =uy j uy j : Ext u; y =NW

u S S S S ;

1 m

We will now analyze this extractor. The following Lemma is implicit in [NW94] and is more explicitly

shown in [Tre98]. It shows how, from any function D which distinguishes the output of NW from uniform,

S ;P

one can obtain a randomized \program" which, using D as an oracle, predicts P with noticeable advantage.

This lemma shows that this \program" can be taken to be of a very simple form, which will allowus to

b ound its complexity later on. The only mo di cation we need to the pro ofs of [NW94, Tre98] is that wedo

not use an averaging argument to x the \random part" of the hybrids in the \hybrid argument"; rather,

wekeep these random.

` m

Lemma 13 Let P : f0; 1g !f0; 1g be any predicate and let D : f0; 1g !f0; 1g be any \distinguisher" such

that

Pr [D r =1] Pr [D NW y = 1] >";

S ;P

r y

m d

where r is selected uniformly from f0; 1g and y from f0; 1g . Then there exists a there exists an i m and

functions P ;:::;P from f0; 1g to f0; 1g

1 i

1 "

1. Pr [ D P x;::: ;P x;b;r b = P x] > + , where x is selected uniformly from f0; 1g , b

x;b;r 1 i1

2 m

from f0; 1g, and r from f0; 1g .

2. Each P depends on only jS \ S j bits of x where these bit positions depend only on S and i, but not

j i j

on P or D 

Pro of sketch: We can expand the hyp othesis of Lemma 13 as

Pr [D r r =1] Pr [D P y j P y j = 1] >";

1 m S S

1 m

r r y

1 m

where r ;:::;r are uniformly and indep endently selected bits and y is uniformly selected from f0; 1g . By

1 m

the \hybrid argument" of [GM84] cf. [Gol95, Sec. 3.2.3], there is an i such that

Pr D P y j P y j r r =1 Pr [D P y j P y j r r =1]> "=m

S S i m S S i+1 m

1 i1 1 i

y;r r y;r r

i m i+1 m

Now, renaming r as b and using the standard transformation from distinguishers to predictors [Yao82] cf.

[Gol98, Sec. 3.3.3], we see that

" 1

D P y j + Pr P y j b = P y j br r >

S S S i+1 m

1 i1 i

y ;b;r r

2 m

i+1 m 7

Using an averaging argumentwe can x all the bits of y outside S while preserving the prediction advantage.

Renaming y as x, wenow observe that x varies uniformly over f0; 1g while P y j for j 6= i is nowa

S S

i j

function P of x that dep ends on only jS \ S j bits of x. So, wehave

j i j

1 "

[D P x P xbr r b = P x] > + ; Pr

1 i1 i+1 m

x;b;r r

2 m

i+1 m

as desired. 2

Nowwe use a counting argument to b ound the complexity or \description size" of the \program" ab ove

and illustrate the connection with weak designs:

`+1+m m

Lemma 14 ThereisasetF of functions from f0; 1g to f0; 1g depending only on S such that

` m

1. For every predicate P : f0; 1g !f0; 1g and distinguisher D : f0; 1g !f0; 1g such that

Pr [ D r =1] Pr [ D NW y =1]>";

S ;P

r y

there exists a function f 2F such that

1 "

Pr [D f x; b; r b = P x] > + ;

x;b;r

2 m

` m

where x is selected uniformly from f0; 1g , b from f0; 1g, and r from f0; 1g .

jS \S j

i j

2. log jF j log m + max 2 .

jS \S j 4

i j

3. Each function in F can becomputed by a circuit of size O max jS \ S j 2 .

i i j

We will not use Item 3 the b ound on circuit size in the analysis of our extractor; we only use this for

our quantitative improvement to the pseudorandom generators of [NW94] given in Section 8.

Pro of: By Lemma 13, to meet Condition 1 it suces to let F be the set of functions f of the form

x; b; r 7! P x;P x;:::;P x;b;r, where P x dep ends only some set T of bits of x, where jT j =

1 2 i1 j ij ij

jS \ S j. The number of bits it takes to represent i is log m. Given i, the number of bits it takes to

i j

jT j jS \S j

ij i j

represent each P is 2 =2 . So, the total numb er of bits it takes to represent a function in F is

jS \S j

i j

log m + max 2 , giving the desired b ound on log jF j.

For the b ound on circuit size, notice that the circuit size of f is simply the sum of the circuit sizes of the

P 's, and every function on k bits can b e computed by a circuit of size O k 2 .

We now analyze the extractor Ext when we take S to be a weak design. The argument follows the

analysis of Trevisan's extractor in [Tre98] except that we use the more re ned b ounds on jF j given by

Lemma 14.

Prop osition 15 If S =S ;:::;S with S [d]isaweak `; -design for =k 3 logm=" 5=m

1 m i

n d m

where c is a xedconstant, then Ext : f0; 1g f0; 1g !f0; 1g is a k; "-extractor.

Pro of: Let X be any distribution of min-entropy k . We need to show that the statistical di erence b etween

U and ExtX; U is at most ". By the de nition of statistical di erence, this is equivalent to showing

m d

!f0; 1g, that, for every distinguisher D : f0; 1g

y = 1] " Pr [D r =1] Pr [D NW

u S ;

uX;y

We measure circuit size by the number of internal gates, so, for example, the identity function has circuit size 0. 8

m d

where r and y are selected uniformly from f0; 1g and f0; 1g , resp ectively. Wehave dropp ed the absolute

value in the de nition of statistical di erence; this is without loss of generality since wemay replace D by

its binary complement. So let D : f0; 1g !f0; 1g be any distinguisher and let F b e as in Lemma 14, so

m `

jF j m2 . For every f 2F,we obtain a function f : f0; 1g ! [0; 1] given by

f x=Pr[D f x; b; r b =1]:

b;r

Think of D f x; b; r b as a randomized algorithm built out of D with input x and random coins b; r .

n `

^ ^

. Notice that for a predicate P : f0; 1g !f0; 1g whichwe can also We can view each f asavector f 2 [0; 1]

view as a vector P 2f0; 1g , the ` -distance b etween f and P gives the probability that the randomized

algorithm corresp onding to f computes P incorrectly. That is,

jf P j =Pr[D f x; b; r b 6= P x]:

b;r

Let B b e the set of u for which there exists an f 2F such that jf uj < 1=2 "=2m. In other words,

u can b e approximated easily by one of these randomized algorithms f . B is the set of \bad" u for which

By the prop erty of the error-correcting co de given in Prop osition 12, for each function f 2F, there are at

2 n

most 4m=" strings u 2f0; 1g such that jf uj < 1=2 "=2m. By the union b ound,

2 2 m

jB j4m=" jFj =4m=" 2 :

Since X has min-entropy k , each u 2 B has probability at most 2 of b eing selected from X ,so

2 m k

Pr [u 2 B ] 4m=" m2 2

u X

2 k 3 log m="5 k

= 4m=" m2 2

< "=2

Now, by Lemma 14, if u=2 B , then

y = 1] <"=2: Pr [D r =1] Pr [D NW

S ;

r y

Thus,

 

Pr [D r =1] Pr [D NW y =1] = y = 1] Pr [D r =1] Pr [D NW

S ; S ;

u u

r r y

u X;y

u X

 Pr [u 2 B ]+ Pr [u=2 B ] "=2

u X u X

 "=2+ "=2= ":

Combining Prop osition 15 with the weak designs given by Lemmas 8 and 9 essentially proves Theorem 1.

The only technicality is that Prop osition 15 do es not allowustotake = k=m or k=m 1 which is what

wewould need to deduce Theorem 1 directly. Instead, we lose = 3 logm=" + 5 bits of the source entropy

in Prop osition 15. However, since is so small, we can give our extractor more truly random bits in its

seed increasing d by only a constant factor whichwe just concatenate to the output to comp ensate for the

loss. The details of this are given b elow.

0 0 0 0

Pro of of Theorem 1: Let = 3 logm="+5. Let k = k , m = m 3, and = k =m >k=m 1.

For 1 or 2, apply Prop osition 15 with the weak `; -design S ;:::;S [d ] of Lemma 8 or Lemma 9,

1 m

0 0

log n="

n d m 0

resp ectively. This gives an k; "-extractor Ext: f0; 1g f0; 1g ! f0; 1g , with d = O or

logk=m

d = O log n=" log1= , resp ectively. By using +3 additional bits in the seed and simply concate-

n d ++3 m

nating these to the output, we obtain a k; "-extractor Ext: f0; 1g f0; 1g !f0; 1g , as desired. 9

In applying Lemma 9, we need to make sure that <3=2, but if 3=2, we can use the weak design of

Lemma 8 instead.

Remark The stronger prop erty of error-correcting co des given by Prop osition 12 which corresp onds to

hardness against randomized algorithms could have b een avoided by using an averaging argument to \ x" r

and b in Lemma 13. If this is done in a straightforward manner, wewould havetopay a price for these bits

in the size of F , as they would b e needed to fully describ e a function. The cost of these bits can b e avoided,

however, if we do the hybrid argument while we are still lo oking at the advantage of the distinguisher averaged

over the choice of u; then these bits can be xed indep endently of u and absorb ed into the distinguisher

before we make the counting argument whichsays that the distinguisher fails with probability at least 1 "=2

over the choice of uX . Doing the analysis this way eliminates the log m term in the b ound on log jF j.

However, the approachwe have taken, advo cated by Oded Goldreich, corresp onds b etter to the intuition

that one should not pay a price for r and b since they can b e taken to b e random.

5 Construction of weak designs

Pro of of Lemma 8: Let `, m, and b e given, and let d = d`= ln e `. We view [d] as the disjoint union

of ` blo cks B ;:::;B , each of size d`= ln e. We construct the sets S ;:::;S in sequence so that

1 ` 1 m

1. Each set contains exactly one element from each blo ck, and

jS \S j

i j

2. 2 i 1.

Suppp ose we have S ;:::;S [d] satisfying the ab ove conditions. We prove that there exists a

1 i1

set S satisfying the required conditions using the Probabilistic Metho d [ASE92] see also [MR95, Ch.

5]. Let a ;:::;a be uniformly and indep endently selected elements of B ;:::;B , resp ectively, and let

1 ` 1 `

S = fa ;:::;a g. We will argue that with nonzero probability, Condition 2 holds. Let Y b e the indicator

i 1 ` j;k

random variable for whether a 2 S ,soPr[Y =1]=1=jB j =1=d`= ln e. Notice that for a xed j , the

k j j;k j

random variables Y ;:::;Y are indep endent.

j;1 j;`

2 3

h i

X X

jS \S j Y

i j

j;k

4 5

2 = 2

E E

" 

X Y

j;k

= 2

X Y

j;k

= 2

= i 1 1+

d`= ln e

 i 1 

Hence, with nonzero probability, Condition 2 holds, so a set S satisfying the requirements exists. How-

ever, wewant to nd such a set deterministically. This can b e accomplished by a straightforward application

of the Metho d of Conditional Exp ectations see [ASE92] and [MR95, Ch. 5]. Details can be found in

App endix A.

Remark A p erhaps more natural way to carry out the ab ove probabilistic construction is to chose S

uniformly from the set of all subsets of [d] of size `, rather than dividing [d]into ` blo cks. This gives essentially 10

the same b ounds, but complicates the analysis b ecause the elements of S are no longer indep endent. The

cleaner approach in the ab ove pro of was suggested byDavid Zuckerman.

h q

Pro of of Lemma 9: For simplicity, assume that 1 + =1=1 2 and m =2 =1 + . Let d = d`= ln 2e`

and let d = h d = O ` log1 + . We view [d] as the disjoint union of h blo cks B ;:::;B each of size

0 1 h

P P

q t

m = m. m ,so d . For each t 2 [h], let m =2 and n =

t s 0 t t

t s=1

Nowwe de ne our weak design S ;:::;S . For each t 2 [h], we let S ;:::;S B be a weak

1 m n +1 n +m t

t t t

`; 2-design as given by Lemma 8. In other words, we take the ordered union of h weak `; 2-designs

consisting of m ;m ;:::;m sets, resp ectively using disjoint subsets of the universe for each. The number

1 2 h

of sets is m, the size of the universe is d, and each set is of size `,sowe only need to check that for all i 2 [m],

jS \S j

i j

2 < m 1. For i 2fn +1;:::;n + m g, S is disjoint from any S for any j n and

t t t i j t

jS \S j

i j

2 2 m 1:

j =n +1

since S ;:::;S is a weak `; 2-design.

n +1 n +m

t t t

Thus, wehave

X X X

jS \S j jS \S j jS \S j

i j i j i j

2 + 2 2 =

j =1 j =n +1 j

 n +2 m 1

t t

= 2 2 < 1 + m 1;

as desired.

6 Lower b ounds for designs

Pro of of Prop osition 6: Let I = max jS \ S j log . For each j =1;:::;m, let b e the set of

i6=j i j j

subsets of S of size I +1,so j j = . Let = . Notice that the sets are disjoint, b ecause no

j j j j

I +1

two distinct sets S ;S share more than I elements. Thus, jj = m . At the same time, jj consists of

i j

I +1

subsets of [d] of size I +1,so jj . So wehave

I +1

` d

m :

I +1 I +1

Expanding the binomial co ecients and rearranging terms, wehave

I +1 log 2

d d 1 d I d d

m 

` ` 1 ` I ` I ` log 

Pro of of Prop osition 10: Wehave

jS \S j

i j

2 max

m 1

X X

jS \S j

i j

 2

mm 1

i=1 j

jS \S j

i j

= 2

2mm 1

i6=j

jS \S j

i j

i6=j

mm1

2 

where the last inequality follows from Jensen's inequality. Thus,

log 2> jS \ S j 1

i j

i6=j

P P

jS j = m `. n = Now, for a 2 [d], let n = jfi: a 2 S gj. Then

i a a i

i a

X X

jS \ S j = n n 1

i j a a

i6=j

a2[d]

X X

= n n

a a

= n m`

 !

n m` 

2 2

m `

= m`

2 2

m `

 ;

unless d m`=2. Putting this in Inequality1,wehave

2 2 2

1 m ` `

log 2> =

m 2d 2d

which proves the prop osition.

Remark The ab ove pro of gives a stronger b ound on d if wehave a family of sets S ;:::;S such that for

1 m

jS \S j

i j

all i, 2 < i 1 e.g., the family of sets constructed in the pro of of Lemma 8. If wehave

such a b ound, then summing over i from 1 to m gives

m 1

jS \S j

i j

 > 2 ;

2 2

i6=j

and applying Jensen's inequality and taking logs as in the ab ove pro of gives

jS \ S j log >

i j

i6=j

instead of Inequality1. Following the rest of the pro of without change, this shows that

m` `

: ; d min

2 log 2 12

7 Achieving small entropy loss

n d m

Recall that the entropy loss of an extractor Ext: f0; 1g f0; 1g !f0; 1g is de ned as = k + d m,

and we can hop e for this to b e as small as 2 log1="+ O 1 with d = logn k +O 1 [RT97].

In constructing our extractor Ext u; y =NW y , we \threw away" y after using it as a seed for the

S S ;

Nisan{Wigderson generator and hence the d bits of entropy carried by y were lost. However, the analysis of

the Nisan{Wigderson generator actually shows that the quality of the generator is not a ected if the seed

is revealed. Thus, we de ne Ext u; y = y; NW y . Now all the analysis of Ext done in Section 4

S ; u

actually applies to Ext in Lemmas 13 and 14, give the distinguisher D the seed y in addition to NW y ,

S ;u

and we obtain the following strengthening of Prop osition 15:

Prop osition 16 If S =S ;:::;S with S [d]isaweak `; -design for =k 3 logm=" 5=m,

1 m i

n d m+d

then Ext : f0; 1g f0; 1g !f0; 1g is a k; "-extractor.

Combining Prop osition 16 and Lemma 9 with m = k 1 immediately gives Theorem 3. An additional

additive factor of log m can b e removed from the entropy loss by taking the alternative approach mentioned

in the remark at the end of Section 4. Note that the trick of adding extra bits to the seed and concatenating

these to the output, as we did in the pro of of Theorem 1, do es not help in reducing the entropy loss.

8 Better pseudorandom generators

Using alternativetyp es of designs also gives some quantitative improvements in the construction of pseu-

dorandom generators from hard predicates in [NW94]. From Lemma 14, we see that the relevant notion of

design in the setting of circuit complexity-based pseudorandom generation is the following:

De nition 17 A family of sets S ;:::;S [d] is a typ e 2 weak `; -design if

1 m

1. For al l i, jS j = `.

2. For al l i,

jS \S j

i j

jS \ S j2 m 1:

i j

jS \S j

i j

Notice that it is meaningful to consider even values of less than 1, since jS \ S j 2 can b e zero.

i j

Using a construction like the one in Lemma 8, we obtain

Lemma 18 For every `; m 2 N and >0, there exists a type2 weak `; -design S ;:::;S [d] with

1 m

` 6

O if 

ln ` `

d =

6 `

if < O

 `

Moreover, such a family can be found in time p oly m; d.

The quantitative relation b etween pseudorandom generators and typ e 2 weak designs follows readily from

Lemma 14:

Lemma 19 Suppose P : f0; 1g !f0; 1g is a predicate such that no circuit of size s can compute P correctly

on more than a fraction + " of the inputs and suppose that S = S ;:::;S where S [d] is a type

1 m i

2 weak `; -design. Then no circuit of size s m can distinguish NW from uniform with advantage

S ;P

greater than m".

Combining this and Lemma 18 with = 1 and s =2m,we obtain 13

Theorem 20 Suppose P : f0; 1g !f0; 1g isapredicate such that that no circuit of size 2m can compute P

1 "

O ` = log `

correctly on more than a fraction + of the inputs. Then thereis a generator G : f0; 1g !

P;m

2 m

f0; 1g computable in time p oly m; `, making m oracle cal ls to P , such that no circuit of size m can

distinguish the output of G from uniform with advantage greater than ".

In other words, to obtain m bits which are pseudorandom against circuits of size m,we need only assume

that there is a predicate which is hard against circuits of size O m. In contrast, the results of [NW94]

1+

always need to assume that the predicate is hard against circuits of size m for some constant >0 or

else their generator will require a seed length that is p olynomial in m instead of `. In fact, if we instead

take =1=`,we need only assume that the predicate is hard against circuits of size 1 + 1=` m and the

generator will have a seed length O ` .

Acknowledgments

I am grateful to Luca Trevisan for sharing his novel insights into these problems with me. Iacknowledge

Oded Goldreich, Madhu Sudan, and David Zuckerman for several simpli cations of the pro ofs in this pap er

and for helpful discussions. Further thanks to Oded Goldreich for valuable comments on the presentation.

References

[ACR97] Alexander E. Andreev, Andrea E. F. Clementi, and Jos e D. P. Rolim. Worst-case hardness

suces for derandomization: A new metho d for hardness-randomness trade-o s. In Pierpaolo

Degano, Rob ert Gorrieri, and Alb erto Marchetti-Spaccamela, editors, Automata, Languages and

Programming, 24th International Col loquium,volume 1256 of Lecture Notes in Computer Science,

pages 177{187, Bologna, Italy, 7{11 July 1997. Springer-Verlag.

[AK92] E.F. Assmus and J.D. Key. Designs and their codes. Numb er 103 in Cambridge Tracts in Math-

ematics. Cambridge University Press, 1992.

[AKSS89] Mikl os Ajtai, J anos Koml os, William Steiger, and Endre Szemer edi. Almost sorting in one round.

In Silvio Micali, editor, Randomness and Computation, volume 5 of Advances in Computing

Research, pages 117{125. JAI Press Inc., 1989.

[ASE92] Noga Alon, Jo el H. Sp encer, and Paul Erd}os. The Probabilistic Method. Wiley-Interscience Series

in Discrete Mathematics and Optimization. John Wiley and Sons, Inc., 1992.

[CG88] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak randomness and probabilistic

communication complexity. SIAM Journal on Computing, 172:230{261, April 1988.

[GM84] Sha Goldwasser and Silvio Micali. Probabilistic encryption. Journal of Computer and System

Sciences, 282:270{299, April 1984.

[Gol95] Oded Goldreich. Foundations of Cryptography Fragments of a Book.Weizmann Institute of Sci-

ence, 1995. Available, along with revised version 1/98, from http://theory.lcs.mit.edu/~oded.

[Gol98] Oded Goldreich. Modern Cryptography, Probabilistic Proofs and Pseudorandomness, June 1998.

Available from http://theory.lcs.mit.edu/~oded/.

[GW97] Oded Goldreich and Avi Wigderson. Tiny families of functions with random prop erties: A quality-

size trade-o for hashing. Random Structures & Algorithms, 114:315{343, 1997.

[GZ97] Oded Goldreich and David Zuckerman. Another pro of that BPP PH and more. Elec-

tronic Col loquium on Computational Complexity Technical Rep ort TR97-045, September 1997.

http://www.eccc.uni-trier.de/eccc. 14

[ILL89] Russell Impagliazzo, Leonid A. Levin, and Michael Luby. Pseudo-random generation from one-

way functions extended abstracts. In Proceedings of the Twenty First Annual ACM Symposium

on Theory of Computing, pages 12{24, Seattle, Washington, 15{17 May 1989.

[MR95] Ra jeev Motwani and Prabhakar Raghavan. RandomizedAlgorithms. Cambridge University Press,

1995.

[Nis96] Noam Nisan. Extracting randomness: How and why: A survey. In Proceedings, Eleventh Annual

IEEE Conference on Computational Complexity, pages 44{58, Philadelphia, Pennsylvania, 24{

27 May 1996. IEEE Computer So ciety Press.

[NT98] Noam Nisan and Amnon Ta-Shma. Extracting randomness: A survey and new constructions.

Journal of Computer and System Sciences, 1998. To app ear in STOC `96 sp ecial issue. Preliminary

versions in [Nis96] and [TS96].

[NW94] Noam Nisan and Avi Wigderson. Hardness vs randomness. Journal of Computer and System

Sciences, 492:149{167, Octob er 1994.

[NZ96] Noam Nisan and David Zuckerman. Randomness is linear in space. Journal of Computer and

System Sciences, 521:43{52, February 1996.

[Pip87] Nicholas Pipp enger. Sorting and selecting in rounds. SIAM Journal on Computing, 166:1032{

1038, Decemb er 1987.

[RT97] Jaikumar Radhakrishnan and Amnon Ta-Shma. Tight b ounds for depth-two sup erconcentrators.

In 38th Annual Symposium on Foundations of Computer Science, pages 585{594, Miami Beach,

Florida, 20{22 Octob er 1997. IEEE.

[Sip88] Michael Sipser. Expanders, randomness, or time versus space. Journal of Computer and System

Sciences, 363:379{383, June 1988.

[SSZ98] Michael Saks, Aravind Srinivasan, and Shiyu Zhou. Explicit OR-disp ersers with p olylogarithmic

degree. Journal of the ACM, 451:123{154, January 1998.

[SV86] Miklos Santha and Umesh V. Vazirani. Generating quasi-random sequences from semi-random

sources. Journal of Computer and System Sciences, 331:75{87, August 1986.

[SZ98] Aravind Srinivasan and David Zuckerman. Computing with very weak random sources. To app ear

in SIAM Journal on Computing, 1998. Preliminary version in FOCS `94.

[Tre98] Luca Trevisan. Simple and improved construction of extractors. Unpublished manuscript, July

1998.

[TS96] Amnon Ta-Shma. On extracting randomness from weak random sources extended abstract. In

Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, pages

276{285, Philadelphia, Pennsylvania, 22{24 May 1996.

[TS98] Amnon Ta-Shma. Almost optimal disp ersers. In Proceedings of the 30th Annual ACM Symposium

on Theory of Computing, pages 196{202, Dallas, TX, May 1998. ACM.

[Vaz84] Umesh V. Vazirani. Randomness, Adversaries, and Computation. PhD thesis, University of

California, Berkeley, 1984.

[Vaz87a] Umesh V. Vazirani. Eciency considerations in using semi-random sources extended abstract. In

Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, pages 160{168,

New York City, 25{27 May 1987.

[Vaz87b] Umesh V. Vazirani. Strong communication complexity or generating quasirandom sequences from

two communicating semirandom sources. Combinatorica, 74:375{392, 1987. 15

[VV85] Umesh V. Vazirani and VijayV.Vazirani. Random p olynomial time is equal to slightly-random

p olynomial time. In 26th Annual Symposium on Foundations of Computer Science, pages 417{428,

Portland, Oregon, 21{23 Octob er 1985. IEEE.

[WZ95] Avi Wigderson and David Zuckerman. Expanders that b eat the eigenvalue b ound: Explicit

construction and applications. Technical Rep ort CS-TR-95-21, UniversityofTexas Department

of Computer Sciences, 1995. To app ear in Combinatorica.

[Yao82] Andrew C. Yao. Theory and applications of trap do or functions extended abstract. In 23rd

Annual Symposium on Foundations of Computer Science, pages 80{91, Chicago, Illinois, 3{5

Novemb er 1982. IEEE.

[Zuc96] David Zuckerman. Simulating BPP using a general weak random source. Algorithmica,

164/5:367{391, Octob er/Novemb er 1996.

[Zuc97] David Zuckerman. Randomness-optimal oblivious sampling. Random Structures & Algorithms,

114:345{367, 1997.

A Derandomizing the pro of of Lemma 8

In the analysis of the probabilistic choice of S ,we showed that

3 2

jS \S j

i j

5 4

2 i 1

By averaging, this implies that there exists an 2 B such that

1 1

3 2

jS \S j

i j

5 4

2 a = i 1 2

1 1

i h

jS \S j

i j

a = for every 2 So, assuming we can eciently calculate the conditional exp ectation

1 1

2 B , we can nd the that makes Inequality 2 hold. Then, xing such an , another averaging

1 1 1 1

argument implies that there exists an 2 B such that

2 2

3 2

jS \S j

i j

5 4

 i 1 3 2 a = ;a =

1 1 2 2

Again, assuming that we can compute the appropriate conditional exp ectations, we can nd an that

makes Inequality 3 hold. Pro ceeding like this, we obtain ;:::; such that

1 `

3 2

jS \S j

i j

5 4

a = ;a = ;:::;a = i 1 4 2

1 1 2 2 ` `

jS \S j

i j

But now there is no more randomness left in the exp eriment, and Inequality 4 simply says that 2 

 i 1, for S = f ;:::; g. To implement this algorithm for nding S ,we need to b e able to calculate

i 1 ` i

the conditional exp ectation

3 2

jS \S j

i j

5 4

a = ;:::;a = 2 ;

1 1 i i

for any i and ;:::; . If we let T = f ;:::; g, then a calculation like the one in the pro of of Lemma 8

1 i 1 i

for the unconditional exp ectation shows

3 2

X X

jT \S j jS \S j

j i j

5 4

1+ = 2 2 a = ;:::;a = ;

1 1 i i

d`= ln e

which can b e easily computed. 17