<<

Intro duction to Complexity Theory

Notes for a OneSemester Course

Oded Goldreich

Department of Computer Science and Applied Mathematics

Weizmann Institute of Science Israel

Email odedwisdomweizmanni

Spring revised Octob er

c

Copyright by Oded Goldreich

Permission to make copies of part or of this work for p ersonal or classro om use is granted without fee

provided that copies are not made or distributed for prot or commercial advantage and that new copies

b ear this notice and the full citation on the rst page Abstracting with credit is p ermitted I

Preface

Complexity Theory is a central eld of Theoretical Computer Science with a remarkable list of

celebrated achievements as well as a very vibrant present research activity Theeldis concerned

with the study of the intrinsic complexity of computational tasks and this study tend to aim at

generality It cuses on natural computational resources and the ect of limiting those on the

classofproblems that can b e solved Putinotherwords Complexity Theory aims at understanding

the nature of ecient computation

at exp osing the Topics In my opinion a intro ductory course in complexity theory should aim

students to the basic results and research directions in the eld The fo cus should b e on concepts

and ideas and complex technical pro ofs should b e avoided Sp ecic topics may include

Revisiting NP and NPC with emphasis on search vs decision

Complexity classes dened by one resourceb ound hierarchies gaps etc

Nondeterministic with emphasis on NL

Randomized Computations eg ZPP RP and BPP

Nonuniform complexity eg Pp oly and lower b ounds on restricted circuit classes

The Polynomialtime Hierarchy

The counting class P approximateP and uniqueSAT

Probabilistic pro of systems ie IP PCP and ZK

Pseudorandomness generators and derandomization

Time versus Space in Turing Machines

Circuitdepth versus TMspace eg AC NC SC

Communication complexity

Averagecase complexity

Of course it would be hard if not imp ossible to cover all the ab ove topics even briey in a

singlesemester course of tw o hours a week Thus a choice of topics has to be made and the

rest may b e merely mentioned in a relevant lecture or in the concluding lecture The choice may

dep end on other courses given in the institute in fact myown choice was strongly eected by this

asp ect II

Prerequisites It is assumed that students havetaken a course in and hence are

familiar with Turing Machines

Mo del of Computation Most of the presented material is quite indep endent of the sp ecic

reasonable mo del of computation but some material do es dep end heavily on the lo cality of

computation of Turing machines

The partition of material to lectures The partition of the material to lectures ects only

the logical organization of the material and do es not reect the amount of time to be sp ent on

each topic Indeed some lectures are much longer than other

State of these notes These notes provide an outline of an intro ductory course on complexity

theory including discussions and sketches of the various notions denitions and pro ofs The latter

are presented in v arying level of detail where the level of detail do es not reect anything except the

amountoftimespent in writing Furthermore the notes are neither nor fully pro ofread

Related text The current singlesemester intro ductory course on complexity theory is a prop er

subset of a twosemester course that I gave in at the Weizmann Institute of Science

Lectures notes for that course are availalb e from the webpage

httpwwwwisdomweizmannacilodedcchtml III

Contents

Preface II

I Things that should have been taught in previous courses

P versus NP

The searchversion

The decision version

Conclusions

Reductions and Selfreducibility

The general notion of a

Selfreducibility of search problems

NPcompleteness

Denitions

The existence of NPcomplete problems

CSAT and SAT

NP sets that are neither in P nor NPcomplete

NP coNP and NPcompleteness

Optimal search for NPrelations

Historical Notes for the rst series

II The most traditional material

Complexity classes dened by a sharp threshold

Denitions

Hierarchies and Gaps

Space Complexity

Deterministic space complexity

Nondeterministic space complexity

Two mo dels of nondeterminism

Some basic facts ab out NSPACE

Comp osition Lemmas

NSPACE is closed under complementation IV

The PolynomialTime Hierarchy

Dening PH via quantiers

Dening PH via oracles

Equivalence of the two denitions of PH

Collapses

Comment a PSPACEcomplete problem

Randomized Complexity Classes

Twosided error BPP

Onesided error RP and coRP

No error ZPP

Randomized space complexity

NonUniform Complexity

Circuits and

The p ower of nonuniformity

Uniformity

Evidence that Pp oly do es not contain NP

Reductions to sparse sets

Counting Classes

The denition of P

Pcomplete problems

A randomized reduction of ApproximateP to NP

A randomized reduction of SAT to UniqueSAT

Promise problems

Space is more valuable than time

Circuit Depth and Space Complexity

Historical Notes for the second series

III The less traditional material

Probabilistic Pro of Systems

Intro duction

Interactive Pro of Systems

The Denition

An Example interactive pro of of Graph NonIsomorphism

Interactive pro of of NonSatisability

The Power of Interactive Pro ofs

Advanced Topics

ZeroKnowledge Pro ofs

Perfect ZeroKnowledge

General or Computational ZeroKnowledge

Concluding Remarks

Probabilistically Checkable Pro of PCP Systems V

The Denition

The p ower of probabilistically checkable pro ofs

PCP and Approximation

The actual notes that were used

Interactive Pro ofs IP

Probabilistically Checkable Pro ofs PCP

Pseudorandomness

Intro duction

The General Paradigm

The Archetypical Case

The actual denition

How to Construct Pseudorandom Generators

Pseudorandom Functions

The Applicability of Pseudorandom Generators

The Intellectual Contents of Pseudorandom Generators

Derandomization of BPP

On weaker notions of computational indistinguishability

The actual notes that were used

AverageCase Complexity

Intro duction

Denitions and Notations

DistributionalNP

Average PolynomialTime

Reducibilitybetween Distributional Problems

A Generic DistNP Complete Problem

DistNPcompleteness of

BH

Conclusions

App endix Failure of a naive formulation

Circuit Lower Bounds

Constantdepth circuits

Monotone circuits

Communication Complexity

Deterministic Communication Complexity

Randomized Communication Complexity

Historical Notes for the second series

Bibliography VI

Lecture Series I

Things that should have been taught

in previous courses

Therstthree lectures fo cus on material that should have been taught in the basic course on

computability Unfortunatelyinmany cases this material is covered but from a wrong p ersp ective

or without any prop er p ersp ective Thus although in a technical sense most of the material eg

the class NP and the notion of NPcompleteness may be known to the students its conceptual

meaning may not have b een appreciated and our aim is to try to correct this damage

In addition wecover some topics that may b e new to most students These topics include self

reducibility of search problems the existence of NPsets that are neither in P nor NPcomplete

the eect of having coNPsets that are NPcomplete and the existence of optimal search algorithms

for NPrelations

Lecture

P versus NP

We assume that all students have heard of P and NP but we susp ect that manyhave not obtained

a go o d explanation of what the P vs NP question actually represents This unfortunate situation is

due to using the standard technical denition of NP which refers to nondeterministic p olynomial

time rather than more cumb ersome denitions that clearly capture the fundamental nature of NP

Below we take the alternative approach In fact we presenttwo fundamental formulations of the

P vs NP question one in terms of search problems and the other in terms of decision problems

Ecient computation Discuss the asso ciation of eciency with p olynomialtime Polynomi

als are merely a closed set of mo derately growing functions where closure means closure under

addition multiplication and comp osition

The search version

We fo cus on p olynomiallyb ounded relations The relation f g f g is p olynomially

b ounded if there exists a p olynomial p such that for every x y R it holds that jy j pjxj

For such a relation it makes sense to ask whether given an instance x one can eciently nd a

solution y such that x y R The p olynomiallyb ounded condition guarantees that intrinsic

intractability may not b e due to the length or mere typing of the required solution

P as a natural class of search problems With each p olynomiallyb ounded relation R we

asso ciate the following search problem given x nd y such that x y R or state that no such y

exists The class P corresp onds to the class of search problems that are solvable in p olynomialtime

en x nd y suchthatx y R or state that ie there exists a p olynomialtime that giv

no such y exists

NP as another natural class of search problems A p olynomiallyb ounded relation R is

called an NPrelation if given an alleged instancesolution pair one can eciently verify whether

the pair is valid that is there exists a p olynomialtime algorithm that given x and y determines

whether or not x y R It is reasonable to fo cus on search problems for NPrelations b ecause

the ability to recognize a valid solution seems to b e a natural prerequisite for a discussion regarding

nding such solutions Indeed formally sp eaking one can intro duce nonNPrelations for which

the search problem is solvable in p olynomialtime but still the restriction to NPrelations is very

natural

The P versus NP question in terms of search problems Is it the case that the search

problem of every NPrelation can besolved in polynomialtime In other words if it is easy to test

whether a solution for an instance is correct then is it also easy to nd solutions to given instances

If P NP then this would mean that if solutions to given instances can be eciently veried

for correctness then they can also b e eciently found when given only the instance This would

mean that all reasonable search problems ie all NPrelations are easy to solve On the other

hand if P NP then there exist reasonable search problems ie some NPrelations that are

hard to solv e In such a case the world is more interesting some reasonable problems are easy to

solve whereas others are hard to solve

The decision version

For an NPrelation R we denote the set of instances having solution by L that is L fx

R R

y x y R g Such a set is called an NPset Intuitively an NPset is a set of valid statements

ie statements of memb ership of a given x in L that can b e eciently veried given adequate

R

pro ofs ie a corresp onding NPwitness y such that x y R

NPpro of systems Pro of systems are dened in terms of their verication pro cedures Here

we fo cus on the natural class of ecientverication pro cedures where eciency is represented by

p olynomialtime computations We should either require that the time is p olynomial in terms of

the statement or con ourselves to short pro ofs that is pro ofs of length that is b ounded

by a p olynomial in the length of the statement An NPrelation R yields a natural verication

pro cedure which amounts to checking whether the alleged statementpro of pair is in R This pro of

system satises the natural completeness and soundness conditions every true statement ie

NPwitness y such that x y R whereas false statements x L has a valid pro of ie an

R

ie x L havenovalid pro ofs ie x y R for all y s

R

The P versus NP question in terms of decision problems Is it the case that NPproofs

are useless That is is it the case that for every eciently veriable pro of system one can easily

determine the validity of assertions without given suitable pro ofs If that were the case then

pro ofs would be meaningless b ecause they would hav e no fundamental advantage over directly

determining the validity of the assertion Recall that P is the class of sets that can be decided

eciently ie by a p olynomialtime algorithm Then the conjecture P NP asserts that pro ofs

are useful there exists NPsets that cannot be decided by a p olynomialtime algorithm and

for these sets obtaining a pro of of memb ership for some instances is useful b ecause we cannot

determine membership by ourselves

Conclusions

terms of decision Verify that P NP in terms of search problems if and only if P NP in

problems Thus it suces to fo cus on the latter simpler formulation

Note that NP is typically dened as the class of sets that can b e decided by a ctitious device

called a nondeterministic p olynomialtime machine The reason that this class of ctitious devices

is imp ortant is b ecause it captures indirectly the denition of NPpro ofs Verify that indeed the

standard denition of NP in terms of nondeterministic p olynomialtime machine equals our

denition of NP in terms of the class of sets having NPpro ofs

Lecture

Reductions and Selfreducibility

We assume that all students have heard of reductions but again we fear that most have obtained

a conceptuallyp o or view of their nature We present rst the general notion of p olynomialtime

reduction among computational problems and view the notion of a Karpreduction as an imp ortant

sp ecial case that suces and is more convenient in some cases

The general notion of a reduction

Reductions are pro cedures that use functionallysp ecied subroutines That is the functionalityof

the subroutine is sp ecied but its op eration remains unsp ecied and its runningtime is counted

at unit cost Analogously to algorithms which are mo deled byTuring machines reductions can b e

mo deled as oracle Turing machines A reduction solves one whichmaybe

either a search or by using oracle or subroutine calls to another computational

problem which again may be either a search or decision problem We fo cus on ecient ie

p olynomialtime reductions which are often called Co ok reductions

The standard case is of reducing decision problems to decision problems but we will also consider

h problems or reducing search problems to decision problems reducing search problems to searc

A Karpreduction is a sp ecial case of a reduction from a decision problem to a decision problem

Sp ecically for decision problems L and L we say that L is Karpreducible to L if there is a

reduction of L to L that operates as fol lows On input x an instance for L the reduction

computes x makes query x to the oracle L ie invokes the subroutine for L on input x and

answers whatever the latter returns

Indeed a Karpreduction is a syntactically restricted notion of a reduction This restricted case

suces for many cases eg most imp ortantly for the theory of NPcompleteness but not in case

we want to reduce a search problem to a decision problem Furthermore whereas each decision

problem is reducible to its some decision problems are not Karpreducible to their

complement eg the trivial decision problem Likewise each decision problem in P is trivially

reducible to any computational problem ie by a reduction that do es not use the subroutine at

all whereas such a trivial reduction is disallowed by the syntax of Karpreductions

We comment that Karpreductions may and should be augmented also in order to handle

reductions of search problems to search problems Such an augmented Karpreduction of the

search problem of R to the search problem of R op erates as follows On input x an instance for

R the reduction computes x makes query x to the oracle R ie invokes the subroutine for R

on input x obtaining y such that x y R and uses y to compute a solution y to x ie

x y R Indeed unlike in case of decision problems the reduction cannot just return y as an

answer to x

Selfreducibility of search problems

The search problem for R is called selfreducible if it can be reduced to the decision problem of

L fx y x y R g Note that the decision problem of L is always reducible to the search

R R

problem for R eg invoke the search subroutine and answer YES if and only if it returns some

string rather than the no solution symbol

We will see that all NPrelations that corresp ond to NPcomplete sets are selfreducible mostly

via natural reductions We start with SAT the set of satisable Bo olean formulae Let R

SAT

b e the set of pairs such that is a satisfying assignment to the formulae Note that R

SAT

is an NPrelation ie it is p olynomiallyb ounded and easy to decide by evaluating a Bo olean

expression

Prop osition R is selfreducible The search problem R is reducible to SAT

SAT SAT

Pro of Given a formula we use a subroutine for SAT in order to nd a satisfying assignment

to in case such exists First we query SATon itself and return no solution if the answer

we get is false Otherwise we let initiated to the empty string denote a prex of a satisfying

assignment of We pro ceed in iterations where in each iteration we extend by one bit This

is done as follows First we deriveaformula denoted by setting the rst j j variables of

according to the values Next we query SATon which means that we ask whether or not

is a prex of a satisfying assignmentof If the answer is p ositivethenweset else we set

b ecause if is a prex of a satisfying assignmentof and is not a prex of a satisfying

assignmentof then must b e a prex of a satisfying assignmentof

A key point is that the formulae can be simplied to contain no constants such that they

t the canonical denition of SAT That is after replacing some variables by constants we should

simplify clauses according to the straightforward b o olean rules eg a false literal can b e omitted

from a clause and a true literal app earing in a clause yields omitting the entire clause

A similar reduction can be presented also for other NPcomplete problems Consider for ex

ample Colorability Note that in this case the pro cess of getting rid of constants representing

partial solutions is more involved Details are left as an exercise In general if you dont see

a natural selfreducibility pro cess for some NPcomplete relation you should still know that a

selfreduction pro cess alas mayb e not a natural one do es exist

Theorem Every NPrelation of an NPcomplete set is selfreducible

Pro of Let R be an NPrelation of the NPcomplete set L Then we combine the following

R

sequence of reductions

is reducible to the search problem of R by the NPcompleteness The search problem of R

SAT

of the latter

The search problem of R is reducible to SAT by Prop osition

SAT

The decision problem SAT is reducible to the decision problem L by the NPcompleteness

R

of the latter

The theorem follows

Lecture

NPcompleteness

This is the third and last lecture devoted to material that the students have heard Again most

students did see an exp osition of the technical material in some undergraduate class but they

mighthave missed imp ortant conceptual p oints Sp ecically we stress that the mere existence of

NPcomplete sets regardless if this is SAT or some other set is amazing

Denitions

The standard denition is that a set is NPcomplete if it is in NP and every set in NP is reducible

to it via a Karpreduction Indeed there is no reason to insist on Karpreductions rather than use

arbitrary reductions except that the restricted notion suces for all p ositive results and is easier

to work with

We say that a p olynomiallyb ounded relation is NPcomplete if it is an NPrelation and every

NPrelation is reducible to it

The mere fact that wehave dened something ie NPcompleteness do es not mean that this

thing exists It is indeed remarkable that NPcomplete problems do exist

The existence of NPcomplete problems

Theorem There exist NPcomplete relations and sets

Pro of The pro of as well as all NPcompleteness is based on the observation that some NP

relations are rich enough to enco de all NPrelations This is most obvious for the universal

NPrelation denoted R and dened below which is used to derive the simplest pro of of the

U

current theorem

t

The relation R consists of pairs hM x iy suchthat M is a description of a deterministic

U

Turing machine that accepts the pair x y within t steps where jy j t Instead of requiring

that jy jt we may require that M is canonical in the sense that it reads its entire input before

def

halting It is easy to see that R is an NPrelation and indeed L fz y z y R g is an

U U U

NPset

Wenow turn to showing that any NPrelation is reducible to R Asawarm let us rst show

U

that any NPset is Karpreducible to L Let R b e an NPrelation and L fx y z y R g

U R

be the corresp onding NPset Let p be a p olynomial b ounding the length of solutions in R ie

R

jy j p jxj for every x y let M be a p olynomialtime machine deciding memb ership of

R R

alleged x y pairs in R and let t a p olynomial b ounding its runningtime Then the Karp

R

t jxjp jy j

R R

reduction maps an instance x for L to the instance hM x i

R

Note that this mapping can be computed in p olynomialtime and that x L if and only if

t jxjp jy j

R R

hM x iL

R U

To reduce the search problem of R to the search problem of R we use essentially the same re

U

t jxjp jy j

R R

duction On input an instance x for R wemake the query hM x i to the search R

R U

and return whatever the latter returns Note that if x L then the answer will b e no solution

R

t jxjp jy j

R R

whereas for every x and y it holds that x y R if and only if hM x iy R

R U

CSAT and SAT

Dene Bo olean circuits directed acyclic graphs with vertices lab eled by Bo olean op eration Prove

the NPcompleteness of the circuit satisfaction problem CSAT The pro of b oils down to enco ding

p ossible computations of a by a corresp onding layered circuit where each layer

represents a conguration of the machine and the conditions of consecutive congurations are

captured by uniform lo cal gadgets in the circuit

Dene Bo olean formulae ie a circuit with tree structure Prove the NPcompleteness of the

formula satisfaction problem SAT even when the formula is given in a nice form ie CNF The

pro of is by reduction from CSAT which in turn b oils down to intro ducing auxiliary variables in

order to cut the computation of a deep circuit into a conjunction of related computations of shallow

ie depth circuits whichmay b e presented as CNFs

neither in P nor NPcomplete NP sets that are

Many to say the least other NPsets have b een shown to b e NPcomplete Things reach a situation

in which p eople seem to exp ect any NPset to be either NPcomplete or in P This naive view is

wrong

Theorem Assuming NP P there exist NPsets that are neither NPcomplete nor in P

The pro of is by mo difying a set in NP n P such that to fail all p ossible reductions to this set

and all p ossible p olynomialtime decision pro cedures for this set Sp ecicallywe start with some

L which is also in NP n Pby making each reduction sayofLto L NPnP and derive L

L fail by dropping nitely many elements from L until the reduction fails whereas all p ossible

p olynomialtime fail to decide L which dier from L only on a nite numb er of inputs We use

the fact that any reduction of some set in NP nP to a nite set ie a nite subset of Lmust fail

while making only a nite numb er of p ossible queries whereas any ecient decision pro cedure for

L or L mo died on nitely many inputs must fail on some nite p ortion of all p ossible inputs of

L The pro cess of mo difying L into L pro ceeds in iterations alternatively failing a reduction by

dropping suciently many strings from the rest of L and failing a decision pro cedure by including

suciently many strings from the rest of L This can be done eciently b ecause it is inessential

to determine the rst lo cation where wehave enough strings as long as we determine some lo cation

where wehave enough

We mention that some natural problems eg factoring are conjecture to b e neither solvable

in p olynomialtime nor NPhard See discussion following Theorem

NP coNP and NPcompleteness

By prep ending the name of a of decision problems with the prex co we mean

the class of complement sets that is

def

co C ff g n L L Cg

Sp ecically co NP ff g n L L NPg is the class of sets that are complements of NP

sets That is if R is an NPrelation and L fx y x y R g is the asso ciated NPset then

R

f g n L fx y x y R g is the corresp onding coNPset

R

It is widely b elieved that NP is not closed under complementation ie NP coNP Indeed

this conjecture implies P NP b ecause P is closed under complementation and is implied by

the conjecture that NP co NP is a prop er sup erset of P The conjecture NP co NP means

that some coNPsets eg the complements of NPcomplete sets do not have NPpro of systems

that is there is no NPpro of system for proving that a given formula is not satisable

If indeed P NP then some nontrivial NPsets cannot be Karpreducible to coNPsets

exercise why Recall that the empty set cannot be Karpreducible to f g In contrast

all NPsets are reducible to coNPsets by a straightforward general reduction that just ips the

answer A less obvious fact is that NP coNP implies that some NPsets cannot b e reduced to

sets in NP coNP even under general reductions Sp ecically

Theorem If NP co NP contains an NPhard set then NP coNP

Recall that a set is NPhard if every NPset is reducible to it p ossibly via a general reduction

Since NP co NP is conjectured to be a prop er sup erset of P it follows using the conjecture

NP co NP that there are NPsets are neither in P nor NPhard Notable examples are sets

related to the integer factorization problem eg the set of pairs N s such that s has a square

ro ot mo dulo N that is a quadratic residue mo dulo N and the least signicant bit of s equals

Pro of Supp ose that L NP co NP is NPhard Given any L co NP we will show that

L NP We will merely use the fact that L reduces to L which is in NP co NP Such a

def

L f g n L via a general reduction whereas L NP reduction exists b ecause L is reducible

and thus is reducible to L which is NPhard

To show that L NP we will present an NPrelation R that characterizes L ie L

fx y x y R g The relation R consists of pairs of the form x z w z w

t t t

where on input x the reduction of L to L accepts after making the queries z z obtaining the

t

corresp onding answers and for every i it holds that if then w is an NPwitness for

t i i

z L whereas if then w is an NPwitness for z f g n L

i i i i

def

L f g n L are NPsets and refer to the We stress that we use the fact that both L and

corresp onding NPwitnesses Note that R is indeed an NPrelation The length of solutions is

b ounded by the runningtime of the reduction and the corresp onding NPwitnesses Memb ership

in R is decided bychecking that the sequence of z s matches a p ossible queryanswer sequence

i i

in an execution of the reduction regardless of the correctness of the answers and that all answers

ie s are correct The latter condition is easily veried by use of the corresp onding NPwitness

i

That is weneedtoverify that on input x after obtaining the answers to the rst i queries the

i

th

i query made by the reduction equals z

i

Optimal search algorithms for NPrelations

Actually this section do es not relate to NPcompleteness but rather to NPrelations

The title sounds very promising but our guess is that the students will be less excited once

they see the pro of We claim the existence of an optimal search algorithm for any NPrelation

Furthermore we will explicitly present such an algorithm and prove that it is optimal in a very

strong sense for any algorithm correctly solving the same search problem it holds that upto some

xed additive p olynomial term whichmay b e disregarded in case the NPproblem is not solvable

in p olynomialtime our algorithm is at most a constant factor slower than the other algorithm

That is

Theorem For every NPrelation R there exists an algorithm A that satises the fol lowing

A correctly solves the search problem of R

There exists a polynomial p such that for every algorithm A that correctly solves the search

xpjxj where t resp problem of R and for every x L it holds that t xO t

A R A

A

t denotes the number of steps taken by A resp A on input x

A

Interestingly we establish the optimality of A without knowing what its optimal runningtime

is We stress that the hidden constant in the Onotation dep ends only on A but in the following

pro of the dep endence is exp onential in the length of the description of algorithm A and it is not

known whether a b etter dep endence can b e achieved

Pro of sketch Fixing R we let M be a p olynomialtime algorithm that decides memb ership in

R and let p b e a p olynomial b ounding the runningtime of M We present the following algorithm

A that merely runs all p ossible search algorithms in parallel and checks the results provided by

each of them using M halting whenever it obtains a correct solution

Since there are innitely many p ossible algorithms we should clarify what we mean by running

them all in parallel What we mean is to run them at dierent rates such that the innite sum

of rates converges to or any other constant Viewed in dierent terms for any unbounded

th

function N N we pro ceed in iterations such that in the i iteration we let each of the rst

i

i algorithms run for at most steps In case some of these algorithms halts with output y

A invokes M on input x y and output y if and only if M x y The verication of algorithm

a solution provided by an algorithm is also emulated at the exp ense of its stepcount Put in other

words we augmenteach algorithm with a canonical pro cedure ie M thatchecks the validityof

the solution oered by the algorithm

In case we want to guarantee that A also stops on x L we may let it run an exhaustive

R

search for a solution in parallel to all searches and halt with output in case this exhaustive

search fails

Clearly whenever Ax outputs y ie y it must hold that x y R Now supp ose

A is an algorithm that solves R Fixing A for every x let us denote by t x the number of

steps taken by A on input x where t x also accounts for the running time of M x Then

xth iteration of A provided that the txstep execution of A on input x is covered by the i

ix maxjA j log t x where jA j denotes the length of the description of A Thus the

P

ix

j

j and for sucintly large x it running time of A on input x denoted tx is at most

j

holds that t x jA j Using say j j it follows that txO t x log t x which almost

establishes the theorem while we dont care ab out establishing it as stated

Historical Notes

Many sources provide historical accounts of the developments that led to the formulation of the

P vs NP Problem and the development of the theory of NPcompleteness We thus refrain from

attempting to provide such an account

One technical p oint that we mention is that the three founding pap ers of the theory of NP

completeness use the three dierent terms of reductions used ab ove Sp ecically Co ok uses the

general notion of p olynomialtime reduction often referred to as Co okreductions The notion

of Karpreductions originates from Karps pap er whereas its augmentation to search problems

originates from Levins pap er It is worth noting that unlike Co ok and Karps works which

ork is stated in terms of search problems treat decision problems Levins w

The existence of NPsets that are neither in P nor NPcomplete ie Theorem was proven

by Ladner and the existence of optimal search algorithms for NPrelations ie Theorem

was proven by Levin

Lecture Series II

The most traditional material

The partition of the rest of the lectures into two lecture series is only due to historical reasons

We start with the more traditional material most of it is due to the s and the early s

Notation We will try to always use n to denote the length of the main input

Lecture

Complexity classes dened by a

sharp threshold

There is something app ealing in dening complexity classes according to a sharp threshold like the

class of problems that can b e solved within time t for some function t eg tnn Contrast

this denition with the class of problems that can be solved within some time t that b elongs to a

class of functions T eg p olynomials The problem with classes dened according to a single

sharp threshold is that they are very sensitive to the sp ecic mo del of computation and may not b e

closed under natural algorithmic op erations Typically these problems do not o ccur when dening

complexity classes that corresp ond to a resource b ounded by a class of functions provided this

class has some desirable closure prop erties

Denitions

Focusing on two natural complexity measures ie time and space wemay dene for each function

f N N classes suchas Dtimef and Dspacef corresp onding to the class of decision problems

that are solvable within time and space complexity f resp ectively That is on input x the deciding

algorithm runs for at most f jxj steps or uses at most f jxj bits of storage

We stress that when measuring the space complexity of the algorithm we dont allow ittouse

its input andor output device ie tap e in case of Turing machines as temp orary storage This

is done by p ostulating that the input and output devices are readonly and writeonly resp ectively

Note that classes as ab ovearevery sensitive to the sp ecic mo del of computation For example

the of multipletap e Turning machines may be quadratic but not more in the

time complexity of singletap e Turning machines eg consider the set fxx x f gg

Hierarchies and Gaps

A natural prop erty that wemay exp ect from complexity measures is that more resources allow for

more computations That is if g is suciently greater than f then the class of problems solvable

in time or space g should b e strictly larger than the class of problems solvable in time or space

f This prop erty corresp onding to a time or space hierarchy do es hold in the natural cases where

the key question is what is suciently greater The answer will be claried from the way such

hierarchy theorems are proved whichisby using diagonalization

Sp ecically supp ose we want to prove that Dtimeg is a strict sup erset of Dtimef This

is done by diagonalizing against all f time machines That is we construct a set L along with

a decision algorithm for it such that no f time machine can correctly decide L In order to

eectively dene L we should be able to emulate the execution of each f time machine Since

we cannot eectively enumerate all f time mac hines what we do instead is emulate each p ossible

machine while using a timeup mechanism that stops the emulation at time f In order to do this

we need in particular to b e able to compute f relatively fast Note that the runningtime of our

decision pro cedure for L is determined by the time it takes to compute f and the time it takes to

emulate a given number of steps Thus time constructible functions play a central rule in such

pro ofs where f is time constructible if on input n the value f n can b e computed within time f n

As for the emulation overhead it dep ends on the sp ecic mo del of computation typically t steps

can be emulated within time t log t Similar considerations apply to space hierarchies but here

one talks ab out space constructible function and the emulation overhead is typically only linear

For simplicity in case of multipletap e Turing machines weget

Theorem sketch For any timeconstructible function t the class Dtime t is strictly con

tainedinDtime t log t For any spaceconstructible function s the class Dspaces is strictly

contained in Dspace s

The existence of functions that are not time or space constructible is the reason for socalled

gap theorems Typically such theorems say that there exist functions f which are certainly not

f

The reason time constructible suchthat Dtimef Dtimef or Dtimef Dtime

for this phenomena is that for such a function f there are no machines that run more than time

f

f but less than time f or

Lecture

Space Complexity

Space complexity is aimed to measure the amount of temp orary storage required for a computational

task On one hand we dont want to count the input and output themselves within the space of

computation but on the other hand we have to make sure that the input and output device

cannot b e abused to provide work space which is uncounted for This leads to the convention of

p ostulating that the input device eg a designated inputtap e of a multitap e Turing machine is

readonly whereas the output device eg a designated outputtap e of a such machine is write

only Space complexity accounts for the amount of space on other storage devices eg the

worktap es of a multitap e Turing machine that is used throughout the computation

Deterministic space complexity

Only regular languages can be decided in constant space This follows by combining two facts

Firstly constantspace Turing machines are equivalent to a generalization of nite automata that

can scan parts of the input back and forth in b oth directions and for several times Second the

latter sweeping automata can be simulated by ordinary nite automata which scan the input

only once from left to right

At rst glance one may think that sublogarithmic deterministic space is not more useful than

constant space b ecause it seems imp ossible to allo cate a sublogarithmic amount of space since

is wrong b ecause measuring the input length requires logarithmic space However this intuition

the input itself in case it is of the prop er form can be used to determine its length whereas in

case the input is not of the prop er form this fact may b e detectable within sublogarithmic space

In fact

Theorem Dspaceolog n is a proper superset of DspaceO

One pro of consists of presenting a doublelogarithmic space algorithm for recognizing the non

regular set L fx k Ngf g where x equals the concatenation of all k bit long strings

k k

k k k k k

in lexicographic order separated by s ie x

k

k

Note that jx j and we claim that x can be recognized in space O log k O log log jx j

k k k

Furthermore the memb ership of any x in L can b e determined in space O log log jxj by iteratively

checking in space O log i whether x x for i Details are left as an exercise In

i

contrast to Theorem doublelogarithmic space is indeed the smallest amount of space that is

more useful than constant space that is

Theorem Dspaceolog log n DspaceO

The pro of pro ceeds by considering for each input lo cation the sequence of storage congurations

of the machine at all times that it crosses this input lo cation For starters the length of this

crossing sequence is upp erb ounded by the number of p ossible storage congurations ie in

case of Turing machines we consider the contents of the tap e and the head lo cation which is

def

sn

at most t sn where s is the machines space complexity Thus the number of such

t

sequences is b ounded ab oveby t But if the latter is smaller than n then there exist three input

lo cations that have the same sequence of congurations Using cutandpaste we get a shorter

def

input on whichthe machine used space s sn whichisnot p ossible in case the original nbit

long input was the shortest one on which the machine uses space at least s We conclude that

t

t nmust hold and sn log t log log n follows

Logarithmic Space Although Theorem asserts that there is life under logspace log

west spacecomplexity class that we will care ab out The class arithmic space will be the lo

of sets recognizable by deterministic machines that use logarithmic space is denoted L that is

def

L Dspacec log n

c

Theorem LP

s

In general if s is at least logarithmic and is computable within time then Dspaces

s

Dtime This follows as a sp ecial case from Theorem The phenomena that time relates

exp onentially to space o ccurs also in other settings

Another class of imp ortant logspace computations is the class of logarithmic space reductions

that is reductions or oracle machines that use only logarithmic space and as usual p olynomial

time In accordance with our conventions regarding input and outputs we stress that the queries

resp answers are written on resp read from a sp ecial devicetap e that is writeonly resp

readonly for the calling algorithm and readonly resp writeonly for the invoked oracle We ob

serve that all known Karpreductions establishing NPcompleteness results are in fact logarithmic

space Observe that if L is logspace reducible to L and L L then so is L See Section

Polynomial Space As stated ab ove we will rarely treat computational problems that require

less than logarithmic space On the other hand we will rarely treat computational problems that

require more than p olynomial space The class of decision problems that are solvable in p olynomial

def

c

for PSPACE is presented in space is denoted PSPACE Dspacen A complete problem

c

Section

Nondeterministic space complexity

Two mo dels of nondeterminism

We discuss two mo dels of nondeterministic machines In the standard mo del called the online

mo del the machine makes nondeterministic on the y or alternatively reads a nondeterministic

input from a readonly tap e that can be read only in a unidirectional way Thus if the machines

wants to refer to such a nondeterministic choice at a latter stage then it must store the choice

on its storage device and b e charged for it In the socalled oline mo del the nondeterministic

choices or the bits of the nondeterministic input are read from a sp ecial readonly record or

tap e that can be scanned in both directions like the main input Although the oline mo del ts

b etter the motivations to NP as presented in the rst lecture the online mo del seems more

adequate for the study of nondeterministic in the context of space complexity The latter thesis is

based on observing that an oline nondeterministic tap e can b e used to co de computations and

in a sense allows to cheat with resp ect to the real space complexity of the computation This

is reected in the fact that the oline mo del can simulate the online mo del while using space that

is logarithmic in the space used by the online mo del This result is tight the online mo del can

simulate the oline mo del using only exp onentially more space

Theorem relating the two mo dels lo osely stated For s N N that is nice and at least

logarithmic Nspace sNspace log s

online oline

To simulate the online mo del on the oline mo del use the nondeterministic input tap e of the

latter to enco de an accepting computation of the former ie a sequence of consecutive congura

tions leading from the initial conguration to an accepting conguration The simulating machine

which veries the legitimacy of the sequence of congurations recorded on the nondeterministic

input tap e needs only store its location within the current pair of congurations that it exam

ines which requires space logarithmic in the length of a single conguration On the other hand

the simulation of the oline mo del by the online mo del uses a crossingsequence argument For

starters one shows that the length of such sequences is at most doubleexp onential in the space

complexity of the oline machine Then the online nondeterministic input tap e is used to en

co de the sequence of crossingsequences and the online machine checks that each consecutive pair

is consistent This requires holding one or two crossingsequences in storage whic h require space

logarithmic in the number of such sequences which in turn is exp onential in the space complexity

of the oline machine

Some basic facts ab out NSPACE

def def

We let Nspaces Nspace s and fo cus on NL NspaceO log n Suitable upwards

online

translation lemmas can b e used to translate simulation results concerning NL resp Linto general

simulation results concerning nondeterministic resp deterministic space Typically the input is

padded till the concrete space allowance b ecomes logarithmic in the padded input ie nbit long

inputs are padded to length N such that sn log N Next the simulation result is applied and

nally the complexity of the obtained simulation is stated in terms of the original input length A

notable prop ertyofNL is that this class has a very natural complete problem

Theorem Directed Connectivity is NLcomplete DirectedConnectivity is in NL and every

problem in NL is reducible to Directed Connectivity by a logspace reduction

Pro of Sketch A nondeterministic logspace machine may decide Directed Connectivityby guess

ing and verifying the directed path onthey To reduce L NL to Directed Connectivitywe

consider the nondeterministic logspace machine that decides L We observe that on input x this

def

machine uses O log jxj space and it may be in one out of jxj p ossible congurations

accounting for the p ossible contents of its worktap e and its head lo cations on the input tap e

and work tap es Consider a directed graph with these conguration as vertices and directed

edges connecting ordered pairs of p ossiblyconsecutive congurations relating to a p ossible non

deterministic move Indeed unlike the vertices the edges dep end on the input x Observe that

s

whereas as stated A related phenomenon is that Nspace sisonlyknown to b e contained in Dtime

oline

s

in Theorem Nspace s Dtime In fact the p ower of the oline mo del emerges from the fact that

online

its running time is not b ounded even not without loss of generality byanexponent in the spacecomplexity

x L if and only if there exists a directed path in this graph leading from the initial conguration

to an accepting conguration Furthermore this graph can be constructed in logarithmic space

from the input x Thus L is logspace reducible to Directed Connectivity

Theorem nondeterministic space versus deterministic time If s is at least logarithmic and

s s

is computable within time then Nspaces Dtime

Pro of Sketch By a suitable upwardstranslation lemma it suces to prove the result for log

arithmic s that is we need to show that NL P Using Theorem we just need to show

that directed connectivity can be solved in p olynomialtime This fact is well known eg by the

directedDFS algorithm

Theorem Nondeterministic versus deterministic space Nspaces Dspaces provided

that s N N is spaceconstructible and at least logarithmic

In particular for any p olynomial p it holds that Nspacep PSPACE where the strict inclusion

c c

is due to the eg Dspacen Dspacen Contrast Theorem

t

with the trivial fact that Ntimet Dtime provided that t N N is timeconstructible

and at least logarithmic

Pro of Sketch Again it suces to show that directed connectivity can b e solved in deterministic

O log n space The basic idea is that checking whether or not there is apath of length at most

from u to v in G reduces in logspace to checking whether there is an intermediate vertex w

such that there is a path of length at most d e from u to w and a path of length at most b c

def

from w to v Let p u v if there is a path of length at most from u to v in G and

G

def

p u v otherwise Thus p u v can be decided recursively by scanning all vertices w

G G

in Gandchecking for each w whether b oth p u w d e b oth p w v b c hold

G G

Thus supp ose we are given a directed graph G and a pair of vertices s t and should decide

whether or not there is a path from s to t in G Let n denote the number of vertices in Gthenwe

need to compute p s t n This is done byinvoking a recursive pro cedure that computes p u v

G G

by scanning all vertices in G and computing for each vertex w the value of p u w d e

G

p w v b c The amount of space taken by each level of the recursion is log n for storing

G

the currentvalue of w and the number of levels is log n The theorem follows

We stress that when computing p u v wemake p olynomially many recursive calls but all

G

w v b cwe these calls reuse the same work space That is when we compute p u w d e p

G G

reuse the space that was used for computing p u w d e p w vb c for the previous w

G G

Furthermore when we compute p w v b c we reuse the space that was used for computing

G

p u w d e

G

Comp osition Lemmas

Indeed as indicated by the pro of of Theorem space unlike time can b e reused In particular

if one machine makes many recursive calls to another machine then the cost in space of these calls

is the maximum space used by a single cal l whereas the cost in terms of time of these calls is

the sum of the time taken by al l cal ls Put in other words Suppose that L is s space reducible

to L and that L is in XSP ACEs where X fD N g Then L is in XSP ACEs s

s n s n

where s ns b ecause is an obvious b ound on the length of queries made to L

Proving this claim is less trivial than it seems even in case of a single call to L b ecause we cannot

s s s

aord to store the query and the answer which may have lengths and resp ectively in

the working space of the resulting machine

For simplicity we fo cus on the singlequery case Let M be the reduction of L to L and

M a machine solving L We emulate them both as follows We allo cate each of the machines a

separate worktap e and b egin byemulating M without sp ecifying its input When M wishes to

th th

read the i eg rst bit of its input which is the i bit of the query of M we run machine

th

M until it pro duces the i bit of its query whichwe hand to M We stress that we do not store

all previous bits of this query but rather discard them Thus we run a new emulation of M per

each time that M wishes to read a bit of its input ie the query directed to it by M When

M outputs its decision we store it and emulate M for the last time In this run we discard all

the query bits pro duced by M feeditwithM s answer and output whatev er M do es

The treatment of reductions to search problems is more complex b ecause unless p ostulated

dierently the calling algorithm may scan the answer provided by the oracle back and forth rather

than read it once from left to right To treat this case wemaykeep twoemulations of M one for

pro ducing bits of the query and the other for using the bits of the answer Note that the second

emulation corresp onds to the last emulation of M in the description ab ove Handling of many

th

oracle calls is p erformed in a query by query manner relying of the fact that the i answer is not

st st

available after the i query is made For i we handle the i queryanswer by

th st

keeping a record of the temp orary congurations of M b efore it started making the i and i

th

queries We maintain four emulations of M the rst resp third for pro ducing bits of the i

st th st

resp i query and the second resp fourth for using the bits of the i resp i

answer Each time we need to emulate the rst or second resp third or fourth copy we start

th st

the emulation from the recorded c onguration of M b efore making the i resp i query

th

Once the fourth copy starts to pro duce the i query we refresh all congurations and move

to the next iteration Sp ecically the conguration of the fourth copy will b e used as the second

th

temp orary conguration as it corresp onds to the conguration b efore making the i query

and the current second conguration which corresp onds to the conguration b efore making the

st

i query will b e used as the rst temp orary conguration for iteration i

Teaching Note In the next subsection we will implicitly use a comp osition result but for that

sp ecic comp osition we do not need the p ower of the ab ove strong comp osition lemma Sp ecically

the reduction will make queries that are very related to its input and thus the invoked subroutine

can form the query by itself from the input Furthermore the answers will b e of logarithmic length

and thus can b e stored by the reduction as in case of invoking a decision subroutine

NSPACE is closed under complementation

People tend to be discouraged by the impression that decades of research have failed to answer

any of the famous op en problems of complexity theory In addition to the fact that substantial

progress towards the understanding of some op en problems has b een achieved p eople tend to forget

that some famous op en problems were indeed resolved The following result relates to a famous

question that was op en for three decades

Weleave the extension to the general multiplequery case as an exercise

In particular using the fact that the class of sets recognized by linearspace nondeterministic machines equals

the set of contextsensitive languages Theorem resolves the question of whether the latter set is closed under

complementation This question has b een puzzling researchers since the early days of research in the area of formal

languages ie the s Wemention that Theorem was proven in the late s

def

Theorem NL coNL where coNL ff g n L L NLg

Again using an adequate upwardstranslation lemma one can derive the closure under comple

mentation of Nspaces

Pro of Sketch It suces to show that directed unconnectivity the complementation of directed

connectivity can b e decided in NL That is we will present a nondeterministic logspace machine

M suchthat

If there is no directedpath from s to t in G then there exists a computation of M that accepts

the input G s t

If thereisadirectedpath from s to t in G then all p ossible computations of M reject G s t

The ab ove decision problem is logspace reducible to determining the number of no des that are

reachable from a given vertex in a given graph Thus we fo cus on providing a nondeterministic

logspace machine that compute the said quantity where wesaythatanondeterministic M computes

the function f f g f g if the following two conditions hold

For every x there exists a computation of M that halts with output f x

For every x all p ossible computation of M either halt with output f x or halt with a sp ecial

dont know symb ol denoted

Fixing an nvertex graph G V E and a vertex v we consider the set of vertices that are

reachable from v by a path of length at most i We denote this set by R and observe that

i

R fv g and that for every i it holds that

R R fu w R st w u E g

i i i

th

Our aim is to compute jR j This will be done in n iterations such that at the i iteration we

n

compute jR j When computing jR j we rely on the fact that jR j is known to us whichmeans

i i i

that well store jR j but not previous jR js in memory Our nondeterministic guess denoted

i j

g forjR j will b e veried as follows

i

V we guess for g vertices jR jg is veried in the straightforward manner That is scanning

i

paths of length at most i from v to them and verify these onthey Indeed we also guess

for which g vertices to verify this fact

We use log n bits to store the currently scanned vertex another log n bits to store an

intermediate vertex on a path from v and another log i log n bits to store the distance

traveled so far

The verication of jR jg is the interesting part of the pro cedure Here we rely on the fact

i

ertices that they are that weknow jR j Scanning V again weverify for n g guessed v

i

not reachable from v by paths of length at most i Verifying that u R is done as follows

i

We scan V guessing jR j vertices that are in R and verify each such guess in the

i i

straightforward manner Implicit here is a pro cedure that given G v i and jR j

i

pro duces R itself

i

Exercise provide such a reduction

For each w R which was guessed and veried ab ove we verify that b oth u w

i

and w u E

By Eq if u passes the ab oveverication then indeed u R

i

We use log n bits to store u another log n bitstocount the number of vertices veried to

be in R another log n bits to store such w and another log n bits for verifying that

i

w R

i

If any of the verications fails the machine halts outputting the dont know symb ol Exercise

assuming that the correct value of jR j is used prove that the ab ove nondeterministic logspace

i

pro cedure computes the value of jR j

i

Observing that when computing jR j we only need to know jR j and do not need jR j for

i i j

any j i the ab ove yields a nondeterministic logspace machine for computing jR j The

n

theorem follows

Lecture

The PolynomialTime Hierarchy

The PolynomialTime Hierarchy PH is a hierarchy of complexity classes that extends NP We

will presenttwo equivalentways of dening this hierarchy and discuss some of its prop erties

Dening PH via quantiers

Recall that L NP if there exists a binary p olynomiallyb ounded relation R such that R is

p olynomialtime recognizable and

x L if and only if y st x y R

Identifying NP with we dene as containing sets L such that there exists a ary p olynomially

b ounded relation R such that R is p olynomialtime recognizable and

x L if and only if y y st x y y R

Ab oveandbelow it is imp ortant to stress that the universal quantiers range only over strings of

the adequate length

In general is dened as the class consisting of sets L such that there exists a i ary

i

p olynomiallyb ounded relation R such that R is p olynomialtime recognizable and

x L if and only if y y Q y st x y y R

i i i

odd resp even That is we where Q is an existential resp universal quantier in case i is

i

have i alternating quantiers starting with an existential one where each quantier ranges over

strings of length p olynomial in the length of x Note that indeed NP

Similarly we can dene classes referring to alternating sequences of quantiers starting with

a universal quantier Sp ecically L if there exists a i ary p olynomiallyb ounded

i

relation R suchthat R is p olynomialtime recognizable and

x L if and only if y y Q y st x y y R

i i i

where Q is an existential resp universal quantier in case i is even resp o dd

i

The p olynomialtime hierarchy denoted PH is dened as That is L PH means that

i i

there exists an i suchthatL PH

Wesay that R is p olynomiallyb ounded if there exists a p olynomial p suchthatforevery x y y R it holds

that jy j jy jpjxj

Indeed wesay that R is p olynomiallyb ounded if there exists a p olynomial p such that for every x y y R

i

P

i

jy j pjxj it holds that

j

j

Exercises the following facts can b e veried by purely syntactic considerations

For every i it holds that co and in particular coNP

i i

For every i it holds that and Thus PH

i i i i i i

For every i it holds that

i i

It is widely b elieved that is a strict subset of See further discussion in Section

i i

Another Exercise Prove that PH is contained in PSPACE See further discussion in Sec

tion

es rise to seminatural com Complete sets in PH The ab ove denition of and giv

i i

plete sets for and For example consider the set of Bo olean circuits of the form C such that C

i i

takes as input i equallength strings denoted x x and it holds that x x Q x C x x

i i i i

where Q is an existential resp universal quantier in case i is o dd resp even Clearly this

i

set is in and every set in is Karpreducible to this set Hint the x s corresp ond to the

i i i

y s in the denition of whereas C corresp onds to x

i i

Natural Examples of sets in PH Recall that natural NPoptimization problems are captured

by NPsets that refer only to a onesided bound on the value of the optimum For example

whereas the optimization version of maxClique requires to nd the largest clique in a given graph

the decision problem is to tell whether or not the largest clique has size greater than or equal to

a given numb er Clearly the latter decision problem is in NP whereas its complement ie

determining whether the largest clique has size smal ler than a given number is in co NP But

what ab out determining whether the largest clique has size equal to agiven numb er That is the

set we refer to is the set of pairs G K such that the size of the largest clique in G equals K Note

that this problem is unlikely to b e in either NP or co NP b ecause this will imply NP coNP

but it is certainly in and in Exercise Present adequate ary relations for the ab oveset

See further discussion in the next section

Dening PH via oracles

Recall that the general notion of a reduction is based on augmenting a deterministic p olynomial

time machine with oracle access A natural question is what languages can b e recognized by such

machines when the oracle is an arbitrary NPset or equivalently an NPcomplete set like SAT

NP

We denote this class by P standing for an arbitrary Pmachine given oracle access to some

NP

NPset As indicated b elow P is likely to be a prop er sup erset of NP whereas the class of

languages that are Karpreducible to NP equals NP

The reason that we insist on Karpreductions here will b ecome clear b elow

Actually the decision problem is typically phrased as determining whether there exists a clique of size greater

than or equal to a given numb er

If we could havegiven an NPpro of that the maxclique has size equal to a given numb er then we could prove

that it is strictly smaller than a given numb er which is a coNPcomplete problem and NP coNP would follow

Exercise Show that these two formulations are indeed equivalent

NP

Comment The notation P is consistent with the standard notation for oracle machines That

L

is for an M oracle set L and string xwelet M x denote the output of M on

input x and oracle access to L Thus when we said that L is reduced to L we meant that there

L

exists a p olynomialtime oracle machine M such that for every x it holds that M x if and

only if x L Thus

NP SAT

P fLM M is a Pmachine g

SAT

where LM denotes the set of inputs that are accepted by M when given oracle access to SAT

Exercises

NP

Show that b oth NP and coNP are subsets of P

In contrast prove that the class of languages that are Karpreducible to NP equals NP

co NP NP

Following the ab ove discussion dene P and show that it equals P

Referring to the set of pairs G K such that the size of the largest clique in G equals K

NP

show that this set is in P

NP

NP

The denition of P suggests that we may dene also classes such as NP Note that sucha

denition do es not yield a natural notion of a reduction to NPsets b ecause the reduction is

NP

elldened class do es emerge Sp ecically NP is the class of sets nondeterministic Still a w

that are accepted by a nondeterministic p olynomialtime oracle machine that is given access to

NP

NP

some NPset Observe that indeed P NP

def

Dening by oracles As b efore welet NP For i we dene

i

def

i

NP

i

NP

Indeed so dened equals NP As we will show in the next section the s as dened here

i

coincide with the classes dened in the previous section

C

mean By the ab ove discussion it should b e clear A general p ersp ective what do es C

C

that the class C can be dened for any two complexity classes C and C provided that C is

associated with a class of machines that extends natural ly to access oracles Actually the class

C

is not denedbased on the class C but rather by analogy to it Sp ecically supp ose that C is C

the class of sets recognizable bymachines of certain typ e eg deterministic or nondeterministic

with certain resource b ounds eg time andor space b ounds Then we consider analogous oracle

C

machines ie of the same typ e and with the same resource b ounds and saythatL C if there

L

such that M accepts the set L exists such an oracle machine M and a set L C

co C C

i

Note that in particular NP C Exercise For C and C as ab ove prove that C

i

NP

Equivalence of the two denitions of PH

Toavoid confusion let use denote by the class dened via quantiers ie in Eq and by

i

the class dened by oracle machines ie in Eq

i

Theorem For every i it holds that

i i

Pro of Sketch The claim holds trivially for i Assuming that equality holds for i we

show that it holds also for i Eachofthetwo inclusions uses only the induction hyp othesis of

the same direction

Assuming that we prove that by lo oking at an i ary relation R for

i i i i

asetL Recall that x L i y y Q y such that x y y R Dene L as

i i i

i

the set of pairs x y such that y Q y it holds that x y y y R Then L co

i i i

i

and x L i there exists a y such that x y L By using a straightforward nondeterministic

co

i i

Using the induction hyp othesis it follows NP hine we obtain that L NP oracle mac

i

that L NP

i

Assuming that we prove that by lo oking at a nondeterministic oracle

i i i i

machine M that accepts a set L when using an oracle L By the denition of

i i

nonuniform acceptance it follows that x L i there exists a computation of M on input x that

L

accepts when the queries are answered according to L Let use denote by M x y the output of

M on input x and nondeterministic choices y when its queries are answered by L Then x L

L

x y We may assume without loss of generality that M i there exists a y such that M

starts its computation by nondeterministically guessing all oracle answers and acting according

to these guesses and that it accept only if these guesses turned out to b e correct In other words

L

there exists a p olynomialtime computable predicate P such that M x y i P x y

th L th

and the j answer provided by the oracle in the computation M x y equals the j bit of y

j

denoted y Furthermore since M acts according to the guessed answers that are part of y the

th j

j query of M is determined in p olynomialtime byx y and is denoted q x y We conclude

j j

ery j Using the that x L i there exists a y such that P x y and y i q L for ev

induction hyp othesis it holds that L and welet R denote the corresp onding i ary

i i

relation Thus x L i

j j j j j

j j

A

y P x y y y y Q y q x y y y R

i

i i

j

The pro of is completed by observing that the ab ove expression can b e rearranged to t the denition

j

of Hint wemay incorp orate the computation of all the q x y s into the relation R and

i

pull all quantiers outside

Collapses

As stated b efore it is widely b elieved that PH is a strict hierarchy that is that is strictly

i

contained in for every i We note that if a collapse o ccurs at some level ie

i i i

Note that for predicates P and P the expression y P y zP y z is equivalent to the expression

y P y zP y z P y zP y z whichinturnisequivalent to the expression y z z P y

t j j j j

Note that pulling the quantiers outside in y z P y z yields an P y z P y P y z

j

t t t j j

expression of the typ e y y z z P y z

j

for some i then the entire hierarchy collapses to that level ie PH This fact is

i

best veried from the oraclebased denition and the verication is left as an exercise In fact a

stronger statement can b e proven

Theorem If holds for some i then PH

i i i

In particular NP co NP implies a total collapse ie PH NP In light of the ab ove

discussion it suces to show that implies This is easiest to pro ve using the

i i i i

quantierbased denition while relying on ideas used in the previous section Sp ecically for

L we rst derive a set L such that x L if and only if there exists y such that

i i

x y L By the hyp othesis L andsox L i

i

y y y Q y st x y y y y R

i i i

where R is the i ary relation guaranteed for L wrt the denition of By joining the

i

def

two leftmost existential quantiers and slightly mo difying R into R fx y y y y

i

x y y y y R g we conclude that L

i i

Comment a PSPACEcomplete problem

Recall that the complete problem of referred to circuits that take i input strings and to an

i

alternating existential and universal quantication over these inputs A natural question that

arises is what happ ens if we drop the restriction on the number of such inputs That is consider

the set of circuits that take a sequence of input strings which is of course b ounded in length by

the size of the circuit Such a circuit denoted C having t tG input strings denoted x x

t

is in the set QC standing for Quantied Circuits if and only if x x Q x C x x

t t t

It is easy to see that QC PSPACE To show that any problem in PSPACE is reducible in

fact Karpreducible to QC we follow the underlying idea of the pro of of Theorem That is let

L PSPACE let M b e the corresp onding p olynomialspace machine and p b e the corresp onding

n pn

p olynomial spaceb ound For any x f g it holds that x L i M passes in at most

steps from the initial conguration with input x denoted initx to an accepting conguration

denoted ACC Dene a Bo olean predicate p such that p t true i M passes in at

M M

most t steps from the conguration to conguration Then we are interested in the value

pn

p initx ACC On the other hand for every and and i N it holds that

M

i i i

p p p

M M M

If wewere to iterate Eq then the length of the formula will double in each iteration and after

log t iterations well just get a straightforward conjunction of t formulae capturing single steps of

owards this end M Our aim is to mo derate the growth of the formula size during the iterations T

we replace Eq by

i

p

M

i

p

M

where f g Observe that Eq is equivalent to Eq whereas in the latter the size of

pn

the formula grows by an additive term rather than by a factor of Thus p initx ACC

M

Assuming that we prove by induction on j i that In the induction step we have

i i j i

j i

NP NP

j i i

can b e written as a quantiedboolean formula with O log t alternating quantiers The formula

b eing quantied over will be a conjunction of O log t simple logical conditions of the typ e

intro duced in Eq as well as a single o ccurrence of the formula p Hence wehave

M

actually established the PSPACEhardness of a sp ecial case of QC corresp onding to Quantied

Bo olean Formulae denoted QB F

Lecture

Randomized Complexity Classes

So far our approach to computing devises was somewhat conservative we thought of them as

rep eatedly executing a deterministic rule A more lib eral and quite realistic approach pursued

in this lecture considers computing devices that use a probabilistic or randomized rule Sp ecif

ically we allow probabilistic rules that cho ose uniformly among two predetermined p ossibilities

and observe that the eect of more general probabilistic rules can b e eciently approximated bya

rule of the former typ e We still fo cus on p olynomialtime computations but these are probabilis

tic p olynomialtime computations Indeed we extend our notion of ecient computations from

deterministic p olynomialtime computations to probabilistic p olynomialtime computations

Rigorous mo dels of probabilistic machines are dened by natural extensions of the basic mo del

for example we will talk of probabilistic Turing machines Again the sp ecic choice of mo del is

immaterial as long as it is reasonable We consider the output distribution of such probabilistic

machines on xed inputs that is for a probabilistic machine M and string x f g wedenote

by M x the distribution of the output of M on input x where the probability is taken over the

machines random moves Fo cusing on decision problems three natural typ es of machines arise

The most lib eral notion is of machines with twosided error probability In case of search

problems it is required that the correct answer is output with probability that is signicantly

greater than eg probability at least When this approach is applied to deci

sion problems solvable by probabilistic p olynomialtime machines we get the class BP P

standing for Boundederror Probabilistic Polynomialtime

Machines with onesided error probability In case of search problems a natural notion is of

machines that output a correct solution in case such exists with probability at least

and never output a wrong solution In case of decision problems there are two natural cases

dep ending on whether the machine errs on YESinstances but not on NOinstances or the

other way around

Machines that never errbutmay output a sp ecial dont know symb ol say with probability

at most

e fo cus on probabilistic p olynomialtime machines and on error probability that may b e reduced W

to a negligible eg exp onentially vanishing in the input length amountby p olynomially many

indep endent rep etitions

We comment that an alternative formulation of randomized computations is captured by de

terministic machines that take two inputs the rst representing the actual input and the second

representing the coin tosses or the random input For suchmachines one considers the output

distribution for any xed rst input when the second input is uniformly distributed among the set

of strings of adequate length

Twosided error BPP

The standard denition of BP P is in terms of machines that err with probability at most

That is L BPP if there exists a probabilistic p olynomialtime machine M such that for every

x L resp x L it holds that PrM x resp PrM x In other

words letting denote the characteristic function of Lwe requite that PrM x x

L L

for every x f g The choice of the constant is immaterial and any other constant smaller

than will do and yield the very same class In fact a more general statement whichisproved

by socalled amplication see next holds

Error reduction or condence amplication For any function N consider

the class BP P of sets L such that there exists a probabilistic p olynomialtime machine M for

whichPrM x x jxj holds Clearly BP P BP P However a wide range of other

L

classes also equal BP P In particular

For every p ositive p olynomial p the class BP P where n pn equals

BP P That is any error that is noticeably b ounded away from ie error

p oly n can b e reduced to an error of

pn

For every p ositive p olynomial p the class BP P where n equals BP P That is

an error of can b e further reduced to an exp onentially vanishing error

facts are proven by applying an adequate Law of Large Numb ers That is consider inde Both

p endent copies of a random variable that represents the output of the weaker machine ie the

machine having larger error probability Use the adequate Law of Large Numbers to b ound the

probability that the average of these indep endent outcomes deviates from the exp ected value of the

original random variable Indeed the resulting machine will invoke the original machine suciently

many times and rule by ma jority We stress that invoking a randomized machine several times

means that the random choices made in the various invo cations are indep endent of one another

BPP is in the PolynomialTime Hierarchy Clearly P BP P and it is commonly con

jectured that equality holds although a p olynomial slowdown may o ccur when transforming

according to these conjectures a probabilistic p olynomialtime algorithm into a deterministic one

However it is not known whether or not BP P is contained in NP In view of this ignorance the

following result is of interest

Theorem BP P

Pro of Supp ose that L BPP and consider by suitable errorreduction a probabilistic p olynomial

time algorithm A suchthat PrAx x jxj for all x f g where jxj denotes

L

the numb er of coins tossed by Ax Let us consider the residual deterministic twoinput algorithm

jxj

A such that A x r equals the output of A on input x and random choices r f g We

claim that x L if and only if

jxj

jxj jxj

A x s r s s s f g r f g

i

jxj

i

Once the claim is proved the theorem follows by observing that Eq ts the denition of

In order to prove the claim we rst consider the case x L We use the Probabilistic Metho d to

show that an adequate sequence of s s exists That is we show that most sequences of s s are

i i

adequate by upp er b ounding the probability that a random sequence of s s is not adequate

i

jxj

jxj

A x s r r f g Pr

i s s s

jxj

i

jxj

jxj

Pr r f g A x s r

s s s i

jxj

i

jxj

X

Pr A x s r

s s s i

jxj

jxj

i

r fg

jxj

X Y

Pr A x s r

s i

i

jxj

i

r fg

jxj jxj

jxj

jxj jxj

where the last inequality is due to the fact that for any xed x L and r it holds that Pr A x s

s i

i

r Pr A x s x jxj On the other hand for any x L and every sequence

s L

W

jxj

of s s it holds that Pr A x s r since x L Thus Eq cannot

i r i

i

p ossibly hold for x L

We comment that the same pro of idea yields a variety of similar statements eg see Sec

tion

Onesided error RP and coRP

The class RP is dened as containing any set L such that there exists a probabilistic p olynomial

time machine M satisfying the following two conditions

x L PrM x

x L PrM x

Observe that RP NP eg note that NP is obtained by replacing Eq with the condition

PrM x for every x L Again the sp ecic probability threshold in Eq is

immaterial as long as it is noticeable and suciently b ounded from Thus RP BPP

Exercise Prove that L is in the class co RP ff g n L L RPg if and only if there exists

a probabilistic p olynomialtime machine M satisfying the following two conditions

x L PrM x

PrM x x L

Exercise Let RP denote the class obtained by replacing Eq by the condition Pr M x jxj

for every x L Observe that RP RP and prove that RP RP and RP RP for any

pn

pn

p ositive p olynomial p Note that amplication is easier in this case of onesided error

The wellknown randomized primality testing algorithms always accept prime numb ers and rejects

comp osite number with high probability Thus these algorithms establish that the set of prime

numb ers is in co RP

No error ZPP

Whereas in case of BP P we have allowed twosided errors and in case of RP and co RP we

have allowed onesided errors we now allow no errors at all Instead we allow the algorithm to

output a sp ecial dont know symb ol denoted with some b ounded away from probability

The resulting class is denoted ZPP standing for Zeroerror Probabilistic Polynomialtime The

standard denition of ZPP is in terms of machines that output with probability at most

That is L ZPP if there exists a probabilistic p olynomialtime machine M such that PrM x

f x g and PrM x x for every x f g Again the choice of the

L L

constant ie is immaterial and amplication can be conducted as in case of RP and

yield the very same class In fact as in case of RP a more general statement holds

Exercise Prove that ZPP RP coRP Indeed ZPP RP as well as ZPP co RP

follows bya trivial transformation of the ZPPmachine On the other hand RP co RP ZPP

can b e proved bycombining the two machines guaranteed for a set in RP coRP

Randomized space complexity

The class RL Random LogSpace is dened analogously to the class NL and is indeed contained

of Random LogSpace machines is identical to the one of in the latter Sp ecically the syntax

Nondeterministic LogSpace machines but the acceptance condition is probabilistic as in the case

of RP In addition we need to require explicitly that the machine runs in p olynomialtime or else

RL extends up to NL

Recall that Directed Connectivity is complete for NL under logspace reductions Below we

show that undirected connectivity is solvable in RL Sp ecically consider the set of triples G s t

such that the vertices s and t are connected in the undirected graph G On input G s t the

randomized logspace algorithm starts a p oly jGjlong random walkatvertex s and accepts the

triplet if and only if the walk passed through vertex t By a random walk we mean that at each

step we select uniformly one of the neighbors of the current vertex and move to it Observe that

the algorithm can b e implemented in logarithmic space b ecause we only need to store the current

vertex as well as the numb er of steps taken so far and that wenever accept G s t in case s and

t are not connected We claim that if s and t are connected in G V E then a random walk

of length O jV jjE j starting at s passes through t with probability at least It follows that

RL undirected connectivity is indeed in

Recall that wlog a nondeterministic logspace machine need only run for p olynomialtime Such a computation

can b e simulated by a randomized logspace machine that rep eatedly guesses nondeterministic moves and simulates

t

the original machine on it Note that we exp ect at most tries b efore we guess an accepting ttime computation

where t is p olynomial in the input length But what if there are no accepting ttime computations To halt with a

t

probabilistic rejecting verdict we should implement a counter that counts till but we need to do so within space

O log t rather than t which is easy In fact it suces to have a randomized counter that with high probability

t

counts to approximately This can be implemented by tossing t coins until all show us heads The exp ected

t

numb er of times we need to rep eat the exp erimentis and we can implementthisby a counter that counts till t

using space log t

On proving the Random Walk Claim Indeed this has little to do with the current course Consider

the connected comp onentof vertex s denoted G V E For any pair u v let T be a random variable

uv

representing the numb er of steps taken in a random walk starting at u until v is rst encountered First verify that

ET jE j for anyu v such that fu v gE Next letting coverG b e the exp ected numb er of steps in a

uv

random walk starting at s and ending when the last of the vertices of V is encountered and C be any directed cyclic

P

tour that visits all vertices in G wehavecover G ET jC jjE j Letting C b e a traversal of

uv

uv C

some of G we conclude that coverG jE jjV j Thus with probability at least a random

walk of length jE jjV j starting at s visits all vertices of G

For example let C n be a random variable counting the number of minimal utov subpaths within a

uv

random walk of length n where the walk starts at the stationary vertex distribution assuming the graph is not

bipartite or is sligtly mo died otherwise On one hand ET lim nEC n due to the memoryless

uv n uv

prop erty of the walk On the other hand EC n is lower b ounded by the exp ected number of times

uv

to uinsucha nstep walk where the latter exp ected numb er equals that the edge v uwas travesed from v

njE j b ecause each directed edge app ears in each step on the walk with equal probability It follows that

ET lim nnjE j jE j

uv n

Lecture

NonUniform Complexity

All complexity classes considered so far are uniform in the sense that each set in each of these

classes was dened via one nite machine or nite expression which applied to all input lengths

This is indeed in agreement with the basic algorithmic paradigm of designing algorithms that can

handle all inputs

In contrast nonuniform complexityinvestigates what happ ens when we allow to use a dierent

algorithm for each input length Indeed in such a case we must bound the description size of

the algorithm otherwise any problem can b e solved by incorp orating in the algorithm the answers

to all nitely many inputs of the adequate length By considering nonuniform complexity we

are placing an upp erb ound on what can be done bythe corresp onding uniformcomplexity class

y abstracting away the evasive uniformity condition we will get a nite The hop e is that b

combinatorial structure that wemay b e able to understand

Circuits and advice

Fo cusing on nonuniform p olynomialtime wemention two standard ways of dening nonuniform

complexity classes The rst wayisby considering families of Bo olean circuits as in Section

Sp ecically L is said to be in nonuniform p olynomialtime denoted P p oly if there exists an

innite sequence of Bo olean circuits C C such that for some p olynomial p the following three

conditions hold

The circuit C has n inputs and one output

n

The size eg numb er of edges of the circuit C is at most pn

n

n

For every x f g it holds that C x if and only if x L

n

n

That is C is a nontrivial algorithm ie it cannot explicitly enco de all answers for deciding

n

the memb ership in L of nbit long strings However although C has size at most pn it is not

n

clear whether one can construct C in p oly ntime or at any time see b elow

n

An alternative way of dening P poly pro ceeds by considering machines that take advice

That is we consider deterministic p olynomialtime machines that get two inputs where the second

input ie the advice has length that is at most p olynomial in the rst input The advice may

only dep end on the input length and thus it cannot explicitly enco de the answers to all inputs of

that length Sp ecically L Pp oly if there exists a deterministic p olynomialtime machine M

and an innite sequence of advice strings a a such that for some p olynomial p the following

conditions hold

The length of a is at most pn

n

n

For every x f g it holds that M x a if and only if x L

n

Exercise Prove that the two formulations of P p oly are indeed equivalent Furthermore prove

that without loss of generality the machine M as ab ove may b e a universal machine

The power of nonuniformity

Waiving the uniformity condition allows nonuniform classes to contain nonrecursive sets This

is true for P p oly as well as for most reasonable nonuniform classes and is due to the obvious

reason that there exists nonrecursive unary sets Sp ecicallyany unary set L fg p ossibly non

def

n

recursive can b e decided by a lineartime algorithm that uses bit long advice ie a

n L

jxj

and M x a if and only if b oth x and a

jxj jxj

On the other hand the existence of sets that are not in P p oly can be proven in a more

concrete way than the corresp onding statement for P Fixing any sup erp olynomial and sub

exp onential function f we observe that the numb er of p ossible f nbit long advice is much smaller

n

than the number of p ossible subsets of f g whereas these advice account for all the sets that

P poly may recognize using a universal machine

We to ok it for granted that P P p oly which is indeed true eg by using empty advice

strings The fact that P poly also contains BP P is less obvious Before proving this fact let use

mention that it is widely b elieved that P p oly do es not contain NP and indeed proving the latter

conjecture was suggested as a go o d way for establishing that P NP Whether or not this way

is a go o d one is controversial

Theorem BP P Ppoly

Pro of As in the pro of of Theorem we consider an adequate amplication of BP P Here for

L BPP we consider by suitable errorreduction a probabilistic p olynomialtime algorithm A

jxj

such that PrAx x Again let us consider the residual deterministic twoinput

L

jxj

algorithm A such that A x r equals the output of A on input x and random choices r f g

n

x r x Then by a trivial counting argument there exists a string r f g suchthatA

L

for all xs of length n Using this string r as the advice for nbit long inputs we are done

Uniformity

The nonuniform asp ect of the denition of P poly is the lack of requirements regarding the con

structibility of the circuits resp advice As a sanity check we note that requiring that these

ob jects b e p olynomialtime constructible results in a cumb ersome denition of P That is supp ose

n

that we require that there is a p olynomialtime algorithm A that given outputs the circuit C

n

resp the advice a for deciding L Ppoly as per the denition ab ove Then combining A

n

with the standard circuitevaluation algorithm resp the advicetaking machine M we obtain an

ordinary p olynomialtime algorithm for deciding L

Evidence that Pp oly do es not contain NP

ma jor motivation towards studying P poly is the desire to prove that P p oly do es Recall that a

not contain NP and thus also BP P P do es not contain NP In view of the fact that P poly

contains nonrecursive sets one may wonder how feasible is the conjecture that P poly do es not

contain NP It would have b een b est if we knew that NP Pp oly if and only if P NP But

weonlyknowthatNP Pp oly implies a collapse of the Polynomialtime Hierarchy That is

Theorem NP Pp oly implies that PH

Pro of sketch Weshowthat and the claim follows by Theorem Supp ose that L

and let us consider the corresp onding quantied expression for x L y zRx y z where

def

polyjxj

y z f g Let L fx y zRx y z g and observe that L is in NP and thus

in P p oly Thus x L if and only if for m p oly jxj there exists a p oly msize circuit C

m

m

for deciding L f g suchthat for all y s it holds that C x y The ab ove expression is

m

almost of the adequate ie form except that weneedtocheck that C is indeed correct on all

inputs of length m Supp ose that L was downwards selfreducible that is that deciding whether

w L could b e reduced to deciding memb ership in L of shorter than w strings Then we could

have revised the ab ove expression and assert that x L if and only if there exists a sequence of

p olynomialsize circuits C C such that

m

for all y s it holds that C x y

m

for i m the circuit C correctly determines memb ership in L where correctness of

i

i

C is expressed by saying that for all w f g the value of C w is consistent with the

i i

values obtained bythedownwards selfreduction as answered by the already veried circuits

C C

i

However we have no reason to assume that L is selfreducible What we do instead is reduce L

to SAT and apply the argument to SAT using its p olynomialsize circuits which exist by the

hyp othesis and its downwards selfreducibility which is a very natural pro cedure Sp ecically

let f be a Karpreduction of L to SAT Thus x L if and only if yfx y SAT Using the

hyp othesis we have SAT P p oly and thus there exists a sequence of p olynomialsize circuits

C C for SAT Now we assert that x L if and only if there exists a sequence of p olynomial

size circuits C C where m jf x y j such that the following two conditions hold

m

For all y s of adequate length C f x y

m

For i m the circuit C correctly decides memb ership of ibit long strings in SAT Note

i

that the correctness condition for C can b e expressed as follows For every ilong formula

i

or C where resp is it holds that C if and only if either C

i

i i

the formula obtained from by replacing its rst variable with resp and i resp i

is the length of the resulting formula after straightforward simplications which necessarily

o ccurs after instantiating avariable

Observe that the expression obtained for memb ership in L is indeed of the form The theorem

follows

Reductions to sparse sets

Another way of lo oking at P p oly is as the class of sets that are Co okreducible to a sparse set

where a sparse set is a set that contains at most p olynomially many strings of each length The

reason for stressing the fact that we refer to Co okreductions will b e explained b elow Let us rst

establish the validity of the ab ove claim

Prop osition L Ppoly if and only if L is reducible to some sparse set

Pro of sketch Supp ose that L P p oly and supp ose that n is suciently large Then we can

enco de the nth advice string ie a in the rst ja j strings of the nbit slice of a set S ie by

n n

th n th

placing the i nbit string in S f g if and only if the i bit of a equals Observe that

n

S is indeed sparse b ecause ja j polyn On input x the reduction rst retrieves the advice

n

string a by making p olynomiallymany nbit long queries to S and decides according to the

jxj

advicetaking M x a

jxj

In case L is reducible to a sparse set S welet the nth advice enco de the list of all the strings

in S that have length at most q n where q is the p olynomial b ounding the runningtime of the

P

q n

i

reduction Given this advice which is of length jS f g j i p oly n the advicetaking

i

machine can emulate the answers of the oracle machine of the reduction and thus decide L

As a direct corollary to Prop osition we obtain

Corollary SAT is Cookreducible to a sparse set if and only if NP Ppoly

Combining Corollary and Theorem it follows that SAT cannot be Co okreducible to a

sparse set unless the Polynomialtime hierarchy collapses

Persp ective Karpreductions to sparse sets

We have stressed the fact that we refer to Co okreductions b ecause by Corollary SAT is

Co okreducible to a sparse set if and only if NP P p oly In contrast it is known that SAT

is Karpreducible to a sparse set if and only if NP P Thus the dierence between Co ok and

Karp reductions is reected in the dierence b etween NP Ppoly and NP P

Theorem SAT is Karpreducible to a sparse set if and only if NP P

Pro of of a sp ecial case Clearly if NP P then SAT is Karpreducible to any nontrivial set

eg to the set fg We establish the opp osite direction only for the sp ecial case that SAT is

Karpreducible to some set S such that S is a subset of a sparse set G P Such a set S is called

guarded and S fg is indeed a sp ecial case Sp ecically using the Karpreduction of SAT

to S we present a deterministic p olynomialtime decision pro cedure for SAT The pro cedure

conducts a DFS on the tree of all p ossible partial truth assignment to the input formula while

truncating the search at no des that are ro ots of subtrees that contain no satisfying assignment

ula derived at the leaves The key observation is that each internal no de which yields a form

from the initial formulae by instantiating the corresp onding partial truth assignment is mapp ed

by the reduction either to a string not in G in which case we conclude that the subtree contains

no satisfying assignmentsortoastringinG in which case we dont know what to do However

once webacktrack from this internal no de we know that the corresp onding elementofG is not in

S andwe will never extend a no de mapp ed to this element again Sp ecically let b e the input

formula and denote the formula resulting from by setting its rst j j variables according to

the partial truth assignment Then the pro cedure pro ceeds as follows using the Karpreduction

f of SAT to S

For an nvariable formulae the leaves of the tree corresp ond to all p ossible nbit long strings and an internal

no de corresp onding to is the parent of no des corresp onding to and

Initialization and B where is a partial truth assignment for which we wish to

determined whether or not SAT and B G n S is a set of strings that were already

proved not to b e in S

The following steps are recursive and return a Bo olean value representing the whether or not

SAT

Internal no de Determine whether or not SAT according to the following three cases

If f G then return the value false

and by the validity of the reduction SAT Since S Gwehave f S

If f B then return the value false

Since B G n S wehave f S andby the validity of the reduction SAT

Otherwise ie f G n B invoketwo recursive calls for and resp ectively

If b oth calls have returned false and f G n B then add f toB since SAT

holds Actually if the rst call returns true then the second call do es not take place

In any case return the ORvalue of the twovalues returned by the recursive calls

We stress that only the third case invokes recursive calls

Bottom Level If the constant formula is false and f G n B then add f to B

In any case return the value of

It is easy to verify that the pro cedure returns the correct answer The runningtime analysis is

then f based on the observation that if and are not prexes of one another and f

it cannot b e that Case was applied to b oth of them Thus the number of internal no des for which

P

m

m

Case was applied is at most the depth of the tree times j G n S j jG j p oly m

i i

i

i

def

i

where G G f g and m jf j polyjj

i

Wemayreevaluate the condition f B after obtaining the answer of the rst call but this is not really

necessary

Otherwise the pro cedure will visit all satisfying assignments and consequently may run for exp onential time

Lecture

Counting Classes

The denition of P

A natural computational problem asso ciated with an NPrelation R is to determine the number of

def

solutions for a given instance that is given x determine the cardinalityofR x fy x y R g

This problem is the counting problem asso ciated with R Certainly the counting problem asso ciated

with R is not easier than the problem of deciding memb ership in L fx y st x y R g

R

which can b e casted as determining for a given x whether jR xj is p ositive or zero

The class P can be dened as a class of functions that count the number of solutions in

NPrelations That is f P if there exists an NPrelation R such that f x jR xj for all

xs Alternatively we can dene P as a class of sets where for every NPrelation R the set

def

R fx k jR xjk g is in P Exercise Formulate and show the equivalence between

the two denitions

PP The class P is related to a probabilistic class denoted PP that was not Relation to

dened in Lecture WesaythatL PP if there exists a probabilistic p olynomialtime algorithm

A such that for every x it holds that PrAx if and only if x L or alternatively

PrAx x forevery x Recall that in contrast L BPP requires that PrAx

L

x for every x Notice that any L PP can be decided by a p olynomialtime oracle

L

machine that is given oracle access to R where R describ es the actions of the PPalgorithm ie

x r R i Ax accepts when using coins r On the other hand P PPby virtue of a minor

n mn

mo dication to the following algorithm that refers to R where R f g f g

nN

mjxj

on input x k with probability one half select y uniformly in f g and accept i x y R

mjxj

probability accept with probability exactly k and otherwise ie with

Exercise Provide the missing details for all the ab ove claims

Pcomplete problems

We say that a computational problem is P complete if it is in P and every problem in P

is reducible to it Thus for an NPrelation R the problem R which is always in P is P

it holds that R is reducible to R Using the standard complete if for any NPrelation R

Karpreductions it is easy to show that for any known NPcomplete relation R the set R is

Exercise show the equivalence of the two formulations

P complete This is the case b ecause the standard reductions or minor mo dications of them

are parsimonious ie preserve the numb er of solutions In particular

Prop osition SAT is P complete where k SAT if and only if has at least k

dierent satisfying assignment

Exercise Verify that the standard reduction of any NPrelation to SAT is parsimonious that is for

any NPrelation R the standard reduction of R to SAT maps each x to a formula having exactly

jR xj satisfying assignments

As stated ab ove Prop osition is merely a consequence of the nature of the reductions used in

the standard context of NPcompleteness results Sp ecically it is the case that the same reductions

used to demonstrate NPcompleteness of search problems can b e used to showP completeness of

the corresp onding counting problems Consequently hard ie NPcomplete search problems

give rise to hard ie P complete counting problems Interestingly there are hard counting

problems ie P complete problems for which the corresp onding search problem is easy For

example whereas the problem of nding a maximum matching in a given graph is easy ie

solvable in p olynomialtime the corresp onding counting problem is hard ie P complete

Theorem The problem of counting the number of perfect matching in a is

P complete Equivalently the problem of computing the permanent of integer matrices with

entries is P complete

Needless to say the reduction used in proving Theorem is not parsimonious or else we could

have used it to reduce NP to the problem of deciding whether a given graph has a p erfect matching

For the same reason the recent p olynomialtime algorithm for approximating the p ermanent of

nonnegative matrices do es not yield p olynomialtime approximation algorithms for all P

A randomized reduction of ApproximateP to NP

By an approximation for a counting problem R in P we mean a pro cedure that on input

x outputs a good approximation denoted Ax of jR xj Sp ecically we require that with

high probability the ratio AxjR xj will be b ounded For many natural NPrelations and in

particular for SAT the following notions are all equivalent

With probability at least it holds that Ax is within a factor of of jR xj

ie AxjR xj

With probability at least expjxj it holds that AxjR xj

c

xj it holds that AxjR xj jxj where With probability at least expj

cisany xed constant

c

jxj

With probability at least it holds that AxjR xj where cisany xed

constant

See Jerrum Sinclair and Vigo da APolynomialTime Approximation Algorithm forthePermanent of a Matrix with

NonNegative Entriesin Proc of the rdSTOC pages

p p

AxjR xj Show that this is equivalent to abilitytogetAxsuch that

c

jxj

Note that for some constant c that dep ends on R the ability to approximate jR xj to within a factor of

c

jxj

merely requires the ability to distinguish the case jR xj from jR xj since jR xj always holds



jx j

Exercise Show that ability to approximate every jR x j to within a factor of implies abilitytoapproximate

c

jxj

jR xj to within a factor of

Item implies Item by using straightforward errorreduction as in case of BP P To show that

Item implies Item resp Item implies Item we use the fact that for many natural NP

relations it is the case that many instances can b e enco ded in one ie R hx x ifhy y i

t t

iy R x g Thus supp ose that for every x we know how to approximate jR xj to within

i i

jxj

a factor of andwewant to approximate jR xj to within a factor of for every x Then

def

jx j

we form x as a sequence of t jxj copies of x and obtain a factor approximation of

t th

jR x j jR xj Taking the t ro ot of this approximation we obtain jR xj upto a factor of

jx j t tjxj t

In view of the ab ove we fo cus on providing anygoodapproximation to the problem of counting

the number of satisfying assignments to a b o olean formula The same techniques apply to any

NPcomplete problem

Theorem The counting problem SAT can be approximated up to a constant factor by a

probabilistic polynomialtime oracle machine with oracle access to SAT

Pro of Sketch Given a formula on n variables we approximate jSAT j by trying all p ossible

i

powers of as candidate approximations That is for i n we check whether is a

go o d approximation of jSAT j This is done by uniformly selecting a go o d hashing function

n i

h f g f g and checking whether there exists a truth assignment for such that the

following two conditions hold

the truth assignment satises ie true and

i

h hashes to the allzero string ie h

These two conditions can b e enco ded in a new formula eg by reducing the ab ove NPcondition

to SAT The new formula is satisable if and only if there exists an assignment to that

satises the ab ove conditions Thus the answer to the ab ove question ie whether sucha exists

is obtained by making a corresp onding query ie tothe SAT oracle

n

In the analysis we assume that the hashing function is go o d in the sense that for any S f g

i i

with high probability a randomly selected hashing function h satises jfe S he gj jS j

i i

In particular a randomly selected hashing function h maps eachstringto with probability

and the mapping of dierent strings is pairwise indep enden t For further details see Lect

i

Note that if jSAT j then a randomly selected hashing function is unlikely to map

i i

any of the fewer than satisfying assignmentof to Sp ecically the probability that any

i i

sp ecic assignment is mapp ed to equals and so the bad event occurs with probability less

than which can b e further reduced by rep eating the random exp eriment

i

On the other hand if jSAT j then a randomly selected hashing function is likely to

i i

than satisfying assignmentof to This can b e proven using the map some of the more

pairwise indep endent prop erty of the mapping induced by a random hashing function

i

Thus with high probability the ab ove pro cedure outputs a value v such that i

log jSAT j i We stress that the entire argument can be adapted to any NPcomplete

problem Furthermore smaller approximation factors can b e obtained directly by using tricks as

in the pro of of Theorem

For example the numb er of satisfying assignmentstoaformula consisting of t formulae over distinct variables is

the pro duct of the numb er of satisfying assignments to each of these formulae

i

Alternatively for some p opular hashing functions the condition h is easily transformed to CNF Thus

i

we obtain the formula z z z z hz z

n n n

A randomized reduction of SAT to UniqueSAT

The widely b elieved intractabilityofSAT cannot b e due to instances that havevery many satis

n

fying assignments For example satisfying assignments for nvariable formula having at least n

satisfying assignments can be found in probabilistic p olynomialtime by selecting n assignments

at random Going to the other extreme one may ask whether SAT instances having very few

satisfying assignments eg a unique satisfying assignment can be hard As shown b elow the

answer is positive We show that ability to solve suc h instances yields ability to solve arbitrary

instances

In order to formulate the ab ove discussion weneedtointro duce the notion of a promise problem

which extends or relaxes the notion of a decision problem A promise problem is a pair of

disjoint subsets denoted A deterministic machine M is said to solve such a problem

yes no

if M x for every x and M x for every x whereas nothing is required in

yes no

case x ie x violates the promise The notion extends naturally to probabilistic

yes no

oracle machines and so on When we say that some problem reduces to the promise machines

problem we mean that the reduction yields the correct output regardless of the

yes no

way in which queries outside of are answered This is consistent with requiring nothing

yes no

from a machine that solves in case the input is not in

yes no

The computational problem of distinguishing instances with a unique solution from instances

with no solution yields a natural promise problem For example uniq ueS AT or uS AT is the

promise problem with yesinstances b eing formulae having a unique satisfying assignment and no

instances b eing formulae having no satisfying assignment

Theorem SAT is randomly reducible to uS AT

Pro of Sketch We present a probabilistic p olynomialtime oracle machine that solves SAT using

an oracle to uS AT Actually it is easier to rst randomly reduce SAT to fewSAT where

fewSAT is the promise problem with yesinstances b eing formulae having between and

satisfying assignments and noinstances b eing formulae having no satisfying assignment

Observe that the pro cedure describ ed in the pro of of Theorem can b e easily adapted to do

the work Sp ecicallywe accept the given SAT instance if and only if any of the oracle invo cations

returns the value true Note that the latter eventmay o ccur only if is satisable b ecause when

is unsatisable all queries are unsatisable On the other hand if has k satisfying

assignments then in iteration i blog k c with high probability the query is satisable and

i

has at most k satisfying assignments ie is a yesinstance of fewSAT

To nishup the pro of we reduce fewSAT to uS AT Given a formula for i

i

we construct a formula that has a unique satisfying assignment if and only if has exactly i

i

copies of over distinct satisfying assignments For example may consist of the conjunction of i

variables and a condition imp osing a lexicography order b etween the corresp onding assignments

In order to take care of the case k we also query the fewSAT oracle ab out itself

def

j

n

x y x y Eg x x y y x x y y

j j n n n n

k k

j

k

Lecture

Space is more valuable than time

This lecture was not given The intention was to prove the following result which asserts that any

computation requires strictly less space than time

Theorem Dtimet Dspacet log t

That is any given deterministic multitap e Turing Machine TM of time complexity t can be

simulated using a deterministic TM of space complexity t log t A main ingredient in the simulation

is the analysis of a p ebble game on directed b oundeddegree graphs

Lecture

Circuit Depth and Space Complexity

This lecture was not given The intention was to explore some of the relations between Bo olean

circuits and Turing machines Sp ecically

Dene the complexity classes NC and AC ie b ounded versus unb ounded fanin circuits of

i i

i

p olynomialsize and O log depth and compare their computational power Point out the

connection b etween uniformNC and ecient parallel computation

Establish a connection between the space complexity of a problem and the depth of circuits

with b ounded fanin for the problem

Historical Notes

For historical discussion of the material presented in Lecture the reader is referred to the textb o ok

of Hop croft and Ullman Needless to say the latter provides accurate statements and pro ofs

of hierarchy and gap theorems

Space Complexity The emulation of nondeterministic spaceb ounded machines by determin

istic spaceb ounded machines ie Theorem is due to Savitch Theorem ie NL

co NLwas proved indep endently by Immerman and Szelep csenyi

The PolynomialTime Hierarchy The PolynomialTime Hierarchywas intro duced by Sto ck

meyer The third equivalentformulation via alternating machines can b e found in

Randomized Time Complexity Probabilistic Turing Machines and corresp onding complexity

classes including BP P RP ZPP and PP were rst dened by Gill The randomwalk log

deciding undirected connectivity is due to Aleliunas et al Additional space algorithm for

examples of randomized algorithms and pro cedures can b e found in and Ap dx B

The robustness of the various classes under various error thresholds was established using

straightforward amplications ie running the algorithm several times using indep endent ran

dom choices Randomnessecient amplication metho ds which use related random choices in

the various runs have b een studied extensively since the mid s cf Sec

The fact that BP P is in the Polynomialtime hierarchy was proven indep endently by Laute

mann and Sipser We have followed Lautemanns pro of The ideas underlying Sipsers

pro of found many applications in complexity theory in particular they are used in the approxi

mation pro cedure for P as well as in the emulation of general interactive pro ofs by publiccoin

ones

NonUniform Complexity The class Pp oly was dened by Karp and Lipton as part of a

general formulation of machines whichtake advise They have noted the equivalence to the

traditional formulation of p olynomialsize circuits the eect of uniformity as well as the eect

of NP P p oly on the Polynomialtime hierarchy ie Theorem Theorem is due to

Fortune

Theorem is attributed to Adleman who actually proved that RP Pp oly using a more

involved argument

Counting Classes The counting class P was intro duced by Valiant who proved that

computing the p ermanent of matrices is P complete cf Theorem Valiants pro of

rst establishes the P hardness of computing the p ermanentofinteger matrices the entries are

actually restricted to f g and next reduces the computation of the p ermanentofinteger

matrices to the the p ermanent of matrices A deconstructed version of Valinats pro of can b e

found in

The approximation pro cedure for P is due to Sto ckmeyer following an idea of Sipser

Our exp osition follows further developments in the area The randomized reduction of SAT to

uniqueSAT is due to Valiant and Vazirani Again our exp osition is a bit dierent

Lecture Series III

The less traditional material

These lectures are based on research done in the s and the s

The lectures on Probabilistic Proof Systems and Pseudorandomness are related to lectures that

may b e given as part of other courses ie Foundations of and Randomness in Com

putation resp ectively But the choice of material for the current course as well as the p ersp ective

would b e dierent here

Lecture

Probabilistic Pro of Systems

Various typ es of probabilistic pro of systems have played a central role in the development of com

puter science in the last decade In these notes we concentrate on three such pro of systems

interactive proofs zeroknow ledge proofs and probabilistic checkable proofs

The notes for this lecture were adapted from various texts that I wrote in the past see eg

Chap In view of the fact that that zeroknowledge pro ofs are covered at Weizmann in the

Foundation of Cryptography course I have only discussed IP and PCP in the current course The

actual notes I ha ve used in the current course app ear in Section

Intro duction

The glory given to the creativity required to nd pro ofs makes us forget that it is the less glori

ed pro cedure of verication which gives pro ofs their value Philosophically sp eaking pro ofs are

secondary to the verication pro cedure whereas technically sp eaking pro of systems are dened in

terms of their verication pro cedures

The notion of a verication pro cedure assumes the notion of computation and furthermore the

notion of ecient computation This implicit assumption is made explicit in the denition of NP

in which ecient computation is asso ciated with deterministic p olynomialtime algorithms

Traditionally NP is dened as the class of NPsets Yet each such NPset can b e viewed as a

pro of system For example consider the set of satisable Bo olean formulae Clearly a satisfying

for the assertion is satisable the assignment for a formula constitutes an NPpro of

verication pro cedure consists of substituting the variables of by the values assigned by and

computing the value of the resulting Bo olean expression

The formulation of NPpro ofs restricts the eective length of pro ofs to b e p olynomial in length

of the corresp onding assertions However longer pro ofs may b e considered by padding the assertion

with suciently many blank symb ols So it seems that NP gives a satisfactory formulation of pro of

systems with ecient verication pro cedures This is indeed the case if one asso ciates ecient

pro cedures with deterministic p olynomialtime algorithms However we can gain a lot if we are

willing to take a somewhat nontraditional step and allow probabilistic verication pro cedures In

particular

teractive verication pro cedures giving rise to interactive proof systems Randomized and in

seem much more p owerful ie expressive than their deterministic counterparts

Such randomized pro cedures allow the intro duction of zeroknow ledge proofs which are of

great theoretical and practical interest

NPpro ofs can be eciently transformed into a redundant form that oers a tradeo be

tween the number of lo cations examined in the NPpro of and the condence in its validity

which is captured in the notion of probabilistical ly checkable proofs

In all ab ovementioned typ es of probabilistic pro of systems explicit b ounds are imp osed on the

computational complexityoftheverication pro cedure which in turn is p ersonied by the notion

of a verier Furthermore in all these pro of systems the verier is allowed to toss coins and

evidence Thus all these pro of systems carry a probability of error yet this rule by statistical

probability is explicitly b ounded and furthermore can b e reduced by successive application of the

pro of system

Interactive Pro of Systems

In light of the growing acceptability of randomized and distributed computations it is only natural

to asso ciate the notion of ecient computation with probabilistic and interactive p olynomialtime

computations This leads naturally to the notion of interactive pro of systems in which the verica

tion pro cedure is interactive and randomized rather than b eing noninteractive and deterministic

Thusaproofinthiscontext is not a xed and static ob ject but rather a randomized dynamic

pro cess in which the verier interacts with the prover Intuitivelyonemay think of this interaction

as consisting of tricky questions asked by the verier to which the prover has to reply convinc

ingly The ab ove discussion as well as the actual denition makes explicit reference to a prover

whereas a prover is only implicit in the traditional denitions of pro of systems eg NPpro ofs

The Denition

The main new ingredients in the denition of interactive pro of systems are

Randomization in the verication pro cess

Inter action between the verier and the prover rather than unidirectional communication

from the prover to the verier as in the case of NPpro of systems

The combination of both new ingredients is the source of power of the new denition If the

verier do es not toss coins then the interaction can b e collapsed to a single message On the other

hand combining randomization with unidirectional communication yields a randomized version of

NPpro of systems called MA Westressseveral other asp ects

The prover is computational ly unbounded As in NP we start by not considering the com

plexity of proving

The verier is probabilistic polynomialtime We maintain the paradigm that verication

ought to b e easyalaswe allow random choices in our notion of easiness

Completeness and Soundness We relax the traditional soundness condition by allowing small

probability of b eing fo oled by false pro ofs The probabilityistaken over the veriers random

choices We still require p erfect completeness that is that correct statements are proven

with probability Error probability b eing a parameter can b e further reduced by successive

rep etitions

We denote by IP the class of sets having interactive pro of systems

Variations Relaxing the p erfect completeness requirement yields a twosided error variantof

IP ie error probability allowed also in the completeness condition Restricting the verier to

send only random ie uniformly chosen messages yields the restricted notion of ArthurMerlin

interactive pro ofs aka publiccoins interactive pro ofs and denoted AM However b oth variants

are essentially as p owerful as the original one

An Example interactive pro of of Graph NonIsomorphism

The problem not known to be in NP Proving that two graphs are isomorphic can be done

by presenting an isomorphism but howdoyou provethatnosuch isomorphism exists

ou claim that two ob jects are dierent The construction the two ob ject proto col If y

then you should b e able to tell whichis which when I present them to you in random order In

the context of the Graph NonIsomorphism interactive pro of two supp osedly dierent ob jects

are dened by taking random isomorphic copies of each of the input graphs If these graphs are

indeed nonisomorphic then the ob jects are dierent the distributions have distinct supp ort else

the ob jects are identical

Interactive pro of of NonSatisability

Weshow that co NP IP by presenting an interactive pro of for NonSatisability ie SAT

Arithmetization of Bo olean CNF formulae Given a Bo olean CNF formula we replace

the Bo olean variables byinteger variables their negations by minus the variable orclauses by

sums and the top level conjunction by a pro duct Note that false is asso ciated with zero whereas

true is asso ciated with a p ositive integer To prove that the given formula is not satisable we

consider the sum over all assignments of the resulting integer expression Observe that the

resulting arithmetic expression is a degree p olynomial ie the degree is at most the number

of clauses and that its value is b ounded ie exp onentially in the numb er of clauses

k equalitybetween twointegers in M it suces Moving to a Finite Field Whenever wechec

to check equality mo d q where q M The benet is that the arithmetic is now in a nite eld

mo d q and so certain things are nicer eg uniformly selecting avalue Thus proving that a

CNF formula is not satisable reduces to proving an equality of the following form

X X

x x mo d q

n

x x

n

where is a low degree multivariant p olynomial and q is exp onential in n

The actual construction stripping summations in iterations In each iteration the prover

is supp osed to supply the p olynomial describing the expression in one currently stripp ed vari

able By the ab ove observation this is a low degree p olynomial and so has a short description

The verier checks that the p olynomial is of low degree and that it corresp onds to the current

value b eing claimed ie p p v Next the verier randomly instantiates the variable

yielding a new value to b e claimed for the resulting expression ie v pr for uniformly chosen

See and resp ectively Sp ecically we can get rid of the completeness error by adapting the pro of of

Theorem cf The pro of that AM IP is signicantly more involved cf

r GF q The verier sends the uniformly chosen instantiation to the prover At the end of

the last iteration the verier has a fully sp ecied expression and can easily check it against the

claimed value That is for i n the ith iteration is intended to provide evidence that

P P

r r x x v mo d q where r r v are as deter

i i n i i i

x x

n

i

def

The prescrib ed prover is supp osed to set mined in the previous i iterations and v

P P

p z r r zx x and send p to the verier which checks

i i i n

x x

n

i

that p p v mo d q rejecting immediately if the equivalence do es not hold selects

i i i

r at random in GF q sends it to the prover and sets v p r modq In the next iteration

i i i

P P

the verier exp ects to get evidence that r r r x x v

i i i n i

x x

n

i

mo d q

Completeness of the ab ove When the claim holds the prover has no problem supplying the

correct p olynomials and this will lead the verier to always accept

Soundness of the ab ove It suces to b ound the probability that for a particular iteration the

initial claim is false whereas the ending claim is correct Both claims refer to the current summation

b eing equal to the current value where current means either at the b eginning of the expression

iteration or at its end Let T b e the actual p olynomial representing the expression when stripping

the current variable and let p be any potential answer by the prover We may assume that

p p v and that p is of lowdegree as otherwise the verier will reject Using our

hyp othesis that the initial claim is false we know that T T v Thus p and T are

dierent lowdegree p olynomials and so they may agree on very few points In case the verier

instantiation do es not happ en to b e one of these few p oints the ending claim is false to o

Op en Problem alternative pro of of co NP IP Polynomials play a fundamental role

in the ab ove construction and this trend has even deep ened in subsequentworks on PCPItdoes

not seem p ossible to abstract that role which seems to be very annoying I consider it imp ortant

to obtain an alternative pro of of co NP IP a pro of in which all the underlying ideas can be

presented at an abstract level

The Power of Interactive Pro ofs

Theorem The IP Characterization Theorem IP PSPACE

Interactive Pro ofs for PSPACE Recall that PSPACE languages can b e expressed by Quan

tied Bo olean Formulae The number of quantiers is p olynomial in the input but there are b oth

existential and universal quantiers and furthermore these quantiers may alternate Considering

the arithmetization of these formulae wefacetwo problems Firstly the value of the formulae are

only b ounded by a doubleexp onential function in the length of the input and secondly when

stripping out summations the expression may b e a p olynomial of high degree due to the universal

quantiers which are replaced by pro ducts The rst problem is easy to deal with by using the

M are dierent then they must be dier Chinese Reminder Theorem ie if two integers in

ent mo dulo most of the primes in the interval p oly log M The second problem is resolved

by refreshing variables after each universal quantier eg xy zx y z is transformed into

xy x x x zx yz That is in the resulting formula all variables app earing in a

residual formula are quantied within the residual formula

IP is in PSPACE We show that for every interactive pro of system there exists an optimal

prover strategy and furthermore that this strategy can be computed in p olynomialspace This

follows by lo oking at the tree of all p ossible executions Thus the acceptance probability of the

verier when interacting with an optimal prover can b e computed in p olynomialspace

Advanced Topics

The IP Hierarchy

Let IPr denote the class of languages having an interactive pro of in which at most r messages

are exchanges Then IP co RP BPP The class IP is a randomized version of NP wit

nesses are veried via a probabilistic p olynomialtime pro cedure rather than a deterministic one

The class IP seems to b e fundamentally dierent the verication pro cedure here is truly inter

active Still this class seems relatively close to NP sp ecically it is contained in the p olynomial

time hierarchy which seems low when contrasted with PSPACE IPp oly Interestingly

IPr IPr and so in particular IPO IP Note that IP r IPr

can b e applied successively a constantnumb er of times but not more

Op en Problem the structure of the IP hierarchy Supp ose that L IPr What can

L Currentlywe only know to argue as follows L IPr IPp oly PSPACE b e said ab out

and so L P S P AC E and is in IPp oly This seems ridiculous we do not use the extra information

ie L IPr and not merely L IP On the other hand wedonotexpectL to b e in IPg r

for any function g since this would put co NP co IP in IPg IP Other parameters

of interest are the total lengths of the messages exchanged in the interaction and the total number

of bits sentby the prover In general it would b e interesting to get a b etter understanding of the

IP Hierarchy

How Powerful Should the Prover be

Here we consider the complexity of proving valid statements that is the complexity of the pre

scrib ed prover referred to in the completeness condition

The Cryptographic Angle Interactive pro ofs occur inside cryptographic proto cols and so

the prover is merely a probabilistic p olynomialtime machine yet it mayhave access to an auxiliary

input given to it or generated by it in the past Such provers are relatively weak ie they can only

prove languages in IP yet they maybeofinterest for other reasons eg see zeroknowledge

The Complexity Theoretic Angle It make sense to try to relate the complexity of proving

a statement to another party to the complexity of deciding whether the statement holds This

gives rise to two related approaches

Restricting the prover to b e a probabilistic p olynomialtime oracle machine with oracle access

to the language in which memb ership is proven This approach can be thought of as

extending the notion of selfreducibility of NPlanguages these languages have an NPpro of

This class is also denoted MA Observe that the pro of of Theorem can b e adapted to give BP P MA

Thus BP P N P MA

For a study of the latter complexity measure see On interactive pro ofs with a laconic provers by Goldreich Vadhan

and Wigderson in Proc of the th ICALP Springers LNCS pages

system in which the prover is a p olynomialtime machine with oracle access to the language

Indeed alike NPcomplete languages the IPcomplete languages also have such a relatively

ecient prover Recall that an optimal prover strategy can b e implemented in p olynomial

space and thus by a p olynomialtime machine having oracle access to a PSPACEcomplete

language

Restricting the prover to run in time that is p olynomial in the complexity of the language in

which memb ership is proven

Op en Problem Further investigate the power of the various notions and in particular the

one extending the notion of selfreducibility of NP languages Better understanding of the latter is

also long due A sp ecic challenge provide an NPpro of system for Quadratic NonResideousity

QNR using a probabilistic p olynomialtime prover with access to the QNR language

ComputationallySound Pro ofs

Computationallysound pro ofs systems are fundamentally dierent from the ab ove discussion whic h

did not eect the soundness of the pro of systems here we consider relaxations of the soundness

conditions false pro ofs may exist even with high probability but are hard to nd Variants may

corresp ond to the ab ove approaches sp ecically the following have b een investigated

Argument Systems Here one only considers prover strategies implementable by p ossibly non

uniform p olynomialsize circuits equiv probabilistic p olynomialtime machines with auxiliary

inputs Under some reasonable assumptions there exist argument systems for NP having p oly

logarithmic communication complexity Analogous interactive pro ofs cannot exists unless NP is

contained in QuasiPolynomial Time ie NP Dtimeexp p oly log n

CS Pro ofs Here one only considers prover strategies implementable in time that is p olynomial

in the complexity of the language In an noninteractive version one asks for certicates of the NP

typ e that are only computationallysound In a mo del allowing b oth prover and verier access to

a one can convert interactive pro ofs alike CS pro ofs into noninteractive ones As

a heuristics it was also suggested to replace the random oracle by use of random public functions

a fuzzy notion not to b e confused with pseudorandom functions

Op en Problem Try to provide rm grounds for the heuristics of making pro of systems non

interactiveby use of random public functions I advise not to try to dene the latter notion in a

general form but rather devise some adho c metho d using some sp ecic but widely b elieved com

plexity assumptions eg hardness of deciding Quadratic Residucity mo dulo a comp osite number

for this sp ecic application

ZeroKnowledge Pro ofs

Zeroknowledge pro ofs are central to cryptography Furthermore zeroknowledge pro ofs are very

intriguing from a conceptual p oint of view since they exhibit an extreme contrast between b eing

We mention that QNR has a constantround interactive pro of in which the prover is a probabilistic p olynomial

time prover with access to QNR This pro of system is similar to the one presented ab ove for Graph NonIsomorphism

The reasons for this recommendation are explained in The Random Oracle Metho dology Revisited by Canetti

Goldreich and Halevi in Proc of the th STOC

convinced of the validity of a statement and learning anything in addition while receiving such

a convincing pro of Namely zeroknowledge pro ofs have the remarkable prop erty of b eing both

convincing while yielding nothing to the verier b eyond the fact that the statementisvalid

The zeroknowledge paradigm Whatever can b e eciently computed after interacting with

the prover on some common input can be eciently computed from this input alone without

interacting with anyone That is the interaction with the prover can be eciently simulated in

solitude

ATechnical Note Ihave deviated from other presentation in which the simulator works in ex

p ected probabilistic p olynomialtime and require that it works in strict probabilistic p olynomial

time Yet I allow the simulator to halt without output with probability at most Clearly this

implies an exp ected p olynomialtime simulator but the converse is not known In particular some

known p ositive results regarding p erfect zeroknowledge with average p olynomialtime simulators

are not known to hold under the ab ove more strict notion

Perfect ZeroKnowledge

The Denition A simulator can pro duce exactly the same distribution as o ccurring in an inter

action with the prover Furthermore in the general denition this is required with resp ect to any

probabilistic p olynomialtime verier strategy not necessarily the one sp ecied for the verier

Thus the zeroknowledge prop erty protects the prover from any attempt to obtain anything from

it b eyond conviction in the validity of the assertion

ZeroKnowledge NPpro ofs Extending the NPframework to interactive pro of is essential for

the nontriviality of zeroknowledge It is easy to see that zeroknowledge NPpro ofs exist only for

languages in RP Actually thats a go o d exercise

A p erfect zeroknowledge pro of for Graph Isomorphism The prover sends the verier a

random isomorphic copy of the rst input graph The verier challenges the prover by asking the

prover to present an isomorphism of graph sent to either the rst input graph or to the second

input graph The veriers choice is made at random The fact that this interactive pro of system is

zeroknowledge is more subtle than it seems for example many parallel rep etitions of the pro of

system are unlikely to b e zeroknowledge

Statistical or almostp erfect ZeroKnowledge Here the simulation is only required to b e

statistically close to the actual interaction The resulting class denoted SZK lies b etween Perfect

ZK and general or Computational ZK For further details see

General or Computational ZeroKnowledge

This denition is obtained by substituting the requirement that the simulation is identical to the

real interaction by the requirement that the two are computational indistinguishable

See further details in strict p olynomialtime in simulation and extraction by Barak and Lindell th STOC pages

An NPpro of system for a language L yields an NPrelation for L dened using the verier On input x L a

p erfect zeroknowledge simulator either halts without output or outputs an accepting conversation ie an NPwitness

for x

Computational Indistinguishability is a fundamental concept of indep endent interest Two

ensembles are considered indistinguishable by an algorithm A if As b ehavior is almost invariantof

whether its input is taken from the rst ensemble or from the second one Weinterpret b ehavior

as a binary verdict and require that the probabilitythatA outputs in b oth cases is the same upto

a negligible dierence ie smaller than pn for any p ositive p olynomial p and all suciently

long input lengths denoted by n Two ensembles are computational indistinguishable if they are

indistinguishable by all probabilistic p olynomialtime algorithms

A zeroknowledge pro of for NP an abstract b oxes setting It suces to construct

such a pro of system for Colorability COL To obtain a pro of system for other NPlanguages

use the fact that the standard reduction of NP to COL is p olynomialtime invertible

input graph and pro ceeds as follows First it uniformly The prover uses a xed coloring of the

selects a relab eling of the colors ie one of the p ossible ones and puts the resulting color of

each vertex in a lo cked b ox marked with the vertex name All b oxes are sent to the verier who

resp onse with a uniformly chosen edge asking to op en the b oxes corresp onding to the endp ointof

this edge The prover sends over the corresp onding keys and the verier op ens the twoboxes and

accepts i he sees two dierent legal colors

A zeroknowledge pro of for NP the real setting The lo cked b oxes need to b e implemented

digitally This is done byacommitment scheme a cryptographic primitive designed to implement

suchlocked b oxes Lo osely sp eaking a commitmentscheme is a twoparty proto col which pro ceeds

in two phases so that at the end of the rst phase called the commit phase the rst party

called sender is committed to a single value which is the only value he can later reveal in the

second phase whereas at this p ointthe other party gains no knowledge on the committed value

Commitmentschemes exist if and actually i oneway functions exist Thus the mildest of all

cryptographic assumptions suces for constructing zeroknowledge pro ofs for NP and actually

for all of IP That is

Theorem The ZK Characterization Theorem If oneway functions exist then every set in

IP has a zeroknow ledge

Furthermore zeroknowledge pro ofs for languages that are hard on the average imply the exis

tence of oneway functions thus the ab ove construction essentially utilizes the minimal p ossible

assumption

Concluding Remarks

The provers strategy in the ab ove zeroknowledge pro of for NP can be implemented by a proba

bilistic p olynomialtime machine which is given as auxiliary input an NPwitness for the input

This is clear for COL and for other NPlanguages one needs to use the fact that the relevant

reductions are coupled with ecient witness transformations The ecient implementation of the

prover strategy is essential to the applications b elow

Applications to Cryptography Zeroknowledge pro ofs are a powerful to ol for the design of

cryptographic proto cols in which one typically wants to guarantee prop er b ehavior of a party

without asking him to reveal all his secrets Note that prop er b ehavior is typically a p olynomial

time computation based on the partys secrets as well as on some known data Thus the claim

that the party b ehaves consistently with its secrets and the known data can b e casted as an NP

statement and the ab ove result can be utilized More generally using additional ideas one can

provide a secure proto col for any functional b ehavior These general results have to b e considered as

plausibility arguments you would not like to apply these general constructions to sp ecic practical

problems yet you should know that these sp ecic problems are solvable

Op en Problems do exists but seem more sp ecialized in nature For example it would be

interesting to gure out and utilize the minimal p ossible assumption required for constructing

zeroknowledge proto cols for NP in various mo dels like constantround interactive pro ofs the

noninteractive mo del and p erfect zeroknowledge arguments

Further Reading See chapter on ZeroKnowledge in

Probabilistically Checkable Pro of PCP Systems

When viewed in terms of an interactive pro of system the probabilistically checkable pro of setting

consists of a prover that is memoryless Namely one can think of the prover as b eing an oracle and

of the messages sent to it as b eing queries A more app ealing interpretation is to view the proba

bilistically checkable pro of setting as an alternative way of generalizing NP Instead of receiving

the entire pro of and conducting a deterministic p olynomialtime computation as in the case of

lo cation of its choice Potentially NP the verier may toss coins and prob e the pro of only at

this allows the verier to utilize very long pro ofs ie of sup erp olynomial length or alternatively

examine very few bits of an NPpro of

The Denition

The Basic Mo del A probabilistically checkable pro of system consists of a probabilistic p olynomial

time verier having access to an oracle which represents a pro of in redundantform Typically the

verier accesses only few of the oracle bits and these bit p ositions are determined by the outcome

of the veriers coin tosses Completeness and soundness are dened similarly to the way they

were dened for interactive pro ofs for valid assertions there exist pro ofs making the verier always

accepts whereas no oracle can make the verier accept false assertions with probability ab ove

Weve sp ecied the error probability since weintend to b e very precise regarding some complexity

measures

Additional complexity measures of fundamental imp ortance are the randomness and query

complexities Sp ecically PCP r q denotes the set of languages having a probabilistic check

able pro of system in which the verier on any input of length n makes at most r n coin tosses

and at most q n oracle queries As usual unless stated otherwise the oracle answers are always

binary ie either or

r

Observed that the eective oracle length is at most q ie lo cations that may b e accessed

on some random choices In particular the eective length of oracles in a PCP log system is

p olynomial Exercise Show that PCP log p oly NP

PCP augments the traditional notion of a pro of An oracle that always makes the pcp

verier accept constitutes a pro of in the standard mathematical sense However a p cp system has

the extra prop erty of enabling a lazy verier to toss coins takeitschances and assess the validity

of the pro of without reading all of it but rather by reading a tiny p ortion of it

The power of probabilistically checkable pro ofs

Theorem The PCP Characterization Theorem PCP log O NP

Thus probabilistically checkable pro ofs in which the verier tosses only logarithmically many coins

and makes only a constantnumb er of queries exist for every NPlanguage It follows that NPpro ofs

can b e transformed into NPpro ofs whichoer a tradeo between the p ortion of the pro of b eing

read and the condence it oers Sp ecically if the verier is willing to tolerate an error probability

of then it suces to let it examine O log bits of the transformed NPpro of These bit

lo cations need to be selected at random Furthermore an original NPpro of can be transformed

into an NPpro of allowing such tradeo in p olynomialtime The latter is an artifact of the pro of

of the PCP Theorem

The Pro of of the PCP Characterization Theorem is one of the most complicated pro ofs

in the Theory of Computation Its main ingredients are

A pcplog p oly log pro of system for NP Furthermore this pro of system has additional

prop erties which enable pro of comp osition as in Item b elow

A pcpp oly O pro of system for NP This pro of system also has additional prop erties

enabling pro of comp osition as in Item

The pro of comp osition paradigm Supp ose you have a pcpr O system for NP in

of queries are made nonadaptively to an valued oracle and which a constant number

the veriers decision regarding the answers may be implemented by a p oly size circuit

Further supp ose that you haveapcpr qlike system for P in which the input is given

in enco ded form via an additional oracle so that the system accepts inputoracles that enco de

inputs in the language and reject any inputoracle which is far from the enco ding of any

input in the language In this latter system access to the inputoracle is accounted in the

query complexity Furthermore supp ose that the latter system may handle inputs which

result from concatenation of a constant number of subinputs each enco ded in a separate

def

subinput oracle Then NP has a pcpr r s q s where sn p oly n

The extra factor of is an artifact of the need to amplify each of the two pcp systems so

that the total error probability sums up to at most

with itself using r r log q In particular the pro of system of Item is comp osed

p oly log and sn p oly log n yielding a pcplog p oly log log system for NPwhich is then

comp osed with the system of Item using r log p oly log log r p oly q O and

sn p oly log log n yielding the desired pcplog O system for NP

The pcplog p oly log system for NP We start with a dierent arithmetization of CNF for

mulae than the one used for constructing an interactive pro of for co NP Logarithmically many

variables are used to represent in binary the names of variables and clauses in the input formula

and an oracle from variables to Bo olean values is supp osed to represent a satisfying assignment

An arithmetic expression involving a logarithmic number of summations is used to represent the

value of the formula under the truth assignment represented by the oracle This expression is a

lowdegree p olynomial in the new variables and has a cubic dep endency on the assignmentoracle

Smallbiased probability spaces are used to generate a p olynomial number of such expressions so

that if the formula is satisable then all these expressions evaluate to zero and otherwise at most

half of them evaluate to zero Using a summation test as in the interactive pro of for co NPand

def

a lowdegree test this yields a pcptt system for NP where tn O log n log log n

We use a nite eld of p oly log n elements and so we need log n O log log n random bits

O log n

long sequences over for the summation test To obtain the desired pcp system one uses

log log n

f log ng to representvariableclause names rather than logarithmicallylong binary sequences

O log n

We can still use a nite eld of p oly log n elements and so we need only O log log n

log log n

random bits for the summation test All this is relatively easy compared to what is needed in order

to transform the p cp system so that only a constantnumb er of queries are made to a multivalued

oracle This is obtained via randomnessecient parallelization of p cp systems which in turn

dep ends heavily on ecient lowdegree tests

Op en Problem As a rst step towards the simplication of the pro of of the PCP Characteri

zation Theorem one maywant to provide an alternative parallelization pro cedure that do es not

rely on p olynomials or any other algebraic creatures

The pcpp oly O system for NP It suces to prove the satisability of a systems of

quadratic equations over GF because this problem is NPcomplete The oracle is supp osed

to hold the values of all quadratic expressions under a satisfying assignmentto the variables We

n

distinguish two tables in the oracle One corresp onding to the linear expressions and the other

n

to pure bilinear expressions Each table is tested for selfconsistency via a linearity the

test and the two tables are tested to b e consistent via a matrixequalitytestwhich utilizes self

correction Each of these tests utilizes a constant number of Bo olean queries and randomness

which is logarithmic in the size of the corresp onding table

PCP and Approximation

PCPCharacterizations of NP playa central role in recent developments concerning the diculty

of approximation problems To demonstrate this relationship we rst note that the PCP Char

acterization Theorem can be rephrased without mentioning the class PCP altogether Instead a

new typ e of p olynomialtime reductions whichwecall amplifying emerges

Amplifying reductions There exists a constant and a p olynomialtime Karpreduction

f of SAT to itself so that f maps nonsatisable CNF formulae to CNF formulae for which

every truth assignment satises at most a fraction of the clauses I call the reduction f

amplifying and its existence follows from the PCP Characterization Theorem On the other hand

any amplifying reduction for SAT yields a pro of of the PCP Characterization Theorem The

pro ofs of b oth directions are left as an exercise

Arststeptowards this partial goal was taken in A Combinatorial Consistency Lemma with application to the PCP

Theorem by Goldreich and Safra SICOMPVolume Numb er pages

Hint Toprove the rst direction consider the guaranteed p cp system for SAT asso ciate the bits of the oracle

with Bo olean variables and intro duce a constant size Bo olean formula for each p ossible outcome of the sequence of

O log n coin tosses describing whether the verier would have accepted given this outcome For the other direction

consider a p cp system that is given oracle access to a truth assignment for the formula resulting from the amplied

reduction

Amplifying reductions and NonApproximability The ab ove amplifying reduction of SAT

implies that it is NPHard to distinguish satisable CNF formulae from CNF formulae for which

every truth assignment satises less than a fraction of its clauses Thus MaxSAT is NPHard

to approximate to within a factor

Stronger NonApproximability Results were obtained via alternative PCP Characterizations

of NP For example the NPHardness of approximating MaxClique to within N was

obtained via NP FPCPlog where the second parameter in FPCP measures the amortized

freebit complexity of the p cp system

The actual notes that were used

Ihave fo cused on interactive pro ofs and probabilistically checkable pro ofs

Interactive Pro ofs IP

Unfortunately this part of my notes was lost I have dened and discussed the basic mo del

examplied it with the Graph NonIsomorphism proto col and showed that co NP IP

Probabilistically Checkable Pro ofs PCP

Unfortunately the rst part of my notes intro ducing the basic mo del and the complexity measures

was lost

Adaptivity versus nonadaptivity in the context of PCP Whenever one discusses oracle

machines there is a distinction between adaptive machines that may select their queries based on

answ ers to prior queries and nonadaptive machines that determine all their queries as a function of

their initial input and coin tosses Adaptive machines can always b e converted to nonadaptive

ones at the cost of an exp onential increase in their query complexity ie by considering apriori

all p ossible answers In our case where the query complexity is an unsp ecied constant this

dierence is immaterial Thus whenever it is convenient we will assume that the verier in the

PCP scheme is nonadaptive

The PCP Characterization Theorem The theorem states that NP PCP O log nO

wing that PCP O log nO is contained in NP This follows by The easy direction consists of sho

observing that the eective length of the oracle ie the numb er of bits read from the oracle under

all p ossible settings of the randomtap e is p olynomial The other direction is much more complex

Here we will only sketch a pro of of a much easier result that is NP PCPp oly nO We

stress that this result is very interesting by itself b ecause it states that NPassertions can b e veried

probabilistically by making only a constantnumb er of prob es into the p ossibly exp onentiallylong

pro of

Let QE be the set of satisable systems of quadratic equations mo dulo ie quadratic

k

k

b equations over GF That is c is in QE if the quadratic system of equations

ij n k m

ij

P

k

k

f c x x b g mo dulo has a solution in the x s Exercise Prove that QE is NP

i j i

k m

ij n

ij

complete and that this holds also when m n Also note that linear terms can be replaced by

quadratic terms We will showthat QE is in PCP O n O Below all arithmetic op erations

are mo dulo

The oracle in the PCP system that we will present is supp osed to enco de a satisfying assignment

to the system of equations where the enco ding will b e very redundant As we will see redundant

enco dings may be very easy to check Sp ecically a satisfying assignment will be

n

enco ded byproviding all the partial sums of the s ie an enco ding of via the Hadamard co de

i

as well as all the partial sums of the s That is the rst resp second part of the enco ding of

i j

P

n n

is the bit long string in whichentry f g corresp onds to resp the

n i i

i

P

n n

bit long string in whichentry f g corresp onds to where

ij i j n

ij

resp

nn

i

k

k i n

On input c b and oracle access to where j j the

ij n k n

ij

verier will p erform the following four tests

n

Test that is close to an encoding of some f g under the Hadamard code That is

n

wecheck whether there exists a f g suchthat

X

Pr

i i

i

This checking is p erformed by invoking the socalled linearity test for a constant number of

n

times where in each invo cation we select uniformly and indep endently f g and

check whether holds where denote bitbybit

addition Although very app ealing the analysis of the linearity test is quite nontrivial and

thus omitted

n

Test that is close to an encoding of some f g under the Hadamard code That

n

is wecheck whether there exists a f g suchthat

X

Pr

ij ij

ij

Indeed we just use the linearityteston

T est that the string encodedin match the one encodedin Recall that the Hadamard

co de has relative distance equal to ie the enco dings of two dierent strings agree in

exactly of the co ordinates Thus Eq may hold only for one and similarly

Eq may hold only for one In the currentstepwewant to test whether the string

that satises Eq is consistent with the string that satises Eq that is that

holds for all i j n

ij i j

and B are equal by making A detour Supp ose wewant to test that two nbyn matrices A

few queries to a suitable enco ding This case b e done by uniformly selecting a row vector r

and a column vector s and checking whether rAs rB s ie bit equality Let C A B

We are actually checking whether C is all zeros bychecking whether rC s Clearly if C

is all zeros then equality will always hold On the other hand if C is a nonzero matrix then

it has rank d inwhich case the probability that for a randomly chosen r thevector rC

d

is an allzero vector is exactly The pro of is left as an exercise but do the next exercise

rst Furthermore for a nonzero vector v rC the probability that for a randomly chosen

is exactly Prove this to o We conclude that for any nonzero s it holds that vs

matrix C it holds that Pr rC s

rs

Considering the matrices A and B we want to check whether they are

i j ij ij ij

n

identical By the ab ove detour this calls for uniformly selecting rs f g and checking

P

whether rAs rB s Now observe that rAs r s equals the pro duct of r and

i i

i

P P

r s s On the other hand rB s So it seems that all we need to check is

i j i i ij

ij i

whether r s equals z where z is the outerpro duct of r and s This is not

P P

quite true Steps and only guarantee that and

i i ij ij

i ij

with high probability for uniformly distributed and This is ne with resp ect to what we

want to retrieve from but not for what wewant to retrievefrom b ecause the outer

en if r and s are uniformly distributed pro duct of r and s is not uniformly distributed ev

n

Thus instead of querying on z we uniformly select z f g query on z and

z z which are b oth uniformly distributed and use the value z z z This

pro cess is called selfcorrection

Test that the string encoded in satises the quadratic system That is for as in

P

k

k

b holds for all k n Rather Eq we want to check whether c

i j

ij n

ij

n

than p erforming n tests whichwe cannot aord we uniformly select r f g and check

whether

X X X

k

k

r r b c

i j

k k

ij

k n k n ij n

P P

k

The lefthand side can be written as r c and so we merely need to retrieve

i j

k

ij k

ij

that value which by Steps can be obtained via selfcorrection from That is

assuming we did not reject in any of Steps it holds that with high probability over a

P

k

n

equals where the value of uniformly chosen f g

i j

ij

ij

P

we will set such that r c

ij ij

k

k

We conclude that if the original system of equations is not satisable then every is

rejected with probability at least by one of the ab ove four steps whereas the original system

is satisable then there exists a that is accepted with probability by all the ab ove

steps

Amplifying reductions For sake of concreteness we fo cus on a sp ecic NPcomplete problem

ie SAT but similar statements can be made ab out some other but not all natural NP

complete problems We say that a Karpreduction f is an amplifying reduction of SAT to itself if

there exists a constant such that the following holds

If SAT then f SAT

If SAT then not only that f SAT but rather every truth assignment to

def

f satises at most fraction of the clauses of

That is the reduction amplies the unsatisabilityof ie it may b e that there exists a

truth assignment that satises all but one of the clauses of still all truth assignments fail

to satisfy a constant fraction of the clauses of

Interestingly the notion of amplifying reductions captures the entire contents of the PCP Theorem

and so you should not exp ect to b e able to see a simple amplifying reduction

Theorem The fol lowing two are equivalent

SAT PCPO log nO

There exists an amplifying reduction of SAT to itself

Note that SAT PCPO log nO if and only if NP PCPO log nO

Pro of sketch We rst show that amplifying reductions imply the PCP Theorem Supp ose that

f is an amplifying reduction of SAT to itself and b e the corresp onding constant Consider

a verier that on input a CNF formula computes f selects at random a clause of

prob e the oracle for the values of the corresp onding three variables and decide accordingly This

verier uses a logarithmic amount of randomness always accepts SAT when provided an

adequate oracle and rejects each SAT with probability at least not matter what oracle

is presented Clearly the error can be reduced to as required by invoking this verier

times

On the other hand given a PCP system as in Item we construct an amplifying reduction as

follows On input a CNF formula we construct a CNF formula as follows The variables of

will corresp ond to the bits of the oracle used bythe PCP verier Recall that the number of

O log jj

eective oracle bits is p olynomial in jj For each p ossible randomtap e r f g of the

verier consider the veriers verdict as a function of the O answers obtained from the oracle

Thus the veriers decision on input and randomtap e r can be represented as a constantsize

formula in O variables representing the corresp onding oracle bits Using auxiliary variables

O log jj

such aformula can b e represented in CNF of constant size The conjunction of these

formulae each constructible in p olynomial time from yields Observethat if is satisable

then so is On the other hand if is not satisable then every truth assignment to the variables

of satises at most of the constantsize CNFs which corresp ond to individual values of the

randomtap e Thus for each of at least of the constantsize CNFs at least one of the clauses

is not satised It follows that the reduction constructed ab ove is amplifying with that dep ends

on the constantnumb er of clauses in each of the small CNFs

Amplifying reductions and the diculty of approximation MaxSATistypically dened

as a search problem in which given a CNF formula one seeks a truth assignment satisfying as many

clauses as p ossible In the approximation version given a formula one is only required

to nd a truth assignment that satises at least opt clauses where opt denotes

the maximum number of clauses that can be satised by any truth assignment to Observe

the that the existence on an amplifying reduction of SAT to itself with constant implies that

approximation version of MaxSAT is NPhard Proving this fact is left as an exercise

Lecture

Pseudorandomness

A fresh view at the question of randomness was taken in the theory of computing It has b een

p ostulated that a distribution is pseudorandom if it cannot b e told apart from the uniform distri

bution by any ecient pro cedure The paradigm originally asso ciating ecient pro cedures with

p olynomialtime algorithms has b een applied also with resp ect to a variety of limited classes of

such distinguishing pro cedures

Lo osely sp eaking pseudorandom generators are ecient pro cedures that stretch short random

seeds into signicantly longer pseudorandom sequences Again the original approach has required

that the generation be done in p olynomialtime but subsequent works have demonstrated the

fruitfulness of alternative requirements

The notes for this lecture were adapted from various texts that I wrote in the past see eg

Chap In view of the fact that that the archetypical case of pseudorandom generators is covered

Weizmann in the Foundation of Cryptography course I fo cused in the current course on the at

derandomization asp ect The actual notes I have used in the current course app ear in Section

Intro duction

The second half of this century has witnessed the developmentof three theories of randomness a

notion which has been puzzling thinkers for ages The rst theory cf initiated by Shan

non is ro oted in probability theory and is fo cused at distributions that are not p erfectly

random Shannons Information Theory characterizes p erfect randomness as the extreme case in

which the information content is maximized and there is no redundancy at all Thus p erfect

randomness is asso ciated with a unique distribution the uniform one In particular by denition

one cannot generate such p erfect random strings from shorter random seeds

The second theory cf due to Solomonov Kolmogorov and Chaitin

is ro oted in and sp ecically in the notion of a universal language equiv

universal machine or computing device It measures the complexity of ob jects in terms of the

shortest program for a xed universal machine that generates the ob ject Like Shannons theory

Kolmogorov Complexity is quantitative and p erfect random ob jects app ear as an extreme case

Interestingly in this approachonemaysay that a single ob ject rather than a distribution over ob

jects is p erfectly random Still Kolmogorovs approach is inherently intractable ie Kolmogorov

Complexity is uncomputable and b y denition one cannot generate strings of high Kolmogorov

Complexity from short random seeds

The third theory initiated by Blum Goldwasser Micali and Yao is ro oted in

complexity theory and is the fo cus of this lecture This approach is explicitly aimed at providing a

notion of p erfect randomness that allows to eciently generate p erfect random strings from shorter

random seeds The heart of this approach is the suggestion to view ob jects as equal if they cannot

be told apart by any ecient pro cedure Consequently a distribution that cannot be eciently

distinguished from the uniform distribution will be considered as b eing random or rather called

pseudorandom Thus randomness is not an inherent prop erty of ob jects or distributions but

rather relative to an observer and its computational abilities To demonstrate this approach let

us consider the following mental exp eriment

Alice and Bob play head or tail in one of the following four ways In all of them

Alice ips a coin high in the air and Bob is asked to guess its outcome before the coin

hits the o or The alternative ways dier by the knowledge Bob has b efore making

his guess In the rst alternative Bob has to announce his guess b efore Alice ips the

coin Clearly in this case Bob wins with probability In the second alternative

Bob has to announce his guess while the coin is spinning in the air Although the

outcome is determined in principle by the motion of the coin Bob do es not have accurate

information on the motion and thus we b elieve that also in this case Bob wins with

probability The third alternative is similar to the second except that Bob has

at his disp osal sophisticated equipment capable of providing accurate information on

the coins motion as well as on the environment eecting the outcome However Bob

cannot pro cess this information in time to improve his guess In the fourth alternative

Bobs recording equipment is directly connected to a powerful computer programmed

to solve the motion equations and output a prediction It is conceivable that in sucha

case Bob can improve substantially his guess of the outcome of the coin

We conclude that the randomness of an event is relative to the information and computing resources

at our disp osal Thus a natural concept of pseudorandomness arises a distribution is pseudo

random if no ecient pro cedure can distinguish it from the uniform distribution where ecient

pro cedures are asso ciated with probabilistic p olynomialtime algorithms

The General Paradigm

A generic formulation of pseudorandom generators consists of sp ecifying three fundamental asp ects

the stretching measure of the generators the class of distinguishers that the generators are supp osed

to fo ol ie the algorithms with resp ect to whichthecomputational indistinguishability requirement

should hold and the resources that the generators are allowed to use ie their own computational

complexity

Stretching function A necessary requirement from any notion of a pseudorandom generator

is that it is a deterministic algorithm that stretches short strings called seeds into longer output

sequences Sp ecically it stretches k bit long seeds into k bit long outputs where k k The

function is called the stretching measure or stretching function In some settings the sp ecic

stretching measure is immaterial eg see Section

Computational Indistinguishability A necessary requiremen t from any notion of a pseudo

random generator is that it fo ols some nontrivial algorithms That is any algorithm taken from

some class of interest cannot distinguish the output pro duced by the generator when the generator

is fed with a uniformly chosen seed from a uniformly chosen sequence Typically we consider a

class D of distinguishers and a class F of noticeable functions and require that the generator G

satises the following For any D Dany f F and for all suciently large k s

j PrD GU PrD U j f k

k

k

n

where U denotes the uniform distribution over f g and the probabilityistaken over U resp

n

k

The archetypical U as well as over the coin tosses of algorithm D in case it is probabilistic

k

choice is that D is the set of probabilistic p olynomialtime algorithms and F is the set of functions

which are the recipro cal of some p ositive p olynomial

Complexity of Generation The archetypical choice is that the generator has to work in

p olynomialtime ie time that is p olynomial in length of its input the seed Other choices

will be discussed as well We note that placing no computational requirements on the generator

or alternatively putting very mild requirements such as a doubleexp onential runningtime upp er

b ound yields generators that can fo ol any sub exp onentialsize circuit family

The Archetypical Case

As stated ab ove the most natural notion of a pseudorandom generator refers to the case where b oth

the generator and the p otential distinguisher work in p olynomialtime Actually the distinguisher

is more complex than the generator The generator is a xed algorithm working within some xed

p olynomialtime whereas a p otential distinguisher is any algorithm that runs in p olynomialtime

Thus for example the distinguisher may always run in time cubic in the runningtime of the

generator Furthermore to facilitate the developmentofthistheorywe allow the distinguisher to

be probabilistic whereas the generator remains deterministic as ab ove In the role of the set of

noticeable functions we consider all functions that are the recipro cal of some p ositive p olynomial

This choice is naturally coupled with the asso ciation of ecient computation with p olynomial

time algorithms An event that o ccurs with noticeable probability o ccurs almost always when the

exp eriment is rep eated a feasible ie p olynomial numb er of times

The actual denition

framework presented in The ab ove discussion leads to the following instantiation of the generic

Section

Denition pseudorandom generator archetypical case A deterministic polynomialtime

algorithm G is cal led a pseudorandom generator if there exists a stretching function N N so

that for any probabilistic polynomialtime algorithm D for any positive polynomial p and for al l

suciently large k s

j PrD GU PrD U j

k

k

pk

Thus we require certain functions ie the absolute dierence b etween the ab ove probabilities to b e smaller

than any noticeable function on al l but nitely many integers We call such functions negligible Note that a function

may b e neither noticeable nor negligible eg it maybe smaller than any noticeable function on innitely many

values and yet larger than some noticeable function on innitely manyothervalues

The denition b elow asserts that the distinguishing gap of certain machines must b e smaller than the recipro cal

of anypositive p olynomial for all but nitely many ns Such functions are called negligible The notion of negligible

probability is robust in the sense that an event that o ccurs with negligible probability o ccurs with negligible probability

also when the exp eriment is rep eated a feasible ie p olynomial numb er of times

n

where U denotes the uniform distribution over f g and the probability is taken over U resp

n

k

U as wel l as over the coin tosses of D

k

Thus pseudorandom generators are ecient ie p olynomialtime deterministic programs that

expand short randomly selected seeds into longer pseudorandom bit sequences where the latter are

computationally indistinguishable from truly random sequences by ecient ie p olynomialtime

algorithms It follows that any ecient maintains its p erformance when its

internal coin tosses are substituted by a sequence generated by a pseudorandom generator

Amplifying the stretch function Pseudorandom generators as dened ab ove are only required

to stretch their input a bit for example stretching k bit long inputs to k bit long outputs will

do Clearly generator of such mo derate stretch function are of little use in practice In contrast we

want to have pseudorandom generators with an arbitrary long stretch function By the eciency

requirement the stretch function can be at most p olynomial It turns out that pseudorandom

generators with the smallest p ossible stretch function can be used to construct pseudorandom

generators with any desirable p olynomial stretch function Thus when talking ab out the existence

of pseudorandom generators wemay ignore the stretch function

Theorem Let G be a pseudorandom generator with stretch function k k and be

any polynomial ly bounded stretch function which is polynomialtime computable Let G x denote

and G x denote the last bit of Gx ie GxG x G x the jxjbit long prex of Gx

Then

def

G s

jsj

where x s G x and x G x for i jsj

i i i i

generator with stretch function is a pseudorandom

Pro of Sketch The theorem is proven using the hybrid technique cf Sec One consid

i

are inde and U where U P U for i k dened by U ers distributions H

k i

i i k

k k

i k

p endent uniform distributions over f g and f g resp ectively and P x denotes the j bit

j

long prex of G x The extreme hybrids corresp ond to G U and U whereas distinguisha

k

k

bility of neighb oring hybrids can be worked into distinguishability of GU and U Lo osely

k k

i

i

Then using P s G sP G s from H sp eaking supp ose one could distinguish H

j j

k

k

i

P G U for j this means that one can distinguish H U G U

k i

i

k

k k

i

from H U U P U Incorp orating the generation of U and the eval

k i

i i

k k

uation of P into the distinguisher one could distinguish G U G U GU

k

k i

k k

from U U U in contradiction to the pseudorandomness of G For details see

k

k

Sec

How to Construct Pseudorandom Generators

The known constructions transform computation diculty in the form of oneway functions de

ned b elow into pseudorandomness generators Lo osely sp eaking a polynomialtime computable

function is called onewayifany ecient algorithm can invert it only with negligible success prob

ability For simplicitywe consider only lengthpreserving oneway functions

Denition oneway function A oneway function f isapolynomialtime computable func

tion such that for every probabilistic polynomialtime algorithm A every positive polynomial p

and al l suciently large ns

h i

Pr A f x f f x

xU

n

pn

n

where U denotes the uniform distribution over f g and xX means that x is distributed

n

according to X

Popular candidates for oneway functions are based on the conjectured intractabilityofInteger Fac

torization the Discrete Logarithm Problem and deco ding of random linear co de The infeasibility

th

of inverting f yields a weak notion of unpredictability Let b x denotes the i bit of x Then for

i

every probabilistic p olynomialtime algorithm A and suciently large n it must b e the case that

Pr Ai f x b x n where the probability is taken uniformly over i f ng and

ix i

n

x f g A stronger and in fact strongest p ossible notion of unpredictability is that of a hard

core predicate Lo osely sp eaking a polynomialtime computable predicate b is called a hardcore

of a function f if all ecient algorithm given f x can guess bx only with success probability

which is negligible b etter than half

Denition hardcore predicate A polynomialtime computable predicate b f g

f g is cal led a hardcore of a function f if for every probabilistic polynomialtime algorithm A

every positive polynomial p and al l suciently large ns

Pr A f x bx

xU

n

pn

Clearlyifb is a hardcore of a p olynomialtime f then f must b e oneway

It turns out that any oneway function can b e slightly mo died so that it has a hardcore predicate

Theorem A generic hardcore Let f be an arbitrary oneway function and let g be dened

def

by g x r f xr where jxj jr j Let bx r denote the innerproduct mod of the binary

vectors x and r Then the predicate b is a hardcore of the function g

See pro of in Ap dx C or Sec Finallywe get to the construction of pseudorandom

generators

Theorem A simple construction of pseudorandom generators Let b beahardcorepredicate

def

of a polynomialtime computable function f Then Gs f s bs is a pseudorandom

generator

Pro of Sketch Clearly the jsjbit long prex of Gs is uniformly distributed since f is and

jsj

s onto f g Hence the pro of b oils down to showing that distinguishing f sbs from f

where is a random bit yields contradiction to the hyp othesis that b is a hardcore of f ie that

bs is unpredictable from f s Intuitively such a distinguisher also distinguishes f sbs from

f sbs where and so yields an algorithm for predicting bs based on f s

In a sense the key p oint in the pro of of the ab ove theorem is showing that the obvious by

constriction unpredictability of the output of G implies its pseudorandomness The fact that next

bit unpredictability and pseudorandomness are equivalent in general is proven explicitly in

Sec

Functions that are not mayhave hardcore predicates of information theoretic nature but these are of no

use to us here For example for f g the function f x f x has an information theoretic hardcore

predicate bx

A general condition for the existence of pseudorandom generators Recall that given

any oneway function we can easily construct a pseudorandom generator Actually the

requirementmay b e dropp ed but the currently known construction for the general case is quite

complex Still wedohave

Theorem On the existence of pseudorandom generators Pseudorandom generators exist

if and only if oneway functions exist

To show that the existence of pseudorandom generators imply the existence of oneway functions

k

consider a pseudorandom generator G with stretch function k k For x y f g dene

def

f x y Gx and so f is p olynomialtime computable and lengthpreserving It must b e that

f is oneway or else one can distinguish GU fromU by trying to invert f and checking that the

k k

result is correct Inverting f on its range distribution refers to exp erimenting with the distribution

GU whereas the probability that U has an inverse under f is negligible

k k

The interesting direction is the construction of pseudorandom generators based on any oneway

function In general when f may not b e the ensemble f U may not b e pseudorandom and so

k

Construction ie Gsf sbs where b is a hardcore of f cannot b e used directly Thus

one idea is to hash f U to an almost uniform string of length related to its entrop y using Universal

k

Hash Functions This is done after guaranteeing that the logarithm of the probability mass

of a value of f U istypically close to the entropyoff U But hashing f U down to length

k k k

comparable to the entropy means shrinking the length of the output to say k k This foils the

entire point of stretching the k bit seed Thus a second idea is to comp ensate for the k k loss

by extracting these many bits from the seed U itself This is done by hashing U and the p ointis

k k

that the k k bit long hash value do es not make the inverting task any easier Implementing

ideas turns out to be more dicult than it seems and indeed an alternative construction these

would b e most appreciated

Pseudorandom Functions

Pseudorandom generators allow to eciently generate long pseudorandom sequences from short

random seeds Pseudorandom functions dened below are even more p owerful They allow e

cientdirectaccesstoahuge pseudorandom sequence which is infeasible to scan bitbybit Put in

other words pseudorandom functions can replace truly random functions in any ecient applica

tion eg most notably in cryptography That is pseudorandom functions are indistinguishable

from random functions by ecient machines that may obtain the function values at arguments

of their choice Suc h machines are called oracle machines and if M is such machine and f is a

f

function then M x denotes the computation of M on input x when M s queries are answered

by the function f

Denition pseudorandom functions A pseudorandom function ensemble with length pa

def

jsj jsj

D R

rameters N N is a col lection of functions F ff f g f g g

D R s

sfg

satisfying

ecient evaluation There exists an ecient deterministic algorithm which given a seed

sand an jsjbit argument x returns the jsjbit long value f x

D R s

Sp ecically given an arbitrary one way function f one rst constructs f by taking a direct pro duct of

def

k

suciently many copies of f For example for x x f g weletf x x f x f x

  

k k k

pseudorandomness For every probabilistic polynomialtime oracle machine M for every

positive polynomial p and al l suciently large ns

f n n

Pr M Pr M

R

f F

n

n

pn

n

where F denotes the distribution on F obtained by selecting s uniformly in f g and R

n n

n n

D R

denotes the uniform distribution over al l functions mapping f g to f g

Supp ose for simplicity that nn and n Then a function uniformly selected among

D R

n

functions of a pseudorandom ensemble presents an inputoutput behavior which is indistin

n

guishable in p oly ntime from the one of a function selected at random among all the Bo olean

n

functions Contrast this with the pseudorandom sequences pro duced by a pseudorandom gener

ator which are computationally indistinguishable from a sequence selected uniformly among all the

polyn

many sequences Still pseudorandom functions can b e constructed from any pseudorandom

generator

Theorem How to construct pseudorandom functions Let G be a pseudorandom generator

with stretching function nn Let G s resp G s denote the rst resp last jsj bits in

Gsand

def

G s G G G s

jsj jsj

def

jsj jsj

Then the function ensemble ff f g f g g where f x G s is pseudoran

s s x

sfg

dom with length parameters n nn

D R

The ab ove construction can be easily adapted to any p olynomiallyb ounded length parameters

N N

D R

th i

Sketch The pro of uses the hybrid technique The i hybrid H is a function ensemble Pro of

n

i

n n n i

consisting of functions f g f g each dened by random nbit strings denoted

The value of such function at x with jj i is G s The extreme hybrids hs i

i

fg

n

corresp ond to our indistinguishability claim ie H f and H R and neighb oring hybrids

U n

n n n

corresp ond to our indistinguishabilityhyp othesis sp ecically to the indistinguishabilityofGU

n

and U under multiple samples

n

The Applicability of Pseudorandom Generators

Randomness is playing an increasingly imp ortant role in computation It is frequently used in

the design of sequential parallel and distributed algorithms and is of course central to cryptog

raphy Whereas it is convenient to design such algorithms making free use of randomness it is

also desirable to minimize the usage of randomness in real implementations Thus pseudorandom

generators as dened ab ove are akey ingredient in an algorithmic to olb ox they provide an

automatic compiler of programs written with free usage of randomness into programs whichmake

this yields results of the an economical use of randomness In the context of complexity theory

following typ e

Theorem Derandomization of BP P If there exists nonuniformly strong pseudorandom

def

n

generators then BP P is contained in Dtimet where t n

Pro of Sketch Given any L BPP and any welet A denote the decision pro cedure for L

and G denote a pseudorandom generator stretching n bit long seeds into p oly nlong sequences

tobeusedby A on input length n Combining A and Gwe obtain an algorithm A A that

G

on input x rst pro duces a p oly jxjlong sequence by applying G to a uniformly selected jxj bit

long string and next runs A using the resulting sequence as a randomtap e We note that A and

A may dier in their decision on at most nitely many inputs or else we can incorp orate such

inputs together with Ainto a family of p olynomialsize circuits which distinguishes GU from

n

U Incorp orating these nitely many inputs into A and more imp ortantly emulating A

polyn

n

on each of the p ossible random choices ie seeds to G we obtain a deterministic algorithm

A as required

We comment that stronger results regarding derandomization of BP P are presented in Sec

tion

Comment Indeed pseudorandom numb er generators have app eared with the rst computers

However typical implementations use generators which are not pseudorandom according to the

ab ove denition Instead at b est these generators are shown to pass some adho c statistical test

cf However the fact that a pseudorandom number generator passes some statistical

tests do es not mean that it will pass a new test and that it is go o d for a future untested

application Furthermore the approach of sub jecting the generator to some adho c tests fails to

provide general results of the typ e stated ab ove ie of the form for all practical purp oses using

the output of the generator is as go o d as using truly unbiased coin tosses In contrast the

approach encompassed in Denition aims at such generality and in fact is tailored to obtain

it The notion of computational indistinguishability which underlines Denition covers all

p ossible ecient applications p ostulating that for all of them pseudorandom sequences are as go o d

as truly random ones

The Intellectual Contents of Pseudorandom Generators

We shortly discuss some intellectual asp ects of pseudorandom generators as dened ab ove

Behavioristic versus Ontological Our denition of pseudorandom generators is based on the

is best notion of computational indistinguishability The behavioristic nature of the latter notion

demonstrated by confronting it with the KolmogorovChaitin approach to randomness Lo osely

sp eaking a string is Kolmogorovrandom if its length equals the length of the shortest program

pro ducing it This shortest program may b e considered the true explanation to the phenomenon

describ ed by the string A Kolmogorovrandom string is thus a string which do es not have a

substantially simpler ie shorter explanation than itself Considering the simplest explanation

of a phenomenon may b e viewed as an ontological approach In contrast considering the eect of

phenomena on an observer as underlying the denition of pseudorandomness is a behavioristic

approach Furthermore there exist probability distributions whic h are not uniform and are not

even statistically close to a uniform distribution that nevertheless are indistinguishable from a

uniform distribution by any ecient metho d Thus distributions which are ontologically very

dierent are considered equivalentbythebehavioristic p oint of view taken in the denitions ab ove

A relativistic view of randomness Pseudorandomness is dened ab ove in terms of its ob

server It is a distribution which cannot b e told apart from a uniform distribution byany ecient

ie p olynomialtime observer However pseudorandom sequences may be distinguished from

random ones by innitely powerful powerful not at our disp osal Sp ecically an exp onential

time machine can easily distinguish the output of a pseudorandom generator from a uniformly

selected string of the same length eg just by trying all p ossible seeds Thus pseudorandomness

is sub jective to the abilities of the observer

Randomness and Computational Diculty Pseudorandomness and computational di

culty play dual roles The denition of pseudorandomness relies on the fact that putting com

putational restrictions on the observer gives rise to distributions which are not uniform and still

cannot b e distinguished from uniform Furthermore the construction of pseudorandom generators

rely on conjectures regarding computational diculty ie the existence of oneway functions

and this is inevitable given a pseudorandom generator we can construct oneway functions Thus

nontrivial pseudorandomness and computational hardness can b e converted back and forth

Derandomization of BPP

The ab ove discussion has fo cused mainly on one asp ect of the pseudorandomness question the re

sources or typeoftheobserver or p otential distinguisher Another imp ortant question is whether

such pseudorandom sequences can b e generated from much shorter ones and at what cost or com

plexity So far wehave required the generation pro cess to b e at least as ecient as the eciency

limitations of the distinguisher Indeed this seems fair and natural Allowing the generator

to be more complex ie use more time or space resources than the distinguisher seems unfair

but still yields interesting consequences in the context of trying to derandomize randomized

complexity classes For example as we shall see one may b enet from considering generators that

work in time exp onential in the length of their seed

In the context of derandomization we typically lose nothing by b eing more lib eral and al

lowing exp onentialtime generators To see whywe consider a typical derandomization argument

pro ceeding in two steps cf the pro of of Theorem First one replaces the true randomness

of the algorithm by pseudorandom sequences generated from much shorter seeds and next one

deterministically scans all p ossible seeds and lo oks for the most frequentbehavior of the mo died

algorithm Thus in such a case the deterministic complexity is anyhow exp onential in the seed

length The question is whether we gain anything by allowing exp onentialtime generators The

answer seems to b e p ositive b ecause with more time at their disp osal the generators can p erform

b etter eg output longer sequences andor be based on weaker intractability assumptions For

example

def

cn

Theorem Let E Dtimet with t n Suppose that there exists a language

c c c

L E and a constant such that for al l but nitely many ns any circuit C which correctly

n

n n

decides L on f g has size at least Then BP P P

Indeed Theorem is related to Theorem but the pseudorandom generators underlying

their pro ofs are very dierent

Pro of Sketch Underlying the pro of is a construction of an adequate pseudorandom generator

This generator op erates in exp onentialtime and generates an exp onentially long output that fo ols

If fact wehave require the generator to b e more ecient than the distinguisher The former was required to b e a

xed p olynomialtime algorithm whereas the latter was allowed to b e any algorithm with p olynomial running time

circuits of size that is a xed polynomial in the length of the output or a smaller exp onential in

the seed length That is for some constant b and all k s the generator running in time

O k bk

stretches k bit seeds into sequences of length that cannot be distinguished from truly

bk k

random sequences by any circuit of size Note that b b ecause a time machine can

easily distinguish the generated sequences from random ones b y trying all p ossible k bit seeds

The derandomization of BP P pro ceeds by setting the seedlength to be logarithmic in the input

length and utilizing the ab ove generator

Sp ecically let A be a randomized ptime algorithm that we wish to derandomize

def

On input xwe set k b log pjxjO log jxj and scan all p ossible k bit seeds

bk

For each seed we pro duce the corresp onding bit sequence use it as a randomtap e

to A invoked on input x and record the output of A Each such invo cation takes

O k k

time pjxj p oly jxj and we have p oly jxj many invo cations We

k

output the most frequent output obtained in all invo cations of Ax

We now turn to the construction of the generator The construction utilizes a predicate com

putable in exp onentialtime but unpredictable even to within a particular exp onential advantage

by any circuit family of a particular exp onential size One main ingredient of the pro of is sup

plying such a predicate given the hyp othesis but w e omit this part here Given such a predicate

the generator works by evaluating the predicate on exp onentiallymany subsequences of the bits

of the seed so that the intersection of any two subsets is relatively small That is for as in

O k bk

the hyp othesis and b p oly given a k bit seed the generator constructs in time

def

wo sets has size at subsets of k f k g each of size k such that the intersection of every t

most k and evaluates the predicate on the pro jection of the seed bits determined by each of

these subsets

The ab ove generator fo ols circuits of the stated size even when these circuits are presented

with the seed as auxiliary input These circuits are smaller than the running time of the generator

and so they cannot just evaluate the generator on the given seed The pro of that the generator

fo ols such circuits refers to the characterization of pseudorandom sequences as unpredictable ones

Thus one proves that the next bit in the generators output cannot b e predicted given all previous

bits as well as the seed Assuming that a small circuit can predict the next bit of the generator

we construct a circuit for predicting the hard predicate The new circuit incorp orates the best

such prediction augmentation of the input to the circuit into a seed for the generator ie for

the bits not in the sp ecic subset of the seed are xed in the b est way The key observation is

that all other bits in the output of the generator dep end only on a small fraction of the input bits

ie recall the small intersection clause ab ove and so circuits for computing these other bits have

relatively small size and so can be incorp orated in the new circuit Using all these circuits the

new circuit forms the adequate input for the nextbit predicting circuit and outputs whatever the

latter circuit do es

st

Sp ecically using a circuit C for predicting the i bit of the generator invoked on

k bit seeds we describ e a circuit for approximating the value of the predicate on inputs

def

of length k Recall that C is given the rst i bits output by the generator as w ell



For future reference say that for some constant no circuit of size can guess the value of the predicate



on a random bit input with success probability higher than

Thus this generator is only mo derately more complex than the distinguisher Viewed in terms of its output

the generator works in time p olynomial in the length of the output whereas the output fo ols circuits of size whichis

a smaller p olynomial in the length of the output

bk

as the k bit seed and predicts the said bit with advantage say We rst x

st

the b est setting for C s prediction of the seed bits that are not in the i subset

st

Certainly C s prediction for a random setting of the bits of the i subset and a b est

b est setting of the rest is at least as go o d as its prediction on a random seed Next for

each of the rst i bits in the generators output we consider circuits for computing

st

the value of these bits as a function of the undetermined seed bits of the i subset

and the xed bits of the rest of the seed Since the number of undetermined bits is at

k bk

most k each such circuit has size Incorp orating these i circuits into

st

C we obtain a circuit that predicts the i output bit when only given the bits

st

of the i subset In other words the resulting circuit approximates the predicate

bk k

on random inputs of length k with correlation at least for as in

k bk k

Fo otnote The size of the resulting circuit is at most size C

This contradicts the hyp othesis regarding the predicate

Recall that we have only showed how to use a predicate that is hard to approximate in order to

obtain the desired pseudorandom generator To complete the pro of sketch one has to showhow the

existence of predicates in E that are hard in the adequate worstcase sense implies the existence

of predicates in E that are hard to approximate in the adequate sense This part is to o complex

to b e treated here and the interested reader is referred to

On weaker notions of computational indistinguishability

Whenever the aim is to replace random sequences utilized by an algorithm with pseudorandom

sequences one may try to capitalize on knowledge of the target algorithm Ab ovewehave merely

used the fact that the target algorithm runs in p olynomialtime However for example if we

knowthat the algorithm uses very little workspace then we may b e able to do b etter The same

holds if weknow that the analysis of the algorithm only dep ends on some sp ecic prop erties of the

random sequence it uses eg pairwise indep endence of its elements In general weaker notions

constantdepth of computational indistinguishability such as fo oling spaceb ounded algorithms

circuits and even sp ecic tests eg testing pairwise indep endence of the sequence arise naturally

Generators pro ducing sequences that fo ol such tests are useful in a variety of applications if the

application utilizes randomness in a restricted way then feeding it with sequences of low randomness

quality may do Needless to say weadvocate a rigorous formulation of the characteristics of such

applications and a rigorous construction of generators that fo ol the typ e of tests that emerge

In the context of a course on complexity theory it is most appropriate to mention the pseu

dorandom generators that fo ol spaceb ounded algorithms that have online access to the insp ected

sequence which is analogous to the online access of randomized b oundedspace machines to their

randomtap e Such generators can be constructed without relying on any intractability assump

tions and yield strong derandomization results Two such famous results are captured by the

following theorems

Theorem BP L SL where BP L RL in the class of sets recognized by twosided log

space machines and SL in the class of sets recognized by deterministic polynomialtime algorithms

that use only polylogarithmic amount of space

Theorem Suppose that L can be decided by a probabilistic polynomialtime algorithm of

space complexity s Then L can be decided by a probabilistic polynomialtime algorithm of space

O s where s nmaxsnn complexity O s and randomness complexity

Analogous results hold for search problems The pseudorandom generator underlying Theorem

uses a logarithmic number of hashing functions each having logarithmic description length and

a logarithmicallylong string to dene a p olynomiallylong sequence The seed of the generator

consists of the description of the hash functions and the additional string but for a xed logspace

distinguisher one can determine a sequence of hashing functions for which the distinguisher is fo oled

when on only varies the additional logarithmicallylong string The pseudorandom generator un

derlying Theorem uses a randomness extractor which is a more sophisticated construct

which has b een the fo cus on extensive research in the recent decade see

The actual notes that were used

The general paradigm of pseudorandom generators refers to a deterministic program that stretches

random seeds into longer sequences that look random to a specied set of resourcebounded observers

Thus pseudorandomness is not generated deterministically but rather from a short random seed

the relatively small amount of randomness and the ab ove formalism is aimed to make explicit

used in the generation pro cess That is we refer to a deterministic function or family of functions

k k

one p er eachvalue of k oftheformG f g f g satisfying three prop erties

Stretching Atthevery least k k for every k

Pseudorandomness For every observer D taken from an adequate class which dep ends on

the setting D cannot distinguish a random output of G from a truly random string of the

same length That is

D Gs Pr Pr D r

k

k

sfg

r fg

That is D as a p otential distinguisher fails to do its job in a very strong sense Throughout

this lecture we will fo cus on p otential distinguishers that are implementable by p olynomial

size circuits ie a nonuniform family of circuits of size p olynomial in the length of the input

ie k

The complexity of generation This will vary from setting to setting Wemention two imp or

tant cases

a The archetypical case The natural requirement is that G be a p olynomialtime algo

rithm Using such a generator allows to shrink the amountof randomness used in any

ecient application Note that in this case the stretch is p olynomiallyb ounded

and so the shrinkage obtained is ie if the original application used m random bits

then wecantypically mo dify it to use only m random bits

We stress that in this case the distinguisher whichmay use any probabilistic p olynomial

time pro cedure is more complex than the generator which has runningtime equal a

sp ecic xed p olynomial

b The case of derandomization As well see below in the context of derandomization

we will anyhow scan all p ossible seeds Thus derandomization is always exp onential in

the seed and so we gain nothing by requiring that the generation pro cess ie G is

Indeed sucha distinguisher may incorp orate the output of G on a sp ecic seed or on a few seeds but the

probability that this seed will b e chosen for the lefthandside of Eq is negligible

p olynomialtime Instead wemay allow G to work for exp onential time ie time that is

exp onential in the seed length In this case the stretch is only exp onentiallyb ounded

We stress that in this case the generator may b e more complex than the distinguisher

Sp ecically whereas the generator is allowed time exp onential in the seed length this

cannot b e p ossibly allowed for the distinguisher or else the latter may try all seeds and

apply the generator to each suchseed

We stress that the archetypical case yields a generalpurpose generator that can be used in

any application In particular it yields a compiler for saving randomness in any probabilis

y where the tic p olynomialtime algorithm and is the typ e of thing needed in cryptograph

adversarydistinguisher may b e more complex than the legitimate strategy that uses the gen

erator In contrast the typ e of generators used in case of derandomization are sometimes

go o d only with resp ect to the sp ecic algorithm b eing derandomized or a sp ecic resource

b ound

To clarify the ab ove let us sp ell out how one typically uses a pseudorandom generator Let A be a

probabilistic p olynomialtime algorithm say running in time n where n denote its input length

Let G b e a pseudorandom generator of the rst typ e ie G is p olynomialtime computable say

with stretch function k k We derive a new algorithm A by replacing the randomness of A

with randomness generated out of a random seed of G That is let Ax r denote the output of

jxj

A on input x and randomness r f g recall that Ax makes at most jxj steps Then on

jxj

input x and randomness s f g algorithm A computes Gs and outputs Ax Gs Note

that A runs in p olynomialtime b ecause so do A and G We claim that A p erforms as well as A

while using signicantly less random bits The pro of is left as an exercise hint use the fact that

tly from A can be hardwired into a distinguishing circuit inputs on which A diers signican

Note that using an adequate pseudorandom generator we can shrink the amount of randomness

used byany probabilistic p olynomialtime algorithm to n forany constant

So far wehave only shrinked the amount of randomness used by probabilistic p olynomialtime

algorithms Full derandomization is obtained by scanning all p ossible randomtap es used by the

resulting algorithm or in other words scanning all p ossible seeds for the generator That is given

A as ab ove we derive a deterministic algorithm A by scanning all p ossible ss and outputting

on input x the ma jorityvalue of A x s tak en over all relevant ss If we use a generator G of

running time t and stretch k t k then the runningtime of A on input an nbit string

G G

will b e

n

t n time n

G A

k

For that is exp onential ie k whenever A is p olynomialtime and G is exp onential

O k

we obtain a p olynomialtime algorithm A b ecause nO log n time ie t k

G

O n

and t n p oly n Let use take a closer lo ok at what we need in order to

G

k k

obtain such a result We need a generator G f g f g that runs in at most

O k

exp onentialtime ie t k and stretches its seed by an exp onential amount ie

G

k

k such that these outputs are indistinguishable from random k bit long sequences

by circuits of size say k or even k Note that the complexityofthedistinguisher circuit

is dominated by the complexityofA but wehavesetk time n

A

The question is whether such generators exist The answer dep ends on the existence of su

ciently hard problems Note that this is not surprising b ecause the denition of pseudorandomness

actually refers to a problem ie the one of distinguishing that should b e hard although it is solv

able when waiving resourceb ounds b ecause the pseudorandom sequences are not truly random

Indeed wehave

Theorem Theorem restated Suppose that there exists a predicate f that is com

putable in exponentialtime and a constant c such that for al l but nitely many ms any

m c m

circuit C that correctly compute f on f g has size at least Then there exists a con

m

k k k

stant c and an exponentialtime generator G f g f g such that k and

ck

for circuit C of size it holds that

Pr C Gs Pr C r

k

k

sfg

r fg

Note that c must hold or else the hyp othesis cannot p ossibly hold ie b ecause a circuit of size

m

may just incorp orate the values of f for all mbit strings Exercise Show that Theorem

c n

the class E do es not have size circuits then implies Theorem ie if for some c

BP P P The pro of of Theorem consists of two steps

Hardness Amplication Given f as in the hyp othesis we construct an exp onentialtime

c m

computable predicate f that cannot be approximated on random mbit inputs by

sized circuits where c is a constant dep ending on c Sp ecically for any such circuit

C it holds that

c m

m

Pr C xf x

xfg

That is whereas f is only hard to compute in the worstcase f is even hard to guess with

signicantadvantage over the obvious random guess

The actual construction Averagecase of the latter typ e is naturally linked to pseudoran

domness Sp ecically given f f as ab ove Gs s f s is a pseudorandom generator

alas with pitiful stretch However our goal is to obtain exp onential stretch rather than

onebit stretch Clearly we cannot just rep eat the ab ove ie Gss f sf s f s

is clearly not a pseudorandom generator regardless how complex f is One natural idea

is to apply f to dierent parts of the seed that is to parts of the seed with small pairwise

overlap This is indeed the construction in use Let T T be a collection of sets such

k

that T f k g jT j k k and jT T jk k O for every i j On in

i i i j

put a k bit seed s the generator will construct such a collection in exp onentialtime details

omitted and will output the sequence

f sT fsT f sT

k

where sT is the pro jection of s on co ordinates T

i i

Observe that since f is computable in exp onentialtime so is G and that G has the desired stretch

The issue is to establish the pseudorandomness of G An imp ortant theorem in that resp ect is the

connection of pseudorandomness and unpredictability ie hardness of guessing the next bit in the

output sequence when given the previous bits Clearly pseudorandomness implies unpredictability

b ecause ability to predict the next bit in the output of G yields ability to distinguish Gs output

unpre from a truly random sequence However we care ab out the opp osite direction ie that

dictability implies pseudorandomness or put dierently ability to distinguish from random implies

ability to predict

Exercise Prove that Gss f s is indeed a pseudorandom generator

Unpredictability implies pseudorandomness Supp ose that a circuit C can distinguish with

gap k between X in our case the output of G on a random k bit seed and the uniform

k

k i i

distribution over f g Consider for i k the hybrid distributions H where H

k k

consists of the rst i bits of X augmented with an k ibit long uniformly distributed string

k

k

where U denotes the uniform distribution over GU and H U Observe that H

m

k

k

k

k

m

f g Thus although not designed for that purp ose there exists an i suchthat C distinguishes

i i

i i

with gap at least k k between H and H On the other hand H and H dier only in

k k

k k

the distribution of the i st bit and so C can b e easily converted into a predictor of the i st

bit of GU Exercise Fillup the details

k

Predictability of G implies approximation of f By the ab ove it suces to prove that

the output of G is unpredictable with the suitable parameters Towards the contradiction we

ck

consider a circuit C of size at most predicting the i st bit of GU Using the denition

k

of Gwehave

C f sT f sT f sT Pr k

k

i i

sfg

For simplicity of notations supp ose that T f k g and write s hx s i where jxj k

i

Using an averagingargument ie xing the best s we infer that there exists a string s

k k

f g such that

x s iT f x k C f hx s iT f h Pr

k

i

xfg

The key observation is that for j ithevalue of f hx s iT dep ends only on at most k bits of

j

x ie the bits in p ositions T T Thus there exists a circuit of size at most expk which

j i

dep ends on the xed s that given x computes f hx s iT ie by using a lo okup table for the

j

relevant bits of x Combining all these circuits we obtain a circuit C which is only k exp k

bigger than C such that Pr C x f x k For a suitable setting of the

k

xfg

constants c c k k and c k k we obtain a contradiction to the hyp othesis regarding f

cc c k ck ck k

and approximates the value of f on random since C has size at most

k bit inputs whereas cc c c

Lecture

AverageCase Complexity

In Leonid Levin has initiated a theory of averagecase complexity We provide an exp osition

of the basic denitions suggested by Levin and discuss some of the considerations underlying these

denitions The notes for this lecture were adapted from

Intro duction

The average complexity of a problem is in many cases a more signicant measure than its worst

case complexity This has motivated the development of a rich area in algorithmic research the

probabilistic analysis of algorithms cf However this line of research has so far b een

applicable only to sp ecic algorithms and with resp ect to sp ecic typically uniform probability

distributions

The general question of averagecase complexitywas addressed for the rst time by Levin

uch the same way Levins work can b e viewed as the basis for a theory of average NPcompleteness m

as Co oks and Levins works are the basis for the theory of NPcompleteness Subsequent

works haveprovided few additional complete problems Other basic complexity problems suchas

decision versus search were studied in

Levins averagecase complexity theory in a nutshell An averagecase complexity class

consists of pairs called distributional problems Each such pair consists of a decision resp

search problem and a probability distribution on problem instances We fo cus on the class

def

DistNP hN P Pcomputable i dened by Levin which is a distributional analogue of NP

It consists of NP decision problems coupled with distributions for which the accumulative measure

is p olynomialtime computable That is Pcomputable is the class of distributions for which there

p olynomial time algorithm that on input x computes the total probability of all strings exists a

y x The easy distributional problems are those solvable in average p olynomialtime a notion

which surprisingly require careful formulation Reductions between distributional problems are

dened in a way guaranteeing that if is reducible to and is in average p olynomialtime

then so is Finallyitisshown that the class DistNP contains a complete problem

Levins averagecase theory revisited Levins laconic presentation hides the fact that

choices has b een done in the development of the averagecase complexity theory We discuss some

of these choices here Firstly we stress that the motivation here is to provide a theory of ecient

computation rather than a theory of infeasible computation eg as in Cryptography The

two are not the same Furthermore we note that a theory of usefulforcryptography infeasible

computations do es exist cf eg A key dierence between the two theories is that in

Cryptographywe needs problems for whichonemay generate instancesolution pairs so that solving

the problem given only the instance is hard In the theory of averagecase complexity considered

b elow we consider problems that are hard to solve but do not require an ecient pro cedure for

generating hard on the average instances coupled with solutions

Secondly one has to admit that the class DistNP ie sp ecically the c hoice of distributions

is somewhat problematic Indeed Pcomputable distributions seem simple but it is not clear if

they exhaust all natural simple distributions A much wider class which is easier to defend is

the class of all distributions having an ecient algorithm for generating instances according to

the distribution One may argue that the instances of any problem we may need to solve are

generated eciently by some pro cess and so the latter class of Psamplable distribution suces

for our theory Fortunately it was show that any distributional problem that is complete

for DistNPhN P Pcomputablei is also complete with resp ect to the class hN P Psamplablei

Thus in retrosp ect Levins choice only makes the theory stronger It requires to select complete

distributional problems from the restricted class hN P Pcomputablei whereas hardness holds with

resp ect to the wider class hN P Psamplablei

As hinted ab ove the denition of average p olynomialtime is less straightforward than one may

exp ect The obvious attempt at formulation this notion leads to fundamental problems which in

our opinion deem it inadequate For a detailed discussion of this point the reader is referred

to the App endix We b elieve that once the failure of the obvious attempt is understo o d Levins

denition presen ted b elow do es lo ok a natural one

Denitions and Notations

In this section we present the basic denitions underlying the theory of averagecase complexity

Most denitions originate from Levin but the reader is advised not to lo ok there for further

explanations and motivating discussions

For sake of simplicity we consider the standard lexicographic ordering of binary strings Any

xed ecientenumeration will do An ecient enumeration is a and onto mapping of strings

to integers that can be computed and inverted in p olynomialtime By writing x y we mean

that the string x precedes y in lexicographic order and y denotes the immediate predecessor of

y Also we asso ciate pairs triples etc of binary strings with single binary strings in some standard

manner ie enco ding

Denition Probability Distribution Function A distribution function f g

is a nondecreasing function from strings to the unit interval that converges to one that is

x y for each xy and lim x The density function associated with

x

the distribution function is denoted and dened by and xx x

for every x

P

y For notational convenience we often describ e distribution functions Clearly x

y x

converging to some c In all the cases where we use this convention it is easy to normalize

the distribution so that it converges to one An imp ortant example is the uniform distribution

jxj

function dened as x A minor mo dication that do es converge to is obtained

jxj

jxj

by letting x

jxjjxj

Denition A Distributional Problem A distributional decision problem resp distribu

tional search problem is a pair D resp S where D f g f g resp S

f g f g and f g is a distribution function

In the sequel we consider mainly decision problems Similar formulations for search problems can

b e easily derived

DistributionalNP

Simple distributions are identied with the Pcomputable ones The imp ortance of restricting

atten tion to simple distributions rather than allowing arbitrary ones is demonstrated in

Sec essentially making no such restrictions would collapse the averagecase theory to the

standard worstcase theory

Denition Pcomputable A distribution is in the class Pcomputable if there is a de

terministic polynomial time Turing machine that on input x outputs the binary expansion of x

ie the running time is polynomial in jxj

It follows that the binary expansion of x has length p olynomial in jxj An necessary condition

for distributions to b e of interest is their putting noticeable probabilityweight on long strings ie

for some p olynomial p and suciently big n the probabilityweight assigned to nbit strings should

def

jxj

b e at least pn Consider to the contrary the density function x An algorithm of

jxj

running time tx will b e considered to have constantontheaverage runningtime wrt this

P P

n

as x tjxj

x n

If the distribution function is in Pcomputable then the density function is computable

erse however is false unless P NP In spite of this remark in time p olynomial in jxj The conv

we usually present the density function and leave it to the reader to verify that the corresp onding

distribution function is in Pcomputable

Wenow present the class of distributional problems which corresp onds to the traditional NP

Most of results in the literature refer to this class

Denition The class DistNP A distributional problem D belongs to the class DistNP

if D is an NPpredicate and is in Pcomputable DistNP is also denoted hN P Pcomputable i

A wider class of distributions denoted Psamplable gives rise to a wider class of distributional NP

problems which was discussed in the intro duction A distribution is in the class Psamplable

if there exists a p olynomial p and a probabilistic algorithm A that outputs the string x with

x within pjxj steps That is elements in a Psamplable distribution are generated probability

in time p olynomial in their length Wecomment that any Pcomputable distribution is Psamplable

whereas the converse if false provided oneway functions exist For a detailed discussion see

Average PolynomialTime

The following denitions regarding average p olynomialtime may seem obscure at rst glance It

is imp ortanttopoint out that the naive formalizations of these denitions suer from serious prob

lems such as not b eing closed under functional comp osition of algorithms b eing mo del dep endent

enco ding dep endent etc For a more detailed discussion see App endix

Denition Polynomial on the Average A function f f g N is p olynomial on the

average with resp ect to a distribution if there exists a constant such that

X

f x

x

jxj

xfg

The function l xf x is linear on the average wrt

Thus a function is p olynomial on the average if it is b ounded by a p olynomial in a function that is

linear on the average In fact the basic denition is that of a function that is linear on the average

see Def

Denition The class AverageP A distributional problem D is in the class AverageP

if there exists an algorithm A solving D so that the running time of A is polynomial on the average

with respect to the distribution

We view the classes AverageP and DistNP as the averagecase analogue of P and NP resp ectively

Reducibility between Distributional Problems

We now present denitions of average p olynomial time reductions of one distributional problem

to another Intuitively suchareduction should b e eciently computable yield a valid result and

preserve the probability distribution The purp ose of the last requirement is to ensure that

the reduction do es not map very likely instances of the rst problem to rare instances of the

second problem Otherwise having a p olynomial time on the average algorithm for the second

distributional problem do es not necessarily yield such an algorithm for the rst distributional

problem Following is a denition of randomized Turing reductions Denitions of deterministic

and manytoone reductions can b e easily derived as sp ecial cases

Denition Randomized Turing Reductions We say that the probabilistic oracle Turing

machine M randomly reduces the distributional problem D to the distributional problem

D if the fol lowing three conditions hold

Eciency Machine M is polynomial time on the average taken over x with distribution and

the internal coin tosses of M with uniform probability distribution ie let t x r be the

M

running time of M on input x and internal coin tosses r then there exists such that

P

t xr

M

where is the uniform distribution r x

xr

jxj

Validity For every x f g

D

xD x ProbM

D

where M x is the random variable determinedbyM s internal coin tosses which denotes

the output of the oracle machine M on input x and access to oracle for D

Domination There exists a constant c such that for every y f g

X

y Ask x y x

M

c

jy j

xfg

where Ask x y is the probability taken over M s internal coin tosses that machine M

M

asks query y on input x

D

In the denition of deterministic Turing reductions M x is determined by x rather than b eing

a random variable and Ask x y is either or rather than b eing any arbitrary rational in

M

In case of a manytoone deterministic reduction for every x we have Ask x y for

M

a unique y

It can be proven that if D is deterministically resp randomly reducible to D

and if D is solvable by a deterministic resp randomized algorithm with running time

p olynomial on the average thensoisD

Reductions are transitive in the sp ecial case in whichtheyare honest that is on input x they

ask queries of length at least jxj for some constant All known reductions have this prop erty

A Generic DistNP Complete Problem

The following distributional version of Bounded Halting denoted BH is known to

BH BH

b e DistNPcomplete see Section

Denition distributional Bounded Halting

k

Decision BH M x i there exists a computation of the nondeterministic machine

M on input x which halts within k steps

Distribution The distribution is dened in terms of its density function

BH

def

k

M x

BH

jM j jxj

k

jM j jxj

Note that is very dierent from the uniform distribution on binary strings eg consider

BH

relatively large k Yet as noted by Levin one can easily mo dify so that has a uniform

BH

distribution and is DistNPcomplete with resp ect to randomized reduction Hint replace the

unary time b ound by a string of equal length assigning each such string the same probability

DistNPcompleteness of

BH

The pro of presented here is due to Guretich An alternative pro of is implied by Levins

original pap er

In the traditional theory of NPcompleteness the mere existence of complete problems is almost

immediate For example it is v ery easy to show that Bounded Halting is NPcomplete In the

case of distributionalNP an analogous theorem is much harder to prove The diculty is that

we have to reduce all DistNP problems ie pairs consisting of decision problems and simple

P

tx

c

Hint Supp ose that for wehave x O and for some c wehave x jxj x

jxj

x

P

c

def

tx

c

x Then let S fx tx jxj g and split the sum x according to x S or not The sum

jxj

x

P P

c c

tx tx

c

x is b ounded by using tx jxj whereas x is b ounded by O using

jxj jxj

xS x S

P

tx

c c

jxj x and jxj tx and x O

jxj

x

k

Recall that Bounded Halting BH is dened over triples M x where M is a nondeterministic machine x is

a binary string and k is an integer given in unary The problem is to determine whether there exists a computation

of M on input x which halts within k steps Clearly Bounded Halting is in NP here its crucial that k is given in

unary Let D b e an arbitrary NP problem and let M b e the nondeterministic machine solving it in time P n

D D

on inputs of length n where P is a xed p olynomial Then the reduction of D to BH consists of the transformation

D

P jxj

D

x M x

D

distributions to one single distributional problem ie Bounded Halting with a single simple

distribution Applying reductions as in Fo otnote we endup with many distributional versions of

Bounded Halting and furthermore the corresp onding distribution functions will be very dierent

and will not necessarily dominate one another Instead one should reduce each distributional

problem D with an arbitrary Pcomputable distribution to the same distributional problem

with a xed Pcomputable distribution eg The diculty in doing so is that the reduction

BH

should have the domination prop erty Consider for example an attempt to reduce each problem in

P jxj

D

DistNP to by using the standard transformation of D to BH ie x M x This

BH D

transformation fails when applied to distributional problems in which the distribution of innitely

many strings is much higher than the distribution assigned to them bythe uniform distribution

jxj

In such cases the standard reduction maps an instance x having probability mass x to

P jxj P jxj jxj

D D

a triple M x with much lighter probability mass recall M x

D D

BH

This violates the domination condition and thus an alternative reduction is required

The key to the alternative reduction of D to isaneciently computable enco ding

BH

of strings taken from an arbitrary p olynomialtime computable distribution by strings that have

comparable probability mass under a xed distribution This enco ding will map x into a co deword

of length b ounded ab ove bythe logarithm of x Accordingly the reduction will map x to a

O

jxj

triple M x where jx j O log x and M is a nondeterministic Turing

D D

machine that rst retrieves x from x and then applies the standard nondeterministic machine ie

M of the problem D Such a reduction will b e shown to satisfy all three conditions ie eciency

D

validity and domination Thus instead of forcing the structure of the original distribution on

the target distribution the reduction will incorp orate the structure of into the the reduced

BH

instance The following technical lemma is the basis of the reduction

Co ding Lemma Let b e a p olynomialtime computable distribution function Then there exist

a co ding function C satisfying the following three conditions

Compression For every x f g

jC xjmin jxj log

x

Ecient Encoding The function C is computable in p olynomialtime

Unique Decoding The function C is onetoone ie C xC x implies x x

jxj

Pro of The function C is dened as follows If x then C xx ie in this case x

jxj

serves as its own enco ding If x then C xz where z is the longest common prex

of the binary expansions of x and x eg if and

then C z with z Consequently z is in the interval x x that is

x z x

We now verify that C so dened satises the conditions of the lemma We start with the

jxj

j jxj log x On the compression condition Clearlyif x then jC x

jxj

other hand supp ose that x and let z z z be as ab ove ie the longest common

prex of the binary expansions of x and x Then

polyjxj

X X X

i jz j i i

A

z x x x z

i i

i i

i

and jz jlog x follows Thus jC xj log x in b oth cases Clearly C can b e

computed in p olynomialtime by computing x and x Finally note that C is onetoone

by considering the two cases C xx and C xz In the second case use the fact that

x z x

Using the co ding function presented in the ab ove pro of weintro duce a nondeterministic machine

M so that the distributional problem D is reducible to BH in a way that

D BH BH

all instances of D are mapp ed to triples with rst element M On input y C x machine

D

M computes D x by rst retrieving x from C x eg guess and verify and next running

D

the nondeterministic p olynomialtime machine ie M that solves D

D

def

P jxj

The reduction maps an instance x of D to the triple M C x where P n

D

P nP nn P n is a p olynomial b ounding the running time of M on acceptable inputs

D C D D

of length n and P n is a p olynomial b ounding the running time of an algorithm for enco ding

C

inputs of length n

Prop osition The ab ove mapping constitutes a reduction of D toBH

BH

Pro of Weverify the three requirements

The transformation can b e computed in p olynomialtime Recall that C is p olynomialtime

computable

By construction of M it follows that D x if and only if there exists a computation

D

of machine M that on input C x halts outputting within P jxj steps Recall on

D

input C x machine M nondeterministically guesses x veries in P jxj steps that x

D C

is enco ded by C x and nondeterministically computes D x

To see that the distribution induced by the reduction is dominated by the distribution we

BH

rst recall that the transformation x C x is onetoone It suces to consider instances

of BH that have a preimage under the reduction since instances with no preimage satisfy the

By the denition condition trivially All these instances are triples with rst element M

D

of

BH

P jxj

M C x c

D

BH

jC xj

P jxj

jC xj

where c is a constant dep ending only on D

jM j

D

jM j

D

By virtue of the co ding Lemma

jC xj

x

It thus follows that

x

P jxj

M C x c

D

BH

P jxj jC xj

c

x

P jxj

jM C x j

D

The Prop osition follows

Conclusions

In general a theory of averagecase complexity should provide

a sp ecication of a broad class of interesting distributional problems

a denition capturing the sub class of distributional problems that are easy on the average

notions of reducibility that allow to infer the easiness of one distributional problem from

the easiness of another

and of course results

It seems that the theory of averagecase complexity initiated by Levin and further develop ed in

satises these exp ectations to some extent Following is my evaluation regarding its

p erformance with resp ect to eachoftheabove

The scop e of the theory originally restricted to Pcomputable distributions has b een signi

cantly extended to cover all Psampleable distributions as suggested in The key result

hN P Pcomputable i here is by Impagliazzo and Levin show proved that every language that is

complete is also hN P Psamplableicomplete This imp ortant result makes the theory of

averagecase very robust It allows to reduce distributional problems from an utmost wide

class to distributional problems with very restrictedsimple typ e of distributions

The denition of average p olynomialtime do es seem strange at rst glance but it seems that

it or similar alternative do es captures the intuitive meaning of easy on the average

The notions of reducibility are b oth natural and adequate

Results did follow but here indeed much more is exp ected Currently DistNPcomplete

problems are known for the following areas Computability eg BoundedHalting Combi

natorics eg Tiling and a generalization of Formal Languages and Algebra

eg of matrix groups However the challenge of nding a really natural distributional prob

lem that is complete in DistNP eg subset sum with uniform distribution has not b een met

so far It seems that what is still lacking are techniques for design of distribution preserving

reductions

complexity reductions that preserve In addition to their central role in the theory of averagecase

uniform or very simple instance distribution are of general interest Such reductions unlikemost

known reductions used in the theory of NPcompleteness have a range that is a nonnegligible part

of the set of all p ossible instances of the target problem ie a part that cannot b e claim to b e only

a pathological sub case

We note that Levin views the results in his pap er as an indication that all simple ie

Pcomputable distributions are in fact related or similar

App endix Failure of a naive formulation

When asked to motivate his denition of average p olynomialtime Leonid Levin replies non

deterministically in one of the following three ways

This is the natural denition

This denition is not important for the results in my pap er only the denitions of reduc

tion and completeness matter and also they can be mo died in many ways preserving the

results

Any denition that makes sense is either equivalentorweaker

For further elab oration on the rst argument the reader is referred to Leonid Levin The second

argument is of course technically correct but unsatisfactory We will need a denition of easy

on the average when motivating the notion of a reduction and developing useful relaxations of it

The third argumentisathesiswhich should b e interpreted along Wittgensteins suggestion to the

teacher say nothing and conne yourself to pointing out errors in the students attempts to say

something We will follow this line here by arguing that the denition that seems natural to an

average computer scientist suers from serious problems and should b e rejected

Denition X naive formulation of the notion of easy on the average A distributional problem

D is polynomialtime on the average if there exists an algorithm A solving D ie on input x

outputs D x such that the running time of algorithm A denoted t satises cn

A

X

c

x t x n

A

n

n

xfg

where x is the conditional probability that x occurs given that an nbit string occurs ie

n

P

x x y

n

n y fg

The problem which we consider to b e most upsetting is that Denition X is not robust under

can be solved in functional comp osition of algorithms Namely if the distributional problem A

average p olynomialtime given access to an oracle for B and problem B can b e solved in p olynomial

time then it do es not follow that the distributional problem A can b e solved in average p olynomial

time For example consider uniform probability distribution on inputs of each length and an oracle

n

B

steps on Turing machine M which given access to oracle B solves A Supp ose that M runs

n

of the inputs of length n and n steps on all other inputs of length n and furthermore that

p

M when making t steps asks a single query of length t Note that machine M given access to

oracle for B is p olynomialtime on the average Finally supp ose that the algorithm for B has

cubic runningtime The reader can now verify that although M given access to the oracle B is

p olynomialtime on the average combining M with the cubic runningtime algorithm for B does

not yield an algorithm which is p olynomialtime on the average according to Denition X It is easy

to see that this problem do es not arise when using the denition presented in Section

The source of the ab ove problem with Denition X is the fact that the underlying denition of

p olynomialontheaverage is not closed under application of p olynomials Namelyift f g N

is p olynomial on the average with resp ect to some distribution it do es not follow that also t

is p olynomial on the average with resp ect to the same distribution This technical problem is

also the source of the following problem that Levin considers most upsetting Denition X is not

machine indep endent This is the case since some of the simulations of one computational mo del on

another square the running time eg the simulation of twotap e Turing machines on a onetap e

Turing machine or the simulation of a RAM Random Access Machine on a Turing machine

Another two problems with Denition X have to do with the fact that it deals separately with

inputs of dierent length The rst problem is that Denition X is very dep endent on the particular

enco ding of the problem instance Consider for example a problem on simple undirected graphs

for which there exist an algorithm A with running time t G f n m where n is the number of

A

n

then f n m and vertices in G and m is the numb er of edges in G Supp ose that if mn

else f n mn Consider the distributional problem which consists of the ab ove graph problem

with the uniform probability distribution on all graphs with the same number of vertices Now if

the graph is given by its incident matrix representation then Denition X implies that A solves

the problem in average p olynomialtime the average is taken on all graphs with n no des On

the other hand if the graphs are represented by their adjacency lists then the mo died algorithm

A which transforms the graphs to matrix representation and applies algorithm A is judged by

Denition X to b e nonp olynomial on the average here the av erage is taken over all graphs of m

edges This of course will not happ en when working with the denition presented in Section

The second problem with dealing separately with dierent input lengths is that it do es not allow

one to disregard inputs of a particular length Consider for example a problem for which we are

only interested in the runningtime on inputs of o dd length

After p ointing out several weaknesses of Denition X let us also doubt its clear intuitive

advantage over the denition presented in Section Denition X is derived from the formulation

of worstcase p olynomialtime algorithms which requires that c n

n c

x f g t x n

A

Denition X was derived by applying the exp ectation op erator to the ab ove inequality But why

not make a very simple algebraic manipulation of the inequality b efore applying the exp ectation

op erator How ab out taking the cth ro ot of b oth sides and dividing by n this yields c n

c

t x

A

n

x f g

n

Applying the exp ectation op erator to the ab ove inequality leads to the denition presented in

Section We b elieve that this denition demonstrates a b etter understanding of the eect of the

exp ectation op erator with resp ect to complexity measures

Summary Robustness under functional comp osition as well as machine indep endence seems to

b e essential for a coherent theory So is robustness under eciently eected transformation of the

problem enco ding These are one of the primary reasons for the acceptability of P as capturing

problems that can b e solved eciently In going from worstcase analysis to averagecase analysis

we should not and would not like to lose these prop erties

Lecture

Circuit Lower Bounds

See old survey by Boppana and Sipser

Constantdepth circuits

Monotone circuits

Lecture

Communication Complexity

See textb o ok by Kushilevitz and Nisan

Deterministic Communication Complexity

Randomized Communication Complexity

Historical Notes

Probabilistic Pro of Systems

For a more detailed account of the history of the various typ es of probabilistic pro of systems we

refer the reader to Sec

Interactive Pro ofs Interactive pro of systems were intro duced by Goldwasser Micali and Rack

o with the explicit ob jective of capturing the most general notion of eciently veriable pro of

systems The original motivation was the intro duction of zeroknowledge pro of systems whichin

turn were supp osed to provide and indeed do provide a powerful to ol for the design of complex

cryptographic schemes cf

First evidence that interactive pro ofs may bemorepowerful than NPpro ofs was given by Gol

dreich Micali and Wigderson in the form of the interactive pro of for Graph NonIsomorphism

presented ab ove The full pow er of interactive pro of systems was discovered by Lund Fortnow

Karlo Nisan and Shamir in and The basic technique was presented in where

it was shown that co NP IP and the nal result PSPACE IP in Our presentation

follows

Publiccoin interactive pro ofs also known as ArthurMerlin pro ofs were intro duced by Babai

The fact that these restricted interactive pro ofs are as p owerful as general ones was proved by Gold

wasser and Sipser The linear sp eedup in numb er of rounds of publiccoin interactive pro ofs

was shown by Babai and Moran

intro duced by Goldwasser Zeroknowledge pro ofs The concept of zeroknowledge has been

Micali and Racko in the very same pap er quoted ab ove ie Their pap er contained also

a p erfect zeroknowledge pro of for Quadratic NonResiduosity The p erfect zeroknowledge pro of

system for Graph Isomorphism is due to Goldreich Micali and Wigderson More imp ortantly

the latter pap er presents a zeroknowledge pro of systems for all languages in NP using any secure

commitmentscheme which in turn can b e constructed based on any oneway function For

the comprehensive discussion of zeroknowledge see Chap

Probabilistically Checkable Pro ofs The PCP Characterization Theorem is attributed to

Arora Lund Motwani Safra Sudan and Szegedy cf and These pap ers in turn built on

numerous previous works for details see the pap ers themselves or In general our presentation

of PCP follows Sec and the interested reader is referred to the latter for a survey of

further developments and more rened considerations

The rst connection between PCP and hardness of approximation was made by Feige Gold

wasser Lovasz Safra and Szegedy They showed the connection to maxClique presented

ab ove The connection to maxSAT and other MaxSNP approximation problems was made

later in

We did not present the strongest known nonapproximability results for maxSAT and max

Clique These can b e found in Hastads pap ers and resp ectively

Pseudorandomness

The notion of computational indistinguishability was intro duced by Goldwasser and Micali

within the context of dening secure encryptions and given general formulation by Yao

Our denition of pseudorandom generators follows the one of Yao which is equivalent to a prior

formulation of Blum and Micali For more details regarding this equivalence as well as many

other issues see The latter source presents the notion of pseudorandomness discussed here as

a sp ecial case or archetypical case of a general paradigm

The discovery that computational hardness in form of onewayness can be turned into a

pseudorandomness was made by Blum and Micali Theorem asserting that pseudorandom

generators can b e constructed based on anyoneway function is due to Hastad Impagliazzo Levin

and Luby who build on

The fact that pseudorandom generators yield signicantly b etter derandomization than the

straightforward one was rst exploited byYao The fact that for purp ose of derandomization

one may use pseudorandom generators that run in exp onential time was rst observed by Nisan

and Wigderson who presented a general framework for such constructions All impro ved

derandomization results build on the latter framework In Particular Theorem is due to

Impagliazzo and Wigderson who build on

Theorems and regarding derandomization of spaceb ounded randomized classes

are due to Nisan and Nisan and Zuckerman resp ectively

AverageCase Complexity

The theory of averagecase complexitywas initiated by Levin Levins laconic presentation

hides the fact that imp ortant choices have been made in the development of the averagecase

complexity theory These choices were discussed in and our presentation follows the latter

text

Bibliography

L Adleman Two theorems on random p olynomialtime In th FOCS pages

R Aleliunas RM Karp RJ Lipton L Lovasz and C Racko Random walks universal

traversal sequences and the complexity of maze problems In th FOCS pages

S Arora C Lund R Motwani M Sudan and M Szegedy Pro of Verication and Intractability

of Approximation Problems JACMVol pages Preliminary version in rd

FOCS

S Arora and S Safra Probabilistic Checkable Pro ofs A New Characterization of NP JACM

Vol pages Preliminary version in rd FOCS

L Babai Trading Group Theory for Randomness In th STOC pages

L Babai L Fortnow L Levin and M Szegedy Checking Computations in Polylogarithmic

Time In rd STOC pages

L Babai L Fortnow N Nisan and A Wigderson BPP has Sub exp onential Time Simulations

unless EXPTIME has Publishable Pro ofs Complexity TheoryVol pages

L Babai and S Moran ArthurMerlin Games A Randomized Pro of System and a Hierarchy

of Complexity Classes JCSSVol pages

P Beame and T Pitassi Prop ositional Pro of Complexity Past Present and Future In

Bul letin of the European Association for Theoretical Computer Science Vol June

pages

S BenDavid B Chor O Goldreich and M Lub y On the Theory of Average Case Com

plexity JCSSVol No April pages

A BenDor and S Halevi In nd Israel Symp on Theory of Computing and Systems

ISTCS IEEE Computer So cietyPress

M Blum and S Micali How to Generate Cryptographically Strong Sequences of Pseudo

Random Bits SICOMP Vol pages Preliminary version in rd FOCS

R Boppana and M Sipser The complexity of nite functions In Handbook of Theoreti

cal Computer Science Volume A Algorithms and Complexity J van Leeuwen editor MIT

PressElsevier pages

L Carter and M Wegman Universal Hash Functions JCSSVol pages

GJ Chaitin On the Length of Programs for Computing Finite Binary Sequences JACM

Vol pages

AK Chandra DC Kozen and LJ Sto ckmeyer Alternation JACMVol pages

SA Co ok The Complexity of Theorem Proving Pro cedures In rd STOC pages

TM Cover and GA Thomas Elements of Information Theory John Wiley Sons Inc

NewYork

U Feige S Goldwasser L Lovasz S Safra and M Szegedy Approximating Clique is almost

NPcomplete JACMVol pages Preliminary version in nd FOCS

S Fortune A Note on Sparse Complete Sets SIAM J on ComputingVol pages

M Furer O Goldreich Y Mansour M Sipser and S Zachos On Completeness and Sound

ness in Interactive Pro of Systems Advances in Computing Research aresearch annualVol

Randomness and Computation S Micali ed pages

MR Garey and DS Johnson Computers and Intractability A Guide to the Theory of NP

Completeness WH Freeman and CompanyNewYork

J Gill Computational complexity of probabilistic Turing machines SIAM Journal on Com

putingVol pages

O Goldreich Notes on Levins Theory of AverageCase Complexity In ECCC TR

O Goldreich Secure MultiParty Computation Unpublished amnuscript Available

odedgmwhtml from httpwwwwisdomweizmannacil

O Goldreich Modern Cryptography Probabilistic Proofs and Pseudorandomness Algorithms

and Combinatorics series Vol Springer

O Goldreich Foundation of Cryptography Basic Tools Cambridge University Press

O Goldreich Randomized Methods in Computation Lecture Notes Spring Available

from httpwwwwisdomweizmannacilodedrndhtml

O Goldreich S Goldwasser and S Micali How to Construct Random Functions JACM

Vol No pages

O Goldreich H Krawcyzk and M Luby On the Existence of Pseudorandom Generators

SICOMPVol pages

O Goldreich and LA Levin Hardcore Predicates for any OneWayFunction In st STOC

pages

O Goldreich S Micali and A Wigderson Pro ofs that Yield Nothing but their ValidityorAll

Languages in NP Have ZeroKnowledge Pro of Systems JACMVol No pages

Preliminary version in th FOCS

O Goldreich S Micali and A Wigderson How to PlayanyMental Game A Completeness

Theorem for Proto cols with Honest Ma jority In th STOC pages For details

see

S Goldwasser and S Micali Probabilistic Encryption JCSS Vol No pages

Preliminary version in th STOC

S Goldwasser S Micali and C Racko The Knowledge Complexity of Interactive Pro of

Systems SICOMP Vol pages Preliminary version in th STOC

Earlier versions date to

S Goldwasser and M Sipser Private Coins versus Public Coins in Interactive Pro of Systems

Advances in Computing Research a research annual Vol Randomness and Computation

S Micali ed pages Extended abstract in th STOC pages

Y Gurevich Complete and Incomplete Randomized NP Problems In Proc of the th FOCS

pages

J Hastad Clique is hard to approximate within n Acta Mathematica Vol pages

Combines preliminary versions in th STOC and th FOCS

J Hastad Getting optimal inapproximability results In th STOC pages

J Hastad R Impagliazzo LA Levin and M Luby A Pseudorandom Generator from any One

wa yFunction SICOMPVolume Numb er pages Combines preliminary

versions by Impagliazzo et al in st STOC and Hastad in nd STOC

JE Hop croft and JD Ullman Introduction to Automata Theory Languages and Computa

tion AddisonWesley

N Immerman Nondeterministic Space is Closed Under Complementation SIAM Jour on

Computing Vol pages

R Impagliazzo Hardcore Distributions for Somewhat Hard Problems In th FOCSpages

R Impagliazzo and LA Levin No Better Ways to Generate Hard NP Instances than Picking

Uniformly at Random In Proc of the st FOCS pages

R Impagliazzo and A Wigderson PBPP if E requires exp onential circuits Derandomizing

the XOR Lemma In th STOC pages

DS Johnson The NPComplete Column an ongoing guide Jour of Algorithms Vol

pages

RM Karp Reducibility among Combinatorial Problems In Complexity of Computer Com

putations RE Miller and JW Thatcher eds Plenum Press pages

RM Karp Probabilistic Analysis of Algorithms Manuscript

RM Karp and RJ Lipton Some connections between nonuniform and uniform complexity

classes In th STOC pages

RM Karp and V Ramachandran Parallel Algorithms for Shared Memory Machines In

Handbook of Theoretical Computer Science Vol A Algorithms and Complexity

MJ Kearns and UV Vazirani An introduction to Computational Learning Theory MIT

Press

DE Knuth The Art of Computer ProgrammingVol Seminumerical Algorithms Addison

Wesley Publishing Company Inc rst edition and second edition

A Kolmogorov Three Approaches to the Concept of The Amount Of Information Pr obl of

Inform Transm Vol

E Kushilevitz and N Nisan Communication Complexity Cambridge University Press

RE Ladner On the Structure of Polynomial Time Reducibility JouroftheACM

pages

C Lautemann BPP and the IPL pages

LA Levin Universal Search Problems Problemy Peredaci Informacii pages

Translated in problems of Information Transmission pages

LA Levin Randomness Conservation Inequalities Information and Indep endence in Mathe

matical Theories Inform and Control Vol pages

LA Levin Average Case Complete Problems SIAM Jour on Computing Vol pages

M Li and P Vitanyi AnIntroduction to Kolmogorov Complexity and its Applic ations Springer

Verlag August

C Lund L Fortnow H Karlo and N Nisan Algebraic Metho ds for Interactive Pro of

Systems JACM Vol No pages Preliminary version in st FOCS

R Motwani and P Raghavan Randomized AlgorithmsCambridge University Press

M Naor Bit Commitment using Pseudorandom Generators Jour of Crypto Vol pages

N Nisan Pseudorandom Generators for Space Bounded Computation CombinatoricaVol

pages

N Nisan RL S C Journal of Computational Complexity Vol pages

N Nisan and A Wigderson Hardness vs Randomness JCSS Vol No pages

N Nisan and D Zuckerman Randomness is Linear in Space JCSSVol pages

WJ Savitch Relationships between nondeterministic and deterministic tap e complexities

JCSSVol pages

R Shaltiel Recentdevelopments in explicit constructions of extractors In the Bul letin of the

European Association for Theoretical Computer ScienceVol June pages

A Shamir IP PSPACE JACMVol No pages Preliminary version

in st FOCS

CE Shannon A mathematical theory of communication Bel l Sys Tech JourVol pages

M Sipser A Complexity Theoretic Approach to Randomness In th STOC pages

M Sipser Introduction to the Theory of Computation PWS Publishing Company

RJ Solomono A Formal Theory of Inductive Inference Inform and Control Vol

pages

LJ Sto ckmeyer The PolynomialTime Hierarchy Theoretical Computer Science Vol

pages

L Sto ckmeyer The Complexity of Approximate Counting In th STOC pages

R Szelep csenyi A Metho d of Forced Enumeration for Nondeterministic Automata Acta

Informatica Vol pages

S Vadhan A Study of Statistical ZeroKnowledge Pro ofs PhD Thesis Department of Math

ematics MIT

LG Valiant The Complexity of Computing the Permanent Theoretical Computer Science

Vol pages

LG Valiant and VV Vazirani NP Is as Easy as Detecting Unique Solutions Theoretical

Computer ScienceVol pages

AC Yao Theory and Application of Trap do or Functions In rd FOCS pages