<<

Probabilistic Pro of Systems { Lecture Notes



Oded Goldreich

Department of Computer Science and Applied Mathematics

Weizmann Institute of Science, Rehovot, Israel.

Septemb er 1996

Abstract

Various typ es of probabilistic pro of systems have played a central role in the development

of computer science in the last decade. In these notes, we concentrate on three such pro of

systems | interactive proofs, zero-know ledge proofs, and probabilistic checkable proofs.

Remark: These are lecture notes in the strict sense of the word. Surveys of mine on this sub ject

can b e obtained from URL http://theory.lcs.mit.edu/ ~oded/pps.html.

These notes were prepared for a series of lectures given in the Theory Student Seminar of the CS

Department of UC-Berkeley.



Visiting Miller Professor, Aug. { Sept. 1996, EECS Dept., UC{Berkeley. 0

1 Intro duction

The glory given to the creativity required to nd pro ofs, makes us forget that it is the less glori-

ed pro cedure of veri cation which gives pro ofs their value. Philosophically sp eaking, pro ofs are

secondary to the veri cation pro cedure; whereas technically sp eaking, pro of systems are de ned in

terms of their veri cation pro cedures.

The notion of a veri cation pro cedure assumes the notion of computation and furthermore the

notion of ecient computation. This implicit assumption is made explicit in the de nition of NP,

in which ecient computation is asso ciated with deterministic p olynomial-time .

Traditionally, NP is de ned as the class of NP-sets. Yet, each such NP-set can b e viewed as a

pro of system. For example, consider the set of satis able Bo olean formulae. Clearly, a satisfying

assignment  for a formula  constitutes an NP-pro of for the assertion \ is satis able" the

veri cation pro cedure consists of substituting the variables of  by the values assigned by  and

computing the value of the resulting Bo olean expression.

The formulation of NP-pro ofs restricts the \e ective" length of pro ofs to b e p olynomial in length

of the corresp onding assertions. However, longer pro ofs may b e allowed by padding the assertion

with suciently many blank symb ols. So it seems that NP gives a satisfactory formulation of pro of

systems with ecientveri cation pro cedures. This is indeed the case if one asso ciates ecient

pro cedures with deterministic p olynomial-time algorithms. However, we can gain a lot if we are

willing to take a somewhat non-traditional step and allow probabilistic veri cation pro cedures. In

particular,

 Randomized and interactiveveri cation pro cedures, giving rise to interactive proof systems,

seem much more p owerful i.e., \expressive" than their deterministic counterparts.

 Such randomized pro cedures allow the intro duction of zero-know ledge proofs which are of

great theoretical and practical interest.

 NP-pro ofs can b e eciently transformed into a redundant form which o ers a trade-o

between the numb er of lo cations examined in the NP-pro of and the con dence in its validity

see probabilistical ly checkable proofs.

In ab ovementioned typ es of probabilistic pro of systems, explicit b ounds are imp osed on the

computational complexity of the veri cation pro cedure, which in turn is p ersoni ed by the notion

ofaveri er. Furthermore, in all these pro of systems, the veri er is allowed to toss coins and

rule by statistical evidence. Thus, all these pro of systems carry a probability of error; yet, this

probability is explicitly b ounded and, furthermore, can b e reduced by successive application of the

pro of system.

2 Interactive Pro of Systems

In light of the growing acceptability of randomized and distributed computations, it is only natural

to asso ciate the notion of ecient computation with probabilistic and interactive p olynomial-time

computations. This leads naturally to the notion of interactive pro of systems in which the veri ca-

tion pro cedure is interactive and randomized, rather than b eing non-interactive and deterministic.

Thus, a \pro of " in this context is not a xed and static ob ject but rather a randomized dynamic

pro cess in which the veri er interacts with the prover. Intuitively, one may think of this interaction 1

as consisting of \tricky" questions asked by the veri er to which the prover has to reply \convinc-

ingly". The ab ove discussion, as well as the actual de nition, makes explicit reference to a prover,

whereas a prover is only implicit in the traditional de nitions of pro of systems e.g., NP-pro ofs.

2.1 The De nition

Interaction: Going b eyond the uni-directional \interaction" of the NP-pro of system. If the

veri er do es not toss coins then interaction can b e collapsed to a single message.

computationall y unb ounded Prover: As in NP,we start by not considering the complexity

of proving.

probabilistic p olynomial-time Veri er: We maintain the paradigm that veri cation ought

to b e easy, alas we allow random choices in our notion of easiness.

Completeness and Soundness: We relax the traditional soundness condition by allowing small

probability of b eing fo oled by false pro ofs. The probability is taken over the veri er's random

choices. We still require \p erfect completeness"; that is, that correct statements are proven with

probability 1. Error probability, b eing a parameter, can b e further reduced by successive rep eti-

tions.

Variations: Relaxing the \p erfect completeness" requirement yields a two-sided error variantof

IP i.e., error probability allowed also in the completeness condition. Restricting the veri er to

send only \random" i.e., uniformly chosen messages yields the restricted Arthur-Merlin interactive

pro ofs aka public-coins interactive pro ofs. Alas, b oth variants are essentially as p owerful as the

one ab ove.

2.2 An Example: interactive pro of of Graph Non-Isomorphism

The problem: not known to b e in NP. Proving that two graphs are isomorphic can b e done

by presenting an isomorphism, but howdoyou prove that no such isomorphism exists?

The construction: the \two di erent ob ject proto col" { if you claim that two ob jects are di erent

then you should b e able to tell which is which when I present them to you in random order. In

the context of the Graph Non-Isomorphism interactive pro of, two supp osedly di erent ob jects

are de ned by taking random isomorphic copies of each of the input graphs. If these graphs are

indeed non-isomorphic then the ob jects are di erent the distributions have distinct supp ort else

the ob jects are identical.

2.3 Interactive pro of of Non-Satis ability

Arithmetization of Bo olean CNF formulae: Observe that the arithmetic expression is a

degree p olynomial. Observe that, in any case, the value of the arithmetic expression is b ounded. 2

Moving to a Finite Field: Whenever wecheck equalitybetween twointegers in [0;M], it suces

to check equalitymodq, where q>M. The b ene t is that the arithmetic is now in a nite eld

mo d q  and so certain things are \nicer" e.g., uniformly selecting a value. Thus, proving that a

CNF formula is not satis able reduces to proving equality of the following form

X X

 x ; :::; x  0modq

1 n

x =0;1 x =0;1

n

1

where  isalow degree multi-variant p olynomial.

The construction: stripping summations in iterations. In each iteration the prover is supp osed

to supply the p olynomial describing the expression in one currently stripp ed variable. By the

ab ove observation, this is a low degree p olynomial and so has a short description. The veri er

checks that the p olynomial is of low degree, and that it corresp onds to the currentvalue b eing

claimed i.e., p0 + p1 v . Next, the veri er randomly instantiates the variable, yielding a new

value to b e claimed for the resulting expression i.e., v  p , for uniformly chosen r 2 GFq .

The veri er sends the uniformly chosen instantiation to the prover. At the end of the last iteration,

the veri er has a fully sp eci ed expression and can easily check it against the claimed value.

Completeness of the ab ove: When the claim holds, the prover has no problem supplying the

correct p olynomials, and this will lead the veri er to always accept.

Soundness of the ab ove: It suces to b ound the probability that for a particular iteration the

initial claim is false whereas the ending claim is correct. Both claims refer to the current summation

expression b eing equal to the currentvalue, where `current' means either at the b eginning of the

iteration or at its end. Let T  b e the actual p olynomial representing the expression when stripping

the currentvariable, and let pbeany p otential answer by the prover. Wemay assume that

p0 + p1 v and that p is of low-degree as otherwise the veri er will reject. Using our

hyp othesis that the initial claim is false, we know that T 0 + T 1 6 v . Thus, p and T are

di erentlow-degree p olynomials and so they may agree on very few p oints. In case the veri er

instantiation do es not happ en to b e one of these few p oints, the ending claim is false to o.

Op en Problem 1 alternative pro of of coNP IP: Polynomials play a fundamental role in the

above construction and this trend has even deepened in subsequent works on PCP. It does not seem

possible to abstract that role, which seems to be very annoying. I consider it important to obtain

an alternative proof of coNP IP;a proof in which al the underlying ideas can bepresentedat

an abstract level.

2.4 The Power of Interactive Pro ofs

IP = PSPACE

Interactive Pro ofs for PSPACE: Recall that PSPACE languages can b e expressed by Quan-

ti ed Bo olean Formulae. The numb er of quanti ers is p olynomial in the input, but there are

b oth existential and universal quanti ers, and furthermore these quanti ers may alternate. Con-

sidering the arithmetization of these formulae, we face two problems: Firstly, the value of the

formulae is only b ounded by a double exp onential function in the length of the input, and sec-

ondly when stripping out summations, the expression may b e a p olynomial of high degree due 3

to the universal quanti ers which are replaced by pro ducts. The rst problem is easy to deal

with by using the Chinese Reminder Theorem i.e., if twointegers in [0;M] are di erent then they

must b e di erent mo dulo most of the primes -to p oly log M . The second problem is resolved

by \refreshing" variables after each universal quanti er e.g, 9x8y 9zx; y ; z  is transformed into

0 0 0

9x8y 9x x = x  ^9yx ;y;z.

IP in PSPACE: We show that for every interactive pro of there exists an optimal prover strategy,

and furthermore that this strategy can b e computed in p olynomial-space. This follows by lo oking

at the tree of all p ossible executions.

The IP Hierarchy: Let IPr denote the class of languages having an interactive pro of in

which at most r  messages are exchanges. Then, IP 0 = coRP B P P . The class IP1 is a

randomized version of NP; witnesses are veri ed via a probabilistic p olynomial-time ppro cedure,

rather than a deterministic one. The class IP 2 seems fundamentally di erent; the veri cation

pro cedure here is truly interactive. Still, this class seems close to NP; sp eci cally, itiscontained in

the p olynomial-time hierarchy which seems `low' when contrasted with PSPACE = IPp oly . In-

terestingly, IP 2r  = IP r, and so in particular IP O1 = IP2. Note that \IP2r  =

IPr" can b e applied successively a constantnumb er of times, but not more.

Op en Problem 2 the structure of the IP hierarchy:

L ? Suppose that L 2IPr. What can be said about

Currently, we only know to argue as fol lows: IPr IPp oly  P S P AC E and so L 2 P S P AC E

and is in IP p oly . This seems ridiculous: we do not use the extra information on IPr. On the

other hand, we don't expect L to bein IP gr, for any function g , since this wil l put coNP

coIP1 in IP 2. So another parameter may berelevant here; how about the lengths of the

messages exchanged in the interaction. Indeed, if L has an interactive proof in which the total

3

message length is m then L has an interactive proof in which the total message length is O m .

This just fol lows by the known PSPACE IP construction. I consider it important to obtain a

better result! In general, it would be interesting to get a better understanding of the IP  Hierarchy.

2.5 HowPowerful Should the Prover b e?

The Cryptographic Angle: Interactive pro ofs o ccur inside \cryptographic" proto cols and so

the prover is merely a probabilistic p olynomial-time machine; yet it mayhave access to an auxiliary

input given to it or generated by it in the past. Such provers are relatively weak i.e., they can only

prove languages in IP 1; yet, they maybeofinterest for other reasons e.g., see zero-knowledge.

The Complexity Theoretic Angle: It make sense to try to relate the complexity of proving

a statement to another party to the complexity of deciding whether the statement holds. This

gives rise to two related approaches:

1. The prover is a probabilistic p olynomial-time oracle machine with access to the language. This

approach can b e thought of as extending the notion of self-reducibil i ty of NP-languages:

These languages have an NP-pro of system in which the prover is a p olynomial-time machine

with oracle access to the language. Indeed, alike NP- languages, the IP-complete

languages also have such a \relatively ecient" prover. Recall that an optimal prover strategy

can b e implemented in p olynomial-space, and thus by a p olynomial-time machine having

oracle access to a PSPACE-complete language. 4

2. The prover runs in time which is p olynomial in the complexity of the languages.

Op en Problem 3 Further investigate the power of the various notions, and in particular the one

extending self-reducibility on NP languages. Better understanding of the latter is also long due.

Aspeci c chal lenge: provide an NP-proof system for Quadratic Non-Residucity QNR, using a

probabilistic polynomial-time prover with access to QNR language.

2.6 Computationally-Sound Pro ofs

Such pro ofs systems are fundamentally di erent from the ab ove discussion which did not e ect

the soundness of the pro of systems: Here we consider relations of the soundness conditions { false

pro ofs may exist even with high probability but are hard to nd. Variants may corresp ond to the

ab ove approaches; sp eci cally, the following has b een investigated:

Argument Systems: One only considers prover strategies implementable by p ossibly non-

uniform p olynomial-size circuits eq., probabilistic p olynomial-time machines with auxiliary in-

puts. Under some reasonable assumptions there exist argument systems for NP having p oly-

logarithmic communication complexity. Analogous interactive pro ofs cannot exists unless NP is

contained in Quasi-Polynomial Time i.e., NP Dtimeexpp oly log n.

CS Pro ofs: One only considers prover strategies implementable in time p olynomial in the com-

plexity of the language. In an non-interactive version one asks for \certi cates a la NP-typ e" which

are only computationally sound. In a mo del allowing b oth prover and veri er access to a random

oracle, one can convert interactive pro ofs alike CS pro ofs into non-interactive ones. As a heuris-

tics, it is also suggested to replace the by use of \random public functions" a fuzzy

notion, not to b e confused with pseudorandom functions.

Op en Problem 4 Try to provide rm grounds for the heuristics of making proof systems non-

interactive by use of \random public functions": I advise not to try to de ne the latter notion in a

general form, but rather devise some ad-hoc method, using some speci c but widely believedcom-

plexity assumptions e.g., hardness of deciding Quadratic Residucity modulo a composite number,

for this speci c application.

3 Zero-Knowledge Pro ofs

Zero-knowledge pro ofs are central to .Furthermore, zero-knowledge pro ofs are very

intriguing from a conceptual p oint of view, since they exhibit an extreme contrast b etween b eing

convinced of the validity of a statement and learning anything in addition while receiving such

a convincing pro of. Namely, zero-knowledge pro ofs have the remarkable prop erty of b eing b oth

convincing while yielding nothing to the veri er, b eyond the fact that the statementisvalid.

The zero-knowledge paradigm: Whatever can b e eciently computed after interacting with

the prover on some common input, can b e eciently computed from this input alone without

interacting with anyone. That is, the interaction with the prover can b e eciently simulated in

solitude. 5

ATechnical Note: Ihave deviated from other presentation in which the simulator works in

average probabilistic p olynomial-time and require that it works in strict probabilistic p olynomial-

1

time. Yet, I allow the simulator to halt without output with probability at most . Clearly this

2

implies an average p olynomial-time simulator, but the converse is not known. In particular, some

known p ositive results regarding p erfect zero-knowledge with average p olynomial-time simulators

are not known to hold under the ab ove more strict notion.

3.1 Perfect Zero-Knowledge

The De nition: A simulator can pro duce exactly the same distribution as o ccurring in an inter-

action with the prover. Furthermore, in the general de nition this is required with resp ect to any

probabilistic p olynomial-time veri er strategy not necessarily the one sp eci ed for the veri er.

Thus, the zero-knowledge prop erty protects the prover from any attempt to obtain anything from

it b eyond conviction in the validity of the assertion.

Zero-Knowledge NP-pro ofs: Extending the NP-framework to interactive pro of is essential for

the non-triviality of zero-knowledge. It is easy to see that zero-knowledge NP-pro ofs exist only for

languages in RP . Actually, that's a go o d exercise.

A p erfect zero-knowledge pro of for Graph Isomorphism: The prover sends the veri er a

random isomorphic copy of the rst input graph. The veri er challenges the prover by asking the

prover to present an isomorphism of graph sent to either the rst input graph or to the second

input graph. The veri er's choice is made at random.

The fact that this interactive pro of system is zero-knowledge is more subtle than it seems; for

example, many parallel rep etitions of the pro of system are unlikely to b e zero-knowledge.

3.2 Computational Zero-Knowledge

This de nition is obtained by substituting the requirement that the simulation is identical to the

real interaction, by the requirement that the two are computational indistinguishable.

Computational Indistinguishabili ty is a fundamental concept of indep endentinterest. Two

ensembles are considered indistinguishabl e by an A if A's b ehavior is almost invariantof

whether its input is taken from the rst ensemble or from the second one. Weinterpret \b ehavior"

as a binary verdict and require that the probability that A outputs 1 in b oth cases is the same up-to

a negligible di erence i.e., smaller than 1=pn, for any p ositive p olynomial p and all suciently

long input lengths denoted by n. Two ensembles are computational indistinguishable if they are

indistinguishabl e by all probabilistic p olynomial-time algorithms.

A zero-knowledge pro of for NP { abstract b oxes setting: It suces to construct sucha

pro of system for 3-Colorability 3COL. To obtain a pro of system for other NP-languages use the

fact that the standard reduction of NP to 3COL is p olynomial-time invertible.

The prover uses a xed 3-coloring of the input graph and pro ceeds as follows. First, it uniformly

selects a relab eling of the colors i.e., one of the 6 p ossible ones and puts the resulting color of

eachvertex in a lo cked b ox marked with the vertex name. All b oxes are sent to the veri er who

resp onse with a uniformly chosen edge, asking to op en the b oxes corresp onding to the endp ointof

this edge. The prover sends over the corresp onding keys, and the veri er op ens the twoboxes and

accepts i he sees two di erent legal colors. 6

A zero-knowledge pro of for NP { real setting: The lo cked b oxes need to b e implemented

digitally. This is done byacommitment scheme, a cryptographic primitive designed to implement

suchlocked b oxes. Lo osely sp eaking, a commitmentscheme is a two-party proto col which pro ceeds

in two phases so that at the end of the rst phase called the commit phase the rst party

called sender is committed to a single value which is the only value he can later reveal in the

second phase, whereas at this p oint the other party gains no knowledge on the committed value.

Commitmentschemes exist if and actually i  one-way functions exist. Thus, the mildest of all

cryptographic assumptions suces for constructing zero-knowledge pro ofs for NP and actually for

all of IP. Furthermore, zero-knowledge pro ofs for languages which are \hard on the average" imply

the existence of one-way functions; thus, the ab ove construction essentially utilizes the minimal

p ossible assumption.

one-way functions imply

IP = ZKIP

3.3 Concluding Remarks

The prover's strategy in the ab ove zero-knowledge pro of for NP can b e implemented

by a probabilistic p olynomial-time machine whichisgiven as auxiliary input an NP-witness

for the input. This is clear for 3COL, and for other NP-languages one needs to use the fact

that the relevant reductions are coupled with ecient witness transformations. The ecient

implementation of the prover strategy is essential to the applications b elow.

Applications to Cryptography: Zero-knowledge pro ofs are a p owerful to ol for the design of

cryptographic proto cols, in which one typically wants to guarantee prop er b ehavior of a party

without asking him to reveal all his secrets. Note that prop er b ehavior is typically a p olynomial-

time computation based on the party's secrets as well as on some known data. Thus, the claim

that the party b ehaves consistently with its secrets and the known data can b e casted as an NP-

statement, and the ab ove result can b e utilized. More generally, using additional ideas, one can

provide a secure proto col for any functional b ehavior. These general results have to b e considered as

plausibili ty arguments; you would not like to apply these general constructions to sp eci c practical

problems, yet you should know that these sp eci c problems are solvable.

Op en Problems do exists, but seem more sp ecialized in nature. For example, it would b e

interesting to gure out and utilize the minimal p ossible assumption required for constructing

\zero-knowledge proto cols for NP" in various mo dels like constant-round interactive pro ofs, the

\non-interactive" mo del, and p erfect zero-knowledge arguments.

Further Reading: see chapter on Zero-Knowledge in my \fragments of a b o ok" on Foundations

of Cryptography available from URL http://theory.lcs.mit.edu/~oded/frag.html.

Solution to Exercise regarding Zero-Knowledge NP-pro ofs: An NP-pro of system for a

language L yields an NP-relation for L de ned using the veri er. On input x 2 L a p erfect

zero-knowledge simulator either halts without output or outputs an accepting conversation i.e.,

an NP-witness for x. 7

4 Probabilistically Checkable Pro of Systems

When viewed in terms of an interactive pro of system, the probabilistically checkable pro of setting

consists of a prover which is memoryless. Namely, one can think of the prover as b eing an oracle

and of the messages sent to it as b eing queries. A more app ealing interpretation is to view the prob-

abilistically checkable pro of setting as an alternativeway of generalizing NP. Instead of receiving

the entire pro of and conducting a deterministic p olynomial-time computation as in the case of

NP, the veri er may toss coins and prob e the pro of only at lo cation of its choice. Potentially,

this allows the veri er to utilize very long pro ofs i.e., of sup er-p olynomial length or alternatively

examine very few bits of an NP-pro of.

4.1 The De nition

The Basic Mo del: A probabilisticall y checkable pro of system consists of a probabilistic p olynomial-

time veri er having access to an oracle which represents a pro of in redundant form. Typically, the

veri er accesses only few of the oracle bits, and these bit p ositions are determined by the outcome

of the veri er's coin tosses. Completeness and soundness are de ned similarly to the way they

were de ned for interactive pro ofs: for valid assertions there exist pro ofs making the veri er always

1

. accepts, whereas no oracle can make the veri er accept false assertions with probabilityabove

2

We've sp eci ed the error probability since weintend to b e very precise regarding some complexity

measures.

Additional complexity measures of fundamental imp ortance are the randomness and query

complexities. Sp eci cally, PCPr;q denotes the set of languages having a probabilistic check-

able pro of system in which the veri er, on any input of length n, makes at most r n coin tosses

and at most q n oracle queries. As usual in complexity theory, unless stated otherwise, the oracle

answers are always binary i.e., either 0 or 1.

r

Observed that the \e ective" oracle length is at most 2  q i.e., lo cations whichmay b e accessed

on some random choices. In particular, the e ective length of oracles in a PCPlog;  system is

p olynomial.

the traditional notion of a pro of: An oracle which always makes the p cp- PCP augments

veri er accept constitutes a pro of in the standard mathematical sense. However a p cp system has

the extra prop erty of enabling a lazy veri er, to toss coins, take its chances and \assess" the validity

of the pro of without reading all of it but rather by reading a tiny p ortion of it.

4.2 The p ower of probabilistically checkable pro ofs

The PCP Characterization Theorem states that

PCPlog;O1 = NP

Thus, probabilistically checkable pro ofs in which the veri er tosses only logarithmically many coins

and makes only a constantnumb er of queries exist for every NP-language. It follows that NP-pro ofs

can b e transformed into NP-pro ofs which o er a trade-o b etween the p ortion of the pro of b eing

read and the con dence it o ers. Sp eci cally, if the veri er is willing to tolerate an error probability

of  then it suces to let it examine O log1= bits of the transformed NP-pro of. These bit

lo cations need to b e selected at random. Furthermore, an original NP-pro of can b e transformed 8

into an NP-pro of allowing such trade-o in p olynomial-time. The latter is an artifact of the pro of

of the PCP Theorem.

The Pro of of the PCP Characterization Theorem is one of the most complicated pro ofs

in the Theory of Computation. Its main ingredients are:

1. A pcp log; p oly log pro of system for NP.Furthermore, this pro of system has additional

prop erties which enable pro of comp osition as in item 3 b elow.

2. A pcpp oly ;O1 pro of system for NP. This pro of system also has additional prop erties

enabling pro of comp osition as in item 3.

3. The pro of comp osition paradigm: Supp ose you havea pcpr;O` system for NP in

`

which a constantnumb er of queries are made non-adaptively to an 2 -valued oracle and

the veri er's decision regarding the answers may b e implemented by a p oly `-size circuit.

0

Further supp ose that you havea pcpr ;q-like system for P in which the input is given in

enco ded form via an additional oracle so that the system accepts input-oracles which enco de

inputs in the language and reject any input-oracle which is \far" from the enco ding of any

input in the language. In this latter system access to the input-oracle is accounted in the

query complexity. Furthermore, supp ose that the latter system may handle inputs which

result from concatenation of a constantnumb er of sub-inputs each enco ded in a separate

def

0

sub-input oracle. Then, NP has a pcp2r + r s; 2q s, where sn = p oly `n.

[The extra factor of 2 is an artifact of the need to amplify each of the two p cp systems so

that the total error probability sums up to at most 1=2.]

0

In particular, the pro of system of item 1 is comp osed with itself [using r = r = log , ` = q =

p oly log , and sn = p oly logn] yielding a pcp log; p oly log log  system for NP, which is then

0

comp osed with the system of item 2 [using r = log , ` = p oly log log , r = p oly , q = O 1, and

sn = p oly log log n] yielding the desired pcplog ;O1 system for NP.

The pcp log; p oly log system for NP: We start with a di erent arithmetization of CNF for-

mulae than the one used for constructing an interactive pro of for coNP. Logarithmically many

variables are used to represent in binary the names of variables and clauses in the input formula,

and an oracle from variables to Bo olean values is supp osed to represent a satisfying assignment.

An arithmetic expression involving a logarithmic numb er of summations is used to represent the

value of the formula under the truth assignment represented by the oracle. This expression is a

low-degree p olynomial in the new variables and has a cubic dep endency on the assignment-oracle.

Small-biased probability spaces are used to generate a p olynomial numberofsuch expressions so

that if the formula is satis able then all these expressions evaluate to zero, and otherwise at most

half of them evaluate to zero. Using a summation test as in the interactive pro of for coNP and

def

alow-degree test, this yields a pcpt;t system for NP, where tn = O logn  log log n.

[We use a nite eld of p oly logn elements, and so we need log n  O log log n random bits

O log n

-long sequences over for the summation test.] To obtain the desired p cp system, one uses

log log n

f1; :::; log ng to representvariable/clause names rather than logarithmically-long binary sequences.

O log n

[We can still use a nite eld of p oly logn elements, and so we need only  O log log n

log log n

random bits for the summation test.] All this is relatively easy compared to what is needed in order

to transform the p cp system so that only a constantnumb er of queries are made to a multi-valued

oracle. This is obtained via randomness-ecient \parallelization" of p cp systems, which in turn

dep ends heavily on ecientlow-degree tests. 9

Op en Problem 5 As a rst step towards the simpli cation of the proof of the PCP Characteri-

zation, one may want to provide an alternative \paral lelization"procedure which does not rely on

polynomials or any other algebraic creatures. A rst step towards this partial goal was taken by

Safra and myself see TR96-047 of http://www.eccc.uni-trier.de/eccc: We have constructed

an ecient low-degree test which utilizes a simple/inecient low-degree test which is paral lelized

using a new \combinatorial consistency lemma".

The pcpp oly ;O1 system for NP: It suces to prove the satis ability of a systems of

quadratic equations over GF2 as this problem in NPC. The oracle is supp osed to hold the

values of all quadratic expressions under a satisfying assignment to the variables. We distinguish

2

n n

two tables in the oracle: One corresp onding to the 2  linear expressions and the other to the 2

pure bilinear expressions. Each table is tested for self-consistency via a linearity test and the two

tables are tested to b e consistent via a matrix-equality test which utilizes \self-correction". Each

of these tests utilizes a constantnumb er of Bo olean queries, and randomness which is logarithmic

in the size of the corresp onding table.

4.3 PCP and Approximation

PCP-Characterizations of NP plays a central role in recent developments concerning the diculty

of approximation problems. To demonstrate this relationship, we rst note that the PCP Char-

acterization Theorem can b e rephrased without mentioning the class PCP altogether. Instead, a

new typ e of p olynomial-time reductions, whichwe call amplifying, emerges.

Amplifying reductions: There exists a constant >0, and a p olynomial-time reduction f ,of

3SAT to itself so that f maps non-satis able 3CNF formulae to 3CNF formulae for whichevery

truth assignment satis es at most a 1  fraction of the clauses. I call the reduction f amplify-

ing. Its existence follows from the PCP Characterization Theorem by considering the guaranteed

p cp system for 3SAT, asso ciating the bits of the oracle with Bo olean variables and intro ducing a

constant size Bo olean formula for each p ossible outcome of the sequence of O log n coin tosses

describing whether the veri er would have accepted given this outcome.

Amplifying reductions and Non-Approximability: The ab ove amplifying reduction of 3SAT

implies that it is NP-Hard to distinguish satis able 3CNF formulae from 3CNF formulae for which

every truth assignment satis es less than a 1  fraction of its clauses. Thus, Max-3SAT is NP-Hard

to approximate to within a 1  factor.

Stronger Non-Approximability Results were obtained via alternative PCP Characterizations

1

of NP.For example, the NP-Hardness of approximating Max-Clique to within N , 8>0, was

FPCPlog;, where the second parameter in FPCP measures the \amortized obtained via NP =

free-bit" complexity of the p cp system.

Op en Problems regarding various parameters in PCP Characterizations of NP will probably

remain also after the turbulence currently created byworks in progress by Hastad and by Raz &

Safra. 10