<<

On the Existence of 3-Round Zero-Knowledge Proofs by Matthew Lepinski Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2002 © Massachusetts Institute of Technology 2002. All rights reserved.

X,

Author.. Department of Electricd Engineering and Computer Science November 2, 2001

Certified by...... Silvio Micali Professor Thesis Supervisor

I

Accepted by ...... Arthur C. Smith Chairman, Department Committee on Graduate Students BARKER MASSACHUSMlSiTTWRT OF TECHNOLOGY

JUL 3 12002

LIBRARIES 2 On the Existence of 3-Round Zero-Knowledge Proofs by Matthew Lepinski

Submitted to the Department of Electrical Engineering and Computer Science on November 2, 2001, in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering

Abstract Goldreich and Krawczyk proved that there do not exist 3-round black-box zero- knowledge proofs or arguments for languages outside BPP. In 1998, Hada and Tanaka used non-standard assumptions to provide a 3-round zero-knowledge argument for every language in NP which was not black-box zero-knowledge. We present a non- black-box simulatable 3-round zero-knowledge proof system for NP, which is secure even when the prover has unbounded computational resources. However, we require a non-standard assumption (similar to those used by Hada and Tanaka) in order to prove our protocol is zero-knowledge. Additionally, we provide a proof of knowledge framework in which to view this type of non-standard assumption. In this thesis, I designed and implemented a compiler which performs optimizations that reduce the number of low-level floating point operations necessary for a specific task; this involves the optimization of chains of floating point operations as well as the imple- mentation of a "fixed" point data type that allows some floating point operations to simulated with integer arithmetic. The source language of the compiler is a subset of C, and the destination language is assembly language for a micro-floating point CPU. An instruction-level simulator of the CPU was written to allow testing of the code. A series of test pieces of codes was compiled, both with and without optimization, to determine how effective these optimizations were.

Thesis Supervisor: Silvio Micali Title: Professor

3 4 Acknowledgments

I would like to thank my adviser, Silvio Micali for all of his assistance in producing this work. This material is based upon work supported under a National Science Foundation Graduate Research Fellowship.

5 6 Contents

1 Introduction 9

2 Notation 11

3 Background 13 3.1 Zero-Knowledge Proofs ...... 13 3.2 Round Minimization in Zero-Knowledge Protocols ...... 15 3.3 The Protocol of Hada and Tanaka ...... 16 3.4 Blum's Zero-Knowledge Proof for Hamiltonian Cycle ...... 18 3.5 The Goldreich-Levin Theorem ...... 20

4 Our Assumptions 21 4.1 Proofs of Knowledge ...... 22 4.2 Our Proof of Knowledge Assumption ...... 24

5 Our Protocol 27 5.1 The Actual Protocol ...... 27 5.2 Proof of Soundness ...... 29 5.3 Proof of Zero-Knowledgeness ...... 30

6 Conclusion 41

Bibliography 43

7 8 Chapter 1

Introduction

A zero-knowledge proof system is a protocol which allows a prover to convince a verifier that a statement is true without providing the verifier with any additional information about the statement being proved. Zero-Knowledge proofs have seen much application in the design of cryptographic protocols. This is due in large part to the result by Goldreich, Micali and Wigderson[16] that there exists a zero-knowledge proof system for any language in NP. Additionally, work has been done to determine the minimum number messages which must be exchanged in order to complete a zero-knowledge proof ([10], [13], [12]). Of particular interest to us is a paper by Hada and Tanaka which provides a three round (three message) protocol for any language in NP which is proven to be zero-knowledge given a set of strong assumptions. This is significant because it is known that no two round zero-knowledge protocol exists for any language outside BPP.[13]

We present a three round zero-knowledge proof system for any language in NP which is an improvement over the Hada-Tanaka protocol in two important ways. First, our protocol is secure even the prover has unbounded resources.1 Second, we assume the hardness of the discrete logarithm problem for a randomly chosen prime and generator instead of assuming the discrete logarithm problem is hard for every

'The Hada-Tanaka protocol is a zero-knowledge argument and not a zero-knowledge proof system since it is secure only if the prover is polynomially bounded

9 prime number of a certain form.2 Like the Hada-Tanaka paper we rely on a non- standard assumption which seems to be quite strong. In Chapter 2 we provide a summary of the notation that we use in this paper. In Chapter 3 we provide definitions and background from related work. In Chapter 4 we provide the assumptions which we need to prove our protocol correct. In Chapter 5 we provide our protocol and a proof that it is correct.

2 All of our hardness assumptions are with respect to a randomly chosen prime and generator. The Hada-Tanaka paper assumes that the discrete logarithm problem and the Diffie-Hellman problem are hard for every generator and for every prime which is one more than twice a prime.

10 Chapter 2

Notation

Point of Notation 1 Let X and Y be distributions and let P(-, .) be a predicate then by

Pr[x +- X; y *- Y : P(x, y)]

we mean the probability that P(x, y) is true when x is drawn randomly from distribu- tion X and Y is drawn randomly from distribution Y.

For any set S, let x <- S denote that x is drawn from the uniform distribution over S. Similarly for any randomized algorithm A, let x *- A(y) denote that x is drawn from the distribution induced by a random execution of algorithm A on input y.

Point of Notation 2 When X and Y are distributions then

[XY]

denotes the distribution which produces pairs (x, y) where x is drawn from X and y is drawn from Y

Point of Notation 3 When X and Y are distributions and A is a randomized al- gorithm then [x <-- X, y +- Y, A(x, y)]

11 denotes the distribution which produces triples (x, y, a) where x is drawn from X, y is drawn from Y and a is drawn from the distribution induced by a random execution of A on inputs x and y.

Point of Notation 4 We denote the result of the interaction between A and B on common input x by the random variable

[A(x) ++B(x)] e {accept,reject}

Point of Notation 5 Let x, y E {O, 1}. Then by < x, y > we mean the inner product of x and y in the vector space Z2n. That is,

> xiyi mod2

where x2 is the ith bit of x.

12 Chapter 3

Background

3.1 Zero-Knowledge Proofs

In their paper, "The Knowledge Complexity of Interactive Proofs", Goldwasser, Mi- cali and Rackoff[18] introduced the concept of a zero-knowledge proof system. In- formally a zero-knowledge proof system for a language, L, is an interactive protocol by which a prover can convince a verifier that a common input x is in L without providing V with any additional information.

Definition 1 (Zero-Knowledge Proof System) A zero-knowledge proof system for a language, L, is a protocol for a prover, P, and a verifier, V, that satisfies the following three properties. 1

1. Completeness: For all x E L,

Pr[[P(x) ++V(x)] = accept] = 1

1As done in [19] We present a non-uniform formulation of Zero-Knowledge in terms of cir- cuit families. Hada and Tanaka claim that non-uniform zero-knowledge is strictly weaker than the original GMR definition of zero-knowledge since one can construct protocols for which non-uniform simulators exist but uniform simulators do not. However, it is not clear whether one could construct protocols in which a non-uniform cheating verifier could extract knowledge but a uniform cheating verifier could not. A protocol of this type would satisfy the GMR definition of zero-knowledge, but not the non-uniform definition of zero-knowledge.

13 2. Soundness: For all circuit families P', for all c and for all sufficiently long strings x V L, 1 Pr[[P'(x) *-4 V(x)] = accept] <

3. Zero-Know ledgeness: For all polynomial size circuit families, V', there ex- ists a polynomial size circuit family, Sv', such that the probability ensembles {Sv'}xCL and VIEW([P(x) ++ V'(x)] are computationally indistinguishable. Where VIEW([P(x) +- V'(x)]) denotes the distribution over the random input of V' and the messages sent by P in a random interaction of P and V' with common input x.

Property 3 above formalizes the notion that V' learns nothing from interacting with P besides the fact that x E L. So if V' already knew that x E L, V' could run Sv, and obtain a view which is indistinguishable from the view he would receive if he were to interact with the true prover. Thus, V' has no reason to interact with P other than to learn that x E L. One can also consider a modification to Property 3 in which instead of allowing a separate simulator for each cheating verifier, V', it is required that there exist a single simulator, S, which, when given black-box access to V', produces views which are indistinguishable from the views V' would receive when interacting with the true prover. Proof systems that satisfy this property are known as black-box zero-knowledge proof systems. It is clear that black-box zero-knowledge is a stronger property than regular zero-knowledge. Property 2 of a zero-knowledge proof system requires that even a computationally unbounded prover is unable to convince the verifier to accept a false theorem with non-negligible probability. Brassard, Chaum and Crepeau[7] introduced the notion of a zero-knowledge argument in which the soundness property is only guaranteed to hold if the cheating prover runs in probabilistic polynomial time. They observed that protection against probabilistic polynomial time provers was sufficient for most applications.

Definition 2 (Computational Indistinguishability[17] [25]) Let L C {O, 1}* be

14 an infinite set of strings and let fl 1 = {111(x)}xEL and H 2 = {f2(x)}xeL be two probability ensembles. For any distinguishing circuit, D, let pr(x) be the probability that D outputs 1 on an input drawn randomly from 11i(x). Then H1 and H2 are computationally indistinguishable if for any polynomial size family of circuits, D, for all c and for all sufficiently long x C L,

Ipf (x) - pI (x)I <

3.2 Round Minimization in Zero-Knowledge Pro-

tocols

One standard measure of the complexity of a zero-knowledge proof is the number of rounds of interaction (i.e. the number of messages sent from one party to another during the protocol). This complexity measure is of particular interest in a situation where the time required to perform local computation is much less than the time required to exchange messages (An example of this setting would be two modern per- sonal computers communicating over a large network, such as the internet). However, in the original zero-knowledge proof system for NP provided by Goldreich, Micali and Wigderson[16], the number of rounds required was polynomial in the size of the input.

Feige and Shamir[10] proved the existence of a 4-round zero-knowledge argument for any language in NP. Additionally, Goldreich and Oren[13] proved neither 2-round zero-knowledge proofs nor arguments exist for languages outside of BPP. This work left open the question of whether or not 3-round zero-knowledge protocols exist for all of NP. Goldreich and Krawczyk[12] proved that 3-round black-box zero-knowledge proofs or arguments for languages outside of BPP can not exist. This result was sig- nificant because at the time of it's publication, every known zero-knowledge protocol satisfied the condition of black-box zero-knowledge.

15 3.3 The Protocol of Hada and Tanaka

In response to the result of Goldreich and Krawczyk[12], Hada and Tanaka[19] pro- vided a 3-round zero-knowledge argument for any language in NP, which did not achieve black-box zero-knowledge. The methods used by Hada and Tanaka differed significantly from previous zero-knowledge protocols, in particular the proof that their protocol is correct required two non-standard assumptions which they refer to as the Strong Diffie-Hellman Assumptions. The first Strong Diffie-Hellman Assumption is needed to prove that that the protocol is zero-knowledge. This assumption is applied to a cheating verifier to demonstrate the existence of a family of circuits which the simulator can use to simulate properly. The Second Strong Diffie-Hellman Assump- tion is needed to prove that the protocol is sound. This assumption is applied to a cheating prover to demonstrate the existence of a family of circuits which could be used to compute discrete logarithms if the cheating prover were able to violate the soundness condition.

Assumption 1 (First Strong Diffie-Hellman Assumption (SDHA-1)) Let I be a family of polynomial-size circuits which takes as input (p, q, g, ga mod p), where p and q are prime, p = 2q + 1 and g is a generator of Z*, and tries to output (B, X) such that X = B'. For every family of polynomial-size circuits I = {In} there ex- ists another family of polynomial-size circuits I' = {I'j} which on input (p, q, g, ga) outputs (B', X', b) such that for primes p of length n and generators g,

1. The distribution of the first two outputs of I'(p, q, g, ga) is statistically close to the distribution 1(p, q, g, ga), (where the probability distribution is taken over a uniformly random choice of a E Zp*).

2. For every c and all sufficiently large n,

Pr[a +- Z*; (B', X', b) <- I'(p, q, g, ga); X' = B'/ A B' # gb] <

Basically, this assumption is about the "knowledge" of a circuit. It claims that if there is a circuit family that is able to produce B and Ba mod p given only ga, then

16 the circuit family must "know" the discrete log of B (i.e. there is another circuit family of similar size which is also able to output the discrete log of B). In the Hada Tanaka protocol, the prover sends p, q, g and g' to the verifier and the verifier responds with B and B'. The rest of the protocol proceeds in such a way that the prover could cheat if he knew the discrete log of B, but since the prover is assumed to be computationally bounded he is unable to compute the discrete log. Additionally, the simulator needs the discrete log of B in order to simulate properly. For any verifier, V', that outputs B and B' the SDHA-1 states that there exists a related circuit family, ' which also outputs the discrete log of B. This means that the simulator, Sv,, can have a description of ' hard-wired in and use V' to acquire the discrete log of B (and hence simulate properly).

Assumption 2 (Second Strong Diffie-Hellman Assumption (SDHA-2)) Let I be a family of polynomial-size circuits which takes as input (p, q, g, ga mod p, gb mod p, gab mod p), where p and q are prime, p = 2q + 1 and g is a generator of Z*, and tries to output (Y, C) such that Y = Cb. For every family of polynomial-size circuits

I = {I} there exists anotherfamily of polynomial-size circuits I' = {I,} which on in- put (p, q, g, ga mod P, gb mod p, gab mod p) outputs (C', Y', c) such that for primes p of length n, all generators g of Z* and all a E Z*,

1. The distribution of the first two outputs of I' (p,q,g,ga, gb9ab) is statistically

close to the distribution 1(p, q, g, ga, gb, gab), (where the probability distribution

is taken over a uniformly random choice of b C ZP*).

2. For every c and all sufficiently large n,

Pr[Y' = C'b A Y' (gb)c A Y' = (gab)c] < #4 nc

This is another assumption about the "knowledge" of a circuit. It claims that if there is a circuit family that is able to produce C and Cb mod p given ga, gb and gab then the circuit family must "know" either a discrete log of C base g or else a discrete log

17 of C base ga. (i.e. there is another circuit family of similar size which is also able to output this discrete log).

In the Hada Tanaka protocol, the prover sends p, q, g and ga and the verifier responds with gb and gab. At this point the prover sends Y and C (where Y = Cb mod p) along with responses specified by Y. The responses are such that the prover can cheat if and only if the prover can choose a particular value of Y (independent of b). However, if the prover, P', can choose a particular value of Y, then by SDHA-2 there exists a related circuit family, P', which computes the discrete log (either base g or base ga) of this Y. This circuit F' is then used to construct an inverting algorithm which contradicts the assumption that for every p, q and g (where p and q are prime, p = 2q + 1 and g is a generator of Z*) it is hard to compute the discrete log of a randomly chosen element of Z*. In this way, Hada and Tanaka use SDHA-2 to show that the hardness of discrete log implies the soundness of their protocol.

These assumptions seem especially strong because they are built on the assump- tion that it is hard to compute the discrete log for any prime p = 2 * q + 1 and any generator g of Zp*. It is much more standard to assume that discrete log is hard with respect to a randomly chosen prime p = 2 * q + 1 and a randomly chosen generator g. We find assumptions which assume hardness with respect to a randomly chosen prime and randomly chosen generator to be more plausible since assumptions of this form allow for the possibility that there are rare primes or generators which are of a special form that makes computing the discrete logarithm easy.

3.4 Blum's Zero-Knowledge Proof for Hamiltonian

Cycle

Although Goldreich, Micali and Wigderson[16] provide a zero-knowledge proof system for any language in NP, their protocol requires the verifier to choose one of polynomi- ally many challenges. We instead base our proof system on the protocol of Blum[5] which has the desirable property that that the verifier selects one of two challenges

18 and if the theorem being proved is false then only one of the two possible challenges can be answered by the prover.' What follows is Blum's protocol for proving that a graph G has a Hamiltonian cycle.

" PROVER: Randomly choose a permutation r of the vertices of G. Send the verifier a commitment to the edges in G' = 7r(G).

" VERIFIER: Choose a random bit, b, and send it to the prover.

" PROVER: If b = 0, send to the verifier both 7r and a decommitment to every edge in G'. Let H be a Hamiltonian cycle in G. If b = 1, send to the verifier a decommitment to every edge in 7r(H).

" VERIFIER: If b = 0, accept if 7r is a permutation which maps G to G'. If b = 1, accept if the revealed edges form a Hamiltonian cycle in G'.

We will denote by N the first message sent in Blum's protocol and we will denote by N0 and N 1 the prover's response to challenges 0 and 1 respectively. When the protocol is being run several times, we use subscripts such that N? is the prover's response to challenge 0 in the instance of the protocol where the first message was Ni. Blum's protocol is black-box zero-knowledge. Therefore, there exists a simulator which interacts with a verifier and then produces triples of the form (p, M, M?) where (M, M?) are indistinguishable from the messages the honest prover would send in an interaction with the verifier when the verifier uses ft as it's random string. Of particular interest is the verifier which always outputs 0 and the verifier which always outputs 1. We will denote by (M, MO) the pair of messages output by the simulator when interacting with the 0 verifier and by (M, M 1 ) the pair of messages output by the simulator when interacting with the 1 verifier. (Again we will use subscripts when the simulators are run multiple times). Without loss of generality we assume that there is a polynomial f such that No, N 1 , M0 , M' all have length f(k).

2We could instead use any simple subset revealing protocol as defined in [21]

19 3.5 The Goldreich-Levin Theorem

Goldreich and Levin[15] provide a reduction that allows us to turn a process which for randomly chosen r produces < x, r > into a process which produces a list likely to contain x.

Theorem 1 (Goldreich Levin) Suppose we have a random process P,,k {0, I}kk {0, 1} so that for some integer c,

1 1 Pr[r - {0, 1 }k; q +--Qx,k(r) : q -< r, x >] >__ + then there exists a probabilisticpolynomial time algorithm A which when given oracle access to QxT will output a list of strings which with probability at least - contains x.

Since A runs in polynomial time, the list produced by A can have at most poly- nomial length. Therefore, by randomly chosen an element in the list produced by A we get the following corollary.

Corollary 1.1 Suppose we have a random process Pxk : {0, I} {0, 1} so that for some integer c,

1 1 Pr[r - {0, I}k; q -- Qx,k(r) : q =< r, x >] > I + then there exists a probabilistic polynomial time algorithm A and an integer d such that when A is given oracle access to Qxn, it will output x with probability at least 1 H2d

20 Chapter 4

Our Assumptions

First we make a fairly standard assumption about the difficulty of the discrete loga- rithm problem.

Assumption 3 (Standard Discrete Log Assumption) Let I = {In} be any fam- ily of polynomial-size circuits. Let PRIMES,, be the set of all n-bit primes. Let GENp be the set of all generators of Z*. Then for all c and all sufficiently large n,

1 Pr[p +- PRIMESn; g +- GENp; x +- a - In(p, g, g) : a = x] < nc

Additionally, we assume that the Diffie-Hellman Problem is hard[9]. 1

Assumption 4 (Standard Diffie-Hellman Assumption) Let I = {In} be any family of polynomial-size circuits. Let PRIMES, be the set of all n-bit primes. Let GENp be the set of all generators of Z*. Then for all c and all sufficiently large n,

1 Pr[p +- PRIMES,; g +- GENp; x <- Z*; y +- Z,*; a +- In (p, g, g, gY) : a = gxy] <

'We assume that no polynomial-size family of circuits can solve the Diffie-Hellman Problem with success probability greater than -L. Victor Shoup[23] proved that this is equivalent to assuming that no polynomial-size family of circuits can solve the Diffie-Hellman problem with success probability

1 1 -f

21 4.1 Proofs of Knowledge

As noted earlier, the Strong Diffie-Hellman Assumptions used by Hada and Tanaka[19] are assumptions about the knowledge of a circuit. We believe that this type of assumption is necessary in order to achieve zero-knowledge in three rounds. We find a proof of knowledge to be a useful framework in which to consider this type of assumption. Proofs of knowledge are a common tool in the design of cryptographic protocols. They are used in cryptographic protocols when one party needs confidence that an- other party is able to compute a value with specified properties. Fischer, Micali and Rackoff[11] were the first to use a proof of knowledge when they observed that the adversary could conceivably cheat if he were ignorant of a value that he was suppose to know. We avoid presenting a rigorous definition of a proof of knowledge2 . Intuitively, a proof of knowledge is a protocol involving a prover and a verifier such that if the veri- fier accepts the proof, then with high probability there is a polynomial time knowledge extractor which could interact several times with the prover and output a value with the specified properties. Of particular interest in a cryptographic setting is a witness- indistinguishable proof of knowledge in which the verifier gains no information about which value the prover knows. The following protocol is a proof of knowledge similar to proofs of knowledge in [18], [16] and [24]. In this protocol, given a prime p, a generator g and a number

Q, the prover provides a pair (X, Y) such that XY = Q mod p and proves that he knows the discrete log of either X or Y. To do this, the prover constructs a sequence of pairs (Wi, Zj) such that WjZj = Q mod p. For each such pair, the verifier makes a challenge requesting a proof that the pair (Wi, Zj) is properly constructed (i.e. that the prover knows the discrete log of either W or Z) or a proof that if (Wi, Zj) is properly constructed then the prover knows the discrete log of either X or Y.

1. VERIFIER: Chooses a random k - bit prime number p such that the factor-

2The precise definition of a proof of knowledge is still under debate.[20]

22 ization of p - 1 is known.3 Chooses a random generator, g, of Z*. Chooses a random element Q E Z*. Sends p, g and Q to PROVER.

2. PROVER: Sends a pair (X, Y) to VERIFIER such that XY Q mod p. Sends

k additional pairs (W1 , Z 1),... , (Wk, Zk) to the VERIFIER such that WiZi = Q mod p for each i.

3. VERIFIER: Randomly selects k bits bi and send them to the prover.

4. PROVER: Let x be such that either gx = X mod p or gx = Y mod p. Let wi

be such that either gwi - Wi mod p or gwi = Zi mod p. For each i if bi = 0 sends Bi = wi to VERIFIER. If bi = 1, randomly selects ci E {-1, 1} and sends

Bi = ci(x - wi) to VERIFIER.

5. VERIFIER: For each i, if bi = 0 checks that g B = X mod p or gBi - Y 1 mod p. If bi = 1 checks that Bi is in the set {XWJ-1, X-1 W, XZZ-1, X Zi}. Rejects the proof if any of these checks fail.

The above protocol is secure because a knowledge extractor which is able to inter- act multiple times with the prover will be able to get the responses to both of the challenges for a particular (Wi, Zi) and thus compute the discrete log of either X or Y. We now consider the following modification of the protocol which makes use of a random oracle, 0 (without loss of generality we assume that the oracle always returns k bits).

1. VERIFIER: Chooses a random k - bit prime number p such that the factor- ization of p - 1 is known. Chooses a random generator, g, of Z*. Chooses a random element Q E Z*. Sends p, g and R to PROVER.

2. PROVER: Selects a pair (X, Y) such that XY = Q mod p and additional pairs

(W1 , Z 1),... , (Wk, Zk) such that WiZi = Q mod p for each i. Let bl,... , bk =

O(X, W 1 ,... ,Wk). Let x be such that either gx = X mod p or gx =Y

3 This can be done by using the algorithm in [1]

23 mod p. Let wi be such that either 9gw - Wi mod p or gi =- Z mod p.

Send (X, Y), (W1 , Z1 ), .. . , (Wk, Zk) to VERIFIER. For each i if bi = 0 sends

Bi = wi to VERIFIER. If bi = 1, randomly selects ci {-1, 1} and sends Bi = ci(x - wi) to VERIFIER.

3. VERIFIER: Let bi,... , bk = O(XW 1 ,... ,Wk). For each i, if bi = 0 checks

that 9Bi - X mod p or gBi = Y mod p. If bi = 1 checks that Bi is in the set {XWi-11 X- 1 W, XZJ1 , X-1 Zi}. Rejects the proof if any of these checks fail.

This protocol is a secure proof of knowledge in the random oracle model because a knowledge extractor which interacts several times with the prover (with different choices for the oracle) will be able to get the responses to both challenges for the same (Wi, Zi) and can then compute the discrete log of either X or Y.

4.2 Our Proof of Knowledge Assumption

Our proof of knowledge assumption is that there exists either a cryptographic function

(such as SHA-1 or MD5) or a family of cryptographic functions such that the previous protocol remains a valid proof of knowledge even if the random oracle is replaced by the function. This is a strong assumption. Canetti, Goldreich and Halevi[8] show that there exist settings (albeit somewhat artificial ones) in which replacing a random oracle with any cryptographic function does not yield a secure outcome. However, we believe our assumption plausible. (Indeed, in our setting, it seems unlikely that the hash function will interact with the protocol in just the right way so as to allow an malicious prover to cheat.)

Assumption 5 (Proof of Knowledge Assumption (POKA)) Let I be a fam- ily of polynomial-size circuits which takes as input (p, g, R, h), where p is a k-bit prime, g is a generator of Z*, Q is an element of ZP* and h is a hash function whose

range is {0, I}k, and tries to output ((X, Y), (W1, Z 1 ), .. . , (Wk, Zk), B1, . .. , Bk, S)

such that for each i, bi = 0 implies 9Bi E {Wi, Zi} and bi = 1 implies gBi E 1 {XW[', X -'Wi, XZ; , X-Zi} where bi is the ith bit of h(X, W 1,. .. , Wk). For every

24 family of polynomial-size circuits I = {I} there exists anotherfamily of polynomial- size circuitsI = {Ik} which on input (p, g, Q, h) outputs ((X, Y), (W1 , Z1),... , (Wk, Zk), B1 ,... , B , such that

1. The distribution of the first 2k + 2 outputs of Ik(p, g, Q, h) is statistically close to the distribution Ik(p, g, Q, h), (where the probability distribution is taken over a random choice of p E PRIMESk, g E GENp and Q E Z*).

2. For every c and all sufficiently large k,

Pr[p +- PRIMESk; g +- GE N,; Q +- Z*;

((X, Y), (W 1, Zi), .. , (Wk, Zk), B1... , Bk, S,X) +- ik(p, giQ, h)

(A((b = 0 A gB E {Wi, Zi}) v

(b, = 1 A g"' E {XWJ1, X~'W, XZj1, X~1Zi}}) A gx V {X,Y} <

Note that the above assumption is parameterized by the method in which the hash function is chosen. From this point forward we will use the notation Ilk to mean a distribution of hash functions for which the above assumption is believed to be true.

25 26 Chapter 5

Our Protocol

5.1 The Actual Protocol

Our protocol is based on the ideas of Kilian, Micali and Ostrovsky[21] and Bellare and Micali[3]. Kilian, Micali and Ostrovsky show how all of the interaction in a zero- knowledge proof can be moved to a short interactive preprocessing stage which could be performed before selecting the theorem to be proved. Bellare and Micali introduce the notion of an channel and show how an oblivious transfer channel can be implemented using public keys. Additionally, they show how the protocol of Kilian, Micali and Ostrovsky could be modified to use an oblivious transfer channel. In our protocol, we use our POKA to set up k oblivious transfer channels without using public keys. Then for each channel, the prover proceeds to send one response in Blum's protocol using the first half of the channel and the other response in Blum's protocol using the second half of the channel (and thus the prover is assured that the verifier is only able to read one of the two responses). The following protocol is our 3-round zero-knowledge proof that a graph G of size k has a Hamiltonian cycle.

* PROVER: Chooses a random k-bit prime number p such that the factorization of p - 1 is known.1 Chooses a random generator, g, of Z*. Chooses a random element Q C Z*. Sends (p, g, Q, H) to VERIFIER where H is a hash function

'This can be done by using the algorithm in [1]

27 chosen from yk.

* VERIFIER: Selects k random elements x 1 ,... , Xk of Z*. Flips k coins to create

random values C1, . . . , Ck E {0, 1}.

1 * VERIFIER: For each xi, if Ci = 0, let Xi = gy' mod p and Yi = QX mod p. If Ci = 1, let Yi = gxi mod p and Xi = QYi- 1 mod p.

* VERIFIER: For each xi, selects k random elements wi,1 ... Wi,k of Z* and flips

k coins to create random values C, 1 ,.... , C,j E {0, 1}.

1 e VERIFIER: For each wij, if Cij = 0, let Wi,= gwi mod p and Zi,= QWi- .

If C,3 = 1, let Zi= g'i mod p and Wi, = QZtJ'-

" VERIFIER: For each xi, let bi,1 ... b,k =H(Xi, Wi,1,... , Wi,k).

* VERIFIER: For each wij, if b = 0 then Bij = wij. If bi, = 1, then randomly

select cij E {-1, 1} and let Bj c, (Xi - wi,).

o VERIFIER: Flip k coins to create random values S1,... , Sk E {0, 1}. Send

Sk) ((X1, Y),. .. , (Xk, Yk), (W1,1, Z1,1),... , (W,k,Zk,k), B 1 ,.. . ,kB, S1 , , to PROVER.

e PROVER: For each Xi, compute b,1 ... bi,k = H(Xi,Wi, 1,... , Wik). For each

i,j, if b23 = 0, check that either gij = Wij mod p or g. =Z mod p. If bi, = 1, check that g! is in the set {XiWi-1, X 1 W,,, XZJ, XJ1 Z,J}. If any of these checks fails, reject and terminate the protocol.

* PROVER: Select R to be a random element of {0, I}k and select ai,,... , ak,k

to be random elements of Z*. Let K be the k-bit string whose Jh bit is < R, Xa"i' >. Similarly, let Kr be the k-bit string whose Jh bit is < R, Yi', >.

9 PROVER: Choose k messages Ni and corresponding responses N2 and Nj as the honest prover would in Blum's protocol. Let F be a pseudorandom gener- ator.2 For each Si, if Si = 0 then Construct Li by taking the XOR of Nf with

2 Pseudorandom generators exist if computing discrete logs is hard[6]

28 F(K ) and construct L' by taking the XOR of Nil with F(Kf). If Si 1 then Construct Li by taking the XOR of Ni with F(K ) and construct L, by taking the XOR of N.0 with F(KI[).

* PROVER: Let Aij = g' mod p. Send (R, A 1,1,... , Ai,,, N1, ... , Na, L1 , IL ... , L,) to VERIFIER.

" VERIFIER: Let Ki be the k-bit string whose Jth bit is < Ri , Ax >. For each i, if Ci = 0, let RESPONDi be the XOR of Li with F(Ki). If Ci = 1, let RESPONDi be the XOR of L' with F(Ki). Accept only if for each i, either Si = Ci and RESPONDi is a correct response to challenge 0 in Blum's protocol with first message Ni or Si # Ci and RESPONDi is a correct response to challenge 1 in Blum's protocol with first message Ni.

The protocol is complete since if the prover follows the specified program, the honest verifier will always accept.

5.2 Proof of Soundness

Assume that G does not have a Hamiltonian cycle. Then for any first message Ni in Blum's protocol it is possible for the prover to correctly answer at most one of the two possible challenges. (That is, it is possible to produce either Nf or Nj but not both). Given Ni, let piit be such that the prover can produce Nti but not N'-P. For the honest verifier, the message sent to the prover is completely independent of whether the verifier knows the discrete log of Xi or Y. (That is, it is completely independent of the Ci's.) This is because each pair (Xi, Y) is a random pair of elements in Z* whose product is Q. Additionally, Bij gives no information about Ci because for fixed (Xi, Y, WZj,Z,) there are four possible values of Bij depending on the values of C, Cjj and ci2 j and each of the four values of Bij results from one setting of (Ci, C, 3 , ci,j) in which Ci = 0 and from one setting in which C, = 1.

For the honest verifier, with probability one-half, Si = Ci. Since the prover has no information about C, then with probability one-half either Si = Ci and oaj # 0 or

29 Si 74 C, and ac # 1. Observe that the verifier will reject if for any i either Si = Ci and ao 74 0 or S, 74 C, and ac :A 1. Therefore, with probability at least one-half, the verifier will reject because of the prover's j% response. Therefore, the verifier will accept with probability at most 1.

5.3 Proof of Zero-Knowledgeness

For any cheating verifier, V', we construct a simulator Sv, and show that for graphs G which contain a Hamiltonian cycle, Sv, (G) is computationally distinguishable from VIEW([P(G) ++ V'(G)]). For any cheating verifier, V', by assumption there exists another family of circuits

as specified by POKA. 3. Let Sv, be the family of probabilistic circuits which on input G do the following:

1. Let k be the size of the input graph G

2. Select a random string T to be used as the random string for the verifier

3. Select a random k-bit prime number p such that the factorization of p - 1 is known [1].

4. Select a random generator, g, of Z*.

5. Select a random element, Q, of Z*.

6. Select a random hash function, H, from Hk.

7. Run V' on input (p, g, Q, H) and random string T to obtain

((X,7 Y), ...., (Xki, Y)),7 (W1,1, Z1, 1), ... , (Wk,k, Z,k),I B1, ...., Bk, S1l.. Sk)

3Technically, V' has k sets of outputs and so we apply POKA k times (On the ith application of POKA we apply the assumption to a version of V' which is modified to only output the ith set of outputs). We can then easily combine the k resulting circuits into a single circuit which outputs all k discrete logs. For simplicity when we refer to V' we are referring to the composite circuit which results from k applications of POKA

30 and (xi,... , Xk).

8. For each X2 , compute bi,1 .. .bi, = H(X, 7Wi,1, . .. , Wi,k). For each i, j, if bij 0, check that either g% = Wjj mod p or g Z, mod p. If bj = 1, check

that g B is in the set {XiWi-j, X-1 Wij, XiZ9, X7-1 Z, }. If any of these checks fails, output (T, (p, g, Q, H), REJECT) and terminate.

9. Select R to be a random element of {0, 1}k and select a 1, 1, ... , akk to be random elements of Z*.

10. Let K be the k-bit string whose J"' bit is < R, Xc",' >. Similarly, let K( be

the k-bit string whose Jth bit is < R, i" >

11. For each xi, let Ci = 1 if Y = gxi mod p and Ci = 0 otherwise.

12. For each xi, If Ci = Si run the simulator for Blum's protocol to get Mi and M2 and if Ci y Si run the simulator for Blum's protocol to get Mi and Mi.

13. If Ci =0 Construct Li by taking the XOR of MiS Ci and F(K) and construct L' to be F(K'). If Ci = 1 Construct Li to be F(K ) and construct L' by taking the XOR of M!e-9 s' and F(K').

14. Let Ai,1 = gaio mod p and Output

(T, (p, g, Q, H), Mi, . .. , Mk, R, A1,,, ...., Ak,k, L1, Lf, ...., Lk,, Ll)

First we consider the case where ' outputs B 1 ,1, . . . , Bk,k such that for some i, j Bij, does not pass the prover's checks. By POKA, the output of ' is statistically close to the output of V'. Also note that in this case, the simulator, S, behaves in a manner identical to that of the honest prover. So, in this case, Sv,(G) is statistically close to VIEW([V' + P(G)]). Therefore, for any verifier V' that sometimes outputs

B1,1,... ,Bkk which do not pass the provers checks, we consider a modified verifier V" which behaves like V' except that when V' would produce an output that doesn't

31 pass the prover's checks, V" instead behaves like the honest verifier. We note that if Sv,(G) and VIEW([V' +-± P(G)]) are distinguishable then so are Syu (G) and

VIEW([V" <-+ P(G)]) and therefore from this point forward we restrict our attention to verifiers whose output always passes the prover's checks.

Next we consider the case where 17' sometimes outputs x1 , . .. , xk such that gxi - Y mod p. In this case we can replace ' with a circuit " which on input (p, g, Q, H) and random string T, first runs 1' on input (p, g, Q, H) and random string T. Then

V" checks to see if for any i, gxi Y mod p. For each such i, V" swaps Xi and Y and produces output identical to ' except for the switches. Additionally, consider the verifier V" which on input (p, g, Q, H) and random string T, first runs 1' on input (p, g, Q, H) and random string T. Then V" checks to see if for any i, gxi = Y mod p. For each such i, V" swaps Xi and Yi. V"' then produces output identical to V' except for the swaps. Observe that the pair " and V" satisfy the conditions of POKA. Additionally observe that if Sv,(G) and VIEW([V' + P(G)]) are distinguishable, then so are Svy (G) and VIEW([V" + P(G)]). Therefore, from this point forward we restrict our attention to verifiers, V' such that ' always outputs x1, ... , Xk such that gxi $ Yi. To derive a contradiction we assume that there exists an infinite family of input graphs, {Gk} (without loss of generality assume that for sufficiently large k, Gk has size k) such that Sv, (Gk) is computationally distinguishable from VIEW([P(Gk) - V'(Gk)]). It will be useful to define several circuit families and probability distributions. For simplicity in the following argument we will write

[p, g, Q, H, R,] to mean

[p +- PRIMESk ,g <- GEN, Q +- Z>*, H &+- k, R +- {0,1}l, + 0, 1}]

Let {Ek} be the family of circuits which on input (T, p, g, Q, H, R) does the fol-

32 lowing:

1. Run ' on input (p, g, Q, H) and random string T to obtain (Xi, Y), Si and xi.

2. Select a7, 1, ... , k to be random elements of Z*.

3. Let Kf be the k-bit string whose jth bit is < R, X"'s >. Similarly, let K( be

the k-bit string whose Jth bit is < R, Yij,' >.

4. Let Ci = 0.

5. If Ci = Si run the simulator for Blum's protocol to get Mi and M2 and if Ci 5 Si run the simulator for Blum's protocol to get Mi and Mj.

6. If Ci = 0 Construct Li by taking the XOR of Mi sCi and F(Kj) and construct L' to be F(Kf). If Ci 1 Construct Li to be F(K ) and construct L' by taking the XOR of Mf" Si and F(Kf).

7. Let Aij = ga'it mod p and output (Mi, Aj, 1 ,... , Ai,k, Li, L').

Let {f'} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing: (Note that the Hamiltonian cycle for Gk can be hardwired into IP)

1. Run ' on input (p, g, Q, H) and random string T to obtain (Xi, Y), Si and xi.

2. Select a, 1 ,. ... , aik to be random elements of Z*.

3. Let K be the k-bit string whose jth bit is < R, Xc",' >. Similarly, let K( be

the k-bit string whose jth bit is < R, Yi"i >.

4. Use the Hamiltonian Cycle in Gk to obtain N-, NP and N .

5. if Si = 0 then Construct Li by taking the XOR of N2 with F(K ) and construct L' by taking the XOR of Nil with F(K'). If Si = 1 then Construct Li by taking the XOR of N.1 with F(K ) and construct L' by taking the XOR of Ni" with F(Kf ).

6. Let Ai = mod p and output (Ni, Ai, 1 , ... , Ai,k, Li, L').

33 Let {bV} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing: (Note that the Hamiltonian cycle for Gk can be hardwired into 4I)

1. Run V' on input (p, g, Q, H) and random string T to obtain (Xi, Y), Si and xi.

2. Select ai,, . . . , aO,k to be random elements of Z*.

3. Let K be the k-bit string whose jth bit is < R, X>"'j >. Similarly, let Kf be

the k-bit string whose jth bit is < R, Yi" >

4. Let Ci = 0.

5. If Ci = Si use the Hamiltonian cycle in Gk to obtain Ni and N2 and if Ci # Si

use the Hamiltonian cycle in Gk to obtain Ni and Nj.

6. If Ci = 0 Construct Li by taking the XOR of Nsisci and F(K-) and construct L' to be F(K'). If Ci = 1 Construct Li to be F(K ) and construct L' by taking the XOR of N'c'-s' and F(K').

7. Let Aij = g'ia mod p and output (Mi, Ai, 1 , ... , A7,i, Li, L').

Let {Q} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing: (Note that the Hamiltonian cycle for Gk can be hardwired into Q4)

1. Run V' on input (p, g, Q, H) and random string T to obtain (Xi, Y), Si and xi.

2. Select czi, ... , a,k to be random elements of Z*.

3. Let K be the k-bit string whose th bit is < R, X" >.

4. Let Ci = 0.

5. If C = Si use the Hamiltonian cycle in Gk to obtain N and N? and if Ci = Si use the Hamiltonian cycle in Gk to obtain Ni and Nj.

6. If Ci = 0 Construct Li by taking the XOR of Ns'-c' and F(K ) and let L' be

uniformly chosen from {0, 1}f(k).

7. Let Ai, = g'ii mod p and output (Mi, Aj, 1 , . . , Ai,k, Li, L').

34 Let Y be the distribution

[p, g, Q, H, R, T, Ei,I ,g ,H ) ir ,g ,H ) (TpgQHR I-".7 (T p g Q H, R)) ir ,g ,H )

Let o-j be the distribution [p, g, Q, R, T, H, E9J(,p, g, Q, H, R)].

Let 7j be the distribution [p, g, Q, R, T, H,HI,(Tp,g, Q, H, R)]. Let qj be the distribution [p, g, Q, R, r, H, 4I(T, p, g, Q, H, R)]. Let wj be the distribution [p, g, Q, R, T, H, (Tp,g,Q,H,R)].

Lemma 1 If there exists an infinite sequence of graphs, Gk such that SvI(Gk) is computationally distinguishablefrom VIEW([P(Gk) ++ V'(Gk)]) then there exists an 1 such that either #1is computationally distinguishablefrom 1 or 1 is computationally distinguishablefrom (TI.

Proof It is clear from definition that Sv,(Gk) is Yk and that VIEW([P(G) + V'(G)]) is Yo. So Yk is computationally distinguishable from Y and therefore there exists some 1 such that Y_1 is computationally distinguishable from Y. For this 1, it is easy to see that o, is distinguishable from 7r, Since a, is computationally distinguishable from 7r,, then either q1 is computation- ally distinguishable from or, or #1 is computationally distinguishable from 7rj. l

Lemma 2 For all 1, #1is computationally indistinguishablefrom oa.

Proof To derive a contradiction, we assume that there exists an 1 and a polynomial size family of distinguishers {D1k} which distinguish q1 and or. We now contradict the fact that Blum's protocol is zero-knowledge by constructing a family of malicious verifiers {Ek} and a family of distinguishers {D1'} such that {D1'} distinguishes an interaction of Ek with the Blum simulator from a view of an interaction of Ek with the Blum honest prover for the infinite family of graphs {Gk}. Let Ek be the verifier who on any input and random string p does the following:

35 1. Parse the random string, p, as (T,p, g, Q, H, R) where -r is a random element of

{0, I}k, p is a random k-bit prime, g is a random generator of Z*, Q is a random element of Z* H is randomly chosen from Hk and R is a random element of {0, I}.

2. Run V' on input (p, g, Q, H) and random string T to obtain X1, Si and xi.

3. If gxl mod p = X, and S, = 0 or gx mod p 7 X, and S, 1 then output 0, otherwise output 1.

Let D1' be the circuit which on input (p, v, v') does the following:

1. Parse p into (, p, g, Q, H, R) in the same manner as Ek would.

2. Run V' on input (p, g, Q, H) and random string T to obtain (Xj, Y,), S, and xj.

3. Select .. ,1,.. , alk to be random elements of Z*.

4. Let Kx be the k-bit string whose jth bit is < R, Xa"'' >. Similarly, let K' be

the k-bit string whose jth bit is < R, Ya~ >

5. Let C = 0.

6. If C, = 0 Construct L, by taking the XOR of v' and F(K) and construct L' to be F(K,). If C, 1 Construct L, to be F(Kx) and construct L' by taking the XOR of v' and F(K').

7. Let Ai, = gali mod p

8. Run Dlk on input (p, g, Q, R, TH, (v, A, 1 , ...., Ajk, L, L))

Observe that if (p, v, v') is a random output of the Blum Simulator for Ek then D1' outputs Dlk applied to a random element from a, and if (p, v, v') is the view of a random interaction between Ek and the Blum honest prover then D1', outputs Dlk applied to a random element of #1. Therefore, D1' distinguishes the output of the

Blum Simulator for Ek from an interaction between Ek and the honest Blum prover.

This contradicts the fact that Blum's protocol is zero-knowledge. l

36 Lemma 3 Let SEL = {SELk} be a polynomial-size family of circuits which take as input a prime, p, a generator, g, of ZP* and a element Q of ZP* and outputs a pair (x,aux). Let SOL {SOLk} be a polynomial-size family of circuits which takes as input a triple as input and outputs a number. Then for all c and all sufficiently large k

Pr[p +- PRIMESk; g +- GENp; q +- Z; Q +- g7;

(x, aux) +- SELk(pg, Q); b - Z*; ans +- SOLk(Qgx gb, aux) : ans - g(q-x)b] <1

Proof To derive a contradiction we assume that there exist circuit families SEL and SOL and some integer c such that for all sufficiently large k,

Pr[p <- PRIMES; g +- GENp; q +- Z*; Q +- gq;

(x,aux) +- SELk (p,g,Q); b +- Z*; ans - SOLk(Qg X gbaux) : ans - g(q-x)b] 1

We know consider the family of circuits I = {Ik} which on input a prime, p, a generator g of Z* and elements ga and gb of Z* does the following:

1. Let (x, aux) be the output of SELk on input (p, g,ga).

2. Let ans be the output of SOLk on input (ga-x, gb, aux).

3. Output ans(gb)x

Observe that

Pr[p <- PRIMESk; g -GEN; x +- Z*; y-Z*;a

= Pr[p -PRIMESk; g +- GEN; q <-Z*; Q <- gq;

> (x, aux) +- SELk(p g,Q); b +- Z*; ans - SOLk(Qg , gbaux) : ans = g(q-x)b]

This contradicts the Diffie-Hellman assumption. l

37 Lemma 4 For all 1, q1 is computationally indistinguishablefrom 7r,.

Proof To derive a contradiction we assume that there exists an 1 such that q1 is computationally distinguishable from r1. It follows that either w, is computationally distinguishable from 7r, or w, is computationally distinguishable from q1. We derive a contradiction for the case where 7r, and w, are distinguishable. The proof that the other case yields a contradiction is identical except for a small difference which we will note when we reach it.

Let {Bk} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing:

1. Run V' on input (p, g, Q, H) and random string T to obtain (Xi, Yi).

2. Select ,. .. , a,k to be random elements of Z*.

3. Let KY be the k-bit string whose jth bit is < R, Yc"j >.

0 4. Let Ajj = g 1i mod p and output (A,, 1 , ... , Alk, F(K ))

Let {P'k} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing:

1. Run V' on input (p, g, Q, H) and random string T to obtain (Xi, Y).

0 2. Select z,1, . . . , al,k to be random elements of Z*.

3. Let KY be the k-bit string whose j'h bit is < R, Y >.

4. Let Ai,3 = galJ mod p and output (A, 1, ... , A1,1 KY)

Since a) and 7r, are computationally distinguishable, there exists a family of circuits

{D2k} that distinguish w, and 'r1. We now construct a family of distinguishers {D2' } which distinguish [p, g, Q, H, R, T, Bk(p, g, Q, H, R)] and [p, g, Q, H, R, T, A, 1 <- Z,*, ... A,k* Z*, {0, 1}f(k)]

Let {D2' } be the circuit which on input (p, g, Q, H, R, T, A, 1, .... , Alk, p) does the following: (Note that the Hamiltonian cycle for Gk can be hardwired into D2')

38 1. Run V' on input (p, g, Q, H) and random string T to obtain (Xj, Y1), S and xj.

2. Let Kx be the k-bit string whose jth bit is < R, A >.

3. Use the Hamiltonian Cycle in Gk to obtain N, N' and N 1 .

4. if S, = 0 then Construct L, by taking the XOR of N 0 with F(Kx) and construct L' by taking the XOR of N1 with 1. If S = 1 then Construct L, by taking the XOR of N' with F(KX) and construct L' by taking the XOR of No with 1u.

5. Run D2k on (p, g, Q, H, R, T, N, (A, 1, .... , Ai,k, L1, L')) and output the result.

Observe that if (p, g, Q, H, R, T, A 1,1 ,... , A, P) is taken from [p, g, Q, H, R, T, Bk(p, g, Q, H, R)

then D2' outputs D2k applied a random element of wr and if (p, g, Q, H, R, T, A, 1,... , A ~i)

is taken from [p, g, Q, H, R, T, A1,1, ... , A {0, 1}f(k)] then D2' outputs D2k applied

to a random element of w1. Therefore, [p, g, Q, H, R, T, Bk(p, g, Q, H, R)] and [p, g, Q, H, R, r, A, 1, ... are computationally distinguishable. Since F is a pseudorandom generator, this

means that [p, g, Q, H, R, T, Fk (p, g, Q, H, R)] and [p, g, Q, H, R, T, A,, 1, . . . , Ai {0, 1}f(k)]. 5

Let {k} be the family of circuits which on input (T, p, g, Q, H, R) does the fol- lowing:

1. Run V' on input (p, g, Q, H) and random string T to obtain (Xi, Y1).

2. Randomly select a E Z* and let A = gc mod p.

3. Output A, < R, Ya >.

Observe that for the output distribution of 1 k is equal to k copies of the output

distribution of k* (When both circuits have inputs drawn from [p, g, Q, H, R, T]).

'If we were in the case where q and w were distinguishable instead of 7r and w then here we would let L' be p (regardless of what S, is). 5We know that F(KX) is distinguishable from uniform with the help of p, g, Q, H, R, T),A, 1 ,... , AI,k. We know that since F is a pseudorandom generator its output is indistinguishable from uniform if its input is indistinguishable from uniform. Therefore we conclude that we distinguish Kx from uniform with the help of p, g, Q, H, R, 7, A, 1,. .. , Alk.

39 A standard hybrid argument yields that if k copies of 7k(P, g, Q, H, R, T) are distin- guishable from k copies of [A +- Z*, {O, 1}] with the help of p, g, Q, H, R and T, then one copy of Ek(p, g, Q, H, R, T) is distinguishable from one copy of [A +- Z, {, 1}] with the help of p, g, Q, H, R and T. That is, [p, g, Q, H, R, ,k(p, g, Q, H, R,T)] is distinguishable from [p, g, Q, H, R, T, A -- Z*, {0, 1}]. This means that there exists a family of circuits which when given random p, g, Q, H, R, T and g' guess < R, Yj' > with some polynomial advantage. Therefore, we can invoke Goldreich-Levin to show that there exists a family of circuits algorithm, G = {Gk}, which when given random 0 p, g, Q, H, T and g output Ya with probability - for some integer d. We now show that this violates Lemma 3 by constructing a pair of circuit families SEL and SOL and an integer c such that for infinitely many k,

Pr[p +- PRIMESk; g +- GENp; q +- ZP*; Q <- gq;

(x, aux) +- SEL(p, g, Q); b +- Z*; ans <- SOL(Qg--, gb, aux) : ans = g(q-x)b> 1 -kc

Let {SELk} be the family of circuits which on input (p, g, Q) does the following:

1. Randomly choose H from Hk and randomly choose r from {0, 1 }k.

2. Run V' on input (p, g, Q, H) and random string T to obtain X, and xj.

3. Let aux = (H, T) and output (xi, aux).

Let {SOLk} be the family of circuits which on input (Qg-x, gb, (H, T)) does the following:

1. Run Ik on input p, g, Q, H, T and gb and output the result.

We know that except with negligible probability, Y = Qg-x. When Y = Qg-1,

SOL will output g(q-x)b with probability at least !. Therefore, there exists a c such that will output g(q-x)b with probability at least kc. f

That the protocol is zero-knowledge is a straightforward consequence of Lemmas 1, 2 and 4

40 Chapter 6

Conclusion

We have provided a 3-round zero-knowledge proof system for any language in NP.

From Goldreich and Oren we know that 3 rounds is minimal unless BPP = NP.

However, it is unclear whether our proof of knowledge assumption is reasonable. It would be desirable to create a 3-round zero-knowledge proof system which relies on more standard assumptions but it is not clear whether this is possible. In fact, it is not even clear whether non-black-box zero-knowledge protocols (of any number of rounds) can be achieved using standard assumptions.

If non-standard assumptions are necessary to achieve zero-knowledge in three rounds, we feel that proofs of knowledge provide a useful framework for formulating those assumptions. Our protocol is based on a proof of knowledge for the discrete logarithm of one of a pair of numbers. However, it is likely that similar protocols can be created using assumptions based on different proofs of knowledge (like a proof of knowledge of one of the square roots of a quadratic residue).

41 42 Bibliography

[1] E. Bach. how to generate factored random numbers. Siam Journal of Computing, 17, 1988.

[2] Boaz Barak. How to go beyond the black-box simulation barrier. In Proceedings of the 42nd FOCS, 2001.

[3] and Silvio Micali. Non-interactive oblivious transfer and applica- tions. In Proceedings of Crypto'89, 1989.

[4] L. Blum, M. Blum, and M. Shub. A simple unpredictable pseudorandom number generator. Siam Journal of Computing, 15(2), 1986.

[5] . How to prove a theorem so no one else can claim it. In Proceedings of the International Congress of Mathematicians, 1986.

[6] Manuel Blum and Silvio Micali. how to generate cryptographically strong se- quences of psuedo-random bits. Siam Journal of Computing, 13(4), 1984.

[7] G. Brassard, D. Chaum, and C. Crepeau. Minimum disclosure proofs of knowl- edge. Journal of Computer and System Sciences, 37(2), 1988.

[8] Ran Canetti, , and . The random oracle methodology, revisited. In Proc. of the 30th STOC, 1998.

[9] and Martin E. Hellman. New directions in . IEEE Transactions on Information Theory, IT-22(6), 1976.

43 [10] U. Feige and A. Shamir. Zero knowledge proofs of knowledge in two rounds. In Proceedings of Crypto'89, 1989.

[11] M. Fischer, S. Micali, and C. Rackoff. A secure protocol for the oblivious transfer. In Proceedings of Eurocrypt'84, 1984.

[12] 0. Goldreich and H. Krawczyk. On the composition of zero-knowledge proof systems. SIAM Journal of Computing, 25(1), 1996.

[13] 0. Goldreich and Y. Oren. Definitions and properties of zero-knowledge proof systems. Journal of Cryptology, 6(3-4), 1993.

[14] Oded Goldreich and Ariel Kahan. How to construct constant-round zero- knowledge proof systems for np. Journal of Cryptology, 9(3), 1996.

[15] Oded Goldreich and Leonid Levin. A hard-core predicate to any one-way func- tion. In Proceedings of the 21st STOC, 1989.

[16] Oded Goldreich, Silvio Micali, and . Proofs that yield nothing but their validity or all languages in np have zero-knowledge proofs. Journal of the ACM, 38(3), 1991.

[17] and Silvio Micali. Probabilistic encryption. Journal of Com- puter and System Science, 28(2), 1984.

[18] Shafi Goldwasser, Silvio Micali, and . The knowledge complexity of interactive proof systems. SIAM Journal of Computing, 18(1), 1989.

[19] Satoshi Hada and Toshiaki Tanaka. On the existence of 3-round zero-knowledge protocols. In Proceedings of Crypto'98, 1998.

[20] Shai Halevi and Silvio Micali. Conservative proofs of knowledge. In To Appear, 2001.

[21] Silvio Micali Joe Kilian and . Minimum resource zero-knowledge proofs. In Proceedings of Crypto'89, 1989.

44 [22] Matthew Lepinski and Silvio Micali. On the existence of 3-round zero-knowledge proofs. Technical Report MIT-LCS-TM-616, MIT, April 2001.

[23] Victor Shoup. lower bounds for discrete logarithms and related problems. In Proceedings of Eurocrypt'97, 1997.

[24] M. Tompa and H. Woll. random self-reducibility and zero-knowledge interactive proofs of possession of information. In Proceedings of the 28th FOCS, 1987.

[25] A.C. Yao. Theory and application of trapdoor functions. In Proceedings of the 23rd FOCS, 1982.

45