Information Theoretic Advances in Zero-Knowledge by Itay Berman B.Sc., Tel Aviv University (2012) M.Sc., Tel Aviv University (2014)

Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June 2019 @Massachusetts Institute of Technology 2019. All rights reserved.

Author ...... Signature redacted Department of Electrical Engineering and Computer Science Signature redacted""'I May 23, 2019 C ertified by ...... Vinod Vaikuntanathan Associate Professor of Electrical Engineering and Computer Science Thesis Supervisor Signature redacted A ccepted by ...... MASSACHUSES INSTITUTE lteslVW'l Kolodziejski OF TECHNOLOGY rofessor of Electrical Engineering and Computer Science I UIQ ? ()In, Chair, Department Committee on Graduate Students L_ LIBRARIES ARCHIVES 2 Information Theoretic Advances in Zero-Knowledge by Itay Berman

Submitted to the Department of Electrical Engineering and Computer Science on May 23, 2019, in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Abstract

Zero-knowledge proofs have an intimate relation to notions from information theory. In particular, the class of all problems possessing statistical zero-knowledge proofs (SZK) was shown to have complete problems characterized by the statistical distance (Sahai and Vadhan [JACM, 20031) and entropy difference (Goldreich and Vadhan [CCC, 19991) of a pair of efficiently samplable distributions. This characterization has been extremely beneficial in understanding the computational complexity of languages with zero-knowledge proofs and deriving new applications from such languages. In this thesis, we further study the relation between zero-knowledge proofs and information theory. We show the following results:

1. Two additional complete problems for SZK characterized by other information theoretic notions-triangulardiscrimination and Jensen-Shannon divergence. These new complete problems further expand the regime of parameters for which the STATISTICAL DIFFERENCE PROBLEM is complete for SZK.

We further show that the parameterized STATISTICAL DIFFERENCE PROBLEM, for a regime of parameters in which this problem is not known to be in SZK, still share many properties with SZK. Specifically, its hardness implies the existence of one-way functions, and it and its complement have a constant-round public coin interactive protocol (i.e., AM n coAM).

2. The hardness of a problem related to the ENTROPY DIFFERENCE PROBLEM implies the existence of multi-collision resistant hash functions (MCRH). We also demonstrate the usefulness of such hash functions by showing that the existence of MCRH implies the existence of constant-round statistically hiding (and computationally binding) commitment schemes. 3. We initiate the study of zero-knowledge in the model of interactive proofs of proximity (IPP). We show efficient zero-knowledge IPPs for several problems. We also show problems with efficient IPPs, for which every zero-knowledge IPP must be inefficient. Central in this study is showing that many of the statistical properties of SZK carry over to the IPP setting.

Thesis Supervisor: Vinod Vaikuntanathan Title: Associate Professor of Electrical Engineering and Computer Science

3 4 Acknowledgments

I would like to first thank my advisor, Vinod Vaikuntanathan, for his support, gui- dance and caring during my time at MIT. Vinod was always curious to hear about any topic, eager to solve any problem, and knew exactly where to look for answers. I have learned a lot from him about how to do research and also about the best ways to present this research. I am very grateful to Iftach Haitner, my advisor during my Master studies at Tel Aviv University. Iftach was the driving force behind my decision to pursue research in theory of computer science, particularly in cryptography. He always pushed me to achieve more, and his insights made even the most difficult problems seem appro- achable. I thank Iftach for his useful advice and support throughout my graduate studies. I was extremely lucky that Ron Rothblum was doing his post-doc at MIT during my Ph.D. studies. Ron's patience to explain and to listen, his ability to extract the essence and to simplify, and his intuition about how to approach a problem make him an ideal collaborator. Collaborating with Ron was a true delight and this entire thesis is a result of this collaboration. Most of all, I thank Ron for his friendship, advice, and willingness to always share his thoughts. I will be forever in his debt. A special thanks goes to Akshay Degwekar. I started collaborating with Akshay from the moment I arrived at MIT, and since it was so much fun I never stopped. Akshay's ability to so quickly understand the point, and his abstract thinking and simplifications never cease to amaze me. I thank Akshay for sharing with me his knowledge about things I knew nothing about, and for navigating together our paths in Ph.D. studies and beyond. Another special thank you goes to Prashant Vasudevan. The willingness to share, deep insights, and relentless curiosity make it immensely enjoyable to collaborate with Prashant. His ability to present the most difficult technique in the most understan- dable way has always astonished me. A large part of this thesis is a result of the collaboration with Akshay and Prashant. I thank all my other co-authors throughout my graduate studies: Ilan Komar- godski, , Aris Tentes, and Eliad Tsfadia. I am grateful to Nir Bitansky and Omer Paneth for sharing with me their views about research and life. I am also grateful to other researchers from which I have learned a lot; in particular, Benny Applebaum, Yuval Ishai, Yael Kalai, and Alon Rosen. Yael Kalai and Ronitt Rubinfeld served as my thesis committee and for that I thank them. I also thank Ronitt and Michael Sipser for being great instructors; I enjoyed being their teaching assistant a lot. I am grateful for the guidance Piotr Indyk gave me as my academic advisor at MIT. My passion and excitement for information theory stem from a wonderful course taught by Yury Polyanskiy. Yury is a fantastic teacher and his ability to explain difficult proofs is unmatched. I would also like to thank my fellow students in the Theory of Computation group at MIT. Govind Ramnarayan for our countless talks during lunch about anything but research. Saleet Mossel and Tal Wagner for making me feel like I am in

5 for at least a few hours every week. Nishanth Dikkala for reminding me how much I enjoy playing ping pong. Madalina Persu for long and meaningful conversations. I also thank Sitan Chen, Aloni Cohen, Daniel Grier, Pritish Kamath, Sam Park, Adam Sealfon, and all other students that I had the privilege to interact with while at MIT. Last but not least, I would like to thank Tiana. She is the reason I went to MIT and this thesis would not have come to light without her love and support.

6 Contents

1 Introduction 9 1.1 O ur R esults ...... 11 1.2 Outline of this Thesis ...... 15

2 Preliminaries 17 2.1 N otations ...... 17 2.2 Information Theory Preliminaries ...... 18 2.3 Statistical Zero-Knowledge Interactive Proofs ...... 24

3 Statistical Difference Beyond the Polarizing Regime 31 3.1 O verview ...... 31 3.2 Techniques ...... 39 3.3 Complete Problems for SZK ...... 48 3.4 One Way Functions from SDI" with Any Noticeable Gap ...... 60 3.5 Estimating Statistical Distance in AM n coAM ...... 68 3.6 Triangular Discrimination Inequalities ...... 79

4 Multi-Collision Resistant Hash Functions 81 4.1 O verview ...... 81 4.2 Constructing MCRH Families ...... 95 4.3 Statistically IHiding Comnitments ...... 103 4.4 Black-Box Separation ...... 113

5 Zero-Knowledge Interactive Proofs of Proximity 117 5.1 O verview ...... 117 5.2 ZKPP - Model and Definitions ...... 128 5.3 The Power of ZKPP: The Statistical Case ...... 135 5.4 Limitations of SZKPP ...... 149 5.5 Computational ZK Proofs and Statistical ZK Arguments of Proximity 157 5.6 Deferred Proofs ...... 161

7 8 Chapter 1

Introduction

Zero-knowledge proofs, introduced by Goldwasser, Micali, and Rackoff [GMR89], achieve an almost unbelievable task-giving each party what it wants. The parties in question are a (typically) computationally unbounded prover and a computatio- nally limited verifier. The prover and the verifier exchange messages according to an agreed upon protocol-known as an interactive proof-in order for the verifier to be convinced in the validity of a shared statement. At the end of the interaction, the verifier is protected from being convinced that a false statement is true. At the same time, the prover knows that if the statement is true, the verifier learns nothing other than that. One of the cornerstones contributed to the foundations of cryptography by the seminal work of [GMR89]-together with the definition of interactive proof that by itself has had vast implications in computational complexity theory-is how to for- mally define zero-knowledge. That is, what does it mean for the prover to know that the verifier learned nothing but the validity of a true statement. [GMR89] captured the notion of zero-knowledge via the simulation paradigm-if the verifier can simu- late, without the help of the prover, what it has learned from the interaction, then it learned nothing it did not already know. More formally, this simulation is done by a randomized algorithm called the simulator, whose computational resources are similar to those of the verifier. The simulator gets access to a true statement and to the verifier, and without interacting with the prover it should output a transcript that is similar to the transcript generated by the interaction between the verifier and the prover, when both get the same statement given to the simulator. Several notions of zero-knowledge were considered by [GMR.89J. These noti- ons concern the degree to which the simulated transcripts are similar to the real transcripts (i.e., the transcripts of the interaction between the verifier and the pro- ver). If the distributions of the simulated transcripts and the real transcripts cannot be distinguished by any computationally bounded algorithm, then the interactive proof is said to be computational zero-knowledge. A stronger degree of similarity is achieved if those distributions are statistically close, namely indistinguishable by any computationally unbounded algorithm. Such interactive proofs are said to be statis- tical zero-knowledge. All problems possessing statistical zero-knowledge interactive proofs belong to the class of Statistical Zero-Knowledge (SZK).

9 In a beautiful line of works, beginning with Fortnow [For89 and Aiello and Haistad [AH91J, an intimate relation has been established between statistical zero- knowledge proofs and notions from information theory. [For89] and [AH91I showed that the distributions of the simulated transcripts of every language in SZK all share certain statistical properties. They further showed that those properties can be cer- tified and refuted using constant-round public-coin protocols, hence proving that SZK C AM n coAM (AM denotes the class of problems that have constant-round public-coin interactive proofs, and coAM is the complement of AM). Perhaps the climax of this line of works is the establishment of complete promise problems for SZK by Sahai and Vadhan [SV03] and Goldreich and Vadhan [GV99]. [SV03, GV99] showed that the statistical properties shared among all the simulated transcript distributions can be succinctly characterized via the central information theoretic notions of statistical distancel and entropy2 . Specifically, [SV031 (respecti- vely, [GV99]) defined a computational problem whose promise is characterized by the statistical distance (respectively, entropy gap) of some efficiently samplable distribu- tions, and showed that this problem is complete for the class SZK. The establishment of these complete problems is a major advancement in the study of SZK as it opens the door to using powerful tools from information theory in the analysis of problems related to SZK.3 Indeed, the information-theoretic characterization of SZK has been extremely im- pactful in several aspects:

Application of Zero-Knowledge. The aforementioned complete problems had been used extensively to derive many cryptographic applications of SZK. For exam- ple, in showing that every language in SZK has instance-dependent commit- ments (e.g., [OV08]), which in turn was used to show that every language in SZK n NP has an interactive proof with an efficient prover [NV061. Another example is simplifying the construction and analysis of one-way functions4 from average-case hardness of SZK ([Ost9l], [Vad99, Section 4.81). Furthermore, when arguing about a specific application, one characterization of SZK may be easier to use than others. Indeed, instance-dependent com- mitments are naturally analyzed using entropy, but showing the existence of one-way functions is easier via statistical distance. Hence, expanding the infor- mation theoretic characterization of SZK can be useful in deriving additional applications and simplifying the analysis of existing ones.

Computational Complexity. The complete problems had also been used to study the computational complexity of SZK and related classes. For example, showing

'The statistical distance between two distributions P and Q over a set Y is defined as SD(P, Q) } E jP(y) - Q(y)I. 2The entropy of a random variable X over a discrete set X is H(X) = EX Pr[X = x] log(1/ Pr[X = x]). 3A nice survey on the topic can be found in [GV11]. 4These are functions that are easy to compute, but any polynomial time algorithm fails to invert them with overwhelming probability.

10 that SZK C AM n coAM and that SZK is closed under complement (originally shown for honest verifiers by Okamoto [Oka00) is easily proved using the com- plete problems ([Vad99, Section 4.2]). The information theoretic characterization is also useful for studying complexity classes related to SZK. In particular, it has led to the establishment of com- plete problems (similarly characterized by notions of entropy) for important subclasses of SZK-NISZK (the non-interactive variant of SZK; see [GSV99I) and SZKL (where the verifier and simulator are implemented in logarithmic space; see [DGBVllj).

1.1 Our Results

In this thesis we continue this line of works and attempt to broaden the scope of the relation between zero-knowledge proofs and information theory. From a bird's eye view, we make progress in three aspects:

" Tools: We show additional SZK-complete problems characterized by different information theoretic distances. These problems, and the relevant techniques we used, provide further tools to understand the structure and applications of SZK.

" Applications: We show that the hardness of an entropy approximation pro- blem (that is related to NISZK) can be used to construct a useful cryptographic primitive called multi-collision resistant hash functions.

" Different model: We initiate the study of zero-knowledge in a model - motivated by the property testing literature-where the verifier (and the si- mulator) do not have full access to the input.

In the next sections we survey in greater detail the results presented in this thesis.

1.1.1 Statistical Difference Beyond the Polarizing Regime

In the first part of this thesis we focus on the STATISTICAL DIFFERENCE PROBLEM (SDP), the aforementioned problem shown to be SZK-complete by Sahai and Vad- han [SV031. The input to the STATISTICAL DIFFERENCE PROBLEM is a pair of circuits Co and C1, specifying probability distributions (i.e., that are induced by fee- ding the circuits with a uniformly random string). YES instances are those in which the statistical distance (i.e., SD(CO, C1)) between the two distributions is at least 2/3 and NO instances are those in which the distance is at most 1/3. The choice of the constants 1/3 and 2/3 in the above definition is not entirely arbitrary (this is in contrast to many computational complexity classes where the constants are arbitrary, such as IP and BPP). The restriction on these constants stem from the elegant polarization lemma, a central component is [SV03]'s proof that the STATISTICAL DIFFERENCE PROBLEM possesses a statistical zero-knowledge

11 proof. The lemma describes an efficient transformation taking as input a pair of circuits (Ce, C1 ) and an integer k and outputting a new pair of circuits (Do, D1 ) such that if SD(CO, C1 ) > a then SD(Do, D1 ) > 1 - 2 -k and if SD(CO, C1 ) < / then SD(Do, Dj) < 2-. The polarization lemma is known to hold for any constant 2 values 3 < a , but extending the lemma to the regime in which a 2 < / < a has remained elusive. The constants 1/3 and 2/3 were chosen by [SV03] exactly because 1/3 < (2/3)2. We focus on studying the regime of parameters in which a2 < / < a. Our main results are:

* Two additional SZK-complete problems characterized by different information theoretic distances: the triangulardiscrimination (TD) and the Jensen-Shannon divergence (JS). The triangular discrimination is commonly used, among many other applications, in statistical learning theory for parameter estimation with quadratic loss (in a similar manner to how statistical distance characterizes the 0-1 loss function in hypothesis testing). The Jensen-Shannon divergence is often referred to as the "symmetric Kullback-Leibler divergence" and characterizes the mutual information of some random variables naturally defined over the rele- vant distributions. Both TD and JS (as well as SD) are non-negative, bounded by 1, and are types of f-divergence, a general information theoretic framework to measure similarity between distributions (see more in Chapter 3).

We consider the TRIANGULAR DISCRIMINATION PROBLEM (TDP) and the JENSEN-SHANNON DIVERGENCE PROBLEM (JSP). YES cases of TDP (resp., JSP) are pairs of circuits with TD(Co, C1 ) a (resp., JS(CO, C1 ) > a). NO cases are such pairs with TD(Co, C1 ) < / (resp., JS(Co, C1) < /). We show that these problems are SZK-complete even if the gap between a and / is only inverse-polynomially small. This is in contrast to the STATISTICAL DIF- FERENCE PROBLEM which was previously known to be SZK-complete only if a2 > 0, and (a2 - /3) is at least inverse-logarithmically small. As a corollary from these new SZK-complete problems, we derive a polariza- tion lemma for some problems where the statistical distance satisfies a2< /3 < a. We also derive a polarization lemma for statistical distance with any inverse-polynomially small gap between a 2 and /, improving upon the inverse- logarithmically small gap previously known. The latter lemma implies that the STATISTICAL DIFFERENCE PROBLEM is SZK-complete in this regime of parameters as well.

9 The average-case hardness of the parameterized variant of the STATISTICAL DIFFERENCE PROBLEM (i.e., the difficulty of determining whether the statisti- cal distance between two given circuits is at least a or at most 3), for any values of / < a, implies the existence of one-way functions. Such a result was previ- ously only known for / < a2 , which follows generically from Ostrovsky's [Ost9l] result, showing that the average-case hardness of any problem in SZK implies the existence of one-way functions. Our results implies that the average-case

12 hardness of some problems currently beyond SZK also implies the existence of one-way functions.

o A (direct) constant-round interactive proof for estimating the statistical dis- tance between any two distributions (up to any inverse polynomial error) given circuits that generate them. In particular, the above protocol implies that the parameterized variant of the STATISTICAL DIFFERENCE PROBLEM belongs to AM n coAM. Actually, the latter statement was explicitly proven by Bhatnagar et al. [BBIM1I], and it can also be derived from other existing results in the literature (specifically, the combination of [SV031 and [GVW02I; see Chapter 3 for more details). Thus, we view our main contribution to be the proof which is via a single protocol that we find to be cleaner and more direct than alternate approaches.

As we mentioned above, any problem in SZK belongs to AM n coAM, and the average-case hardness of any such problem implies the existence of one-way functions. Thus, our results show that the STATISTICAL DIFFERENCE PROBLEM in the regime in which a2 < 3 < a share central properties with the class SZK (though we fall short of showing the problem belongs to SZK).

1.1.2 Multi-Collision Resistant Hash Functions

In the second part of this thesis we focus on cryptographic applications of zero- knowledge proofs. In particular, we study multi-collision resistant hash functions (MCRH) a natural relaxation of collision resistant hash functions (CRH). Collision resistant hash functions are functions that shrink their input, but for which it is computationally infeasible to find a collision, namely two strings that hash to the same value (although collisions are abundant). A natural relaxation of collision resistance is to consider hash functions for which it is infeasible to find a t-way collision; i.e., t strings that all have the same hash value. Here t is a parameter, where the standard notion of collision resistance corresponds to the special case of t = 2. We refer to such functions as multi-collision resistant hash functions (MCRH) and emphasize that, for t > 2, it is a weaker requirement than that of standard collision resistance. The property of multi-collision resistance was considered first by Merkle [Mer89] in analyzing a hash function construction based on DES. The notion has also been considered in the context of identification schemes [GS94], micro-payments [RS961, signature schemes [BPVY00], and cryptanalysis [Jou04J. We show that the existence of MCRH follows from the average case hardness of a variant of the ENTROPY APPROXIMATION PROBLEM. The input to the EN- TROPY APPROXIMATION PROBLEM (EAP) is a circuit C specifying a probability distribution, and a number k. YES instances are those in which the entropy of the distribution induced by the circuit is at least k (i.e., H(C) > k) and NO instances are those in which the entropy is at most k - 1 (i.e., H(C) < k - 1). Goldreich, Sahai and Vadhan [GSV99] showed that EAP is complete for the class of (promise) problems

13 that have non-interactive statistical zero-knowledge proofs (NISZK), an important sub-class of SZK. We consider a variant of EAP, first studied by Dvir et al. [DGRV11], that uses different notions of entropy. Specifically, consider the promise problem EAPminmax, where YES cases are those in which the min-entropy5 of the distribution induced by the circuit is at least k (i.e., Hmin(C) > k) and NO instances are those in which the max-entropy is at most k - 1 (i.e., Hmax(C) < k - 1). We show that the average-case hardness of EAPmin,max (i.e., the difficulty of determining whether the min-entropy of the circuits is at least k or that the max-entropy of the circuit is at most k - 1) implies the existence of MCRH. Many notable and well established cryptographic assumptions imply the average- case hardness of EAPmin,max. Dvir et al. [DGRV11] showed that the hardness of the QUADRATIC RESIDUOSITY (QR) problem or the DECISIONAL DIFFIE HELLMAN (DDH) assumption imply the hardness of EAPmin,max. The latter hardness can also be shown to follow from the average-case hardness of the SHORTEST VECTOR PROBLEM or the CLOSEST VECTOR PROBLEM with approximation factor roughly / . To the best of our knowledge the existence of CRH is not known based on such small approximation factors (even assuming average-case hardness). In addition to constructing MCRH, we also show that their existence implies the existence of constant-round statistically hiding (and computationally binding) com- mitment schemes. Such commitment schemes are an important cryptographic primi- tive, used for example by Goldreich and Kahan [GK961 to construct constant-round zero-knowledge proofs for NP.

1.1.3 Zero-Knowledge Proofs of Proximity In the third and last part of this thesis we study the feasibility zero-knowledge proofs in the model of interactive proofs of proximity. In the standard model of interactive proofs the verifier must be allowed to read the input (i.e., the statement) in its entirety. For some applications with huge amounts of data, such verification is too slow. Interactive Proofs of Proximity (IPPs) [EKR04, RVW13] are interactive proofs in which the verifier runs in time sub-linear in the input length. Since the verifier cannot even read the entire input, following the property testing literature, we only require that the verifier reject inputs that are far from the language (and, as usual, accept inputs that are in the language). We initiate the study of zero-knowledge proofs of proximity (ZKPP). A ZKPP convinces a sub-linear time verifier that the input is close to the language (similarly to an IPP) while simultaneously guaranteeing a natural zero-knowledge property. Specifically, the verifier learns nothing beyond (1) the fact that the input is in the language, and (2) what it could additionally infer by reading a few bits of the input. Our main focus is the setting of statistical zero-knowledge where we show the following (where N denotes the input length):

5 For a random variable X, the min-entropy is defined as Hmin(X) = minxEsupp(x) lOg(Pr[)~x) whereas the max-entropy is Hnax(X) = log (ISupp(X)1). It is always true that Hmin(X) < Hmnax(X).

14 " Statistical ZKPPs can be sub-exponentially more efficient than property testers (or even non-interactive IPPs): We show a natural property which has a sta- tistical ZKPP with a polylog(N) time verifier, but requires Q( /N) queries (and hence also runtime) for every property tester. A central technique towards showing this is using average min-entropy, another entropy notion related to the standard (conditional) entropy.

" Statistical ZKPPs can be sub-exponentially less efficient than IPPs: We show a property which has an IPP with a polylog(N) time verifier, but cannot have a statistical ZKPP with even an NOM time verifier. Towards proving this we show that every problem with statistical ZKPP can be (somewhat) reduced to the ENTROPY DIFFERENCE PROBLEM, the problem shown by Goldreich and Vadhan [GV99] to be SZK-complete.

" Statistical ZKPPs for some graph-based properties such as promise versions of expansion and bipartiteness, in the bounded degree graph model, with polylog(N) time verifiers exist. To show that expansion property has Statistical ZKPP, we (somewhat) reduce it to the STATISTICAL DIFFERENCE PROBLEM. Lastly, we also consider the computational setting where we show that: " Assuming the existence of one-way functions, every language computable by either a (uniform) Boolean circuit with poly-logarithmic depth (i.e., in NC) or a Turing machine with poly-logarithmic space (i.e., the class SC), has a computational ZKPP with a (roughly) VN time verifier.

" Assuming the existence of collision-resistant hash functions (CRH), every lan- guage in NP has a statistical zero-knowledge argument of proximity with a polylog(N) time verifier.

1.2 Outline of this Thesis

In Chapter 2 we give general notations and definitions used throughout this thesis. In Chapter 3 we present our new complete problems for SZK and the additional results on the STATISTICAL DIFFERENCE PROBLEM. In Chapter 4 we present our results on the construction and application of multi-collision resistant hash functions. Finally, in Chapter 5 we present our results on zero-knowledge proofs of proximity. Every chapter in this thesis is self-contained, with few cross-references between chapters (apart from references to the preliminaries chapter). Hence, this thesis can be read in any order. This thesis is based on the following publications: * Chapter 3 is based on: Statistical difference beyond the polarizing regime. Itay Berman, Akshay Deg- wekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. In ECCC 2019 IBDRV 191.

15 * Chapter 4 is based on: Multi-collision resistant hash functions and their applications. Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. In EU- ROCRYPT 2018 [BDRV18.

* Chapter 5 is based on: Zero-knowledge proofs of proximity. Itay Berman, Ron D. Rothblum, and Vinod Vaikuntanathan. In ITCS 2018 [BRV18].

16 Chapter 2

Preliminaries

In this chapter we give general notations and definitions used throughout this thesis.

2.1 Notations

We use calligraphic letters (e.g., U) to denote sets, uppercase for random variables, boldface for vectors (e.g., x), lowercase for values and functions, and uppercase sans- serif (e.g., A) for algorithms (i.e., Turing Machines). All logarithms considered here are in base two. Given a random variable X, we write x - X to indicate that x is selected according to the distribution of X. Similarly, given a finite set S, we let s ~ S denote that s is selected according to the uniform distribution on S. We denote the probability mass that a distribution P (over a finite set Y) puts on an element y EE Y by either PY or P(y), where we choose which notation to use based on readability and context. The support of a distribution P over a finite set 3, denoted Supp(P), is defined as {y e Y : P(y) > 0}. The support of a discrete random variable X is the support of its probability mass function, that is Supp(X) = Supp(Px), where Px is the distribution of the random variable X. We adopt the convention that when the same random variable occurs several times in an expression, all occurrences refer to a single sample. For example, Pr[f(X) = X] is defined to be the probability that when x ~ X, we have f(x) = x. We write Un to denote the random variable distributed uniformly over {0, 1}". The product distribution between two distributions P and Q is denoted by P 0 Q. The product distribution of k independent copies of P is denoted by POk. Given a boolean statement S (e.g., X > 5), let I{S} be the indicator function that outputs 1 if S is a true statement and 0 otherwise. The relative distance, over alphabet

E, between two strings x c E' and y E Z" is defined by A(x, y) n If A(x, y) < E, we say that x is E-close to y, and otherwise we say that x is E-far from y. Similarly, we define the relative distance of x from a non-empty set S C E"n by A(x, S) A minyes A(x, y). If A(x, S) < E, we say that x is E-close to S, and otherwise we say that x is E-far from S. The bitwise exclusive-or between two binary strings c, y E {0, 1} is denoted by x e y. The image of a function f: X -- Y is defined as Im(f) = {y E : Ix E X , f(x) = y}. An additional notation that we will use is that

17 if S = (Sk)kEN and T = (Tk)kEN are ensembles of sets, we denote by S C T the fact that Sk g T for every k E N. Given a probabilistic polynomial-time algorithm A, we denote by A(x; r) the out- put of A given input x and randomness r. We let poly denote the set all polynomials over the integers. A function v: N -+ [0,1] is negligible, denoted v(n) = negl(n), if u(n) < 1/p(n) for every p c poly and large enough n. A function f: {0, 1}* -+ {0, 1}* is efficiently computable if there exists a probabi- listic polynomial-time algorithm that on input x E {0, 1}* outputs f(x).

2.2 Information Theory Preliminaries

2.2.1 Statistical Distance and Imbalance

The statistical distance between two distributions P and Q over a finite set Y, is defined as SD(P, Q) = maxscy P(S) - Q(S) =Lcy PY - QyI. The definition of statistical distance immediately implies the following property.

Proposition 2.2.1. Let P be a distribution over a finite set Y and let U be the uniform distribution over Y. Then SD(P, U) > 1 - ISupp(P)|

Proof. Let S = Supp(P). Then

SD(P, U) > P(S) - U(S) =1- Supp(P) ly'l

In this thesis (particularly in Chapter 3), we will use the following view of statis- tical distance.

Definition 2.2.2 (The imbalance between P and Q). Let P and Q be two distributi- ons over a finite set Y. Let (B, Y) be the jointly distributed random variables defined as follows: B ~{0, 1} and if B = 1, then Y ~ P (that is, Y is a random variable drawn according to P), and if B = 0, then Y ~ Q. For every y E Supp(Y) we define the imbalance OyQ = Pr[B = 1|Y = y] - Pr[B = 0|Y = y].

We will typically omit the distributions in the superscript from the notation (i.e., write O, instead of 0 yPQ) when they are clear from the context.

Proposition 2.2.3. Let P and Q be two distributions as in Definition 2.2.2. Then, SD(P, Q) = Ey-y [10y 1].

Proof. The proof follows from the definitions of OY and statistical distance, details follow.

18 It is easy to verify that for every y e Supp(Y), it holds that Pr[Y = y] = and 0Y = P'-Q. Thus,

E [0y] = S Pr[Y = y] - 10,j > y +Q * - -SD(P,=Q Q). y~Y 2 +Q yESupp(Y) yESupp(Y)

The following claim is in the heart of [SV031's SZK protocol for SDP.

Proposition 2.2.4. Let P and Q be two distributions and let B and Y be random variables as in Definition 2.2.2. Let B' be a random variable that depends only on Y. That is, we have the following Markov chain: B -+ Y -+ B'. Then,

1 SD(P, Q) Pr[B = B'] < - + , 2 2 where equality holds if B'(y) = 1 if 0,, > 0 and B'(y) = 0 if O, < 0 (that is, B' is the maximal likelihood estimator of B).

Proof. Let 0'= Pr[B' = 11Y = yj - Pr[B' = 0|Y y].

Pr[B B'] E [Pr[B = B'IY = y ] yY L

-E EPr[B = 11Y = y] - Pr[B' 1IY = y yYL + Pr[B OIY y] -Pr[B' OY =y]

E [(+ (1 + ) + (1- 1-. Y Y 2 2 2 2 _ 1 E -~Y + Y-- [0y- - ..0'] 21 Ey~ [o2 . ]

2 2 1 SD(P, Q) 2 2 where the second equlity follows since condition on Y = y, the random variables B and B are independent, and the inequality follows since 0' E [-1, 1]. Moreover, if 0' = sign(0y) then the inequality is actually an equality. B' satisfying the latter condition is exactly the maximal likelihood estimator for B.

19 2.2.2 Entropy Definition 2.2.5 (Entropy). The entropy of a discrete random variable X is defined as

H(X)= E [log( Xx _ Pr [X = x]

The binary entropy function h: [0, 1] - [0, 1] is defined to be the entropy of X ~ Bernoulli(p). That is, h(p) - -plog(p)-(1-p) log(1-p), where we use the convention that h(O) = h(1) = 0.

Definition 2.2.6 (Conditional entropy). Let X, Y be jointly distributed random va- riables. The conditional entropy of X given Y is defined as

H(XIY) = E [H(X|Y = y)] = E log Pr[X Y~Y (x,Y)~(xY) = xY = y]

Fact 2.2.7 (Chain rule for entropy). Let X, Y be jointly distributed random variables. Then

H(X, Y) = H(XIY) + H(Y).

Another notion of entropy that we shall use is that of (conditional) average min- entropy.

Definition 2.2.8 (Average min-entropy [DORS08j). Let X, Y be jointly distributed random variables. The average min-entropy of X given Y is defined by

fmin(XIY) = - log EY [max Pr[X = x I Y = y].

The following fact follows immediately from the above definition. Since we could not find a reference, we include a proof.

Fact 2.2.9. Let X', Yn be n-tuples of independent copies of the random variables X and Y respectively. Then Hjmin(Xn|Y') = n - Hmin(X|Y).

Proof. We prove for the case that n = 2. The general case follows by induction.

2 Hmin 2 2 - max Pr [X = (X1, X2) I y2 (X Iy ) log E = (Y1, Y2)]I) ((yi,Y2)~2 X1,X2

-log E_ maxPr[X = xI Y = yi] - Pr[X = X2I Y = Y2] ((yl,Y2)~Y2 X1,X2 I )

- log E 2 max Pr[X = x, I Y =Y1- rnaxPr[X =X2 Y =Y2 X1 Q(II . 2 Y where the second equality follows since the first sample from (X, Y) is independent from the second one, and the third equality follows since for non-negative functions

20 f and g, it holds that max,,, 2 f(x1) - g(x 2 ) = max, f(x1 ) -maxx2 g(x 2 ). Letting h(y) = maxx Pr[X = x I Y = y], we write

2 2 Hin(X IY ) - - log E [h(yi) - 2 \(yi y2)~Y

- log( E [h(y1)] E [h(Y 2 )])

- log (E [h(yi)1) - log (E [h(Y2)]

= Hmin(XIY) + Hmin(XIY), where the second equality follows since the first sample of Y is independent of the second one. l

The following fact show that a random pre-image of any shrinking function has some average min-entropy.

Fact 2.2.10. Let f: X -+ Y, let X be a random variable uniformly distributed over X, and let Y = f (X). Then, H min(XIY) = log(|XI/lIm(f)|).

Proof. For y E Y, let f-1(y) = {x c X: f(x) = y}. Fix y E Im(f). For x E f (y), it holds that Pr[X = xY = y] = 1/1f- 1(y)I, while for x f-1 (y), it holds that Pr[X = xIY = y] 0. Thus, maxx Pr[X = xjY = y] = 1/1f -'(y). Moreover, it holds that Pr[Y y] = If-'(y)I/IXI. Finally, for every y 0 Im(f), it holds that Pr[Y = y] = 0. Hence,

fmim(XIY) = - log E [max Pr[X = xIY = y] kY Y

If j(y) 1 -- log (yElm(f)E

= log .

nI

2.2.3 Hashing

2.2.3.1 Many-wise Independent Hashing

Many-wise independent hash functions are used extensively in complexity theory and cryptography.

Definition 2.2.11 (f-wise independent hash functions). For f E N, a family of functions = {f : {0, 1} - {0, 1} '"} is f-wise independent if for every distinct

21 x 1 , x 2 ,. . , fOE {O, 1}' and every Y1, Y2, . .. , yf E{O, 1}m, it holds that

Pr [f(xi) = y1 A f(x 2 ) = y 2 A ... A f (xf) =y] = 1 f~_TF (2 rn)e

Any k-wise independent hash family, for k ;> 2, is also universal. The existence of efficient many-wise hash function families is well known. Fact 2.2.12 (c.f. [Vadi2, Corollary 3.34]). For every n, m, f c N, there exists a family of f-wise independent hash functions -F = {f: {Tn1}- {Oi}T "} where a random function from Ff/7n can be selected using f - max(m, n) bits, and given a description of f c FnLo and x E {O, i}, the value f(x) can be evaluated in time poly(n, m, f).

Whenever we only need pairwise independent hash function family F 1 2 , we re- move the two from the superscript and simply write Fn,,. A weaker requirement for a hash family than being pairwise independent, is being universal. Definition 2.2.13 (Universal hash function). For n, m E N, a family of functions

F {f {0, 1' - {0, 1}"} is universal if for every x1 # x 2 E {0, 1}', it holds that 1 Pr [f (x1) = f (X2)] f ~F = 2,rI.

Dodis et al. [DORS081 showed the following generalization of the leftover hash lemma, for sources having high conditional average min-entropy. Lemma 2.2.14 (Generalized leftover hash lemma [DORS08, Lemma 2.4]). Let F {f: {O, }n - {0, 1}"} be a family of universal hash functions. For any random variables X and Y with Supp(X) C {0, 1 and the random variable F ~ F, it holds that

SD ((F(X), F, Y), (Um, F, Y) < - 2 ina W )- 2m where U is distributed uniformly over {O, 1}'.

2.2.3.2 Load Balancing

The theory of load balancing deals with allocating elements into bins, such that no bin has too many elements. If the allocation is done at random, it can be shown that with high probability the max load (i.e., the number of elements in the largest bin) is not large. In fact, allocating via many-wise independent hash function also suffices. Fact 2.2.15 (Folklore (see, e.g., [CRBSW13])). Let n, m, E E N with f > 2e (where e is the base of the natural logarithm) and let F, be an -wise independent hash function family. Then, for every set S C {0, 1} with ISI < 2m it holds that:

Pr [-]y E {0, 1}' such that f 1 (y) n S > f < 2", f_7(0)Y

22 where f -(y) = {E {O, 1}": f (x) = y}.

Proof. Fix y E {0, 1}'. It holds that

Pr [If- 1(y) n Sl ;> f] < Pr [3 distinct i, ... ,xf E S: f(xi) = y A A f(xe) y]

< Pr [f(xj) = y A ...A f((x) =y] distinct xi .xgCS

2m < (2rn) (I)'

< 2-f where the second inequality is by a union bound, the third inequality follows from the f-wise independence of Fln, the fourth inequality is by a standard bound on binomial coefficients, and the last inequality follows by our assumption that f > 2e. Fact 2.2.15 follows from a union bound over all values of y E {o, 1}rn. E

Remark 2.2.16 (More efficient hash functions). We remark that more efficient con- structions of hash functions guaranteeing the same load balancing performance as in Fact 2.2.15 are known in the literature. Specifically, focusing on the setting of f = O(m), Fact 2.2.15 gives a load balancing guaranteefor functions whose description size (i.e., key length) is Q(m 2 ) bits. In con- trast, a recent result of Celis et al. [CRSW13] constructs such functions that require only 0(m) key size. Furthermore, a follow up work of Meka et al. [MRRR14] impro- ves the evaluation time of the [CRSW13] hash function to be only poly-logarithmic in m (in the word RAM model). However, since our focus is not on concrete efficiency, we ignore these optimiza- tions throughout this thesis.

2.2.4 Concentration Bounds

We use the following well-known concentration bound.

Fact 2.2.17 (Chernoff-Hoeffding bound). Let X1, X 2 ,... , Xj be independent random variables taking values in [a, b] and let X = > t" Xj. Then, for every E > 0 it holds that

< C 2LE 2 /ba2 Pr [X - , (X] ;> E] 2 (b--a)2 [<- E _ e2 2,] PrPrX - ]E[1r

We also use the following fact, showing that computing the empirical distribution from large enough number of samples approximates the original distribution well.

23 Fact 2.2.18 (Folklore (see, e.g., [Goll7, Exercise 11.41)). Let P be a distribution over n elements and let P be the empirical distribution obtained from taking N samples

P1 ,.. . , PN from P, namely, P(i) = {j: P = i}I/N. Then, if N > ,no1/6)it holds that

Pr SD(P, P) ;> E <

2.3 Statistical Zero-Knowledge Interactive Proofs

In this section we give the standard definitions for statistical zero-knowledge proofs and recall classical results regarding such proofs. We follow [Vad99l.

Definition 2.3.1 (Interactive proofs). Let 1 = (YES, NO) be a promise problem. An for 1I is an interactive protocol (P, V) with completeness error c: N - [0, 1J and soundness error s: N -+ [0, 1] if the following holds for every security parameter k E N:

* Completeness: If x E YES, then, when V(x, k) interacts with P(x, k), with probability 1 - c(k) it accepts.

" Soundness: If x c NO, then for every prover strategy P, when V(x, k) interacts with P, with probability 1 - s(k) it rejects.

* Efficiency: The running time of the verifier V on input (x, k) is poly(|xI, k).

If c(-) and s(.) are negligible functions, we say that (P,V) is an interactive proof system.

Definition 2.3.2 (View of interactive protocol). Let (P, V) be an r-message inte- ractive protocol. The view of V on a common input x is defined by viewp,v(x) = , the messages sent by the parties in a (Mi 1 , M 2 , ... , mr; p), where M 1 , M 2 ,. .. mrn are random execution of the protocol, and p contains of all the random coins V used during this execution.

To formalize the notion of statistical zero-knowledge proofs we will allow probabi- listic algorithms to fail by outputting I. An algorithm A is useful if Pr[A(x) =-] < 1/2 for every x, and let A(x) denote the output distribution of A(x), conditioning on A(x) 1. Two major variants of statistical zero-knowledge proofs exist in the literature. The first and easier to achieve variant guarantees that the only the honest verifier-the original verifier on the interactive proof system-learns nothing from interacting with the prover.

Definition 2.3.3 (Honest-verifier zero-knowledge proofs). Let H = (YES, NO) be a promise problem. An interactive proof system (P, V) for H is said to be honest- verifier statistical zero-knowledge, if there exists a useful probabilisticpolynomial-time

24 algorithm S, and a negligible function y: N -+ [0, 1] such that for every k E N and x E YES,

SD (S(x, k), viewp,v(x, k)) < p(k)

HVSZK denotes the class of promise problems possessing honest-verifier statistical zero-knowledge interactive proof system. A stronger variant of zero-knowledge proofs would guarantee that any efficient verifier, including those that cheat and deviate from the original protocol, learns noting from interacting with the prover. We allow cheating verifiers to be non-uniform by giving them an auxiliary input. For an algorithm A and a string z c {o, 1*, let A[z] be A when z was given as auxiliary input. Following lVad99], we adopt the convention that the running time of A is independent of z, so if z is too long, A will not be able to access it in its entirety. Definition 2.3.4 (Cheating-verifier zero-knowledge proofs). Let - = (YES, NO) be a promise problem. An interactive proof system (P, V) for H is said to be statistical zero-knowledge, if for every probabilistic polynomial-time V there exists a useful pro- babilistic polynomial-time algorithm S, and a negligible function p: N -> [0, 1] such that for every k E N, z G {0, 1}* and x E YES,

SD (S[,l (x), view, (x)) < p(k).

SZK denotes the class of promise problems possessing statistical zero-knowledge inte- ractive proof system.

2.3.1 Complete Problems

Central in the study of statistical zero-knowledge are problems dealing with properties of distributions encoded by circuits. Definition 2.3.5 (Distributions encoded by circuits). Let C be a Boolean circuit with m input gates and n output gates. The distribution encoded by C is the distribution induced on {0, 1}"' by evaluating C on a uniformly selected string from {0, 1}'. By abuse of notation, we also write C for the distribution defined by the circuit C. Two particularly interesting problems are the STATISTICAL DIFFERENCE PROBLEM and the ENTROPY DIFFERENCE PROBLEM. Definition 2.3.6 (STATISTICAL DIFFERENCE PROBLEM). Let a, /3: N --+ [0, 1] with a(n) > 0(n) for every n. The STATISTICAL DIFFERENCE PROBLEM with promise (a, /), denoted SDP', is given by the sets

SDPY = {(Co, C1) SD(Co, C) > a(n)} and

SDP" - {(Co, C1) SD(Co, C1 ) < /3(n)}, where n is the output length of the circuits Co and C1 .

25 Definition 2.3.7 (ENTROPY DIFFERENCE PROBLEM). Let g: N - R+. The EN- TROPY DIFFERENCE PROBLEM with promise g, denoted by EDP9, is given by the sets

EDPg = {(Co, C1 ) | H(CO) > H(C1 ) + g(n)},

EDPN = {(Co, CI) H (C1 ) > H(Co) + g(n)}. where n is the output length of the circuits Co and C 1 .

Remark 2.3.8. In prior works a, / and g were typically thought of as constants, and so their dependence on the input was not specified. In contrast, since we will sometimes want to think of them as parameters, we choose to let them depend on the output length of the circuit since this size seems most relevant to the distributions induced by the circuits. Other natural choices could have been the input length or the description size of the circuits (and indeed in Section 4.2 we consider a problem whose promise is a function of the circuits' input length). We remark that these different choices do not affect our results in a fundamental way.

Both SDP and EDP are known to be complete for SZK, though for different setting of parameters.

Theorem 2.3.9 (SV03, GSV98I). Let a, / be such that (a2 , /3) are (1/poly)-separated functions and that there exists a constant E E (0,1/2) with 2-n/- < 0(n) and 3 a(n) < 1 - 2-n1/2 for every n E N. Then, the promise problem SDP I is SZK complete. Furthermore, the problem remains complete if it is restricted to length- preserving pair of circuits.

Remark 2.3.10. The restrictionsplaced on a and 3 in Theorem 2.3.9-that they are 2 not closer than 2 -/ to 1 or 0-are a result of the extent to which the polarization lemma of [SV03] can polarize. While, for some small constant c, it might be possible to push this to allow them to be 2-c" close to 1 or 0 using a more efficient polarization technique (see, for instance, the proof of Theorem 12 in [AAIRV17]), there are reasons to believe that this restriction cannot be weakened much more. For instance, if a = 1, then SDP~'3 is in PZK, and is thus unlikely to be complete for SZK (indicated by a known oracle separation between these classes [BCN+17]). Similarly, if a/3 > 2n/2, then SDP"a, is contained in the class PP, which SZK is again oracle-separated from [BCH+l1/. Finally, with regard to the separation between a and 0, in the extreme case that Ia(n) - /(n)I 2 -, we note that SDPO'3 can be shown to be NP-hard by a reduction from Circuit-SAT, and thus is not contained in SZK unless the polynomial hierarchy collapses [BHZ87].

Since the furthermore part of Theorem 2.3.9 was not stated in prior works (but will be used by us for technical reasons), we include a proof sketch.

Proof Sketch of Theorem 2.3.9. While this theorem is not stated in this generality in [SV03], it extends to this by setting some parameters appropriately. We show how

26 to do this using the proof of the equivalent theorem in [Vad99]. The approach is to first establish the SZK-hardness of SDP2 /3,1/3, and then use the polarization lemma [Vad99, 2 Lemma 3.1.12] to reduce this to SDP1 k 2-k, to show that the former is in SZK. Here we sketch how, for every a > 3 such that (a2 ,3) are (1/poly)-separated and any E E (0, 1/2), the problem SDP',1 is reduced-via the polarization lemma-to SDP1-2 ,2 for k = n1 /2-e and n being the input and output length of the circuit (since the furthermore clause requires us to produce length-preserving circuits). The polarization lemma, for some a and 3 and an integer k, gives a way to take a pair of circuits (Co, C1 ) and efficiently produce another pair (Do, D1 ) such that if SD(Co, C1 ) > a, then SD(Do, D 1 ) > 1 - 2 -k, and if SD(CO, C1 ) < 3, then

SD(Do, D1 ) < 2 -k. While it can do this for any k, the output lengths of Do and D1 grow as k grows, and we are interested in k as a function of these output lengths. In particular, we wish to show that, for any constant E E (0, 1/2), we can end up with 2 an output length n for Do and D 1 such that k > n1/ -e. Actually, since we want the resulting circuits to be length-preserving, we analyze how the polarization lemma also affects the input length. The proof of [Vad99, Lemma 3.1.121 makes use of the following two transformati- ons on pairs of circuits:

* f-fold Repetition: the input circuits (Co, C1 ) are mapped to (Do, D1 ), where Db is the concatenation of f copies of Cb. If m and n are the input and output lengths of Co and C1, then m - f and n - f are the input and output lengths of Do and D1 .

* r-fold XOR: the input circuits (Co, C1 ) are mapped to (DO, D 1 ), where Db first samples (bi, ... , br) ~ {0, 1}' conditioned on b, e - @ br = b and outputs the concatenation of Cb1,...., C,. Notice that in order to sample the desired bl, ... , b,, it suffices to have r uniformly random bits: if b1 e - -- b, = b, then use these bits; otherwise, flip the first bit b1 . The resulting sampled bits satisfy the desired distribution. Hence, if m and n are the input and output lengths of

Co and C1 , then (m - r + r) and n - r are the respective input and output lengths of Do and D 1 .

Initially, the polarization lemma sets A = min(a2 f/, 2) and f = [logs 4k], and proceeds as follows:

1 1. Repeat (CO, C1 ) for f times to get circuits C( ) and C(1.

2 2. XOR (C(', C(')) for r = A/(2a ) times to get circuits C( ) and C .

2 2 3. Repeat (C( , C( )) for k times to get circuits Do and D1 .

The fact that Do and D1 are polarized is proved in [SV03. Here we only focus on showing the dependence of k on the input and output lengths.

Based on the above description, the input lengths of Do and D1 are equal to m' = mn . -. + A' - k) and their output length are equal to n' = ( --A' - k - n

27 where m (resp., n) is the input (resp., output) length of the input circuits Co and C1. Note that if the input circuits are length-preserving, then the input of the resulting circuits is longer than their output. We follow the proof of [CCKV08, Lemma 381 and note that the above setting guarantees the following. First, note that f = 0 ). Moreover, since A E (1, 2], we have that ln(A) = ln(1 + (A - 1)) > (A - 1)/2 > Q(02 3, where we used that ln(1-+ x) > x/2 for all x E [0, 1]. So, = 0 n). Also note that r < 1/2 -(2/a2)e exp(O (" ik 2n(2/a2)

Our starting point is an instance of SDPa',f. We would first like to use the polariza- tion lemma to reduce it to an instance of SDP3/4,1/ 4 . Set k = 2, and so f < O(log(n)), 2 where we used that a - 3 > Q(1/ log(n)), for n being the output length of the gi- ven circuits. Further, it holds that # ln(2/a 2) /ln(2/#) < 1 for all /3 c (0, 1), and hence, r = poly(n). The resulting polarization procedure is polynomial in the description of the input circuits, and thus reduces SDPc', to SDP 3 / 4 , 1/ 4 . This shows 3 1 that SDPa c SZK. To show that SDP a is SZK-hard, we further reduce SDP /4, /4 to SDP- 2 k ,-kfor k(n) = n We are given a pair of circuits that are an instance of SDP 3 / 4 , 1/ 4 . First, pad the input and output to get a length-preserving circuits whose input and output lengths are some integer m. We apply the polarization lemma twice, each time appropriately setting the parameter k. In the second time we apply it with k to be determined by the analysis below. This k also sets the parameters for the first application of the polarization lemma as follows.

Let = log 2 4k-this is the f set by the second application of the polarization lemma (we are in a regime that a 2 > 2,3, so A = 2). We first polarize the circuits so that a > (1 - 1/f). To do this, set k' log f and f' = log 2 k'. The input length of the resulting polarized circuits is m' ( ( i l. 2/', +2/,) k') = O(m. polylog(log(k))). Now, apply the polarization lemma one more time with k and

= log 2 4k > 2 (that we already set) to get our (almost) final circuits Do and D1 . The input length of Do or D1 is given by n (m'. - . - +_ 2_ +(-/J2 2(1-1/f)2f -k O(Umkpoyog(k)), where we used that (1 - 1/q) = Q(1). The statistical distance between Do and D 1 is now either at least 1 - 2 -k or at most 2 -k. Also note that the output length of the circuits is shorter than their input length n.

1 We would like that (Do, D1 ) be an instance of SDP 2n12- . It hence 2 suffices that -k < -1/2-, namely k > n1/ -E. Let c 2E and assume that 2 2 - 1-2 k > ml/c (we will shortly set k to satisfy this assumption). This setting guarantee that (mk2 )1/ 2 -E < kl-E. Thus, it holds that n1/2-E < O(k-polylog(k)) < k, where the last inequality holds for all k > k(E), for k(E) being a constant depends on E and the hidden constants in the 0 notation of n. We are now finally able to set k = max(ml/c, k(c)). Lastly, we also want to produce length-preserving circuits. Since the input length of the circuits is longer than their output, and we set k according to the input length,

28 we can simply pad the output so it equals the input length n. The resulting circuits are thus instance of SDP'-2 , as required. l

Theorem 2.3.11 ([GV99, GSV98]). For every efficiently computable p = p(n) E poly(n), the problem EDP'/P is SZK-complete.

2.3.1.1 Oracle Access Zero-Knowledge Proofs

In Chapter 5 we will also care about how the verifier and the simulator access their input. In particular, a fact that we will rely on heavily is that the zero-knowledge proof-systems for SDP and EDP from Theorems 2.3.9 and 2.3.11 only require oracle access to the distributions induced by the input circuits. That is, neither the ve- rifier nor the simulator in these proof-systems need to actually look at the circuits themselves. Rather, all that they need is the ability to generate samples from the circuits.

Definition 2.3.12 (Oracle-access honest-verifier zero-knowledge proof). Let H be a promise problem whose inputs is a pair (C1 , C2 ) of circuits on n bits. An honest- verifier zero-knowledge proof system for H is oracle-access if both the verifier and the simulator only require oracle access to the two circuits. Namely, both algorithms only get access to an oracle that on input x C {0, 1}"n, returns Co(x) and C,(x).

We can now state the results regarding the zero-knowledge proof systems of SDP and EDP, that are used in the proofs of Theorems 2.3.9 and 2.3.11 to show that both problems are in SZK. In fact, we will not care about SDP but rather about the STATISTICAL CLOSENESS PROBLEM, the complement of SDP in which the YES and NO instances are switched, namely SDPa/ A (SDPyA, SDP ) (SDP is also complete for SZK, see [Vad99, Corollary 6.5.11).

Lemma 2.3.13. Let 0 < 3 < a < 1 be constants such that' h((1 + a)/2) < 1 - /. Then, there exists a 2-message oracle-access honest-verifier statistical zero-knowledge proof for SDP'"3 . Moreover, the running times of the verifier and the simulator in the above protocol given oracle access to (CO, C1) and security parameter k are poly(m, n, k), where m is the number of random coins needed to sample from Co or

C1 (i.e., their input length) and n is the output length of Co and C,.

The protocol establishing Lemma 2.3.13 reduces, in a black-box way, an instance of SDP~'- to EDP (see [Vad99, Section 4.41) and then uses the next lemma.

Lemma 2.3.14. There exists 2-message oracle-access honest-verifier statisticalzero- knowledge proof for EDP. Moreover, the running times of the verifier and the simu- lator in the above protocol given oracle access to (Co, C1) and security parameter k are poly(m, n, k), where m is the number of random coins needed to sample from Co or C, (i.e., their input size) and n is the output size of Co and C1.

'Recall that we use h to denote the binary entropy function i(p) = -p-log(p) -(1 -p) -log(1 -p).

29 30 Chapter 3

Statistical Difference Beyond the Polarizing Regime

In this chapter we study the computational complexity of the class SZK in general, and of the STATISTICAL DIFFERENCE PROBLEM in particular. We show two additional complete problems for SZK, defined via different distance notions from information theory. Based on known bounds between those distance measures and statistical distance, we also manage to show that the STATISTICAL DIFFERENCE PROBLEM is SZK-complete for a regime of parameters not previously known. We also show that the average-case hardness of the STATISTICAL DIFFERENCE PROBLEM, for a regime of parameters not known to be in SZK, implies the existence of one-way functions. Finally, we give a (direct) constant-round interactive proof for estimating the statistical distance between any two distributions (up to any inverse polynomial error) given circuits that generate them.

This chapter is based on [BDRXW19].

3.1 Overview

The STATISTICAL DIFFERENCE PROBLEM, introduced by Sahai and Vadhan [SV03], is a central computational (promise) problem in complexity theory and cryptography, which is also intimately related to the study of statistical zero-knowledge (SZK). The input to this problem is a pair of circuits Co and C1, specifying probability distributions (i.e., that are induced by feeding the circuits with a uniformly random string). YES instances are those in which the statistical distance1 between the two distributions is at least 2/3 and NO instances are those in which the distance is at most 1/3. Input circuits that do not fall in one of these two cases are considered to be outside the promise (and so their value is left unspecified). The choice of the constants 1/3 and 2/3 in the above definition is somewhat

'Recall that the statistical distance between two distributions P and Q over a set Y is defined as SD(P, Q) = IPv - Qj, where Py, (resp., Qy) is the probability mass that P (resp., Q) puts on y E Y.

31 arbitrary (although not entirely arbitrary as will soon be discussed in detail). A more general family of problems can be obtained by considering a suitable parameterization. More specifically, let 0 < 3 < a < 1. The (a, /) parameterized version of the STATISTICAL DIFFERENCE PROBLEM, denoted SDP"',3 has as its YES inputs pairs of circuits that induce distributions that have distance at least a whereas the NO inputs correspond to circuits that induce distributions that have distance at most /. Recall the formal definition of the problem (restating Definition 2.3.6):

Definition 3.1.1 (STATISTICAL DIFFERENCE PROBLEM). Let a,/3: N -+ [0, 1] with a(n) > 0(n) for every n. The STATISTICAL DIFFERENCE PROBLEM with promise (a, 0),. denoted SDP"'3 , is given by the sets

SDP" = {(Co, C1) I SD(Co, CI) > a(n)} and

SDP0"0 {(Co, C1) I SD(CO, C1 ) < 0(n)}, where n is the output length of the circuits Co and C 1 .

(Recall that we abuse notation and use Co and C1 to denote both the circuits and the respective distributions that they generate.) The elegant polarization lemma of [SV03] shows how to polarize the statistical distance between two distributions. In more detail, for any constants a and / such that / < a2 , the lemma gives a transformation that makes distributions that are at least a-far be extremely far and distributions that are /-close be extremely close. Beyond being of intrinsic interest, the polarization lemma is used to establish the SZK completeness of SDP"'3, when a2 > /, and has other important applications in cryptography such as the amplification of weak public key encryption schemes to full fledged ones [DNR04, HR05]. Sahai and Vadhan left the question of polarization for parameters a and / that do not meet the requirements of their polarization lemma as an open question. We refer to this setting of a and / as the non-polarizing regime. We emphasize that by non-polarizing we merely mean that in this regime polarization is not cur- rently known and not that it is impossible to achieve (although some barriers are known and will be discussed further below). The focus of this work is studying the STATISTICAL DIFFERENCE PROBLEM in the non-polarizing regime.

3.1.1 Our Results

We proceed to describe our results.

3.1.1.1 Polarization and SZK Completeness for Other Notions of Distance

The statistical distance metric is one of the central information theoretic tools used in cryptography as it is very useful for capturing similarity between distributions. However, in information theory there are other central notions that measure similarity such as mutual information and KL divergence as well as others.

32 Loosely speaking, our first main result shows that polarization is possible even in some cases in which 3 > a2 . However, this result actually stems from a more general study showing that polarization is possible for other notions of distance between distributions from information theory, which we find to be of independent interest. When distributions are extremely similar or extremely dissimilar, these different notions of distance are often (but not always) closely related and hence interchan- geable. This equivalence is particularly beneficial when considering applications of SZK-for some applications one distance measure may be easier to use than others. For example, showing that the average-case hardness of SZK implies one-way functi- ons can be analyzed using statistical distance (e.g., [Vad99, Section 4.8]), but showing that every language in SZK has instance-dependent commitments is naturally analy- zed using entropy (e.g., [OV08]). However, as the gaps in the relevant distances get smaller (i.e., the distributions are only somewhat similar or dissimilar), the relation between different statistical properties becomes less clear (for example, the reduction from SDP,"3 to the EN- TROPY DIFFERENCE PROBLEM of [GV99 only works when roughly a 2 > 3). This motivates studying the computational complexity of problems defined using different notions of distance in this small gap regime. Studying this question can be (and, as we shall soon see, indeed is) beneficial in two aspects. First, providing a wider bag of statistical properties related to SZK, which can make certain applications easier to analyze. Second, the computational complexity of these distance notions might shed light on the computational complexity of problems involving existing distance notions (e.g., SDP'"3 when a 2 < 3). We focus here on two specific distance notions-the triangulardiscrimination and the Jensen-Shannon divergence, defined next.

Definition 3.1.2 (Triangular Discrimination). The Triangular Discrimination (a.k.a. Le Cam divergence) between two distributions P and Q is defined as

1 (P_-__) Y TD (P, Q) = - YEY PY + Q where Y is the union of the supports of P and Q. The TRIANGULAR DISCRIMINATION PROBLEM with promise (a, j), denoted by TDPa, ; is defined analogously to SDP~',, but with respect to TD rather than SD.

The triangular discrimination is commonly used, among many other applicati- ons, in statistical learning theory for parameter estimation with quadratic loss, see [CamG, P. 481 (in a similar manner to how statistical distance characterizes the 0-1 loss function in hypothesis testing). Jumping ahead, while the definition of tri- angular discrimination seems somewhat arbitrary at first glance, in Section 3.2 we will show that this distance notion characterizes some basic phenomena in the study of statistical zero-knowledge. Triangular discrimination has recently found usage in theoretical computer science, and even specifically in problems related to SZK. Yehu- dayoff [Yeh16] showed that using TD yields a tighter analysis of the pointer chasing problem in communication complexity. The work of Komargodski and Yogev [IKY18]

33 uses triangular discrimination to show that the average-case hardness of SZK implies the existence of distributional collision resistant hash functions. Next, we define the Jensen-Shannon Divergence. First, recall that the KL-divergence between two distributions P and Q is defined 2 as KL(PIQ) = Ey Py log(Py/Qy).

Also, given distributions P and P we define the distribution 1P0 + P1 as the dis- tribution obtained by sampling a random coin b E {0, 1} and outputting a sample y from P (indeed, this notation corresponds to arithmetic operations on the probability mass functions). The Jensen-Shannon divergence measures the mutual information between b and y.

Definition 3.1.3 (Jensen-Shannon Divergence). The Jensen-Shannon divergence be- tween two distributions P and Q is defined as

JS(P,Q)= KL (P P +-Q)+ KL(Q P Q) 2 2 2 2

The JENSEN-SHANNON DIVERGENCE PROBLEM with promise (a, p3), denoted JSP0,"3 , is defined analogously to SDP'"3 , but with respect to JS rather than SD.

The Jensen-Shannon divergence enjoys a couple of important properties (in our context) that the KL-divergence lacks: it is symmetric and bounded. Both triangular discrimination and Jensen-Shannon divergence (as well as statistical distance and KL- divergence) are types of f-divergences, a central concept in information theory (see [PW17, Section 6J and references therein). They are both non-negative and bounded by one.3 Finally, the Jensen-Shannon divergence is a metric, while the triangular discrimination is a square of a metric. With these notions of distance and corresponding computational problems in hand, we are almost ready to state our first set of results. Before doing so, we introduce an additional useful technical definition.

Definition 3.1.4 (Separated functions). Let g: N - [0,1]. A pair of poly(n)-time computable functions (a, 3), where a = a(n) E [0, 1] and / = 0(n) E [0, 1], is g- separated if a(n) ;> /(n) + g(n) for every n E N. We denote by (1/poly)-separated the set of all (1|p)-separated pairs of functi- ons, for every polynomial p. Similarly, we denote by (1/Iog)-separated the set of all (1/(clog))-separated pairs of functions, for every constant c > 0.

We can now state our first set of results: that both TDP and JSP, with a noticeable gap, are SZK complete.

Theorem 3.1.5. Let (a, /3) be (1/poly)-separated functions such that there exists a constant E E (0,1/2) such that 2- < /(n) and a(n) < 1 - 2-"/2, for every n E N. Then, TDPC0 is SZK complete.

2To be more precise, in this definition we view 0 - log 2 as 0 and define the KL-divergence to be oc if the support of P is not contained in that of Q. 3 In the literature these distances are sometimes defined to be twice as much as our definitions. In our context, it is natural to have the distances bounded by one.

34 Theorem 3.1.6. For (a, /) as in Theorem 3. 5, the problem JSP" 3 is SZK complete.

2 The restriction on 2-" 1/2- < 0(n) and a(n) < 1 - 2-n/ - should be interpreted as a non-degeneracy requirement (which we did not attempt to optimize), where we note that some restriction seems inherent (see Remark 2.3.10 below). Moreover, we can actually decouple the assumptions in Theorems 3.1.5 and 3.1.6 as follows. To show that TDPa,/3 and JSP'-3 are SZK-hard, only the non-degeneracy assumption

(i.e., 2 -"112- < 03(n) and a(n) < 1 - 2 -/2- ) is needed. On the other hand, to show that these problems are in SZK we only require that (a, /) are (1/poly)-separated. Note that in particular, Theorems 3.1.5 and 3.1.6 imply polarization lemmas for both TD and JS. For example, for triangular discrimination, since TDP", c SZK and TDP'-2-k,2-k is SZK-hard, one can reduce the former to the latter. Beyond showing polarization for triangular discrimination, Theorem 3.1.5 has im- plications regarding the question of polarizing statistical distance, which was our original motivation. It is known that the triangular discrimination is sandwiched be- tween the statistical distance and its square; namely, for every two distributions P and Q it holds that (see [Top00, Eq. (11)1):

SD(P, Q) 2 < TD(P, Q) < SD(P, Q) (3.1)

(for self containment we include a proof of this fact in Section 3.6.) Thus, the problem SDP'a is immediately reducible to TDP,2"0, which Theo- rem 3.1.5 shows to be SZK-complete, as long as the gap between a2 and # is notice- able. Specifically, we have the following corollary.

Corollary 3.1.7. Let (a, /) be as in Theorem 3. 1.5, with the exception that (a2 ,3) are (1/poly)-separated (note that here a is squared). Then, the promise problem SDP a3 *2s SZK complete.

We highlight two implications of Theorem 3.1.5 and Corollary 3.1.7 (which were also briefly mentioned above).

Polarization with Inverse Polynomial Gap. Observe that Corollary 3.1.7 im- plies polarization of statistical distance in a regime in which a and / are functions of n, the output length of the two circuits, and a2 and / are only separated by an inverse polynomial. This is in contrast to most prior works which focus on a and / that are constants. In particular, Sahai and Vadhan's [SV03] proof of the polarization lemma focuses on constant a and / and can be extended to handle an inverse logarithmic gap, but does not seem to extend to an inverse polynomial gap. 4 Corollary 3.1.7 does yield such a result, by relying on a somewhat different approach.

4 Actually, it was claimed in [GV1I] that the [SV03] proof does extend to the setting of an inverse polynomial gap between o and / but this claim was later retracted, see bttp:// . weizan ac. dl/~/-e,/et

35 Polarization Beyond a2 > /. Theorem 3.1.5 can sometimes go beyond the requi- rement that a 2 > /3 for polarizing statistical distance. Specifically, it shows that any problem with noticeable gap in the triangular discrimination can be polarized. Indeed, there are distributions (P, Q) and (P', Q') with SD(P, Q) > SD(P', Q') > SD(P, Q)2 but still TD(P, Q) > TD(P', Q').5 Circuits generating such distributions were until now not known to be in the polarizing regime, but can now be polarized by combining Theorem 3.1.5 and Eq. (3.1).

3.1.1.2 From Statistical Difference to One-way Functions

We continue our study of the STATISTICAL DIFFERENCE PROBLEM, focusing on the regime where /3 < a (and in particular even when / > a 2 ). We show that in this regime the SDP'af problem shares many important properties of SZK (although we fall short of actually showing that it lies in SZK-which is equivalent to polarization for any / < a). First, we show that similarly to SZK, the average-case hardness of SDP'", im- plies the existence of one-way functions. The fact that average-case hardness of SZK (or equivalently SDP'3 for / < a 2 ) implies the existence of one-way functions was shown by Ostrovsky [Ost91J. Indeed, our contribution is in showing that the weaker condition of / < a (rather than / < a 2 ) suffices for this result.

Theorem 3.1.8. Let (a, /) be (1/poly)-separated functions and suppose that SDP is average-case hard. Then, there exists a one-way function.

The question of constructing one-way functions from the (average-case) hardness of SDP is closely related to a result of Goldreich's [Go190] showing that the existence of efficiently sampleable distributions that are statistically far but computationally indistinguishable implies the existence of one-way functions. Our proof of Theo- rem 3.1.8 allows us to re-derive the following strengthening of [Gol90I, due to Naor and Rothblum [NR06, Theorem 4.1]: for any (1/poly)-separated (a, /3), the existence of efficiently sampleable distributions whose statistical distance is a but no efficient algorithm can distinguish between them with advantage more than /, implies the existence of one-way functions. See further discussion in Remark 3.2.1.

3.1.1.3 Interactive Proof for Statistical Distance Approximation

As our last main result, we construct a new interactive protocol that lets a veri- fier estimate the statistical distance between two given circuits up to any noticeable precision.

Theorem 3.1.9. There exists a constant-round public-coin interactive protocol bet- ween a prover and a verifier that, given as input a pair of circuits (Co, C1), a claim

'For example, for a parameter -y E [0, 1] consider the distributions R' and R' over {0, 1, 2}: R' puts -y mass on b and 1 - -y mass on 2. It holds that SD(R-, R') = TD(R, RJ) = y. If, say, (P, Q) = (R' , Ri/ 2 ) and (P',Q') = (R R, R/ 3), then SD(P, Q) > SD(P', Q') > SD(P, Q) 2 but TD(P, Q) > TD(P', Q').

36 A c [0, 11 for their statistical distance, and a tolerance parameter 3 G [0, 1], satisfies the following properties:

* Completeness: If SD(CO, C1 ) = A, then the verifier accepts with probability at least 2/3 when interacting with the honest prover.

* Soundness: If ISD(Co, CI) - Al;> 6, then when interacting with any (possibly cheating) prover, the verifier accepts with probability at most 1/3.

9 Efficiency: The verifier runs in time poly(jCol, Ci1, 1/6).

(As usual the completeness and soundness errors can be reduced by applying parallel repetition. We can also achieve perfect completeness using a result from [FGM+891.) Theorem 3.1.9 is actually equivalent to the following statement.

Theorem 3.1.10 ([BL13, Theorem 6,[BBF16, Theorem 2]). For any (1/poly)-separated 6 (a, 3), it holds that SDP~', E AM n coAM.

It is believed that AM n coAM lies just above SZK, and if we could show that SDP',O is in SZK, that would imply SD polarization for such a and 3. Since Theorem 3.1.9 can be derived from existing results in the literature, we view our main contribution to be the proof which is via a single protocol that we find to be cleaner and more direct than alternate approaches. Going into a bit more detail, [BL13, BBF16J's proofs are in fact a combination of two separate constant-round protocols. The first protocol is meant to show that SDPa"3 E AM and follows directly by taking the interactive proof for SDP presen- ted by Sahai and Vadhan (which has completeness error (1 - a)/2 and soundness error (1 +3)/2), and applying parallel repetition (and the private-coin to public-coin transformation of [GS89]). The second protocol is meant to show that SDP~'3 e coAM, and is based on a pro- tocol by Bhatnagar, Bogdanov, and Mossel [BBM11J. Another approach for proving that SDPafl E coAM is by combining results of [GVW02 and [SV03]. Goldreich, Vad- han and Wigderson [GVW02I showed that problems with laconic interactive proofs, that is proofs where the communication from the prover to the verifier is small, have coAM proofs. Sahai and Vadhan [SV03], as described earlier, showed that SDP',3 and SZK in general, has an interactive proof where the prover communicates a single bit. Combining these results immediately gives a coAM protocol for SDP'"3 when (a, /) are Q(1)-separated. As for (a, /) that are only (1/poly)-separated, while the [GVW02] result as-stated does not suffice, it seems that their protocol can be adapted to handle this case as well.7

'Recall that AM is the class of problems that have constant-round public-coin interactive proofs. coAM is simply the complement of AM. 7 1n more detail, the [GVW02 result is stated for protocols in which the gap between completeness and soundness is constant (specifically 1/3). In case cc and 3 are only 1/poly-separated, the [SV03] protocol only has a I/poly gap (and we cannot afford repetition since it will increase the commu- nication). Nevertheless, by inspecting the [GVW02] proof, it seems as though it can be adapted to cover any noticeable gap.

37 As mentioned above, we give a different, and direct, proof of Theorem 3.1.9 that we find to be simpler and more natural than the above approach. In particular, our proof utilizes the techniques developed for our other results, which enable us to give a single and more general protocol-one that approximates the statistical difference (as in Theorem 3.1.9), rather than just deciding if that distance is large or small. At a very high level, our protocol may be viewed as an application of the set-lower- bound-based techniques of Akavia et al [AGGM06 or Bogdanov and Brzuska [BB15] to our construction of a one-way function from the average-case hardness of SDP (i.e., Theorem 3.1.8), though there are technical differences in our setting. Both these papers show how to construct a coAM protocol for any language that can be reduced, to inverting a size-verifiable one-way function.8 While we do not know how to reduce solving SDP in the worst-case to inverting any specific function, we make use of the fact that associated with each instance of SDP, there is an instance-dependent function [OW93, that is size-verifiable on the average.

3.1.2 Additional Related Works Barriers to Improved Polarization. Holenstein and Renner [HR05 show that in a limited model dubbed "oblivious polarization", the condition a2 > 3 on the statisti- cal distance is necessary for polarizing statistical distance.' All the past polarization reductions fit in this framework and so do ours. Specifically, Holenstein and Renner show distributions where a 2 < 3 and cannot be polarized in this model. We show a condition that suffices for polarization, even for distributions where a 2 < /. This does not contradict the [HR.05] result because their distributions do not satisfy this condition. In a more general model, [LZ17, CGVZ181 showed lower bounds for SZK-related distribution manipulation tasks. The model they consider allows the reduction ar- bitrary oracle access to the circuits that sample the distributions, as opposed to the more restricted model of oblivious polarization. In this model, Lovett and Zhang [LZ17 show that efficient entropy reversal is impossible", and Chen, G66s, Vadhan and Zhang [CGVZ18 showed that entropy flattening requires Q(n2 ) invocations to the underlying circuit. Showing lower bounds for polarization in this more general model remains an interesting open question.

Polarization for Other Distances. Toward characterizing zero-knowledge in the help model, Ben-Or and Gutfreund [BG03 and Chailloux et al. [CCKV08 gave a po-

8Informally, a function f is size-verifiable if given an output y f (x), there exists an AM protocol to estimate If -(y)I. 9Roughly speaking, an oblivious polarization is a randomized procedure to polarize without invoking the circuits; it takes as input a bit a- and an integer k, and outputs a sequence of bits (b',..., b') and a string r'. Given a pair of circuits (Co, C1 ), such a procedure defines a pair of circuits (DO, D 1) as follows: D, samples (b ,..., b') and r' and outputs (Cb..., C-, r'). We are guaranteed that if SD(Co, C1 ) ;> a, then SD(Do, D 1) > 1 - 2 -- , and if SD(Co, C1 ) < /3, then SD(DO, D1 ) 2-k. 10Entropy reversal refers to the task of given circuit C and parameter t output (C', t') such that when H(C) > t, then H(C') < t' - 1 and if H(C) < t - 1, then H(C') > t'.

38 larization procedure that considers two different distances for every (1/log)-separated a > #: if the statistical distance is at most /, then it decreases to 2 -k; and if the mutual disjointness" is at least a, then it increases to 1 - 2 -k. Fehr and Vaudenay [FV17 raise the question of polarization for the fidelity measure" but leave resolving it as an open problem (see Section 3.2.3.3 for details).

SDP and Cryptography. We show that average-case hardness of SDP,1 implies one-way functions. In the reverse direction, Bitansky et al. IBDV17 show that one- way functions do not imply even worst-case hardness of SDP~', in a black-box manner for any (1/poly)-separated a, /.13

3.1.3 Organization of this Chapter

In Section 3.2 we give an overview of the techniques that we use to prove our main results in this chapter. In Section 3.3 we prove that TDP and JSP are SZK com- plete. In Section 3.4 we construct a one-way function from average-case hardness of SDP. Lastly, in Section 3.5 we construct an interactive proof for estimating statistical distance.

3.2 Techniques

We begin in Section 3.2.1 by describing how to construct a one-way function from the average-case hardness of SD with any noticeable gap (Theorem 3.1.8). The techni- ques used there are also central in our interactive protocol for SD estimation (Theo- rem 3.1.9), which is described in Section 3.2.2, as well as in our proof that triangular discrimination and Jensen-Shannon divergence are SZK complete (Theorems 3.1.5 and 3.1.6), which are outlined in Section 3.2.3 below.

3.2.1 One-Way Function From Statistical Difference with Any Noticeable Gap

We first show the existence of distributiornally one-way functions. Namely, an ef- ficiently computable function f for which it is hard to sample a uniformly random pre-image for a random output y (rather than an arbitrary pre-image as in a standard one-way function). This suffices since Impagliazzo and Luby IL89] showed how to convert a distributionally one-way function into a standard one.

Assume that we are given a distribution over a pair of circuits (Co, C1 ) such that it is hard to distinguish between the cases SD(Co, C1 ) > a or SD(Co, C1 ) < #3, for

"For an ordered pair of distributions P and Q, their disjointness is Disj(P, Q) = Pry-p[y Supp(Q)], and their mutual disjointness is MutDisj(P, Q) = min(Disj(P, Q), Disj(Q, P)). "For two distributions P, Q, fidelity is defined as Fidelity(P, Q) = E PY - QY. 1 3While [BDV17] state the result for constant a, 0, the construction and analysis extend to our setting.

39 some a > 3 + 1/poly. A natural candidate for a one-way function is the (efficiently computable) function

fco,ci(b, x) = Cb(x). (3.2)

Namely, f is parameterized by the circuits (Co, C1) (which are to be sampled according to the hard distribution), and the bit b chooses which of the two circuits would be evaluated on the string x. This function appears throughout the SZK literature (e.g., it corresponds to the verifier's message in the SDP protocol of [SV03]). Assume that f is not distributionally one-way, and let A be an algorithm that given (CO, C1) and a random input y-sampled by first drawing a uniformly random bit b and a string x and then computing y = Cb(x)-outputs a uniformly random element (b', x') from the set f 1 , (y) = {(b, x): Cb(x) = y}. For simplicity, we assume that A is a perfect distributional inverter, that is for every fixed (Co, C1 , y) it outputs uniformly random elements of f'7 c1(y). Arguably, the most natural approach for distinguishing between the cases of high or low statistical distance given the two circuits and the inverter, is to choose x and b at random, invoke the inverter to obtain (b', X'), and check whether b = b'. Indeed, if SD(CO, C1 ) = 1, then Pr[b = b'] = 1, and if SD(Co, C1 ) = 0, then Pr[b = b'] = Thus, we can distinguish between the cases with constant advantage. But what happens when the gap in the statistical distance is smaller? To analyze this case we want to better understand the quantity Pr[b= b']. It turns out that this quantity is characterized by the triangular discrimination between the circuits. Let Pb denote the output distribution of Cb. Using elementary manipulations (and the

fact that 1(Po + F1 ) is a distribution), it holds that"

1 1 Pr[b= b'] - Pr [b' = 0] + - Pr [b' = 13 (3.3) 2 y~Po 2 y~P1 1 Po(y) 2 + P1(y) 2 2 Po(y) + P(y) 2 1 1 ~- (Po(y) + P1(y)) (p ) -

12 Z(P_(y)O()+P -4P1(y)) PO (y) +P (y)2

y

S1+ TD(CO, C1 ) 2

Based on the general bounds between triangular discrimination and statistical dis-

"In Section 3.1 we used Py to denoted the probability mass a distribution P puts on an element y, while here we use P(y). In the rest of this chapter we choose which notation to use based on readability and context.

40 tance (Eq. (3.1)), which are known to be tight, all we are guaranteed is

SD(Co, C 1) > a - Pr[b = b'] > 2

SD(Co, C 1) < Pr[b = b'] 2 .

So, this approach is limited to settings in which a2 > 0. To overcome this limitation we want to find a quantity that is more tightly cha- racterized by the statistical distance of the circuits. This quantity, which we call imbalance, will be central in all of the proofs in this work. The imbalance measures how likely it is that an output string y was generated from C 1 versus Co. Formally,

O = Pr[b = ily] - Pr[b = 01y] = PO(34) P1(y) + Po(y)(34

Elementary manipulations yields that

SD(Co, C1) = IP(y) - Po(y)I (3.5) 2

E I(i()+PW P1(y) - POMy) 2(P1 (y) Po(y)) P(Y) PO(y) 2 P2y I oy =E [|6,|]. y~(jPO+-!P1)

(Recall that y is sampled by first drawing a uniform random bit b and a string x, and setting y = Cb(x). Hence, using the notation that Pb denotes the output distributions of the circuit Cb, the marginal distribution of y is 1PO + }P1 .) Eq. (3.5) naturally gives rise to the following algorithm for approximating SD(CO, C):

Algorithm to estimate SD(Co, CI) using the inverter A: 1. Sample polynomially many y1, ... , ye. 2. For every yi: (a) Call A(yi) polynomially many times to get b,..., b'. (b) Let m be the number of ones in b', . b (c) Set p, = m/k, po = (k - m)/k and Wi =p1 - po.

3. Return _1i|.

The quantities pi and po are in fact the empirical distribution of b condition on y, computed using k samples. By choosing large enough k, we get that (pi,po) ~ (Pr[b = IIy], Pr[b = O1y]) and so W Q~ Oy,. By then choosing large enough t, we get that t Z$=i 0i~i - SD(Co, CI). Hence, we can distinguish between the cases SD(Co, C1) > a or SD(CO, C1) < 0, for any a > 73+ 1/poly. Essentially the same proof continues to work if A is not a perfect distributional

41 inverter, but is close enough to being so-that is, on input y its output distribution is close to being uniform over f (y) for most (but not all) tuples Co, C1 , y. We leave handling these details to the formal proof in Section 3.4 The above proof strategy also yields a new proof for Naor and Rothblum's [NR06] strengthening of [Gol90 1". See Remark 3.2.1 below for a discussion about the diffe- rences between our techniques and those of [NR06].

Distributional Collision Resistant Hash Function. As a matter of fact, the above proof also shows that the average-case hardness of SDP'"3 also implies that 6 the function fco,c1 (b, x) = Cb(x) is a distributional k-multi-collision resistant hash function, for k = 0 ( . That is, for a random output y of f, it is difficult to find k random preimages of y. This is because access to such a set of k random pre-images of random yj's is all we use the inverter A for in the above reduction, and it could handily be replaced with a k-distributional multi-collision finder.

Remark 3.2.1 (Comparison to [NR061). Naor and Rothblum's proof implicitly at- tempts to approximate the maximal likelihood bit of y; that is, the bit bi such that Pr[b = bmily] > Pr[b =1 -- bmi1y] (breaking ties arbitrarily). Indeed, the maximal li- kelihood bit, as shown by [SV03], is closely related to the statistical distance:

I + SD(CO, C ) Pr[b = b,11 = 1 . (3.6) 2

To approximates b. 1 , [NR06] make, like us, many calls to A(y), and take the majority of the answered bits. The idea is that when the statisticaldistance is large, the majority is likely to be bi, and when the statistical distance is small, the majority is equally likely to be bmi or 1 - b,,. To formally prove this intuition, it must hold that if SD(Co, C1) is large, then Pr[b = bi1y] - Pr[b = 1 - bmiIy] is sufficiently large; putting in our terminology and using Eq. (3.5), if Ey[|Iy|| is sufficiently large, then |Gy| should be large for a random y (and the opposite should hold if SD(CO, C1 ) is small). While these statements are true, in order to prove them, [NVR06]'s analysis involves some work which results in a more complicated analysis. We manage to avoid such complications by using the imbalance U, and its cha- racterization of statistical distance (Eq. (3.5)). Furthermore, [NIR06]'s approach only attempts to distinguish between the cases when SD(CO, C1) is high or low, while our approach generalizes to approximate SD(CO, CI). Lastly, Naor and Rothblum do not construct one-way functions based on the average-case hardness of SDP"'3 with any noticeable gap as we do. Using their technique to do so seems to require additional work-work that our analysis significantly simplifies. "Namely, that for any (1/poly)-separated (a, 3), the existence of efficiently sampleable distribu- tions whose statistical distance is a but no efficient algorithm can distinguish between them with advantage more than /, implies the existence of one-way functions. 16 Multi-collision hash functions, recently considered in several works [KNY17, KNY18, BKP18, BDRV181, are hash functions for which it is hard to find multiple inputs that all hash to the same output.

42 3.2.2 Interactive Proof for Statistical Distance Approximation

We proceed to describe a constant-round public-coin protocol in which a compu- tationally unbounded prover convinces a computationally bounded verifier that the statistical difference of a given pair of circuits is what the prover claims it to be, up to any inverse polynomial (additive) error. Such a protocol simultaneously establishes the inclusion of SDP',1 in both AM and coAM for any a > 4 + 1/poly. Our starting point is the algorithm we described above that used a one-way function inverter to estimate the statistical distance. Specifically, that algorithm used the inverter to estimate Y, for random y's, and then applied Eq. (3.5). We would like to use the prover, instead of the inverter, to achieve the same task. In our protocol, the verifier draws polynomially many y's and sends them to the prover. The prover responds with values Qj's, which it claims are the genuine Oyi'S. But how can the verifier trust that the prover sent the correct values? In the reduction in Section 3.2.1, we used k many samples of b conditioned on y to estimate b's true distribution. A standard concentration bound shows that as k grows, the number of ones out of bl, . . . , bk, all sampled from (bly), is very close to Pr[b = IIy] - k. Similarly, the number of zeros is very close to Pr[b = 0y] - k. Consider the following typical set for any fixed y and arbitrary value 0:

{ (bi, , - - , k,xk) Cb, (xi) = y for all i, and k - b }

Namely, Tk o contains every k-tuple of (bi, xi) such that all map to y, and each tuple can be used to estimate 0 well-the difference between the number of ones and the number of zeros, normalized by k, is close to 0. Also consider the pre-image set of y: Iy = {(b, x) I Cb(x) = y}. Since as k grows the estimation of Oy improves, we expect that Tyk' -- the typical set of y with the value O,-to contain almost all tuples. Indeed, standard concentration bounds show that

kO

k -Q(k) (3.7) y 1 k - e-~)

On the other hand, the sets TNkO', corresponding to values 0' that are far from 0!, should be almost empty. Indeed, if 10' - yj ;> Q(1), then,

E I .< (3.8)

So, for the verifier to be convinced that the value 0 sent by the prover is close to Oy, the prover can prove that the typical set 7Yk ' is large. To do so, the parties will use the public-coin constant round protocol for set lower-bound of [GS89], which enables the prover to assert statements of the form "the size of the set S is at least s".

43 However, there is still one hurdle to overcome. The typical set k,'' is only large relative to I- 1 k. Since we do not known how to compute 'Iy I it is unclear what should be the size s that we run the set lower-bound protocol with. Our approach for bypassing this issue is as follows. First observe that the expected value, over a random y, of the logarithm of the size of E is the entropy1 7 of (b, x) given y. Namely,

E[logl-Eyl] = H(B, XJY), (3.9) y where the jointly distributed random variables (B, X, Y) take the values of randomly drawn (x, b, y). Thus, if we draw t independent elements y,..., yt, the average of log|IEI gets closer to t - H(B, XIY), as t grows. Specifically, t 2 Pr [ii. 2 t-H(B,XIY) > 1 - e-(t/n ) (3.10) where n denotes the output length of the given circuits. For large enough t, we can thus assume that the size of this product set is approximately 2 t-H(BXIY), and run the set lower bound protocol for all the yi's together. That is, we ask the prover to send t estimates (01, ... , Ot) for the values ( 0 ,y,. .. , Oyt), and prove that the size of the product set Tk1 x ... x T", is almost 2 t'H(B,XIY) So far we have reduced knowing the size of J, to knowing H(B, XIY), but again it seems difficult for the verifier to compute this quantity on its own. Actually, standard entropy manipulations show that

H(B, XY) = (m + 1) - H(Y), where m denotes the input length of the given circuits. It thus suffices to approximate H(Y). Recall that y is the output of the circuit that maps (x, b) to Cb(x), so Y is drawn according to an output distribution of a known circuit. Luckily, Goldreich, Sahai and Vadhan [GSV991 showed that approximating the output entropy of a given circuit is in NISZK, and thus has a constant-round public-coin protocol (since NISZK C AM n coAM). To conclude, we describe the entirety of our protocol, which proves Theorem 3.1.9.

Protocol to approximate SD(Co, C1 ), given the circuits (Co, CI) as input: 1. First, the prover sends the verifier a claim H of the value of H(Y). 2. The parties execute [GSV99]'s protocol to convince the verifier that this claim- that H ~ H(Y)-is correct. 3. The verifier uses H to compute H(B, XIY) as ((m + 1) - H). 4. The verifier samples yi,... , yt from C0 +Ci and sends them to the prover. - 2 5. The prover responds with 0 1,... , Ot as claims for the values Y, ... , yt.

"Recall that the entropy of a random variable X over X is defined as (H(X) = E E Pr [X = x] log(1/ Pr [X = x]). The conditional entropy of X given Y is H(XIY) = Ey-y [H(XIY = y)].

44 6. The parties run a set lower-bound protocol to prove that the set x . x Yt7- - x is almost as large as (1y, x x Ey,)k.

Here, they use 2tkH(BXI1) as a proxy for (I'yiI| Iy,)k. 7. If the verifier has not rejected so far, it outputs 1621.

3.2.3 TDP and JSP are SZK-Complete

We show that both TDP'"3 and JSP'"3 with a > 3 + 1/poly are SZK-complete. Since the proof of the former uses that of the latter we start by giving an outline that JSP,/3 is SZK-complete.

3.2.3.1 JENSEN-SHANNON DIVERGENCE PROBLEM is SZK-complete

We need to show that JSP~'3 with a > 0 + 1/poly is both in SZK and SZK-hard. In both parts we use the following characterization of the Jensen-Shannon divergence, which follows from its definition. Given a pair of circuits Co and C1, consider the jointly distributed random variables (B, X, Y), where B is a uniformly random bit, X is a uniformly random string and Y = CB(X). Then, it follows from some elementary manipulations (see Proposition 3.3.1 below) that:

JS(Co, C1) = 1 - H(BIY). (3.11)

We use this characterization to tie JENSEN-SHANNON DIVERGENCE PROBLEM to another SZK-complete problem-ENTROPY DIFFERENCE PROBLEM (EDP) with a gap function g. The input to EDP" is also a pair of circuits Co and C1. YES instances are those in which the entropy gap H(Co) - H(C1 ) is at least g(n) (where n is the output length of the circuits) and NO instances are those in which the gap is at most -g(n). Goldreich and Vadhan [GV99] showed that EDP9 is SZK-complete for any noticeable function g. Our proof that JSP 0 is SZK-complete closely follows the reduction from the reverse problem of SDP (i.e., in which YES instances are distributions that are statistically close) to EDP [Vad99, Section 4.41. JSP~', is in SZK: We reduce JSPO/J to ED(a-)/ 2 . Given Co and C1, the reduction

outputs a pair of circuits Do and D1 such that D1 outputs a sample from (B, Y) and Do outputs a sample from (B', Y), where B' is an independent random bit with H(B) = 1 - "3. The chain rule for entropy1 8 implies that

a+3 a+F/ H(Do) - H(D ) = 1 - a - H(BIY) = JS(Co, C1 ) - 2 1 2 2

where the second equality follows from Eq. (3.11). Thus, if JS(CO, C1 ) > a, then

H(DO) - H(D1 ) > "-'3; and if JS(Co, C1 ) < /, then H(Do) - H(D1 ) < - . And since ED('-//2 E SZK, we get that JSP",/ E SZK.

1 8 For a jointly distributed random variables X and Y, it holds that H(X, Y) = H(X) + H(Y|X).

45 3 2 k JSP is SZK-hard: We reduce SDP1- 2-k to JSPa~o , for some large enough k. This suffices since SDP 1-2 -k,2 - is known to be SZK-hard ISV03]." In the pre- sentation of related results in his thesis, Vadhan relates the statistical distance of the circuits to the entropy of B given Y [Vad99, Claim 4.4.2]. For example, if SD(Co, C1) = 0 (i.e., the distributions are identical), then BjY is a uniformly random bit, and so H(BIY) = 1; and if SD(C0 , C1 ) = 1 (i.e., the distributions are disjoint), then B is completely determined by Y, and so H(BIY) = 0. More 20 generally, Vadhan showed that if SD(Co, C1 ) = 6, then

1 - 6 < H(BIY) < h . (3.12)

By taking k to be large enough (as a function of a and 0), and applying

Eqs. (3.11) and (3.12), we have that if SD(Co, C1 ) > 1- 2 -, then JS(Co, C1 ) > a; and if SD(Co, C1 ) < 2-, then JS(CO, Cl) < 3. Thus, the desired reduction is simply the identity function that outputs the input circuits.

3.2.3.2 TRIANGULAR DISCRIMINATION PROBLEM is SZK-complete. We need to show that TDP 13 with a > # + 1/poly is both in SZK and SZK-hard. Showing the latter is very similar to showing that JSP'O~ is SZK-hard, but using Eq. (3.1) to relate the triangular discrimination to statistical distance (instead of Eq. (3.12) that relates the Jensen-Shannon divergence to statistical distance). We leave the formal details to the body of this paper and focus here on showing that TDP'3 is in SZK. A natural approach to show that TDP'a, is in SZK is to follow Sahai and Vadhan's proof that SDP 2 /3,1/ 3 is in SZK. Specifically, a main ingredient in that proof is to polarize the statistical distance of the circuits (to reduce the simulation error). Indeed, if we can reduce TDPfl to, say, TDPO 9"'0 ' by polarizing the triangular discrimination, then Eq. (3.1) would imply that we also reduce TDP",3 to SDP 2 /3J/3, which we know is in SZK. We are indeed able to show such a polarization lemma for triangular discrimination (using similar techniques to [SV03I's polarization lemma). However, this lemma only works when the gap between a and 3 is roughly 1/log. Actually, the polarization lemma of [SV031 also suffers the same limitation with respect to the gap between a 2 and /. Still, we would like to handle also the case that the gap between a and 3 is only 1/poly. To do so we take a slightly different approach. Specifically, we reduce TDPa, to JSpy' , where a' and /' are also noticeably separated. An important step toward showing this reduction is to characterize the trian- gular discrimination and the Jensen-Shannon divergence via the imbalance O, (see

1 9For the simplicity of presentation, we are ignoring subtle details about the relation of k to the output length of the circuits. See Section 3.3.1 for the formal proof. 2 0 The function It is the binary entropy function. That is, h(p) = -plog(p) - (1 - p) log(1 - p) is the entropy of a Bernoulli random variable with parameter p.

46 Eq. (3.4)), as we already did for statistical distance. Recall that given Y = y, the random variable B takes the value 1 with probability , and 0 otherwise. Hence, Eq. (3.11) can also be written as

Js (CO, C1) = I - E [h .(3.13) y~Y 2

As for the triangular discrimination, it follows from the definition that

TD(Co, C1) = E [02]. (3.14)

Furthermore, by Taylor approximation, for small values of 0, it holds that

h ~11-02 (3.15) 2

As we can see, the above equations imply that if all the Oy's were small, a gap in the triangular discrimination would also imply a gap in the Jensen-Shannon divergence. Thus, we would like an operation that reduces all the Oy. The main technical tool we use to reduce Qy is to consider the convex combination of the two input circuits. Given a pair of circuits CO and C1, consider the pair of circuits Do and D1 such that Db= A -Cb+ - A) - Co+Ci 21 Let Qb denote the output distribution of Db, and recall that P denotes the output distribution of Cb. We also let 0' be defined similarly to 0, but with respect to Do and D1 (rather than Co and

C1 ). Using this notation, we have that O, = PIyP( and it may be seen that

0'S = Q(y)QY + O)_=(y)+P A- y*0 (3.16)

So, our reduction chooses a sufficiently small A, and outputs the circuits Do and D1 . Some care is needed when choosing A. Eqs. (3.14) and (3.16) yield that 2 TD(Do, D1 ) = A - TD(Co, C1). Hence, the convex combination also shrinks the gap in triangular discrimination. We show that by choosing A ~~ /a - O, the ap- proximation error in Eq. (3.15) is smaller than the aforementioned shrinkage, and the reduction goes through. The resulting gap in the Jensen-Shannon divergence is roughly (a - 0)2, which is noticeable by the assumption that a > 1 + 1/poly. This shows that TDP,"3 is in SZK if a > 0 + 1/poly. By the relationship between TD and SD (Eq. (3.1)), this implies that SDP a is in SZK if a2 > 1 + 1/poly. This, in turn, by the SZK-hardness of SDP 2 / 3 / 3 and the known polarization lemma that applies for the same, implies polarization for statistical distance for any (a,1) such that a2 > 3 + 1/poly.

2 'This definition of convex combination is more convenient to analyze than perhaps the more natural definition of Db= A - C + (1 - A) -Ci-b.

47 3.2.3.3 Reflections and an Open Problem

Many f-divergences of interest can be expressed as an expectation, over y - Y, of a simple function of 0Y. That is, an expression of the form Ey~y[g(y)], for some function g : [-1, 1] - [0, 1]. For example:

" SD(Co, C1 ) = Ey~y[6yj] (i.e., g(z) = jzj, see Eq. (3.5));

* TD(Co, C1 ) = E [ ] (i.e., g(z) - 2, see Eq. (3.14)); and

* JS(Co, C1 ) = Eyy [1 - h (1+'Y (i.e., g(z) = 1 - h(l-z), see Eq. (3.13)).

To reduce TDP to JSP, we took a convex combination of the two circuits and used the fact that 1 - h ( ) O() for small values of Oy. While this worked for polarization of TD (which corresponds to g(z) =), it seems unlikely to yield a polarization lemma for SD for an arbitrarily small (but noticeable) gap. The reason is that the function g(z) = Izj - the g-function corresponding to SD - in not differentiable at 0 and in particular does not act like z 2 for small values of z. As we find this similarity between the different notions of distance striking, and indeed our proofs leverage the relations between them, we provide in Fig. 3.2.1 a plot comparing the different choices for the function g. Another popular f-divergence that we have not discussed thusfar 2 2 is the squared 2 Hellinger distance, defined as H (P - ) 2 . It can be shown that 2 H (C0 , C1 ) =E~Y~ [1- 1- O, and so also this distance falls within the above framework (i.e., by considering g(z) = 1 - v/1 - z 2 ). Interestingly, the squared Hellinger distance also acts like JS (and TD) around 0; namely, 1 - 1 ~ O(O ) for small values of 0x. However, unlike TDPa'O, we do not know how to show that the HELLINGER DIFFERENCE PROBLEM, denoted HDP13 and defined analogously to TDP~', (while replacing the distance TD with H2 ), is in SZK for all (1/poly)-separated (a, /). We do mention that H2 (P, Q) < TD(P, Q) < 2 2 H (P, Q), and thus HDP'3 is in SZK if a and //2 are (1/poly)-separated. However, the proof described above does not go through if we try to apply it to the Hellinger d] istafn ce-we cannnt guarantat the gap in he ig istanqrc ate takxing 4t- convex combination is larger than the error in the Taylor approximation. Indeed, the question whether HDP'a, is in SZK for any (1/poly)-separated (a, /), first raised by Fehr and Vaudenay [FV17], remains open.

3.3 Complete Problems for SZK

In this section we prove Theorems 3.1.5 and 3.1.6. That is, we show that the TRI- ANGULAR DISCRIMINATION PROBLEM (TDP) and JENSEN-SHANNON DIVERGENCE

22Actually we will use the squared Hellinger distance in Section 3.3.2.2 to analyze triangular discrimination of direct product distributions. Also, the squared Hellinger distance is closely related to the Fidelity distance: Fidelity(P, Q) = 1 - H2 (p, Q).

48 9)

1

0.8 i91 (SID) ...... ------2 (TD)

0.6 ...... 1 .h(k ) (JS) - -- 1 v' - 2(j)

0.4

0.2

A 0 0 0.2 0.4 0.6 0.8 1

Figure 3.2.1: Comparison between the difference choices of the function g that were discussed. Sinc all functions are symmetric around 0, we restrict to the domain [0, 1]. Recall that g1(9) = 101 corresponds to SD, 92(0) = 02 to TD, 93(0) = 1 - h( '-) to 2 JS and g4 (0)=1- v/1 -02 to H .

PROBLEM (JSP) are SZK-complete for any noticeable gap between the YES and NO promises. The outline of this section is as follows. We begin, in Section 3.3.1, with proving that JSP is SZK-complete (Theorem 3.1.6). This proof is closely related to known results in SZK, and in particular closely follow reductions related to the ENTROPY DIFFERENCE PROBLEM (EDP). In Section 3.3.2 we prove that TDP is SZK-complete (Theorem 3.1.5). The latter proof is actually via a reduction to the SZK completeness of JSP.

3.3.1 JSP is Complete for SZK In this section we show that the promise problem JSP',3 (see Definition 3.1.3) is complete for SZK. To do so, we need to show that JSP~'13 is both in SZK and hard for SZK. The proofs of both parts rely on the following characterization of the Jensen- Shannon divergence.

Proposition 3.3.1. Let P and Q be two distributions over a universe Y. Let (B, Y) be the jointly distributedrandom variables defined as follows: B ~{0, 1} and if B = 1, then Y - P (that is, Y is a random variable drawn according to P), and if B = 0, then Y ~ Q. Then, JS(P, Q) = 1 - H(BIY).

49 Proof. Assume without loss of generality that Y is the union of the supports of P and Q (otherwise, exclude those elements from Y and the following calculations remain intact). For y c Y, recall that we defined the imbalance (wrt P and Q) as oy = Pr[B = 1Y = y] - Pr[B = 0Y = y] = (see Definition 2.2.2). Observe

that (BjY y) is a Bernoulli random variable with - parameter 1 2 y Py+QyP We proceed to a straightforward but somewhat tedious calculation that establishes the proposition (in the following y is always summed over Y).

1 (P+iQ ( P+Q\ JS(P,Q) =-KLIP +-KL(Q I 2 2 2 2 I (Eg2P" + Qylog 2Q

2y Py + Q Py + Q +EQylog =+PyQlog Q 2 y+ Py+yy Py +Q)Q = + EPy+ Qy. log + log S 2 Py + Qy Py+ Qy Py+Qy Py+Qy)}

=I1- E h 1 +Q h( y~ 2 1I- E [h(1 O)

= 1- H(BIY),

where we recall that h is the binary entropy function (see Definition 2.2.5). l

The above characterization naturally relates the JENSEN-SHANNON DIVERGENCE PROBLEM to the ENTROPY DIFFERENCE PROBLEM (EDP), a promise problem al- ready known to be complete for SZK (see Theorem 2.3.11). In particular, the proofs of the next two lemmas closely follow techniques from the reduction of STATISTICAL CLOSENESS (i.e., the reversal problem of SDP) to EDP [Vad99, Section 4.41. Lemma 3.3.2 (JSP is in SZK). For every (1/poly)-separatedpair of functions (a, 3) (according to Definition 3.1i4), the promise problem JSP'3 is in SZK.

Proof. The proof reduces JSP"13 to EDP9, where g(n) = (a(n - 1) - 0(n - 1))/2 for every n > 2, and g(l) = g(2). 23 Since (a,3) are polynomially separated, Theo- rem 2.3.11 completes the proof.

Given a pair of circuits (Co, C1 ) whose output length is n, let (B, Y) be the jointly distributed random variables from Proposition 3.3.1 with respect to the distributions

Co and C1 . The reduction outputs the pair of circuits (Do, D 1) such that D1 outputs a sample from (B, Y) and Do outputs a sample from (B', Y), where B' is an independent

2 1Seeting g(1) = g(2) is done for technical reasons so g would be defined for all n E N. As we will soon see, the reduction always outputs circuits whose output length is at least 2.

50 _ a(n) (n) 24 random bit with H(B') = 1 2 Note that the output length of Do or D1 is n + 1 > 2. It holds that

H (DO) - H(D1 ) = H (B', Y) - H(B, Y) = H(B') - H(BJY) a(n) + /(n) _ H(BIY) 2

If JS(Co, C1 ) > a(n), then by Proposition 3.3.1, H(BJY) < 1 -a(n). It holds that

a(n) + O(n) _ = a(n) - O(n) = H1(D) - H(D1 ) 1 2 2 =g(ri+1).

If JS(Co, C1 ) < /3(n), then by Proposition 3.3.1, H(BIY) > 1 - /3(n). It holds that

11(D ) - 1(D ) < I - a(n) + /3(n) 0 a(n) - / -(n)9(n+1). 0 1 2 2

Finally, since the output length of Do or D1 is n + 1, the mapping (Co, C1 ) - 9 (DO, D1 ) is a polynomial-time reduction from JSP a to EDP . D

Lemma 3.3.3 (JSP is Hard for SZK). Let a, /3: N - [0, 1) be efficiently- computable functions such that there exists a constant E G (0, 1/2) such that 2-,/2-E < 0(n) and a(n) < 1 - 2 -0/2- for every n G N. Then, the promise problem JSP"a, is hard for SZK.

Proof. We reduce SDP'-- 2 - with length-preserving circuits to JSPOO. This suffices since the former problem is SZK-hard (Theorem 2.3.9). Let (Co, C1) be a pair of circuits whose output length is n, and let (B, Y) be the jointly distributed random variables defined in Proposition 3.3.1 w.r.t. the distributi- ons CO and C1. Assume for now that n > n(E), where n(E) is some constant dependent on E to be determined by the analysis later. Vadhan [Vaid99, Claim 4.4.2] showed the following relation between the statistical difference of Co and C1 to H(BIY). Specifi-

cally, if SD(Co, C1 ) = 6, then

1 - 6 < H(BJY) < h .1 (3.17)

24To sample such B' we require that p = h'((a + 0)/2) can be described using polynomially many bits. While this might not always be true, we can efficiently compute p' ~ p such that the difference between their binary entropies are negligible. For simplicity, we ignore this issue in this proof.

51 If SD(Co, C1 ) > 1 - 2-n 1 , then Proposition 3.3.1 yields that

JS(Co, C1) = 1 - H(BIY) > 1 - h(I -(1- 2' 1/2-/2

= 1 - h (2 / -

S1 - 2 2-(-/2-E/2+1)/2 > O) where we used that h(f-) is decreasing in 0 < 6 < 1, that h(p) < 2,p for all p E [0, 1] and we set n(E) so that the last inequality holds. Specifically, recall that 2 a(n) 1 - 2 -/2- so there exists a constant n(E) such that 1 - 2 - 2 -(n/2-6/ +1)/2 > 1 - 2 ~1/2-for all n > n(E). On the other hand, if SD(C, C1) < 2n"/ e, it holds that

JS(Co, CI) = 1 - H(BJY) < 1 - (1 - 2-nl/2- E/2) = 2-n1/2-E/2 < (n), where the last inequality holds for n > n(E) (recall that 0(n) > 2-n1/2/2).

Hence, the identity mapping (CO, CI) '-* (CO, CI) is a reduction from the problem SDP - 2 - -* to the problem JSP"I3 , as long as the output length of the given circuits is larger than n(s). For circuits of shorter output, we use that the circuits are length-preserving, so their input is also shorter than n(E). For such input circuits the reduction can go over all inputs (at most 2n(l) strings, which is constant) and compute exactly the statistical distance. If that statistical distance is larger than 1 - 2-n2 , the reduction outputs arbitrary disjoint circuits; and if that statistical distance is smaller than 2" 1/2-, the reduction outputs arbitrary identical circuits. Ei

Lemmas 3.3.2 and 3.3.3 imply that JSP'3 is SZK-complete, for the desired set of (a, /), thereby proving Theorem 3.1.6.

3.3.2 TDP is Complete for SZK

In this section we show that the promise problem TDP'"3 (see Definition 3.1.2) is SZK complete. To do so, we need to show that TDP~'1 is both in SZK and hard for SZK. We start by proving the latter.

Lemma 3.3.4 (TDP is Hard for SZK). Let a, /: N -+ [0, 1] be efficiently- computable functions such that there exists a constant E E (0,1/2) such that 21/2-e < /3(n) and a(n) < 1 - 2-"h/2- for every n E N. Then, the promise problem TDP '3 is hard for SZK.

We prove Lemma 3.3.4 by using the fact that the triangular discrimination is polynomially related to the statistical distance.

Proof. We reduce SDP 1-2-7" ,-2n/ with length-preserving circuits to TDP,0. This suffices since the former problem is SZK-hard (Theorem 2.3.9).

52 We will use the fact that triangular discrimination is sandwiched between the sta- tistical difference squared and the statistical difference. Specifically, recall Eq. (3.1): for every distributions P and Q, it holds that

SD(P, Q) 2 < TD(P, Q) < SD(P, Q).

Let (Co, C1 ) be a pair of circuits whose output length is n. Assume for now that n n(>), where n(E) is some constant dependent on E to be determined by the analysis later. If SD(Co, C1) > 1 - 2--" , then

2 2 2 TD(Co, C1 ) > (1 - 2 fl/2-e/ )2 > 1 - 2 n/ / +1 > a(n), where we set n(E) so that the last inequality holds. Specifically, recall that a(n) < 2 2 1 - 2-E so there exists a constant n(E) such that 1 - 2 -n1 -e +1 > 1 - 2- for all n > n(E).

On the other hand, if SD(Co, C1) < 2 11/2-e/2 it holds that

TD(Co, C,) < 2-/2-/2 < / (n), where the last inequality holds for every n by the assumption on /.

Hence, the identity mapping (Co, C1 ) - (Co, C1 ) is a reduction from the problem

SDP' 2 - to the problem JSP0'0 , as long as the output length of the given circuits is larger than n(E). For circuits of shorter output, we use the same procedure described at the end of the proof of Lemma 3.3.3. E

It is left to show that TDP~'3 is in SZK. Given that the triangular discrimination is polynomially related to statistical difference, a natural approach to achieve such goal is to polarize the triangular discrimination of the given distributions. Namely, design an efficient procedure that takes as input a pair of circuits (Co, C1 ) and outputs a pair of circuits (Do, D1 ) such that if TD(Co, C1 ) > a then TD(Do, D 1 ) > 1 - 2 and if TD(Co, CI) < 3 then TD(DO, D1 ) < 2-. Using Eq. (3.1), we would now be able to reduce TDPa3 to SDP2 /3,1/ 3 . Indeed, Sahai and Vadhan [SV03] used such a polarization lemma for statistical difference to show that SDP 2 /3,1/ 3 is in SZK. We can adapt the polarization lemma of [SV03] to polarize triangular discrimi- nation as well, because triangular discrimination behaves sufficiently like statistical distance under the repetition and xor operations. Analogous to the statistical dis- tance polarization, where a 2 and / can be (1/log)-separated, this approach allows us to show that TDPaf c SZK for all (1/log)-separated a and /3. To show the stronger claim that TDP"O3 is in SZK for (1/poly)-separated a and /3 we take a different approach-we reduce TDP'3 to JSP"O'' for some (1/poly)- separated a' and /3'. Since we already showed that JSP" 3 ' is in SZK (see Lemma 3.3.2), this shows that TDP'"3 is also in SZK. The reduction from TDPo'3 to JSP"'' is given in Section 3.3.2.1. Since we find the (direct) polarization lemma for triangular discrimination and its analogy to [SV03]'s polarization lemma for statistical difference interesting, we prove it in Section 3.3.2.2.

53 3.3.2.1 From TDP to JSP

In this section we prove that TDP with any noticeable gap is in SZK. This proof is via a Karp reduction to JSP with a noticeable gap, which we have already shown to be in SZK (see Lemma 3.3.2). Since SZK is closed under Karp reductions, this implies that TDP with any noticeable gap is in SZK.

Lemma 3.3.5. Let (a, 3) be (1/poly)-separated (according to Definition 3.1.4). Then there exist (1/poly)-separated (a', 0') such that TDP'"3 is polynomially (Karp) redu- cible to JSP"'3 '.

Corollary 3.3.6. For every (1/poly)-separated (a, /), the promise problem TDP~' , is in SZK.

Lemma 3.3.4 and Corollary 3.3.6 together imply that TDPfl is SZK-complete, for the desired set of (a, /), thereby proving Theorem 3.1.5. The main technical tool we use to prove Lemma 3.3.5 is to consider the convex

combination of a pair of circuits. Given a pair of circuits (Co, C1 ), consider the circuits (DO, D1 ), where Db = AC + (1 - A) CO c. Unsurprisingly, such an operation reduces the difference between the circuits. To analyze its exact effect on the triangular discrimination, it will be convenient to characterize the triangular discrimination in terms of the random variables (B, Y) and the imbalance O, (see Definition 2.2.2). Proposition 3.3.7. Let P and Q be two distributions over a universe Y. Let (B, Y) be the jointly distributed random variables defined as follows: B {0, 1} and if B = 1, then Y - P (that is, Y is a random variable drawn according to P), and if B = 0, then Y ~ Q. Finally, for y E Supp(Y), recall that OY = Pr[B = 1|Y = y] - Pr[B Ojy = y] = PV-QZ. PY+Qy Then, TD(P, Q) = E Yy [02]. Proof. In the following y is summed over Supp(Y).

TD(P, Q) =Z Y QY E Q2 PY2 +(PY +QY)2 Y ZY Y [2].

It is easy to see the effect of the convex combination operation on OY. Proposition 3.3.8. Let P and Q be two distributions over a universe Y and let 0 < A < 1. Define the distributionsP' = A -P+(I- A) - and Q' = A -Q +(1 - A) -9 . Then, for every y c Supp(P) U Supp(Q), equivalently Supp(P') U Supp(Q'), it

holds that 0 P'Q' = A. PQ. y y Proof.

0 P'Q_ PQ1 - (P-Qy) - A(P~ -Qy)

Ay A P1'Q1 Y Y

54 Propositions 3.3.7 and 3.3.8 immediately yield that TD(Do, D 1 ) = A2 -TD(Co, C1 ). So, as long as A is not too small, a noticeable gap in the triangular discrimination is preserved. We can now finally prove Lemma 3.3.5. The main insight in the proof is that for small 0 's the Jensen-Shannon divergence behaves like 0 . The first step in the proof is to reduce the magnitude of the O,'s by taking via a convex combination with some small parameter A. Since, by Proposition 3.3.7, the triangular discrimination is exactly characterized by 02, we can now relate the two measures. One difficulty arises when performing the convex combination-the gap in triangular discrimination between the Yes and the No cases shrinks as well. We show that with a careful choice of A, the Jensen-Shannon divergence is "closer" to 0 than the degree to which the gap decreases, thus ensuring that we preserve a noticeable gap.

Proof of Lemma 3.3.5. The proof relies on the Taylor series of the function g(O) 1 - h((1 + 0)/2) around 0:

02 1 00 0 2n 2In2 21n2 n n(2n -1)

(This series is obtained from the Taylor series of the binary entropy function h around 1/2.) The above series yields that for all 0 < A < 1 and -1 < 6 < 1,

4 A202 A2 02 A - < g(AO) < + . (3.18) 21n2 21n2 2In2 To see that the right-hand side inequality holds, note that

4 1 (AO) 2n A 0C 02n 21n2n-2n(2n-1) - 2ln2 n(2n-1) A4 1 -21n 2 n(2n - 1) A 4 1 21n2 2 n=2

2ln2( 6

2 Let A be the largest number such that 1/A is a power of 2 and that A < (a-)/2. 2 This implies that 1/A < 2v/2/(a - ), and thus A > (a-3)/8. Let (Co, C1 ) be a pair of circuits and consider the pair of circuits (Do, D1 ), where Db= Ab- (1- A) - Co+C1

We wish to analyze JS(Do, D 1 ). Let (B, Y) be the random variables defined in Proposition 3.3.7 w.r.t. distributions

Co and C1 , and let (B', Y') be defined similarly for Do and D 1 . Note that Y is

55 distributed the same as Y'. Indeed, Y' is sampled according to the distribution

Do + D _ A - (Co + C1) + (1 - A) - (Co + C1 ) _ Co + C1 2 2 2 and the latter is the distribution by which Y is sampled.

Assume that TD(Co, C1 ) > a. Then, Propositions 3.3.1 and 3.3.7 and Eq. (3.18) yield that

A2 0 2 2 .EYY -TD(Co, C1) A2 . a >T> .*a JS(Do, D 1) = E [g(A - 1 y~Y' 21n2 21n2 21n2

On the other hand, if TD(Co, C1) < 13, then

A2 -Ey~Y-[o] + A4 JS(Do, D1 ) = E [g(A - 0)] < n y~Y' 2n2 A2 TD(Co, C1) + A4 21n 2 A 2 - #+ (a - 0)/2) A2 0Z + - 21n2 21n2 2}

-+ Set a' =1n and Of' = 21n-In2 (-3). 'The +2 mapping (Co, C1) (Do, D1) establishes the reduction from TDP " 3 to JSP"O'. Note that the output lengths of the circuits is preserved, so the above calculation indeed guarantees the desired gap in the Jensen- Shannon divergence, as a function of the output length of Do and D 1. Since 1/A is polynomial in 1/(a -/3), the reduction runs in polynomial time. Finally, it holds that

A2 a - > (a - 0)2 21n2 k\2) 32

and since (a, #) are (1/poly)-separated, then so are (a', /'). l

3.3.2.2 A Polarization Lemma for Triangular Discrimination

In this section we give a procedure that polarizes the triangular discrimination of two input circuits. The procedure is practically identical to that of Sahai and Vadhan's polarization lemma [SV03, Lemma 3.3]. While [SV03, Lemma 3.3] requires that a2 > 3, the polarization lemma for triangular discrimination only needs that a > 0. As already stated, the polarization that we show here is inferior to the (indirect) polarization obtained in Section 3.3.2.1, as it only supports an inverse logarithmic gap. Nevertheless, we include it since we find the approach appealing.

Lemma 3.3.9 (Polarization Lemma for triangular discrimination). There is an al-

gorithm that takes as input the tuple (Xo, X 1, a, /, k), where Xo and X1 are circuits

56 and a > 0, and outputs a pair of circuits (Y, Y1 ) such that:

TD(Xo, X) ;> a - TD(Yo, Y) > 1 - 2 -k

TD(Xo, X 1) - TD(Yo, Y 1 ) 2k.

The running time of the algorithm is polynomial in the description of X0 and X1 as well as in k and cxp (. 1/0)

The proof of [SV03, Lemma 3.31 is done by considering two operations on a pair of circuits-the repetition (i.e., direct product) operation and the XOR operation. We analyze the effect of these operations on the triangular discrimination of the distributions.

Lemma 3.3.10 (Direct Product Lemma for triangular discrimination). Let P, Q be distributions such that TD(P, Q) =6. Then for all k E N,

1 - exp(--k/2) < TD(Pok, Q k) < 2k.

This lemma is where the main difference between polarizing triangular discrimi- nation and statistical difference lies. Specifically, the lower bound in the analogous lemma for statistical difference ([SV03, Lemma 3.41) depends on 62, rather than 6 here. This dependence is exactly why a2 must be larger than 3 for statistical diffe- rence, but not for triangular discrimination.

Proof. The proof is done by considering the Squared Hellinger distance:

H2( - = - E . IX xQL Q(X)

The squared Hellinger distance is useful in our context because of two properties: the triangular discrimination is sandwiched by constant factors of the squared Hellinger distance, and the squared Hellinger distance tensorizes under product distributions. For the first property, Le Cam [Cam86, P. 48] showed that

H2(P, Q) < TD(P, Q) < 2 H 2(P, Q). (3.19)

57 For the second property, that pok QOk are product distributions yield that

H 2 (pek, Q1k) 1 - E [ (X (3.20) xk-QOk [ QOk(Xk)(.

1- E (Xi) xkoQ9k[ Q(Xi)

= 1 HxiQEiQ [Q(X%)] =I - (I - H2(p )k

Equipped with the above properties, we are now ready to prove the lemma. For the upper bound, it holds that

2 TD(P k,Q k) < 2 H (pk, Ok) = 2(1 - (1 - H2 (p, Q))k) 2(1 - (1 - k H2(PQ))) = 2k H2 (pQ) < 2k TD(P, Q), where the second inequality follows since (1 - X)k> 1 - kx for all x and integer k. For the lower bound, it holds that

TD(P k, Q k) > H 2(pok, Qek) = (1 - (1 - H2 (pQ))k) > (I- (I- 6/2)k) > 1 - exp(-6k/2), where the last inequality follows since 1 - x < ex for all x. El

Next we analyze the XOR operation for triangular discrimination, and see that it is idendical to the effect of this operation on statistical difference ([SV03, Lemma 3.51). Lemma 3.3.11 (XOR Lemma for triangular discrimination). There is a polynomial algorithm that takes as input (XO, X 1 , Ik), where X0 and X1 are circuits, and outputs a pair of circuits (Y, Y1 ) such that TD(Y, Y) = TD(Xo, X 1 )k. Specifically, Y and Y are defined as follows:

Y: Sample (b 1, ... , bk) ~{0, 1 }k uniformly at random condition that b1 @ ... E bk = 0 and output a sample from Xb, - Xb . Xbk.

Y: Sample (b 1 , . .. , bk) ~ {0, 1 }k uniformly at random condition that b1 (D ...-Ebk 1 and output a sample from Xb, . Xb2 -- - -. Xbk.

58 The proof of Lemma 3.3. 11. follows from the next proposition and a straightforward induction. Proposition 3.3.12. Let P, P', Q, Q' be any distributions over Y, and let R \(PP' + QQ') and R' = 1(PQ' + QP') be distributions over Y x y. Then TD(R, R') = TD(P, Q) -TD(P', Q'). Proof. It is easy to verify that for every i, j E X, it holds that

Ri. =0AR'- =04-:- (Pit =0AQi =0)V(P A = AQ'> = 0). (3.21)

Hence, the set A = {(i, j): R y > 0 V R' > 0} is exactly the product of B = {i: P > 0 V Qj > 0} and C = {j: P > 0 V Q > 0}. Compute

TD(R, R') 2 E3 (P~P +QiQ')+- (PQ + QPJ) (P (P Q[7) - Q(J; - ) ) 4 P (PQ)Q')+Q(P'[+ Q[)

1 > p((p - Q)(p _Q)) 2

4 ePi + Qi)(I77 + Q) Q,)2) (P - (2\ 2E P +Q ) P+ i =TD(P, Q) -TD(P', Q').

Using the above analysis we can now prove Lemma 3.3.9. This proof follows similar lines to that of [SV03, Lemma 3.51.

Proof of Lemma 3.3.9. Let A = min(a/#, 2) > 1,25 and let e = FlogA 8k]. Apply the XOR Lemma (Lemma 3.3.11) to the input (Xo, X1 , 1f) to produce (X6, X') such that

TD(Xo, X1 ) > a T->T(X01, X') > af

TD(Xo, X1 ) < ->TD(X/, X') < of.

Let m = A /(4a') < 1/(40'), let Xg' = (X')O' and X = (XI)*n. The Direct Product Lemma (Lemma 3.3.10) now yields that

TD(Xo, X1 ) > a -- > TD(X'', X") >1 - cxp(-afm/2) > 1 - c-k

TD(Xo, X1 ) < 3 - TD(X', X') < 2mf < 1/2.

25This is the only place this proof diverges from that of [SV03, Lemma 3.5]. The latter sets A = a/3.

59 Finally, apply the XOR Lemma (Lemma 3.3.11) again to the input (XJ', Xi', ik) to produce Yo, Y such that

TD(Xo, X1 ) > a -> TD(Y, Y1 ) > (1 - e-k)k > l - ke-k > 1 -2

TD(Xo, X1 ) < # - TD(Y, Y1) < 1 /2*

The last derivation holds for sufficiently large k, which we can obtain by increasing it at the start. As for the running time, the analysis is similar to the one done in the proof sketch of Theorem 2.3.9 (which in turn follows [CCKV08, Lemma 381), and we refer the reader there. F

3.4 One Way Functions from SDP with Any Notice- able Gap

In this section we construct a one-way function assuming the average case hardness of the STATISTICAL DIFFERENCE PROBLEM (SDP), with any inverse polynomial gap. Actually, we will only construct a distributional one-way function but by a result of Impagliazzo and Luby [IL89j, this yields a full-fledged one-way function. We first recall the (standard) definitions of a one-way function and a distributional one-way function.

Definition 3.4.1 (One-Way Function). A function f: {0, 1}* -+ {0, 1 is one-way if it is polynomial-time computable and for every probabilisticpolynomial-time algorithm A,

Pr [A(1 Y) C f 1 (Y)] = negl(n), where Y = f(X) for X ~ {0, I (and the probability is also over the coin tosses of A).

Definition 3.4.2 (Distributionally One-Way Function). A polynomial-time computa- ble function f : {0, 1}* -+ {0, 1}* is distributionally one-way if there exists a polynomial p such that for every probabilistic polynomial-time algorithm A and all large enough n

SD ((X, Y), (A(I, Y), Y) p 1 where X {0, 1}' and Y = f (X). Any one-way function is also a distributionally one-way function. While the other direction is not always true, [IL89] showed that the existence of both primitives is equivalent.

Lemma 3.4.3 ([IL89, Lemma 1J). If there exists a distributionally one-way function then there exists a one-way function.

60 Hence, to show the existence of one-way functions, it suffices to show that dis- tributionally one-way functions exist. As noted above we will do so based on the average-case hardness of SDP.

Definition 3.4.4 (Average-case Hardness). We say that a promise problem H-= (YES, NO) is average-case hard if there exists a probabilistic polynomial-time algo- rithm S such that S(1") outputs samples from YES U NO, and for every probabilistic polynomial-time distinguisher D,

Pr [D(1" X) =H(x)] < - + negl(n), X-s(1" ) 2 where H(x) = 1 if x E YES and H(x) 0 if x E NO. The above probability is taken also over the randomness of D. We call S a hard-instance sampler for H.

The main result of this section is that the average-case hardness of SDP with any noticeable gap implies the existence of distributionally one-way functions.

Theorem 3.4.5. Let (a, 3) be (1/poly)-separated functions (according to Defini- tion 3.1.4). Assume that SDP0,0 is average-case hard with hard-instance sampler S. Then, the function f defined as

f (17t r, b, X) = (ri" Co, C1, y),

26 where ( Co, C1 ) =S(1; r) and y = Cb(x), is distributionally one-way.

It may be useful for the reader to think of f as a collection of functions indexed by the pair of circuits (Co, C1) that map (b, x) to Cb(x). We refrain from describing the function as a collection as the definition of collections of distributionally one-way functions is slightly more cumbersome. Theorem 3.4.5 immediately proves Theorem 3.1.8. Furthermore, our proof impli- citly implies two additional results. First, it proves that the function f is distributio- nal k-multi-collision resistant hash function, for k = 0 _g" that is, for a random output (co, ci, y) of f, it is difficult to find k random preimages of (co, ci, y). Se- cond, the proof of Theorem 3.4.5 also implicitly shows the following strengthening of Goldreich's [Gol901 result: the existence of efficiently sampleable distributions whose statistical distance is a but no efficient algorithm can distinguish between them with advantage more than 3, for any (1/poly)-separated (a, /), implies the existence of one-way functions. The rest of this section is dedicated to proving Theorem 3.4.5.

26Recall that S(1; r) stands for the output of S(1) when its random coins are set to be r. 27Definition 3.4.2 only considers functions whose domain is {O, 1}*, i.e., functions defined for every input length. Although this function is not defined for every input length (and has 1' as an input), using the fact that it is defined on {0, 1}'(') for some q(n) E poly(n) and standard padding techniques, such restricted distributionally one-way function imply the existence of standard distributionally one- way function, per Definition 3.4.2. In the rest of this section we ignore this issue and assume that distributionally one-way functions can be defined for inputs in {0, 1}"(), for some q(n) E poly(n).

61 3.4.1 Proving Theorem 3.4.5

The proof of Theorem 3.4.5 is via a reduction. We show that given an adversary that distributionally inverts f, we can construct a distinguisher that breaks the average case hardness of SDP We begin with defining the following jointly distributed random variables with respect to the security parameter n: Let R {0, 1 }("), for p(n) being a bound on the number of random bits that S(1) uses, let (CO, C1) ~ S(1; R), let B - {0, 1}, let X ~ {0, 1}n(n), for m(n) being a bound on the input length of Co and C1, and let Y = CB(X). Finally, let W = (1 nR,B,X) and Z= (1", Co, C1, Y). Note that Co and C1 are random variables taking values of circuits (i.e., a description of the circuit itself). This is in contrast to other parts of this paper in which we (abuse notation and) use C to denote also the output distribution of the circuit C. To avoid confusion, in this section we denote by P and P the output distributions of the circuits Co and C1, respectively. Assume toward a contradiction that f is not distributionally one-way and let p(n) c poly(n) be some polynomial to be determined by the analysis. Then, there exists a probabilistic polynomial-time inverter A such that

SD ((W, Z), (A(l", Z), Z) < , (3.22) for every n in an infinite set I C N. Using the inverter A we construct a distinguisher D such that for large enough n E I, it holds that28

Pr [D(n, (Co, C 1)) = 1{SD(Po, P1) > a(n)} > + (3.23) -2 q(n)' for some q(n) C poly(n) to be determined by the analysis. 2 9 The existence of such a D contradicts the average-case hardness of SDP- 3 and so it remains to establish Eq. (3.23). Fix some large enough n C I. When it is clear from the context, we will sometimes omit n from the notations. Our first step is to show that for a large fraction of the circuit pairs sampled by S, the inverter A inverts the function well. Let R', B', X' be random variables induced by the output of A(1", Z). By Eq. (3.22) (and using the data-processing inequality for statistical distance) it holds that

1 SD((B, Y, Co, C1 ), (B', Y, C0 , C1 )) < -. p

28 Recall that for a boolean statement S (e.g., X > 5), 1{S} stands for the indicator function that outputs 1 if S is a true statement and 0 otherwise. 29The probability is over the choices of (CO, C1) and the randomness of D. The statistical difference in the probability is between the output distributions of Co and C1 after those circuits were drawn and fixed.

62 Let

U {(co, ci): SD (((B, Y)I(Co, C1 ) = (co, ci)), ((B', Y)I(Co, C1 ) = (co, ci)) (3.24)

Namely, U is the set containing the pair of circuits for which the inverter's output B', when given a random Y, is statistically close to B. It follows that 1 1 - > SD((B, Y, Co, Ci), (B', Y, Co, Ci)) > Pr[(Co, C1 ) U. 1 , (3.25) p

and thus Pr[(Co, Ci) E U] > 1 - We design a distinguisher D that works particularly well when it is given a circuit pair belonging to U. Specifically, we describe D for which, for all (co, ci) E U, it holds that

4 Pr [D(co, ci) = 1{SD(Po, P ) > a(n)}j] 1 - , (3.26) 1 n where the probability is over the randomness of D. Such a distinguisher yields Eq. (3.23) as follows:

( - Pr [D(CO, C1) = 1{SD(Po, P) > a(n)}] > Pr[(Co, C1) E U] I

>1 4)

1 ~ 2q 1 _(n)n 2

for large enough q(n) E poly(n). In the rest of the proof we establish Eq. (3.26). For (co, ci, y) e Supp(Co, C1, Y), let

OCO Ci (y) = Pr [B = 1|Co = co, C1 = ci, Y = y] - Pr [B = OlCo = co, C1 = ci, Y = y].

Namely, Ooc (y) measures the difference between the likelihoods that each circuit outputs y. A perfect inverter for such f, when given an output y, would return 1 1 O"COrl, probability 1- 010,(y) with probability 2 and 0 with 2 The quantity OO,c (y) plays a crucial role in the proof. First, we show that it can be used to characterize the statistical distance between the output distributions

of co and c1 . Second, we show that it can be well-approximated using the inverter A, specifically using random samples of B'. Thus, the distinguisher D uses A to approximate SD(PO, P1) and answers accordingly. We proceed to the actual proof. For (co, ci) E Supp(Co, CI), let Ycoc denote the random variable sampled ac- cording to (YI(Co,C1 ) = (co,ci)) (i.e., Yc,,cl drawn from the distribution P0+).2 )

63 For y E Supp(Yco'c) let Bco,'ci, denote the random variable sampled according to

(BI(Co = co, C1 = c1 , Y = y)), and similarly for B' . Using the above notations, Proposition 2.2.3 states that for every pair of circuits (co, ci) E Supp(Co, CI), it holds that 6 SD(Po, P1 ) E [ 0co,c(y)I]. (3.27)

Namely, the statistical distance between the output distributions of co and ci is the expected value over y ~ Ye0,'c of 10,, (y)1. So, the distinguisher's task is to approxi- mate the expected value of I10,,(y)1, for (co, c1) E U. We design such a distinguisher in two steps. First, we show how to estimate coc (y) for a random y. Then, we use this estimator and Eq. (3.27) to calculate an approximation for the statistical dis- tance (up to some inverse polynomial additive error). In the following we let e - '-4 k =1o!)] and f = [2.n) Note that since a > 3 are noticeably separated, it holds that k, f E poly(n).

3.4.1.1 Estimating 0 coci(y)

Consider the following algorithm Est whose goal is to estimate Oe,,cl (y). The algorithm gets access to an oracle 0 that takes as input (co, ci, y) E Supp(Co, C 1, Y) and outputs a bit b'.

Estimator Esto(co, c1 , y): 1. For every i E [I], run O(co, c1 , y) to get (b'...... , b'). 2. Let m = I{b': b' =1}| be number of ones in (b ...... , b'). 3. Set PB(1) = m/e and PB(O) = (f - m) /f.

4. Return PB(1) - PB(O).

The next claim shows that if the oracle 0(co, ci, y) perfectly samples from Bco,cl,,, then indeed the output of Esto(co, ci, y) is a good estimator for Ococ (y).

Claim 3.4.1. Let (co, c1 , y) E Supp(Co, C1, Y) and assume that O(co, c1 , y) ~ Bco'ci'y. Then, it holds that

Pr [I Oco,c (y) - Est0 (co, ci, y)| > E< 1 kn' where the above probability is over the randomness of 0.

Proof. We use Fact 2.2.18 to show that Esto(co, c1 , y) indeed approximates Oc.,c,(y). Let PB denote the distribution of Bcocy. By the assumption on O(co, c1 , y) and the definition of Est it follows that PB (defined in line 3 of Est) is the empirical distribution of PB, computed using f samples. Since the domain size of B is 2, the setting of f

64 and Fact 2.2.18 yield that

Pr 0|.,,c (y) - Est'(co, ci, y)| > E] = Pr (PB(1) - PB(0)) - (PB(1) - PB(O)) > (3.28) < Pr SD(PB, >B)E/2 1 kn

Unfortunately, even with the inverter A we cannot implement an oracle 0 that perfectly samples BecY. However, we can show that for (co, ci) c U, the inverter approximates such an oracle for a random y. In the following we let A be the projection variant of A that outputs A's second output (i.e., the bit b').

Claim 3.4.2. Let (co, ci) c U (see Eq. (3.24)). Then

Pr Oc.,c (y) - Est'(co, ci, y) > E _- + , Y-YCO.C kn CP where the above probability is also over the randomness of A.

Proof. Consider the set

V {y: SD(Bco~cIy, B'soC1 Y) <

Similar calculations as those made in Eq. (3.25), and also using that (co, ci) E U (and thus SD((Bco,c 1 y, Ycoc), (B' sy,7 Y,ci)) < 1/ p), show that Pry-yccl [y V] .

Let y c V. It follows that SD(A(co, c1 , y), O(co, c1 , y)) < 1/ ', p, where the oracle 0 is from Claim 3.4.1. Since Est makes f oracle calls, a standard argument using the data-processing inequality for statistical distance and Claim 3.4.1 yield that for every y E V

Pr[ Oco,c, (y) - EstA(co, ci, y) > E] < Pr[|0,,c, 1(y) - Esto(co, ci, y)| > E + -SD(A(co, ci, y), O(co, ci, y)) 1 - kn Q

65 All in all,

Pr [O.,c (y) - Est(co, ci, y) > E

< Pr oco'ci (y) -EstA (Co, C1, y) > E y E V + Pr [y V] 1 / 1 kn C /' as required.

3.4.1.2 Approximating the Statistical Distance

Using the Oc,,,(y) estimator Est, we now describe the distinguisher D (recall that A is the projection variant of A that outputs A's second output (i.e., the bit b')).

Distinguisher DA (cO, cl):

1. For every i c [k], draw bi - B, xi - X, and set y = Cb(Xi).

2. Compute =A k 1 EstA(co, c1 , y1)I. 3. Output 1 if A > , and 0 otherwise. 2

We show that for (co, ci) E U, the distinguisher approximates the relevant statis- tical distance well.

Claim 3.4.3. Let (co, c1 ) c U and let A(co, c1) be the value set to A in line 2 in a random execution of D' on input (co, c1). It holds that

Pr A(co, ci) - SD(Po, P ) ;> - < - 1 2 _n where the probability is over the randomness of D and A.

The proof of Claim 3.4.3 goes as follows. First, using a union bound, we argue that with high probability for every yi sampled by D, it holds that Est(co, c 1 , yi) is close to oco ci(yi). Then, we use Eq. (3.27) and the Chernoff-Hoeffding bound to argue that the average of IEst(co, c1 , yi)I's is a good approximation for the statistical distance. Before formally proving Claim 3.4.3, let us use it to derive Theorem 3.4.5.

Proof of Theorem 3.4.5. Recall that in order to prove Theorem 3.4.5 it suffices to establish Eq. (3.26). Let (co, cl) c U. Note that if E(co, c1 ) - SD(Po, P1 ) < then the distinguisher D always outputs the correct answer. Indeed, if SD(P0 , P) ;> , then A(co, ci) > j'+ and D outputs 1; and if SD(P, P) < /3, then A(co, c1 ) < and D outputs 0. Hence, Claim 3.4.3 immediately establishes Eq. (3.26).

66 Lastly, since k, f E poly(n), the pair of function a and 3 are efficiently computa- ble, and A runs in polynomial time, then so does DA. This completes the proof of Theorem 3.4.5. 11

It it left to prove Claim 3.4.3.

froof of Clazm J.4. . Por i - Lcj iet r, be tne values set to y, in a random execu- tion of DA(Co, cl). Let Vi -- .0, e (Yi)I and f EstA(cociYi) . The definition of A yields that

Pr[ A(co, ci) - SD(Po, P1 ) > = Pr - SD(Po, P 2] 1 ) > 2 (3.29)

< Pr .- - SD(PO, k P1 ) > 4

I k I1k z - + Pr k V k 4

We bound each summand in the right-hand side of Eq. (3.29) separately.

To bound the first summand, note that by definition Y is always drawn from Y,,,. Hence, Proposition 2.2.3 yields that E[ ] = SD(PO, P) for every i E [k]. Fact 2.2.17 now yields that

k Pr = Pr (3.30) kLV, - SD(Po,P1) > V - SD(PO, P1 ) > E1 ..1 = 2 K 2e-2kE 2 n where the last inequality follows from the definition of k.

To bound the second summand in the right-hand side of Eq. (3.29), we apply

67 Claim 3.4.2.

1 k k - Pr - k 4 = Pr[ Z V k 1 k _ >61 (3.31) k V=' k = 1 i 1 Pri C [k: Vi - E k ZPr[ V - > E] i= 1 k

< ZPr [I 0 , (Y) - EstA(co, ci, Y) > Ei i=1

< k- -+f+ - kn 10- 1 2+

n

where the first inequality follows since Ix - yj I lx - yjj for all x and y, the penul- timate inequality follows from Claim 3.4.2, and the last inequality follows by setting p =(k(C + 1)n)4 E poly(n). Plugging Eqs. (3.30) and (3.31) into Eq. (3.29) completes the proof of the claim.

3.5 Estimating Statistical Distance in AM n coAM

In this section, we present a constant-round public-coin interactive protocol in which, given circuits Co and C1 and a parameter A E [0,1], a computationally unbounded prover can prove to a computationally bounded verifier that SD(Co, C1) ~ A, up to any arbitrary inverse-polynomial precision (and thereby proving Theorem 3.1.9). This immediately gives an AM as well as a coAM protocol for the STATISTICAL DIF- FERENCE PROBLEM problem SDP4', for any noticeably separated a and 0 (which as already discussed in Section 3.1.1.3 were already known). Fix a pair of circuits O, C1 :{0, 1}" - {0, i}, and define the circuit C 30 {0, 1}"*1 - {0, 1}n that on input (b, x) outputs Cb(X). For any y, define the "pre-image set" 1y = {(b, x) I Cb(x) = y}. We shall once more use the imbalance 64,, defined in Definition 2.2.2. In terms that will be more convenient for our application, for any y in the support of C, we note that 0Y may be written as follows:

0= Pr [b= 1] - Pr [b= 0] = E [b]- 1 - E [b] = 2. E [b] - 1. (b,r)~ly (b,r)~ly (b,r)~Iy (b,r)~ITy )(b,r)~ly

Our protocol for statistical distance is based on the relation between statistical

3 0 The output distribution of C is exactly Cp+C 1 -the distribution from which the random variable Y from previous sections was drawn. In this section is will be convenient to explicitly describe this distribution as an output distribution of a circuit.

68 distance and statistics of the above quantity 0Y, which we proved in Proposition 2.2.3. Specifically, for any pair of circuits CO and C1 ,

SD(Co, C 1) = E [J0l] (3.32) y~C

In our protocol, the verifier will estimate this expectation by picking several y's at random from the distribution of C, and asking the prover for the values of the corresponding 0y's. The prover is then asked to prove that the values of the y,'s that it provided are (approximately) correct. In order to do this, we make use of the following formulation of "typical sets" that will provide us with a measure that distinguishes between values close to 0Y and those far from it. For any y in the co-domain of C, any 0 c [-1, 11 and 6 c [0, 1], and any k C N, define the set 7TY-,'k as:

O{k (b1, 1, -bk - Xk) ICb (xi) =y for all i, and 2 - , 1 E [0 - 6, 0 +6]

We refer to the set 7'Y' (where 0 is set to be 0y) as the typical set corresponding to y, and sets T-*Ok for values of 0 close to O0 as "nearly typical sets". We claim that, after sufficient repetition (that is, for large enough k), any nearly typical set contains a large fraction of the pre-images of y.

Proposition 3.5.1. Consider any y E Supp(C) and 0 E [-1, 1], and let E = |0 - Oy|. For any 6 > F and k C N, the following holds:

TY I6, - 2c-k (6-_)2/2

This proposition follows from concentration of measure. We relegate the formal proofs of this and later propositions to the end of this section. Next, we claim that if 0 is far from O6, then the set '"*k contains very few pre-images of y under C.

Proposition 3.5.2. Consider any y E Supp(C) and 0 E [-1, 1], and let e 0=- 0yl. For any 6 < F and k E N, the following holds:

Ty , < C- k E-6)2/2

Thus, in order for the prover to prove that a value 0 is approximately equal to OY for a given y, all it has to do is show that the set T1 1 6,k is large (relative to I |Ik In order to do this, we will be using the set lower-bound protocol of Goldwasser and Sipser [GS89] (with the tighter analysis of Aiello and Hastad [A1911).

Lemma 3.5.3 ([AH91], Lemma 4.1). There is a constant-round public-coin inte- ractive protocol that, for any set S C {0, 1 }' such that membership in S can be com- puted in time t and for any b E N, proves the statement "|S| > 2 b" More precisely, the protocol satisfies the following properties for every such set S:

69 " Completeness: When interacting with the honest prover, the verifier accepts with probability at least 1 - Is' * Soundness: When interacting with any (possibly cheating) prover, the verifier

accepts with probability at most 2b

" Efficiency: The verifier runs in time poly(n, t).

Finally, note that the size of the set T,6,k has to be shown to be large relative to

_TY k. Hence, in order to be able to use the above set lower-bound protocol for this purpose, the verifier needs to know the value of Ilylk, which it may not be able to compute efficiently.3 1 However, as a random variable (when y is drawn from C), the quantity Il,' behaves predictably. Let B, X and Y denote the random variables induced by drawing a bit b and a string x E {0, 1} uniformly at random and y computed as C(b, x). By the definition of conditional entropy, we can write the expectation of log|IjY as follows:

E [logl|Ey|] = H(B, XJY). y~Y

So if we pick several y's independently, the mean of the log|I's will be concen- trated around H(B, XIY). We state this in a more convenient form as follows.

Proposition 3.5.4. For any t c N, suppose y1,... , yt are independently sampled as C(bi, xi) for bi and xi chosen uniformly at random. Let I y, x x lyt. Then, for any r>; 0,

2 2 Pr[|I E [2 t-(H(B,XY)-r) 2t-(H(B,XIY)+r) ] I 2_-2tr /(rM+1)

Thus, if the verifier can estimate H(B, XIY) reliably, we can then implement the earlier lower-bound protocol collectively for all the yi's sampled without knowing any of the individual Iy, I's, but confident that the product of all of them is approximately 2t-H(B,XIY)* We will use the prover to enable the verifier to perform this estimation, using the fact that the problem of approximating the entropy of the output of a given circuit is in NISZK [GSV99I, and hence can has a constant-round interactive protocol. While (B, XIY) is not the output distribution of any circuit, computing this conditional entropy reduces to computing the entropy of just Y by the following calculation:

H(B, XIY) = H(B, X, Y) - H(Y) = H(B, X) - H(Y) = (m + 1) - H(Y),

where the first equality follows the chain rule for Shannon entropy (Fact 2.2.7), and the second from the fact that Y is a deterministic function of (B, X). Noting that

31An alternative to deal with this would be to ask the prover to prove to the verifier that I is of a certain size using set lower-bound and upper-bound protocols (an example of the latter may be found in [For89I). However, it is unclear how to perform the upper-bound protocol with sufficiently small soundness error given the inability of the verifier to sample at will several random elements from Iy.

70 Y is the random variable corresponding to the output of the circuit C, the following lemma is now implied by the results of Goldreich, Sahai and Vadhan [GSV99.

Lemma 3.5.5. There is a constant-roundpublic-coin interactive protocol that takes as input a tuple of the form (C, h, ), where C is a circuit, h E R is an entropy estimate, and -y > 0 is a gap parameter, and has the following properties:

" Completeness: If H(C) c [h - -y, h + ], the verifier accepts with probability at least 0.9 when interacting with the honest prover.

" Soundness: If H(C) [h - 37, h + 3-y], then for every (possibly cheating) prover, the verifier accepts with probability at most 0.1.

" Efficiency: The verifier runs in time poly(ICI, 1/-y).

The entirety of our protocol is described formally as Protocol 3.5.1; Lemma 3.5.6 states its properties and immediately implies Theorem 3.1.9.

Lemma 3.5.6. There is a constant-round public-coin interactive protocol that, given as input a pair of circuits (CO, C1 ), a claim A e [0,1] for their statistical distance, and a tolerance 6 E [0,1], satisfies the following properties:

" Completeness: If ISD(Co, C1) - Al < 6, the verifier accepts with probability at least 2/3 when interacting with the honest prover.

" Soundness: If ISD(Co, C1) - Al > 36, when interacting with any (possibly cheating) prover, the verifier accepts with probability at most 1/3.

" Efficiency: The verifier runs in time poly(lCol, ICI1, 1/6).

That is, it proves that SD(CO, C1) is within an additive error 6 of A, failing with probability at most 1/3. Setting the tolerance factor 6 in the above protocol to be a third of the one from Theorem 3.1.9 proves the latter. We now prove Lemma 3.5.6 by the approach outlined so far.

Proof of Lemma 3.5.6. We show that Protocol 3.5.1 satisfies the properties required by the lemma.

Completeness. Fix an input (Co, C1, 6), and suppose that ISD(Co, C1 ) - A < 6. We now show that the prover makes the verifier accept with high probability. First, if the prover computes H correctly then H - H(C) 7rj by the precision of our chosen representation of numbers. So by Lemma 3.5.5, the execution of the subprotocol Hent accepts except with probability 0.1. 11 Next, by Lemma 3.5.3 the subprotocol 1b accepts with probability at least 1 -

2 (tk(-2r))/1T. So if 1TI is more than 2tk( -34/2), then fl1b accepts with probability more than 1 - 2-tko/2 > 1 - 1/20, where we use that t > 1 and kr > 1/r > 200. We claim that, if the Oj's are reported correctly, this happens with high probability. Recall that H = (m + 1) - H is supposed to be roughly H(B, X|Y).

71 Input: Pair of circuits Co, C1 : {0, 1}r' - {0, 1}, claimed distance A, and tolerance 6

2 2 Set 7 = 6 /200, t = [8ln(40)(m + 1) / 2] , and k = [4ln(2t)/q2]. All numerical quantities in the protocol will be specified to within an additive error of a-to write down any q E R like this uses (log[lq[] + O(log(1/77)) bits. We denote the result of rounding the quantity q in this manner by [qj. We will be using the following sub-protocols: " The entropy approximation protocol from Lemma 3.5.5, denoted by Hent, which takes input of the form (C, k, -y).

" The set lower-bound protocol from Lemma 3.5.3, denoted by Hlb, which takes input of the form (S, b).

Prover Verifier H +- [H(C)]

Flent(C, , +) For i C [t] : bj ~ 0, 1}, xi ~ {0, 1}m y <- C(bi, xi) Yi, ..,Yt

For i C [t] :

Qi +- Foyij 01,...' Ot

x1,2,k Ot,2,k H (m + 1) - H H 11(T, tk(H - 27)) k Accept if and only if both 1 '1 ent and 1 1b accept and:

JA - . Z= 1 0 I I < 26

Protocol 3.5.1: Estimating Statistical Distance

Claim 3.5.1. Suppose that for all i E [t], we have 0 = [yj, and that H = [H(C)]. Then,

Pr [ITI < 2 tk(H-3,q/2) < 20

Proof of Claim 3.5.1. This follows from Proposition 3.5.1 and Proposition 3.5.4. Re- call that T is defined as T1,2., x T...,2rk, that for each i we assume that

72 2 |O - T1, that t > 1, and that k > 4 ln(2t)/77 . By Proposition 3.5.1, we have:

TI > - 2e k(2-,)2/2 t I - 2te '/2 (333) y , e- k n2 /2 + n (2t) _E~~~~ y ~ I -

> 1 - 21--2 > 3/4 > 2 -tkr/4.

Let lI= Iy, x ...x Ty,. Proposition 3.5.4 now lets us approximate 1 k as follows.

Pr |k 2 tk(H-5'/4) ] Pr > 2 tk(H(B,XIY)-,r/4)

= Pr[J1T > 2t (H(B,X|Y)-,q/4)I

> 1 - 2e-t. 2 /8(m+1) 2 > 1 - 1/20, (3.34)

where the first inequality follows from the precision of H (since H [H(C)], we have that IH - H(B, XIY)I is at most 77), the second from Proposition 3.5.4, and the last from the chosen setting of t. Eqs. (3.33) and (3.34) together imply the claim. E

Claim 3.5.1 along with the argument just preceding it now implies that the pro- bability that the execution of 1-ub rejects is at most (1/20 + 1/20) = 1/10. And third, we need to argue that the final check of the verifier passes. As the prover is honest, the mean Ejil/t is within y = 62 /200 of EjOyj/t. By Eq. (3.32), Eyi-c[ 1 IJ] = SD(Co, C 1 ) for each i. And by our hypothesis, JA - SD(Co, Ci)| 56. Using these facts and by the Chernoff-Hoeffding bound (Fact 2.2.17), noting that the yi's sent by the verifier are indeed sampled from C, we have:

Pr [ A -- | > 26 - t i t 1 2

< Pr SD(Co, C1) - jyil > -

< 0.1, (3.35)

where the last inequality follows from the choice of t. Thus, we have shown that if this prover strategy is used, then the execution of Hent rejects with probability at most 0.1, that of Hlb with probability at most 0.1, and the final check of the verifier fails with probability at most 0.1. By the union bound, the whole protocol rejects with probability less than 1/3, proving completeness.

Soundness. Now suppose that |SD(Co, CI) - A| > 36. First, consider the case where the H sent by the prover is such that H - H(C) > 3,q. In this case, by Lemma 3.5.5, the probability that feHt(C,, rq) accepts is less than 0.1. Next, consider the case where H - H(C) < 3TI, but the estimate EZji0/t differs

73 substantially from Ei|Qyf/t (because the prover reported wrong values for 6O). In this case, we would like to show that the execution of Hlb rejects with high probability. Specifically, suppose we have:

E.z il - E

This implies the following:

1:0 - I > 10 - Z,1I > i (3.36)

By Lemma 3.5.3, Hlb accepts with probability at most ITI/ 2 tk(H-27). Recall that STO 1, 27,k X ... x TOt ,27,k Let I =y, x ... x 'yt. We show that the condition

(3.36) implies that T is much smaller than 1 |;k this is because (3.36) implies that a number of the Oi's are far from the corresponding 6 's, and thus the sets formed using them are not typical. Claim 3.5.2. If Ei|IO - 6, I > t/2, then:

2 ITI < e-2tk6 /25 |i|r -

Proof of Claim 3.5.2. For each i E [t] such that 10i - O, I > 2TI, Proposition 3.5.2 implies that:

2 2q,k e-k(0i-y, -2 ) /2 k

Noting that ,,,k is at most as large as 1Ei Ik, we multiply this bound by a correcting factor of ek(2l)2 /2 to handle the case of I6, - jyiI 2TI, so that the following holds for all i E [t],

T ,,l,k k( 04-yi 1-2,q2/ 2 . k (2,q) 2/2 k

Taking the product of this over all i gives us:

TI < exp -k Z(oi _ 6)2/2 + 2ki Z:10 - 0| _T k -ii /i1

By the Cauchy-Schwarz inequality and (3.36), we have EZ(O -60,) 2 > 62 t/4. Further, we know that ZIOi - 2t (since Oi, Oy, c [-1, 1]). Together with these and the fact that , = 62/200, the above expression gives us:

2 2 T

74 LII

Under the other part of our current hypothesis - that H - H(C) < 3,q (which implies that IH - H(B, XIY) < 3TI) - we show that I is, most of the time, not very large.

Claim 3.5.3. Suppose that - H (B, XIY) I < 37. Then,

Pr[ 2 t(F+7,7/2)] < 1

Proof of Claim 3.5.3. This follows from Proposition 3.5.4. We have:

Pr[1_11 2t (F+7,,/2)] < Pr >2" ((B,XIY)+,q/2) ] < 2 t2/2(n+ 1)2< where the first inequality is from the accuracy of H, the second from Proposition 3.5.4, and the last from the value of t. D

Together, Claims 3.5.2 and 3.5.3 imply the following. If H - H(C) < 3-q then, except with probability 1/20 over the choice of the yi's, unless the estimate ZKIOii/t is within 6/2 of EZ|IyjI/t, the subprotocol 11b accepts with probability at most:

2 T e-2tk6 /25 . k e2tk62/25 . 2 tk(H+7r/2)

2 tk (F1- 2,/) 2 tk(H-2,q) - tk(H-2ql) -20 where the first inequality follows from Claim 3.5.2, the second from our conditioning that the event from Claim 3.5.3 does not happen, and the last from the values of the quantities involved. The case we are left with is where E|OBil/t is within 6/2 of EZjOyj/t. The final check by the verifier is whether ZJO|il/t is within 26 of A. For this to happen, as |A - SD(Co, C1)| > 36, the mean EiO, |/t has to be more than 6/2 away from SD(Co, CI). For random yi's drawn from C, as calculated in Eq. (3.35) in the proof of completeness, this happens with probability less than 0.1. We summarize the argument for soundness as follows:

1. If the prover sends H that is 37-far from H(C), the verifier rejects except with probability at most 0.1.

2. Otherwise, except with probability at most 1/20 over the choice of the yi's, unless E|OBil/t is within 6/2 of EZjOy2I/t, the verifier rejects except with pro- bability at most 1/20.

3. Also, except with probability at most 0.1 over the choice of the yi's, if EIO|i/t is within 6/2 of EjOyj/t, the verifier rejects.

Thus, the total probability of the verifier accepting is at most (0.1+1/20+1/20+0.1) < 1/3, as required.

75 Efficiency. The running time of the verifier is the sum of those of the verifiers in the calls to "ent and 14b, and poly(t, 1/6) (for sampling and the final check). Membership in the set T can be verified using its definition in time poly(ICI, k, t). The entire running time may now be verified to be poly(ICof, C1, 1/6), as required. While the protocol as written is private-coin, note that, since Hent and rlb are both public-coin, the only instance where the verifier's coins are not sent over is when it sends the yi's to the prover. This is remedied by having the verifier send the (bi, xi)'s to the prover instead, and noting that this does not affect the soundness of the protocol. D

3.5.1 Proofs of Intermediates

We complete this section by proving the intermediate propositions and lemmas used in the proof of Lemma 3.5.6.

Proof of Proposition 3.5.1. The proposition is proven using the additive Chernoff bound. Under the uniform distribution over Iy, we have Pr(bx)~.x [b = 1] = (1 +O)/2. So if we take many independent samples (b, x) from this distribution and looked at the empirical mean of the b's, it would be concentrated around (1 + Oy)/2. Then, if O is close to Os, this empirical mean is, with good probability, close to 0 as well. We abuse notation slightly and write b ~ to indicate the vector (bi,..., bk) obtained by sampling (bi, xi)iE[k] uniformly and independently from Ty and dropping the x's. Observe that:

TO,6,k b = Pr 2 E - [0-6,0+6] |I k bk -Eyk0 k

> Pr biE [ Y 6-E1+O bk~E[k 2 2 '2 2

> 1 - 2 -e-k(-E)2 where the first inequality follows from the observation that an interval of size 6 around O contains an interval of size (6 - E) around OY, and the second inequality follows from the Chernoff-Hoeffding Bound (Fact 2.2.17). LI

Proof of Proposition 3.5.2. The proof of this proposition also uses the Chernoff- Hoeffding bound. The central idea here is that this claimed typical set is almost disjoint from the actual typical set, and most probability mass lies inside the actual typical set. Consider the case where OY > 0; the argument for the other case is identical. We abuse notation slightly and write bk Ik to indicate the vector (bi, ... , bk) obtained by sampling (bi, Xi)ie[k] uniformly from I, and dropping the r's. Let b = 1 bi/k.

76 Observe that:

7 -0 = Pr [2b- 1 E [0 -6,0+6]] 1Y k b k yk 1+0 61 < Pr Ib< + - bk _.ykL 2 2j

= Pr I> _+0___ - Prb_ 2 2J

where the first equality follows from the definition of the distribution, the second equality follows from the definition of E and the assumption that 0, > 0, and the final inequality follows from the Chernoff-Hoeffding bound (Fact 2.2.17). 1

Proof of Proposition 3.5.4. We prove this again using the Chernoff-Hoeffding bound on independent instances of the random variable log I', which is contained in [0, m+ 1]. This is done as follows.

PrI C [2 t(H(BXIY)-) 2 t(H(B,XIY)+r,) Pr E [H(B, XJY) - , H(B, XIY) + ]1

We now note that loglI /t is simply the mean of several i.i.d. variables:

log I 1 t t logIIi 1, i= 1 where for each i we know that logjI, I is contained in [0, m + 1] (as Iy, is a set of (b, x) where b is a bit and x is of length m), and its expectation is H(B, XIY). Applying the Chernoff-Hoeffding bound (Fact 2.2.17), after scaling the values withing the probability expression down by (m + 1), now gives us what we want:

Pr [ [H(B, X|Y) - q, H(B, X|Y) + q]] > 1 - 2e-2tq /(,+1

Proof of Lemma 3.5.5. The protocol is a straightforward combination of the NISZK and coNISZK protocols for the Entropy Approximation and its complement from Goldreich et al [GSV99]. We make use of the following lemma from their work (with the inequalities made non-strict for ease of use).

Lemma 3.5.7 ([GSV99, Lemma 3.2]). There is a polynomial-time computable function that takes input (is, C, h), where C is a circuit, h C R+, and s E N, and produces a circuit C' that outputs f bits such that:

77 1. If H(C) > h + 1, then SD(C', U) < 2-, where Uf is the uniform distribution on {0, 1; and

2. If H(C) h - 1, then isupp(c')I < 2-

We start with a constant-round private-coin protocol to approximate entropy that works as follows given input (C, k, 7):

1. Let s = 3, let t [1/'y, and let C' denote the circuit obtained by concatenating t copies of C.

2. The verifier invokes the function from Lemma 3.5.7 with the input (is, Ct, tk+2) to get a circuit Cpper that outputs ,ppe, bits.

3. The verifier picks a random bit b. If b = 0, it samples a random output of C,,,, and sends it to the prover. Else it samples a uniform string from {0, 1}"fPer and sends it to the prover.

4. The prover responds with a bit b'. If b' # b the verifier rejects.

5. The verifier then invokes the function from Lemma 3.5.7 with the input (1, C, Itk- 2) to get a circuit Clowe, that takes niowe, bits as input and outputs romee bits.

6. The verifier picks a uniformly random string y from {0, 1}low"r and sends it to the prover.

7. The prover responds with a string x C {0, 1}llower.

8. If Ciower(x) = y, the verifier accepts. Else it rejects.

That the running time of the verifier above is poly(ICj, 1/y) may be verified by inspection, noting the efficient computability of the function from Lemma 3.5.7. To show completeness, suppose H(C) C [k - -y, k + 7]. For the sake of simplicity, suppose that 1/-y E N (so that t =1/7); the arguments for the more general case are identical. Then, H(Ct) c [t(k - ),(k + )=[ - 1 +- _ 1] y Lemma .. , ths implies two things for the circuits constructed in our protocol:

ISupp(Cupper) <1

2fuppr - 8 1 SD(Ciower,Uf,_) < -

Together with Proposition 2.2.1, these properties imply, respectively, that:

SD(Cpper, Ujr) > 7 (3.37)

e)I,Supp(C >) 7 (3.38) 2 -ower8

78 Eq. (3.37) and Proposition 2.2.4 imply that there is a prover strategy (specifically, to send the maximal likelihood bit) such that the probability that b' 7 b in step 4 of our protocol is at most 1/2 - (7/8)/2 = 1/16. And Eq. (3.38) implies that the probability that the prover is not able to produce a valid x in step 7 is at most 1/8. Thus, the total probability that the verifier rejects is less than 3/16. To show soundness, first consider the case where H(C) > k + 3-y. This implies that H(C') is at least (tk + 3). By the guarantees of Lemma 3.5.7, this means that SD(Cpper, U ,) < 1/8, which implies, together with Proposition 2.2.4, that the probability that the verifier does not reject in step 4 is at most 1/2 + (1/8)/2 = 9/16. On the other hand, if H(C) < k - 3-y, then H(C') is at most (tk -3). In this case, Lemma 3.5.7 impies that ISupp(Ciower)1/2 1-er < 1/8. Thus, the probability that the prover is able to produce a valid pre-image x in step 7 is at most 1/8. Overall, we have a protocol with completeness at least 1 - 3/16 = 13/16 and soundness error at most 9/16. By repetition in parallel, with an 0(1) blowup in the complexity of the verifier, both completeness and soundness error can be made smaller than 0.1, giving the desired parameters. And finally, it can be made public- coin while retaining a constant number rounds and an efficient verifier using the results of Goldwasser and Sipser [GS89J.

3.6 Triangular Discrimination Inequalities

Proposition 3.6.1 ([Top00J). For distributions P, Q it holds that:

SD(P, Q)2 < TD(P, Q) < SD(P, Q).

Proof. Let Y be the union of the supports of P and Q. Observe that,

TD(P, Q) =- I = E PY 2 P + Qy y~(-lP+)Q P + and SD(P, Q) =- Py - Qy= E F " 1 2 y~(P+2Q) PY + Q

It follows that,

TD(P, Q) - SD(P, Q) 2 = E P ) - QE 2 Y~(-!P+-!) PY + Qy y~(.!P+_ Py + Qy

= Var (y+ Y > 0.

Hence, that SD(P, Q) 2 < TD(P, Q) follows from the non-negativity of variance. That TD(P, Q) < SD(P, Q) follows from the fact that SP+Y- PY+Q

79 80 Chapter 4

Multi-Collision Resistant Hash Functions

In this chapter, we study multi-collision resistant hash functions. We show how to construct such functions based on the hardness of a problem related to the ENTROPY APPROXIMATION PROBLEM, an NISZK-complete problem. We also show how to use such functions to construct an important cryptographic primitive called Constant- Round Statistically Hiding Commitment.

This chapter is based on [BDRV17].1

4.1 Overview

Hash functions are efficiently computable functions that shrink their input and mimic 'random functions' in various aspects. They are prevalent in cryptography, both in theory and in practice. A central goal in the study of the foundations of cryptography has been to distill the precise, and minimal, security requirements necessary from hash functions for different applications. One widely studied notion of hashing is that of collision resistant hash functions (CRH). Namely, hash functions for which it is computationally infeasible to find two strings that hash to the same value, even when such collisions are abundant. CRH have been extremely fruitful and have notable applications in cryptography such as digital signatures2 [GMR88], efficient argument systems for NP [Kil92, Mic00] and (constant-round) statistically hiding commitment schemes [NY89, DPP93, HM96]. In this chapter we study a natural relaxation of collision resistance. Specifically, we consider hash functions for which it is infeasible to find a t-way collision: i.e., t strings that all have the same hash value. Here t is a parameter, where the standard notion of collision resistance corresponds to the special case of t = 2. We refer to such

'An extended abstract version of [BDRV17] was published at EUROCRYPT 2018 [BDRV18] (@ IACR 2018, 10.1007/978-3-319-78375-85). [BDRV17 is the full version of that work. 2We remark that the weaker notion of universal one-way hash functions (UOWHF) (which is known to be implied by standard one-way functions) suffices for this application [NY89, Rfom90I.

81 functions as multi-collision resistant hash functions (MCRH) and emphasize that, for t > 2, it is a weaker requirement than that of standard collision resistance. The property of multi-collision resistance was considered first by Merkle [Mer89 in analyzing a hash function construction based on DES. The notion has also been con- sidered in the context of identification schemes [GS941, micro-payments [RS96], and signature schemes [BPVYOOJ. Joux [Jou04] showed that for iterated hash functions, finding a large number of collisions is no harder than finding pairs of highly structured colliding inputs (namely, collisions that share the same prefix). We emphasize that Joux's multi-collision finding attack only applies to certain types of hash functions (e.g., iterated hash functions, or tree hashing) and requires a strong break of collision resistance. In general, it seems that MCRH is a weaker property than CRH. As in the case of CRH, to obtain a meaningful definition, we must consider keyed functions (since for non keyed functions there are trivial non-uniform attacks). Thus, we define MCRH as follows (here and throughout this chapter, we use n to denote the security parameter.)

Definition 4.1.1 ((s, t)-MCRH). Let s = s(n) E N and t = t(n) E N be functions computable in time poly(n). An (s, t)-Multi-Collision Resistant Hash Function Family ((s, t)-MCRH) consists of a probabilisticpolynomial-time algorithm Gen that on input I' outputs a circuit h such that:

" s-Shrinkage: The circuit h : {0, 1}' - {0, 1}"- maps inputs of length n to outputs of length n - s.

" t-Collision Resistance: For every polynomial size family of circuits A (An)nEN,

Pr For all < negl(n) h -Gen(1j), h(xi) = h(xj) and i 7 j ( 1 X2,...,xt. )-An(h)

In the above definition, the hash generating algorithm Gen(1") is required to output hash functions with the same input length-all must have n bits of input. A slightly more general definition would allow Gen(1) to output hash functions with different input lengths (while given the same security parameter). Those output lengths must be (with overwhelming probability) super-logarithmic in the security parameter n, since otherwise collisions would be easy to find in time poly(n) (in this case the adversary would be given I' as input, so it can run in time that is polynomial in the security parameter). For simplicity of presentation, we restrict ourselves to the above definition; that is, we require that Gen(1") always outputs hash functions whose input is of length n. All the results presented in this chapter can be naturally adopted to the more general definition. Note that the standard notion of CRH simply corresponds to (1, 2)-MCRH (which is easily shown to be equivalent to (s, 2)-CRH for any s = n - w(log n)). We also remark that Definition 4.1.1 gives a non-uniform security guarantee, which is natural, especially in the context of collision resistance. Note though that all of our results are obtained by uniform reductions.

82 Remark 4.1.2 (Shrinkage vs. Collision Resistance). Observe that (s, t)-MCRH are meaningful only when s > log t, as otherwise t-way collisions might not even exist (e.g., consider a function mapping inputs of length n to outputs of length n - log(t -1) in which each range element has exactly t - 1 preimages). Moreover, we note that in contrast to standard CRH, it is unclear whether the shrinkage factor s can be trivially improved (e.g., by composition) while preserving the value of t. Specifically, constructions such as Tree Hashing (aka Merkle Tree) inherently rely on the fact that it is computationally infeasible to find any collision. It is possible to get some trade-offs between the number of collisions and shrinkage. For example, given an (s = 2, t = 4)-M CRH, we can compose it with itself to get an (s = 4, t = 10)-MCRH. But it is not a priori clear whether there exist transformations that increase the shrinkage s while not increasing t. We remark that a partial affirmative answer to this question was recently given in an independent and concurrent work by Bitansky et al. [BKPi18], as long as the hash function is substantially shrinking (see additional details in Section 4.1.2). Thus, we include both the parameters s and t in the definition of MCRH, whereas in standard CRH the parameter t is fixed to 2, and the parameter s can be given implicitly (since the shrinkage can be trivially improved by composition).

Remark 4.1.3 (Scaling of Shrinkage vs. Collisions). The shrinkage s is measured in bits, whereas the number of collisions t is just a number. A different definitional choice could have been to put s and t on the same "scale" (e.g., measure the logarithm of the number of collisions) so to make them more easily comparable. However, we refrain from doing so since we find the current (different) scaling of s and t to be more natural.

Remark 4.1.4 (Public-coin MCRH). One can also consider the stronger public-coin variant of MCRH, in which it should be hard to find collisions given not only the description of the hash function, but also the coins that generated the description. Hsiao and Reyzin [HR 04] observed that for some applications of standard collision resistance, it is vital to use the public-coin variant (i.e., security can be broken in case the hash function is not public-coin). The distinction is similarly important for MCRH and one should take care of which notion is used depending on the application. Below, when we say MCRH, we refer to the private-coin variant (as per Definition 4.1.1).

4.1.1 Our Results

The focus of this chapter is providing a systematic study of MCRH. We consider both the question of constructing MCRH and what applications can we derive from them.

4.1.1.1 Constructions of MCRH

Since any CRH is in particular also an MCRH, candidate constructions are abundant (based on a variety of concrete computational assumptions). The actual question that we ask, which has a more foundational flavor, is whether we can construct MCRH from assumptions that are not known to imply CRH.

83 Our first main result is that the existence of MCRH follows from the average- case hardness of a variant of the ENTROPY APPROXIMATION PROBLEM studied by Goldreich, Sahai and Vadhan [GSV99. The ENTROPY APPROXIMATION PROBLEM, denoted EAP, is a promise problem, where YES inputs are circuits whose output distribution (i.e., the distribution obtained by feeding random inputs to the circuit) has entropy at least k, whereas NO inputs are circuits whose output distribution has entropy at most k - 1 (where k is a parameter that is unimportant for the current discussion). Here by entropy we specifically refer to Shannon entropy. 3 Goldreich et al. showed that EAP is complete for the class of (promise) problems that have non-interactive statistical zero-knowledge proofs (NISZK). In this chapter we consider a variant of EAP, first studied by Dvir et al. [DGRV11], that uses different notions of entropy. Specifically, consider the promise problem EAPmin,max, where the goal now is to distinguish between circuits whose output dis- tribution has min-entropy 4 at least k from those with max-entropy at most k - 1. It is easy to verify that EAPminmax is an easier problem than EAP.

Theorem 4.1.5 (Informal, see Theorem 4.2.6). If EAPminmax is average-case hard, then there exist (s, t)-MCRH, where s = Vn and t = 6n2

(Note that in the MCRH that we construct there exist 2v-way collisions, but it is computationally hard to find even a 6n2 -way collision.) In contrast to the original ENTROPY APPROXIMATION PROBLEM, we do not know whether EAPmin,max is complete for NISZK. Thus, establishing the existence of MCRH based solely on the average-case hardness of NISZK (or SZK) remains open. Indeed such a result could potentially be an interesting extension of Ostrovsky's [Ost9l] proof that average-case hardness of SZK implies the existence of one-way functions.

Instantiations. Dvir et al. IDG V11I showed that the average-case hardness of EAPmin,max is implied by either the quadratic residuosity (QR) or decisional Diffie Hellman (DDH) assumptions.5 It is not too hard to see that above extends to any encryption scheme (or even commitment scheme) in which ciphertexts can be perfectly re-randomized.6 The hardness of EAPmin,max can also be shown to follow from the average-case hardness of the SHORTEST VECTOR PROBLEM or the CLOSEST VECTOR PROBLEM

3 Recall that the Shannon Entropy of a random variable X is defined as H(X)

EX-X log .rXx 4 For a random variable X, the min-entropy is defined as Hmin(X) = minXCSuPP(X) log Pr [X=X] whereas the max-entropy is Hmax(X) = log (ISupp(X)|). 'In fact, [DG3V11] show that the same conclusion holds even if we restrict the problem to constant-depth (i.e., NCO) circuits. 6 Given such a scheme consider a circuit that has, hard-coded inside, a pair of ciphertexts (co, ci) which are either encryptions of the same bit or of different bits. The circuit gets as input a bit b and random string r and outputs a re-randomization of cb (using randomness r). If the scheme is perfectly re-randomizing (and perfectly correct) then the min-entropy of the output distribution in case the plaintexts disagree is larger than the max-entropy in case the plaintexts agree.

84 with approximation factor roughly Vn.7 To the best of our knowledge the existence of CRH is not known based on such small approximation factors (even assuming average-case hardness). Lastly, we remark that a similar argument establishes the hardness of EAPmin,max based on the plausible assumption that graph isomorphism is average-case hard.'

4.1.1.2 Applications of MCRH

The main application that we derive from MCRH is a constant-round statistically- hiding commitment scheme.

Theorem 4.1.6 (Informally stated, see Theorem 4.3.4). Assume that there exists a (log(t), t)-MCRH. Then, there exists a 3-round statistically-hidingand computationally- binding commitment scheme.

We note that Theorem 4.1.6 is optimal in the sense of holding for MCRH that are minimally shrinking. Indeed, as noted in Remark 4.1.2, (s, t)-MCRH with s < log(t - 1) exist trivially and unconditionally. It is also worthwhile to point out that by a result of Haitner et al. [HNO+09], statistically-hiding commitment schemes can be based on the existence of any one-way function. However, the commitment scheme of [HNO09] uses a polynomial number of rounds of interaction and the main point in Theorem 4.1.6 is that we obtain such a commitment scheme with only a constant number of rounds. Moreover, by a result of [HHIIRS15J, any fully black-box construction of a statisti- cally hiding commitment scheme from one-way functions (or even one-way permuta- tions) must use a polynomial number of rounds. Loosely speaking, a construction is "fully black-box" [RTV04] if (1) the construction only requires an input-output access to the underlying primitive and (2) the security proof also relies on the adversary in a black-box way. Most constructions in cryptography are fully black-box. Since our proof of Theorem 4.1.6 is via a fully black-box construction, we obtain the following immediate corollary:

Corollary 4.1.7 (Informally stated, see Theorem 4.4.3). There is no fully blackbox construction of MCRH from one-way permutations.

7 The hard distribution for SVP/; and CVP/ is the first message from the 2-message honest- verifier SZK proof system of Goldreich and Goldwasser [GG981. In the case of CVPg, the input is (B, t, d) where B is the basis of the lattice, t is a target vector and d specifies the bound on the distance of t from the lattice. The distribution is obtained by sampling a random error vector 7/ from the ball of radius dl /2 centered at the origin and outputting b -t + r/ mod P(B), where b ~{0, 1} and P(B) is the fundamental parallelopiped of B. When t is far from the lattice, this distribution is injective and hence has high min-entropy while when t is close to the lattice, the distribution is not injective and hence has lower max-entropy. Similarly for SVP ,;, on input (B, d), the output is rq mod P(B) where t is again sampled from a ball of radius dV/t/2. 8 Note that the graph isomorphism is known to be solvable in polynomial-time for many natural distributions, and the recent breakthrough result of Babai [Bab6] gives a quasi-polynomial worst- case algorithm. Nevertheless, it is still plausible that Graph Isomorphism is average-case quasi- polynomially hard (for some efficiently samplable distribution).

85 Corollary 4.1.7 can be viewed as an extension of Simon's [Sim98] blackbox sepa- ration of CRH from one-way permutations.

4.1.2 Related Works

Generic Constructions of CRH. Peikert and Waters [PW11] construct CRH from lossy trapdoor functions. Their construction can be viewed as a construction of CRH from EAPmin,max with a huge gap. (Specifically, the lossy trapdoor function h is either injective (i.e., Hmin(h) ;> n) or very shrinking (i.e., Hmax(h) < 0.5n).' One possible approach to constructing CRH from lossy functions with small 'lossi- ness' (Hmax(h)/ Hmin(h)) is to first amplify the lossiness and then apply the [PW11] construction. Pietrzak et al. [PRS12] rule out this approach by showing that it is impossible to improve the 'lossiness' in a black-box way.' We show that even with distributions where the gap is tiny, we can achieve weaker yet very meaningful notions of collision-resistance. Applebaum and Raykov [AR16b] construct CRH from any average-case hard lan- guage with a Perfect Randomized Encoding in which the encoding algorithm is one- to-one as a function of the randomness. Perfect Randomized Encodings are a way to encode the computation of a function f on input x such that information-theoretically, the only information revealed about x is the value f(x). The class of languages with such randomized encodings PRE is contained in PZK. Their assumption of an average- case hard language with a perfect randomized encoding implies EAPminmax as well.

Constant-Round Statistically Hiding Commitments from SZK Hardness. That average-case hardness of SZK yields constant-round statistically hiding com- mitments had been considered a folklore in few works in the past." Bitansky et al. [BHKY19j recently gave a formal proof for this implication.

Distributional CRH. A different weakening of collision resistance was considered by Dubrov and Ishai [D1061. Their notion, called "distributional collision-resistant" in which it may be feasible to find some specific collision, but it is hard to qnm- ple a random collision pair. That is, given the hash function h, no efficient algo- rithm can sample a pair (z1, z 2 ) such that z1 is uniform and z 2 is uniform in the set {z : h(z) = h(zi)}. Recently, Komargodski and Yogev [KY18] showed that the existence of MCRH implies the existence of distributional CRH. They further showed that the average-case hardness of SZK implies distributional collision resistant hash functions. The aforementioned proof of Bitansky et al. [BHKY19]-that SZK hard- ness implies constant-round statistically hiding commitments-constructs the latter

9 The trapdoor to the lossy function is not used in the construction of CRH. 10In contrast, it is easy to see that repetition amplifies the additive gap between the min-entropy and the max-entropy. In fact, we use this in our construction. "Dvir et al. [DGRJV11] attribute this construction to a combination of [OV08] and an unpublished manuscript of Guy Rothblum and Vadhan [RV09].

86 primitive from distributional CRH.1 2

Min-Max Entropy Approximation. The main result of the work of Dvir et al. [DGRV11] (that was mentioned above) was showing that the problem EAP for degree-3 polynomial mappings (i.e., where the entropies are measured by Shannon entropy) is complete for SZKL, a sub-class of SZK in which the verifier and the si- mulator run in logarithmic space. They also construct algorithms to approximate different notions of entropy in certain restricted settings (but their algorithms do not violate the assumption that EAPmiin,max is average-case hard).

4.1.2.1 Independent Works

MCRH have been recently considered in an independent work by Komargodski et al. [KNY17I. Komargodski et al. study the problem, arising from Ramsey theory, of finding either a clique or an independent set (of roughly logarithmic size) in a graph, when such objects are guaranteed to exist. As one of their results, [KNY17 relate a variant of the foregoing Ramsey problem (for bipartite graphs) to the existence of MCRH. We emphasize that the focus of [KNY17 is in studying computational problems arising from Ramsey theory, rather than MCRH directly. Beyond the work of [IKNY17, there are two other concurrent works that specifi- cally study MCRH [KNY18, BKP181. The main result of [KNY18] is that the exis- tence of MCRH (with suitable parameters) implies the existence of efficient argument- systems for NP, A la Kilian's protocol [Kil921. Komargodski et al. [KNY18] also prove that MCRH imply constant-round statistically hiding commitments (similarly to The- orem 4.1.6), although their result only holds for MCRH who shrink their input by a constant multiplicative factor. Lastly, [KNY18] also show a blackbox separation be- tween MCRH in which it is hard to find t collisions from those in which it is hard to find t + 1 collisions. Bitansky et al. [BKP18] also study MCRH, with the motivation of constructing efficient argument-systems. They consider both a keyed version of MCRH (as in this chapter) and an unkeyed version (in which, loosely speaking, the requirement is that adversary cannot produce more collisions than those it can store as non-uniform advice). [BKP18] show a so-called "domain extension" result for MCRH that are sufficiently shrinking. Using this result they construct various succinct and/or zero- knowledge argument-systems, with optimal or close-to-optimal round complexity. In particular, they show the existence of 4 round zero-knowledge arguments for NP based on MCRH, and, assuming unkeyed MCRH, they obtain a similar result but with only 3 rounds of interaction.

4.1.3 Our Techniques

We provide a detailed overview of our two main results: Constructing MCRH from EAPmm max and constructing constant-round statistically-hiding commitment scheme

1 2 [BHKY19] also give a direct proof that SZK hardness implies constant-round statistically hiding commitments, that dose not go through distributional CRH.

87 from MCRH.

4.1.3.1 Constructing MCRH from EAPmin,max

Assume that we are given a distribution on circuits {C: {o, 1} -+ {0, 1}2n} such that that it is hard to distinguish between the cases Hmin(C) > k or Hmax(C) < k - 1, where we overload notation and let C also denote the output distribution of the circuit when given uniformly random inputs. Note that we have set the output length of the circuit C to 2n but this is mainly for concreteness (and to emphasize that the circuit need not be shrinking). Our goal is to construct an MCRH using C. We will present our construction in steps, where in the first case we start off by assuming a very large entropy gap. Specifically, for the first (over-simplified) case, we assume that it is hard to distinguish between min-entropy > n vs. max-entropy < n/2.1' Note that having min-entropy n means that C is injective.

Warmup: The case of Hmin(C) > n vs. Hmax(C) < n/2. In this case, it is already difficult to find even a 2-way collision in C: if Hmin(C) > n, then C is injective and no collisions exist. Thus, if one can find a collision, it must be the case that Hmax(C) < n/2 and so any collision finder distinguishes the two cases. The problem though is that C by itself is not shrinking, and thus is not an MCRH. To resolve this issue, a natural idea that comes to mind is to hash the output of C, using a pairwise independent hash function." Thus, the first idea is to choose f : {0, 1}2n -> {o, I}-,, for some s > 1, from a family of pairwise independent hash functions and consider the hash function h(x) = f(C(x)). If Hmji(C) > n (i.e., C is injective), then every collision in h is a collision on the hash function f. On the other hand, if Hmax(C) < n/2, then C itself has many collisions. To be able to distinguish between the two cases, we would like that in the latter case there will be no collisions that originate from f. The image size of C, if Hmax(C) < n/2, is smaller than 2n/2. If we set s to be sufficiently small (say constant) than the range of f has size roughly 2n. Thus, we are hashing a set into a range that is more than quadratic in its size. In such case, we are "below the birthday paradox regime" and a random function on this set will be injective. A similar statement can be easily shown also for functions that are merely pairwise independent (rather than being entirely random). Thus, in case C is injective, all the collisions appear in the second part of the hash function (i.e., the application of f). On the other hand, if C has max-entropy smaller than n/2, then all the collisions happen in the first part of the hash function (i.e., in C). Thus, any adversary that finds a collision distinguishes between the two cases and we actually obtain a full-fledged CRH (rather than merely an MCRH) at the cost

"This setting (and construction) is similar to that of Peikert and Waters's construction of CRH from lossy functions [PWiiI. 1 4 Recall that a collection of functions F is k-wise independent if for every distinct X1, ... ,Xk, the distribution of (f(x1),..., f(Xk)) (over the choice of f - T) is uniform.

88 of making a much stronger assumption.

The next case that we consider is still restricted to circuits that are injective (i.e., have min entropy n) in one case but assumes that it is hard to distinguish injective circuits from circuits having max-entropy n - V (rather than n/2 that we already handled).

The case of Hmin(C) > n vs. Hmax(C) < n - x/Y. The problem that we encounter now is that in the low max entropy case, the output of C has max-entropy n - V . To apply the above birthday paradox argument we would need the range of f to be of size roughly (2"-)2 > 2' and so our hash function would not be shrinking. Note that if the range of f were smaller, than even if f were chosen entirely at random (let alone from a pairwise independent family) we would see collisions in this case (again, by the birthday paradox). The key observation that we make at this point is that although we will see collisions, there will not be too many of them. Specifically, suppose we set s - . Then, we are now hashing a set of size 2-v' into a range of size 2'-,V. If we were to choose f entirely at random, this process would correspond to throwing N = 2"- balls (i.e., the elements in the range of C) into N bins (i.e., elements in the range of f). It is well-known that in such case, with high probability, the maximal load for Iog(N) any bin will be at most log log (N) < n. Thus, we are guaranteed that there will be at most n collisions. Unfortunately, the work of Alon et al. [ADM+99 shows that the same argument does not apply to functions that are merely pairwise independent (rather than entirely random). Thankfully though, suitable derandomizations are known. Specifically, it is not too difficult to show that if we take f from a family of n-wise independent hash functions, then the maximal load will also be at most n (see Section 2.2.3.2 for details).15 Similarly to before, in case C is injective, there are no collisions in the first part. On the other hand, in case C has max-entropy at most n - V,' , we have just argued that there will be less than n collisions in the second part. Thus, an adversary that finds an n-way collision distinguishes between the two cases and we have obtained an (s, t)-MCRH, with s = n and t = n (i.e., collisions of size 2V" exist but finding a collision of size even n is computationally infeasible).

The case of Hmin(C) > k vs. Hmax(C) < k - ri. We want to remove the assumption that when the min-entropy of C is high, then it is in fact injective. Specifically, we consider the case that either C's min-entropy is at least k (for some parameter k < n) or its max entropy is at most k - ji. Note that in the high min-entropy case, C - although not injective - maps at most 2 n-k inputs to every output (this is essentially the definition of min-entropy). Our approach is to apply hashing a second time (in a different way), to effectively make C injective, and then apply the construction from the previous case.

"We remark that more efficient constructions are known, see Remark 2.2.16.

89 Consider the mapping h'(x) (C(x), f(x)), where f will be defined ahead. For h' to be injective, f must be injective over all sets of size 2n-*. Taking f to be pairwise- independent will force to set its output length to be too large, in a way that will ruin the entropy gap between the cases. As in the previous case, we resolve this difficulty by using many-wise independent hashing. Let f: {0, 1}" -+ {, 1 }n-k be a 3n-wise independent hash function. If Hmin(C) > k, then the same load-balancing property of f that we used in the previous case, along with a union bound, implies that with high probability (over the choice of f) there will be no 3n-way collisions in h'. Our final construction applies the previous construction on h'. Namely,

hc,f,g (x) = g(C(x), f(x)), for f : {O, i}" --+ {O, 1 } and g: {0, 1 }3n-k -+ {0, }" being 3n-wise and 2n-wise independent hash functions, respectively. We can now show that

* If Hmin(C) > k, then there do not exist 3n distinct inputs x1,... , x 3n such that they all have the same value of (C(xi), f(xi)); and

" If Hmax (C) < k - /i5, then there do not exist 2n distinct inputs xl,... , x 2 n such that they all have distinct values of (C(xi), f(xi)), but all have the same value g (C (xi), f (Xi)).-

We claim that hc,f,g is (s, t)-MCRH for s = V and t = 6n 2 . First, note that in any set of 6n 2 collisions for hc,f,g, there has to be either a set of 3n collisions for (C, f) or a set of 2n collisions for g, and so at least one of the conditions in the above two statements is violated. Now, assume that an adversary A finds a 6n2 -way collision in hc,f,g with high probability. Then, an algorithm D that distinguishes be- tween Hmin(C) > k to Hmax(C) < k - n chooses f and g uniformly at random and

2 runs A on the input h = hc,f,g to get x 1 ,..., IX 6 with h(xi) = ... = h(X 6n2). The distinguisher D now checks which of the two conditions above is violated, and thus can distinguish if it was given C with H (C) k Hk ax(C) k - /n

We proceed to the case that the entropy gap is 1 (rather than 72). This case is rather simple to handle (via a reduction to the previous case).

The case of Hmin(C) > k vs. Hmax(C) < k - 1. This case is handled by reduction to the previous case. The main observation is that if C has min-entropy at least k, and we take f copies of C, then we get a new circuit with min-entropy at least f - k. In contrast, if C had max-entropy at most k - 1, then C' has max-entropy at most e -k - f. Setting f = k, we obtain that in the second case the max-entropy is n'- V , where n' = f - k is the new input length. Thus, we have obtained a reduction to the 72' gap case that we already handled.

90 4.1.3.2 Constructing Constant-Round Statistically Hiding Commitment from MCRH

The fact that MCRH imply constant-round statistically-hiding commitments can be shown in two ways. The first, more direct way, uses only elementary notions such as k-wise independent hashing and is similar to the interactive hashing protocol of Ding et al. [DHRS07]. An alternative method, is to first show that MCRH imply the existence of an (0(1)-block) inaccessible entropy generator [HRVW09, HV17. The latter was shown by [HBVW09, HVI7] to imply the existence of constant-round statistically-hiding commitments. We discuss these two methods next and remark that in our actual proof we follow the direct route.

Direct Analysis

In a nutshell our approach is to follow the construction of Damgtrd et al. [DPP93] of statistically-hiding commitments from CRH, while replacing the use of pairwise independent hashing, with the interactive hashing protocol of Ding et al. [DHRSO7. We proceed to the technical overview, which does not assume familiarity with any of these results.

Warmup: Commitment from (Standard) CRH. Given a family of collision- resistant hash functions W = {h: {0, 1}' -+ {0, 1}"1}, a natural first attempt is to have the receiver sample the hash function h <- 'R and send it to the sender. The sender, trying to commit to a bit b, chooses x +- {o, }n and r +- {0, }n, and sends (y = h(x), r, a = (r, x) T b) to the receiver. The commitment is defined as c = (h, y, r, a). To reveal, the sender sends (x, b) to the receiver, which verifies that h(x) = y and a = (r, x) E b. Pictorially, the commit stage is as follows:

S(b) R h h +- Gen(1')

x, r <- {0, }n c = (h(x), r, (r, x) G b)

The fact that the scheme is computationaly binding follows immediately from the collision resistance of h: if the sender can find (x, 0) and (x', 1) that pass the receiver's verification, then x 4 x' and h(x) = h(x'). Arguing that the scheme is statistically-hiding is trickier. The reason is that h(x) might reveal a lot of information on x. What helps us is that h is shrinking, and thus some information about x is hidden from the receiver. In particular, this means that x has positive min-entropy given h(x). At this point we would like to apply the Leftover Hash Lemma (LHL) to show that for any b, the statistical distance between (h(x), r, (r, x) ( b) and (h(x), r, u) is small. Unfortunately, the min-entropy level is insufficient to derive anything meaningful from the LHL and indeed the distance between these two distributions is a constant (rather than negligible as required).

91 To reduce the statistical distance, we increase the min-entropy via repetition. We modify the protocol so that the sender selects k values x = (x 1, .. . , Xk) +- {, 1}' ' and r +- {0, 1}rik, and sends (h(xi), ... , h(Xk), r, (r, x) E b) to the receiver. The min- entropy of x, even given h(x1),..., h(X,) is now Q(k), and the LHL now yields that the statistical distance between the two distributions (h, h(xi),. . . , h(x,), r, (r, x) O0) and (h, h(xi), ... , h(XI), r, (r, x) D 1) is roughly 2 -k. Setting k to be sufficiently large (e.g., k = poly(n) or even k = polylog(n)) we obtain that the scheme is statistically- hiding. Note that repetition also does not hurt binding: if the sender can find valid

decommitments (x = (x 1 ... , Xk), 0) and (x' = (x,... , x'), 1) that pass the receiver's verification, then there must exist i E [k] with xi 7 x' and h(xi) = h(x') (i.e., a collision).

Handling MCRHs. For simplicity, let us focus on the case t = 4 (since it basically incorporates all the difficulty encountered when dealing with larger values of t). That is, we assume that N = {h: {O,1}0 -+ {,1 }"~} is an (s, t)-MCRH with s = 2 and t = 4. Namely, it is hard to find 4 inputs that map to the same hash value for a random function from N, even though such 4-way collisions exist. Note however that it might very well be easy to find 3 such colliding inputs. And indeed, the binding argument that we had before breaks: finding x $ x' with h(x) = h(x) is no longer (necessarily) a difficult task. The problem comes up because even after the sender supposedly 'commits' to .. y1 = h(x1 ), , yk = h(xk), it is no longer forced to reveal x 1 ,..., Xk. Intuitively, for every yi, the sender might know 3 inputs that map to yi, so, the sender is free to reveal any value in the Cartesian product of these triples. Concretely, let S, be the set of inputs that h maps to y, that the sender can find efficiently, and let

SY = Sy, x ... x Syk. Since the sender can find at most 3 colliding inputs, it holds S, 3 for every i, and thus ISythat I< 3k. ToI fix the binding argument, we want to force every efficient sender to able to reveal a unique x = (x 1 ,..., Xk) E Sy. A first attempt toward achieving the above goal is to try to use a pairwise- independent hash function f that is injective over Sy with high probability. At a high level, the receiver will also specify to the sender a random function f from the pairwise independent hash function family. The sender in turn sends f(x) as well as (h(xi),.. ,h(Xk)). The receiver adds a check to the verification step to ensure that f maps the decommited input sequence (X, ... , ') to the value that was pre-specified. In order for the function f to be injective on the set Sy, the birthday paradox tells us that the range of f must have size at least ISY12 (roughly), which means at least 3 2k. Thus, to ensure that f is injective on Sy, we can use a pairwise-independent

function f : {0, 1 }1k - {0, 1 }2klog(3) Unfortunately, this scheme is still not binding: f is promised (with high proba- bility) to be injective for fixed sets of size 3k, but the sender can choose y based on the value of f. Specifically, to choose y so that f is not injective over Sy. To fix the latter issue, we split the messages that the receiver sends into two rounds. In the first round the receiver sends h and receives y = (h(xi), . . . , h(xk)) from the sender. Only then the receiver sends f and receives zi = f(x). Now, the scheme is binding: since

92 f is chosen after y is set, the pairwise-independence property guarantees that f will be injective over Sy with high probability. Pictorially, the commit stage of the new scheme is as follows:

S(b) R

h h -Gen(1")

x <-o{, 1}"nk, Y = (Y1, Y2 -... Yk) y= h(xi)

ff :{0, }nk -+ 0, 1}2k log(3)

r {,i- 1}k f (x), r, (r, x) E b4

But is this scheme statistically-hiding? Recall that previously, to argue hiding, we used the fact that the mapping (X 1 ,... , Xk) - (h(xi),... , h(Xk)) is shrinking. Analogously here, we need the mapping (Xi, .. ., Xk) F-+ (h(xi),..., h(Xk), f(x)) to be shrinking. However, the latter mapping maps strings of length n - k bits to strings of length (n - 2) - k + 2 log(3) - k, which is obviously not shrinking. One work-around is to simply assume that the given MCRH shrinks much more than we assumed so far. For example, to assume that WN is (4, 4)-MCRH (or more gene- rally (s, t)-MCRH for s > log(t)).16 However, by adding one more round of interaction we can actually fix the protocol so that it gives statistically-hiding commitments even with tight shrinkage of log(t).

Overcoming the Birthday Paradox. To guarantee hiding, it seems that we can- not afford the range of f to be as large as (3 k)2. Instead, we set its range size

to 3 k (i.e., f: {o, }nk _ {0, 1 }klog(3)). Moreover, rather than choosing it from a pairwise independent hash function family, we shall one more use one that is many- wise-independent. Again, the important property that we use is that such functions are load-balanced7 with high probability, zi - the value that the sender sends in the second round - has at most log(3k) = k - log(3) pre-images from Sy under f (i.e., I{x e Sy: f(x) = zi} < k - log(3)). We once more face the problem that the sender can reveal any of these inputs, but now their number is exponentially smaller - it is

only k log(3) (as opposed to 3 ' before). We can now choose a pairwise-independent

g: {o, 1 }nk _ {0, 1 }2(1og(k)+loglog(3)) that is injective over sets of size k - log(3) (with high probability). For the same reasons that f was sent after h, the receiver sends g only after receiving f(x). Thus, our final protocol has three rounds (where each round is composed of one message for each of the two parties) and is as follows: In the first round, the receiver

"We remark that our construction of MCRH based on EAPmin,max (see Section 4.2) actually supports such large shrinkage. 171n a nutshell, the property that we are using is that if N = 3k balls are thrown into N bins, with high probability the maximal load in every bin will be at most log(N). It is well-known that hash functions that are log(N)-wise independent also have this property. See Section 2.2.3.2 for details.

93 selects h +- 'W and sends it to the sender. The sender, trying to commit to a bit b, chooses x = (X 1 ,..., ,X) <- { 0 , 1}k and sends y = (y, = h(xi),..., y = h(xk)). In the second round, the receiver selects a many-wise-independent hash function f : {0, 1 }nk _ {0, 1 }k log(3) and sends it to the sender. The sender sends zi = f(x) to the receiver. In the third and final round, the receiver selects a pairwise-independent hash function g: {0, 1}l' -+ {0, 1 }2(log(k)+log log(3)) and sends it to the sender. The sen- der selects r <- {0, 1 },k, and sends (z 2 = g(x), r, u- (r, x) E b) to the receiver. The commitment is defined as c = (h, y, f, z1 , g, z 2, c-). To reveal, the sender sends (x, b) to the receiver, which verifies that h(xi) = y for every i, that f(x) = z 1, g(x) = Z2 and a = (r, x) ( b. Pictorially, the commit stage is as follows:

S(b) R

h h <- Gen (1n)

x <-{0,}nk, y = (1, Y2... yk) yi h(xi)

f f : { 0 , 1 }nk {0 , 1 }klog(3)

f (x)

9 g : {0, 1 }nk -+ {0, 1 }2(logk+loglog(3))

r {0, 1}nk g(x), r, (r, x) ED b

Intuitively, the scheme is computationally binding since for any computationally bounded sender that committed to c, there is a unique x that passes the recei- ver's verification. As for statistically hiding, we need the mapping (x1 ,... , Xk) f-+ (h(xi),.. ., h(X,), f(x), g(x)) to be shrinking. Observe that we are mapping n - k bits to (n - 2)k + log(3)k + 2(log(k) + log log(3)) bits (where all logarithms are to the base 2). Choosing k to be sufficiently large (e.g., k = poly(n) certainly suffices) yields that the mapping is shrinking. T1is completes the high levei uverview ui the uirect analysis of our construui iuin of constant-round statistically hiding commitments. The formal proof, done via a re- duction from the binding of the scheme to the MCRH property, requires more delicate care (and in particular handling certain probabilistic dependencies that arise in the reduction). See Section 4.3 for details.

Analysis via Inaccesible Entropy

Consider the jointly distributed random variables (h(x), x), where x is a uniform n- bit string and h is chosen at random from a family of t-way collision resistant hash functions N = {h: {0, 1} _4 {0, }-lo.(t)}l Since h(x) is only (n - log(t)) bits long, it can reveal only that amount of information about x. Thus, the entropy of x given

94 h(x) (and h) is at least log(t). In fact, a stronger property holds: the expected number of pre-images of h(x), over the choice of x, is t. This implies that x given h(x) has log(t) bits of (a weaker variant of) min-entropy. While h(x) has t pre-images (in expectation), no efficient strategy can find more than t - 1 of them. Indeed, efficiently finding t such (distinct) pre-images directly violates the t-way collision resistance of h. In terms of inaccessible entropy, the above discussion establishes that (h(x), x) is a 2-block inaccessible entropy generator where the second block (i.e., x) has real min- entropy log(t) and accessible max-entropy at most log(t - 1). This block generator is not quite sufficient to get statistically-hiding commitment since the construction of [1RVVW09, IV17J requires a larger gap between the entropies. This, however, is easily solved since taking many copies of the same generator increases the entropy gap. That is, the final 2-block generator is ((h(xi), . . ,h(Xk)), (x1 ,. . . , Xk) , for a suitable choice of k. The existence of constant-round statistically-hiding commitment now follows immediately from [HV17, Lemma 191.18 The resulting protocol turns out to be essentially the same as that obtained by the direct analysis discussed above (and proved in Section 4.3).

4.1.4 Organization of this Chapter

In Section 4.2 we formally state the entropy approximation assumption and present our construction of MCRH based on this assumption. In Section 4.3 we describe the construction of constant-round statistically-hiding commitments from MCRH. Lastly, in Section 4.4 we prove the blackbox separation of MCRH from one-way permutations.

4.2 Constructing MCRH Families

In this section, we present a construction of a Multi-Collision Resistant Hash family (MCRH) based on the hardness of estimating certain notions of entropy of a distribu- tion, given an explicit description of the distribution (i.e., a circuit that generates it). We define and discuss this problem in Section 4.2.1, and present the construction of MCRH in Section 4.2.2.

4.2.1 Entropy Approximation

In order to discuss the problem central to our construction, we first recall some standard notions of entropy.

Definition 4.2.1. For a random variable X, we define the following notions of en- tropy: 18 The general construction of statistically-hiding commitments from inaccessible entropy genera- tors is meant to handle a much more general case than the one needed in our setting. In particular, a major difficulty handled by [HRVW09, HV171 is when the generator has many blocks and it is not known in which one there is a gap between the real and accessible entropies.

95 " Min-entropy: Hmin(X) = minCesupp(x) log Pr[X=x]

" Max-entropy: Hmax(X) = log (ISupp(X)I).

" Shannon entropy: H(X) = E x log Pr[4x]

For any random variable, these entropies are related as described below. These relations ensure that the problems we describe later are well-defined. Fact 4.2.2. For a random variable X supported over {0, 1 }',

0 < Hmin(X) < H(X) < Hmax(X) < m.

Given a circuit C : {0, 1} -+ {0, 1 }, we overload C to also denote the random variable induced by evaluating C on a uniformly random input from {0, 1}'. With this notation, the ENTROPY APPROXIMATION PROBLEM is defined as below.

Definition 4.2.3 (MIN-MAx ENTROPY APPROXIMATION PROBLEM). Let g = g(n) E R be a function such that 0 < g(n) < n. The MIN-MAX ENTROPY APPROXI- MATION PROBLEM with gap g, denoted EAP )max, is a promise problem (YES, NO) for

YES = {(C, k) Hmin(C) > k}, and NO = {(C, k) Hmax(C) < k - g(n)}, where n is the input length of the circuit C. We also define EAPminmax = EAP( ax That is, when we omit the gap g we simply min maxTais mean that g = 1. In contrast to the definitions of SDP and EDP (Definitions 2.3.6 and 2.3.7), where the gap parameters depend on the output length of the circuits, in Definition 4.2.3 the gap parameter (i.e., the function g) depend on the circuit's input length. In our current context, we are interested in constructing hash functions, which are necessarily shrinking, and whose degree of shrinkage is measured with respect to to their input length. Thus, it is more suitable to have the gap parameters of EAP(x? depend on the input length. The Shannon ENTROPY APPROXIMATION PROBLEM (where Hmin and Hmax above are replaced with H), with constant gap, was shown by Goldreich et al. [GSV99 to be complete for the class NISZK (promise problems with non-interactive statistical zero knowledge proof systems). For a discussion of generalizations of the ENTROPY APPROXIMATION PROBLEM to other notions of entropy, and other related problems, see [DGRV11I.

4.2.1.1 The Assumption: Average-Case Hardness of Entropy Approxima- tion

Our construction of MCRH is based on the average-case hardness of the ENTROPY APPROXIMATION PROBLEM EAPmin max defined above (i.e., with gap 1). Recall the

96 definition of average-case hardness of promise problems.19

Definition 4.2.4 (Average-case hardness). We say that a promise problem H = (YES, NO) is average-case hard if there is a probabilistic polynomial-time algorithm S such that S(1") outputs samples from (YES U NO), and for every family of polynomial- sized circuits A = (AnGN,

1 Pr [A,(1I, x) = H(x < - + negl(n), x-S(i) 2 where H(x) = 1 if x E YES and H(x) = 0 if x E NO. We call S a hard-instance sampler for H. The quantity (PrXs(In)[A,(In, x) = H(x)] - 1/2) is referred to as the advantage the algorithm A has in deciding H with respect to the sampler S.

In our construction and proofs, it will be convenient for us to work with the pro- blem EAPminmax rather than EAPminmax - EAP' .max. At first glance EAPmismax seems to be an easier problem because the gap here is [nj, which is much larger. The following simple proposition shows that these two problems are in fact equiva- lent (even in their average-case complexity). The key idea here is repetition: given a circuit C, we can construct a new circuit C' that outputs C evaluated on independent inputs with a larger gap.

Proposition 4.2.5. EAP m is average-case hard if and only if EAP('m is average-case hard.

Proof Sketch. Any YES instance of EAP afI) is itself a YES instance of EAPm a and the same holds for NO instances. So the average-case hardness of EAPmiYmax im- mediately implies that of EAP( max, with the same hard-instance sampler. In order to show the implication in the other direction, we show how to use a hard-instance sampler for EAP max to construct a hard-instance sampler S' for EAPmim- S' on input (1.1): 1. Sample (C, k) <- S(1"). Let m and r be the input and output lengths of C (i.e., C: {0, 1}'m -+ {0, 1}). 2. Let C be the m-fold repetition of C. Namely, C takes an m 2-bit input x. It breaks x into m disjoint blocks x 1 ,...,, x, where x1 ,..., x, are of size m. It runs a copy of C on each of the other xi's, and outputs a concatenation of all the outputs.

3. S' outputs (C, k i).m

As C is the rn-fold repetition of C, its max and min entropies are m times the respective entropies of C. So if C had min-entropy at least k, then C has min-entropy

"Definition 4.2.4 is almost identical to Dcfinition 3.4.4, except that the latter definition considers uniform adversaries, whereas in the context of collision-resistant hash function, we need to consider non-uniform adversaries.

97 The Construction of MCRH

Let S be a hard-instance sampler for EAPmin,max Gen(1):

1. Sample (C, k) +- S(1n), where C maps {0, 1} -+ {0, I'. 2. Sample' f ~F(n) and g ~ F(n

3. Output the circuit that computes the function hc,,.9 : {O, 1}" -+ {O, i}L- that is defined as follows:

'Recall that 'M = {f :{O,1}" - {O,1}r} is a family of -wise independent hash functions (see Section 2.2.3.1).

Figure 4.2.1: Construction of MCRH from the MIN-MAX ENTROPY APPROXIMATION PROBLEM. at least k - m, and if C had max-entropy at most (k - 1), then C has max-entropy at most (k - 1) - m = k - m - m. Since the input length of C is m 2 , the proposition follows. El

4.2.2 The Construction

Our construction of a Multi-Collision Resistant Hash (MCRH) family is presented in Figure 4.2.1. For simplicity of presentation, we assume that the hard-instance sam- pler S for EAPmnmax, when given In as input, always outputs circuits whose input length is n. The construction in Figure 4.2.1 and the proof below can be naturally adopted for the case that S outputs circuits with input lengths different than n. In particular, the input length of such circuits must be, with overwhelming probabi- lity, super-logarithmic in n; otherwise the problem would be easy to solve. Hence, any exponentially small quantity in the input length-such as those in Claims 4.2.1 aiu 4.2.2 Delow-would be negligible in the security parameter n. We now prove that the construction is secure under our average-case hardness assumption.

Theorem 4.2.6. If EAPIvYK1I is average-case hard, then the construction in Fi- gure 4.2.1 is an (s, t)-MCRH, where s = L i] and t = 6n2 The above theorem, along with Proposition 4.2.5, now implies the following.

Corollary 4.2.7. If EAPmin,inax is average-case hard, then there exists an (s, t)- MCRH, where s = [Vij and t = 6n2

Note that above, the shrinkage being LVPY] guarantees that there exist 2 -way collisions. But the construction is such that it is not possible to find even a 6n 2-way

98 collision, (which is sub-exponentially smaller). This is significant because, unlike in the case of standard collision-resistant hash functions (i.e., in which it is hard to find a pair of collisions), shrinkage in MCRHs cannot be easily amplified by composition while maintaining the same amount of collision-resistance (see Remark 4.1.2). The rest of this section is dedicated to proving Theorem 4.2.6.

Proof of Theorem 4.2.6. Let Gen denote the algorithm described in Figure 4.2.1, and S be the hard-instance sampler used there. Fact 2.2.12, along with the fact that S runs in polynomial-time ensures that Gen runs in polynomial-time as well. The shrinkage requirement of an MCRH is satisfied because here the shrinkage is s(n) = L\/n]. To demonstrate multi-collision resistance, we show how to use an adversary that finds 6n 2 collisions in hash functions sampled by Gen to break the average-case hardness of EAP(K'. 1 ) For the rest of the proof, to avoid cluttering up notations, we will denote the problem EAP mx by just EAP. We begin with an informal discussion of the proof. We first prove that large sets of collisions that exist in a hash function output by Gen have different properties depending on whether the instance that was sampled in step 1 of Gen was a YES or NO instance of EAP. Specifically, notice that the hash functions that are output by Gen have the form hcfg(x) = g(C(x), f(x)); we show that, except with negligible probability:

" In functions hc,f.g generated from (C, k) c YES, with high probability, there do

not exist 3n distinct inputs X1 , ... , Xzn such that they all have the same value of (C(xi), f (xi)).

" In functions hcfg generated from (C, k) c NO, with high probability, there do

not exist 2n distinct inputs x 1 , ... , x 2 n such that they all have distinct values of (C(xi), f(xi)), but all have the same value g(C(xi), f(xi)).

Note that in any set of 6n 2 collisions for hc,f,g, there has to be either a set of 3n collisions for (C, f) or a set of 2n collisions for g, and so at least one of the conclusions in the above two statements is violated. A candidate average-case solver for EAP, when given an instance (In, C, k), runs steps 2 and 3 of the algorithm Gen from Figure 4.2.1 with this C and k. It then runs the collision-finding adversary on the hash function hc,f,g that is thus produced. If the adversary does not return 6n 2 collisions, it outputs a uniformly random answer. But if these many collisions are returned, it checks which of the conclusions above is violated, and thus knows whether it started with a YES or NO instance. So whe- never the adversary succeeds in finding collisions, the distinguisher can decide EAP correctly with overwhelming probability. As long as the collision-finding adversary succeeds with non-negligible probability, then the distinguisher also has non-negligible advantage, contradicting the average-case hardness of EAP. We now state and prove the above claims about the properties of sets of collisions, then formally write down the adversary outlined above and prove that it breaks the average case hardness of EAP.

99 The first claim is that for hash functions hc,f,g generated according to Gen using a YES instance, there is no set of 3n distinct xi's that all have the same value for C(xi) and f(xi), except with negligible probability.

Claim 4.2.1. Let (C, k) be a YES instance of EAP, and let n be the input length of C. Then,

Pr [Ey, yi e {O, 1}* : 1C'(y) n f- 1(y)j > 3n] 1. 3 2 pfr( ,n) n,(n-k)

Intuitively, the reason this should be true is that when C comes from a YES instance, it has high min-entropy. This means that for any y, the set C- 1 (y) will be quite small. The function f can now be thought of as partitioning each set C-(y) into several parts, none of which will be too large because of the load-balancing properties of many-wise independent hash functions.

Proof. The above probability can be bounded using the union bound as follows:

1 1 Pr [3y, y1 : |C-1(y) n f-1(yi)| I 3n] < S Pr [iyi : IC (y) n f- (y1 ) > 3n]. YEIm(C) (4.1)

The fact that (C, k) is a YES instance of EAP means that Hmni(C) > k. The definition of min-entropy now implies that for any y e Im(C):

log( Pr.{ 1 [) > k, Pr_{0,1}n [CQX) = yj which in turn means that 1C- 1 (y)I < 2 -k. Fact 2.2.15 (about the load-balancing properties of Tk)) now implies that for any y G Im(C):

1 1 -k 1 Pr [1y, : C (y) n f (y1 )|I 3n] < 23- - . (4.2)

Combining Eqs. (4.1) and (4.2), and noting that the image of C has at most 2" elements, we get the desired bound: Pr[]y~yi1 1 Pr [Py, y1 : IC 1 (y) n f-1(yi)| 3n] < 2*- < .

The next claim is that for hash functions hc,fg generated according to Gen using a NO instance, there is no set of 2n values of xi that all have distinct values of (C(xi), f(xi)), but the same value g(C(xi), f(xi)), except with negligible probability.

100 Claim 4.2.2. Let (C, k) be a NO instance of EAP, and let n and n' be the input and output lengths of C, respectively. Then,

For all ij, 1

Pr ]X 1 ,... ,X2n: (C(xi), f (xi)) 7 (C(xj), f(xj)), and < -. f~n,(n-k) g (C(xi), f (Xi)) = g (C(x), f(x3 )) -2

Proof. The fact that (C, k) is a NO instance of EAP means that Hmax(C) < k - [ n]; that is, C has a small range: Im(C)I < 2 k- L] .

For any f C n, (n- k) ,(3n)which is what is sampled by Gen when this instance is used, the range of f is a subset of {0, 1 }4k. This implies that even together, C and f have a range whose size is bounded as:

Ilm(C, f) I 2 k- .v/i2 n-k - 2n- [I'IJ, where (C, f) denotes the function that is the concatenation of C and f. For there to exist a set of 2n inputs xi that all have distinct values for (C(x ), f(xi)) but the same value for g(C(xi), f(xi)), there has to be a y that has more than 2n inverses under g that are all in the image of (C, f). As g comes from T(2n) we can use Fact 2.2.15 along with the above bound on the size of the image of (C, f) to bound the probability that such a y exists as follows:

2'- LV/-n I 1 Pr [Ly : jg- (y) n Im(C, f) ;> 2n] < 22n - .

Let A = (An)nEN be a polynomial-size family of circuits that given a hash function output by Gen(1"n) finds a 6n2 -way collision in it with non-negligible probability. The candidate circuit family A' = (A')nCN for solving EAP on average is described below. A' on input (Ir, C, k):

1. Run steps 2 and 3 of the algorithm Gen in Figure 4.2.1 with (C, k) in place of the instance sampled from S there. This results in the description of a hash function hc,f g. 2. Run An(hc,f,g) to get a set of purported collisions S.

2 3. If S does not actually contain 6n collisions under hc0 f,g, output a random bit. 4. If S contains 3n distinct xi's, and they all have the same value of (C(xi), f(xi)), output 0. 5. If S contains 2n distinct xi's, and they all have distinct values of (C(xi), f(xi)) but the same value g(C(xi), f(xi)), output 1.

101 The following claim now states that any collision-finding adversary for the MCRH constructed can be used to break the average-case hardness of EAP, thus completing the proof.

Claim 4.2.3. If A finds 6n2 collisions in hash functions output by Gen(1") with non-negligible probability, then A' has non-negligible advantage in deciding EAP with respect to the hard-instance sampler S used in Gen.

Proof. On input (In, C, k), the adversary A' computes hc,f,g and runs An on it. If An does not find 6n2 collisions for hc,f,g, then A' guesses at random and is correct in its output with probability 1/2. If A, does find 6n2 collisions, then A' is correct whenever one of the following is true:

1. (C, k) is a YES instance and there is no set of 3n collisions for (C, f). 2. (C, k) is a NO instance and there is no set of 2n collisions for g in the image of (C, f).

Note that inputs to A' are drawn from S(1), and so the distribution over hcf,g produced by A', is the same as that produced by Gen(1) itself. With such samples, let E1 denote the event of (C, f) having a set of 3n collisions from S (the set output by An), and let E2 denote the event of g having a set of 2n collisions in the image of (C, f) from S. Also, let Er denote the event of the input to A' being a YES instance, EN that of it being a NO instance, and EA the event that S contains at least 6n 2 collisions. Following the statements above, the probability that A' is wrong in deciding EAP with respect to (C, k) +- S(1) can be upper-bounded as:

Pr[A' (1", C, k) is wrong] = Pr[(-,EA) A (A' is wrong)] + Pr[EA A (A' is wrong)] 1 < Pr[-,EA] - + Pr[(Ey A E ) V (EN A E2)]. 2 1

The first term comes from the fact that if An doesn't find enough collisions, A' guesses at random. The second term comes from the fact that if both (Ey A E1 ) and (EN A E2 ) are false and B. is true, then since at least one of Ey and EN is always true, one of (Ey A -E) and (EN A -,E2) will also be true, either of which would ensure that A' is correct, as noted earlier. We now bound the second term above, starting as follows:

Pr[(Ey A E1) V (EN A E2)] < Pr[(Ey A E1 )] + Pr[(EN A E2)] = Pr[Ey] Pr[E1|Ey] + Pr[EN] Pr[E2IEN] < Pr[Ey] - negl(n) + Pr[ENI negl(n)

- negl(n), where the first inequality follows from the union bound and the last inequality follows from Claims 4.2.1 and 4.2.2.

102 Putting this back in the earlier expression, 1 Pr[A' (I', C, k) is wrong] < Pr[-,EAI - ne 2 1 Pr[EAI = 2 + negl(n). 2 2

In other words, 1 Pr[EA] Pr[A' (1n, C, k) is correct] ;> +P 2 - negl(n).

So if A succeeds with non-negligible probability in finding 6n2 collisions, then A' had non-negligible advantage in deciding EAP over S. El

This concludes the proof of Theorem 4.2.6. LI

4.3 Statistically Hiding Commitments

In this section we show that multi-collision-resistant hash functions imply the exis- tence of constant-round statistically-hiding commitments. Here we follow the "direct route" discussed in the introduction (rather than the "inaccessible entropy route"). For simplicity, we focus on bit commitment schemes (in which messages are just single bits). As usual, full-fledged commitment schemes (for long messages) can be obtained by committing bit-by-bit. Definition 4.3.1 (Bit Commitment Scheme). A bit commitment scheme is an inte- ractive protocol between two polynomial-time parties - the sender S and the receiver R - that satisfies the following properties. 1. The protocol proceeds in two stages: the commit stage and the reveal stage.

2. At the start of the commit stage both parties get a security parameter I" as a common input and the sender S also gets a private input b - {0, 1}. At the end of the commit stage the parties have a shared output c, which is called the commitment, and the sender S has an additionalprivate output d, which is called the decommitment.

3. In the reveal stage, the sender S sends (b, d) to the receiver R. The receiver R accepts or rejects based on c, d and b. If both parties follow the protocol, then the receiver R always accepts. In this section we focus on commitment schemes that are statistically-hidingand computationally-binding. Definition 4.3.2 (Statistically Hiding Bit Commitment). A bit commitment scheme (S, R) is statistically-hiding if for every cheating receiver R it holds that

SD ((S(0), )(1n), (S(1) )(1")) = negl(n),

103 where (S(b), R)(In) denotes the transcript of the interaction between R and S(b) in the commit stage.

Definition 4.3.3 (Computationally Binding Bit Commitment). A bit commitment scheme (S, R) is computationally-binding if for every family of polynomial-size circuits sender S = (Sn)n EN it holds that S wins in the following game with only with negl(n) probability:

1. The cheating sender S, interacts with the honest receiver R(1) in the commit stage obtaining a commitment c.

2. Then, Sn outputs two pairs (0, do) and (1, d1 ). The cheating sender S wins if

the honest receiver R accepts both (c, 0, do) and (c, 1, d1 ).

We are now ready to state the main result of this section. A round of a commit- ment scheme is a pair of messages, the first sent from the receiver to the sender, and the second the other way.

Theorem 4.3.4 (MCRH => Constant-Round Statistically-Hiding Commitments). Let t = t(n) E N be a polynomial computable in poly(n) time. Assume that there exists a (s, t)-MCRH for s > log(t), then there exists a three-round statistically-hiding computationally-binding commitment scheme.

As we already mentioned in Section 4.1, constructions of statistically hiding com- putationally binding commitment schemes are known assuming only the minimal assumption that one-way functions exist. Those constructions, however, have a poly- nomial number of rounds (and this is inherent for black-box constructions [HHRS15]). Theorem 4.3.4, on the other hand, yields a commitment scheme with only a constant (i.e., three) number of rounds. The rest of this section is dedicated to proving Theorem 4.3.4.

4.3.1 Proving Theorem 4.3.4

The proof follows the outline detailed in Section 4.1.3.2. Let Gen be the generating algorithm that defines an (s, t)-MCRH for s log(t), assumed to exist in the theorem's statement. Since s must be an integer, we can assume without loss of generality that the function defined by Gen is ([log(t)], t)- MCRH (we can always pad the output of the function without making it easier to find collisions). The commitment scheme is defined in Fig. 4.3.1. The proof follows from the next two lemmas.

Lemma 4.3.5 (Computationally Binding). The commitment scheme (S, R) in Fig. .4.3.1 is computationally binding.

Lemma 4.3.6 (Statistically Hiding). The commitment scheme (S, R) in Fig. 4.3.1 is statistically hiding.

The proof of Lemmas 4.3.5 and 4.3.6 is given in the next two sections.

104 The Commitment Scheme (S, R)

S's Input: security parameter In and a bit b E {O, 1}. R's Input: security parameter 1'. Algorithm Gen: polynomial-time algorithm that on input 1' returns a circuit computing a ([log(t(n))], t(n))-MCRH h: {o, I} -+ {0, 1 }nlo(n))

The commit stage:

1. Both parties set t = t(n) and k n - t.

2. R samples h +- Gen(l) and sends h to S.

3. S samples x =(X1, .-. , Xk) <- {0, 1 }"'*, computes y =(y,... , Yk), where y= h(xi) for all i C [k], and sends y to R.

4. R samplesa f _ - F * and sends f to S. 5. S sends zi f(x) to R.

6. R samples g +- Fn-k [21og(k)+21og og(t-1)+10_ 2 (n)] and sends g to S.

7. S sends z2 = g(x) to R.

8. S samples r ÷- {o, 1 }nk and computes o =r, x) e b and sends (r, a) to R.

9. The commitment is defined as c = (h, y, f, Z1, g, z 2, r, 0) and the decommitment is defined as d = x.

The reveal stage:

1. S sends (b, x) to R.

2. R accepts if h(xi) = y, for every i E [k], f(x) = z1 , g(x) = z2 and r, x) D b = u.

aRecall that Tk) is a family of k-wise-independent hash functions from {0, 1}" to {0, 1}' (see Section 2.2.3.1).

Figure 4.3.1: Statistically Hiding Commitment

105 4.3.1.1 Analyzing Binding - Proving Lemma 4.3.5

Assume toward a contradiction that the scheme is not computationally binding. That is, there exists a polynomial-size family of circuits S = (Sf)EN, an infinite index set I C N and a polynomial q such that for every n E I the following events occur with probability at least 1/q(n): (1) the cheating sender S, interacts with the honest recei- ver R(1") in the commit stage of the protocol and the parties obtain a commitment

c = (h, y = (Yi ... , yk), f, zI, g, z2 , r, 0); then, (2) S, outputs valid decommitments to

two distinct values (0, x = (X 1 ,... , Xk)) and (1, x' = (x', ... , ')) such that

ViE [k]: h(xi) = h(x') = yj f (x) = f (x') = z, and g(x) = g(x') = Z2 (4.3) (rx) G O= o and (r,x') el =a.

Whenever the above conditions are met, we say that S wins. We use S to find a t-way collision in N, thereby deriving a contradiction. 20 Let T be a sequence of random coins used by S during its interaction with R.

Observe that y, the first message sent by S, is a deterministic function of T and h, where h is the first message sent by R (i.e., a description of an MCRH). Similarly, z1 , the second message sent by S, is a deterministic function of T, h and f, where f is the second message sent by R. Finally, r, 0_, x and x', the values sent in the third message of S, are all deterministic functions of T, h, f and g, where g is the third message sent by R. Hence, for any T and h we can define a set

W,1h = { (f, g) : (h, y, f, g, r, 0, x, x') satisfy the conditions in Eq. (4.3)}.

That is, WTh contains all many-wise independent hash functions f and g that lead the adversary to successfully break the binding property of the commitment scheme (with respect to the fixed T and h). We can now describe the algorithm for finding t-way collision in N. The (non- uniform) algorithm CollFinder = (CollFindern)nEN is defined in Fig. 4.3.2. It is easy to verify that ColiFinder, is of polynomial size. 21 In the rest of the proof we show that ColiFinder finds t-way collision in N with probability roughly 1/q(n), which is a contradiction to the multi-collision resistance of N. Intuitively, the sets Si store collisions of h. The choice of f and g guarantee that, with probability at least 1/poly(n) and as long as JSj < t for every i, in every iteration the main loop of ColtFinder in which S wins, at least one of the Si increases. Iterating the loop for sufficiently many times guarantees that with probability at least 1/poly(n), one of the Si's contains at least t values (i.e., a t-way collision) at the end of the loop.

2 0 Since S is a non-uniform adversary, we could have assumed without loss of generality that it is deterministic. We refrain from doing so to highlight that our reduction is uniform. 21 We assume, without loss of generality, that q(n) can be computed in time poly(n) (otherwise, take q' > q that is efficiently computed)

106 Co lFinder,, on input (1", h):

1. Set t =t(n), k=n.t and q=q(n) 2. Sample at random coins' for So, denoted by T 3. Emulate b S.,,(h; T) to obtain y = (yi,..., Yk) 4. For every i C [k], set Si 0 2 2 5. Repeat for 4 -q . n . (t - 1) - k times: (a) Sample f +--(2Fk.1o0t1)1

(b) Emulate Sn(f; r) to obtain z1

(c) Sample g *- .Fn.k,[2og(k)+2 loggg(t-1)+1g2()] Sn(g; T) (d) Emulate to obtain r, o, x = (X 1 ,.. ., X) and x' (X',. . X ')k (e) If S wins,c then for every i update Si = Si U {xi, X'c} 6. Output Sout = Sj such that j = arg maxi{ISiI}

'Again, considering a randomized S, is not necessary and only done to highlight that the reduction is uniform. bBy A(-; T) we mean that A is run when its coins are set to r. 'Namely, if the conditions in Eq. (4.3) are satisfied, or equivalently if (f, g) E hV,.

Figure 4.3.2: Algorithm to find t-collision in 'W

Formally, fix some large enough n c I and remove it from notation. Observe that Sout, the set of alleged collisions returned by CoIlFinder, is updated only when S wins. Thus, it holds that h(x) = h(x') for every x, x' c Sout. Let L be random variable equal to the size of Sout in a random execution of CollFinder(1", h), for h +- Gen(l"). Hence, out goal is to show that

Pr[L > t] > Q(1/q). (4.4)

Our first step is to analyze how the choice of (T, h) affects the success probability of CoilFinder. Let b c {0, 1} and let (T, H, F, G) be (jointly distributed) random variables induced by the values of (T, h, f, g) in a random execution of (S(b), R) where T are the random coins used by S (note that these random variables are identically distributed if b = 0 or b = 1). Note that (T, H) are independent of (F, G), the random variable H is distributed as the output of Gen(1") and T is distributed as the value of T in Step 2 in a random execution of CoIIFinder(1', H). Let

W (T, h): Pr[(F, G) C W,/] > q}.

We make use of the next claim.

Claim 4.3.1. It holds that Pr[(T, H) E W] > 1/(2q).

Proof. Let P,,, = Pr[(F, G) E W-,h]. We can lower bound the expectation of the

107 random variable PT,H as follows:

E[PT,H] = Pr[(F, G) E WTH] = PrS wins > I. rs I q

Using elementary probability, and the fact that PT,H takes values in [0, 1], we have that:

q E P,- Pr T, -~~~~ TH q>_ <1 +| Pr [T,H > _ q I 2q q 2 2 q

and so Pr PT,H > > - 2

For T, h, let L,h denote the random variable distributed as L conditioned on (T, H) = (T, h). Using Claim 4.3.1, we have that

Pr[ L > t] = (-~)E T,)Pr[ L,,h > t] (4.5)

> Pr[(T, H)cW1 (h>ETH) [Pr[Lr,h > t I (7, h) E w]

1 E [Pr[L,,h > t I (rI h) c W]. 2q (-r,h)~(TH)

In the rest of the proof we show that for every fixed (T, h) E W, it holds that

Pr[L,,h > t > 1 - negl(n). (4.6)

At this point it might be helpful to provide an explanation of our approach in the rest of the proof. Our goal is to show that with high probability ColiFinder finds t-way collision, or equivalently that at the end of the loop in Step 5 at least one of the sets S, ... , Sk contains at least t elements. By definition, these sets are updated only when S wins in some iteration j of the loop. However, even if S wins in the j'th iteration it does not necessarily mean that the size of one of the Si's increased, namely that there exists i E [k] such that xi d S4 or x ( S4. The latter condition is guaranteed to occur if some injective condition - denoted for now by injj) - is met, where injU) is defined over the choice of the functions f and g chosen in the j'th iteration and the current value of the sets S1, . .. , Sk (i.e., the value at the beginning of the j'th iteration). If injd) is met for enough iterations (roughly t - k), one of the sets must contain at least t elements. Ideally, we would like to argue that injdj) occurs in many iterations since in every iteration f and g are chosen independently from many-wise independent hash families. However, arguing this turns out to not be straightforward. The reason being the dependency between injdj) and inj(j'), for j' < j. Indeed, injUj) depends on the value of the sets S, ... ,Sk at the beginning of the j'th iteration, and the value of these sets depends on whether previous injective conditions were met. To circumvent this issue we take a different approach - instead of analyzing the

108 dynamic updates of the sets S 1 ,...,Sk, we fix in advanced sets S ,..., Sk, all of which contain at most t - 1 elements. We show that only with very small probability the sets S1 ,..., Sk are equal to S',..., Sk at the end of CollFinder's run. To do so, we redefine inj( ) to depend on the functions f and g that are chosen in the j'th iteration and the fixed sets S ,... ,Sk. Having done that, injUj) and inj(j') are now indeed independent, and a straightforward analysis can be applied. To complete the argument we must show the above for any possible choice of S, .... , S'. Since SJ < t, there are (relatively) a small number of possible choices for Si,..., S, and we can apply a union bound over all of them.

Going back to the formal proof, fix (T, h) c W and let A 1 ,..., Ak be random variables induced by the values of the sets S1, . . . , Sk at the end of a random execution of CoilFinder, conditioned on (T, H) = (T, h). It holds that

Pr[Lf < t] = Pr[IAj < t - 1 for every i E [k]] (4.7)

S .,.... , {l0,1}n Vi: S

Claim 4.3.2. If (h, y, f I, 9, z2 , r, 0-, x, x') satisfy the conditions in Eq. (4.3) and g is injective over Spg, then there exists i G [k] such that xi 0 Si or x' Sj.

Proof. First, note that since the conditions in Eq. (4.3) are satisfied, it holds that (r, x) G 0 = a and (r, x') e 1 = -, and thus x z x'. Assume toward a contradiction that xi E Sjl and xi S for every i E [k]. It follows that x, x' E S' x ... x Sk. Since the conditions in Eq. (4.3) are satisfied it holds that f(x) = f(x') = zi, and thus x, x' e SyZ1 . Using once more that the conditions in Eq. (4.3) are satisfied it holds that g(x) = g(x'), a contradiction to the assumption that g is injective over Sfz 1 . LI

Recall that zj, the sender's message in the second round of the commitment pro- tocol, is a deterministic function of T, h and f, and since T and h are fixed, z, is a deterministic function of f, i.e., z, = zi(f). For (f, g), let inj(f, g) = 1 if g is injective over the set SfZ(f)-

Claim 4.3.3. It holds that 1 Pr[inj(F, G) = 0] < 2q- 2

Proof. For f E Supp(F), let BJ = Sf zj. First, we show that with high pro- bability BF is small. Indeed, since IS] < (t - 1) for every i E [k], it follows that

109 ISx - x S - (t -)k. Moreover, since F is chosen from the family y2 [klog(t-1)], Fact 2.2.15 (about the load balancing of many-wise independent functions) yields that

Pr [BF > 2 - -log(t - 1)]] < Pr [1z: ISFz I > 2 - k -log(t - 1)]

< -2-[k-log(t-1)]+[k-log(t-1) _ 2--[n.t-log(t-1)

Second, we show that if BF is small, then inj(F, G) = 1 with high probability. Recall that in the protocol (S, R), the function G is chosen (independently from everything else) from the pairwise independent family n-k, F2 log(k)+2 log log(t-1)+og 2 (n)] after zi(F) (and SF,zi(F)) is determined. Thus, we can use the pairwise independence of G to complete the proof. Fix f c Supp(F) with Bf < 2 - [k - log(t - 1)]. It holds that

Pr[inj(F, G) = 0 | F = f] Pr [3x x ' E Sf,z,(f): G(x) = G(x')]

< E Pr[G(x) = G(x')]

x#$V'ESf,z 1 (f) < (2(k - log(t - 1) + 1))2 2 2 2 (k . log (t - 1) - 21og (n)) < 10 - 2-

Finally, combining the above we have that

Pr [inj(F, G) = 0] < Pr [BF > 2 - [k -log(t - 1)]

+ Pr [inj(F, G) = 0 1 BF < 2 - [k log(t - 1)]

2 < 2- Ent o(t-01 + 10 2 -log (n) 1 where the last inequality holds for large enough n and since q E poly(n). E

We complete the proof using the above claims. For j E [4 - q2 - n - (t - 1) - k 2] let F(j) and GO) be random variables induced by the values of f and g in j'th iteration of the loop in Step 5 in a random execution of CollFinder, conditioned on (T, H) = (T, h). Claim 4.3.2 implies that if inj(F(j*), G(i*)) = 1 and (F(j*), G(i*)) E WT,h for some j*, then the random variables A1 ,..., Ak cannot take the values of the sets S',..., S .

110 It follows that

Pr[A, = Sj, for every i E [k]] (4.8)

< Pr [Vj c [4 -q2 - n -(t - 1) - k 21, inj(F(i), GUj)) 0 V (F(s), GG)) VW',Vj 2 4-q2 .'n-(t-1)-k ( W,h] S [I Pr[inj(FO), G(i)) = 0 V (F(s), G(j)) j=1 2 2 4-q -n-(t-1)-k < 1 (Pr [inj(F(j), GG3)= 0] + Pr [(F(3), GG3 ( WT,,]) j=1 2 2 4.q -n-(t-1)-k 22+ q 2 j=1 2 2 4-q -n-(t-1)-k

< e- 2-n-(t-1)-k2 where the first equality follows since (,0), G())'s are independent, the third inequality follows from Claim 4.3.3, since (T, h) e W and since (FU), GW)) are identically distri- buted as (F, G), and the last inequality follows since 1 - x < ex for any x E (0, 1). Finally, we bound the number of different sets Sj'..., S with ISjj < t - 1. For every Sj', there are

(nk < (t - 1) - 2 n-k-(t-1) < 2 2n-(t-1)-k

s=1 different possibilities, where the first inequality follows from the bound 3E ( ) <

2 k - nk. Hence, there are at most (2 2n.(t-1).k)k - 2 2n-(t-1) k different possibilities for S/ S Combining Eqs. (4.7) and (4.8) yields that

2 2 Pr[L,,,, < t] < 2 2n-(t-1)-k . 2n-(t-1)-k negl(n).

Hence, Eq. (4.6) holds, and the proof of Lemma 4.3.5 is complete.

4.3.1.2 Analyzing Hiding - Proving Lemma 4.3.6

The crux of proving that the scheme is statistically hiding is the following observation: If a function d is shrinking and X is a random input for it, then the random variable (X I d(X)) has (some form of) conditional min-entropy. Therefore, the receiver - who sees only d(X) - cannot completely recover X. The actual notion of entropy we use is that of average min-entropy (see Definition 2.2.8). Proof of Lemma 4.3.6. Let R be any (possibly unbounded) algorithm. Fix large

111 enough n E N and remove it from notation when it convenient and clear from the context. Let (H, X = (X1 , .. ., Xx), F, G, r) be (jointly distributed) random varia- bles induced by the values of (h, x = (x 1 ,..., Xk), f, J, r) in a random execution of (S(b), R), for an arbitrary b E {0, 1}.22 The transcript of the interaction between S(b) and R for any b e {0, 1} is thus

(S(b), R) = (H, H(X1 ), . . . , H(Xk), F, F(X), G,0(X), r, (r, X) ( b).

We view (H, F, G) as a description of a function Q mapping n - k bits to n - k - r bits for m = k - [log t] - [k log(t - 1)] - [2log(k) + 2log log(t - 1) + log2 (n)] bits.

Namely, Q(X) = (H(X1 ), . .. , (Xk), F(X), 0(X)). We can thus write

(S(b), R) (Q, Q(X), r, (r, X) E b).

Fix b E {0, 1} and let U ~ {, 1} be a uniform bit. It holds that

SD ((Q, Q(X),r, (r,X) D b), (Q, Q(X), r, U) (4.9)

=E [SD (q(X), r, (r,X) o b), (q(X), r, U ob)I

< E [2 . V2Hmin(XVii(X)). 2

< 2- (m-1)/2 where the equality follows since r and X are independent of Q, the first inequality fol- lows from Lemma 2.2.14 (Generalized Leftover Hash Lemma) and since inner product is a universal hash function, and the second inequality follows from Fact 2.2.10. Finally, by the setting of parameters and for large enough n it holds that

m = k [logt] - [k - log(t - 1)] - [2 log(k) + 2 log log(t - 1) + log 2 (n)] (4.10) > k - log(t) - (k - log(t - 1) + 1) - (2log(k) + 2log log(t - 1) + log 2 (n) + 1)

= k - log 1 + 1 ) - 2 log(k) - 2 log log(t - 1) - log 2 (n) - 2

1 Sk - 2 log(k) - 2 log log(t - 1) - log 2 (n) - 2 t - 1 1 > (n .t) -I - 2 log(n -t) - 2 log log(t - 1) - log 2 (n) - 2 " log3 (n).

2 2 Note that these random variables are identically distributed if b = 0 or b = 1.

112 Putting it all together, it holds that

SD ((S(0), R), (S(1),R))

= SD ((Q, (X), R, (R, X) e 0), (Q-, Q(X), R, (R, X) E 1)) < SD((Q, Q(X), R, R, X) E b), (Q, Q(X), R, U)) bE{0,1} < 2-(m-1)/2

< 2--0 (n)-1)/2 = negl(n), where the first inequality follows from the triangle inequality for statistical distance, the second inequality follows from Eq. (4.9) and the last inequality follows from Eq. (4.10).

4.4 Black-Box Separation

In this section, we formally state and prove a black-box separation between MCRH and one-way permutations. This separation is a straight-forward corollary of our black-box construction of constant-round statistically-hiding commitments from MCRH (Theo- rem 4.3.4) and the known black-box separation between constant-round statistically- hiding commitment schemes and one-way permutations by Haitner, Hoch, Reingold and Segev [HHRS151. As mentioned in the introduction, the result of this section can be viewed as an extension of Simon's [Simr98l blackbox separation of CRH from one-way permutations. We start by defining one-way permutations followed by the notion of fully-black- box construction of MCRH from one-way permutations.

Definition 4.4.1 (One-way Permutations). A family F = { Tf}nEN of polynomial time computable functions is called a one-way permutation if the following hold: 1. fn {0, 1} -+ {0, 1}" is a permutation for all fn E _Tn. 2. For every probabilisticpolynomial time algorithm A, and for sufficiently large n, it holds that, Pr [A(f,1InIy) = f- (y)] < negl (n),

where negl(.) is some negligible function and the probability is over y <- {0, 1}"0 and the randomness of A.

Definition 4.4.2. A fully black-box construction of a (s(n), t(n))-multi-collision resis- tant function family, where s(n) > log t(n), from a family of (n)-hard one-way per- mutations is a pair of probabilistic polynomial-time oracle-aided algorithms (Gen, M) for which the following hold:

* Black-Box Construction: For any family f = {ff : {0, 1}' - {0, 1} }nEN of permutations, the algorithm Genf(1") outputs the description of an oracle-aided circuit C: {o, 1}I {O, 1}-C")

113 e Black-Box Proof of Multi-Collision Resistance: For every family f = {fn : {0, 1}" - {- , 1}"}N of permutations and every probabilisticpolynomial- time algorithm A, if A with oracle access to f breaks the (s(n),t(n)-multi- collision resistance of Genf then,

Pr [MfA(y) f -1(y)] 1 T(n)

for infinitely many values of n, where M runs in time T(n) and the probability is over all possible choices of f and y E {0, 1}' and the randomness of M. We say that a fully black-box construction (Gen, M) is f(n)-security-parameter-expanding if for every adversary A from above, the reduction M on security parameter In invokes

A on security parameters which are at most 10(). We will rule out fully-black-box constructions of (s, t)-MCRHs from one-way per- mutations for all s > log t. In contrast, note that (log(t - 1), t)-MCRHs exist trivially (and unconditionally) for all values of t. Theorem 4.4.3 (Restatement of Corollary 4.1.7). Let (Gen, M) be an f(n)-security- parameter-expanding fully-black-box construction of (s, t)-MCRH from a family of T(n)-hard one-way permutations _T, where t = t(n) E N is a polynomial computa- ble in poly(n) time and s(n) > log(t(n)), then

r(f (n)) = 2".-2 ('n)

Theorem 4.4.3 shows that there does not exist a fully black-box construction of MCRH from polynomially-hard one-way permutation.2 4 The proof of Theorem 4.4.3 relies on the black-box separation of constant-round statistically-hiding commitments from one-way permutations [HHJIRS15, stated now. Definition 4.4.4. A fully black-box construction of a statistically-hidingcommitment scheme from a family of r(n)-hard one-way permutations F, is a triple of probabilistic polynomial-time oracle-aided algorithms (S, R, M) for which the following hold: " Correctness and Hiding: The commitment scheme (Si, Rf) satisfies correct- ness (i.e., satisfies Definition 4.3.1) and statistical hiding (see Definition 4.3.2) for every f E F.

" Black-Box Proof of Binding: For every f = {f : {0, 1} - {0, 1}"}nEN a family of permutations and for every probabilisticpolynomial-time algorithm S, such that S breaks the binding of (Sf, RJ), it holds that

Pr I'y) = f -,(y) ;> Tn

23 Consider a (t - 1)-regular function f : {0, 1}T - j {o, 1},-g(-1). For such a function t-way collisions simply do not exist. 24 This theorem can be strengthened by considering an additional parameter: the security reduction M's running time. This allows us to rule out constructions relying on sub-exponential assumptions (c.f. [HHRSi5, Footnote 21]). However, we do not consider this generalization here.

114 for infinitely many values of n, where M runs in time 7(n) and the probability is over all possible choices of f and y E {o, 1}"I and the randomness of M.

We say that a fully black-box construction (S, R, M) is f (n)-security-parameter-expanding if for every adversary S from above, the reduction M on security parameter I' invokes S on security parameters which are at most 14").

Theorem 4.4.5 ([HHRS15, Theorem 6.31). Any (n)-security-parameter-expanding fully-black-box construction of a d(n)-round statistically hiding commitment scheme from an T(n)-hard family of one-way permutations, must satisfy the following relation: d(f d~e~n))(n)) = Q ( log T(n)) .

The proof of Theorem 4.4.3 is now a simple corollary of Theorem 4.4.5 together with the observation that our construction of a 3-round statistically-hiding commit- ment scheme from MCRH is fully black-box (Theorem 4.3.4).

Proof of Theorem 4.4.3. Let (Gen, M) be an f(n)-security-parameter-expanding fully-black-box construction of (s, t)-MCRH from a family of T(n)-hard one-way per- mutations F, where t = t(n) E N is a polynomial computable in poly(n) time and s(n) > log t(n). Observe that the composition of our construction of a 3-round statis- tically hiding commitment from MCRH (Theorem 4.3.4) and the fully black-box con- struction (Gen, M) yields an f(n)-security-parameter-expanding fully-black-box con- struction of a 3-round statistically hiding commitment from a q(f(n)) - T(f(n))-hard family of one-way permutations for some polynomial q determined by the proof of Theorem 4.3.4. Theorem 4.4.5 now yields that, n 3 c log q(f(n)) - T(f(n))' for some universal constant c > 0. Rearranging, we get

C-01 r(f (n)) ;> 2 - 2Q (), q(f (n)) where the second equality follows since q and f are polynomials. L

115 116 Chapter 5

Zero-Knowledge Interactive Proofs of Proximity

In this chapter, we study zero-knowledge interactive proofs of proximity (ZKPP). We show several properties that posses very efficient statistical ZKPP. On the other hand, we show that there exists a property for which any statistical ZKPP must be highly non-efficient. Finally, we also study the computational setting of ZKPP, and show such interactive proofs of proximity for wide range of complexity classes, assuming standard cryptographic assumptions.

This chapter is based on [BRVI7].'

5.1 Overview

The standard model of interactive proofs, introduced by Goldwasser, Micali and Rackoff [GMR89J, allow a polynomial-time verifier to check the correctness of a com- putational statement, typically formulated as membership of an input x in a language L, using an interactive protocol. Given the vast amounts of data that are available nowadays, and the ubiquity of cloud computing, in some applications polynomial-time or even linear-time verifica- tion may be too slow. A recent line of work, initiated by Rothblum, Vadhan and Wig- derson [RVW13J, following the earlier work of Ergiin, Kumar and Rubinfeld [EKR04J, asks whether we can construct interactive proofs in which the verifier runs in sub-linear time. Since the verifier cannot even read the entire input, we cannot hope to obtain sub-linear time verification in general (even for some very simple computations 2). Thus, following the property testing literature [RS96, GGR98] (see also [Gol17), the verifier is given oracle access to the input, and soundness is relaxed. Namely, the veri- fier is only required to reject inputs that are far (in Hamming distance) from being in the language. Since the verifier is only assured that the input x is close to the language , these proof-systems are called interactive proofs of proximity, or IPPs for short.

'An extended abstract version of [BlIVl7] was published at ITCS 2018 [BRV181. 2Consider for example verifying whether a given string has parity 0.

117 Recent results ([RVW13, GR18, FGL14, KR15, GGR18, RRR1 6, GG18, GRI7) have demonstrated that many languages admit IPPs with sublinear-time verification. 3 One of the main features of classical interactive proofs (over their non-interactive counterparts) is that they allow for proving statements in zero-knowledge [GMR89, GMW911: amazingly, it is possible to prove that x E L without revealing anything other than that. Beyond being of intrinsic interest, zero-knowledge proofs have a multitude of applications, especially in cryptography. In this chapter, we initiate the study of zero-knowledge proofs of proximity, or ZKPP for short. Specifically we ask:

Is it possible to prove the correctness of a computation to a verifier that reads only few bits of the input, without revealing any additional "non-local" information about the input?

By non-local information, we mean any information that cannot be inferred by making only few queries to the input. In particular, and in contrast to the classical zero- knowledge setting, we want our notion of zero-knowledge to capture the fact that the verifier does not even learn the input string itself.

The Model of Zero-Knowledge Proofs of Proximity. As expected, we capture the desired zero-knowledge requirement using the simulation paradigm of [GMR89.

Definition 5.1.1 (ZKPP, informally stated (see Section 5.2)). An IPP with prover P and verifier V is a ZKPP, if for any (possibly malicious) verifier V, that given oracle access to input of length N runs in time t(N) < N, there exists a simulator S that runs in time roughly t(N) such that for every x C L it holds that

where (P(x), vx) denotes V's view when interacting with P.

In particular, if the verifier cannot afford to read the entire input, then the simu- atLrL mUst sUccessully imuUalte bt verifie' w even thoug" It too cannot reau Ut entire input. See Section 5.2 for the formal definition of the model and additional discussions.

Knowledge Tightness and Simulation Overhead. The above informal defini- tion of zero-knowledge requires that for any possible cheating verifier that runs in (sublinear) time t, there exists a simulator, running in roughly the same time, that 3Throughout this chapter we use the verification time as our primary complexity measure for IPPs. We could have alternatively chosen to view the total number of bits observed by the verifier (i.e., those read from the input and those communicated from the prover) as the main resource (note that the verification time is an upper bound on the latter). Focusing on verification time makes our upper bounds stronger, whereas our lower bounds also hold wrt the total number of bits observed by the verifier.

118 simulates the verifier's view. We call the running time of the simulator, viewed as a function of t, the simulation overhead of the protocol.4 In the zero-knowledge literature, the simulation overhead s = s(t) is typically allowed to be any polynomially-bounded function. This is motivated by the fact that such polynomial-time simulation implies that every polynomial-time verifier strategy has a polynomial-time simulation. In contrast, since in our setting of ZKPP the verifier runs in sub-linear time, we will sometimes need to be more precise. Suppose for example that we had a ZKPP with a t = v-N time verifier (where N is the input length) and with some unspecified polynomial simulation bound s = s(t). In such a case, if for example s(t) = Q(t2), then the simulator would be able to read the entire input whereas the verifier clearly cannot. This leads to an undesirable gap between the power of the verifier and that of the simulator. Thus, to obtain more meaningful results we will sometimes need to precisely spe- cify the simulation overhead that is incurred. Nevertheless, since most (but not all) of our results deal with verifiers that run in poly-logarithmic time, unless we ex- plicitly state otherwise, our default is to allow for polynomial simulation overhead. Indeed, in the poly-logarithmic regime, polynomial simulation implies that every poly- logarithmic time verifier strategy has a poly-logarithmic time simulation. In the few cases where we need to be more precise, the simulation overhead will be stated expli- citly. We remark that our quantification of the simulation overhead is closely related to Goldreich's [Gol0l, Section 4.4.4.2] notion of knowledge tightness of standard zero- knowledge proofs.

A Cryptographic Motivation from the 90's. Interestingly, the notion of ZKPP has already implicitly appeared in the cryptographic literature 20 years ago. Bellare and Yung [BY96] noticed that the soundness of the [FLS99] construction of non- interactive zero-knowledge proof-system (NIZK) from trapdoor permutations breaks, if the cheating prover sends a description of a function that is not a permutation. [BY96] observed that to regain soundness in the [FLS991 protocol, it suffices to verify that the given function is close to being a permutation. Focusing on the case that the domain of the permutation 5 is {0, 1} , [BY96] suggested the following natural non-interactive zero-knowledge proof for certifying that a function is close to a permutation: sufficiently many random elements yi, . , Yk in {0, 1}" are specified as part of a common random string 6 (CRS), and the prover is

required to provide inverses x1 , . .. , xk to all of these elements. Soundness follows from the fact that if the function is far from a permutation then, with high probability, one

4In our actual definition the simulation overhead may depend also on the input length (and proximity parameter). However, the more fundamental dependence is on the (possibly cheating) verifier's running time. Thus, we omit the dependence on these additional parameters from the current discussion. 5 We remark that the general case (i.e., when the domain is not {0, 1 }) introduces significant difficulties. See [GR13] and [CL17] for details. 'Recall that NIZKs inherently require the use of a CRS.

119 of these elements will simply not have an inverse. Zero-knowledge is demonstrated by having the simulator sample the x's at random and obtain the y's by evaluating the permutation. Since the verifier in the [BY96] protocol is only assured that the function is close to a permutation, in our terminology, the [BY96] protocol is a non-interactive ZKPP. Notice that the verifier runs in time poly(n), which is poly-logarithmic in the input (i.e., the truth table of f).

5.1.1 Our Results

As is the case for standard zero-knowledge, the results that we can obtain depend heavily on the specific notion of zero-knowledge. These notions depend on what exactly it means for the output of the simulator to be indistinguishable from a real interaction. The main notion which we focus on in this chapter is that of statistical zero- knowledge proofs of proximity. Here, the requirement is that the distribution of the output of the simulator is statistically close7 to that of the real interaction.

5.1.1.1 Statistical ZKPP

The first natural question to ask is whether this notion is meaningful-do there exist statistical ZKPPs?5 More precisely, since every property tester is by itself a trivial ZKPP (in which the prover sends nothing), we ask whether statistical ZKPPs can outperform property testers. We answer this question affirmatively. Moreover, we show that same natural problem considered by [BY96 (i.e., verifying that a function is a permutation) has a very efficient zero-knowledge proof of proximity. We emphasize that, in contrast to the protocol of [BY96] mentioned above, our protocol is zero-knowledge against arbitrary malicious verifiers (rather than only honest-verifier zero-knowledge as in the [BY961 protocol).

Theorem 5.1.2 (ZKPP for permutations, informally stated (see Section 5.3.1)). Let PERMUTATION, be the set of all permutations on n-bit strings. Then:

9 ZKPP Upper Bound: PERMUTATION, has a 4-round statistical ZKPP in which the verifier runs in poly(n) time.

7 That is, the two distributions have negligible statistical distance. Negligible here refers to an auxiliary security parameter that is given to all parties, see further discussion in Remark 5.2.6. 8 Note that not every IPP is zero-knowledge. Suppose that we want to check whether a given input consists of two consecutive palindromes (of possibly different lengths) or is far from such. Alon et al. [AKNSOO] showed that every tester for this property must make Q(VW) queries. However, Fis- cher et al. [FGL141 observed that if the prover provides the index that separates the two palindromes, the property becomes easy to verify. The IPP of [FGL14] is not zero-knowledge since any o( N) time simulator can be transformed into an o( N) time tester for the property, contradicting the [AKNS00 ]lower bound.

120 * Property Testing Lower Bound: Every tester for PERMUTATION, must make at least Q(2 n/ 2 ) queries to the input (and in particularmust run in time Q2 (2nl/2 )).

(Notice that poly(n) is poly-logarithmic in the input size, whereas 2n/2 is roughly the square root of the input size.) Similarly to other results in the literature on constant-round statistical zero- knowledge (SZK), we can only bound the expected running time of our simulator (rather than giving a strict bound that holds with all but negligible probability). Using standard techniques, which introduce a super constant number of rounds, we can obtain a strict bound on the simulator's running time. However, in the interest of simplicity and since it is not our main focus, we avoid doing so. We also remark that Gur and Rothblum [GR,15] give a lower bound on the complexity of non-interactive IPPs (i.e., IPP in which the entire interaction consists of a single message from the prover to the verifier, also known as MAPs) for PERMUTATION, and combining their result with ours yields a sub-exponential separation between the power of statistical ZKPP vs. MAPs. (Specifically, [GR15] show an MAP lower bound of roughly Q(2n/4) for PERMUTATION.) Lastly, we mention that a variant of the permutation property was used by Aaronson [Aarl 2J to give an oracle separation of SZK from QMA. Ho- wever, the SZK protocol that he constructs (which is essentially the [BY96] protocol) is only honest-verifier9 zero-knowledge. Beyond the property of being a permutation, we also consider two additional graph problems, and show that they admit efficient honest-verifier ZKPP protocols. Both problems we consider are in the bounded degree graph model", which has been widely studied in the property testing literature [GGR98, GR02.

Theorem 5.1.3 (Honest-verifier ZKPP for expansion and bipartiteness, informally stated (see Sections 5.3.2 and 5.3.3)). There exist honest-verifier statistical ZKPP in which the verifier's running time is polylog(N), for input graphs of size N, for the following two promise problems:

1. Promise Expansion: Distinguish graphs with (vertex) expansion a C (0, 1] from graphs that are far from even having expansion roughly 3 = a2 / log(N).

2. Promise Bipartiteness: Distinguish bipartitegraphs from graphs that are both rapidly mixing and far from being bipartite.

A few remarks are in order. We first note that the property testing complexity of both promise problems is known to be 0(/N) [GR02, GR11, CS10, NS10, KS11]. Second, the IPP for promise-bipartiteness is actually due to [RVW13] and we merely point out that it is an honest-verifier ZKPP. In contrast, the promise-expansion property above was not previously known to admit an (efficient) IPP (let alone a

'In an honest-verifier ZKPP, the simulator needs only to output an interaction that is indistin- guishable from the interaction of the honest (original) verifier and the prover. 10In the bounded degree graph model we assume that the degree of all vertices is bounded by a parameter d and the input graph is represented by an adjacency list. In other words, one can request to see the i-th neighbor (for i E [d]) of some vertex v using a single query.

121 zero-knowledge one). We also remark that both of the problems in Theorem 5.1.3 refer to promise problems. In particular, we leave open the possibility of a ZKPP for bipartiteness that also handles graphs that are not rapidly mixing, and a ZKPP for expansion that accepts graphs that are a-expanding and rejects graphs that are far from a-expanding (rather than just rejecting those that are far from being a2 / log(N)- expanding as in Theorem 5.1.3). Lastly, we also leave open the possibility of extending these protocols to be statistical ZKPP against arbitrary cheating verifiers (rather than just honest verifiers)."

Limitations of Statistical ZKPP. Given these feasibility results, one may wonder whether it is possible to obtain statistical ZKPP with poly-logarithmic complexity for large complexity classes (e.g., for any language in P), rather than just specific problems as in Theorems 5.1.2 and 5.1.3. The answer turns out to be negative since

Kalai and Rothblum [KRI5 constructed a language, computable in NC 1, for which every IPP (let alone a zero-knowledge one) requires Q( N) verification time.12 Still, the latter observation raises the question of whether statistical ZKPP are as powerful as IPPs. That is, can every IPP be converted to be statistically zero- knowledge with small overhead? We show that this is not the case:

Theorem 5.1.4 (IPP Z SZKPP, informally stated (see Section 5.4.1)). There exists a property 1l that has an IPP in which the verifier runs in polylog(N) time, where N is the input length, but 1 does not have a statistical ZKPP in which the verifier runs even in time NO().

We emphasize that Theorem 5.1.4 is unconditional (i.e., it does not rely on any unproven assumptions as is typically the case when establishing lower bounds in the classical setting). Interestingly, if we do allow for a (reasonable) assumption, we can obtain a stronger separation: namely, of MAP from SZKPP:

Theorem 5.1.5 (MAP Z SZKPP, informally stated (see Section 5.4.2)). Assuming suitable circuit lower bounds, there exists a property H that has an MAP in which the *ferfiPrrnn. in pn\lIog(N) \f #irn, where N is the inp'J lpnnth, but 171 does not have a statisticalZKPP in which the verifier runs even in time NO).

The circuit lower bound that we assume follows from the plausible assumption that the Arthur-Merlin communication complexity of the set disjointness problem is Q(n), where n is the input length and E > 0 is some constant.

"Since honest-verifier SZK protocols can be converted to be zero-knowledge against arbitrary malicious verifiers ([GSV98], see also [Vad99l), it is reasonable to wonder whether the same holds for statistical ZKPP. We conjecture that this is the case but leave the question of verifying this conjecture to future work. "This still leaves open the possibility that statistical ZKPPs with O(v N) complexity exist for large complexity classes. Actually, in the computational setting we show such results, see further discussion in Section 5.1.1.2.

122 5.1.1.2 The Computational Setting

Unsurprisingly, we can obtain much stronger results if we relax some of our require- ments to only be computational (rather than statistical). Specifically we will consider the following two relaxations: 1. (Computational Zero-Knowledge:) the simulated view is only required to be computationally indistinguishable from the real interaction.

2. (Computational Soundness aka Argument-System:) Here, we only require sound- ness against efficient cheating provers. The following results show that under either one of these relaxations, and assuming reasonable cryptographic assumptions, we can transform many of the known results from the literature of IPPs to be zero-knowledge. Focusing on computational zero- knowledge, we can derive such protocols for any language computable in bounded- depth or in bounded-space, where the verifier runs in roughly vN time. Theorem 5.1.6 (Computational ZKPP for bounded depth, informally stated (see Section 5.5)). Assume that there exist one-way functions. Then, every language in logspace-uniform NC, has a computational ZKPP, where the verifier (and the simu- lator) run in time N21+ () and the number of rounds is polylog(N). The simulation overhead is roughly linear. Theorem 5.1.7 (Computational ZKPP for bounded space, informally stated (see Section 5.5)). Assume that there exist one-way functions. Then, every language computable in poly(N)-time and O(N7)-space, for some sufficiently small constant a > 0, has a computationalZKPP, where the verifier (and the simulator) run in time N2+0('). The simulation overhead is roughly linear. Note that in both results the simulation overhead is (roughly) linear which means that a verifier running in time t will be simulated in nearly the same time. See additional discussion on the notion of simulation overhead above. Interestingly, if we only relax to computational soundness, we can do even better both in terms of expressive power and the running time of the verifier. The following result gives statistical zero-knowledge arguments of proximity for every language in NP, and with a verifier that runs in only poly-logarithmic time. Theorem 5.1.8 (Statistical zero-knowledge arguments for NP, informally stated (see Section 5.5)). Assume that there exist collision-resistanthash functions. Then, every language in NP, has a constant-round statistical zero-knowledge argument of proxi- mity, where the verifier runs in time polylog(N). (Here, since the verifier runs in poly-logarithmic time, we can and we do allow for polynomial simulation overhead.)

5.1.2 Related Works

In this section we discuss some related notions (and results) that have previously appeared in the literature, and how they compare with our results.

123 Zero-Knowledge PCPs. Zero-knowledge PCPs, introduced by Kilian, Petrank and Tardos [KPT97] (and further studied in [GIMS10, IWY16J), are similar to standard PCPs with an additional zero-knowledge requirement. Namely, the oracle access that the (potentially malicious) verifier has to the PCP should not reveal anything beyond the fact that the input is in the language. Note that the verifier in a zero-knowledge PCP is given full access to the input and oracle access to the proof. In contrast, in zero-knowledge proofs of proximity (studied in this paper) the situation is reversed: the verifier is given oracle access to the input but full access to the communication line with the prover. A more closely related notion of zero-knowledge PCPs of proximity was conside- red by Ishai and Weiss [XI14]. These are PCP systems in which the verifier gets oracle access to both the input and to an alleged PCP-style proof. Similarly to our notion of ZKPP, the verifier runs in sublinear time and is assured (with high pro- bability) that the input is close to the language. ZKPPs and zero-knowledge PCPPs are incomparable - soundness is harder to achieve in the interactive case (since the prover's answers may be adaptive) whereas zero-knowledge is harder to obtain in the PCP setting. Therefore, the difference between our model and that of [IW14] is that we consider interactice proofs, whereas [IW14] focus on PCP-style proofs: namely soundness is guaranteed only if the PCP proof string is written in advance.

Zero-Knowledge Communication Complexity. A model of zero-knowledge in communication complexity was recently proposed by G66s, Pitassi and Watson [GPW16] and further studied by Applebaum and Raykov [AR.16a]. Since there are known con- nections between property testing and communication complexity [BBM\'12] (which holds also in the interactive setting, see [GRI8I), it is interesting to study whether such a connection can be fruitful also in the zero-knowledge setting. We leave the study of this possibility to future work.

Zero-Knowledge Interactive PCPs and Oracle Proofs. Recent works by Ben- Sasson et al. [BCF+16, BCGV16] study zero-knowledge interactive oracle proofs - a model in which the verifier receives oracle access to the communication tape, but full access to the input.1 3 Our model of ZKPP is reversed - the verifier has oracle access to the input but full access to the communication tape. Chiesa et al. [CFS17 consider zero-knowledge in the context of interactive PCPs, a model introduced by Kalai and Raz [KR08J.

Measures of Knowledge. The notion of "simulation overhead", similarly to that of "knowledge tightness" [Gol01] mentioned above, can be viewed as a (quantitative) security measure for the zero-knowledge of a protocol. Both notions are worst-case and consider the verifier and simulator's running times. Micali and Pass [MIP07]

1 3Interactive proofs in which the verifier is not charged for reading the entire communication tape are called either probabilistically checkable interactive proofs [iRR 161 or interactive oracle proofs [BCS16] in the literature.

124 considered a similar measure, but in an execution-to-execution setting. Finally, Gold- reich and Petrank [GP99 considered other, incomparable, security measures than the verifier's and simulator's running times.

5.1.3 Technical Overview

We provide overview for our main conceptual results. Overviews for our other results are given in the appropriate places in the other sections of this chapter.

5.1.3.1 ZKPP for PERMUTATION (see Theorem 5.1.2) Since it is easier to argue, we begin with showing that any tester for the property of PERMUTATION must make at least Q( N) queries, where N = 2'. To see this, consider the following two distributions: (1) a random permutation over {0, 1}; and (2) a random function from {0, 1} to {0, 1}. The first distribution is supported exclusively on YES instances whereas it can be shown that the second is, with high probability, far from a permutation. However, if a tester makes q < \/W queries, then in both cases, with high probability, its view will be the same: q distinct random elements. The property testing lower bound follows. We now turn to show a statistical ZKPP in which the verifier runs in poly(n) time. Consider the following simple IPP for PERMUTATION (based on the [BY96] protocol). Given oracle access to a function f : {0, l}, - {0, }n, the verifier chooses a random r G {O, }" and sends r to the prover. The prover computes z = f 1 (r) and sends it to the verifier. The verifier checks that indeed f(z) = r and if so accepts. Clearly if f is a permutation then the verifier in this protocol accepts with pro- bability 1, whereas if f is far from a permutation, then with some non-negligible probability the verifier chooses r which does not have a pre-image under f. In such a case the prover cannot make the verifier accept and so the protocol is sound. It is also not hard to see that this protocol is honest-verifier zero-knowledge.1 4 However, it is not cheating-verifierzero-knowledge: a cheating verifier could learn the inverse of some arbitrary r of its choice. In order to make the protocol zero-knowledge, intuitively, we would like to have a way for the prover and verifier to jointly sample the element r such that both are assured that it is uniform. For simplicity let us focus on the task of just sampling a single bit o-. The specific properties that we need are

1. If f is a permutation then the prover is assured that o is random.

2. If f is far from being a permutation then the verifier is assured that a is random.

In fact, the transformation of general honest-verifier statistical zero-knowledge proofs to cheating-verifier ones (see [Vad99, Chapter 6]) implements a sub-routine achieving a generalization of the above task, assuming full access to the input. We give a simple

"As a matter of fact, this protocol can be viewed as a non-interactive statistical zero-knowledge protocol for PERMUTATION (and is used as such in [BY961).

125 solution for our specific case. That is, using only oracle access to a function that is either a permutation or far from any permutation. We proceed to describe a simple procedure for sampling such a random bit -. First, the verifier chooses at random x E {0, 1}" and a pairwise independent hash function h : {0, 1}' - {0, 1} and sends y = f(x) and h to the prover. The prover now chooses a random bit r E {0, 1} and sends r to the verifier. The verifier now sends x to the prover who checks that indeed f(x) = y. The random bit that they agree on is o- = r ED h(x). From the prover's perspective, if f is a permutation then y fully determines x and so r (which is chosen uniformly at random after y is specified) is independent of h(x). Hence, o- = r ( h(x) is a uniformly random bit. On the other hand, from the verifier's perspective, if f is far from being a permutation, then, intuitively, even conditioned on the value y there still remains some entropy in x (indeed, x is essentially uniform among all the pre-images of y). 1 5 Now, using a variant of the leftover hash lemma, we can argue that h(x) is close to random. Actually, since the leftover hash lemma implies that pairwise independent hash functions are strong extractors, we have that h(x) is close to random even conditioned on h and therefore also conditioned on r (which is a randomized function of h). Thus, we obtain that a = r e h(x) is close to being uniformly random and so our procedure satisfies the desired properties.

A Different Perspective: Instance-Dependent Commitments. Instance de- pendent commitments [BMO90, IOS97] are commitment schemes that depend on a specific instance of some underlying language: if the instance is in the language, the commitment is guaranteed to be statistically binding; and if the instance is not in the language the commitment is guaranteed to be statistically hiding. Instance-dependent commitments are a central tool in the study of SZK (e.g., [OV08, NV06, MV03J). We can use PERMUTATION to construct an instance-dependent commitment as follows. Given a function f : {0, 1}" - {0, 1}", a commitment to a bit b is a tuple (f(x), h, h(x) E b), for a random x E {0, 1}' and a pairwise independent hash function h: {0, 1} -+ {0, 1}. Our arguments can be adapted to show that if f is a permutation, then this commitment is statistically binding, whereas if f is far from a permutation, then this commitment is (weakly) statistically hiding (to amplify, we can repeat by choosing many x's). One way to view our protocol for sampling the random string r that was described above, is as an instantiation of Blum's coin flipping protocol [Blu81 based on the foregoing instance-dependent commitment. 1 6

"Actually, the amount of entropy can be fairly small (and depends on how far f is from being a permutation). To obtain a sufficient amount of entropy, in our actual protocol we generate many such y's. 16 Recall that in Blum's coin-flipping protocol, one party sends a commitment to a random bit b and the other party replies with another random bit b'. Now, the first party decommits and the parties agree on the bit b D b'.

126 5.1.3.2 Separating IPP from SZKPP (see Theorem 5.1.4) The proof of Theorem 5.1.4 is done in two steps. The first step is to construct a property 1l which has an interactive proof of proximity with a large number of rounds and polylog(N)-time verifier, but such that in every 2-message interactive proof of proximity for 11, the verifier's running time must be N6, for some constant 6 > 0. Actually, such a result was recently established by Gur and Rothblum [GRI7]. The second step in proving Theorem 5.1.4 is a general round reduction transforma- tion for any honest-verifier statistical zero-knowledge proof of proximity. Namely, we would like a procedure that takes any many-messages honest-verifier zero-knowledge proof of proximity and turns it into a 2-message honest-verifier zero-knowledge proof of proximity while only slightly deteriorating the verifier's and simulator's running times. To establish such a procedure we apply Goldreich and Vadhan's [GV99 proof that the ENTROPY DIFFERENCE PROBLEM is complete for the class SZK. That proof takes an instance x of any promise problem U = (YES, NO) c SZK and efficiently

constructs two distributions (encoded by circuits) Co and C1 such that if x E YES

then H(Co) > H(C1) + 1, and if x E NO then H(C1 ) > H(Co) + 1. That proof goes on to show a zero-knowledge protocol to distinguish between the case that H(CO) >

H(C1 ) + 1 and the case that H(C1 ) > H(Co) + 1. Two important points regarding that proof: (1) sampling from Co and C1 can be done by running (many times) the simulator for the original problem U; (2) the protocol for EDP consists of only two messages and requires only oracle access to Co and C1. In our settings, we can view a property U as a promise problem where functions possessing the property are in YES and functions that are E-far from possessing the property are in NO. Then, we can have the verifier "run" the reduction to EDP and apply the oracle-access protocol for EDP. The unbounded prover will behave as in the protocol for EDP. Recall that the original simulator (i.e., the one for the property's IPP) required only oracle access to the input function. Since sampling from the distributions only requires running the original simulator, the new verifier can implement this step with only oracle access to the input function and with only polynomial overhead to the running time of the original simulator.

5.1.3.3 The Computational Setting (see Theorem 5.1.6-5.1.8)

The proofs of Theorem 5.1.6, Theorem 5.1.7 and Theorem 5.1.8 rely on the same basic idea: compiling existing public-coin protocols from the literature (specifically those of [RVW13, RRR16, Kil92J) that are not zero-knowledge to ones that are. This step is based on the idea, which originates in the work of Ben-Or et al. [BGG+88], of having the prover commit to its messages rather than sending them in the clear. This ability to commit is where we use the assumption that one-way functions exist. The compiler, which can only be applied to public-coin protocols is as follow. At every round, rather than sending its next message in the clear, the prover merely commits to the message that it would have sent in the protocol. Since the protocol is public-coin, the verifier can continue the interaction even though it does not see the

127 actual contents of the prover's messages. After all commitments have been sent, the verifier only needs to check that there exist suitable decommitments that would have made the underlying IPP verifier accept. Since the commitment hides the contents of the messages, it cannot do so by itself and we would like to use the prover. At this point, one could try to naively argue that the residual statement is an NP statement, and so we can invoke a general purpose zero-knowledge protocol for NP (e.g., the classical [GMW91] protocol or the more efficient [IKOS09] protocol). Herein arises the main difficulty with this approach. While the statement that the verifier needs to check at the end of the interaction does consist of an existential quantifier applied to a polynomial-time computable predicate, the latter predicate makes oracle access to the input x and so we do not know how to express it as an NP statement. To resolve this difficulty, we restrict our attention to verifiers that make prover-oblivious queries; that is, the queries that the verifier makes do not depend on messages sent by the prover. Luckily enough, in the IPPs that we rely on the verifier's queries are indeed prover-oblivious. Thus, our verifier can actually make its queries after seeing only the commitments and we can construct an NP statement that refers to the actual values that it reads from the input. At this point we can indeed invoke a general purpose zero-knowledge protocol for NP and conclude the proof. Lastly, we remark that the specific flavor of soundness and zero-knowledge that we obtain depends on the commitment scheme we use and the soundness of the protocol to which we apply the transformation. Loosely speaking, instantiating the above ap- proach with a computationally hiding and statistically binding commitment scheme yields a computational zero-knowledge proof of proximity, whereas a statistically hi- ding and computationally binding one yields a statistical zero-knowledge argument of proximity.

5.1.4 Organization of this Chapter

The model of zero-knowledge proofs of proximity (ZKPP) is defined in Section 5.2. Our statistical ZKPP protocols for Permutations, Expansion and Bipartiteness, are presented and analyzed in Section 5.3, while our lower bounds for statistical ZKPP are in Section 5.4. In Section 5.5 we present our results on computational ZK proofs of proximity and the statistical ZK arguments of proximity. Finally, in Section 5.6 we prove statements whose proofs we deferred throughout this chapter.

5.2 ZKPP - Model and Definitions

A ZKPP is an interactive proof for convincing a sub-linear time verifier that a given input is close to the language, in zero-knowledge. Loosely speaking, by zero-knowledge we mean that if the (N-bit) input is in the language, the view of any (potentially malicious) verifier that runs in time t < N can be simulated by reading not much more than t bits from the input.

128 The only non-trivial step in formalizing this intuition is in quantifying what we mean by "not much more". In the classical setting of zero-knowledge interactive proofs, we merely require that the simulator run in polynomial-time, and so "not much more" is interpreted as polynomial overhead. A natural adaptation for the sub- linear setting would therefore be to require that the running time of the simulator be polynomially related to that of the verifier. However, in some settings this requirement is problematic-e.g., suppose that the verifier runs in time t = O(VN). Here, a simulator that runs in time t2 (and in particular can read the entire input) would be far less meaningful than one running in, say, t3 / 2 time. Thus, as pointed out in the introduction, it will be important for us to quantify more precisely what is the overhead incurred by the simulator. We refer to this as the simulation overhead, which we think of as a function of the verifier's running time (see precise statement below). Thus, rather than merely saying that a protocol is a ZKPP, we will say that it is a ZKPP with simulator overhead s. We proceed to the formal definitions. A property is an ensemble H = (11, Ds, Rl),EfN, where H, is a set of functions from D,, to R,, for every n E N. In certain contexts, it will be more convenient for us to view U as a set of strings of length ID, over the alp- habet R4, (in the natural way). In such cases we will also sometimes refer to properties as languages. We denote by N the bit-length of the input, i.e., N = ID"r - log 2 (17ZI)- In the technical sections we will often measure efficiency in terms of the parameter N but in our actual definition below we will allow a direct dependence on n, IDaj and

IR]n. This makes the definitions slightly more cumbersome but allows us to capture certain auxiliary parameters that arise in specific models, e.g., the dependence on the degree of the graph in the bounded degree graph model (for details see Remark 5.2.7). Lastly, similarly to [Vad99], we use a security parameter k to control the quality of our soundness and zero-knowledge guarantees rather than letting these depend on the input length (although our reasons for doing so are slightly different from those in [Vad99], see Remark 5.2.6 for additional details).

Section Organization. We begin by recalling the definition of IPPs in Section 5.2.1, then proceed to define statistical ZKPP in Section 5.2.2, and finally we discuss com- putational ZKPP in Section 5.2.3.

5.2.1 Interactive Proofs of Proximity (IPPs)

Our definition of IPP follows [RVW13 with minor adaptations.

Definition 5.2.1 (Interactive proofs of proximity (IPP)). An r-message interactive proof of proximity (IPP), with respect to proximity parameter E > 0, (in short, e-IPP) for the property H = (U.I, Dn, R n)n is an interactiveprotocol (P, V) between a prover

P, which gets free access to an input f: Dn - JZ, as well as to E, n, |Dn|, |hZl and k, and a verifier V, which gets oracle access to f as well as free access to E, n, |DnI, IRn | and k. The following conditions are satisfied at the end of the protocol for every k c N and large enough n E N:

129 " Completeness: If f E H,, then, when V interacts with P, with probability 1 - negl(k) it accepts.

" Soundness: If f is E-far from H1, then for every prover strategy P, when V interacts with P, with probability 1 - negl(k) it rejects.

For t = t(n, IDn, IR7I, k,E), we denote by IPP[t] the class of properties possessing e-IPP in which the verifier's running time is at most 0(t). Finally, for a class of functions C, we denote by IP P[C(n, IDn|1, 17Z,, k, e)] the class of propertiesU for which there exists t E C such that H E IPP[t].

The probabilities that the verifier rejects in the completeness condition, and accepts in the soundness condition, are called the completeness error and soundness error, respectively. If the completeness error is zero, then we say that the IPP has perfect completeness. A public-coin IPP is an IPP in which every message from the verifier to the prover consists only of fresh random coin tosses and the verifier does toss coins beyond those sent in its messages. An IPP is said to have query complexity q = q(n, IDN, ZR I, k, E) E N if for every n, k E N, E > 0, f : D --+ Rn, and any prover strategy P, the verifier V makes at most q(n, IDn 1, 1Rn 1, k, E) queries to f when interacting with P. The IPP is said to have communication complexity c = c(n, IA1, [R7Z1, k, E) C N if for every n, k E N, E > 0, and f : D, - R, the communication between V and P consists of at most c(n, IDn1, IRn1, k, e) bits. Our main (but not exclusive) focus in this chapter is on properties that have IPPs in which the verifier's running time (and thus also the communication and query complexities) is poly-logarithmic in the input size and polynomial in the security parameter k and in the reciprocal of the proximity parameter e. That is, the class IPP[poly(log(N), k, 1/)]. An IPP that consists of a single message sent from the prover (Merlin) to the verifier (Arthur) is called Merlin-Arthur proof of proximity (MAP) [GR18J. We extend all the above notations to MAPs in the natural way.

5.2.2 Statistical ZKPPs

Before defining general ZKPPs, we first consider zero-knowledge with respect to honest verifiers. Following [Vad99], we require the simulator to run in strict polynomial-time but allow it to indicate a failure with probability 1/2 (which can then be reduced by repetition). The requirement is that conditioned on not failing, the simulated view is statistically close to the actual execution (see Section 2.3 for the definitions of the classical statistical zero-knowledge proofs). Recall that we say that an algorithm A is useful if Pr[A(x) =I] ; 1/2 for every input x, and use A(x) to denote the output distribution of A(x), conditioning on A(x) #1. We define the view of the verifier V on a common input x (given as standard input or by oracle access to either of the parties) by viewp,v(x) = (M1 , M2 , . .. , mr; p), where m1 , m2 , . .. , mr are the messages sent by the parties in a random execution of the protocol, and p contains of all the random coins V used during this execution.

130 Definition 5.2.2 (Honest-verifier zero-knowledge proofs of proximity (HVSZKPP, HVPZKPP)). Let (P, V) be an IPP for a property II = (fn, Dn, Rn)nEN. The protocol (P, V) is said to be honest-verifier statistical zero-knowledge with simulation overhead s, for some function s: N' x (0, 1] - N if there exists a useful probabilistic algorithm

S, which (like V) gets oracle access to f : DEn - Rn as well as free access to E, n, IDn1, 1R7ZI and k, and whose running time is at most O(s(tv, n, I n1, IRn1, k, c)), where tv(n,IDn 1, IRn|, k, E) is V's running time, such that for every k c N, every large enough n G N and f: Dn -+ Rn, if f E Hn, it holds that:

SD (Sf (e, n, IDnj, 1Rn 1, k), viewpv(E, n, IDP, IRn 1, k, f)) < negl(k).

If the negl(k) can be replaced with 0 in the above equation, (P, V) is said to be honest- verifier perfect zero-knowledge with simulation overhead s. For t = t(n, IDnj, IRnj, k, E), HVSZKPP[t, s] (respectively, HVPZKPP[t, s]) deno- tes the class of properties possessing honest-verifier statistical (respectively, perfect) zero-knowledge proof of proximity with simulation overhead s in which the verifier's running time is at most O(t).

We say that the query complexity of a simulator S is q' = q'(n, LDnI, JRnI, k, E) E N if for every n, k E N, E > 0, f : D, --+ R,, Sf makes at most q'(n, IDnI, I7 J, k, c) queries to f. A typical setting (that we will focus on) is when the verifier's running time is poly(log(N), k, 1/), namely poly-logarithmic in the input length N and polynomial in the security parameter k and in the proximity parameter 1/. In this setting we often allow for polynomial simulation overhead, that is the simulator's running time is also poly(log(N), k, 1/E). Specifically, we denote by HVSZKPP [poly(log(N), k, 1/E)] the class of properties I1 c HVSZKPP[t, s] for t = poly(log(N), k, 1/) and s = poly(t, log(N), k, 1/). The class HVPZKPP [poly(log(N), k, 1/E)] is similarly defined. Another setting of interest is when the verifier's running time is Ns - poly(k, 1/6), for some constant 6 c (0, 1). In this setting, unlike the previous one, allowing the simulation overhead to be polynomial will give the simulator much greater compu- tational power than the verifier (e.g., if 5 = 1/2 and s is quadratic in the verifier's running time, then the simulator can run in time O(N) and in particular may read the entire input). In this setting we aim for the simulation overhead to be linear in the verifier's running time (but it can be polynomial in k and 1/)." When the simulation overhead is clear from context we allow ourselves to say that the protocol is a ZKPP (rather than a ZKPP with specific simulation overhead as per Definition 5.2.2).

Cheating Verifier ZKPP. We will allow cheating verifiers to be non-uniform by giving them an auxiliary input. For an algorithm A and a string z C {0, 1* (all auxiliary inputs will be binary strings, regardless of the properties' alphabet), let A[z] be A when z was given as auxiliary input. Since we care about algorithms whose

1 7This requirement is in the spirit of constant knowledge tightness, see [Gol01, Section 4.4.4.2].

131 running time is insufficient to read the entire input, we would not want to allow the running time to depend on the auxiliary input (otherwise, we could artificially inflate z so that A would be able to read the entire input). Thus, following [Vad99J, we adopt the convention that the running time of A is independent of z, so if z is too long, A will not be able to access it in its entirety.

Definition 5.2.3 (Cheating-verifier zero-knowledge proofs of proximity (SZKPP, PZKPP)). Let (P,V) be an IPP for a property H = (Un,Dn, Rn)nEN. The protocol (P, V) is said to be cheating-verifier statistical zero-knowledge with simulation overhead s, for some function s: N5 x (0,1] -+ N, if for every algorithm V whose running time is O(tV(n, IDnI, |R|, k, e)), there exists a useful probabilistic algorithm S, which (like

V) gets oracle access to f: Dn -+ Rn as well as free access to e, n, |Dn|, |R7| and k, and whose running time is at most O(s(tV, n, IDnI, IRnI, k, e)), such that for every k E N, large enough n e N, z E {0, 1}* and f: Dn -+ Rn, if f E Hn, then

SD (sfj(E, n, IDTJ, IRn, k), viewp 9 (E, n, DLI, Rn, k, f)) ; negl(k).

If the negl(k) can be replaced with 0 in the above equation, (P,V) is said to be a cheating-verifier perfect zero-knowledge with simulation overhead s. For t = t(n, IDnJ RnII, k, e), SZKPP[t, s] (respectively, PZKPP[t, s]) denotes the class of properties possessing cheating-verifier statistical (respectively, perfect) zero- knowledge proof of proximity with simulation overhead s in which the verifier's run- ning time is at most 0(t).

Expected Simulation Overhead. Definition 5.2.3 requires that the running time of the simulator always be bounded. Similarly to many results in the ZK literature, in some cases we can only bound the simulator's expected running time. The following definition captures this (weaker) notion:

Definition 5.2.4 (Expected simulation cheating-verifier ZKPP (ESZKPP, EPZKPP)). Let (P, V) be an IPP for a property H = (1u, Dn, Rn)nEN. The protocol (P, V) is said to be cheating-verifier statistical zero-knowledge with expected simulation overhead s if bi Subt,5Jsfeh ,16 Iquirimenb U 11b D/tbIo 1.. xcVtJ t t WC lbby UUboL bt expected running time of the simulator (where the expectation is over the coins of the simulator). The classes ESZKPP[t, s] and EPZKPP[t, s] are defined analogous to SZKPP[t, s] and PZKPP[t, s] from Definition 5.2.3.

Unless explicitly saying otherwise, all zero-knowledge protocols we discuss are cheating-verifier ones. As in the honest-verifier case, a typical setting is that in which the verifier's running time is poly-logarithmic in the input size N and polynomial in the secu- rity parameter k and in 1/E, and the simulator's (possibly only expected and not strict) running time is polynomial in the running time of the cheating-verifier that it simulates, poly-logarithmic in N and polynomial in k and 1/. Specifically, if

132 we allow the cheating-verifier the same computational powers as the honest-verifier, then both the honest-verifier and every simulator run in time poly(log(N), k, 1/c). We let ESZKPP[poly(log(N), k, 1/E)] be the class of properties U E ESZKPP[t, s] for t = poly(log(N), k, 1/E) and s = poly(tV, log(N), k, 1/). Finally, we let the class EPZKPP [poly(log(N), k, 1/E), poly] be defined similarly.

5.2.2.1 Additional Discussions

We conclude Section 5.2.2 with a few remarks on statistical ZKPP. Remark 5.2.5 (Proximity promise problems). Some of the protocols that we con- struct do not refer to a property but rather to a "proximity promise problem". Recall that a promise problem considers a pair of disjoint sets YES and NO and the goal is to distinguish input that are in YES from those that are in NO (and no requirement is given for inputs outside of YES U NO). For some of our results we will consider proximity promise problems, which are also characterized by sets YES and a family of sets (NO(')):(o,1), and we require that for every E E (0, 1), it holds that NO(') is E-far from YES (rather than merely being disjoint). We extend the definitions above to handle proximity promise problems in the natural way (specifically, completeness and zero-knowledge should only hold for input in YES whereas the soundness requirement is that if the verifier is given proximity parameter E > 0 and an input in NO(E), then it should reject with high probability). Remark 5.2.6 (The security parameter). One of the original motivations for the introduction of a security parameter in the classical definitions of statistical zero- knowledge proofs was to control the error parameters (completeness, soundness and simulation deviation) independently from the input's length. Specifically, one may want to provide a high-quality proof (i.e., very small errors) for short inputs (see [Vadl99, Section 2.4]). In our setting, the situation is somewhat reversed. We think of very large inputs that the verifier and simulator cannot even entirely read. Hence, it seems unreasonable to require errors that are negligible in the input length. Instead, we control the quality of the proof with the security parameter, independent of the input length. Remark 5.2.7 (A definitional convention). Traditionally [GGR981, a property tester gets an oracle access to a Boolean function f : {0, 1} -+ {0, 1} and needs to determi- ned if the function has the property or is E-far from having this property (i.e., E-far from any function that has the property). The tester gets n (or alternatively 2n - the input length of the truth table of f) as a standard input and its complexity (e.g., running time, number of oracle queries) is measured as a function of n. As models and properties evolved (e.g., the bounded degree model [GR02]) Boolean functions no longer sufficed to (conveniently) describe properties. For example, in the boun- ded degree model graphs with n vertices and degree d are specified as a functions G: [n] x [d] - [n] U {1} such that G(u, i) = v if v is the i'th neighbor of a vertex u and G(u, i) =I if u has less than i neighbors. Consequently, the parameter n alone no longer suffices to measure the complexity of the tester.

133 The situation becomes even more delicate when interaction is added. The model of interactive proofs of proximity (IPP), studied by [RVW131 following [EKR04j, con- siders an interaction between a prover and a verifier in which the prover is trying to convince the verifier that a function has a property. In the definition of [RVW1i3], in addition to the function f, to which the verifier has only oracle access and is referred to as the implicit input, the verifier also has full access to an additional (shorter) input w, called the explicit input. For example, in the bounded degree model w might be simply d, and in "algebraic" properties w can contain a description of some underlying field. Roughly speaking, [RVW131 chose to measure the complexity of the proof-system with respect to the length of the implicit input alone. This creates a slight inconvenience when trying to describe complexity measures. For example, in the bounded degree model, we would like the running time of the verifier to explicitly depend on the number of vertices and the degree d. However, as it is defined in [RVW13], the function that bounds the verifier's running time gets only the input length (which has bit length n - d -log(n)). To avoid this minor issue, in this paper, we take a slightly different approach than that of [RVW13 when defining IPPs. Our goal is to define a general model in which it is easy to compare properties from different domains (e.g., properties of bounded degree graphs and those considering algebra). To do so we no longer split the input to an implicit and explicit parts. We consider functions f: D - R from an arbitrary domain D to an arbitrary range R. The verifier receives oracle access to f, and full access to IDI and JRi. The prover receives full access the function f. Different complexity measures are now functions of the verifier's standard inputs - IDI and JR1. Going back to the bounded degree graph example, we can see in this framework the function describing the verifier's running time gets fD.j = n -d and IRaj = n as inputs, and can be easily "converted" into a function that simply gets n and d as inputs. Moreover, we can now define N to be the input length of the property (i.e., N = IDI. log(IRI)) and define complexity classes with respect to this input length. For example, we can define IPP[poly(log(N))] to be the class containing all properties with interactive proof of proximity in which the verifier's running time (as a function of IDI and IRI) is bounded by poly(log(N)). Note that the above class does not depend on the domain and range of the propety, and properties of different

"ypes" can still Uelg tU IPIPpy IUg(V ))j.

5.2.3 Computational ZKPP

Since our focus is on the statistical case, we do not provide explicit definitions of computational zero-knowledge proofs of proximity. Rather, these definitions can be easily extrapolated from the statistical ones in a standard way (see for example Vad- han's [Vad99, Section 2] definition of computational zero-knowledge). Specifically, in the computational definitions one simply requires that the simulator's output and the protocol's view are computationally indistinguishable (rather than statistically close), with respect to the security parameter.

134 5.3 The Power of ZKPP: The Statistical Case

5.3.1 ZKPP for Permutations

In this section, we look at functions f : {0, 1}n _, {0, 1} and consider the property of being a permutation. That is, we would like to distinguish between functions that are a permutation from those that are far from being a permutation.

Definition 5.3.1 (The permutation property). For every n E N let

PERMUTATION= {f: {0, 1n _+ {0, 1} f is a permutation}.

We define the permutation property as

PERMUTATION = (PERMUTATION,, {0, 1}, {o, 1). nEN

As we argued in Section 5.1.3.1, any property tester for PERMUTATION must make at least Q(v/N) queries, where N = 2". As a matter of fact, the situation cannot be significantly improved even if we allow non-interactive (i.e., one way) communication with a prover. Specifically, Gur and Rothblum [GR15] have shown that the verifier in every MAP (i.e., non-interactive proof of proximity, see [GR18]) for PERMUTATION must make either Q(N1 /4 ) queries or use a proof of length Q(N1 / 4 ). In particular, this means that any MAP verifier must run in time Q(N / 4 ). In this section, we show that the PERMUTATION property has a 4-message (statis- tical) zero-knowledge proof of proximity with respect to cheating verifiers. We note that we only bound the expected number of queries and running time of the simulator of our protocol. As mentioned in the introduction, using standard techniques (that introduce 0(n) rounds of interaction), we can construct a ZKPP for PERMUTATION with strict bounds on the query complexity and running times of the simulator (rather than just bounding their expectation). Before stating the theorem a word on notation. In the definitions in Section 5.2 we gave the prover and the verifier, as explicit inputs, the domain and range sizes; both, in the case of the permutation property, are 2 . In this section, for convenience, instead of giving 2" as an explicit input, we will simply give n. Relevant complexity measures (e.g., running time, query and communication complexity) will similarly be functions of n.

Theorem 5.3.2 (EPZKPP for Permutation). It holds that

PERMUTATION c EPZKPP [poly(log(N, k, 1/E))].

Specifically, PERMUTATION has a cheating-verifier perfect zero-knowledge proof of proximity (PpermVperm) with expected simulator Sperm with respect to proximity pa- rameter E > 0 such that the following properties hold for every n E N, every input f: {0, I _> {0, 1}" and security parameter k E N:

135 1. The parties exchange four messages and the total communication is O(n 2 k/E2 ) bits.

2. Vperm's running time is poly(n, k, 1/) and Vperm's query complexity is O(nk/E2).

3. If f E PERMUTATION,, then for every auxiliary input z, Sperm[Z] 's expected running time and query complexity, given access to a (possibly cheating) verifier 9, are 0(tV(E, r, k, z))+poly(n, k, 1/E) and O(qi(E, n, k, z)+nk/E2 ) respectively, where tV(E, n, k, z) and qV(E, n, k, z) are the running time and query complexity of Vfz (e, n, k).

(Note that the input of PERMUTATIONn has size n - 2', so a polynomial dependence on n translates into a poly-logarithmic dependence on the input-size.) Combined with the aforementioned MAP lower bound for PERMUTATION, we obtain that the complexity of ZKPP (with expected simulation bounds) can be sub- exponentially smaller than those of MAPs (and therefore also of property testers).

Remark 5.3.3. We mention that in Item 3 in the theorem statement, when the simulator simulates the view of an interaction with the honest verifier Vperm, its strict (rather than expected) query complexity is exactly equal to the query complexity of the verifier (i.e., in such a case no overhead is incurred).

The rest of this section is dedicated to proving Theorem -.3.2.

5.3.1.1 Proof of Theorem 5.3.2

The protocol (Pperm, Vperm) is given in Fig. 5.3.1. It closely follows the intuitive discussion given in Section 5.1.3.1. It is easy to verify that (PpermVperm) has the desired round complexity, query complexity and verifier's running time, where we use the fact that O(n 2 k/E 2 ) bits suffice for describing the pairwise independent hash function in the protocol (see Fact 2.2.12). To see that completeness holds observe that if f : {0, 1}" -+ {0, 1}' is a per- mutation, and the two parties follow the protocol, then indeed f(Xi) = yi and f(zi) = f(f-1(ri E h(x)i)) = ri D h(x)i for every i c [t - s], and therefore the parties complete the interaction and the verifier accepts. It remains to show that the soundness and zero-knowledge conditions hold. Sound- ness follows from the following lemma, which is proved below.

Lemma 5.3.4. Let n, k e N, let e > 0 and suppose that f: {0,1} -+ {0,1} is E-far from PERMUTATION,. Then, for every prover strategy P, when Vperm 1 (E, n, k) interacts with P it rejects with probability 1 - negl(k).

Finally, to show that this protocol is perfect zero-knowledge, consider the simula- tor Sperm given in Fig. 5.3.2. The following lemma, which we prove below, shows that this simulator perfectly samples from the view of any (possible cheating) verifier.

136 The Permutation Protocol (Pperm, Vperm)

Pperm'S Input: A function f : {0, 1} -+ {0, 1}, proximity parameter e > 0 and security parameter k. Vperm's Input: E, n, k and oracle access to f.

1. Both parties set t = [(n + 1)/IE] and s [k/E].

2. Vperm samples x = (X1, X2,. . ., Xt-s) ~ ({0, }") ts and h ~n. t.sn..s.a Vperm computes yi = f(xi) for every i c [t - s] (by querying f), and sends y (Y1, Y2, - -- , Yt.s) and h to Pperm.

3. Pperm samples r = (ri, r 2 ,. .-. ,r.) ({0, 1 })s and send them to Vperm.

4. Vperm sends x to Pperm.

5. Pperm checks that f(xi) = yi, for every i e [t - s]. If any check fails then Pperm sends _ and aborts.

6. Pperm sends z = (zi, z2,. .. , z) to Vperm, where zi =f -1 (ri h(x)i), for every i G [s].'

7. Vperm accepts if f(zi) = ri D h(x)i for every i E [s], and otherwise it rejects.

'Recall that Fn,m = {f : {0, 1} -+ {0, 1}" } is a family of pairwise independent hash functions. See Fact 2.2.12. bRecall that G stands for the bitwise exclusive-or. Also, we view h(x) E {0, 1}ns as h(x)

(h(x)I, It(x) 2 ,. . ., h(x),) such that h(x) E {0, 1}

Figure 5.3.1: The Permutation Protocol

137 The Simulator Sperm for The Permutation Protocol (Pperm, Vperm)

Simulator's Input: e, n, k, auxiliary input z, oracle access to f: {0, 1}" -* {0, 1} and access to (possibly cheating) verifier V.

1. Run Z(E, n, k) using random coins p to get y = (yi, y2,. yts) and h.

2. Sample r = (ri, r2 ,. .. ,rs) ~ ({,1}')' and give them to V[z] as the answers from Pperm-

3. Continue to run Vf](E, n, k) to get x = (XI, x 2,. .. , xts), the values Vf](E, n, k) sends to Pperm in the third message of the protocol.

4. If there exists i E [t - s] such that f (xi) # yi, output (y, h, r, x, _, p).

5. Otherwise, repeat the following:

(a) Sample r'= (r', r . . ,r') ~ ({0, 1}n)s and for every i G [s], set r' f (r') E h(x)i.

(b) Rewind Vf I (E, n, k) to the point it is waiting for the second message of the protocol using p again as the random coins (i.e., step 3 of the protocol). Give r" = (r/'1 ri',... , r') as the answers from Pperm.

(c) Continue to run Vf (E, n, k) to get x' = (X/, I2, .. , X/,), the values VfI (E, n, k) sends to Pperm in the third message of the protocol. (d) If f(xi) = yj for every i c [ts], output (y, h, r", x', r', p) and halt. Otherwise, go back to 5a.

Figure 5.3.2: The Simulator for The Permutation Protocol

Lemma 5.3.5. Let n, k E N, let z E {0, 1}, let f C PERMUTATION,, let V be some verifier strategy and let Sf(E,n, k,V) be the output of Sperm when running on input e, n, k, auxiliary input z, with oracle access to f and access to V. Then:

VISD' f , , k , V), VI wVVp Vk , /6 , k f) = V.

Moreover, the expected running time and query complexity of S{(e, n, k,V) are as in Item 3 of the theorem statement.

This concludes the proof of Theorem 5.3.2 (modulo the proofs of Lemma 5.3.4 and Lemma 5.3.5 which are proved next).

5.3.1.2 Analyzing Soundness - Proof of Lemma 5.3.4

Before proving Lemma 5.3.4 we show a basic, but useful, property of functions that are E-far from permutations-their image size cannot be too large.

Fact 5.3.6. If f is E-far from PERMUTATION,, then Im(f)| < (1 - E) - 2'.

138 Proof. We prove the contrapositive. Let f : {0, 1}" -+ {0, 1}" and suppose that Im(f)I > (1-E).2". We show that f is E-close to a permutation f' :{, 1}" -- {O, 1}'. Start by setting f'(x) f(x) for every x. Repeat the following process until Im(f') = {0, 1}': take y ( Im(f); find x / x' such that f'(x) = f'(x') (such x, x' must exist since at this point f' is not a permutation); set f'(x) = y. In every iteration Im(f)I increases by one. The above process started with Im(f')| > (1 - E) - 2', and thus takes less than E - 2" iterations. It follows that f' and f disagree on at most E - 2" inputs, or in other words f' is E-close to f. Moreover, f' is a permutation, since Im(f') = {0, 1}'. l

Using the above fact we can show that even after seeing a random element in the image of a function that is E-far from permutation, its preimage still has some entropy. The specific notion of entropy we use is average min-entropy. 18

Claim 5.3.1. Suppose that f : {0, 1}" -÷ {0, 1}" is E-far from PERMUTATION,. Let X be a random variable uniformly distributed over {0, 1}", and let Y = f (X ). Then, Hmin(X IY) > log(1/(1 - E)).

Proof. Immediately follows from Facts 2.2.10 and 5.3.6.

We are now ready to prove Lemma 5.3.4.

Proof of Lemma 5.3.4. Let P be a (cheating) prover strategy. We assume without loss of generality that P is deterministic (by fixing the best choice of random coin tosses). Let X, H, Y, R and Z be the (jointly distributed) random variables induced by the values of x, h, y, r and z respectively, in a random execution of (P, Vperm) and let out(P, Vperm) be the random variable induced by Vperm's output in the same random execution (i.e., out(P, Vperm) E {accept, reject}). By the definition of the permutation protocol, it holds that

Pr [out(P, Vperm) = accept] < Pr[Vi E [s]: f(Xj) = Ri e H(X)i] (5.1)

< Pr[Vi' E [s]: Ri- T H(X)i E Im(f)].

Note that R = (R1 ,. .. , R,) is a function of Y and H, determined by P. It follows that

Pr[Vi E [s]: Ri(Y, H) (D H(X) G Im(f)] < Pr[Vi E [s]: Ri(Y, H) (D Ui E Im(f)] (5.2) + SD((H(X), H, Y), (U, H, Y)),

8Recall that average min-entropy of X given Y is defined as Hmir(XIY) - log(E,,y[max, Pr[X = x | Y = y]]) (see Definition 2.2.8).

139 where U = (U1 , U2,..., U,) is uniform over ({0, 1}")s. We bound both terms in the right-hand side of Eq. (5.2). For the first term, note that U is independent of Uj for i 4 j, and thus

S Pr[Vi E [s]: RH(Y, H) e U E Im(f)] = Pr[Ri(Y, H) ( U E Im(f)] (5.3)

S = Pr[U E Im(f)] i= 1 (1 -e)", where the second equality follows from the fact that for every r E {O,1}", r e U is uniform over {0, 1}' and the last inequality follows from Fact 5.3.6. As for the second term of Eq. (5.2), by Fact 2.2.9 it holds that

Hmin(XjY)= t s Hmin(X1iY1) > t -s -log(1/(1 - )), (5.4) where the inequality follows from Claim 5.3.1. Applying the generalized leftover hash lemma (Lemma 2.2.14) we now obtain that:

2 SD ((H(X), H, Y), (U, H, Y)) < - 2ts'o9(1-) . 2n = _. (2" . (1 - E)t)s/ (5.5)

Plugging Eqs. (5.2), (5.3) and (5.5) into Eq. (5.1), we have

Pr out(P, Vperm) = accept] < (1 - E)y + - (2". (1 - E)t)s/2 (5.6)

< (1 - c)k/I + - ." - ( - (n+I)/E)k/(2E)

< 2~-k + 2(2". - (n+1)) k/(2E) - 2 = 2-k +2 2-k/(2E)-1 where the second inequality follows from our setting of t = [(n + 1)/E] and s [k/lE and the third inequality follows from the fact that 1-x < 2- for any x > 0. Thus, the verifier accepts with probability that is exponentially vanishing in k, and in particular negligible. L

5.3.1.3 Analyzing Zero-Knowledge - Proof of Lemma 5.3.5

Let V be a cheating verifier strategy and fix an input f c PERMUTATION. For simplicity, and without loss of generality, we assume that V is deterministic. 19 Throughout this proof we fix an auxiliary input z to Sperm and remove it from the notation of both the simulator and the (possibly cheating) verifier (since all Sperm does

9 Recall that if the cheating verifier is randomized, we can fix its random coins as part of the auxiliary input to both parties.

140 with its auxiliary input is to provide it to V, both algorithms get the same auxiliary inputs). Recall that we let Sf(E, n, k, V) denote the algorithm defined by running Sperm on input E, n, k, with oracle access to f and access to V. Note that Sf(e, n, k, V) halts almost surely, namely the probability that it never halts is zero. The following claim shows that the output distribution of Sperm is identical to the view of V. Later, in Claim 5.3.3, we will bound the (expected) running time and query complexity of Sperm.

Claim 5.3.2. The output distribution of Sperm is identical to the view of V.

Proof. Let X, H, Y, R and Z be the (jointly distributed) random variables induced by the values of x, h, y, r and z respectively, in a random execution of (Pperm, V). First observe that since V is deterministic, its first message (y, h) is fixed and so Y = y and H h. Also, since the verifier is deterministic, there exists a function v such that X v(R). Lastly, observe that there also exists a function u such that Z = u(R). The view of the verifier is therefore:

v iew p, (E, n, k, f ) = (Y, H, R, X, Z) = (y, h, R, v (R), u (R)) (5.7)

Similarly, let Xs, Hs, Ys, Rs, Zs be the (jointly distributed) random variables induced by the output of a random execution of Sf(E, n, k, V). We need to show that:

(Y, H, R, X, Z) = (Ys, Hs, Rs, Xs, Zs). (5.8)

First observe that by construction Ys = y and Hs = h. Also observe that no matter what value Rs obtains, in all steps in which the simulator might generate an output, it holds that Xs = v(Rs), where the function v is as defined above. Similarly it holds that Zs = u(Rs), where u was defined above. Thus, Eq. (5.8) reduces to showing that R and Rs are identically distributed. Since R is uniform all we need to show is that Rs is also uniformly distributed. Let

A {r c ({0, 1}f)S : Ii E [t - s] s.t. yi # f(xi) where x = v(r)}, (5.9) where y = (Yi, . ., Yts) (and recall that y was fixed). Namely, A contains those ele- ments in ({0, 1}")s, that had they been sent by Pperm as its second message, the verifier V would have sent x that are not the preimages of y. Finally, let p - Pr[r - ({0, 1}'"')s]r A and fix r* c ({0, 1}n)s. We show that Pr[Rs = r*] = 1/(2n)s. The proof now splits according to r*. r* E A: The only way Sf(E, n, k, V) would output r* is by choosing it in Step 2. Since Sf (E, n, k, V) chooses the values in this step uniformly at random from ({0, 1}f)s, it follows that Pr[Rs = r*] = 1/(2")s.

141 A: The only way Sf(E, n, k, V) would output r* is by choosing r" = r* in Step 5a. The probability that Sf(E, n, k, V) reaches Step 5 at all is p. Having reached Step 5, and since f is a permutation, every time that Sf(E, n, k, V) runs Step 5a, it samples r" uniformly at random from ({0, 1}f)s, independent of all previous messages it received from V. In Step 5 Sf(e, n, k, V) is performing rejection sampling until it gets r" ( A and then sets Rs = r". All in all, it holds that

Pr[Rs = r*] p Pr [r = r* I r" A] r"~({0,1})s Pr,~{,1s["=r* = Pr [r ( A] -'~{,}) r ,r~({,1}n)s Pr,_({o,1}n)s [r" ( A] 1 Pr [r r*] = . 'r~({0,1}")S (2n)s

(Note that we can condition on the event r" 0 A since r* A and so Pr [r" 0 A] > 0.)

Hence, Rs is uniform over ({0, 1}n)s. This completes the proof of Claim 5.3.2. L

Claim 5.3.3. If the cheating verifier V runs in time tV and makes qg queries, then the expected running time of the simulator Sperm is O(t,) + poly(n, k, 1/&) and its query complexity is O(qV + nk/E2 ).

Proof. We prove this part by first showing the expected number of calls it makes to V is constant. Let T be the number of times Sf(E, n, k, V) executes Steps 3 and 5c when V uses the coins p. Note that T is equal to the number of times the simulator invokes V. Let A be as defined in Eq. (5.9) (recall that A was defined as the set of vectors r for which the verifier V responds with x that do not all correspond to the respective preimages of y). Let p = Pr,-({o.11-) [r A]. Let E denote the event that Sf(E, n, k, V) reaches Step 5. By construction, Pr[T 1] = Pr[-,E] = 1 - p. Moreover, it holds that the random variable TIE is drawn from a geometric distribution with parameter p. Since the latter has expectation i/p we have that:

E[T] Pr[-,E] -E[T I -,E] + Pr[E] -E[T I E] (5.10) 1 S(1 - ) - 1 + p - - p 2 -p,

Thus, in expectation, the simulator invokes V at most twice. Every time Sf(E, n, k, V) calls V, it samples a random string in {0, I}", evaluates some h E ,Tn-ts,n.s, and makes O(t . s) calls to f. Recall that t, and qV denote the running time and query complexity of V, respectively. The expected running time of S (E, n, k, V) is thus at most O(tV) + poly(t, s, n) = O(tgj) + poly(n, k, 1/E) (note that

142 by Fact 2.2.12, evaluation of h can be done in time poly(t, s, n)). The expected query complexity of SJ(F, n, k, V) is thus at most O(q, + t - s) = O(q , + nk/E 2 ).

5.3.2 Promise Expansion is in HVSZKPP

In this section we consider the property of a graph being a "good" expander, in the bounded degree graph model (see [GGR98, GR02). Recall that in bounded degree graph model, the input graph is represented by an adjacency list and so, using a single query, the verifier can find the ith neighbor of a vertex v. The property of being a good expander was first considered by Goldreich and Ron [GR11J. They showed that any tester for the (spectral) expansion of a graph on n vertices must make at least Q( n) queries. [GR11] also suggested a testing algorithm that matches this bound and conjectured its correctness. Czumaj and Sohler [CS10 focused on vertex expansion and proved that the [GR11] tester accepts graphs that are good expanders and rejects graphs that are far from having even much worst expansion. More specifically, [CS10] showed that the [GR11] tester accepts graphs with (vertex) expansion a and rejects graphs that are far from having even

(vertex) expansion 0 . Lastly, Nachmias and Shapira [NS10] and Kale and Seshadhri [KS111] improved [CS10j's result and showed that the tester in fact rejects graphs that are far from having expansion 0(a 2 ). We show how to apply [CS10J's approach to get an honest-verifier statistical zero-knowledge proof of proximity with only a poly-logarithmic dependence on n, as long as we have a similar type of gap between YES and NO instances as in [CS10]. 2 Formally, a vertex expander 1 is defined as follows.

Definition 5.3.7 (Vertex expander). A graph G = (V, E) is an a-expander, for a parameter a > 0, if for every subset U C V of size U| < |V/2, it holds that |N(U)| > a-jUj, where we define N(U) {v c V \U : Eu E U such that (v,u) c E}.

Throughout this section we fix a bound d on the degree of all graphs (which we think of as constant). We identify graphs on n vertices as functions G: [n] x [d] - [n] U {I} such that G(u, i) = v if v is the i'th neighbor of a vertex n and G(a, i) =_ if u has less than i neighbors.

Definition 5.3.8 (Promise expansion). Let d E N. For n E N and a = a(n) > 0, let

EXPANDERd7 {G: [n] x [d] -* [n] G is an a(n)-expander}.

Let 3 = 0(n) E (0, a(n)]. We define the expander promise problem (see Remark 5.2.5) as

EXPANDERd', = EXPANDERSd ,1 EXPANDER"' E(,1) [n] x [d],

20Some references in the literature call this boundary expander.

143 where EXPANDERdQ = EXPANDERd'c and

EXPANDER' ' = {G: [n] x [d] -+ [n] A(G, EXPANDERd' > E

That is, YES instances of the promise problem are graphs that are a-expanders and NO instances are graphs that are far from even being -expanders, for < a. Note that the above promise problem is well defined with respect to Remark 5.2.5- every #-expander is also an a-expander, and thus graphs in EXPANDERd,'1 are E-far from EXPANDER' ,. Theorem 5.3.9 (SZKPP for expansion). Let d c N and a E (0, 1/3 be constant. Then, it holds that EXPANDER ,do E HVSZKPP[poly(log(n), k, 1/e)] (and therefore also EXPANDERd,,' c HVSZKPP[poly(log(N), k, 1/)]), for 3 = 0 2 (fl), where n is the number of vertices in the graph, k is the security parameter and e is the proximity parameter. (The containment HVSZKPP[poly(log(n), k, 1/E)] C HVSZKPP[poly(log(N), k, 1/E)] in the above theorem holds since the input length N is equal to n -d - log(n) and d is a constant.)

Proof of Theorem 5.3.9. We prove Theorem 5.3.9 by reducing graph expansion to the problem of testing whether two distributions are statistically close. The reduction is such that we can sample from each distribution using few queries to our original input graph. Given this reduction, we can now use Lemma 2.3.13 which gives an honest-verifier statistical zero-knowledge proof for verifying whether the distribution induced by two circuits on a random input is statistically close. A crucial observation is that neither the verifier nor the simulator in the latter protocol need to actually look at their input circuits. Rather, they only need to be able to draw relatively few random samples from the distribution induced by the circuit on a random input. Intuitively, by applying our reduction we therefore obtain an honest-verifier statistical zero-knowledge proof for EXPANDERd',$ We proceed to give an overview of our reduction from EXPANDERd,'"3 to statistical closeness. The reduction, which is randomized, chooses uniformly at random a vertex a and considers two distributions: the first, denoted by , tp the last vertex in a random walk of length f = ( 2 1 o(n)) starting at u; the second, denoted by U[,], is uniform over all vertices. Observe that both distributions can be sampled using relatively few queries to the input graph. We observe that if the graph is an a-expander (i.e., a YES instance) then for any choice of u, it holds that SD(P$, U[,j) ~ 0 (and so the two distributions are close). On the other hand, [CS10] showed that if the graph is far from being a ( 2 )- expander (i.e., a NO instance), then there exists a set of vertices U with JU = Q(n) such that for every u E U, it holds that SD(P, U[n]) > 0. Thus, in the NO case, with constant probability (over the choice of u), our reduction generates distributions that are statistically far. We proceed to the actual proof. Our protocol uses the following lazy random walk on the graph G.

144 Definition 5.3.10 (Random walk). Let G = (V, E) be a (simple) bounded degree d graph. For u E V, define pu(v) 1/2d if (u,v) E S (i.e., (u,v) is an edge), and puju) = 1 - deg(u)/2d. For f e N, a random walk of length f starting at u is a

random process that chooses f vertices U1,... , uf such that u1 au and every vertex ui 1 is sampled from the distribution pm. The distribution P(v) is the probability that ut = v. Assume without loss of generality that F < 0.1, where e is the proximity para- 2 1 meter. Let h : [0, 1] - [0, 1] be the binary entropy function (recall that h(p) -p log(p) - (1 - p) log(1 - p)). By a routine calculation, it holds that

h - (1 + 0.2) < h(0.6) < 0.98 < 1 - 0.01. (5.11)

Thus, we can apply Lemma 2.3.13, with respect to the constants a = 0.2 and f 0.01 to obtain an honest verifier statistical zero knowledge protocol (P(Q)(k),V(Q')(k)) for SDP 2 -0 0" 1 . Thus, (P(M)(k),V(')(k)) is a statistical zero-knowledge protocol for distinguishing between YES instances, which are pairs of circuits that have statistical distance at most 0.01, from NO instances, which are pairs of circuits whose statistical distance is at least 0.2.22 Using the latter, we construct a protocol (Pexpan, Vexpan) for verifying expansion, which is given in Fig. 5.3.3.

The Expander Protocol (Pexpan, Vexpan)

Prover's Input: A graph G: [n] x [d] -a [n], expansion parameter a > 0, proximity parameter E > 0 and security parameter k. Verifier's Input: a, E, n, d, k and oracle access to G.

1. Vexpan samples u ~ [n] and sends a to Pexpan. 2. The parties construct two oracle circuits: one encodes the distribution P? for =

[ uniform distribution on the graph's 2 ln(v'%/0.01)]; the other encodes U[,], the vertices.

3. The parties run the protocol (PN9- U(k),VI "UH(k)), where (P(M),V(')) is the protocol for SDPO.2,. 00 from Lemma 2.3.13.

Figure 5.3.3: The Expander Protocol

The next two lemmas show the completeness and soundness of the expander pro- tocol.

Lemma 5.3.11. Let n, k c N, let E> 0 and let G: [n] x [d] -- [n]. Assume that G is in EXPANDERd , (i.e., G is an a-expander), then Vexpan (E, n, k), when interacting

21 Otherwise, we can just "reset" E to 0.1. 22The constant 0.2 that we use here stems from the analysis of [CS10]. On the other hand, the constant 0.01 is arbitrary (but was chosen so that Eq. (5.11) holds).

145 with Pexpanad(l, G, k) according to the expander protocol (Fig. 5.3.3), accepts with probability 1 - negl(k).

Lemma 5.3.11 is proven below via a standard analysis of random walks on expan- ders.

Lemma 5.3.12. Let n, k c N, let 0 < e < 0.1 and let G: [n] x [d] -+ [n]. Assume that

G is E-far from EXPANDERV for = e d2,g(n) then for every prover strategy P, when Vexpana d(E, n, k) interacts with P according to the expander protocol (Fig. 5.3.3), with probability at least E/24 - negl(k) it rejects.

Lemma 5.3.12 is proven below via a combinatorial property of graphs that are E-far from -expanders, shown by [CS101. As for honest-verifier zero-knowledge, let S() denote the simulator of the protocol for SDPO.2,0 0 1 from Lemma 2.3.13. Note that if G is an a-expander, the same mixing argument used to establish completeness implies that SD P U[,J < 0.01, for every vertex u (see below). Hence, for every vertex u it holds that SP5uu-I(k) simulates (pPpunl(k), VPiu~n1 (k)) with simulation deviation at most p(k), for some negligible function p. Our simulator for the expansion protocol, denoted by Sexpan will choose a vertex u uniformly at random, and output (U, SPu- (k)). Now observe that Sexpan's deviation from Vexpan's view in (Pexpan, Vexpan) is precisely equal to the expected de- viation, over the choice of a random vertex u, of SpuUEN (k) from the view of VP-,Ul-] in the protocol (PPEuEnI (k), VPEualI (k))_ Since the latter is bounded by [ for every choice of u, the expected deviation is bounded by [(k) as well. So far we have shown that the expander protocol (Fig. 5.3.3) has negl(k) com- pleteness error, 1 - (E/24 - negl(k)) soundness error and is honest-verifier statistical zero-knowledge. To reduce the soundness error the parties will repeat the above pro- tocol in parallel for poly(k)/E times. Since honest-verifier statistical zero-knowledge is preserved under parallel repetition, and parallel repetition reduces the soundness errors of IPPs at an exponential rate (see, e.g., [GGR18, Appendix A]), the resulting protocol is an honest-verifier statistical zero-knowledge proof of proximity. Finally, we argue about the efficiency of Vexpan (the analysis of the simulator's efficiency is similar). The verifier Vexpan needs to provide VP-fU[] (k) samples from P. and Ul,,,. Generating a random sample from U[,] is easy and requires 0 (log n) random coins. Generating a random sample from Pj is standard as well, requires poly(f, d) = poly(log n) random coins and oracle calls to the input graph. By Lemma 2.3.13, it follows the Vexpan's running time (accounting for the parallel repetition as well) is at most poly(log(n), 1/, k). l

5.3.2.1 Analyzing Completeness - Proving Lemma 5.3.11

Lemma 5.3.11 is an easy implication of the following standard result regarding random walks on expanders.

146 Lemma 5.3.13 (Expanders are rapidly mixing (c.f. [NS10, proof of Theorem 2.1])). Suppose that G is an a-expander graph on n vertices with bounded degree d. Then for 02 every vertex u and f E N it holds that SD(P!, U[,]) < i - e- 8 de.

Proof of Lemma 5.3.11. From the choice of f and Lemma 5.3.13, it holds that

SD(Puf, U[,]) < 0.01, for every vertex u. Let W be the random variable induced by the values of u chosen in step 1 of a random execution of the protocol It follows that

Pr [Vexpane'd(E, n, k) accepts] = Pr Vp&u"nI(k) accepts] > 1 - negl(k), where the last inequlity follows from the completeness property of (P, V) (Lemma 2.3.13). LI

5.3.2.2 Analyzing Soundness - Proving Lemma 5.3.12

The soundness of the expander protocol follows from the following combinatorial property of graphs that are far from expanders.

Lemma 5.3.14 ([CS10, Corollary 4.6 and Lemma 4.7]). Let G be a graph on n vertices with bounded degree d. There exists a constant c = c(d) > 0 such that the following holds. If G is 6-far from any -expander with / < 1/10, then there exists U c [n] with |U > E -n/24 such that for every u E U, it holds that SD PaUn-> where f < 1/(10co).

Recall that Vexpan's first steps is to choose uniformly at random a vertex u. This a will belong to the set U from the Lemma 5.3.14 with probability at least E/24. Conditioned on the latter, we claim that the input to (Pprunh(k),VP-'U (k)) is a NO input and so soundness follows immediately from the soundness of the protocol for SDP (Lemma 2.3.13). We argue soundness with respect to / [2o.c.d2.1jv/.o1)j, where c = c(d) > 0 is the constant guaranteed to exist by Lemma 5.3.14.

Proof of Lemma 5.3.12. Let P be any prover strategy. Let W be the random variable induced by the values of a chosen in step 1 of a random execution of the protocol and let U be the set guaranteed to exist by Lemma 5.3.14. It holds that

Pr [Vexpana'd(E, n, k) accepts] < Pr[W U] + Pr [Vexpanad(E, n, k) accepts W E U]. (5.12)

We bound both terms in the right-hand side of Eq. (5.12). Lemma 5.3.14 yields that Pr[W U] < 1 - E/24. As for the second term, Lemma 5.3.14 yields that

147 SD(Pf, U[n]) ;> 1-2 > 0.2 for every w E U (note that / was chosen so that f < 1/(10c) and that we assumed above that E < 0.1). It follows that

Pr Vexpan'd(E, n, k) accepts W E U < E [Pr[V

5.3.3 Promise Bipartiteness is in HVSZKPP

In this section we consider the property of a graph being bipartite in the bounded degree model (we introduced this model in Section 5.3.2). The property of being a bipartite graph was first considered by Goldrich and Ron [GR02, GR99. They showed a tester for bipartiteness of graphs with n vertices which makes at most O(Vji) queries. They also showed a matching lower bound, namely that any such tester must make at least Q(V/nY) queries. Rothblum et al. [RVW13] gave an interactive proof of proximity for a "promise variant" of bipartiteness in which the verifier's running time is polylog(n). In this variant, YES instances are bipartite graphs. However, NO instances, in addition to being far from bipartite, are also well-mixing, namely that a random walk of e(log n) steps ends at each vertex with probability at least 1/2n. We denote this promise problem by PROMISE-BIPARTITE. In [RVW1i3I's protocol, the verifier simply takes a random walk of length 6(log n) starting at a randomly chosen vertex (in each step it performs a self-loop with pro- bability at least 1/2; see Definition 5.3.10). The verifier then sends to the prover the start and end vertices of the walk and asks the prover for the parity of the number of non-self-loop steps it took during the walk. If the graph is bipartite, then the parity of the number of non-self-loops is zero in case the start and end vertices lie on the same side of the graph, and is one otherwise. Thus, if the graph is bipartite the prover can easily predict the value. [RVWI3 showed that if the graph is well-mixing and far from being bipartite, then the chance of taking a path of even non-self-loop steps is close to that of taking a path of odd non-self-loop steps. Hence, any cheating prover will fail to convince the verifier. Our observation is that the above protocol is also an honest-verifier perfect zero- knowledge proof of proximity. The simulator will simply act as the verifier: take a random walk and output the parity of the non-self-loop steps in this walk (which the simulator knows since it performs the walk). Since this result follows immediately from [RVW131's protocol, we only state an informal version of the it here, and refer the reader to [RVW13, Section 5.1] for formal definitions and description of the protocol.

Theorem 5.3.15. PROMISE-BIPARTITE c HVPZKPP[poly(log(N), k, 1/)].

As noted in the introduction, we remark that it is an interesting open question to either remove the restriction to well-mixing graphs, or to show a cheating-verifier

148 zero-knowledge proof of proximity for PROMISE-BIPARTITE (even with the restriction to well-mixing graphs).

5.4 Limitations of SZKPP

In light of the positive results in Section 5.3 an important question rises:

Can every IPP be transformed to be statistical zero knowledge?

We give a negative answer to the above question." Actually we show two incompa- rable lower bounds:

1. There exists a property H that has an IPP in which the verifier runs in poly- logarithmic time, but the verifier in any zero-knowledge proof of proximity for H cannot run in poly-logarithmic time. (Actually we can even show that such a verifier cannot run in time Nol, see Remark 5.4.6). Thus, this lower bound can be viewed as a separation between the class IPP [poly(log(N), k, 1/)] and HVESZKPP [poly(log(N), k, 1/)].

2. An additional lower bound which separates HVESZKPP [poly(log(N), k, 1/E) even from a weaker class-namely the class of languages admitting non-interactive proofs of proximity, also known as Merlin-Arthur proofs of proximity or MAPs [GR18]. However, in contrast to the previous separation from IPPs, this result is conditional: we can only prove it assuming a (very plausible) circuit lower bound. Specifically, we assume that (randomized) DNFe, namely DNF formulas composed with one layer of parity gates (see [CS16, ABG+14 and references therein), cannot compute the disjointness function. This circuit lower bound is implied by the assumption that the Arthur-Merlin communication complexity of disjointness is nE, for inputs of length n and some constant E > 0.

5.4.1 IPP Z ESZKPP

We show that there exists a property H such that U E IPP[poly(log(N), k, 1/)], but H HVESZKPP[poly(log(N), k, 1/E)]. Namely, 1 that has an efficient IPP, which unconditionally cannot have such a statistical zero-knowledge IPP.14

Theorem 5.4.1. IPP[poly(log(N), k, 1/E)] Z HVESZKPP [poly(log(N), k, 1/c)].

As we mentioned in Section 5.1.3.2, the proof of Theorem 5.4.1 uses the following result by Gur and Rothblum [GR17:

2 1We emphasize that here we refer to statistical zero-knowledge. Indeed, in Section 5.5 below we show that for computational zero-knowledge such a transformation is possible, for a large class of IPPs (see Theorem 5.5.2). "Our actual result refers to statistical zero-knowledge with expected simulation bounds, but this only makes our result stronger.

149 Lemma 5.4.2 ([GRi7, Theorem 11). The exists H E IPP[poly(log(N), k, 1/)] such that the verifier in every 2-message IPP for fl, with respect to proximity parameter E = 1/10 and completeness and soundness error 1/3, must run in time Q(N 6 ), for some universal constant 6 > 0.

Namely, there is a property 1I which has an interactive proof of proximity with a large number of rounds and polylog(N)-time verifier, but such that in every 2-message interactive proof of proximity for H, the verifier's running time must be N', for some constant 6 > 0. To derive Theorem 5.4.1 we now show a general round reduction transformation for any honest-verifier statistical zero-knowledge proof of proximity. Namely, we would like a procedure that takes any many-messages honest-verifier zero-knowledge proof of proximity and turns it into a 2-message honest-verifier zero-knowledge proof of proximity while only slightly deteriorating the verifier's and simulator's running times. Specifically, we show the following lemma.

Lemma 5.4.3 (Efficient round reduction for SZKPP). Suppose that the property H has an honest-verifier statisticalzero-knowledge E-1P P such that for every input length N E N and security parameterk c N the simulator's expected running time is bounded by ts(e, N, k) = t' (e, N) - poly(k) and for every value of E, the function t'(E,-) is monotone non-decreasing. Then, fl has a 2-message honest verifier statistical zero-knowledge E-IPP such that for every input length N and security parameterk the running time of the verifier is poly(ts(E, N, k'), k), for k' = poly(t' (E, N)).

For the setting of poly-logarithmic zero-knowledge proof of proximity, Lemma 5.4.3 can be stated as follows.

Corollary 5.4.4. Every 1 e HVESZKPP[poly(log(N), k, 1/E)] has a 2-message honest- verifier statistical zero-knowledge E-IPP with expected simulation, such that the veri- fier's running time is poly(log(N), k, 1/).

Remark 5.4.5 (Comparison with Babai-Moran IBMI\881). Lemma 5. and Corol- lary .4.4 should be contrasted with the classical round reduction of interactive proofs, due to Babai and Moran [BM88] (and shown in [RVWI3] to hold also for IPPs). In contrast to Lemma 5.4.3, the Babai-Moran round reduction increases the complexity of the verifier exponentially in the round complexity of the original protocol. In con- trast, the overhead in Lemma 5.4.3 is only polynomial, which is crucialfor our lower bound.

The proof of Lemma 5.4.3 is a direct application of the proof that the ENTROPY DIFFERENCE PROBLEM (EDP, see Definition 2.3.7) is complete for the class SZK (see Section 2.3). We refer the reader to Section 5.1.3.2 for an intuitive explanation of the proof, while we defer the actual proof of Lemma 5.4.3 to Section 5.6.2. Using Lemmas 5.4.2 and 5.4.3 we can now prove Theorem 5.4.1.

150 Proof of Theorem 5.4. 1. Let H be the property guaranteed to exist by Lemma 5.4.2. Assume towards a contradiction that H E HVESZKPP[poly(log(N), k, 1/E)]. Na- mely, H has an honest-verifier statistical zero-knowledge interactive proof of prox- imity with the simulator's expected running time being (log(N))' - ko . (1/e)7 for constants a, 3, 'y > 0. Applying Lemma 5.4.3 with respect to H yields that H has a 2-message e-IPP (P, V), with V's running time being (log(N)) 6' - k - (1/&)3 for constants 61= 61(a, ), 62, 63 = 53(, 7) > 0. Set E = 1/10 and k such that the soundness error of (P, V) is at most 1/3. Note that in this setting, V's running time is O(log(N) 61) = poly(log(N)). This is a contradiction to Lemma 5.4.2. El

Remark 5.4.6. We remark that the proof of Theorem 5.4.1 actually establishes the stronger result that H cannot even have an HVESZKPP protocol in which the verifier runs in time NO(') - poly(k, 1/E). Indeed, assuming a simulator with expected running time No(') - k . (1/E)', Lemma 5.4.3 yields that H has a 2 E-IPP in which the verifier runs in NO(') time, in contradiction to Lemma 5.4.2.

5.4.2 MAP Z ESZKPP, assuming Circuit Lower Bounds

We show that there exists a property H E MAP[poly(log(N), k, 1/E)] but, assuming certain circuit lower bounds, it holds that H Z HVESZKPP[poly(log(N), k, i/c)] Let t-DNF@ refer to depth 3 circuits, whose output gate is an bounded fan-in OR gate, intermediate level are composed of fan-in t AND gates and third layer is composed of (unbounded fan-in) parity gates. The size of a t-DNFe gate is the fan-in of its top gate. A randomized t-DNFD simply refers to a distribution over t-DNF® circuits. We say that a randomized t-DNFD circuit C : {0, 1 }k - {0, 1} computes a function f if for every x c {0, 1}k it holds that Pr[C(x) = f(x)] > 2/3.

For any k E N and strings x, y c {0, 1 }k, we define DISJk(x, y) = 1 if for every i E [k] it holds that either xi = 0 or y. = 0 and DISJk(x, y) = 0 otherwise. The following conjecture states that small randomized DNF( circuits cannot compute DISJ.

Conjecture 5.4.7. There exists a constant 6 > 0 such that for every randomized t-DN Fe of size S that computes DISJk it holds that min(t, log(S)) = Q(k).

We remark that a randomized t-DNFe circuit of size S yields an Arthur-Merlin communication complexity with complexity log(S)+t.2" To the best of our knowledge, the Arthur-Merlin communication complexity of disjointness is believed to be Q(k) (which would imply Conjecture 5.4.7 with 6 = 1). We mention that proving any non- trivial Arthur-Merlin communication complexity lower bound is a notorious open problem.

2 5 In contrast, note that there is a very simple CNF formula for computing DISJ. 2 6 First, Alice and Bob choose a DNFqE circuit from the distribution and specify it to Merlin. Merlin then sends an index of which term in the circuit is satisfied. A single term is a fan-in t AND gate composed with parity gates. Alice and Bob can compute this term's value using 2t communication, by having them send to each other their respective contributions to each of the t parities.

151 Theorem 5.4.8. If Conjecture 5.4.7 holds, then

MAP [poly(log(N), k, 1/c)] Z HVESZKPP [poly(log(N), k, i/c)].

We begin by an outline of the proof. Our main tool will be a binary linear

error-correcting code C : {0, 1 }k -+ {0, 1}', with constant relative distance and almost-linear" blocklength, which is also locally testable and locally decodable. A

code C : {0, 1 }k {0, 1}' is locally testable if there exists a procedure that makes only few queries to a word w E {0, 1}n, and determines with high probability if it is

a codeword (i.e., if w = C(x) for some message x c {0, 1 }k) or far from the code (see Definition 5.4.9 for the formal definition). A code is locally decodable if there exists a procedure that takes as input an index i E [k] and a word w E {0, } close to some codeword C(x), makes only few queries to w, and outputs xi with high probability (see Definition 5.4.10 for the formal definition). The property that we consider is the CODE INTERSECTION (CIP) property. This property consists of pairs of codewords (C(x), C(y)), coded under the foregoing code, such that DISJ(x, y) = 0 (i.e., x and y intersect). This problem was previously considered by Gur and Rothblum [GR18 who showed that it has a very efficient MAP (we re-prove this fact since we use a slightly different code). Indeed, it is easy to see that CIP has a very efficient MAP. Merlin simply sends to Arthur the index i on which x and y intersect. Arthur, using the local testability, will verify that the input is close to a pair of codewords, and then locally decodes xi and yi. Arthur accepts iff xi = y, = 1. This proof of proximity, however, reveals a lot to Arthur (and in particular is not zero-knowledge). Specifically, Arthur learns the index of the intersection. As a matter of fact, this is not a coincidence. We show that, assuming that Conjecture 5.4.7 holds, the property CIP does not have any honest-verifier zero-knowledge IPP with poly-logarithmic complexity. To see how we prove the lower bound, consider the promise problem CODE DISJOINTNESS (CDP), in which the YES instances are pairs of codewords (C(x), C(y)) such that DISJ(x, y) = 1, and NO instances are pairs of codewords (C(x), C(y)) such that DISJ(x, y) = 0. Note that NO instances of CDP are in the property CIP. Mo- reover, YES instances of CDP are 6(C)/2-far from CIP, where 6(C) is the relative distance of the code C. Assume, toward a contradiction, that CIP has an honest-verifier statistical zero- knowledge IPP with poly-logarithmic complexity. We argue that this implies that the complement promise problem of CDP has a constant-round IPP. The latter fact basically follows from the fact that the ENTROPY DIFFERENCE PROBLEM (EDP) is complete for the class of promise problems having a statistical zero-knowledge proof, and is itself closed under complement. Thus, we have constructed an IPP which accepts inputs from CDP and rejects inputs from CIP. Using a result of Rothblum et al. [RVW13J, we can derive from this IPP a quasi-polynomial size randomized DNF for the same promise problem. We further observe that since the code C is a linear code, we have obtained a circuit

2 7 Note that we are using the term "linear" in two different ways. First, the code is a linear function of the message. Second, the length of the codeword is almost linear in the length of messages.

152 that computes the disjointness function on input (X, y) by first applying a linear transformation and then the aforementioned randomized DNF. Or in other words, a quasi-polynomial sized DNFo circuit. This contradicts Conjecture 5.4.7. We proceed to the formal proof of Theorem 5.4.8. We begin with definitions and

notations. An error-correcting code is an injective function C : {0, 1 }k {o, 1}". The

code C is said to have relative distance 6(C) if for any x - x' E {o, 1 }k it holds that A(C(x), C(x')) > 6(C). 2 8 Throughout this work we deal with (uniform) algorithms, and so we will need (families of) error-correcting codes. Formally, for a parameters k = k(f) > 1 and n(f) > k(f) we define an ensemble of error correcting code as an

ensemble C = (Ce {0, 1 }k(e) a {0, 1}n()) fEN of error-correcting codes. An ensemble of error correcting codes C = (Ce)fEN is said to have relative distance 6(C) if for all sufficiently large f, each code C in the ensemble has relative distance 6(C). We next formally define locally testable and decodable codes.

Definition 5.4.9 ((Strong) locally testable codes (c.f. [GS06])). Let t: N -+ N.

A ensemble of error-correcting code C = (C: {0, 1 }k -+ {0, 1}I(e)) EN is t-locally- testable if there exists a probabilistic algorithm (tester) T that, given explicit input f and oracle access to w c {0, }n() , runs in time t(C), and satisfies the following.

0 * Completeness: For every c E {0, 1 }(') it holds that Pr [Tc (f) = 1] 1.

" Soundness: For every w c {0, 1 }n(f), it holds that

Pr[Tw(f) = 0] > Q(A(w, Im(Ce))),

where A(w, Im(Cf)) is the relative distance of w from the code.2 9

Definition 5.4.10 (Locally decodable codes (c.f. [KTOO])). Let t: N - N. A ensem-

ble of error-correctingcode C = (C: {0, 1 }(e) _. {0, 1 }fl") EEN is t-locally-decodable if there exists a constant 6radius E (0, 6(C)/2) and a probabilistic algorithm (decoder) D that, given oracle access to w c {0, 1} and explicit inputs i E [k] and F c N, runs in time t() and satisfies the following.

* Completeness: For every i E [k(f)] and x E {0, 1 }k(e), it holds that

Pr[DC(x)(i) = xi] = 1.

* Soundness: For every i E [k(E)] and every w E {0, 1}n() with A(w, C(x)) < 6radius, it holds that Pr[Dw(i) = xi] > 2/3.30 28 Recall that the relative distance between y E {0, 1} and y' E {O, I} is defined as A(y, y')

29 Recall that the relative distance of x c {0, } from a non-empty set S C {0, 1} is defined as A(x, S) = minyes A(x, y). 3 6 1Since radius < 6c/2 the message x is unique.

153 We use the following well-known fact. Lemma 5.4.11 (Existence of "good" code). There exists an ensemble of binary linear 10 codes C = (C' : {0, 1 1 k(e) -+ {0, 1}n)) EN ,for k(f) 0(f) and n(f) <_k( , whose relative distance is some constant 6 > 0 and that is polylog(f)-locally-testable and polylog(e) -locally-decodable.

See Section 5.6.4 for a sketch of the construction (which is basically the concate- nation of the low degree extension code, over a field of poly-logarithmic size, with a good binary code). Using Lemma 5.4.11, we can now define the property Code Intersection.

Definition 5.4.12 (CODE INTERSECTION). Let C =(C)EN be the code guaranteed to exist by Lemma 5.4.11. For f E N, let

CIPi = {(Ce(x),Ce(y)): x,y E { 0 , 1}(') such that DISJk(f)(x,y) = 0

We define the CODE INTERSECTION property as CIP = (CIPj, [2n(f)], {0, 1})EN The proof of Theorem 5.4.8 follows immediately from the next two lemmas, proven in Sections 5.4.2.1 and 5.4.2.2. Lemma 5.4.13. CIP E MAP [poly(log(N), k, 1/)]. Lemma 5.4.14. If Conjecture 5.4.7 holds, then CIP HVESZKPP[poly(log(N), k, 1/)].

5.4.2.1 Proving Lemma 5.4.13

Consider the protocol (PcIp, Vc1p) from Fig. 5.4.1. Perfect completeness follows from the perfect completeness in the local testing and decoding procedures. We proceed to argue that soundness holds.

Fix E > 0, sufficiently large E e N and w = (wI, w 2 ) E {0, 1} n)such that w is E-far from CIPf. We assume without loss of generality that E < 6 radius (otherwise "reset" E to 6,adiu,). We consider two cases: c/2: Let j E{1, 2} such that A(wi,A(w2, Im(C)) Im(Cf)) > E/2 or A(w 2 , Im(Ce)) E e/2. By the soundness condition of the tester T, it holds that

Pr[VcIpw(E, n(e)) rejects] > Pr[Twi (f) = 0] > Q(A(wj, Im(Cf))) > Q(E/2).

A(wi, Im(C )) E/2 and A(w 2 , Im(C)) < E/2: Fix a cheating prover P. Assume without loss of generality that P is deterministic and let i* be the index it

sends to VCIp in step 1. Let x, y E {0, 1 }k(l) such that A(wi, C(x)) < E/2 and

A(w 2 , C(y)) < E/2 (such x and y are unique since E < 6radius(C) ). Moreover, as w is E-far from CIPf, it must be that either xi. = 0 or yi. = 0. Observe that if xi. = 0, then by the soundness of the decoding procedure, with probability 2/3, the decoder will output 0 in which case our verifier rejects. The case that y* = 0 is analyzed similarly.

154 The Code Intersection Protocol (PCIp, VCIP)

Prover's Input: A pair of strings (w1 , W2) E {0, 1 }2"n(f) and proximity parameter E > 0.

Verifier's Input: f, n(f), E and oracle access to (Wi, w 2 ).

Let C be the code ensemble from Lemma 5.4.11. Let T be the tester from Definition 5.4.9 with respect to C. Let 6radius(C) and D be the decoding radius and decoder, respectively, from Defini- tion 5.4.10 with respect to C.

1. PCIp finds i E [k(f)] such that w = (Cj(x), Cj(y)) for some x, y c {0, 1}k(') and Xi = yi. Sends i to VCIp.

2. VCIp acts as follows:

(a) Set E = min{E, 26radius(C)}.

(b) Run Tw (f) and Tw2 (f) and reject if any of them rejects.

2 (c) Accept if Dw1 (i, f) = Dw (i, f) - 1, and otherwise reject.

Figure 5.4.1: The Code Intersection Protocol

Combining both conditions, we have Pr[VcIpw(e, n(f)) rejects] > min{Q(E/2), 1/3 - Q (E). So far we have shown that the code intersection protocol (Fig. 5.4.1) has prefect completeness and soundness error 1- O(E). To reduce the soundness error it suffices to have the verifier repeat its check poly(k)/& times. 31 As shown in [GR.18] this reduces the soundness error to 2 -k and so the resulting protocol is an e-MAP. Finally, it is easy to verify that the ultimate verifier run in time poly(log(f), k, 1/E) which, since the input length (i.e., 2 - n(f)) is poly(f), is poly(log(N), k, 1/).

5.4.2.2 Proof of Lemma 5.4.14

We prove the contrapositive. Assume that CIP e HVESZKPP [poly(log(N), k, 1/c)j and consider the promise problem of CODE DISJOINTNESS.

Definition 5.4.15 (CODE DISJOINTNESS). For f E N, let

CDPYES,= {(Cx), C(Y)): X, y E {0, 11(') such that DISJk(f)(x, y) = 1

CDPNO,e {(Ce(x) y))f : X, Y {o, *}k) such that DISJk( )(x, y) = 0.

Let CDP = (CDPYES,, CDPNO4,)fN.

31 Note that here k refers to the security parameter and not the message length k(f).

155 Note that the input length here is N = 2-n(f) 0(e1 0 1 ). Hence, the query complexity and communication complexity of the IPP are poly(log(e), k, 1/). We will use the zero-knowledge proof of proximity for CIP to design a randomized DNFe circuit that solves disjointness. Note that by definition CDPNOf = CIPf- Observe that every string w E CDPYES,f is 3(C)/2-far from CIPj. Thus, an (honest-verifier) statistical zero-knowledge IPP for CIP immediately yields an (honest- verifier) statistical zero-knowledge proof for the complement of CDP. Recall that in Section 5.4.1 we used that entropy difference (EDP) is complete for the class SZK. Here, we will use this fact again, plus that there is an easy reduction from EDP to its complement, to show the following claim, proven in Section 5.6.3.

Claim 5.4.1. The promise problem CDP has an interactive proof system with the following properties.

1. The verifier gets f as explicit input and oracle access to w C CDPYES,f U CDPNO,e-

2. The completeness and soundness errors are both 1/3.

3. The verifier's running time is poly(log(f)).

4. The parties exchange a constant number of messages.

Using the Goldwasser-Sipser [GS891 transformation from private-coin to public- coin interactive proofs and the Babai-Moran [BM881 round reduction (see [RVW13, Section 4] for more details). We obtain a 2-message Arthur Merlin interactive proof, where the verifier runs in time polylog(e). Applying an additional transformation from such proof-systems to randomized DNFs due to [RVW13] (see also [GR17]), we can obtain the following:

Claim 5.4.2 (Based on [RVW13, Section 4]). There exists a randomized polylog(f)- DNF of size 2PlYl9''() that computes (the promise problem) CDP for inputs of size 2 - n(f).

By observing that all w e CDPYES, U CDPNO,j are composed of two codewords, and the the code is a binary linear error correcting code, Claim 5.4.2 implies that there exists a randomized polylog(e)-DNFe circuit of size 2po'y g(Y) that computes DISJk(e). This contradicts Conjecture 5.4.7. This concludes the proof of Section 5.4.2.1.

Remark 5.4.16 (Relaxed local decoders and the [GGK15] Code). We remark that for our result, as in [GR18J it suffices for us to use relaxed local decoders (as put forth in [BGH+06]). Loosely speaking, relaxed local decoding allows the decoder to refuse to decode if it notices that the word is corrupt. Given that, it is tempting to ask why we did not use the locally testable and (re- laxed) decodable codes of Golderich et al. [GGK1 5]. Indeed, their codes have constant- query whereas the code that we used requires poly-logarithmic query complexity. The only reason that we do not use the [GGK15 code is that the computational complexity of this code was not analyzed in [GGKi.5J.

156 5.5 Computational ZK Proofs and Statistical ZK Ar- guments of Proximity

In this section we show that, assuming reasonable cryptographic assumptions (speci- fically, the existence of one-way or collision-resistant hash functions), a large class of IPPs and arguments of proximity3 2 can be transformed to be zero-knowledge. As a consequence, using the results of [RVW13, RRR16, Kil92, BGH+06, DR061 we obtain computational ZK proofs of proximity for small-depth and for small-space computa- tions, and statistical ZK arguments of proximity for all of NP. Our transformation should be contrasted with an analogous transformation of Ben-Or et al. [BGG+88] for classical public-coin interactive proofs (and arguments). Indeed, our transformation is based on the main idea of [BGG+881. However, in contrast to their result, our transformation does not apply to arbitrary public-coin IPPs. Rather, it only applies to such IPPs in which the queries that the verifier makes do not depend on messages sent by the prover. We say that such IPPs make prover-oblivious queries.

Definition 5.5.1. We say that an IPP makes prover-oblivious queries if the input locations that the verifier queries are fully determined by its random coin tosses and the answers to previous queries that it made. That is, the queries do not depend on messages sent by the prover.

Thus, an IPP with prover-oblivious queries can be thought of as a two steps process. In the first step the verifier can make queries to its input but it is not allowed to interact with the prover. In the second step, the parties are allowed to interact but the verifier is no longer allowed to query the input. Interestingly (and crucially for our purpose), the general purpose IPPs and ar- guments of proximity in the literature are indeed public-coin and make only prover- oblivious queries. Using this fact, together with our transformation, we obtain general purpose ZK proofs of proximity. Our main transformation is summarized in the following two theorems.

Theorem 5.5.2 (IPPs -÷ Computational ZK). Assume that one-way functions exist. Suppose that the language L has an t-message public-coin IPP with prover oblivious queries where the verifier runs in time tv = tv(N, k, e) and the (honest) prover runs in time tp = tp(N, k, e). Then, L has an (f + poly(k))-message computational ZKPP in which the prover runs in time t' (N, k, E) A (tp(N, k, E) + poly(tv(n, k, E))).poly(k) and the verifier runs in time t'Q(N, k, E) A tv(N, k, E) - poly(k). The simulation overhead (see discussion in Section 5.2) is s(tV, N, ke)= tV - poly(k), for cheating verifiers that run in time tV = tV(N, k,E).

"An argument of proximity is similar to an IPP except that the soundness condition is further relaxed and required to hold only for polynomial-time cheating provers. See [KR15] for details and a formal definition. "Our notion of prover-oblivious queries extends the notion of proof-oblivious queries studied by Gur and Rothblum [GRi8] in the context of MAPs (i.e., non-interactive proofs of proximity).

157 Theorem 5.5.3 (Arguments of Proximity -+ Statistical ZK Arguments). Assume that one-way functions exist. Suppose that the language L has an f -message public- coin interactive argument of proximity with prover oblivious queries where the verifier runs in time tv = tv(N, k, e) and the (honest) prover runs in time tp = tp(N, k, e). Then, L has an (f - poly(k))-message statistical zero-knowledge argument of proximity in which the prover runs in time t' (N, k, E) (tp(N, k, E) + poly(tv(N, k, E))) -poly(k) and the verifier runs in time t'v(N, k,e) tv(N, k, E) - poly(k). Furthermore, if there exist collision-resistanthash functions, then the round com- plexity of the foregoing argument-system can be reduced to ( + 0(1)). We refer the reader to Section 5.1.3.3 where an intuitive explanation is given for the proofs of Theorems 5.5.2 and 5.5.3. We would need the following result from [1KOS09}, that gives a general purpose zero-knowledge protocol for NP:3 4

Lemma 5.5.4 ([IKOS09]). Let L - NP with witness relationR(., -) that is computable in time t. If there exist one-way functions, then L has a computationalzero-knowledge proof in which the verifier runs in time 6(t) - poly(k) and the prover runs in time poly(N, k). For every (malicious) verifier running in time T, the simulator runs in time (T + 6(t)) - poly(k). The number of rounds is poly(k).

Actually, since the running times are not specified in [IKOS09], we give an overview of the construction in Section 5.6.5. We proceed to give a proof sketch of Theorem 5.5.2 and note that Theorem 5.5.3 is proved similarly (using statistically hiding commitments).35

Proof Sketch of Theorem 5.5.2. The existence of one-way functions implies the exis- tence of the following cryptographic protocols that we will use:

* A computationally hiding and statistically binding commitment scheme [Nao9l, HILL99]. Moreover, after one initial set-up message from the receiver to the sender (where this setup can be re-used for a polynomial number of commit- ments), the commitment scheme is non-interactive: the sender only needs to send a single message to the receiver. (This commitment scheme will be used to derive the first part of Theorem 5.5.2.)

e Computational zero-knowledge proofs for any language in NP in which the ve- rifier runs in time that is almost linear in the complexity of the witness relation (see Lemma 5.5.4). 3 4We use [IKOSO9] protocol and not the classical [GMW91] protocol for the following reason: The verifier in the [IKOS09] protocol runs in time that is linear in the complexity t of the NP verification process. In contrast, the [GMW91] verifier runs in time poly(t). The distinction is important for us since in Corollaries 5.5.8 and 5.5.9 below, we will apply Theorem 5.5.2 on a statement that can be verified in roughly V/nY time, and so we cannot afford a polynomial overhead. 3 5 Note that statistically hiding commitments can be based on any one-way function [HNO+09]. For the furthermore part, we need to use constant-round statistically hiding commitments and constant-round statistical zero-knowledge arguments for NP. Both are known to exist assuming collision resistant hash functions [NY89, BCY91].

158 Let (P, V) be an f-round public-coin IPP for L with prover oblivious queries. We describe the construction of a computational zero-knowledge proof of proximity for L. As alluded to above, the construction of a statistical zero-knowledge argument of proximity is similar, except that we replace the computationally hiding and statisti- cally binding commitment with one that is statistically hiding and computationally binding, and replace the computatational zero-knowledge proof for NP with a statis- tical zero-knowledge argument.

We proceed to describe the computational ZKPP (P',V') for L, on input x of length N, security parameter k and proximity parameter e. First, (P', V') run the setup for the commitment scheme. After this initial step, the interaction consists of two main parts. In the first part, P' and V' emulate the interaction between P and V, where P' only commits to the messages that P would have sent. Since the protocol (P, V) is a public-coin protocol, the verifier V' can continue the interaction without actually knowing the contents of the messages that it receives (since V' only needs to sample and send random coin tosses).

Then, in the second part, V' has already obtained commitments ci, . .. , cj to some messages zi, . . ., af that P would have sent. At this point we would like P' to prove the statement:

di is a decommitment of ci w.r.t. message aj, for every i E [r] Id, ... ,de,ai,...,a : and Vx (N,E, k, (oil, 1,.. ., af,,of)) =1 (5.14)

The statement in Eq. (5.14) is almost, but not quite, an NP statement. The reason that we would like to phrase it as an NP statement is that by Lemma 5.5.4 (and using our assumption that there exist one-way functions), there exist very efficient (computational) zero knowledge proofs for any language in NP. Thus, we would like for P' to prove Eq. (5.14) to V' using such a general purpose zero-knowledge proof- system.

The problem that we encounter is that Eq. (5.1.4) is not precisely an NP statement since it refers to oracle access to a given string x.3" To overcome this problem, we use our assumption that V makes prover oblivious queries. Hence, the queries that V makes depend only on its own random coin tosses (and answers to previous queries that it has made), but not on the messages sent by P. Denote by Q(x; p) the sequence of (possibly adaptive) queries that V makes on input x and random string p. Since Q(x, p) depends only on the randomness (and, possibly, on answers to previous queries to x), the verifier V' can sample p at random and generate this set. We can now re- state Eq. (5.14) as:

"As a matter of fact Eq. (5.14) can be expressed as an "non-interactive proof of proximity" or MAP [GR18].

159 di is a decommitrnent of ci w.r.t. message ao, for every i E [r] d ., de, &1,. .. , a: and o, V '(X ;P), (N, _-, k), (a 1, 01,. . ., c ,e )) = (5.15)

which is in fact an NP relation, for which P' has a witness. Therefore, using our assumption that one-way functions exist, there exists a computational zero-knowledge proof for Eq. (5.15). P' and V' engage in this proof-system and V' accepts or rejects accordingly. (To actually run this proof-system V' first shares p with P' - but it does so only after they have completed the emulation of (P, V).) Completeness of (P', V') follows from the completeness of (P, V) and the (perfect) completeness of the [IKOSO9] zero-knowledge proof. The analysis of soundness and zero-knowledge is standard and we omit them from this preliminary version. We proceed to analyze the efficiency of the proof-system. We consider the three phases of interactions separately:

" Setup Phase: First, the two parties set up the commitment scheme this step is done using a single round of communication and with complexity poly(k) for both parties.

" Commitment Phase: Each bit that P sends to V in the original protocol is emulated by a (non-interactive) commitment (with a poly(k) overhead). Mes- sages sent from V to P are unchanged (recall that these refer to random coin tosses). Thus, the round complexity of this part is f and there is a poly(k) overhead to the running time of both parties.

* Final Phase: V' first sends the random string used by the underlying V. This introduces a tv(N, k, E) overhead to both parties. Then, both parties run the [IKOS09 protocol on an NP statement that can be verified in time t = tv(N, E, k)- poly(k). The [IKOS09] verifier runs in time that is O(t) = tv(N, F, k). poly(k) whereas the [IKOSO9 prover runs in time poly(t) = poly(tv(N, E, k), k). The number of rounds is poly(k).

To obtain our ZKPP results, we will combine Theorem 5.5.2 with known results from the literature. Specifically, we will use the following results: (where throughout N denotes the input length, k the security parameter, and E the security parameter).

Theorem 5.5.5 ([RVW13]). Every language in logspace-uniform NC, has a polylog(N)- round public-coin E-IPP, for e = N- , with perfect completeness and 1/2 soundness error. The verifier runs in time N2+0(1) and the (honest) prover runs in time poly(N). Furthermore, the verifier makes prover oblivious queries.

160 Theorem 5.5.6 ([R.RR16]). Let L be a language that is computable in poly(N)-time and O(N")-space, for some sufficiently small constant o- > 0. Then L has a constant- round public-coin E-IPP for E = N-1 2 , with perfect completeness and 1/2 soundness error. The verifier runs in time N1/ 2 +0() and the (honest) prover runs in time poly(N). Furthermore, the verifier makes prover oblivious queries.

Theorem 5.5.7 ([Kil92, BGH+06, DR06J). Assume that there exist collision-resistant hash functions. Then, every language in NP has a 4-message public-coin argument of E-proximity with perfect completeness and 1/2 soundness error (for any value of E > 0). The verifier runs in time poly(log(N), k, 1/E) and the prover runs in time poly(N, k). Furthermore, the verifier makes prover oblivious queries.

We remark that the fact that the verifier makes prover oblivious queries is not stated explicitly in the above works but can be verified by inspection." Combining Theorems 5.5.5 and 5.5.6 with Theorem 5.5.2, and Theorem 5.5.7 with Theorem 5.5.3 we derive the following corollaries:

Corollary 5.5.8 (Computational ZKPP for bounded depth). Assume that there ex- ist one-way functions. Let L be a language in logspace-uniform NC. Then, L has a (polylog(N) + poly(k))-round computational zero-knowledge proof of c-proximity, for 1 2 E = n- . The verifier runs in time N2 o() . poly(k) and the (honest) prover runs in time poly(N, k). The simulation overhead is s(tg, N, k, E) = t - poly(k), for (ma- licious) verifiers running in time tV = tV(N, k, E).

Corollary 5.5.9 (Computational ZKPP for Bounded Space). Assume that there ex- ist one-way functions. Let L be a language that is computable in poly(N)-time and O(N")-space, for some sufficiently small constant - > 0. Then, L has a poly(k)- message computationalzero-knowledge proof of E-proximity, for E = N-1/2. The veri- 2 fier runs in time N1/ +0(a) . poly(k) and the (honest) prover runs in time poly(N, k). The simulation overhead is s(tV, N, k, E) = tc - poly(k), for (malicious) verifiers run- ning in time tV = tV(N, k, E).

Corollary 5.5.10 (Statistical Zero-Knowledge Arguments). Assume that there exist collision resistant hash functions. Then, every language in NP, has a constant-round statisticalzero-knowledge argument of E-proximity, for every value of E > 0. The veri- fier runs in time poly(log(N), k, 1/E) and the (honest) prover runs in time poly(N, k).

5.6 Deferred Proofs

In this section we prove statements made in this chapter, but that has yet to be proven. A few of those statements relay on the following reduction of HVSZKPP to the ENTROPY DIFFERENCE PROBLEM. 3 7 Theorem 5.5.7 is obtained by applying Kilian's [Kil92] protocol to a PCP of proximity (c.f., [BGH:06, DR06]). See further discussions in [RVW13, KR151. We remark that for the resulting argument of proximity to have proof oblivious queries, we need to use a PCP of proximity whose queries are non-adaptive in the proof. Such general purpose PCPs of proximity were constructed in [BGH.06, DR06].

161 5.6.1 Reducing HVSZKPP to the ENTROPY DIFFERENCE PRO- BLEM In this section we show how to reduce any property with honest-verifier zero-knowledge to an instance of hte ENTROPY DIFFERENCE PROBLEM.

Lemma 5.6.1. Suppose that a property H has a honest-verifier statisticalzero-knowledge e-1PP such that for every input length N and security parameterk e N the simulator's expected running time is bounded by ts(e,N, k) = t' (E, N) - poly(k) and for every E the function t'(E, -) is monotone non-decreasing. Then, there is a reduction from H to EDP. Specifically, the reduction is given E and an input length N and outputs two oracle aided circuits Co, C1: {o, 1}"' - {0, 1}" such that the following holds.

1. (CO, CI) is an instance of EDP:

f E rl -> H(C) > H(Cf) + 1, f is e-far from H --> H(Cf) > H(C) + 1.

2. The reduction's running time is poly(ts(e, N, poly(t'(e, N)))).

Note that the last item implies that for every x e {0, 1}' and b e {0, 1}, com- puting Cb(x) requires only poly(ts(E, N, poly(t'(E, N)))) many oracle calls. The proof of the above lemma follows from the proof that EDP is SZK-hard from [Vad99l. We only give sufficient details to demonstrate how to apply that proof to our setting.

Proof sketch. Assume that (P, V) is the E-IPP for H and let S be the honest-verifier simulator for (P,V) whose simulation deviation is p(k) = negl(k). We assume for simplicity that ts(N, k, e) is a strict bound (and not only expected) on the running of S. The proof can be extended to handle expected bounds as well (in fact, the proof in [Vad99] handles even weaker simulators). Assume without loss of generality that P and V send their messages in turns, P sends the odd messages and V the even ones. Let v(E, N, k) be a bound on the number of messages sent by V to P for every f. In addition, let c(e, N, k) and r(e, N, k) be bounds on the total communication (measures in bits) between P and V and the number of random bits accessed by the verifier, respectively. We now modify the proof system so that V sends its random coins to P in an additional message just before the end of the protocol. The total communication and number of messages sent from V to P now increases to c'(e, N, k) = c(E, N, k) + r(E, N, k) and v'(E, N, k) = v(E, N, k) + 1, respectively. S is modified to simulate the additional last message as well, without increasing its simulation deviation (this is possible since S was supposed to simulate V's random coins anyway). Fix E and N and let k' E N with p(k') < min{1/v'(E, N, k') . c'(E, N, k'), 1/4 - 2-40} and the completeness and soundness errors of (P, V) are at most 2-40. Note that it suffices to take k' = poly(t's(E, N)) (i.e., a fixed polynomial for all e and N): It holds that v'(E, N, k') -c'(e, N, k') < ts(E, N, k') = t2(E, N) - poly(k'). Thus, we can take k' such that '(k') < 1/t1(E, N), for some negligible function p'(k') = (k') - poly(k').

162 Since t'(E, N) is monotone non-decreasing in N, taking k' = poly(t'(E, N)) guarantee the required condition for large enough (depending on y) N (for simplicity, we ignore shorter inputs that can be solved via brute-force by the verifier). Finally, let S1 be the random variable distributed according to the first i messages in the output of a random execution of Sf(e, N, k'). In the following we remove E, N and k' from the notation.

Constructing CO and C1. Define X = S2 0 S 0 ...0 S ,.38 Similarly, define Y to be Y = Sf 0Sj -* 0 S, and define Y 2 to be the uniform distribution on r -7 bits. Furthermore, define Y3 as follows: run Sf 8 ln(cv' + 2) times independently; if the verifier rejects in the majority of the transcripts obtained, output c'' +2 random bits; otherwise, output the empty string. Define Y = Y 0 Y2 0 Y3 . Finally, the circuits CO and C1 take as input random coins to sample and output x ~ X and y ~ Y, respectively. Since we require that the input (respectively, output) lengths of Co and C1 will be equal, we pad the shorter input (respectively, output) with redundant random coins (respectively, zeros). 3 9

Analysis. That (Co, C1 ) is an instance of EDP (Item 1) follows from lVad99, Claims 3.3.14 and 3.3.151. The reduction's running time (Item 2) follows from the constructi- ons of Co and C1 .

5.6.2 Proving Lemma 5.4.3

The proof of Lemma 5.4.3, sketch of which is given below, immediately follows from Lemmas 2.3.14 and 5.6.1.

Proof sketch of Lemma 5.4.3. Both the verifier and the prover will run the reduction form Lemma 5.6.1 to get two distributions encoded by oracle-aided circuits (CO, C1 ). They will then run the protocol from Lemma 2.3.14 with respect to these distributions. Since the latter is an oracle-access protocol the verifier can indeed run it using only oracle access to its input f. The running time of the reduction implies that the input and outputs sizes of CO and C1 are poly(ts(E, N, poly(t'(E, N)))). By Lemma 2.3.14 the running time of the verifier is thus poly(ts(E, N, poly(t'(E, N))), k), as required. Zero-knowledge follows from similar arguments to the ones made above. E

5.6.3 Proving Claim 5.4.1

The proof of Claim 5.4.1 follows similar lines to that of Lemma 5.4.3.

38Recall that P 3 Q stands for the product distribution of P and Q. 39Let tmx and ny denote the input and output lengths of X, respectively. Let 'my, ny be similarly defined. For example, if mx < my and nx < ny we can modify X as follows: sample x ~ X using part of the given my random bits and output A0,-".

163 Proof sketch of Claim 5.4.1. Both the verifier and the prover will run the reduction form Lemma 5.6.1 with respect to the property CIP and proximity parameter 6(C)/2 to get two distributions encoded by oracle-aided circuits (Co, CI). If w E CDPYEse then w is 3(C)/2-far from CIP and thus H(C{) > H(Co) + 1. However, if w E CDPNO,e then w E CIP and thus H(C") > H(C) + 1. The verifier and the prover will then run the protocol from Lemma 2.3.14 with respect to the instance (C', Cow) (note that the order of the circuits has changed) and security parameter k chosen such that the completeness and soundness errors are both 1/3 for large enough f. Since the latter is an oracle-access protocol the verifier can indeed run it using only oracle access to its input w. Recall that we assumed that CIP E HVESZKPP[poly(log(N), k, 1/E)]. Hence, the simulator for CIP runs in time poly(log(e), k, 1/), which by the choice of parameters is simply poly(log f) (recall the the proximity parameter 6(C) and the security parameter k are constant). The running time of the reduction implies that the input (i.e., the number of bits need to sample from the distribution) and outputs sizes of Co and C1 are poly(loge) as well, and by Lemma 2.3.14 the running time of the verifier is thus poly(log(e)), as required. l

5.6.4 Proof Sketch of Lemma 5.4.11 We start with a low degree extension code, over a finite field F, which view messages x E Ff as functions x : H' - F, where H C F is a subset and m is a dimension, such that Hml = f. The code maps x to its low degree extension: namely, the unique individual degree IHI - 1 polynomial that agrees with x on Hm . By the Shwartz-

Zippel lemma this code has relative distance 1 - (IHI-IIFI )-m Furthermore, this code is known to be locally testable [RS96] and decodable [GS92, Sud95j using O(IHI - m) queries. We set H = (log(e))C, m ClIoglog( log(f) ) and JF = 0(m - IHj) for a sufficiently large constant c > 1. Furthermore, we use a field F which is an extension field of the binary field F2 . We then concatenate the above low degree extension code with a good binary linear code. The overall resulting code has message length k(C) = 1HI m - log(F) = 0(e), m blocklength n(e) = O(F . F) =( W+1c), constant relative distance and locally testable and decodable with O(IHI -m) = polylog(e) queries, which meets our desired parameters by setting c to be sufficiently large. Furthermore, since the low degree extension is linear over the large field F, which is an extension field of F2, it is also linear over F2 and therefore the resulting code is also F2-linear.

5.6.5 Proof Sketch of Lemma 5.5.4 We use the [IKOS09 "MPC in the head" construction. More specifically, we will use their simplest variant, which is based on the [GMW8'1] 3-party protocol, in the OT- hybrid, with semi-honest security against 2 (semi-honest) players.40 Below we refer

4 0 Indeed, note that the [IKOS09] approach transforms semi-honest secure MPC protocols into proof-systems that are zero-knowledge with respect to malicious verifiers.

164 to this as the IKOS protocol. We first recall that the IGMWS7] protocol, with 2-out-of-3 semi-honest can be implemented so that the parties, and the (semi-honest) simulator, run in time O(t') (where we count OT calls at unit cost), where t' is the circuit complexity of the function.

The IKOS protocol works in k sequential phases (in order to obtain 2 -k soundness ), where each phase works as follows. The prover first runs the [GMW87] protocol with respect to the function f(x, w 1 , w 2 , w3 ) = R(x, w, ( W2 ( w3 ), where w1 , w 2 , W3 are an additive secret sharing of the witness W. Observe that f is computable by a size t' = tildeO(t) circuit, where t is the complexity of R of the NP relation (an extra log factor comes from emulating Turing machines by circuits). Thus, the parties and the MPC simulator run in time 0(t). After running the MPC protocol (in its "head"), the IKOS prover commits to the view of all the players.4 1 Then, the verifier chooses two (distinct) players i, j E {1, 2, 3} at random, and sends i and j to the prover. The prover decommits to these players views. The verifier rejects if the decommitments are invalid, the views are inconsistent, or if the result of the computation is not 1. Otherwise it accepts. For the analysis of soundness and zero-knowledge of the IKOS protocol see [IKOS09]. Here we focus on the running times of the verifier and the simulator. Observe that all that the verifier's running time in each phase is 0(t) * poly(secp) as required. We proceed to describe the IKOS simulator. Fix a malicious verifier V. The simulator also runs for k phases. In each phase it repeats the following procedure at most poly(secp) times (and aborts if all fail):

1. Select at random a pair of (distinct) players i, J c {1, 2, 3}, and runs the [GMW87] simulation on them (with respect to random strings wi and w;).

2. The simulator generates commitments to the simulated views for these players as well as a fake simulation (e.g., all zero) for the third player. The simulator "sends" these commitments to the verifier V.

3. V responds with a pair of distinct indices i', j' E {1, 2, 3} (otherwise, since the IKOS prover would abort, the simulator can output its generated view so far followed by a I symbol).

4. If i', j' are not the same as i, j, then continue the loop.

5. Otherwise, the simulator can send decommitments to V and they continue to the next phase.

Overall the simulation of a single phase takes (O(t) + T) - poly(k) time, where T is the running time of V. See [IKC5S09] for additional details.

4Here we use a satistically binding commitment scheme, which follows from the existence of one-way functions [HILL99, Nao9i].

165 166 Bibliography

[Aarl2l Scott Aaronson. Impossibility of succinct quantum proofs for collision- freeness. Quantum Information & Computation, 12(1-2):21-28, 2012.

[AARV17 Benny Applebaum, Barak Arkis, Pavel Raykov, and Prashant Nalini Va- sudevan. Conditional disclosure of secrets: Amplification, closure, amor- tization, lower-bounds, and separations. In Jonathan Katz and Hovav Shacham, editors, Advances in Cryptology - CRYPTO 2017 - 37th An- nual InternationalCryptology Conference, Santa Barbara, CA, USA, Au- gust 20-24, 2017, Proceedings, Part I, volume 10401 of Lecture Notes in Computer Science, pages 727-757. Springer, 2017.

[ABG+141 Adi Akavia, Andrej Bogdanov, Siyao Guo, Akshay Kamath, and Alon 0 Rosen. Candidate weak pseudorandom functions in AC o MOD 2 . In Innovations in Theoretical Computer Science, ITCS'14, Princeton, NJ, USA, January 12-14, 2014, pages 251-260, 2014.

[ADM+99] , Martin Dietzfelbinger, Peter Bro Miltersen, Erez Petrank, and Gdbor Tardos. Linear hash functions. J. ACM, 46(5):667-683, 1999.

[AGGM06] Adi Akavia, Oded Goldreich, , and Dana Moshkovitz. On basing one-way functions on np-hardness. In Jon M. Kleinberg, editor, Symposium on Theory of Computing, pages 701-710. ACM, 2006.

[AH91I William Aiello and Johan HAstad. Statistical zero-knowledge languages can be recognized in two rounds. J. Comput. Syst. Sci., 42(3):327-345, 1991.

[AKNS00I Noga Alon, Michael Krivelevich, Ilan Newman, and . Re- gular languages are testable with a constant number of queries. SIAM J. Comput., 30(6):1842-1862, 2000.

[AR16a] Benny Applebaum and Pavel Raykov. From private simultaneous messa- ges to zero-information arthur-merlin protocols and back. In Eyal Kus- hilevitz and Tal Malkin, editors, Theory of Cryptography - 13th Interna- tional Conference, TCC 2016-A, Tel Aviv, Israel, January 10-13, 2016, Proceedings, Part II, volume 9563 of Lecture Notes in Computer Science, pages 65-82. Springer, 2016.

167 [AR16b] Benny Applebaum and Pavel Raykov. On the relationship between sta- tistical zero-knowledge and statistical randomized encodings. In Annual Cryptology Conference, pages 449-477. Springer, 2016.

[Bab16 Liszl6 Babai. Graph isomorphism in quasipolynomial time [extended abstract]. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18- 21, 2016, pages 684-697, 2016.

[BB15] Andrej Bogdanov and Christina Brzuska. On basing size-verifiable one- way functions on np-hardness. In Yevgeniy Dodis and Jesper Buus Niel- sen, editors, TCC, volume 9014 of Lecture Notes in Computer Science, pages 1-6. Springer, 2015.

[BBF16] Zvika Brakerski, Christina Brzuska, and Nils Fleischhacker. On statis- tically secure obfuscation with approximate correctness. In Advances in Cryptology - CRYPTO 2016 - 36th Annual InternationalCryptology Con- ference, Santa Barbara, CA, USA, August 14-18, 2016, Proceedings, Part II, pages 551-578, 2016.

[BBM11I Nayantara Bhatnagar, Andrej Bogdanov, and Elchanan Mossel. The computational complexity of estimating MCMC convergence time. In Approximation, Randomization, and Combinatorial Optimization. Algo- rithms and Techniques - 14th International Workshop, APPROX 2011, and 15th International Workshop, RANDOM 2011, Princeton, NJ, USA, August 17-19, 2011. Proceedings, pages 424-435, 2011.

[BBM121 Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lo- wer bounds via communication complexity. Computational Complexity, 21(2):311-358, 2012.

[BCF+16 Eli Ben-Sasson, Alessandro Chiesa, Michael A. Forbes, Ariel Gabizon, Michael Riabzev, and Nicholas Spooner. On probabilistic checking in perfect zero knowledge. IA CR Cryptology ePrint Archive, 2016:988, 2016.

[BCGV16] Eli Ben-Sasson, Alessandro Chiesa, Ariel Gabizon, and Madars Virza. Quasi-linear size zero knowledge from linear-algebraic pcps. In Theory of Cryptography - 13th InternationalConference, TCC 2016-A, Tel Aviv, Israel, January 10-13, 2016, Proceedings, Part II, pages 33-64, 2016.

[BCH+17] Adam Bouland, Lijie Chen, Dhiraj Holden, Justin Thaler, and Pras- hant Nalini Vasudevan. On the power of statistical zero knowledge. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 708-719. IEEE Computer Society, 2017.

168 [BCS16] Eli Ben-Sasson, Alessandro Chiesa, and Nicholas Spooner. Interactive oracle proofs. In Theory of Cryptography - 14th International Confe- rence, TCC 2016-B, Beijing, China, October 31 - November 3, 2016, Proceedings, Part II, pages 31-60, 2016.

[BCY91J Gilles Brassard, Claude Crepeau, and Moti Yung. Constant-round perfect zero-knowledge computationally convincing protocols. Theor. Comput. Sci., 84(1):23-52, 1991.

[BDRV17 Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. Multi collision resistant hash functions and their applications. IACR Cryptology ePrint Archive, 2017:489, 2017.

[BDRV18 Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. Multi-collision resistant hash functions and their applications. In EUROCRYPT, 2018.

[BDRV19] Itay Berman, Akshay Degwekar, Ron D. Rothblum, and Prashant Nalini Vasudevan. Statistical difference beyond the polarizing regime. Electronic Colloquium on Computational Complexity (ECCC), 26:38, 2019.

[BDV17] Nir Bitansky, Akshay Degwekar, and Vinod Vaikuntanathan. Structure vs hardness through the obfuscation lens. CRYPTO, 2017.

[BG03] Michael Ben-Or and Danny Gutfreund. Trading help for interaction in statistical zero-knowledge proofs. J. Cryptology, 16(2):95-116, 2003.

[BGG+88] Michael Ben-Or, Oded Goldreich, Shafi Goldwasser, Johan Hastad, Joe Kilian, , and Phillip Rogaway. Everything provable is pro- vable in zero-knowledge. In Advances in Cryptology - CRYPTO '88, 8th Annual International Cryptology Conference, Santa Barbara, California, USA, August 21-25, 1988, Proceedings, pages 37-56, 1988.

[BGH+06] Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, , and Salil P. Vadhan. Robust pcps of proximity, shorter peps, and applications to coding. SIAM J. Comput., 36(4):889-974, 2006.

[BHKY19] Nir Bitansky, Iftach Haitner, Ilan Komargodski, and Eylon Yogev. Distri- butional collision resistance beyond one-way functions. IA CR Cryptology ePrint Archive, 2019:115, 2019.

[BHZ87 Ravi B. Boppana, Johan HAstad, and Stathis Zachos. Does co-np have short interactive proofs? Inf. Process. Lett., 25(2):127-132, 1987.

[BKP18] Nir Bitansky, Yael Tauman Kalai, and Omer Paneth. Multi-collision resistance: a paradigm for keyless hash functions. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 671-684, 2018.

169 [BL13] Andrej Bogdanov and Chin Ho Lee. Limits of provable security for homo- morphic encryption. In Advances in Cryptology - CRYPTO 2013 - 33rd Annual Cryptology Conference, Santa Barbara, CA, USA, August 18-22, 2013. Proceedings, Part I, pages 111-128, 2013.

[Blu8l] Manuel Blum. Coin flipping by telephone. In Advances in Cryptology: A Report on CRYPTO 81, CRYPTO 81, IEEE Workshop on Communi- cations Security, Santa Barbara, California, USA, August 24-26, 1981., pages 11-15, 1981.

[BM88 LAszl6 Babai and Shlomo Moran. Arthur-merlin games: A randomized proof system, and a hierarchy of complexity classes. J. Comput. Syst. Sci., 36(2):254-276, 1988.

[BM090] Mihir Bellare, Silvio Micali, and Rafail Ostrovsky. Perfect zero-knowledge in constant rounds. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing, May 13-17, 1990, Baltimore, Maryland, USA, pages 482-493, 1990.

[BPVYOO] Ernest F. Brickell, David Pointcheval, Serge Vaudenay, and Moti Yung. Design validations for discrete logarithm based signature schemes. In Public Key Cryptography, Third International Workshop on Practice and Theory in Public Key Cryptography, PKC 2000, Melbourne, Victoria, Australia, January 18-20, 2000, Proceedings, pages 276-292, 2000.

[BRV17] Itay Berman, Ron D. Rothblum, and Vinod Vaikuntanathan. Zero- knowledge proofs of proximity. IACR Cryptology ePrint Archive, 2017:114, 2017.

[BRV18] Itay Berman, Ron D. Rothblum, and Vinod Vaikuntanathan. Zero- knowledge proofs of proximity. In 9th Innovations in Theoretical Com- puter Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA, pages 19:1-19:20, 2018.

[BY961 Mihir Bellare and Moti Yung. Certifying permutations: Noninteractive zero-knowledge based on any trapdoor permutation. J. Cryptology, 9(3):149-166, 1996.

[Cam86] Lucien Le Cam. Asymptotic Methods in Statistical Decision Theory. Springer-Verlag, New York, NY, 1986.

[CCKV08j Andr6 Chailloux, Dragos Florin Ciocan, Iordanis Kerenidis, and Salil P. Vadhan. Interactive and noninteractive zero knowledge are equivalent in the help model. In Theory of Cryptography, Fifth Theory of Cryptography Conference, TCC 2008, New York, USA, March 19-21, 2008., pages 501- 534, 2008.

170 [CFS17] Alessandro Chiesa, Michael A. Forbes, and Nicholas Spooner. A zero kno- wledge sumcheck and its applications. Electronic Colloquium on Compu- tational Complexity (ECCC), 24:57, 2017.

[CGVZ18] Yi-Hsiu Chen, Mika Gbbs, Salil P. Vadhan, and Jiapeng Zhang. A tight lower bound for entropy flattening. In CCC, 2018.

[CL17] Ran Canetti and Amit Lichtenberg, 2017. Unpublished manuscript.

[CRSW13] L. Elisa Celis, , Gil Segev, and Udi Wieder. Balls and bins: Smaller hash families and faster evaluation. SIAM J. Comput., 42(3):1030-1050, 2013.

[CS10] Artur Czumaj and Christian Sohler. Testing expansion in bounded-degree graphs. Combinatorics, Probability & Computing, 19(5-6):693-709, 2010.

[CS16] Gil Cohen and Igor Shinkar. The complexity of DNF of parities. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, Cambridge, MA, USA, January 14-16, 2016, pages 47-58, 2016.

[DGRV11] Zeev Dvir, Dan Gutfreund, Guy N. Rothblum, and Salil P. Vadhan. On approximating the entropy of polynomial mappings. In Innovations in Computer Science - ICS 2010, Tsinghua University, Beijing, China, Ja- nuary 7-9, 2011. Proceedings, pages 460-475, 2011.

[DHRS07 Yan Zong Ding, Danny Harnik, Alon Rosen, and Ronen Shaltiel. Constant-round oblivious transfer in the bounded storage model. J. Cryp- tology, 20(2):165-202, 2007.

[DI06] Bella Dubrov and Yuval Ishai. On the randomness complexity of efficient sampling. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing, Seattle, WA, USA, May 21-23, 2006, pages 711-720, 2006.

[DNR041 , Moni Naor, and Omer Reingold. Immunizing encryption schemes from decryption errors. In Christian Cachin and Jan Came- nisch, editors, EUROCRYPT, volume 3027 of Lecture Notes in Computer Science, pages 342-360. Springer, 2004.

[DORS08] Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin, and Adam D. Smith. Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. SIAM J. Comput., 38(1):97-139, 2008.

[DPP93] Ivan Damgard, Torben P. Pedersen, and Birgit Pfitzmann. On the exis- tence of statistically hiding bit commitment schemes and fail-stop signatu- res. In Advances in Cryptology - CRYPTO '93, 13th Annual International Cryptology Conference, Santa Barbara, California, USA, August 22-26, 1993, Proceedings, pages 250-265, 1993.

171 [DR06 Irit Dinur and Omer Reingold. Assignment testers: Towards a combi- natorial proof of the PCP theorem. SIAM J. Comput., 36(4):975-1024, 2006.

[EKR041 Funda Ergiin, Ravi Kumar, and Ronitt Rubinfeld. Fast approximate probabilistically checkable proofs. Inf. Comput., 189(2):135-159, 2004. [FGL14 Eldar Fischer, Yonatan Goldhirsh, and Oded Lachish. Partial tests, uni- versal tests and decomposability. In Innovations in Theoretical Computer Science, ITCS'14, Princeton, NJ, USA, January 12-14, 2014, pages 483- 500, 2014.

[FGM+89] Martin Fiirer, Oded Goldreich, Yishay Mansour, Michael Sipser, and Sta- this Zachos. On completeness and soundness in interactive proof systems. Advances in Computing Research, 5:429-442, 1989.

[FLS99] , Dror Lapidot, and Adi Shamir. Multiple noninteractive zero knowledge proofs under general assumptions. SIAM J. Comput., 29(1):1- 28, 1999.

[For89 Lance Fortnow. The complexity of perfect zero-knowledge. Advances in Computing Research, 5:327-343, 1989.

[FV17] Serge Fehr and Serge Vaudenay. Personal Communication, 2017. [GG98 Oded Goldreich and Shafi Goldwasser. On the limits of non- approximability of lattice problems. In Proceedings of the thirtieth annual A CM symposium on Theory of computing, pages 1-9. ACM, 1998.

[GG18 Oded Goldreich and Tom Gur. Universal locally testable codes. Chicago J. Theor. Comput. Sci., 2018, 2018.

[GGK15 Oded Goldreich, Tom Gur, and Ilan Komargodski. Strong locally testable codes with relaxed local decoders. In 30th Conference on Computational Complexity, CCC 2015, June 17-19, 2015, Portland, Oregon, USA, pages 1-41, 2015.

[GGR98] Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. J. ACM, 45(4):653-750, 1998.

[GGR18I Oded Goldreich, Tom Gur, and Ron D. Rothblum. Proofs of proximity for context-free languages and read-once branching programs. Inf. Comput., 261(Part):175-201, 2018.

[GIMS10] Vipul Goyal, Yuval Ishai, Mohammad Mahmoody, and Amit Sahai. Inte- ractive locking, zero-knowledge pcps, and unconditional cryptography. In Advances in Cryptology - CRYPTO 2010, 30th Annual Cryptology Confe- rence, Santa Barbara, CA, USA, August 15-19, 2010. Proceedings, pages 173-190, 2010.

172 [GK96] Oded Goldreich and Ariel Kahan. How to construct constant-round zero- knowledge proof systems for NP. J. Cryptology, 9(3):167-190, 1996.

[GMR88 Shafi Goldwasser, Silvio Micali, and Ronald L. Rivest. A digital signa- ture scheme secure against adaptive chosen-message attacks. SIAM J. Comput., 17(2):281-308, 1988.

[GMR89 Shafi Goldwasser, Silvio Micali, and . The knowledge complexity of interactive proof systems. SIAM J. Comput., 18(1):186- 208, 1989.

JGMW87] Oded Goldreich, Silvio Micali, and . How to play any mental game or A completeness theorem for protocols with honest ma- jority. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, 1987, New York, New York, USA, pages 218-229, 1987.

[GMW91] Oded Goldreich, Silvio Micali, and Avi Wigderson. Proofs that yield nothing but their validity for all languages in NP have zero-knowledge proof systems. J. ACM, 38(3):691-729, 1991.

[Gol90] Oded Goldreich. A note on computational indistinguishability. Inf. Pro- cess. Lett., 34(6):277-281, 1990.

[GolOll Oded Goldreich. The Foundations of Cryptography - Volume 1, Basic Techniques. Cambridge University Press, 2001.

[Gol11J Oded Goldreich, editor. Studies in Complexity and Cryptography. Miscel- lanea on the Interplay between Randomness and Computation - In Colla- boration with Lidor Avigad, Mihir Bellare, Zvika Brakerski, Shafi Gold- wasser, Shai Halevi, Tali Kaufman, Leonid Levin, , Dana Ron, Madhu Sudan, Luca Trevisan, , Avi Wigderson, David Zuckerman, volume 6650 of Lecture Notes in Computer Science. Springer, 2011.

[Gol17 Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017.

[GP99] Oded Goldreich and Erez Petrank. Quantifying knowledge complexity. Computational Complexity, 8(1):50-98, 1999.

[GPW16] Mika Gbds, Toniann Pitassi, and Thomas Watson. Zero-information pro- tocols and unambiguity in arthur-merlin communication. Algorithmica, 76(3):684-719, 2016.

[GR99 Oded Goldreich and Dana Ron. A sublinear bipartiteness tester for boun- ded degree graphs. Combinatorica, 19(3):335-373, 1999.

[GR021 Oded Goldreich and Dana Ron. Property testing in bounded degree graphs. Algorithmica, 32(2):302-343, 2002.

173 [GR11J Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. In Goldreich [Goll1J, pages 68-75.

[GR13 Oded Goldreich and Ron D. Rothblum. Enhancements of trapdoor per- mutations. J. Cryptology, 26(3):484-512, 2013.

[GR15 Tom Gur and Ron D. Rothblum, 2015. Unpublished observation.

[GR17 Tom Gur and Ron D. Rothblum. A hierarchy theorem for interactive proofs of proximity. In 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, January 9-11, 2017, Berkeley, CA, USA, pages 39:1-39:43, 2017.

[GR18 Tom Gur and Ron D. Rothblum. Non-interactive proofs of proximity. Computational Complexity, 27(1):99-207, 2018.

[GS891 Shafi Goldwasser and Michael Sipser. Private coins versus public coins in interactive proof systems. Advances in Computing Research, 5:73-90, 1989.

[GS92] Peter Gemmell and Madhu Sudan. Highly resilient correctors for polyno- mials. Inf. Process. Lett., 43(4):169-174, 1992.

[GS94 Marc Girault and Jacques Stern. On the length of cryptographic hash- values used in identification schemes. In Advances in Cryptology - CRYPTO '94, 14th Annual International Cryptology Conference, Santa Barbara, California, USA, August 21-25, 1994, Proceedings, pages 202- 215, 1994.

[GS061 Oded Goldreich and Madhu Sudan. Locally testable codes and pcps of almost-linear length. J. A CM, 53(4):558-655, 2006.

[GSV98] Oded Goldreich, Amit Sahai, and Salil Vadhan. Honest-verifier statistical zero-knowledge equals general statistical zero-knowledge. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 399-408. ACM, 1998.

[GSV99] Oded Goldreich, Amit Sahai, and Salil P. Vadhan. Can statistical zero knowledge be made non-interactive? or on the relationship of SZK and NISZK. In Michael J. Wiener, editor, CRYPTO, volume 1666 of Lecture Notes in Computer Science, pages 467-484. Springer, 1999.

[GV99] Oded Goldreich and Salil P. Vadhan. Comparing entropies in statistical zero knowledge with applications to the structure of SZK. In Proceedings of the 14th Annual IEEE Conference on Computational Complexity, At- lanta, Georgia, USA, May 4-6, 1999, page 54, 1999.

[GV11I Oded Goldreich and Salil P. Vadhan. On the complexity of computational problems regarding distributions. In Goldreich [Goll], pages 390-405.

174 [GVW02I Oded Goldreich, Salil Vadhan, and Avi Wigderson. On interactive proofs with a laconic prover. Computational Complexity, 11(1-2):1-53, 2002.

[HHRS15] Iftach Haitner, Jonathan J Hoch, Omer Reingold, and Gil Segev. Finding collisions in interactive protocols-tight lower bounds on the round and communication complexities of statistically hiding commitments. SIAM Journal on Computing, 44(1):193-242, 2015.

[HILL99 Johan Histad, Russell Impagliazzo, Leonid A. Levin, and Michael Luby. A pseudorandom generator from any one-way function. SIAM J. Comput., 28(4):1364-1396, 1999.

[HM96 Shai Halevi and Silvio Micali. Practical and provably-secure commit- ment schemes from collision-free hashing. In Advances in Cryptology - CRYPTO '96, 16th Annual International Cryptology Conference, Santa Barbara, California, USA, August 18-22, 1996, Proceedings, pages 201- 215, 1996.

[HNO+09 Iftach Haitner, Minh-Huyen Nguyen, Shien Jin Ong, Omer Reingold, and Salil P. Vadhan. Statistically hiding commitments and statistical zero- knowledge arguments from any one-way function. SIAM J. Comput., 39(3):1153-1218, 2009.

[HR041 Chun-Yuan Hsiao and Leonid Reyzin. Finding collisions on a public road, or do secure hash functions need secret coins? In Advances in Cryptology - CRYPTO 2004, 24th Annual InternationalCryptology Conference, Santa Barbara, California, USA, August 15-19, 2004, Proceedings, pages 92- 105, 2004.

[HR05] Thomas Holenstein and Renato Renner. One-way secret-key agreement and applications to circuit polarization and immunization of public-key encryption. In CRYPTO, pages 478-493, 2005.

[HRVW09] Iftach Haitner, Omer Reingold, Salil P. Vadhan, and Hoeteck Wee. In- accessible entropy. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 611-620, 2009.

[HV17 Iftach Haitner and Salil P. Vadhan. The many entropies in one-way functi- ons. In Tutorials on the Foundations of Cryptography., pages 159-217. 2017.

[IKOS09] Yuval Ishai, Eyal Kushilevitz, Rafail Ostrovsky, and Amit Sahai. Zero- knowledge proofs from secure multiparty computation. SIAM J. Comput., 39(3):1121-1152, 2009.

175 [1L891 Russell Impagliazzo and Michael Luby. One-way functions are essen- tial for complexity based cryptography (extended abstract). In 30th An- nual Symposium on Foundations of Computer Science, Research Triangle Park, North Carolina, USA, 30 October - 1 November 1989, pages 230- 235, 1989.

[IOS971 Toshiya Itoh, Yuji Ohta, and Hiroki Shizuya. A language-dependent cryp- tographic primitive. J. Cryptology, 10(1):37-50, 1997.

[IW14] Yuval Ishai and Mor Weiss. Probabilistically checkable proofs of proxi- mity with zero-knowledge. In Theory of Cryptography - 11th Theory of Cryptography Conference, TCC 2014, San Diego, CA, USA, February 24-26, 2014. Proceedings, pages 121-145, 2014.

[IWY16] Yuval Ishai, Mor Weiss, and Guang Yang. Making the best of a leaky situation: Zero-knowledge pcps from leakage-resilient circuits. In Theory of Cryptography - 13th International Conference, TCC 2016-A, Tel Aviv, Israel, January 10-13, 2016, Proceedings, Part II, pages 3-32, 2016.

[Jou04 . Multicollisions in iterated hash functions. application to cascaded constructions. In Advances in Cryptology - CRYPTO 2004, 24th Annual International CryptologyConference, Santa Barbara, California, USA, August 15-19, 2004, Proceedings, pages 306-316, 2004.

[Kil92] Joe Kilian. A note on efficient zero-knowledge proofs and arguments (extended abstract). In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, May 4-6, 1992, Victoria, British Columbia, Canada, pages 723-732, 1992.

[KNY17 Ilan Komargodski, Moni Naor, and Eylon Yogev. White-box vs. black- box complexity of search problems: Ramsey and graph property testing. In FOCS, 2017.

[KNY18] Ilan Komargodski, Moni Naor, and Eylon Yogev. Collision resistant hashing for paranoids: Dealing with multiple collisions. In EUROCRYPT, pages 162-194, 2018.

[KPT97 Joe Kilian, Erez Petrank, and Gibor Tardos. Probabilistically checkable proofs with zero knowledge. In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, pages 496-505, 1997.

[KR08] Yael Tauman Kalai and Ran Raz. Interactive PCP. In Automata, Lan- guages and Programming, 35th International Colloquium, ICALP 2008, Reykjavik, Iceland, July 7-11, 2008, Proceedings, Part II - Track B: Lo- gic, Semantics, and Theory of Programming & Track C: Security and Cryptography Foundations, pages 536-547, 2008.

176 [KR15] Yael Tauman Kalai and Ron D. Rothblum. Arguments of proximity - [extended abstract]. In Advances in Cryptology - CRYPTO 2015 - 35th Annual Cryptology Conference, Santa Barbara, CA, USA, August 16-20, 2015, Proceedings, Part II, pages 422-442, 2015.

[KS11J Satyen Kale and C. Seshadhri. An expansion tester for bounded degree graphs. SIAM J. Comput., 40(3):709-720, 2011.

[KTOO] Jonathan Katz and Luca Trevisan. On the efficiency of local decoding procedures for error-correcting codes. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 80-86, 2000.

IKY18] Ilan Komargodski and Eylon Yogev. On distributional collision resistant hashing. In Advances in Cryptology - CRYPTO 2018 - 38th Annual Inter- national Cryptology Conference, Santa Barbara, CA, USA, August 19-23, 2018, Proceedings, Part II, pages 303-327, 2018.

[LZ171 Shachar Lovett and Jiapeng Zhang. On the impossibility of entropy re- versal, and its application to zero-knowledge proofs. In TCC, 2017.

[Mer89] Ralph C. Merkle. One way hash functions and DES. In Advances in Cryptology - CRYPTO '89, 9th Annual International Cryptology Confe- rence, Santa Barbara, California, USA, August 20-24, 1989, Proceedings, pages 428-446, 1989.

[Mic00] Silvio Micali. Computationally sound proofs. SIAM J. Comput., 30(4):1253-1298, 2000.

[MP07] Silvio Micali and Rafael Pass. Precise zero knowledge. http://www.cs. cornelledu/~rafael/papers/preciseZKpdf, 2007.

[MRRR14] Raghu Meka, Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. Fast pseudorandomness for independence and load balancing - (extended abstract). In Automata, Languages, and Programming - 41st Internati- onal Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I, pages 859-870, 2014.

[MV03] Daniele Micciancio and Salil P. Vadhan. Statistical zero-knowledge proofs with efficient provers: Lattice problems and more. In , edi- tor, Advances in Cryptology - CRYPTO 2003, 23rd Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 2003, Proceedings, volume 2729 of Lecture Notes in Computer Science, pages 282-298. Springer, 2003.

[Nao9l] Moni Naor. Bit commitment using pseudorandomness. J. Cryptology, 4(2):151-158, 1991.

177 [NR061 Moni Naor and Guy N. Rothblum. Learning to impersonate. In ICML, pages 649-656, 2006.

[NS10] Asaf Nachmias and Asaf Shapira. Testing the expansion of a graph. Inf. Comput., 208(4):309-314, 2010.

[NV06] Minh-Huyen Nguyen and Salil P. Vadhan. Zero knowledge with efficient provers. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing, Seattle, WA, USA, May 21-23, 2006, pages 287-295, 2006.

[NY891 Moni Naor and Moti Yung. Universal one-way hash functions and their cryptographic applications. In Proceedings of the 21st Annual ACM Sym- posium on Theory of Computing, May 14-17, 1989, Seattle, Washigton, USA, pages 33-43, 1989.

[Oka00] Tatsuaki Okamoto. On relationships between statistical zero-knowledge proofs. J. Comput. Syst. Sci., 60(1):47-108, 2000.

[Ost9l] Rafail Ostrovsky. One-way functions, hard on average problems, and sta- tistical zero-knowledge proofs. In Structure in Complexity Theory Confe- rence, pages 133-138, 1991.

[OV081 Shien Jin Ong and Salil P. Vadhan. An equivalence between zero know- ledge and commitments. In TCC, pages 482-500, 2008.

[OW93] Rafail Ostrovsky and Avi Wigderson. One-way fuctions are essential for non-trivial zero-knowledge. In ISTCS, pages 3-17, 1993.

[PRS12] Krzysztof Pietrzak, Alon Rosen, and Gil Segev. Lossy functions do not amplify well. In Theory of Cryptography - 9th Theory of Cryptography Conference, TCC 2012, Taormina, Sicily, Italy, March 19-21, 2012. Pro- ceedings, pages 458-475, 2012.

[PW11] Chris Peikert and Brent Waters. Lossy trapdoor functions and their applications. SIAM J. Comput., 40(6):1803-1844, 2011.

[PW17] Yury Polyanskiy and Yihong Wu. Lecture notes on information the- ory. Available at: http: //people. lids. .mit. edu/yp/homepage/data/ itlecturesv5.pdf, 2017.

[Rom90] John Rompel. One-way functions are necessary and sufficient for secure signatures. In Proceedings of the 22nd Annual ACM Symposium on The- ory of Computing, May 13-17, 1990, Baltimore, Maryland, USA, pages 387-394, 1990.

[RRR16 Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. Constant- round interactive proofs for delegating computation. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages 49-62, 2016.

178 [RS96] Ronald L. Rivest and Adi Shamir. Payword and micromint: Two simple micropayment schemes. In Security Protocols, International Workshop, Cambridge, United Kingdom, April 10-12, 1996, Proceedings, pages 69- 87, 1996.

[RTV04 Omer Reingold, Luca Trevisan, and Salil P. Vadhan. Notions of reducibi- lity between cryptographic primitives. In Theory of Cryptography, First Theory of Cryptography Conference, TCC 2004, Cambridge, MA, USA, February 19-21, 2004, Proceedings, pages 1-20, 2004.

[RV09] Guy N. Rothblum and Salil P. Vadhan, 2009. Unpublished Manuscript.

[RVW13 Guy N. Rothblum, Salil P. Vadhan, and Avi Wigderson. Interactive proofs of proximity: delegating computation in sublinear time. In Symposium on Theory of Computing Conference, STOC'13, Palo Alto, CA, USA, June 1-4, 2013, pages 793-802, 2013.

[Sim98j Daniel R Simon. Finding collisions on a one-way street: Can secure hash functions be based on general assumptions? In Advances in Cryptolo- gyaATEUROCRYPT'98, pages 334-345. Springer, 1998.

[Sud951 Madhu Sudan. Efficient Checking of Polynomials and Proofs anf the Hardness of Approximation Problems, volume 1001 of Lecture Notes in Computer Science. Springer, 1995.

[SV03] Amit Sahai and Salil Vadhan. A complete problem for statistical zero knowledge. Journal of the A CM (JA CM), 50(2):196-249, 2003.

[Topoo] Flemming Topsee. Some inequalities for information divergence and rela- ted measures of discrimination. IEEE Transactions on Information The- ory, 46(4):1602-1609, July 2000.

[Vad99] Salil Pravin Vadhan. A study of statistical zero-knowledge proofs. PhD thesis, Massachusetts Institute of Technology, 1999.

[Vad12] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theore- tical Computer Science, 7(1-3):1-336, 2012.

[Yeh16] Amir Yehudayoff. Pointer chasing via triangular discrimination. Electro- nic Colloquium on Computational Complexity (ECCC), 23:151, 2016.

179