Innovations in Computer Science 2010

Robustness of the Learning with Errors Assumption

Shafi Goldwasser1 Yael Kalai2 Chris Peikert3 Vinod Vaikuntanathan4 1MIT and the Weizmann Institute 2Microsoft Research 3Georgia Institute of Technology 4IBM Research shafi@csail.mit.edu [email protected] [email protected] [email protected]

Abstract: Starting with the work of Ishai-Sahai-Wagner and Micali-Reyzin, a new goal has been set within the theory of community, to design cryptographic primitives that are secure against large classes of side-channel attacks. Recently, many works have focused on designing various cryptographic primitives that are robust (retain security) even when the secret key is “leaky”, under various intractability assumptions. In this work we propose to take a step back and ask a more basic question: which of our cryptographic assumptions (rather than cryptographic schemes) are robust in presence of leakage of their underlying secrets? Our main result is that the hardness of the learning with error (LWE) problem implies its hardness with leaky secrets. More generally, we show that the standard LWE assumption implies that LWE is secure even if the secret is taken from an arbitrary distribution with sufficient entropy, and even in the presence of hard-to-invert auxiliary inputs. We exhibit various applications of this result. 1. Under the standard LWE assumption, we construct a symmetric-key encryption scheme that is robust to secret key leakage, and more generally maintains security even if the secret key is taken from an arbitrary distribution with sufficient entropy (and even in the presence of hard-to-invert auxiliary inputs). 2. Under the standard LWE assumption, we construct a (weak) obfuscator for the class of point functions with multi-bit output. We note that in most schemes that are known to be robust to leakage, the parameters of the scheme depend on the maximum leakage the system can tolerate, and hence the efficiency degrades with the maximum anticipated leakage, even if no leakage occurs at all! In contrast, the fact that we rely on a robust assumption allows us to construct a single symmetric-key encryption scheme, with parameters that are independent of the anticipated leakage, that is robust to any leakage (as long as the secret key has sufficient entropy left over). Namely, for any k

1 Introduction cret keys used by the algorithm are perfectly secret and chosen afresh for the algorithm. In practice, however, The development of the theory of cryptography, information about keys does get compromised for a starting from the foundational work in the early 80’s variety of reasons, including: various side-channel at- has led to rigorous definitions of security, mathemat- tacks, the use of the same secret key across several ical modeling of cryptographic goals, and what it applications, or the use of correlated and imperfect means for a cryptographic algorithm to achieve the sources of randomness to generate the keys. In short, stated security goals. adversaries in the real world can typically obtain in- A typical security definition builds on the follow- formation about the secret keys other than through ing framework: a cryptographic algorithm is modeled the prescribed input/output interface. as a Turing machine (whose description is known to all) initialized with a secret key. Adversarial entities, In recent years, starting with the work of Ishai, modeled as arbitrary (probabilistic) polynomial-time Sahai and Wagner [19] and Micali and Reyzin [21], machines have input/output access to the algorithm. a new goal has been set within the theory of cryp- The requirement is that it is infeasible for any such tography community to build general theories of se- adversary to “break” the system at hand. The (often curity in the presence of information leakage.A implicit) assumption in such a definition is that the se- large body of work has accumulated by now (see 230 ROBUSTNESS OF THE LEARNING WITH ERRORS ASSUMPTION

[1,3,6,8,12–14,14,16,19,21,22,25,26,28]and the (standard) cryptographic assumptions are them- the references therein) in which security against dif- selves naturally “robust to leakage”. Specifically: ferent classes of information leakage has been defined, • Is it hard to factor an RSA composite N = pq and different cryptographic primitives have been de- (where p and q are random n-bit primes) given signed to provably withstand these attacks. A set arbitrary, but bounded, information about p and of works particularly relevant to our work are those q? that design cryptographic primitives which are “ro- • Is it hard to compute x given gx (mod p), where bust” in the following sense: they remain secure p is a large prime and x is uniformly random even in the presence of attacks which obtain arbi- given arbitrary, but bounded, information about trary polynomial-time computable information about x? the secret keys, as long as “the secret key is not fully As for factoring with leakage, there are known neg- revealed” (either information theoretically or compu- ative results: a long line of results starting from the tationally) [1, 11, 12, 20, 22]. early work of Rivest and Shamir [29] show how to In this paper, we turn our attention to the notion of factor N = pq given a small fraction of the bits of robust cryptographic assumptions rather than robust one of the factors [10, 17]. Similar negative results cryptographic constructions. are known for the RSA problem, and the square-root extraction problem. Robustness of Cryptographic Assumptions As for the discrete-logarithm problem with leakage, Suppose one could make a non-standard assump- it could very well be hard. In fact, an assumption of tion of the form “cryptographic assumption A holds this nature has been put forward by Canetti and used even in the presence of arbitrary leakage of its secrets”. to construct an obfuscator for point functions [7], and An example would be that the factoring problem is later to construct a symmetric encryption scheme that hard even given partial information about the prime is robust to leakage [9]. However, we do not have any factors. Once we are at liberty to make such an as- evidence for the validity of such an assumption, and sumption, the task of coming up with leakage-resilient in particular, we are far from showing that a standard cryptographic algorithms becomes substantially eas- cryptographic assumption implies the discrete-log as- ier. However, such assumptions are rather unappeal- sumption with leakage. ing, unsubstantiated and sometimes (as in the case of Recently, Dodis, Kalai and Lovett [12] consider this factoring) even false! (See below.) question in the context of the learning parity with In addition, it is often possible to show that a quan- noise (LPN) problem. They showed that the LPN titatively stronger hardness assumption translates to assumption with leakage follows from a related, but some form of leakage-resilience. For example, the as- non-standard assumption they introduce, called the sumption that the problem is 2k- learning subspaces with noise (LSN) assumption. hard (for some k>0) directly implies its security in In light of this situation, we ask: the presence of roughly k bits of leakage.1 However, in practice, what is interesting is a cryptographic as- Is there a standard cryptographic sumption that is secure against leakage of a constant assumption that is robust with leakage? fraction of its secret. The problem with the above Our first contribution is to show that the learning approach is that it fails this goal, since none of the with errors (LWE) assumption [27] that has recently O(N) cryptographic assumptions in common use are 2 - gained much popularity (and whose average-case com- hard to break (where N is the length of the secret plexity is related to the worst-case complexity of var- key). ious lattice problems) is in fact robust in the presence Thus, most of the recent work on leakage-resilient of leakage (see Section 1.1 and Theorem 1 below for cryptography focuses on constructing cryptographic details). schemes whose leakage-resilience can be reduced to standard cryptographic assumptions such as the poly- Leakage versus Efficiency nomial hardness of problems based on factoring, Typically, the way leakage-resilient cryptographic discrete-log or various lattice problems, or in some primitives are designed is as follows: first, the de- cases, general assumptions such as the existence of signer determines the maximum amount of leakage he one-way functions. is willing to tolerate. Then, the scheme is designed, and the parameters of the scheme (and hence its ef- However, a question that still remains is: Which of ficiency) depend on the maximum amount of leakage 1It is worth noting that such an implication is not entirely the scheme can tolerate. The more this quantity is, obvious for decisional assumptions. the less efficient the scheme becomes. In other words, 231 S. GOLDWASSER, Y. KALAI, C. PEIKERT AND V. VAIKUNTANATHAN the efficiency of the scheme depends on the maximum problem is (quantumly) as hard as the approximation anticipated amount of leakage. In cases where there is versions of various standard lattice problems in the no leakage at all in a typical operation of the system, worst-case. Furthermore, the decisional and search this is grossly inefficient. To our knowledge, all known LWE problems are equivalent (up to polynomial in q leakage-resilient cryptographic schemes that rely on factors). standard assumptions follow this design paradigm. In contrast, what we would like is a scheme whose 1.1.1 LWE with Weak Secrets security, and not efficiency, degrades with the actual We prove that the LWE assumption is robust to amount of leakage. In other words, we would like leakage. More generally, we prove that it is robust to to design a single scheme whose security with differ- weak keys and to auxiliary input functions that are ent amounts of leakage can be proven under a (stan- (computationally) hard to invert. Namely, we show dard) cryptographic assumption, with degraded pa- that (for some setting of parameters) the standard rameters. The larger the leakage, the stronger the LWE assumption implies that LWE is hard even if the security assumption. In other words, the amount of secret s is chosen from an arbitrary distribution with leakage never shows up in the design phase, and comes sufficient min-entropy, and even given arbitrary hard- up only in the proof of security. We call this a “grace- to-invert auxiliary input f(s). For the sake of clarity, ful degradation of security”. This leads us to ask: in the introduction we focus on the case that the se- cret s is distributed according to an arbitrary distri- Are there cryptosystems that exhibit a bution with sufficient min-entropy, and defer the issue graceful degradation of security? of auxiliary input to the body of the paper (though Our second contribution is a symmetric-key encryp- all our theorems hold also w.r.t. auxiliary inputs). tion scheme with a graceful degradation of security. We show that if the modulus q is any super- The construction is based on the LWE assumption, polynomial function of n, then the LWE assump- and uses the fact that the LWE assumption is robust tion with weak binary secrets (i.e., when the secret s to leakage. (See Section 1.1 and Theorem 2 below for is distributed according to an arbitrary distribution { }n details.) over 0, 1 with sufficient min-entropy) follows from Our final contribution is a (weak) obfuscator for the the standard LWE assumption. More specifically, we class of point functions with multi-bit output with a show that the LWE assumption where the secret s graceful degradation of security. (See Section 1.1 and is drawn from an arbitrary weak source with min- { }n Theorem 3 below for details.) entropy k over 0, 1 is true, assuming the LWE as- sumption is true with uniform (but smaller) secrets ℤ k−ω(log n) 1.1 Our Results and Techniques from q,where = log q . Namely, we reduce The learning with error (LWE) problem, introduced the LWE assumption with weak keys to the stan- by Regev [27] is a generalization of the well-known dard LWE assumption with “security parameter” .A “learning parity with noise” problem. For a security caveat is that we need to assume that the (standard) parameter n,andaprimenumberq, the LWE prob- LWE assumption is true with a super-polynomial n modulus q, and a super-polynomially small “error- lem with secret s ∈ ℤq is defined as follows: given polynomially many random, noisy linear equations rate”. Translated into the worst-case regime using Regev’s worst-case to average-case reduction, the as- in s, find s. More precisely, given (ai, ai, s + xi), n sumption is that standard lattice problems are (quan- where ai ← ℤq is uniformly random, and xi is drawn from a “narrow error distribution”, the problem is tumly) hard to approximate with quasi-polynomial 2 to find s. The decisional version of LWE is to dis- time and within quasi-polynomial factors. tinguish between polynomially many LWE samples Theorem 1 (Informal). For any super-polynomial {(ai, ai, s + xi)} and uniformly random samples. modulus q = q(n),anyk ≥ log q, and any distribution Typically, the LWE problem is considered with a n D = {Dn}n∈ℕ over {0, 1} with min-entropy k,the Gaussian-like error distribution (see Section 2 for de- (non-standard) LWE assumption, where the secret is tails). For notational convenience, we use a compact drawn from the distribution D, follows from the (stan- matrix notation for LWE: we will denote m samples ≜ k−ω(log n) from the LWE distribution compactly as (A, As + x), dard) LWE assumption with secret size log q where A is an m-by-n matrix over ℤq. 2 q Our confidence in the LWE assumption stems from By taking to be smooth, and following an argument from [23], our assumption translates to the assumption that a worst-case to average-case reduction of Regev [27], standard lattice problems are (quantumly) hard to approximate who showed that the (average-case, search) LWE in polynomial time and within super-polynomial factors. 232 ROBUSTNESS OF THE LEARNING WITH ERRORS ASSUMPTION

(where the “error-rate” is super-polynomially small An astute reader might have noticed that introduc- and the adversaries run in time poly(n)). ing this trick in fact undid the “extraction” argument. In particular, now one has to consider We sketch the main ideas behind the proof of this theorem. Let us first perform a mental experiment (B · C + Z)s + x =(B · Cs)+Zs + x where the matrix A in the definition of the LWE prob- · lem is a non-full-rank matrix. i.e, A = B · C where Indeed, B Cs+x = Bt+x by itself is pseudorandom m× ×n (as before), but is not necessarily pseudorandom after B ← ℤq and C ← ℤq are uniformly random; this is a uniformly random matrix with rank at most the addition of the correlated Zs component. We mit- (and exactly if both B and C are full rank). The igate this problem by using the following elementary LWE distribution then becomes property of a Gaussian distribution: shifting a Gaus- sian distributed random variable by any number that As + x = B · (Cs)+x is super-polynomially smaller than its standard devi- ation results in distribution that is statistically close At this point, we use the leftover hash lemma,3 which to the original Gaussian distribution. Thus, if we set states that matrix multiplication over ℤq (and in fact, each entry of Z to be super-polynomially smaller than n any universal hash function) acts as a (strong) ran- the standard deviation of x and choose s ∈{0, 1} , domness extractor. In other words, Cs is statisti- we see that Zs + x is distributed statistically close to cally close to a uniformly random vector t (even given a Gaussian. Namely, the noise x “eats up” the cor- C). Now, Bt + x is exactly the LWE distribution related and troublesome term Zs. This is where the with a uniformly random secret t which, by the LWE restrictions of the LWE secret s being binary, as well assumption with security parameter , is pseudoran- as q being super-polynomial and the noise-rate being dom. It seems that we are done, but that is not quite negligibly smaller than q come into play. the case. 1.1.2 Applications The problem is that B · C does not look anything Our main result has several applications. We con- like a uniformly random m-by-n matrix; it is in fact struct an efficient symmetric encryption scheme se- easily distinguishable from the uniform distribution. cure against chosen plaintext attacks, even if the se- At first, one may be tempted to simply change the cret key is chosen from an arbitrary distribution with LWE assumption, by choosing A = B · C as above, sufficient min-entropy (and even in the presence of ar- rather than a random m-by-n matrix. The problem bitrary hard-to-invert auxiliary input). We also con- with this approach is that if C is fixed then the min- struct an obfuscator for the class of point functions entropy of the secret s may depend on C,inwhich with multi-bit output I = {I K,M }, that is secure case we cannot use the leftover hash lemma and claim ( ) w.r.t. any distribution with sufficient min-entropy. A that C·s is uniformly distributed. On the other hand, point function with multi-bit output is a function if C is chosen at random each time, then we essen- I K,M that always outputs ⊥, except on input K in tially reduced the problem of LWE with weak keys, to ( ) which it outputs M. Bothoftheseresultsrelyonthe the problem to LWE with related secrets C s,...Cts, 1 standard LWE assumption. where C1,...,Ct are known. We take a different approach: we use the LWE as- Theorem 2 (Informal). For any super-polynomial sumption to “hide” the fact that B · C is not a full- q = q(n) there is a symmetric-key encryption scheme rank matrix. More precisely, we claim that the matrix with the following property: For any k ≥ log q,theen- B · C + Z,whereZ is chosen from the LWE error dis- cryption scheme is CPA-secure when the secret-key is tribution, is computationally indistinguishable from a drawn from an arbitrary distribution with min-entropy m×n uniformly random matrix A ∈ ℤq . Thisissimply k. The assumption we rely on is the (standard) LWE k−ω n because each column B · ci + zi can be thought of as assumption with a secret of size ≜ (log ) (where log q a bunch of LWE samples with “secret” ci ∈ ℤq.By the “error-rate” is super-polynomially small and the the LWE assumption with security parameter ,this adversaries run in time poly(n)). looks random to a polynomial-time observer. This technique was used in the work of Alekhnovitch [2] The encryption scheme is simple, and similar ver- in the context of learning parity with noise, and was sions of it were used in previous works [4, 12]. The ←{ }n later used in a number of works in the context of the secret-key is a uniformly random vector s 0, 1 . ∈{ }m LWE assumption [24, 27]. To encrypt a message w 0, 1 , the encryption algorithm computes 3In the regime of auxiliary inputs, we use the Goldreich- Levin theorem over ℤq from the recent work of [11]. Es(w)=(A, As + x + q/2 · w) 233 S. GOLDWASSER, Y. KALAI, C. PEIKERT AND V. VAIKUNTANATHAN

m×n where A ∈R ℤq and each component of x is drawn on exponential hardness assumptions [14, 26]. We em- from the error distribution. To decrypt a ciphertext phasize that all these examples rely on non-standard (A, y) using the secret-key s, the decryption algo- assumptions. − rithm computes y As and it deciphers each coordi- The work of Katz and Vaikuntanathan [20] con- ∈ nate i [m] separately, as follows: If the i’th coordi- structs signature schemes that are resilient to leakage. − 3q 5q nate of y As is close to q/2 (i.e., between 8 and 8 ) They, similarly to us, “push” the leakage to the as- then it deciphers the i’th coordinate of the message sumption level. To construct their signature schemes, to be 1. If the i’th coordinate of y − As is close to 0 q 7q they use the observation that any collision-resistant or q (i.e., is smaller than 8 or larger than 8 )thenit hash function (and even a universally one-way hash deciphers the i’th coordinate of the message to be 0. function) is a leakage-resilient one-way function. In ⊥ Otherwise, it outputs . other words, they use the fact that if h is collision- The ciphertext corresponding to a message w resistant, then h(x) is hard to invert even given some consists of polynomially many samples from the LWE partial information about the pre-image x.Thiscan distribution with secret s, added to the message vec- be interpreted as saying that the collision-resistant tor. Robustness of LWE means that the ciphertext is hash function assumption is a robust assumption. pseudorandom even if the secret key is chosen from an arbitrary distribution with min-entropy k (assuming This differs from our work in several ways. The that the standard LWE assumption holds with secrets most significant difference is that we show that pseu- dorandomness (and not only one-wayness) of the LWE drawn from ℤq where =(k − ω(log n))/ log q). The same argument also shows that the scheme is secure distribution is preserved in the presence of leakage. in the presence of computationally uninvertible This enables us to construct various cryptographic auxiliary input functions of s. objects which we do not know how to construct with the [20] assumption. We mention that, whereas the [20] observation follows by a straightforward count- Finally, we use a very recent work Canetti et. al. ing argument, our proof requires non-trivial compu- [9], that shows a tight connection between secure en- cryption w.r.t. weak keys and obfuscation of point tational arguments: roughly speaking, this is because the secret s is uniquely determined (and thus has no functions with multi-bit output. The security defi- information-theoretic entropy) given the LWE sam- nition that they (and we) consider is a distributional one: Rather than requiring the obfuscation to be se- ples (A, As+x). Thus, we have to employ non-trivial computational claims to prove our statements. cure w.r.t. every function in the class, as defined in the seminal work of [5], security is required only when the Finally, we would like to contrast our results with functions are distributed according to a distribution those of Akavia et al. [1], who show that the public-key with sufficient min-entropy. Using this connection, to- encryption scheme of Regev [27] based on LWE is in gether with our encryption scheme described above, fact leakage-resilient. Unfortunately, achieving larger we get the following theorem. and larger leakage in their scheme entails increasing the modulus q correspondingly. Roughly speaking, to Theorem 3 (Informal). There exists an obfuscator obtain robustness against a leakage of (1 − )frac- for the class of point functions with multi-bit output tion of the secret key, the modulus has to be at least under the (standard) LWE assumption. n1/ (where n is the security parameter). In short, the scheme does not degrade gracefully with leak- 1.2 Related Work age (much like the later works of [11, 22]). In fact, There is a long line of work that considers various constructing a public-key encryption scheme with a models of leakage [1, 3, 8, 9, 11, 11, 12, 14, 14, 16, 19– graceful degradation of security remains a very inter- 22, 25, 26]. Still, most of the schemes that are known esting open question. to be resilient to leakage suffer from the fact that the parameters of the scheme depend on the maximum anticipated leakage, and thus they are inefficient even in the absence of leakage! There are a few exceptions. 2 Preliminaries For example, the work of [7, 9] exhibit a symmetric encryption scheme that is based on a leakage-resilient We will let n denote the main security parameter DDH assumption, which says that DDH holds even if throughout the paper. The notation X ≈c Y (resp. one of its secrets is taken from an arbitrary distribu- X ≈s Y ) means that the random variables X and Y tion with sufficient min-entropy. We also mention sev- are computationally indistinguishable (resp. statisti- eral results in the continual-leakage model which rely cally indistinguishable). 234 ROBUSTNESS OF THE LEARNING WITH ERRORS ASSUMPTION

2.1 Learning with Errors (LWE) used. In particular, this equivalence holds for any q The LWE problem was introduced by Regev [27] as that is a product of sufficiently large poly(n)-bounded a generalization of the “learning noisy parities” prob- primes. lem. Let 𝕋 = ℝ/ℤ be the reals modulo 1, repre- Evidence for the hardness of LWE n,q,α follows from sented by the interval [0, 1). For positive integers n results of Regev [27] and Peikert [23], who (infor- n n,q,α and q ≥ 2, a vector s ∈ ℤq , and a probability dis- mally speaking) showed that solving LWE (for tribution φ on ℝ,letAs,φ be the distribution over appropriate parameters) is no easier than solving ap- n n ℤq × 𝕋 obtained by choosing a ∈ ℤq uniformly at proximation problems on n-dimensional lattices in the random and an error term x ← φ, and outputting worst case.√ More precisely, under the hypothesis that n ≥ ˜ (a,b= a, s/q + x) ∈ ℤq × 𝕋. q ω( n)/α, these reductions obtain O(n/α)-factor We also consider a discretized version of the LWE approximations for various worst-case lattice prob- distribution. For a distribution φ over ℝ andan(im- lems. (We refer the reader to [23, 27] for the details.) plicit) modulus q,letχ = φ¯ denote the distribution Note that if q 1/α, then the LWE n,q,α problem over ℤ obtained by drawing x ← φ and outputting is trivially solvable because the exact value of each n   q · x . The distribution As,χ over ℤq × ℤq is ob- a, s can be recovered (with high probability) from n tained by choosing a ∈ ℤq uniformly at random, an its noisy version simply by rounding. ←   error term x χ, and outputting (a,b = a, s + x), 2.2 Leftover Hash Lemma where all operations are performed over ℤq.(Equiv- Informally, the leftover hash lemma [18] states that alently, draw a sample (a,b) ← As,φ and output n any universal hash function acts as a randomness ex- (a, b · q ) ∈ ℤ × ℤq.) q tractor. We will use a variant of this statement ap- Definition 1. For an integer q = q(n) and an error plied to a particular universal hash function, i.e, ma- distribution φ = φ(n) over 𝕋, the (worst-case, search) trix multiplication over ℤq.Inparticular: n learning with errors problem LWE n,q,φ in n dimen- Lemma 1. Let D be a distribution over ℤq with min- sions is: given access to arbitrarily many independent entropy k. For any >0 and ≤ (k − 2log(1/) − samples from As,φ,outputs with non-negligible prob- O(1))/ log q, the joint distribution of (C, C · s) where ¯ ×n n ability. The problem for discretized χ = φ is defined C ← ℤq is uniformly random and s ∈ ℤq is drawn similarly. from the distribution D is -close to the uniform dis- ×n The (average-case) decision variant of the LWE tribution over ℤq × ℤq. problem, denoted DLWEn,q,φ, is to distinguish, with non-negligible advantage given arbitrarily many inde- 2.3 Goldreich-Levin Theorem over ℤq n pendent samples, the uniform distribution over ℤq ×𝕋 We also need a “computational version” of the left- from As,φ for a uniformly random (and secret) s ∈ over hash lemma, which is essentially a Goldreich- n ¯ ℤq . The problem for discretized χ = φ is defined sim- Levin type theorem over ℤq. The original Goldreich- ilarly. Levin theorem proves that for every uninvertible func- tion h, c, s (mod 2) is pseudorandom, given h(s) In this work, we are also concerned with the ∈ ℤn and c for a uniformly random c 2 . Dodis et al. [11] average-case decision LWE problem where the secret (following the work of Goldreich, Rubinfeld and Su- D ℤn s is drawn from a distribution over q (which may dan [15]) show the following variant of the Goldreich- not necessarily be the uniform distribution). We de- Levin theorem over a large field ℤq. note this problem by DLWEn,q,φ(D). Lemma 2. Let q be prime, and let H be any poly- Observe that simply by rounding, the n ∗ ℤq →{ } DLWEn,q,χ(D) problem is no easier than the nomial size subset of .Letf : H 0, 1 be ← ℤ×n DLWEn,q,φ(D) problem, where χ = φ¯. any (possibly randomized) function. Let C q We are primarily interested in the LWE problems be uniformly random and s be drawn from the distri- D n where the error distribution φ is a Gaussian. For bution over H . any α>0, the density function of a one-dimensional If there is a PPT algorithm A that distinguishes Gaussian probability distribution over ℝ is given by between (C, Cs) and the uniform distribution over 2 the range given h(s), then there is a PPT algo- Dα(x)=exp(−π(x/α) )/α.WewriteLWE n,q,α as an rithm B that inverts h(s) with probability roughly abbreviation for LWE n,q,D . α · It is known [4, 23, 27] that for moduli q of a certain 1/(q poly(n, 1/)). 2 form, the (average-case decision) DLWEn,q,φ problem In other words, if h is (roughly) 1/q -hard to invert is equivalent to the (worst-case search) LWE n,q,φ prob- (by polynomial-time algorithms), then (C, C · s)is lem, up to a poly(n) factor in the number of samples pseudorandom given h(s). 235 S. GOLDWASSER, Y. KALAI, C. PEIKERT AND V. VAIKUNTANATHAN

3 LWE with Weak Secrets Remark 2. The analogous statement for the search version of LWE follows immediately from Theorem 4 In this section, we show that if the modulus q is and the search to decision reduction for LWE. How- some super-polynomial function of n, then the LWE ever, doing so naively involves relying on the assump- assumption with weak binary secrets (i.e., when the secret s is distributed according to an arbitrary dis- tion that LWE ,m,q,γ is hard (for the parameters , m, q n and γ as above) for algorithms that run in superpoly- tribution over {0, 1} with sufficient min-entropy) nomial time. The superpolynomial factor in the run- follows from the standard LWE assumption. More specifically, let s ∈{0, 1}n be any random variable ning time can be removed by using the more involved ω n search to decision reduction of Peikert [23] that uses with min-entropy k,andletq = q(n) ∈ 2 (log ) be a modulus q of a special form. any super-polynomial function in n. Then, we show that for every m = poly(n), 3.1 Proof of Theorem 4 In the proof of Theorem 4 we rely on the following (A, As + x) ≈c (A, u), lemma, which was proven in [11]. ∈ ℤm×n ← m ← ℤm where A R q , x Ψβ and u q .Weshow Lemma 3. Let β>0 and q ∈ ℤ. this statement under the (standard) LWE assumption 1. Let y ← Ψβ. Then with overwhelming probabil- ≜ √ with the following parameters: The secret size is ity, |y|≤βq · n. k−ω(log n) log q , the number of examples remains m,and 2. Let y ∈ ℤ be arbitrary. The statistical distance the standard deviation of the noise is any γ such that between the distributions Ψβ and Ψβ + y is at γ/β = negl(n). An example of a parameter setting most |y|/(βq). that achieves the latter is β = q/poly(n)andγ = poly(n). Proof. Fix any q, m, k, , β, γ as in the statement the- orem. We define the distribution Dγ, as follows: Let m× ×n Theorem 4. Let n, q ≥ 1 be integers, let D be any B ← ℤq and C ← ℤq be uniformly random and n distribution over {0, 1} having min-entropy at least m×n  Z ← Ψγ . Output the matrix A = BC + Z.The k,andletα, β > 0 be such that α/β = negl(n).Then following claim about the distribution Dγ is then im- ≤ k−ω(log n) for any log q , there is a PPT reduction from mediate: D DLWE,q,α to DLWEn,q,β( ).  Claim 1. Under the LWE ,m,q,γ Assumption, A is computationally indistinguishable from uniform in Remark 1. When converting the non-standard ver- m×n ℤq . sion of the LWE assumption to the standard LWE assumption we lose in two dimensions. The first is Claim 1 implies that it suffices to prove that k−ω(log n) that the secret size becomes ≜ q rather than    log (A , A s + x) ≈c (A , u). n. This loss seems to be inevitable since in the non- standard assumption, although the secret s was in Note that {0, 1}n, it had only k bits of entropy, so when convert- ing this assumption to the standard LWE assumption As + x =(BC + Z)s + x = BCs + Zs + x. is seems like we cannot hope to get a secret key with more than k bits of entropy, which results with the Thus, it suffices to prove that key size being at most k/ log q. We remark that the ≈ dimension can be increased to O(k − ω(log n)) (thus (BC + Z, BCs + Zs + x) c (BC + Z, u). making the underlying LWE assumption weaker) by n We prove the stronger statement that consider secrets s over ℤq (rather than binary secrets). This modification makes the proof of the theorem a (B, C, Z, BCs + Zs + x) ≈c (B, C, Z, u). (1) bit more cumbersome, and hence we choose not to present it. The first item of Lemma 3 implies that with over- Another dimension in which we lose is in the er- whelming√ probability each entry of Z is of size at ror distribution. The standard deviation of the error most γq · n. Thisimpliesthat,withoverwhelm- distribution becomes γ which is negligibly small com- ing probability (over Z), for every s ∈{0, 1}n each pared to the original standard deviation β.Thisloss coordinate of Zs is of size at most γq · n.Letx be m which does not seem to be inherent, and seems to be a random variable distributed according to (Ψβ) . a byproduct of our proof, induces the restriction on q Then, the second item of Lemma 3 implies that for  to be super-polynomial in n. every i ∈ [m] the statistical distance between (Z, s, xi) 236 ROBUSTNESS OF THE LEARNING WITH ERRORS ASSUMPTION

γq·n and (Z, s, (Zs)i + xi)isatmost βq .Usingthe 4 Symmetric-Key Encryption Scheme  fact that γ/β = negl(n), we conclude that (Z, s, xi) In this section we present a symmetric-key encryp- and (Z, s, (Zs)i + xi) are statistically close, and thus  tion scheme that is secure even if the secret key that (Z, s, x )and(Z, s, Zs+x) are statistically close. is distributed according to an arbitrary distribution Therefore, it suffices to prove that with sufficient min-entropy, assuming the standard

 LWE assumption holds (for some setting of param- ≈ n (B, C, Z, BCs + x ) c (B, C, Z, u). eters). More precisely, if the secret key s ∈{0, 1} D  m is distributed according to some distribution min- where x is drawn from Ψβ . entropy k then the assumption we rely on is that The fact that Z is efficiently sampleable implies there exists a super-polynomial function q = q(n)and that it suffices to prove that a parameter β = q/poly(n)forwhichDLWEk,q,β (D) holds. According to Theorem 4, this assumption fol-  ≈ (B, C, BCs + x ) c (B, C, u). lows from the (standard) assumption LWE ,q,γ,where ≜ k−ω(log n) log q and γ is any function that satisfies A standard application of leftover hash lemma [18] γ/q = negl(n). using the fact that the min-entropy of s is at least log q + ω(log n) implies that 4.1 The Scheme We next describe our encryption scheme 𝔼 = (C, Cs) ≈s (C, u) (G, E, D), which is very similar to schemes that ap- peared in previous work [4, 12]. Thus, the LWE ,m,q,β Assumption (which follows • Parameters. Let q = q(n)beanysuper- from the LWE ,m,q,γ Assumption), immediately im- polynomial function, let β = q/poly(n), and let plies that m = poly(n). • Key generation algorithm G. On input 1n, ≈ (B, C, BCs + x) (B, C, u), G(1n) outputs a uniformly random secret key s ←{0, 1}n. as desired. • Encryption algorithm E. On input a secret key s and a message w ∈{0, 1}m, ThesametechniqueusedtoproveTheorem4,with q the exception of using the Goldreich Levin theorem Es(w)=(A, As + x + w) 2 over ℤq (i.e, Lemma 2) instead of the leftover hash lemma shows that the LWE assumption holds even m×n m where A ∈R ℤq and x ← (Ψβ) . given auxiliary input h(s), where h is any uninvert- • Decryption algorithm D. on input a secret ible function (See below). In essence, the proof of key s and a ciphertext (A, y), the decryption al- the auxiliary input theorem below proceeds by us- gorithm computes y − As and it deciphers each ingLemma2toextractfromthe“computationalen- coordinate i ∈ [m] separately, as follows: If the tropy” in the secret (as opposed to using the left- i’th coordinate of y − As is close to q/2, i.e., over hash lemma to extract from the “information- 3q 5q is between 8 and 8 , then it deciphers the i’th theoretic entropy”). coordinate of the message to 1. If the i’th coor- dinate of y − As is far from q/2, i.e., is smaller ≥ H q q Theorem 5. Let k log q,andlet be the class of than or larger than 7 , then it deciphers the { }n →{ }∗ −k 8 8 all functions h : 0, 1 0, 1 that are 2 hard i’th coordinate of the message to 0. Otherwise, to invert, i.e, given h(s), no PPT algorithm can find it outputs ⊥. s with probability better than 2−k. For any super-polynomial q = q(n),anym = Remark. We note that, as opposed to most known poly(n),anyβ,γ ∈ (0,q) such that γ/β = negl(n) leakage resilient schemes, the parameters of this scheme do not depend on any amount of leakage (A, As + x,h(s)) ≈c (A, u,h(s)) (or min-entropy) that the scheme can tolerate. Nevertheless, using Theorem 4, we are able to prove m×n n m where A ← ℤq , s ← ℤq and u ← ℤq are uni- that this scheme is secure w.r.t. weak keys, and in m particular is leakage resilient. We emphasize the formly random and x ← Ψβ . assuming the (stan- k−ω(log n) change in the order of quantifiers: Rather than dard) DLWE,m,q,γ assumption, where ≜ . log q proving that for every leakage parameter there exists 237 S. GOLDWASSER, Y. KALAI, C. PEIKERT AND V. VAIKUNTANATHAN

a scheme that is secure w.r.t. such leakage, we show Lemma 4. Let D = {Dn}n∈ℕ be an arbitrary distri- that there exists a scheme that is secure w.r.t. any bution with min-entropy k = k(n). Then, the encryp- leakage parameter, under a standard assumption, tion scheme 𝔼D =(GD,E,D) is CPA secure under whose parameters depend on the leakage parameter. the DLWEn,q,β(D) assumption. We note that the only other encryption scheme that is known to have this property is based on a 4.2 Proof of Lemma 4 non-standard version of the DDH assumption, which Let D = {Dn}n∈ℕ be an arbitrary distribution with says that the DDH assumption holds even if one of min-entropy k = k(n). We will prove that no PPT its secrets is chosen from an arbitrary distribution adversary can distinguish between the case that it is given access to a valid encryption oracle and the case with sufficient min-entropy [7, 9]. that it is given access to an oracle that simply outputs random strings. Suppose for the sake of contradiction We first prove the correctness of our scheme. that there exists a PPT adversary A that succeeds in Claim 2. There exists a negligible function µ such distinguishing between the two oracles with probabil- that for every n and every w ∈{0, 1}m, ity ,where is not a negligible function. We will show that this implies that there exists a polynomial Pr[Ds(Es(w)) = w]=1− µ(n) t = poly(n) and a PPT algorithm B such that Proof. Fix any n and any message w ∈{0, 1}m. |Pr[B(A, As + x)=1]− Pr[B(A, u)=1]|≥(n)(2)

t×n t Pr[(Ds(Es(w))) = w]= where A ∈R ℤq , s ←Dn and x ← (Ψβ) .This D Pr[∀i ∈ [m], (Ds(Es(w)))i = wi]= will contradict the DLWEn,q,β( ) assumption. Al- gorithm B simply emulates A’s oracle. Every time ∀ ∈ q Pr[ i [m], (Ds(A, As + x + w))i = wi]= that A asks for an encryption of some message w,the    2 q q algorithm B takes m fresh rows of his input, denoted ∀ ∈ ∈ − ≥ q Pr i [m], xi , by (A, y) and feeds A the ciphertext (A, y + w).   8 8 2 q 7q We choose t so that t/m is larger than the number of 1 − m Pr xi ∈ , = A 8 8 oracle queries that makes. Note that if the input of B is an LWE instance then B perfectly simulates the 1 − negl(n). encryption oracle. On the other hand, if the input of B is random, then B perfectly simulates the random oracle. Thus, Equation (2) indeed holds. We next argue the security of this scheme. Using Theorem 5 and a proof along similar lines as Definition 2. We say that a symmetric encryption above, this scheme can also be shown to be secure scheme is CPA secure w.r.t. k(n)-weak keys,iffor against uninvertible auxiliary inputs. any distribution {Dn}n∈ℕ with min-entropy k(n),the scheme is CPA secure even if the secret key is chosen D 5 Point Function Obfuscation with according to the distribution n. Multibit Output Theorem 6. For any k = k(n), the encryption In this section we consider the task of obfuscating scheme 𝔼 =(G, E, D) is CPA secure with k(n)- the class of point functions with multibit output (or weak keys, under the (standard) DLWE,q,γ assump- k−ω(log n) multi-bit point functions (MBPF) for short), tion, where = log q and where γ satisfies γ/q = ∗ negl(n). I = {I(k,m) | k, m ∈{0, 1} }, In order to prove Theorem 6, we use the follow- { }∗ ∪{⊥}→{ }∗ ∪⊥ D {D } where each I(k,m) : 0, 1 0, 1 is ing notation: For any distribution = n n∈ℕ we defined by denote by GD the key generation algorithm that sam- ples the secret key s according to the distribution D. m if x = k n I k,m (x)= Namely, GD(1 ) outputs s ←Dn. ( ) ⊥ otherwise Theorem 4 implies that in order to prove Theo- rem 6 it suffices to prove the following lemma, which Namely, I(k,m) outputs the message m given a correct states that if the secret key is samples according to key k,and⊥ otherwise. some distribution D, rather than sampled uniformly, We show that our encryption scheme implies a then the scheme is secure under the (non-standard) (weak) obfuscator for the class of multi-bit point DLWEn,q,β(D) assumption. functions. To this end, we use a recent result of 238 ROBUSTNESS OF THE LEARNING WITH ERRORS ASSUMPTION

Canetti et. al. [9] that shows a tight connection be- Keeping Definitions 3 and 4 in mind, we can finally tween encryption schemes with weak keys and a weak state our result. form of obfuscation of multi-bit point functions. The obfuscation definition they consider is a distributional Theorem 7. There exists an obfuscator for the class one: Rather than requiring the obfuscation to be se- of point functions with multibit output which is self- cure w.r.t. every function in the class, as defined in the composable α(n)-entropic secure under the DLWE,q,γ seminal work of [5], security is required only when the assumption, where q = q(n) is super-polynomial, = α(n)−ω(log n) functions are distributed according to a distribution log q and γ/q is negligible in n. with sufficient min-entropy. As was mentioned above, the proof of this theorem Definition 3 (Obfuscation of Point Functions with follows from the tight connection between encryptions Multi-bit Output [9]). A multi-bit point function scheme that are secure w.r.t. weak keys, and multibit (MBPF) obfuscator is a PPT algorithm O which takes point function obfuscation. To state this connection as input values (k, m) describing a function I(k,m) ∈I formally, we need the following definition. and outputs a circuit C. Definition 5 (Wrong-Key Detection [9, 12]). We Correctness: For al l I k,m ∈I with |k| = n, |m| = ( ) say that an encryption scheme 𝔼 =(G, E, D) sat- poly(n),allx ∈{0, 1}n, isfies the wrong-key detection property if for all k =  ∈{ }n ∈{ }poly(n) Pr[C(x) = I(k,m)(x) | C ←O(I(k,m))] ≤ negl(n) k 0, 1 , and every message m 0, 1 , Pr[Dk (Ek(m)) = ⊥] ≤ negl(n). where the probability is taken over the randomness of the obfuscator algorithm. Lemma 5. [9] Let 𝔼 =(G, E, D) be an encryp- Polynomial Slowdown: For any k, m, the size of tion scheme with CPA security for α(n)-weak keys the circuit C = O(I(k,m)) is polynomial in |k| + |m|. and having the wrong-key detection property. We de- O Entropic Security: We say that the obfuscator has fine the obfuscator which, on input I(k,m),com- α(n)-entropic security if for any PPT adversary putes a ciphertext c = Ek(m) and outputs the cir- A with 1 bit output and any polynomial (·),there cuit Cc(·) (with hard-coded ciphertext c) defined by exists a PPT simulator S such that for every distri- Cc(x)=Dx(c).Then,O is a self-composable multi- n bution {Xn}n∈ℕ,whereXn takes values in {0, 1} bit point function obfuscator with α(n)-entropic secu- and H∞(Xn) ≥ α(n), and for every message m ∈ rity. {0, 1}(n), The proof of Theorem 7 follows immediately from

Pr A(O(I k,m )) = 1 Lemma 5, Theorem 6, and the following claim.  ( ) 

− Pr SI(k,m) (·) (1n)=1 ≤ negl(n) Claim 3. The encryption scheme 𝔼 =(G, E, D),de- fined in Section 4, has the wrong-key detection prop- where the probability is taken over the randomness of erty. k ← Xn, the randomness of the obfuscator O and the randomness of A, S. 5.1 Proof of Claim 3 Let s, s ∈{0, 1}n be any two distinct secret keys, Applying the work of [9] to our encryption scheme, and let w ∈{0, 1}m be any message. gives us the stronger notion of self-composable obfus- cation, defined below. Pr[Ds (Es(w)) = ⊥]=   m Definition 4 (Composability [9]). A multi-bit point  −q q Pr A(s − s )+x ∈ , function obfuscator O with α(n)-entropic security is 8 8 said to be self-composable if for any PPT adversary A · m×n m with 1 bit output and any polynomial ( ),there where A ∈R ℤq and x ← (Ψβ) .Thefactthat exists a PPT simulator S such that for every dis-   −  n s = s implies that the vector A(s s ) is uniformly tribution {Xn}n∈ℕ with Xn taking values in {0, 1} m  distributed in ℤq , and thus the vector A(s − s )+x and H∞(Xn) ≥ α(n), and for every m1,...,mt ∈ m is uniformly distributed in ℤq . This implies that {0, 1}(n),     m m | A O I ,...,O I −  q q 1 Pr[ ( ( (k,m1)) ( (k,mt))) = 1] Pr A(s − s )+x ∈ − , ≤ I (·),...,I (·) n 8 8 4 Pr[S (k,m1) (k,mt) (1 )=1]|≤negl(n) = negl(n) where the probabilities are over k ← Xn and over the randomness of A, S, O. as desired. 239 S. GOLDWASSER, Y. KALAI, C. PEIKERT AND V. VAIKUNTANATHAN

References CRYPTO, pages 463–481, 2003. [20] J. Katz and V. Vaikuntanathan. Signature schemes [1] A. Akavia, S. Goldwasser, and V. Vaikuntanathan. with bounded leakage. In ASIACRYPT, pages ??–??, Simultaneous hardcore bits and cryptography against 2009. memory attacks. In TCC, pages 474–495, 2009. [21] S. Micali and L. Reyzin. Physically observable cryp- [2] M. Alekhnovich. More on average case vs approxi- tography (extended abstract). In TCC, pages 278– mation complexity. In FOCS, pages 298–307, 2003. 296, 2004. [3] J. Alwen, Y. Dodis, and D. Wichs. Leakage-resilient [22] M. Naor and G. Segev. Public-key cryptosystems public-key cryptography in the bounded-retrieval resilient to key leakage. In CRYPTO, pages 18 – 35, model. In CRYPTO, pages 36–54, 2009. 2009. [4] B. Applebaum, D. Cash, C. Peikert, and A. Sa- [23] C. Peikert. Public-key cryptosystems from the worst- hai. Fast cryptographic primitives and circular se- case shortest vector problem: extended abstract. In cure encryption based on hard learning problems. In STOC, pages 333–342, 2009. CRYPTO, pages 595–618, 2009. [24] C. Peikert, V. Vaikuntanathan, and B. Waters. A [5] B. Barak, O. Goldreich, R. Impagliazzo, S. Rudich, framework for efficient and composable oblivious A. Sahai, S. P. Vadhan, and K. Yang. On transfer. In CRYPTO, pages 554–571, 2008. the (im)possibility of obfuscating programs. In [25] C. Petit, F.-X. Standaert, O. Pereira, T. Malkin, and CRYPTO, pages 1–18, 2001. M. Yung. A block cipher based pseudo random num- [6] V. Boyko. On the security properties of oaep as an ber generator secure against side-channel key recov- all-or-nothing transform. In CRYPTO, pages 503– ery. In ASIACCS, pages 56–65, 2008. 518, 1999. [26] K. Pietrzak. A leakage-resilient mode of operation. [7] R. Canetti. Towards realizing random oracles: Hash In EUROCRYPT, pages 462–482, 2009. functions that hide all partial information. In [27] O. Regev. On lattices, learning with errors, random CRYPTO, pages 455–469, 1997. linear codes, and cryptography. In STOC, pages 84– [8] R. Canetti, Y. Dodis, S. Halevi, E. Kushilevitz, and 93, 2005. A. Sahai. Exposure-resilient functions and all-or- [28] R. L. Rivest. All-or-nothing encryption and the pack- nothing transforms. In EUROCRYPT, pages 453– age transform. In FSE, pages 210–218, 1997. 469, 2000. [29] R. L. Rivest and A. Shamir. Efficient factoring based [9] R. Canetti, Y. Kalai, M. Varia, and D. Wichs. On on partial information. In EUROCRYPT, pages 31– symmetric encryption and point function obfusca- 34, 1985. tion, 2009. Manuscript in Submission. [10] D. Coppersmith. Finding a small root of a bivariate integer equation; factoring with high bits known. In EUROCRYPT, pages 178–189, 1996. [11] Y. Dodis, S. Goldwasser, Y. Kalai, C. Peikert, and V. Vaikuntanathan. Public-key encryption schemes with auxiliary inputs, 2009. Manuscript in Submis- sion. [12] Y. Dodis, Y. T. Kalai, and S. Lovett. On cryptogra- phy with auxiliary input. In STOC, pages 621–630, 2009. [13] Y. Dodis, S. J. Ong, M. Prabhakaran, and A. Sahai. On the (im)possibility of cryptography with imper- fect randomness. In FOCS, pages 196–205, 2004. [14] S. Dziembowski and K. Pietrzak. Leakage-resilient cryptography. In FOCS, pages 293–302, 2008. [15] O. Goldreich, R. Rubinfeld, and M. Sudan. Learn- ing polynomials with queries: The highly noisy case. SIAM J. Discrete Math., 13(4):535–570, 2000. [16] S. Goldwasser, Y. T. Kalai, and G. N. Rothblum. One-time programs. In CRYPTO, pages 39–56, 2008. [17] N. Heninger and H. Shacham. Reconstructing private keys from random key bits. In CRYPTO, pages 1–17, 2009. [18] R. Impagliazzo, L. A. Levin, and M. Luby. Pseudo- random generation from one-way functions (extended abstracts). In STOC, pages 12–24, 1989. [19] Y. Ishai, A. Sahai, and D. Wagner. Private cir- cuits: Securing hardware against probing attacks. In

240