<<

Practical Applications of Homomorphic

Michael Brenner, Henning Perl and Matthew Smith Distributed Computing Security Group, Leibniz Universitt Hannover, Hannover, Germany

Keywords: Homomorphic Encryption, Private Information Retrieval, Encrypted Search.

Abstract: Homomorphic has been one of the most interesting topics of mathematics and computer security since Gentry presented the first construction of a fully homomorphic encryption (FHE) scheme in 2009. Since then, a number of different schemes have been found, that follow the approach of bootstrapping a fully homo- morphic scheme from a somewhat homomorphic foundation. All existing implementations of these systems clearly proved, that fully homomorphic encryption is not yet practical, due to significant performance limita- tions. However, there are many applications in the area of secure methods for cloud computing, distributed computing and delegation of computation in general, that can be implemented with homomorphic encryption schemes of limited depth. We discuss a simple algebraically homomorphic scheme over the integers that is based on the factorization of an approximate semiprime integer. We analyze the properties of the scheme and provide a couple of known protocols that can be implemented with it. We also provide a detailed discussion on searching with encrypted search terms and implementations and performance figures for the solutions discussed in this paper.

1 INTRODUCTION discuss an algebraically homomorphic scheme and show for a couple of problems of practical relevance, Fully homomorphic encryption fired many people’s how these can be solved by a surprisingly small num- imagination in the field of distributed computing se- ber of operations on encrypted values. We exem- curity. Architectures have been proposed and many plarily discuss solutions to the Millionaires’ Problem, application scenarios have been identified that can one-round Oblivious Transfer and oblivious memory benefit from FHE. Encrypted online storage, secure access based on the homomorphic scheme. We also delegation of confidential computation and even pri- discuss searching over encrypted data with encrypted vacy for searching the web: the Cloud was about to search terms. Since this is a very important opera- turn secure. Unfortunately, all implementations of tion in distributed environments, we present our so- fully homomorphic encryption schemes showed, that lution to this in more detail in a separate section. We this technique is still much too slow for practical ap- show a delegationscheme, where the remote party op- plications. erates with encrypted arguments on public data and The most important property of FHE is un- generates encrypted results.This is useful when pos- limited chaining of algebraic operations in the ci- ing search requests to a public database while main- pherspace, which means that an arbitrary number of taining confidentiality of request and response. additions and multiplications can be applied to en- The paper is structured as follows: Section 2 gives crypted operands. To achieve this, an FHE scheme a summary of the current status of homomorphic must provide a mechanism to reduce the noise of ci- cryptography. Section 3 outlines a somewhat homo- pher values, because these schemes are based on a morphic encryption scheme and analyses its proper- slightly inaccurate representation of the plaintext val- ties in detail. Section 4 introduces a selection of algo- ues. Every single operation on a causes rithmic primitives that can be secured using our ho- even lower accuracy and eventually, the ciphertext can momorphic scheme. Section 5 introduces a constant- no longer be properly decrypted. depth approach to encrypted searching. In Section 6 This paper focuses on somewhat homomorphic we present details and performance figures of our im- encryption, where no re-encryption is required but plementation. Section 7 concludes the paper. only a limited number of operations is possible. We

Brenner M., Perl H. and Smith M.. 5 Practical Applications of Homomorphic Encryption. DOI: 10.5220/0003969400050014 In Proceedings of the International Conference on Security and Cryptography (SECRYPT-2012), pages 5-14 ISBN: 978-989-8565-24-2 Copyright c 2012 SCITEPRESS (Science and Technology Publications, Lda.) SECRYPT2012-InternationalConferenceonSecurityandCryptography

2 RELATED WORK • the bit length ρ of the message space, defined as λ − η. Since the breakthrough work of (Gentry, 2009), a number of similar approaches to fully homomorphic 3.1 The Basic Construction encryption appeared, like (Smart and Vercauteren, 2010) or slightly different approaches like (Braker- Our scheme E is defined as a tuple {P,C,K,E,D,⊕,⊗} ski and Vaikuntanathan, 2011). Performance figures where the elements denote the following: of actual implementations (Brenner et al., 2011) and P is the plaintext space and contains elements applications of FHE show that these systems can be from N+ limited by the prime integer p of order 2λ used for small problems only. Due to the computa- such that for two plaintext operands a,b ∈ NP,a b < p η tional overhead of current FHE schemes, the ques- and NP := {x|x < 2 }. tion arises, if the underlyingSHE schemes can also be C is the ciphertext space and contains elements used for more practical homomorphicencryption. Re- from N+. cent proposals, like (Naehrig et al., 2011) follow this K is the generator. The secret key is a large approach. However, the fully homomorphic encryp- prime integer p, the auxiliary compression argument tion is still subject to progress in terms of new consid- is d with d ← 2s + rp with r ∈ N+ and s ∈ NC with ∀x ∈ erations of hardness assumptions (Stehl and Steinfeld, NC,∀y ∈ NP,2x < y (see compactness). 2010) or conceptual simplicity (Coron et al., 2011). E is the encryption function. We encrypt a bit There are different paradigms for secure delega- value b by picking an integer a with a ≡ b mod 2 and tion of computation like secure function evaluation adding a random even or odd multiple of the prime (SFE) mostly based on Yao's Garbled Circuits (Yao, modulus, such that a′ = a +(rp). If r is composite, it 1982) and extensions by (Malkhi et al., 2004) or must contain at least one large prime factor of order λ (Kolesnikov et al., 2009b). Garbled circuits have 2 . It is mandatory that a ∈ N+, i.e. the encryption also been combined with homomorphicencryption by must add noise (see below). (Gentry et al., 2010) and (Kolesnikov et al., 2009a) to D is the decryption function. The decrypted result overcome their inherent disadvantage of of being lim- is the remainder of a ciphertext modulo the prime key ited to static one-pass boolean circuits. p: a = a′ mod p. A theoretical approach to achieve privacy of mem- ⊕ is the addition in ciphertext space. Due to the ory access patterns and algorithm execution in a spe- cipher structure, the addition is performed as an ordi- cial type of Turing Machines is the Oblivious Random nary arithmetic addition. The scheme is mixed addi- Access Machine (ORAM) by (Goldreich, 1987) (Gol- tive. dreich and Ostrovsky, 1996). There are recent propos- ⊗ is the multiplication in ciphertext space. Like als to reduce the complexity of ORAMs by Pinkas et the addition, the multiplication is performed as an al. (Pinkas and Reinman, 2010) and further devel- ordinary arithmetic multiplication. The scheme is opments towards practical applications by (Damgrd mixed multiplicative. et al., 2011) and (Goodrich and Mitzenmacher, 2011). Section 4 outlines how to achieve oblivious memory In this scheme, the positive plaintext value is also access with our scheme. the noise for the ciphertext because it additively in- terferes with the product of the prime factors that en- crypt it. In order to obtain a probabilistic encryption 3 ASOMEWHAT scheme, we perform parity (mod 2) arithmetics, i.e. HOMOMORPHIC a plaintext bit is encoded in a random integer of the same parity. Notice that the encryption does not re- ENCRYPTION SCHEME veal the parity of the plaintext integer, because the encryption function E picks at random even or odd This section describes the encryption scheme and its multiples r of the key p. This effectively allows us to properties correctness, security and compactness. Our hide the parity of the plaintext and to encode binary somewhat homomorphic encryption scheme E de- information. pends on the following parameters: • the security parameter λ, 3.2 Correctness η • the bit length of a cipher’s initial noise, We show the correctness of the encryption and de- • the modulus p which is a large prime integer of cryption, as well as the homomorphic operations in λ order 2 , the following lemmas, assuming that p ∈ N+ be a

6 PracticalApplicationsofHomomorphicEncryption

large prime integer as a secret key and a and b be two that efficiently computes p from any cipher a′. + arbitrary positive integers with a,b ≪ p ∈ N . That means, that Aex is able to extract p from any inte- gercomputedbya term ofthe form a+ pq for arbitrary Lemma 1. The encryption scheme E is mixed additive a, p, and thus can be applied in a function and the addition is correct if a + b < p. B f ac(i) : p ← Aex(0 + i) a′ Proof. We perform the encrypted addition as ( ⊕ to factorize arbitrary composite integers (pq) that b′ a′ b′ a r p b r p ) which extends to ( ⊕ )=( + 1 )+( + 2 ) = can be trivially expressed as 0 + pq.  a + b +(r1 + r2)p and decrypted mod p yields (a + b). Now we show the security of the encryption ′ The mixed additive operation is defined as a ⊕ b = scheme against an attempt of A to compute the plain-  (a + rp) + b = a + rp + b mod p = a + b. text value of a cipher. We achieve this by showing in Lemma 2. The encryption scheme E is mixed multi- addition to Lemma 4 an IND-CPA equivalent prop- plicative and the multiplication is correct if a ∗ b < p. erty of the scheme, which means that any two cipher- texts are computationally indistinguishable by A. Proof. We perform an encrypted multiplication ′ ′ Lemma 5. The security of the encryption scheme as a ⊗ b =(a + r1 p)(b + r2 p) = ab + a(r2 p) + b(r1 p) + 2 is IND-CPA equivalent and the success probability (r1r2)p mod p =(ab). The mixed multiplicative oper- |Pr[Exp = 1] − 1 | is negligible in λ for any A ation is defined as a′ ⊗b =(a+rp)b = ab+brp mod p = ind−cpa 2 from PPT (probabilistic polynomial time) function. ab.  Proof. Since A has no public key for encryption, Lemma 3. The encryption scheme E is capable of we provide a probabilistic encryption oracle O that ρ η Enc computing a sequence of at least log2 − log2 arith- takes a plaintext as input and generates a ciphertext metic operations. under the secret key. Consider the following experi- Proof. The choice of the factors r and p also de- ment in the message space M : λ fines the message space, such that the bit length of Expind−cpa= { the message space ρ is λ − η. The noise of a cipher {m0,m1} ← M R grows by at most one bit per addition (the carry), so i ← {0,1} the minimum number of additions is ρ. The mini- c ← OEnc(mi) mum number of subsequent multiplications is defined iA ← Aind−cpa(c,{m0,m1}) ρ η 1 if i = i as log2 − log2 , assuming that ciphers with the max- return A 0 otherwise imum noise are multiplied, since this can produce at  most η2n bits of noise after n multiplications.  }

The encryption function equivalent oracle OEnc is 3.3 Security defined as follows: Oλ,p Enc(m)= { R η As shown above, our simple encryption scheme is al- a ← 2[2 ] R λ gebraically correct. This subsection discusses con- r ← [2 ] fidentiality of the encrypted data and the security of c ← (m mod 2) + a + rp the encryption scheme. The results of this subsection return c lead to the choice of appropriate parameters. The at- } tack model focuses an attacker who is in possession of a ciphertext. Since we apply symmetric encryption The parity of the outputof the oracle OEnc(m) has a only, the attacker has no public key. By reduction to discrete uniform distribution, such that Pr[OEnc(m) ≡2 O 1 N+ integer factorization we show the security of the en- 0] = Pr[ Enc(m) ≡2 1] = 2 for any integer m ∈ , i.e. cryption scheme against the computation of the secret the encryption function does not leak any parity infor- mation. It is sufficient to show that Pr[OEnc(m) ≡2 1] = key from a ciphertext. 1 2 . Lemma 4. Let the parameters (p,q) ∈ N+ be of or- Pr[OEnc(m) =2 1] = Pr[(a + r p ≡2 1] λ der 2 . Any Attack A running in time polynomial in λ Pr[a ≡2 1] Pr[r p ≡2 0]+Pr[a ≡2 0]Pr[r p ≡2 1] against the encryption scheme can be converted into =0.5 =0 =0.5 an algorithm B for solving the integer factorization 1 1 | {z(Pr[a}≡ |1]) ={z } | {z } of any composite integer number (pq) running in time 2 2 2 polynomial in λ. =1 This leads to the conclusion that are in- λ A | {z } Proof. Consider a poly( ) time function ex distinguishable by A, even if he is allowed to provide ′ p ← Aex(a ) known plaintext to the encryption function. 

7 SECRYPT2012-InternationalConferenceonSecurityandCryptography

3.4 Compactness 4.1 One-round k-Bit Oblivious Transfer

The length of a ciphertext grows exponential in the Bob has two arguments and wants to share one of number of multiplications applied to it. In order to them with Alice. Alice is allowed to pick one ar- obtain compact ciphertexts, i.e. the length of a cipher- gument but doesn’t want to reveal to Bob which one text is independentfrom the number of operations, we she chooses. On the other hand, Bob wants to keep suggest generating an encryption of 0 as an auxiliary the other argument secret. Using our homomorphic compression argument d that can be used in a reduc- scheme, we can solve this problem for arbitrary k-bit tion procedure to limit the size of a ciphertext. The Strings in a single request-reply interaction. This is security of this delimiter is subject to Lemma 4. the protocol: Lemma 6. The operation a′′ ← a′ mod (2s + rp) re- 1. Alice has a ∈ {0,1} and the secret key SK ′ duces the bit length of the ciphertext a . The reduction k 2. Bob has m0,1 ∈ {0,1} , two k-bit strings is correct for any 2s ∈ N+ < a, i.e. the parity of the η N+ original plaintext a encrypted in a′ is preserved in a′′. 3. Alice picks at random a -bit integer t ∈ such ′ The length of a reduced cipher is |2s + rp| = 2λ. that t ≡ a mod 2 and encrypts a ← E(t)SK ′ + ′′ ′ ′′ ′ a′ 4. Bob picks at random η-bit integers m ∈ N such Proof. a ← a mod d ⇔ a ← a −⌊ ⌋d, d = 2s+rp i j d ′ ′′ ′ a+r1 p that m ≡ m mod 2 for i ∈ {0,1} and 0 ≤ j ≤ (k−1). ⇒ a = a − ⌊ ⌋ (2s + r p); I : a + r p < 2s + r p i j i j 2s+r2 p 2 1 2 a r p He computes the (selected) output bits s ← ((m ⇒ a′′ = a′; II : a + r p ≥ 2s + r p ⇒ n = ⌊ + 1 ⌋,n ≥ 1; j 0 j 1 2 2s+r2 p ′ ′ (a + 1))+(m1 j a )) for 0 ≤ j ≤ (k − 1) δ = n (2s + r2 p) = 2ns + nr2 p ′′ 5. Alice decrypts the selected bit string by comput- ⇒ a ← a + r1 p − 2ns − nr2 p mod p ≡2 a ′ ing s j ← D(s j)SK mod 2 for 0 ≤ j ≤ (k − 1) even mod p=0 mod p=0 The multiplicative depth of this protocol is 1. Bob a|{z}′ |{z} δ |{z} determines the result value by applying a binary se- The reduced| result{z a′′} decrypted| {z mod} p has the same lector over the two bit strings, addressed by Alice’s s m parity as a.  binary selection argument. He computes ← ( 0 ∧ ¬a)⊕(m1 ∧a) by evaluating logic AND-operations af- 3.5 Choice of Parameters ter transferring them into an arithmetic representation.

As we showed above, the security of our scheme is 4.2 Oblivious Memory Access essentially based on the common assumption, that the factorization of a large integer is hard. So the product In this scenario, we have a larger number of mes- of the prime key p and the factor r must be sufficiently sages to choose from, so this problem can be solved large. Taking the results of the former RSA-challenge by applying a log2n-bit demultiplexer for n messages. into account, we suggest factors with a size ¿512 bits. The algorithm for this purpose is quite similar to a memory circuit: the switching function of the demux A reasonable configuration with a factor size of λ = that selects a memory item m ∈ {m ,m ,m ,m } ad- 1024 bits and an initial noise of η = 8 bits would allow 0 1 2 3 dressed by a ∈ {(00),(01),(10),(11)} is given as m = for a sequence of log2λ − log2η = 7 multiplications. Small factors. The encryption function must not (¬a0 ∧¬a1 ∧ m0) ∨ (a0 ∧¬a1 ∧ m1) ∨ (¬a0 ∧ a1 ∧ m2) ∨ a a m . Consider the following protocol: issue a ciphertext with a = 0 in order to avoid small ( 0 ∧ 1 ∧ 3) factors: if it generated two different representations 1. Bob has 4 items m0..m3 ∈ {0,1} ′ ′ of 0 with a1 = 0 + r1 p and a2 = 0 + r2 p, the adver- 2. Alice has an argument a ∈ {0,1}2 and the secret sary could easily generate a positive composite inte- key SK ger b′ = |(a′ − a′ )| = |(r − r )|p, with |(r − r )| likely 1 2 1 2 1 2 + 3. Alice picks at random η-bit integers ti ∈ N such containing a small factor, assumed that r1 and r2 are in a similar range. that ti ≡ ai mod 2 for 0 ≤ i ≤ 2 and encrypts each ′ ai ← E(ti)SK η ′ N+ 4. Bob picks at random -bit integers mi ∈ such ′ ′ that mi ≡ mi mod 2 for 0 ≤ i ≤ 3. He computes s ← 4 SIMPLE APPLICATIONS ′ ′ ′ ′ ′ ′ ′ ((a0 + 1) (a1 + 1) m0)+(a0 (a1 + 1) m1)+((a0 + ′ ′ ′ ′ ′ This section describes a selection of algorithms that 1) a1 m2)+(a0 a1 m3) ′ can be encrypted by applying the scheme E. 5. Alice decrypts the selected value s ← D(s )SK .

8 PracticalApplicationsofHomomorphicEncryption

This protocol has a multiplicative depth of 2 for a of homomorphic encryption is described in more de- range of 4 items. The number of selectable items tail than the simple use cases in Section 4, because (i.e. the memory size) can be extended by a wider remote encrypted search is one the basic operations address range which leads to a higher multiplicative in a distributed setting with a wide range of possible depth given as log2n. It can be extended to select k-bit applications. Therefore we split the search scenario items by arranging k instances in parallel. in the subtypes exact search and fuzzy search.

4.3 The Millionaires’ Problem 5.1 General Approach

Alice and Bob want to know who is richer but don’t The following componentsare part of the system: The want to reveal the amounts of their respective capi- Input This can be a list of words or a stream of data. tals. This paragraph depicts a solution to this problem The Encrypted circuit For each word in the input data, using homomorphically encrypted binary adders. The an encrypted homomorphic circuit is executed, which protocol is as follows: calculates one element in the output vector, based on the encrypted search term. An Encrypted search term k 1. Alice has a ∈ {0,1} , containing a (k-1)-bit little- The search term is encrypted and used by the homo- endian binary representation of her capital in morphic circuit. The Output indicator vector In the a0..ak−2, ak−1 ← 0 and the secret key SK. case of an exact search, the output vector is a bit- k 2. Bob has b ∈ {0,1} , like a, bk−1 ← 0 vector containing a 1 if the search term matches and + a 0 else. Yet, it is also possible to do an inexact fuzzy 3. Alice picks at random η-bit integers ti ∈ N such ′ search with this approach. In this case, the output vec- that ti ≡ ai mod 2 and encrypts ai ← E(ti)SK for 0 ≤ tor would contain the rank of the match. Formally, i ≤ (k − 1) one cell of the output vector can be written as the η ′ N+ 4. Bob picks at random -bit integers bi ∈ such value of a function ′ ∗ ∗ ∗ that bi ≡ bi mod 2 for 0 ≤ i ≤ (k − 1). He computes Σ × C → C ′ ′ ϕ : bi ← bi + 1 for 0 ≤ i ≤ (k − 1), picks at random an (word,searchterm) → C(word,searchterm) η ′ N+ ′ -bit integer t ∈ with t ≡ 1 mod 2, computes (1) ′ ′ ′ ′ ′ ′ ′ ′ u0 ← b0 + t and c0 ← b0 t and further ui ← bi + Since the definition of ϕ is independent of the other ′ , ′ ′ ci−1 ci ← bi ci−1 for 1 ≤ i ≤ (k − 1). Now he has words in the database, the output can efficiently be ′ ′ the two’s complement of b in u . He computes computed in parallel. Another advantage is that the ′ ′ ′ ′ ′ ′ s0 ← a0 + u0 and c0 ← a0 u0. Finally, he computes circuit C depends only on the encrypted search term. ′ ′ ′ ′ ′ ′ ′ ′ ′ si ← ai + ui + ci−1 and ci ← (ai ui)+(ci−1 (ai + ui)) Especially, the depth is constant with respect to the for 1 ≤ i ≤ (k − 1). size of a database entry. See 5.2.3 for a reduction of ′ 5. Alice decrypts s ← D(sk−1)SK and concludes for the output size. both parties that a < b if s ≡ 1 mod 2 and a ≥ b otherwise. 5.2 Exact-match Search Bob first computes the two’s complement of his argu- Building on the general approach for searching intro- ment and adds Alice’s a. The most significant bit of duced above, one can easily generate a circuit which the solution (the sign) indicates the relation between a executes an exact-match search. The circuit should and b: if the solution is negative, i.e. the sign bit is 1, have the following property: then obviously b is greater then a. As stated in Section 3.2, Bob injects his data unencryptedly. However, af- ϕ 1 word == searchterm ter an operation with an encrypted operand, the result (word,searchterm)= (0 else contains the implicitly encrypted plaintext argument. (2) The multiplicative depth of the protocol is 2. which translates to a character-level comparison

|searchterm|−1 ϕ 5 SEARCH USING (word,searchterm) = ^ (wordi =c searchtermi) CONSTANT-DEPTH CIRCUITS i=0 (3) The two things that need further specification are This section introduces an approach to encrypted a) the concrete implementation of the character-level searching using circuits of constant depth that oper- comparison =c and b) the (common) case when ate on encrypted queries and data. This application |searchterm| = |word|.

9 SECRYPT2012-InternationalConferenceonSecurityandCryptography

5.2.1 Character-level Comparison only modifications made add a constant to Σ and |searchterm| in Equation 6. Let Σ be the finite alphabet for the word database as well as the search term. Then there exists a strict total 5.2.3 Reducing the Output Size order < on Σ. In case of Σ = {a,...,z,A,...,Z} this may for example be the dictionary order. The bijec- The output vector of the exact search has a linear de- tive funciton bin(σ) then returns the binary value of pendency in the size if the database that is operated the positon of σ and is defined as on. Under the assumption of a total strict order of the database elements, we have that the search yields Σ Σ →{0,1}⌈log2 ⌉ at most one positive indicator (or none, if no entry in the databases matches the search term). Following bin : 0 i f σ = min(Σ) the technique, outlined in the description of the obliv- σ → σ′ ious memory access in 4.2, the reduced output vec- ( maxσ′<σ{bin( )} + 1 else. 2 ∑n (4) tor for the indicators I0..In can be computed as i=0 Ii.  This reduces the output space complexity to O(1), i.e. Finally, the relation =c is defined as to one single element which contains an encrypted Σ ⌈log2 ⌉−1 1, if the search term was found and an encrypted 0 σ σ σ σ =c : { 1, 2}| ^ bin( 1)i ⊕bin( 2)i ⊕1 . otherwise. To indicate the position of the match in i=0 the database, the return value can be computed as  (5) ∑n I ∗ i. The result can be returned as a log n bit el- The depth of the resulting circuit from Equation 5.2 i=0 i 2 ement with an encrypted binary representation of the combined with the circuit of = from Equation 5 is c match index i. The reduction cost of the binary index generation is a multiplicative depth which is increased depth(ϕ(word,search)) = by 1. Σ (6) 2 + ⌈log2 ⌉ + ⌈log2 |search|⌉ ∈ O(log2 n) Note that the computation of bin is just used to trans- 5.3 Fuzzy Search form a character to its binary representationand there- fore is not part of the circuit itself. Rather than com- While Section 5.2 provided some fundamental con- puting the implicit function given in Equation 4 one struction of a search using a homomorphiccircuit, this would use a code table like ASCII for an efficient section uses that principle to its full extent by allow- transformation. ing more than just a boolean comparison (=c) of the characters. This results in the ability to also compute 5.2.2 inexact matches. The main part that differs from the exact search scheme is the construction of φ. In Equation 5.2 the implicit requirement was |word| ≥ |searchterm| introduced. In a general search, however, 5.3.1 Counting Mismatches the lengths of the search term and the words may very well be different. The first step towards a circuit evaluating a fuzzy Definition 5.1. For an alphabet Σ anda padding char- search is done by counting the number of mismatches acter  with Σ ∩  = 0/ let the alphabet with padding between the search string and a given word in a be Σ = Σ ∪ {}. n is a shorthand for , an n- database or a substring in a character stream. To ac- AND n times complish this the multiplicative ( ) combination charactere padding. of all the character-level comparisons like in Equa- | {z } Additionally, some random padding may be re- tion 5.2 is replaced by a sum, yielding quired for the searchterm, because the length of s log s Σ∗ × C → C⌈ 2 ⌉ the searchterm is a public information, even if the searchterm itself is encrypted. Also, the length of ϕ : |s|−1 . (7) the searchterm is revealed in the depth of the cir- (w,s) → ∑ (wi =c si) cuit. In order to hide the actual length the client i=0 appends a random number of padding characters to Since the sum can be expressed as a hamming the searchterm prior to encrypting it. The server weight of a bit vector, the application of symmet- only evaluates the search function for database items ric polynomials leads to a very shallow circuit, used shorter than the searchterm. Despite this modifi- for example in (Smart and Vercauteren, 2010, page cations, the depth of ϕ is still in O(logn), as the 15). The k-th bit of the hamming weight of a vector

10 PracticalApplicationsofHomomorphicEncryption

(b1,...,bn) can be directly expressed by the polyno- mial G A C T G 0 1 3 2 A 1 0 1 3 C 3 1 0 2 e k−1 (b1,...,bn) := ∑ b j b j . 2 1 2k−1 T 2 3 2 0 1≤ j1< j2 j2k−1 ≤n The evaluation of the polynomial translates directly to the circuit bin

e k−1 (b1,...,bn) := b j ∧b j . 2 M 1 2k−1 1 1≤ j1< j2 j2k−1 ≤n 00 01 11 10 In the case that the maximum number of errors 00 00 01 11 10 01 01 00 01 11 errmax is given beforehand, the advantage of symmet- 11 11 01 00 10 ric polynomials is that one can directly limit the num- 10 10 11 10 00 ber of evaluated polynomials to the first log2(errmax) digits of the output vector.

5.3.2 “Softer” Mismatches 2 2 00 01 11 10 00 01 11 10 Some applications of fuzzy search, for example the 00 0 0 1 1 00 0 1 1 0 mapping of short reads in genomes (Trapnell and 01 0 0 0 1 01 1 0 1 1 Salzberg, 2009), require more differentiation when 11 1 0 0 1 11 1 1 0 0 calculating the match between two characters. In this 10 1 1 1 0 10 0 1 0 0 example, a sequence where only one character (= acid in the DNA) changed to one chemically related acid 3 3 still strongly indicates a match in the genome. Conse- o0 =(a2 ∨ b2) o1 =(a2 ∨ b2) quently, if the position of the mismatch would intro- ∧ (a0 ∨ b0 ∨ b1) ∧ (a1 ∨ b1) duce a change to a completely unrelated acid, a match ∧ (b0 ∨ a0 ∨ a1) of the sequence would be less likely. ∧ (a2 ∨ a1 ∨ b2 ∨ b1) ∧ (a0 ∨ a1 ∨ b0 ∨ b1) For the rest of this subsection, let the alphabet Figure 1: Construction of the pen function: transformation Σ = {A,C,G,T} consist of the four acids that build of a penalty table to a circuit using a Karnaugh map. the DNA. The top table in Figure 1 lists the quan- tified differences (penalities) between these DNA- bases which reflect a computational metric for the ge- and AND-gates are available in the homomorphiccir- netic distances between a searchterm and a processed cuit, each binary disjunction a ∨ b must be written as genome sequence. compound operation (a ⊕ b) ⊕ (a ∧ b), resulting in a factor two compared to the depth of a circuit using In the circuit, a function pen : Σ×Σ → C∗ is needed that maps two given characters to the penalty of con- OR gates. Each disjunctive term therefore has depth Σ verting one character into the other. Since this func- log2(2| |). The total depth of the resulting circuit for Σ Σ O Σ tion needs to be evaluated inside the circuit, it needs pen is log2(| |) + log2(2| |) ∈ (log| |), which is the to be written as a boolean circuit itself. The necessary same depth as for =c in Section 5.2.1. steps (also see Figure 1) are: By replacing the character comparison with a penalty function, it is also necessary to sum up the 1. For each character c in the row and column head- individual penalties from the character-level compar- ers, write bin(c). Additionally, write the penalty in isons for a total penalty of the word match as shown binary. in Section 5.3.1. This results in the circuit 2. For each digit of the binary value of the penalty, Σ∗ Cs C⌈log2 s⌉ write one table containing only that digit. × → ϕ |s|−1 3. For each table in the previous step, create a mini- : . (8) (w,s) → ∑ pen(wi,si) mal circuit evaluating this table using a Karnaugh i=0 map. Depth analysis of pen. Without loss of generality, 5.4 Results assume the resulting circuit be in conjunctive normal form (CNF). The conjunction at the top level results In this section we introduced an approach to mark the Σ in AND-gates of depth log2(| |). Since only XOR- position of a definitive or probable match with respect

11 SECRYPT2012-InternationalConferenceonSecurityandCryptography

to exact or fuzzy search. The circuit ϕ that needs Table 2: Protocol timings. to be evaluated in the homomorphic cipherspace was λ λ λ Σ Protocol = 512 = 1024 = 2048 shown to havea depth bound to O(log2 | |) in all cases. This means that the depth of the circuit is known be- OT100 ¡1ms ¡1ms ¡1ms forehand, and especially that it does notdepend on the OT10k 8ms 15ms 35ms OT100k 190ms 290ms 426ms size of the database. YMP4 ¡1ms ¡1ms ¡1ms YMP40 2ms 9ms 37ms YMP80 10ms 34ms 132ms 6 IMPLEMENTATION & OMA16 ¡1ms 1ms 4ms OMA32 1ms 4ms 14ms PERFORMANCE OMA64 3ms 12ms 43ms

This section gives a brief summary of our implemen- tation1. The platform for the prototype is Java 6 SE 6.3 Parameter Sizes on a 2.4 GHz Core 2 Duo with 3 MB L2 cache and 4 For the key parameters, factor sizes of λ ∈ GB RAM. We present performance figures for differ- η ent key and problem sizes. {512,1024,2048} and = 8 bits have been chosen. There are measurements of every described protocol 6.1 Basic Operation Timings for three different problem sizes. Table 3 shows the return sizes for the Oblivious Transfer (OT) proto- col implementation with 100, 10000 and 100000 bits Due to the structure of the homomorphic scheme E, long plaintext arguments. The timing figures show the the algorithms for addition, multiplication and de- runtime required for the determination of the result cryption are almost atomic arithmetic operations of by Bob. The output sizes of the single return argu- the underlying library for large integer handling. The most time consuming operations are K (Keygen) and Table 3: Oblivious transfer parameter sizes (Bits). E (Encrypt). The timing for these and the other basic operations is depicted in Table 1. The reason for the Protocol λ = 512 λ = 1024 λ = 2048 OT100 103k 205k 410k Table 1: Timings for basic operations. OT10k 10.3M 20.5M 41M OT100k 103M 205M 410M λ λ λ Op. = 512 = 1024 = 2048 YMP4 3k 6k 12k Keygen 35ms 270ms 4242ms YMP40 40k 80k 160k Encrypt 35ms 301ms 3218ms YMP80 80k 161k 323k Decrypt ¡1ms ¡1ms ¡1ms OMA16 4k 8k 16k Add ¡1ms ¡1ms ¡1ms OMA32 5k 10k 20k Mult ¡1ms ¡1ms ¡1ms OMA64 6k 12k 24k relatively long runtime of Keygen and Encrypt is the ments of Yao's Millionaires' Problem (YMP) shown determination of a prime number during these opera- in Table 3 are in the range from 4 to 80 plaintext bits tions. and therefore require between 3k and 323k bits of en- crypted space. The runtime figures in Table 2 only 6.2 Protocol Timings contain the computational part of Bob. The Oblivious Memory Access (OMA) is the selection of exactly one Our implementation comprises the protocols outlined of 16, 32 or 64 elements (memory bits). So the re- in Section 4. The algorithms have been transformed turn sizes are the number of bits required by a single into native Java code which invokes the API of our encrypted return bit. The calculation of the memory encryption scheme. The resulting performance is de- circuit by Bob is shown in the runtime figures. picted in Table 2, which contains the time consump- tion figures and Table 3, showing the sizes of the out- 6.4 Search Results put arguments. We have implemented the exact search as described in Section 5.2 with different key sizes and database sizes, as well as two different output set reductions. 1The software can be downloaded at The default reduction yields a one-bit indicator con- http://www.hcrypt.com taining an encrypted 1 if the search term matches a

12 PracticalApplicationsofHomomorphicEncryption

database entry and 0 otherwise. The second reduction correctness. We gave a security analysis for different type generates an encrypted binary representation of attack models and stated, under what circumstances the matching entry’s database index. The word size the scheme is secure. Proof-of-concept implementa- for all experiments is 5 bits. Table 4 summarizes the tions of the discussed protocols outlined the charac- result figures. The upper section of Table 4 depicts teristics of homomorphically encrypted real-life ap- plications. A detailed formal analysis of exact-match Table 4: Exact-match search results. searching with extensions to fuzzy searching on en- crypted data with encrypted search terms showed, DB size λ = 512 λ = 1024 λ = 2048 how the algorithmic primitives of the simple proto- search cols can be combined to solve a problem of higher 1024 31 ms 115 ms 442 ms complexity. 256 k 8.6 s 30.1 s 119 s 512 k 17.4 s 61.1 s 241 s 1 M 36.9 s 124 s 487 s generate index REFERENCES 1024 11 ms 22 ms 43 ms 256 k 5.1 s 10.2 s 20.4 s Brakerski, Z. and Vaikuntanathan, V. (2011). Effi- 512k 12 s 24.0 s 48.1 s cient fully homomorphic encryption from (standard) 1 M 29.8 s 59.5 s 119.5 s lwe. Cryptology ePrint Archive, Report 2011/344. indicator vector size(bits, ∗compact cipher) http://eprint.iacr.org/. 1 M∗ 2 M∗ 4 M∗ 1024 Brenner, M., Wiebelitz, J., von Voigt, G., and Smith, M. 5.2 M 10.5 M 20.9 M (2011). A smart-gentry based software system for se- 256 M∗ 512 M∗ 1G∗ 256 k cret program execution. In Proc. of the International 1.34 G 2.68 G 5.37 G Conference on Security and Cryptography SECRYPT. 512 M∗ 1G∗ 2G∗ SciTePress. 512k 2.68 G 5.37 G 10.7 G Coron, J.-S., Mandal, A., Naccache, D., and Tibouchi, M. 1G∗ 2G∗ 4G∗ 1 M (2011). Fully homomorphic encryption over the inte- 5.37 G 10.7 G 21.4 G gers with shorter public keys. In Advances in Cryptol- index value size (bits) ogy CRYPTO 2011, volume 6841 of LNCS. Springer 1024 5.1 k 10.2 k 20.5 k Berlin / Heidelberg. 256 k 5.1 k 10.2 k 20.5 k Damgrd, I., Meldgaard, S., and Nielsen, J. (2011). Perfectly 512k 5.1 k 10.2 k 20.5 k secure oblivious ram without random oracles. In The- 1 M 5.1 k 10.2 k 20.5 k ory of Cryptography, volume 6597 of LNCS. Springer Berlin / Heidelberg. the timing results for different key sizes and database Gentry, C. (2009). Fully homomorphic encryption using sizes. The timings scale almost linearly with the size ideal lattices. In Proc. of the 41st annual ACM sympo- of the database, whereas larger keys cause an expo- sium on Theory of computing, STOC ’09, New York, nential timing behavior. The second section of the NY, USA. ACM. table shows the timings of the index-reduction for the Gentry, C., Halevi, S., and Vaikuntanathan, V. (2010). i- different problem sizes. The sum of the index gen- hop homomorphic encryption and rerandomizable yao eration time and the basic search time is the total ex- circuits. In Advances in Cryptology - CRYPTO 2010, ecution time for an index-reduced search. Section 3 volume 6223 of LNCS. Springer Berlin / Heidelberg. of the table summarizes the sizes in number of bits of Goldreich, O. (1987). Towards a theory of software protec- tion and simulation by oblivious rams. In Proc. of the the returned match-indicator vectors. The lower sec- 19th annual ACM symposium on Theory of comput- tion of the table contains the return sized of an index- ing, STOC ’87, New York, NY, USA. ACM. reduced search. The returned argument contains an Goldreich, O. and Ostrovsky, R. (1996). Software protec- encrypted binary representation of the log2n plaintext tion and simulation on oblivious rams. J. ACM, 43. index-bits. Goodrich, M. and Mitzenmacher, M. (2011). Privacy- preserving access of outsourced data via oblivious ram simulation. In Automata, Languages and Program- ming, volume 6756 of LNCS. Springer Berlin / Hei- 7 SUMMARY delberg. Kolesnikov, V., Sadeghi, A.-R., and Schneider, T. (2009a). In this paper we discussed an algebraically homo- How to combine homomorphic encryption and gar- morphic scheme of limited multiplicative depth that bled circuits improved circuits and computing the can be used as an approach to build practical applica- minimum distance efficiently. tions that operate on encrypted data. We discussed the Kolesnikov, V., Sadeghi, A.-R., and Schneider, T. (2009b). properties of the SHE scheme and provided a proof of Improved garbled circuit building blocks and appli-

13 SECRYPT2012-InternationalConferenceonSecurityandCryptography

cations to auctions and computing minima. In Cryp- tology and Network Security, volume 5888 of LNCS. Springer Berlin / Heidelberg. Malkhi, D., Nisan, N., Pinkas, B., and Sella, Y. (2004). Fairplay - a secure two-party computation system. In Proc. of the 13th conference on USENIX Security Symposium - Volume 13, SSYM’04, Berkeley, CA, USA. USENIX Association. Naehrig, M., Lauter, K., and Vaikuntanathan, V. (2011). Can homomorphic encryption be practical? In Proc. of the 3rd ACM workshop on Cloud computing se- curity workshop, CCSW ’11, New York, NY, USA. ACM. Pinkas, B. and Reinman, T. (2010). Oblivious ram revisited. In Advances in Cryptology - CRYPTO 2010, volume 6223 of LNCS. Springer Berlin / Heidelberg. Smart, N. and Vercauteren, F. (2010). Fully homomorphic encryption with relatively small key and ciphertext sizes. In Public Key Cryptography, PKC 2010, vol- ume 6056 of LNCS. Springer Berlin / Heidelberg. Stehl, D. and Steinfeld, R. (2010). Faster fully homo- morphic encryption. In Advances in Cryptology - ASIACRYPT 2010, volume 6477 of LNCS. Springer Berlin / Heidelberg. Trapnell, C. and Salzberg, S. (2009). How to map billions of short reads onto genomes. Nature Biotechnology, 27(5). Yao, A. C. (1982). Protocols for secure computations. In SFCS '82: Proc. of the 23rd Annual Symposium on Foundations of Computer Science. IEEE Computer Society, Washington, DC, USA.

14