<<

Floating-Point Homomorphic Encryption

Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song

Department of Mathematical Sciences, Seoul National University, Republic of Korea {jhcheon, kimandrik, alfks500, lucius05}@snu.ac.kr

Abstract. Our paper suggests a general method to construct a Floating-Point Homomor- phic Encryption (FPHE) scheme that allows the floating-point arithmetics of ciphertexts, thus computing encryptions of most significant of m1 +m2 and m1m2, given encryptions of floating-point numbers m1 and m2. Our concrete construction of leveled FPHE based on BGV scheme is almost optimal in the sense of noise growth and precision loss. More precisely, given encryptions of d messages with η bits of precision, our scheme of depth dlog de securely computes their product with η −dlog de bits of precision similarly to the case of unencrypted floating-point computation. The required size of the largest modulus grows linearly in the depth. We also describe algorithms for evaluating some floating-point arithmetic circuits containing polynomial, multiplicative inverse, and even exponential function, and analyze their complexities and output precisions. With the security parameter λ = 80, our rudimen- tary implementation takes 315ms and 168ms to compute a product of 16 ciphertexts and a multiplicative inverse of a ciphertext, respectively, when given ciphertexts have 20 bits of precision. Keywords. Homomorphic Encryption, Floating-point Arithmetic, BGV Scheme, Somewhat Homomorphic Encryption

1 Introduction

The floating-point is the most common way to approximately represent real numbers in computers, and has been widely used in computer system and intensive scientific computations. A floating- point number is represented in terms of four integers as x = ±s · be−k, where b is the base, e is the exponent, k is the precision (the number of significant digits in significand), and s is the significand satisfying 0 ≤ s ≤ bk − 1. The floating-point arithmetic consists of the usual arithmetic and rounding step to the significand. It allows us to represent real numbers using fixed number of bits, and supports a trade-off between range (size of representation) and precision. Homomorphic encryption (HE) is a cryptographic scheme that allows homomorphic operations on encrypted data without decryption. Since Gentry discovered the first plausible construction of fully homomorphic encryption scheme [Gen09], many other HE schemes have been suggested follow- ing Gentry’s blueprint (e.g., [DGHV10,BV11a,BV11b,Bra12,BGV12,GSW13,CLT14,CS15,DM15]). Most of existing schemes encode the messages into a fixed modulus space, a product of several modulus spaces, or a polynomial ring over modulus space. However, HE schemes have not been used in many practical applications because of their inefficiency of real number computations, especially floating-point arithmetic. Floating-Point Homomorphic Encryption. In this paper, we propose a general idea to con- struct a Floating-Point Homomorphic Encryption (FPHE) scheme supporting floating-point ad- dition and multiplication of ciphertexts which output encryptions of some most significant bits (MSBs) of m1 + m2 and m1m2 respectively, upon input encryptions of two messages m1 and m2. Our concrete construction of a leveled FPHE scheme comes from the classical LWE-based scheme [BGV12] by modifying some operations such as modulus-switching procedure. 2

First of all, we place an encryption noise in the rightmost position together with a plaintext as in Figure 1, i.e., an encryption of m encrypted by s satisfies hc, si = m + e (mod q) for some small e. We cannot recover the exact value m from the ciphertext c, but the inserted noise e can be considered as an error occurred during floating-point computations. This approximate value m + e may substitute the original message in floating-point arithmetics with almost the same precision when |e| is sufficiently small. If the input data is too small or large, then one may encode and regulate the magnitude of message by multiplying a scaling factor, and store the scaling factor (or exponent) independently. The homomorphic operations of our scheme output the ciphertexts cadd and cmult satisfying hcadd, si mod q ≈ m1 + m2 and hcmult, si mod q ≈ m1m2. We use the key-switching technique [BV11a,BGV12] for multiplication procedure. The modulus-switching technique was suggested in [BV11a,BGV12] to reduce the magnitude of inserted error, but it has a completely different role in our scheme. It rounds up some least significant bits (LSBs) of message to maintain the size of significand after homomorphic operations. The composition of this modulus-switching procedure and homomorphic operation (as in Figure 1) mimics the ordinary (unencrypted) floating-point computations. Our scheme is almost optimal in the sense of precision: precision of resulting message is lost by about one bit similar to the case of unencrypted floating-point arithmetic.

m1 m2

e1 e2

q × q

m1m2

MSB emult LSB

q

q0 ModSwitch q m1m2

e0

q0

Fig. 1. Ciphertext after a homomorphic multiplication

FPHE Arithmetics. Based on our construction of FPHE, we homomorphically evaluate some typical circuits containing (real) constant addition/multiplication, product, polynomial, and multi- plicative inverse. We describe algorithms to minimize the circuit depth and complexity, and analyze their noise and precision loss during homomorphic evaluation. In our scheme, every integer is a trivial encryption of itself without error. Hence, the constant addition procedure almost does not change the absolute error of the ciphertext (fraction part can be added to noise). In addition, the constant multiplication process with suitable choice of scaling almost does not change the relative error of the input ciphertext. Given encryptions of messages m1, . . . , md with η bits of precision, our leveled FPHE scheme of depth dlog de computes their product with (η − dlog de) bits of precision in (d − 1) homomorphic multiplications of ciphertexts. The magnitude of significands is invariably maintained, so the required bit size of the largest modulus is linear in depth dlog de. For example, given 16 ciphertexts with 20 bits of precision, we can compute their product with 15 bits of precision in about 315ms. We get a similar result for the evaluation of polynomials, and extend the case to analytic functions such as exponential and logistic functions using Taylor decomposition. 3

Table 1. Summary of FPHE arithmetics

Bit precision Function Depth #(HM) of output Constant addition a + x ≥ η 0 0 Constant multiplication a · x η 0 0 Qd Product i=1 xi η − dlog de dlog de d − 1 Power function xd η − dlog de dlog de dlog de Pd i Polynomial i=0 aix η − dlog de dlog de d − 1 Multiplicative inverse x−1 η − 1 dlog ηe 2blog ηc Exponential ex η − dlog ηe dlog ηe η − 1 η: bit precision of inputs HM: homomorphic multiplications of ciphertexts

Letx ¯ = 1 − x for a given real number x with 1/2 ≤ x ≤ 1. From the equality x(1 +x ¯)(1 + k−1 k k−1 x¯2) ··· (1 +x ¯2 ) = (1 − x¯2 ), x−1 can be approximated by (1 +x ¯)(1 +x ¯2) ··· (1 +x ¯2 ) with a precision of 2k bits. We transform this equality to securely compute the multiplicative inverse. More precisely, given a ciphertext with η bits of precision, our FPHE scheme of depth log η can compute its multiplicative inverse with (η −1) bits of precision in 2blog ηc multiplications of ciphertexts. For example, upon inputs the 20 bits of precision, one can compute an encryption of its multiplicative inverse with 19 bits of precision with only 8 multiplications of FPHE scheme of depth 5, and our implementation with the security parameter λ = 80 takes about 168ms.

Related Works. There have been a substantial number of studies concerned with processing real numbers over encryption. The obvious method is to scale them to integers, but a plaintext modulus is exponential in the length of message. For example, Bos et al. [BLN14] described how to privately predict the likelihood of having a heart attack by evaluating the logistic regression function with scaled input data. On the other hand, Dowlin et al. [DGL+15] presented an efficient method to represent fixed-point numbers − it is encoded as a polynomial with coefficients in the range [−(B − 1)/2, (B − 1)/2] using its base-B representation for an odd integer B ≥ 3. However, when the number of significant bits is large, it results in large degree of the polynomial ring. On the other hand, division is an essential operation for real-world applications such as the med- ical, financial, advertising domains, and machine learning algorithms. Algesheimer et al. [ACS02] introduced a problem of solving a secure integer division such that given two encryptions of a and a b, determine an encryption of b b c without leaking any information about inputs. Their solution was based on Newton iteration using secure multiparty computation protocol. The protocols were later implemented by Jakobsen in the passively secure three-party setting [Jak06]. In [DNT12], they proposed secure computation protocols to solve the problem in a two-party setting. Their computation cost is tractable, but the protocols still require a lot of interactions. Recently, Cetin et al. [cDSM15] described new ideas for comparison and homomorphic division given word-based encrypted data. They found approaches by using approximation techniques, but there was no accurate explanation of the precision for the approximation.

Road-map. Section 2 briefly introduces notations and some preliminaries about floating-point arithmetics and the LWE problem. Then Section 3 presents a floating-point homomorphic encryp- tion scheme and analyze the noise growth during basic homomorphic operations. In Section 4, we suggest some algorithms to evaluate typical circuits homomorphically, and compute the precision of outputs. Finally, in Section 5 we provide an implementation and performance of our scheme. 4

2 Preliminaries

Notation. All logarithms are base 2 unless otherwise indicated. We denote vectors in bold, e.g., a, and matrices in upper-case bold, e.g., A. Every vector in this paper is a column vector. We denote by h·, ·i the usual dot product of two vectors. For a real number r, bre denotes the nearest integer to r, rounding upwards in case of a tie. For an integer q, we use Z ∩ (−q/2, q/2] as a representative of Zq. We use x ← D to denote the sampling x according to distribution D. It denotes the uniform sampling when D is a finite set. For a set S, U(S) denotes the uniform distribution on S. Throughout the paper, we let λ denote the security parameter: all known valid attacks against the cryptographic scheme under scope should take Ω(2λ) bit operations.

2.1 Floating-Point Arithmetic Floating-point is the formulaic representation that approximates a real number so as to support a trade-off between range and precision. A number is, in general, represented approximately to a fixed number of significant digits (significand), so real numbers can be approximated with the finite word length in a computer. The term floating refers to the fact that a number’s radix point can float: it can be placed anywhere relative to the significant digits of the number. More precisely, a floating-point number x can be represented in terms of four integers as x = ±s·be−k, where b is the base or radix, e is the exponent, k is the precision (the number of significant digits in significand), and s is the significand satisfying 0 ≤ s ≤ bk − 1. Let x∗ be a real number. We use the notation x = fl(x∗) to represent the floating-point value of x∗ (nearest number in the floating-point system). The most useful measures of the accuracy ∗ ∗ ∗ of x are its absolute error Eabs(x) = |x − x | and its relative error Erel(x) = |x − x |/|x |. To 1 1−k determine accuracy, we define the quantity u = 2 · b , called the unit round-off. It is the furthest distance relative to unity between a real number and the nearest floating-point number. It bounds ∗ the relative error in representing as x = x (1 + ) such that Erel(x) = || ≤ u [Hig02]. A simple method to add (resp. subtract) floating-point numbers is to first proceed with the usual addition (resp. subtraction). Then it is rounded to a precision of the same bits, which makes round-off error. To multiply, the significands are multiplied while the exponents are added, and then 2 the result is rounded. For example, if we are given two floating-point numbers x = 1101(2) · 2 and −3 −1 y = 1001(2) · 2 with 4 bits of precision, we first get the true product value x · y = 1110101(2) · 2 2 and then it is rounded to fl(x · y) = 1111(2) · 2 with 4 bits of precision. Small errors may accumulate as floating-point operations are performed in succession. Dealing with these errors incurred in the floating-point operations is one of the main topics in error analysis. In particular, we focus on the forward errors which are concerned with how close the computed result is to the exact value of an algorithm. For more explicit overview, we recommend to refer to [Gol91,Hig02,Ein05]. Here is the standard model of floating-point arithmetic, introduced by Wilkinson [Wil61]: if x and y are floating-point numbers, then fl(x ? y) = (x ? y)(1 + ) for some || ≤ u, where ? ≡ +, −, ·, or /.

2.2 Learning With Errors (LWE) The LWE problem was introduced by Regev as a generalization of learning parity with noise [Reg05]. n LWE For positive integers n, and q ≥ 2, t ∈ Zq and a distribution χ over Z, we define Aq,χ (t) as the n n distribution obtained by sampling a ← Zq and e ← χ, and returning c = (ha, ti + e, a) ∈ Zq × Zq . n Definition 1( LWE). Let D be a distribution on Zq . The (decision) learning with errors problem, denoted by LWEn,q,χ(D), is to distinguish arbitrarily many independent samples chosen according LWE n to Aq,χ (t) for a fixed t ← D, from the uniform distribution over Zq × Zq . 5

n The LWE problem is self-reducible, that is, LWEn,q,χ(D) can be reduced to LWEn,q,χ(U(Zq )) n for any distribution D. Moreover, one can efficiently reduce LWEn,q,χ(U(Zq )) to LWEn,q,χ(χ) when χ is a Gaussian distribution with an appropriate parameter [ACPS09]. It was shown that the hardness of LWE can be established by a quantum reduction to approxi- mate short vector problems in ideal lattices [Reg05]. Peikert also showed there is a classical reduc- tion between LWE and worst-case lattice problems [Pei09]. We refer the readers to [Reg05,Pei09] for these results in more details.

3 Floating-Point Homomorphic Encryption

In this section, we describe a new leveled homomorphic encryption scheme based on the hardness of LWE problem which allows to carry out floating-point computations of ciphertexts. Given en- cryptions of m1 and m2, the floating-point addition (resp. multiplication) outputs an encryption of some MSBs of m1 + m2 (resp. m1m2) upon input encryptions of two integers m1 and m2. It can be applied to compute an approximate value of product of integers, multiplicative inverse, and power series. Our concrete construction in this section is based on the BGV scheme [BGV12], but this idea can be adapted to other (Ring) LWE-based HE schemes such as [GHS12b].

3.1 Basic Encryption Scheme

We first describe a basic public-key homomorphic encryption scheme in this subsection. This scheme is IND-CPA secure under the assumption that (n, q, χ)-LWE problem is hard. The scheme consists of the following algorithms.

λ • E.Setup(1 ). Take a modulus q = q(λ). Choose the parameters n = n(λ, q) and a Berr-bounded λ error distribution χ = χ(λ, q) appropriately for LWEn,q,χ that achieves at least 2 security. Take τ = 2(n + 1)dlog qe. Output the parameters params = (n, q, χ, τ). • E.SecretKeyGen(params). Sample a vector t ← χn and set the secret key sk = s ← (1, −t) ∈ n Zq × Zq . LWE τ • E.PublicKeyGen(params, sk). Generate and output a (n + 1) × τ matrix pk = A ← Aq,χ (t) LWE (Each column of A is sampled independently from the distribution Aq,χ (t)). τ • E.Encpk(m). Given a message m ∈ Z, sample a vector r ← {0, 1} and let m ← (m, 0,..., 0). Output c ← m + A · r mod q. 0 • E.Decsk(c). Output m ← hc, si mod q.

Apart from the most of existing schemes (e.g., [Bra12,GHS12b,BLLN13]), our scheme does not have a fixed or separate plaintext space from an inserted error. This floating-point encryption scheme may seem strange, because the output m0 = m + e of its decryption circuit is slightly different from the original message m. However, they can be considered to be actually the same in floating-point computation sense if e is small enough. More precisely, if the input message m has a precision of η bits to its true value m∗ and |e| ≤ 2−η|m∗|, then m0 is also an approximation to m∗ with (η − 1) bits of precision. When the input data is too small or large, we should encode the message and regulate its magnitude before encryption procedure to maintain the precision. For example, in order to encrypt −3 the floating-point number x = 1001(2) · 2 with 4 bits of precision, one may move the radix point 5 (multiplied by the constant 2 ) and use the scaled value m = 100100(2) as an input message of encryption procedure. If the encryption error e bounded by |e| ≤ 22, then m0 = m + e is an approximation of x · 25 with 9 bits of precision. The scaling factors of input data should be stored and managed independently for correct understanding of computing results. 6

This notion of approximate encryption has been partially used in previous work, for example, additional information for multiplication in [BV11a,Bra12,BGV12,CS15] and intermediate values in squashed decryption circuit in [DGHV10,CLT14] are encrypted in the same way. Note that these ciphertexts were used in approximate computations allowing small errors. For homomorphic operations, one should take a larger parameter depending on the circuit complexity and also the size of input messages. We analyze the suitable parameters for various circuits in Section 4.

Security. The security of basic encryption scheme is directly obtained from the security proof of leveled floating-point HE scheme that will be described in next subsection.

3.2 Leveled Floating-Point Homomorphic Encryption Scheme We start by adapting some notations from [BGV12] to our context and recalling the key switching and modulus reduction techniques in [BV11a,BGV12] for homomorphic operations of LWE-based encryption schemes. n Let q be a positive integer. Given a vector x ∈ Zq , its bit decomposition and power of two n(1+blog qc) Pblog qc i are defined by BD(x) = (u0,..., ublog qc) ∈ {0, 1} with x = i=0 2 ui, and P2(x) = blog qc (x,..., 2 x). Then we can see that hBD(x), P2(y)i = hx, yi. We also recall the definition of n m tensor product u ⊗ v = (u1v1, u1v2, . . . , u1vm, . . . , unv1, . . . , unvm) on the vector space R × R , and its relation with the inner product hu ⊗ v, u0 ⊗ v0i = hu, u0i · hv, v0i. n For i = 1, 2, let ci be ciphertexts of mi encrypted by the secret s ∈ Zq×Zq . They satisfy hci, si = mi + ei mod q for some small ei’s. Consider the quadratic equation Qc1,c2 (x) := hc1, xi · hc2, xi = 0 0 0 hc1 ⊗c2, x⊗xi. If mi = mi +ei’s are small enough, we obtain Qc1,c2 (s) mod q = m1m2 = m1m2 +e for some small e, as desired, and c ← c1 ⊗ c2 is an encryption of m1m2 with the secret s ⊗ s. However, this comes at a cost of increasing the dimension of ciphertexts and the complexity of homomorphic multiplication. Brakerski and Vaikuntanathan [BV11a] use a key-switching technique to convert a LWE ciphertext under the long secret key s ⊗ s into another LWE ciphertext of the same plaintext under a different secret key s˜ of smaller dimension. Roughly speaking, this process 0 publishes auxiliary information s⊗s = (si)i encrypted by a new secret s˜, and converts a ciphertext 0 P 0 c = (ci)i into a new ciphertext c ← i ci · Encs˜(si). Hence, we need a chain of L secret keys in order to evaluate up to L multiplicative levels while keeping the same ciphertext size. In this paper, we assume the circular security for convenience, so it is safe to encrypt the leveled HE secret key under its own public key. This allows us to use the same secret key for all the levels. Brakerski, Gentry and Vaikuntanathan [BV11a,BGV12] developed the modulus-switching tech- nique to manage the size of errors in LWE-based HE schemes. It converts a ciphertext c modulus q with an error e into a ciphertext c0 of the same message in a new modulus q0 and an error 0 q0 e ≈ q e. The ratio e/q of the noise to the modulus size maintains almost the same, but it reduces the magnitude of the noise and the error growth in the next operation. As a result, the bit-size of largest modulus grows linearly with the multiplicative depth instead of exponentially. We apply the modulus-switching technique to floating-point HE but for a completely different purpose. Roughly speaking, we scale c by a factor q0/q, and then round appropriately to get back an integer ciphertext to discard the LSBs (erroneous part) of messages and control the magnitude of significands.

λ L ` • FP.Setup(1 , 1 ). Take a base modulus p = p(λ, L). Let q` = p for ` = 1,...,L. Choose the parameters n = n(λ, qL) and a Berr-bounded error distribution χ = χ(λ, qL) appropriately for λ (n, qL, χ)-LWE problem that achieves at least 2 security. Let τ = 2(n + 1)dlog qLe. Output the parameters params = (n, qL, χ, τ). • FP.SecretKeyGen(params). Run sk = s ← E.SecretKeyGen(params). • FP.PublicKeyGen(params, sk). Run pk ← E.PublicKeyGen(params, sk). 7

• FP.Encpk(m). Run c ← E.Encpk(m). 0 • FP.Decsk(c). Run m ← E.Decsk(c). 2 • FP.SwitchKeyGen(params, sk). Generate a matrix A0 ← ALWE (t)(n+1) dlog qLe and set the L qL,χ T 0 T switching key swk = AL ← P2(s ⊗ s) + AL (Add the vector P2(s ⊗ s) to the first row of 0 2 AL). For 0 < ` < L, let A` be the (n + 1) × (n + 1) dlog q`e matrix consisting of the first 2 (n + 1) dlog q`e columns of AL. • FP.Add(c1, c2). For two ciphertexts c1, c2 in the same level `, output cadd ← c1 + c2 mod q`. • FP.Multswk(c1, c2). For two ciphertexts c1, c2 in the same level `, output cmult ← A` · BD(c1 ⊗ c2) mod q`. j m 0 q`0 • FP.ModSwitch`→`0 (c). For a ciphertext c in level `, output c ← · c . q` 0 • FP.ModEmbed`→`0 (c). For a ciphertext c in level `, output c ← c mod q`0 .

Let LWEs,`(m, B) be the set of `-level encryptions of m under s whose error is bounded by B, i.e., a vector c ∈ × n is contained in LWE (m, B) if hc, si ≡ m + e (mod q ) for some Zq Zq` s,` ` |e| ≤ B. We omit s and shortly denote LWE`(m, B) when it is clear from the context. There is a natural embedding FP.ModEmbed : c (mod q`) 7→ c (mod q`0 ) from LWE`(m, B) to 0 0 0 LWE` (m, B) for any ` > ` . The converted ciphertext c ← FP.ModEmbedq`→q`0 (c) has exactly the same message and error as c, so this procedure does not change the scale of message. This embedding can be used to carry out operations between ciphertexts in different levels.

λ L Security. Let params = (n, qL, χ, τ) ← FP.Setup(1 , 1 ) so that the (n, qL, χ)-LWE problem λ achieves at least 2 security and τ = 2(n+1)dlog qLe. If we generate sk ← E.SecretKeyGen(params), pk ← E.PublicKeyGen(params, sk) and swk ← FP.SwitchKeyGen(params, sk), then (pk, swk) is 2 (n+1)×τ (n+1)×(n+1) dlog qLe computationally indistinguishable from the uniform distribution over ZqL ×ZqL (n+1)×τ from the hardness of (n, qL, χ) and the assumption of circular security. If pk is chosen in ZqL uniformly, then the statistical distance of (pk, FP.Encpk(0)) from the uniform distribution on (n+1)×τ × n+1 is negligible from the leftover hash lemma [DGHV10]. Thus, our floating-point ZqL ZqL HE scheme described above is IND-CPA secure.

We analyze the noise growth of our basic scheme at encryption and addition, and provide a sufficient condition for decryption correctness.

Lemma 1 (Encryption Noise). Let c ← FP.Enc(m) for a message m. Then the initial ciphertext c is in LWEL(m, Bclean) for Bclean = τ · Berr with overwhelming probability.

Proof. By definition, we have hc, si = hm + A · r, si = m + rT · (AT s). Since each column of T A is a LWE sample, the infinite norm of vector e := A · s is less than Berr with overwhelming probability. This lemma follows from |hr, ei| ≤ τ · Berr. ut

Lemma 2 (Addition/Multiplication Noise). Let ci ∈ LWE`(mi,Bi) be encryptions of mi satisfying |mi| ≤ Mi for i = 1, 2. Let cadd ← FP.Add(c1, c2) and cmult ← FP.Mult(c1, c2). Then, we have cadd ∈ LWE`(m1 + m2,B1 + B2) and cmult ∈ LWE`(m1m2,M1B2 + M2B1 + B1B2 + BKS.`) for 2 BKS,` = (n + 1) dlog q`e · Berr with overwhelming probability.

Proof. The bound of addition noise is obvious. The definition of FP.Mult circuit implies hcmult, si = 0 hP2(s ⊗ s), BD(c1 ⊗ c2)i + hA` · BD(c1 ⊗ c2), si. From a property of tensor product, we get hP2(s ⊗ s), BD(c1 ⊗ c2)i = hc1, si · hc2, si = (m1 + e1)(m2 + e2) for some |e1| ≤ B1 and |e2| ≤ B2. We 0T note that the infinite norm of vector e := A` · s is less than Berr with overwhelming probability. Therefore, hcmult, si = m1m2 + emult for emult = m1e2 + m2e1 + e1e2 + he, BD(c1 ⊗ c2)i, which 2 satisfies |emult| ≤ M1B2 + M2B1 + B1B2 + (n + 1) dlog q`eBerr. ut 8

We now consider the modulus-switching noise and explain its role in our scheme. As mentioned before, we roughly scale a ciphertext by the factor q`0 /q` to change the current modulus from q` to q`0 .

Lemma 3 (Modulus-Switching Noise). Let c ∈ LWE`(m, B) be an encryption of m and let 0 0 q`0 q`0 n+1 c ← FP.ModSwitch`→`0 (c). Then c ∈ LWE`0 ( m, B + BMS) for BMS = Berr. q` q` 2 Proof. Let e be the error of ciphertext c. Then hc0, si = q`0 hc, si + hu, si = q`0 (m + e) + hu, si q` q` 0 q`0 0 q`0 (mod q`0 ) for a vector u := c − c with ||u||∞ ≤ 1/2. The new error e = e + hu, si satisfies q` q` 0 q`0 1 q`0 |e | ≤ B + ||s||1 ≤ B + BMS. ut q` 2 q`

Effect of Modulus-Switching. As noted before, our modulus-switching technique is similar to one of Brakerski, Gentry and Vaikuntanathan [BV11a,BGV12]. However, in our paper, the procedure FP.ModSwitch changes the message from m to a different and smaller message q`0 m = q` 0 p` −`m. It does not mean that the message is ruined, but we just and change the representation (scale) of message. More precisely, if we express numbers in the floating-point system with base p, the modulus-switching procedure rounds up the least (` − `0)-digits of message and shifts the radix point to the left while the exponent part is increased by (` − `0). As a result, most significant digits of message are invariant over this modulus-switching procedure, but only the magnitude of significand is reduced.

Ciphertext Format with Auxiliary Information. For a practical usage of our scheme, it is recommended to keep additional information in a full ciphertext of the form (c, `, M, B, exp) in order to remember the position of radix point and compute the bounds of message and error during homomorphic evaluation. Here, `, M, B and exp denote the level, the bounds of message and error, and exponent part of message of the ciphertext c, respectively. Then our scheme can be described as follows:

FP.Enc :(m, M) 7→ (c, L, M, Bclean, 0), FP.Dec :(c, `, M, B, exp) 7→ (m, B, exp),

FP.Add : ((c1, `, M1,B1, exp), (c2, `, M2,B2, exp)) 7→

(cadd, `, M1 + M2,B1 + B2, exp),

FP.Mult : ((c1, `, M1,B1, exp1), (c2, `, M2,B2, exp2)) 7→

(cmult, `, M1M2,M1B2 + M2B1 + B1B2 + BKS.`, exp1 + exp2), 0 `0−` `0−` 0 FP.ModSwitch `→`0 :(c, `, M, B, exp) 7→ (c, ` , p M, p B + BMS, exp + (` − ` )), 0 FP.ModEmbed `→`0 :(c mod q`, `, M, B, exp) 7→ (c mod q`0 , ` ,M,B, exp).

Homomorphic Operations of Ciphertexts in Different Levels. To multiply a ciphertext c 0 in LWE`(m, B) with a ciphertext in lower level ` , we first should convert c into a ciphertext in 0 `0−` level ` . The modulus-switching procedure FP.ModSwitch`→`0 changes the message to p m with additional noise, while the modulus-embedding procedure FP.ModEmbed`→`0 does not. Therefore, 0 the modulus-switching procedure storages smaller message p` −`m which makes it more space efficient, and the modulus-embedding procedure does not create additional noises to initial message which makes calculations more careful and precise. See Figure 2 for an illustration. In the rest of our paper, if the two ciphertexts c1, c2 do not belong to the same level, FP.Add(c1, c2) and FP.Mult(c1, c2) denote the homomorphic operations after performing the modulus-embedding procedure to the lower level. 9

m m

e e

q` q`

ModSwitch 0 ModEmbed p` −` · m m

0 0 q` q`

Fig. 2. Modulus-switching and modulus-embedding operations

0 Let c be a ciphertext in LWE`(m, B) of a message |m| ≤ M and m ← FP.Dec(c). The absolute error e = m0 − m is bounded by B, but sometimes it is convenient to consider the relative error of message in floating-point arithmetics as we mentioned in Subsection 2.1. However, the definition of relative error in floating-point arithmetics cannot be used directly in homomorphic evaluation of circuits since one cannot see the intermediate values during secure evaluation of circuits. Instead, we suggest a slightly different definition of relative error α := B/M in this paper using the bounds of message and error. The following theorem gives an upper bound of relative error after the evaluation of circuit corresponding to the ordinary floating-point multiplication: rounding after multiplication of significands.

Theorem 1. Let ci ∈ LWE`(mi,Bi) be encryptions of mi such that |mi| ≤ Mi and Bi = αiMi −1 for i = 1, 2. Let  = dlog qLe and ∆ = 1 + . If (1) p ≥ 2(n + 1)dlog qLe, (2) αi ≤ , and (3) k p · Bclean ≤ (α1 + α2)M1M2, then

c ← FP.ModSwitch`→(`−k)(FP.Mult(c1, c2))

1 1 is contained in LWE`−k( pk m1m2, αM) for α = ∆(α1 + α2) and M = pk M1M2. 0 0 Proof. It follows from Lemmas 1 and 2 that the vector c ← FP.Mult(c1, c2) is in LWE`(m1m2,B ) 0 0 for B = M1B2 + M2B1 + B1B2 + BKS,`. From Lemma 3, the output c ← FP.ModSwitch`→(`−k)(c ) 1 1 0 of modulus switching procedure belongs to LWE`−k( pk m1m2,B) for B = pk B + BMS. 1 Dividing B by M = pk M1M2, we obtain

k B BKS,` + p · BMS = α1 + α2 + α1α2 + M M1M2  k  n + 1 p Bclean ≤ α1 + α2 + α1α2 + + . 2 4dlog qLe M1M2

k  n+1 p Bclean By making use of the inequalities α1α2 ≤ (α1 + α2) from (2), and ( + ) ≤ 2 2 4dlog qLe M1M2  2 (α1 + α2) from (1) and (3), we get the desired bound B/M ≤ ∆(α1 + α2). ut −1 For convenience, we use the notations  = dlog qLe and ∆ = 1 +  in the rest of our paper. −1 L 1 L 1/ log p Note that  ≈ (L log p) and ∆ ≈ (1+ L log p ) ≈ e is very close to 1. After a homomorphic multiplication with modulus switching procedure, the relative error becomes α = ∆(α1 + α2). In −1 −1 other words, the result ciphertext has a significand with log α ≈ −1 + log αi bits of preci- sion. Hence, this process decreases the precision of message about one bit similar to the case of unencrypted floating-point computation. The condition (1) only depends on the scheme parameters p, n and L from FP.Setup, and most of parameters used in leveled HE schemes satisfy this condition. Hence, we assume that 10 the condition (1) always hold in this paper. We have (α1 + α2)M1M2 = M1B2 + M2B1 in the condition (3). Under the assumption that Bi ≥ Bclean for i = 1, 2, this condition can be relaxed k to max{M1,M2} ≥ p . In addition, the error bound of ciphertext c in Theorem 1 satisfies B = 1 αM ≥ (α1 + α2)M = (α1 + α2) pk M1M2 ≥ Bclean again. Therefore, we only need to consider the k relaxed condition max{M1,M2} ≥ p when the error bound of ciphertexts are not smaller than Bclean.

4 Homomorphic Evaluation of Floating-Point Arithmetics

We now show that noise growth in our FPHE is almost same as in unencrypted floating-point computations. We start with evaluating some typical floating-point circuits homomorphically- ad- dition and multiplication by constants, monomial and power polynomial. We also extend these facts to general polynomial, inverse, exponent, and analytic function. We deduce bounds on the noise growth that occurs during evaluation and analyze the precision of results.

4.1 Addition and Multiplication by Constants We start with presenting simple operations to evaluate the functions f(x) = x + a and f(x) = ax for a real number a ∈ R. Briefly speaking, we add a ∈ R to the first component of a ciphertext for constant addition, and multiply a ciphertext by a ∈ R for constant multiplication.

Lemma 4 (Addition by Constant). Let c ∈ LWE`(m, B) be an encryption of m and a ∈ R 0 be a real number. Then, c ← c + (bae , 0,..., 0) (mod q`) is contained in LWE`(m + bae ,B) ⊆ LWE`(m + a, B + 1/2). 0 Proof. It is obvious from hc , si = bae + hc, si = (bae + m) + e (mod q`) for some |e| ≤ B. ut

In the case of an integer a ∈ Z, the constant addition procedure does not change the error bound B but change the message bound to M + a. As a result, the relative error also changes.

Lemma 5 (Multiplication by Constant). Let c ∈ LWE`(m, B) be an encryption of m such that |m| ≤ M, and a ∈ R be a real number. Then, the vector c0 ← bpu · ae c is contained in u 0 0 u 1 1 LWE`(p · am, B ) for B = (p · |a| + 2 )B + 2 M. 0 u Proof. From the fact that hc, si = m+e (mod q`) for some |e| ≤ B, we have hc , si = bp · ae (m+ u 0 0 u 1 1 e) = p · am + e such that |e | ≤ (p · |a| + 2 )B + 2 M. ut In the previous lemma, let α = B/M be the relative error of given ciphertext c. The vector 0 u 0 u 0 B0 c ← bp · ae c has the message bound M = p · |a|M, and the relative error α = M 0 = α + B+M 1 2pu·|a|M ≤ α + pu·|a| . We may choose an integer u large enough so that the relative error does not increase too much. In particular, if the scalar a ∈ R is of the form p−u · aˆ for some integers u and aˆ, then the relative error remains the same after constant multiplication.

4.2 Monomial and Power Polynomial d Qd We introduce some algorithms to evaluate the power function x and the monomial i=1 xi in k this subsection. We start from the simplest case f(x) = x2 , the power polynomial of a power of two degree. For simplicity, we assume that the bound M of message m is equal to the base p. Algorithm 1 performs the modulus switching procedure repeatedly after each squaring step to k maintain the size of significand. The output of Algorithm 1 is thus not an encryption of f(m) = m2 , k k but an encryption of the scaled value p · f(m/p) = m2 /p2 −1. The following lemma checks the correctness of Algorithm 1 and computes the relative error of output ciphertext. 11

Algorithm 1 Power polynomial f(x) = xd of power-of-two degree 1: procedure (c ∈ × n , d = 2k)) Power Zq` Zq` 2: c0 ← c 3: for i = 1 to k do 4: ci ← FP.ModSwitch(`−i+1)→(`−i)(FP.Mult(ci−1, ci−1)) 5: end for 6: return ck 7: end procedure

Lemma 6 (Power Polynomial of Power-of-Two Degree). Let c ∈ LWE`(m, B) be an en- −(k−1) cryption of m such that |m| ≤ p and B = α0p. If α0 ≤ (2∆) ·  and B ≥ Bclean, then k 2k 2k−1 the output ck ← Power(c, 2 ) of Algorithm 1 is contained in LWE`−k(m /p , αkp) where k αk = (2∆) · α0.

Proof. Let αi = 2∆ · αi−1 for i = 1, . . . , k. From Theorem 1, we can inductively show that ci ← 2i 2i−1 FP.ModSwitch(`−i+1)→(`−i)(FP.Mult(ci−1, ci−1)) belongs to LWE`−i(m /p , αip) for all i. Thus, 2k 2k−1 k after k iterations, ck is contained in LWE`−k(m /p , αkp) where αk = (2∆) · α0. ut

Algorithm 1 is used as a subroutine of the following extended algorithm which evaluates a power polynomial f(x) = xd for any degree d.

Algorithm 2 Power polynomial f(x) = xd 1: procedure (c ∈ × n , d) Power Zq Zq` k1 kr 2: Let d = 2 + ··· + 2 where 0 ≤ k1 < ··· < kr = blog dc. k1 3: c1 ← Power(c, 2 ) 4: v1 ← c1 5: for i = 2 to r do ki−ki−1 6: ci ← Power(ci−1, 2 )

7: vi ← FP.ModSwitch(`−ki)→(`−ki−1)(FP.Mult(ci, vi−1)) 8: end for 9: return vr 10: end procedure

Similar to the previous case, Algorithm 2 outputs an encryption of the scaled value p·f(m/p) = md/pd−1. We show the correctness and analyze the relative error in the following theorem.

Theorem 2 (Power Polynomial). Let c ∈ LWE`(m, B) be an encryption of m such that |m| ≤ p k1 kr and B = α0p. Let d be a positive integer with binary representation d = 2 +···+2 for an integer −(k−1) r and some integers 0 ≤ k1 < ··· < kr. Let k = dlog de. If α0 ≤ (2∆) · and B ≥ Bclean, then d d−1 k the output v ← Power(c, d) of Algorithm 2 is in LWE`−k(m /p , αkp) for αk = (2∆) · α0.

Proof. The case r = 1 is obvious by Lemma 6, so assume that r ≥ 2. Let αi = 2∆ · αi−1 for 2k1 2k1 −1 i = 1, ··· , k. From the proof of Lemma 6, the vector c1 is contained in LWE`−k1 (m /p , αk1 p). 2ki 2ki−1 Using the induction on i, we may show that ci ∈ LWE`−ki (m /p , αki p) and vi ∈ 2k1 +···+2ki 2k1 +···+2ki −1 LWE`−ki−1(m /p , αki+1p) for all i = 1, ··· , r. Therefore, the output of Algo- d d−1 rithm 2 is contained in LWE`−k(m /p , αkp) for k = kr + 1 = dlog de. ut 12

Q One may homomorphically evaluate a monomial f(x1, . . . , xd) = i xi using the binary tree structure, and its relative error can be computed similarly to Theorem 2. Let k = dlog de. For given −(k−1) level-` ciphertexts {ci ∈ LWE`(mi, αip) : 1 ≤ i ≤ d} such that |mi| ≤ p and αi ≤ (2∆) · , one may compute their product by (d − 1)-number of multiplications in depth k. The resulting Qd d−1 k Pd ciphertext is contained in LWE`−k( i=1 mi/p , αp) for α = ∆ · i=1 αi. Comparing this result to the unencrypted case in Section 2.1, our scheme achieves almost optimal precision of floating- point computation.

4.3 Polynomial and Approximating Power Series The goal of this subsection is to evaluate polynomial and analytic functions and analyze the relative error of output ciphertext during homomorphic evaluation. We start by presenting an algorithm for evaluating polynomial functions.

Pd i Algorithm 3 Polynomial function f(x) = i=0 aix 1: procedure (c ∈ × n , f(x) = Pd a xi ∈ [x], u ∈ ) Polynomial Zq` Zq` i=0 i R Z  u+1  2: v ← ( p · a0 , 0,..., 0) 3: for i = 1 to d do u 4: v ← FP.Add(v, bp · aie · Power(c, i)) 5: end for 6: return v 7: end procedure

Pd i Theorem 3 (Polynomial Evaluation). Let f(x) = i=0 aix ∈ R[x] be a real polynomial of degree d with k = dlog de, and c ∈ LWE`(m, B) be an encryption of m such that |m| ≤ p and −(k−1) B = α0p. If α0 ≤ (2∆) · , Then, the output v ← Polynomial(c, f(x), u) of Algorithm 3 is u+1 u+1 Pd  k −u in LWE`−k(p · f(m/p), αf p ) where αf = i=1 |ai| · (2∆) α0 + (d + 1)p .

Proof. From Theorem 2 and Lemma 5, for all i = 1, . . . , d, the ciphertexts Power(c, i) and u i i−1 u+1 i 0 bp · aie·Power(c, i) are contained in LWE`−dlog ie(m /p ,Bi) and LWE`−dlog ie(p ·aimi/p ,Bi) i 0 u for Bi = (2∆) α0p and Bi = p · |ai|Bi + p, respectively. Therefore, the output v of Algorithm 3 is an encryption of pu+1 · f(m/p) with the error bound Pd 0 Pd i u+1 u+1 1/2 + i=1 Bi = 1/2 + i=1 |ai| · (2∆) · α0p + dp, which is less than αf p for αf = Pd  k −u i=0 |ai| · (2∆) α0 + (d + 1)p . ut

The resulting ciphertext of above theorem is an encryption of pu+1 · f(m/p), with the message   u+1 Pd αf k (d+1) bound p · i=0 |ai| and the relative error Pd = (2∆) α0 + u Pd , which is almost i=0 |ai| p · i=0 |ai| k equal to (2∆) α0 for a sufficiently large integer u. By storing some intermediate computation results, we can compute the encryptions of m, m2/p, . . . , md/pd−1 simultaneously in d homomorphic multiplications and so reduce the complexity of Algorithm 3 to d. The evaluation point m/p is in the interval [−1, 1] in Theorem 3, but it can be extended to general cases by taking an alternative polynomial (for example, g(x) such that g(x) = f(px)). Now consider an analytic function f(x) and its Taylor decomposition f(x) = Td(x) + Rd(x) (i) Pd f (0) i for Td(x) = i=0 i! x and Rd(x) = f(x) − Td(x). If Rd(x) goes to zero as d → ∞, then Td(x) 13 converges to f(x). Assume that we are given an encryption of m such that |m| ≤ p with relative error α0. One may use previous theorem to evaluate the function f(x) at m/p by computing its u+1 Taylor polynomial Td(x) for a sufficiently large d and obtaining an encryption of p · Td(m/p), u+1 u+1 which is an approximate value of p · f(m/p) with a small error p · Rd(m/p). The size of total (i) u+1 Pd f (0)  dlog de −u error is bounded by (αT + |Rd(m/p)|) · p for αT = i=0 | i! | · (2∆) α0 + (d + 1)p by Theorem 3. Consider the exponential function f(x) = ex as a concrete example. It has the Taylor polynomial Pd xi e Td(x) = i=0 i! and the remainder Rd(x) is bounded by |Rd(x)| ≤ (d+1)! when |x| ≤ 1. For an encryption of m such that |m| ≤ p with relative error α0, take two integers u and d such that u −1 −1 −1 p ≥ 2α0 and d(d + 1)! ≥ 2α0 . Note that the choice of d = dlog α0 e holds the required u+1 condition. Then Algorithm 3 outputs an encryption of p · Td(m/p) with an error bounded u+1 dlog de −u by αT p for αT ≤ e · (2∆) α0 + (d + 1)p , and it can be also viewed as an encryption u+1 m/p u+1 dlog de of p · e with the message bound ep and the relative error 2(2∆) α0. Hence, the −1 precision of ciphertext is decreased by dlog de + 1 = dlog log α0 e + 1 bits compared to the input ciphertext. ex Now consider another example f(x) = 1+ex . The logistic function is widely used in medicine as a predictive equation in logistic regression. For example, logistic regression was used for prediction the likelihood to have a heart attack in an unspecified period for men in [BLN14]. It was also used as a predictive equation to screen for diabetes in [TH02]. One can evaluate the logistic function ex x −1 1+ex = 1 − (1 + e ) by combining the constant addition, the exponential function and the multiplicative inverse, that will be described in the next subsection.

4.4 Multiplicative Inverse We may refer the previous subsection to compute its Taylor decomposition and apply Theorem 3, to securely compute the multiplicative inverse. However, we will use another method, which is more easy to investigate, and almost preserves the precision of floating-point computation. We  1  review the approximate method using a convergence algorithm [cDSM15]. Given x ∈ 2 , 1 , let  1  x¯ = 1 − x ∈ 0, 2 . Then we see that

2 k−1 k x(1 +x ¯)(1 +x ¯2)(1 +x ¯2 ) ··· (1 +x ¯2 ) = 1 − x¯2 . (1)

h k i Note that this product is in the interval 1 − 2−2 , 1 and converges to one as k goes to infinity. Qk−1 2i −1 2k Hence, i=0 (1 +x ¯ ) = x (1 − x¯ ) can be considered as an approximate multiplicative inverse k of x with an relative error bounded by 2−2 . We now return to our subject. Assume that an integer m has a range of [p/2, p] and denote m¯ = p − m. The standard approach starts by normalizing those numbers to be in the unit interval  1  by setting x = m/p ∈ 2 , 1 . Since we cannot multiply fractions over encrypted data, the precision point has to move to the left for each term of (1). We multiply both sides of the equation (1) by k p2 , and then it yields

2 2 k−1 k−1 k k m(p +m ¯ )(p2 +m ¯ 2)(p2 +m ¯ 2 ) ··· (p2 +m ¯ 2 ) = p2 − m¯ 2 .

Qk−1 2i 2i 2k Therefore, i=0 (p +m ¯ )/p can be seen as approximate inverse of m with an relative error k k bounded by (m/p ¯ )2 ≤ 2−2 . We first suggest an algorithm for evaluating this circuit, and analyze the resulting relative error and optimal number of iterations in the following theorem.

Theorem 4 (Inverse Function). Let c ∈ LWE`(m, B0) be an encryption of m such that p/2 ≤ −(k−1) 2k−1−(k−1) m ≤ p and B0 = α0p/2. If α0 ≤ (2∆) ·  and B0 ≥ 2 · Bclean, then the output vk ← 14

Algorithm 4 Inverse function f(x) = x−1 1: procedure (c ∈ × n , k) Inverse Zq Zq` 2: p ← (p, 0,..., 0) 3: c0 ← p − c 4: v1 ← FP.ModEmbed`→(`−1)(p + c0) 5: for i = 1 to k − 1 do 6: ci ← FP.ModSwitch(`−i+1)→(`−i)(FP.Mult(ci−1, ci−1)) 7: vi+1 ← FP.ModSwitch(`−i)→(`−i−1)(FP.Mult(vi, p + ci)) 8: end for 9: return vk 10: end procedure

0 0 p2  m 2k  Inverse(c, k) of Algorithm 4 is contained in LWE`−k (m , α · 2p) where m = ( m ) 1 − (1 − p ) k −1 and α = ∆ ·α0. In particular, with the input k = dlog log α0 e, this algorithm outputs a ciphertext 2 in LWE`−k(p /m, 2α · 2p).

Proof. Letm ¯ = p−m. From Lemma 4, the ciphertexts c0 and v1 are contained in LWE`(m, ¯ B0) and −1 LWE`−1(p +m, ¯ B0), respectively. Note that they have the same error bound B0 = α0(2 p), but −1 −1 0 −1 different message bounds 2 p and (1+2 )p, and different relative errors α0 and α0 = (2+1) α0. i Let αi = (2∆) · α0 for i = 1, . . . , k − 1. From Theorem 1, it is easy to check that ci belongs to 2i 2i−1 −2i 2i LWE`−i(m /p , αi(2 p)) for all i. By Lemma 4, the vector p+ci is contained in LWE`−i((p + 2i 2i−1 −2i −2i m )/p , αi(2 )p), and it has the increased message bound (1+2 )p and the reduced relative i error α0 = 1 α = (2∆) α . i (22i +1) i 22i +1 0 00 0 00 0 00 Let α1 = ∆ · α0 and αi = ∆(αi−1 + αi−1) for i = 2, ··· , k. Using the induction on i, we can show that vi is contained in

i−1 i−1  Y 2j 2j 2i−2 00 Y −2j LWE`−i  (p +m ¯ )/p , αi · (1 + 2 )p , j=0 j=0

Qi−1 2j 2j 2i−2 2 2i that is, vi can be seen as an encryption of j=0(p +m ¯ )/p = (p /m)(1 − (m/p ¯ ) ) with the Qi−1 −2j −2i message bound j=0(1 + 2 )p = (1 − 2 )(2p) ≤ 2p and the relative error   i−1 i−1 j 00 X i−j 0 i X 2 i αi = ∆ · αj = ∆ ·  j  α0 ≤ ∆ · α0, 22 + 1 j=0 j=0

j from the fact that P∞ 2 = 1. Therefore, the output of Algorithm 4 is contained in LWE (m0, j=0 22j +1 `−k 0 2 2k k α · 2p) for m = (p /m)(1 − (m/p ¯ ) ) and α = ∆ · α0. −1 0 2 −2k 2 In the case of k = dlog log α0 e, the difference of m and (p /m) is bounded by 2 (p /m) ≤ α · 2p also, and this algorithm outputs an encryption of (p2/m) with the message bound 2p and k the relative error 2α = 2∆ · α0. ut

k This algorithm outputs a ciphertext whose relative error is 2α = 2∆ · α0 ≈ 2α0, so it loses about one bit of precision. The optimal number of iterations can be changed when we know more/less information about the magnitude of m. Assume that the message m is contained in the interval ((1 − β)p, p) for some value 0 < β < 1. Letm ¯ = p − m ∈ [0, βp]. Then we have 15

Qk−1 2j 2i 2k 2k 2k k −1 i=0 (p +m ¯ ) = (p /m) · (1 − (m/p ¯ ) ), which is an approximation of p /m with 2 · log β −1 −1 bits of precision. Therefore, the optimal number of iteration is k ≈ log log α0 − log log β in general case. 2k−1−(k−1) The condition B0 ≥ 2 · Bclean in Theorem 4 is stronger than our general assumption B0 ≥ Bclean. However, we may use a simple transformation ciphertexts to make them satisfy the required condition. For a given encryption c ∈ LWE`(m, B0) of m ∈ [p/2, p] such that B0 ≥ Bclean ∗ ∗ ∗ ∗ ∗ and B0 = α0p, let c = p · c (mod q`), m = p · m, B0 = p · B0 and p = p · p. Then c is an ∗ ∗ ∗ ∗ encryption of m ∈ [p /2, p ] with error bound B0 and relative error α0. After this transformation, ∗ we can use Theorem 4 and compute the multiplicative inverse of m homomorphically since B0 = 2(k−1)−(k−1) p · B0 ≥ 2 Bclean.

5 Comparison and Implementation

In this section, we provide a theoretical comparison of our FPHE scheme and previous HE scheme for evaluating arithmetic circuits described in Section 4. We also adapted our method to BGV scheme with Shoup-Halevi’s HE library [HS13] (called HElib). All experiments reported in our paper were performed on a machine with an Intel Xeon 2.6 GHz processor running a Linux 3.16.0 operating system.

5.1 Comparison with previous HE schemes

Gentry, Halevi and Smart [GHS12b] constructed an efficient BGV-type somewhat HE scheme. The security of this scheme is based on the (decisional) Ring Learning With Errors (RLWE) assumption, which was first introduced by Lyubashevsky, Peikert and Regev [LPR10]. This scheme has a chain of ciphertext moduli by a set of primes of roughly the same size, p0, ··· , pL, that is, the i-th Qi modulus qi is defined as qi = k=0 pk. For simplicity, assume that P is the approximate size of the pi’s. We choose an integer m that defines the m-th cyclotomic polynomial Φm(x). For a polynomial ring R = Z[x]/(Φm(x)), set the plaintext space to Rt := R/tR for some fixed t ≥ 2 and the ciphertext space to Rq := R/qR for an integer q. Assume that we are given encryptions of messages m1, . . . , md with η bits of precision. To securely compute a product of mi’s, we can consider three distinct message encoding possibilities with BGV-type schemes: The first two naive methods are called scalar encoding method, where each input (bit or large integer) is encoded as a constant polynomial. The last one is called balanced base-B encoding method for an odd integer B ≥ 3, which represents a fixed-point number with base B and encodes as a polynomial with coefficients in the range [−(B − 1)/2, (B − 1)/2].

• All the messages are encrypted bit-wise. We represent the entire functions as binary circuits, which correspond to arithmetic on bits in Z2. Note that η-bit multiplication is computable by boolean circuits of depth O(log η) with O(η2) complexity [Vol13]. Now suppose that we only obtain the most significant η bits. Then the total depth amount to O(log η · log d) and it requires O(d · η2) homomorphic multiplication in total. So the approximate bit size of the ciphertext modulus qL is (L + 1) · log P = O(log η · log d). We note that it is encrypted as a set of d · η ciphertexts. • The second method is to use a large integer ring as a message space instead of a binary field. It allows us to evaluate with only (d−1) homomorphic multiplications. We first choose sufficiently large t so that no reductions modulo t occurs in the plaintext space, so we take t as the smallest integer which satisfies log t > η · d. Next, given the Hamming weight h of the secret key, it  log(h·φ(m)·t4)  follows from Section 5.1 in [KL15] that L = log d· 2 log P + 1 = O(η · d log d). Thus the bit size of the ciphertext modulus qL is approximately (L+1)·log P = O(η·d log d) and we only 16

have d ciphertexts. The reader can consider another method which is based on decomposition modulus W .1 • Dowlin et al. [DGL+15] suggested a balanced base-B encoding method of fixed-point numbers + PI i for an odd integer B ≥ 3. If a fixed-point number b is represented as b = i=I− biB for + I− PI i some bi ∈ [−(B − 1)/2, (B − 1)/2], it is encoded as a polynomial b(X) = X · ( i=0 biX ) + − PI −i i=1 b−iX . It follows from [CSVW16] that when computing the encryption of product of B−1 mi’s homomorphically, the size of t is needed to ensure correctness as follows: t > ( 2 · η d blogB 2 c) . Thus the bit size of the largest modulus is approximately O(log d·log t) = O(log η· d log d), and we also only need d ciphertexts. In case of our scheme, we may take p = 2η and L = log d for the correctness of decryption procedure after product of ciphertexts of mi’s. Then total computation cost is (d − 1) and the bit size of the ciphertext modulus is log qL = log p · (L + 1) = η · (log d + 1). Even though our scheme only computes an encryption of arithmetic solution with (η − dlog de) bits of precision, the circuit depth is logarithm in d, and the numbers of homomorphic multiplications and ciphertexts are linear in d. We provide a better description of comparison in Table 2.

Table 2. Product of d ciphertexts

Circuit Size of modulus #(HM) #(ctx) Depth (log qL) Bit ptx O(log η log d) O(dη2) O(log η log d) dη Previous Large ptx log d d − 1 O(ηd log d) d Balanced log d d − 1 O(log η · d log d) d Ours log d d − 1 η(log d + 1) d

5.2 Our Implementation with Concrete Parameters In Table 3, we present the parameter setting and performance results for multiplying 16 integers of 20 bits and computing the multiplicative inverse of 20-bit integer with our scheme. All the parameters provide 80-bit security level. When we measured the average running times, we excluded computing times used in data encryption and decryption. Q16 To evaluate the arithmetic circuit i=1 xi homomorphically, BGV-type scheme requires huge computation cost with bit-wise encryption or at least 320-bit plaintext space, so it is impossible to be evaluated in practice. Meanwhile, our scheme can compute the approximate value with 15 bits of precision in about 315ms. Moreover, upon input with 20 bits of precision of x for 1/2 ≤ x ≤ 1, we may compute its multiplicative inverse x−1 with only 8 homomorphic multiplications and depth 5, and its implementation takes about 168ms. 1 For an integer W , we first decompose our messages modulo W and then evaluate the function over the Pr j integers. More precisely, an initial message can be represented as mi = j=0 mi,j · W for mi,j ∈ [0,W ) η and r = blogW 2 c. Assume that it is encrypted as a set of (r + 1) · d ciphertexts of mi,j , denote by P Enc(mi,j ). One can perform Enc(mi ,j ) · Enc(mi ,j ) for 0 ≤ l ≤ 2r, which corresponds to the j1+j2=l 1 1 2 2 multiplication of mi1 and mi2 . After log d steps, we obtain a set of ciphertexts of the form {cti}0≤i≤d·r Pd·r i and then compute i=0 Dec(cti) · W as desired. This method provides some parameter/complexity tradeoff, since obtaining a smaller value of t (i.e., log t = d · log W ) makes parameters small but requires more number of homomorphic multiplications. 17

Table 3. Implementation results for η = 20 with 80 bits security

Bit precision Circuit Size of modulus Timing Function #(HM) of output depth (log qL) (ms) Q16 xi 15 315 i=1 15 4 76 x16 4 84 x−1 18 5 8 76 168

References

[ACPS09] Benny Applebaum, David Cash, Chris Peikert, and Amit Sahai. Fast cryptographic primitives and circular-secure encryption based on hard learning problems. In Advances in Cryptology- CRYPTO 2009, pages 595–618. Springer, 2009. [ACS02] Joy Algesheimer, Jan Camenisch, and Victor Shoup. Efficient computation modulo a shared se- cret with application to the generation of shared safe-prime products. In Advances in Cryptology - CRYPTO 2002, pages 417–432. Springer, 2002. [BGV12] Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. (Leveled) fully homomorphic en- cryption without bootstrapping. In Proc. of ITCS, pages 309–325. ACM, 2012. [BLLN13] Joppe W Bos, Kristin Lauter, Jake Loftus, and Michael Naehrig. Improved security for a ring-based fully homomorphic encryption scheme. In and Coding, pages 45–64. Springer, 2013. [BLN14] Joppe W Bos, Kristin Lauter, and Michael Naehrig. Private predictive analysis on encrypted medical data. Journal of biomedical informatics, 50:234–243, 2014. [Bra12] Zvika Brakerski. Fully homomorphic encryption without modulus switching from classical GapSVP. In Reihaneh Safavi-Naini and Ran Canetti, editors, CRYPTO 2012, volume 7417 of Lecture Notes in Computer Science, pages 868–886. Springer, 2012. [BV11a] Zvika Brakerski and Vinod Vaikuntanathan. Efficient fully homomorphic encryption from (stan- dard) LWE. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS’11, pages 97–106. IEEE Computer Society, 2011. [BV11b] Zvika Brakerski and Vinod Vaikuntanathan. Fully homomorphic encryption from Ring-LWE and security for key dependent messages. In Phillip Rogaway, editor, CRYPTO 2011, volume 6841 of Lecture Notes in Computer Science, pages 505–524. Springer, 2011. [cDSM15] Gizem S. C¸etin, Yarkın Dor¨oz,Berk Sunar, and William J. Martin. An investigation of com- plex operations with word-size homomorphic encryption. Cryptology ePrint Archive, Report 2015/1195, 2015. http://eprint.iacr.org/2015/1195. [CLT14] Jean-S´ebastienCoron, Tancr`edeLepoint, and Mehdi Tibouchi. Scale-invariant fully homo- morphic encryption over the integers. In Public-Key Cryptography–PKC 2014, pages 311–328. Springer, 2014. [CS15] Jung Hee Cheon and Damien Stehl´e.Fully homomophic encryption over the integers revisited. In Advances in Cryptology–EUROCRYPT 2015, pages 513–536. Springer, 2015. [CSVW16] Anamaria Costache, Nigel P. Smart, Srinivas Vivek, and Adrian Waller. Fixed point arithmetic in she schemes. Cryptology ePrint Archive, Report 2016/250, 2016. http://eprint.iacr.org/ 2016/250. [DGHV10] Marten van Dijk, Craig Gentry, Shai Halevi, and Vinod Vaikuntanathan. Fully homomorphic encryption over the integers. In Henri Gilbert, editor, EUROCRYPT 2010, volume 6110 of Lecture Notes in Computer Science, pages 24–43. Springer, 2010. [DGL+15] Nathan Dowlin, Ran Gilad, Kim Laine, Kristin Lauter, Michael Naehrig, and John Werns- ing. Manual for using homomorphic encryption for bioinformatics. 2015. http://research. microsoft.com/pubs/258435/ManualHE.pdf. [DM15] L´eoDucas and Daniele Micciancio. Fhew: Bootstrapping homomorphic encryption in less than a second. In Advances in Cryptology–EUROCRYPT 2015, pages 617–640. Springer, 2015. 18

[DNT12] Morten Dahl, Chao Ning, and Tomas Toft. On secure two-party integer division. In Financial Cryptography and Data Security, pages 164–178. Springer, 2012. [Ein05] Bo Einarsson. Accuracy and reliability in scientific computing, volume 18. SIAM, 2005. [Gen09] Craig Gentry. A fully homomorphic encryption scheme. PhD thesis, Stanford University, 2009. http://crypto.stanford.edu/craig. [GHS12a] Craig Gentry, Shai Halevi, and Nigel P. Smart. Better bootstrapping in fully homomorphic encryption. In Public Key Cryptography–PKC 2012, pages 1–16. Springer, 2012. [GHS12b] Craig Gentry, Shai Halevi, and Nigel P. Smart. Homomorphic evaluation of the AES circuit. In Reihaneh Safavi-Naini and Ran Canetti, editors, Advances in Cryptology - CRYPTO 2012, volume 7417 of Lecture Notes in Computer Science, pages 850–867. Springer, 2012. [Gol91] David Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys (CSUR), 23(1):5–48, 1991. [GSW13] Craig Gentry, Amit Sahai, and Brent Waters. Homomorphic encryption from learning with er- rors: Conceptually-simpler, asymptotically-faster, attribute-based. In Advances in Cryptology– CRYPTO 2013, pages 75–92. Springer, 2013. [Hig02] Nicholas J Higham. Accuracy and stability of numerical algorithms. Siam, 2002. [HS13] Shai Halevi and Victor Shoup. Design and implementation of a homomorphic-encryption library. IBM Research (Manuscript), 2013. [HS15] Shai Halevi and Victor Shoup. Bootstrapping for helib. In Advances in Cryptology– EUROCRYPT 2015, pages 641–670. Springer, 2015. [Jak06] Thomas Jakobsen. Secure multi-party computation on integers. PhD thesis, Master Thesis University of Aarhus, Denmark, 2006. [KL15] Miran Kim and Kristin Lauter. Private genome analysis through homomorphic encryption. BMC medical informatics and decision making, 15(Suppl 5):S3, 2015. [LPR10] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On ideal lattices and learning with errors over rings. In Advances in Cryptology - EUROCRYPT 2010, pages 1–23, 2010. [Pei09] Chris Peikert. Public-key cryptosystems from the worst-case shortest vector problem: Extended abstract. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, STOC ’09, pages 333–342, New York, NY, USA, 2009. ACM. [Reg05] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. In Proceedings of the Thirty-seventh Annual ACM Symposium on Theory of Computing, STOC ’05, pages 84–93, New York, NY, USA, 2005. ACM. [TH02] Bahman P Tabaei and William H Herman. A multivariate logistic regression equation to screen for diabetes development and validation. Diabetes Care, 25(11):1999–2003, 2002. [Vol13] Heribert Vollmer. Introduction to circuit complexity: a uniform approach. Springer Science & Business Media, 2013. [Wil61] James H. Wilkinson. Error analysis of direct methods of matrix inversion. pages 281–330, 1961. http://dl.acm.org/citation.cfm?id=321076.