Calculating the average using Paillier’s

Christiana Zaraket Maroun Chamoun Tony Nicolas ESIB, USJ CIMTI ESIB, USJ CIMTI ESIB, USJ Beirut, Lebanon Beirut, Lebanon CIMTI [email protected] [email protected] Beirut, Lebanon [email protected]

Abstract—Homomorphic (HE) is one of the This paper is the result of a prolonged study on Paillier’s efficient ways that allow many companies to store their data in cryptosystem and our purpose of the latter is to facilitate and an encrypted form into the Cloud and then the latter can be to make clearer the mathematical ideas and theorems used in analyzed and worked with as if it were still in its initial form. In this cryptosystem. All this is done by giving details and proofs this paper we consider two types of on every step taken, as well by giving numerical examples. So (HE) schemes: the additive and the multiplicative ones. But we that a reader, from any field, can understand not only the will be focusing on the additive one in order to calculate on the Paillier’s cryptosystem but also can see direct applications on cipher-texts and get the average while decrypting the result. it. Therefore we will show the theoretical part of the additive homomorphic Paillier's cryptosystem which is used in order to The rest of this paper is decomposed as follows: in section encrypt and decrypt the messages needed, but what can be more II we define the notion of homomorphic encryption (HE) as interesting is showing in details the application part of it which well as its properties. In section III, we present in details is realized by a numerical example applied on Sagemath. Paillier’s cryptosystem which is an example of a homomorphic encryption (HE) scheme and we show, in Keywords—Homomorphic encryption (HE), additive section IV, how to apply operations on it. In section V, we homomorphic, Paillier’s cryptosystem give a numerical example concerning the previous section.

I. INTRODUCTION And finally we conclude this paper in section VI by giving a conclusion and an idea about future work. Cloud computing is a data storing technique that gives opportunities for out-sourcing of storage and computation. It II. HOMOMORPHIC ENCRYPTION offers flexibility and cost saving, but the main disadvantage In this section we give a definition of a homomorphic that faces many companies to use the Cloud computing in their encryption (HE) for an asymmetric scheme as well as its work is the security concept. The first solution proposed was properties. to encrypt the sensible data before storing it into the Cloud, but if ever we want to apply operations on the latter without A. Asymmetric Scheme [6] decrypting it and without having any information about the An asymmetric scheme is based on three functions: initial data, we need to use what's called homomorphic encryption (HE). The homomorphic encryption (HE) • KeyGen( ):generates two different keys, a public one procedure is simple: the user stores its encrypted data into the noted pk and a secret one noted sk. cloud and sends encrypted queries over it, the latter must be ( ) able to send back to user an answer which is in an encrypted • Encrypt pk, mi : is a mathematical function that is form and while decrypting the result, the user obtains what he able to transform a plain-text mi into a cipher-text ci wants. In addition a homomorphic encryption (HE) scheme using the public pk. could be either additive such as Paillier’s [1] and Goldwasser- • Decrypt(sk, c ): is a mathematical function that uses micali’s [2] or multiplicative such as RSA [3] i the secret key sk in order to obtain the plain-text m and EL Gamal [4] crypto-systems. This depends on what the i user wants as operation to be done on his plain-texts after from the cipher-text ci . applying the decryption function. After all these work, in 2009 B. Homomorphic Encryption (HE) for an Asymmetric Craig Gentry had the idea of creating a new cryptosystem Scheme known as Fully homomorphic encryption scheme (FHE) [5] which can handle addition and multiplication operations at the An asymmetric homomorphic encryption scheme, same time, but such scheme was proved unfeasible in practice schematized in the Fig. 1, is realized by following these steps: due to a massive overhead in computation and memory cost. However, the idea of this article comes from the murex’s As a first step the client must encrypt his messages mi using Encrypt(pk, m ) function, and then he sends the resu purpose, which is to calculate a list of KPI’s such as i lts calculating the average salary for the 10% higher salary and to be stored into the Cloud. The next step could be that the the average monthly earnings... After studying the elements of client sends some queries f( ) to the Cloud that must be capable this list, we found that the operations needed by the latter are to compute an encryption of f(mi ) noted by Encrypt(pk,f(mi )) the addition operation and the dividing by a constant operation and we mention here that f {+, ×}. Then the Cloud returns and then we found a new simple technique to calculate the this result to the client who will compute average of given plain-texts by only using their appropriate Decrypt(sk,Encrypt(pk,f(mi ))) and obtains f(m ). cipher-texts. i

Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

113

* where r is a random element from Zn.

This can be represented on Sagemath as following: def encrypt(m,n,g): r=random_prime(n,proof=True) return ((g**m)*(r**n))% n**2

The decryption function is given by: Fig. 1 A client-Cloud homomorphic encryption (HE) scenario L(cλ mod n2 ) Decrypt(sk, c)= mod n All this process, schematized in the Fig. 1, is done without L(gλ mod n2 ) having to decrypt the cipher-texts stored into the Cloud and Where L(u)= u–1 and the Carmichael’s function λ is equal to without having any information on the initial data. n lcm(p–1,q–1). C. Homomorphic Properties There are two types of homomorphic encryption: On Sagemath this function could be represented as following: • An encryption scheme is supposed to be additive homomorphic if it verifies the following: def decrypt(c,lmd,mu,n): n return ((((c**lmd)%(n**2)-1)/n)*mu)%n g(Encrypt(pk, mi ))=Encrypt(pk, ∑ i= 1 mi ).

Remark: The decryption function gives back the plain-text • An encryption scheme is supposed to be m because: multiplicative homomorphic if it verifies the λ 2 m n λ 2 c mod n =(g r ) mod n following: mλ nλ 2 n =g r mod n (1) mλ =g mod n2 (2) g(Encrypt(pk, mi ))=Encrypt(pk, ∏ mi ) . mλ 2 i=1 =(1+n) m od n (3) mλ mλ i 2 We mention here that g {+, ×}. = ∑i=0 Ci n mod n (4) mλ mλ mλ 2 =(C0 .1+C1 .n + C2 .n + 2 mλ mλ i-2 2 n ( ∑i=3 Ci n ))mod n (5) 2 III. PAILLIER’S CRYPTOSYSTEM [1] =(1+nmλ) mod n (6)

Paillier’s cryptosystem is an example of additive The second part of the decryption function which is homomorphic encryption scheme invented by Pascal Paillier = (L(g mod n2))– 1, is calculated in the key generation in 1999. We give in this section an explanation of the Paillier’s cryptosystem construction and its properties. algorithm and so that of λ in order to not being obliged to repeat the same calculation each time we are using the A. Scheme Construction Decryption algorithm and therefore we are optimizing the calculation time of the latter. is calculated as follows: Let n be the multiplication of two chosen prime numbers * λ 2 λ 2 p and q and let g be an element of Zn2 . But in the rest of this g mod n =(1+n) mod n article we will consider that g is equal to n+1 as Damgard, =(1+nλ) mod n2 Jurik and Nielsen has chosen it in 2010 in order to make the choice of g the simplest possible [7]. L(gλ mod n2 )= 1+nλ–1 mod n The public key is given by pk=(n,g) and the secret one is n given by sk=(p,q). = λ mod n –1 So = λ mod n The key generation algorithm using Sagemath is the And then, following: 1+nmλ–1 λ 2 def keygen(KeySize): L(c mod n ) n Decrypt(sk, c)= mod n = mod n=m mod n p=random_prime(2**( KeySize //2),lbound=2**( L(gλ mod n2 ) λ KeySize //2-1),proof=True) q=random_prime(2**( KeySize //2),lbound=2**( Before passing to the next section, we would like to KeySize //2-1),proof=True) explain a little bit about the equalities we have considered λ 2 n=p*q while calulating c mod n .

g=n+1 • To pass from (1) to (2) we have used the Carmichael‘s lmd=(p-1)*(q-1) theorem [8], which is the following: mu=(lmd**(-1))%n * return p,q,n,g,lmd,mu For every element w Zn2

λ To encrypt a plain-text m Zn one can use the following w =1 mod n { nλ 2 encryption function: w =1 mod n In our case we have r Z* which is equivalent m n 2 n c=Encrypt(pk, m)=g r mod n , * to r Zn2 and 0

114 and therefore while applying Carmichael‘s def encrypted_mean(M, n,g): 2 theorem one can get rnλ =1 mod n . k=(len(M)**(-1))% n return (encrypted_sum(M,n,g)**k)%(n**2) • To pass from (2) to (3) we have replaced g by (1+n).

• To pass from (3) to (4) we have used the binomial • Step 3: Apply Decrypt( ) function on [µ] to obtain

theorem [9] which is the following: ∑ k m n i=1 i mod n which is the average but in Z . k n n n n-k k (x+y) = ∑ C k x y

k=0 Proof:

Where the n choose k combinations formula is equal Decrypt(sk,[µ]) –1 to: k 2 k mod n 2 =Decrypt(sk,( ∏i= 1 ci mod n ) mod n )

n! =Decrypt(sk, Cn = k k-1mod n k!(n-k)! (Encrypt(pk, ∑k m mod n)) mod n2 ) i=1 i -1 ∑k • To pass from (4) to (5), we have to develop the =(k mod n . mi -1 k mλ mλ =(k ∑ mod n) mod n (using (8)) i=1 i 2 i=1 m i )mod n. formula ∑ Ci n mod n . i=0 This step is applied on Sagemath as following: • To pass from (5) to (6), we calculate each term of the def dec_mean(M,lmd,mu,n,g): equation (5) as follows: return decrypt(encrypted_mean(M,n,g),lmd,mu,n)

(mλ)! - Cmλ .1= .1=1 • Step 4: Applying LLL algorithm or the Extended 0 0!(mλ-0)! Euclidean Algorithm to obtain the average result in mλ (mλ)! - C1 .n= .n= mλn 1!(mλ–1)! Q.

mλ 2 2 mλ mλ i-2 2 Proof: - (C2 .n +n ( ∑i=3 Ci n )) mod n =0 because (integer×n2 ) mod n2 =0 1) LLL Algorithm idea

B. Paillier’s Cryptosystem Properties • The LLL algorithm purpose

Suppose that we have two plain-texts m1 ,m2 , with two Let a1 , a2 be two vectors that form a basis of cipher-texts respectively Encrypt(pk,m1 ), Encrypt(pk,m2 ). a lattice L. The goal of LLL algorithm is to take as input a , a and to give as output a Paillier’s cryptosystem verifies the following equations: 1 2 new basis for the lattice L where the lengths • Decrypt( Encrypt(pk,m1 )Encrypt(pk,m2 ) of the vectors of the latter are as short as 2 mod n )=(m1 +m2 ) mod n [1]. (7) possible. One of the most important properties of k 2 • Decrypt( Encrypt(pk,m1 ) mod n ) = k. m1 mod n the LLL algorithm is that the first vector [10]. (8) given as output represents the shortest one Equation (7) shows that Paillier’s cryptosystem is in the lattice L [11]. additive homomorphic.

IV. APPLYING OPERATIONS ON PAILLIER’S CRYPTOSYSTEM • The use of LLL algorithm in our case ∑ k m Calculating on the cipher-texts and getting the average Suppose that µ* is equal to i=1 i such that k while decrypting the results could be interesting. k gcd( ∑ i=1 mi ,k) and gcd(k,n) are equal to 1. And suppose that µ represents the latter in To realize that, we need to follow four steps: Zn so that it could be written in this way: k k ∑i=1 mi • Step 1: Calculate an encryption of ∑ i=1m i mod n by mod n . The condition of having k k 2 using (7) which means by calculating ∏i= 1 ci mod n gcd(k,n) =1, taken before assure the and we consider here that k is the number of elements existence of k–1 in µ. needed. This step is applied on Sagemath as In our case, this form is obtained after following: applying Decrypt formula on [µ]. So we k def encrypted_sum(M,n,g): know the value of µ and not that of ∑i=1 mi C=[encrypt(m,n,g) for m in M ] nor that of k. A way to find the values of

return mul(C) % (n**2) the latters is to apply the two-dimensional lattice theory.

–1 k We define a lattice L as following: • Step 2: Calculate an encryption of k ∑ m mod n i=1 i L = {(x,y) ∈Z2 ; x = yµ mod n} (9) by using (8) which means by calculating which is equivalent to: –1 k 2 k mod n 2 2 k [µ] = ( ∏i= 1 ci mod n ) mod n . This step is L = {(x,y) ∈Z ; kx =y ∑i=1 mi mod n}(10) applied on Sagemath as following: From (9), one can say that (n,0) and (µ,1)

115 form a basis of L which is correct for two that in each time the initial vector length is reasons: getting shorter. b) Swap two vectors. • First, because one can notice that n=0µ mod n therefore (n,0) ∈L and one • Extended Euclidean Algorithm using can also notice that µ = 1µ mod n Sagemath therefore (µ,1)∈L. def ExtendedEuclide(N,avg): • Second, we have n×1–0×µ≠0 therefore u1=0 (n,0) and (µ,1) are not collinear. u2=N v1=1 k From (10), replacing x by ∑ i=1 mi and y by v2=moy k k one can deduce that (∑ i= 1 mi ,k) is a vector of while u2>sqrt(N): k Q = u2 // v2 L. So obtaining the values of ∑ i=1 mi and k that verifies gcd( ∑k m ,k) =1 , means that the [t1,t2]=[u1-v1*Q,u2-v2*Q] i=1 i ∑ k m [u1,u2]=[v1,v2] fraction i=1 i is irrational and then the latters’ [v1,v2]=[t1,t2] k values are optimal. This is equivalent to finding return u2/u1 the shortest vector of the lattice L that can be obtained by the application of the LLL • Average Function using Extended algorithm on the basis vectors (n,0) and (µ,1) Euclidean algorithm in Sagemath: [12]. def avg_Euclide(M,lmd,mu,n,g): return ExtendedEuclide • LLL algorithm using Sagemath: (n,dec_mean(M,lmd,mu,n,g)) def lattice_LLL(N,avg):

M=Matrix(2,2,[avg,1,N,0]) #The vectors of the basis in a matrix form V. PRACTICAL EXAMPLE OF CALCULATING THE AVERAGE L=M.LLL() #The new basis USING THE CIPHER-TEXTS vectors Using Sagemath, we show in this part an example that return L[0,0]/ L[0,1] illustrates the previous sections. #Return the fraction formed by the [nb, min, max, KeySize] = [10, 100, 500, 24] shortest vector random.seed(int(time.time())) #elements which corresponds to M=[random.randint(min,max) for m in range(nb)] k ∑ mi p,q,n,g,lmd,mu=keygen(KeySize) i=1 . k print("Average using LLL= "+str(avg_LLL(M,lmd,mu,n,g))) print("Average using Extended Euclide= • Average Function using LLL "+str(avg_Euclide(M,lmd,mu,n,g))) algorithm in Sagemath: Suppose that we have p = 3623 and q = 3833. In this case def avg_LLL(M,lmd,mu,n,g): we have n=pq= 13886959, g=n+1= 13886960, λ= 13879504 return –1 and [L(gλ mod n2 )] = 594224. lattice_LLL(n,dec_mean (M,lmd,mu,n,g)) Using the public key pk=(n,g), the secret key sk=(p,q) and random numbers r, one can compute cipher-texts given by c=Encrypt(pk, m)=gm rn mod n2 as shown in the table below. 2) Extended Euclidean Algorithm idea [11] The Extended Euclidean Algorithm gives as result the shortest vector in a given two-dimensional lattice L. The only thing that differs the Extended Euclidean Algorithm from LLL Algorithm is that the latter can handle higher dimensions. But in our case we just need two dimensional lattices therefore the two algorithms can be applied and give the same In our case, the average of the plain-texts is equal to result. 1843/5. To obtain it using only the cipher-texts one must follow the steps given in section IV.

To improve a given basis the Extended Euclidean As mentioned before the first step is to calculate Algorithm follows these steps: k–1mod n k 2 2 a) Subtract from a vector a linear [µ] = ( ∏ ci mod n ) mod n which is equal in our case to 4010i=12 169206101. In the second step, while decrypting combination of the others which means the latter; one can obtain 8332544 as a result. Finally, to have

116 the average value in Q, one must apply the Extended Euclidean Algorithm or the LLL algorithm and obtain 1843/5. So this example make us sure that in this way we can always REFERENCES calculate safely on the cipher-texts and obtain the same result [1] Naveed ISLAM, William PUECH and Robert BROUZET, How to as if we are applying the average on the plain-texts. Secretly Share the Treasure Map of the Captain? Proc. SPIE 7542, Multimedia on Mobile Devices 2010, 75420L (27 January

VI. CONCLUSION AND FUTURE WORK 2010); https://doi.org/10.1117/12.839844 [2] Shruthi R, Sumana P, Anjan K Koundinya, Performance Analysis of In this paper, we reviewed the companies’ problem Goldwasser-Micali Cryptosystem, International Journal of Advanced concerning the security issue while using the Cloud Research in Computer and Communication Engineering, Vol. 2, Issue Computing in their work and how it can be solved by the 7, July 2013 homomorphic encryption idea. We present in particular the [3] Nikita Somani, Dharmendra Mangal, An Improved RSA Cryptographic System, International Journal of Computer Applications Murex’s KPIs list based on addition and ordering operations. (0975 – 8887), Volume 105 – No. 16, November 2014. For additional issue we used an important additive [4] Andreas V. Meier, The ElGamal Cryptosystem, June 8, 2005. homomorphic scheme known by Paillier’s cryptosystem and [5] Craig Gentry.2009. Fully homomorphic encryption using ideal lattices. we reviewed a numerical application on it. However, in fact, In Proceedings of the forty-first annual ACM symposium on Theory of we realized that this cryptosystem could not achieve the Computing (STOC’09). ACM, New York, NY, USA, ordering idea because of the existence of the modulus in its 169178:https://doi.org/10.1145/1536414.1536440 [6] Khalil Hariss, Maroun Chamoun and Abed Ellatif Samhat, On DGHV encryption function. and BGV Fully Homomorphic Encryption Schemes, 1st Cyber Security in Networking Conference (CSNet), Rio de Janeiro, Brazil, 18-20 Based on this problem, future work will focus on finding Oct. 2017. a new secured cryptosystem that can handle the additive and [7] Nina Pettersen, Applications of Paillier’s Cryptosystem, NTNU, the ordering issue at the same time. Our objective is to allow Norwegian University of Science and Technology, August 2016. Murex Company (and other companies that provide [8] Andreas Steffen, The Paillier Cryptosystem, Hochschule für Technik Rapperswil, 17.12.2010, Paillier.pptx 1. technology solutions to financial market) to calculate their [9] Brett Berry, The Binomial Theorem Explained wi8th a special splash KPIs list easily and efficiently. of Pascal’s Triangle, Oct 26, 2018. [10] Mohamed Nassar, Abdelkarim Erradi, Qutaibah M. Malluhi, Paillier’s ACKNOWLEDGMENT Encryption: Implementation and Cloud Applications. International Conference on Applied Research in Computer Science and We thank Dr. Eric Filiol (ESIEA University, France) for Engineering (ICAR) valuable discussions and feedback. [11] Phong Nguyễn, The LLL Algorithm, May 2010, Luminy,

We would also like to thank Murex Company for placing http://www.di.ens.fr/~pnguyen. their trust and confidence in our abilities to achieve our [12] Pierre-Alain Fouque, Jacques Stern, and Geert-Jan Wackers, CryptoComputing with rationals. In Financial , volume purpose. 2357 of Lecture Notes in Computer Science, pages 136-146. Springer,

Finally, the authors would like to acknowledge the Berlin, Heidelberg 2002. National Council for Scientific Research of Lebanon (CNRS – L) for granting a doctoral fellowship for Christiana Zaraket.

117