1 Minimax Approximation of by Composite Polynomial for Homomorphic Comparison

Eunsang Lee, Joon-Woo Lee, Jong-Seon No, Fellow, IEEE, and Young-Sik Kim, Member, IEEE

Abstract—The comparison operation for two numbers is one of the most commonly used operations in several applications, including deep learning. Several studies have been conducted to efficiently evaluate the comparison operation in homomorphic encryption schemes, termed homomorphic comparison operation. Recently, Cheon et al. (Asiacrypt 2020) proposed new comparison methods that approximate the sign function using composite polynomial in homomorphic encryption and proved that these methods have optimal asymptotic complexity. In this paper, we propose a practically optimal method that approximates the sign function by using compositions of minimax approximate polynomials. It is proved that this approximation method is optimal with respect to depth consumption and the number of non-scalar multiplications. In addition, a polynomial-time algorithm that determines the optimal compositions of minimax approximate polynomials for the proposed homomorphic comparison operation is proposed by using dynamic programming. The numerical analysis demonstrates that the proposed homomorphic comparison operation reduces running time by approximately 45% (resp. 41%) on average, compared with the previous algorithm if running time (resp. depth consumption) is to be minimized. In addition, when N is 217, and the precision parameter α is 20, the previous algorithm does not achieve 128-bit security, while the proposed algorithm achieves 128-bit security due to small depth consumption.

Index Terms—Cheon–Kim–Kim–Song scheme, fully homomorphic encryption, homomorphic comparison operation, minimax approximate polynomial, Remez algorithm, sign function. !

1 INTRODUCTION

OMOMORPHIC encryption (HE) is a cryptographic al- require non-arithmetic operations. One of the core non- H gorithm that allows algebraic operations on encrypted arithmetic operations is the comparison operation, which is data. Before Gentry’s seminal work [1] in 2009, HE schemes denoted as comp(u, v) and outputs 1 if u > v, 1/2 if u = v, were able to perform only a few specific operations on the and 0 if u < v. This comparison operation is widely used in encrypted data. Fully homomorphic encryption (FHE) is a various real-world applications including machine learning cryptographic algorithm that was first developed in [1] and algorithms such as support-vector machines, cluster analy- allows all algebraic operations on the encrypted data with- sis, and gradient boosting [7], [8]. out restriction. Accordingly, FHE has attracted significant The max function max(u, v) is another core non- attention in various applications, and its standardization arithmetic operation, and it is particularly important in process is in progress. deep learning. The max pooling operation in deep learning, FHE schemes can be classified as bitwise and word-wise. which is the same as the max function, is used to extract Word-wise FHE, such as Brakerski/Fan–Vercauteren [2] and the most distinctive feature from some layers in a neural Cheon–Kim–Kim–Song (CKKS) [3], allows the addition and network, and prevents the model from overfitting. Several multiplication of an encrypted array over C or Zp for a deep learning models, such as AlexNet [9], VGGNet [10], positive integer p > 2. All other operations in word-wise GoogleNet [11], Inception [12], and ResNet [13], use the FHE should be performed using these two basic operations. max pooling operation. In addition, sorting algorithms with By contrast, the basic operations of a bitwise FHE scheme, the max function are used as subroutines of various other such as fast fully homomorphic encryption over the torus algorithms because it is frequently required to use sorted [4], are logic gates such as NAND gates. Recently, word- rather than unsorted data. Thus, it is important to efficiently wise FHE has been widely used in several applications such evaluate the max function on encrypted data, termed ho- as deep learning [5], [6]. momorphic max function, because this directly improves Although word-wise FHE can support virtually all arith- the performance of the max pooling operation in privacy- metic operations on encrypted data, several applications preserving deep learning models and of sorting algorithms on encrypted data, termed homomorphic sorting algorithms • E. Lee, J.-W. Lee, and J.-S. No are with the Department of Electrical and [14]. Computer Engineering, INMC, Seoul National University, Seoul 08826, Several studies have been conducted to efficiently imple- South Korea. ment the comparison operation on encrypted data, termed E-mail: (eslee3209, joonwoo3511)@ccl.snu.ac.kr, [email protected]. • Y.-S. Kim is with the Department of Information and Communication homomorphic comparison operation, and homomorphic Engineering, Chosun University, Gwangju 61452, South Korea. max function. The comparison operation and max function E-mail: [email protected] can be easily implemented using the sign function, that 1 (Corresponding author: Joon-Woo Lee) is, comp(u, v) = 2 (sgn(u − v) + 1) and max(u, v) = 2

b = 1 b = 1 i 0 i ai

proposed proposed 0 ai ai f4 [15] g4 [15] ai

ai−1 bi−1 = 1 ai−1 bi−1 = 1

(a) Comparison with f4 (b) Comparison with g4

Fig. 1. Comparison between the proposed minimax approximate polynomial and the previous polynomials f4 and g4 in [15].

(u+v)+(u−v) sgn(u−v) 2 , where sgn(x) = x/|x| for x 6= 0, and which implies that the error condition in (1) is satisfied. 0, otherwise. Thus, implementing sgn(x) leads directly to The most challenging in approximating sgn(x) is that implementing comp(u, v) and max(u, v), and we focus on the interval [, 1], whose size is nearly one, should finally implementing sgn(x) in this paper. be mapped to a tiny interval [1 − 21−α, 1 + 21−α], whose Since sgn(x) is not a polynomial, to evaluate sgn(x) in size is 22−α, through k component polynomials. Our key FHE, which provides only addition and multiplication, it is observation is that finding good polynomials that efficiently necessary to determine and evaluate a polynomial p(x) that reduce the given intervals is the core of an efficient compar- approximates sgn(x). Since sgn(x) is discontinuous at x = ison operation implementation. Then, the natural question 0, the small neighborhood (−, ) of zero is not considered would be when measuring the difference between p(x) and sgn(x). “Given an interval [a , b ] = p ◦ · · · ◦ p ([, 1]) and an Specifically, it is required that i−1 i−1 i−1 1 upper bound of degree di, what is the best polynomial with |p(x) − sgn(x)| ≤ 21−α for x ∈ [−1, −] ∪ [, 1], (1) odd-degree terms pi of degree at most di that maps this interval into the smallest interval?” 1 which implies that 2 (p(u − v) + 1) is an approximate value of comp(u, v) within 2−α error for u, v ∈ [0, 1] satisfying |u − v| ≥ . Here,  and α determine the input and 1.1 Our Results output precisions of comparison operation, respectively. It 1.1.1 Minimax Composite Polynomial is desirable to determine an approximate polynomial that First, for [ai, bi] = pi([ai−1, bi−1]), we consider the case requires less running time and depth consumption. To this when bi should be one. Then, we note that the best poly- end, it is desirable to reduce the polynomial degree as much nomial of degree at most di that maps [ai−1, bi−1] into the as possible. smallest interval [ai, bi] is the minimax approximate poly- A method to approximate sgn(x) using a composite nomial of degree at most di on [−bi−1, −ai−1] ∪ [ai−1, bi−1] polynomial was recently proposed in [15], and it was (for c sgn(x) for some constant c < 1 such that bi = 1). Thus, proved that this method achieves the optimal asymptotic we come up with the idea of using a composite polyno- complexity. However, the functions fn and gn used in [15] mial of minimax approximate polynomials, called minimax do not ensure approximation optimality, and thus, finding composite polynomial, where each component polynomial pi better composite polynomials that approximate sgn(x) is an is the minimax approximate polynomial of degree at most important study. di defined on [−bi−1, −ai−1] ∪ [ai−1, bi−1] = pi−1 ◦ · · · ◦ Let p = pk ◦ pk−1 ◦ · · · ◦ p1 be a composite polynomial p1([−1, −] ∪ [, 1]). that approximates sgn(x). Since sgn(x) is an odd function, The two functions fn and gn used in [15] cause some it is natural to approximate sgn(x) by using a composition inefficiency compared to the proposed method. First, since of polynomials with odd-degree terms. Thus, we assume fn is not the minimax approximate polynomial, it does not that p1, p2, ··· , pk are polynomials with odd-degree terms, efficiently reduce the size of the interval compared to the and because of the symmetry, it suffices to consider only minimax approximate polynomial. In addition, although gn the case of x > 0 to verify that the composite polynomial is the minimax approximate polynomial (for c sgn(x) for p = pk ◦ pk−1 ◦ · · · ◦ p1 satisfies the error condition in (1). some c < 1), it is the minimax approximate polynomial Let p1([, 1]) = [a1, b1],p 2 ◦ p1([, 1]) = [a2, b2], ··· ,p k ◦ on a different domain from [−bi−1, −ai−1] ∪ [ai−1, bi−1], · · · ◦ p1([, 1]) = [ak, bk]. Each component polynomial pi can which results in some inefficiency. Fig. 1 shows that us- be seen as performing a task of mapping a given interval ing the proposed minimax approximate polynomial on [ai−1, bi−1] = pi−1 ◦ · · · ◦ p1([, 1]) into a smaller interval [−bi−1, −ai−1] ∪ [ai−1, bi−1] reduces the size of the interval 1−α 1−α 0 [ai, bi], and we should have [ak, bk] ∈ [1 − 2 , 1 + 2 ], more efficiently than using fn or gn (that is, [ai, 1] ⊂ [ai, 1]). 3

1.86 1.35 1 1 1 0.65 0.14 0.01 1 0.14 1.86 0.65 1.35

(a) p1(x) (b) p2(x) (c) p3(x)

Fig. 2. The component polynomials of an example of the proposed minimax composite polynomial p = p3 ◦ p2 ◦ p1 on D = [−1, −0.01] ∪ [0.01, 1] for {d1, d2, d3} = {9, 9, 9}.

In this paper, the range [ai, bi] = pi([ai−1, bi−1]) is actually polynomial for some set of degrees {di}1≤i≤k that satisfies in the form of [1 − τi, 1 + τi] (centered at one) for i ≥ 2, not the error condition in (1) and requires a smaller or equal in the form such that bi = 1 as shown in Fig. 1. However, depth consumption and number of non-scalar multiplica- it is possible to arbitrarily change the range of a component tions than the given composite polynomial. Now, our goal polynomial by scaling while maintaining the performance of is to find the optimal set of degrees for minimax composite the homomorphic comparison operation the same. Specifi- polynomial. Specifically, for a given depth consumption D, cally, if cpi(x) is used instead of pi(x) and pi+1(x/c) instead we find the optimal set of degrees such that the minimax of pi+1(x) for some i and c > 0, we can scale the range composite polynomial for the set of degrees requires the [ai, bi] = pi ◦ · · · ◦ p1([, 1]) by c times while maintaining minimum number of non-scalar multiplications while con- the same performance of the homomorphic comparison suming depth D. operation because the degrees of pi(x) and pi+1(x) do not First, we can think of brute-force searching for all num- change. Thus, although minimax approximate polynomials bers of compositions and the degrees of the component whose range is centered at one are used in this paper, the polynomials to find the optimal set of degrees. However, minimax approximate polynomial in Fig. 1 is scaled so that for the upper bound of the numbers of compositions k¯ and bi = 1 for comparison with fn and gn in [15]. that of the degrees of component polynomials d¯, this brute- ¯ Now, the formal definition of the proposed minimax force search requires O(d¯k) times, which is too much when composite polynomial is described. For a set of polynomials α is large. We propose a fast method to find the optimal set {pi}1≤i≤k, the composite polynomial pk ◦ pk−1 ◦ · · · ◦ p1 is of degrees using dynamic programming. called a minimax composite polynomial on D = [−b, −a]∪[a, b] Specifically, the values of h(m, n, τ) and G(m, n, τ) (de- if there exists {di}1≤i≤k that satisfies the following: fined in Section 3.3) are evaluated using a recursion equation • p1 is the minimax approximate polynomial of degree at (see Theorem 3), and the optimal sets of degrees are ob- most d1 on D for sgn(x). tained using these values (see Algorithm 5). It is proved that • For 2 ≤ i ≤ k, pi is the minimax approximate polyno- the minimax composite polynomial for the set of degrees mial of degree at most di on pi−1 ◦ pi−2 ◦ · · · ◦ p1(D) for determined from the proposed algorithm is optimal with sgn(x). respect to depth consumption and the number of non-scalar Let τi be the minimax approximation error of the min- multiplications. imax approximate polynomial pi for 1 ≤ i ≤ k. Here, τi For the upper bound of the number of non-scalar mul- 1−α becomes smaller as i increases, and if τk ≤ 2 , then tiplications m¯ , that of depth consumption n¯, and that of p = pk ◦ · · · ◦ p1 satisfies the error condition in (1). The degrees of component polynomials d¯, the time complexity component polynomials of the minimax composite poly- of the proposed algorithm is O(m ¯ n¯d¯). It is pretty fast when nomial efficiently reduce the size of the given interval implemented, and we obtain the optimal set of degrees for without inefficiencies caused by fn and gn in [15]. Fig.2 input precision α from 5 to 20 (see Table 2). shows the component polynomials of an example of the proposed minimax composite polynomial p = p3 ◦ p2 ◦ p1 1.1.3 Improved Performance of Homomorphic Comparison on D = [−1, −0.01] ∪ [0.01, 1] for {d1, d2, d3} = {9, 9, 9}. Operation It can be seen that when the HEAAN library [3] is used, the 1.1.2 Finding Optimal Degrees by Using Dynamic Pro- proposed homomorphic comparison operation algorithm gramming reduces running time by approximately 45% (resp. 41%) We prove that approximating sgn(x) using minimax com- on average, compared with the previous algorithm CompG posite polynomial is the optimal method with respect to (referred to as NewCompG in [15]) if running time (resp. the number of non-scalar multiplications and depth con- depth consumption) is to be minimized. In addition, the sumption. That is, we prove that for any given compos- proposed homomorphic comparison operation algorithm ite polynomial that approximates sgn(x) and satisfies the reduces the ciphertext modulus bit by approximately 41% error condition in (1), there exists a minimax composite (resp. 45%) on average if running time (resp. depth con- 4

sumption) is to be minimized. In particular, it is noteworthy composition of minimax approximate polynomials is pro- that when the precision parameter α is 20, the proposed ho- posed, and it is proved that the proposed approximation momorphic comparison operation achieves 128-bit security method is optimal. In addition, a polynomial-time algorithm due to small depth consumption while the homomorphic to obtain the optimal minimax composite polynomial for comparison operation in [15] does not achieve 128-bit secu- the homomorphic comparison operation is proposed using rity. dynamic programming, and the performance achieved by the optimal minimax composite polynomial is compared 1.1.4 Improved Performance of Homomorphic Max Func- with the previous algorithm when depth consumption and tion and Homomorphic Sorting the number of non-scalar multiplications are minimized, re- (u+v)+(u−v) sgn(u−v) spectively. In Section 4, the proposed method of approximat- Since we have max(u, v) = , max(u, v) 2 ing sign function is applied to homomorphic max function can be easily implemented using the proposed method of and homomorphic sorting. In Section 5, numerical results approximating sgn(x). As a result, the proposed homo- for the proposed homomorphic comparison operation, the morphic max function algorithm reduces running time by proposed homomorphic max function, and a homomorphic approximately 41% (resp. 35% ) on average and reduces the sorting algorithm that uses this function are provided in the ciphertext modulus bit by approximately 33% (resp. 45%) HEAAN library [3]. Finally, concluding remarks are given on average, compared with the previous algorithm MaxG in Section 6. (referred to as NewMaxG in [15]) if running time (resp. depth consumption) is to be minimized. Furthermore, we implement the homomorphic sorting of 2 PRELIMINARIES three and four numbers using the proposed homomorphic 2.1 Fully Homomorphic Encryption max function. As a result, the homomorphic sorting algo- rithm using the proposed homomorphic max function algo- FHE schemes are classified as bitwise and word-wise. The rithm is approximately two times as fast as using the previ- basic operations of the former are logic gates, whereas the ous homomorphic max function algorithm. In addition, the basic operations of the latter are algebraic, such as addition sorting algorithm that uses the previous homomorphic max and multiplication. In this paper, we focus only on word- function algorithm does not achieve 128-bit security, while wise FHE, and the term FHE refers to word-wise FHE. The the sorting algorithm that uses the proposed homomorphic definition of FHE is as follows: max function algorithm achieves 128-bit security due to Definition 1. An FHE scheme is a set of five polynomial-time small depth consumption. algorithms that satisfy the following: • KeyGen(λ) → (pk, sk, evk); KeyGen takes a security 1.2 Related Works parameter λ as input, and outputs a public key pk, a secret key sk, and an evaluation key evk. Some research has been conducted on determining poly- • Enc(µ, pk) → ct; Enc takes a public key pk and a message nomials that approximate sgn(x) or comp(u, v) in FHE. µ as input, and outputs a ciphertext ct of µ. An analytic method to approximate the sign function us- • Dec(ct, sk) → µ0 or ⊥; Dec takes a ciphertext ct and a ing Fourier series was proposed in [16]. In [17], the sign secret key sk as input, and outputs a message µ0. If the function was approximated using the approximate equation decryption procedure fails, Dec outputs a special symbol ⊥. ejx−e−jx tanh(jx) = ejx+e−jx ' sgn(x) for large j > 0. Recently, • Add(ct1, ct2, evk); Add takes ciphertexts ct1 and ct2 of µ1 an iterative algorithm was proposed that performs a ho- and µ2, respectively, and an evaluation key evk as input, uj momorphic comparison using the equation lim j j = and outputs a ciphertext ctadd of µ1 + µ2. j→∞ u +v comp(u, v) in [18], where the inverse operation can be • Mult(ct1, ct2, evk); Mult takes ciphertexts ct1 and ct2 of µ1 performed using the Goldschmidt division algorithm [19]. and µ2, respectively, and an evaluation key evk as input, However, the use of the inverse operation results in ineffi- and outputs a ciphertext ctmult of µ1 · µ2. cient computation. More recently, in [15], the homomorphic The CKKS scheme has two types of multiplication: scalar comparison operation was approximated using composite and non-scalar. The latter is the multiplication of two vari- polynomials with a smaller depth consumption and number ables, and the former is the multiplication of a variable and of non-scalar multiplications than in previous methods. It a constant. Non-scalar multiplications require significantly was also shown that this homomorphic comparison op- more running time than scalar multiplications. Thus, in this eration has optimal asymptotic computational complexity. paper, when the homomorphic comparison operation and However, its performance can be further improved because homomorphic max function are considered, we focus on the composite polynomials used in [15] do not optimally reducing depth consumption and the number of non-scalar approximate the sign function. rather than scalar multiplications.

1.3 Outline 2.2 Comparison Operation in Fully Homomorphic En- The remainder of this paper is organized as follows. Section cryption 2 presents preliminaries regarding the concept of FHE, FHE schemes support addition and multiplication opera- comparison operation in FHE, approximation theory, and tions on the encrypted data, but not non-arithmetic oper- the algorithms for minimax approximation. In Section 3, a ations, such as comparison operation. Thus, the approxi- new method for approximating the sign function using a mation of the comparison operation should be performed 5

by using additions and multiplications. The comparison concerned with polynomials that approximate sgn(x) in this operation and the sign function are denoted as paper. Moreover, we are only concerned with cases in which D is the union of two symmetric closed intervals [−b, −a] ∪   [a, b]. 1 if u > v 1 if x > 0   We refer to the definition of Haar’s condition [20] for a comp(u, v) = 1/2 if u = v , sgn(x) = 0 if x = 0 .   set of functions, which deals with the generalized version 0 if u < v −1 if x < 0 of polynomial bases such as power basis or Chebyshev polynomial basis. It is well known that the power basis and Our objective is to approximate comp(u, v) using addi- Chebyshev polynomial basis satisfy Haar’s condition. Thus, tions and multiplications only. We note that comp(u, v) and if an argument deals with a set of functions that satisfy sgn(x) are related as follows: Haar’s condition, it naturally includes the case of power sgn(u − v) + 1 basis or Chebyshev polynomial basis. comp(u, v) = . 2 Definition 4 (Haar’s condition and generalized polynomial Thus, the approximation of comp(u, v) is equivalent to [20]). A set of functions {φ1, φ2, ··· , φn} satisfies Haar’s con- that of sgn(x). Therefore, we only focus on the polynomial dition if each φi is continuous and the determinant approximation of sgn(x). Definition 2 quantifies how close a φ1(x1) ··· φn(x1) sgn(x) sgn(x) sgn(x) polynomial that approximates is to . is . . . x = 0 D[x1, ··· , xn] = . .. . discontinuous at , and thus it is impossible to exactly approximate sgn(x) near x = 0. Definition 2 implies that φ1(xn) ··· φn(xn) 2−α the approximation error is ensured to be below only for is not zero for any n distinct points x1, ··· , xn. A linear  ≤ |x| ≤ 1 . combination of {φ1, φ2, ··· , φn} is referred to as a generalized Definition 2 ( [15]). For α > 0 and 0 <  < 1, a polynomial polynomial. p is said to be (α, )-close to sgn(x) over [−1, 1] if p satisfies the The following theorem and lemmas are required for following: some proofs in Section 3. −α ||p(x) − sgn(x)||∞,[−1,−]∪[,1] ≤ 2 , Theorem 1 (Chebyshev alternation theorem [20]). Let D be a closed subset of [a, b], and let {φ , φ , ··· , φ } be a set of where || · || denotes the infinity norm over the domain D. 1 2 n ∞,D continuous functions on [a, b] that satisfy Haar’s condition. A P For precision parameters, α and , a polynomial p˜(u, v) polynomial p = i ciφi is the minimax approximate polynomial that approximates comparison operation should satisfy the on D for any given f on D if and only if following error condition: there are n + 1 elements x0 < ··· < xn in D such that −α |p˜(u, v) − comp(u, v)| ≤ 2 r(xi) = −r(xi−1) = ± sup |r(x)|, 1 ≤ i ≤ n (3) x∈D for any u, v ∈ [0, 1] satisfying |u − v| ≥ . (2) for the error function r = f − p restricted on D. comp(u, v) = sgn(u−v)+1 p(x) Since 2 , if a polynomial Remark 1. The condition in (3) is called the equioscillation approximating sgn(x) is (α − 1, )-close, then p˜(u, v) = p(u−v)+1 condition. Let D be [−b, −a] ∪ [a, b]. As r(xi) = ± sup |r(x)| x∈D 2 satisfies the comparison operation error condition in (2). Thus, we find (α − 1, )-close composite polynomials for 0 ≤ i ≤ n, it follows that r(x) should have extreme points at 0 that approximate sgn(x). xi for 0 ≤ i ≤ n. Thus, p (xi) = 0 and xi ∈ (−b, −a) ∪ (a, b), It is known that non-scalar multiplication requires a or xi ∈ {−b, −a, a, b}. considerable running time. In addition, as bootstrapping is Lemma 1 (Generalized de La Vallee Poussin theorem [21]). time-consuming, minimizing the depth consumption for the Let {φ1, φ2, ··· , φn} be a set of continuous functions on [a, b] homomorphic comparison operation is also important, as it that satisfy Haar’s condition. Let D be a closed subset of [a, b], and reduces bootstrapping. Thus, it is necessary to approximate let f(x) be a continuous function on D. Let xi, 0 ≤ i ≤ n be n+ sgn(x) by polynomials that minimize depth consumption as 1 consecutive points in D. Let p(x) be a generalized polynomial well as the number of non-scalar multiplications. such that p − f has alternately positive and negative values at ∗ xi, 0 ≤ i ≤ n. Let p (x) be a minimax approximate polynomial 2.3 Approximation Theory on D for f, and let e(f) be the corresponding approximation error p∗(x) Herein, certain concepts from approximation theory are of . Then, introduced. e(f) ≥ min |p(xi) − f(xi)|. i Definition 3. Let D be a closed subset of [a, b], and let f be Lemma 2 ([22]). If f(x) is an odd function, the minimax a continuous function on D. A polynomial p is said to be the approximate polynomial of degree at most n to f(x) is also an minimax approximate polynomial of degree at most n on D for odd function. f if p minimizes maxD||p(x) − f(x)||∞ among polynomials of degree at most n. 2.4 Algorithms for Minimax Approximation It is known that for any continuous function f on D, The Remez algorithm [23] in Algorithm 1 obtains the the minimax approximate polynomial of degree at most n minimax approximate polynomial for a continuous func- on D is unique [20]. We set f(x) = sgn(x) because we are tion on an interval. First, it initializes reference points 6

{x1, ··· , xn+1}, which will converge to the extreme points Algorithm 2: Improved Multi-Interval Remez algo- of the minimax approximate polynomial. Then, it deter- rithm [21] mines a polynomial p(x) with basis {φ1, ..., φn} that satisfies {φ , ··· , φ } i Input: A basis 1 n , an approximation p(xi) − f(xi) = (−1) E, 1 ≤ i ≤ n + 1 for some E > 0; parameter γ, an input domain c , ··· , c l that is, it determines coefficients 1 n that satisfy D = S [a , b ] ⊂ , and a continuous c φ (x ) + ··· + c φ (x ) − f(x ) = (−1)iE, 1 ≤ i ≤ n + 1 i=1 i i R 1 1 i n n i i . function f on D These equations form a system of linear equations consist- Output: The minimax approximate polynomial p for ing of n + 1 equations and n + 1 unknowns (coefficients f c1, ··· , cn and E). This system is guaranteed not to be 1 Choose x1, ··· , xn+1 ∈ D, where x1 < ··· < xn+1; singular by Haar’s condition, and thus the polynomial p(x) 2 Find the polynomial p(x) in terms of {φ1, ··· , φn} can be uniquely determined. such that p(x ) − f(x ) = (−1)iE, 1 ≤ i ≤ n + 1 for n z , ··· , z p(x) − f(x) i i Then, we find zeros 1 n of , where some E; x < z < x , and we also find n + 1 extreme points i i i+1 3 Collect all the extreme and boundary points of p − f y , ··· , y p(x) − f(x) y ∈ [z , z ] 1 n+1 of such that i i−1 i , where on D such that µ(x)(p(x) − f(x)) ≥ |E| and put z = a z = b 0 and n+1 . If the extreme points satisfy the them in a set B; equioscillation condition in (3), the Remez algorithm out- 4 Find n + 1 extreme points y < y < ··· < y in p(x) 1 2 n+1 puts as the minimax approximate polynomial. Oth- B that satisfy the alternating condition and erwise, it replaces the reference points with these extreme maximum absolute sum condition; points and proceeds with the above steps again. It was 5 max ← max |p(yi) − f(yi)|; proved in [23] that this polynomial p(x) always converges 1≤i≤n+1 6 min ← min |p(yi) − f(yi)|; to the minimax approximate polynomial. 1≤i≤n+1 7 if (max − min)/min <γ then Algorithm 1: Remez algorithm [21] 8 Return p(x); 9 else Input: A basis {φ1, ··· , φn}, a domain [a, b], an approximation parameter γ, and a 10 Replace xi with yi for all i. Go to line 2; continuous function f on [a, b] 11 end Output: The minimax approximate polynomial p for f 1 Choose x1, ··· , xn+1 ∈ [a, b], where to the Remez algorithm; however, unlike in the Remez x1 < ··· < xn+1; algorithm, there may be more than n + 1 extreme points in 2 Find the polynomial p(x) in terms of {φ1, ··· , φn} the improved multi-interval Remez algorithm. Thus, n + 1 i such that p(xi) − f(xi) = (−1) E, 1 ≤ i ≤ n + 1 for extreme points y1, ··· , yn+1 should be chosen among all some E; candidate extreme points. µ(x) is a function that is required 3 Divide the domain [a, b] into n + 1 sections to describe how to choose the n + 1 extreme points, and it is [zi−1, zi], i = 1, ··· , n + 1. defined as z1, ··· , zn are the zeros of p(x) − f(x), where  1,p (x) − f(x) is concave at x on D xi < zi < xi+1, and z0 = a, zn+1 = b;  4 Find the maximum or minimum point of p − f for µ(x) = −1,p (x) − f(x) is convex at x on D  each section when p(xi) − f(xi) has a positive or  0, otherwise. negative value, respectively. These points y1, ··· , yn+1 are called extreme points; Then, the n + 1 extreme points in Algorithm 2 can be 5 max ← max |p(yi) − f(yi)|; selected based on the following three criteria: 1≤i≤n+1 (i) Local extreme value condition. We have min µ(yi)(p(yi) − 6 min ← min |p(yi) − f(yi)|; i 1≤i≤n+1 f(yi)) ≥ E, where E is the value obtained when line 2 7 if ( −  )/ <γ then max min min in Algorithm 2 is performed. 8 Return p(x); (ii) Alternating condition. µ(yi) · µ(yi+1) = −1 for i = 9 else 1, ··· , n. 10 Replace x with y for all i. Go to line 2; n+1 i i P 11 end (iii) Maximum absolute sum condition. |p(yi) − f(yi)| is i=1 maximized for all candidate sets of extreme points Recently, Lee et al. [21] proposed the improved multi- satisfying the local extreme value and the alternating interval Remez algorithm that determines the minimax ap- condition. proximate polynomial on multiple intervals and proved that The improved multi-interval Remez algorithm operates this algorithm can always obtain the minimax approximate with n basis functions {φ1, ··· , φn}. The minimax approx- polynomial for any piecewise continuous function. In this imate polynomial p(x) is represented by the basis func- Pn paper, it is required to determine the minimax approximate tions as p(x) = i=1 ciφi(x), and the improved multi- polynomial for the sign function, and this can be obtained interval Remez algorithm determines the coefficients ci by the improved multi-interval Remez algorithm in Algo- of p(x). Although the simplest basis is a power basis, rithm 2. {1, x, x2, ··· , xn−1}, when the sign function is approxi- The improved multi-interval Remez algorithm is similar mated using this basis, the magnitudes of the coefficients 7

ci are often unstable (i.e., excessively small or large values), TABLE 1 resulting in more numerical errors. Thus, the Chebyshev Required Depth Consumption and the Number of Non-Scalar Multiplications for Evaluating Polynomials of Degree d with Odd-Degree polynomials are used as basis functions in this paper. The Terms Using the Odd Baby-Step Giant-Step Algorithm [24] and the Chebyshev polynomials Ti on [−1, 1] are defined by the Optimal Level Consumption Technique [25] following recursion: polynomial depth consumption multiplications T (x) = 1 degree d dep(d) mult(d) 0 3 2 2 T1(x) = x 5 3 3 T (x) = 2xT (x) − T (x) i ≥ 2. 7 3 5 i i−1 i−2 for 9 4 5 11 4 6 13 4 7 If the sign function should be approximated on a domain 15 4 8 larger than [−1, 1], then scaled Chebyshev polynomials 17 5 8 T˜i(x) = Ti(x/w) should be used for some w > 1 instead 19 5 8 21 5 9 of Ti for all i. 23 5 9 25 5 10 3 APPROXIMATION OF SIGN FUNCTIONBY US- 27 5 10 29 5 11 ING OPTIMAL COMPOSITIONOF MINIMAX APPROX- 31 5 12 IMATE POLYNOMIALS 3.1 New Approximation Method for Sign Function Us- implement the multiplication by coefficients using scalar ing Composition of Minimax Approximate Polynomials multiplication in this paper, and thus, we should consider In [15], sgn(x) was approximated using a composition of both the depth consumption due to scalar multiplications Pn 1 2i polynomials fn on [−1, 1], where fn(x) = i=0 4i i x(1− and that due to non-scalar multiplications. The approxi- 2 i x ) . If n and the number of compositions sn of fn increase, mate polynomials are evaluated with the minimum depth (sn) the composite polynomial fn better approximates sgn(x). consumption using the odd baby-step giant-step algorithm In addition, by defining and using another acceleration [24] and the optimal level consumption technique [25], polynomial gn together with fn for composition, the effi- which consider both the depth consumption due to scalar ciency of the composite polynomial was further improved multiplications and that due to non-scalar multiplications. with a smaller number of required compositions. How- For an odd integer d, the two functions, dep(d) and mult(d), ever, the polynomials fn and gn cause some inefficiency denote the required depth consumption and the number in approximation and thus do not ensure approximation of non-scalar multiplications, respectively, for evaluating a optimality. polynomial of degree d with odd-degree terms using the In this paper, we construct composite polynomials using odd baby-step giant-step algorithm and the optimal level new component polynomials pi, which are different from consumption technique. Table 1 shows the values of dep(d) the polynomials fn or gn used in [15], and the repeated com- and mult(d) for odd degrees d up to 31. position of each pi is not used. As sgn(x) is an odd function, In this paper, the goal is to determine an (α − 1, )-close it is natural to approximate sgn(x) by using a composition polynomial p(x), and we consider a minimax composite of polynomials with odd-degree terms. Let pk ◦pk−1 ◦· · ·◦p1 polynomial p = pk ◦pk−1 · · ·◦p1 where p1([−1, −]∪[, 1]) = be a composition of polynomials with odd-degree terms [−1 − τ1, −1 + τ1] ∪ [1 − τ1, 1 + τ1] and pi([−1 − τi−1, −1 + approximating sgn(x) on [−1, −] ∪ [, 1]. Because of the τi−1]∪[1−τi−1, 1+τi−1]) = [−1−τi, −1+τi]∪[1−τi, 1+τi], symmetry, it suffices to consider only the case of x > 0 to 2 ≤ i ≤ k for some τ1, ··· , τk ∈ (0, ∞). For a concise check that the composite polynomial pk ◦ pk−1 ◦ · · · ◦ p1 is description of Sections 3.2 and 3.3, however, it is also (α − 1, )-close. Let [a0, b0] = [, 1], p1([a0, b0]) = [a1, b1], necessary to consider a minimax composite polynomial p2([a1, b1]) = [a2, b2], ··· ,p k([ak−1, bk−1]) = [ak, bk]. We p = pk◦pk−1 · · ·◦p1 where p1([−1−δ, −1+δ]∪[1−δ, 1+δ]) = note that pk ◦ pk−1 ◦ · · · ◦ p1 is (α − 1, )-close if and only if [−1 − τ1, −1 + τ1] ∪ [1 − τ1, 1 + τ1] and pi([−1 − τi−1, −1 + 1−α 1−α pk ◦ pk−1 ◦ · · · ◦ p1([, 1]) = [ak, bk] ⊆ [1 − 2 , 1 + 2 ]. τi−1]∪[1−τi−1, 1+τi−1]) = [−1−τi, −1+τi]∪[1−τi, 1+τi], As [ak, bk] should be a very small interval, it is desirable 2 ≤ i ≤ k for some δ, τ1, ··· , τk ∈ (0, ∞). The following that each pi on the domain [ai−1, bi−1] reduces the range as definition quantifies how close this composite polynomial is much as possible. The key observation is that if the minimax to sgn(x). approximate polynomials are used in the composition, the Definition 5 ( [15]). For α > 0 and 0 < δ < 1, a polynomial range [a , b ] can be rapidly reduced as i increases. Thus, i i p(x) is said to be (α, δ)-two-sided-close to sgn(x) if it satisfies we use a composition of minimax approximate polynomials the following: that can be obtained by the improved multi-interval Remez −α algorithm. ||p(x) − sgn(x)||∞,[−1−δ,−1+δ]∪[1−δ,1+δ] ≤ 2 , In [15], the coefficients of approximate polynomials are j where || · ||∞,D denotes the infinity norm over the domain D. rounded to 2i for some integers i and j, and the multiplica- tion by coefficients is implemented by adding a ciphertext Later, we will show that determining (α, δ)-two-sided- j times and then removing i least significant bits. Thus, close composite polynomial and determining (α, )-close 1− the depth consumption due to scalar multiplications does composite polynomial are equivalent for δ = 1+ . In fact, not need to be considered in [15]. On the other hand, we δ is just a temporary precision parameter that is introduced 8

1− for a concise description, and when users perform the pro- DepNum({p˜i}1≤i≤k) = n when δ = 1+ . That is, it can be posed algorithms in Algorithms 4, 5, 6, and 7, they should seen that the following two algorithms are equivalent: use  rather than δ. (i) An algorithm that determines the (α − 1, )-close com- We define two functions MultNum and DepNum, which posite polynomial pk ◦ · · · ◦ p1 that minimizes depth output the total number of non-scalar multiplications and consumption and the number of non-scalar multiplica- depth consumption, respectively, required for evaluating a tions. composite polynomial. (ii) An algorithm that determines the (α − 1, δ)-two-sided- close composite polynomial p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 that Definition 6. Let {pi}1≤i≤k be a set of polynomials satisfy- minimizes depth consumption and the number of non- ing deg(pi) = di, 1 ≤ i ≤ k. MultNum({pi}1≤i≤k) and scalar multiplications, where δ = 1− . DepNum({pi}1≤i≤k) denote the sum of the number of non-scalar 1+ multiplications and the sum of depth consumptions, respectively, Thus, we henceforth focus on the latter. required to evaluate pi for 1 ≤ i ≤ k by using the Paterson– The minimax composite polynomial, which is the core of Stockmeyer algorithm. That is, the proposed homomorphic comparison method, is now defined as follows. The principle of the proposed approxi- k X mation method is to use the minimax composite polynomial MultNum({p } ) = mult(deg(p )) i 1≤i≤k i to approximate the sign function. i=1 k Definition 7. Let {pi}1≤i≤k be a set of polynomials. Let D X [−b, −a] ∪ [a, b] p ◦ p ◦ · · · ◦ p DepNum({pi}1≤i≤k) = dep(deg(pi)). be . k k−1 1 is called a minimax i=1 composite polynomial on D if there exists {di}1≤i≤k that satisfies the following: Our objective is to determine an (α − 1, )-close com- • p is the minimax approximate polynomial of degree at most posite polynomial p ◦ p ◦ · · · ◦ p and minimize 1 k k−1 1 d on D for sgn(x). MultNum({p } ) as well as DepNum({p } ). The 1 i 1≤i≤k i 1≤i≤k • For 2 ≤ i ≤ k, p is the minimax approximate polynomial of following lemma implies that this is equivalent to deter- i degree at most d on p ◦ p ◦ · · · ◦ p (D) for sgn(x). mining an (α − 1, δ)-two-sided-close composite polynomial i i−1 i−2 1 1− p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 when δ = 1+ . We denote [−1 − τ, −1 + τ] ∪ [1 − τ, 1 + τ] by Rτ for τ > 0. Let τ be the minimax approximation error of the Lemma 3. For a set of polynomials with odd-degree terms i minimax approximate polynomial pi for 1 ≤ i ≤ k. We note {pi}1≤i≤k, let {p˜i}1≤i≤k be a set of polynomials with odd- 1+ that pi ◦ pi−1 ◦ · · · ◦ p1(D) = Rτi for 1 ≤ i ≤ k by Theorem degree terms such that p˜1(x) = p1( 2 x) and p˜i(x) = pi(x), 1. In fact, τi decreases as i increases. It can be seen that if 2 ≤ i ≤ k. Then, pk ◦ pk−1 ◦ · · · ◦ p1 is (α − 1, )-close if and pk◦pk−1◦· · ·◦p1 is a minimax composite polynomial on D = only if p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is (α − 1, δ)-two-sided-close when 1− [−b, −a] ∪ [a, b], then {pi}1≤i≤k is a set of polynomials with δ = 1+ . −(α−1) odd-degree terms by Lemma 2. If τk ≤ 2 , then the Proof. Let pk ◦pk−1◦· · ·◦p1 be an (α−1, )-close composition minimax composite polynomial on Rδ becomes (α − 1, δ)- of polynomials with odd-degree terms. As pk ◦ pk−1 ◦ · · · ◦ two-sided-close. The principle is to determine the optimal p1(x) is a polynomial with odd-degree terms, it suffices to set of degrees {di}1≤i≤k such that the minimax composite consider the case of x > 0. Then, pk ◦ pk−1 ◦ · · · ◦ p1(x) ∈ polynomial on Rδ for the set of degrees is optimal with −(α−1) −(α−1) 0 2 [1 − 2 , 1 + 2 ] for  ≤ x ≤ 1. Let x = 1+ x. respect to the depth consumption and the number of non- 0  ≤ x ≤ 1 corresponds to 1−δ ≤ x ≤ 1+δ. Then, p˜k ◦p˜k−1 ◦ scalar multiplications among all (α − 1, δ)-two-sided-close 0 −(α−1) −(α−1) · · ·◦p˜1(x ) = pk◦pk−1◦· · ·◦p1(x) ∈ [1−2 , 1+2 ] such polynomials. 0 for 1 − δ ≤ x ≤ 1 + δ. Thus, p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is (α − 1, δ)- two-sided-close. 3.2 Optimality of Approximation of the Sign Function 0 −(α−1) Conversely, let p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1(x ) ∈ [1 − 2 , 1 + by a Minimax Composite Polynomial −(α−1) 0 1+ 0 2 ] for 1 − δ ≤ x ≤ 1 + δ. Let x = 2 x . 1 − δ ≤ sgn(x) 0 In this subsection, we prove that approximating x ≤ 1 + δ corresponds to  ≤ x ≤ 1. Then, pk ◦ pk−1 ◦ · · · ◦ 0 −(α−1) −(α−1) using minimax composite polynomial is the optimal method p1(x) =˜pk ◦ p˜k−1 ◦ · · · ◦ p˜1(x ) ∈ [1 − 2 , 1 + 2 ] with respect to the number of non-scalar multiplications for  ≤ x ≤ 1, which implies that pk ◦ pk−1 ◦ · · · ◦ p1 is and depth consumption. That is, we prove that for any (α − 1, )-close. Thus, the lemma is proved. given (α − 1, δ)-two-sided-close composite polynomial that approximates sgn(x), there is an (α − 1, δ)-two-sided-close We note that as deg(pi) = deg(p˜i), 1 ≤ i ≤ k in Lemma minimax composite polynomial for some degrees {di}1≤i≤k 3, we have that requires a smaller or equal depth consumption and number of non-scalar multiplications than those for the MultNum({pi}1≤i≤k) = MultNum({p˜i}1≤i≤k), given composite polynomial. The following definition and DepNum({pi}1≤i≤k) = DepNum({p˜i}1≤i≤k), lemmas are required to prove the optimality of the proposed minimax composite polynomial method. Thus, for any m, n ∈ N, a composition of poly- nomials with odd-degree terms pk ◦ pk−1 ◦ · · · ◦ p1 is Definition 8. Let {pi}1≤i≤k be a set of polynomials. pk ◦pk−1 ◦ (α − 1, )-close and satisfies MultNum({pi}1≤i≤k) = m and · · · ◦ p1 is called a 1-centered range composite polynomial on Rδ DepNum({pi}1≤i≤k) = n if and only if the corresponding if {pi}1≤i≤k is a set of polynomials with odd-degree terms, and composite polynomial p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is (α − 1, δ)- there exists {τi}1≤i≤k such that p1([1−δ, 1+δ]) = [1−τ1, 1+ two-sided-close and satisfies MultNum({p˜i}1≤i≤k) = m and τ1] and pi([1 − τi−1, 1 + τi−1]) = [1 − τi, 1 + τi] for 2 ≤ i ≤ k. 9

Lemma 4. Let p1 be the minimax approximate polynomial of minimax composite polynomial for the degrees is (α − 1, δ)- degree at most d on [−b1, −a1] ∪ [a1, b1] for sgn(x). Let p2 two-sided-close, and requires depth consumption D and a be the minimax approximate polynomial of degree at most d on smaller or equal number of non-scalar multiplications than [−b2, −a2] ∪ [a2, b2] for sgn(x). If [a2, b2] ⊆ [a1, b1], then the any (α − 1, δ)-two-sided-close composite polynomial that minimax approximation error e2 of p2 is less than or equal to the consumes depth D. In this section, we propose a method minimax approximation error e1 of p1. to find the optimal set of degrees for minimax composite polynomial. Proof. The maximum approximation error e1 when p1 ap- First, we can think of a naive method to determine the proximates sgn(x) on [−b1, −a1] ∪ [a1, b1] is larger than 0 optimal degrees among all sets of degrees by brute-force or equal to the maximum approximation error e when p1 1 searching for all composition numbers k and the degrees approximates sgn(x) on [−b2, −a2] ∪ [a2, b2]. By the defi- of the component polynomials d , ··· , d . However, for the nition of the minimax approximate polynomial, among all 1 k upper bound of the composition number k¯ and that of polynomials that approximate sgn(x) on [−b2, −a2]∪[a2, b2] ¯ degrees of component polynomials d¯, this requires O(d¯k) and have degree less than or equal to d, p2 has the small- times, which is too much for large α. Thus, dynamic pro- est maximum approximation error. As the degree of p1 is smaller than or equal to d, we have that e ≤ e0 ≤ e , and gramming is used to determine the minimax composite 2 1 1 R the lemma is proved. polynomial on δ in polynomial time and we propose a related algorithm. Before describing the proposed algorithm that Lemma 5. Let p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 be any (α − 1, δ)-two-sided- uses dynamic programming, we define MinErr(d, τ), close 1-centered-range composite polynomial on Rδ. Then, there InvMinErr(d, τ), h(m, n, τ), and G(m, n, τ) as follows: is an (α − 1, δ)-two-sided-close minimax composite polynomial Definition 9. For d ∈ and τ ∈ (0, 1), MinErr(d,τ ) is pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 on Rδ such that deg(pˆi) ≤ deg(p˜i) for i, N 1 ≤ i ≤ k. the minimax approximation error of the minimax approximate polynomial of degree at most d on Rτ for sgn(x). Lemma 6. Let pk ◦ pk−1 ◦ · · · ◦ p1 be any (α − 1, δ)-two-sided- close composition of polynomials with odd-degree terms. Then, Lemma 7. For a fixed odd d ∈ N, MinErr(d,τ ) is a strictly there is an (α − 1, δ)-two-sided-close 1-centered-range composite increasing continuous function of τ on (0, 1). polynomial p˜k◦p˜k−1◦· · ·◦p˜1 on Rδ such that deg(p˜i) = deg(pi) The proof of Lemma 7 can be found in Appendix C. If for all i, 1 ≤ i ≤ k. the minimax approximate polynomial of degree at most d 0 The proofs of Lemmas 5 and 6 can be found in Appen- on Rτ narrows the domain Rτ to a range Rτ , MinErr(d,τ ) 0 dices A and B, respectively. The optimality of the proposed outputs τ . As MinErr(d,τ ) is a strictly increasing function method of using a minimax composite polynomial is now of τ on (0, 1), its inverse function exists and is defined as proved in Theorem 2. follows:

Theorem 2. Let pk ◦ pk−1 ◦ · · · ◦ p1 be any (α − 1, δ)-two- Definition 10. For d ∈ N and τ ∈ (0, 1), InvMinErr(d,τ ) is 0 0 sided-close composition of polynomials with odd-degree terms. equal to a value τ ∈ (0, 1) such that MinErr(d,τ ) = τ. (α − 1, δ) Then, there is an -two-sided-close minimax composite The approximate value of InvMinErr(d,τ ) can be ob- pˆ ◦pˆ ◦· · ·◦pˆ R deg(pˆ ) ≤ deg(p ) polynomial k k−1 1 on δ such that i i tained by a binary search using the improved multi-interval i 1 ≤ i ≤ k for all , . Remez algorithm. Proof. p ◦p ◦· · ·◦p (α−1, δ) Let k k−1 1 be any -two-sided-close Definition 11. h(m, n, τ) is the maximum value of τ 0 ∈ composition of polynomials with odd-degree terms. By (0, 1) such that there exists a minimax composite polynomial Lemma 6, there is an (α − 1, δ)-two-sided-close 1-centered- pk ◦ pk−1 ◦ · · · ◦ p1 on Rτ 0 satisfying pk ◦ pk−1 ◦ · · · ◦ p1([1 − range composite polynomial p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 on Rδ 0 0 τ , 1 + τ ]) ⊆ [1 − τ, 1 + τ], MultNum({pi}1≤i≤k) ≤ m, and such that deg(p˜i) = deg(pi), 1 ≤ i ≤ k. In addition, by DepNum({pi}1≤i≤k) ≤ n. Lemma 5, there is an (α − 1, δ)-two-sided-close minimax composite polynomial pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 on Rδ such Definition 11 implies that h(m, n, τ) outputs the max- 0 that deg(pˆi) ≤ deg(p˜i), 1 ≤ i ≤ k. Thus, there is an imum τ > 0 when the range of a minimax composite 0 (α − 1, δ)-two-sided-close minimax composite polynomial polynomial on Rτ becomes smaller than Rτ , with m or pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 on Rδ such that deg(pˆi) ≤ deg(pi) for all less non-scalar multiplications and a depth consumption i, 1 ≤ i ≤ k. of n or less. The degrees of the k component polynomi- als for the corresponding minimax composite polynomial Remark 2. In Theorem 2, as deg(pˆ ) ≤ deg(p ) for 1 ≤ i ≤ i i p ◦ p ◦ · · · ◦ p on R0 in Definition 11 are stored in k, we have MultNum({pˆ } ) ≤ MultNum({p } ) and k k−1 1 τ i 1≤i≤k i 1≤i≤k G(m, n, τ) as an ordered set. DepNum({pˆ } ) ≤ DepNum({p } ). i 1≤i≤k i 1≤i≤k If the value of h(m, n, τ) can be computed for any τ ∈ (0, 1) and m, n ∈ N, it is easy to obtain the minimum 3.3 Achieving Polynomial-Time Algorithm for New Ap- number of non-scalar multiplications and depth consump- proximation Method by Using Dynamic Programming tion. For example, for sufficiently large nmax, the smallest i 1−α From Section 3.2, it can be seen that the method of ap- satisfying h(i, nmax, 2 ) ≥ δ is the minimum number of proximating sgn(x) using minimax composite polynomial non-scalar multiplications. It is trivial that if 0 ≤ m ≤ 1 or is the optimal approximation method. For a given depth 0 ≤ n ≤ 1, then h(m, n, τ) = τ. However, it is not easy to consumption D, degrees d1, ··· , dk are called optimal if the obtain the value of h(m, n, τ) for m ≥ 2 and n ≥ 2 using 10

the definition directly. Thus, we introduce a useful recursion the minimum required depth consumption Mdep. The user for h(m, n, τ), which is the core of dynamic programming, then chooses the depth consumption D(≥ Mdep) to use and and the following theorem shows that the recursion holds. obtain the minimum number of non-scalar multiplications M and corresponding optimal set of degrees M from Theorem 3. For m ≥ 2 and n ≥ 2, the following recursion for mult degs ComputeMinMultDegs. Here, m and n should be h(m, n, τ) holds: max max set large enough for these two algorithms to obtain the h(m, n, τ)= accurate outputs, and we experimentally confirm that these m = 70 max InvMinErr(2j + 1, h(m − mult(2j + 1), algorithms obtain accurate outputs when max and 1≤j nmax = 40 for α ≤ 20. If the algorithms fail to obtain Mdep mult(2j+1)≤m dep(2j+1)≤n or Mmult because mmax and nmax are not large enough, they return an error symbol ⊥. The procedure to obtain n − dep(2j + 1),τ )). the optimal minimax composite polynomial using dynamic The proof of Theorem 3 can be found in Appendix D. programming is summarized as follows: Then, h(m, n, τ) and G(m, n, τ) are recursively computed (i) h(m, n, τ) and G(m, n, τ) are computed recursively by Algorithm 3 using the recursion equation in Theorem 3. using dynamic programming in Algorithm 3. In the 9th line of Algorithm 3, {2j + 1} ∪ G(m − mult(2j + (ii) From the values of h(m, n, τ) and G(m, n, τ), and 1), n − dep(2j + 1),τ ) is the insertion of 2j + 1 into the depth consumption D that user chooses, Mdegs is de- ordered set G(m − mult(2j + 1), n − dep(2j + 1),τ ) as the termined in Algorithms 5. first component. Here, mmax and nmax are set large enough. (iii) The component minimax approximate polynomials pi In this paper, only minimax approximate polynomials of for 1 ≤ i ≤ k are determined using the improved multi- degree at most 31 are used because using polynomials of interval Remez algorithm with Mdegs. higher degree may cause more numerical errors. As the analysis in Section 3.4 demonstrates that only minimax Algorithm 4: ComputeMinDep approximate polynomials of degree at most9 are used Input: Precision parameters α and  to minimize the number of non-scalar multiplications, it Output: Minimum depth consumption Mdep appears that degrees of at most 31 are sufficient. However, 1−α 1 Obtain 2-dimensional tables h(m, n, 2 ) and when we minimize the depth consumption, the required 1−α G(m, n, 2 ) for 0 ≤ m ≤ mmax and depth consumption may be further reduced if minimax 0 ≤ n ≤ nmax using Algorithm 3 approximate polynomials of degrees larger than 31 are also 2 for i ← 0 to nmax do used. 1−α 1− 3 if h(mmax, i, 2 ) ≥ δ = 1+ then 4 Mdep ← i Algorithm 3: Computation of h(m, n, τ) and 5 return Mdep G(m, n, τ) using dynamic programming 6 end Input: τ 7 if i = nmax then Output: 2-dimensional tables h(m, n, τ),G(m, n, τ) 8 return ⊥ for 0 ≤ m ≤ mmax and 0 ≤ n ≤ nmax 9 end 1 Generate a 2-dimensional table G(m, n,τ ) for 10 end 0 ≤ m ≤ mmax and 0 ≤ n ≤ nmax, where the components are all empty sets. 2 for m ← 0 to mmax do Algorithm 5: ComputeMinMultDegs 3 for n ← 0 to nmax do 4 if m ≤ 1 or n ≤ 1 then Input: Precision parameters α and , and depth 5 h(m, n, τ) ← τ consumption D 6 else Output: Minimum number of multiplications Mmult 7 j ← argmax InvMinErr(2k + 1, h(m − and optimal set of degrees Mdegs 1−α 1≤k 1 Obtain 2-dimensional tables h(m, n, 2 ) and mult(2k+1)≤m 1−α dep(2k+1)≤n G(m, n, 2 ) for 0 ≤ m ≤ mmax and mult(2k + 1), n − dep(2k + 1),τ )) 0 ≤ n ≤ nmax using Algorithm 3 8 h(m, n, τ) ← InvMinErr(2j + 1, h(m − 2 for j ← 0 to mmax do 1−α 1− mult(2j + 1), n − dep(2j + 1),τ )) 3 if h(j, D, 2 ) ≥ δ = 1+ then 9 G(m, n,τ ) ← {2j + 1} ∪ G(m − mult(2j + 4 Mmult ← j 1), n − dep(2j + 1),τ ) 5 Go to line 11 10 end 6 end 11 end 7 if j = mmax then 12 end 8 return ⊥ 9 end 10 end The algorithms ComputeMinDep in Algorithm 4 and 1−α 11 Mdegs ← G(Mmult,D, 2 ) // Mdegs: ordered set ComputeMinMultDegs in Algorithm 5 are now intro- 12 return Mmult and Mdegs duced. They use the values of h(m, n, τ) and G(m, n, τ) ob- tained by Algorithm 3. First, ComputeMinDep determines 11

Theorem 4 shows that Mdep obtained from Compute- sided-close minimax composite polynomial on Rδ such that MinDep is the minimum depth consumption. For Mmult MultNum({pˆi}1≤i≤k) ≤ m and DepNum({pˆi}1≤i≤k) ≤ D. and Mdegs obtained from ComputeMinMultDegs, Theo- rem 5 shows that Mmult is the minimum number of non- scalar multiplications when the depth consumption is D, The MinimaxComp algorithm, which outputs an ap- comp(u, v) and Mdegs is the corresponding optimal set of degrees. proximate value of , is now proposed in Algo- rithm 6. It uses the optimal set of degrees Mdegs obtained Theorem 4. Let Mdep be the output value of Algorithm 4 for from ComputeMinMultDegs for α, , and D. Then, the inputs α and . Then, Mdep ≤ DepNum({pi}1≤i≤k) for any error between the output of the proposed algorithm Min- (α−1, )-close composition of polynomials with odd-degree terms imaxComp and comp(u, v) is bounded by 2−α for any pk ◦ pk−1 ◦ · · · ◦ p1. u, v ∈ [0, 1] satisfying |u − v| ≥ . Here,  and α are pre- cision parameters that users of homomorphic comparison Proof. p ◦ p ◦ · · · ◦ p (α − 1, ) Let k k−1 1 be any -close operation can choose, and they determine input precision composition of polynomials with odd-degree terms. Let and output precision, respectively. MinimaxErr(a, b; d) and MultNum({p } ) = m DepNum({p } ) = n i 1≤i≤k and i 1≤i≤k . MinimaxFunc(a, b; d) are defined for the description of (α − 1, δ) From Lemma 3, there exists an -two-sided-close MinimaxComp as follows. composition of polynomials with odd-degree terms p˜k ◦· · ·◦ p˜1 such that MultNum({p˜i}1≤i≤k) = MultNum({pi}1≤i≤k) Definition 12. For a, b ∈ R and d ∈ N, let and DepNum({p˜i}1≤i≤k) = DepNum({pi}1≤i≤k). In addi- MinimaxFunc(a, b; d) be the minimax approximate polynomial tion, from Theorem 2, there exists an (α − 1, δ)-two-sided- of degree at most d on [−b, −a] ∪ [a, b] for sgn(x), and close minimax composite polynomial pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 on MinimaxErr(a, b; d) be the minimax approximation error of Rδ such that MultNum({pˆi}1≤i≤k) ≤ MultNum({p˜i}1≤i≤k) MinimaxFunc(a, b; d). and DepNum({pˆi}1≤i≤k) ≤ DepNum({p˜i}1≤i≤k). We as- sume that n < Mdep. Then, DepNum({pˆi}1≤i≤k) ≤ Algorithm 6: MinimaxComp DepNum({p˜i}1≤i≤k) ≤ DepNum({pi}1≤i≤k) = n < Mdep. Input: Inputs u, v ∈ (0, 1), precision parameters α As n < Mdep, and Mdep is the smallest i that satisfies and , and depth consumption D 1−α 1−α h(mmax, i, 2 ) ≥ δ, we have h(mmax, n, 2 ) < δ. Thus, Output: Approximate value of comp(u, v) there is no minimax composite polynomial p¯k ◦p¯k−1 ◦· · ·◦p¯1 1 Obtain Mdegs = {d1, d2, ··· , dk} from on Rδ such that p¯k ◦ p¯k−1 ◦ · · · ◦ p¯1([1 − δ, 1 + δ]) ⊆ ComputeMinMultDegs for α, , and D 1−α 1−α [1 − 2 , 1 + 2 ], MultNum({p¯i}1≤i≤k) ≤ mmax, and 2 p1 ← MinimaxFunc(1 − , 1; d1) DepNum({p¯i}1≤i≤k) ≤ n. This leads to a contradiction 3 τ1 ← MinimaxErr(1 − , 1; d1) because pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 is an (α − 1, δ)-two-sided- 4 for i ← 2 to k do close minimax composite polynomial on Rδ such that 5 pi ← MinimaxFunc(1 − τi−1, 1 + τi−1; di) MultNum({pˆi}1≤i≤k) ≤ mmax and DepNum({pˆi}1≤i≤k) ≤ 6 τi ← MinimaxErr(1 − τi−1, 1 + τi−1; di) n. 7 end pk◦pk−1◦···◦p1(u−v)+1 8 return 2 Theorem 5. Let Mmult and Mdegs be the output values of Algorithm 5 for depth consumption D, and precision parame- α  M ≤ MultNum({p } ) ters and . Then, mult i 1≤i≤k for any 3.4 Optimal Composite Polynomials for Homomorphic (α−1, ) -close composition of polynomials with odd-degree terms Comparison and Their Performance pk ◦ pk−1 ◦ · · · ◦ p1 satisfying DepNum({pi}1≤i≤k) = D. Herein, using dynamic programming and the algorithm Proof. Let pk ◦ pk−1 ◦ · · · ◦ p1 be any (α − 1, )-close ComputeMinMultDegs, the optimal set of degrees and composition of polynomials with odd-degree terms. Let the corresponding number of non-scalar multiplications are MultNum({pi}1≤i≤k) = m and DepNum({pi}1≤i≤k) = D. obtained for a given depth consumption D. Fig. 3 shows the From Lemma 3, there exists an (α − 1, δ)-two-sided-close minimum number of non-scalar multiplications of minimax composition of polynomials with odd-degree terms p˜k ◦· · ·◦ composite polynomials for homomorphic comparison oper- p˜1 such that MultNum({p˜i}1≤i≤k) = MultNum({pi}1≤i≤k) ation according to depth consumption D. It can be seen that and DepNum({p˜i}1≤i≤k) = DepNum({pi}1≤i≤k). In addi- there is a tradeoff between the required depth consumption tion, from Theorem 2, there exists an (α − 1, δ)-two-sided- and the number of non-scalar multiplications. The empty close minimax composite polynomial pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 on mark implies that the option does not need to be used by Rδ such that MultNum({pˆi}1≤i≤k) ≤ MultNum({p˜i}1≤i≤k) users because another option uses the same number of non- and DepNum({pˆi}1≤i≤k) ≤ DepNum({p˜i}1≤i≤k). We as- scalar multiplications as the option and uses smaller depth sume that m < Mmult. Then, MultNum({pˆi}1≤i≤k) ≤ consumption than the option. MultNum({p˜i}1≤i≤k) ≤ MultNum({pi}1≤i≤k) = m < We analyze the performance for the option of mini- Mmult. As m < Mmult, and Mmult is the smallest j that mizing the number of non-scalar multiplications and that satisfies h(j, D, 21−α) ≥ δ, we have h(m, D, 21−α) < δ. of minimizing depth consumption among several opti- Thus, there is no minimax composite polynomial p¯k ◦ p¯k−1 ◦ mal options in Fig. 3. Table 2 shows the ordered sets · · · ◦ p¯1 on Rδ such that p¯k ◦ p¯k−1 ◦ · · · ◦ p¯1([1 − δ, 1 + Mdegs that store the degrees of the optimal component 1−α 1−α δ]) ⊆ [1 − 2 , 1 + 2 ], MultNum({p¯i}1≤i≤k) ≤ m, minimax approximate polynomials when depth consump- and DepNum({p¯i}1≤i≤k) ≤ D. This leads to a contra- tion and the number of non-scalar multiplications, respec- diction because pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 is an (α − 1, δ)-two- tively, are minimized. The minimum depth consumption 12

Fig. 4 (a) shows a comparison of the number of required non-scalar multiplications and depth consumption between the previous homomorphic comparison operation algorithm α = 8 60 and the proposed algorithm MinimaxComp when the num- α = 12 ber of non-scalar multiplications is minimized. We note that 50 α = 16 the previous algorithm has the same number of non-scalar α = 20 multiplications and depth consumption. This is because the 40 degrees of all of the component polynomials used in [15] are nine, and the required number of non-scalar multiplications 30 and depth consumption of polynomials of degree nine using the polynomial evaluation method in [15] are both four. 20 It can be seen that the minimum number of the required non-scalar multiplications and the corresponding depth 10 consumption for the proposed algorithm are reduced by approximately 30% and 31% on average, respectively, com- pared with the corresponding values for the previous algo- 0 0 10 20 30 40 rithm. In this case, the proposed algorithm MinimaxComp,

minimum number of non-scalar multiplications which uses the output M of ComputeMinMultDegs, depth consumption D degs is aimed at minimizing the number of non-scalar multipli- cations; however, the corresponding depth consumption is Fig. 3. The minimum number of non-scalar multiplications of minimax also reduced. composite polynomials for homomorphic comparison operation accord- ing to depth consumption D. Fig. 4 (b) shows a comparison of the number of required non-scalar multiplications and depth consumption between TABLE 2 the previous algorithm and the proposed homomorphic Optimal Set of Degrees Mdegs Obtained from ComputeMinMultDegs comparison algorithm MinimaxComp when depth con- Algorithm That Stores the Degrees of the Component Minimax sumption is minimized. It can be seen that the minimum Approximate Polynomials When Depth Consumption and the Number depth consumption for the proposed algorithms is reduced of Non-Scalar Multiplications, Respectively, are Minimized by approximately 48% on average, and the corresponding ordered set of degrees of the optimal component number of non-scalar multiplications is increased by ap- α minimax approximate polynomials Mdegs proximately 4% on average. Although the number of non- minimize multiplications minimize depth scalar multiplications for the proposed algorithm is slightly 4 {3, 3, 5} {27} 5 {5, 5, 5} {7, 13} larger than that for the previous algorithm, when bootstrap- 6 {3, 5, 5, 5} {15, 15} ping is required, the proposed algorithm would require less 7 {3, 3, 5, 5, 5} {7, 7, 13} running time than the previous algorithm because boot- 8 {3, 3, 5, 5, 9} {7, 15, 15} strapping, owing to the large depth consumption, requires 9 {5, 5, 5, 5, 9} {7, 7, 7, 13} 10 {5, 5, 5, 5, 5, 5} {7, 7, 13, 15} much more running time than non-scalar multiplication 11 {3, 5, 5, 5, 5, 5, 5} {7, 15, 15, 15} operations. 12 {3, 5, 5, 5, 5, 5, 9} {15, 15, 15, 15} 13 {3, 5, 5, 5, 5, 5, 5, 5} {15, 15, 15, 31} 14 {3, 3, 5, 5, 5, 5, 5, 5, 5} {7, 7, 15, 15, 27} 3.5 Proposed Homomorphic Comparison Operation 15 {3, 3, 5, 5, 5, 5, 5, 5, 9} {7, 15, 15, 15, 27} 16 {3, 3, 5, 5, 5, 5, 5, 5, 5, 5} {15, 15, 15, 15, 27} Using Margin 17 {5, 5, 5, 5, 5, 5, 5, 5, 5, 5} {15, 15, 15, 29, 29} The proposed homomorphic comparison operation in Algo- 18 {3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5} {15, 15, 29, 29, 31} 19 {5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5} {15, 29, 31, 31, 31} rithm 6 satisfies the comparison operation error condition 20 {5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 9} {29, 31, 31, 31, 31} well if there are no other errors than those caused by polynomial approximation. However, if Algorithm 6 is used in approximate homomorphic encryption CKKS scheme, the difference between pi ◦ · · · ◦ p1(x) and sgn(x) for some and the minimum number of required non-scalar mul- i(≤ k) and x ∈ [, 1] can be greater than the minimax tiplications are computed using ComputeMinDep and approximation error τi of pi. It means that pi◦· · ·◦p1(x) does ComputeMinMultDegs. Other options that are not shown not fall into [1 − τi, 1 + τi], which is the domain of the next in Table 2 can be seen in Appendix E. minimax approximate polynomial pi+1, and this can lead to For an integer n = 4 and precision parameters α a failure of homomorphic comparison operation. Thus, we and , the lower bounds of depth consumption and the propose an improved algorithm, which is Algorithm 7, and number of non-scalar multiplications for the previous al- the algorithm adds the use of margins to Algorithm 6 to gorithm CompG (referred to as NewCompG in [15]) are make it resistant to errors caused by the CKKS scheme. This 1 1 both 4(d 2 · log(2/)e + d · log(α − 2)e), where log 0.98cn log(n+1) Algorithm 7 is used instead of Algorithm 6 in the numerical 2n+1 2n cn = 4n n , which can be obtained from Lemmas 1 and analysis in Section 5. 3 in [15]. This lower bound is very close to that obtained If the value of margin η becomes larger, the homo- experimentally. In this paper, the performances of the pre- morphic comparison operation will become more resistant vious and proposed homomorphic comparison algorithms to errors caused by CKKS scheme. Thus, the margin η are both analyzed when  = 2−α. is set as large as possible among valid values of margin 13

80 80 no. of mults/depth of the previous algorithm no. of mults/depth of the previous algorithm 70 no. of mults of the proposed algorithm 70 no. of mults of the proposed algorithm depth of the proposed algorithm depth of the proposed algorithm 60 60

50 50

40 40

30 30

20 20

10 10

0 number of multiplications and depth 0

5 10 15 20 when the depth consumption is minimized 5 10 15 20 number of multiplications and depth when the

number of non-scalar multiplications is minimized precision parameter α precision parameter α (a) When the number of non-scalar multiplications is minimized (b) When the depth consumption is minimized

Fig. 4. Comparison of the minimum number of non-scalar multiplications and depth consumption between the previous and the proposed algorithms when the number of non-scalar multiplications or the depth consumption is minimized.

Algorithm 7: MinimaxComp (using margin) should be noted that the inputs of ComputeMinMultDegs Input: Inputs u, v ∈ (0, 1), precision parameters α to perform the proposed homomorphic max function are and , depth consumption D, and margin η different from those to perform the proposed homomor- ζ · 2−α Output: Approximate value of comp(u, v) phic comparison operation. First, for some max function factor ζ > 1 is used instead of . This is be- 1 Obtain M = {d , d , ··· , d } from degs 1 2 k p(x) sgn(x) ComputeMinMultDegs for D, α, and  cause, for a polynomial that approximates , the error condition of p(x) is different when used to im- 2 p1 ← MinimaxFunc(1 − , 1; d1) plement the homomorphic max function and when used 3 τ1 ← MinimaxErr(1 − , 1; d1) + η to implement the homomorphic comparison operation. In 4 for i ← 2 to k do addition, depth consumption D − 1 is used instead of 5 pi ← MinimaxFunc(1 − τi−1, 1 + τi−1; di) depth consumption D because another depth is required 6 τi ← MinimaxErr(1 − τi−1, 1 + τi−1; di)+ η (u−v)pk◦pk−1◦···◦p1(u−v)+(u+v) 7 end to compute 2 in homomorphic pk◦pk−1◦···◦p1(u−v)+1 max function. The larger the value of ζ, the smaller the 8 return 2 depth consumption and the number of non-scalar multipli- cations that this homomorphic max function requires. We experimentally set the value of ζ as large as possible among 1−α such that τk ≤ 2 , which implies that the homomorphic valid values of ζ such that the homomorphic max function comparison operation using margin satisfies the comparison satisfies the error condition. operation error condition. Algorithm 8: MinimaxMax (using margin) Input: Inputs u, v ∈ (0, 1), precision parameter α, 4 APPLICATION TO MIN/MAXAND SORTING max function factor ζ, depth consumption D, The max function can be used in several applications, in- and margin η cluding the max pooling operation in deep learning and Output: Approximate value of max(u, v) sorting algorithms. It can be easily implemented using the 1 Obtain Mdegs = {d1, d2, ··· , dk} from sign function, that is, ComputeMinMultDegs for D − 1, α, and ζ · 2−α (u + v) + (u − v) sgn(u − v) 2 p1 ← MinimaxFunc(1 − , 1; d1) max(u, v) = . 3 τ ← MinimaxErr(1 − , 1; d ) + η 2 1 1 4 for i ← 2 to k do Thus, the proposed method of determining polynomials 5 pi ← MinimaxFunc(1 − τi−1, 1 + τi−1; di) that approximate sgn(x) can also be used to determine 6 τi ← MinimaxErr(1 − τi−1, 1 + τi−1; di)+ η polynomials that approximate the max function. The min 7 end function can be easily implemented by using the max func- (u−v)pk◦pk−1◦···◦p1(u−v)+(u+v) 8 return tion, that is, min(u, v) = u + v − max(u, v). 2 For some polynomial p(x) that approximates sgn(x), the The proposed homomorphic max function algorithm can (u+v)+(u−v)p(u−v) error between 2 and max(u, v) should be be used for sorting. In this paper, we analyze the perfor- bounded by 2−α for any u, v ∈ [0, 1]. Algorithm 8 is the pro- mance of sorting algorithms for three numbers and four posed homomorphic max function that uses the optimal set numbers using the proposed homomorphic max function. of degrees Mdegs obtained from ComputeMinMultDegs. It Fig. 5 shows the diagrams representing these sorting algo- 14

0 0 rithms. Each arrow, the start and end points of which are ui • Add(ct, ct ): Output ctadd ← [ct + ct ]q` . 0 0 0 0 and uj, respectively, indicates the sorting of the data ui and • Multevk(ct, ct ): For ct = (¯c0, c¯1) and ct = (¯c0, c¯1), ¯ ¯ ¯ 0 0 0 0 uj. After the sorting, max(ui, uj) and min(ui, uj) are at the compute (d0, d1, d2) = (¯c0c¯0, c¯0c¯1 +¯c0c¯1, c¯1c¯1) and 0 ¯ ¯ −1 ¯ end and start point of the arrow, respectively. ctmult ← [(d0, d1) + bqL · d2 · evke]q` . Then, output −1 0 ctmult ← [b∆ · ctmulte]q`/∆. u1 It should be noted that an attack method on the CKKS scheme has recently been proposed in [26]. The CKKS u 1 scheme satisfies indistinguishability under chosen-plaintext u2 attack (IND-CPA) security, and it is sufficient if the secret u2 key owner does not share decryption results with people u3 who do not own the secret key. However, in [26], it was

u3 shown that the CKKS scheme could be broken if the secret u4 key owner shares the decryption results with others, and a (a) Three numbers (b) Four numbers stronger notion of security called IND-CPA+ was proposed. Then, an attack prevention method was proposed in [27]. Fig. 5. Diagrams representing sorting algorithms for three and four This method adds a Gaussian error at the end of the decryp- numbers. tion procedure, which allows the CKKS scheme to satisfy IND-CPA+. Procedures other than the decryption procedure do not need to be modified, and thus, the proposed homo- 5 NUMERICAL RESULTS morphic comparison operation can be used without change, In this section, numerical results regarding the algorithms even with this attack prevention method. for the proposed homomorphic comparison operation, ho- momorphic max function, and homomorphic sorting using 5.2 Parameter Setting the proposed homomorphic max function are presented. The performances of the proposed algorithms are evaluated The precision parameters,  and α, determine the input and and compared with those of the previous algorithms in output precisions of the homomorphic comparison opera- −α [15]. The numerical analysis is conducted using the CKKS tion, respectively. We set  = 2 , implying that the input scheme library HEAAN [3] on a Linux PC with an Intel and output of the homomorphic comparison operation have Core i7-10700 CPU at 2.90GHz with 8 threads. the same precision bits. Unlike the homomorphic compari- son operation, there is only one precision parameter α which determines the output precision in the homomorphic max 5.1 CKKS Scheme function. We set N = 217, and the initial ciphertext modulus 1700 The CKKS scheme is a representative word-wise FHE qL can be set up to 2 to achieve 128-bit security which is scheme that supports the evaluation of encrypted real or estimated by Albrecht’s LWE estimator [28]. The script for complex data with approximation. The ciphertext modulus security estimation can be found in [15]. We simultaneously ` of bit-length ` is denoted by q` = 2 for 1 ≤ ` ≤ L, perform homomorphic comparison operation or homomor- where L is the bit length of the initial ciphertext modulus. phic max function for N/2 tuples of real numbers. Thus, N Let R = Z[X]/(X + 1) and Rq = R/qR, where N the amortized running time is computed by dividing the is a power-of-two integer. Let χkey, χerr, and χenc be the running time by N/2. key distribution, the error distribution, and the distribution for encryption over R, respectively, and these distributions 5.2.1 Scaled Values and Margins can be determined by the security analysis. We set the If the power basis is used, the coefficients often become χ = HWT (256) key distribution key N , which samples an excessively small or large, and thus, the accuracy of the R element in with ternary coefficients that have 256 nonzero homomorphic comparison operation is sometimes poor. U(R ) values uniformly at random. Let q denote the uni- Accordingly, we use the scaled Chebyshev polynomials form distribution over R . There is a field isomorphism q T˜ (t) = T (t/w) for some scaled value w ≥ 1 as basis τ¯ : [X]/(XN + 1) → N/2 i i R C . This isomorphism is used polynomials. The scaled Chebyshev polynomials can be τ¯−1 for decoding, and its inverse isomorphism is used for evaluated using the following recursion: encoding. The CKKS scheme is described as follows: 0 T˜ (x) = 1 • KeyGen: Sample s¯ ← χkey, e¯, e¯ ← χerr, a¯ ← U(RqL ), 0 0 and a¯ ← U(R 2 ), and set sk ← (1, s¯), pk ← (−a¯ · s¯ + ˜ qL T1(x) = x/w e¯, a¯) ∈R 2 evk ← (−a¯0 · s¯+¯e0 + q · s¯2, a¯0) ∈R 2 qL , and L q2 . ˜ ˜ ˜ ˜ L Ti+j(x) = 2Ti(x)Tj(x) − Ti−j(x) for i ≥ j ≥ 1. • Encpk(z; ∆): For a plaintext vector z = N/2 (z0, ··· , zN/2−1) ∈ C and a scaling factor ∆, We experimentally obtain the scaled values w and mar- encode z by m¯ ← b∆ · τ¯−1(z)e ∈R , where b·e gins η, which is used in Algorithms 7 and 8, and the denotes the rounding operation. Then, sample obtained values that we use are shown in Table 3. v¯ ← χenc and e¯0, e¯1 ← χerr, and output the ciphertext 5.2.2 Initial Modulus ct = dv¯ · pk + (¯m +¯e0, e¯1)eqL . 2 • Dec (ct; ∆) ct = (¯c , c¯ ) ∈R sk : For a ciphertext 0 1 q` , com- When we execute the homomorphic comparison operation, 0 pute the plaintext polynomial m¯ ← [c¯0 +¯c1 · s¯]q` , and we set the initial modulus qL so that the final modulus 0 −1 0 N/2 output the plaintext vector z ← ∆ · τ¯(m¯ ) ∈ C . bit is log ∆ + 10 after the execution of each homomorphic 15

TABLE 3 Set of Scaled Values and Margins for the Proposed Homomorphic Comparison Operation and Homomorphic Max Function

proposed homomorphic comparison operation MinimaxComp proposed homomorphic max function MinimaxMax α minimize running time minimize depth minimize running time minimize depth scaled values w margin η scaled values w margin η scaled values w margin η scaled values w margin η 8 {1,2,1.6} 2−12 {1,2,1.6} 2−12 {1,2,1.6} 2−9 {1,1.6} 2−10.5 12 {1,2,2,2,1.6} 2−15 {1,2,2,1.6} 2−14 {1,2,2,1.6} 2−13 {1,2,1.7} 2−13 16 {1,2,2,2,2,1.7} 2−18 {1,2,2,2,1.6} 2−17 {1,2,2,2,2,1.6} 2−16 {1,2,2,2,1.6} 2−17 20 {1,2,2,2,2,2,2,2,1.6} 2−21 {1,2,2,2,1.6} 2−22 {1,2,2,2,2,1.6} 2−20.5 {1,2,2,2,1.6} 2−20

comparison operation as in [15]. For a composite polyno- 5.2.3 Scaling Factor mial pk ◦ pk−1 ◦ · · · ◦ p1, where deg(pi) = di, the depth Increasing the scaling factor ∆ increases the accuracy of Pk i=1 dep(di) is consumed. In addition, additional depth the homomorphic comparison operation and homomorphic consumption is required because of the required multipli- max function. The homomorphic comparison operation or cation by 1/w in computing scaled Chebyshev polynomial homomorphic max function is said to fail for one input tuple ˜ T1(x) and of the required multiplication by 1/2 in com- of two real numbers u and v if the output does not satisfy the pk◦pk−1◦···◦p1(u−v)+1 puting 2 in Algorithm 6. However, for comparison operation or max function error condition. We a set of scaled values {w1, w2, ··· , wk}, if we multiply the perform the proposed homomorphic comparison operation 20 coefficients of pi by 1/wi+1 for 1 ≤ i ≤ k − 1 and those of or homomorphic max function for 65536 × 16 = 2 input pk by 1/2, then no additional depth consumption is required tuples for each α, and we obtain the number of failures. by using these modified coefficients instead. Then, the initial Then, the failure rate is the number of failures divided by the 20 modulus bit log qL is total number of input tuples, that is, 2 . We set the scaling factor so that the homomorphic comparison operation or log qL = log ∆ · (dep(d1) + ··· + dep(dk)) + log ∆ + 10. homomorphic max function does not fail in any slot, and the failure rate is said to be less than 10−6 in this case. Table 4 shows the failure rate of the proposed homomor- When we execute the homomorphic max function, the phic comparison algorithm MinimaxComp according to the initial modulus bit is scaling factor for various α when running time and the log qL = log ∆ · (dep(d1) + ··· + dep(dk)) + 2 log ∆ + 10, depth consumption are minimized, respectively. We note that if the scaling factor is larger than a certain threshold which is log ∆ bit larger than the initial modulus bit in value, the number of failures is zero. In this paper, log ∆ homomorphic comparison operation because homomor- is set to 40 when α is 8, 12, or 16, and set to 44 when α phic max function requires one more depth in computing is 20. It is confirmed that the failure rates of homomorphic (u−v)pk◦pk−1◦···◦p1(u−v)+(u+v) 2 in Algorithm 8. comparison operation and homomorphic max function are

TABLE 4 Failure Rate of the Proposed Algorithm MinimaxComp on HEAAN According to the Scaling Factor for Various α When Running Time and the Depth Consumption Are Minimized, Respectively

failure rate of the proposed algorithm MinimaxComp log ∆ minimize running time minimize depth consumption α 8 12 16 20 8 12 16 20 32 < 10−6 < 10−6 0.004653 0.522744 < 10−6 < 10−6 0.009860 0.974903 34 < 10−6 < 10−6 0.000075 0.202683 < 10−6 < 10−6 0.000801 0.390946 36 < 10−6 < 10−6 < 10−6 0.125566 < 10−6 < 10−6 0.000004 0.171154 38 < 10−6 < 10−6 < 10−6 0.000921 < 10−6 < 10−6 < 10−6 0.006707 40 < 10−6 < 10−6 < 10−6 0.000019 < 10−6 < 10−6 < 10−6 0.000655 42 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 0.000020 44 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6 < 10−6

TABLE 5 Optimal Set of Degrees for the Proposed Homomorphic Comparison Operation and Homomorphic Max Function When Running Time and the Depth Consumption Are Minimized, Respectively

optimal set of degrees α proposed homomorphic comparison operation MinimiaxComp proposed homomorphic max function MinimaxMax minimize running time minimize depth ζ minimize running time minimize depth 8 {7,15,15} {7,15,15} 12 {3,5,9} {7,15} 12 {7,7,7,13,13} {15,15,15,15} 15 {5,7,7,15} {7,15,27} 16 {7,7,7,13,13,27} {15,15,15,15,27} 19 {5,5,5,7,7,15} {7,7,7,15,15} 20 {5,5,5,7,7,7,7,7,27} {29,31,31,31,31} 21 {7,7,7,13,13,27} {15,15,15,15,27} 16

TABLE 6 Running Time (Amortized Running Time) and Ciphertext Modulus Bit Consumption of CompG (Cheon et al. [15]) and the Proposed MinimaxComp on HEAAN for Various α; an Asterisk (*) Indicates That the Parameter Set Does Not Achieve 128-bit Security Because of the Large log qL ≥ 1700

proposed algorithm MinimaxComp CompG (Cheon et al. [15]) α log ∆ minimize running time minimize depth running log( qL ) running log( qL ) running log( qL ) time qf time qf time qf 8 40 15 s (0.22 ms) 844 8 s (0.12 ms) 440 8s (0.12 ms) 440 12 40 28 s (0.42 ms) 1184 15 s (0.22 ms) 680 16s (0.24 ms) 640 16 40 45 s (0.68 ms) 1524 24 s (0.36 ms) 880 25s (0.38 ms) 840 20 44 62 s* (0.94 ms) 1854* 38 s (0.57 ms) 1276 43s (0.65 ms) 1100

TABLE 7 Running Time (Amortized Running Time) and Ciphertext Modulus Bit Consumption of MaxG (Cheon et al. [15]) and the Proposed MinimaxMax on HEAAN for Various α

proposed algorithm MinimaxMax MaxG (Cheon et al. [15]) α log ∆ minimize running time minimize depth running log( qL ) running log( qL ) running log( qL ) time qf time qf time qf 8 40 8 s (0.12 ms) 548 4 s (0.06 ms) 400 5s (0.07 ms) 320 12 40 16 s (0.24 ms) 885 10 s (0.15 ms) 560 11s (0.16 ms) 520 16 40 29 s (0.44 ms) 1225 16 s (0.24 ms) 800 17s (0.25 ms) 720 20 44 41 s (0.62 ms) 1527 28 s (0.42 ms) 1012 29s (0.44 ms) 968

less than 10−6 for these scaling factors. reduces ciphertext modulus bit consumption by approxi- mately 41% (resp. 45%) on average, compared with the pre- 5.2.4 Optimal Degrees vious algorithm CompG if running time (resp. depth con- sumption) is to be minimized. It should be noted that when In this numerical analysis, the initial modulus is determined the precision parameter α is 20, the previous algorithm does according to the depth consumption, and thus, not only the not achieve 128-bit security, while the proposed algorithm number of non-scalar multiplications but also depth con- achieves 128-bit security due to small depth consumption. sumption affect running time. Thus, minimizing number of We note that the proposed algorithm MinimaxComp non-scalar multiplications does not necessarily correspond requires slightly more running time when the depth con- to minimizing running time, and we perform numerical sumption is minimized than when running time is mini- analysis when running time and the depth consumption are mized. However, less ciphertext modulus bit consumption minimized, respectively. Table 5 shows the optimal set of is required in the former case, resulting in less frequent boot- degrees for the proposed homomorphic comparison oper- strapping and hence considerably reduced running time for ation and homomorphic max function when running time FHE applications that use bootstrapping. and the depth consumption are minimized, respectively. 5.4 Performance of the Proposed Homomorphic Max 5.3 Performance of the Proposed Homomorphic Com- Function Algorithm MinimaxMax parison Algorithm MinimaxComp We compare the performance of the proposed homomorphic max function MinimiaxMax in Algorithm 8 with that of We compare the performance of the proposed homomorphic the previous algorithm MaxG using the scaled values w comparison algorithm MinimaxComp in Algorithm 7 with and margins η in Table 3. Table 7 shows the running time that of the previous algorithm CompG [15] using the scaled (amortized running time) and the ciphertext modulus bit values w and margins η in Table 3. Table 6 shows the consumption of MaxG (referred to as NewMaxG in [15]) running time (amortized running time) of CompG and the and the proposed algorithm MinimaxMax on HEAAN for proposed algorithm MinimaxComp on HEAAN for various various α. It can be seen that when we minimize running α. It can be seen that the proposed algorithm Minimax- time (resp. depth consumption), the proposed algorithm Comp reduces running time by approximately 45% (resp. MinimaxMax reduces running time by approximately 41% 41%) on average, compared with the previous algorithm (resp. 35%) on average, and reduces the ciphertext modulus CompG if running time (resp. depth consumption) is to be bit by approximately 33% (resp. 40%) on average, compared minimized. with the previous algorithm MaxG. In addition, Table 6 shows the ciphertext modulus bit consumption of CompG and the proposed algorithm Min- 5.5 Performance of the Homomorphic Sorting Algo- imaxComp on HEAAN for various α. Let qL and qf be the moduli of the initial ciphertext and the final ciphertext, rithm Using MinimaxMax respectively, after the homomorphic comparison operation. We compare the performance of homomorphic sorting algo- It can be seen that the proposed algorithm MinimaxComp rithms for three and four numbers using the proposed max 17

TABLE 8 max function algorithm reduced running time by approx- Running Time (Amortized Running Time) and the Ciphertext Modulus imately 41% (resp. 35%) on average, compared with the Bit Consumption of Sorting When We Use MaxG (Cheon et al. [15]) and the Proposed MinimaxMax on HEAAN; an Asterisk (*) Indicates previous algorithm if running time (resp. depth consump- That the Parameter Set Does Not Achieve 128-Bit Security Because of tion) was to be minimized. Finally, the homomorphic sorting the Large log qL ≥ 1700 algorithm using the proposed homomorphic max function MinimaxMax algorithm was approximately two times as fast as the homo- MaxG (minimize depth) morphic sorting algorithm using the previous homomorphic sorting running 140 s* 76 s max function algorithm. of three time (2.13 ms) (1.15 ms) numbers log(qL/qf ) 2655* 1560 sorting running 252 s* 136 s APPENDIX A of four time (3.84 ms) (2.07 ms) PROOFOF LEMMA 5 numbers log(qL/qf ) 2655* 1560 Proof. As p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is a 1-centered-range composite polynomial on Rδ, there exists {τi}1≤i≤k such that p˜1([1 − δ, 1 + δ]) = [1 − τ1, 1 + τ1] and p˜i([1 − τi−1, 1 + τi−1) = MinimaxMax function algorithm with that achieved by [1 − τi, 1 + τi] for all i, 2 ≤ i ≤ k. Then, p˜k ◦ p˜k−1 ◦ · · · ◦ MaxG α = 12 using the previous algorithm , and we set . As p˜1([1 − δ, 1 + δ]) = [1 − τk, 1 + τk]. As p˜k ◦ p˜k−1 ◦ · · · ◦ −(α−1) in the numerical analysis of the homomorphic comparison p˜1 is (α − 1, δ)-two-sided-close, τk ≤ 2 should hold. operation and max function, we set the initial modulus Let deg(p˜i) = di for 1 ≤ i ≤ k. Let pˆ1 be the minimax q log ∆ + 10 L such that the final modulus bit is after the approximate polynomial of degree at most d1 on Rδ, and execution of the sorting algorithm. 0 0 let τ1 be the approximation error of pˆ1. Then, τ1 ≤ τ1. Let Table 8 shows the running time (amortized running 0 τi be the approximation error of pˆi, which is the minimax time) and the ciphertext modulus bit consumption of sorting approximate polynomial of degree at most di on Rτ 0 for MaxG i−1 when we use (Cheon et al. [15]) and the proposed sgn(x) for i, 2 ≤ i ≤ k. Then, pˆk ◦pˆk−1 ◦· · ·◦pˆ1 is a minimax MinimaxMax on HEAAN. It can be seen that the sorting composite polynomial on Rδ. We prove by induction that algorithm using the proposed max function algorithm Min- 0 0 00 τi ≤ τi, 2 ≤ i ≤ k. We assume that τi−1 ≤ τi−1. Let τi imaxMax is approximately two times as fast as the sorting be the error in the approximation of sgn(x) by the minimax algorithm using the previous max function MaxG. In ad- approximate polynomial of degree at most di on Rτi−1 . By dition, ciphertext modulus bit consumption is reduced by 0 00 Lemma 4, we have τi ≤ τi . As p˜i([1 − τi−1, 1 + τi−1]) = 41%. It should be noted that the sorting algorithm that use 00 0 00 [1 − τi, 1 + τi], we have τi ≤ τi. Thus, τi ≤ τi ≤ τi. We the previous homomorphic max function algorithm does 0 conclude by induction that τi ≤ τi for all i, 2 ≤ i ≤ k. As not achieve 128-bit security, while the sorting algorithm that 0 −(α−1) τk ≤ τk ≤ 2 , we have that pˆk ◦ pˆk−1 ◦ · · · ◦ pˆ1 is an use the proposed homomorphic max function algorithm (α − 1, δ)-two-sided-close minimax composite polynomial achieves 128-bit security due to small depth consumption. on Rδ such that deg(pˆi) ≤ deg(p˜i) for all i, 1 ≤ i ≤ k.

6 CONCLUSION APPENDIX B We proposed the optimal method of approximating sgn(x) PROOFOF LEMMA 6 for the homomorphic comparison operation, which uses the Proof. Let p1([1 − δ, 1 + δ]) = [a1, b1],p 2([a1, b1]) = proposed minimax composite polynomials. Its principle is to [a2, b2], ··· ,p k([ak−1, bk−1]) = [ak, bk]. As {pi}1≤i≤k is determine the optimal set of degrees such that the minimax a set of polynomials with odd-degree terms, we have composite polynomial for the set of degrees is optimal with p1([−1 − δ, −1 + δ]) = [−b1, −a1],p 2([−b1, −a1]) = respect to number of non-scalar multiplications and depth [−b2, −a2], ··· ,p k([−bk−1, −ak−1]) = [−bk, −ak]. As the consumption among all minimax composite polynomials composite polynomial is (α−1, δ)-two-sided-close, we have −(α−1) −(α−1) that satisfy the comparison operation error condition. It was [ak, bk] ⊆ [1 − 2 , 1 + 2 ] and 0 < ai < bi for proved that for any given composition of polynomials with i, 1 ≤ i ≤ k. 2 2 ai+bi Let p˜1(x) = p1(x), p˜i(x) = pi( x), 2 ≤ odd-degree terms that satisfies the comparison operation er- a1+b1 ai+bi 2 b1−a1 b1−a1 ror condition, there exists a minimax composite polynomial i ≤ k. Then, p˜1([1 − δ, 1 + δ]) = [1 − , 1 + ] and a1+b1 a1+b1 for some set of degrees that satisfies the error condition and bi−1−ai−1 bi−1−ai−1 bi−ai bi−ai p˜i([1 − a +b , 1 + a +b ]) = [1 − a +b , 1 + a +b ], requires a smaller or equal depth consumption and num- i−1 i−1 i−1 i−1 i i i i 2 ≤ i ≤ k, and p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is a 1-centered- ber of non-scalar multiplications than those for the given range composite polynomial on Rδ. We now show that composite polynomial. As a brute-force search requires [1 − bk−ak , 1 + bk−ak ] = [ 2ak , 2bk ] ⊆ [1 − 2−(α−1), 1 + ak+bk ak+bk ak+bk ak+bk considerable running time for α, we proposed polynomial- −(α−1) 2 ], implying that p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 is (α − 1, δ)-two- time algorithms that obtain the optimal minimax composite −(α−1) 2ak sided-close. If ak + bk ≤ 2, then 1 − 2 ≤ ak ≤ polynomials by using dynamic programming. ak+bk 2bk 2ak −(α−1) and = 2 − ≤ 2 − ak ≤ 1 + 2 , whereas if The numerical analysis demonstrated that if running ak+bk ak+bk 2bk −(α−1) 2ak ak + bk > 2, then < bk ≤ 1 + 2 and = time (resp. depth consumption) was to be minimized, ak+bk ak+bk 2bk −(α−1) 2 − > 2 − bk ≥ 1 − 2 . Thus, p˜k ◦ p˜k−1 ◦ · · · ◦ p˜1 the proposed homomorphic comparison algorithm reduced ak+bk running time by approximately 45% (resp. 41%) on average, is an (α − 1, δ)-two-sided-close 1-centered range composite compared with the previous algorithm when the HEAAN polynomial on Rδ satisfying deg(p˜i) = deg(pi) for all i, library was used. In addition, the proposed homomorphic 1 ≤ i ≤ k. 18

0 APPENDIX C τ . That is, MinErr(d, τ1) < MinErr(d, τ2), which is a PROOFOF LEMMA 7 contradiction. Thus, MinErr(d, τ) is a strictly increasing function of τ. Proof. Let d be 2i+1. We consider the minimax approximate polynomial p(x) of degree at most 2i + 1 on Rτ for sgn(x). 0 (ii) Continuous: Let τ be the minimax approximation error of p(x) on Rτ . As sgn(x) is an odd function, it can be seen from Lemma MinErr(d, τ) τ = τ 2 that the minimax approximate polynomial of degree at We now show that is continuous at 0, δ0 > 0 0 > 0 most 2i + 1 to sgn(x) is equal to the minimax approximate that is, for any , there exists such that |τ − τ | ≤ 0 |MinErr(d, τ) − MinErr(d, τ )| ≤ δ0 polynomial of degree at most 2i + 2 to sgn(x). In addition, 0 implies 0 . p(x) p(x) is a polynomial with odd-degree terms by Lemma Let be the minimax approximate polynomial of degree at most 2i + 1 on R , and let τ 0 be the minimax 2. We now show that there exist i + 2 distinct points τ0 p(x) x , x , ··· , x ∈ [1 − τ, 1 + τ] that satisfy the following approximation error of . It suffices to consider only 0 1 i+1 δ0 < τ 0 i + 2 three properties: the case . There exist distinct points x0, x1, ··· , xi+1 ∈ (0, ∞) that satisfy the above three properties. There exists a unique x ∈ (0, x0) such that Prop 1. 1 − τ = x0 < x1 < ··· < xi+1 = 1 + τ. 0 0 0 j+1 0 p(x) = 1 − τ − δ . Let 1 be 1 − τ0 − x for this unique Prop 2. p(xj) = 1 + (−1) τ , 0 ≤ j ≤ i + 1. x. In addition, there exists a unique x ∈ (x , ∞) such Prop 3. p(x) is strictly increasing on (0, x ). For j, i+1 1 that p(x) = 1 + τ 0 + δ0 when i is even, and unique 1 ≤ j ≤ i, p(x) is strictly increasing on (xj, xj+1) when j is 0 0 x ∈ (xi+1, ∞) such that p(x) = 1 − τ − δ when i is even, and strictly decreasing on (xj, xj+1) when j is odd. In 0 odd. Let 2 be x − 1 − τ0 for the unique x. There exists a addition, p(x) is strictly increasing on (xi, ∞) if i is even, and 0 0 0 unique x ∈ (x0, x1) such that p(x) = 1 − τ + δ . Let 3 strictly decreasing on (xi, ∞) if i is odd. be x − 1 + τ0 for this unique x. In addition, there exists a unique x ∈ (x , x ) such that p(x) = 1 + τ 0 − δ0 |p(x) − sgn(x)| should have maximum values at 2i + 4 i i+1 when i is even, and unique x ∈ (x , x ) such that distinct points in R by Theorem 1. However, there are i i+1 τ p(x) = 1−τ 0 +δ0. Let 0 be −x+1+τ for this unique x. at most 2i distinct points x such that p0(x) = 0. If we 4 0 Let now 0 be min(0 , 0 , 0 , 0 ). Then, p([1 − τ − 0, 1 + consider the case x > 0, |p(x) − sgn(x)| should have 1 2 3 4 0 τ + 0]) ⊆ [1 − τ 0 − δ0, 1 + τ 0 + δ0]. Thus, the minimax maximum values at i + 2 distinct points on [1 − τ, 1 + τ], 0 0 approximation error of the minimax approximate poly- and there are at most i distinct points x such that p (x) = 0. 0 0 nomial on R 0 is smaller than or equal to τ +δ . That If |p(x) − sgn(x)| has a maximum value at x =x ¯, τ0+ is, MinErr(d, τ +0) ≤ τ 0+δ0. Furthermore, let x0 = x + then p0(¯x) = 0 or x =x ¯ is a boundary point, that is, 0 0 0 0, x0 = x , ··· , x0 = x , x0 = x − 0. We consider x¯ ∈ {1 − τ, 1 + τ}. Thus, |p(x) − sgn(x)| should have 1 1 i i i+1 i+1 2i + 4 points −x0 , −x0 , ··· , −x0 , x0 , ··· , x0 , x0 . By maximum values at two boundary points x = 1 − τ i+1 i 0 0 i i+1 Lemma 1, the minimax approximation error of the and x = 1 + τ. Let x0, ··· , xi+1 be the i + 2 distinct 0 minimax approximate polynomial on Rτ0− is larger points in (0, ∞) such that |p(x) − sgn(x)| has maximum 0 0 0 than or equal to τ − δ . That is, MinErr(d, τ0 −  ) ≥ values at those points. Then, x0 = 1 − τ, xi+1 = 1 + τ 0 0 0 0 0 τ − δ . As MinErr(d, τ) is an increasing function, if and p (x1) = p (x2) = ··· = p (xi) = 0. In addition, 0 0 |τ − τ0| ≤  , then |MinErr(d, τ) − MinErr(d, τ0)| ≤ δ . considering p(0) = 0 and p(x1) > 0, p(x) is strictly Thus, MinErr(d, τ) is continuous at τ = τ0. increasing on (0, x1). As p(x0) < p(x1), we have that 0 0 0 p(x0) = 1 − τ , p(x1) = 1 + τ , p(x2) = 1 − τ , ··· for some τ 0 > 0 by Theorem 1. Moreover, it can be seen that Prop 3 is satisfied by Theorem 1. Thus, there exist i + 2 points x0, x1, ··· , xi+1 ∈ (0, ∞) that satisfy the above three PPENDIX properties. We now show that MinErr(d, τ) is a strictly A D increasing continuous function of τ with domain (0, 1) as PROOFOF THEOREM 3 follows: Proof. Let τ 0 = h(m, n, τ). We assume that (i) Strictly increasing: τ 0 > max InvMinErr(2j + 1, h(m − mult(2j + 1), Let 0 < τ1 < τ2 < 1. Let p1(x) and p2(x) be 1≤j the minimax approximate polynomials of degree mult(2j+1)≤m dep(2j+1)≤n at most 2i + 1 on Rτ1 and Rτ2 , respectively. It is trivial that MinErr(d, τ1) ≤ MinErr(d, τ2). We n − dep(2j + 1), τ)). 0 assume that MinErr(d, τ1) = MinErr(d, τ2) = τ . Then, by the uniqueness property of the minimax Then, there exists a minimax composite polynomial approximate polynomial, p1(x) = p2(x). We note pk ◦ pk−1 ◦ · · · ◦ p1 satisfying pk ◦ pk−1 ◦ · · · ◦ p1([1 − 0 0 that p1(x) is the minimax approximate polynomial τ , 1 + τ ]) ⊆ [1 − τ, 1 + τ], MultNum({pi}1≤i≤k) ≤ m, and

of degree at most 2i + 1 on Rτ1 . Then, it can be DepNum({pi}1≤i≤k) ≤ n. Let d1 be the degree of p1, and let 0 0 0 seen that 0 < p1(1 − τ2) < p1(1 − τ1) = 1 − τ p1([1−τ , 1+τ ]) = [1−τ1, 1+τ1]. As the minimax composite

by Prop 3. Considering p1(x) = p2(x), the minimax polynomial pk ◦pk−1 ◦· · ·◦p2 on Rτ1 satisfies pk ◦pk−1 ◦· · ·◦

approximation error of p2(x) on Rτ2 is larger than p2([1 − τ1, 1 + τ1]) ⊆ [1 − τ, 1 + τ], MultNum({pi}2≤i≤k) ≤ 19

m − mult(d1), and DepNum({pi}2≤i≤k) ≤ n − dep(d1), we have that τ1 ≤ h(m − mult(d1), n − dep(d1), τ). Then, 0 τ = InvMinErr(d1, τ1)

≤ InvMinErr(d1, h(m − mult(d1), n − dep(d1), τ)) ≤ max InvMinErr(2j + 1, h(m − mult(2j + 1), 1≤j mult(2j+1)≤m dep(2j+1)≤n n − dep(2j + 1), τ)), which leads to a contradiction. We assume that τ 0 < max InvMinErr(2j + 1, h(m − mult(2j + 1), 1≤j mult(2j+1)≤m dep(2j+1)≤n n − dep(2j + 1), τ)).

InvMinErr(2i + 1,h(m − mult(2i + 1), n − dep(2i + 1), τ)) = max InvMinErr(2j + 1, h(m − mult(2j + 1), 1≤j mult(2j+1)≤m dep(2j+1)≤n n − dep(2j + 1), τ)) for some i. Let τ1 be InvMinErr(2i+1, h(m−mult(2i+1), n− dep(2i + 1), τ)). Let τ2 be h(m − mult(2i + 1), n − dep(2i + 1), τ) = MinErr(2i + 1, τ1). Then, there exists a minimax composite polynomial pk ◦ pk−1 ◦ · · · ◦ p2 satisfying

pk ◦ pk−1 ◦ · · · ◦ p2([1 − τ2, 1 + τ2]) ⊆ [1 − τ, 1 + τ]

MultNum({pi}2≤i≤k) ≤ m − mult(2i + 1)

DepNum({pi}2≤i≤k) ≤ n − dep(2i + 1).

Let p1 be the minimax approximate polynomial of degree at most 2i+1 on [1−InvMinErr(2i+1, τ2), 1+InvMinErr(2i+ 1, τ2)]. As p1([1 − InvMinErr(2i + 1, τ2), 1 + InvMinErr(2i + 1, τ2)]) = [1 − τ2, 1 + τ2], we have that pk ◦ pk−1 ◦ · · · ◦ p1([1−InvMinErr(2i+1, τ2), 1+InvMinErr(2i+1, τ2)]) ⊆ [1− τ, 1 + τ]. In addition, MultNum(pk ◦ pk−1 ◦ · · · ◦ p1) ≤ m and 0 DepNum(pk ◦ pk−1 ◦ · · · ◦ p1) ≤ n. Thus, τ = h(m, n, τ) < InvMinErr(2i + 1, τ2), which is a contradiction. Thus, τ 0 = max InvMinErr(2j + 1, h(m − mult(2j + 1), 1≤j mult(2j+1)≤m dep(2j+1)≤n n − dep(2j + 1), τ)) and the theorem is proved.

APPENDIX E OPTIMAL SETOF DEGREESFOR HOMOMORPHIC COMPARISON OPERATION 20

α depth #mults degrees depth #mults degrees 5 10 {27} 6 8 {5,7} 4 7 7 {3,3,5} 7 12 {7,13} 8 10 {3,5,7} 5 9 9 {5,5,5} 8 16 {15,15} 9 14 {3,7,13} 6 10 12 {5,5,11} 11 11 {3,5,5,5} 10 17 {7,7,13} 11 15 {3,5,7,7} 7 12 14 {5,5,5,7} 13 13 {3,3,5,5,5} 11 21 {7,15,15} 12 19 {3,7,7,13} 8 13 17 {5,5,7,11} 14 15 {3,3,5,5,9} 13 22 {7,7,7,13} 14 20 {3,5,7,7,7} 9 15 18 {3,5,5,5,13} 16 17 {5,5,5,5,9} 14 25 {7,7,13,15} 15 23 {5,7,7,7,7} 10 16 21 {5,5,5,7,13} 17 20 {3,3,5,5,5,13} 18 18 {5,5,5,5,5,5} 15 29 {7,15,15,15} 16 26 {5,7,7,7,15} 11 17 24 {3,3,5,7,7,13} 18 23 {3,5,5,5,7,13} 19 21 {5,5,5,5,5,11} 20 20 {3,5,5,5,5,5,5} 16 32 {15,15,15,15} 17 29 {7,7,7,13,13} 12 18 27 {3,5,7,7,7,13} 19 25 {5,5,5,5,7,15} 20 24 {3,5,5,5,5,7,7} 21 22 {3,5,5,5,5,5,9} 17 36 {15,15,15,31} 18 32 {7,7,7,13,27} 19 30 {5,7,7,7,7,13} 20 28 {5,5,5,7,13,13} 13 21 26 {3,5,5,5,5,7,13} 22 25 {5,5,5,5,5,5,13} 23 23 {3,5,5,5,5,5,5,5} 19 36 {7,7,15,15,27} 20 33 {7,7,7,7,11,13} 21 31 {3,5,5,7,7,7,15} 22 29 {5,5,5,5,7,7,13} 14 23 28 {3,3,5,5,5,5,7,13} 24 26 {3,5,5,5,5,5,5,11} 25 25 {3,3,5,5,5,5,5,5,5} 20 39 {7,15,15,15,27} 21 36 {7,7,7,7,11,27} 22 34 {5,5,7,7,7,7,15} 23 32 {3,3,5,5,7,7,7,13} 15 24 30 {3,5,5,5,5,5,7,15} 25 28 {5,5,5,5,5,5,5,13} 26 27 {3,3,5,5,5,5,5,5,9} 21 42 {15,15,15,15,27} 22 39 {7,7,7,13,13,27} 23 37 {5,7,7,7,7,13,13} 24 35 {3,5,5,7,7,7,7,13} 16 25 33 {5,5,5,5,5,7,7,15} 26 31 {3,3,5,5,5,5,5,7,13} 27 29 {3,5,5,5,5,5,5,5,11} 28 28 {3,3,5,5,5,5,5,5,5,5} 22 46 {15,15,15,29,29} 23 42 {7,7,13,13,15,27} 24 40 {3,7,7,7,7,7,7,15} 25 38 {5,5,7,7,7,7,7,13} 17 26 36 {5,5,5,5,7,7,13,13} 27 34 {3,5,5,5,5,5,7,7,13} 28 32 {5,5,5,5,5,5,5,5,15} 29 31 {3,3,5,5,5,5,5,5,5,11} 30 30 {5,5,5,5,5,5,5,5,5,5} 23 50 {15,15,29,29,31} 24 45 {7,13,13,15,15,27} 25 42 {7,7,7,7,7,7,7,13} 26 40 {5,5,7,7,7,7,13,13} 18 27 38 {3,5,5,5,7,7,7,7,13} 28 36 {5,5,5,5,5,5,7,7,15} 29 35 {5,5,5,5,5,5,5,13,13} 30 33 {3,5,5,5,5,5,5,5,5,13} 31 31 {3,3,5,5,5,5,5,5,5,5,5} 24 55 {15,29,31,31,31} 25 48 {13,13,15,15,15,27} 26 45 {7,7,7,7,7,7,7,27} 27 43 {3,5,7,7,7,7,7,7,15} 19 28 41 {5,5,5,7,7,7,7,7,13} 29 39 {5,5,5,5,5,7,7,7,23} 30 37 {3,5,5,5,5,5,5,7,7,13} 31 35 {5,5,5,5,5,5,5,5,5,15} 32 34 {3,3,5,5,5,5,5,5,5,5,11} 33 33 {5,5,5,5,5,5,5,5,5,5,5} 25 59 {29,31,31,31,31} 26 51 {13,15,15,15,27,27} 27 48 {7,7,7,7,7,7,15,27} 28 46 {5,7,7,7,7,7,7,7,15} 20 29 44 {5,5,5,7,7,7,7,7,27} 30 42 {3,5,5,5,5,7,7,7,7,15} 31 40 {5,5,5,5,5,5,7,7,7,13} 32 38 {5,5,5,5,5,5,5,5,7,23} 33 36 {3,5,5,5,5,5,5,5,5,5,13} 34 35 {5,5,5,5,5,5,5,5,5,5,9} 21

ACKNOWLEDGMENTS [19] R. E. Goldschmidt, Applications of Division by Convergence. PhD thesis, Massachusetts Institute of Technology, 1964. This work was supported by the Samsung Research Fund- [20] E. W. Cheney, Introduction to Approximation Theory. Cambridge, ing and Incubation Center of Samsung Electronics under UK:McGraw-Hill, 1966. Project SRFC-IT1801-08. We would like to thank anonymous [21] J.-W. Lee, E. Lee, Y. Lee, Y.-S. Kim, and J.-S. No, “High-precision bootstrapping of RNS-CKKS homomorphic encryption using op- reviewers for their valuable suggestions and comments that timal minimax polynomial approximation and inverse func- helped improve the quality of this paper. tion,” International Conference on the Theory and Applications of Cryp- tographic Techniques (Eurocrypt 2021), accepted for publication. [22] Y. Lee, J.-W. Lee, Y.-S. Kim, and J.-S. No, “Near-optimal polynomial REFERENCES for modulus reduction using l2-norm for approximate homomor- phic encryption,” IEEE Access, vol. 8, pp. 144321–144330, 2020. [1] C. Gentry, “Fully homomorphic encryption using ideal lattices,” [23] E. Y. Remez, “Sur la determination des polynomes in Proc. of the Forty-First Annual ACM Symposium on Theory of d’approximation de degre donnee,” Comm. Soc. Math. Kharkov, vol. Computing, 2019, pp. 169–178. 10, no. 196, pp. 41–63, 1934. [2] J. Fan and F. Vercautern, “Somewhat practical fully homomorphic [24] J.-W. Lee, E. Lee, Y. Lee, Y.-S. Kim, and J.-S. No, “Optimal minimax encryption,” Cryptol. ePrint Arch., Tech. Rep. 2012/144, 2012. [On- polynomial approximation of modular reduction for bootstrapping line]. Available: https://eprint.iacr.org/2012/144. of approximate homomorphic encryption,” Cryptol. ePrint Arch., [3] J. Cheon, A. Kim, M. Kim, and Y. Song, “Homomorphic encryp- Tech. Rep. 2020/552/20200803:084202, 2020. [Online]. Available: tion for arithmetic of approximate numbers,” in Proc. International https://eprint.iacr.org/2020/552/20200803:084202. Conference on the Theory and Application of Cryptology and Information [25] J. Bossuat, C. Mouchet, J. Troncoso-Pastoriza, and J. Hubaux, Security, LNCS, Berlin, Germany: Springer, 2017, pp. 409–437. “Efficient bootstrapping for approximate homomorphic encryption [4] I. Chillotti, N. Gama, M. Georgieva, and M. Izabachene, “TFHE: with non-sparse keys,” International Conference on the Theory and fast fully homomorphic encryption over the torus,” Journal of Cryp- Applications of Cryptographic Techniques (Eurocrypt 2021), accepted tology, vol. 33, no. 1, pp. 34–91, 2020. for publication. [5] R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and [26] B. Li and D. Micciancio, “On the security of homomorphic en- J. Wernsing, “Cryptonets: Applying neural networks to encrypted cryption on approximate numbers,” International Conference on the data with high throughput and accuracy,” in Proc. International Theory and Applications of Cryptographic Techniques (Eurocrypt 2021), Conference on Machine Learning , 2016, pp. 201–210. accepted for publication. [6] C. Juvekar, V. Vaikuntanathan, and A. Chandrakasan, “GAZELLE: [27] J. Cheon, S. Hong, and D. Kim, “Remark on the security of CKKS A low latency framework for secure neural network inference,” in scheme in practice,” Cryptol. ePrint Arch., Tech. Rep. 2020/1581, Proc. 27th USENIX Secur. Symp. (USENIX Security), 2018, pp. 1651– 2020. [Online]. Available: https://eprint.iacr.org/2020/1581. 1669. [28] M. R. Albrecht, R. Player, and S. Scott, “On the concrete hardness [7] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learn- of learning with errors,” Journal of Mathematical Cryptology, vol. 9, ing, vol. 20, no. 3, pp. 273–297, 1995. no. 3, pp. 169–203, 2015. [8] J. A. Hartigan and M. A. Wong, “A k-means clustering algorithm,” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp. 100–108, 1979. [9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifi- cation with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (NIPS), 2012, pp. 1097–1105. [10] K. Simonyan and A. Zisserman, “Very deep convolutional net- works for large-scale image recognition.” [Online]. Available: https://arxiv.org/abs/1409.1556. [11] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, Eunsang Lee (Post-Doctoral Researcher, IEEE) D. Erhan, V. Vanhoucke, and Andrew Rabinovich, “Going deeper received the B.S. and Ph.D. degrees in electrical with convolutions,” in Proc. IEEE Conference on Computer Vision and and computer engineering from Seoul National Pattern Recognition, 2015, pp. 1–9. University, Seoul, South Korea, in 2014 and [12] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Re- 2020, respectively. He is now a Post-Doctoral thinking the inception architecture for computer vision,” in Proc. Researcher at the Institute of New Media and IEEE Conference on Computer Vision and Pattern Recognition, 2016, Communications, Seoul National University. His pp. 2818–2826. current research interests include homomorphic [13] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for encryption and lattice-based cryptography. image recognition,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. [14] N. Emmadi, P. Gauravaram, H. Narumanchi, and H. Syed, “Up- dates on sorting of fully homomorphic encrypted data,” in Proc. International Conference on Cloud Computing Research and Innovation, 2015, pp. 19–24. [15] J. Cheon, D. Kim, and D. Kim, “Efficient homomorphic com- parison methods with optimal complexity,” in Proc. International Conference on the Theory and Application of Cryptology and Information Security (Asiacrypt 2020), LNCS, Berlin, Germany: Springer, 2020, pp. 221–256. Joon-Woo Lee (Graduate Student Member, [16] C. Boura, N. Gama, and M. Georgieva, “Chimera: a uni- IEEE) received the B.S. degree in electrical and fied framework for B/FV, TFHE and HEAAN fully homomor- computer engineering from Seoul National Uni- phic encryption and predictions for deep learning,” Cryptol. versity, Seoul, South Korea, in 2016, where he is ePrint Arch., Tech. Rep. 2018/758, 2018. [Online]. Available: currently pursuing the Ph.D. degree. His current https://eprint.iacr.org/2018/758. research interests include homomorphic encryp- [17] D. Chialva and A. Dooms, “Conditionals in homomor- tion and lattice-based cryptography. phic encryption and machine learning applications,” Cryptol. ePrint Arch., Tech. Rep. 2018/1032, 2018. [Online]. Available: https://eprint.iacr.org/2018/1032. [18] J. Cheon, D. Kim, D. Kim, H. Lee, and K. Lee, “Numerical method for comparison on homomorphically encrypted numbers,” in Proc. International Conference on the Theory and Application of Cryptology and Information Security (Asiacrypt 2019), LNCS, Berlin, Germany: Springer, 2019, pp. 415–445. 22

Jong-Seon No (Fellow, IEEE), received the B.S. and M.S.E.E. degrees in electronics engineer- ing from Seoul National University, Seoul, South Korea, in 1981 and 1984, respectively, and the Ph.D. degree in electrical engineering from the University of Southern California, Los Angeles, CA, USA, in 1988. He was a Senior MTS with Hughes Network Systems, from 1988 to 1990. He was an Associate Professor with the Depart- ment of Electronic Engineering, Konkuk Univer- sity, Seoul, from 1990 to 1999. He joined the faculty of the Department of Electrical and Computer Engineering, Seoul National University, in 1999, where he is currently a professor. His research interests include error-correcting codes, cryptography, se- quences, LDPC codes, interference alignment, and wireless commu- nication systems. He became an IEEE Fellow through the IEEE Infor- mation Theory Society, in 2012. He became a member of the National Academy of Engineering of Korea (NAEK), in 2015, where he served as the Division Chair of Electrical, Electronic, and Information Engineering from 2019 to 2020. He was a recipient of the IEEE Information Theory Society Chapter of the Year Award, in 2007. From 1996 to 2008, he served as the Founding Chair of the Seoul Chapter of the IEEE Infor- mation Theory Society. He was the General Chair of Sequence and Their Applications 2004 (SETA 2004), Seoul. He also served as the General Co-Chair of the International Symposium on Information Theory and Its Applications 2006 (ISITA 2006) and the International Symposium on Information Theory, 2009 (ISIT 2009), Seoul. He served as the Co- Editor-in-Chief for the IEEE JOURNAL OF COMMUNICATIONS AND NETWORKS, from 2012 to 2013.

Young-Sik Kim (Member, IEEE) received the B.S., M.S., and Ph.D. degrees in Electrical En- gineering and Computer Science from Seoul National University, in 2001, 2003, and 2007, respectively. He joined the Semiconductor Divi- sion, Samsung Electronics, where he worked in the research and development of security hard- ware IPs for various embedded systems, includ- ing modular exponentiation hardware accelera- tor (called Tornado 2MX2) for RSA and elliptic- curve cryptography in smart-card products and mobile application processors of Samsung Electronics, until 2010. He is currently a Professor with Chosun University, Gwangju, South Korea. He is also a Submitter for two candidate algorithms (McNie and pqsigRM) in the first round for the NIST Post Quantum Cryptography Standard- ization. His research interests include post-quantum cryptography, IoT security, physical-layer security, data hiding, channel coding, and signal design. He is selected as one of 2025’s 100 Best Technology Leaders (for Crypto-Systems) by the National Academy of Engineering of Korea.