Setup-Free Secure Search on Encrypted Data: Faster and Post-Processing Free
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings on Privacy Enhancing Technologies ; 2019 (3):87–107 Adi Akavia*, Craig Gentry, Shai Halevi, and Max Leibovich Setup-Free Secure Search on Encrypted Data: Faster and Post-Processing Free Abstract: We present a novel secure search protocol Keywords: Secure search, Fully homomorphic encryp- on data and queries encrypted with Fully Homomor- tion, Randomized algorithms, Universal hash functions phic Encryption (FHE). Our protocol enables organiza- DOI 10.2478/popets-2019-0038 tions (client) to (1) securely upload an unsorted data Received 2018-11-30; revised 2019-03-15; accepted 2019-03-16. array x = (x[1], . , x[n]) to an untrusted honest-but- curious sever, where data may be uploaded over time 1 Introduction and from multiple data-sources; and (2) securely is- Following the rapid advancement and widespread avail- sue repeated search queries q for retrieving the first ability of cloud computing it is a common practice ∗ ∗ element (i , x[i ]) satisfying an agreed matching crite- to outsource data storage and computations to cloud ∗ rion i = min {i ∈ [n] | IsMatch(x[i], q) = 1}, as well as providers. Placing cleartext (i.e unencrypted) data on fetching the next matching elements with further inter- the cloud compromises data security. To regain data action. For security, the client encrypts the data and privacy one could encrypt the data prior to upload- queries with FHE prior to uploading, and the server ing to the cloud. However, if using standard encryp- processes the ciphertexts to produce the result cipher- tion (e.g. AES), this solution nullifies the benefits of text for the client to decrypt. Our secure search pro- cloud computing: when given only ciphertexts the cloud tocol improves over the prior state-of-the-art for secure provider cannot process the underlying cleartext data in search on FHE encrypted data (Akavia, Feldman, Shaul any meaningful way. (AFS), CCS’2018) in achieving: Fully homomorphic encryption (FHE) [22, 23, 49] is – Post-processing free protocol where the server produces an encryption scheme that allows processing the under- a ciphertext for the correct search outcome with over- lying cleartext data while it still remains in encrypted whelming success probability. This is in contrast to form, and without giving away the secret key (see Def- returning a list of candidates for the client to post- inition 2.1). With FHE it is possible for the client to process, or suffering from a noticeable error probabil- securely outsource computations to the server as fol- ity, in AFS. Our post-processing freeness enables the lows: The client first encrypts its data x with an FHE server to use secure search as a sub-component in a scheme to obtain the ciphertext [[x]] ← Enc (x), and larger computation without interaction with the client. pk sends [[x]] to the server. The server can now compute any – Faster protocol: (a) Client time and communication function f on the underlying clear-text data x by eval- bandwidth are improved by a log2 n/ log log n factor. uating a homomorphic version of f on the ciphertext (b) Server evaluates a polynomial of degree linear in [[x]]. The outcome of this computation is a ciphertext log n (compare to cubic in AFS), and overall number of [[y]] ← Eval (f, [[x]]) that decrypts to the desired output multiplications improved by up to log n factor. (c) Em- pk y = f(x). The server can now send the ciphertext [[y]] to ploying only GF(2) computations (compare to GF(p) the client who would decrypt y ← Dec ([[y]]) to obtain for p 2 in AFS) to gain both further speedup and sk the result. compatibility to all current FHE candidates. The homomorphic computations achievable by the – Order of magnitude speedup exhibited by extensive known FHE candidates (e.g. [7, 19, 24, 43]) are specified benchmarks we executed on identical hardware for im- plementations of ours versus AFS’s protocols. Additionally, like other FHE based solutions, our solu- tion is setup-free: to outsource elements from the client *Corresponding Author: Adi Akavia: University of Haifa, to the server, no additional actions are performed on x E-mail: [email protected] except for encrypting it element by element (each ele- Craig Gentry: IBM Research, E-mail: craigbgen- [email protected] ment bit by bit) and uploading the resulted ciphertexts Shai Halevi: IBM Research, E-mail: [email protected] to the server. Max Leibovich: University of Haifa, E-mail: [email protected] Setup-Free Secure Search on Encrypted Data: Faster and Post-Processing Free 88 by a polynomial over a finite ring (i.e. by repeated ap- (d) no initial setup is performed on x except for en- plication of homomorphic-addition and homomorphic- crypting it element-by-element (each element encrypted multiplication for that ring). For example, for data bit-by-bit) and uploading the resulting ciphertexts to in binary representation, bitwise operations on plain- the server. text bits (addition and multiplication modulo 2) can We point out that the latter condition, among other be replaced by their homomorphic counterparts on en- things, prevents speeding up the search by using stan- crypted bits (homomorphic-addition and homomorphic- dard data structures such as search-trees, hash-tables, multiplication). or sorted arrays (on top of, or instead of, the encrypted Key factors influencing the running-time of such unsorted array [[x]]). A linear scan lower bound is thus homomorphic computations are the degree and overall implied by the addressed formulation, even if we were to multiplications of the polynomial. Leading to the main search on clear-text data. This restriction is nonetheless two constraints in designing algorithms that compute on motivated by many use-cases, as discussed next. FHE encrypted data: they must be realized by a poly- Use-cases motivating the aforementioned no-setup re- nomial of low degree and low amount of overall multi- striction arise in settings where, for example: plications. – Matching criteria are unknown in advance, thus pre- Note that this FHE approach for securely outsourc- cluding appropriate indexing or sorting at setup; ing to the server the computation of y = f(x) has the – High dimensional range queries, where index size is benefits of requiring only a single round of communica- exponential in the number attributes and infeasible tion, and with low communication bandwidth (commu- to compute or store; nicating only the encrypted input [[x]] and output [[y]]). – Streaming data with client discarding each element Furthermore, the server in this protocol learns no new immediately after encrypting and uploading to the information about x or y (assuming the FHE is seman- server, thus precluding client’s setup or maintenance tically secure). of the desired data-structures (and where for the Secure search is a fundamental computational problem, server, seeing only ciphertexts, secure maintenance useful in numerous data analysis and retrieval tasks. An of advanced data structures seems even harder than abundance of proposed solutions were presented to solve secure search); it using different cryptographic tools (see Section 1.1 – Low capacity clients that are too weak to run setup and Tables 1-2). In particular, Gentry [22, 23] proposed over the entire cleartext array prior to encrypting using FHE to securely search on encrypted data. and uploading it to the server; In this work we address the natural and simple for- – Fragmented data uploaded to the server from mul- mulation for secure search on FHE encrypted data (se- tiple distinct client endpoints (data-sources) with cure search) as considered by [2]: Secure search is a two no single endpoint that can perform setup over the party protocol between a server and a client. The server entire cleartext data. holds an unsorted array [[x]] = [[x[1]]],..., [[x[n]]] of en- The single-round and low-communication restric- crypted elements (not necessarily distinct) that were tions are motivated by use-cases in settings where com- previously encrypted and uploaded by the client, as well munication is a major bottleneck, e.g. in being intermit- as a specification of a predicate IsMatch(a, b) ∈ {0, 1} tent or unreliable, or where communicating is with data- specifying the matching condition. The client submits sources that are mostly offline, or have restricted bat- encrypted queries [[q]] to the server in order to re- tery capacity as in sensors-networks or some Internet- trieve the first matching element. The server returns of-Things (IoT) devices. to the client the encrypted index and element pair The single server restriction is motivated, not only [[y]] = [[i∗]], [[x[i∗]]] for i∗ the index of the first element by the simplicity of such architecture, but also by its satisfying the matching condition, i∗ = min{i ∈ [n] | stronger security guarantee: requiring no non-collusion IsMatch(x[i], q) = 1}. See detailed definition and exten- assumption on servers. sions in Section 3 and 5. Threat model. We address computationally-bounded Restrictions on protocols. Note that the above secure semi-honest adversaries that follow the protocol but search formulation, as addressed in this work, focuses on may try to learn additional information. Our security re- protocols that involve (a) a single server, and where quirement is that adversaries controlling the server can- the client-server interaction is of (b) a single round, not distinguish between two adversarially-chosen equal and (c) low communication complexity. Furthermore, size queries or data arrays. See Section 3.3. Setup-Free Secure Search on Encrypted Data: Faster and Post-Processing Free 89 The leakage of our protocols include only size in- ing setup necessitates –even on cleartext data– a linear formation (specifically, upper-bounds on array size, ele- scan of the data. Our work focuses on the no-setup case.