EURASIP Journal on Information Security
Signal Processing in the Encrypted Domain
Guest Editors: Alessandro Piva and Stefan Katzenbeisser Signal Processing in the Encrypted Domain EURASIP Journal on Information Security Signal Processing in the Encrypted Domain
Guest Editors: Alessandro Piva and Stefan Katzenbeisser Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
This is a special issue published in volume 2007 of “EURASIP Journal on Information Security.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Editor-in-Chief Mauro Barni, University of Siena, Siena, Italy
Associate Editors
JeffreyA.Bloom,USA D. Kirovski, USA Hans Georg Schaathun, UK G. Doerr,¨ UK Deepa Kundur, USA Martin Steinebach, Germany Jean-Luc Dugelay, France E. Magli, Italy Q. Sun, Singapore T. Furon, France Kivanc Mihcak, Turkey W. Trappe, USA Miroslav Goljan, USA Lawrence O’Gorman, USA C. Vielhauer, Germany S. Katzenbeisser, The Netherlands Fernando Perez-Gonz´ alez,´ Spain S. Voloshynovskiy, Switzerland Hyoung Joong Kim, Korea A. Piva, Italy Andreas Westfeld, Germany Contents
Signal Processing in the Encrypted Domain, Alessandro Piva and Stefan Katzenbeisser Volume 2007, Article ID 82790, 1 page
A Survey of Homomorphic Encryption for Nonspecialists, Caroline Fontaine and Fabien Galand Volume 2007, Article ID 13801, 10 pages
Secure Multiparty Computation between Distrusted Networks Terminals, S.-C. S. Cheung and Thinh Nguyen Volume 2007, Article ID 51368, 10 pages
Protection and Retrieval of Encrypted Multimedia Content: When Cryptography Meets Signal Processing, Zekeriya Erkin, Alessandro Piva, Stefan Katzenbeisser, R. L. Lagendijk, Jamshid Shokrollahi, Gregory Neven, and Mauro Barni Volume 2007, Article ID 78943, 20 pages
Oblivious Neural Network Computing via Homomorphic Encryption, C. Orlandi, A. Piva, and M. Barni Volume 2007, Article ID 37343, 11 pages
Efficient Zero-Knowledge Watermark Detection with Improved Robustness to Sensitivity Attacks,JuanRamon´ Troncoso-Pastoriza and Fernando Perez-Gonz´ alez´ Volume 2007, Article ID 45731, 14 pages
Anonymous Fingerprinting with Robust QIM Watermarking Techniques, J. P. Prins, Z. Erkin, andR.L.Lagendijk Volume 2007, Article ID 31340, 13 pages
Transmission Error and Compression Robustness of 2D Chaotic Map Image Encryption Schemes, Michael Gschwandtner, Andreas Uhl, and Peter Wild Volume 2007, Article ID 48179, 16 pages Hindawi Publishing Corporation EURASIP Journal on Information Security Volume 2007, Article ID 82790, 1 page doi:10.1155/2007/82790
Editorial Signal Processing in the Encrypted Domain
Alessandro Piva1 and Stefan Katzenbeisser2
1 Department of Electronics and Telecommunications, University of Florence, Via S. Marta 3, 50139 Firenze, Italy 2 Information & System Security Group, Philips Research Europe, High Tech Campus 34 MS 61, 5656 AE Eindhoven, The Netherlands
Correspondence should be addressed to Alessandro Piva, [email protected]fi.it
Received 31 December 2007; Accepted 31 December 2007
Copyright © 2007 A. Piva and S. Katzenbeisser. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recent advances in digital signal processing enabled a num- The first part of the special issue contains three survey ber of new services in various application domains, ranging papers: Fontaine and Galand give an overview of homomor- from enhanced multimedia content production and distri- phic encryption, which is one of the key tools for signal pro- bution, to advanced healthcare systems for continuous health cessing in the encrypted domain, in their paper “A survey of monitoring. At the heart of these services lies the ability homomorphic encryption for nonspecialists.” An introduc- to securely manipulate “valuable” digital signals in order to tion to the field of secure multiparty computation is provided satisfy security requirements such as intellectual property by the paper “Secure multiparty computation between dis- management, authenticity, privacy, and access control. Cur- trusted networks terminals” by Cheung and Nguyen. Finally, rently available technological solutions for “secure manipu- research in the area of signal processing under encryption is lation of signals” apply cryptographic primitives by build- surveyed in the paper “Protection and retrieval of encrypted ing a secure layer on top of existing signal processing mod- multimedia content: when cryptography meets signal pro- ules, able to protect them from leakage of critical infor- cessing” by Erkin et al. mation, assuming that the involved parties or devices trust The second part of the special issue contains four re- each other. This implies that the cryptographic layer is used search papers. Orlandi et al. introduce the notion of obliv- only to protect the data against access through unautho- ious computing with neural networks in the paper “Obliv- rized third parties or to provide authenticity. However, this ious neural network computing via homomorphic encryp- is often not enough to ensure the security of the applica- tion.” Troncoso-Pastoriza and Perez-Gonz´ alez´ present new tion, since the owner of the data may not trust the process- protocols for zero-knowledge watermark detection in their ing devices, or those actors that are required to manipulate paper “Efficient zero-knowledge watermark detection with them. improved robustness to sensitivity attacks.” Prins et al. It is clear that the availability of signal processing algo- show in their paper “Anonymous fingerprinting with robust rithms that work directly on encrypted signals would be of QIM watermarking techniques” how advanced quantization- great help for application scenarios where signals must be index-modulation watermarking schemes can be used in produced, processed, or exchanged securely. conjunction with buyer-seller watermarking protocols. Fi- Whereas the development of tools capable of processing nally, Gschwandtner et al. explore properties of specialized encrypted signals may seem a formidable task, some recent, image encryption schemes in their paper “Transmission er- still scattered, studies, spanning from secure embedding and ror and compression robustness of 2D chaotic map image detection of digital watermarks and secure content distri- encryption schemes.” bution to compression of encrypted data and access to en- Finally, we would like to thank all the authors, as well as crypted databases, have shown that performing signal pro- all reviewers, for their contribution to this issue. We hope cessing operations in encrypted content is indeed possible. that the readers will enjoy this special issue and that it en- We are delighted to present the first issue of a journal, en- courages more colleagues to devote time to this novel and tirely devoted to signal processing in the encrypted domain. exciting field of research. The issue contains both survey papers allowing the reader to become acquainted with this exciting field, and research pa- Alessandro Piva pers discussing the latest developments. Stefan Katzenbeisser Hindawi Publishing Corporation EURASIP Journal on Information Security Volume 2007, Article ID 13801, 10 pages doi:10.1155/2007/13801
Review Article A Survey of Homomorphic Encryption for Nonspecialists
Caroline Fontaine and Fabien Galand
CNRS/IRISA-TEMICS, Campus de Beaulieu, 35042 Rennes Cedex, France
Correspondence should be addressed to Caroline Fontaine, [email protected]
Received 30 March 2007; Revised 10 July 2007; Accepted 24 October 2007
Recommended by Stefan Katzenbeisser
Processing encrypted signals requires special properties of the underlying encryption scheme. A possible choice is the use of ho- momorphic encryption. In this paper, we propose a selection of the most important available solutions, discussing their properties and limitations.
Copyright © 2007 C. Fontaine and F. Galand. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. INTRODUCTION momorphic encryption; it is particularly aimed at noncryp- tographers, providing guidelines about the main characteris- The goal of encryption is to ensure confidentiality of data tics of encryption primitives: algorithms, performance, secu- in communication and storage processes. Recently, its use rity. Section 3 provides a survey of homomorphic encryption in constrained devices led to consider additional features, schemes published so far, and analyses their characteristics. such as the ability to delegate computations to untrusted Most schemes we describe are based on mathematical no- computers. For this purpose, we would like to give the un- tions the reader may not be familiar with. In the cases these trusted computer only an encrypted version of the data to notions can easily be introduced, we present them briefly. process. The computer will perform the computation on this Thereadermayreferto[15] for more information concern- encrypted data, hence without knowing anything on its real ing those we could not introduce properly, or algorithmic value. Finally, it will send back the result, and we will decrypt problems related to their computation. it. For coherence, the decrypted result has to be equal to the Before going deeper in the subject, let us introduce some intended computed value if performed on the original data. notation. The integer (x) denotes the number of bits con- For this reason, the encryption scheme has to present a par- stituting the binary expansion of x.Asusual,Zn will denote ∗ ticular structure. Rivest et al. proposed in 1978 to solve this the set of integers modulo n,andZn the set of its invertible issue through homomorphic encryption [1]. Unfortunately, elements. Brickell and Yacobi pointed out in [2]somesecurityflaws in the first proposals of Rivest et al. Since this first attempt, 2. TOWARDS HOMOMORPHIC ENCRYPTION a lot of articles have proposed solutions dedicated to nu- merous application contexts: secret sharing schemes, thresh- 2.1. Basics about encryption old schemes (see, e.g., [3]), zero-knowledge proofs (see, e.g., [4]), oblivious transfer (see, e.g., [5]), commitment schemes In this section, we will recall some important concepts con- (see, e.g., [3]), anonymity, privacy, electronic voting, elec- cerning encryption schemes. For more precise information, tronic auctions, lottery protocols (see, e.g., [6]), protection the reader may refer to [16] or the more recent [17]. of mobile agents (see, e.g., [7]), multiparty computation (see, Encryption schemes are, first and foremost, designed to e.g., [3]), mix-nets (see, e.g., [8, 9]), watermarking or finger- preserve confidentiality. According to Kerckoffs’ principle printing protocols (see, e.g., [10–14]), and so forth. (see [18, 19] for the original papers, or any book on cryp- The goal of this article is to provide nonspecialists with tography), their security must not rely on the obfuscation of a survey of homomorphic encryption techniques. Section 2 their code, but only on the secrecy of the decryption key. We recalls some basic concepts of cryptography and presents ho- can distinguish two kinds of encryption schemes: symmetric 2 EURASIP Journal on Information Security and asymmetric ones. We will present them shortly and dis- the receiver with the secret key needed to recover the data, the cuss their performance and security issues. sender encrypts this key with an asymmetric cipher. Hence, the asymmetric cipher is used to encrypt only a short data, Symmetric encryption schemes while the symmetric one is used for the longer one. The sender and the receiver do not need to share anything be- Here “symmetric” means that encryption and decryption are fore performing the encryption/decryption as the symmet- performed with the same key. Hence, the sender and the re- ric key is transmitted with the help of the public key of the receiver. Proceeding this way, we combine the advantages of ceiver have to agree on the key they will use before perform- ffi ing any secure communication. Therefore, it is not possi- both: e ciency of symmetric schemes and functionalities of ble for two people who never met to use such schemes di- the asymmetric schemes. rectly. This also implies to share a different key with every one we want to communicate with. Nevertheless, symmet- Security issues ric schemes present the advantage of being really fast and are used as often as possible. In this category, we can distinguish Security of encryption schemes was formalized for the first block ciphers (AES [20, 21])1 and stream ciphers (One-time time by Shannon [26]. In his seminal paper, Shannon in- pad presented in Figure 1 [22], Snow 2.0 [23]),2 which are troduced the notion of perfect secrecy/unconditional secu- even faster. rity, which characterizes encryption schemes for which the knowledge of a ciphertext does not give any information ei- ther about the corresponding plaintext or about the key. He Asymmetric encryption schemes proved that the one-time pad is perfectly secure under some conditions, as explained in Figure 1. In fact, no other scheme, In contrast to the previous family, asymmetric schemes in- neither symmetric nor asymmetric, has been proved uncon- troduce a fundamental difference between the abilities to en- ditionally secure. Hence, if we omit the one-time pad, any crypt and to decrypt. The encryption key is public, as the encryption scheme’s security is evaluated with regard to the decryption key remains private. When Bob wants to send an computational power of the opponent. In the case of asym- encrypted message to Alice, he uses her public key to encrypt metric schemes, we can rely on their mathematical structure the message. Alice will then use her private key to decrypt it. to estimate their security level in a formal way. They are based Such schemes are more functional than symmetric ones since on some well-identified mathematical problems which are there is no need for the sender and the receiver to agree on hard to solve in general, but easy to solve for the one who anything before the transaction. Moreover, they often pro- knows the trapdoor, that is, the owner of the keys. Hence, vide more features. These schemes, however, have a big draw- it is easy for the owner of the keys to compute his/her pri- back: they are based on nontrivial mathematical computa- vate key, but no one else should be able to do so, as the tions, and much slower than the symmetric ones. The two knowledge of the public key should not endanger the private most prominent examples, RSA [24] and ElGamal [25], are key. Through reductions, we can compare the security level presented in Figures 2 and 3. of these schemes with the difficulty of solving these math- ematical problems (factorizing large integers or computing Performance issues a discrete logarithm in a large group) which are famous for their hardness. Proceeding this way, we obtain an estimate A block cipher like AES is typically 100 times faster than RSA of the security level, which sometimes turns out to be op- ffi encryption and 2000 times than RSA decryption, with about timistic. This estimation may not be su cient for several 60 MB per second on a modest platform. Stream ciphers reasons. First, there may be other ways to break the system are even faster, some of them being able to encrypt/decrypt than solving the reference mathematical problem [27, 28]. 100 MB per second or more.3 Thus, while encryption or de- Second, most of security proofs are performed in an ideal- cryption of the whole content of a DVD will take about a ized model called the random oracle model,inwhichinvolved minute with a fast stream cipher, it is simply not realistic to primitives, for example, hash functions, are considered truly use an asymmetric cipher in practice for such a huge amount random. This model has allowed the study of the security of data as it would require hours, or even days, to encrypt or level for numerous asymmetric ciphers. Recent works show decrypt. that we are now able to perform proofs in a more realistic Hence, in practice, it is usual to encrypt the data we want model called the standard model.From[29]to[30], a lot of to transmit with an efficient symmetric cipher. To provide papers compared these two models, discussing the gap be- tween them. In parallel with this formal estimation of the security level, an empirical one is performed in any case, and 1 AES has been standardized; see http://csrc.nist.gov/groups/ST/toolkit/ new symmetric and asymmetric schemes are evaluated ac- block ciphers.html formoredetails. cording to published attacks. 2 Snow 2.0 is included in the draft of Norm ISO/IEC 18033-4, http://www The framework of a security evaluation has been stated .iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER by Shannon in 1949 [26]: all the considered messages are =3997. 3 See, for example, http://www.ecrypt.eu.org/stream/perf/alpha/bench- encrypted with the same key—so, for the same recipient— marks/snow-2.0 for some benchmark of Snow 2.0, or openssl for AES and and the opponent’s challenge is to take an advantage from all RSA. his observations to disclose the involved secret/private key. C. Fontaine and F. Galand 3
Usually, to evaluate the attack capacity of the opponent, we of information about the plaintext m, namely, the so- distinguish among several contexts [31]: ciphertext-only at- called Jacobi symbol; tacks (where the opponent has access only to some cipher- (iii) when using a deterministic encryption scheme, it is texts), known-plaintext attacks (where the opponent has ac- easy to detect when the same message is sent twice cess to some pairs of corresponding plaintext-ciphertexts), while processed with the same key. chosen-plaintext attacks (same as previous, but the opponent can choose the plaintexts and get the corresponding cipher- So, in practice, we prefer encryption schemes to be prob- texts), and chosen-ciphertext attacks (the opponent has access abilistic. In the case of symmetric schemes, we introduce a to a decryption oracle, behaving as a black-box, that takes random vector in the encryption process (e.g., in the pseudo- a ciphertext and outputs the corresponding plaintext). The random generator for stream ciphers, or in the operating first context is the most frequent in real life, and results from mode for block ciphers), generally called IV. This vector eavesdropping the communication channel; it is the worst may be public, and transmitted as it is, without being en- case for the opponent. The other cases may seem difficult to crypted, but IV must be changed every time we encrypt achieve, and may arise when the opponent has a more pow- a message. In the case of asymmetric ciphers, the security erful position; he may, for example, have stolen some plain- analysis is more mathematical, and we want the randomized texts, or an encryption engine. The “chosen” ones exist in schemes to remain analyzable in the same way as the deter- adaptive versions, where the opponent can wait for a compu- ministic schemes. Some adequate modes have been proposed tation result before choosing the next input. to randomize already published deterministic schemes, as the Optimal Asymmetric Encryption Padding OAEP for RSA (or any scheme based on a trap-door one-way permutation) How do we choose the right scheme? [33].8 Some new schemes, randomized by nature, have also been proposed [25, 34, 35] (see also Figures 3 and 4). The right scheme is the one that fits your constraints in the A simple consequence of this requirement to be proba- best way. By constraints, we may understand constraints in bilistic appears in the so-called expansion: since for a plain- time, memory, security, and so forth. The two first criteria text we require the existence of several possible ciphertexts, are very important in highly constrained architectures, of- the number of ciphertexts is greater than the number of pos- ten encountered in very small devices (PDAs, smart cards, sible plaintexts. This means the ciphertexts cannot be as short RFID tags, etc.). They are also important if we process a huge as the plaintexts, they have to be strictly longer. The ratio amount of data, or numerous data at the same time, for ex- between the length, in bits, of ciphertexts and plaintexts is ample, video streams. Some schemes as AES or RSA are usu- called the expansion. Of course, this parameter is of practical ally chosen because of their reputation, but it is important importance. We will see in the sequel that efficient proba- to note that new schemes are proposed each year. Indeed, it bilistic encryption schemes have been proposed with an ex- is necessary to keep a diversity in the proposals. First, it is pansion less than 2 (e.g., Paillier’s scheme). necessary in order to be able to face new kinds of require- ments. Second, because of security purpose, having all the 2.3. Homomorphic encryption schemes relying on the same structure may lead to a disaster in case an attack breaks this structure. Hence, huge interna- We will present in this section the basic definitions related to tional projects have been funded to ask for new proposals, homomorphic encryption. The state of the art will be given in with a fair evaluation to check their advantages and draw- Section 3. backs, for example, RIPE, NESSIE,4 and NIST’s call for the M 5 6 7 The most common definition is the following. Let design of the AES, CRYPTREC, ECRYPT, and so forth. (resp., C) denote the set of the plaintexts (resp., ciphertexts). An encryption scheme is said to be homomorphic if for any 2.2. Probabilistic encryption given encryption key k the encryption function E satisfies The most well-known cryptosystems are deterministic:for ∀m1, m2 ∈ M, E m1Mm2 ←− E m1 C E m2 (1) a fixed encryption key, a given plaintext will always be en- crypted in the same ciphertext. This may lead to some draw- for some operators M in M and C in C,where← means backs.RSAisagoodexampletoillustratethispoint: “can be directly computed from,” that is, without any inter- (i) particular plaintexts may be encrypted in a too much mediate decryption. M C structured way: with RSA, messages 0 and 1 are always If ( , M)and( , C )aregroups,wehaveagroup ho- encryptedas0and1,respectively; momorphism.Wesayaschemeisadditively homomorphic if (ii) it may be easy to compute partial information about we consider addition operators, and multiplicatively homo- the plaintext: with RSA, the ciphertext c leaks one bit morphic if we consider multiplication operators. A lot of such homomorphic schemes have been published that have been widely used in many applications. Note that 4 see http://www.cryptonessie.org. 5 see http://csrc.nist.gov and http://csrc.nist.gov/CryptoToolkit/aes. 6 see http://www.ipa.go.jp/security/enc/CRYPTREC/index-e.html. 8 Note that there are a lot of more recent papers proposing variants or im- 7 see http://www.ecrypt.eu.org. provements of OAEP, but it is not our purpose here. 4 EURASIP Journal on Information Security
Prerequisite: Alice and Bob share a secret random keystream, say a binary one. Goal:AlicecansendanencryptedmessagetoBob,andBobcansendanencryptedmessagetoAlice. Principle: To encrypt a message, Alice (resp., Bob) XORs the plaintext and the keystream. To decrypt the received message, Bob (resp., Alice) applies XOR on the ciphertext and the keystream. Security: This scheme has been showed to be unconditionally secure by Shannon [26] if and only if the keystream is truly random, has the same length as the plaintext, and is used only once. Thus, this scheme is used only for very critical situations for which these constraints may be managed, as the red phone used by the USA and the USSR [32, pp. 715-716]. What we may use more commonly is a similar scheme, where the keystream is generated by a pseudorandom generator, initialized by the secret key shared by Alice and Bob. A lot of such stream ciphers has been proposed, and their security remains only empirical. Snow 2.0 is one of these.
Figure 1: One-time pad—1917(used)/1926 (published [22]). Note that this scheme may be transposed in any group (G,+)otherthan ({0, 1}, XOR), encryption being related to addition of the keystream, while decryption consists in subtracting the keystream.
Prerequisite: Alice computed a (public, private) key: an integer n = pq,wherep and q are well chosen large prime numbers, an integer e such that gcd (e, φ(n)) = 1, and an integer d which is the inverse of e modulo φ(n), that is, ed ≡ 1modφ(n); φ(n) denotes the Euler function, φ(n) = φ(pq) = (p − 1)(q − 1). Alice’s public key is (n, e), and her private key is d; p and q have also to be kept secret, but are no more needed to process the data, they were only useful for Alice to compute d from e. Goal: Anyone can send an encrypted message to Alice. Principle: To send an encrypted version of the message m to Alice, Bob computes c = me mod n. To get back to the plaintext, Alice computes cd mod n which, according to Euler’s theorem, is precisely equal to m. Security: It is clear that if an opponent may factor n and recover p and q,hewillbeabletocomputeφ(n), then d,andwillbeable to decrypt Alice messages. So, the RSA problem (accessing m while given c) is weaker than the factorization problem. It is not known whether the two problems are equivalent or not.
Figure 2: RSA—1978 [24]. in some contexts it may be of great interest to have this prop- provide any useful information on the plaintext to some hy- erty not only for one operator but for two at the same time. pothetical adversary having only a reasonably restricted com- Hence, we are also interested in the design of ring/algebraic putational power. More formally, for any function f and homomorphisms. Such schemes would satisfy a relation of the any plaintext m, and with only polynomial resources (that form is, with algorithms which time/space complexities vary as a polynomial function of the size of the inputs), the probabil- ∀m , m ∈ M, E m +Mm ←− E m +C E m , 1 2 1 2 1 2 (2) ity to guess f (m) (knowing f but not m) does not increase E m ×Mm ←− E m ×C E m . 1 2 1 2 if the adversary knows a ciphertext corresponding to m. This As it will be further discussed, no convincing algebraic ho- might be thought of as a kind of perfect secrecy in the case momorphic encryption scheme has been found yet, and their when we only have polynomial resources. design remains an open problem. Together with this strong requirement, the notion of Less formally, these definitions mean that, for a fixed key polynomial security was defined: the adversary chooses two k, it is equivalent to perform operations on the plaintexts plaintexts, and we choose secretly at random one plaintext before encryption, or on the corresponding ciphertexts after and provide to the adversary a corresponding ciphertext. The encryption. So we require a kind of commutativity between adversary, still with polynomial resources, must guess which encryption and some data processing operations. plaintext we chose. If the best he can do is to achieve a prob- Of course, the schemes we will consider in the following ability 1/2+ε of success, the encryption is said to be polyno- have to be probabilistic ciphers, and we may consider E to mially secure. Polynomial security is now known as the indis- behave in a probabilistic way in the above definitions. tinguishability of encryptions following the terminology and definitions of Goldreich [36]. 2.4. New security considerations Quite amazingly, Goldwasser and Micali proved the equivalence between polynomial security and semantic se- Probabilistic encryption was introduced with a clear pur- curity [34]; Goldreich extended these notions [36] preserv- pose: security. This requires to properly define different se- ing the equivalence. With this equivalence, it is easy to state curity levels. Semantic security wasintroducedin[34], at the that a deterministic asymmetric encryption scheme cannot same time as probabilistic encryption, in order to define what be semantically secure since it cannot be indistinguishable: could be a strong security level, unavailable without proba- the adversary knows the encryption function, and thus can bilistic encryption. Roughly, a probabilistic encryption is se- compute the single ciphertext corresponding to each plain- mantically secure if the knowledge of a ciphertext does not text. C. Fontaine and F. Galand 5
Prerequisite: Alice generated a (public, private) key: she first chose a large prime integer p, a generating element g of the cyclic ∗ = − ∈ group Zp , and considered q p 1, the order of the group; building her public key, she picked at random a Zq = a ∗ and computed yA g in Zp , her public key being then (g, q, yA); her private key is a. Goal: Anyone can send an encrypted message to Alice. ∈ = k k Principle:Tosendanencryptedversionofthemessagem to Alice, Bob picks at random k Zq,computes(c1, c2) (g , myA) ∗ a −1 ∗ in Zp . To get back to the plaintext, Alice computes c2(c1) in Zp ,whichispreciselyequaltom. Security: The security of this scheme is related to the Diffie-Hellman problem: if we can solve it, then we can break ElGamal encryption. It is not known whether the two problems are equivalent or not. This scheme is IND-CPA.
Figure 3: ElGamal—1985 [25].
But with asymmetric encryption schemes, the adversary broken in subexponential time [45]. Note that this last point knows the whole encryption material E involving both the does not mean that deterministic algebraically homomor- encryption function and the encryption key. Thus, he can phic cryptosystems are insecure, but that one can find the compute any pair (m, E(m)). Naor and Yung [37]andRack- plaintext from a ciphertext in a subexponential time (which off and Simon [38] introduced different abilities, relying on is still too long to be practicable). For example, we know the different contexts we discussed above. From the weak- that the security of RSA encryption depends on factorization est to the strongest, we have the chosen-plaintext, nonadap- algorithms and we know subexponential factorization algo- tive chosen ciphertext and the strongest is the adaptive cho- rithm. Nevertheless, RSA is still considered strong enough. sen ciphertext. This leads to the IND-CPA, IND-CCA1, and IND-CCA2 notions in the literature. IND stands for indistin- 3. HOMOMORPHIC ENCRYPTION: STATE OF THE ART guishability whereas CPA and CCA are acronyms for chosen plaintext attack and chosen-ciphertext attack. Finally, CCA1 First of all, let us recall that both RSA and ElGamal encryp- refers to nonadaptive attacks, and CCA2 to adaptive ones. tion schemes are multiplicatively homomorphic. The prob- Considering the previous remarks on the ability for anyone lem is that the original RSA being deterministic, it cannot to encrypt while using asymmetric schemes, the adversary achieve a security level of IND-CPA (which is the highest has always the chosen-plaintext ability. security level for homomorphic schemes, see Section 2.4). Another security requirement termed nonmalleability Furthermore its probabilistic variants, obtained through has also been introduced to complete the analysis. Given a OAEP/OAEP+, are no more homomorphic. In contrast to ciphertext c = E(m), it should be hard for an opponent to RSA, ElGamal offers the best security level for a homomor- produce a ciphertext c such that the corresponding plain- phic encryption scheme, as it has been shown to be IND- text m , that is not necessary known to the opponent, has CPA. Moreover, it is interesting to notice that an additively some known relation with m. This notion was formalized homomorphic variant of ElGamal has also been proposed ff di erently by Dolev et al. [39, 40], and by Bellare et al. [41], [48]. Comparing it with the original ElGamal, this variant both approaches being proved equivalent by Bellare and Sa- also involves an element G (G may be equal to g) that gen- hai [42]. erates (Z , +) with respect to the addition operation. To send ff q We will not detail the relations between all these di er- an encrypted version of the message m to Alice, Bob picks at ent notions and the interested reader can refer to [41–43]for ∈ = k m k random k Zq and computes (c1, c2) (g , G yA). To get a comprehensive treatment. Basically, the adaptive chosen- a −1 back the plaintext, Alice computes c2(c1) , which is equal to ciphertext indistinguishability IND-CCA2 is the strongest re- Gm; then, she has to compute m in a second step. Note that quirement for an encryption; in particular, it implies non- this last decryption step is hard to achieve and that there is malleability. no other choice for Alice than to use brute force search to get It should be emphasized that a homomorphic encryption back m from Gm. It is also well known that ElGamal’s con- cannot have the nonmalleability property. With the notation = struction works for any family of groups for which the dis- of Section 2.3, knowing c,wecancomputec c C c and de- crete logarithm problem is considered intractable. For exam- duce, by the homomorphic property, that c is a ciphertext of = ple, it may be derived in the setup employing elliptic curves. m m Mm. According to the previous remark on adaptive Hence, ElGamal and its variants are known to be really in- chosen-ciphertext indistinguishability, an homomorphic en- teresting candidates for realistic homomorphic encryption cryption has no access to the strongest security requirement. schemes. The highest security level it can reach is IND-CPA. We will now describe another important family of homo- To conclude this section on security, and for the sake morphic encryption schemes, ranging from the first proba- of completeness, we point out some security considerations bilistic system9 proposed by Goldwasser and Micali in 1982 about deterministic homomorphic encryption. First, it was proved that a deterministic homomorphic encryption for which the operation is a simple addition is insecure [44]. 9 To be more precise, the first published probabilistic public-key encryption Second, Boneh and Lipton showed in 1996 that any de- schemeisduetoMcEliece[49], and the first to add the homomorphic terministic algebraically homomorphic cryptosystem can be property is due to Goldwasser-Micali. 6 EURASIP Journal on Information Security
Prerequisite: Alice computed a (public, private) key: she first chose n = pq, p and q being large prime numbers, and g a quadratic nonresidue modulo n whose Jacobi symbol is 1; her public key is composed of n and g, and her private key is the factorization of n. Goal: Anyone can send an encrypted message to Alice. ∈ ∗ = b 2 Principle: To encrypt a bit b, Bob picks at random an integer r Zn ,andcomputesc g r mod n (remark that c is a quadratic residue if and only if b = 0). To get back to the plaintext, Alice determines if c is a quadratic residue or not. To do so, she uses the property that the Jacobi symbol (c/p)isequalto(−1)b. Please, note that the scheme encrypts 1 bit of information, while its output is usually 1024 bits long! Security: This scheme is the first one that was proved semantically secure against a passive adversary (under computational assumption).
Figure 4: Goldwasser-Micali—1982 [34, 46].
Prerequisite: Alice computed a (public, private) key: she first chose an integer n = pq, p and q being two large prime numbers and = = ∗ ∈ n satisfying gcd (n, φ(n)) 1, and considered the group G Zn2 of order k. She also considered g G of order n.Her public key is composed of n and g, and here private key consists in the factors of n. Goal: Anyone can send a message to Alice. ∈ ∈ ∗ = m n 2 Principle: To encrypt a message m Zn, Bob picks at random an integer r Zn ,andcomputesc g r mod n .Togetbackto λ(n) 2 the plaintext, Alice computes the discrete logarithm of c mod n , obtaining mλ(n) ∈ Zn,whereλ(n)denotesthe − Carmichael function. Now, since gcd (λ(n), n) = 1, Alice easily computes λ(n) 1 mod n and gets m. Security: This scheme is IND-CPA.
Figure 5: Paillier—1999 [47].
[34, 46] (described in Figure 4), to the famous Paillier’s en- Then, encryption selects a random element of Mb to encrypt cryption scheme [47] (described in Figure 5) and its im- b, and decryption allows to know in which part the ran- provements. Paillier’s scheme and its variants are famous for domly selected element lies. The core point lies in the way their efficiency, but also because, as ElGamal, they achieve the to choose the subset, and to partition it into M0 and M1.GM highest security level for homomorphic encryption schemes. uses group theory to achieve the following: the subset is the We will not discuss their mathematical considerations in group G of invertible integers modulo n with a Jacobi sym- detail, but will summarize their important parameters and bol, with respect to n, equal to 1. The partition is generated properties. by another group H ⊂ G, composed of the elements that are (i) We begin with the rather simple scheme of invertible modulo n with a Jacobi symbol, with respect to a Goldwasser-Micali [34, 46]. Besides some historical impor- fixed factor of n, equal to 1; with these settings, it is possible tance, this scheme had an important impact on later pro- to split G into two parts: H and G \ H. posals. Several other schemes, that will be presented below, The generalizations of Goldwasser-Micali play with these were obtained as generalizations of this one. For these rea- two groups; they try to find two groups G and H such that G sons, we provide a detailed description in Figure 4.Here,as can be split into more than k = 2 parts. for RSA, we use computations modulo n = pq,aproduct (ii) Benaloh [50] is a generalization of GM, that enables of two large primes. Encryption is simple, with a product to manage inputs of (k) bits, k being a prime satisfying and a square, whereas decryption is heavier, with an expo- some particular constraints. Encryption is similar as in the nentiation. Nevertheless, this step can be done in O( (p)2). previous scheme (encrypting a message m ∈{0, ..., k − 1} ∈ ∗ = m k Unfortunately, this scheme presents a strong drawback since means picking an integer r Zn and computing c g r its input consists of a single bit. First, this implies that en- mod n) but decryption is more complex. The input and out- crypting k bits leads to a cost of O(k· (p)2). This is not very put sizes being, respectively, of (k)and (n) bits, the expan- efficient even if it is considered as practical. The second con- sion is equal to (n)/ (k). This is better than in the GM case. sequence concerns the expansion: a single bit of plaintext is Moreover, the encryption cost is not too high.√ Nevertheless, encrypted in an integer modulo n, that is, (n) bits. Thus, the the decryption cost is estimated to be O( k (k)) for pre- expansion is really huge. This is the main drawback of this computation, and the same for each dynamical decryption. scheme. This implies that k has to be taken quite small, which limits Before continuing our review, let us present the the gain obtained on the expansion. Goldwasser-Micali (GM) scheme from another point of view. (iii) Naccache-Stern [51] is an improvement of Benaloh’s This is required to understand how it has been generalized. scheme. Considering a parameter k that can be greater The basic principle of GM is to partition a well-chosen sub- than before, it leads to a smaller expansion. Note that set of integers modulo n into two secret parts: M0 and M1. the constraints on k are slightly different. The encryption C. Fontaine and F. Galand 7 step is precisely the same as in Benaloh’s scheme, but the (vii) Galbraith proposed in [58] an adaptation of the pre- decryption is different. To summarize, the expansion is vious scheme in the context of elliptic curves. Its expansion still equal to (n)/ (k), but the decryption cost is lower: is equal to 3. The ratio of the encryption (resp., decryption) O( (n)5 log ( (n))), and the authors claim it is reasonable to cost of this scheme in the case s = 1 over Paillier’s can be choose the parameters as to get an expansion equal to 4. estimatedtobeabout7(resp.,14).But,incontrasttothe (iv) In order to improve previous schemes, Okamoto and previous scheme, the larger the s is, the more the cost may de- Uchiyama decided to change the base group G [52]. Consid- crease. Moreover, as in the case of Damgard-Jurik’s˚ scheme, ering n = p2q, p and q still being two large primes, and the the higher the s is, the stronger the scheme is. = ∗ = 10 group G Zp2 , they achieve k p. Thus, the expansion (viii) Castagnos explored in [59, 60] another improve- is equal to 3. As Paillier’s scheme is an improvement of this ment direction considering quadratic fields quotients. We one and will be fully described below, we will not discuss its have the same kind of structure regarding ns+1 as before, but description in detail. Its advantage lies in the proof that its se- in another context. To summarize, the expansion is 3 and the curity is equivalent to the factorization of n. Unfortunately, ratio of the encryption/decryption cost of this scheme in the a chosen-ciphertext attack has been proposed leading to this case s = 1overPaillier’scanbeestimatedtobeabout2(plus2 factorization. This scheme was used to design the EPOC sys- computations of Legendre symbols for the decryption step). tems [53], currently submitted for the supplement P1363a to (x) To close the survey of this family of schemes, let us the IEEE Standard Specifications for Public-Key Cryptogra- mention the ElGamal-Paillier amalgam, which merges Pail- phy (IEEE P1363). Note that earlier versions of EPOC were lier and the additively homomorphic variant of ElGamal. subject to security flaws as pointed out in [54], due to a bad More precisely, it is based on Damgard-Jurik’s˚ (presented use of the scheme. above) and Cramer-Shoup’s [55] analyses and variants of (v) One of the most well-known homomorphic encryp- Paillier’s scheme, and was proposed by [9]. The goal was tion schemes is due to Paillier [47], and is described in to gain the advantages of both schemes while minimizing Figure 5. It is an improvement of the previous one, that de- their drawbacks. Preserving the notation of both ElGamal creases the expansion from 3 to 2. Paillier came back to and Paillier schemes, we will describe the encryption in the n = pq,withgcd(n, φ(n)) = 1, but considered the group particular case s = 1, which leads Damgard-Jurik’s˚ variant = ∗ = ∈ G Zn2 , and a proper choice of H led him to k (n). to the original Paillier. To encrypt a message m Zn,Bob k The encryption cost is not too high. Decryption needs one picks at random an integer k,andcomputes(c1, c2) = (g exponentiation modulo 2 to the power ( ), and a mul- m k n 2 n λ n mod n,(1+n) (yA mod n) mod n ). tiplication modulo n. Paillier showed in his paper how to Now that we have reviewed the two most famous fami- manage decryption efficiently through the Chinese Remain- lies of homomorphic encryption schemes, we would like to der Theorem. With smaller expansion and lower cost com- mention a few research directions and challenges. pared with the previous ones, this scheme is really attractive. First, as we mentioned in Section 2.1,itisimportant In 2002, Cramer and Shoup proposed a general approach to to have different kinds of schemes, because of applications gain security against adaptive chosen-ciphertext attacks for and security purposes. One direction to design homomor- certain cryptosystems with some particular algebraic prop- phic schemes that are not directly related to the same math- erties [55]. Applying it to Paillier’s original scheme, they pro- ematical problems as ElGamal or Paillier (and variants) is to posed a stronger variant. Bresson et al. proposed in [56]a consider the recent papers dealing with Weil pairing. As this slightly different version that may be more accurate for some new direction is more and more promising in the design of applications. asymmetric schemes, the investigation in the particular case (vi) Damgard˚ and Jurik proposed in [57] a generalization ∗ of homomorphic ciphers is of interest. ElGamal may not be of Paillier’s scheme to groups of the form Zns+1 with s>0. The directly used in the Weil pairing setup as the mathematical larger the s is, the smaller the expansion is. Moreover, this problem it is based on becomes easy to manage. One more scheme leads to a lot of applications. For example, we can promising direction is the use of the pairing-based scheme mention the adaptation of the size of the plaintexts, the use proposed by Boneh and Franklin [61] to obtain a secure ho- of threshold cryptography, electronic voting, and so forth. To momorphic ID-based scheme (see directions in [62] for the ∈ ∈ ∗ encrypt a message m Zn, one picks r Zn at random and ability of such schemes to provide interesting new features). m ns ∈ computes g r Zns+1 . The authors show that if one can A second interesting research direction lies in the area of = break the scheme for a given value s σ, then one can break symmetric encryption. As all the homomorphic encryption = − it for s σ 1. They also show that the semantic security of schemes we mentioned so far are asymmetric, they are not this scheme is equivalent to that of Paillier. Tosummarize, the as fast as symmetric ones could be. But, homomorphy is eas- ffi expansion is of 1+1/s, and hence can be close to 1 if s is su - ier to manage when mathematical operators are involved in ciently large. The ratio of the encryption cost of this scheme the encryption process, which is not usually the case in sym- over Paillier’s can be estimated to be (1/6)s(s +1)(s +2).The metric schemes. Very few symmetric homomorphic schemes same ratio for the decryption step equals (1/6)(s +1)(s +2). have been proposed, most of them being broken ([63]bro- Note that even if this scheme is better than Paillier’s accord- ken in [64, 65], [66]brokenin[67]). Nevertheless, it may ing to its lower expansion, it remains more costly. Moreover, if we want to encrypt or decrypt k blocks of (n) bits, running Paillier’s scheme k times is less costly than running Damgard-˚ 10 This scheme is mentioned in the conclusion of [59], and more deeply Jurik’s scheme once. presented in [60], unfortunately in French. 8 EURASIP Journal on Information Security be of interest to consider a simple generalization of the one- Notes in Computer Science, pp. 117–126, Springer, New York, time pad, where bits are replaced by integers modulo n,as NY, USA, 1987. introduced by [68]. In terms of security, it has exactly the [3] D. Rappe, Homomorphic cryptosystems and their applications, same properties than the one-time pad, that is, perfect se- Ph.D. thesis, University of Dortmund, Dortmund, Germany, crecy if and only if the keystream is truly random, of same 2004, http://www.rappe.de/doerte/Diss.pdf. length as the plaintext, and is used only once. Here again, this [4] R. Cramer and I. Damgard,˚ “Zero-knowledge for finite field arthmetic, or: can zeroknowledge be for free?” in Advances is overwhelming and the keystream could be generated by a in Cryptology (CRYPTO ’98), vol. 1462 of Lecture Notes in well-chosen pseudorandom generator (e.g., as Snow 2.0), de- Computer Science, pp. 424–441, Springer, New York, NY, USA, creasing security from unconditional to computational. Note 1998. that this scheme’s homomorphy is a little bit fuzzy, as we have [5] H. Lipmaa, “Verifiable homomorphic oblivious transfer and for any pair of encryption keys (k1, k2) private equality test,” in Advances in Cryptology (ASIACRYPT ’03), vol. 2894 of Lecture Notes in Computer Science, pp. 416– ∀ ∈ M ←− m1, m2 , Ek1+k2 m1 + m2 Ek1 m1 + Ek2 m2 . 433, Springer, New York, NY, USA, 2003. (3) [6] P.-A. Fouque, G. Poupard, and J. Stern, “Sharing decryption in the context of voting or lotteries,” in Proceedings of the 4th This is the only example of a symmetric homomorphic en- International Conference on Financial Cryptography, vol. 1962 cryptionthathasnotbeencracked. of Lecture Notes in Computer Science, pp. 90–104, Anguilla, As per algebraic homomorphy, designing algebraically British West Indies, 2000. homomorphic encryption schemes is a real challenge today. [7] T. Sander and C. Tschudin, “Protecting mobile agents against There has been only a few ones proposed: by Fellows and malicious hosts,” in Mobile Agents and Security, vol. 1419 of Koblitz [69] (which cannot be considered as secure nor ef- Lecture Notes in Computer Science, pp. 44–60, Springer, New ficient [70]), by Domingo-Ferrer [63, 66](whichhasbeen York, NY, USA, 1998. broken [64, 65, 67]), and construction studies of Rappe et al. [8] P.Golle, M. Jakobsson, A. Juels, and P.Syverson, “Universal re- [3]. No satisfactory solution has been proposed so far, and, encryption for mixnets,” in Proceedings of the RSA Conference Cryptographers (Track ’04), vol. 2964 of Lecture Notes in Com- as Boneh and Lipton conjectured that any algebraically ho- puter Science, pp. 163–178, San Francisco, Calif, USA, 2004. momorphic encryption would prove to be insecure [45], the [9] I. Damgard˚ and M. Jurik, “A length-flexible threshold cryp- question of their existence and design is still open. tosystem with applications,” in Proceedings of the 8th Aus- tralian Conference on Information Security and Privacy (ACISP 4. CONCLUSION ’03), vol. 2727 of Lecture Notes in Computer Science, Wollon- gong, Australia, 2003. We presented in this paper a state of the art on homomor- [10] A. Adelsbach, S. Katzenbeisser, and A. Sadeghi, “Cryptology phic encryption schemes discussing their parameters, perfor- meets watermarking: detecting watermarks with minimal or mances and security issues. As we saw, these schemes are not zero-knowledge disclosures,” in Proceedings of the European well suited for every use, and their characteristics must be Signal Processing Conference (EUSIPCO ’02), Toulouse,France, taken into account. Nowadays, such schemes are studied in September 2002. wide application contexts, but the research is still challeng- [11] B. Pfitzmann and W. Waidner, “Anonymous fingerprinting,” in Advances in Cryptology (EUROCRYPT ’97), vol. 1233 of ing in the cryptographic community to design more power- Lecture Notes in Computer Science, pp. 88–102, Springer, New ful/secure schemes. Their use in the signal processing com- York, NY, USA, 1997. munity is quite new, and we hope this paper will serve as [12] N. Memon and P. Wong, “A buyer-seller watermarking proto- a guide for understanding their specificities, advantages and col,” IEEE Transactions on Image Processing, vol. 10, no. 4, pp. limits. 643–649, 2001. [13] C.-L. Lei, P.-L. Yu, P.-L. Tsai, and M.-H. Chan, “An efficient ACKNOWLEDGMENTS and anonymous buyer-seller watermarking protocol,” IEEE Transactions on Image Processing, vol. 13, no. 12, pp. 1618– The authors are indebted to the referees for their fruitful 1626, 2004. comments concerning this manuscript, and to Fabien Laguil- [14] M. Kuribayashi and H. Tanaka, “Fingerprinting protocol for laumie and Guilhem Castagnos for discussions about the re- images based on aditive homomorphic property,” IEEE Trans- cent improvements in the field. They also thank all the peo- actions on Image Processing, vol. 14, no. 12, pp. 2129–2139, 2005. ple who took the time to read this manuscript and share their [15] V. Shoup, A Computational Introduction to Number thoughts about it. Dr. C. Fontaine is supported (in part) by Theory and Algebra, Cambridge University Press, 2005, the European Commission through the IST Programme un- http://www.shoup.net/ntb/. der Contract IST-2002-507932 ECRYPT. [16] A. Menezes, P. Van Orschot, and S. Vanstone, Hand- book of applied cryptography, CRC Press, 1997, REFERENCES http://www.cacr.math.uwaterloo.ca/hac/. [17] H. Van Tilborg, Ed., Encyclopedia of Cryptography and Security, [1] R. Rivest, L. Adleman, and M. Dertouzos, “On data banks and Springer, New York, NY, USA, 2005. privacy homomorphisms,” in Foundations of Secure Computa- [18] A. Kerckhoffs, “La cryptographie militaire (part i),” Journal des tion, pp. 169–177, Academic Press, 1978. Sciences Militaires, vol. 9, no. 1, pp. 5–38, 1883. [2] E. Brickell and Y. Yacobi, “On privacy homomorphisms,” in [19] A. Kerckhoffs, “La cryptographie militaire (part ii),” Journal Advances in Cryptology (EUROCRYPT ’87), vol. 304 of Lecture des Sciences Militaires, vol. 9, no. 2, pp. 161–191, 1883. C. Fontaine and F. Galand 9
[20] J. Daemen and V. Rijmen, “The block cipher RIJNDAEL,” in [38] C. Rackoff and D. Simon, “Non-interactive zero-knowledge (CARDIS ’98), vol. 1820 of Lecture Notes in Computer Science, proof of knowledge and chosen ciphertext attack,” in Advances pp. 247–256, Springer, New York, NY, USA, 2000. in Cryptology (CRYPTO ’91), vol. 576 of Lecture Notes in Com- [21] J. Daemen and V. Rijmen, “The design of Rijndael,” in AES— puter Science, pp. 433–444, Springer, New York, NY, USA, the Advanced Encryption Standard, Informtion Security and 1991. Cryptography, Springer, New York, NY, USA, 2002. [39] D. Dolev, C. Dwork, and M. Naor, “Non-malleable cryptogra- [22] G. Vernam, “Cipher printing telegraph systems for secret wire phy,” in Proceedings of the 23rd ACM Annual Symposium on the and radio telegraphic communications,” Journal of the Ameri- Theory of Computing —(STOC ’91), pp. 542–552, 1991. can Institute of Electrical Engineers, vol. 45, pp. 109–115, 1926. [40] D. Dolev, C. Dwork, and M. Naor, “Non-malleable cryptogra- [23] P. Ekdahl and T. Johansson, “A new version of the stream phy,” SIAM Journal of Computing, vol. 30, no. 2, pp. 391–437, cipher SNOW,” in Selected Areas in Cryptography (SAC ’02), 2000. vol. 2595 of Lecture Notes in Computer Science, pp. 47–61, [41] M. Bellare, A. Desai, D. Pointcheval, and P. Rogaway, “Re- Springer, New York, NY, USA, 2002. lations among notions of security for public-key encryption [24] R. Rivest, A. Shamir, and L. Adleman, “A method for obtaining schemes,” in Advances in Cryptology (CRYPTO ’98), vol. 1462 digital signatures and public-key cryptosystems,” Communica- of Lecture Notes in Computer Science, pp. 26–45, Springer, New tions of the ACM, vol. 21, no. 2, pp. 120–126, 1978. York, NY, USA, 1998. [25] T. ElGamal, “A prublic key cryptosystem and a signature [42] M. Bellare and A. Sahai, “Non-malleable encryption: equiva- scheme based on discrete logarithms,” in Advances in Cryp- lence between two notions, and an indistinguishability-based tology (CRYPTO ’84), vol. 196 of Lecture Notes in Computer characterization,” in Advances in Cryptology (CRYPTO ’99), Science, pp. 10–18, Springer, New York, NY, USA, 1985. vol. 1666 of Lecture Notes in Computer Science, pp. 519–536, [26] C. Shannon, “Communication theory of secrecy systems,” Bell Springer, New York, NY, USA, 1999. System Technical Journal, vol. 28, pp. 656–715, 1949. [43] Y. Watanabe, J. Shikata, and H. Imai, “Equivalence between [27] M. Ajtai and C. Dwork, “A public key cryptosystem with semantic security and indistinguishability against chosen ci- worst-case/average-case equivalence,” in Proceedings of the phertext attacks,” in Public Key Cryptography (PKC ’03), 29th ACM Symposium on Theory of Computing (STOC ’97), vol. 2567 of Lecture Notes in Computer Science, pp. 71–84, pp. 284–293, 1997. Springer, New York, NY, USA, 2003. [28] P. Nguyen and J. Stern, “Cryptanalysis of the Ajtai-Dwork [44] N. Ahituv, Y. Lapid, and S. Neumann, “Processing encrypted cryptosystem,” in Advances in Cryptology (CRYPTO ’98), data,” Communications of the ACM, vol. 30, no. 9, pp. 777–780, vol. 1462 of Lecture Notes in Computer Science, pp. 223–242, 1987. Springer, New York, NY, USA, 1999. [45] D. Boneh and R. Lipton, “Algorithms for black box fields and [29] R. Canetti, O. Goldreich, and S. Halevi, “The random oracle their application to cryptography,” in Advances in Cryptology model, revisited,” in Proceedings of the 30th ACM Symposium (CRYPTO ’96), vol. 1109 of Lecture Notes in Computer Science, on Theory of Computing (STOC ’98), pp. 209–218, Berkeley, pp. 283–297, Springer, New York, NY, USA, 1996. Calif, USA, 1998. [46] S. Goldwasser and S. Micali, “Probabilistic encryption,” Jour- [30] P.Paillier, “Impossibility proofs for RSA signatures in the stan- nal of Computer and System Sciences, vol. 28, no. 2, pp. 270– dard model,” in Proceedings of the RSA Conference 2007, Cryp- 299, 1984. tographers’ (Track), vol. 4377 of Lecture Notes in Computer Sci- ence, pp. 31–48, San Fancisco, Calif, USA, 2007. [47] P. Paillier, “Public-key cryptosystems based on composite de- [31] W. Diffie and M. Hellman, “New directions in cryptography,” gree residuosity classes,” in Advances in Cryptology (EURO- IEEE Transactions on Information Theory,vol.22,no.6,pp. CRYPT ’99), vol. 1592 of Lecture Notes in Computer Science, 644–654, 1976. pp. 223–238, Springer, New York, NY, USA, 1999. [32] D. Kahn, The Codebreakers: The Story of Secret Writing, [48] R. Cramer, R. Gennaro, and B. Schoenmakers, “A secure and ffi Macmillan, New York, NY, USA, 1967. optimally e cient multiauthority election scheme,” in Ad- [33] M. Bellare and P. Rogaway, “Optimal asymmetric vances in Cryptology (EUROCRYPT ’97), vol. 1233 of Lecture encryption—how to encrypt with RSA,” in Advances in Notes in Computer Science, pp. 103–118, Springer, New York, Cryptology (EUROCRYPT ’94), vol. 950 of Lecture Notes in NY, USA, 1997. Computer Science, pp. 92–111, Springer, New York, NY, USA, [49] R. McEliece, “A public-key cryptosystem based on algebraic 1995. coding theory,” Dsn progress report, Jet Propulsion Labora- [34] S. Goldwasser and S. Micali, “Probabilistic encryption & how tory, 1978. to play mental poker keeping secret all partial information,” in [50] J. Benaloh, Verifiable secret-ballot elections, Ph.D. thesis, Yale Proceedings of the 14th ACM Symposium on the Theory of Com- University, Department of Computer Science, New Haven, puting (STOC ’82), pp. 365–377, New York, NY, USA, 1982. Conn, USA, 1988. [35] M. Blum and S. Goldwasser, “An efficient probabilistic public- [51]D.NaccacheandJ.Stern,“Anewpublic-keycryptosystem key encryption scheme which hides all partial information,” in based on higher residues,” in Proceedings of the 5th ACM Con- Advances in Cryptology (EUROCRYPT ’84), vol. 196 of Lecture ference on Computer and Communications Security, pp. 59–66, Notes in Computer Science, pp. 289–299, Springer, New York, San Francisco, Calif, USA, November 1998. NY, USA, 1985. [52] T. Okamoto and S. Uchiyama, “A new public-key cryptosys- [36] O. Goldreich, “A uniform complexity treatment of encryption tem as secure as factoring,” in Advances in Cryptology (EURO- and zero-knowledge,” Journal of Cryptology,vol.6,no.1,pp. CRYPT ’98), vol. 1403 of Lecture Notes in Computer Science, 21–53, 1993. pp. 308–318, Springer, New York, NY, USA, 1998. [37] M. Naor and M. Yung, “Public-key cryptosystems provably se- [53] T. Okamoto, S. Uchiyama, and E. Fujisaki, “Epoc: efficient cure against chosen ciphertext attacks,” in Proceedings of the probabilistic publickey encryption,” Tech. Rep., 2000, Proposal 22nd ACM Annual Symposium on the Theory of Computing to IEEE P1363a, http://grouper.ieee.org/groups/1363/P1363a/ (STOC ’90), pp. 427–437, Baltimore, Md, USA, 1990. draft.html. 10 EURASIP Journal on Information Security
[54] M. Joye, J.-J. Quisquater, and M. Yung, “On the power of [69] M. Fellows and N. Koblitz, “Combinatorial cryptosystems ga- misbehaving adversaries and security analysis of the original lore!,” in Contemporary Mathematics, vol. 168 of Finite Fields: EPOC,” in Topics in Cryptology CT-RSA 2001, vol. 2020 of Lec- Theory, Applications, and Algorithms, FQ2, pp. 51–61, 1993. ture Notes in Computer Science, Springer, New York, NY, USA, [70] L. Ly, A public-key cryptosystem based on Polly Cracker,Ph.D. 2001. thesis, Ruhr-Universitat¨ Bochum, Bochum, Germany, 2002. [55] R. Cramer and V. Shoup, “Universal hash proofs and a paradigm for adaptive chosen ciphertext secure public-key encryption,” in Advances in Cryptology (EUROCRYPT ’02), vol. 2332 of Lecture Notes in Computer Science, pp. 45–64, Springer, New York, NY, USA, 2002. [56] E. Bresson, D. Catalano, and D. Pointcheval, “A simple public- key cryptosystem with a double trapdoor decryption mech- anism and its applications,” in Advances in Cryptology (ASI- ACRYPT ’03), vol. 2894 of Lecture Notes in Computer Science, pp. 37–54, Springer, New York, NY, USA, 2003. [57] I. Damgard˚ and M. Jurik, “A generalisation, a simplification and some applications of Pailliers probabilistic public-key sys- tem,” in 4th International Workshop on Practice and Theory in Public-Key Cryptography, vol. 1992 of Lecture Notes in Com- puter Science, pp. 119–136, Springer, New York, NY, USA, 2001. [58] S. Galbraith, “Elliptic curve paillier schemes,” Journal of Cryp- tology, vol. 15, no. 2, pp. 129–138, 2002. [59] G. Castagnos, “An efficient probabilistic public-key cryp- tosystem over quadratic fields quotients,” 2007, Finite Fields and Their Applications, paper version in press, http://www.unilim.fr/pages perso/guilhem.castagnos/. [60] G. Castagnos, Quelques sch´emas de cryptographie asym´etrique probabiliste, Ph.D. thesis, Universite´ de Limoges, 2006, http://www.unilim.fr/pages perso/guilhem.castagnos/. [61] D. Boneh and M. Franklin, “Identity-based encryption from the Weil pairing,” in Advances in Cryptology (CRYPTO ’01), vol. 2139 of Lecture Notes in Computer Science, pp. 213–229, Springer, New York, NY, USA, 2001. [62] D. Boneh, X. Boyen, and E.-J. Goh, “Hierarchical identity based encryption with constant size ciphertext,” in Advances in Cryptology (EUROCRYPT ’05), vol. 3494 of Lecture Notes in Computer Science, pp. 440–456, Springer, New York, NY, USA, 2005. [63] J. Domingo-Ferrer, “A provably secure additive and multi- plicative privacy homomorphism,” in Proceedings of the 5th International Conference on Information Security (ISC ’02), vol. 2433 of Lecture Notes in Computer Science, pp. 471–483, Sao Paulo, Brazil, 2002. [64] D. Wagner, “Cryptanalysis of an algebraic privacy homomor- phism,” in Proceedings of the 6th International Conference on Information Security (ISC ’03), vol. 2851 of Lecture Notes in Computer Science, Bristol, UK, 2003. [65] F. Bao, “Cryptanalysis of a provable secure additive and multi- plicative privacy homomorphism,” in International Workshop on Coding and Cryptograhy (WCC ’03), pp. 43–49, Versailles, France, 2003. [66] J. Domingo-Ferrer, “A new privacy homomorphism and ap- plications,” Information Processing Letters,vol.60,no.5,pp. 277–282, 1996. [67] J. Cheon, W.-H. Kim, and H. Nam, “Known-plaintext crypt- analysis of the domingo-ferrer algebraic privacy homomor- phism scheme,” Information Processing Letters, vol. 97, no. 3, pp. 118–123, 2006. [68] C. Castelluccia, E. Mykletun, and G. Tsudik, “Efficient ag- gregation of encrypted data in wireless sensor networks,” in ACM/IEEE Mobile and Ubiquitous Systems: Networking and Services (Mobiquitous ’05), pp. 109–117, 2005. Hindawi Publishing Corporation EURASIP Journal on Information Security Volume 2007, Article ID 51368, 10 pages doi:10.1155/2007/51368
Research Article Secure Multiparty Computation between Distrusted Networks Terminals
S.-C. S. Cheung1 and Thinh Nguyen2
1 Center for Visualization and Virtual Environments, Department of Electrical and Computer Engineering, University of Kentucky, Lexington, KY 40507, USA 2 School of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center Corvallis, Oregon, OR 97331-5501, USA
Correspondence should be addressed to S.-C. S. Cheung, [email protected]
Received 7 May 2007; Accepted 12 October 2007
Recommended by Stefan Katzenbeisser
One of the most important problems facing any distributed application over a heterogeneous network is the protection of pri- vate sensitive information in local terminals. A subfield of cryptography called secure multiparty computation (SMC) is the study of such distributed computation protocols that allow distrusted parties to perform joint computation without disclosing private data. SMC is increasingly used in diverse fields from data mining to computer vision. This paper provides a tutorial on SMC for nonexperts in cryptography and surveys some of the latest advances in this exciting area including various schemes for reducing communication and computation complexity of SMC protocols, doubly homomorphic encryption and private information re- trieval.
Copyright © 2007 S.-C. S. Cheung and T. Nguyen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. INTRODUCTION the criminal biometric database from law enforcement, the surveillance tape from company A, and the proprietary soft- Theproliferationofcapturingandstoragedevicesaswellas ware from company B. the ubiquitous presence of computer networks make shar- Encryption alone cannot provide adequate protection ing of data easier than ever. Such pervasive exchange of data, when performing the aforementioned applications. The en- however, has increasingly raised questions on how sensitive crypted data needs to be decrypted at the receiver for pro- and private information can be protected. For example, it is cessing and the raw data will then become vulnerable. Al- now commonplace to send private photographs or videos to ternatively, the client can download the software and pro- the hundreds of online photoprocessing stores for storage, cess her private data in a secure environment. This, however, development, and enhancement like sharpening and red-eye runs the risk of having the proprietary technology of the soft- removal. Few companies provide any protection of the per- ware company pirated or reverse-engineered by hackers. The sonal pictures they receive. Hackers or employees of the store Trusted Computing (TC) Platform may solve this problem by may steal the data for personal use or distribute them for per- executing the software in a secure memory space of the client sonal gain without consent from the owner. machine equipped with a cryptographic coprocessor [1]. Be- There are also security applications in which multiple sides the high cost of overhauling the existing PC platform, parties need to collaborate with each other but do not want the TC concept remains highly controversial due to its un- any of their own private data disclosed. Consider the fol- balanced protection of the software companies over the con- lowing example: a law-enforcement agency wants to search sumers [2]. for possible suspects in a surveillance video owned by pri- The technical challenge to this problem lies in develop- vate company A, using a proprietary software developed by ing a joint computation and communication protocol to be another private company B. The three parties involved all executed among multiple distrusted network terminals with- have information they do not want to share with each other: out disclosing any private information. Such a protocol is 2 EURASIP Journal on Information Security called a secure multiparty computation (SMC) protocol and One of the basic tools used in PSMC is secret sharing. hasbeenanactiveresearchareaincryptographyformore A t-out-of-m secret-sharing scheme breaks a secret num- than twenty years [3]. Recently, researchers in other disci- ber x into mshares r1, r2, ..., rm such that x cannot be recon- plines such as signal processing and data mining have begun structed unless an adversary obtains more than t − 1 shares touseSMCtosolvevariouspracticalproblems.Thegoalof with t ≤ m. The importance of a secret-sharing scheme in this paper is to provide a tutorial on the basic theory of SMC PSMC is illustrated by the following example: in a 2-party and to survey recent advances in this area. secure computation of f (x1, x2), party Pi will use a 2-out- of-2 secret-sharing scheme to break xi into ri1 and ri2,and 2. PROBLEM FORMULATION share rij with party Pj . Each party then computes the func- tion using the shares received, resulting in y1 f (r11, r21) The basic framework of SMC is as follows: there are n par- at P1 and y2 f (r12, r22)atP2. If the secret-sharing scheme ties P , P , ..., P on a network who want to compute a joint 1 2 n is homomorphic under the function f (·), that is, y1 and y2 function f (x , x , ..., x ) based on private data x owned by 1 2 n i are themselves secret shares of the desired function f (x1, x2), party P for i = 1, 2, ..., n. The goal of the SMC is that P i i f (x1, x2) can then be easily computed by exchanging y1 and will not learn anything about x for j =i beyond what can be j y2 between the two parties. Under our computational model, inferred from her private data xi and the result of the com- all SMC problems can be solved if the secret-sharing scheme putation f (x1, x2, ..., xn). SMC can be trivially accomplished is doubly homomorphic—it preserves both addition and mul- if there is a special server, trusted by every party with its pri- tiplication. One such scheme was invented by Adi Shamir vate data, to carry out the computation. This is not a practical which we will explain next [4]. solution as it is too costly to protect such a server. The objec- In Shamir’s secret-sharing scheme, a party hides her se- tive of any SMC protocol is to emulate this ideal model as cret number x as the constant term of a secret polynomial much as possible by using clever transformations to conceal g(z)ofdegreet − 1, the private data. Almost all SMC protocols are classified based on their t−1 t−2 models of security and adversarial behaviors. The most com- g(z) at−1z + at−2z + ··· + a1z + x. (1) monly used security models are perfect security and compu- tational security, which will be covered in Sections 3 and 4, respectively. Adversarial behaviors are broadly classified into The coefficients a1 to at−1 are random coefficients distributed two types: semihonest and malicious. A dishonest party is uniformly over the entire field. Given the polynomial g(z), called semihonest if she follows the SMC protocol faithfully the secret number x can be recovered by evaluating it at but attempts to find out about other’s private data through z = 0. The secret shares are computed by evaluating g(z)at the communication. A malicious party, on the other hand, z = 1, 2, ..., m and are distributed to m other parties. It is as- will modify the protocol to gain extra information. We will sumed that each party knows the degree of g(z) and the value focus primarily on semihonest adversaries but briefly de- z at which her share is evaluated. We follow the convention scribe how the protocols can be fortified to handle malicious that the share received by party Pi is evaluated at z = i. adversaries. If an adversary obtains any t shares g(z1), g(z2), ..., g(zt) We also assume that private data are elements from a fi- with zi ∈{1, 2, ..., m}, the adversary can then formulate the nite field F and the target function f (·) can be implemented following polynomial g (z): as a combination of the field’s addition and multiplication. This is a reasonably general computational model for two t t − reasons: first, at the lowest level, any digital computing device j=1,j =i z zj g (z) g z . (2) can be modeled by setting F as the binary field with the XOR i t − i=1 j=1,j =i zi zj as addition and AND as multiplication. Second, while most signal processing and scientific computation are described using real numbers, we can approximate the real numbers We claim that g (z) is identical to the secret polynomial g(z): with a reasonably large finite field and estimate any analytical first, the degree g (z)ist − 1, same as that of g(z). Second, function using a truncated version of its power series expan- g (z) = g(z)forz = z1, z2, ..., zt because, when evaluating sion, which consists of only additions and multiplications. g (z) at a particular z = zi, every term inside the summa- tion in (2) will go to zero except for the one that contains 3. SMC WITH PERFECT SECURITY g(zi) it simply becomes g(zi) as the multiplier becomes one. Consequently, the (t − 1)th-degree polynomial g(z) − g (z) In this section, we discuss perfectly secure multiparty com- will have t roots. As the number of roots is higher than the putation (PSMC) in which an adversary will learn nothing degree, g(z) − g (z) must be identically zero or g (z) ≡ g(z). about the secret numbers of the honest parties no matter As a result, the adversary can reconstruct the secret number how computationally powerful the adversary is. The idea is x = g (0). that while the adversary may control a number of parties who On the other hand, the adversary will have no knowledge receive messages from other honest senders, these messages about x even if it possesses as many as t − 1 shares. This is provide no useful information about the secret numbers of because, for any arbitrary secret number x, there exists a the senders. polynomial h(z) such that h(0) = x and h(zi) = g(zi)for S.-C. S. Cheung and T. Nguyen 3 i = 1, 2, ..., t − 1. h(z) is given as follows and its properties is Party 1 Party 2 Party 3 similar to those of (2): g(1)h(1) g(2)h(2) g(3)h(3) h(z) t−1 − t−1 t−1 − j=1 z zj z j=1,j =i z zj (3) x − + g z − . t 1 i t 1 q1(z)with q2(z)with q3(z)with = − z = z = = z − z j 1 j i 1 i j 1,j i i j = = q1(0) g(1)h(1) q2(0) g(2)h(2) q3(0) = g(3)h(3) q (1) Shamir’s secret-sharing scheme is obviously homomor- q1(3) q2(1) q2(3) 3 q3(3) q2(2) q (2) phic under addition: given two secret (t − 1)th-degree poly- q1(1) q1(2) 3 nomials g(z)andh(z), the secret shares of g(z)+h(z)are = = = simply the summation of their respective secret shares g(1) + q(1) γ1q1(1)+ q(2) γ1q1(2)+ q(3) γ1q1(3)+ γ q (2) + γ q (2) γ q (3) + γ q (3) h(1), g(2)+h(2), ..., g(m)+h(m). Secrecy is also maintained γ2q2(1) + γ3q3(1) 2 2 3 3 2 2 3 3 as the coefficients of g(z)+h(z), except for the constant term which is the sum of all the secret numbers, are uniformly dis- tributed and no party can gain additional knowledge about q(0) = γ1q(1) + γ2q(2) + γ3q(3) = g(0)h(0) others’ secret shares. On the other hand, the degree of the product polynomial g(z)h(z) increases to 2(t−1). The locally Figure 1: This diagram shows how three parties can share computed shares g(1)h(1), g(2)h(2), ..., g(m)h(m) cannot the secret g(0)h(0) based on the locally computed products completely specify g(z)h(z) unless the number of shares m g(1)h(1), g(2)h(2), and g(3)h(3). is strictly larger than 2(t − 1) or equivalently, t ≤m/2. Even if this condition is satisfied, a series of product can eas- ily result in a polynomial with degree higher than m.Fur- ffi The second last equality is because g(j)h(j) is the secret thermore, the coe cients of the product polynomial is not number hidden by the polynomial q (z). The last equality entirely random, for example, they are related in such a way j is based on (5). This implies that di for i = 1, 2, ..., m are that the polynomial can be factored by the original polyno- secret shares of the scalar g(0)h(0). An example of the above mials. These problems can be solved by first assuming that protocol in a three-party situation is shown in Figure 1. ≤ t m/2 and then replacing the product polynomial by a To address how each party can solve (5), we note that, − new (t 1)th-degree polynomial as follows. based on our assumption t ≤m/2 the degree of the prod- Pi first computes g(i)h(i) and then generates a random uct polynomial g(z)h(z) is strictly smaller than the number (t − 1)th-degree polynomial qi(z)withqi(0) = g(i)h(i). m−1 of shares m.Letg(z)h(z) = am−1z + ··· + a0. The coef- Again, using the secret-sharing scheme, Pi sends share qi(j) ficients ’s are completely determined by the values ( ) ( ) = ai g z h z to party Pj for j 1, 2, ..., m. This step leaks no information at z = 1, 2, ..., m. In other words, the following matrix equa- about the local product g(i)h(i). In the final step, Pi computes tion has a unique solution: di based on all the received shares qj (i)forj = 1, 2, ..., m, ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ m−1 m−2 0 m 1 1 ··· 1 am−1 g(1)h(1) ⎜ m−1 m−2 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 2 2 ··· 2 ⎟ ⎜ − ⎟ ⎜ (2) (2) ⎟ di γ j qj (i), (4) ⎜ ⎟ ⎜am 2⎟ ⎜ g h ⎟ j=1 Va ⎜ . . . ⎟ ⎜ . ⎟ = ⎜ . ⎟ . ⎝ . . . ⎠ ⎝ . ⎠ ⎝ . ⎠ = m−1 m−2 ··· 0 where γ j for j 1, 2, ..., m solve the following equation: m m m a0 g(m)h(m) (8) m = g(0)h(0) γ j g(j)h(j). (5) × j=1 The m m invertible matrix V is called the Vandermonde matrix and it is a constant matrix. Taking its inverse W = −1 = Before explaining how Pi can solve (5) without knowing V and considering the last row entries Wmi for i g(0)h(0) and g(j)h(j)forj =i, we first note that di for i = 1, 2, ..., m,wehave 1, 2, ..., m are shares of a (t − 1)th-degree polynomial q(z) defined below: m Wmig(i)h(i) = a0 = g(0)h(0). (9) m = i 1 q(z) γ j qj (z). (6) j=1 = = Comparing (9)with(5), we have Wmi γi for i 1, 2, ..., m, ffi which are constants. The coe cients of q(z)areuniformlyrandomastheyare ≤ linear combinations of uniformly distributed coefficients of The condition t m/2 on using Shamir’s scheme in PSMC posts a restriction on the number of dishonest parties q (z)’s. Furthermore, its constant term is our target secret j tolerated—it implies that the number of honest parties must number g(0)h(0): be a strict majority. In particular, we cannot use this scheme m m for a two-party SMC in which one party has to assume that = = = q(0) γ j qj (0) γ j g(j)h(j) g(0)h(0). (7) the other party is dishonest. A surprising result in [5] shows j=1 j=1 that the condition t ≤m/2 is not a weakness of Shamir’s 4 EURASIP Journal on Information Security
1 scheme—in fact, except for certain trivial functions, it is im- Table 1: OT table at P1. possible to compute any f (x , x , ..., x ) with perfect security 1 2 m Key Values if the number of dishonest parties equals to or exceeds m/2. − To conclude this section, we briefly describe how PSMC 0 u − protocols can be modified to handle malicious parties. There 11r11 u − are two types of disruption: first, a malicious party can out- 22r11 u put erroneous results and second, she may perform an incon- . . . . sistent secret-sharing scheme such as evaluating the polyno- r22 r22r11 − u mial at random points. Provided the number of malicious . . parties is less than one third of the total number of par- . . ties, the first problem can be solved by replacing (2)witha N − 2(N − 2)r11 − u robust extrapolation scheme based on Reed-Solomon codes N − 1(N − 1)r11 − u [5]. This bound on the number of malicious parties can be raised to one half by combining interactive zero-knowledge proof with a broadcast channel [6]. The second problem can sider the protocols for addition and multiplication in finite be solved by using a verifiable secret-sharing (VSS) scheme fields. We will concentrate on the canonical two-party case in which the sender needs to provide auxiliary information but our construction can be easily extended to more than so that the receivers can verify the consistency of their shares two parties. Our starting point of building general CSMC is without gaining knowledge of the secret number [5]. a straightforward secret-sharing scheme: each secret number is simply broken down as a sum of two uniformly distributed random numbers: x1 = r11 + r12 and x2 = r21 + r22. Pi then 4. SMC WITH COMPUTATIONAL SECURITY sends rij to Pj for j =i. This scheme is clearly homomorphic under addition It is unsatisfactory that PSMC introduced in Section 3 can- = not even provide secure two-party computation. Instead of x1 + x2 r11 + r21 + r12 + r22 . (10) relying on perfect security, modern cryptographical tech- Multiplication, on the other hand, introduces cross-term niques primarily use the so-called computational security r r which breaks the homomorphism the homomorphism model. Under this model, secrets are protected by encoding 11 22 them based on a mathematical function whose inverse is dif- x1x2 = r11r21 + r12x2 + r11r22. (11) ficult to compute without the knowledge of a secret key. Such a function is called one-way trapdoor function and the con- While the first two terms can be locally computed by P1 and cept is used in many public-key cipher: a sender who wants P2, respectively, it is impossible to compute the third term to send a message m to party P will first compute a cipher- r11r22 without having one party revealed the actual secret text c = E(m, k) based on the publicly known encryption number to the other. In order to accomplish this under the algorithm E(·)’s and P’s advertised public key k. The encryp- computational security model, we will make use of a general tion algorithm acts as a one-way trapdoor function because cryptographic protocol called the oblivious transfer (OT). a computationally bounded eavesdropper will not be able to A 1-out-of-N OT protocol allows one party (the chooser) recover m given only c and k. On the other hand, P can re- to read one entry from a table with N entries hosted by an- cover m by applying a decoding algorithm D(E(m, k), s) = m other party (the sender). Provided that both parties are com- using her secret key s. Unlike perfectly secure protocols in putationally bounded, the OT protocol prevents the chooser which the adversary simply does not have any information from reading more than one entry and the sender from about the secret, the adversary in the computationally secure knowing the chooser’s choice. We first show how the OT model is unable to decrypt the secret due to the computa- protocol can be used to break r11r22 in (11) into random tional burden in solving the inverse problem. Even though shares u and v such that r11r22 = u + v. Assume our fi- it is still a conjecture that true one-way trapdoor functions nite field has N elements. The sender P1 generates a ran- exist and future computation platforms like quantum com- dom u and then creates a table T with N entries shown in 3 puter may drastically change the landscape of these func- Table 1. Using the OT protocol, the chooser P2 selects the tions, many one-way function candidates exist and are rou- entry v T(r ) = r r − u without letting P know her 2 22 22 11 1 tinely used in practical security systems. selection or inspecting any other entries in the table. The most fundamental result in SMC is that it is possible It remains to show how OT provides the security guaran- to design general computationally secure multiparty compu- tee. A 1-out-of-N OT protocol consists of the following five tation (CSMC) protocols to handle arbitrary number of dis- steps. honest parties [3]. In this section, we will discuss the basic construction of these protocols. Similar to Section 3,wecon- (1) P1 sends N randomly generated public keys k0, k1, ..., kN−1 to P2.
1 The exceptions are those functions that are separable or f (x1, x2, ..., 3 xm) = f1(x1) f2(x2) ··· fm(xm). The role of P1 and P2 can be interchanged with proper adjustment to 2 A list of one-way function candidates can be found in [7, Chapter 1]. Table 1 entries. S.-C. S. Cheung and T. Nguyen 5
(2) P2 selects kr22 basedonhersecretnumberr22,encrypts ing the communication requirement of OT and other CSMC ff her public key k using kr22 , and sends E(k , kr22 )back protocols thus become the focus of intensive research e ort. to P1. In [9], Naor and Pinkas showed that the 1-out-of-N OT (3) As P1 does not know P2’s key selection, P1 decodes protocol can be reduced to applying a 1-out-of-2 OT proto- = col log N times. The idea is that the two parties repeatedly the incoming message using all possible keys or ki 2 = use the 1-out-of-2 OT on individual bits of the binary repre- D(E(k , kr22 ), si)withprivatekeyssi for i 0, 1, ..., − sentation of the chooser’s secret number x2: in the ith round, N 1. Only one of ki ’s (kr ) matches the real key k 22 the sender will present two keys Ki0 and Ki1 to the chooser but P1 has no knowledge of it. who will choose Kix2[i] based on x2[i], the ith bit of x2.The (4) P encrypts each table entry T(i) using k and sends = 1 i keys Ki0 and Ki1 for i 1, 2, ...,log2N are used by the sender = − E(T(i), ki )fori 0, 1, ..., N 1toP2. to encrypt the table entries T(k) using the binary representa- (5) P2 decrypts the r22th message using her private key s : tion of k as follows: = = D(E(T(r22), kr ), s ) T(r22)askr k is the public log2N 22 22 key corresponding to the secret key s . P2 then obtains E T(k) = T(k) ⊕ f Kik[i] , (12) her random share of v = T(r22) = r22r11 − u.Note i=1 that P2 will not be able to decrypt any other message = where k is a log N-bit number, f (s) is a random number gen- E(T(i), ki )fori r22 as it requires the knowledge of P1’s erated by seed s,and⊕ denotes XOR. The entire encrypted secret key si. table is sent to the chooser. Since the chooser already knows = It is clear from the above procedure that OT can accomplish a Kix2[i] for i 1, 2, ...,log2N, she can use them to decrypt tablelookupsecuretobothP1 and P2. As the definition of the E(T(x2)) as follows: table is arbitrary, OT can support secure two-party computa- log N tion of any finite field function. Following similar procedures 2 T x = E T x ⊕ f K . (13) as in Section 3, the above construction can be extended using 2 2 ix2[i] i=1 standard zero-knowledge proof and verifiable secret-sharing scheme to handle malicious parties that do not follow the The same authors further improved the computation prescribed protocols [8, Chapter 7]. complexity of the 1-out-of-2 OT protocol in [10]. They showed that it is possible to use one exponentiation, the most complex operation in a public-key cipher, for any number of 5. RECENT ADVANCES simultaneous invocations of the 1-out-of-2 OT at the cost of increasing the communication overhead. Their public-key In Sections 3 and 4, we present the construction of general cipher is based on the assumed difficulty of the Decisional SMC protocols under the perfect security model and the Diffie-Hellman problem whose encryption process enables computational security model. While most of these results the sender to prepare all her encrypted messages with one are established in 1980s, SMC continues to be a very active re- exponentiation without any loss of secrecy. search area in cryptography and its applications begin to ap- An aspect that the above algorithms do not address is pear in many other disciplines. Recent advances focus on bet- the communication requirement of general CSMC protocols. ter understanding of the security strength of individual pro- There are three different facets to the communication prob- tocols and their composition, improving CSMC protocols in lem. First, our basic version of the 1-out-of-N OT protocol terms of their computation complexity [9, 10]andcommu- requires the sender to send N random keys and N encrypted nication cost [11–14], relating SMC to error-correcting cod- messages to the chooser. The random keys can be considered ing [15, 16], and introducing SMC to a variety of applica- as setup cost, provided that the sender changes her random tions [17–22]. The rigorous study of protocol security is be- share u and the chooser changes her key k in every invoca- yond the scope of this paper, and thus we will focus on the tion of the protocol. However, it seems necessary to send the remaining three topics. N encrypted messages every time as the messages depend on u. A closer examination reveals that all the chooser needs is 5.1. Reduction of computation complexity and one particular message that corresponds to her secret num- communication cost ber. The entire set of N messages is sent simply to obfuscate her choice from the sender. This subproblem of obfuscating a Both the computation complexity and communication cost selection from a public data collection is called private infor- of the 1-out-of-N OT protocol depend linearly on the size mation retrieval (PIR). PIR attracts much research interest N of the sender’s table that defines the function—it requires lately and is treated in Section 5.2.Itsuffices to know that O(N) invocations of a public-key cipher and O(N) messages there are techniques that can reduce the communication cost exchanged between the sender and the chooser. In many from O(N)toO(log N)[23]. practical applications, the value of N could be very large. The second facet involves the communication cost of the For example, computing a general function on 32-bit com- original unsecured implementation of the target function. puters requires a table of N= 232 or more than four billion The CSMC protocols in Section 4 provide a systematic pro- entries! This renders our basic version of OT hopelessly im- cedure to secure each addition and multiplication operation practical. Improving the computation efficiency and reduc- in the original implementation. However, not all operations 6 EURASIP Journal on Information Security need to be secured—local operations can be performed with- These two groups are related by a special bilinear map e : out any modification. As such, it is important to minimize G×G→G such that e(uα, vβ) = e(u, v)αβ for arbitrary u, v ∈ G the number of cross-party operations that need to be forti- and integers α, β.5 Furthermore, e(g, g)isageneratorforG fied with the OT protocol. Consider the following example: if g is a generator for G. The public keys for the cipher de- P1 and P2,eachwithn/2secretnumbers,wanttofindthe fined on G are a generator g and a random h = gαq2 for median of the entire set of n numbers. The best known unse- some α. The public keys for the cipher on G are g = e(g, g) cured algorithm to find the median requires O(n)compari- = = αq2 son operations. Tomake this algorithm secure, we can use the and h e(g, h) g . Given a message m, the sender 1-out-of- OT protocol to implement each comparison,4 re- generates a random integer r and computes the ciphertext N = m r ∈ sulting in communication requirement of O(n log N). This, C g h G. To decrypt this ciphertext, the receiver first however, is not the optimal solution—a distributed median- removes the random factor by raising C to the power of the finding algorithm requires much less communication [13]. private key q1: The idea is to have P1 and P2 first compared with their re- q m m q1 = m r 1 = q1 αq2rq1 = q1 spective local medians. The party with the the larger me- C g h g g g , (14) dian can then discard the half of the local data larger than the local median—the global median cannot be in this por- where we use the basic fact gq1q2 = gn = 1 from group theory. tion of the local data as the global median must be smaller Provided that the message space is small enough, the receiver than the larger of the two local medians. Following the same can then retrieve m by computing the discrete logarithm of logic, the other party can discard the smaller half of her lo- Cq1 base gq1 . The security of the cipher is based on the as- cal data. The two parties again compare their local medi- sumed hardness of the so-called subgroup decision problem ans of the remaining data until exhaustion. Notice that all of which we refer the readers to the original paper [14]. We the local computation can be done without invocations of now focus on the homomorphic properties of this scheme. = m1 r1 = m2 r2 OT. As a result, this algorithm only requires O(log n) cross- Given two ciphertext messages C1 g h and C2 g h , = m1+m2 r1+r2 party secure comparison and this results in a communi- it is easy to see that C1C2 g h which is the cipher- cation cost of O(log n log N), a significant reduction from text of message m1 + m2. For multiplication, we apply the · · the naive implementation. In fact, it has been shown that if bilinear map e( , )onC1 and C2: a communication-efficient unsecured implementation exists for a general function, we can always convert it into a secure e C , C = e gm1 hr1 , gm2 hr2 1 2 one without much increase in communication [12]. = e gm1+αq2r1 , gm2+αq2r2 The final facet of communication requirements has to do = m1m2+αq2(m1r2+m2r1+αq2r1r2) with the interactivity of the CSMC protocols. All the pro- e(g, g) (15) tocols introduced thus far require multiple rounds of com- = e(g, g)m1m2 e(g, h)m1r2+m2r1+αq2r1r2 munications between the parties. Such frequent interaction = m1m2 r is undesirable in many applications such as batch processing g h . in which one party needs to reuse many times the same se- cret information from another party, and asymmetric com- The last expression is clearly a ciphertext for m1m2.Unfortu- putation in which a low-complexity client wants to leverage nately, e(C1, C2)belongstoG,notinG. This means that one a sophisticated server to privately perform a complex com- cannot further combine this with other ciphertexts in G and putation. Earlier work in this area showed that one round of as such this scheme falls short of being a completely homo- message exchange is indeed possible for secure computation morphic encryption scheme. of any function [11]. However, the length of the replied mes- sage depends on the complexity of the implementation of the 5.2. Private information retrieval function. As a result, this requires the end receiver to devote much time in decoding the message even though the output Private information retrieval (PIR) protocols allow a party (a can be as small as a binary decision. This problem can be re- user) to select a record from a database owned by another solved using a doubly homomorphic public-key encryption party (a server) without the server knowing the selection of scheme in which arbitrary computation can be done on the the user. PIR is a step in OT as explained in Section 5.1.Un- encrypted data without size expansion. It is an open problem like OT, PIR does not prevent the sender from obtaining in- in cryptography on whether a doubly homomorphic encryp- formation about the collection beyond her choice. Due to its tion scheme exists. The closest scheme, which we will explain asymmetric protection, the paradigm of PIR is useful for pri- next, can support arbitrary numbers of additions and one vacy protection of ordinary citizens in using search engine, multiplication on encrypted data [14]. shopping at online stores, participating in public survey and The construction is based on two public-key ciphers de- electronic voting. As we have seen in Section 5.1, the sim- fined on two different finite cyclic groups G and G of the plest form of PIR is to send the entire database to the user. same size n = q1q2,whereq1 and q2 are large private primes. This imposes a communication cost in the order of the size
4 Secure comparison is also called the Secure Millionaire Problem, one of 5 An example of such construction is based on the modified Weil paring on the earliest problems studied in SMC literature [3]. the elliptic curve y2 = x3 + 1 defined over a finite field [14]. S.-C. S. Cheung and T. Nguyen 7 of the database. Recent advances in PIR protocols, however, user to inspect only a small fraction of C(x), say k n bits, show that the goal can be accomplished with a much smaller in order to fully recover a specific bit x[i]inx. Furthermore, communication overhead. each bit in C(x)canbeusedinak-bit subset to recover x[i]. The problem of PIR was first proposed in the seminal pa- As such, the knowledge of a particular bit in C(x) being used per by Chor et al. as follows [24]: the server has an n-bit bi- provides no information about which x[i] is being recovered. nary string x, and a user wants to know x[i], the ith bit of x, To see how LDC is used in PIR, we assume that each of the without the server knowing about i. The first important re- k servers has the same m-bit C(x) generated using an LDC sult shown in [24] is that, under the perfect security model, encoding function on the n-bit database x. In order to re- it is impossible to send less data than the trivial solution of trieve x[i], the user sends q1, q2, ..., qk ∈{1, 2, ..., m}, the sending the entire x to the user. On the other hand, if iden- locations of bits in C(x) needed to recover x[i], to each of tical databases are available at k ≥ 2 noncolluding servers, the k servers, respectively. Note that these locations depend then perfect security can be achieved with the communica- only on i and the particular LDC used. Upon receiving qj, 1/k tion cost of O(n ). Their results are based on the following the jth server simply replies with C(x)[qj]forj = 1, 2, ..., k. basic two-server scheme that allows a user to privately obtain After gathering all the k replies, the user can then run the de- x[i] by receiving a single bit from each of the two servers. Let coding algorithm to recover x[i]. Using this framework, the us denote communication cost of the PIR system is k(l +logm)with ⎧ klog m and kl corresponded to the user’s and server’s com- ⎨⎪S ∪{a},ifa ∈ S, munication costs, respectively. ⊗ = (16) In fact, the two-server basic scheme introduced earlier S a ⎩⎪ S \{a},ifa ∈ S. can be viewed as using the Hadamard code in the LDC framework. The Hadamard code H(x)ofann-bit message n n The user first randomly selects the indexes j ∈{1, 2, ...n} x has 2 bits. The kth bit of H(x)fork ∈{0, 1, ...,2 − 1} is with probability of 1/2 for each value of j, to form a set S. defined as follows: Next, the user computes S⊗i,wherei is the desired index. The n user then sends S to server one and S ⊗ i to server two. Upon H(x)[k] = x[j]k[j]. (17) receiving S, server one replies to the user with a single bit j=1 which is the result of XORing of all the bits in the positions specified by S. Similarly, server two replies to the user with To retrieve x[i] from the servers, the user first randomly picks ⊕ a single bit which is the result of XORing of all the bits in an n-bit number k, and then sends k to server one and k ei the positions specified by S ⊗ i. The user then computes x[i] to server two, where ei is an n-bit number with a single one ⊕ by XORing the two bits received from the two servers. This in the ith position. Upon receiving k and k ei,serversone ⊕ scheme works because every position j =i will appear twice— and two reply with H(x)[k]andH(x)[k ei], respectively. one in S and one in S⊗i, therefore the result from XORing of The user can then decode x[i] by computing all x[j]’s together will be 0. On the other hand, i appears only ( )[ ] ⊕ ( ) ⊕ once in either S or S ⊗ i, therefore the result of XORing of all H x k H x k ei n n x[j]’s and x[i]willbex[i]. Provided the two servers do not = ⊕ ⊕ ⊕ ∼ collude, every bit is equally likely to be selected by the user. In x[j]k[j] x[i]k[i] x[j]k[j] x[i] k[i] j=1,j =i j=1,j =i this scheme, each server sends one bit to the user but the user has to send an n-bit message6 to each server. Thus, the overall = x[i] k[i] ⊕∼k[i] = x[i]. communication cost is still O(n). With minor modification, (18) this basic scheme can be extended to reduce the number of bits sent by the user to O(n1/k)[24]. The symbol ∼ denotes negation. This scheme is almost Recently, an interesting connection is made between PIR equivalent to the scheme by Chor et al., except that the XOR and a special type of forward-error-correcting codes (FEC) of all possible selections of bits in x are already contained in called locally decodable codes (LDC) and it has created a the Hadamard code H(x). We mention again that the com- flurry of interest in the information theory community [16]. munication cost of this scheme is O(n) due to the exponen- FEC is used to combat transmission errors by adding redun- tial code length of the Hadamard code. Nevertheless, the pos- dancy to the transmitted data. Formally, the sender uses an sibility of using better error-correcting codes in the place of encoding function C(·) to map an n-bit message x to an m- the Hadamard code opens many opportunities for new PIR bit message C(x)withm>n, and then sends C(x)overa schemes. PIR schemes based on Reed-Solomon codes and noisy channel. Upon receiving a string y possibly different Reed-Muller codes can be found in [16]. The best published from C(x), a receiver attempts to recover x using a decoding result on PIR uses LDC to achieve a communication com- 10−7 algorithm D(C(x)). In the conventional FEC, it will takes at plexity of O(n ) with three noncolluding servers [25]. least O(n) complexity to recover an n-bit x since O(n)isre- All of the above constructions provide PIR under the per- quired just to record x. LDC, on the other hand, allows the fect security model. By making certain computational as- sumptions, PIR can also achieve sublinear communication complexity with only one database [23, 26]. We briefly re- 6 The message is simply an n-bit number with ones indicating the desired view the scheme in [26] as follows: it is based on the assumed bit. hardness of determining whether a number in a finite field 8 EURASIP Journal on Information Security
F is a quadratic residue, that is, without knowing the prime 250 factorization of the field size N,itisdifficult to compute the following predicate: 200 1ifu = v2 for some v ∈ F, 150 QR(u) = (19) 0 otherwise. 100 It is easy to see that QR(·) is homomorphic under multipli- = cation, that is, QR(xy) QR(x)QR(y). The basic principle 50 of using QR to retrieve x[i] is straightforward: the user sends the server n numbers y1, ..., yn ∈ F, all of them quadratic 0 residues except yi, that is, QF(yj ) = 1forj =i and QF(yi) = 0. The server then replies with m ∈ F computed as follows: −50 y if x[j] = 0, n = j −100 m Πj=1wj ,wherewj 2 = (20) yj if x[j] 1. −150 Since all yj ’s are quadratic residues except for yi,wehave 0 102030405060 QR(wj ) = 1forj =i and QR(wi) = x[i]. Combining the homomorphic property, we get the desired result QR(m) = Original signal QR(wi) = x[i]. This scheme, however, is very wasteful as the P1’s estimate user needs to send n log N bits. We can improve this by rear- P2’s astimate ranging x as an s × t matrix M with s = n(L−1)/L and t = n1/L for some integer L. Assume that x[i] is the entry at the ath Figure 2: Original signal and least-square estimates in secure inner row and the bth column of M. The user then sends the server product. yj ,forj = 1, 2, ..., t,allquadraticresiduesexceptforyb.The communication for this step is O(n1/L). Using these t num- bers, the server carries a similar computation as (20)foreach While an algorithm in a typical data mining applica- row of M, resulting in mk for k = 1, 2, ..., s. Of all the mk’s, tion may need to handle millions of records on a daily ba- all the user needs is ma from the ath row because it is suffi- sis, a real-time signal processing algorithm needs to handle ffi cient to retrieve x[i]asQR(ma) = x[i]. Since each of the mk millions of samples within milliseconds. Very e cient algo- is a log N-bit number, this is equivalent to carrying out the rithms have recently been developed at the expense of pri- PIR procedure log N times—but this time the database size vacy. The pioneering work by Avidan and Moshe showed shrinks from n to s = n(L−1)/L. This observation allows the the feasibility of building a secure distributed face detector same procedure to be applied recursively with exponentially [20]. While keeping OT as the core, they provide an efficient decreasing communication cost. As a result, the communi- implementation based on the assumption that certain visual cation is dominated by the first step which is O(n1/L)andwe features used in the detector are noninvertible and for this can make L asbigaswewant.SubsequentworkbyCachin they do not leak important information about the images. et al. showed that the communication cost can be further re- Another noteworthy scheme is a collection of statistical duced to logarithmic complexity [23]. routines, developed in [18], that use linear subspace projec- tion for privacy projection. We illustrate the idea with a sim- 5.3. Practical applications of SMC ple inner product computation. Assume that two parties, P1 and P2,haven-dimensional vectors x1 and x2,respectively. While the theoretical studies of SMC have advanced signif- They both know an invertible matrix M and its inverse M−1. icantly in recent years, developing practical applications us- M is broken down into top and bottom halves T ∈ Rn/2×n ing SMC has been slow. The data mining community is the and B ∈ R(n−n/2)×n, while M−1 into left and right halves ∈ Rn×n/2 ∈ Rn×(n−n/2) T first to introduce SMC into practical usage. The goal is to L and R . The inner product x1 x2 compute aggregate statistics over private data stored in dis- can then be decomposed as follows: tributed databases. Using the OT protocol as the core, dif- T = T −1 = T T ferent SMC protocols have been developed to construct lin- x1 x2 x1 M Mx2 x1 LTx2 + x1 RBx2. (21) ear algebra routines [27], median computation [13], deci- T T sion trees [17], neural network [19], and others. Even though P1 then sends x1 R to P2 who computes x1 RBx2 while P2 T these algorithms provide innovative implementations for sends P1Tx2 so that she can compute x1 LTx2. P2 can then many data mining schemes, their security relies on modular send his scalar to P1 or vice versa to obtain the final answer. arithmetic operations on very large integers which are com- They cannot recover each other’s data as the transmitted data T putationally intensive. In a recent study on PIR, the authors x1 R and Tx2 are all n/2-dimensional vectors. Using a ran- of [28] showed that even with the most advanced CPUs, the domly generated M and x1 = x2, Figure 2 shows the least modular arithmetic in the SMC protocol requires more time square estimates by both parties based on the received data. than simply sending the entire database through a typical Following a similar approach, we have also developed secure broadband connection. two-party routines for linear filtering [21] and thresholding S.-C. S. Cheung and T. Nguyen 9
[22]. Even though all of the above algorithms are computa- tomata, Languages and Programming, pp. 512–523, Geneva, tionally very efficient, they all leak private information to a Switzerland, July 2000. certain degree and thus may not be suitable for applications [12] M. Naor and K. Nissim, “Communication complexity and se- that demand the utmost privacy and security. cure function evaluation,” Electronic Colloquium on Computa- tional Complexity, vol. 8, no. 62, 2001. [13] G. Aggarwal, N. Mishra, and B. Pinkas, “Secure computation 6. CONCLUSIONS of the kth-ranked element,” in Proceedings of Advances in Cryp- tology International Conference on the Theory and Applications In this article, we have briefly reviewed the foundation of of Cryptographic Techniques (EUROCRYPT ’04), vol. 3027 of SMC protocols and some of the latest developments. As we Lecture Notes in Computer Science, pp. 40–55, 2004. do not assume any background in cryptography, we focus on [14] D. Boneh, E.-J. Goh, and K. Nissim, “Evaluating 2-DNF for- the intuition rather than the rigorous treatment of the sub- mulas on ciphertexts,” in Proceedings of Theory of Cryptogra- ject. Serious readers should consult the comprehensive text phy Conference 2005, vol. 3378 of Lecture Notes in Computer of [8] and the collection of papers at specialized bibliogra- Science, pp. 325–341, Cambridge, Mass, USA, February 2005. phy sites [29, 30]. As the demand for secure and privacy- [15] W. Gasarch, “A survey on private information retrieval,” The enhancing applications is rapidly growing, we believe that it Bulletin of the EATCS, vol. 82, pp. 72–107, 2004. is a great opportunity for researchers in diverse areas outside [16] L. Trevisan, “Some applications of coding theory in computa- of cryptography to understand the concepts of SMC and to tional complexity,” Quaderni di Matematica, vol. 13, pp. 347– develop practical SMC protocols for their respective applica- 424, 2004. tions. [17] Y. Lindell and B. Pinkas, “Privacy preserving data mining,” Journal of Cryptology, vol. 15, no. 3, pp. 177–206, 2003. [18]W.Du,Y.S.Han,andS.Chen,“Privacy-preservingmultivari- ACKNOWLEDGMENT ate statistical analysis: linear regression and classification,” in Proceedings of the 4th SIAM International Conference on Data The authors would like to thank the constructive comments Mining, pp. 222–233, Lake Buena Vista, Fla, USA, April 2004. from the anonymous reviewers. [19] Y.-C. Chang and C.-J. Lu, “Oblivious polynomial evaluation and oblivious neural learning,” Theoretical Computer Science, REFERENCES vol. 341, no. 1–3, pp. 39–54, 2005. [20] S. Avidan and M. Butman, “Blind vision,” in Proceedings of the [1] Trusted Computing Group, “TCG Specification Architecture 9th European Conference on Computer Vision, vol. 3953 LNCS Overview,” April 2004, https://www.trustedcomputinggroup of Lecture Notes in Computer Science, pp. 1–13, Graz, Austria, .org. May 2006. [2] R. Anderson, “Trusted Computing Frequently Asked Ques- [21] N. Hu and S.-C. Cheung, “Secure image filtering,” in Pro- tions,” August 2003, http://www.cl.cam.ac.uk/∼rja14/tcpa-faq ceedings of IEEE International Conference on Image Processing .html. (ICIP ’06), Atlanta, Ga, USA, October 2006. [3] A. C. Yao, “Protocols for secure computations,” in Proceedings [22] N. Hu and S.-C. Cheung, “A new security model for secure of the 23rd Annual IEEE Symposium on Foundations of Com- thresholding,” in Proceedings of IEEE International Conference puter Science, pp. 160–164, Chicago, Ill, USA, November 1982. on Acoustic, Speech and Signal Processing (ICASSP ’07),Hon- [4] Shamir, “How to share a secret,” Communications of the ACM, olulu, Hawaii, USA, April 2007. vol. 22, no. 11, pp. 612–613, 1979. [23] C. Cachin, S. Micali, and M. Stadler, “Computationally private [5]M.Ben-Or,S.Goldwasser,andA.Wigderson,“Complete- information retrieval with polylogarithmic communication,” ness thorems for non-cryptographic fault tolerant distributed in Proceedings of Advances in Cryptology: International Con- computation,” in Proceedings of the 20th ACM Symposium on ference on the Theory and Applications of Cryptographic Tech- the Theory of Computing, pp. 1–10, Chicago, Ill, USA, May niques (EUROCRYPT ’99), vol. 1592, pp. 402–414, 1999. 1988. [24] B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan, “Private [6] T. Rabin and M. Ben-Or, “Verifiable secret sharing and multi- information retrieval,” in Proceedings of the Annual Symposium party protocols with honest majority,” in Proceedings of the 21st on Foundations of Computer Science, pp. 41–50, October 1995. Annual ACM Symposium on Theory of Computing, pp. 73–85, [25] S. Yekhanin, “New locally decodable codes and private infor- Seattle, Wash, USA, May 1989. mation retrieval schemes,” Tech. Rep. 127, Electronic Collo- [7] S. Goldwasser and M. Bellare, Lecture Notes on Cryptography, quium on Computational Complexity, 2006. Massachusetts Institue of Technology, Cambridge, Mass, USA, [26] E. Kushilevitz and R. Ostrovsky, “Replication is not needed: 2001. single database, computationally-private information re- [8] O. Goldreich, Foundations of Cryptography: Volume II Basic trieval,” in Proceedings of the Annual Symposium on Founda- Applications, Cambridge University Press, Cambridge, Mass, tions of Computer Science, pp. 364–373, Miami Beach, Fla, USA, 2004. USA, 1997. [9] M. Naor and B. Pinkas, “Oblivious transfer and polynomial [27] R. Cramer and I. Damgaard, “Secure distributed linear algebra evaluation,” in Proceedings of the Annual ACM Symposium on in constant number of rounds,” in Proceedings of the 21st An- Theory of Computing, pp. 245–254, Atlanta, Ga, USA, 1999. nual IACR (CRYPTO ’01), vol. 2139 of Lecture Notes in Com- [10] M. Naor and B. Pinkas, “Efficient oblivious transfer proto- puter Science, pp. 119–136, Santa Barbara, Calif, USA, August cols,” in Proceedings of the SIAM Symposium on Discrete Algo- 2001. rithms (SODA ’01), pp. 448–457, Washington, DC, USA, 2001. [28] R. Sion and B. Carbunar, “On the computational practical- [11] C. Cachin, J. Camenisch, J. Kilian, and J. Muller, “One-round ity of prive information retrieval,” in Proceedings of the 14th secure computation and secure autonomous mobile agents,” ISOC Network and Distributed Systems Security Symposium, in Proceedings of the 27th International Colloquium on Au- San Diego, Calif, USA, February-March 2007. 10 EURASIP Journal on Information Security
[29] H. Lipmaa, “Oblivious Transfer or Private Information Re- trieval,” University College London, http://www.adastral.ucl .ac.uk/∼helger/crypto/link/protocols/oblivious.php. [30] K. Liu, “Privacy Preserving Data Mining Bibliography,” University of Maryland, Baltimore County, http://www.csee .umbc.edu/∼kunliu1/research/privacy review.html. Hindawi Publishing Corporation EURASIP Journal on Information Security Volume 2007, Article ID 78943, 20 pages doi:10.1155/2007/78943
Review Article Protection and Retrieval of Encrypted Multimedia Content: When Cryptography Meets Signal Processing
Zekeriya Erkin,1 Alessandro Piva,2 Stefan Katzenbeisser,3 R. L. Lagendijk,1 Jamshid Shokrollahi,4 Gregory Neven,5 and Mauro Barni6
1 Electrical Engineering, Mathematics, and Computer Science Faculty, Delft University of Technology, 2628 CD, Delft, The Netherlands 2 Department of Electronics and Telecommunication, University of Florence, 50139 Florence, Italy 3 Information and System Security Group, Philips Research Europe, 5656 AE, Eindhoven, The Netherlands 4 Department of Electrical Engineering and Information Sciences, Ruhr-University Bochum, 44780 Bochum, Germany 5 Department of Electrical Engineering, Katholieke Universiteit Leuven, 3001 Leuven, Belgium 6 Department of Information Engineering, University of Siena, 53100 Siena, Italy
Correspondence should be addressed to Zekeriya Erkin, [email protected]
Received 3 October 2007; Revised 19 December 2007; Accepted 30 December 2007
Recommended by Fernando Perez-Gonz´ alez´
The processing and encryption of multimedia content are generally considered sequential and independent operations. In certain multimedia content processing scenarios, it is, however, desirable to carry out processing directly on encrypted signals. The field of secure signal processing poses significant challenges for both signal processing and cryptography research; only few ready-to-go fully integrated solutions are available. This study first concisely summarizes cryptographic primitives used in existing solutions to processing of encrypted signals, and discusses implications of the security requirements on these solutions. The study then continues to describe two domains in which secure signal processing has been taken up as a challenge, namely, analysis and retrieval of multimedia content, as well as multimedia content protection. In each domain, state-of-the-art algorithms are described. Finally, the study discusses the challenges and open issues in the field of secure signal processing.
Copyright © 2007 Zekeriya Erkin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. INTRODUCTION In several application scenarios, however, it is desirable to carry out signal processing operations directly on encrypted In the past few years,the processing of encrypted signals has signals. Such an approach is called secure signal processing, en- emerged as a new and challenging research field. The combi- crypted signal processing,orsignal processing in the encrypted nation of cryptographic techniques and signal processing is domain. For instance, given an encrypted image, can we cal- not new. So far, encryption was always considered as an add- culate the mean value of the encrypted image pixels? On the on after signal manipulations had taken place (see Figure 1). one hand, the relevance of carrying out such signal manipu- For instance, when encrypting compressed multimedia sig- lations, that is, the algorithm, directly on encrypted signals is nals such as audio, images, and video, first the multime- dia signals were compressed using state-of-the-art compres- entirely dependent on the security requirements of the appli- sion techniques, and next encryption of the compressed bit cation scenario under consideration. On the other hand, the stream using a symmetric cryptosystem took place. Conse- particular implementation of the signal processing algorithm quently, the bit stream must be decrypted before the multi- will be determined strongly by the possibilities and impossi- media signal can be decompressed. An example of this ap- bilities of the cryptosystem employed. Finally, it is very likely proach is JPSEC, the extension of the JPEG2000 image com- that new requirements for cryptosystems will emerge from pression standard. This standard adds selective encryption secure signal processing operations and applications. Hence, to JPEG2000 bit streams in order to provide secure scalable secure signal processing poses a joint challenge for both the streaming and secure transcoding [1]. signal processing and the cryptographic community. 2 EURASIP Journal on Information Security
Process Process x(n) Encrypt Channel Decrypt x(n) (compress) (decompress)
Figure 1: Separate processing and encryption of signals.
The security requirements of signal processing in en- encryption is carried out independently on individual signal crypted domains depends strongly on the considered appli- samples. As a consequence, individual signal samples can be cation. In this survey paper, we take an application-oriented identified in the encrypted version of the signal, allowing for view on secure signal processing and give an overview of pub- processing of encrypted signals on a sample-by-sample basis. lished applications in which the secure processing of signal If we represent a one-dimensional (e.g., audio) signal X that amplitudes plays an important role. In each application, we consists of M samples as show how signal processing algorithms and cryptosystems T are brought together. It is not the purpose of the paper to X = x1, x2, x3, ..., xM−1, xM ,(1) describe either the signal processing algorithms or the cryp- tosystems in great detail, but rather focus on possibilities, im- where xi is the amplitude of the ith signal sample, then the possibilities, and open issues in combining the two. The pa- encrypted version of X using key k is given as per includes many references to literature that contains more T elaborate signal processing algorithms and cryptosystem so- Ek(X) = Ek x1 , Ek x2 , Ek x3 , ..., Ek xM−1 , Ek xM . lutions for the given application scenario. It is also crucial (2) to state that the scenarios in this survey can be implemented more efficiently by using trusted third entities. However, it is Here the superscript “T” refers to vector transposition. Note not always easy to find trusted entities with high computa- that no explicit measures are taken to hide the temporal or tional power, and even if one is found, it is not certain that spatial structure of the signal, however, the use of sophisti- it can be applicable in these scenarios. Therefore, the trusted cated encryption schemes that are semantically secure (as the entities either do not exist or have little role in discussed sce- one in [2]) achieves this property automatically. narios in this paper. Secondly, only public key cryptosystems are used that In this paper, we will survey applications that directly ma- have particular homomorphic properties. The homomorphic nipulate encrypted signals. When scanning the literature on property that these public key cryptographic system provide secure signal processing, it becomes immediately clear that will be concisely discussed in Section 2.2.1. In simple terms, there are currently two categories under which the secure sig- the homomorphic property allows for carrying out additions nal processing applications and research can be roughly clas- or multiplications on signal amplitudes in the encrypted do- sified, namely, content retrieval and content protection. Al- main. Public key systems are based on the intractability of though the security objectives of these application categories some computationally complex problems, such as differ quite strongly, similar signal processing considerations (i) the discrete logarithm in finite field with a large and cryptographic approaches show up. The common cryp- (prime) number of elements (e.g., ElGamal cryptosys- tographic primitives are addressed in Section 2.Thissection tem [3]); also discusses the need for clearly identifying the security re- (ii) factoring large composite numbers (e.g., RSA cryp- quirements of the signal processing operations in a given sce- tosystem [4]); nario. As we will see, many of the approaches for secure sig- (iii) deciding if a number is an nth power in ZN for large nal processing are based on homomorphic encryption, zero- enough composite N (e.g., Paillier cryptosystem [2]). knowledge proof protocols, commitment schemes, and mul- tiparty computation. We will also show that there is ample It is important to realize that public key cryptographic sys- room for alternative approaches to secure signal processing tems operate on very large algebraic structures. This means towards the end of Section 2. Section 3 surveys secure sig- that signal amplitudes xi that were originally represented in nal processing approaches that can be classified as “content 8-to-16 bits will require at least 512 or 1024 bits per signal retrieval,” among them secure clustering and recommenda- sample in their encrypted form Ek(xi). This data expansion tion problems. Section 4 discusses problems of content pro- is usually not emphasized in literature but this may be an tection, such as secure watermark embedding and detection. important hurdle for practical applicability of secure signal Finally, Section 5 concludes this survey paper on secure pro- processing solutions. In some cases, however, several signal tection and retrieval of encrypted multimedia content. samples can be packed into one encrypted value in order to reduce the size of the whole encrypted signal by a linear fac- tor [5]. 2. ENCRYPTION MEETS SIGNAL PROCESSING A characteristic of signal amplitudes xi is that they are 2.1. Introduction usually within a limited range of values, due to the 8-to-16 bits amplitude representation format of sampled signals. If The capability to manipulate signals in their encrypted form a deterministic encryption scheme would be used, each sig- is largely thanks to two assumptions on the encryption nal amplitude would always give rise to the same encrypted strategies used in all applications discussed. In the first place, value, making it easy for an adversary to infer information Zekeriya Erkin et al. 3
Table 1: Some (probabilistic) encryption systems and their homomorphisms.
Encryption system f1(·, ·) f2(·, ·) Multiplicatively Homomorphic El-Gamal [3] Multiplication Multiplication Additively Homomorphic El-Gamal [13] Addition Multiplication Goldwasser-Micali [14] XOR Multiplication Benaloh [15] Addition Multiplication Naccache-Stern [16] Addition Multiplication Okamoto-Uchiyama [17] Addition Multiplication Paillier [2] Addition Multiplication Damgard-Jurik˚ [18] Addition Multiplication
about the signal. Consequently, probabilistic encryption has tion scenarios are built on the four cryptographic primitives to be used, where each encryption uses a randomization or discussed in Section 2.2, there is ample room for entirely dif- blinding factor such that even if two signal samples xi and xj ferent approaches to secure signal processing. have the same amplitude, their encrypted values Epk[xi]and ff Epk[xj ]willbedi erent. Here, pk refers to the public key used 2.2. Cryptographic primitives upon encrypting the signal amplitudes. Public key cryptosys- tems are constructed such that the decryption uses only the 2.2.1. Homomorphic cryptosystems private key sk, and that decryption does not need the value of the randomization factor used in the encryption phase. All Many signal processing operations are linear in nature. Lin- encryption schemes that achieve the desired strong notion of earity implies that multiplying and adding signal amplitudes semantic security are necessarily probabilistic. are important operations. At the heart of many signal pro- Cryptosystems operate on (positive) integer values on cessing operations, such as linear filters and correlation eval- finite algebraic structures. Although sampled signal ampli- uations, is the calculation of the inner product between two tudes are normally represented in 8-to-16 bits (integer) val- signals X and Y. If both signals (or segments of the signals) ues when they are stored, played, or displayed, intermediate contain M samples, then the inner product is defined as signal processing operations often involve noninteger signal ⎡ ⎤ y1 amplitudes. Work-arounds for noninteger signal amplitudes ⎢ ⎥ ⎢ y2 ⎥ M T ⎢ ⎥ may involve scaling signal amplitudes with constant factors X, Y =X Y = x1, x2, ..., x · ⎢ . ⎥ = x y . (3) M ⎣ . ⎦ i i (say factors of 10 to 1000), but the unavoidable successive . i=1 operations of rounding (quantization) and normalization by yM division pose significant challenges for being carried out on encrypted signal amplitudes. This operation can be carried out directly on an encrypted In Section 2.2, we first discuss four important cryp- signal X and plain text signal Y if the encryption system used tographic primitives that are used in many secure signal has the additive homomorphic property, as we will discuss processing applications, namely, homomorphic encryption, next. · zero-knowledge proof protocols, commitment schemes, and Formally, a “public key” encryption system Epk( ) and its · secure multiparty computation. In Section 2.3, we then con- decryption Dsk( ) are homomorphic if those two functions · sider the importance of scrutinizing the security require- are maps between the message group with an operation f1( ) · ments of the signal processing application. It is meaningless and the encrypted group with an operation f2( ), such that to speak about secure signal processing in a particular ap- if x and y are taken from the message space of the encryption plication if the security requirements are not specified. The scheme, we have security requirements as such will also determine the possi- f1(x, y) = Dsk f2 Epk(x), Epk(y) . (4) bility or impossibility of applying the cryptographic prim- itives. As we will illustrate by examples—and also in more For secure signal processing, multiplicative and additive ho- detail in the following sections—some application scenarios momorphisms are important. Table 1 gives an overview of simply cannot be made secure because of the inherent infor- encryption systems with additive or multiplicative homo- mation leakage by the signal processing operation because of morphism. Note that those homomorphic operations are ap- the limitations of the cryptographic primitives to be used, plied to a modular domain (i.e., either in a finite field or in a or because of constraints on the number of interactions be- ring ZN )—thus, both addition and multiplication are taken tween parties involved. Finally, in Section 2.4, we briefly dis- modulo some fixed value. For signal processing applications, cuss the combination of signal encryption and compression which usually require integer addition and multiplication, it using an approach quite different from the ones discussed in is thus essential to choose the message space of the encryp- Sections 3 and 4, namely, by exploiting the concept of coding tion scheme large enough so that overflows due to modular with side information. We discuss this approach here to em- arithmetic are avoided when operations on encrypted data phasize that although many of the currently existing applica- are performed. 4 EURASIP Journal on Information Security
Another important consideration is the representation of two vectors is encrypted. One takes the encrypted samples the individual signal samples. As encryption schemes usually Epk(xi), raises them to the power of yi, and multiplies all ob- operate in finite modular domains (and all messages to be tained values. Obviously, the resulting number itself is also in encrypted must be represented in this domain), a mapping is encrypted form. To carry out further useful signal processing required which quantizes real-valued signal amplitudes and operations on the encrypted result, for instance, to compare translates the signal samples of X into a vector of modular it to a threshold, another cryptographic primitive is needed, numbers. In addition to the requirement that the computa- namely, zero knowledge proof protocols, which is discussed tions must not overflow, special care must be taken to repre- in the next section. sent negative samples in a way which is compatible with the In this paper, we focus mainly on public-key encryption homomorphic operation offered by the cryptosystem. For schemes, as almost all homomorphic encryption schemes be- the latter problem, depending on the algebraic structure of long to this family. The notable exception is the one-time pad the cipher, one may either encode the negative value −x by (and derived stream ciphers), where messages taken from a the modular inverse x−1 in the underlying algebra of the mes- finite group are blinded by a sequence of uniformly random sage space or by avoiding negative numbers entirely by using group elements. Despite its computationally efficient encryp- a constant additive shift. tion and decryption processes, the application of a one-time In the context of the above inner product example, we pad usually raises serious problems with regard to key dis- require an additively homomorphic scheme (see Table 1). tribution and management. Nevertheless, it may be used to Hence, f1 is the addition, and f2 is a multiplication: temporarily blind intermediate values in larger communica- tion protocols. Finally, it should be noted that some recent = · x + y Dsk Epk(x) Epk(y) ,(5)work in cryptography (like searchable encryption [6]and order-preserving encryption [7]) may also yield alternative or, equivalently, ways for the encryption of signal samples. However, these ap- proaches have not yet been studied in the context of media E (x + y) = E (x) · E (y). (6) pk pk pk encryption. Note that the latter equation also implies that To conclude this section, we observe that directly com- puting the inner product of two encrypted signals is not pos- c Epk(c · x) = Epk(x) (7) sible since this would require a cryptographic system that has both multiplicative and additive (i.e., algebraic) homomor- for every integer constant c. Thus, every additively homo- phism. Recent proposals in that direction like [8, 9] were later morphic cryptosystem also allows to multiply an encrypted proven to be insecure [10, 11]. Therefore, no provably secure value with a constant available or known as clear text. cryptographic system with these properties is known to date. The Paillier cryptosystem [2] provides the required ho- The construction of an algebraic privacy homomorphism re- momorphism if both addition and multiplication are con- mains an open problem. Readers can refer to [12]formore sidered as modular. The encryption of a message m under a details on homomorphic cryptosystems. Paillier cryptosystem is defined as
m N 2 2.2.2. Zero-knowledge proof protocols Epk(m) = g r mod N ,(8) Zero-knowledge protocols are used to prove a certain state- = ∈ Z∗ where N pq, p and q are large prime number, g N2 is ment or condition to a verifier, without revealing any ∈ Z∗ a generator whose order is a multiple of N,andr N is a “knowledge” to the verifier except the fact that the assertion random number (blinding factor). We then easily see that is valid [19]. As a simple example, consider the case where the prover Peggy claims to have a way of factorizing large E (x)E (y) = gxrN g yrN mod N2 pk pk x y numbers. The verifier Victor will send her a large number = gx+y r r N mod N2 (9) and Peggy will send back the factors. Successful factorization x y of several large integers will decrease Victor’s doubt in the = Epk(x + y). truth of Peggy’s claim. At the same time Victor will learn “no knowledge of the actual factorization method.” Applying the additive homomorphic property of the Paillier Although simple, the example shows an important prop- encryption system, we can evaluate (3) under the assumption erty of zero-knowledge protocol proofs, namely, that they are that X is an encrypted signal and Y is a plain text signal: interactive in nature. The interaction should be such that with increasing number of “rounds,” the probability of an M M M yi adversary to successfully prove an invalid claim decreases Epk X, Y =Epk xi yi = Epk xi yi = Epk xi . i=1 i=1 i=1 significantly. On the other hand, noninteractive protocols (10) (based on the random oracle model) also do exist. A formal definition of interactive and noninteractive proof systems, Here, we implicitly assume that xi, yi are represented as inte- such as zero-knowledge protocols, falls outside the scope of gers in the message space of the Paillier cryptosystem, that is, this paper, but can be found, for instance, in [19]. xi, yi ∈ ZN .However,(10) essentially shows that it is possi- As an example for a commonly used zero-knowledge ble to compute an inner product directly in case one of the proof, consider the proof of knowing the discrete logarithm Zekeriya Erkin et al. 5 x of an element y to the base g in a finite field [20]. Hav- hiding due to the random blinding factor r; furthermore, it ing knowledge of discrete logarithm x is of interest in some is binding unless Alice is able to compute discrete logarithms. applications since if For use in signal processing applications, commitment schemes that are additively homomorphic are of specific x y = g mod p, (11) importance. As with homomorphic public key encryption schemes, knowledge of two commitments allows one to then given p (a large prime number), g,andy (the calcu- compute—without opening—a commitment of the sum lation of the logarithm x) are computationally infeasible. If of the two committed values. For example, the above- Peggy (the prover) claims she knows the answer (i.e., the mentioned Pedersen commitment satisfies this property: value of x), she can convince Victor (the verifier) of this given two commitments c = gm1 hr1 mod p and c = gm2 hr2 knowledge without revealing the value of x by the follow- 1 2 mod p of the numbers m and m , a commitment c = ing zero-knowledge protocol. Peggy picks a random number 1 2 gm1+m2 hr1+r2 mod p of m +m can be computed by multiply- r ∈ Z and computes t = gr mod p. She then sends t to Vic- 1 2 p ing the commitments: c = c c mod p. Note that the com- tor. He picks a random challenge c ∈ Z and sends this to 1 2 p mitment c can be opened by providing the values m + m Peggy. She computes s = r − cx mod p and sends this to Vic- 1 2 and r + r . Again, the homomorphic property only supports tor. He accepts Peggy’s knowledge of x if gs yc = t, since if 1 2 additions. However, there are situations where it is not possi- Peggy indeed used the correct logarithm x in calculating the ble to prove the relation by mere additive homomorphism value of s,wehave as in proving that a committed value is the square of the gs yc mod p = gr−cx gx c mod p = gr = t mod p. (12) value of another commitment. In such circumstances, zero- knowledge proofs can be used. In this case, the party which In literature, many different zero-knowledge proofs exist. possesses the opening information of the commitments com- We mention a number of them that are frequently used in putes a commitment of the desired result, hands it to the secure signal processing: other party, and proves in zero-knowledge that the commit- ment was actually computed in the correct manner. Among (i) proof that an encrypted number is nonnegative [21]; others, such zero-knowledge proofs exist for all polynomial (ii) proof that shows that an encrypted number lies in a relations between committed values [24]. certain interval [22]; (iii) proof that the prover knows the plaintext x corre- sponds to the encryption E(x)[23]; 2.2.4. Secure multiparty computation (iv) proofs that committed values (see Section 2.2.3)satisfy certain algebraic relations [24]. The goal of secure multiparty computation is to evaluate a public function f (x(1), x(2), ..., x(m)) based on the secret in- In zero-knowledge protocols, it is sometimes necessary for puts x(i), i = 1, 2, ..., m of m users, such that the users learn the prover to commit to a particular integer or bit value. nothing except their own input and the final result. A sim- Commitment schemes are discussed in the next section. ple example, called Yao’s Millionaire’s Problem, is the com- parison of two (secret) numbers in order to determine if 2.2.3. Commitment schemes x(1) >x(2). In this case, the parties involved will only learn if their number is the largest, but nothing more than that. An integer or bit commitment scheme is a method that al- Thereisalargebodyofliteratureonsecuremultiparty lows Alice to commit to a value while keeping it hidden from computation; for example, it is known [26] that any (com- Bob, and while also preserving Alice’s ability to reveal the putable) function can be evaluated securely in the multi- committed value later to Bob. A useful way to visualize a party setting by using a general circuit-based construction. commitment scheme is to think of Alice as putting the value However, the general constructions usually require a large in a locked box, and giving the box to Bob. The value in the number of interactive rounds and a huge communication box is hidden from Bob, who cannot open the lock (without complexity. For practical applications in the field of dis- the help of Alice), but since Bob has the box, the value in- tributed voting, private bidding and auctions, and private in- side cannot be changed by Alice; hence, Alice is “committed” formation retrieval, dedicated lightweight multiparty proto- to this value. At a later stage, Alice can “open” the box and cols have been developed. An example relevant to signal pro- reveal its content to Bob. cessing application is the multiparty computation known as Commitment schemes can be built in a variety of ways. Bitrep which finds the encryption of each bit in the binary As an example, we review a well-known commitment scheme representation of a number whose encryption under an ad- due to Pedersen [25]. We fix two large primes p and q such ditive homomorphic cryptosystem is given [27]. We refer the that q | (p − 1) and a generator g of the subgroup of order q reader to [28] for an extensive summary of secure multiparty of Z∗. Furthermore, we set h = ga mod p for some random p computations and to [29] for a brief introduction. secret a.Thevaluesp, q, g,andh are the public parameters of the commitment scheme. To commit to a value m,Alice chooses a random value r ∈ Zq and computes the commit- 2.3. Importance of security requirements ment c = gmhr mod p. To open the commitment, Alice sends m and r to Bob, who verifies that the commitment c received Although the cryptographic primitives that we discussed in previously indeed satisfies c = gmhr mod p. The scheme is the previous section are useful for building secure signal 6 EURASIP Journal on Information Security