Privacy‑Preserving User Profile Matching in Social Networks
Total Page:16
File Type:pdf, Size:1020Kb
This document is downloaded from DR‑NTU (https://dr.ntu.edu.sg) Nanyang Technological University, Singapore. Privacy‑preserving user profile matching in social networks Yi, Xun; Bertino, Elisa; Rao, Fang‑Yu; Lam, Kwok‑Yan; Nepal, Surya; Bouguettaya, Athman 2019 Yi, X., Bertino, E., Rao, F.‑Y., Lam, K.‑Y., Nepal, S., & Bouguettaya, A. (2019). Privacy‑preserving user profile matching in social networks. IEEE Transactions on Knowledge and Data Engineering, 32(8), 1572‑1585. doi:10.1109/TKDE.2019.2912748 https://hdl.handle.net/10356/143529 https://doi.org/10.1109/TKDE.2019.2912748 © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TKDE.2019.2912748. Downloaded on 27 Sep 2021 00:25:54 SGT 1 Privacy-Preserving User Profile Matching in Social Networks Xun Yi, Elisa Bertino, Fang-Yu Rao, Kwok-Yan Lam, Surya Nepal and Athman Bouguettaya Abstract—In this paper, we consider a scenario where a user queries a user profile database, maintained by a social networking service provider, to identify users whose profiles are similar to the profile specified by the querying user. A typical example of this application is online dating. Most recently, an online data site, Ashley Madison, was hacked, which results in disclosure of a large number of dating user profiles. This data breach has urged researchers to explore practical privacy protection for user profiles in a social network. In this paper, we propose a privacy-preserving solution for profile matching in social networks by using multiple servers. Our solution is built on homomorphic encryption and allows a user to find out matching users with the help of multiple servers without revealing to anyone the query and the queried user profiles in the clear. Our solution achieves user profile privacy and user query privacy as long as at least one of the multiple servers is honest. Our experiments demonstrate that our solution is practical. Keywords—User profile matching, data privacy protection, ElGamal encryption, Paillier encryption, homomorphic encryption F 1 INTRODUCTION they use counterintuitive “privacy” settings, and their data Matching two or more users with related interests is an management systems have serious security flaws. important and general problem, applicable to a wide range In July 2015, “The Impact Team” group stole user data of scenarios including job hunting, friend finding, and dating from Ashley Madison, a commercial website billed as enabling services. Existing on-line matching services require participants extramarital affairs. The group then threatened to release to trust a third party server with their preferences. The matching users’ names and personally identifying information if Ashley server has thus full knowledge of the users’ preferences, which Madison was not immediately shut down. On 18 and 20 August raises privacy issues, as the server may leak (either intentionally, 2015, the group leaked more than 25 gigabytes of company or accidentally) users’ profiles. data, including user details. Because of the site’s policy of When signing up for an online matching service, a user not deleting users’ personal information, including real names, creates a “profile” that others can browse. The user may home addresses, search history and credit card transaction be asked to reveal age, sex, education, profession, number records, many users feared being publicly shamed. On 24 of children, religion, geographic location, sexual proclivities, August 2015, Toronto police announced that two unconfirmed drinking behavior, hobbies, income, religion, ethnicity, drug suicides had been linked to that data breach. use, home and work addresses, favorite places. Even after an Such a data breach has raised growing concerns amongst account is canceled, most online matching sites may retain users on the dangers of giving out too much personal infor- such information. mation. Users of these services also need to be aware of data Users’ personal information may be re-disclosed not only theft. A main challenge is thus how to protect privacy of user to prospective matches, but also to advertisers and, ultimately, profiles in social networks. So far, the best solution is through to data aggregators who use the data for purposes unrelated encryption, i.e., users encrypt their profiles before uploading to online matching and without customer consent. In addition, them onto social networks. However, when user profiles are there are risks such as scammers, sexual predators, and encrypted, it is challenging to match the users with the similar reputational damage that come along with using online matching profiles. services. In this paper, we consider a scenario where a user queries a Many online matching sites take shortcuts with respect to user profile database, maintained by a social networking service safeguarding the privacy and security of their customers. Often, provider, to find out some users whose profiles are similar to the profile specified by the querying user. A typical example of X. Yi is with the School of Computer Science and Software Engineering, RMIT this application is online dating. We give a privacy-preserving University, Melbourne, VIC 3001, Australia. solution for user profile matching in social networks by using E. Bertino and F. Y. Rao are with the Department of Computer Science and multiple servers. Cyber Center, Purdue University, West Lafayette, IN 47907. K. Y. Lam is with the School of Computer Science and Engineering, Nanyang Our basic idea can be summarized as follows. Before Technological University, Singapore 639798 uploading his/her profile to a social network, each user encrypts S. Nepal is with the Commonwealth Scientific and Industrial Research the profile by a homomorphic encryption scheme with the Organisation (CSIRO), Armidale, NSW 2350, Australia common encryption key. Therefore, even if the user profile A. Bouguettaya is with the School of Information Technologies, The University database falls into the hand of a hacker, the hacker can only of Sydney, NSW 2006, Australia Manuscript received 2 August 2017. get the encrypted data. When a user wishes to find people 2 in the social network, the user encrypts his/her preferred user P1 inputs X = fx1; x2; ··· ; xng and the other party profile and a dissimilarity threshold and submits the query to the P2 inputs Y = fy1; y2; ··· ; yng, one party learns X \ social networking service provider. Based on the query, multiple Y and nothing else and the other party learns nothing. servers, which secretly share the decryption key, compare the Their solution is based on commutative encryption with preferred user profile with each record in the database. If the property: Ek1 (Ek2 (x)) = Ek2 (Ek1 (x)), where k1; k2 the dissimilarity is less than the threshold, the matching user’ are known to P1 and P2, respectively. The idea is: P1 contact information is returned to the querying user. (P2) encrypts its inputs X (Y ) with its key k1 (k2), de- Our main contributions include noted as Ek1 (X) (Ek2 (Y )) and exchanges them. Next, P1 1) We formally define the user profile matching model, the sends a pair (Ek2 (Y );Ek1 (Ek2 (Y ))) to P2, which computes user profile privacy and the user query privacy. Ek2 (Ek1 (X))), compares it with (Ek2 (Y );Ek1 (Ek2 (Y ))) and 2) We give a solution for privacy-preserving user profile decrypts Ek2 (y) if Ek2 (Ek1 (x)) = Ek1 (Ek2 (y)). Then Vaidya matching for a single dissimilarity threshold and then et al. [28] extended such a solution to n-party setting. Arb et extend it for multiple dissimilarity thresholds. al. [2] applied this idea to detect friend-of-friend in MSN. 3) We perform security analysis on our protocols. If at In 2004, Freedman et al. [9] gave a solution for private two- least one of multiple servers is honest, our protocols party set-intersection problem based on polynomial evaluation. achieve user profile privacy and user query privacy. The idea is: P1 defines a polynomial P (y) = (x1 − y)(x2 − 4) We conduct extensive experiments on a real dataset y):::(xn − y) and sends to P2 homomorphic encryptions of to evaluate the performance of our proposed protocols the coefficients of this polynomial. P2 uses the homomorphic under different parameter settings. Experiments show properties of the encryption system to evaluate the polynomial that our solutions are practical and efficient. at each of his inputs and then multiplies each result by a This paper extends our previous work [32] as follows. fresh random number r to get an intermediate result, and 1) In our previous work, we use a variant ElGamal adds to it an encryption of the value of his input, i.e., P2 encryption scheme [33] as the underlying homomorphic computes E(rP (y) + y). Therefore, for each of the elements encryption scheme, which assumes the two prime factors in the intersection of the two parties’ inputs, the result of of the modulus are public parameters. Rao [22] has this computation is the value of the corresponding element, found a security flaw in the encryption scheme, that whereas for all other values the result is random. In 2005, is, an attacker may decrypt the ciphertexts without the Kissner and Song [19] gave an improved solution that enables decryption key. In this paper, we fix the security flaw set-intersection, cardinality set-intersection, and over-threshold by keeping the factorization of the modulus secret. set-union operations on multisets. Later, Sang et al.[23], Ye 2) In our previous work, the user profile data is shared by et al. [31] and Dachman-Soled et al.