Secure Multi Keyword Fuzzy with Semantic Expansion Based Search Over Encrypted Cloud Data
Total Page:16
File Type:pdf, Size:1020Kb
SECURE MULTI KEYWORD FUZZY WITH SEMANTIC EXPANSION BASED SEARCH OVER ENCRYPTED CLOUD DATA ARFA BAIG Dept of Computer Science & Engineering B.N.M Institute of Technology, Bangalore, India E-mail: [email protected] Abstract— The initiation of cloud computing has led to ease of access in Internet-based computing and is commonly used for web servers or development systems where there are security and compliance requirements. Nevertheless, some of the confidential information has to be encrypted to avoid any intrusion. Henceforward as an attempt, a semantic expansion based multi- keyword fuzzy search provides solution over encrypted cloud data by using the locality-sensitive hashing technique. This solution returns not only the accurately matched files, but also the files including the terms semantically related to the query keyword. In the proposed scheme fuzzy matching is achieved through algorithmic design rather than expanding the index files. It also eradicates the need of a predefined dictionary and effectively supports multiple keyword fuzzy search without increasing the index or search complexity. The indexes are formed based on locality sensitive hashing (LSH), the result files are returned according to the total relevance score. Index Terms—Multi keyword fuzzy search, Locality Sensitive Hashing, Secure Semantic Expansion. support fuzzy search and also required the use of pre- I. INTRODUCTION defined dictionary which lacked scalability and Cloud computing is a form of computing that depends flexibility for modification and updation of the data. on sharing computing resources rather than having These drawbacks create the necessity of the new local servers or personal devices to handle technique of multi keyword fuzzy search. applications. The public cloud deployments are Thus from a new perception semantic expansion commonly used for web servers or development based multi keyword fuzzy search reinforces the systems where security and compliance requirements system usability by returning the exactly matched of larger organizations and their customers is not an files and the files including the terms semantically issue. In cloud computing, scalable and pliant storage related to the query keyword, which boosts the search and computation resources are provisioned as flexibility and usability. measured services through the Internet. Cloud Fuzzy searching will find a word even if it is computing empowers cloud customers to enjoy the misspelled. For instance, a fuzzy search for apple will on-demand high quality applications and services find appple. Fuzzy searching can be beneficial for from a centralized pool of configurable computing searching text that may contain typographical errors. resources. This technique can dismiss the burden of Multi-keyword search scheme abolishes the storage management which allows universal data requirement of a predefined keyword dictionary and access with independent geographical locations, and accomplishes this by several novel designs based on avoid capital spending on hardware, software, and locality-sensitive hashing which is secure, efficient personnel maintenances, etc. and accurate. To diminish the risk of data leakage to the cloud service providers, the straight forward solution to II. RELATED WORK implement data privacy is to encrypt sensitive data before being outsourced. Unfortunately, data A. Privacy preserving multi-keyword text search in encryption, if not carried out appropriately and may the cloud supporting similarity based ranking reduce the efficiency of data utilization. Typically, a With the increasing popularity of cloud computing, user reclaims files of interest to him/her via keyword huge amount of documents are outsourced to the search instead of retrieving back all the file such as cloud for reduced management cost and ease of keyword based search technique has been extensively access. Although encryption helps protecting user used in the daily life, e.g. Google plaintext keyword data confidentiality, it leaves the well-functioning yet search. Nonetheless, the technologies are invalid after practically-efficient secure search functions over the keywords are encrypted. encrypted data a challenging problem. The paper Fuzzy search using symmetric encryption has been a presents a privacy-preserving multi-keyword text challenge as it was being carried out using single search (MTS) scheme with similarity-based ranking exact keyword only and there had been use of to address this problem. To support multi-keyword inverted indexes as well which were not so search and search result ranking, a search index based proficient. In order to preserve the privacy of the on term frequency and the vector space model is query keyword cosine similarity measurement has proposed along with cosine similarity measure to been used for the multi keyword search but it did not achieve higher search result accuracy. To improve the Proceedings of 14th IRF International Conference, Bengaluru, India, 31st May 2015, ISBN: 978-93-85465-25-3 66 Secure Multi Keyword Fuzzy With Semantic Expansion Based Search Over Encrypted Cloud Data search efficiency, a tree-based index structure and into two steps. The first step finds the candidate list in various adaption methods for multi-dimensional terms of secure pruning codes. In particular, two (MD) algorithm is proposed so that the practical methods are developed to construct these pruning search efficiency is much better than that of linear codes. The second step uses a semi honest third party search. To further enhance the search privacy, two to determine the best matching keyword depending secure index schemes are used to meet the stringent on secure similarity function. This intends to reveal privacy requirements under strong threat models, i.e., as little information as possible to that third party and known cipher text model and known background hopes that developing such a system will enhance the model. Finally, the effectiveness and efficiency of the utilization of retrieval information systems and make proposed scheme are demonstrated through extensive these systems more user-friendly. experimental evaluation. III. EXISTING SYSTEM B. Public key encryption with keyword search. The problem of searching on data that is encrypted It is desired to store data on data storage servers such using a public key system is studied. Considering as mail servers and file servers in encrypted form to user Bob who sends email to user Alice encrypted reduce security and privacy risks. But this generally under Alice’s public key. An email gateway wants to implies that one has to detriment functionality for test whether the email contains the keyword “urgent” security. For example, if a client wishes to reclaim so that it could route the email accordingly. only documents containing certain words, it was not Alice, on the other hand does not wish to give the earlier known how to let the data storage server gateway the ability to decrypt all her messages. This perform the search and answer the query without loss defines and constructs a mechanism that enables of data confidentiality. Here the cryptographic Alice to provide a key to the gateway that enables the schemes for the problem of searching on encrypted gateway to test whether the word “urgent” is a data is carried out and provide proofs of security for keyword in the email without learning anything else the resulting cryptosystems. This technique has a about the email. The paper refers to the mechanism number of essential advantages. Provably secure: this as Public Key Encryption with keyword Search. As scheme provides secrecy for encryption, such that the another example, consider a mail server that stores untrusted server cannot acquire anything about the various messages publicly encrypted for Alice by plain text when only given the cipher text. Provides others. Using the mechanism Alice can send the mail query isolation for searches which means that the server a key that will enable the server to identify all untrusted server cannot learn anything more about the messages containing some specific keyword, but plain text than the search result. Controlled searching learn nothing else. The paper defines the concept of is carried out so that the untrusted server cannot public key encryption with keyword search and gives search for an arbitrary word without the user’s several constructions. authorization; they also support hidden queries, such that the user may ask the untrusted server to search C. Approximate Keyword-based Search over for a secret word without revealing the word to the Encrypted Cloud Data server. This scheme didn’t contain an index, thus, the To protect the privacy, users have to encrypt their search operation went through the entire file. sensitive data before outsourcing it to the cloud. However, the traditional encryption schemes are inadequate since they make the application of indexing and searching operations more challenging tasks. Accordingly, searchable encryption systems are developed to conduct search operations over a set of encrypted data. Unfortunately, these systems only allow their clients to perform an exact search but not approximate search, an important need for all the current information retrieval systems. Recently, an increased attention has been paid to the approximate searchable encryption systems to find keywords that match the submitted queries approximately. This work focuses on constructing a flexible secure index that allows the cloud server to perform the Fig1: Working