Provable Ownership of Encrypted Files in De-Duplication Cloud Storage

Total Page:16

File Type:pdf, Size:1020Kb

Provable Ownership of Encrypted Files in De-Duplication Cloud Storage 1 Provable Ownership of Encrypted Files in De-duplication Cloud Storage Chao Yangy z, Jianfeng May and Jian Renz ySchool of CS, Xidian University Xi’an, Shaanxi, 710071. Email: fchaoyang, [email protected] zDepartment of ECE, Michigan State University East Lansing, MI 48824. Email: fchaoyang, [email protected] Abstract—The rapid adoption of cloud storage services has tells a client that it does not have to upload the file, it created an issue that many duplicated copies of files are stored in means that some other clients have the same file, which the remote storage servers, which not only wastes the communica- could be a sensitive information [5]. More seriously, Halevi tion bandwidth for duplicated file uploading, but also increases the cost of security data management. To solve this problem, et al. recently found some new attacks to the client-side client-side deduplication was introduced to avoid the client from deduplication system [6]. In these attacks, by learning just uploading files already existed in the remote servers. However, the a small piece of information about the file, namely its hash existing scheme was recently found to be vulnerable to security value, an attacker is able to get the entire file from the server. attacks in that by learning a small piece of information related These attacks are not just theoretical. Some similar attacks to the file, such as the hash value of the file, the attacker may be able to get full access of the entire file; and the confidentiality that were implemented against Dropbox were also discovered of the date may be vulnerable to “honest-but-curious” attacks. by Mulazzani et al. [7], recently. The root cause of all these In this paper, to solve the problems mentioned above, we attacks is that there is a very short piece of information that propose a cryptographically secure and efficient scheme to represents the file, and an attacker that learns it can get access support cross-user client side deduplication over encrypted file. to the entire file. Our scheme utilizes the technique of spot checking in which the client only need to access small portions of the original file, Furthmore, the confidentiality of users’ sensitive data dynamic coefficients, randomly chosen indices of the original files against the cloud storage server in client-side deduplication and a subtle approach to distribute the file encrypting key among is another serious security problem. clients to satisfy security requirements. Our extensive security There are two kinds of straightforward methods: i) The analysis shows that the proposed scheme can generate provable user’s sensitive data can be encrypted by the cloud storage ownership of the encrypted file (POEF) with the presence of the curious server, and maintain a high detection probability of the server who will choose and maintain the encrypting key. But client misbehavior. Both performance analysis and simulation it is reported that as a famous cloud storage server, Dropbox results demonstrate that our proposed scheme is much more mistakenly kept all user data open to public for almost 4 hours, efficient than the existing schemes, especially in reducing the due to a new bug in their software [8]. It is also reported that burden of the client. a bug in Twitter’s client software which allows adversary to Index Terms—Cloud storage, Deduplication, Enrypted File, access users’ private data, is discovered [9]. ii) If users’ data Provable Ownership, Spot-checking are encrypted on client side and the encrypting key is kept away from cloud storage server, then there will be no such I. INTRODUCTION failure of privacy protection of sensitive data, even if cloud With the rapid adoption of Cloud services, a large volume storage server made such mistakes or was hacked in. of data is stored at remote servers, so techniques to save disk However, the second kind of straightforward client side space and network bandwidth are needed. A key concept in encryption with randomly chosen encrypting key will stop this context is deduplication, in which the server stores only a deduplication [5]. The reason is twofold: 1) The cloud storage single copy of a file, regardless of how many clients want to server does not possess the original file in plaintext anymore, store that file. All clients possessing that file only use the link so it is hard for server to authenticate whether a new client to the single copy of the file stored at the server. Furthermore, has the proof of ownership of the original file. 2) Encryptions if the server already has a copy of the file, then clients do of the same file by different users with different encrypting not have to upload it again to the server, which will save the keys will result in different ciphertexts, which will prevent bandwidth as well as the storage capacity and is called client- deduplication across multiusers for happening. side deduplication [1] extensively employed by Dropbox [2] Recently there are only a few of solutions to these new secu- and Wuala [3]. It is reported that business applications can rity problems metioned above. Mulazzani et al. [7] discovered achieve deduplication ratios from 1:10 to as much as 1:500, and implemented a new attack against Dropbox’s deduplica- resulting in disk and bandwidth savings of more 90% [4]. tion system and proposed a preliminary and simple revisal to However, the client-side deduplication introduces some new the communication protocol of Dropbox; Halevi et al. [6] put security problems. Harnik et al. found that when a server forward the proof-of-ownership (POW) model, in which, based 2 on Merkle Hash Trees and error-control encoding, a client can Computation requirements. The server typically has to han- prove to a server that it indeed has a copy of a file without dle a large number of files concurrently. So the solution should actually uploading it. However, neither of two methods above not impose too much burden on the server, even though it has is able to tackle the problem of the confidentiality of users’ more powerful computation capability. On the other hand, the sensitive data against the cloud storage server. To overcome client has limited storage as well as computation resources, this deficiency, Jia et al. [10] recently proposed a solution and it is the leading actor in the deduplication scenario who to support cross-user client side deduplication proof over has to prove to the server that it possesses the exactly same encrypted data. Actually they proposed a method to distribute file already stored at the server. So, the design of the solution a randomly chosen per-file encrypting key to all owners of should pay more attention to reducing the burden on the client the same file, and combined it with the POW proof method in terms of the computation and storage resources and, at the [6] to form a new scheme. However, this combination makes same time, keep the burden on the server at a relatively low their scheme inherit the drawbacks of POW proof method: level. the scheme cannot guarantee the freshness of the proof in Security requirements. Although the server has only the every challenge and has to build Merkle Hash Tree on the encrypted data without the file encrypting key, the solution origianl file which is inherently inefficient. Moreover, their must insist on that the verification and proof should be based scheme failed to provid enough security protection against on the availability of the original data in its original form, key exposure, because it encrypts the file encrypting key only instead of any stored message authentication code (MAC), or with a static and constant hash value of the original file in previously used verification results. Moreover, the requested all key distribution processes. As a result, the applicability parts of the original file should be randomly chosen every of these shcemes in scenario of client-side deduplication over time and the generated proof should be totally different in each encrypted data are greatly limited. challenge. So it is infeasible for anybody to forge or prepare In this paper, to solve the problem in the scenario of client- the proof in advance and to satisfy the verification challenge. side deduplication over encrypted data mentioned above, we Furthermore, the file encrypting key should be encrypted with propose a cryptographically secure and efficient solution where fresh and different keys in every process of key distribution a client proves to the server that it indeed has the encrypted between clients minimizing the risk of key exposure. file, which is called a Provable Ownership of Encrypted File (POEF). We achieve the efficient goal by relying on spot B. System Model checking [20], in which the client could only access small portions of the original file to generate the proof of possessing A typical network architecture for cloud data storage is the original file correctly, thus greatly reducing the burden of illustrated in Figure 1. There are two entities as follows: computation on the client and minimizing the I/O between the Storage Server. It will provide cloud storage service to all client and the server. At the same time, by utilizing dynamic kinds of users. Its computation and storage capability (CPU, coefficients and randomly chosen indices of the original files, I/O, network, etc) is stronger than each single user. Storage our scheme mixes the sampled portions of the original file Server will maintain the integrity of users’ data regardless of with the dynamic coefficients together to generate the unique plaintext or ciphertext, and the availability of cloud service. proof in every challenge. Furthermore, our scheme proposes Client Users.
Recommended publications
  • A Backup-As-A-Service (Baas) Software Solution
    Universidade de Brasília Institute of Exact Sciences Department of Computer Science A Backup-as-a-Service (BaaS) Software Solution Heitor M. de Faria Dissertation presented as partial requirement for conclusion on the Professional Master in Applied Computing Advisor Prof. Dra. Priscila Solis Brasília 2018 Universidade de Brasília Institute of Exact Sciences Department of Computer Science A Backup-as-a-Service (BaaS) Software Solution Heitor M. de Faria Dissertation resented as partial requirement for conclusion do Professional Master in Applied Computing Prof. Dra. Priscila Solis (Advisor) CIC/UnB Prof. Dr. Jacir Bordim Dr. Georges Amvame-Nzê Universidade de Brasília Universidade de Brasília Prof. Dr. Marcelo Ladeira Coordinator of the Post-graduation Program in Applied Computing Brasília, July 1st, 2018 Abstract Backup is a replica of any data that can be used to restore its original form. However, the total amount of digital data created worldwide more than doubles every two years and is expected reach 44 trillions of gigabytes in 2020, bringing constant new challenges to backup processes. Enterprise backup is one of the oldest and most performed tasks by in- frastructure and operations professionals. Still, most backup systems have been designed and optimized for outdated environments and use cases. That fact, generates frustration over currently backup challenges and leads to a greater willingness to modernize and to consider new technologies. Traditional backup and archive solutions are no longer able to meet users current needs. The ideal modern currently backup and recovery software product should not only provide features to attend a traditional data center, but also allow the integration and exploration of the growing Cloud, including “backup client as a service” and “backup storage as a service”.
    [Show full text]
  • The Web ICT Systems for Business Networking Vito Morreale
    ICT Systems for Business Networking Vito Morreale The Web ICT Systems for Business Networking Vito Morreale Note. The content of this document is mainly drawn from Wikipedia [www.wikipedia.org] and follows GNU Free Documentation License (GFDL), the license through which Wikipedia's articles are made available. The GNU Free Documentation License (GFDL) permits the redistribution, creation of derivative works, and commercial use of content provided its authors are attributed and this content remains available under the GFDL. Material on Wikipedia (and this document too) may thus be distributed multilingually to, or incorporated from, resources which also use this license. Table of contents 1 INTRODUCTION ................................................................................................................................................ 3 2 HOW THE WEB WORKS .................................................................................................................................. 4 2.1 PUBLISHING WEB PAGES ....................................................................................................................................... 4 2.2 SOCIOLOGICAL IMPLICATIONS ................................................................................................................................ 5 3 UNIFORM RESOURCE IDENTIFIER (URI) ................................................................................................. 5 4 HYPERTEXT TRANSFER PROTOCOL (HTTP) ........................................................................................
    [Show full text]
  • Cal Anderson, OTARMA IT Risk Control Specialist
    2019 CYBER PRESENTATION Cal Anderson, OTARMA IT Risk Control Specialist Cal Anderson Bio • 20 plus years of industry experience in Information Systems, Cyber Security & Risk Management • Specialize in performing Cyber Security, SOX IT, enterprise IT audits and enterprise IT risk assessments of Fortune 500, mid‐range and small scale IT environments. • Specialize in Data Governance, Data validation and flowcharting business processes from beginning to end. • Certified: CISA, CRISC, CWM, CSQL • Certified Notary Public for the state of Ohio Goal • Provide overview of IT Risk Control Specialist function • Provide overview of the IT Risk assessment process • Provide overview of IT Risk Control Specialist function conducting training and providing IT information to the OTARMA member bases. • Provide comprehensive Cyber Security educational content for managing cyber threats. 1 IT RISK CONTROL SPECIALIST FUNCTION High‐Level Tasks perform by the IT Risk Control Specialists: • Conduct IT Risk Assessment o Identification/PII Risks o Analyze risks and how it will affect the member • Risk Evaluation o Costs o Legal Requirements o Environmental Factors o Members handling of risks • Provide recommendations to address IT issues and deficiencies. • Consult with members to answer their questions and to educate/promote awareness. • Conduct IT training IT Risk Assessment Process Members Base Management o Visit Scheduling o Visit Confirmation Onsite IT Risk Assessment o Assessment o Activities Member Base Management Visit Scheduling o Call or email member
    [Show full text]
  • Migrating Enterprise Storage Applications to the Cloud
    UNIVERSITY OF CALIFORNIA, SAN DIEGO Migrating Enterprise Storage Applications to the Cloud A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Michael Daniel Vrable Committee in charge: Professor Stefan Savage, Co-Chair Professor Geoffrey M. Voelker, Co-Chair Professor Bill Lin Professor Paul Siegel Professor Amin Vahdat 2011 Copyright Michael Daniel Vrable, 2011 All rights reserved. The dissertation of Michael Daniel Vrable is approved, and it is acceptable in quality and form for publication on micro- film and electronically: Co-Chair Co-Chair University of California, San Diego 2011 iii DEDICATION To my family, for all the support I’ve received. iv EPIGRAPH If I have seen further it is only by standing on the shoulders of giants. —Sir Isaac Newton v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . .v Table of Contents . vi List of Figures . ix List of Tables . .x Acknowledgements . xi Vita ......................................... xiii Abstract of the Dissertation . xv Chapter 1 Introduction . .1 1.1 Cloud Computing Applications . .3 1.2 Contributions . .5 1.3 Organization . .6 Chapter 2 Background . .7 2.1 Cloud Providers . .7 2.1.1 Cloud Storage . .8 2.1.2 Cloud Computation . 12 2.2 Enterprise Storage Applications . 13 2.2.1 File System Backup . 14 2.2.2 Shared Network File Systems . 15 Chapter 3 Cumulus . 18 3.1 Related Work . 20 3.2 Design . 22 3.2.1 Storage Server Interface . 23 3.2.2 Storage Segments . 23 3.2.3 Snapshot Format . 24 3.2.4 Sub-File Incrementals .
    [Show full text]
  • Migrating Enterprise Storage Applications to the Cloud
    UC San Diego UC San Diego Electronic Theses and Dissertations Title Migrating enterprise storage applications to the cloud Permalink https://escholarship.org/uc/item/0dv3d3p5 Author Vrable, Michael Daniel Publication Date 2011 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Migrating Enterprise Storage Applications to the Cloud A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Michael Daniel Vrable Committee in charge: Professor Stefan Savage, Co-Chair Professor Geoffrey M. Voelker, Co-Chair Professor Bill Lin Professor Paul Siegel Professor Amin Vahdat 2011 Copyright Michael Daniel Vrable, 2011 All rights reserved. The dissertation of Michael Daniel Vrable is approved, and it is acceptable in quality and form for publication on micro- film and electronically: Co-Chair Co-Chair University of California, San Diego 2011 iii DEDICATION To my family, for all the support I’ve received. iv EPIGRAPH If I have seen further it is only by standing on the shoulders of giants. —Sir Isaac Newton v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . .v Table of Contents . vi List of Figures . ix List of Tables . .x Acknowledgements . xi Vita ......................................... xiii Abstract of the Dissertation . xv Chapter 1 Introduction . .1 1.1 Cloud Computing Applications . .3 1.2 Contributions . .5 1.3 Organization . .6 Chapter 2 Background . .7 2.1 Cloud Providers . .7 2.1.1 Cloud Storage . .8 2.1.2 Cloud Computation . 12 2.2 Enterprise Storage Applications . 13 2.2.1 File System Backup .
    [Show full text]
  • Sistemi ICT Per Il Business Networking
    Corso di Laurea Specialistica Ingegneria Gestionale Sistemi ICT per il Business Networking The Web Docente: Vito Morreale ([email protected]) 2 April 2006 1 The Web The World Wide Web ("WWW" or simply the "Web") is a global information space which people can read and write via computers connected to the Internet Not a synonym of the Internet: the Web is actually a service that operates over the Internet The Web Internet 2 April 2006 ICT Systems for Business Networking 2 1 The Web The World Wide Web is the combination of four basic ideas: Hypertext: moving from one part of a document to another or from one document to another through internal connections among these documents (called "hyperlinks") Resource identifiers: locating a particular resource (computer, document or other resource) on the network through a unique identifier Client-server model of computing: client software or a client computer makes requests of server software or a server computer that provides the client with resources or services (e.g data or files) Markup language: characters or codes embedded in text indicate to a computer how to print or display the text (e.g. in italics or bold type or font) 2 April 2006 ICT Systems for Business Networking 3 Web browser and server On the Web, a web browser (client) retrieves information resources (e.g. web pages and other files) from web servers using their network addresses and displays them, typically on a computer monitor, using a markup language that determines the details of the display One can then follow hyperlinks in each page to other resources on the Web of information whose location is provided by these hyperlinks The act of following hyperlinks is often called "browsing" the Web Web pages are arranged in collections of related material ("websites“) Web Server Web Browser 2 April 2006 ICT Systems for Business Networking 4 2 Main Web standards At its core, the Web is made up of three standards: the Uniform Resource Identifier (URI): a universal system for referencing resources on the Web (e.g.
    [Show full text]
  • UG TN-200 200T1(V1).Pdf
    TRENDnet User’s Guide Cover Page TRENDnet User’s Guide Table of Contents Contents RAID .................................................................................................................... 37 S.M.A.R.T............................................................................................................. 43 Product Overview ............................................................................. 2 Scan Disk ............................................................................................................. 45 Package Contents .......................................................................................................... 2 Volume Information ........................................................................................... 46 Features ......................................................................................................................... 2 Network .................................................................................................................. 46 Product Hardware Features ........................................................................................... 3 Network Settings ................................................................................................ 46 Application Diagram ...................................................................................................... 5 LLTD/DDNS .......................................................................................................... 48 Installing Hard Drive(s) .................................................................................................
    [Show full text]
  • Cumulus: Filesystem Backup to the Cloud
    Cumulus: Filesystem Backup to the Cloud MICHAEL VRABLE, STEFAN SAVAGE, and GEOFFREY M. VOELKER University of California, San Diego Cumulus is a system for efficiently implementing filesystem backups over the Internet, specifically designed under a thin cloud assumption—that the remote datacenter storing the backups does not provide any special backup services, but only a least-common-denominator storage interface. Cumulus aggregates data from small files for storage and uses LFS-inspired segment cleaning to maintain storage efficiency. While Cumulus can use virtually any storage service, we show its efficiency is comparable to integrated approaches. Categories and Subject Descriptors: D.4.3 [Operating Systems]: File Systems Management; E.5 [Data]: Files—Backup/recovery General Terms: Design, Economics, Management, Measurement, Reliability 14 Additional Key Words and Phrases: Backup, cloud storage ACM Reference Format: Vrable, M., Savage, S., and Voelker, G. M. 2009. Cumulus: Filesystem backup to the cloud. ACM Trans. Storage 5, 4, Article 14 (December 2009), 28 pages. DOI = 10.1145/1629080.1629084 http://doi.acm.org/10.1145/1629080.1629084 1. INTRODUCTION It has become increasingly popular to talk of “cloud computing” as the next infrastructure for hosting data and deploying software and services. Not sur- prisingly, there are a wide range of different architectures that fall under the umbrella of this vague-sounding term, ranging from highly integrated and fo- cused (e.g., Software As A Service offerings such as Salesforce.com) to decom- posed and abstract (e.g., utility computing such as Amazon’s EC2/S3). Towards This article is an expanded version of the paper “Cumulus: Filesystem Backup to the Cloud” which appeared in the Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09).
    [Show full text]