J.K. Periasamy et al., International Journal of Advanced Engineering Technology E-ISSN 0976-3945 Research Paper CLOUD STORAGE AUGMENTATION WITH MULTI USER REPUDIATION AND DATA DE-DUPLICATION 1J.K. Periasamy, 2B. Latha Address for Correspondence 1Research Scholar, Information and Communication Engineering, Anna University, Chennai & Computer Science and Engineering, Sri Sairam Engineering College, Chennai, Tamilnadu, India. 2Professor and Head, Computer Science and Engineering, Sri Sairam Engineering College, Chennai,Tamilnadu, India. ABSTRACT One of the major applications of Cloud computing is “Cloud Storage” where data is stored in virtual cloud servers provided by numerous third parties. De-duplication is a technique established in cloud storage for eliminating duplicate copies of repeated data. The storage space is reduced and the capacity of bandwidth is increased in the server using Data De- duplication. It is related to intelligent data compression and single-instance data storage.To take the complexity out of managing the Information Technology infrastructure, the storage outsourcing has become the popular option. The latest techniques to solve the complications of protective and efficient public auditing for dynamic and shared data are still not secure against the collusion that is, the illegal agreement of the cloud storage server and the multiple user repudiation in workable cloud storage. Hence, to prevent the collusion attack in the existing system and to provide an effective global auditing and data integrity, the group user repudiation is performed based on ordered sequence of values commit and group signature is generated with secure hash algorithm. The group user data is encrypted using block ciphers and bilinear transformation. This work also introduces a new approach in which each user holds an independent master key for encryption using convergent keys technique and out sourcing them to the cloud. The storage optimization was achieved with the help of messaging scheme between sender and receiver over the network, which reduces the overheads correlated with the duplication detection and query processes. The planned system also uses binary diff technique rule to identify the unique data chunk which is stored in the cloud. The breach of privacy and leakage of data can be prevented to acceptable level. The data chunk size is set by the user. Moreover, this work also proposes a feasible technique to detect storage of copyright and hazardous content in Cloud. KEYWORDS—Cloud Computing, Global auditing, Data integrity, User repudiation, Data De-duplication, Hashing, Data Chunks, Bundling, Copyrighted contents I. INTRODUCTION A. Data De-Duplication Process Storing data in the cloud has become the integral part Data De-duplication is one of the specialized data of business organizations and other enterprise compression techniques for eliminating repeated solutions. The best cloud data storage services are data. It is used to enhance the storage and network provided by Google Drive, Dropbox, Mega, Tresorit, utilization for reducing the number of bytes during pCloud and OneDrive [12]. The authorization of data transfers. During the De-duplication process, the cloud storage provides the determination of transferred file is divided into number of chunks authenticated identity for specified set of resources. based on dynamically specified bytes by the user and Consolidation of data centers reduces the cost then unmatched chunks of data are identified and stored during the analysis process. In the analysis, significantly. The management of cloud storage other chunks were compared to the already stored includes the apparent new risk management such as copy and whenever a match occurs, the matched data unavailability, breaching privacy, multiple chunk was replaced with a reference point to the integration of organizations, leakage of data and already stored chunk. In the given file, the same byte issues in compliances. The security aspect of cloud of the chunks may occur numerous times, thus the storage involves protection, privacy, recoverability, amount of chunks storage and time can be reduced reliability and integrity. Effective security in cloud using this technique. The match frequency is can be achieved using data encryption services, calculated based on chunk size. authorization and management services, access In this system is based on the in-line data De- control services, monitoring and auditing services. duplication method which occurs during data Cloud storage is a virtualized infrastructure with transfer. The transferred data is split into chunks and instant scalability, elasticity, multi-tenancy and huge each chunk is encrypted using convergent encryption resources. where the key for encryption is generated from plain The surface area attack increases in storage text of the chunk itself and the checksum has been outsourcing. The data is distributed and stored in generated for each chunk are used for naming of different location which increases the risk of respective chunks which can be easily recognized the unauthorized physical access. The multiple data is duplicates. shared with multiple users in cloud hence, large Once the duplicates are recognized, they will be numbers of keys are required for secure storage. The truncated and the rest of the chunks will be sent to the number of networks over which the data travels gets storage. There are different overheads in implementing as follows (a) Every duplication query increased. During transmission the risk in data read brings one extra round trip time of latency. (b) can be mitigated by encryption technology. Termination of TCP connections for non-duplicate Encryption protects the data that is transmitted in chunks in the original data. cloud. Outsourced data in cloud is more prone to To overcome these derelictions, a technique called hackers and national security agencies. The sites in-memory filter and inherent data locality to reduce which permit file sharing must enable piracy and the frequency of disk lookups, is proposed. The copyright infringement. Reliability and availability scattered chunks are bundled together into a single depends on the network and the service provider. The TCP connection and this method is named as essential aspect of the cloud service provider is the bundling. The duplicate chunks are usually clustered security audit. and data locality preserved for already detected Int J Adv Engg Tech/Vol. VII/Issue I/Jan.-March,2016/797-803 J.K. Periasamy et al., International Journal of Advanced Engineering Technology E-ISSN 0976-3945 chunk is utilized to reduce the number of duplication by the revoked user with [3] proxy re-signatures. The queries. Moreover, the messaging back scheme helps public or private verifier can audit the integrity of the sender to recognize the duplicate hash value that shared data without retrieving the entire data from the has been sent from the receiver. The redundant cloud database, even if some of the blocks in shared chunk transfer will be reduced using Messaging Back data have been re-signed by the cloud. Scheme. Group signatures are the central cryptographic basic B. Role of Dedup in Cloud where users can anonymously and accountably sign Backing up the data from your cell phone to the messages on behalf of the group they belong to. cloud is a routine process. In the back end, the Several adequate constructions with security proofs service providers need to protect and store massive in the standard model has been appeared in the amounts of data within the data centers. So, the frequent years. Like standard PKIs, group signatures Cloud Service Providers (CSPs) have implemented need a powerful revocation system to be practical. A the technology named De-duplication shortly called new method for scalable revocation, related to the as Dedup. Dedup is used to eliminate the duplicate Naor-Naor-Lotspiech (NNL) [4] broadcast files occupy the storage unnecessarily. CSPs can encryption framework that interacts easily with enhance their storage and maintenance easily with at techniques for building group signatures in the most security and provide service to the customer standard model was proposed. with minimum cost. Data De-duplication removes the duplicates in the C. In-line De-duplication storage to optimize the storage. This is done based on In-line De-duplication is the pre-process De- a prescribed time interval in the data center. Initially, duplication, which removes the duplicates before this needs storage in plethora. However, it is unlike storing it in the permanent storage. It removes the the in-line De-duplication process where the chunk in the file when the file is transferred; the duplicates are removed before storing. Data De- identified chunk matches with the chunk in the duplicate over networks, that is in-line process De- storage or receiver. With this method the storage duplication brings more advantages to storage required to store the file after transfer is very less optimization. But all these are achieved with the cost compared to post process De-duplication where the of various parameters like latency, disk lookups and De-duplication is done in a periodic manner after TCP terminations. Few drawbacks are there in the storing the files with duplicates in the storage. This existing system like latency, Unnecessary TCP system uses this in-line De-duplication method with terminations, Unwanted disk lookups. some other added techniques like filters, Cache, disk A method was developed for more protected De- chunks, and messaging back mechanism that duplication. It introduces a new key known as ‘de- facilitates the process in a great manner. key’ concept where the master key of the user is D. Convergent Encryption shared among the key management service providers. Convergent encryption is one of the encryption So the master key will not be compromised by the methods where the encryption key is generated from hackers easily. Unfortunately, it is implemented in the plain text of the file itself. When the file produces post process De-duplication, which needs larger the same checksum thereby; the duplicates are storage for initial storage and might have duplicate recognized with no collision.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages7 Page
-
File Size-