
2017 Asia Modelling Symposium Non-Internet Synchronization for Distributed Storages Anuradha Wickramarachchi, Dulaj Atapattu, Pamoda Wimalasiri, Ravidu Mallawa Arachchi, Gihan Dias Department of Computer Science and Engineering University of Moratuwa, Sri Lanka {anuradha.13, dulaj.atapattu.13, pamoda.13, ravidu_lashan.13, gihan}@cse.mrt.ac.lk Abstract — Synchronization of files has become a major use case availability of data at another locale, but the replication of in the context of cloud storage. Yet the usage of Internet content for eventual availability. Thus, the use of the network bandwidth and cost for additional usage of the cloud usage has for such synchronization would often consume time, while become the major bottlenecks of such services. Non-Internet causing partial synchronization of content, depletion of synchronization is the process of performing file network quota and excessive network bandwidth utilization. synchronization just by using moving devices (nomadic device) to communicate the content for synchronization, i.e. without This paper presents a novel methodology for using Internet as the communication medium. This enables the synchronization by utilization of moving devices as the synchronization to be performed at a very low cost and Internet carriers of changes for synchronizing storages. This enables bandwidth, yet with reasonably the same user experience. the devices that often move between storage devices, to act as Furthermore, the use of Internet for synchronization of large the synchronization medium. For an example, the media are often limited with lower upload speeds, which makes synchronization of home device and office device (end the entire process slower. The work presents the use of devices storage devices), using a device that moves between the two such as computers, mobile phones and other network attached locations (carrier device), such as a mobile phone or personal IoT devices that move between synchronization storages to computer can be considered. This paper represents the transfer synchronization data and synchronize the storages. implementation of the synchronization mechanism which is The distributed storages run Linux kernel to perform called the Non-Internet Synchronization (NIS) to address synchronization algorithms, which will be discussed in the aforementioned problems. Section II outlines the related work paper. The results demonstrated competitive performance with existing synchronization mechanisms given the user doesn’t in the field of consideration. Section III demonstrates the need real-time synchronization but nomadic access prevails. architecture of the implemented system. Section IV explains Non-Internet Synchronization (NIS) does the needful for the implementation carried out and Section V presents the synchronization storages with minimum cost while utilizing obtained experimental results and evaluation of results. redundant storage that moves between storage location Finally, Section VI concludes the paper with the inferences frequently. obtained from the results, future work and emphasizes the importance of the research. Keywords - Non-Internet synchronization, nomadic computing, personal cloud, reliability II. RELATED WORK Many studies have been carried out to present varying I. INTRODUCTION methods of synchronization of content. Most of the methods Now-a-days, synchronization between devices is essential focus on using a cloud storage in order to store content to for many applications such as P2P, client-server and cloud- synchronize the clients (storage devices). MetaSync [4] based systems. There are many synchronization presents a secure file synchronization mechanism which implementations such as Resilio sync [1] and Rsync [2] and intends to provide integrity and confidentiality for data using other synchronization protocols such as Pydio [3] for a storage known as untrusted storages which uses multiple enterprise level entities. Existing synchronization protocols cloud services. This approach leverages resources out of concentrate on the synchronization of devices using the multiple cloud service providers. Although this approach Internet or the local networks as the communication medium. increases the reliability through redundant resources, the This uses a large amount of network traffic and bandwidth and communication overhead remains greater. The overall may take a long period of time. It increases the cost of Internet process presented in work is expensive due to the usage, for those who are using metered connections. communication with many cloud services. Also, the varying Furthermore, the use of Internet bandwidth for network conditions add additional overheads in terms of synchronization of files would keep the connections maintaining consistency. congested for a longer period of time, while reducing the quality of service for other users of the network as well. Younghwan Go et al. [5] presents Simba. This is a data However, the requirement is often not the real-time sync service which abstracts the data storage and 2376-1172/17 $31.00 © 2017 IEEE 1929 DOI 10.1109/AMS.2017.13 synchronization. The work presents a data model and an API confidentiality in systems such as Dropbox [13] which uses that utilizes a unified storage for synchronization. It adopts a deduplication. tabular storage mechanism. This mechanism uses row Work performed by Andri Lareida et al. [14] presents consistency to ensure consistency of data. The proposed solution embraces a Client-Server architecture in order to Box2Box a peer to peer file sharing and synchronization application. The system keeps polling for connections with perform synchronization. The solution targets data centric mobile applications and the scope is to client server peers in order to connect and synchronize. The approach utilizes a conflict reporting mechanism where the user decides synchronization. Furthermore, the system resolves conflicts by creating a conflicted copy of the files having contradicting the version to keep. Furthermore, the application supports Super Peers so that all the content of all the peers can be changes. However, the platform facilitates similar synchronization to systems such as Dropbox and Google synchronized to a highly available peer. However, the approach consumes Internet bandwidth as much as any other Drive. Therefore, issues with privacy of the unified storage exists. Usage of a unified storage hosted in a cloud synchronization mechanism would. Furthermore, the system is prone to inconsistencies and more conflicts occur when the environment requires Internet to transfer data for synchronization. peers are offline and the content get changed frequently. Due to the increased security and scalability P2P Work by Ajay Tanpure et al. [6] presents the use of Rsync over secure HTTP (HTTPS). The work intends to perform synchronization has become popular lately. Resilio sync [1] is such a synchronization platform which uses a variant of Client-Server synchronization over HTTPS which enables remote synchronization. This is not available in conventional Bittorrent Protocol for synchronization. Work performed by Zhiyuan Peng et al. [15] evaluates the performance of the peer Rsync. The boundary shift problem exists in systems using Rsync, eventhough Rsync is considered as a powerful to peer synchronization taking Resilio sync as a case study. The work concludes the fact that, even though greater speeds protocol to perform chunk based synchronization [2]. Because of the aforementioned problem, simple insertion or deletion are visible in torrent downloads with a larger number of Seeders, no such gains can be anticipated in peer to peer of content can change all the chunks required to be transmitted. This has given rise to the concept of content synchronization. Thus, the performance is limited by the bandwidth of the peer with the slowest uplink speed. defined chunking [7], which eliminates the boundary shift problem. Furthermore, even if Rsync performs better in Therefore, the synchronization process is inherently slower than that of a client server model and takes a longer period of locally connected networks, the communication overhead will be when the Internet is used due to boundary shift problem time. A 50MB file had taken a time period of close to 75 seconds [15] providing an average speed of close to 700 which is common. KBytes/s. Two of the most commonly used cloud storages are Google [8] and Dropbox [9]. These services provide free As discussed there has been much research conducted focusing on file synchronization using different techniques. cloud storage and synchronization service to their users. The services include file synchronization, storage backup and Majority of the research focus on efficient file updating, resource utilization for reliability and security aspects of the public file sharing using sharable links. They also provide conflict resolution and real-time collaboration. Yet, the process. Yet there has been limited research on how to save bandwidth and Internet usage for file synchronization. privacy and confidentiality of data is not guaranteed since Google performs analytics on consumer data [10]. The Furthermore, research regarding the use of nomadic computing devices for synchronization of content is even content is scanned and Machine Learning algorithms are executed to make shopping suggestions and for spam lesser. detection. Dropbox in fact uses personal information such as III. SYSTEM ARCHITECTURE physical addresses to improve quality of service
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-