Content Synchronization Applications for Aakash Tablets and Remote Educational Institutions

M.Tech. Stage II Report

Submitted in partial fulfillment of the requirements for the degree of

Master of Technology

by

Debashee Tarai Roll No : 113050078

under the guidance of

Prof. D. B. Phatak

Department of Computer Science and Engineering Indian Institute of Technology, Bombay

June 2013 Abstract

MHRD Sponsored project ”Aakash”, World’s Cheapest Tablet, aims to make e- learning as the primary mode of learning in India. Tablet, being much cheaper than mobile phone with bigger screen size and lighter than laptop, is the most suitable medium to enforce E-learning in education. Aakash android tablets with Linux based operating system, are currently tested and new useful applications are developed at IIT Bombay. This entire project contains two individual applications, Local Synchro- nization and Remote Synchronization Application. Remote Content Synchronization application aims at distributing contents in a net- work efficient manner to remote Institutions distributed at different geographical loca- tions. It makes use of RSYNC tool for content transfer and RSA public private key pair to make the application more secured and fast. This report describes the architecture, implementation details and analyze the performance of the remote synchronization application. Several tests have been conducted and analysis of the obtained results shows that performance of the Remote Synchronization application is nearly three times better then conventional scp based transfer and RSA key pair reduce the total transfer time to a great extent. Local Content Synchronization application has been developed for Aakash tablets inorder to synchronize contents between students and teachers, providing the users an interface to submit files, download files and view the details of all submissions and sharing at real time. It offers the flexibility of storing the files according to user specified path and configuring IP Address as required. The work has been extended with Moodle database so that Student–Course–Teacher mapping can directly be done on the existing data. Acknowledgments

Foremost, I would like to express my sincere gratitude to my advisor, Prof. D. B. Phatak for selecting me as a part of his valuable project, for motivating me to do better and for spending sufficient amount of time to instruct and encourage me with insightful comments on my project work, in-spite of his busy schedule.

My sincere thanks also goes to Prof. Sridhar Iyer, who was my Internal examiner in satge 1 presentation, for his valuable suggestions and describing the appropriate approach for my project, guiding me in the right path.

Furthermore, I would like to give my gratitude to Mr. Nagesh Karmali and Ms. Firuza Aibara for their guidance and help, staring from the beginning of the project to writing reports. Secondly, I would like to express my appreciation to all my friends working in several Android related Projects, who cooperated and guided me to complete this project.

I thank my fellow labmates, for their discussions with me, suggestions and encouragements.

Last but not the least, I would like to thank my family: my Mother, Nalini, my brother Bharadwaj and my friend Gopalakrishnan, for their unconditional support, encouragement and care throughout my life. Without their support, it would have been impossible for me to finish my masters education seamlessly.

1 Contents

1 Introduction 5 1.1 Synchronization of data and content ...... 5 1.2 Application for Content Synchronization ...... 7 1.3 RSA Key based protection ...... 8 1.4 Motivation ...... 8

2 Literature Review 9 2.1 Survey on Various synchronization tools ...... 9 2.1.1 RSYNC as sync. tool ...... 10 12 2.1.2 Study on DropBox as sync. software ...... 15 2.1.3 SyncBreeze File Synchronization Tool ...... 15 2.2 Study on E-learning through Mobile Devices ...... 17

3 Android Application Architecture Framework 18 3.1 Introduction and overview of Android ...... 19 3.2 A brief description on Dalvik VM ...... 20 3.3 Android Application Framework ...... 20 3.4 Android Application Functionalities ...... 24

4 Proposed Architecture 25 4.1 Push-Pull Framework for Content Forwarding ...... 25 4.1.1 Content Forwarding in Local Synchronization Application ...... 26 4.1.2 Content Forwarding in Remote Synchronization Application ...... 27 4.2 Detail Architectures of Local and Remote Synchronization Application ...... 28 4.2.1 Local Synchronization Application ...... 28 4.2.2 Remote Synchronization Application ...... 32

5 Implementation 37 5.1 Implementation details of Remote Synchronization ...... 37 5.2 Implementation of Local Synchronization Application ...... 47

6 Experiments and Analysis of Results 53

7 Conclusion and Future Work 60

2 List of Figures

1.1 Content Synchronization Scenario ...... 6

2.1 Rsync Algorithm Description with an example ...... 11 2.2 Rsync’s network protocol ...... 13

3.1 Android System Architecture ...... 19 3.2 Life cycle of Activity ...... 21 3.3 Android Application Framework ...... 21 3.4 Application Architecture ...... 23

4.1 Pull Mechanism ...... 25 4.2 Push Mechanism ...... 26 4.3 Local Sync. Content Frwd Mechanism ...... 26 4.4 Remote Sync Content Frwd Mechanism ...... 27 4.5 Overview of Local Synchronization ...... 29 4.6 Update File: Details Steps ...... 30 4.7 Submit File: Detail steps ...... 31 4.8 Global view of Remote Sync ...... 33 4.9 Flow Chart for Remote Synchronization Application ...... 35

5.1 The main page ...... 39 5.2 Page containing options for faculty ...... 40 5.3 Faculty Registration ...... 41 5.4 Faculty adding courses ...... 41 5.5 Modifying existing Fields ...... 42 5.6 Login to View and Download the shared files ...... 43 5.7 Login to View and Download the shared files ...... 44 5.8 CASE 1 : Error Message when invalid inputs ...... 45 5.9 CASE 2 : Error Message when Empty fields entered ...... 45 5.10 CASE 3 : Success Message when values Updated ...... 46 5.11 CASE 4 : Success Message when user Registered ...... 46 5.12 Login screen ...... 48 5.13 Setting Screen ...... 48 5.14 User Validation from Server’s Database ...... 49 5.15 Main Screen ...... 49 5.16 Screen after Clicking Submit button ...... 50 5.17 Screen showing File Browser with UP button ...... 50 5.18 Screen after selecting the file to send file ...... 50 5.19 Page Showing several Options for downloading appropriate file ...... 51

3 5.20 Page Displaying Last Transactions of the logged in User ...... 52

6.1 Results of various size of file transfer to various user sets ...... 54 6.2 Results of Retransmission time taken for the user sets of 75 and 100 ...... 55 6.3 Comparison of File Transfer time for various User sets ...... 57 6.4 File transfer via SCP vs Sync Application ...... 58 6.5 Retransmission time comparision ...... 58 6.6 Performance Comparison between Application, Rsync and SCP ...... 59

4 Chapter 1

Introduction

E-Learning has the potential to support and enhance the traditional learning system and it has already become an integral part of learning. It makes the use of Internet to improve the qual- ity of learning by providing access to resources as well as educational contents and services from remote organizations in a collaborative manner[1],[2]. In educational environment Video lectures, pdfs, ppts, ebooks, study materials etc are the typical examples of content which are needed to be shared. An improved learning process needs techniques and tools to present, interact and share knowledge from variety of resources with others[3] . This can be possible if there is a automatic mechanism to share and distribute contents in between teachers and students inside an institution as well as in between institutions[4] . Content Synchronization is a data distribution technology, in which selected contents are automatically delivered from senders to receivers in real time or at prescribed intervals[5] .

1.1 Synchronization of data and content

Synchronization vs Backup[6],[7]

Synchronization is a bidirectional process. It ensures that two parties involves in the process, remain synchronized to each other while user may change files on both the sides. This is achieved by copying changes that have been made on one side to the other side and vice versa. As a result, both the parties involved in synchronization, end up with containing the mirror copy of the shared content. Backup on the other hand, is one-way synchronization, that is, changes those have been made on one side are propagated to the other side but not vice versa. In this case the receives are least interactive and receive only the contents forwarded by the sender.

Educational environment needs both synchronization and backup mechanism depending on the situation they are being used. Figure 1.1 represents a generalized scenario for local and remote synchronization process.

5 Content synchronization means to keep data in two or more machines up-to-date so that each shared repository contains identical information which is essential in interactive student teacher communication inside the institution, on the other hand backup process is needed to share contents between institutions where the process is some what less interactive and one to many communica- tion happens transferring huge volume of data.

Figure 1.1: Content Synchronization Scenario

There are several important points in this context described as follows : ˆ Implementing file synchronization can take many forms and can address several shortcomings found with other content protection schemes

ˆ Synchronization implies an automatic process of updation of data in all connected devices. In presence of distributed database, the ability to keep remote machines synchronized by routinely copying the entire or subsets of the whole content to them in the network is called Content Synchronization among remote machines.

ˆ Synchronization of Data and Content offers the ability to preserve multiple copies of files across various educational and IT industries, helping to keep updated files available to students, teachers and employees under any circumstance.

ˆ Regardless of whether the files are spreadsheets, documents, e-mail or contain other infor- mational elements classes based on these data, businesses related to those documents cannot operate without ready access to the data in those files creating a real time need of information and data synchronization in day to day life.

6 Working principle of Backup and Synchronization is based on one-way and two-way content prop- agation respectively. A brief description about them are as follows :

One-way synchronization It is used in backup process. All changes made in the source directory are propagated to the destination directory but any modification done to the destination directory will have no affect on the source directory. In short, the sender will keep on updating the the receiver’s contents whenever it finds any new or modified content.

Two-way synchronization Newly created and modified files are synchronized in both directions. Changes made at one location is reflected in all the other locations. As a result, after the file synchronization process gets over, both locations contains identical data. It typically makes the use of a central server that sends the latest copies of data to all required machines. The server keeps on watching the clients. When any change is made at the client, server ”Pulls” the data, find the list of destinations, creates a back up and ”Push” the new content to all the destinations.

1.2 Application for Content Synchronization

The need of sharing files in a peer-to-peer environment grows, increasing the challenge of providing environment that enable users to share files in an efficient manner, specially in environments with large numbers of users and sharing relationships. A Content synchronization application is a file sharing environment. Which operates based on the database containing the details of all the clients connected to it. The client sharing any version of file, is compared to identify the existence of the same content, to find the newly added parts and forwarded to the list of receives connected in the low latency network[8] . It provides interface to users t provide, change and update their details and view the the list of all transactions made with the client in a client specific manner. One of the most important factor is transfer of large volume of data over network to remote computers[9]. The detail design process has been described in chapter 4 and detail Implementation process has been described in chapter 5.

Local Synchronization Application This application has been developed for Aakash tablets[10] , provides an interactive environment to the users. Users can submit, update or get details of previous transaction made by them or with them etc. The Local Synchronization Application has been extended to use database of MOODLE, which is used globally all over the world in educational Institutions and the platform used by the clients are AAKASH Android Tablets.The work has been extended with Moodle database so that Student–Course–Teacher mapping can directly be done on the existing data. For any single transaction, Android application in tablets sends the selected file to the server, server finds destination list,i.e the list of receives registered for for that particular courses, keeps a back and

7 update the clients profile with the new content. This application uses iNotify utility at the server side to process each file transfer instantaneously in real time.

Remote Synchronization Application In this application, video lectures, ebooks, study materials etc. of one institution are forwarded to the registered list institutions, uses One-Way Synchronization, already described above. An institution which want to share its contents, work as central institution, which will provide a GUI for other institutions to register themselves for certain number of courses. Faculties/Users of the institution can register themselves to share their contents. So, an institution which want the study material being referred in the central institution, can register themselves in the application’s GUI and get the study materials on a periodic basis. This application has used Rsync[12] tool for content forwarding, RSA key pair for authentication which are described in details in Implementation Capter

1.3 RSA Key based protection

Key-based authentication[11] is most secure with compared to several other modes of authentica- tion, like plain password because key-based authentication provides several advantages over pass- word authentication, like key values are significantly more difficult to brute-force or guess than plain passwords and are of very large in length and also need not be remembered. There are basically two types of key based authentication, RSA and DSA. RSA being more secured[7], have been used in the Remote Content Synchronization Application. Key-based au- thentication uses two keys, one is Public key which anyone can see, and another is Private key, which only the owner is allowed to see. This has been used in Remote synchronization Application.

1.4 Motivation

MHRD has launched world’s cheapest tablet ’AAKASH-2’, as the step toward the ambition of One Tablet Per Student : E-learning target of Govt of India. IIT Bombay was given the responsibility to train 1,00,000 teachers at a time to train them on Aakash usage under the guidance of Prof D.B Phatak. To enforce e learning as the primary mode of education in India, the most optimal way is to make Aakash Tablets as the means of education because :

ˆ Smaller size and weight than Laptops, so more portable

ˆ Considerably bigger screen size then mobiles

ˆ Very less costlier than laptops and other android based mobiles and Tablets

ˆ Most importantly comfortable to read and type than mobiles

The desire of being an active contributer towards the prestigious ambition of making E-learning through Tablets as the primary mode of education in India and to share the study materials that IITans follow to rest of the country so that a high quality education will not be confined inside IIT only, are my primary motivations that inspired me all the time to build these Content Synchroniza- tion Applications as my Master’s Project.

8 Chapter 2

Literature Review

This chapter includes literature study in several directions like analyzing the working of various Syn- chronization tools, creating e-learning environment in mobile devices to setting up wifi connection within various devices. A brief description regarding the fields are as follows:

ˆ A comparison of file synchronization software tools and ultimately the use of RSYNC tool as an efficient and reliable tool for local as well as remote synchronization and study of Dropbox architecture, documentation of IBM Sterling for synchronization purpose among various mobile phones.

ˆ Implementation of E-learning environment through Mobile Devices like tablets and wifi enabled mobile devices.

ˆ Establishment of wifi connections simultaneously with multiple devices in same and varying BSS and Establishment of peer-peer wifi connections independent of Access Point reducing unnecessary load on Access Point.

2.1 Survey on Various synchronization tools

Synchronization of Data and Content offers the ability to preserve multiple copies of files across various IT industries; helping to keep updated files available to users under most any circumstance. With the increase in the demand of data integrity and consistency there are many data and content synchronization tools. Here under we will conduct a study on some of the very important and popular file synchronization tools and in the latter part we will carry out a comparative study on each of them.

9 2.1.1 RSYNC as sync. tool One of the major synchronization tool for content and file synchronization is rsync. The following contents are the summary of the survey of papers studied on rsync for synchronization purpose. The rsync utility[12] can be used cross- platform Linux, Mac OS X and Windows (with Cygwin, of course) and, in combination with cron and SSH. This can be scripted easily. This makes it one of the essential utilities in ones toolkit, even if not planning to use it for backup purpose. Another advantage is which it is bundled with many major Linux distributions today.

Description of rsync tool[13]

ˆ The killer feature for differential backups rsync, with its unique algorithm, allows to transfer only changes made in a file or directory structure, instead of re-transferring all data.

ˆ This is very beneficial when synchronizing large files or directory trees with gigabytes of data. rsync only transfers changed parts, and applies changes to the file/directory tree copy[17] on other systems, like patch utility.

ˆ It can even be used to synchronize files locally (on same computer), by taking backups on the local machine itself (say, to a different drive, like USB device drive). Overall, it is a simple, easy and efficient solution, eliminating the need to install any complicated backup software.

File synchronization Scenario: Given two versions of a file on different machines, one outdated and the current version, the pro- cess of updation of the outdated version with minimum communication overhead, by finding the significant similarity between the version[14],[15]. The issue of maintaining large replicated collections of files or documents in a distributed environ- ment with limited bandwidth. This arises in a number of problems such as synchronization of data between accounts or devices, content distribution and web site mirroring, large scale web search and mining and storage networks. For this rsync is the most commonly used solution . The rsync tool focus on the efficient synchronization of very large web pages, collections for the purpose of find, evaluate, and distribution.

ˆ At client side : partitions old file to blocks of size b each part is named from B0 to Bi , for each block compute weak hash and strong hash and send.

ˆ At server side :For each pair of received hashes, insert an entry into a dictionary structure, using as (weak hash, strong hash,i) as the key.

10 Server Client

Old Version File.txt Updated file File.txt

39886256432 1234763488625623

1234763488625623 H1 H2 H3 H4

H2 H1 H4 H3 H1 H2 H3 H4

Match Found H3 39886256432

[3 9 H3 5 6 4 3 2 ] 39886256432

H3 replaced & mirror Copy of server file created 39886256432

File.txt

Figure 2.1: Rsync Algorithm Description with an example

11 ˆ A pass is performed through new file, starting at 0th position and involving the following steps:

– Compute the unreliable hash on the block starting at 0th position – Compare the dictionary for block with matching unreliable hash. – If found, and if the reliable hashes match, transmit the index i of the matching block in old file to the client, advance it by ”b” positions, and continue. – If none found, or if the reliable hash did not match, transmit the symbol new file at jth position to the client, advance it by one position, and continue.

ˆ At client side : Use the in coming stream of symbols and indices of hashes in to reconstruct new file from the old file.

File synchronization algorithm that partitions the problem into two phases,

ˆ Map construction where the two parties use a multi-round protocol to determine the com- mon parts of the corresponding files. The ultimate result of map construction is to produce the bit map array, telling the sender to send only some particular blocks

ˆ Delta compression, where the remaining parts are encoded in relation to the common parts and then transmitted to the other side. The authors have introduced technique for extending matches via Continuation Hashes and for the optimized verification of suspected matches in the two files, plus several other optimizations.

Rsync Network Protocol RSync first calculates the checksums (rolling + hash value) for each block of data in the old version of the file. These checksums needs to be transported to the computer holding the new version of the file, so it can get the file modifications.

Once the differences are detected between the current and previous version of file, the computer holding previous version of file receives instructions on how to create a copy of the file that is equal to the new version. These instructions consists of references to blocks in the old file which have not modified in new file, and sequences of changed or new data.

Checksum Protocol ˆ In the original Rsync implementation the rolling checksum takes up 4 bytes. The hash value is 16 bytes. That means that for each block of the outdated version of file, 24 bytes are transmitted across the network in this direction.

12 SENDER RECEIVER fork () to start remote rsync server

Create File List based on File List pathnames, ownership, 1.Compares File List with Permissions,size,mode etc local directory tree 2. Rolling checksums and Hashes are created Rolling Checksums & Hash Values

1.Find and Append only new content New/Modified data & Block Ref no. 1. Create a Temp File 2.Compress & Encode 2. Renamed & replace temp file as the basis file

Figure 2.2: Rsync’s network protocol

ˆ There is no need for any specific block reference identification number. The computer holding the new version of the file will assume that the first 24 bytes is for block no 0, next 24 bytes for block 1 etc.

Merge Instruction Protocol ˆ When sending instructions a list of block references are sent along with sequences of new data to insert. A block reference is encoded as a 4-byte int. This makes it feasible to reference up to 2,147,483,648 blocks of data.

ˆ The data sequences are inserted into the stream of block references, and marked as block reference. For instance, a block reference of -1 means that a data sequence is beginning, and it continues until an end code is met.

Protocol Efficiency ˆ It is possible to optimize the network protocol at cost of CPU computation time. At least five things can be optimized as follows:

– Size of rolling checksum. – Size of block references. – Collapsing sequential block references. – Compression of data sequences. – Storing Checksums Locally

13 Rolling Checksum Optimization

ˆ In the original RSync implementation the rolling checksum consists of two 16 bit values, instead of 2 32 bit values. Cutting the rolling checksum back to 16 bits will increase the number of checksum matches, and thus cost more 16 byte hash value calculations when searching for checksum matches on blocks.

ˆ Using a 32 (4 bytes) rolling checksum instead of a 64 bit (8 bytes) will cut down the data exchanged by non changed blocks by a value of 4 bytes.

Improvements on rsync performance:

1. Continuation hashes perform optimized verification of suspected matches in the two files, plus several other optimizations. Suppose that a block at a particular level results in a confirmed match in the client file, it is unlikely that the sibling of that block would also find a match because

ˆ Any match that is a continuation of the first match would have already been discovered at the parent level

ˆ Any other match is unlikely since the match found by the first sibling will likely extend at least slightly into the other block

ˆ Thus, if split the processing for each block size is splitted into two phases, first a search for matches using continuation hashes on blocks adjacent to confirmed matches, and then a search using global or local hashes

ˆ Second omit sending hashes for any blocks whose sibling found a confirmed match in the first phase (and also for any blocks for which continuation hashes were sent but no matches found)

2. Hierarchy Structures[17] : Rsync can be even more improved by maintaining large hierarchy of folders replicated in a distributed type environment. It was observed that this problem afflicted a number of important applications, like synchronization of Folders Hierarchy between Peer-to-peer environments, synchronization of content between devices or accounts, content distribution and storage networks,large scale web search and mining and website mirroring. At core of problem lay the File in Hierarchy directory synchronization challenge.

ˆ Here a framework for remote file synchronization and describe several new techniques that result in significant amount of time savings and reducing error during files transfer is proposed by the authors.

ˆ Focus on applications that involve very large collections of Hierarchy Folders including files and folders

ˆ The proposed algorithm adopts the Sender/Receiver’s Server structure based on a client modification

14 And implementation of our framework and techniques achieves significant improvements over RSYNC. This standard is capable synchronization of large folder transfer between peer-to-peer distribu- tions. This method is provided for synchronizing folders between the server which is updated by some client and other servers. Our algorithm adopts the Sender/Receiver Server structure based on a client modification.

2.1.2 Study on DropBox as sync. software

Dropbox[18] is a file hosting service operated by Dropbox, Inc. that offers file synchronization, and client software. In brief, Dropbox allows users to create a special folder on each of their computers, which then synchronizes so that it appears to be the same folder (with the same contents) regardless of the computer it is viewed on. Files placed in this current folder are accessible too through a web site and mobile phone applications.

ˆ Dropbox has two parts, Dropbox server and desktop/ mobile client. Dropbox depends on the librsync which is very very similar to rsync.

ˆ The Dropbox client enables users to drop any file into a designated folder that is then syn- chronize with Dropbox’s Internet service.

ˆ Dropbox supports multi-user version controlling, allowing several users to edit and repost files without overwriting versions.

ˆ When a file in a user’s Dropbox folder is changed, Dropbox uploads only pieces of the file that are changed when synchronizing, files uploaded are limited to not more than 300 MB per file.

ˆ Dropbox provides a technology called LANSync, which allows computers on a LAN to securely download contents locally from each other instead of always hitting the central servers.

2.1.3 SyncBreeze File Synchronization Tool SyncBreeze[19] is a powerful, easy-to-use and fast file synchronization solution allowing one to syn- chronize files between network shares, directories and NAS storage devices. The users are provided with different product versions like freeware, pro, ultimate and server designed for home users, power users, IT enterprises. SyncBreeze provides multiple one-way and two-way file synchronization modes, periodic con- tent synchronization, background content synchronization, compressed content synchronization, real time content synchronization, an option to synchronize specific types of contents, user chosen GUI layouts and permits one to define multiple customizable file synchronization commands making it very simple to synchronize many numbers of directories, disks or NAS storage devices.

15 There are some important features of SyncBreeze:

ˆ Synchronizing Specific File Types or Categories

– SyncBreeze provides power computer users with the ability to synchronize specific file types or contents categories using single or more flexible file matching rules.

– For example, the user may specify to synchronize documents and digital images with the file size more than 2 MB.

– During the content synchronization method, SyncBreeze scans the entered source and destination folders and apply specified file matching rules to all the existing files.

– Contents not matching specified rules will be skipped from the file synchronization pro- cess effectively restricting the operation to user-selected files only.

ˆ Synchronizing Disks or Directories

– Content synchronization with preview is more useful providing the user with a clear picture about what contents will be synchronized.

– On the other hand, file synchronization with preview may be ineffective or completely incorrect when there is a requirement to synchronize large directories or whole disks containing many thousands of files, mainly because of the fact that none will have the time to review lists of content synchronization actions containing thousands of items.

ˆ Periodic Execution of File Synchronization Commands

– Another option provided by SyncBreeze is ability to execute file synchronization com- mands periodically at user specified time gaps. SyncBreeze allows one to repeat everyday or every week sync operations to be performed at a specified time of the day.

– The major reason to execute a content synchronization command routinely is to continu- ously keep a frequently changing directory synchronized with a backup directory located in an external disk or a storage device.

16 2.2 Study on E-learning through Mobile Devices

This part of literature survey is based on the implementation of E-learning environment through mo- bile devices like Mobile phones and Tablet pc etc[20] . Implementing a Mobile Campus Using MLE Moodle The main goal is to access learning materials and to support learning activities[21],[22] . The prototype of the deployed Virtual Campus using MLE Moodle enables mobile clients to perform online learning processes and is towards achieving the anytime, anywhere paradigm[23] . Mobile devices have many limitations

1. computational capacity 2. small memory 3. limited graphical user interface

Integration with Web Technology (centralized Client/Server architectures, where a server applica- tion provides services to Clients) and here the difficulty is the limited capacity of mobile devices and the lack of standards. Integration with P2P Technology: JXTA is the P2P open source platform, consists of XMLbased protocols that permits connected device to exchange messages and collaborate in a decentralized P2P mode and supports mobile phones and PDAs.

Proxy vs. Proxyless Architectures :

ˆ Proxy access: Proxy is a computational device that acts on behalf other computational en- tity,so a proxy permits computational entities (nodes, peers) to indirectly connect to the networks (servers, broker or rendezvous peers ). Clearly, the access to resources goes through communication with the proxy. Transparent proxy combines proxy server with NAT s.t. client do not know the existence of proxy.

ˆ Proxyless access : Proxyless architecture solves some of the proxy base architecture, still it is too difficult to integrate mobiles in a straightforward way to server applications.

The advantages of this standard is Control , Efficiency , Filtering. Disadvantages : Anonymity , Security and trust , Inconsistency , Network stability 1. implemented a standard Virtual Campus in Moodle. 2. the Virtual Campus is extended with MLEMoodle module . Two servers , namely,

ˆ Gateway Server , which is a proxy used by MLE (Mobile learning environment ) to access the Campus

ˆ Message Server , which is a server for the instant messenger for mobile clients. Evaluation is done by Virtual Campus, Mobile Campus, Overall platform testing.

In the above discussion proxy and proxyless architectures, Gateway and message servers are considered as way to extend traditional virtual campuses with mobile clients. Indeed, the current wide spread of mobile devices and wireless technologies brings an enormous potential to e-learning, in terms of ubiquity, pervasiveness, personalization, flexibility satisfying ”anytime anywhere” paradigm

17 Chapter 3

Android Application Architecture Framework

As smart phones and tablets become more popular, the operating systems for those devices become more important[25] . Android is such an operating system for less powered devices, which run on battery and are full of hardware like Global Positioning System (GPS) receivers, cameras, light and orientation sensors, WiFi and UMTS (3G telephony) connectivity and a . Like all op- erating systems, Android enables applications to make use of the hardware features by abstraction and provide a defined environment for applications[24].

Structural overview The software stack of Android, as shown in figure can be subdivided into five layers: The kernel and native libraries, low level tools, native libraries, the framework layer and on the top of all applications. ˆ The kernel in use is a Linux 2.6 kernel, modified based on special needs in power management, memory management and the runtime environment.

ˆ Libraries like the libc or libm were developed especially for low memory consumption, as Android supposed to run on the devices with little main memory and low powered CPUs.

ˆ Android Runtime consists of the Dalvik virtual machine (DVM) and Java core libraries. The DVM is an interpreter for byte code that has been transformed from Java byte code to Dalvik byte code.

ˆ Android Application Framework are written in Java and provide abstractions of the underlying libraries and DVM capacities to applications.

ˆ Android applications run in their own sandboxes Dalvik VM and can consist of multiple components: Activities, broadcast receivers and data/content providers, services.

18 Figure 3.1: Android System Architecture

3.1 Introduction and overview of Android

Android OS has become the most popular OS of Linux kernel with approximately 60 millions new mobiles running Android every year[26] . Running applications is the main goal of OS and Android provides different layers to compose, execute and manage applications. For this purpose Android clearly differentiates the terms application, process, task and thread. This chapter explains each term by itself as well as the correlation between the terms.An Activity is a single screen of an application like a browser window or a settings page. It contains elements that present data or allow user interaction.

Application lifetime states[32] On application starts individual components get started and in case of an activity the following hooks are called sequential: onCreate(), onStart(), onResume().If an activity miss it’s focus, on- Pause() method is called and if an activity is not any more visible, onStop() is called.

ˆ onCreate() This method is called for initialization and static set up purposes. It may get passed an older state for resuming. The next method is always onStart().

ˆ onRestart() After an activity is stopped and to be started again,is called and after onStart().

19 ˆ onStart() The application process type changes to visible and the activity is about to be visible to the user, but its not in the foreground.

ˆ onResume() The activity has the focus and can get user input. The application process type is set to foreground.

ˆ onPause() If the application loses the focus or the device is going to sleep, this hook is called and the process type is set to visible. After running this hook, the system is allowed to kill the application at any time.

ˆ onStop() The activity is no longer visible, the process type is set to background and the application may be killed at any time by the system to regain memory. The activity is either going to get destroyed, or restarted.

ˆ onDestroy() The last method that is called in an activity right before the system kills application or application removes activity.

3.2 A brief description on Dalvik VM

Android applications and the underlying frameworks are almost entirely written in Java. Instead of using a standard Java virtual machine, Android uses its own VM. This virtual machine is not compatible to standard JVM, Java ME as it is specialized and optimized for small systems. These small systems usually only provide low RAM, a slow CPU power and no swap space to compensate small memory issue.

The necessary byte code interpreter the virtual machine called as Dalvik. Instead of using standard byte code, Dalvik has its own byte code format which is adjusted to the needs of Android target devices. The byte code is more compact than usual Java byte code and the generated .dex files are small.

3.3 Android Application Framework

This part deals with the android application framework and how applications in android OS work.[33] Running applications is a major goal of operating systems and Android provides sev- eral means on different layers to compose and execute and manage applications. For this reason Android differentiates terms process, task, application and thread. This sub- section explains each term by itself as well as the correlation between the terms.

20 Figure 3.2: Life cycle of Activity

Figure 3.3: Android Application Framework

21 Processes threads In Android five types of methods (processes) are distinguished in order to control the behavior of the system and its running programs. Various types have various and different importance levels which are strictly followed / ordered.

ˆ Foreground : A process that is running an Activity, a Service providing the Activity, a stopping Service or a starting Service or a currently receiving Broadcast-receiver.

ˆ Visible : If a process holds a paused but still visible Activity or a Service bound to a visible Activity and no foreground components, it is classified a visible process.

ˆ Service : A process that executes an already started Service.

ˆ Background : An Activity that is no longer visible is hold by a background process.

ˆ Empty : These processes contain no active application components and exists only for caching purposes

Applications tasks Android applications are run by processes and their included threads. The two terms task and application are linked together tightly, given that a task can be seen as an application by the user. Tasks are series of activities of possibly multiple applications.

ˆ Tasks basically are a logical history of user actions, e.g. the user opens a mail application in which a specific mail is opened with a link included which is opened in a browser.

ˆ In this scenario the task would include two applications (mail and browser) whereat there are also two components of Activities, of the mail application and one from the browser included in the task.

ˆ An advantage of the task concept is the opportunity to allow the user to go back step by step like a pop operation on a stack.

22 Figure 3.4: Application Architecture

Application internals The structure of an Android application is based on four different components, which are: Activity, Service, Broadcast-receiver and Content Provider. An application does not necessarily consists of all four of the components, but to present a GUI there has to be at least a single Activity.

ˆ Applications can start other applications or specific components of other appli- cations by sending an Intent. These intents contain among other things the name of desired executed action.

ˆ The IntentManager resolves incoming intents and starts the proper application or component. Reception of any Intent can be filtered by an application.

ˆ Services and broadcast receivers allow applications to perform jobs in the background and provide more functionalities to other components.

ˆ Broadcast receivers can be triggered by events and only run a short period of time whereas a service may run a long time.

23 AndroidManifest.xml ˆ All Android Dalvik applications need to have a XML document in the applications root directory called AndroidManifest.xml.

ˆ In the manifest file 23 predefined element types are allowed to specify among other things the application name, the components of the application, permissions, needed libraries and filters for intents and broadcasts.

ˆ AndroidManifest.xml is used by various facilities in the system to obtain administrative and organizational info. regarding the application.

3.4 Android Application Functionalities

The application framework of Android provides APIs in various areas like multimedia, GUI, net- working, power management and storage access. The libraries in the framework are written in Java. They run on top of core libraries of the Android Runtime .

ˆ The Application Framework provides managers for different purposes like power manage- ment, resource handling, system wide notification and window management.

ˆ Applications are supposed to use the services of the managers and not use the underlying libraries directly.

ˆ This way it is possible for the managers to enforce application permissions through the means of the sandbox permission system . e.g. initiate a phone call or send data over the network.

With the size and the change rate of a platform like Android, it is almost impossible to present all aspects of such a system, hence this study thesis can only provide a first glance at Android.

It looks like Android is just another mobile operating system, but the wide support from large companies and have created Android one of the important contestants in the mobile sector. Openness and extensibility allow companies / manufacturers to modify the system to fit their needs and requirements, both in hardware as well as in software. This leads to a significant number of devices from many different manufacturers and each covers a different range of customers.

24 Chapter 4

Proposed Architecture

This chapter contains the detail description of architectures followed in the Applications staring from very Primitive level to High level. It describes about the Push/Pull mechanism for content forwarding, RSA Public private encryption mechanism, global scenario of both the individual ap- plications and ultimately the flow charts describing the detail working process of the applications.

4.1 Push-Pull Framework for Content Forwarding

In the ’normal’ client/server model, clients request for service or information from a server, which the server responds in transmitting information to clients. This is called as pull technology; the client pulls information from the server. Here, the content synchronization applications, performs a way different than a typical client server system. They make the use of both push and pull mechanism for content forwarding and synchronization[34] .

Publisher: It is the agent containing updated files in its repository. Publisher can be both Client or Server itself.

Receiver : It is the agent who suppose to receive the recent version of file from the publisher.

DB Server

Client 1 Client 2 Client n

Figure 4.1: Pull Mechanism

The Pull approach in application includes requesting for the new content from publishers. When

25 the pull program executes, it esquires all the client’s local directory, fetches data from individual machine and stores in central repository i.e. central server collects recent contents as the receivers, from the source machines.

DB Server

Client 1 Client 2 Client n

Figure 4.2: Push Mechanism

In contrast to pull, in the Push approach, the central server forwards data to individual clients on a periodic basis creating a mirror copy of the source directory to the destination directory. i.e. the central server acts as publisher and transmits all data to receiver clients.

4.1.1 Content Forwarding in Local Synchronization Application

Tablet as Sender Server Database Tablet as Receiver

Selects File to send from local File Sys Upadte DB Sender's Details, PUSH the new Query Dest Client List content to Server Recieve Dest List

Update new content in server's dir tree

Upadte DB with Receiver's Details

PULL the new file from Server

Figure 4.3: Local Sync. Content Frwd Mechanism

26 In Local Synchronization Scenario, the application works on the basis of Push for content forwarding to the server and from the server and Pull for the purpose of content back up and processing inside server based on the results from the Moodle database. The diagram depicted below, describes the content forwarding mechanism for single round transmission of file from the Tablets as sender through the Central server to the receiver Tablet. The the detail mechanism is described in the next section in the form of flow chart.

4.1.2 Content Forwarding in Remote Synchronization Application In Remote Synchronization Application, the attempt to make application as simple as possible at the client side, lead it to be designed based on Pull Mechanism at the client side and Push mechanism at the server side. Clients only keep the new content inside the appropriate directory structure created at their local machine by the server. Server pulls the new content added by the client, store in the central database and Push further to individual clients.

Client as Sender Server Database Client as Receiver

Store file in local public Dir

AUTOSYNC PULLs Query Dest Client List the new content Recieve Dest List Update new content in server's dir tree

PUSH the new file to dest clients

Figure 4.4: Remote Sync Content Frwd Mechanism

27 4.2 Detail Architectures of Local and Remote Synchroniza- tion Application

This section describes about the global view and in depth description of both Local and Remote Synchronization applications respectively.

4.2.1 Local Synchronization Application

Moodle as the central controller : In local synchronization, operates based on the Moodle Database. The central server creates its own database tables based on the mapping done on Moodle, ex. Student table, Teachers table Details table etc. This auto generation of the application database based on Moodle database is repeated in certain interval so that new registrations, deregistrations of Courses, Students, Teachers can be handled. The main reason for this mapping rather than direct access to Moodle Database is

ˆ Moodle database Mapping is Complex and many many mapping have to be done to get the result of a single query like the list of students registered for a particular Course under a particular professor. So if each time for single round file transfer, the long chain of mapping will be done, it will take a considerable amount of time for the total processing

ˆ Every Institute may not be having Moodle. So they can manually configure the database based on their current Teacher, Students and Course details

Global Scenario

Figure 4.5 represents a typical scenario where Local Content Synchronization Application works in.

ˆ The entire communication process is based on WIFI

ˆ Each student have their own tablet (though it is not an mandatory condition of the applica- tion) and they connect to the central server by providing their Moodle User Id and Password

ˆ The application provides the flexibility to set the Download path and Server’s Ip Address

ˆ Unless modified, these fields will have the default values set at the beginning for Server’s IP Address and /mnt/sdcard/downloads as default download location

ˆ Both teacher’s and student’s behave as clients and only the different in functionality are captured at the server end processing

28 ˆ As described above, the Local Synchronization Application works based on Push from the Tablets side keeping in view that the Battery Life of tablets are limited to 2-3 hours and if they are connected to wifi, the battery life reduces more rapidly

ˆ So whenever the user wants only can send the files, get updated files or view the past file sharing details

Central Server Moodle DB

Student to Teacher T1 Teacher to Student

Sn T2

S1 S2 S3

Figure 4.5: Overview of Local Synchronization

Detail Scenario Unlike remote synchronization Scenario, in case of Local Synchronization, only synchronization phase is there as the registration phase is considered to be based on the Moodle database.

Client Side :

ˆ Set the Ip address and Default path to store downloaded files.

ˆ Enter the UserId and Password for initial credential validation

ˆ For Downloading files shed with the User:

– Click on the Update button

29 – Select the type like, newly added, files from last 2 days, 1 week etc – Check the default path set to get the downloaded files

ˆ To get the details of file Send and Received

– Click on the View button – The details of the particular user will appear on the Tablet screen – This includes the type of the file (Send or Received by the User), Course name, Time and Date and Filename

TABLET SERVER

Enter Login Credentials Check Moodle DB

No Get Error Message for ID PW pair wrong Credentials exists ?? Yes Get the main screen of the application

Select Update button from Main Menu

Select Appropriate Option to get Shared Files ex. New , 1 week and Submit Serve the Request for file to Tablet File Stored in the Scecified path

Figure 4.6: Update File: Details Steps

ˆ For Submitting any file for a particular course :

– Rename the file as ”UserId–Course–filename” format – Click the submit button to browse the file list – Select and send the appropriate file

30 TABLET SERVER

Enter Login Credentials Check Moodle DB

No Get Error Message for ID PW pair wrong Credentials exists ?? Yes Get the main screen of the application

Select Submit button from Main Menu

Get the File Browser to browse through Local Files

Rename file as UserID_CourseID_filename

Submitted file gets stored Select the file to be sent inside watched Directory and Click SEND Button

New content updated at DB with details of the Sender Find Destination list from DB

Update Local Directory of Destinations at Server

Update DataBase with Receivers Details

Maintain a Back Up copy at the Server

Figure 4.7: Submit File: Detail steps

31 Server Side

The server keeps on listening to a particular directory where the file are initially gets stored sent by the Tablet. The Server processes same computation for all files. When any new file is added the following steps happen inside the server

ˆ Get the file name and extract various parameters like user Id and Course info

ˆ Find the valid entry in the database for that particular UserId and Course pair

ˆ If the pair is invalid, discard all further processing

ˆ If the same exist, create an entry in the database

ˆ Find the destination list of UserIds

ˆ Copy the file to reciver’s directory tree

ˆ Update the database the details of the receivers

ˆ Update the entire receiver’s directory structure, reconstructing the directories for files not downloaded, shred in last 1 day, 2 days, 1 week etc.

ˆ Back up the original file into a separate directory

4.2.2 Remote Synchronization Application

Global Scenario The Institution offering Remote synchronization, has to run the Remote synchronization application on a server. Other institutions who want to get study materials from the offering institution, have to register themselves at the application’s GUI by visiting the URL located in server. Teachers who want to share their data also have to register themselves. After registration, users will find the directory structure has been created by the server in user’s local machine, based on the user’s name in the given path (provided at the time of registration), containing the course names as subfolders inside it. The application will periodically check if any new content has been added by any teacher. If any new content found, it will copy the content to servers directory, finds the destination list of recipients and forward the content to them. So, in this application the offering institution will be the server and all the institution, who register to get contents are the clients operatinf in 1 to M relationship. The central server acts as publisher and transmits all data to receiver clients which are remote machines located at geographically distributed locations. Because the primary goal is to use this synchronization tool in educational environment the operational complexity at the client side has been kept at minimum level. This application includes pull mechanism along with push mechanism keeping in view to make the client end as simple as possible.

32 Institute 1 Cental Server

user n Institute 4 Institute n user 1

user 2

user 3

Institute 2 Institute 3

Figure 4.8: Global view of Remote Sync

Detail Scenario Figure 4.9 describes the detail scenario of Remote Content Synchronization application. It contains two parts, Registration and Synchronization. The first part describes the registration process followed by the synchronization process.

Registration This is the initial phase for both the clients who want to share their contents and who want to receive them. Server initiates computations related to registration when any new registration takes place.

Client Side

ˆ Each client register them selves in the server’s GUI providing their details like Institute ID, systems credentials, IP address, path where they want the synchronization to take place etc

ˆ Then submit the names of the courses, they want to get updates from or share updates with form the list of available options displayed in the course registration page

33 Server Side

ˆ The server finds the list of newly registered users and creates directory tree inside itshelf

ˆ Computes the pass phrase based on the user’s details

ˆ Stores the corresponding RSA public key in side the newly registered client

ˆ Creates directory structure inside the path provided by the client

Synchronization The entire content transfer work is done in this phase. The registration and synchronization are to independent work and takes place simultaneously. However, clients registering as receiver will receive the updated contents available which are available after their registration only.

Client Side Stores the content inside appropriate directory created by the server in the provided path during the time of registration.

Server Side

ˆ Periodically checks for the new content added by any client

ˆ When found, it checks whether it’s already present in the servers directory tree and if not true continue for the next steps

ˆ It calculates the checksum of the file and compare and find the minimal amount of content to be copied to the server using the private key and public key pair

ˆ After copying to the server, it finds the list of destination clients who supposed to receive the content

ˆ Updates the sender and timestamp in the database

ˆ Updates the file in server’s directory for each recipient creating a soft-link to the original file

ˆ Checks for the file exist in the in the destination machines

ˆ Transfers the file to the destination using the private key and public key pair

34 CLIENT SERVER Provide details Initiate Registration Process For Registration R E Yes UserID G Error Message Already Exist! I S No T Update Databse R A Store Public Key Create Local Dir. Tree T in Clients computer I Create Dir. Tree in O Clients computer N Notify User for Successful registration

Store in File inside PUBLIC AUTOSYNC detects change folder of local Computer

Initiate Transfer Process S Check permission, size, mode, ownership Y N File C Yes Already Terminate Transfer Process H Present! R New File/ O Modified Orig. File N Recieve Checksum,Hashes Calculate and send Rolling Checksums,Hash pair I Z Send compressed encoded construct original file block Ref. No,hashes set A T Update Sender's ID in DB I O Find Destination list N Update local directory tree

Calculate and send Rolling Send File List to Dest clients Checksums Hash values Recieve Checksum,Hashes

construct original file Compute optimal file & Send in compressed,encoded format

Figure 4.9: Flow Chart for Remote Synchronization Application 35 Key Based Authentication Key Based Authentication is used in remote synchronization. As the server is visible over the Internet, use of public key authentication instead of passwords is better enforcing security to the central server. The user’s details are fetched from the database and user specific pass phrase is constructed. Later on that phrase is used to create public and private key pair and then the key is transferred to the receiver/client. The only way to decrypt data encrypted with the public key is with the matching private key. Although two keys are related to each other, a private key can’t be created from its matching public key. So, after that all the transmission happens through public private keys. If due to some reason, either the private key or the public key crashes, the password less transfer will not happen, that time the application will use the default credentials provided by the client for further data transmission till the end of current refresh cycle. During the next refresh cycle the pass phrase for the client server pair will be reconstructed again overwriting the existing one and the data transmission with the key pair will continue working.

36 Chapter 5

Implementation

This chapter describes the detail process followed in implementation. As described earlier, Local synchronization and Remote synchronization follow different methods for content forwarding and there are different types of security threats for each of them, the implementation of both the synchronization applications are done based on different logic. However, in both the applications, the backend is implemented in Shell Scripting and Java. Java is used to communicate in between the database and shell scripting. Central Server is of LAMP (Linux, Apache, Mysql, Php) Server installed with Linux Ubuntu 12.04 operating system.

5.1 Implementation details of Remote Synchronization

The first part describes about the tools and utilities used in remote synchronization application followed by the step by step details.

Rsync : ˆ As already mentioned, rsync is a network protocol that synchronizes files and directories from one place/location to another while reducing data transfer by using delta encoding when appropriate to minimum

ˆ It works in unidirectional way and in 1 to 1 fashion. In Remote synchronization application, Rsync has been used to fetch data from clients who are registered as contributors (teachers) at a prescribed interval

ˆ Practically a cron job will repeat in everyday to fetch the newly added contents by the user and copy the content by rsync to central server. For transmission of data from central server to the receivers Rsync protocol has also been used

37 RSA keygen 2048 : ˆ ssh-keygen generates, manages and converts authentication keys for ssh process. Ssh-keygen creates RSA keys to be use by SSH protocol. The type of key to be generated is specified with the -t option ex. rsa or dsa

ˆ It accepts pass phrase to create the key. The default key length in bits is 1024 bits but to impose more security, this application makes use 2048 bit key generation by specifying the key length as an argument

ssh copy id ˆ ssh-copy-id is a script that uses ssh to log into a remote machine presumably using a login password, so password authentication is to be enabled

ˆ It also changes the permissions of the remote user’s home, to remove group writability, which would otherwise prevent from log in, if remote sshd has Strict-Modes configured in it’s con- figuration

ˆ This is used by the application to store the key generated by RSA keygen as described above in the remote machines

Expect spawn Expect is a program that ”talks” to other interactive programs according to a script. An interpreted language provides branching and high-level control structures to direct the dialogue. Rsync is a Unidirectional and 1 to 1 tool but the application need 1 to many transfer and bidirectional operation. This makes use of spawn which initiates certain process like ssh, ssh keygen etc. This is mainly useful in automatic passwordless login process. By using expect, interactive scripts have been written for the following purposes

ˆ Automatically transfer the RSA Public key to the newly created users or to all the existing users when the cron for redistribution of the public key repeats

ˆ Automatically transfer or collect content from the users by using rsync to or from the remote clients

ˆ Creation of directory structure, at the user end after the registration process to create the directories in the specified path provided by the user during registration

38 Implementation Details

Registration of users : Remote Synchronization application provides a user interface to register by the clients. Clients can go to that URL and provide their details and submit. The Registration page is designed in PHP, Javascript and Html.

Figure 5.1: The main page

ˆ The main page allows the users to navigate accordingly based on the type like whether the user is teacher or some other institution

ˆ The users are instructed to install rsync to use this application

ˆ The user’s details are checked against repeated inputs, invalid inputs and empty inputs. If provided data do not fall in any of those categories, then the application checks whether the details are already present in the database table

ˆ If true warning is generated by javascript catching results from the sql query and the appli- cation redirects the page to the main page

ˆ If not the application inserts the details in the database in ”registration” table and redirects the page to add courses page

39 ˆ The application provides separate navigation for the Client institutions and faculties of the Institution who are sharing their content ˆ The needed information for users are written on the main page of the user type, i.e. the client todo details are specified in ”Register Institution” page and faculty todo details are specified in ”Join as Faculty” page respectively

Figure 5.2: Page containing options for faculty

Add Courses by users: ˆ A user can add as many number of courses for as many number of departments as possible. The flexibility to the user is add courses at any point of time

ˆ It accepts the input as User id and Ip Address which are considered as Unique pair and then displays the list of departments from the database after validating the entered details against repeated inputs, invalid inputs and empty inputs from the ”registration”

ˆ Upon selecting any department, the application displays the list of courses as checkbox ele- ment. The user can enters multiple number of courses at a single time

ˆ Then the application creates entry in ”details” table inserting the entered details

40 Figure 5.3: Faculty Registration

Figure 5.4: Faculty adding courses 41 Update User information: ˆ When the Ip address, credentials of the user or any registration related information changes, the user can reset the new values by selecting the field to be changed and then entering the existing User Id and Ip address

ˆ Then the application provides the user is redirected to a page where the existing value as well as the new value are expected to be entered

ˆ In this submission also the user entered details are checked against repeated inputs, invalid inputs and empty inputs

ˆ If provided data do not fall in any of those categories, then the application inserts the new values replacing the existing values both in registration and details directory

Figure 5.5: Modifying existing Fields

42 View and Download Files: ˆ This is a very very advantageous feature of the application. It allows the users to login into their individual account maintained at the server and find the files or contents shared with them

ˆ This sharing is done based on the courses for which the user is registered for. The list of files shown over here are same as the the list of files transfered to the remote machine

ˆ This feature provides many features like, the user do not have to search the directories in- dividually because they can directly see the list files shared with them with details like the course the belong to, data and time related information

ˆ The user can download individual files as and when needed from the server

ˆ This helps in another way also, i.e. by simply viewing the course and filename pair, the users can find out which are the files stored in their local system and in which folder respectively

Figure 5.6: Login to View and Download the shared files

43 Figure 5.7: Login to View and Download the shared files

Case Studies: ˆ Figure 5.8, 5.9, 5.10 and 5.11 represent several Case Studies of the Remote synchronization application

ˆ Figure 5.8, represents the error message generated while wrong inputs are inserted in the form during updation od already existing information in the database

ˆ Figure 5.9, represents the error message when Empty fields entered during the Registration phase

ˆ Figure 5.10, represents the success message when the entered values are not duplicate (exist in the database), empty and invalid types at the end of the Registration phase

ˆ Figure 5.11, represents the success message when the entered values are not duplicate (exist in the database), empty and invalid types at the end of the Update Information phase keeping other parameters as they are.

44 Figure 5.8: CASE 1 : Error Message when invalid inputs

Figure 5.9: CASE 2 : Error Message when Empty fields entered

45 Figure 5.10: CASE 3 : Success Message when values Updated

Figure 5.11: CASE 4 : Success Message when user Registered

46 Server side implementation: Following are the step by step details of server side implement: ˆ The server runs a script to find the newly registered users both as contributor and receivers periodically through a cron job

ˆ This creates corresponding users public directory inside the server and add each registered course name as directory inside the public directory

ˆ Then uses expect to spawn ”scp” to automatically transfer the public key through ”ssh copy- id” to all the newly registered user

ˆ Then uses expect to spawn ”scp” to automatically transfer the public key through ”ssh copy- id” to all the newly registered user

ˆ Handling the issue of lost or corrupted key another cron repeats for transmission of public key to all the already registered users replacing the existing key of server

ˆ If the key is corrupted/deleted/lost due to any reason at some point of time, the data trans- mission continues through transmission by using password. In the next cycle, when the key is again stored, the passwordless transmission continues as before

ˆ After this another script repeating at some interval in cron checks periodically whether any new content is added by the ”contributors” only and fetches the new content to the servers corresponding local directory by making use of expect to spawn ”ssh” and transfer through rsync

ˆ When an new content is added the corresponding recipient list is calculated from the database ”details” table and a softlink is created to the original file at respective destination users public server directory

ˆ Then the file is transferred to the remote receivers again by making use of expect to spawn ”ssh” and transfer through rsync. These files when transferred by rsync over network are compressed and encoded before transmission at the binary level[16]

5.2 Implementation of Local Synchronization Application

Moodle v1.9 and v2.3 : Moodle is open source CMS (course management system ), used by thousands of educational in- stitutions around the globe to provide an highly organized and secured interface for e-learning. Moodle allows educators to create online courses, which users can access as a virtual classroom. Moodle is open source, or freely distributed. This application performs synchronization based on the database content and the application database is generated on the basis the result obtained from the mapping on Moodle database. This has been tested both in Moodle 1.9 and 2.3 versions of moodle, i.e. the institutes following any version of Moodle are capable of running the application.

47 iNotify : Inotify (inode notify) is a Linux kernel subsystem that acts to extend filesystems to notice changes to the filesystem, and report those changes to the application. A watch is at the core of Inotify. In this application iNotify WATCH has been established in monitoring for events for file uploading purpose by tablets. The application makes use of iNotify WAIT to keep on watching till any modification is done on the directory under WATCH.

Implementation Details Login and Setting parameters :

Figure 5.12: Login screen Figure 5.13: Setting Screen

48 ˆ Aakash Tablet is a android based operating system which works on Linux Kernel. Where users have to set Server Ip Address and default path to store the downloaded file. If these values are not set, the application uses the default values

ˆ After getting the login credentials, the client side application calls the server as http request adding UserId Password pair as Name-Value Piar. This is a check in the application database where the server accepts the arguments and responds ”User found” if the UserId and Password pair exists otherwise responds ”Invalid Entry”

ˆ The android application receives the response from the server and redirects the page to Main page if the UserId and Password pair exists otherwise shows error message as a toast notifi- cation

ˆ User clicks the submit button when she/he wants to send a file. This requires the file to be renamed as the predescribed format. The application finds the files and folders inside the local storage and displays them as individual element without opening a separate file browser providing up button to go up

ˆ The time user selects any particular file, the application displays the filename to confirm the file to be send to the server and after clicking the SEND button the file delivered to the server via wifi connection

Figure 5.14: User Validation from Server’s Database Figure 5.15: Main Screen

49 Figure 5.16: Screen after Figure 5.17: Screen showing Figure 5.18: Screen after se- Clicking Submit button File Browser with UP button lecting the file to send file

View the Details of last Transactions and Download a file from server ˆ This is an additional functionality provided by the application to the users. After clicking on Veiw button, the users can view the list of files shared with them or send by them. This works as a decision maker for the user whether to click the Update button or not

ˆ The application sends the userId as http request, requesting for the detailed information of that particular User. Server query the database and sends back the results to the tablet. Then the application displays the results in a tabular format

ˆ From the View results, user can know which are the file she/he has not downloaded till now, at what time they are share with the user, in which course and by which teacher, saving repeated file transfer via wifi as large file transfer takes long time in wifi and consumes more battery in the tablets

ˆ The user can download the files needed or the entire set of files from the server and client side application fetches the appropriate file or folder depending on the UserId and the option the user has opted for. They get stored to the default download folder of local storage system as opted.set by the user

50 Figure 5.19: Page Showing several Options for downloading appropriate file

Processing each event at the server: ˆ When any user (independent of whether a teacher or student ) submit any file, that got uploaded in the watched directory. inotify performs all the further computations based on every single upload event. Thus making availability of the file to the destinations in real time

ˆ As soon as the file upload completes, iNotify redirects the file to the public directory of the user, extracting the UserId from the file name. Every single user’s public directory are under single watch of iNotify tool. So when the file gets stored inside the public directory of the user, the corresponding iNotify calls shell scripts to perform further computation

ˆ The shell script first creates an entry in the database with the details like UserId, CourseId, filename, type (Send/Received) and timing. After that the list of receivers found based on the catagory of the User. i.e. if the user is a student the destination list should include only the teachers who are offering that particular course but if the user is a teacher the destination list should include rest of the teachers of the course (in case more than one number of teachers offering the same course) along with all the registered students

ˆ Then the file is copied to the public directory of individual user inside Downloads directory of the server along with that keeping a backup copy at the server and update the database with all the detail informations as specified above for each of the user in the destination list. The application offers the users to download files share in last 1 day, 2 days, 1 week, 2 weeks, 1 month etc. So, server recreates the file structure adding this new file to appropriate folders

51 Figure 5.20: Page Displaying Last Transactions of the logged in User

52 Chapter 6

Experiments and Analysis of Results

Rigorous experiments have been conducted on Remote Synchronization application based on various parameters and factors. The results obtained in the testing have been compared and analyzed against SCP and certain other parameters. This section describes about the Objectives, Scenarios, Obtained results and Analysis done on the results proving the efficiency of the application.

Test Scenario

The main objective and goals of the conducted experiments is to know the performance of the remote synchronization application under various conditions. The tests have been conducted for several type of conditions like

ˆ varing number of clients for receiving content

ˆ varing file size being transmitted

ˆ transmission time difference with public and without public private key pair

ˆ the transmission time for original transmission and retransmission of the same file of slightly modified file name and even content

ˆ the transmission time difference between conventional scp based transmission and rsync based transmission

ˆ disconnecting the transmissions while the data transfer is going on

The central server is LAMP(Linux, Apache, Mysql, Php) server, and the clients are either Windows or Linux (Ubuntu 12.04) type. Initially all the testing were carried out without using the key and later on the same experiments were repeated using the generated key. The entire experiments are conducted with foru sets of users, 20 user set, 50 user set, 75 user set and 100 user set respectively. Around 15 experiments have been conducted for various file sizes as mentioned in the test scenario , i.e. 10kb, 50kb, 100kb upto 2 gb, for each set of clients.

53 User SET File Size 20 CLIENTS 50 CLIENTS 75 CLIENTS 100 CLIENTS 10 KB 30 69 108 145 50 KB 30 69 108 145 100 KB 31 71 109 146 500 KB 32 72 109 147 1 MB 34 79 112 148 5 MB 36 89 118 151 10 MB 40 105 125 159 50 MB 118 271 375 545 100 MB 153 412 625 925 200 MB 306 934 1801 2141 400 MB 544 2116 2916 4214 500 MB 590 2640 3687 5689 700 MB 866 3689 4883 7944 850 MB 1199 4440 6621 8932 1 GB 1280 5280 7186 10067 2 GB 2587 11653 ­­­ ­­

Figure 6.1: Results of various size of file transfer to various user sets

54 User SET 75 CLIENTS 75 CLIENTS 100 CLIENTS 100 CLIENTS RETRANSMISSION File Size RETRANSMISSION 10 KB 108 108 145 145 50 KB 108 108 145 145 100 KB 109 109 146 146 500 KB 109 109 147 147 1 MB 112 112 148 148 5 MB 118 116 151 151 10 MB 125 117 159 153 50 MB 375 119 545 155 100 MB 625 121 925 156 200 MB 1801 123 2141 159 400 MB 2916 125 4214 162 500 MB 3687 127 5689 163 700 MB 4883 130 7944 167 850 MB 6621 135 8932 172 1 GB 7186 141 10067 176

Figure 6.2: Results of Retransmission time taken for the user sets of 75 and 100

55 The findings of Figure 7.1 and Figure 7.2 are discussed as follows:

ˆ Figure 7.1 represents the data obtained by transferring various files sizes from the server to various user sets. All the results are for transmission time in seconds using RSA keys.

ˆ The table data shows that upto 10MB of file transfer, the total elapsed transfer time is same. One major observation from the table is the highest amount of data transfer is 100GB of data all total to 100 user transferring 1GB to each user.

ˆ This consumed around 3hr 20min of time which is a quite less amount of time for huge volume of data transfer like this in a completely automatic manner.

ˆ A manual testing for transferring a total of 10GB data transfer through USB3 consumed around 12 minutes, where as USB3 is considered to be a very hi-speed transmission medium.

ˆ Figure 7.2 represents the retransmission time elapsed for user sets of 75 and 100. As already noted for file size upto 10MB, the transmission time is the minimal amount time to transfer the file to that particular set of users.

ˆ This retransmission experiment on small sized files like pdfs or ppts are done by slightly modifying content of original file and for big sized files (video files) the retransmission is performed with the same file

ˆ A clear result form the data is no matter how big is the file in size, the transmission with RSYNC protocol will mimimum time to transfer it over network. There is a very little time difference in transferring file retransmission between 10 KB and 1 GB and that time difference is the time taken for computing and comparing and communicating the hash and offset pairs

ˆ As a result the repeat transfer of files over network by this remote synchronization application will consume the time transfer time needed to transfer the new content only

However there are many other documents obtained as a results of the experiment . The impor- tant parts are only displayed in the report in above tables and analysis done in the next section is based on all the results obtained i.e. not confined to the displayed data in the tables.

Analysis of Results

This section analyzes the results which have been represented in graphical format. The initial communication process to find the minimum content of a file to be transferred to the receiver consumes a very huge amount of CPU computation power therefore requiring a very high performance system to be the server. All the experiments for this application have been carried out in a 2GB RAM, 64-bit, Intel i3 processor Laptop. In such limitation scenario, the content synchronization application produced a very good result, so any high performance, high speed server will certainly provide much better result causing the Application to be more faster with compared to the current results.

56 Transfer Details varing file sizes and No of recipients

10500

9000 100 MB

. 500 MB c 7500 e

s 1 GB

n i 6000 d e s

p 4500 a l E

e 3000 m i T 1500

0 20 Users 50 Users 75 Users 100 Users Number of users

Figure 6.3: Comparison of File Transfer time for various User sets

File transfer in various user sets ˆ As shown in figure 7.3, the datasets are selected as 100MB, 500MB and 1GB for all the set of 20 users, 50 users, 75 users and 100 users

ˆ This is a comparative view of time taken for file transfer in the overall scenario of testing by the application

File transfer via SCP vs Sync Application ˆ Figure 7.4 represents the scenario of transmission of the same file over SCP and the developed content synchronization application. The results shown for Application is obtained by the experiments specified in the previous section, and the results foe SCP is obtained by trans- mitting the file to one user and multiplying that values to the total no. of users, though in practical scenario, the same will take more time

ˆ The figure shows that the file transfer in SCP for total of 100 GB data transfer may take upto 8hrs and the same by the application took nearly 3.5 hrs

57 Figure 6.4: File transfer via SCP vs Sync Application

26000 24000 22000 SCP Remote Sync Application 20000 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 100 KB 1 MB 10 MB 100 MB 200 MB 500 MB 850 MB 1 GB

Figure 6.5: Retransmission time comparision

5500 5000

4500 SCP RSYNC+KEY 4000 3500 3000 2500 2000 1500 1000 500 0 100 KB 1 MB 10 MB 100 MB 200 MB 500 MB 850 MB 1 GB

58 Retransmission time in SCP vs Sync Application ˆ There is a very high probability of retransmission of the same files. So we have compared the results obtained by transmitting the already transfered files with a small modification or without modification, in the remote synchronization application using rsync and key pairs and commonly used remote transfer tool called SCP as in Figure 7.5.

ˆ For no modification case in the file content, SCP is consumes around 60-70% time compared to original file transfer. Where as, the synchronization application takes considerably less amount of time, thus proves to be very very useful in case of large number of file transfers to hundreds of users.

Figure 6.6: Performance Comparison between Application, Rsync and SCP

24000 22000 SCP 20000 RSYNC 18000 Application 16000 14000 12000 10000 8000 6000 4000 2000 0 100 MB 500 MB 1 GB

Comparison of file transfer via Application with RSYN and SCP Finally considering the most important issue, the transfer time when the file is being transfered for the first time transfer, rsync takes comparatively less time from scp based file transfer and when Autosync transfer’s file, it uses the RSA keys for pass wordless transmission, thus taking quite less amount of time for file transfer. The main reason lies in the fact that, it uses the key to authenticate the receivers and once the receiver is authenticated, all the file transfer via RSYNC will be with out any credential checking.

59 Chapter 7

Conclusion and Future Work

The effect of e-learning on global learning process has been exponentially growing over past decades. E-learning, in different manifestations, has made its way into the majority of educational settings, bringing an evolution in learning environment ending up with a paradigm shift regarding the general understanding of learning itself. It not only provides learning and organizational facilities but also offers communication and interaction supports. Local Synchronization application synchronizes files chosen by senders to the list of authorized receivers through Aakash tablets based on Moodle, allows to download the shared documents to local storage, view details of last transactions, configure default loaction and server locations etc. providing features like Assignment submission by students, study material distribution by teachers to registered students based on Moodle entry. Remote Synchronization Application allows to syn- chronize the data kept in the server created directory at the remote computers and synchronize / update the data to the registered computers. The architecture of content synchronization application serves the purpose of establishing consis- tency among data from a source to one or multiple target data storage in two way manner over time. As learning in general shifted from a teacher-centered process to a learner-centered one, the Synchronization Application will certainly contribute to make contents like pdfs, ppts, study ma- terial even video tutorial readily available to the students and assignments submissions, feedbacks readily available to teachers. In the area of e-learning, Content Synchronization Application, au- tonomously tailors the learning content and learning support to the individual learners as well as to the groups of learners and even between institutions . I have submitted the work done by me on Remote Content Synchronization Application in IEEE Transactions on Learning Technology. My future work is to submit the work I have done for Local Content Synchronization Application for Aakash tablets so that the idea will inspire and guide individual who want to work more on the same field.

60 Bibliography

[1] Cesar Correa Arias, Collaborative Academic Work as a Power Strategy for an Inclusive E- learning Education,University of Guadalajara, MexicoIEEE, ICEMT 2010.

[2] Jianfeng Zhu, Study On E-learning Education Model Based on Web Intelligence, Harbin Normal University, Harbin, China, International Conference on e-Education, e-Business, e- Management and e-Learning, IEEE 2010.

[3] Shu Na, Liu Jing, The Impact of Learner Factor on E-Learning Quality, Department of Educational Technology, Capital Normal University, China,International Conference on e- Education, e-Business, e-Management and e-Learning, IEEE 2009.

[4] Alireza Ghobadi, C. Eswaran, Chin-Kuan Ho, Automated tools for manipulating files in a distributed environment with RSYNC, Advanced Communication Technology (ICACT), Feb. 7-10, IEEE, 2010.

[5] Meilian Lu, YubingZeng, Research And Implementation Of Multi-device Content Synchro- nization In Converged IP Messaging System, Beijing University, China, International Confer- ence on Information and Multimedia Technology, IEEE 2009.

[6] Microsoft Window’s Site, http://windows.microsoft.com/en-in/windows-vista/ what-is-the-difference-between-one-way-and-two-way-sync, June 2013.

[7] Copyright TGRMN Software. TGRMN Software products, ViceVersa: File Synchronization, File Replication, Windows Backup Software, http://www.tgrmn.com/web/kb/item78.htm, June 2013.

[8] Hao Yan, Utku Irmak, Torsten Suel, Algorithms for Low-Latency Remote File Synchroniza- tion, CIS Department, IEEE INFOCOM proceedings of Communications Society, 2008.

[9] Hao Zhang, Chuohao Yeo, and Kannan Ramchandran, RATE EFFICIENT REMOTE VIDEO FILE SYNCHRONIZATION, Department of EECS, University of California, USA, Confer- ence on Speech and Signal Processing IEEE (ICASSP), 2009.

[10] Aakash (tablet), http://en.wikipedia.org/wiki/Aakash_\%28tablet\%29, 17 June 2013.

[11] Public and Private Keys, Key-Based SSH Logins, https://help.ubuntu.com/community/ SSH/OpenSSH/Keys, June 2013.

[12] Andrew Tridgell,Paul Mackerras, The rsync algorithm, Department of Computer Science, Australian National University, Australia, http://rsync.samba.org/tech_report/, June 2013.

61 [13] Infrant Technologies, READYNAS INSTANT STORAGE, Using Rsync for NAS-to-NAS Backups, 2006.

[14] Torsten Suel Patrick Noel Dimitre Trendafilov, Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks, International Conference on Data Engineering (ICDE), IEEE 2004.

[15] Utku Irmak Svilen Mihaylov Torsden Suel, Improved Single-Round Protocols for Remote File Synchronization, CIS Department, IEEE, 2005.

[16] Ryozo Kiyohara, Satoshi Mii, Koichi Tanaka, Yoshiaki Terashima, Hidetoshi Kambe, Study on Binary Code Synchronization in Consumer Devices, IEEE Transactions on Consumer Electronics, Vol. 56, No. 1, FEBRUARY 2010.

[17] Alireza Ghobadi, Ehsan Haji Mahdizadeh, Pre-Processing Directory Structure For Improved RSYNC Transfer Performance, International Conference on Advanced Communication Tech- nology(ICACT) 2011.

[18] SyncBreeze File Synchronization Flexense Ltd, SyncBreeze File Synchronization User Man- ual, Version 5.3, June 2013

[19] A Guide To Getting Started With Dropbox File Backup Sync, http://www.guidingtech. com/3656/dropbox-file-backup-sync/, 2013.

[20] Mitsushi Fujimoto, Stephen M. Watt, An Interface for Math e-Learning on Pen-Based Mobile Devices, 2009.

[21] Giuseppe Laria, Mobile and nomadic user in e-learning: the Akogrimo case, Grid for complex problem solving, SIXTH FRAMEWORK PROGRAMME, PRIORITY IST-2002.

[22] Jason Haag, From eLearning to mLearning: The Effectiveness of Mobile Course Delivery, Advanced Distributed Learning Initiative, Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) 2011.

[23] Fatos Xhafa, Santi Caballe, Isaac Rustarazo, Leonard Barolli, Implementing a Mobile Campus Using MLE Moodle, Technical University of Catalonia, Spain, International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, IEEE, 2010.

[24] The Method of Android Application Speed up by usingNDK Ki-Cheol Son Prof. Jong-Yeol Lee Dept. of Electronic Engineering, 2010.

[25] Android: Changing the Mobile Landscape Margaret Butler, Published by the IEEE CS n 1536-1268/11/ IEEE, 2011.

[26] Foutse Khomh, Hao Yuan, Ying Zou, Adapting Linux for Mobile Platforms: An Empirical Study of Android, 28th IEEE International Conference on Software Maintenance (ICSM), 2012.

[27] Jingtong Hu, Chun Jason Xue, Edwin H.-M. Sha, Yi He, Reprogramming with Minimal Transferred Data on Wireless Sensor Network, Department of Computer Science, University of Texas at Dallas, IEEE, 2009.

62 [28] Alireza Ghobadi, C. Eswaran, Nithiapidary Muthuvelu, Ian K.T. Tan, An Adaptive Wrapper Algorithm for File Transfer Applications to Support Optimal Large File Transfers, Advanced Communication Technology (ICACT), IEEE 2009.

[29] Meilian Lu, YubingZeng, Research And Implementation Of Multi-device Content Synchro- nization In Converged IP Messaging System, IEEE Computer Society, 2009.

[30] Sattar J Aboud, Mohammad A AL-Fayoumi, An Efficient RSA Public Key Encryption Scheme, Fifth International Conference on Information Technology: New Generations, IEEE 2008.

[31] Milad Bahadori, Mohammad Reza Mali, Omid Sarbishei, Mojtaba Atarodi, Mohammad Shar- ifkhani, A Novel Approach for Secure and Fast Generation of RSA Public and Private Keys on SmartCard, IEEE 2010.

[32] Wei-Meng Lee, Beginning Android Application Development, Wiley Publishing, Inc. Indi- anapolis, Indiana, 2011.

[33] Reto Meier, Professional Android Application Development, Wiley Publishing, Inc., Indi- anapolis, Indiana, 2011.

[34] Improving Data Access Performance with Server Push Architecture Xian-He Sun, Surendra Byna, Yong Chen Illinois Institute of Technology Department of Computer Science, IEEE 2007.

[35] Staggered PushA Linearly Scalable Architecture for Push-Based Parallel Video Servers Jack Y. B. Lee, Member, IEEE, 2002.

[36] Global Information Broadcast: An Architecture for Internet Push Channels T IE L IAO INRIA/CS-Telecom, France, JULY– AUGUST 2000.

63