DBSHIELD: SECURING AGAINST DIST RIBUTION

ANAND BHATIA (ANANDR ) & TEJASWI SUDHA (T SUDHA)

CONTENTS

Abstract...... 3 Introduction ...... 3 Dropbox - the growth story ...... 3 Dropbox – INSECURE BY DESIGN? ...... 4 Motivations for an enhanced client ...... 5 DBShield - Conceptual overview ...... 6 Features ...... 6 Evaluation of available anti-malware engines ...... 6 DBShield- usage scenario ...... 7 DBShield implementation ...... 8 Client api review ...... 8 Leveraging the dropbox rest api ...... 8 Implementation ...... 9 Performance evaluation ...... 12 Results ...... 13 Lessons learnt ...... 13 Planned optimizations & future work ...... 14 Conclusions ...... 14 Acknowledgements ...... 14 References ...... 15

ABSTRACT

Cloud based file storage services becoming increasingly popular off-late as they offer convenience & seamless folder based sharing. Among the host of options, Dropbox has proved to be the leader due its ease-of-use, cross platform client availability and low cost of entry. It has proved especially popular in academic circles too as it offers up to 18 GB of free storage space, easily meeting the needs of students. These services have also caught the attention of a more nefarious group of people namely malware and spam distributors. These groups have exploited the multiple security vulnerabilities in the new cloud based offerings towards their end. Using the sharing features of such services it offers them an easy avenue of spreading malware. To prevent epidemics, it is unwise to rely on the end-users to deploy the protections necessary to contain malware to the infected host.

In this report, we present the design, implementation and evaluation of DBShield– a security enhanced Dropbox command line client which offers malware protection “out-of-the-box” thereby removing the burden of anti-malware protection from the end users. It utilizes ClamAV – an Open Source antimalware engine to offer cross-platform protection for both upstream and downstream file syncing. As DBShield is written in Perl, we are able to offer a security without compromising on cross platform compatibility of the standard Dropbox Client. We also discuss the various attack vectors used against Dropbox in the past and ones which could be potentially used in the future. We have implemented a proof-of-concept prototype for both Linux & Windows platforms, tested it against real-world malware and performed performance measurements to optimize the client performance.

INTRODUCTION

Up until recently, the most common way to share files on a personal or small team-based projects was to send them via email. With the advent cloud computing and broadband internet penetration, a new paradigm to file sharing has gained rapid prominence. Centralized cloud based file sharing and syncing had made it much simpler to view, edit and access common files from any terminal at any location. Of the host of cloud based storage service providers, Dropbox reigns supreme having the largest market share and user base with over 100 million users [1].

DROPBOX - THE GROWTH STORY

Dropbox, which started out in September 2008 uses a freemium model of business. It offers both free and paid accounts. There are several factors which contributed to Dropbox’s rapid growth. Some of them are:

 It offers support for variety of devices across multiple Operating systems ranging from Mac OSX to Android devices.  It offers free accounts starting at 2GB of storage expandable right up to 18GB.  It is very easy to use requiring little to no setup to getting the sync working.  It is making rapid inroads into the smartphone space with HTC & Sony Ericsson making deals to offer bundled crowd storage to augment on board storage on the device.

Fig. 1: Shows the rapid growth that Dropbox has witnessed over past few years.

DROPBOX – INSECURE BY DESIGN?

However, all these advantages seem to be just the silver lining which hides a darker cloud. Many security experts have lambasted Dropbox for its insecure design [2][3][4]. The initial version of Dropbox client suffered from the following security loopholes:

1. No encryption at client end. This allows easy spoofing of data packets and the corresponding hashes. 2. Trust based assumptions for client-sent hashes. This is among the more well- known exploits where using a client to spoof hashes allows end users to gain access to arbitrary files not necessarily owned by the user. 3. Weak data possession protocols[2].Dropbox doesn’t not employ any provable possession algorithms to ensure data possession by clients leaving user data essential public. 4. Direct Download attacks. This employs knowledge of host-id which is a unique identifier linking a device to a particular user’s account to download chunks of the users data without owning the data itself.

These loopholes were expectedly exploited and Dropbox has suffered no less than 3 security breaches [4].

There were two major attack vectors targeting these loopholes:

1. Data and Information leak using loopholes 1 through 4. 2. Online “-space” [3]

All of the above issues have more or less been fixed or are slated to be fixed in future Dropbox versions. The new security features introduced by Dropbox include TWO-FACTOR AUTHENTICATION and DROPPING DE-DUPLICATION partially to ensure data privacy for owners.

These “enhancement” though still buggy, address the aforementioned attack vectors.

MOTIVATIONS FOR AN ENHANCED CLIENT

While Dropbox developers have begun to deal with existing attack vectors, there is a worrisome up an coming mode of attack: malware and spam distribution. Initial instances of these ideas in implementations have already been spotted in the wild from as early as early 2012 [5] [6]. The very genesis of these attacks is engrained within the Dropbox Terms of service

“You, not Dropbox, will be fully responsible and liable for what you copy, share, upload, download or otherwise use while using the Services. You must not upload or any other malicious …You, and not Dropbox, are responsible for maintaining and protecting all of your stuff. Dropbox will not be liable for any loss or corruption of your stuff…”

This asserts that Dropbox is just a file storage and syncing service. It does not and will not for the near future provide any malware filter or protection of user data. Malware attacks manifest in three different ways:

1. Trojan distribution. Malware cartels have been using Dropbox’s public folders to host and later download spyware on to infected machines. Thus, Dropbox serves as an easy always available means to infect even partially compromised machines. 2. Spammer abuse. Dropbox’s public folder also serves as the perfect way to host public URLS to which are in essence spammy links to advertisers. Spammers leverage the credence of Dropbox URLS to trick people to generate click-throughs. 3. Accidental spread across owner devices. This attack occurs when a user shares a file benign on one machine, but harmful on the other via dropbox’s automatic syncing thereby leading to a malware epidemic.

There have been several requests by users of the popular client to integrate some basic anti- virus protection in at an added cost but that feature request is unlikely to be implemented by the Dropbox developers. To meet this and future malware threats we implemented our secure Dropbox Client DBShield.

DBSHIELD - CONCEPTUAL OVERVIEW

DBShield is a Secure, cross-platform and Open-source Dropbox client. It is easy to use and extensible. The next section lists some of its basic features. DBShield FEATURES

 Inbuilt malware protection. Leverages open-source antimalware engine ClamAV.  Cross- platform – Written in Perl.  Full Dropbox functionality – Including syncing of files both upstream & downstream.  Open Source.  No major performance overheads.  Extensible for further improvements.

“ClamAV is an open source (GPL) antivirus engine designed for detecting Trojans, viruses, malware and other malicious threats.” [9]

EVALUATION OF AVAILABLE ANTI-MALWARE ENGINES

The figure below reveals the pros and cons of the three anti-malware solutions we evaluated.

AVG Free antivirus : + Very popular antivirus solution. Available on many platforms. - Fails to scan archived files properly. Scant documentation. Poor results in finding viruses.

Avira AntiVir Solution : + Very easy installation. Very less memory/CPU cycles taken to complete a scan. - Year-long licensing is 25$, a major blow.

ClamAV : + Originally developed as email scanner. Fits well for our problem. Performs well even with zipped files. Very good documentation and very less third-party depenedency. The major plus point is its completely open sourced. - Has slightly lower accuracy as compared to commercial solutions.

Fig. 2 : Summary of the pros and cons of the popular cross-platform antimalware engines. As you can see low cost, and good detection rate vis-à-vis AVG & Anti-vir led us to to choose ClamAV as the preferred solution for integration into DBShield.

DBSHIELD- USAGE SCENARIO

DropBox DropBox Remote Client 2 Client 1 DropBox OS1 Folder OS2

DBShield DBShield

Fig. 3 : Schematic diagram depicting Upstream and Downstream filtering.

File safe for both machines.

File benign on Machine 1 (old definitions) but unsafe on Machine 2

Malware filtered from upload when M1 does nothave existing Anti-Virus solution.

A typical usage scenario involves two different user machines using different operating systems. Note the different behaviors for different file signatures.

 TRANSPARENT OPERATIONS. A file benign to both machines will be uploaded as before.  UPSTREAM PROTECTION. A file detected as malicious on the first machine is prevented from upload thereby neutralizing poisoning the cloud and causing a malware “epidemic”.  DOWNSTREAM PROTECTION. A file missed by DBShield on the upstream path due to perhaps outdated virus definitions can still be caught on the downstream path. This shows the symmetric protection offered by DBShield on both upstream and downstream paths.

We now look at how of each of these protections are implemented. DBSHIELD IMPLEMENTATION

CLIENT API REVIEW

Our DBShield client API provides the following commands to the end user:

• setup* • ls • put* • get* • sync download* • sync upload* • mkdir • help The commands in Bold are those which involve communication to and from the Client. Hence, these commands were the prime targets to be made more secure.

LEVERAGING THE DROPBOX REST API

Dropbox came up with a REST API for all the third party mobile apps and SDKs, so that nothing is dependent on the programming language which is used. Some of the API functions which we’ve used in our application are:

AUTHENTICATION

Dropbox uses OAuth for authenticating all the API requests. There are three steps in authenticating a client using the Dropbox API.

1. /oauth/request_token: In this first step the user requests for a token, which is used in other two steps. URL:https://api.dropbox.com/1/oauth/request_token

2. /oauth/authorize: In this step, the user basically goes to the and logs into his Dropbox account. Without the user’s authorization in this step, it isn’t possible for the app to get an access token in the next step. URL:https://www.dropbox.com/1/oauth/authorize

3. /oauth/access_token : After the first two steps , the app calls this to get the access token. The URL-encoded access token, access token secret and Dropbox user id is returned, thus completing the authentication process. The pair of token and secret can be used to sign requests. URL:https://www.dropbox.com/1/oauth/acess_token

FILES AND METADATA

/files (GET): Specified files are downloaded at the requested revision. URL: https://api-content.dropbox.com/1/files//

/files_put : Uploads a file using PUT semantics.

URL:https://api-content.dropbox.com/1/files_put//?param=val

/metadata: Retrieves the meta information of a file or a folder. URL: https://api.dropbox.com/1/metadata//

IMPLEMENTATION

SETUP PHASE i) This the basic authentication phase. When you run the application with setup parameter, new access token is obtained which can be used in further queries.

ii) Install phase involves installation of ClamAV antivirus solution on the user machine. An Install shell script is provided to aid the user in installation. This script gets the ClamAV source directly and does all the configuration changes to obtain a clean install. Whenever we use clamscan service in our application, we do sanity check to make sure that it was installed properly.

SECURING THE COMMUNICATION PIPELINE We now examine each of the hardened commands by providing their usage and describing their implementation procedure.

1. GET : We download file at remotepath to a temporary file, scan it using clamscan and quarantine it if clamscan service finds a virus. Otherwise, we move the file to localpath.

2. PUT : We intercept the upload just like we intercepted the download. We scan the file at localpath and quarantine it if any viruses found. Else, we just let the application upload the file to remotepath using the REST API.

3. SYNC /sync-download : Sync is a feature which is not provided by the standard Dropbox API. We had to implement it separately. In Sync download, the user mentions the directory which needs to be synced. It is achieved by doing a Depth first traversal through the directory tree and comparing each file in the source to destination. Comparisons are based on the metadata elements like size and timestamp. If there are any changes the new files are downloaded to the local node.

We download the whole directory tree into a temporary folder and scan the whole thing at once. We observed a heavy start-time for clamscan service, which made us to take a decision as to scan the whole directory after download rather than each file during the sync.

4. SYNC /sync-upload :

Sync upload is simpler than sync download. The whole directory tree that the user asked to sync in scanned through clamscan as a –r parameter exists which can scan a directory recursively. The infected files are quarantined and remaining files are synced as usual.

PERFORMANCE EVALUATION

Our test environments consisted of a Windows machine and a Linux machine. Configurations:

Parameter Windows Box Linux Box

Operating system Windows 7 Home Premium Ubuntu Linux kernel 3.0.0-17 SP1 64bit x86_64 Processor Intel Core i7- 2670QM @ Intel Core i5 – 2410M @ 2.2GHz 2.3GHz Perl Version ActivePerl [Perl v5.16.1] Perl v5.12.4

To ensure that overhead for the scan was acceptable, we performed a upload with different input data sets of sides varying from 97 MB to nearly 2GB.

We observe that even for a small data set, the scan time starts out at nearly 10 seconds. This is attributed to the startup time of the clamscan and is a constant based on machine configuration.

Scan Time 160

140 >

-- 120 100 Scan 80 Time 60

40 Scan time (s) 20 0 0 500 1000 1500 2000 2500 Data in MB -->

We derived the average slope and formulated the scan time for input data in MB as:

Scan time in seconds = 10 + .0735 * (Data set size in MB)

RESULTS

We measured the overhead created by clamav during upload/download. • We observed that clamscan service has typical startup time of ~ 10 seconds. This is a constant based on machine configuration. • The overhead created was ~70ms scan time for each additional MB of data.

Thus, using the fact that average upload speed in NC, US is close to 400kbps, a typical 5GB upload would take 218.45 minutes. In contrast, the scan time is just 6.27 minutes implying an overhead of 2.8%. This though small can further be optimized and is discussed in a later section.

LESSONS LEARNT

 The official Perl addon mentioned in Dropbox wiki has lot of bugs in the implementation for cross platform use-cases. We have reported them to the authors/maintainers.

 The need for anti-malware integration into cloud-based storage solutions cannot be neglected any longer. Many users in different forums have been requesting for anti- malware integration into Dropbox, but Dropbox authorities have categorically denied them stating that they would be responsible for user’s non-secure systems and it would create overhead in runtime of the native client. But we feel both the options should be provided and the user should be the one to decide which of speed or security is more priority to him/her.

 The developers who wrote Dropbox wanted to create a fast and easy to use synchronization system. This is why for each file only the portions of the file that are changed are actually saved from one revision to another. The process is called delta encoding. Thus, if we just change the timestamp of the file, nothing is transferred as the hash of the file contents of the file did not change.

PLANNED OPTIMIZATIONS & FUTURE WORK

 Adopting cloud based anti-virus software. Recently there has been a surge in the amount of research going into the cloud based antivirus solutions. Software like CloudAV uses a lightweight host agent which runs on the endpoints that identifies new files and sends them to a network service for analysis.

 Implement the whole Dropbox system on top of a user-level filesystems like FUSE where creation of a file or read of a file can be easily captured and corresponding measures can be taken to make the communication pipeline to the remote Dropbox service secure.

 Scan time for syncing can be completely hidden if the upload and download and scan ran in parallel, thus hiding the scan time by overlapping them onto the download time.

CONCLUSIONS

In the report, we described the design, implementation & performance analysis of DBShield, a secure Perl based Dropbox client. DBShield offers Easy to use and at the same time is Secure & Cross platform allowing the same client to be used on different devices with different Operating Systems. Moreover, DBShield is Open Source and therefore can be extended by the general community or tweaked to meet their personal requirements.

ACKNOWLEDGEMENTS

We would like to thank our Professor Dr. Xuxian Jiang for guiding us throughout the course of this project.

REFERENCES

[1] Victoria Barret. (n.d.). Dropbox Hits 100 Million Users Says Drew Houston-Forbes. Retrieved December 5, 2012, from http://www.forbes.com/sites/victoriabarret/2012/11/13/dropbox- hits-100-million-users-says-drew-houston/ [2] – Derek Newton » Dropbox authentication: insecure by design. (n.d.). Retrieved November 26, 2012, from http://dereknewton.com/2011/04/dropbox-authentication-static-host-ids/ [3] -Mulazzani, M., Schrittwieser, S., Leithner, M., Huber, M., &Weippl, E. Dark clouds on the horizon: Using cloud storage as attack vector and online slack space. [4] - Matt Marshall. (2012, August 1). Dropbox has become “problem child” of cloud security | VentureBeat. Retrieved December 5, 2012, from http://venturebeat.com/2012/08/01/dropbox- has-become-problem-child-of-cloud-security/ [5] – Trojan Downloaders actively utilizing Dropbox for malware distribution « Webroot Threat Blog – Internet Security Threat Updates from Around the World. (n.d.). Retrieved November 26, 2012, from http://blog.webroot.com/2012/03/21/trojan-downloaders-actively-utilizing- dropbox-for-malware-distribution/ [6] - Dropbox Abused by Spammers | Symantec Connect Community. (n.d.). Retrieved November 26, 2012, from http://www.symantec.com/connect/blogs/dropbox-abused- spammers-0 [7] - Premium service idea : Webbasedanti virus« Dropbox Forums. (n.d.). Retrieved November 26, 2012, from https://forums.dropbox.com/topic.php?id=8899 [8] -Terms - Dropbox. (n.d.). Retrieved November 26, 2012, from https://www.dropbox.com/dmca#terms [9] -Clam AntiVirus. (n.d.). Retrieved December 5, 2012, from http://www.clamav.net/lang/en/ [10] - DropBox RESTAPI reference https://www.dropbox.com/developers/reference/api