The Cost of Confiden Ality in Cloud Storage
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University | Department of Computer and Information Science Master thesis, 30 ECTS | Software Engineering 2018 | LIU-IDA/LITH-EX-A--18/016--SE The Cost of Confidenality in Cloud Storage Eric Henziger Supervisor : Niklas Carlsson Examiner : Niklas Carlsson Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrä Dea dokument hålls llgängligt på Internet – eller dess framda ersäare – under 25 år från publicerings- datum under förutsäning a inga extraordinära omständigheter uppstår. Tillgång ll dokumentet innebär llstånd för var och en a läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och a använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsräen vid en senare dpunkt kan inte upphäva dea llstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För a garantera äktheten, säkerheten och llgängligheten finns lösningar av teknisk och administrav art. Upphovsmannens ideella rä innefaar rä a bli nämnd som upphovsman i den om- faning som god sed kräver vid användning av dokumentet på ovan beskrivna sä samt skydd mot a dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för up- phovsmannens lierära eller konstnärliga anseende eller egenart. För yerligare informaon om Linköping University Electronic Press se förlagets hemsida hp://www.ep.liu.se/. Copyright The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starng from the date of publicaon barring exceponal circumstances. The online availabil- ity of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educaonal purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are con- dional upon the consent of the copyright owner. The publisher has taken technical and administrave measures to assure authencity, security and accessibility. According to intellectual property law the au- thor has the right to be menoned when his/her work is accessed as described above and to be protected against infringement. For addional informaon about the Linköping University Electronic Press and its procedures for publicaon and for assurance of document integrity, please refer to its www home page: hp://www.ep.liu.se/. © Eric Henziger Abstract Cloud storage services allow users to store and access data in a secure and flexible manner. In recent years, cloud storage services have seen rapid growth in popularity as well as in technological progress and hundreds of millions of users use these services to store thousands of petabytes of data. Additionally, the synchronization of data that is essential for these types of services stands for a significant amount of the total internet traffic. In this thesis, seven cloud storage applications were tested under controlled experiments during the synchronization process to determine feature support and measure performance metrics. Special focus was put on comparing applications that perform client side encryption of user data to applications that do not. The results show a great variation in feature support and performance between the different applications and that client side encryption introduces some limitations toother features but that it does not necessarily impact performance negatively. The results provide insights and enhances the understanding of the advantages and disadvantages that come with certain design choices of cloud storage applications. These insights will help future technological development of cloud storage services. Acknowledgments Even though I am the sole author for this thesis, my journey has been far from lonely and I have many people to thank for reaching the completion of my thesis. First and foremost, thanks to Associate Professor Niklas Carlsson for his work as examiner and supervisor. Niklas has been generous in sharing his vast knowledge and helped me get back on track when I was lost and things felt hopeless. Thanks to my dear friend Erik Areström who I also had the pleasure to have as my opponent for this thesis. Erik’s warmth and positive attitude have been a source of motivation and I’m happy to get to share this final challenge as a Linköping University student with you. Thanks to my fellow thesis students with whom I’ve spent numerous lunches, fika breaks, and foosball games with: Cristian Torrusio, Edward Nsolo, Jonatan Pålsson and Sara Bergman. You guys have turned even the dullest of work days into days of joy with interesting discussions and many laughs. Special thanks to my good friend Tomas Öhberg who, in addition to participating in the previously mentioned activities, have been the greatest of bollplanks when discussing our theses as well as life in general. Thanks to Natanael Log and Victor Tranell for their valuable feedback on early drafts of this thesis. I wish you all good fortune in your future endeavors and I hope that our paths may cross again sometime. This thesis concludes my five years at Linköping University. It has been an adventurous time during which I have learned immensely and had the privilege to get to know many great people. Thanks to all my fellow course mates, especially Henrik Adolfsson, Simon Delvert and Raymond Leow, for being with me through tough and challenging exams, laboratory work and projects. Thanks to all examiners at the university departments IDA, MAI and ISY for pushing me to learn stuff that I would not have been disciplined enough to learn on my own. I would also like to thank my colleagues at Westermo R&D for being great role models in the software industry and for inspiring and motivating me for what’s to come in my professional life. Thanks to my awesome friends back in Hallstahammar, I don’t have space to thank you all, but the three families Brandt, Joannisson and Tejnung include the very strong core part. While spending time with you have been limited during these years, it has always been of highest quality. Finally, my warmest thanks to my mom and dad, Aina and Bosse, and my sister, Annelie, for your endless support and raising me to who I am. Great work! ♡ This thesis was written using LATEX together with PGFPlots for plot generation. The support from random strangers across the internet has been of great use in making this thesis into what it is. iv Contents Abstract iii Acknowledgments iv Contents v List of Figures vii List of Tables viii List of Code Listings ix 1 Introduction 1 1.1 Aim ................................................. 2 1.2 Research Questions ....................................... 2 1.3 Contributions ........................................... 2 1.4 Delimitations ........................................... 3 2 Theory 4 2.1 Cloud Infrastructure and Cloud Storage ........................... 4 2.2 File Encryption .......................................... 5 2.3 Cloud Storage User Behavior .................................. 6 2.4 Cloud Storage Features ..................................... 7 2.5 Personal Cloud Storage Applications ............................. 11 2.6 Related Work ........................................... 13 3 Method 15 3.1 Test Environment ........................................ 15 3.2 Testing Personal Cloud Storage Capabilities ........................ 17 3.3 Advanced Delta Encoding Tests ................................ 18 3.4 CPU Measurements ....................................... 21 3.5 Disk Utilization .......................................... 24 3.6 Memory Measurements ..................................... 24 3.7 Security in Transit ........................................ 25 3.8 Cloud Storage Traffic Identification .............................. 26 v 4 Results 28 4.1 Compression ............................................ 28 4.2 Deduplication ........................................... 29 4.3 Delta Encoding .......................................... 30 4.4 CPU Utilization ......................................... 32 4.5 Disk Utilization .......................................... 38 4.6 Memory Utilization ....................................... 39 4.7 Security in Transit ........................................ 41 5 Discussion 43 5.1 Results ............................................... 43 5.2 Method ............................................... 46 5.3 The Work in a Wider Context ................................. 47 6 Conclusion 49 6.1 Future Work ........................................... 50 Bibliography 51 A Appendices 57 A.1 Cloud Storage Application Changelogs ............................ 57 A.2 Packet Size Distributions .................................... 61 A.3 CPU Utilization ......................................... 62 A.4 Disk Utilization .......................................... 64 vi List of Figures 2.1 Two files sharing the same cloud storage space for two chunks. ............... 8 2.2 Attack scenario in a cross-user deduplicated cloud. ....................... 9 3.1 The testbed setup used for the cloud storage measurements. 16 3.2 Visualization of the update patterns used in the delta encoding tests. 19 3.3 The different phases and their transitions during the sync process. 22 3.4 Screenshot of MEGAsync preferences with HTTP disabled. 24 3.5 Screenshot of Wireshark during TLS analysis. ......................... 25 4.1 Compression test results for the different PCS applications. 29 4.2 Bytes uploaded with sprinkled updates