Master Thesis Cyber Security

Master Thesis Computing Science Cyber Security Specialization Radboud University Analyzing the Tahoe-LAFS filesystem for privacy friendly replication and file sharing Author: Main supervisor/assessor: Luuk de Bruin Dr. Jaap-Henk Hoepman S4811062 Second assessor: Dr. ir. Erik Poll August 2019 2 Abstract. There are a lot of file sharing technologies in the world with each of them their own characteristics. However, finding the right technology that supports file sharing, file synchronization, file downloads, file back-ups, and is, in addition, secure and privacy friendly, is much harder. We aim to develop a solution as a preparation for a world disk where it would be possible for two individuals to trade free space to, for example, remotely back up one’s data securely and privacy friendly. We have analyzed several file sharing technologies and came to the conclusion that three of them would form the best basis for future research. The three that came up best are Interplanetary File System, Tahoe-LAFS and BitTorrent Sync. From these three, we found out that Tahoe-LAFS is the best one that fits our end solution and we analyzed this technology in greater detail: we looked at the architecture, access control, fault tolerance, cryptography and privacy. With regard to privacy, we found out that Tahoe-LAFS, by default, does not hide IP addresses in order to protect the privacy of their users. However, they developed a way to run Tahoe-LAFS over I2P or over TOR (both anonymization networks) in order to support anonymous file downloads. We also looked at other limitations and found out that an owner of a file does not have any control (e.g. locking a file or blocking access for specific nodes) when he/she shared access rights with others. To solve this, we came up with a solution containing a whitelist which only the owner can write, and all of the storage servers can read in order to provide a flexible way to manage access control. We also looked at another solution, which is less flexible: re-encryption. This solution makes it possible for users with write rights to re-encrypt the contents of a file in order to “obfuscate” the contents of the file, making it impossible for others to read. Keywords: Peer-to-Peer, Client-Server, File Sharing, File Synchronization, disk-space trading, file back-up, file replication, file sharing protocols, file systems, Tahoe-LAFS, Privacy in File Systems. 3 Contents 1 Introduction ..................................................................................................... 4 1.1 Problem statement ................................................................................... 4 1.2 Reading guide ......................................................................................... 4 1.3 Related work ........................................................................................... 5 2 Background ...................................................................................................... 6 2.1 What are file synchronization & sharing services? ................................. 6 2.2 What is (centralized) cloud based file syncing & sharing? ..................... 7 2.3 What is P2P file syncing & sharing? ...................................................... 7 2.4 What is file back-up and how does it fit in file sharing & synchronization? 8 2.5 File systems............................................................................................. 8 2.6 File systems versus file synchronization versus file sharing ................... 8 3 Existing technologies regarding file sharing.................................................. 10 3.1 Conclusion ............................................................................................ 15 4 Tahoe-LAFS .................................................................................................. 16 4.1 Architecture .......................................................................................... 19 4.2 Access control ....................................................................................... 20 4.3 Fault tolerance....................................................................................... 23 4.4 Cryptography ........................................................................................ 28 4.5 Freshness of mutable files & directories ............................................... 38 5 Improving the Tahoe-LAFS file system ........................................................ 39 5.1 Privacy analysis .................................................................................... 39 5.2 Privacy improvements .......................................................................... 39 5.3 Adding owner control to the Tahoe-LAFS file system ......................... 44 6 Conclusions & future work ............................................................................ 49 7 Appendices .................................................................................................... 50 7.1 Appendix 1: Analysis of file syncing & sharing technologies .............. 50 7.2 Appendix 2: Analysis Tahoe-LAFS, IPFS & BitTorrent Sync ............. 73 References .............................................................................................................. 81 4 1 Introduction Looking for decent software that offers file synchronization, file back-up and file sharing can be quite hard due to the amount of solutions and technologies that can be found on the internet. It becomes harder to find such software that is also secure, and even harder if it also has to be privacy friendly. That is what this paper is about: analyzing and comparing existing technologies in order to come up with the best privacy friendly and secure solution that supports, among other things, file synchronization, file back- ups and file sharing. 1.1 Problem statement Assume the following case: two individuals own some storage which they are not using entirely. Let’s say: both of the individuals own a 500GB storage device, while they only use a maximum of 250GB. Furthermore, suppose both users want to create an external back-up of their data. In order to create that back-up, they buy or rent storage elsewhere, while those two individuals have matching demands: they both seek 250GB of storage to back-up their data. This isn’t very efficient, as both parties need another 250GB of storage to create a back-up of their data. A more efficient solution would be to offer a technology that makes it possible for these individuals to “trade” storage, meaning that the first individual can create a back- up on the storage device of the second individual and vice versa. This prevents a waste of storage and money and due to the trading mechanism, there is no one who can abuse this system (space-for-space). Generalizing this idea leads to a “World Disk”. This thesis tries to come up with a design for a technology that is secure and privacy friendly, and forms a solid basis for the World Disk. The technology has to support, among other things, the following functionalities: ─ File synchronization; ─ File back-up; ─ Anonymous downloads; ─ File sharing. This leads us to the following research question: “What is the best file sharing technology for our purpose, and how to improve it?” 1.2 Reading guide Section 2 describes background information and describes the terminology used in the rest of the paper. This section provides the reader background information about the subject, i.e. it explains, for example, what (types of) file sharing exists and what file synchronization is. Section 3 analyzes multiple technologies for file sharing and file synchronization, and tests them individually against a framework that is created based on the problem statement to come up with a selection at the end. Section 4 picks the best technology that is found in section 3, and gives an elaborate analysis about this 5 technology. Section 5 analyzes the privacy aspects of this best technology, and comes up with solutions that could be implemented to improve the technology. It ends with section 6 that gives conclusions and describes future work. Section 7 contains appendices in which the first appendix (7.1) is an analysis and comparison of multiple file sharing technologies. The second appendix (7.2) analyzes three of the best technologies found in section 3 in detail, and gives a more detailed comparison which is input for section 4. All sections consist of theoretical and literature study. There is no empirical part in this thesis; every piece of information is based on online sources and theoretical / aca- demical papers. 1.3 Related work Related work includes papers and sources that are used to describe certain technologies (such as papers and sources that are referenced in section 3). However, there is no research (yet) about how these technologies could fit in an end solution as described above. In addition, the technology that came out best for our purpose (Tahoe-LAFS), isn’t even completely described / summarized in one paper, as the main academic paper for this technology is missing a lot of information. Their docs are also a puzzle where one has to dig in to find specific information. In this respect, this is the only work (as far as we could find) that analyzes Tahoe-LAFS in detail, containing the necessary information. 6 2 Background This section gives some background information before actually analyzing different existing technologies. The paper “Receiver anonymity within a distributed file sharing

Master Thesis Cyber Security

Elastic Storage for Linux on System Z IBM Research & Development Germany

What Is Peer-To-Peer File Transfer? Bandwidth It Can Use

Oracle® ZFS Storage Appliance Security Guide, Release OS8.6.X

BESIII Physical Analysis on Hadoop Platform

A Generic Data Exchange System for F2F Networks

Backup Preso PMUG-NJ 2019

The Edonkey File-Sharing Network

IPFS and Friends: a Qualitative Comparison of Next Generation Peer-To-Peer Data Networks Erik Daniel and Florian Tschorsch

Softnas Deployment Guide for High- Performance SQL Storage

Comptia A+ Acronym List Core 1 (220-1001) and Core 2 (220-1002)

Improving Learning of Programming Through E-Learning by Using Asynchronous Virtual Pair Programming

The Joe-E Language Specification (Draft)