World Libraries Towards Eﬃciently Sharing Large Data Volumes in Open Untrusted Environments While Preserving Privacy

World Libraries Towards Efficiently Sharing Large Data Volumes in Open Untrusted Environments while Preserving Privacy Von der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften genehmigte Dissertation vorgelegt von Diplom-Informatiker Stefan Richter aus Berlin Berichter: Universitätsprofessor Dr. Peter Rossmanith Universitätsprofessor Dr. Dr. h.c. Otto Spaniol Tag der mündlichen Prüfung: 12. November 2009 Diese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar. Zusammenfassung Offentliche¨ Bibliotheken erfüllen seit langer Zeit eine Funktion von unschätz- barem Wert. Im Zeitalter digitaler Medien verlieren jedoch traditionelle Methoden der Informationsverbreitung zunehmend an Bedeutung. Diese Arbeit stellt die Frage, ob und wie Informationstechnologie genutzt werden kann, um funktionelle Alternativen zu finden, die den durch eben diese In- formationstechnologie veränderten Bedürfnissen der Gesellschaft gerecht werden. Zu diesem Zweck führen wir das neue Konzept von Weltbibliotheken ein: Stark verteilte Systeme, die zur effizienten Verbreitung von großen Daten- mengen in offenen, unsicheren Umgebungen geeignet sind und dabei die Pri- vatsphäre ihrer Nutzer schützen. Wir grenzen unsere Definition zunächst von ähnlichen Konzepten ab und motivieren ihren Nutzen mit philosophis- chen, ökonomischen und rechtlichen Argumenten. Dann untersuchen wir ex- istierende anonyme file sharing-Systeme dahingehend, inwiefern sie als Im- plementierungen von Weltbibliotheken geeignet sind. Dabei isolieren wir Gründe für den vergleichsweise geringen Erfolg dieser Gruppe von Program- men und identifizieren Prinzipien und Strukturen, die in Weltbibliotheken angewandt werden können. Eine solche Struktur von übergeordneter Wichtig- keit ist die sogenannte distributed hash table (DHT), deren Sicherheitseigen- schaften, vor allem in Hinblick auf den Schutz der Privatsphäre, wir daher näher beleuchten. Mit Hilfe von analytischen Modellen und Simulation zeigen wir Schwachstellen in bisher bekannten Systemen auf und stellen schließlich die erste praktische DHT mit skalierbaren Sicherheitseigenschaften gegen ak- tive Angreifer vor. Das vierte Kapitel der Arbeit widmet sich allgemeinen Schranken der Anonymität. Anonymitätstechniken stellen wichtige Mittel zur Sicherung von Privatsphäre dar. Wir untersuchen neuartige Angriffe auf solche Techniken, die unter sehr allgemeinen Voraussetzungen einsetzbar sind, und finden analytische untere Schranken für die Erfolgswahrschein- lichkeit solcher Angriffe in Abhängigkeit von der Anzahl der gegnerischen Beobachtungen. Dies beschränkt die Anwendbarkeit von Anonymitätstech- niken in Weltbibliotheken und anderen privacy enhancing technologies. Ab- schließend fassen wir unsere Erkenntnisse und ihre Auswirkungen auf die Gestaltung von Weltbibliotheken zusammen. Abstract Public libraries have served an invaluable function for a long time. Yet in an age of digital media, traditional methods of publishing and distributing information become increasingly marginalized. This thesis poses the ques- tion if and how information technology can be employed to find functional alternatives, conforming to the needs of society that have been changed by information technology itself. To this end we introduce the concept of world libraries: Massively distributed systems that allow for the efficient sharing of large data volumes in open, untrusted environments while preserving privacy. To begin with, we discuss our definition as it relates to similar but different notions, then defend its utility with philosophical, economical, and legal arguments. We go on to examine anonymous file sharing systems with regard to their applicability for world libraries. We identify reasons why these systems have failed to achieve popularity so far, as well as promising principles and structures for successful world library implementations. Of these, distributed hash tables (DHTs) turn out to be particularly important. That is why we study their security and privacy properties in detail. Us- ing both analytical models and simulation, weaknesses in earlier approaches are shown, leading us to the discovery of the first practical DHT that scal- ably defends against active adversaries. The fourth chapter is dedicated to general bounds on anonymity. Anonymity techniques are important tools for the preservation of privacy. We present novel attacks that endanger the anonymity provided by these tools under very general circumstances, giving analytical lower bounds to the success probability of such attacks measured in the number of adversarial observations. These results limit the applicability of anonymity techniques in world library implementations and other privacy enhancing technologies. Finally, we summarize our findings and their impact on world library design. Acknowledgements I am deeply grateful for the opportunity to conceive, prepare, and finish this thesis. This feeling is not limited to a specific person or group of people, and were I to personally thank everyone who has wished me well or somehow contributed to the success of this enterprise, this thesis would easily double its volume. Still, the following people have played an outstanding part in this process, which is why I want to say thank you to them in particular: First of all, my adviser Prof. Dr. Peter Rossmanith for giving me the chance to find and pursue my own interests, even when they went rather far away from canonical theoretical computer science. My co-adviser Prof. Dr. Dr. h.c. Otto Spaniol for agreeing to report on such an unusual thesis. My colleagues Joachim Kneis and Alexander Langer, who, although they had a hard time understanding what this was all about, freed me from the hard la- bor of teaching when it was most needed. My coauthors Daniel Mölle, Dogan Kesdogan, Peter Rossmanith, Andriy Panchenko, and Arne Rache without whom most research that went into this thesis would not have been possible. Lexi Pimenidis for introducing me into the field of security research. Hossein Shafagh who supported me greatly in the practical aspects of understanding the OFF System better. Bob Way, Philipp Claves, Spectral Morning, and the rest of the Owner-Free community, for showing me a fresh perspective, and hands-on help with OFF. Johanna Houben for encouraging me to finish this thesis when I was in doubt. My parents for patiently trusting that what I did was right for me, and my entire family for their ongoing support in more ways then I could thank them for. Finally, I would like to especially thank the following people (again) for proofreading and helpful comments on draft versions of this document: Daniel Mölle, Philipp Claves, Bertolt Meier, Bob Way, Diego Biurrun, Christoph Richter, my parents, Peter Rossmanith and Verena Simon. Contents 1 World Libraries 1 1.1 IntroductionandOverview. 1 1.2 DefinitionandRelatedConcepts. 3 1.2.1 DigitalLibraries. 5 1.2.2 Tor and Other Low-Latency AnonymityNetworks . 6 1.2.3 Censorship Resistant Publishing, Anonymous Storage, and Freenet . 7 1.2.4 Private Information Retrieval andObliviousTransfer . 8 1.2.5 AnonymousFileSharing . 8 1.3 Motivation,SocialEffects,andtheLaw . 10 1.3.1 Privacy .......................... 11 1.3.2 PossibleObjections . 15 2 Existing Systems/Case Studies 19 2.1 AntRoutingNetworks . 20 2.2 Other Anonymous File Sharing Networks . 24 2.3 TheOwner-FreeFileSystem. 26 2.3.1 IntroductiontotheOFFSystem . 27 2.3.2 AttacksonOFF...................... 29 2.3.3 ScalingOFF........................ 36 2.3.4 Conclusion......................... 42 3 Privacy and Security Aspects of Distributed Hash Table De- sign 44 3.1 Introduction............................ 44 3.1.1 DHTModel ........................ 46 3.1.2 AttackerModel . .. .. 47 3.2 RelatedWork ........................... 48 3.3 ActiveAttacks .......................... 51 i 3.3.1 Redundancy through Independent Paths Does Not Scale 52 3.3.2 Redesigning Redundancy: Aggregated Greedy Search . 53 3.3.3 HidingtheSearchValue . 55 3.3.4 Bounds Checking in Finger Tables . 57 3.4 PassiveAttacks .......................... 64 3.4.1 InformationLeakage . 64 3.4.2 Increasing Attacker Uncertainty . 65 3.4.3 Random Walks in Network Information Systems . 66 3.5 FurtherObservations . 70 3.5.1 BootstrappingProcess . 70 3.5.2 Arbitrary Positioning of Malicious Nodes . 71 3.5.3 Path Length in Aggregated Greedy Search . 73 3.6 Conclusion............................. 74 4 Learning Unique Hitting Sets — Bounds on Anonymity 76 4.1 Introduction............................ 76 4.1.1 Anonymity Techniques . 78 4.1.2 RelatedWorkandContributions . 80 4.2 The Uniqueness of Hitting Sets . 82 4.3 ASimpleLearningAlgorithm . 83 4.4 An Advanced Learning Algorithm . 86 4.5 Conclusion............................. 89 5 Conclusion 91 5.1 SummaryandContributions . 91 5.2 Requirements, Issues, Principles, and IdeasforWorldLibraries . 94 5.3 OutlookandFutureWork . 96 ii List of Figures 2.1 Number of uses found for blocks in the live OFF network . 32 2.2 Interaction frequency per node histogram in the live OFF system 36 2.3 Interaction frequency per node histogram in the live OFF system when restricted to accepted block pushes . 37 2.4 Block frequency histogram for PlanetLab OFF system on March 26, 2009 at 17:42:59: 379 nodes after storing and dispersing 684MBofdata .......................... 38 2.5 Block frequency histogram for PlanetLab OFF system on March 26, 2009 at 19:04:11:

World Libraries Towards Eﬃciently Sharing Large Data Volumes in Open Untrusted Environments While Preserving Privacy

Uila Supported Apps

Conducting and Optimizing Eclipse Attacks in the Kad Peer-To-Peer Network

Diapositiva 1

Smart Regulation in the Age of Disruptive Technologies

Fundamentos Da Ciências Da Computação.Indd

N.K. Jemisin in the City We Became, the Award-Winning Science Fiction Writer Keeps Breaking New Ground P

BIS Volume 21 Numero 1

4.4 IT Infrastructure 4.4.1 Does the Institution Have a Comprehensive IT

Año De La Salud Y Del Personal Sanitario Anexo Número: Referencia

The Current State of Anonymous Filesharing

Hash Collision Attack Vectors on the Ed2k P2P Network

Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb