Censorship-Resistant Collaboration with a Hybrid DTN/P2P Network
Total Page:16
File Type:pdf, Size:1020Kb
Censorship-resistant Collaboration with a Hybrid DTN/P2P Network Masterarbeit von Philipp Hagemeister aus Braunschweig vorgelegt am Lehrstuhl fur¨ Rechnernetze und Kommunikationssysteme Prof. Dr. Martin Mauve Heinrich-Heine-Universitat¨ Dusseldorf¨ Marz¨ 2012 Acknowledgments My thanks go to Marc Fontaine for asking stupid questions that turned out to be quite clever, and for pointing out that correctness is essential both in the real and the physical world. I also thank Paul Baade for demanding impossible features which turned out to be the last piece in the puzzle. Julius Rommler¨ has notified me of orthographical, typographical, and (inadvertently) semantical er- rors. And told me to use fewer big words. Thanks! I wish to thank Denis Lutke-Wiesmann¨ for proofreading the thesis, and the footnotes. Sven Hager found lots of overly short, overly long, and overly wrong statements. Thanks! Thanks to Prof. Martin Mauve for coming up with the idea, shielding us from bureaucracy, asking for explanation and rationale at every step, and finding all the errors nobody else found. iii Contents List of Figures viii 1 Motivation 1 1.1 Distribution of Speech . .2 1.2 Threat Model . .2 1.2.1 Nontechnical Attacks . .2 1.2.2 Internet Access . .3 1.2.3 Control over the User’s Computer . .4 1.2.4 Total Shutoff . .4 1.2.5 Physical Attacks . .5 1.2.6 IP Blocking . .5 1.2.7 DNS censorship . .6 1.2.8 Deep Packet Inspection . .6 1.2.9 Active Attacks . .8 1.2.10 Conclusions . .9 1.3 Decentralization . 10 1.4 Collaboration . 10 1.5 Structure of this Thesis . 11 2 Components 12 2.1 Peer-To-Peer Networks . 13 2.1.1 Bootstrapping . 13 2.1.2 NAT Traversal . 16 2.1.3 Broadcasting . 17 2.1.4 Integration Notes . 18 2.2 Delay-Tolerant Networks . 19 2.2.1 Integration Notes . 20 2.3 Security . 21 2.3.1 Trust Models . 22 2.3.2 Integration Notes . 24 2.4 Anonymization Networks . 27 v Contents 2.4.1 Mix networks . 29 2.4.2 Hidden services . 31 2.4.3 Common implementations . 32 2.4.4 Integration Notes . 34 2.5 Revision Control . 36 2.5.1 Centralized Revision Control . 37 2.5.2 Graph-based Distributed Revision Control . 37 2.5.3 Excursus: Content-Addressable Storage . 39 2.5.4 P2P Revision Control . 41 2.5.5 Patch-based Distributed Revision Control . 41 2.5.6 Document-oriented Database Systems . 42 2.5.7 Integration Notes . 43 3 Architecture 44 3.1 Implementation Architecture . 45 3.1.1 Finding Project Nodes . 48 3.2 Applications and Projects . 48 3.2.1 Project Structure . 48 3.2.2 The ProjectListApplication . 50 3.2.3 Application Services . 51 3.2.4 Revision Control Application Services . 51 3.2.5 Interaction with the Application Core . 53 3.2.6 Policy Drafting Application . 53 3.2.7 Write Authorization . 54 3.2.8 Read Authorization . 55 3.3 Transports . 56 3.4 Web Application . 57 3.4.1 Server Fallback . 58 3.4.2 Preventing Malicious Fallback Servers . 59 3.4.3 Offline Web Applications . 60 3.4.4 Client-Side Web Applications . 61 4 Implementation 63 5 Conclusion 65 5.1 Future Work . 66 5.1.1 General . 66 5.1.2 P2P . 66 5.1.3 DTN . 67 vi Contents 5.1.4 Security . 67 5.1.5 Version Control . 67 5.1.6 Anonymization Networks & Transports . 67 5.1.7 Web Application . 68 5.1.8 User Interface . 68 Bibliography 69 vii List of Figures 1.1 Screenshot of a packet dump of two HTTP requests to pku.edu.cn, the latter of which is terminated by an injected TCP RST packet (in red) . .8 2.1 Crude broadcasting in an unstructured vs optimal broadcasting in a structured P2P network . 17 2.2 Model of an anonymization network. The attacker can intercept an encrypted version of the traffic between sender and first node, encrypted traffic that goes to one of the nodes in the network as well as plain traffic between exit node and receiver. 27 2.3 Terms in a Revision Graph. Note that the revision identifiers are for illustrative pur- poses only. 36 2.4 Merging different versions of a branch in a graph-based distributed revision control system. 38 2.5 Screenshot of the revision graph after simplistic automated merging with git. 39 2.6 git’s usage of content-addressable storage. The content of the blocks is shown inside the rectangle, and the hash of the content at the lower right of each rectangle. The arrows visualize the relations between the blocks, but are not explicitly stored by the CAS, but a function of the content. 40 3.1 High-level overview of the architecture . 44 3.2 Overview of the Implementation Architecture . 45 3.3 Project header . 49 3.4 Example block database state. The content of the blocks has been simplified; in prac- tice, each block content contains the file name, revision id, and the content of the file at that revision. 52 3.5 Example optimized listRoot answer . 52 3.6 Detail view of the transports in the implementation architecture . 57 3.7 User interface for server fallback . 59 3.8 Fallback verification model . 60 4.1 Screenshot of the policy drafting application. 64 4.2 Screenshot of the configuration of a DTN endpoint. 64 viii Chapter 1 Motivation Development of policies such as laws, political manifestos, examination regulations, articles, source code, and any other form of speech1 can be greatly enhanced by computer supported cooperative work systems. Unfortunately, speech – especially if political – faces attempts to censor or suppress it all over the world. The 2011 ”Freedom in the World” report of Freedomhouse[Pud11] rates 47 countries (with one third of the world population) as ”Not Free”. In these countries, people are denied basic civil liberties such as political participation. Similarly, the Amnesty International Report 2011[FIS11] mentions serious restrictions on freedom of speech and political participation in 48 countries (about 40% of the world’s population). Unsurprisingly, efforts have been made to censor computer-supported speech alongside more tradi- tional censorship methods. Freedomhouse’s Freedom On The Net Report 2011 [KCU11] rates 11 countries (with one quarter of the world population) as ”Not Free”, indicating that experts reported significant restrictions on access to and providers of controversial information. The OpenNet Initia- tive, which automatically measures availability of ”provocative” and ”objectionable” resources instead of relying on human expertise, confirms these assessments by finding significant censorship of these resources in 13 countries (with one quarter of the world population), and any censorship in 42 (60% of the population) in recent measurements[oni11]. Technology should resist censorship and allow free speech whenever possible. In fact, one could argue that free speech is needed the most in the face of censorship. Enabling censorship-resistant free speech has its downsides: The same technology that can be used to debate about democracy or draft an appeal for human rights can be used to foster racism or create a terrorist manifesto2. While 1In this thesis, speech is used in the legal sense, as an umbrella term for any distribution and development of potentially objectionable content. 2Although as scientists, we hope formal analyzers can find and point out fallacies in extremist thought. 1 Chapter 1 Motivation these usage scenarios cannot be prevented without centralized control, I assume that the benefits of unfettered speech outweigh their downsides. The goal of this thesis is to develop a framework which allows collaboration in the face of govern- mental censorship, and implement a prototype. 1.1 Distribution of Speech Current systems for delivering speech include traditional media (e.g. television, newspapers) as well as internet-based services. Traditional media requires significant infrastructure and easily controllable delivery channels (relatively large parts of the radio spectrum and significant transportation vehi- cles/personnel, respectively) and is therefore owned or tightly controlled in non-democratic countries. In democratic countries, the huge barriers to entry can facilitate a concentration of media ownership, which, while not governmental, may impede or.