C 2017 Frederick Ernest Douglas CIRCUMVENTION of CENSORSHIP of INTERNET ACCESS and PUBLICATION

c 2017 Frederick Ernest Douglas CIRCUMVENTION OF CENSORSHIP OF INTERNET ACCESS AND PUBLICATION BY FREDERICK ERNEST DOUGLAS DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2017 Urbana, Illinois Doctoral Committee: Associate Professor Matthew Caesar, Chair Associate Professor Nikita Borisov Professor Klara Nahrstedt Assistant Professor Eric Wustrow, University of Colorado, Boulder ABSTRACT Internet censorship of one form or another affects on the order of half of all internet users. Previous work has studied this censorship, and proposed techniques for circumventing it, ranging from making proxy servers available to censored users, to tunneling internet connections through services such as voice or video chat, to embedding censorship circumvention in cloud platforms' front-end servers or even in ISP's routers. This dissertation describes a set of techniques for circumventing internet censorship building on and surpassing prior efforts. As is always the case, there are tradeoffs to be made: some of this work emphasizes deployability, and some aims for unstoppable circumvention with the assumption of significant resources. However, the latter techniques are not merely academic thought experiments: this dissertation also describes the experience of suc- cessfully deploying such a technique, which served tens of thousands of users. While the solid majority of previous work, as well as much of the work presented here, is focused on governments blocking access to sites and services hosted outside of their country, the rise of social media has created a new form of internet censorship. A country may block a social media platform, but have its own domestic version, on which it tightly controls what can be said. This dissertation describes a system for enabling users of such a platform to monitor for post deletions, and distribute the contents to other users. ii To my wife, my parents, and my cats Oolong and Lilly, for their love and support. iii ACKNOWLEDGMENTS I would like to acknowledge the people whom I worked with on these various projects: Nikita Borisov, Matthew Caesar, Daniel Ellard, Sergey Frolov, Alex Halderman, Christine Jones, Allison McDonald, Rorshach1, Steve Schultze, Will Scott, Weiyang Pan, Ben VanderSloot, and (last but not least!) Eric Wustrow. Additionally, I would like to thank the developers of the SoftEther VPN system, which the Salmon implementation is built upon, for releasing it under the GPL. I thank ntop for providing the deployable TapDance project with academic licenses for the zero-copy version of PF RING. 1Pseudonym. iv TABLE OF CONTENTS CHAPTER 1 INTRODUCTION . 1 1.1 Access Censorship . .3 1.2 Publication Censorship . .3 1.3 Structure . .4 CHAPTER 2 RELATED WORK . 7 2.1 Censorship of Access . .7 2.2 Censorship of Publication . 12 2.3 Summary . 14 CHAPTER 3 SALMON: ROBUST DISTRIBUTION OF PROXIES . 15 3.1 Problem and Design Overview . 16 3.2 Algorithm . 17 3.3 Implementation . 22 3.4 Evaluation . 33 3.5 Conclusion . 42 CHAPTER 4 GHOSTPOST: DISTRIBUTED SOCIAL MEDIA UN-DELETION . 44 4.1 Background . 44 4.2 Threat Model . 45 4.3 Design . 46 4.4 Centralization . 47 4.5 Implementation . 48 4.6 Evaluation . 50 4.7 Summary . 57 CHAPTER 5 DEPLOYABLE PASSIVE DECOY ROUTING . 58 5.1 Background: TapDance . 58 5.2 Implementation . 60 5.3 Deployment . 63 5.4 Challenges . 68 CHAPTER 6 FASTER UPLOAD IN PASSIVE DECOY ROUTING . 70 6.1 Upload-only Streams . 71 6.2 Implementation . 72 v CHAPTER 7 DITTOTAP: DECOY ROUTING WITH BETTER CAMOUFLAGE . 74 7.1 Implementation . 74 7.2 TCP Traffic Patterns . 76 7.3 Page Load Patterns . 79 7.4 Avoiding Startup Delay . 80 7.5 Full Page Mimicking . 81 7.6 Summary . 83 CHAPTER 8 CONCLUSION . 85 8.1 Future Work . 86 8.2 Future Challenges . 87 REFERENCES . 90 vi CHAPTER 1 INTRODUCTION Governments have always sought some degree of control over what their citizens say, regardless of the means of communication involved. Censorship has followed civilization onto the internet, with governments blocking traffic to undesirable web sites and services based in freer countries. Data from Freedom House indicates that on the order of half of the world's internet users live under some significant form of internet censorship (illustrated in Figure 1.1). Internet censorship includes simple techniques such as disruption of DNS resolutions of banned domain names [1] and blocking of flows to blacklisted IP addresses, as well as more sophisticated techniques, such as killing flows when deep packet inspection detects a banned keyword [2], identifying Tor connections with traffic pattern fingerprinting [3, 4], and (not yet widely deployed, but practical) general fingerprinting of websites' traffic patterns within encrypted tunnels [5,6]. Anti-censorship techniques should defeat any such censorship technique (with the possible exception of traffic fingerprinting, which is not yet a real issue and can typically be dealt with in a separate component [7]). The sim- plest technique is to connect (with encryption) to a proxy server: a server that performs internet communication on behalf of a client, making the client's internet location appear to be the proxy's address. A user with access to a proxy server has escaped the censor: the encrypted data being transferred be- tween the user and proxy is not visible to the censor, and no censor currently blocks encrypted communication entirely. However, the proxy approach relies on the censor not realizing the proxy is a proxy, at which point its IP address can simply be added to a block list. Any system employing proxy servers is attempting to reconcile the contradictory goals of widespread distribution of proxy serves, and keeping the identities of the proxy servers hidden from the censor. 1 Sites and services hosted within a censored country can be compelled to remove offending content. A censor seeking to block websites is really seeking to prevent the inward flow of unwanted ideas from other countries. However, blocking outside information is only half of the censor's task. About 1 million Internet users “Free” “Partly” or “Not Free” Internet user blob map: Internet freedom data: Mark Graham and Stefano De Sabbata, Internet Geographies https://freedomhouse.org/report/freedom-net/freedom-net-2016 at the Oxford Internet Institute (data source: World Bank 2011) No Data (License: CC BY 3.0) Figure 1.1: Freedom House's overview [8] of internet freedom, overlaid on a map of geographical distribution of internet users. Effective censorship must also prevent people from expressing their own ideas. Social media represents a unique challenge to a censor aiming for this goal. At a traditional newspaper, a handful of known employees write articles that can be carefully monitored, perhaps even requiring explicit ap- proval from the censor to publish. With social media, a significant fraction of the country's entire population is speaking its mind, spontaneously and constantly. Repressive governments, China in particular, attempt to control social media as tightly as they can, deleting undesirable posts. This dissertation presents work intended to enable the circumvention of both types of censorship: censorship preventing people from accessing exter- nal information (access censorship), and censorship preventing people from expressing themselves (publication censorship). 2 1.1 Access Censorship We can divide efforts to defeat access censorship into two categories: those in which circumvention resources are easily blocked, and must be kept hidden from the censor, and those in which the circumvention is inextricably tied to a service that the censor is unwilling to block. This dissertation will describe a new technique of the former class: Salmon [9], which intelligently distributes proxy servers. It will also describe a new technique of the latter class: DittoTap, an advanced form of decoy routing (where censorship circumvention happens within ISPs' routers). In addition to DittoTap, it will describe a successful effort to deploy a production-quality implementation of an existing decoy routing technique, TapDance1 [10], in the real world with tens of thousands of users (and thousands of concurrent users). Salmon has the advantage of being incrementally deployable, even by in- dividuals on home internet connections. The same is generally true of other easily-blocked-helper techniques, as well: the easily blocked helpers are typically envisioned as individually volunteered proxy servers (or Tor hidden bridges). TapDance and DittoTap require the support of large ISPs, but draw close to the ultimate goal of censorship circumvention: a system that is freely available to the general public, provide high bandwidth and low latency, and cannot be blocked without severing internet connectivity to the outside world. 1.2 Publication Censorship Social media services are massively popular, and in particular have become a significant component of political discourse. If an educated population with exposure to the internet were denied access to social media platforms, the impetus to seek out effective circumvention techniques would increase dramatically. Therefore, an intelligent censoring government will provide its citizens with functional, well-designed social media platforms|that are entirely under its control. In this way, the government can prevent online 1This dissertation describes the development and large-scale deployment of a production-quality TapDance implementation. I was not involved in the original con- ceptual design or publication of TapDance. 3 communities of dissidents from forming, without angering its populace with the obviously and tangibly oppressive move of disallowing social media. With the perfectly free internet access provided by Salmon, or DittoTap, or other systems, people unhappy with the censored social media platforms available to them could in theory simply switch to an uncensored social media platform. However, network effects are powerful. Especially because not all users care about censorship, expecting the bulk of (for example) Sina Weibo users to use anti-censorship tools to switch to Twitter is unrealistic.

C 2017 Frederick Ernest Douglas CIRCUMVENTION of CENSORSHIP of INTERNET ACCESS and PUBLICATION

Uila Supported Apps

Trapped in a Virtual Cage: Chinese State Repression of Uyghurs Online

Shedding Light on Mobile App Store Censorship

Internet Infrastructure Review Vol.27

Threat Modeling and Circumvention of Internet Censorship by David Fifield

Style Counsel: Seeing the (Random) Forest for the Trees in Adversarial Code Stylometry∗

Social Media Contracts in the US and China

Digital Authoritarianism and the Global Threat to Free Speech Hearing

Clickstream Analysis for Sybil Detection

July 20, 2020 the Honorable David N. Cicilline Chairman

Forbidden Feeds: Government Controls on Social Media in China

The Impact of Media Censorship: Evidence from a Field Experiment in China