Practical Countermeasures Against Network Censorship
Total Page:16
File Type:pdf, Size:1020Kb
Practical Countermeasures against Network Censorship by Sergey Frolov B.S.I.T., Lobachevsky State University, 2015 M.S.C.S., University of Colorado, 2017 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science 2020 Committee Members: Eric Wustrow, Chair Prof. Sangtae Ha Prof. Nolen Scaife Prof. John Black Prof. Eric Keller Dr. David Fifield ii Frolov, Sergey (Ph.D., Computer Science) Practical Countermeasures against Network Censorship Thesis directed by Prof. Eric Wustrow Governments around the world threaten free communication on the Internet by building increasingly complex systems to carry out Network Censorship. Network Censorship undermines citizens’ ability to access websites and services of their preference, damages freedom of the press and self-expression, and threatens public safety, motivating the development of censorship circumvention tools. Inevitably, censors respond by detecting and blocking those tools, using a wide range of techniques including Enumeration Attacks, Deep Packet Inspection, Traffic Fingerprinting, and Active Probing. In this dissertation, I study some of the most common attacks, actually adopted by censors in practice, and propose novel attacks to assist in the development of defenses against them. I describe practical countermeasures against those attacks, which often rely on empiric measurements of real-world data to maximize their efficiency. This dissertation also reports how this work has been successfully deployed to several popular censorship circumvention tools to help censored Internet users break free of the repressive information control. iii Acknowledgements I am thankful to many engineers and researchers from various organizations I had a pleasure to work with, including Google, Tor Project, Psiphon, Lantern, and several universities. This work would be impossible without the help of Nikita Borisov, Ox Cart, Katharine Daly, Fred Douglas, Roya Ensafi, David Fifield, Adam Fisk, Vinicius Fortuna, J Alex Halderman, Rod Hynes, Michalis Kallitsis, Allison McDonald, Will Scott, Steve Schultze, Ben Schwartz, Sze Chuen Tan, Benjamin VanderSloot, Jack Wampler, and last but certainly not least, my advisor Eric Wustrow. I want to thank the members of my committee for their guidance. I am also deeply grateful to Matthew Monaco, Andy Sayler, and Dirk Grunwald, who helped me get on my feet during the first months of my PhD program. Finally, I’d like to thank my parents and sisters for their support. Contents Chapter 1 Introduction 1 1.1 Direct Blocking . .1 1.2 Censorship Attacks . .2 1.2.1 Enumeration . .4 1.2.2 TLS Fingerprinting . .5 1.2.3 Active Probing . .6 2 Enumeration Attacks 8 2.1 Background . .8 2.2 An ISP-scale deployment of TapDance . 17 2.2.1 Introduction . 17 2.2.2 Deployment Overview . 18 2.2.3 Scaling TapDance . 20 2.2.4 Trial Results . 26 2.2.5 Acknowledgements . 30 2.2.6 Conclusion . 33 2.3 Conjure: Summoning Proxies from Unused Address Space . 34 2.3.1 Introduction . 34 2.3.2 Background . 37 v 2.3.3 Threat Model . 40 2.3.4 Architecture . 41 2.3.5 Implementation . 49 2.3.6 Evaluation . 53 2.3.7 Attacks and Defenses . 62 2.3.8 Related Work . 66 2.3.9 Conclusion and Future Work . 69 2.3.10 Acknowledgements . 70 3 TLS Fingerprinting 71 3.1 Background . 71 3.2 The use of TLS in Censorship Circumvention . 72 3.2.1 Introduction . 72 3.2.2 Background . 75 3.2.3 Measurement Architecture . 77 3.2.4 High-level results . 80 3.2.5 Censorship Circumvention Tools . 88 3.2.6 Defenses & Lessons . 96 3.2.7 uTLS . 98 3.2.8 Other Results . 104 3.2.9 Related Work . 107 3.2.10 Discussion . 109 3.2.11 Conclusion . 111 3.2.12 Appendices . 111 4 Active Probing 114 4.1 Background . 114 4.2 Detecting Probe-resistant Proxies . 117 vi 4.2.1 Introduction . 117 4.2.2 Background . 118 4.2.3 Probe Design . 123 4.2.4 Identifying Proxies . 126 4.2.5 Evaluation . 131 4.2.6 Defense Evaluation . 144 4.2.7 Related Work . 147 4.2.8 Discussion . 149 4.2.9 Acknowledgements . 151 4.2.10 Conclusion . 151 4.2.11 Automated Proxy Classification . 152 4.3 HTTPT: A Probe-Resistant Proxy . 159 4.3.1 Introduction . 159 4.3.2 Background . 160 4.3.3 Design . 162 4.3.4 Evaluation . 168 4.3.5 Discussion . 172 4.3.6 Conclusion . 174 5 Conclusion 175 Bibliography 178 Tables Table 2.1 Conjure Applications — “Active probe resistant” protocols are designed to look innocuous even if scanned by a censor. “Tunneling” (T) protocols use another protocol (e.g. TLS) to blend in, while “Randomized” (R) ones attempt to have no discernable protocol fingerprint or headers. For existing protocols, we list any known attacks suggested in the literature that let censors passively detect them. We also list if we have implemented the application in our prototype. 45 2.2 Comparing Refraction Networking Schemes — “No inline blocking” corresponds to schemes that can operate as a passive tap on the side without needing an inline element in the ISP network. “Handles asymmetric routes” refers to schemes that work when only one direction (either client to decoy or decoy to server) is seen by the station. “Replay attacks” refers to censors who may replay/preplay previous messages or actively probe the protocol. “Traf- fic analysis” includes latency, inter-packet timing, and website fingerprinting. “Unlimited Sessions” shows schemes that do not need to repeatedly reconnect to download or upload arbitrarily large content. 68 3.1 Top 10 Implementations — The most frequently seen fingerprints in our dataset and the implementations that generate them, for a week in August and December 2018. Despite being only 4 months apart, the top 10 fingerprints changed substantially, as new browser releases quickly take the place of older versions. 85 viii 3.2 Tool Fingerprintability — Summary of all TLS fingerprints generated by censorship cir- cumvention tools and their rank and percentage of connections seen in our dataset as of early August 2018. Highlighted in red are fingerprints seen in relatively few (< 0.1%) connections, putting them at risk of blocking by a censor. 95 3.3 Top Extensions — While we include the presence and order of all extensions in our finger- print, Bold denotes extensions whose data we additionally parse and include in our fingerprint; * marks extensions new in TLS 1.3. 103 3.4 Non-Standard Parameters — Breakdown of the number of unique ClientHellos (finger- prints) and the share of connections they appear in that send non-standard cipher suites or extensions. While TLS 1.3 draft ciphers and extensions are the majority, we still find unknown ciphers and extensions in use. 106 3.5 Weak Ciphers — We analyzed the percentage of connections offering known weak cipher suites. We also include TLS FALLBACK SCSV, which indicates a client that is falling back to an earlier version than it supports due to the server not supporting the same version. 107 4.1 Probe-Resistant Protocols — In this table we list the first message sent by the client in the probe-resistant proxy protocols we study, and the server’s corresponding parsing behavior. Blue text denotes secrets distributed to the client out-of-band. Servers typically close the connection after they receive an invalid handshake message; however, precisely when a server closes a failed connection can reveal a unique fingerprint that could be used to distinguish them from other services. 120 4.2 Proxy Timeouts and Thresholds — We measured how long each probe-resistant proxy lets a connection stay open before it times out and closes it (Connection Timeout), and how many bytes we can send to a proxy before it immediately closes the connection with a FIN or RST. obfs4’s RST Threshold is the next value after the FIN threshold that is divisible by 1448. 130 ix 4.3 Percent of Endpoints Responding with Data — For both our ISP passive (Tap) and active (ZMap) datasets, we report the percent of endpoints that respond to each of our probes, as well as the percent that responded to at least one (Any). 131 4.4 Decision Tree Results.