Enhancing System Transparency, Trust, and Privacy with Internet Measurement

Total Page:16

File Type:pdf, Size:1020Kb

Enhancing System Transparency, Trust, and Privacy with Internet Measurement Enhancing System Transparency, Trust, and Privacy with Internet Measurement by Benjamin VanderSloot A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in the University of Michigan 2020 Doctoral Committee: Professor J. Alex Halderman, Chair Assistant Professor Roya Ensafi Research Professor Peter Honeyman Assistant Professor Baris Kasikci Professor Kentaro Toyama Benjamin VanderSloot [email protected] ORCID iD: 0000-0001-6528-3927 © Benjamin VanderSloot 2020 ACKNOWLEDGEMENTS First and foremost, I thank my wife-to-be, Alexandra. Without her unwavering love and support, this dissertation would have remained unfinished. Without her help, the ideas here would not be as strong. Without her, I would not be here. Thanks are also due to my advisor, J. Alex Halderman, for giving me the independence and support I needed to find myself. I also thank Peter Honeyman for sharing his experience freely, both with history lessons and mentorship. Thank you Roya Ensafi, for watching out for me since we met. Thank you Baris Kasikci and Kentaro Toyama, for helping me out the door, in spite of the ongoing pandemic. I owe a great debt to my parents, Craig and Teresa VanderSloot, for raising me so well. No doubt the lessons about perseverance and hard work were put to good use. Your pride has been fuel to my fire for a very long time now. I also owe so much to my sister, Christine Kirbitz, for being the best example a little brother could hope for. I always had a great path to follow and an ear to share my problems with. Matt Bernhard and Allison McDonald, I owe you my sanity. Let me know if you need to borrow it; I’ll leave the door open so you can follow me out quickly. Thank you Drew Springall and David Adrian for always taking the time out of your own journey to help me. My time in grad school would not have been the same without the fun and academic challenge I had in lab, and for that I have so many to thank: Eric Wustrow, James Kasten, Zakir Durumeric, Will Scott, Chris Dzombak, Ariana Mirian, Ben Burgess, Alex Holland, Gabrielle Beck, Rose Howell, Sai Gouravajhala, Ofir Weisse, Marina Minkin, Andrew Kwong, and many others. Thank you to all of the professors that showed me kindness, helped me, or inspired me. The work in this dissertation was supported in part by the National Science Foundation; the United States Department of State Bureau of Democracy, Human Rights, and Labor; the Rackham Merit Fellowship; and the Computer Science and Engineering Division at Michigan. ii TABLE OF CONTENTS ACKNOWLEDGEMENTS ::::::::::::::::::::::::::::::::: ii LIST OF FIGURES :::::::::::::::::::::::::::::::::::: vi LIST OF TABLES ::::::::::::::::::::::::::::::::::::: xi ABSTRACT ::::::::::::::::::::::::::::::::::::::::: xiv CHAPTER I. Introduction .....................................1 1.1 Measuring Internet Censorship . .2 1.2 Measuring to Circumvent Censorship . .3 1.3 Measuring Web Tracking Evasion . .4 1.4 Measuring the Certificate Ecosystem . .4 II. Measurement of Application-Layer Censorship .................7 2.1 Introduction . .8 2.2 Related Work . 10 2.2.1 Application-layer Blocking . 10 2.2.2 Direct Measurement Systems . 11 2.2.3 Remote Measurement Systems . 11 2.2.4 Investigations of DPI Policies . 11 2.3 Our Measurement System . 12 2.3.1 Design Goals . 12 2.3.2 System Design . 12 2.4 Experimentation . 18 2.4.1 Echo Server Characterization . 18 2.4.2 Datasets . 22 2.5 Evaluation . 24 2.5.1 Validation . 24 2.5.2 Detection of Disruption . 26 2.5.3 Disruption Mechanisms . 29 2.5.4 HTTP vs HTTPS . 30 iii 2.5.5 Disruption Breadth . 32 2.6 Discussion . 35 2.6.1 Ethics . 35 2.6.2 Limitations . 38 2.6.3 Future Work . 40 2.7 Conclusion . 41 III. Measurement in Aid of Censorship Circumvention ............... 42 3.1 Introduction . 42 3.2 Background . 45 3.2.1 Prior schemes . 45 3.3 Deployment Architecture . 47 3.3.1 ISP Deployment . 47 3.3.2 Long-lived Sessions . 47 3.3.3 Handling 40Gbps Links . 49 3.3.4 Multi-Station Constellation . 50 3.3.5 Client Integration . 51 3.3.6 Monitoring . 53 3.3.7 Reachable site selection . 54 3.3.8 Managing Load on Reachable Sites . 56 3.4 Trial Results . 58 3.4.1 At-Scale Operation . 58 3.4.2 User Traffic . 58 3.4.3 Impact on Reachable Sites . 60 3.5 Deployment Results . 60 3.5.1 Psiphon Impact . 61 3.5.2 Client Performance . 62 3.5.3 Decoy Impact . 64 3.5.4 Station Performance . 69 3.5.5 Blocking Event . 72 3.6 Discussion . 74 3.6.1 Role of Refraction Networking . 74 3.6.2 Cost . 75 3.6.3 Partner Concerns . 75 3.6.4 Future Directions . 76 3.7 Conclusion . 77 IV. Measurement of Web Privacy Defenses ...................... 78 4.1 Introduction . 79 4.2 Related Work . 80 4.3 Methods . 84 4.3.1 Web Crawl . 84 4.3.2 Blocker List Testing . 85 iv 4.3.3 Blocker List Probing . 88 4.3.4 Inclusion Graph . 89 4.3.5 Privacy Metrics . 90 4.4 Comparing Blocker Lists . 91 4.4.1 Default Extension Behavior . 92 4.4.2 Explicit A&A Exceptions . 94 4.4.3 EasyList Variants . 96 4.5 Blocking Completeness . 97 4.6 Privacy Impacts . 100 4.6.1 Exchange Sensitivity . 100 4.6.2 Tool Evaluation . 101 4.6.3 Acceptable Advertisement . 103 4.6.4 Improved Model . 105 4.7 Discussion . 106 4.7.1 Findings and Lessons . 106 4.7.2 Limitations . 107 4.8 Conclusion . 108 V. Measurement of HTTPS Certificates ....................... 109 5.1 Introduction . 110 5.2 Certificate Perspectives . 112 5.2.1 Perspectives Enumerated . 112 5.2.2 Ethics of Measurement . 114 5.3 Impact of Perspective on HTTPS Research . 114 5.4 Two Complementary Perspectives Emerge . 115 5.4.1 Limitations of Censys . 115 5.4.2 Limitations of Certificate Transparency . 118 5.5 Passive Traffic Monitoring . 119 5.5.1 Limitations of Scope . 119 5.5.2 User-driven . 120 5.6 Merger of Certificate Transparency and Censys . 120 5.6.1 Benefits to System Administrators . 121 5.6.2 Load-testing Certificate Transparency . 121 5.7 Related Work . 122 5.8 Conclusion . 123 VI. Conclusion and Future Work ............................ 124 BIBLIOGRAPHY ::::::::::::::::::::::::::::::::::::: 128 v LIST OF FIGURES Figure 2.1 Echo Protocol—The Echo Protocol, when properly performed, is a simple ex- change between the client and server where the server’s response is identical to the client’s request. In the example above, the censoring middlebox ignores the client’s inbound request, but reacts to the the echo server’s response, injecting RST packets and terminating the TCP connection. 14 2.2 Test Control Flow—A single test using an echo server is performed by following this diagram. The most common path is also the fastest, in which an echo server responds correctly to the first request and the test is marked as Not Blocked. If the server never responds correctly, the experiment is considered a failure and we do not use the test in our evaluation. 16 2.3 Persistent Interference Duration—We use echo servers in all countries we observe censorship to empirically measure the length of time interference occurs after a censorship event has been triggered. Roughly half of the servers responded correctly to our request within 60 seconds. By 100 seconds, 99.9% responded correctly. We therefore choose two minutes as a safe delay in the Delay Phase. 16 2.4 Echo Server Churn—Only 18% of tested servers were reachable in every obser- vation over 2 months of daily scans. However, 56% were present in both our first and final scans. 20 2.5 Coverage of Autonomous Systems per Country—Echo servers were present in 184 countries with 4458 unique ASes, while OONI probes were in 113 countries with 678 ASes. ..
Recommended publications
  • An ISP-Scale Deployment of Tapdance
    An ISP-Scale Deployment of TapDance Sergey Frolov Fred Douglas Will Scott Eric Wustrow Google Allison McDonald University of Colorado Boulder Benjamin VanderSloot University of Michigan Rod Hynes Michalis Kallitsis David G. Robinson Adam Kruger Merit Network Upturn Psiphon Steve Schultze Nikita Borisov J. Alex Halderman Georgetown University Law University of Illinois University of Michigan Center ABSTRACT network operators, who provide censorship circumvention In this talk, we will report initial results from the world’s functionality for any connection that passes through their net- first ISP-scale field trial of a refraction networking system. works. To accomplish this, clients make HTTPS connections Refraction networking is a next-generation censorship cir- to sites that they can reach, where such connections traverse cumvention approach that locates proxy functionality in the a participating network. The participating network operator middle of the network, at participating ISPs or other network recognizes a steganographic signal from the client and ap- operators. We built a high-performance implementation of pends the user’s requested data to the encrypted connection the TapDance refraction networking scheme and deployed it response. From the perspective of the censor, these connec- on four ISP uplinks with an aggregate bandwidth of 100 Gbps. tions are indistinguishable from normal TLS connections to Over one week of operation, our deployment served more sites the censor has not blocked. To block the refraction con- than 50,000 real users. The experience demonstrates that nections, the censor would need to block all connections that TapDance can be practically realized at ISP scale with good traverse a participating network.
    [Show full text]
  • Threat Modeling and Circumvention of Internet Censorship by David Fifield
    Threat modeling and circumvention of Internet censorship By David Fifield A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor J.D. Tygar, Chair Professor Deirdre Mulligan Professor Vern Paxson Fall 2017 1 Abstract Threat modeling and circumvention of Internet censorship by David Fifield Doctor of Philosophy in Computer Science University of California, Berkeley Professor J.D. Tygar, Chair Research on Internet censorship is hampered by poor models of censor behavior. Censor models guide the development of circumvention systems, so it is important to get them right. A censor model should be understood not just as a set of capabilities|such as the ability to monitor network traffic—but as a set of priorities constrained by resource limitations. My research addresses the twin themes of modeling and circumvention. With a grounding in empirical research, I build up an abstract model of the circumvention problem and examine how to adapt it to concrete censorship challenges. I describe the results of experiments on censors that probe their strengths and weaknesses; specifically, on the subject of active probing to discover proxy servers, and on delays in their reaction to changes in circumvention. I present two circumvention designs: domain fronting, which derives its resistance to blocking from the censor's reluctance to block other useful services; and Snowflake, based on quickly changing peer-to-peer proxy servers. I hope to change the perception that the circumvention problem is a cat-and-mouse game that affords only incremental and temporary advancements.
    [Show full text]
  • Predictroute: a Network Path Prediction Toolkit
    PredictRoute: A Network Path Prediction Toolkit RACHEE SINGH, University of Massachusetts, Amherst, USA DAVID TENCH, University of Massachusetts, Amherst, USA PHILLIPA GILL, University of Massachusetts, Amherst, USA ANDREW MCGREGOR, University of Massachusetts, Amherst, USA Accurate prediction of network paths between arbitrary hosts on the Internet is of vital importance for network operators, cloud providers, and academic researchers. We present PredictRoute, a system that predicts network paths between hosts on the Internet using historical knowledge of the data and control plane. In addition to feeding on freely available traceroutes and BGP routing tables, PredictRoute optimally explores network paths towards chosen BGP prefixes. PredictRoute’s strategy for exploring network paths discovers 4X more autonomous system (AS) hops than other well-known strategies used in practice today. Using a corpus of traceroutes, PredictRoute trains probabilistic models of routing towards prefixes on the Internet to predict network paths and their likelihood. PredictRoute’s AS-path predictions differ from the measured path byat most 1 hop, 75% of the time. We expose PredictRoute’s path prediction capability via a REST API to facilitate its inclusion in other applications and studies. We additionally demonstrate the utility of PredictRoute in improving real-world applications for circumventing Internet censorship and preserving anonymity online. CCS Concepts: • Mathematics of computing ! Paths and connectivity problems; Markov networks; • Net- works ! Routing protocols; Wide area networks; Public Internet. Additional Key Words and Phrases: network measurement; Internet path prediction ACM Reference Format: Rachee Singh, David Tench, Phillipa Gill, and Andrew McGregor. 2021. PredictRoute: A Network Path Prediction Toolkit . Proc. ACM Meas. Anal. Comput. Syst.
    [Show full text]
  • The Use of TLS in Censorship Circumvention
    The use of TLS in Censorship Circumvention Sergey Frolov Eric Wustrow University of Colorado Boulder University of Colorado Boulder [email protected] [email protected] Abstract—TLS, the Transport Layer Security protocol, has TLS [54], and adoption continues to grow as more websites, quickly become the most popular protocol on the Internet, already services, and applications switch to TLS. used to load over 70% of web pages in Mozilla Firefox. Due to its ubiquity, TLS is also a popular protocol for censorship Given the prevalence of TLS, it is commonly used by circumvention tools, including Tor and Signal, among others. circumvention tools to evade Internet censorship. Because However, the wide range of features supported in TLS makes censors can easily identify and block custom protocols [30], it possible to distinguish implementations from one another by circumvention tools have turned to using existing protocols. what set of cipher suites, elliptic curves, signature algorithms, and TLS offers a convenient choice for these tools, providing other extensions they support. Already, censors have used deep plenty of legitimate cover traffic from web browsers and packet inspection (DPI) to identify and block popular circumven- other TLS user, protection of content from eavesdroppers, and tion tools based on the fingerprint of their TLS implementation. several libraries to choose from that support it. In response, many circumvention tools have attempted to mimic popular TLS implementations such as browsers, but this However, simply using TLS for a transport protocol is technique has several challenges. First, it is burdensome to keep not enough to evade censors. Since TLS handshakes are not up with the rapidly-changing browser TLS implementations, and encrypted, censors can identify a client’s purported support for know what fingerprints would be good candidates to mimic.
    [Show full text]
  • Conjure: Summoning Proxies from Unused Address Space
    Conjure: Summoning Proxies from Unused Address Space Sergey Frolov Jack Wampler Sze Chuen Tan University of Colorado Boulder University of Colorado Boulder UIUC J. Alex Halderman Nikita Borisov Eric Wustrow University of Michigan UIUC University of Colorado Boulder ABSTRACT Refraction Networking (formerly known as “Decoy Routing”) has emerged as a promising next-generation approach for circumvent- ing Internet censorship. Rather than trying to hide individual cir- cumvention proxy servers from censors, proxy functionality is implemented in the core of the network, at cooperating ISPs in friendly countries. Any connection that traverses these ISPs could be a conduit for the free flow of information, so censors cannot eas- ily block access without also blocking many legitimate sites. While one Refraction scheme, TapDance, has recently been deployed at ISP-scale, it suffers from several problems: a limited number of “decoy” sites in realistic deployments, high technical complexity, Figure 1: Conjure Overview — An ISP deploys a Conjure station, which and undesirable tradeoffs between performance and observability sees a passive tap of passing traffic. Following a steganographic registration by the censor. These challenges may impede broader deployment process, a client can connect to an unused IP address in the ISP’s AS, and and ultimately allow censors to block such techniques. the station will inject packets to communicate with the client as if there were a proxy server at that address. We present Conjure, an improved Refraction Networking ap- proach that overcomes these limitations by leveraging unused ad- dress space at deploying ISPs. Instead of using real websites as the decoy destinations for proxy connections, our scheme connects 1 INTRODUCTION to IP addresses where no web server exists leveraging proxy func- Over half of Internet users globally now live in countries that block tionality from the core of the network.
    [Show full text]
  • Protocol Proxy: an FTE-Based Covert Channel
    Computers & Security 92 (2020) 101777 Contents lists available at ScienceDirect Computers & Security journal homepage: www.elsevier.com/locate/cose Protocol Proxy: An FTE-based covert channel ∗ Jonathan Oakley , Lu Yu, Xingsi Zhong, Ganesh Kumar Venayagamoorthy, Richard Brooks Department of Electrical and Computer Engineering, Clemson University, Clemson, SC, USA a r t i c l e i n f o a b s t r a c t Article history: In a hostile network environment, users must communicate without being detected. This involves blend- Received 25 September 2019 ing in with the existing traffic. In some cases, a higher degree of secrecy is required. We present a proof- Revised 24 December 2019 of-concept format transforming encryption (FTE)-based covert channel for tunneling TCP traffic through Accepted 22 February 2020 protected static protocols. Protected static protocols are UDP-based protocols with variable fields that can- Available online 24 February 2020 not be blocked without collateral damage, such as power grid failures. We (1) convert TCP traffic to UDP Keywords: traffic, (2) introduce observation-based FTE, and (3) model interpacket timing with a deterministic Hid- Covert channel den Markov Model (HMM). The resulting Protocol Proxy has a very low probability of detection and is Format Transforming Encryption (FTE) an alternative to current covert channels. We tunnel a TCP session through a UDP protocol and guaran- Steganography tee delivery. Observation-based FTE ensures traffic cannot be detected by traditional rule-based analysis Traffic analysis or DPI. A deterministic HMM ensures the Protocol Proxy accurately models interpacket timing to avoid Deep Packet Inspection (DPI) detection by side-channel analysis.
    [Show full text]
  • Practical Countermeasures Against Network Censorship
    Practical Countermeasures against Network Censorship by Sergey Frolov B.S.I.T., Lobachevsky State University, 2015 M.S.C.S., University of Colorado, 2017 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science 2020 Committee Members: Eric Wustrow, Chair Prof. Sangtae Ha Prof. Nolen Scaife Prof. John Black Prof. Eric Keller Dr. David Fifield ii Frolov, Sergey (Ph.D., Computer Science) Practical Countermeasures against Network Censorship Thesis directed by Prof. Eric Wustrow Governments around the world threaten free communication on the Internet by building increasingly complex systems to carry out Network Censorship. Network Censorship undermines citizens’ ability to access websites and services of their preference, damages freedom of the press and self-expression, and threatens public safety, motivating the development of censorship circumvention tools. Inevitably, censors respond by detecting and blocking those tools, using a wide range of techniques including Enumeration Attacks, Deep Packet Inspection, Traffic Fingerprinting, and Active Probing. In this dissertation, I study some of the most common attacks, actually adopted by censors in practice, and propose novel attacks to assist in the development of defenses against them. I describe practical countermeasures against those attacks, which often rely on empiric measurements of real-world data to maximize their efficiency. This dissertation also reports how this work has been successfully deployed to several popular censorship circumvention tools to help censored Internet users break free of the repressive information control. iii Acknowledgements I am thankful to many engineers and researchers from various organizations I had a pleasure to work with, including Google, Tor Project, Psiphon, Lantern, and several universities.
    [Show full text]
  • Obfuscating Network Traffic to Circumvent Censorship
    Obfuscating Network Traffic to Circumvent Censorship Awn Umar MEng Mathematics and Computer Science University of Bristol May 21, 2021 Abstract This report explores some techniques used to censor information in transit on the Internet. The attack surface of the Internet is explained as well as how this affects the potential for abuse by powerful nation state adversaries. Previous research and existing traffic fingerprinting and obfuscation techniques are discussed. Rosen, a modular and flexible proxy tunnel, is introduced and evaluated against state-of-the- art DPI engines, as are some existing censorship circumvention tools. Rosen is also successfully tested from behind the Great Firewall in China. 1 Contents 1 Contextual Background 3 1.1 Information on the Internet.....................................3 1.2 Security of Internet Protocols....................................3 1.3 Exploitation of the Internet by Nation States...........................5 1.3.1 Mass Surveillance.......................................5 1.3.2 Censorship..........................................6 1.4 Ethic of Responsibility........................................7 2 Project Definition 7 2.1 Aim and Objectives.........................................7 2.2 Scope.................................................8 2.3 Vocabulary..............................................8 3 Adversarial Model 8 4 Traffic Identification Methods 11 4.1 Endpoint-Based Filtering...................................... 11 4.2 Traffic Fingerprinting........................................ 13 5 Network
    [Show full text]
  • To Date, Only Tapdance [59], One of Six Refraction Networking Future Censor Techniques with Greater Agility
    CCS ’19, November 11–15, 2019, London, United Kingdom Sergey Frolov, Jack Wampler, Sze Chuen Tan, J. Alex Halderman, Nikita Borisov, and Eric Wustrow To date, only TapDance [59], one of six Refraction Networking future censor techniques with greater agility. We believe that these proposals, has been deployed at ISP scale [20]. advantages will make Conjure a strong choice for future Refraction TapDance was designed for ease of deployment. Instead of in-line Networking deployments. network devices required by earlier schemes, it calls for only a pas- We have released the open source implementation of the Conjure sive tap. This łon-the-sidež approach, though much friendlier from client at https://github.com/refraction-networking/gotapdance/tree/ an ISP’s perspective, leads to major challenges when interposing dark-decoy. in an ongoing client-to-decoy connection: • Implementation is complex and error-prone, requiring kernel 2 BACKGROUND patches or a custom TCP stack. • To avoid detection, the system must carefully mimic subtle Refraction Networking operates by injecting covert communica- features of each decoy’s TCP and TLS behavior. tion inside a client’s HTTPS connection with a reachable site, also • The architecture cannot resist active probing attacks, where known as a decoy site. In a regular HTTPS session, a client es- the censor sends specially crafted packets to determine whether tablishes a TCP connection, performs a TLS handshake with a a suspected connection is using TapDance. destination site, sends an encrypted web request, and receives an • Interactions with the decoy’s network stack limit the length encrypted response. In Refraction Networking, at least one direc- and duration of each connection, forcing TapDance to multi- tion of this exchange is observed by a refraction station, deployed plex long-lived proxy connections over many shorter decoy at some Internet service provider (ISP).
    [Show full text]
  • Decentralized Control: a Case Study of Russia
    Decentralized Control: A Case Study of Russia Reethika Ramesh∗, Ram Sundara Raman∗, Matthew Bernhard∗, Victor Ongkowijaya∗, Leonid Evdokimov†, Anne Edmundson†, Steven Sprecher∗, Muhammad Ikram‡, Roya Ensafi∗ ∗University of Michigan, {reethika, ramaks, matber, victorwj, swsprec, ensafi}@umich.edu ‡ Macquarie University, †Independent, [email protected] Abstract—Until now, censorship research has largely focused decades, receiving significant global and academic attention as a on highly centralized networks that rely on government-run result. [4], [30], [44], [83]. As more citizens of the world begin technical choke-points, such as the Great Firewall of China. to use the Internet and social media, and political tensions begin Although it was previously thought to be prohibitively difficult, to run high, countries with less centralized networks have also large-scale censorship in decentralized networks are on the started to find tools to exert control over the Internet. Recent rise. Our in-depth investigation of the mechanisms underlying years have seen many unsophisticated attempts to wrestle with decentralized information control in Russia shows that such large-scale censorship can be achieved in decentralized networks decentralized networks, such as Internet shutdowns which, due through inexpensive commodity equipment. This new form of to their relative ease of execution, have become the de facto information control presents a host of problems for censorship censorship method of choice in some countries [14], [36], measurement, including difficulty identifying censored content, re- [82]. While some preliminary studies investigating information quiring measurements from diverse perspectives, and variegated control in decentralized networks have examined India [88], censorship mechanisms that require significant effort to identify Thailand [26], Portugal [60], [61], and other countries, there has in a robust manner.
    [Show full text]
  • Traffic Analysis Resistant Infrastructure
    Clemson University TigerPrints All Dissertations Dissertations August 2020 Traffic Analysis Resistant Infrastructure Jonathan Oakley Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Oakley, Jonathan, "Traffic Analysis Resistant Infrastructure" (2020). All Dissertations. 2673. https://tigerprints.clemson.edu/all_dissertations/2673 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. Traffic Analysis Resistant Infrastructure A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Computer Engineering by Jonathan Oakley August 2020 Accepted by: Dr. Richard Brooks, Committee Chair Dr. James Martin Dr. Harlan Russell Dr. Kuang-Ching Wang Abstract Network traffic analysis is using metadata to infer information from traffic flows. Network traffic flows are the tuple of source IP, source port, destination IP, and destination port. Additional information is derived from packet length, flow size, interpacket delay, Ja3 signature, and IP header options. Even connections using TLS leak site name and cipher suite to observers. This metadata can profile groups of users or individual behaviors. Statistical properties yield even more information. The hidden Markov model can track the state of protocols where each state transition results in an observation. Format Transforming Encryption (FTE) encodes data as the payload of another protocol. The emulated protocol is called the host protocol. Observation-based FTE is a particular case of FTE that uses real observations from the host protocol for the transformation.
    [Show full text]
  • Frolov Eric Wustrow University of Colorado Boulder University of Colorado Boulder
    HTTPT: A Probe-Resistant Proxy Sergey Frolov Eric Wustrow University of Colorado Boulder University of Colorado Boulder Abstract deployed behind already-existing Web Servers, making it im- Recently, censors have been observed using increasingly so- possible for censors to access and identify the proxy without phisticated active probing attacks to reliably identify and knowing the secret necessary to use it. block proxies. In this paper, we introduce HTTPT, a proxy designed to hide behind HTTPS servers to resist these active probing 1.1 Benefits attacks. HTTPT leverages the ubiquity of the HTTPS protocol HTTPT has several benefits over existing designs: to effectively blend in with Internet traffic, making it more difficult for censors to block. We describe the challenges that Replay attack protection Existing probe-resistant proxies HTTPT must overcome, and the benefits it has over previous are vulnerable to replay attacks, where the censor re-sends ob- probe-resistant designs. served client messages. Some proxies, such as shadowsocks- libev [43], implement a cache to prevent replays of previous 1 Introduction connections. However, China’s active probing thwarts this defense by permuting replays [3], and the behavior of not re- Internet censors strive to detect and block circumvention prox- sponding at all to replays may be unusual. In contrast, HTTPT ies. Many censors have adopted sophisticated proxy discovery is immune to replay attacks due to its reliance on TLS, which techniques, such as active probing [14,38,39], where the cen- includes bidirectional nonces in the handshake. sor connects to suspected proxy servers and sends probes Using existing web servers HTTPT leverages existing designed to distinguish them from non-proxies.
    [Show full text]