Tor: the Second-Generation Onion Router (2014 DRAFT V1)
Total Page:16
File Type:pdf, Size:1020Kb
Tor: The Second-Generation Onion Router (2014 DRAFT v1) Roger Dingledine Nick Mathewson Steven Murdoch The Free Haven Project The Free Haven Project Computer Laboratory [email protected] [email protected] University of Cambridge [email protected] Paul Syverson Naval Research Lab [email protected] Abstract Perfect forward secrecy: In the original Onion Routing We present Tor, a circuit-based low-latency anonymous com- design, a single hostile node could record traffic and later munication service. This Onion Routing system addresses compromise successive nodes in the circuit and force them limitations in the earlier design by adding perfect forward se- to decrypt it. Rather than using a single multiply encrypted crecy, congestion control, directory servers, integrity check- data structure (an onion) to lay each circuit, Tor now uses an ing, configurable exit policies, anticensorship features, guard incremental or telescoping path-building design, where the nodes, application- and user-selectable stream isolation, and a initiator negotiates session keys with each successive hop in practical design for location-hidden services via rendezvous the circuit. Once these keys are deleted, subsequently com- points. Tor is deployed on the real-world Internet, requires promised nodes cannot decrypt old traffic. As a side benefit, no special privileges or kernel modifications, requires little onion replay detection is no longer necessary, and the process synchronization or coordination between nodes, and provides of building circuits is more reliable, since the initiator knows a reasonable tradeoff between anonymity, usability, and ef- when a hop fails and can then try extending to a new node. ficiency. An earlier paper in 2004 described Tor’s original Separation of “protocol cleaning” from anonymity: design; here we explain Tor’s current design as of late 2012, Onion Routing originally required a separate “application and describe our experiences with an international network of proxy” for each supported application protocol—most of approximately 3000 nodes and 500000 users. We close with which were never written, so many applications were never a list of open problems in anonymous communication. supported. Tor uses the standard and near-ubiquitous SOCKS [30] proxy interface, allowing us to support most 1 Overview TCP-based programs without modification. For the protocol cleaning of HTTP and HTTPS, Tor relies on Torbutton [36] Onion Routing is a distributed overlay network designed to (a Firefox add-on) and modifications made to the version of anonymize TCP-based applications like web browsing, se- Firefox delivered to users as part of the Tor Browser Bundle. cure shell, and instant messaging. Clients choose a path No mixing, padding, or traffic shaping (yet): Onion through the network and build a circuit, in which each node Routing originally called for batching and reordering cells (or “onion router” or “OR”) in the path knows its predecessor as they arrived, assumed padding between ORs, and in later and successor, but no other nodes in the circuit. Traffic flows designs added padding between onion proxies (users) and down the circuit in fixed-size cells, which are unwrapped by a ORs [25, 39]. Tradeoffs between padding protection and symmetric key at each node (like the layers of an onion) and cost were discussed, and traffic shaping algorithms were relayed downstream. The Onion Routing project published theorized [47] to provide good security without expensive several design and analysis papers [25, 39, 46, 47]. While padding, but no concrete padding scheme was suggested. a wide area Onion Routing network was deployed briefly, As of our first writing, research [1] and deployment experi- the only long-running public implementation was a fragile ence [4] suggest that this level of resource use is not practi- proof-of-concept that ran on a single machine. Even this sim- cal or economical; and even full link padding is still vulner- ple deployment processed connections from over sixty thou- able [31]. This has not changed since our first publication, sand distinct IP addresses from all over the world at a rate of though we remain hopeful. about fifty thousand per day. Here we describe Tor, a pro- Many TCP streams can share one circuit: Onion Rout- tocol for asynchronous, loosely federated onion routers. Tor ing originally built a separate circuit for each application- provides provides the following improvements over the old level request, but this required multiple public key operations Onion Routing design: for every request, and also presented a threat to anonymity REVISION 73A25E1 (2014-01-06 19:20:47 +0000) from building so many circuits; see Section 9. Tor multi- require not only anonymous communications but also cen- plexes multiple TCP streams along each circuit to improve sorship resistance. Tor circumvents attempts to block access efficiency and anonymity, but allows the user to control which to the network by allowing users to access the network via streams may share a circuit with which other streams to pre- a “bridge” – a special purpose Tor node which does not ap- vent unintended linking of pseudonyms. pear in the directory and whose IP address is only known by Leaky-pipe circuit topology: Through in-band signaling a small number of individuals to reduce the chance of it being within the circuit, Tor initiators can direct traffic to nodes blocked by IP address. The Tor protocol has also been de- partway down the circuit. This novel approach allows traf- signed to be similar to HTTPS such that blocking Tor, without fic to exit the circuit from the middle—possibly frustrating blocking HTTPS, is made more difficult. traffic shape and volume attacks based on observing the end Modular architecture: The Tor program is only one part of the circuit. (It also allows for long-range padding if future of an effective anonymous communication system and Tor research shows this to be worthwhile.) provides functionality to integrate with other components to Congestion control: Earlier anonymity designs do not ad- fulfil a wide variety of user requirements. The graphical user dress traffic bottlenecks. Unfortunately, typical approaches to interface is a separate program (Vidalia, in the Tor Browser load balancing and flow control in overlay networks involve Bundle, but alternatives exist) which communicates with Tor inter-node control communication and global views of traffic. via a local socket – the “control port”. Special-purpose con- Tor’s decentralized congestion control uses end-to-end acks trollers have also been developed by researchers to anal- to maintain anonymity while allowing nodes at the edges of yse Tor and prototype modifications. Additional resistance the network to detect congestion or flooding and send less against protocol fingerprinting, for the purposes of censor- data until the congestion subsides. This is an area of active ship resistance, may be provided by an external “pluggable experimentation and research. transport” obfuscator. Directory authorities: The earlier Onion Routing design Unlike Freedom [7], Tor does not require OS kernel planned to flood state information through the network—an patches or network stack support. This prevents us from approach that can be unreliable and complex. Tor takes a anonymizing non-TCP protocols, but has greatly helped our simplified view toward distributing this information. Cer- portability and deployability. tain more trusted nodes act as directory authorities: they Tor has been operational since 2003. Our source code is collaborate to generate signed directory documents describ- available under a free license, and Tor is not covered by the ing known routers and their current state. Users periodically patent that affected distribution and use of earlier versions of download these documents directly from the authorities or a Onion Routing. As of this writing, the Tor network stands at mirror, via HTTP tunelled over a Tor circuit. about 3000 nodes. Variable exit policies: Tor provides a consistent mecha- We review previous work in Section 2, describe our goals nism for each node to advertise a policy describing the hosts and assumptions in Section 3, and then address the above list and ports to which it will connect. These exit policies are crit- of improvements in Sections 4, 5, and 6. We summarize in ical in a volunteer-based distributed infrastructure, because Section 7 how our design stands up to known attacks, and each operator is comfortable with allowing different types of talk about our early deployment experiences in Section 8. We traffic to exit from his node. conclude with a list of open problems in Section 9 and future End-to-end integrity checking: The original Onion Rout- work for the Onion Routing project in Section 10. ing design did no integrity checking on data. Any node on the circuit could change the contents of data cells as they passed 2 Related work by—for example, to alter a connection request so it would connect to a different webserver, or to ‘tag’ encrypted traffic Modern anonymity systems date to Chaum’s Mix-Net de- and look for corresponding corrupted traffic at the network sign [9]. Chaum proposed hiding the correspondence be- edges [13]. Tor hampers these attacks by verifying data in- tween sender and recipient by wrapping messages in layers tegrity before it leaves the network. of public-key cryptography, and relaying them through a path Rendezvous points and hidden services: Tor provides an composed of “mixes.” Each mix in turn decrypts, delays, and integrated mechanism for responder anonymity via location- re-orders messages before relaying them onward. protected servers. Previous Onion Routing designs included Subsequent relay-based anonymity designs have diverged long-lived “reply onions” that could be used to build circuits in two main directions. Systems like Babel [26], Mix- to a hidden server, but these reply onions did not provide for- master [34], and Mixminion [13] have tried to maximize ward security, and became useless if any node in the path anonymity at the cost of introducing comparatively large went down or rotated its keys.