Measuring and Exploiting Latencies Between All Tor Nodes
Total Page:16
File Type:pdf, Size:1020Kb
Ting: Measuring and Exploiting Latencies Between All Tor Nodes Frank Cangialosi Dave Levin Neil Spring University of Maryland University of Maryland University of Maryland [email protected] [email protected] [email protected] ABSTRACT 1. INTRODUCTION Tor is a peer-to-peer overlay routing network that achieves Tor [7] is a popular peer-to-peer service for providing un- unlinkable communication between source and destination. linkable communication and anonymous services. It oper- Unlike traditional mix-nets, Tor seeks to balance anonymity ates by allowing a source node to create a circuit through and performance, particularly with respect to providing low- multiple Tor relays, each of whom learns only about the node latency communication. As a result, understanding the la- preceding and the node following it in the circuit. To keep tencies between peers in the Tor network could be an ex- the nodes on the path from linking the source and destina- tremely powerful tool in understanding and improving Tor's tion, Tor circuit creation mandates at least two relays. performance and anonymity properties. Unfortunately, there Arguably, the primary feature of Tor that has led to its are no practical techniques for inferring accurate latencies widespread success is Tor's balance between anonymity and between two arbitrary hosts on the Internet, and Tor clients good end-to-end performance, particularly with respect to are not instrumented to collect and report on these mea- network latency. One consequence of this trade-off is that surements. Tor defaults to circuits of length three: an entry node, a In this paper, we present Ting, a technique for measuring middle node, and an exit node. Another consequence of this latencies between arbitrary Tor nodes from a single vantage design choice is that Tor relays do not arbitrarily introduce point. Through a ground-truth validation, we show that delays or \mixing" like in other anonymity systems [22, 4, Ting is accurate, even with few samples, and does not re- 29], and instead forward as quickly as is possible (and fair). quire modifications to existing clients. We also apply Ting Tor's performance and anonymity are therefore highly de- to the live Tor network, and show that its measurements are pendent on the round-trip time latencies between the nodes stable over time. We demonstrate that the all-pairs latency within the Tor network. For example, there have been many datasets that Ting permits can be applied in disparate ways, proposals for improving how Tor selects its circuits [2, 28, 20, including faster methods of deanonymizing Tor circuits and 8] that benefit from understanding the inter-Tor-node laten- efficiently finding long circuits with low end-to-end latency. cies, and as we demonstrate in Section 5.1, latency knowl- edge can speed up existing deanonymization techniques [16, Categories and Subject Descriptors 9, 12, 10]. However, to date, there are no techniques available within C.4 [Performance of Systems]: Measurement Techniques; Tor or through other Internet measurement to accurately C.2.2 [Computer-Communication Networks]: Network measure the round-trip times between two arbitrary Tor Protocols; C.2.0 [Computer-Communication Networks]: nodes. Researchers and practitioners have thus had to rely Security and Protection on approximations such as geographic distances, which sim- ply cannot model network phenomena such as triangle in- Keywords equality violations (TIVs) in Internet routing [15]. In this paper, we present Ting, a technique for accurately Latency measurement; Tor; Deanonymization measuring the round-trip times between any arbitrary pair of Tor nodes. Ting operates strictly at Tor's \data plane": it carefully constructs circuits and directly measures laten- cies to and through Tor relays. Critically, Ting works with- out requiring any modifications to the Tor protocol, to Tor clients, or special permission from Tor users. Through a Permission to make digital or hard copies of all or part of this work for personal or ground-truth validation on PlanetLab [21], we show that classroom use is granted without fee provided that copies are not made or distributed Ting is extremely accurate, imposes little communication or for profit or commercial advantage and that copies bear this notice and the full citation computational overhead on the Tor network, and can be run on the first page. Copyrights for components of this work owned by others than the from a single host. author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or We also show three applications of such an unprecedent- republish, to post on servers or to redistribute to lists, requires prior specific permission edly accurate and thorough latency dataset. Among these, and/or a fee. Request permissions from [email protected]. IMC’15, October 28–30, 2015, Tokyo, Japan. we show for the first time the presence of TIVs in the Tor Copyright is held by the owner/author(s). Publication rights licensed to ACM. network, and the extent to which Ting's measurements can ACM 978-1-4503-3848-6/15/10 ...$15.00. be used to improve end-to-end paths. DOI: http://dx.doi.org/10.1145/2815675.2815701. 289 Ting fills a gap that has formed in network measurement shortest paths, whether Tor performance is intentionally de- tools: the ability to directly measure the latencies between graded, and whether there are opportunities for performance two hosts, neither of which are under the control of the ex- optimization. We can evaluate how often Tor's default ran- perimenter. The King technique [11], introduced in 2002, dom relay selection produces high latency paths and whether indirectly measured latency between two arbitrary hosts by different path selection approaches might be more difficult to cleverly constructing queries to publicly available recursive deanonymize. By taking advantage of Tor as a system that DNS servers. Unfortunately, since then, most publicly avail- is representative of volunteer-administered overlay, we can able DNS servers have disallowed recursive queries due to also learn about the networks that host relay nodes (x5.3). security concerns, rendering King narrowly applicable. Con- Prior work has observed that latency information would versely, we show that Ting can be used to infer with direct be beneficial, but have avoided attempting to incorporate it measurements the end-to-end latencies among any pair of explicitly into the Tor protocol [20, 28]. Our approach is to active Tor nodes. In other words, as Tor's user base in- use, rather than redesign, the protocol in order to measure creases, so too does Ting's applicability. We show that Tor's inter-relay latency information. This permits immediate, current user base span a diverse set of networks (including, incremental deployment of Ting-capable clients, since they in particular, residential hosts) from among ∼6000 unique can use existing Tor relays. /24 networks, making Ting a viable tool for wide-scale net- There are few workable alternatives for estimating the la- work measurement.1 tency between Tor nodes. To make broad, immediate de- This paper makes the following contributions: ployment possible, we cannot modify the Tor protocol, e.g., to ask relays to ping one another. There are large scale net- • We present the design, implementation, and validation work measurement services that use hardware at clients to of Ting, a technique for measuring round-trip times ping often: RIPE Atlas pings root DNS servers from spe- between any arbitrary pair of Tor relays. Ting does not cialized \probe" devices [24]. Such special purpose hard- require modifications to or special participation from ware [31, 25] could be applied to measure and construct a Tor clients. To the best of our knowledge, Ting is the database of inter-node latencies, but they require users to only practical tool today for measuring pairwise RTTs deploy hardware in their networks. Conversely, Ting oper- in the Tor network. ates completely within the Tor peer-to-peer network, and • We thoroughly validate Ting's accuracy (80% of the does not require any additional user deployment. There has time, its estimates are within 10% of ground-truth), also been considerable work towards estimating inter-node stability (over a week, Ting's estimates vary by less latencies through the use of relatively few landmarks de- than 5ms), and trade-offs between speed and accuracy. ployed throughout the network [6, 18, 33]. Such estimation systems offer considerably greater coverage than Ting|they • We present algorithms that use all-pairs latency mea- can be applied to virtually any pair of nodes|but suffer surements to drastically improve the time (a median from the fact that Internet latencies are inherently difficult 1:5× speedup) to deanonymize Tor circuits. to estimate accurately, e.g., due to triangle inequality vio- • Finally, we show that Ting's measurements can be used lations [26, 15] (x5.2.1). Ting, on the other hand, achieves to improve path selection: We find an abundance of so- greater accuracy by performing direct measurements. called triangle inequality violations in inter-Tor-node An approach that inspires us is that of Gummadi et al. [11], latencies, and show that circuits longer than three hops who aimed to estimate the latency between clients and servers can be used to achieve lower end-to-end latencies. by clever use of recursive DNS queries. Their \King" tool sent a recursive DNS request to a name server associated The rest of this paper is structured as follows. In Sec- with the first host that could only be answered by a name tion 2, we present related work on ascertaining and apply- server associated with the second. Although King required ing round-trip time estimations among arbitrary hosts. We only that one of the two name servers support recursive present the design of Ting in Section 3, and an extensive val- queries, in recent years, DNS servers have stopped respond- idation of Ting in Section 4.