Browser-Based Attacks on Tor
Total Page:16
File Type:pdf, Size:1020Kb
Browser-Based Attacks on Tor Tim Abbott, Katherine Lai, Michael Lieberman, Eric Price {tabbott, k lai, mathmike, ecprice}@mit.edu Abstract. This paper describes a new attack on the anonymity of web browsing with Tor. The attack tricks a user’s web browser into sending a distinctive signal over the Tor network that can be detected using traffic analysis. It is delivered by a malicious exit node using a man-in-the- middle attack on HTTP. Both the attack and the traffic analysis can be performed by an adversary with limited resources. While the attack can only succeed if the attacker controls one of the victim’s entry guards, the method reduces the time required for a traffic analysis attack on Tor from O(nk) to O(n + k), where n is the number of exit nodes and k is the number of entry guards. This paper presents techniques that exploit the Tor exit policy system to greatly simplify the traffic analysis. The fundamental vulnerability exposed by this paper is not specific to Tor but rather to the problem of anonymous web browsing itself. This paper also describes a related attack on users who toggle the use of Tor with the popular Firefox extension Torbutton. 1 Introduction The Internet was not designed with anonymity in mind; in fact, one of the original design goals was accountability [3]. Every packet sent by established protocols identifies both parties. However, most users expect that their Internet communications are and should remain anonymous. As was recently highlighted by the uproar over AOL’s release of a large body of “anonymized” search query data [10], this disparity violates the security principle that systems meet the secu- rity expectations of their users. Some countries have taken a policy of arresting people for expressing dissident opinions on the Internet. Anonymity prevents these opinions from being traced back to their originators, increasing freedom of speech. For applications that can tolerate high latencies, such as electronic mail, there are systems that achieve nearly perfect anonymity [1]. Such anonymity is difficult to achieve with low latency systems such as web browsing, however, because of the conflict between preventing traffic analysis on the flow of packets through the network and delivering packets in an efficient and timely fashion. Because of the obvious importance of the problem, there has been a great deal of recent research on low-latency anonymity systems. Tor, the second-generation onion router, is the largest anonymity network in existence today. In this paper we describe a new scheme for executing a practical timing attack on browsing the web anonymously with Tor. Using this attack, an adversary can identify a fraction of the Tor users who use a malicious exit node and then leave a browser window open for an hour. With current entry guard implementations, the attack requires the adversary to control only a single Tor server in order to identify as much as 0.4% of Tor users targeted by the malicious node (and this probability can be increased roughly linearly by adding more machines). The targeting can be done based on the potential victim’s HTTP traffic (so, for example, one could eventually identify 0.4% of Tor users who read Slashdot). 2 How Tor Works Tor [5] is an anonymizing protocol that uses onion routing to hide the source of TCP traffic. Onion routing is a scheme based on layered encryption, which was first developed for anonymizing electronic mail [1]. As of December 15, 2006, Tor was used by approximately 200,000 users and contained about 750 nodes (also sometimes referred to as “servers” or “routers”) [4]. Fig. 1. A Tor circuit. The client chooses an entry node, a middle node, and an exit node, allowing the exit node to fetch content from a web server. In Tor, a client routes his traffic through a chain of three nodes that he selects from the Tor network, as shown in Figure 1. A client constructs this path of nodes, or “circuit”, by performing Diffie-Hellman handshakes with each of the nodes to exchange symmetric keys for encryption and decryption. These Tor nodes are picked from a list of current servers that are published by a signed directory service.1 To send TCP data through the circuit, the client starts by breaking the stream into fixed sized cells that are successively encrypted with the keys that have been negotiated with each of the nodes in the path, starting with the exit node’s key and ending with the entry node’s key. Fixed size cells are important so that anyone reading the encrypted traffic cannot use cell size to help identify a client [7][9]. Using this protocol, the entry node is the only node that is told the client’s identity, the exit node is the only node that is told the destination’s identity and the actual TCP data sent, and the middle node simply exchanges encrypted cells between the entry node and the exit node along a particular circuit. The nodes are selected approximately randomly using an algorithm dependent on various Tor node statistics distributed by the directory server, some client history, and client preferences. 3 Related Work In May 2006, S. Murdoch and G. Danezis discussed how a website can include “traffic analysis bugs”—invisible signal generators which are used to shape traffic in the Tor network [11]. Our attack uses similar signal generators to attack a Tor client. We rely on the ideas of the papers discussed in the next two sections to deliver the attack and to identify the Tor client. 3.1 Browser Attacks To browse the Internet anonymously using Tor, a user must use an HTTP proxy such as Privoxy so that traffic will be diverted through Tor rather than sent directly over the Internet. This is especially important because browsers will not automatically send DNS queries through a SOCKS proxy. However, pieces of software that plug into the browser, such as Flash, Java, and ActiveX Controls, do not necessarily use the browser’s proxy for their network traffic. Thus, when any of these programs are downloaded and subsequently executed by the web browser, any Internet connections that the programs make will not go through Tor first. Instead, they will establish direct TCP connections, compromising the user’s anonymity, as shown in Figure 2. This attack allows a website to iden- tify its visitors but does not allow a third party to identify Tor users visiting a given website. These active content systems are well-known problems in anony- mous web-browsing, and most anonymizing systems warn users to disable active content systems in their browsers. In October 2006, FortConsult Security [2] described how to extend this attack so that parties could identify Tor users visiting a website they do not control. The attacker uses a malicious exit node to modify HTTP traffic and thus conduct a man-in-the-middle attack, as shown in Figure 3. In particular, it inserts an 1 While the directory service is signed, anyone can add an entry, and claim to have a long uptime and high bandwidth. This makes getting users to use a malicious node a little easier, because clients prefer to use servers with good statistics. Fig. 2. A browser attack using Flash included in a website. The client’s web browser executes a Flash program, which then opens a direct connection to a logger machine, compromising the client’s anonymity. invisible iframe with a reference to some malicious web server and a unique cookie. In rendering the page, the web browser will make a request to the web server and will retrieve a malicious Flash application. If Flash is enabled in the browser, then the Flash movie is played invisibly. The Flash application sends the cookie given to the user directly to the evil web server, circumventing Tor. The web server can then identify which webpages were sent to which users by matching the cookies with the Flash connections. In other words, all Tor users who use HTTP through that exit node while Flash is enabled will have their HTTP traffic associated with their respective IP addresses. However, if we assume that the number of malicious Tor servers is small compared to the total number of Tor servers, a normal user will get a malicious exit node only once in a while. As a result, this attack only works to associate traffic with the particular user for the length of time that the user keeps the same Tor circuit, or at most ten minutes by default. 3.2 Finding Hidden Servers Along with hiding the locations of clients, Tor also supports location-hidden servers, where the clients of a service (for example, visitors to a website) are not able to identify the machine hosting the service. To connect to a hidden server, a client sends a message through an introduction point that is advertised as Fig. 3. A browser attack executed by an exit node. The client’s web browser exe- cutes a Flash program inserted into a webpage by the exit node, which opens a direct connection to a logger machine. being associated with the hidden service by the Tor directory. A clever anony- mous interaction results in the hidden client and hidden server both opening Tor connections to a rendezvous point (chosen by the client). The rendezvous point patches the connections together to form an anonymous channel between the hidden client and hidden server. In May 2006, L. Øverlier and P. Syverson [12] described an attack to locate hidden servers in Tor. The attacker begins by inserting a malicious Tor node into the Tor network and using a Tor client to repeatedly connect to the targeted hidden server, sending a distinctive signal over each Tor connection.