Defending Anonymous Communications Against Passive Logging Attacks

Matthew Wright Micah Adler Brian N. Levine Clay Shields✁

[email protected] micah@ cs.umass.edu brian@ cs.umass.edu clay@ cs.georgetown.edu Dept. of Computer Science, University of Massachusetts, Amherst, MA 01003

✁ Dept. of Computer Science, Georgetown University, Washington, DC 20057

Abstract initiators of a stream of communications. With sufficient path reformations — which are unavoidable in practice Westudy the threat that passive logging attacks pose — the attackers will see the initiator more often than the to anonymous communications. Previous work analyzed other nodes. In thatprior work, we showed thatthis attack these attacks under limiting assumptions. Wefirstdescribe applied to a class of protocols that included all protocols a possible defense that comes from breaking the assump- for anonymous communications that were known at the tion of uniformly random path selection. Our analysis time. We also gave an analysis thatplaced bounds on how shows that the defense improves in the static long the attacks would take to run for a number of specific model, where nodes stay in the system, but fails in a dy- protocols. namic model, in which nodes leave and join. Additionally, In constructing the attack and analysis in thatpaper, we we use the dynamic model to show that the intersection made several simplifying assumptions abouthow the pro- attack creates a vulnerability in certain peer-to-peer sys- tocols operated. Here we examine the effects of relaxing tems for anonymous communciations. We present simu- each assumption. Specifically, we assumed: lation results that show that attack times are significantly lower in practice than the upper bounds given by previous 1. The subset of nodes that forward an initiator's mes- work. To determine whether users' web traffic has com- sages are chosen uniformly atrandom; munication patterns required by the attacks, we collected and analyzed the web requests of users. We found that, 2. Users make repeated connections to specific respon- for our study,frequentand repeated communication to the ders, which are outside points of communication; same web site is common. 3. Nodes do notjoin or leave the session;

These assumptions were necessary for the proof we pro- 1. Introduction vided that showed that the attack works in all cases, and they were also critical to our analysis of the bounds on the Designing systems for anonymous communications is time required for a successful attack. We argued in that a complex and challenging task. Such systems mustbe se- paper that the assumptions are reasonable based on exist- cure against attackers at a single point in time; less obvi- ing protocols. ously, they must also protect users from attacks that seek In this paper, we examine more closely the univer- to gain information about users over the lifetime of the sal applicabilityof predecessorattacks againstanonymous system. protocols. Weexamine ourassumptions and the effectthat In our prior work [21], we analyzed such an attack: the relaxing those assumptions has on the effectiveness of the predecessor attack. In this attack, a set of nodes in the attack. Our specific contributions are:

anonymous system work togetherto passively log possible ✂ First, we show thatdefenses thatuse non-randomse- This paper was supported in partby National Science Foundation awards lection of nodes for path creation offer significant ANI-0087482, ANI-0296194, and EIA-0080199. protection for the initiator in a stable system. ✂ Second, we show that the design of some exist- the class degrades against the predecessor attack. We de- ing peer-to-peer systems for anonymous communi- fined an active set as the set of nodes used by the initiator cations leads to a practical and efficient intersection of a communication to propagate its message through the attack. network. This can be, for example, a path in Crowds or

the set of nodes that share coin flips with the initiator in a ✂ Third, we examine the practical effectiveness of DC-Net. the attacks through simulation. Our previous work proved analytical upper bounds on the number of For any protocol inside our class, we required that the rounds required; the simulations in this paperdemon- active set be chosen uniformly at random many times. In strate thatattackers can be successful in significantly current protocols, that assumption holds because active fewer rounds than the maximums guaranteed by the sets change each time a node is allowed to join the net- bounds. E.g., attacks on and Crowds, work. If active sets did not change, then from with 1000 nodes and 100 attackers, succeed in a recently joining nodes are easily identified. However, it time one-fifth of the rounds guaranteed by the upper is not necessary that the new active sets are chosen uni- bounds. formly at random — in Section 3, we explore ways to choose paths that exploit this fact, with the intention of ✂ Fourth, we characterize measurements taken from defending againstdegradation of anonymity. two web proxies to show the actual frequency and duration of user activities on the Web. This study The second major result in our prior work was a set of allowed us to make some observations aboutuser be- analytic bounds describing how long it might take for at- havior with regard to our assumptions. tackers using the predecessor attack to effectively degrade the anonymity of a user in Crowds, Onion Routing, and The body of this paper is organized around those goals. Mix-Nets. We gave bounds on the number of rounds, i.e., In Section 2, we review related work. We then present periodic changes in the active set, that guarantee for the and analyze the effectiveness of new techniques for avoid- attackers a high probability of success in guessing the ini-

ing predecessor attacks in a static model in Section 3. In tiator. We use the following notation: is the number

Section 4 we use a dynamic model to study the intersec- of nodes in the network, ✁ is the number of those nodes tion attack and the defenses introduced in Section 3. We that are attackers, and ✂ is the fixed path length of Onion describe the results of our simulations of the predecessor Routing or Mix-Nets.

attack in Section 5. In Section 6, we show how often users ✝✟✞✡✠☛ go to the same website from dataobtained by tracking real Against Crowds, the attackers require ✄✆☎ rounds

to identify the attacker with high probability ☎✌☞✎✍ . For

users. We offer concluding remarks in Section 7. ✏

☎ ✝ ✄

the same level of confidence, the attackers need ✞✡✠☛

✄✑☎ ✞✡✠☛ 2. Background rounds againstOnion Routing and ✝ rounds against Mix-Nets. In Section 5, we provide simulation results that In this section, we review the results from our prior are tighterthan these bounds and show how the confidence work that serve as the foundation of this paper. We also of the attackers grows over time. describe related material. In the prior work, we also assumed that rounds oc- curred regularly, and thatthe initiator communicated with 2.1. Our Prior Work the responderin every round. If nodes are allowed to leave and join the protocol, then attackers can force rounds to Our previous work described the predecessor attack, occur as often as the system allows by simply having cor- first discovered by Reiter and Rubin as an attack against rupt nodes join and leave. However, they cannot force Crowds [15]. The primary purpose of our current work the initiator to communicate with the responder during a is to extend our previous results in the analysis of anony- round, which is necessary for the attackers to get data on mous protocols. In this section, we review the definitions, the identityof the initiator. If the initiator rarely communi- methods, and results of that work, and we refer the reader cates with the responder, then the amount of time it takes to the full paper if greater detail is desired [21]. for the attackers to get data from enough rounds can be The first contribution of our previous work was to de- very large. In Section 6, we use logs of Web usage to ex- fine a class of protocols, which included all known pro- amine how many rounds attackers can expect to get data tocols for anonymous communication, and to prove that from over time. 2.2. Related Work discuss some of these protocols and their resistance to the predecessor attack. A number of papers have addressed attacks against One of these is P5, by Sherwood, etal [18]. This proto- systems of anonymous communications. The creators col is designed foranonymitybetween peers connecting to of Crowds [15], Onion Routing[20], Hordes [12], Free- each other, rather than outside responders. It could, how- dom [1], Tarzan[8], Stop-and-Go Mixes [10], and others ever, be adapted to outside communication by using des- have provided analysis of their protocols against some at- tination peers as the final proxy to the rest of the Internet. tacks. P5 uses a tree-based broadcast protocol, where a user's Only a few of these analyses consider the degrada- anonymity is based on the sizes of the differentbroadcast tion of anonymity over time, including Reiter and Rubin's groups in which she is in. seminal work [15]. Berthold, et al., discuss an intersec- The authors assume that “users do not leave once tion attack against the anonymity groups that arise when they join” to prevent a decline in users' anonymity [18]. multiple mix routes are chosen [3]. In this attack, the dif- Withoutthis assumption, anonymitygroups would shrink, ferentanonymitygroups are intersected witheach other to leading to degradation of anonymity within the groups. shrink the number of possible initiators. We expect that this assumption does not hold well in to- Raymondalso discusses an intersection attack based on day's networks, in which nodes may frequentlyshutdown. observations of user activity [13]. Only the active users When anonymity groups become too small, users must can be the initiator, and the set of active users changes recreate a new communication tree, including a new key. over time. Intersecting the sets of active users reduces the This expensive step keeps the protocol from being vulner- set of possible initiators. We explore this idea further in able to the predecessor attack when the number of attack- Section 4.1. ers is less than the size of the user's anonymity group. More recently, Shmatikov used formal analysis and The second protocol, Tarzan, by Freedman, et al. [8], model checking to verify the efficacy of the predecessor is a peer-to-peer system built at the network layer. From attack against Crowds [19]. Due to the high process- the perspective of our analyses, which assume a peer-to- ing requirements of model checking, he was only able peer setting, Tarzan may be considered a variantof Onion to demonstrate probabilities of attacker success for small Routing, as it uses Onion Routing-style layered encryp- numbers of nodes, i.e., twenty or fewer. In this paper, tion. we use simulation to extend these results to thousands of Tarzan achieves a higher level of practical security in nodes with an acceptable loss of precision. some scenarios by having nodes select relays according to random domain selection. This means that attackers One partof our work is to study users' web surfing be- cannot overload the network with malicious nodes from havior to determine whether users visit the same site re- within the same domain, as initiating nodes will not se- peatedly over time. Given the large number of studies on lect proxies from that domain with any greater frequency. the and user behavior on the web, one Attackers can gain an advantage, however, when the num- might believe that this work would be done already. We ber of honest domains represented in the Tarzan group is are unaware of any studies thatshow longer-term user be- small. An attacker may be able to operate nodes from do- havior at the level of detail thatwe require. mains with no honest Tarzan participants. With attackers Work by Baryshnikov, et al., tries to predict traffic in a few such domains, attackers could make it likely to patterns based on content, but from the point of view of appear on an initiator's path despite only operating a few servers handling load and notconsidering single users [2]. corruptnodes. Other papers on traffic prediction, including work by Another protocol, MorphMix, is also a peer-to-peer Duchamp [6] and work by Davison [5], model users based path-based protocol [16]. It allows honest participants to on recent content. In Duchamp, only the last 50 requests find attackers in the system, with the unacceptable cost of are used, while the newer work by Davison only uses the allowing attackers to create paths withonly attackers, with lastfive HTML documents. In this work, we seek patterns high probability. Unlike in Tarzan, the peers do not need over much longer periods. to know all other peers in the network to operate correctly. We will see how this property is desireable for anonymity 2.3. The Predecessor Attack on Recent Protocols in Section 4.1. Additionally, two mutual anonymity protocols have Recently, several significant protocols for anonymous been presented recently[22, 11]. These protocols, like P5, communications have been published. In this section, we are designed to hide the identities of both communication parties from each other and from third parties. Both of The responder does not collaborate with other attackers these protocols are, like Tarzan, variants of Onion Rout- butit tracks the identity of nodes with which it communi- ing, but with anonymity for both the client and server. cates and should not be trusted. In each round, each node

Since they are designed for file sharing, paths are not creates a path of proxies, consisting of other nodes in , maintained. This makes themhighly vulnerable to the pre- between itself and its corresponding responder. It sends decessor attack as new paths create more opportunities for at least one message in each round. For the rest of this

attackers to be on the path. However, repeated connec- section we only consider a single initiator ✠ , whose com-

tions to the same responder may not be as common for munications with responder ✡ are being tracked by the file-sharing as in other applications, and in fact easier to attackers. avoid. In this model, attackers may only passively log infor- mation obtained while running the protocol in the normal 3. Path Selection in Static Systems fashion. They have no ability to eavesdrop on or corrupt

any other nodes in . Nor do they disruptthe protocol by In this section, we examine the assumption of our pre- modifying messages, failing to deliver messages, or per- vious analysis of anonymous protocols that nodes are se- forming other denial-of-service attacks. lected by the initiator uniformly atrandom for forwarding Within this model, we consider fixed-node defenses, in messages in the protocol. which the initiator selects some nodes to always appear in We show the effects of non-uniform selection of nodes certain positions on the path. during path selection. We consider a static set of nodes For this model, no mix-like techniques are used. and attackers using passive logging and review three dif- Specifically, we assume a cost-free attack exists that al- ferentpath creation strategies. lows attackers to determine if they are on the same path; This model is based on the Onion Routing proto- e.g., by analysis of the timings on packetarrivals. col [14]. We study Onion Routing because it provides Note that the attackers may be able to force rounds to an intermediate level of both security and performance occur with short, regular intervals by simply leaving and among path-based protocols. It is significantly more se- rejoining the system. The system must start a new round cure than Crowds against the predecessor attack, as de- to allow a returning node to resume communications. If scribed in our previous work. Fully-connected DC-Net the system delays the startof the nextround for very long, and Mixnets both provide higher security than Onion it will be a great inconvenience to the user of a returning Routing, but they have significant additional costs, both node. for initiators and for intermediate nodes. We considered a set of defenses for the static mem- Briefly stated, the predecessor attack for Onion Rout- bership model that contrast the assumptions of our previ- ing is to use two attackers. When the attackers occupy the ous work of uniformly random path selection. We consid- last node on the path and one earlier node, they log the ered the following cases: The initiator fixes placement of node before the first attacker. The intuitive reason thatthe nodes: fixed first position; fixed last position; fixed first predecessor attack is successful is that the initiator is ob- and last positions; nodes fixed in other positions. served in more rounds than other nodes as the node before A summary of our results is presented in Table 1. the first attacker. 3.1.1 Fixed First Node Defense 3.1. Model

In the static model, we study the idea of fixing nodes in

We assume a set of nodes , where the size of is certain positions on the pathas a defense againstpredeces- ✁✄✂☎

fixed at . We also assume a set of attackers , sor attacks. There are several variations of this technique

✝✆ ✟✞ ✁ consisting of ✁ attackers. The remaining nodes that depend on the positions that the initiator chooses to

in are not trusted, but do not share information with fix: first position, last position, or first and last positions. the set of attackers. All nodes follow protocol forwarding Fixing the first position is equivalent to the setup attack

rules correctly. of our previous work [21]; thus, this subsection describes In our static model, the membership of ✁ and does a generalization of that technique. For all scenarios, we notchange. This propertyis nottrue in the dynamic model assume that the nodes for all other positions are selected

thatwe consider in Section 4. uniformly atrandomfromall of . Additionally, the fixed Each node in communicates uniquely with one cor- nodes also are selected uniformly from all of atthe cre-

responding responder, a node that is not a member of . ation of the first path. Defense Technique Prob. of Success If successful, If successful, rounds req., rounds req., expectation Bounded w.h.p.

Static node membership model:

✄✆☎✞✝ ✟✠✞✡✁ ✄☛☎✞✝☛☞✍✌✏✎ [21] O.R. model 1 ✂✁ )

Fixed placement of nodes

✄ ✟✠✡✁✕✑✖✓✄ ☞✗✌✘✎✖☎

First ✁✒✑✔✓

✁ ✑✔✓ ✑✔✓

✟✠ ☞✗✌✏✎✖☎

✄ ✁ ✄

Last ✁

✁ ✄

First and Last ✏ 1 1 ✁

Table 1.The probability of success and the numberof rounds fora s uccessful predecessor attack agains t various defens es in the s tatic members hip model.

The firstapproach to a fixed-node defense is to use one 3.1.2 Fixed Last Position other node continually in the firstposition on the path. We

A slight variation on this approach is to put ✙ statically call the node selected for the purpose the helper node, ✙ .

This defense protects the initiator from ever being tracked as the last proxy in the path. This approach also keeps ✙ node ✠ from being exposed, as long as can be trusted, by the attackers, except if the node picked as ✙ is unfor-

since all of the communications to ✡ are hidden from the tunately an attacker. If ✙ is not an attacker, then when

attackers. If, however, ✙ is an attacker, then all of the the attackers run the predecessor attack on messages to ✡ ,

initiator's messages will be observed by the attackers. ✠ they will see ✙ as the initiator instead of . As with the fixed first position, an attacker selected as

✙ will require another attacker to be selected as well be- We now consider whathappens when ✙ is an attacker. fore the initiator can be exposed. In this case, it is the

We note that the initiator is not immediately exposed. An ✝

attacker mustappearatthe end of the pathto see the initia- first node, which will be an attacker with probability .

✄ ☎ ✞✡✠

's messages to the responderand determine jointly with In ✝ rounds, the initiator is identified by attackers,

☎✌☞ ✍

given that ✙ is an attacker, with high probability . ✙

that the two attackers are on the same path. Only then ✝ ☎ will the initiator be exposed. Again, there is a limit of on the probability of the attackers being successful, based☎ on the chance of select-

ing an attacker for ✙ . So the probability of the attackers The last node is selected at random from , except-

✝ being successful again grows towards it's maximum value

☞✛✚

ing , and there is a chance that the node selected ✝

☞✛✚

☎ of and becomes very close to that bound in fewer than ✙

is an attacker. Note that ✠ should never select for the

✄ ✞✡✠☛ ✝ rounds. (See Figure 8.)

last position, as it allows ✙ to immediately expose I as

☎✌☞✜✚ ✠

the initiator. Thus, in an expected ✝ rounds, will be ✜✚

exposed as the initiator. ☞ 3.1.3 Fixed First and Last Position

☞✛✚

Since the probability of ✙ being an attacker is , The third variation is to keep both the first and last posi- ✜✚ that serves as an upper limit on the probability that☎✌☞ the tions on the path fixed to the same nodes. The two po-

initiator is ever exposed. Due to the changing last node, sitions should be set to different nodes in because if ✠ the probability of ✠ 's exposure grows towards that upper they were the same, then a single node could expose as limit as it becomes more likely that the last node on the the initiator. In this scenario, if either the first or the last path has been an attacker. This compares favorably with node is notan attacker, then ✠ is safe from attack. If, how- choosing all nodes on the path uniformly at random ev- ever, bothnodes are attackers, then ✠ is exposed in the first ery round, which results in a probability of exposure that round.

grows, albeit more slowly, towards one. We demonstrate The probability of ✠ having been exposed, then, is only

✝✣✢✡✝

☞✜✚✥✤

this by simulation in Section 5; see Figure 8. ✢ and stays constant for all rounds of the protocol.

☎ ☎✌☞✜✚✥✤ Since fixing only one node leads to an eventual ✝ prob- able peers' IP addresses. This is critical for proper oper- ability of exposure, this makes fixing nodes in both☎ posi- ation, because in these protocols peers communicate di- tions the best known choice for the initiator in the static rectly. In describing Tarzan, Freedman, et al. [8], argue model, given sufficientrounds. (See Figure 8.) that having an incomplete list leaves the initiator vulner- Westudy the effectof node instability on this approach able to other attacks in the peer selection process. This in Section 4. list of peers, however, gives the attacker exactly what she needs to perform the intersection attack. 4. Attacks Against Dynamic Systems To conduct this attack, the attacker only needs to keep a single peer in the system to obtain these lists. For every round in which the initiator contacts the responder, the at- In this section, we examine the effects on initiator tacker can add the list of peers to her collection of sets of anonymity of membership changes to the set of proxies. nodes. As nodes leave and join the system, the intersec- We find that membership dynamics of the system greatly tion of those lists becomes smaller. In each round in which affectinitiator anonymity. Using the same model assump- the initiator does notcommunicate withthe responder, the tions as in the static case, we add the notion that nodes attacker does nottake a listfor intersection. This increases leave the system with independent and identical distribu- the total attack time by one round each time it occurs. tions. We call the length of the time they are active in the The attacker can, over time, use the intersection at- system their uptime. tack to uniquely identify the initiator. Alternatively, when This model attempts to capture the consequences of the attacker has reduced the set of possible initiators to a having users thatstop making communications or stop the few nodes, the attacker can use other techniques to distin- peer service on their local nodes. This occurs if users do guish the initiator from the other nodes in the intersection. notleave their systems turned on and connected to the net- For example, the attacker could eavesdrop on the commu- work all the time. It is not clear whether users of anony- nications of the remaining nodes to identify the initiator mous communications systems would leave their systems through packettiming. on and available for long, uninterrupted periods. Studies of peer-to-peer file-sharing systems, however, show that most users of these systems do not stay con- 4.1.1 Analysis nected for long periods of time. According to Sariou, et We now show that that the attack works efficiently in our al. [17], peers using either Napster or Gnutella were not model, given our assumptions aboutusers leaving the net-

often available for IP-level connections, much less for file- work. We discuss other assumptions of our model in Sec- ✆✂✝☎✄ sharing. Only ✂✁☎✄ of these peers were available for tion 4.1.2. or more of the time. For actual file-sharing availability, We model the uptime of users first with an exponential the median session time was only 60 minutes. If users distribution and then a Pareto distribution. Pareto distribu- of anonymous communications system behave similarly, tions have heavy-tailed behavior thatcorresponds roughly then system designs mustaccountfor a dynamic user set. with the uptime of nodes observed in peer-to-peer file- sharing systems [4]. A Pareto distribution corresponds to 4.1. Intersection Attack some nodes remaining active for very long periods, while many other nodes join the system for only a shorttime. If an attacker can obtain a collection of disjoint sets of We do not consider nodes that leave and join the sys- nodes thateach contain the initiator, then simply intersect- tem in the same round (i.e. if they become available in ing those sets can greatly narrow the attacker's search. If the next round) as having left at all. Our distributions are the intersection leaves only one node, this technique can constructed such thatsuch short-term failures are notcon- completely expose the initiator. We refer to this technique sidered events. Note that nodes newly joining the system as the intersection attack [13]. This attack is well-known do not affect the intersection attack, nor do nodes that in- and there are few known means of defending against it. termittently re-join.

The designers of the Java Anon Proxy incorporated an In the exponential model, we say that the average up-

✞ ✟ “anonym-o-meter,” to show users their current level of time for a user is ✚ . Therefore, can be thought of as vulnerability to this attack [7, 3]. the failure rate of the user's node. The probability that a

Some peer to peer systems are particularly vulnerable given node has left after ✠ rounds is given by the prob-

✞✗✖

✡☞☛✌✠✎✍✑✏✓✒ ✞✕✔ to this attack, including Tarzan and Crowds. In these two ability distribution function, ☞ . We protocols, each participating peer knows the other avail- model the nodes as independentand identically distributed

Defense Technique Rounds for high probability

✂✁

✎☎✄ ☎✞✝✡✎ of attacker success Dynamic node membership model:

Intersection attack:

✓✍✌ ✁

✟ ✄ ✓ ☛✡☞✄ ☎

Exponential dist. session with parameter ☞✗✌✔

✓✍✌✚✙

✎ ✏

Pareto dist. session with parameter ✓ ✒✑✍✓✕✔☛✖✘✗

Fixed first and last: ✓✞✑

✓✞✑✥✑✍✦

✛ ✜ ✢✤✣

Constant session length ✛

✓✍✮

✓✞✑✪✩✍✫✬✩☛✭

✜ ✢★✧

✖ ✖ ✓✍✮✞✯

✫ ✭

Table 2. Number of rounds for a successful predecessor attack for various defens es .

✞ ✒ (i.i.d.) processes. Thus, the probability that all ure 1, we see that the probability of total exposure, when

nodes other than the initiator have left after ✠ rounds is all other nodes have left, grows much more slowly than in

✞✗✖✂✳

☎✌☞✜✚

☛ ✠ ✍ ✏✲✱ ✒ ✞ ✔ ☞ . We can also find the number the exponential model. The chance of exposure becomes

of rounds, ✠ , required for the attacker's chance of ex- non-negligible in far fewer rounds, but the attacker is not

posing the initiator to reach a value ✴ . This is given by guaranteed a high probability of success for a very long

✂ ✢

✠ ✏ ✞ ✚ ✵✱ ✒ ✞✶✴ ✠

✸✷ ☎✌☞✜✚✥✤

✚ . Note that is linearly depen- time. This is due to the heavy-tailed propertyof the Pareto ✞ denton the average uptime, ✚ . distribution. Many nodes will leave the system early in the As we see in Figure 1, the probabilityof all nodes leav- course of the attack, but it is likely that a few nodes will ing rises quickly to one. We show curves for the exponen- remain in the system for a long time. Also, the average tial distribution when the average uptime is one and four time to leave makes little difference for the averages we weeks, or 168 and 672 rounds, respectively. The round used. We believe that this is because the chance of expo- length is one hour, though the clock time required for the sure mostly depends on whether other nodes remain for a attack does not change with the round length. Longer long time, regardless of how much longer that time may round lengths may mean longer average uptimes, though, be. as itis more likely thata node thatdisconnects will be able The possibility of long periods withoutcomplete expo- to reconnect before the next round starts. For this reason, sure does notmake the initiator safe from the intersection the round length must be significantly less than the aver- attack. This is because the number of possible initiators

age uptime for the analysis to hold. We show results for a will quickly narrow to a small fraction of the remaining

✆ ✞ ✒

system with 1,000 total nodes. nodes. The probabilitythat ◆ nodes will leave the

☛ ✠ ✍ ✏

After ten weeks, with an average uptime of one week, system after ✠ rounds is given by the binomial

✳✤❯ ❯ ✳ ❯

❘◗❚❙

☎ ☞✛✚

☎✌☞✛✚

✏❱✱ ✒ ✞ ✚ ✱ ☛ ✒ ✞ ✍

☎✌☞ ☞✜✚

✺✹ ✆✼✻★✽ there is a ✁ probability that all nodes have left except , where .

the initiator. When the average uptime is four weeks, the In Figure 2, we compare the probabilities that all but

✺✹ ✆✼✻★✽ time to reach the same ✁ chance that all nodes have five and all butten nodes leave the system, along with the left is 40 weeks. This agrees with the linear relationship probability that all nodes leave the system. We observe

between the average uptime and ✠ . that the attacker reduces the list of possible initiators to a The Pareto distribution gives us a different picture for small fraction of the original nodes much faster than she

how quickly the intersection attack works. For this model, can isolate the initiator completely. The attacker has a list

✁❳✹ ✆ ✆✂✆

✠❂❁ ✏ ❃

we say that the average uptime is given by ✾❀✿ , of ten or fewer nodes with probability in less than

❃ ☞✜✚

where parameter ❄ can be set give the desired average two weeks, when the average time for node failure is one

✏ ✒❅✽✥❆❈❇ ✒❅✽ ✄ ✾❀✿ ✠❂❁ ✏ ✒❅✽ ✄

time. (We chose ❄ so that hours, week. In the same two weeks, the attacker can geta listof

✺✹ ✄ ✄

i.e., 1 week.) The probability that a node leaves before five or fewer nodes with probability ✁ .

✖❊❉

✡☞☛ ✠ ✍ ✏ ✒ ✞

time ✠ is . Again, the nodes can be con-

✞ ✒

sidered i.i.d. processes, and the probability that all 4.1.2 Caveats

☎✌☞✜✚

✖❋❉

✠ ☛ ✠✎✍ ✏ ✒ ✞ ✚ nodes leave by time is given by ✱ . The intersection attack does not reveal the initiator in all We can rearrange this formula to find the number of situations, because the technique requires that the com- rounds the attacker requires to expose the initiator with

munication between the initiator and the responder be

✚✸✷▼❃

✠ ✏■❍ ✚ ✠✎✍ ✏●✴

probability ☛ : . From Fig- uniquely identifiable. In other words, the initiator must

✓✕✔ ✖ ✓❑✮✤▲

✚ ☞✺❏

✫ ✭ 1 1

0.8 0.8

0.6 0.6

0.4 Exponential Distribution, 1 week average 0.4 Pareto Distribution, All Nodes Leave Exponential Distribution, 4 week average Pareto Distribution, 99% of Nodes Leave Pareto Distribution, 1 week average Pareto Distribution, 99.5% of Nodes Leave

Pareto Distribution, 4 week average Probability of Nodes Leaving Probability of All Other Nodes Leaving 0.2 0.2

0 0 100 1000 10000 100000 10 100 1000 10000 100000 Rounds Rounds

Figure 1. Dynamic membership model: The in- Figure 2. Dynamic membership model: The ✏

ters ection attack. The probability, for probability of having les s than ten and les s

✁ ✁✂✁ ✁ ✒ nodes , of all nodes other than the ini- than five nodes other than the initiator re- tiator being inters ected out of the s et of pos - maining in the s ys tem, and the probability

s ible initiators . of having no other nodes remaining, over

✏ ✒✁ ✁✂✁✂✁ time, where the s ys tem s tarts with nodes . be the only node in the system communicating with the responder, or there must be some identifying information in their communications. Two initiators to the same re- sen nodes to remain in the system as long as the initiator sponder can cause each other to be removed from the in- continues to contacta given responder. If the nodes leave, tersection, as one node communicates with the responder the initiator mustselectnew nodes to take their place. Ev- in a round while the other node is notin the system. While ery replacement of the nodes increases the chances that it is possible to extend the attack to pairs of nodes and to two attackers hold both positions simultaneously. The even larger groupings, it makes the attack more compli- probability of exposure, then, grows closer to one over cated and time consuming. time. Another caveat is that the attacker's list must include We assume that nodes leave the network after a fixed, all currently participating nodes for the intersection to be constant amount of time. In order to make comparisons accurate. The lists do not need to be exact, but they must with prior work and with results from Section 5, we mea- notomitany participating nodes. A reasonable implemen- sure time in rounds. Let us say that a node remains in the tation of most protocols does not require such a complete system for exactly ✂ rounds. Also suppose that all nodes list. The attacker can address this issue in some proto- join the system atthe same time. Then itis as if the system cols by placing more corrupt nodes in the system. This resets and forces new path selection every ✂ rounds. can help to make the list of currently participating nodes Every time the new first and last nodes must be se-

more complete. Also, attackers would not want to elimi- lected, the initiator selects the new nodes uniformly at

✝✣✢✡✝

☞✜✚✥✤

nate nodes unless they are known to be unavailable. random from . This creates a ✢ probabil-

☎✌☞✜✚✥✤ ity of selecting attackers for both positions.☎ We want to

know the number of resets, ✡ , such that the probability ☎✄

4.2 Fixed Nodes with Instability ✰ ✄

of attacker success, , is at least some value ✴ . Since

✰✆✄ ✰ ✰

✍ ✞ ☛ ✒ ✞

✏✓✒ , then substituting values of ,

✝ ✞ ✟✡✠

✚ ✚

In Section 3.1.3, we described a method in which an ✢

✚ ☞ ✤

✡ ✏ ✞

✝ .

✍ ☛

initiator selects two nodes to permanently occupy the first ✓✍✮

✩✍✫✬✩☛✭

✚ ☞

✖ ✖ ✓✍✮

✫ ✭

✡ ❱✴ and last positions on its path. This allows the initiator to By using ✏ , we get the number of resets, ,

keep attackers from linking it to any responder, as long as sufficientto make the attackers successful withprobability❏

✂ ✂ ✡ atleastone of the selected nodes was notan attacker. This ✴ . If resets occur every rounds, then rounds are defense has differentproperties in the dynamic model as it required for the attackers to be successful with❏ probability

depends on the stability of nodes: it requires the two cho- ✴ (See Table 2). 1 5.1. Simulation Methodology

0.8 n = 1000, c = 50 n = 1000, c = 100 We simulated long-running Crowds-based and Onion n = 1000, c = 200 Routing-based sessions. We chose these protocols as

0.6 models of systems with relatively low overhead. More costly protocols, such as fully-connected DC-Net and P5, do not fall to the predecessor attack under reasonable at- 0.4 tacker models, e.g. with fewer nodes than the size of the Probability of Attacker Success anonymity set. With path lengths of five and greater, the 0.2 expected time to attack Mix-Nets also falls far outside the times we see in these simulations [21].

0 1 10 100 1000 In our simulations, one initiator sets up a path each Resets of Both First and Last Nodes round by selecting the nodes in the path. For each round, if attackers appear on the path, then they log the predeces- Figure 3. Dynamic Model: Probability of at- sor of the first attacker. Attackers in the Onion Routing tackers' success when first and last nodes model know if they are on the same path (e.g., by timing are fixed. The time is given in the num- analysis in a real implementation).

ber of resets needed. We show results for Each data point of our graphed results are based on

✏ ✒✁ ✁✂✁ ✁ ✁ ✏ ✻ ✁ ✁ ✏ ✒✗✁✂✁

nodes , and , , and the average of ten runs. Each run consisted 10,000 sep-

✏ ✂✁✂✁ ✁ attackers . arate simulations of path formation for a given numbers of nodes and collaborating attackers. Each run used a dif- ferentseed for randomnumbergeneration. For each of the 10,000 simulations in a run, the attackers waited a given In Figure 3, we see that the probability of the attack- number of rounds and then made their guess of the initia- ers succeeding grows towards one as the number of resets tor's identity based on their current logging information. increases. The similarity to simulation results for the pre- We then determined the percentage of the 10,000 simula- decessor attack in Section 5, for uniformly random path tions in which the attackers guessed correctly. This ap- selection, suggests that fixing the first and last nodes may proach provides a confidence level for the attackers after notprovide significantly stronger security over time when any given number of rounds. nodes leave the system frequently. This is in contrastwith Performing this ten times enabled us to give an aver- the results shown for the static model in Figure 8, in which age and determine variation in the results. In all, we ran fixing the first and last nodes appears to be the best strat- over66.8 million simulations. For all graphs we computed egy. In general, fixing nodes on the path is only a good standard deviations, however they are not shown in most strategy when nodes leave the system only after long stays figures as they are too small to be significant(less than one or notatall. percent for all data and less than one tenth of one percent for high probabilities of attacker success), and they would 5. Simulating the Predecessor Attack clutter the graphs. 5.2. Results In our prior work, we provide upper bounds on the number of rounds required for attackers to perform the Figure 4 compares simulation results of attacker suc- predecessor attack against nodes running Crowds, Onion cess for Crowds and Onion Routing when the static mem- Routing, and Mix-Nets. While these bounds helped to in- bership model is used. The graph shows for both pro- dicate the relative performance of each protocol against tocols the average number of simulations of 10,000 that

the attacks, we did notshow them to be tightbounds. correctly determined the initiator for a given number of

✏✕✒✁ ✁✂✁✂✁ In this section, we present simulation results showing rounds. In all simulations, there are total nodes

the number of rounds required for attackers to have con- and the average pathlengthis 10. Weshow a dotted-line at

✁✺✹ ✻ ✁ fidence that they have found the initiator. Additionally, ✏ , as thatis the dividing line between Probable In-

we have simulated a number of the methods for selecting nocence, where the attackers' best guesses are more than

✁ ✄

Onion Routing paths described in Section 3 and compare ✻ likely to be wrong, and Possible Innocence, where ✂✁☎✄ those results againstuniform path selection. the attackers best guesses are more than ✻ likely to be 1 1

0.8 0.998

0.6 0.996

Onion Routing: c = 50 Onion Routing: c = 100 0.4 Onion Routing: c = 200 0.994 Simulation: c = 50 Crowds: c = 50 Simulation: c = 100 Crowds: c = 100 Simulation: c = 200 Bound: c = 50 Probability of Attacker Success Crowds: c = 200 Probability of Attacker Success 50% Chance Bound: c = 100 Bound: c = 200 0.2 0.992 (n-2)/n Chance

0 0.99 1 10 100 1000 10000 10 100 1000 10000 Rounds Rounds

Figure 4. Static membership model: Simu- Figure 5. Static membership model: The pre- lation of the predeces s or attack agains t deces s or attack agains t Onion Routing, top Crowds and Onion Routing, for varying one percent.

numbers of attackers.

✁ ✄

640 rounds to find the initiator with ✻ or better proba-

✏ ✒ ✁ ✁

correct [15]. Although this line is arbitrary, it represents bility. This is about four times longer than for ✁ , ☎ a pointatwhich the initiator's anonymity is compromised for which ✝ is half as large. This is also in line with our beyond an acceptable level, although the attackers cannot analytical results for Onion Routing, which show that the

yetbe highly confidentin their guess. number of rounds required has a squared dependence on ✝ Wecompare Onion Routing paths and Crowds paths of ☎ . This dependence holds for most values, but the differ-

approximately equal length. It is not clear that the perfor- ences are greater for low numbers of rounds. The linear ✝ mance of each protocol withthe same pathlengthis equiv- dependence on for Crowds also holds for most values, alent, and Crowds path lengths will vary. If, however, again excepting☎ the lowest counts. network transfer times dominate the delay along paths for In Figure 5, we show the predecessor attack against systems using either protocol, then equal pathlengths pro- Onion Routing as in Figure 4, butonly for when the prob- vide a useful comparison. To make the path lengths simi- ability of attackers guessing the initiator is greater than

lar, we selected several pathlengths in Onion Routing and 0.99. A line is drawn where the attacker probability of ✍

matched each of those with a Crowds' probability of for- success is ☎✌☞ , which is the standard of high probabil-

☎ ✁ warding (✴ ) thatgave the same average path length. ity we set in the analysis from our previous paper and in From Figure 4, we see that Onion Routing lasts longer Section 3.

against the predecessor attack than a similar configura- From the figure, we see thatthe numberof rounds guar- ✂✁☎✄ tion of Crowds. The leftmost three lines at the ✻ point anteed by the analytic bounds are much higher than the

are all for Crowds, while the three rightmost lines are for simulation results. The number of rounds required for the

✁ ✏ ✒ ✁ ✁

Onion Routing. For example, with , which gives attackers to be logged with high probability ☎✌☞✎✍ is be-

✏ ✒ ✁

us ✝ , the attackers require only 14 to 16 rounds tween 5.7 and 6.2 times less for the simulation than guar-

✁ ✄

againstCrowds to geta ✻ chance of exposing the initia- anteed by the bounds from our prior work.

✏ ✒ ✁✂✁ tor. For the same value of ✁ in Onion Routing, the We also see from the figure that the relationships be- attackers require between 140 and 160 rounds. This ten- tween the numbers of rounds required for different val-

fold increase in the numberof rounds appears to match the ues of ✝ holds as the attackers get closer to being certain

☎ ✝ additional factor of ☎ found in the analytical results from thatthey have found the initiator. We use linear extrapola-

our previous paper. This trend can be seen throughoutthe tion between points to find how many rounds have passed ✍

graph. when the lines cross the ☎✌☞ probability of attacker suc-

✝ ✁ ✏ ✒✗✁✂✁ ✂ ✹ ✝

Wealso see fromFigure 4 thatthe largerthe value of ☎ , cess line. With , approximately times as many

✏ ✂✁✂✁

the longer the predecessor attack takes. For example, with rounds are required than with ✁ to reach the line.

✏ ✻ ✁ ✁ ✏ ✻ ✁ ✁ ✏ ✒✗✁✂✁ ✂❳✹ ✒ Onion Routing, ✁ attackers require between 576 and Between and , the ratio is . The ratio 1 1

Onion Routing: l = 5 0.95 Onion Routing: l = 20 Crowds: l = 5 Crowds: l = 20 0.9 0.8 50% Chance

0.85

0.8 Simulation: n = 100 0.6 Simulation: n = 1000 Simulation: n = 10000 0.75 Bound 98% Chance

0.7 0.4 Probabilty of Attacker Success

Probability of Attacker Success 0.65

0.6 0.2

0.55

0.5 0 100 1000 10000 1 10 100 1000 Rounds Rounds

Figure 6. Static Model: The predeces s or at- Figure 7. Static Model: The predeces s or at-

tack agains t Onion Routing, for varying . tack agains t Onion Routing and Crowds , for varying average path lengths . predicted by the bounds is exactly four for each of these relationships. icantly outperforms Crowds for all path lengths. We We present simulation results for Onion Routing in also see that longer path lengths outperform shorter path

lengths. For example, after 96 rounds, attackers against

✏ ✁❳✹ ✒ ✝

Figure 6 with a fixed ratio of ☎ and a fixed path

☎✆✺✹ ✂ ✄

Onion Routing with path lengths set to five have a ✂

✒✗✁ ✏ ✒ ✁✂✁ length of ✏ , but with three values for : ,

chance of identifying the initiator. With path lengths set

✏ ✒✁ ✁ ✁✂✁ ✏ ✒ ✁ ✁ ✁✂✁

, and . Also, we show the bound

✺✹ ✝☎✄

to 20, however, the attackers only have a ✝✂✁ chance of

✏✕✁❳✹ ✒ ✝ for ☎ , calculated from the formula of our previous paper. success. As in Figure 5, we see the significant difference be- Also in Figure 7, we see that average path length is

tween the predicted values of the bounds and the simula- more significantbefore the probability of attacker success

✁ ✄ tion results. We also see the slight, but significant, dif- has reached ✻ than afterwards. Also, the difference be- ference in the upper half of the graph between the line for tween different path lengths is larger for Onion Routing,

while largely insignificantfor Crowds.

✏ ✒✗✁✂✁ ✏ ✒ ✁✂✁ ✁ ✏✕✒ ✁ ✁ ✁✂✁ and the lines for or . This difference is notshown in the line given by the bounds and The success of fixing nodes in the path in the static is notreflected in the insignificantdifference in the results model can be seen in Figure 8. After only ten rounds,

the fixed first and last strategy is already a better defense

✏ ✒ ✁✂✁ ✁ ✏✕✒ ✁ ✁ ✁✂✁ between and . When there are fewer nodes to selectfrom, other nodes than allowing these nodes to vary with each round. All may be logged as often as the initiator for more rounds three fixed strategies stay at a relatively low probability than when there are many nodes to selectfrom. This effect of attacker success, while attackers see significant gains can be seen in the bounds from our prior paper, although in the probability of their success against ordinary Onion it does not effect the curves shown in our graphs. This is Routing.

because the standard for high probability of attacker suc-

✍ cess, ☎✌☞ , has an inverse dependence on . The number 6. A Study of User Behavior of rounds☎ required to reach a high probability that the at-

tackers correctly identify the initiator is not dependenton For the attacks we have studied to be successful, they because the value of the probability changes with . require that initiators communicate with responders over The other significant parameter in these simulations is long periods of time. To determine whether real user ex- the average pathlength. In Onion Routing, this pathlength hibit such behavior, we obtained traces of communica- is fixed, while in Crowds, the path length depends on the tions from a web proxy in the Department of Computer probability of forwarding. We compare the two protocols Science at the University of Massachusetts, Amherst. We with the same average path length in Figure 7. separately analyzed data collected previously at the Uni- From the figure, we see that Onion Routing signif- versity of California atBerkeley. 0.4 220

200 0.35 Onion Routing Fixed First Node Fixed Last Node 180 Fixed First and Last Node 0.3 160

0.25 140 120 0.2 100

0.15 80

Probability of Attacker Success 60 0.1 Span of days in study 40

0.05 20

0 0 1 10 100 0 5 10 15 20 25 Rounds Users

Figure 8. Static Model: Predeces s or attack Figure 9. Length of time each us er employed agains t Onion Routing when paths are s e- the web proxy. lected uniformly at random, and when the

firstand/or lastnode is fixed. We show re-

✏ ✒✁ ✁ ✁✂✁ ✁ ✏✕✒✗✁✂✁ s ults for nodes , attackers ,

✂ Figure 10 tracks whether users in the study revisted the ✒✗✁ and path length ✏ . same responder over time. If we set the round length to 15 minutes long, the 214-day study divides into 20,549 rounds. Of 7,418 initiator-responder pairs, 103 were seen We sought two important results from the data. First, in 80 or more rounds; 45 pairs in 160 or more rounds; we wanted to know the frequency that users contact the and so on. Values for longer round lengths are also shown same web site (i.e., the responder) and the extentto which in Figure 10. In other words, only a small percentage of thatcontactcontinues to occur. Itis critical to note thatthe connections are long-lasting. However, Figure 12 shows predecessor attack works in rounds of a specific length. what percentage of all traffic is represented by long last- For example, when rounds last one day it does not matter ing sessions. The longest-lasting sessions make up ap- how many times the user contacts the same site during that proximately 10% of all traffic, representing a significant day. Shorter round lengths allow new users to join more fraction of users' behavior. frequently, butincrease the vulnerabilityof existing users. In our simulation results from Section 5, we found that Second, sites thatare contacted mostfrequently(i.e., in in an Onion Routing system with 1,000 nodes and path

the most number of rounds) are also the most vulnerable. lengths of 10, a set of 100 attackers required approxi-

✆✺✹ ✄☎✄ Therefore, we wanted to know whatpercentage of a users' mately 960 rounds to identify the initiator with an ✆ web traffic was vulnerable to passive logging attacks. probability of being correct. This matches the results for One could argue that users of anonymous communi- the five connections in our study seen in more than 1,284 cations systems would be more careful to not behave in fifteen-minute rounds. Thus, these five connections could trackable ways than the volunteers of our study. We be- have been tracked by attackers with a high probability of

lieve thatthis is unlikely. The users of our study knew that determining the initiators. Those connections make up

✆ ✄ theirweb usage was being tracked in a linkable way. Users ✒ of all traffic, so a significant portion of the users' of anonymous communications systems mightbe inclined privacy would be compromised by such an attack.

to behave cautiously, butwould have reason to believe that If we relax the probability thatthe attackers are correct

✂✁ ✄ their patterns of communication are mostly hidden. Some to ✄ , we see that 15 different initiator-responder pairs

of these users mighteven seek systems foranonymitywith can be tracked in the same time frame. These connections

✼✽☎✄ the intention of using them for repeated communications make up ✝ of all the users' traffic. Thus, over one-third

to a specific responder or setof responders, rather than for of the traffic that went through the web proxy faced an

✂✁ ✄ general privacy. ✄ chance of linkage to the initiator. In any case, users should know what patterns of com- In addition to the study at the University of Mas- munications can be tracked in this way as to be able to sachusetts, we also used datacollectedfromthe University avoid them when communicating anonymously. of California atBerkeley's Home IPservice [9]. This data 1e+06 10000 24 hours 15 minutes 6 hours 1 hour 1 hour 6 hour 15 minutes 24 hour 100000

1000

10000

100 1000

100 10 Number of Remaining IR Pairs

10 Number of Remaining Initiator-Responder Pairs 1 1 10 100 1000 10000 1 1 10 100 1000 Rounds Rounds

Figure 10. Survivor plot of us ers contact- Figure 11. Survivor plot of us ers contacting ing the s ame webs ite over increas ing num- the s ame webs ite over increas ing numbers bers of rounds, from the University of Mas- of rounds , from the Univers ity of California sachusetts data. at Berkeley data

has the advantage of having more users, but only tracks These results are important for both the designers and users over 18 days. For that time, however, the users also users of anonymous protocols. It appears that designing showed patterns of repeated communications in signifi- a long-lived anonymous protocol is very difficult, and that cantnumbers of rounds, as shown in Figure 11. users of currentprotocols need to be cautious in how often and how long they attemptto communicate anonymously. 7. Conclusion 8. Acknowledgments In our prior work, we presented the first full analysis of the predecessor attack, an attack which seeks to iden- We thank the Center for Intelligent Information Re- tify the initiator of an anonymous connection over time. trival (CIIR) atthe University of Massachusetts, Amherst, This work was importantbecause itshowed thatall known for allowing us to use data collected via their web proxy. anonymous protocols were subject to this attack; in some We are particularly grateful to Professor James Allan, cases no attack againstthe protocol had been known. Fernando Diaz, and Haizheng Zhang for their assistance In this paper, we have demonstrated several things. in helping us obtain and use the data. We also extend We have shown thatthe churn in group membership of our gratitude to the 24 volunteers who participated in the anonymous protocols provides additional leverage for at- study. tackers to degrade the anonymity of initiators who stay Thanks also to Steven D. Gribble and the others who in the anoymous group. Specifically, both Tarzan and collected the dataatthe University of California atBerke- Crowds are particularly vulnerable to the intersection at- ley and make itavailable online along withuseful process- tack, since lists of current users are readily available. We ing tools. note thatother peer-to-peer protocols, such as MorphMix, may notbe as vulnerable to this attack, since awareness of all other peers is notneeded for operation of the protocol. References We have shown by simulation that the upper bounds obtained in our prior work were fairly loose, and that the [1] A. Back, I. Goldberg, and A. Shostack. Freedom 2.0 Se- attack can potentially succeed much more quickly than curity Issues and Analysis. Zero-Knowledge Systems, Inc. white paper, November 2000. previously described. [2] Y. Baryshnikov, E. Coffman, D. Rubenstein, and We have also shown, through study of real network B. Yimwadsana. Traffic Prediction on the Internet. Techni- traffic, that users may follow the necessary communica- cal Report EE200514-1, Computer Networking Research tion patterns for the attacks to be successful over time. Center, Columbia University, May 2002. 1 15 minute Rounds and Unobservability, volume 2009 of LNCS, pages 10–29. 1 hour rounds 24 hour rounds Springer-Verlag, 2001. 6 hour rounds [14] M. Reed, P. Syverson, and D. Goldschlag. Anonymous Connections and Onion Routing. IEEEJournal on Selected Areas in Communication Special Issue on Copyright and Privacy Protection, 1998. [15] M. K. Reiter and A. D. Rubin. Crowds: Anonymity for Web Transactions. ACM Transactions on Information and

Percentage of traffic System Security, 1(1):66–92, November 1998. [16] M. Rennhard and B. Plattner. Introducing MorphMix: Peer-to-Peer based Anonymous InternetUsage withCollu- sion Detection. In Proc. 2002 ACM Workshop on Privacy 0.1 in the Electronic Society (WPES), November 2002. 1 10 100 1000 10000 [17] S. Saroiu, P.K. Gummadi, and S. Gribble. A Measurement Rounds in which IR pair is observed Study of Peer-to-Peer File Sharing Systems. [18] R. Sherwood, B. Bhattacharjee, and A. Srinivasan. P5: Figure 12. Percentage of all traffic rep- A Protocol for Scalable Anonymous Communication. In res ented by initiator-res ponder pairs ob- Proc. 2002 IEEESymposiumon Securityand Privacy, May served in varying numbers of rounds, from 2002. the University of Massachusetts data [19] V. Shmatikov. Probabilistic Analysis of Anonymity. In IEEE Computer Security Foundations Workshop (CSFW), pages 119–128, 2002. [20] P. Syverson, G. Tsudik, M. Reed, and C. Landwehr. To- wards an Analysis of Onion Routing Security. In Workshop [3] O. Berthold, H. Federrath, and M. Kohntopp. Project on Design Issues in Anonymity and Unobservability, July Anonymity and Unobservability in the Internet. In Com- 2000. puters Freedom and Privacy Conference 2000 (CFP 2000) [21] M. Wright, M. Adler, B. Levine, and C. Shields. An Anal- Workshop on Freedom and Privacy by Design, April 2000. ysis of the Degradation of Anonymous Protocols. In ISOC [4] J. Chu, K. Labonte, and B. N. Levine. Availability and Lo- Symposium on Network and Distributed System Security, calityMeasurements of Peer-to-Peer File Systems. In Proc. February 2002. ITCom: Scalability and Traffic Control in IP Networks II, [22] L. Xiao, Z. Xu, and X. Zhang. Low-Costand Reliable Mu- volume 4868, July 2002. tual Anonymity Protocols in Peer-to-Peer Networks. Tech- [5] B. D. Davison. Predicting Web Actions from HTML Con- nical Report HPL-2001-204, Hewlett Packard Laborato- tent. In The Thirteenth ACM Conference on Hypertextand ries, August 2001. Hypermedia (HT'02), pages 159–168, June 2002. [6] D. Duchamp. Prefetching Hyperlinks. In Second USENIX Symp. on Internet Technologies and Systems, pages 127– 138, October 1999. [7] H. Federrath. JAP: A Tool for Privacy in the Internet. http://anon.inf.tu-dresden.de/index en.html. [8] M. J. Freedman and R. Morris. Tarzan: A Peer-to-Peer Anonymizing Network Layer. In in Proc. ACMConference on Computer and Communications Security (CCS 2002), November 2002. [9] S. D. Gribble. UC Berkeley Home IP Web Traces. Avail- able at http://www.acm.org/sigcomm/ITA, July 1997. [10] D. Kesdogan, J. Egner, and R. Buschkes. Stop-and-Go- MIXes Providing Probablilistic Anonymity in an Open System. In Information Hiding, April 1998. [11] H. T. Kung, S. Bradner, and K.-S. Tan. An IP-Layer Anonymizing Infrastructure. In Proc. MILCOM: Military Communications Conference, October 2002. [12] B. N. Levine and C. Shields. Hordes: A Protocol for Anonymous Communication Over the Internet. ACMJour- nal of Computer Security, 10(3):213–240, 2002. [13] J. Raymond. Traffic Analysis: Protocols, Attack, De- sign Issues and Open Problems. In H. Federrath, editor, Designing Privacy Enhancing Technologies: Proceedings of International Workshop on Design Issues in Anonymity