22nd International Conference on Advanced Information Networking and Applications

A Concept of an Anonymous Direct P2P Distribution Overlay System

Igor Margasiński, Michał Pióro Institute of Telecommunications, Warsaw University of Technology {I.Margasinski, M.Pioro}@tele.pw.edu.pl

Abstract an anonymous network composed of nodes called Mixes that forward anonymous messages. The strength The paper introduces a peer-to-peer system called of the solution consists in: (i) a specific operation of P2PRIV (peer-to-peer direct and anonymous nodes which “mixes” forwarded messages, and (ii) an distribution overlay). Basic novel features of P2PRIV asymmetric encryption of messages exchanged are: (i) a peer-to-peer parallel content exchange between them. The purpose of such mixing is to hide architecture, and (ii) separation of the anonymization the correlation between received and forwarded process from the transport function. These features messages. In general, received data units are padded to allow a considerable saving of service time while a constant size length, encrypted, delayed for a batch preserving high degree of . In the paper we aggregation and then sent (flushed) in a random order. evaluate anonymity measures of P2PRIV (using a Anonymous messages are sent usually via a chain of normalized entropy measurement model) as well as its Mixes to eliminate presence of a trusted party and also traffic measures (including service time and network to omit single point of failure imposed by a single Mix. dynamics), and compare anonymity and traffic In Mix-net, each message is encrypted recursively with performance of P2PRIV with a well known system public keys of Mixes from a forwarding path. Finally, a called CROWDS. message for each successive forward has different bit

representation and place in a flow of other Mix-net 1. Introduction messages. This makes Mix-net communications practically untraceable and secured against The broadband access has given a way for eavesdropping [6]. explosion of large scale distributed overlay networks. The development of mixing methods is primarily Peer-to-Peer (P2P) overlays have grown to one of the aimed at resolving a tradeoff between delay time leading Internet applications and constitute a significant part of the Internet traffic. From its imposed by batching and reordering of messages and inception, development of P2P communications goes the anonymity level. Regardless of potential hand in hand with demand for anonymity. Still, we discontinuities in incoming traffic, Mixes have to wait observe that highly secured P2P overlays of today for sufficient number of messages to achieve struggle to provide good traffic performance. A novel untraceable mixing. One solution to this problem is P2P system proposed in this paper introduces a parallel generation of fake messages (dummy traffic) ([4], [8], content exchange architecture separating the [13]). Dummies enhance anonymity and allow anonymization process from the transport function. particular Mixes to flush faster. However additional The anonymity and traffic performance measures of and “empty” traffic finally delays delivery of parallel the new system are presented and compared with user data. analogous measures obtained for a classical The concept of Mix-net has been used in a wide architecture. range of applications such as E-mail [9], Web browsing [3], ISDN [18]. Other solutions [5] seem to 1.1. Related work play a less important role or (as CROWDS [20]) utilize an idea of “blending into a crowd” by traffic The first concept of network anonymity was forwarding via a group of nodes before its delivery. introduced in the seminal paper of Chaum [6]. The Anonymity of CROWDS is based on a random walk Mix-net system proposed there has become a algorithm. A random set of CROWDS nodes forward foundation of modern anonymity solutions. Mix-net is

1550-445X/08 $25.00 © 2008 IEEE 590 DOI 10.1109/AINA.2008.117

anonymous messages without usage of mixing or with a file id to a randomly chosen peer. Then, the public key encryption techniques. P2P overlays as well selected peer flips an asymmetric coin to decide as other applications usually adapt Mix-net to assure whether to forward the token (with probability pf) to anonymity. Many variants of Mix-net such as Free the next random peer. This communication may be Haven [10], Tarzan [11] and MorphMix [19] were additionally secured and anonymized by Mix-net introduced for P2P. Other system, as Freenet [7] and mechanisms, as numerous but short control messages GNUNet [2], use heuristics with encryption to achieve of constant length, generated by cloning, can be anonymity. It should be noted that the basic common effectively exchanged by the Mix cascades [12]. mechanism used to achieve P2P anonymization is Step 2: Data connection – transport of the requested traffic forwarding by a set of middleman nodes. content. After a random interval of time and based on the content id received earlier, the copies of the content 1.2. Outline of the paper are directly downloaded by selected (cloned) peers from nodes which store data. One of these nodes is the The rest of the paper is organized as follows. Section initial requestor. Files can be looked up by the DHT 2 describes the system and its anonymization idea. algorithm. Like the cloning exchange, lookup Analytical evaluation of anonymity is presented in messages can be secured by Mix-net mechanisms. The Section 3 while traffic performance simulations are resulting data redundancy (the file is downloaded and described in Section 4. Conclusions and future work stored by each clone) improves content accessibility, are presented in Section 5. because the popularity of a content automatically increases the number of its copies. Notice, that in our 2. P2PRIV overview solution the anonymization process is separated from the content transport, in contrast to classical schemes. P2PRIV is an application layer solution for P2P overlay networks assuring a sender’s and a receiver’s 3. Anonymity anonymity. The basic novel idea of the solution consists in parallel content exchange instead of Below we will analyze the degree of anonymity of widespread cascade transmission between chaining P2PRIV using entropy measurement model ([14], nodes. Certainly, anonymity assured by P2PRIV [21]). The CROWDS system, which combines imposes traffic overheads, as in any other system of anonymity and performance with simplicity and this type, for example Mix-net. A motivation behind reputability, will be used as a reference. The the parallel architecture with direct content transport is anonymity analysis will cover receiver anonymity decrease of the service time while preserving high referred to as an unlinkability of the requested content degree of anonymity. and the requestor. The anonymous publication process, P2PRIV peers are symmetric and do not include any which can be performed in many different ways using privileged nodes (supernodes). The P2PRIV specific the state of the art methods, is not considered in the connection and content exchange architecture utilizes a proposed scheme and omitted in the analysis. We will classical concept of chaining with encryption not analyze sender anonymity as well. However, when anonymization, for example Mix-net. It also uses we assume that P2PRIV utilizes the DHT storage, structured lookup system, which can be based on DHT peers of P2PRIV are involved in the process of storing algorithms (distributed hash table, [1]). The cascade and sending data independently from decisions of they anonymization mechanism assures anonymity of an users. entire P2PRIV control messages exchange, including To achieve practical results it is important to assume the communications of distributed content location realistic capabilities of an adversary corresponding to process. We can distinguish two steps of P2PRIV the specific environment of the attack. P2PRIV is operation: primarily dedicated to public WANs, especially the Step 1: Cloning – exchange of management Internet. We assume that users do not establish private messages applied for anonymous random selection of a groups and that no additional trust or access control subset of peers referred to as a cloning cascade (CC). mechanisms are provided. Any person can become the Each such CC contains the requestor and its clones; system user and can utilize the system’s provided each peer can be potentially selected for such a clone. information at his own way. Initially, we assume that This step is similar to the network random walk the adversary obeys the protocol and conducts a mechanism of CROWDS. The requestor sends a token passive observation. Next we will consider active

591

1− S p attacks enabling attackers to change protocol operation = 1 1 p2 . (6) to disclose more information. Keeping in mind the S 2 large scale of public overlays, we will analyze local According to the information theory [22], entropy of attacks where the adversary can control only a part of P2PRIV for this scenario will be the system. We will study an impact of the number of = − N ()= − ()− ( ) collaborating nodes C on the degree of anonymity, and H PS ∑ pi log2 pi S1 p1log 2 p1 S 2 p2 log 2 p2 . (7) end with a look at a global attack. We assume that i=1 Using the entropy measurement model ([14], [21]) the collaborating nodes can collect all information the degree of the anonymity (normalized entropy) system is leaking and send this statistics via an provided by P2PRIV is independent channel to the adversary headquarter for a ≥ ∨ ≥ 0 p1 1 p2 1 summarizing analysis.  C ()−1  log 2 p1  N p ≤ 0 − 2 3.1. Passive-static attacks  log 2 (N C)   − C  ()−1 d = 1  log 2 p2 PS  N  (8) Static attackers cannot predict which nodes will be  p ≤ 0 − 1  log 2 (N C) randomly selected to form the CC for a particular  C ()−1 +  − C  ()−1 request anonymization. The adversary can distinguish  log 2 p1 1  log 2 p2 N  N   p ∈ ()0,1 ∧ p ∈ ()0,1 , two sets of peers {S ,S } among all N nodes and assign  − 1 2 1 2  log 2 (N C) their members probabilities of being the requestor where {p1,p2}. S1 consists of peers which communicate ( p −1)N ( p −1)N p = f , p = f . 1 − − 2 − 2 − − directly with collaborating nodes C during a transport ( p f 2)(N C) ( p f 1)N ( p f 2)C of the requested data, and S2 are remaining suspected The degree of the anonymity describes the uncertainty nodes. Then the average CC length is of the adversary in finding the requestor and takes ∞ − p − 2 P = ip i 2 (1− p ) = f . values from [0,1]. Figure 1 shows the degree of ∑ f f − (1) i=2 p f 1 anonymity d as a function of the parameter C. 1 The adversary concludes that the requestor is not P2PRIV, pf=0.8 P2PRIV, pf=0.75 among collaborating peers. A number of honest nodes P2PRIV, pf=0.66 CROWDS, pf=0.8 0.8 CROWDS, pf=0.75 from the cascade are CROWDS, pf=0.66 C ( p − 2)(N − C) n = P − P = f . − (2) 0.6 N ( p f 1)N The step 1 of P2PRIV operation is anonymized by d 0.4 Mix-net. However in the step 2, the cloned peers communicate directly with other peers. Let’s consider 0.2 the most pessimistic scenario, where all of clones download the content from different peers. Then the average number of honest nodes from the CC which 20 40 60 80 100 C communicate with collaborating nodes equals C( p − 2)(N − C) Figure 1. Degree of anonymity for P2PRIV and = C = f S1 n . (3) CROWDS, passive-static attacks, N = 100. N ( p −1)N 2 f Each of them can be the content requestor with 3.2. Passive-adaptive attacks probability −1 The previous passive-static attacks scenario =  − C  p1  P P . (4)  N  corresponds to a realistic assumption that the adversary cannot predict which nodes will anonymize the The attacker, who can observe S1 nodes involved in a distribution of the requested data, should also consider request. The adaptive scenario is more pessimistic and that none of them is the requestor. He should also take considers implications of the presence of colluding into account the rest of the nodes nodes in the system area where active anonymization process occurs. If we assume that the collaborating S = N − C − S , (5) 2 1 peer belongs to a chosen CC then the number of honest Each of the S members can be the requestor with 2 nodes communicating with collaborating nodes will be probability

592

= C − + The length of CC before first collaborating node S1PA (n 1) 1, (9) N interception PA can be calculated − ( − + − ) C (N C) ( p f 2)C ( p f 1)N p(P = 1) = S = . A N 1PA − 2 (10) ( p f 1)N = =  − C  C +  − C ()− p(PA 2) 1  p f 1  1 p f Analogously to (5) the number of remaining nodes is  N  N  N  = − − n−1 n−1 S2PA N C S1PA , (11) = =  − C  n −1 C +  − C  n−2 ()− p(PA n) 1  p f 1  p f 1 p f . with assigned probability of being the requestor  N  N  N  (15) 1− S p = C +  − C  C +  − C ()− + = 1PA 1 PA 1  p f 1  1 p f p2PA . (12) N N N N S     2PA i−1 i−1 ∞   +  − C  i−1 C +  − C  i−2 ()−  Entropy of P2PRIV in this attack scenario is ∑i 1  p f 1  p f 1 p f =   = − ()− ( ) i 3  N  N  N   H PA S1PA p1log2 p1 S2PA p2PA log2 p2PA (13) N 3 + p 2 (C − N )3 + p N(2C 2 − 3CN + N 2 ) and the degree of anonymity equals P = f f . A ()p (C − N) + N N 2 ≥ ∨ ≥ f 0 p1 1 p2 PA 1 −  ϑ log ()p 1 Among PA the number of nodes communicating  2 1 p ≤ 0 − 2PA log2 (N C) directly with collaborating nodes is = ς ()−1 dPA  log2 p2PA (14) 2 p ≤ 0 (N − C)(N 3 + p (C − N )3 + p N(2C 3 − 3NC + N 2 ))  log (N − C) 1 = f f , 2 nA 3 (16)  −1 −1 ()+ − ϑ log ()p +ς log (p ) N p f (C N ) N  2 1 2 2PA p ∈()0,1 ∧ p ∈ ()0,1 ,  − 1 2PA  log 2 (N C) hence members of C C where S = n , S = (n −1) +1 (17) ( p −1)N(N + ( p − 2)C) 1AS N AS 1AA N A p = f f , 2PA − − ()− + − + − ( p f 2)(C N ) C( p f 2) N(N p f N p f 1) will have assigned probabilities − + − − + −1 ( p f 2)C ( p f 1)N ( p f 2)C N C ϑ = ,ς = . =  −  . ( p − 2)N ( p − 2)N p1A  PA PA  (18) f f  N  Figure 2 shows results for adaptive observation (14). And the rest of nodes respectively to the static/adaptive 1 P2PRIV, pf=0.8 attack P2PRIV, pf=0.75 P2PRIV, pf=0.66 = − − = − − CROWDS, pf=0.8 S2 AS N C S1AS , S 2 AA N C S1AA (19) 0.8 CROWDS, pf=0.75 CROWDS, pf=0.66 will be considered as the requestor with probabilities − 1− S p = 1 S1AS p1A = 1AA 1A 0.6 p2 AS , p2 AA . (20) S2 AS S 2AA d Then degrees of anonymity of P2PRIV for active 0.4 attacks are ≥ ∨ ≥ 0 p1A 1 p2 AS 1 0.2  C ()−1  log 2 p1A  N p ≤ 0 − 2 AS  log 2 (N C) 20 40 60 80 100   − C  ()−1 (21) C d = 1  log 2 p2 AS AS  N   p ≤ 0 Figure 2. Degree of anonymity for P2PRIV and − 1A  log 2 (N C) CROWDS, passive-adaptive attacks, N = 100.  C ()−1 +  − C  ()−1  log 2 p1A 1  log 2 p2 AS N  N   p ∈()0,1 ∧ p ∈ ()0,1 ,  − 1A 2 AS 3.3. Active attacks  log 2 (N C) ≥ ∨ ≥ 0 p1A 1 p2 AA 1 − ψ log ()p 1 Now we analyze static and adaptive behaviors of  2 1A p ≤ 0 log (N − C) 2 AA active adversary. In traditional anonymous chaining  2  −1 d = ζ log ()p systems active attacks must be more subtle and AA  2 2 AA p ≤ 0  − 1A log 2 (N C) complex ([15], [16], [17]) than they could be in a  −1 −1 ψ log ()p + ζ log (p )  2 1A 2 2 AA p ∈()0,1 ∧ p ∈ ()0,1 , parallel architecture, since simple breaking of the chain  − 1A 2 AA  log 2 (N C) transporting user data would be detected quickly. where Breaking the cloning cascade in P2PRIV system does not affect user data delivery. Therefore, when we consider parallel architecture, it is important to take into account extreme scenario of cloning interception.

593

N 3 ( p (C − N) + N) = f include mixing and asymmetric encryptions p1A , (N − C)()p 2 (C − N) 3 +N 3+ p N(2C 2 − 3NC + N 2 ) f f techniques. Notice that P2PRIV proved high N 3 ()N + p (C − N ) = f robustness under the same conditions. p 2AS , p 2C(C − N) 3 + N 3 (C − N 2 ) + p N ()2C 3 − 3C 2 N − (N −1)CN 2 + N 4 f f −1   + ()− ()3 + ()2 − + 2 + 2 − 3  C 1 C N N p f N 2C 3NC N p f (C N) Table 1. Degree of anonymity for CROWDS and p = ζ N − C +   , 2AA  N  N 3 ()N + p (C − N)  P2PRIV, C = 5%.   f  N 3 (N + C) − p N()N 3 − 2C 3 + 3NC 2 − 2N 2C + p 2C(C − N)3 CROWDS P2PRIV ψ = f f , Attack ()3 + 2 − + 2 + 2 − 3 pf =0.66 pf =0.75 pf =0.8 pf =0.66 pf =0.75 pf =0.8 N p f N(2C 3CN N ) p f (C N) N p N()2N 3 − 2C 3 + 5C 2 N − 5CN 2 − CN 3 − p 2 (C − N) 4 Passive-Static 0.69 0.63 0.58 0.97 0.98 0.98 ζ = f f . 3 2 2 2 3 Passive-Adapt. 0.69 0.78 0.82 0.84 0.88 0.91 ()N + p N()2C −3CN + N + p ()C − N N f f Active-Adapt. - - - 0.79 0.84 0.87 Figure 3 shows the degree of anonymity for P2PRIV active attacks. Using the anonymity measurement model ([14], 1 Static, pf=0.8 [21]) with practical attacks approaches, we have found Static, pf=0.75 Static, pf=0.66 Adaptive, pf=0.8 the proper P2PRIV degree of anonymity. Static-passive 0.8 Adaptive, pf=0.75 Adaptive, pf=0.66 attacks have revealed resistance of P2PRIV higher than CROWDS in the entire scope of the collaboration. A 0.6 significant impact on the P2PRIV anonymity was

d disclosed only after an injection of large number of 0.4 colluding nodes (above 25%). As expected, adaptive attacks have the largest impact. This less realistic 0.2 scenario is a good reference, because of its pessimistic assumptions. Adaptive attacks show how important is proper selection of cascade length (p value 20 40 60 80 100 f C configuration). We have observed that p should not be f Figure 3. Degree of anonymity for P2PRIV, CC lower than 0.66. The comparison between CROWDS interception active attacks, N = 100. and P2PRIV passive-adaptive attacks showed that P2PRIV provides a higher degree of anonymity for Notice that we cannot directly compare preceding realistic amount of collaborating peers – below 60%. results with CROWDS because of different attack The last analyzed scenario, active attacks, does not possibilities for cascade and parallel systems. As in degrade P2PRIV protection meaningfully, despite of previous scenarios, the results show good resistance of definitive invasion (breaking anonymization cascade P2PRIV for a realistic percentage of colluding nodes. by the first colluding node).

3.4. Summary of the results 4. Traffic performance

The minimum degree of anonymity depends on Certainly, techniques of information hiding, such as system usage and particular users requirements. anonymity, require traffic overheads and can therefore However as in [14], we restrict acceptable normalized potentially degrade the network traffic performance. entropy to d ≈ 0.8. Taking into account adaptive Hence, usefulness of a particular anonymity solution attacks, the degree of anonymity of CROWDS depends not only on its security level, but also on (pf = 0.75 recommended by CROWDS authors) falls necessary traffic overheads. For P2P distribution below this level for C = 5% while P2PRIV still retains overlays a basic performance factor is a mean the proper anonymity level even for pessimistic active- download time (DT) quoted as a mean time required adaptive attacks (see Table 1). The degree of for a content delivery after a submission of the request anonymity offered by CROWDS is considerably low by the user. Open P2P environment also requires for static attacks. This scenario shows that a set of consideration of unpredictable users migration and nodes actively involved in anonymization process traffic bursts after a new publication of a popular should not be too numerous. Longer cascades impose content. Besides the mean download time and not only larger traffic overheads, but also can make it scalability, we will analyze dynamics of the system in easier for the adversary to become a member of this set reaction to a new content publication. The scenario will and effectively compromise security of a particular cover request arrival rate and content migration system. We should remind that CROWDS does not impacts. For the purpose of the complicated dynamic

594

conditions analysis we have created a peer-to-peer 130 simulation environment. The simulator traces tasks of 120 CROWDS random walk symmetric peers independently. We have used Poisson 110 P2PRIV 100 FTP distribution to model an arrival process. As a reference 90

D 80 we have simulated the CROWDS random walk n i

m 70 algorithm. CROWDS admits simplifications of the @ 60 T anonymous cascade schemes for better traffic D 50 performance. Nodes of CROWDS (called jondos) do 40 not mix or delay forwarding content and also do not 30 20 use asymmetric . CROWDS system was 10 originally dedicated for Web browsing thus we 0 33.3 50 60 90 240 included the content caching functionality for each λ−1 min forwarding node. Based on the results showed in Table Figure 4. Mean download time for P2PRIV and = 0.66 and 1 we will simulate P2PRIV with pf CROWDS random walk, N = @100.D CROWDS with pf = 0.75 configuration which corresponds to comparative degrees of anonymity for adaptive attacks for both systems. The average number 4.2. Dynamics of peers in a cascade is 4 for pf = 0.66 and reaches value 5 for pf = 0.75 (1). These values correspond also Below we consider the mean DT characteristics to comparative traffic volume for P2PRIV and under dynamically changing network traffic CROWDS. Notice that P2PRIV includes one more link conditions. We will analyze system behavior starting for the same cascade length because of its parallel from a new file publication. Let D be the part of all architecture. As an additional reference we will use requests which correspond to the new file. We will µ minimal download rate Min marked for simplification take into account the behavior of selfish users where as FTP. We assume rather typical for today P2P simultaneously D percent of copies leaves the overlay overlays values of link throughputs and file size. Let network for each request. average link throughput between peers be B = 512kbps 120 120 and average file size V = 32MB, then

B −1 90 90 µ = = [] =10% Min 0,002 s . (22) V D 60 60

30 30

4.1. Download time DT [min]

1 6 12 18 24 30 36 1 6 12 18 24 30 36 t h t h Figure 4 shows the mean values and 95% confidence 120 120 intervals of DT for CROWDS and P2PRIV systems as @ D @ D -1 90 90 the function of parameter λ . To analyze systems mean =20% D download time we have computed six simulation series 60 60

(with 30 realizations each) starting from the maximum 30 30 request arrival rate per each node DT [min] µ 1 6 12 18 24 30 36 1 6 12 18 24 30 36 λ = Min = 0,0005[]s−1 . (23) t h t h Max P 120 120 @ D @ D We have found that P2PRIV DT is close to FTP for 90 90 =30% =30% low amount of requests. Diagram shows the superiority D 60 60 of P2PRIV regardless of the request arrival rate. The

30 30 content delivery has been at least four times faster than DT [min] for CROWDS. The observed increase of DT for the 1 6 12 18 24 30 36 1 6 12 18 24 30 36 close-to-maximum request arrival rate is lower for t h t h -1 -1 -1 -1 (a) λ=λMax=33,3 [min ] (b) λ=240 [min ] P2PRIV. @ D @ D Figure 5. P2PRIV (lower graphs) and CROWDS random walk (upper graphs) reaction to the new content publication, N = 100.

595

Figure 5 shows 95% confidence intervals and 25% the diagram). to 75% quantiles (marked as boxes) surrounding the In both systems the dependency of DT on the mean values of DT for both architectures. The results network size is negligible. For networks with thousand indicate that the parallel architecture is more flexible of peers we have observed insignificantly reduced DT and reacts faster to dynamically changing conditions. in comparison with small networks (50-100 nodes). We have found unstable operation of CROWDS Results bounded by narrow confidence intervals random walk for rate λMax (Fig. 5a). This maximum indicate that large networks are very reliable, however request arrival rate does not cause instability of P2PRIV operation is more stable. P2PRIV. Instead, we have observed permanently Below we present results of systems dynamics with increased DT after a new file publication. Both systems networks sizes enlarged to N = 1000. have exhibited a stable operation for a low arrival rate 120 120 and a moderate migration (Fig. 5b, D = 10% and D = 20%). Performance characteristics of CROWDS 90 90 =30% =30% D under a low request arrival rate are similar to P2PRIV 60 60 under rate λMax. For higher dynamics (D = 30%), we 30 30 have found instability in CROWDS even for a low DT [min] request arrival rate. P2PRIV noticed only temporary 1 6 12 18 24 30 36 1 6 12 18 24 30 36 t h t h (about 2 hours) peak under the same conditions. λ λ -1 -1 λ -1 -1 (a) = Max=33,3 [min ] (b) =240 [min ] @ D @ D Figure 7. P2PRIV (lower graphs) and CROWDS 4.3. Scalability random walk (upper graphs) reaction to the new content publication, N = 1000, D = 30%. A foreground advantage of P2P is its decentralized architecture and a necessary feature of any practical Figure 7 illustrates considerably higher systems P2P design is scalability. This vital feature allows for a stability of operation in comparison to results obtained spontaneous growth of distributed overlays. We have for networks of hundred of nodes (Figure 5). Note that repeated earlier traffic analysis of P2PRIV and P2PRIV proves strong robustness (meaningless CROWDS with altered size of simulated networks. changes of DT) for high dynamics (D = 30%). Notice that throughout all traffic analysis we have We have found that both P2PRIV and CROWDS excluded the files lookup problem. Process of finding random walk scale well. Moreover, a large scale of the data in distributed systems significantly impacts network increases flexibility of both systems. scalability, however this approach allows us to revise CROWDS with N = 1000 achieves a better stability scalability of pure examined systems. than with N = 100, but still presents an unstable

140 operation for request arrival rate λMax. What is more,

130 CROWDS random walk, N=50 CROWDS random walk, N=100 P2PRIV for N ≈ 1000 already tolerates high (D = 30%) 120 CROWDS random walk, N=1000 P2PRIV, N=50 migration of peers and a content without introduction 110 P2PRIV, N=100 100 P2PRIV, N=1000 of any noticeable delays regardless of the request

D 90 n arrival rate. i

m 80

@ 70 T D 60 5. Conclusions 50 40 30 The paper describes an original anonymization 20 system dedicated for peer-to-peer distribution overlay

33.3 40 50 60 90 networks, based on a specific connection and content λ−1 min exchange architecture. The proposed parallel content

Figure 6. Mean download time for different network exchange concept enables direct and anonymous data sizes (N) of P2PRIV and CROWDS@ D random walk. transport between network nodes. We have analyzed the anonymity and the traffic performance provided by Figure 6 shows an impact of network size N on the our system. We have found that P2PRIV effectively download time for P2PRIV and CROWDS random protects user privacy by assuring high degree of walk (when looking at Figure 6 please note that the anonymity. For a realistic scope of collaboration, graphs for N = 50 and N = 100 are shifted to the right P2PRIV anonymity is close to maximum. Moreover, in order to avoid confidence intervals overlapping on we have found that the proposed system significantly

596

decreases the download time as compared with protocol,” in Proceedings of the IEEE Symposium on traditional cascade schemes, and achieves results close Security and Privacy, May 2003. to optimum for low to medium loaded networks. [10] R. Dingledine, M. J. Freedman, and D. Molnar, “The free haven project: Distributed anonymous storage Taking into account network dynamics, we have found service,” in H. Federrath, editor, Designing Privacy that the proposed system is more flexible and reacts Enhancing Technologies: Workshop on Design Issues in faster on dynamically changing conditions such as Anonymity and Unobservability. Springer-Verlag, peers/content migration and traffic bursts introduced LNCS 2009, July 2000. by new data publication. P2PRIV scales well and [11] M. J. Freedman and R. Morris, “Tarzan: A peer-to-peer proves high flexibility for large networks. anonymizing network layer,” in 9th ACM Conference on Computer and Communications Security, Washington, In our opinion the field of traffic performance DC, November 2002. modeling for anonymous systems is still in its infancy [12] C. Diaz, Anonymity and Privacy in Electronic Services, and most of the papers neglect this important issue. Ph.D. Thesis. Katholieke Univesiteit Leuven, 2005. Consequently, our future work will include further [13] C. Diaz, L. Sassaman, and E. Dewitt, “Comparison analysis of impact imposed by anonymous techniques between two practical mix designs,” in Proceedings of on network traffic performance. Moreover, we will 9th European Symposiumon Research in Computer Security (ESORICS’04), pages 141–159. Springer- study the distributed hash table interface adjusted to Verlag, LNCS 3193, 2004. the considered parallel distribution, as well as practical [14] C. Diaz, S. Seys, J. Claessens and B. Preneel, “Towards implementation issues. We believe that P2PRIV can measuring anonymity,” in Roger Dingledine and Paul satisfy the requirement of private and low latency Syverson, editors, Proceedings of the Privacy exchange of large content. Enhancing Technologies Workshop. Springer-Verlag, LNCS 2482, April 2002. [15] C. Gülcü and G. Tsudik, “Mixing E-mail with Babel,” 6. References in Proceedings of the Network and Distributed Security Symposium (NDSS’96), pages 2–16. IEEE, 1996. [1] H. Balakrishnan, M. F. Kaashoek, D. Karger, R. Morris, [16] B. Pfitzmann, “Breaking Efficient Anonymous and I. Stoica, “Looking Up Data in P2P Systems,” Channel,” in Advances in Cryptology, Proceedings of Communications of the ACM, Volume 46, Issue 2, EUROCRYPT 1994, pages 332–340. Springer-Verlag, Pages: 43 – 48, February 2003. LNCS 950, 1994. [2] K. Bennett and C. Grothoff, “GAP – practical [17] B. Pfitzmann and A. Pfitzmann, “How to break the anonymous networking,” in Roger Dingledine, editor, direct RSAimplementation of MIXes,” in Advances in Proceedings of The Privacy Enhancing Technologies Cryptology, Proceedings of EUROCRYPT 1989, pages Workshop. Springer-Verlag, LNCS 2760, March 2003. 373–381. Springer-Verlag, LNCS 434, 1990. [3] O. Berthold, H. Federrath, and S. Köpsell, “Web [18] A. Pfitzmann, B. Pfitzmann and M. Waidner, “ISDN- MIXes: A System for Anonymous and Unobservable mixes: Untraceable communication with very small Internet access,” in Proceedings of the Workshop on bandwidth overhead,” in Proceedings of the GI/ITG Design Issues in Anonymity and Unobservability, pages Conference on Communication in Distributed Systems, 101–115, Berkeley, CA, USA, July 25–26 2000. pages 451–463, 1991. [4] O. Berthold, H. Langos, “Dummy traffic against long [19] M. Rennhard and B. Plattner, “Introducing MorphMix: term intersection attacks,” in Designing Privacy Peer-to-Peer based Anonymous Internet Usage with Enhancing Technologies Proceedings of PET’02, pages Collusion Detection,” in Proceedings of the Workshop 110–128. Springer-Verlag, LNCS 2482, 2002. on Privacy in the Electronic Society (WPES 2002), [5] D. Chaum, “The Dining Cryptographers Problem: Washington, DC, USA, November 2002. Unconditional Sender and Recipient Untraceability,” [20] M. K. Reiter, A. D. Rubin, “Crowds: Anonymity for Journal of Cryptology 1/1 (1988). 65-75. web transactions,” ACM Transactions on Information [6] D. Chaum, “Untraceable electronic mail, return and System Security, 1(1), June 1998. addresses, and digital ,” Communications of [21] A. Serjantov, G. Danezis, “Towards an information the ACM, 4(2), February 1981. theoretic metric for anonymity,” in Roger Dingledine [7] I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong, and Paul Syverson, editors, Proceedings of the Privacy “Freenet: A distributed anonymous information storage Enhancing Technologies Workshop, San Diego, CA, and retrieval system,” in Proceedings of Designing April 2002. Springer-Verlag, LNCS 2482. Privacy Enhancing Technologies: Workshop on Design [22] C. E. Shannon, A Mathematical Theory Of Issues in Anonymity and Unobservability, pages 46–66, Communication, the Bell System Technical Journal, July 2000. Vol. 27, pp. 379–423 and pp. 623–656, 1948. [8] W. Dai, “Pipenet 1.1,” Usenet post. Available: http://www.eskimo.com/~weidai/pipenet.txt, 1996. [9] G. Danezis, R. Dingledine, and N. Mathewson, “: Design of a Type III anonymous remailer

597