Oblivp2p: an Oblivious Peer-To-Peer Content Sharing System

OblivP2P: An Oblivious Peer-to-Peer Content Sharing System Yaoqi Jia, National University of Singapore; Tarik Moataz, Colorado State University and Telecom Bretagne; Shruti Tople and Prateek Saxena, National University of Singapore https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/jia This paper is included in the Proceedings of the 25th USENIX Security Symposium August 10–12, 2016 • Austin, TX ISBN 978-1-931971-32-4 Open access to the Proceedings of the 25th USENIX Security Symposium is sponsored by USENIX OBLIVP2P: An Oblivious Peer-to-Peer Content Sharing System 1 2 1 1 Yaoqi Jia ∗ Tarik Moataz ∗ Shruti Tople ∗ Prateek Saxena 1National University of Singapore jiayaoqi, shruti90, prateeks @comp.nus.edu.sg { } 2Colorado State University and Telecom Bretagne [email protected] Abstract sharing files on the Internet. More recently, peer-assisted Peer-to-peer (P2P) systems are predominantly used to CDNs such as Akamai Netsession [4] and Squirrel [5] distribute trust, increase availability and improve perfor- are gaining wide adoption to offload web CDN traffic to mance. A number of content-sharing P2P systems, for clients. The convenient access to various resources at- file-sharing applications (e.g., BitTorrent and Storj) and tract millions of users to join P2P networks, e.g., BitTor- more recent peer-assisted CDNs (e.g., Akamai Netses- rent has over 150 million active users per month [6] and sion), are finding wide deployment. A major security its file-sharing service contributes 3.35% of all world- concern with content-sharing P2P systems is the risk of wide bandwidth [7]. However, the majority of such P2P long-term traffic analysis — a widely accepted challenge applications are susceptible to long-term traffic analy- with few known solutions. sis through global monitoring; especially, analyzing the In this paper, we propose a new approach to protecting pattern of communication between a sender and a re- against persistent, global traffic analysis in P2P content- ceiver to infer information about the users. For exam- sharing systems. Our approach advocates for hiding ple, many copyright enforcement organizations such as data access patterns, making P2P systems oblivious. We IFPI, RIAA, MPAA, government agencies like NSA and propose OBLIVP2P— a construction for a scalable dis- ISP’s are reported to globally monitor BitTorrent traffic tributed ORAM protocol, usable in a real P2P setting. to identify illegal actors. Monitoring of BitTorrent traf- Our protocol achieves the following results. First, we fic has shown to reveal the data requested and sent by the show that our construction retains the (linear) scalability peers in the network [8–10]. Unfortunately, while detect- of the original P2P network w.r.t the number of peers. ing copyright infringements is useful, the same global Second, our experiments simulating about 16,384 peers monitoring is applicable to any user of the P2P network, on 15 Deterlab nodes can process up to 7 requests of and can therefore collect benign users’ data. Thus, users 512KB each per second, suggesting usability in mod- of such P2P systems are at a risk of leaking private infor- erately latency-sensitive applications as-is. The bottle- mation such as the resources they upload or download. necks remaining are purely computational (not band- To hide their online traces, users today employ anony- width). Third, our experiments confirm that in our con- mous networks as a solution to conceal their digital struction, no centralized infrastructure is a bottleneck — identities or data access habits. Currently, anonymous essentially, ensuring that the network and computational networks include Mix networks [11–13], and Onion overheads can be completely offloaded to the P2P net- routing/Tor-based systems [14–17], as well as other P2P work. Finally, our construction is highly parallelizable, anonymity systems [18–23]. Such systems allow the user which implies that remaining computational bottlenecks to be anonymous, so that the user is unidentifiable within can be drastically reduced if OBLIVP2P is deployed on a set of users [24]. a network with many real machines. Although above solutions provide an anonymity guar- antee, they are vulnerable to long-term traffic pattern 1 Introduction analysis attacks, which is an important threat for P2P systems like BitTorrent [25–30]. Researchers have demon- Content sharing peer-to-peer (P2P) systems, especially strated attacks targeting BitTorrent users on top of Tor P2P file-sharing applications such as BitTorrent [1], that reveal information related to the resources uploaded Storj [2] and Freenet [3] are popular among users for or downloaded [31, 32]. Such attacks raise the question ∗Lead authors are alphabetically ordered. - is anonymizing users the right defense against traffic 1 USENIX Association 25th USENIX Security Symposium 945 pattern analysis in P2P content sharing systems? tributing the communication and computation overhead. In this paper, we investigate a new approach to solve the problem of persistent analysis of data communication Solution Overview. We start with a toy construction patterns. We advocate that data / resource access pattern (OBLIVP2P-0) which directly adapts ORAM to a P2P hiding is an important and necessary step to thwart leak- setting, and then present our main contribution which is age of users data in P2P systems. To this end, we present a more efficient solution (OBLIVP2P-1). a first candidate solution, OBLIVP2P— an oblivious pro- Centralized Protocol (OBLIVP2P-0): Our centralized tocol for peer-to-peer content sharing systems. Hiding protocol or OBLIVP2P-0, is a direct adaptation of data access patterns or making them oblivious unlinks ORAM in a P2P system. The peers in the network user’s identity from her online traces, thereby defending behave both like distributed storage servers as well as against long-term traffic monitoring. clients. They request a centralized, trusted tracker to access a particular resource. The tracker performs all the ORAM operations to fetch the resource from the net- 1.1 Approach work and returns it to the requesting peer. However, this variant of OBLIVP2P protocol has limited scalability as For hiding data access patterns between a trusted CPU it assigns heavy computation to the tracker, making it a and an untrusted memory, Goldreich and Ostrovsky pro- bottleneck. posed the concept of an Oblivious RAM (ORAM) [33]. Distributed Protocol (OBLIVP2P-1): As our main con- We envision providing similar obliviousness guarantees tribution, we present OBLIVP2P-1 which provides both in P2P systems, and therefore select ORAM as a start- obliviousness and scalability properties in a tracker- ing point for our solution. To the best of our knowledge, based P2P system. To attain scalability, the key idea OBLIVP2P is the first work that adapts ORAM to ac- is to avoid any single entity (say the tracker) as a bot- cesses in a P2P setting. However, directly employing tleneck. This requires distributing all the ORAM oper- ORAM to hide access patterns in a P2P system is chal- ations for fetching and sharing of resources among the lenging. We outline two key challenges in designing an peers in the network, while still maintaining oblivious- oblivious and a scalable P2P protocol using ORAM. ness guarantees. To realize such a distributed protocol, Obliviousness. The first challenge arises due to the dif- our main building block, which we call Oblivious Selec- ference in the setting of a standard ORAM as compared tion (OblivSel), is a novel combination of private infor- to a P2P content sharing system. Classical ORAM solu- mation retrieval with recent advances in ORAM. Obliv- tions consists of a single client which securely accesses ious Selection gives us a scalable way to securely dis- an untrusted storage (server), wherein the client is even- tribute the load of the tracker. Our construction is proven tually the owner and the only user of the data in the mem- secure in the honest-but-curious adversary model. Con- ory. In contrast, P2P systems consist of a set of trusted structions and proofs for arbitrarily malicious fraction of trackers managing the network, and multiple data own- peers is slated for future work. ers (peers) in the network. Each peer acts both as a client as well as a server in the network i.e., a peer can either request for a data or respond to other peer’s request with 1.2 System and Results the data stored on its machine. Hence, adversarial peers We provide a prototype implementation of both present in the network can see the plaintext and learn the OBLIVP2P-0 and OBLIVP2P-1 protocols in Python. data requested by other peers, a threat that does not exists Our source code is available online [34]. We experimen- in the traditional ORAM model where only encrypted tally evaluate our implementation on DeterLab testbed data is seen by the servers. with 15 servers simulating up to 214 peers in the network. Scalability. The second challenge lies in seeking an Our experiments demonstrate that OBLIVP2P-0 is lim- oblivious P2P system that 1) the throughput scales lin- ited in scalability with the tracker as the main bottleneck. early with the number of peers in the network, 2) has no The throughput for OBLIVP2P-1, in contrast, scales lin- centralized bottleneck and 3) can be parallelized with an early with increase in the number of peers in the network. overall acceptable throughput. In standard ORAM solu- It attains an overall throughput of 3.19 MBps for a net- tions, the (possibly distributed) server is responsible for work of 214 peers that corresponds to 7 requests per sec- serving all the data access requests from a client one-by- ond for a block size of 512 KB. By design, OBLIVP2P-1 one. In contrast, P2P systems operate on a large-scale is embarrassingly parallelizable over the computational with multiple peers (clients) requesting resources from capacity available in a real P2P network.

Oblivp2p: an Oblivious Peer-To-Peer Content Sharing System

Will Sci-Hub Kill the Open Access Citation Advantage and (At Least for Now) Save Toll Access Journals?

Piracy of Scientific Papers in Latin America: an Analysis of Sci-Hub Usage Data

Practical Anonymous Networking?

Mixminion: Design of a Type III Anonymous Remailer Protocol

The Design, Implementation and Operation of an Email Pseudonym Server

Privacy-Enhancing Technologies for the Internet

A Concept of an Anonymous Direct P2P Distribution Overlay System

Social Network Based Anonymous Communication in Tor Peng Zhou, Xiapu Luo, Ang Chen, and Rocky K

Resource-Sharing Computer Communications Networks

Measuring Freenet in the Wild: Censorship-Resilience Under Observation

Sharing Music Files: Tactics of a Challenge to the Industry Brian Martin University of Wollongong, [email protected]

El Gamal Mix-Nets and Implementation of a Verifier