![Is Content Publishing in Bittorrent Altruistic Or Profit-Driven](https://data.docslib.org/img/3a60ab92a6e30910dab9bd827208bcff-1.webp)
Is Content Publishing in BitTorrent Altruistic or Profit-Driven? Ruben Cuevas Michal Kryczka Angel Cuevas Univ. Carlos III de Madrid Institute IMDEA Networks and Univ. Carlos III de Madrid [email protected] Univ. Carlos III de Madrid [email protected] [email protected] Sebastian Kaune Carmen Guerrero Reza Rejaie TU Darmstadt Univ. Carlos III de Madrid University of Oregon [email protected] [email protected] [email protected] ABSTRACT ity [9, 15, 12] and improving its performance [18, 14]. BitTorrent is the most popular P2P content delivery appli- Moreover, some few papers have analyzed the demo- cation where individual users share various type of content graphics of BitTorrent [21, 11, 19] and also the secu- with tens of thousands of other users. The growing popular- rity [20] and privacy issues [6, 7]. However, the socio- ity of BitTorrent is primarily due to the availability of valu- economic aspects associated to BitTorrent in particular, able content without any cost for the consumers. However, and to other P2P file sharing systems in general, have apart from required resources, publishing (sharing) valuable received little attention despite of the importance that (and often copyrighted) content has serious legal implica- they have to the complete understanding of such kind of tions for user who publish the material (or publishers). This applications. This paper is a first step in this direction. raises a question that whether (at least major) content pub- The availability of free popular (often copyrighted) lishers behave in an altruistic fashion or have other incen- content that is of interest to millions of people (e.g. re- tives such as financial. In this study, we identify the content cent TV Shows episodes, Hollywood movies, etc) is the publishers of more than 55k torrents in 2 major BitTorrent key pillar that makes BitTorrent an extremely success- portals and examine their behavior. We demonstrate that a ful system. In this paper we study content publishing small fraction of publishers are responsible for 66% of pub- in BitTorrent from a technical and more importantly lished content and 75% of the downloads. Our investigations socio-economic point of view. In short, we try to un- reveal that these major publishers respond to two different ravel who publishes content in BitTorrent, and why. profiles. On one hand, antipiracy agencies and malicious For this purpose we rely on real data from a large publishers publish a large amount of fake files to protect scale measurement study performed over two large Bit- copyrighted content and spread malware respectively. On Torrent Portals (Mininova and The Pirate Bay). Our > the other hand, content publishing in BitTorrent is largely dataset ( 300 GB) consits of information on more than driven by companies with financial incentive. Therefore, if 35M IP addresses and more than 55K published con- these companies lose their interest or are unable to publish tents including the content publisher. content, BitTorrent traffic/portals may disappear or at least Using this dataset, we have first looked at the contri- arXiv:1007.2327v2 [cs.NI] 22 Jul 2010 their associated traffic will significantly reduce. bution of the different content publishers and conclude that just few publishers (around 100) are responsible Keywords of uploading 2/3 of the contents that serve 3/4 of the downloads in our major dataset. Furthermore, an im- BitTorrent, content publishing, business model portant part of these major publishers consume few or even no contents, rather they dedicate their resources 1. INTRODUCTION (almost) to only seed the published content. This is un- Peer to Peer (P2P) file-sharing applications, and more usual behavior since standard BitTorrent users typically specifically BitTorrent, are a clear example of killer ap- employ their resources for both seeding and download- plications in the last decade. For instance, BitTorrent ing contents. Therefore, our observation reveals that is currently used by hundreds of millions of users and major BitTorrent publishers present an anomalous be- is responsible for a large portion of the Internet traf- havior. This argument is reinforced after checking that fic share [4]. This has attracted the attention of the many of the files published by these major publish- research community that has mainly focused on under- ers are copyrighted. Then, major publishers not only standing the technical aspects of BitTorrent functional- expend their resources without any apparent benefit, 1 but they also face legal reactions due to the publica- • profit-driven top publishers own fairly profitable tion of copyrighted content [1, 2]. These findings raise web sites. They use major BitTorrent portals such the following questions: are these major publishers good as The Pirate Bay as a platform to advertise their citizenships that allocate a great deal of resources and web site to millions of users. For this purpose they assume legal risk for the good of the community? or publish popular torrents where they attach the contrary, do they have any (still) unknown incentive to URL of their web site in various manners. The behave in this manner? publishers linked to this business model are re- To answer these questions we perform a systematic sponsible of around 30% of content and 40% of study of the aforementioned major publishers. We first downloads. discover the identity of these major publishers by look- The rest of the paper is organized as follows. Section ing at their associated usernames and IPs. This allows 2 describes our measurement methodology. Sections 3 us to classify them into two different groups: fake pub- and 4 are dedicated to the identification of major pub- lishers publish a large number of fake content and top lishers and their main characteristics (i.e. signature) re- publishers publish a large number of proper (often copy- spectively. In Section 5 we study the incentives that righted) content. major publishers have to perform this activity. Section Afterwards we study main characteristics of these 6 presents other players that also benefit from content groups such as the popularity of the content they pub- publishing. In Section 7 we describe our publicly avail- lish and their seeding behavior (i.e. its signature). One able application to monitor content publishing activity one hand, our results reveal that the falseness of the in The Pirate Bay portal. Finally Section 8 discusses content published by fake publishers makes their swarms related work and Section 9 concludes the paper. unpopular and obeys them to seed multiple torrents in . parallel across long sessions. On the other hand, top 2. MEASUREMENT METHODOLOGY publishers are responsible of very popular contents for which they guarantee a proper seeding. This section describes our methodology to identify Finally we exploit the available information related the initial publisher of a file that is distributed through to these publishers (e.g. in the BitTorrent portals) and a BitTorrent swarm. Towards this end, we first briefly conclude that fake publishers are linked to antipiracy describe the required background on how a user joins a agencies and malicious users, whereas half of top pub- BitTorrent swarm. lishers run their own web sites that report them eco- Background: A BitTorrent client takes the following X nomical benefits that are very substantial in some few steps to join the swarm associated with file . First, the cases. client obtains the .torrent file associated to the desired In summary the main contributions of this paper are: swarm. The .torrent file contains contact information for the tracker that manages the swarm and the number • A simple measurement methodology to monitor of pieces of file X, etc. Second, the client connects to the content publishing activity in major BitTor- the tracker and obtains the following information: (i) rent portals. This methodology has been used to the number of seeders and leechers that are currently implement a system that continuously monitors connected to the swarm, and (ii) N (typically 50) ran- the content publishing activity in The Pirate Bay dom IP addresses of participating peers in the swarm. portal. The data gathered is made publicly avail- Furthermore, if the number of neighbors is eventually able through a web interface. lower than a given threshold (typically 20), the client contacts the tracker again to learn about other peers in • The major portion of content publishing activity the swarm. in BitTorrent is concentrated in a relative small To facilitate the bootstrapping process, the .torrent set of publishers (around 100) that are responsi- files are typically indexed at BitTorrent portals. Some ble of 2/3 of the published content and 3/4 of the of the major portals (e.g. The Pirate Bay or Mininova) downloads. This set of publishers can be further index millions of .torrent files [21], classify them into divided into three subsets that we name fake pub- different categories and provide a web page with de- lishers, altruistic top publishers and profit-driven tailed information (content category, publisher’s user- top publishers. name, file size, and file description). These portals also • fake publishers are set up by either anti-piracy offer an RSS feed to announce a newly published file. agencies or malicious users and are responsible of The RSS gives also some extra information such as con- 30% of the content and 25% of downloads. This tent category, content size and username that published means that these publishers sustain a continuous the .torrent file. poisoning-like index attack [16] against BitTorrent Identifying Initial Publisher: In BitTorrent a con- portals that based on our results affects to millions tent publisher is identified by its username [21] and IP of downloaders. address. The objective of our measurement study is to 2 Portal Start End #Torrents #IP addresses rate that is allowed by the tracker (i.e.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-