Characterizing Peer-Level Performance of Bittorrent

Total Page:16

File Type:pdf, Size:1020Kb

Characterizing Peer-Level Performance of Bittorrent Characterizing Peer-level Performance of BitTorrent DRP Report Amir H. Rasti ABSTRACT 1. INTRODUCTION BitTorrent is one of the most popular Peer-to-Peer During the past few years, peer-to-peer appli- (P2P) content distribution applications over the cations have become very popular on the In- Internet that significantly contributes in network ternet. BitTorrent in one of the most popu- traffic. lar peer-to-peer applications providing scalable peer-to-peer content distribution over the In- In BitTorrent, a file is divided into segments ternet. Some recent studies [8] have shown that and participating peers contribute their outgoing BitTorrent is accountable for approximately bandwidth by providing their available segments 35% of the Internet traffic. BitTorrent is a to other peers while obtaining their missing peers scalable peer-to-peer content distribution sys- from others. Characterization of BitTorrent is use- tem that enables one-to-many distribution of ful in determining its performance bottlenecks as large files without requiring a large access link well as its impact on the network. bandwidth at the source. Similar to other peer- In this study, we try to address the following two to-peer systems, it uses resources of participat- key questions through measurement: (i) What are ing peers to increase the capacity of the system. the main factors that affect observed performance The main shared resource in BitTorrent is the by individual peers in BitTorrent?, and (ii) What up-link bandwidth of individual peers. The file are the contributions of these factors on the per- being distributed is divided into a large number formance of individual peers? To address these of segments. The source provides different seg- questions, first we examine the group-level and ments of the content to different peers. Partic- peer-level characteristics of BitTorrent using three ipating peers connect to each other to form an tracker logs from different sources. Second, we use overlay and peers exchange the available seg- statistical analysis (namely rank correlation and ments until they have downloaded the entire linear regression) to determine and quantify the content. potential effects of both peer-level and group-level properties on the performance of individual peers. This approach enables participating peers to contribute their outgoing bandwidth which We conclude that: (i) There is no single prop- makes BitTorrent a scalable system. Peers erty that has dominant effect on the observed per- may stay in the system once their download- formance by individual peers, (ii) Outgoing band- ing is complete acting as a seed. This be- width of each peer, average available content in havior whether intentionally or not, makes the the group and churn rate appear to have the most available resources in the system richer and im- notable effect on peer performance and (iii) The proves its robustness to system dynamics. behavior of the system in practice is rather com- plex due to the inherent dynamics in peer partic- BitTorrent is widely used for the distribu- ipation and content delivery as well as bandwidth tion of free software such as Linux as well as heterogeneity and asymmetry. copies of the latest movies and music albums. 1 It is believed to successfully utilize participat- (i) What are the dominant key peer or group ing peers' resources quickly and to be able properties that are more likely to determine the to handle flash crowds with acceptable perfor- observed performance by individual peers? (ii) mance [9](i.e., download time) despite the dy- What is the contribution of each property and namics of peer participation. It is also believed how we can quantify that? that BitTorrent's \tit-for-tat" is a successful in- To answer these questions, we examine the centive mechanism singling out free riders. observed performance by participating peers in The goal of individual peer is to maximize three popular torrents using BitTorrent tracker their download rates by finding high speed up- logs. We identify different challenges in (i) esti- loaders and keeping them satisfied by provid- mating peer and group properties from BitTor- ing them with a good upload rate. However, rent tracker logs, (ii) dealing with large data the extent to which each peer is successful in set and (iii) identifying and coping with vari- achieving this goal or its observed performance ous errors and bugs in a data set. We also ex- is the subject of this project. We believe that plain why assessing the observed performance the performance a peer will observe, potentially by individual peers is not trivial. depends on two sets of parameters: group prop- Using the derived peer and group properties erties and peer properties. The first set of pa- for three data sets, we conduct the following rameters capture the group status from differ- analysis. First, we present evolution of group ent aspects and are defined as a function of properties over time which illustrates existing time: (i) group population, (ii) peer arrival and dynamics in a torrent. The evolution usually departure rates (churn), (iii) average content includes an initial flash crowd pattern followed availability among participating peers and (iv) by a long time in which the system experiences percentage of seeds are the main group prop- rather stable conditions. Second, we show the erties that might affect a peer's performance. distribution of peer properties and observed The second set are the properties of the peer group properties during life time of individual itself regardless of the group. The peer's incom- peers across participating peers in a torrent. ing and outgoing bandwidth, geographical loca- These results illustrate that the observed per- tion, local configuration, ratio of contribution formance could spread over a wide range often (upload over download) are examples of peer with no clear modes or high density regions. properties that might potentially affect peer's performance. To capture peer properties, we Finally, we employ several statistical analy- record a signature of each peer's properties dur- sis techniques to identify key properties that ing its life time using statistical parameters. determine observed performance by each peer: (i) We investigate pairwise statistical correla- While BitTorrent enables a single peer to dis- tion between observed performance and prop- tribute content to a large number of interested erties and find out that the minimum outgoing receivers, it is unclear what factor(s), deter- bandwidth and average group content availabil- mine the observed performance, namely down- ity have the most significant correlation with load rate, by individual peers. the performance. (ii) Using linear regression In this paper, our goal is to empirically an- we show that there is no linear model accu- swer the following basic questions about ob- rately describing the observed performance us- served performance by individual peers in Bit- ing the peer and group properties. However, in Torrent: the generated linear model the median outgo- 2 ing bandwidth and average group content avail- within this group. Download requests may be ability have the most significant effects on the sent to a large list but uploading is controlled model. (iii) Using data mining decision tree by the incentive mechanism and is limited to a technique, we observed that the decision tree small set (5 peers) at any time. generated in all cases is very deep (10 fold) and At any given time, participating peers are large. This implies that there is no common divided into two groups; namely leechers and order of importance among the peer and group seeds. A seed is a peer who has already down- properties determining the peer performance. loaded the whole file and is only helping oth- In a nutshell, our results indicate that: (i) ers to download. Leechers are those still in the there is no single property that determines ob- downloading process. They may have some seg- served peer performance, (ii) peer's outgoing ments of the file downloaded but they are still bandwidth, available content in the group and downloading other segments while contribut- peer arrival/departure rates appear to have the ing their up-link bandwidth to other peers who most effect on peer performance, and (iii) in need the segments that they have. Normally a real world (actually deployed) system, there a peer joins the swarm as a leecher with no are many unexpected effects making the system segments available and then it gradually down- unpredictable and chaotic. loads all segments of the file and eventually be- This report continues in the following order. comes a seed. A seed keeps helping other peers In Section 2, we present an overview of Bit- until it leaves. In some cases, people voluntar- Torrent system explaining common terms and ily seed the content they have already down- describing mechanisms BitTorrent uses. Sec- loaded by staying in the torrent for a longer tion 3, categorizes the related work on BitTor- time or rejoining the torrent at a later time. rent characterization using modeling, simula- Before joining the swarm, the user has to tion and empirical studies. Our methodology, download a meta file called the torrent file. challenges of this project, possible approaches This file contains all the information needed and the specifications of the data set we have to join the swarm and download the content. used, are described in Section 4. In Section 5, The torrent file is usually downloaded from one we present our characterization results. Finally of the popular BitTorrent hosting and index- in Section 6, we provide an analysis of the re- ing web sites but it can also be distributed by sults. other means such as e-mail. Indexing web sites, provide a list of torrents matching a particular 2. BITTORRENT: AN search query which allows users to download OVERVIEW the torrent file associated with the desired con- tent.
Recommended publications
  • OES 2 SP3: Novell CIFS for Linux Administration Guide 9.2.1 Mapping Drives from a Windows 2000 Or XP Client
    www.novell.com/documentation Novell CIFS Administration Guide Open Enterprise Server 2 SP3 May 03, 2013 Legal Notices Novell, Inc., makes no representations or warranties with respect to the contents or use of this documentation, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to revise this publication and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. Further, Novell, Inc., makes no representations or warranties with respect to any software, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to make changes to any and all parts of Novell software, at any time, without any obligation to notify any person or entity of such changes. Any products or technical information provided under this Agreement may be subject to U.S. export controls and the trade laws of other countries. You agree to comply with all export control regulations and to obtain any required licenses or classification to export, re-export or import deliverables. You agree not to export or re-export to entities on the current U.S. export exclusion lists or to any embargoed or terrorist countries as specified in the U.S. export laws. You agree to not use deliverables for prohibited nuclear, missile, or chemical biological weaponry end uses. See the Novell International Trade Service Web page (http://www.novell.com/info/exports/) for more information on exporting Novell software.
    [Show full text]
  • Shared Resource Matrix Methodology: an Approach to Identifying Storage and Timing Channels
    Shared Resource Matrix Methodology: An Approach to Identifying Storage and Timing Channels RICHARD A. KEMMERER University of California, Santa Barbara Recognizing and dealing with storage and timing channels when performing the security analysis of a computer system is an elusive task. Methods for discovering and dealing with these channels have mostly been informal, and formal methods have been restricted to a particular specification language. A methodology for discovering storage and timing channels that can be used through all phases of the software life cycle to increase confidence that all channels have been identified is presented. The methodology is presented and applied to an example system having three different descriptions: English, formal specification, and high-order language implementation. Categories and Subject Descriptors: C.2.0 [Computer-Communication Networks]: General--se- curity and protection; D.4.6 ]Operating Systems]: Security and Protection--information flow controls General Terms: Security Additional Key Words and Phrases: Protection, confinement, flow analysis, covert channels, storage channels, timing channels, validation 1. INTRODUCTION When performing a security analysis of a system, both overt and covert channels of the system must be considered. Overt channels use the system's protected data objects to transfer information. That is, one subject writes into a data object and another subject reads from the object. Subjects in this context are not only active users, but are also processes and procedures acting on behalf of the user. The channels, such as buffers, files, and I/O devices, are overt because the entity used to hold the information is a data object; that is, it is an object that is normally viewed as a data container.
    [Show full text]
  • Shared Resource
    Shared resource In computing, a shared resource, or network share, is a computer resource made available from one host to other hosts on a computer network.[1][2] It is a device or piece of information on a computer that can be remotely accessed from another computer transparently as if it were a resource in the local machine. Network sharing is made possible by inter-process communication over the network.[2][3] Some examples of shareable resources are computer programs, data, storage devices, and printers. E.g. shared file access (also known as disk sharing and folder sharing), shared printer access, shared scanner access, etc. The shared resource is called a shared disk, shared folder or shared document The term file sharing traditionally means shared file access, especially in the context of operating systems and LAN and Intranet services, for example in Microsoft Windows documentation.[4] Though, as BitTorrent and similar applications became available in the early 2000s, the term file sharing increasingly has become associated with peer-to-peer file sharing over the Internet. Contents Common file systems and protocols Naming convention and mapping Security issues Workgroup topology or centralized server Comparison to file transfer Comparison to file synchronization See also References Common file systems and protocols Shared file and printer access require an operating system on the client that supports access to resources on a server, an operating system on the server that supports access to its resources from a client, and an application layer (in the four or five layer TCP/IP reference model) file sharing protocol and transport layer protocol to provide that shared access.
    [Show full text]
  • Resource Management in Multimedia Networked Systems
    University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science May 1994 Resource Management in Multimedia Networked Systems Klara Nahrstedt University of Pennsylvania Ralf Steinmetz University of Pennsylvania Follow this and additional works at: https://repository.upenn.edu/cis_reports Recommended Citation Klara Nahrstedt and Ralf Steinmetz, "Resource Management in Multimedia Networked Systems ", . May 1994. University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-94-29. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_reports/331 For more information, please contact [email protected]. Resource Management in Multimedia Networked Systems Abstract Error-free multimedia data processing and communication includes providing guaranteed services such as the colloquial telephone. A set of problems have to be solved and handled in the control-management level of the host and underlying network architectures. We discuss in this paper 'resource management' at the host and network level, and their cooperation to achieve global guaranteed transmission and presentation services, which means end-to-end guarantees. The emphasize is on 'network resources' (e.g., bandwidth, buffer space) and 'host resources' (e.g., CPU processing time) which need to be controlled in order to satisfy the Quality of Service (QoS) requirements set by the users of the multimedia networked system. The control of the specified esourr ces involves three actions: (1) properly allocate resources (end-to-end) during the multimedia call establishment, so that traffic can flow according to the QoS specification; (2) control resource allocation during the multimedia transmission; (3) adapt to changes when degradation of system components occurs.
    [Show full text]
  • Novell Cluster Services,. for Linux. and Netware
    Novell Cluster Services,. for Linux. and NetWare. ROB BASTIAANSEN SANDER VAN VUGT Novell PRESS. Novell. Published by Pearson Education, Inc. 800 East 96th Street, Indianapolis, Indiana 46240 USA Table of Contents Introduction 1 CHAPTER 1: Introduction to Clustering and High Availability 5 Novell Cluster Services Defined 5 Shared Disk Access 6 Secondary IP Addresses 7 Clustering Terminology 8 High-Availability Solutions Overview 12 Novell Cluster Services 12 Business Continuity Clustering 13 PolyServe Matrix Server 15 Heartbeat Subsystem for High-Availability Linux 16 When Not to Cluster Applications 16 Availability Defined 18 High Availability Defined 18 Calculating Average Downtime 21 Avoiding Downtime 22 Hardware 22 Environment 23 Software 23 Procedures 24 Novell Cluster Services Requirements 24 Hardware Requirements 24 Software Requirements 26 CHAPTER 2: Examining Novell Cluster Services Architecture 27 Novell Cluster Services Objects and Modules 27 Cluster eDirectory Objects 28 Cluster Modules 31 IH Novell Cluster Services for Linux and NetWare Heartbeats, Epoch Numbers, and the Split Brain Detector 35 Removing a Failing Slave Node 36 Removing a Failed Master Node 37 Summary 37 CHAPTER 3: Clustering Design 39 Cluster Design Guidelines 39 How Many Nodes to Choose 39 Using a Heartbeat LAN or Not 40 Use NIC Teaming 41 Choosing Storage Methods 42 Mirror the Split Brain Detector Partition 48 Selecting Applications to Run in a Cluster 48 eDirectory Cluster Guidelines 50 Creating a Failover Matrix 52 Application-Specific Design Guidelines
    [Show full text]
  • Novell® Platespin® Recon 3.7.4 User Guide 5.6.4 Printing and Exporting Reports
    www.novell.com/documentation User Guide Novell® PlateSpin® Recon 3.7.4 September 2012 Legal Notices Novell, Inc., makes no representations or warranties with respect to the contents or use of this documentation, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to revise this publication and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. Further, Novell, Inc., makes no representations or warranties with respect to any software, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to make changes to any and all parts of Novell software, at any time, without any obligation to notify any person or entity of such changes. Any products or technical information provided under this Agreement may be subject to U.S. export controls and the trade laws of other countries. You agree to comply with all export control regulations and to obtain any required licenses or classification to export, re-export or import deliverables. You agree not to export or re-export to entities on the current U.S. export exclusion lists or to any embargoed or terrorist countries as specified in the U.S. export laws. You agree to not use deliverables for prohibited nuclear, missile, or chemical biological weaponry end uses. See the Novell International Trade Services Web page (http://www.novell.com/info/exports/) for more information on exporting Novell software.
    [Show full text]
  • The Pirate Bay Liability
    THE PIRATE BAY LIABILITY University of Oslo Faculty of Law Candidate name: Angela Sobolciakova Supervisor: Jon Bing Deadline for submission: 12/01/2011 Number of words: 16,613 10.12.2011 Abstract The thesis discuses about peer-to-peer technology and easy availability of an Internet access which are prerequisites to a rapid growth of sharing data online. File sharing activities are managing without the copyright holder‟s permission and so there is a great opportunity of infringing exclusive rights. The popular pee-to-peer website which is enabling immediate file sharing is for example www.thepiratebay.org – the object of this thesis. The copyright law is obviously breaching by end users who are committing these acts. However, on the following pages we are dealing with the third party liability – liability of online intermediaries for unlawful acts committed by their users. A file sharing through the pee-to-peer networks brings benefits for the Internet users. They need no special knowledge in order to learn how to use the technology. The service of www.thepiratebay.org website is offering simultaneously users an access to a broad spectrum of legal content and a copyright protected works. The service is mainly free of charge and the users can find a data they are interested in quickly and in a users‟ friendly format. The aim of the thesis is to compare the actual jurisprudential status of liability of intermediary information society service providers for the file sharing activities on www.thepiratebay.org. 2 Acknowledgement I would like to thank Professor Jon Bing, I am grateful to him for supervising the thesis.
    [Show full text]
  • A Study of Peer-To-Peer Systems
    A Study of Peer-to-Peer Systems JIA, Lu A Thesis Submitted in Partial Fulfilment of the Requirements for the Degree of Master of Philosophy in Information Engineering The Chinese University of Hong Kong August 2009 Abstract of thesis entitled: A Study of Peer-to-Peer Systems Submitted by JIA, Lu for the degree of Master of Philosophy at The Chinese University of Hong Kong in June 2009 Peer-to-peer (P2P) systems have evolved rapidly and become immensely popular in Internet. Users in P2P systems can share resources with each other and in this way the server loading is reduced. P2P systems' good performance and scalability attract a lot of interest in the research community as well as in industry. Yet, P2P systems are very complicated systems. Building a P2P system requires carefully and repeatedly thinking and ex- amining architectural design issues. Instead of setting foot in all aspects of designing a P2P system, this thesis focuses on two things: analyzing reliability and performance of different tracker designs and studying a large-scale P2P file sharing system, Xun- lei. The "tracker" of a P2P system is used to lookup which peers hold (or partially hold) a given object. There are various designs for the tracker function, from a single-server tracker, to DHT- based (distributed hash table) serverless systems. In the first part of this thesis, we classify the different tracker designs, dis- cuss the different considerations for these designs, and provide simple models to evaluate the reliability of these designs. Xunlei is a new proprietary P2P file sharing protocol that has become very popular in China.
    [Show full text]
  • Server I/O Networks Past, Present, and Future Renato John Recio Chief Architect, IBM Eserver I/O IBM Systems Group, Austin, Texas [email protected]
    Server I/O Networks Past, Present, and Future Renato John Recio Chief Architect, IBM eServer I/O IBM Systems Group, Austin, Texas [email protected] Abstract • Low latency - the total time required to transfer the first bit of a message from the local application to the remote appli- Enterprise and technical customers place a diverse set of cation, including the transit times spent in intermediate requirements on server I/O networks. In the past, no single switches and I/O adapters. network type has been able to satisfy all of these requirements. As • High throughput - the number of small block transactions a result several fabric types evolved and several interconnects performed per second. The size and rate of small block I/O emerged to satisfy a subset of the requirements. Recently several depends on the workload and fabric type. A first level technologies have emerged that enable a single interconnect to be approximation of the I/O block size and I/O throughput rates required for various I/O workloads can be found in used as more than one fabric type. This paper will describe the “I/O Workload Characteristics of Modern Servers” [4]. requirements customers place on server I/O networks; the various • High bandwidth - the number of bytes per second sup- fabric types and interconnects that have been used to satisfy those ported by the network. Typically, used to gauge perfor- requirements; the technologies that are enabling network mance for large block data transfers. Peek bandwidth convergence; and how these new technologies are being deployed describes the bandwidth provided by a given link; whereas on various network families.
    [Show full text]
  • NFS Gateway for Netware 6 Administration Guide October 22, 2003
    Novell Confidential Manual (99a) 11 September 2003 Novell * NFS Gateway for NetWare® 6 www.novell.com ADMINISTRATION GUIDE October 22, 2003 Novell Confidential Manual (99a) 11 September 2003 Legal Notices Novell, Inc. makes no representations or warranties with respect to the contents or use of this documentation, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc. reserves the right to revise this publication and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. Further, Novell, Inc. makes no representations or warranties with respect to any software, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc. reserves the right to make changes to any and all parts of Novell software, at any time, without any obligation to notify any person or entity of such changes. You may not export or re-export this product in violation of any applicable laws or regulations including, without limitation, U.S. export regulations or the laws of the country in which you reside. Copyright © 2003 Novell, Inc. All rights reserved. No part of this publication may be reproduced, photocopied, stored on a retrieval system, or transmitted without the express written consent of the publisher. U.S. Patent No. 5,157,663; 5,349,642; 5,455,932; 5,553,139; 5,553,143; 5,572,528; 5,594,863; 5,608,903;5,633,931; 5,652,859; 5,671,414;
    [Show full text]
  • Torrent Crawler: a Tool for Collecting Information from Bittorrent Networks
    Torrent Crawler: a tool for collecting information from BitTorrent networks Yeounoh Chung 1. Abstract structure and the unpredictability of the clients can cause problems in the network, making errors BitTorrent is a free peer-to-peer (P2P) difficult to detect and diagnosis. The reliance of a content-sharing application with a complex and node on node local views to maintain the overlay dynamic overlay structure due to loose coupling, high structure further compounds the problem. Defects or churn rate, and varying responsiveness of nodes. The anomalies such as partitioning in the overlay or load complexity and the dynamic nature of the overlay imbalance due to biased peer selection are structure can mask the problems in the network, undetectable without partial global information and a making errors difficult to detect and diagnosis in a good understanding of the various characteristics and timely manner. Furthermore, the heavy reliance of dynamic behaviors of the network. However, such an clients on the node local views compounds the understanding was often times missing or incorrectly problems such as partitioning in the network or load obtained from non-representative measurement imbalance due to biased peer selection. studies, in which a few instrumented clients were In an effort to provide the network with used to capture the desired properties. partial global information to resolve the network Collecting representative global information problems, this project looks into introducing a tool such as peer’s download rate, known neighboring that efficiently collects global information from peers, and the network’s churn rate, from the BitTorrent network. The tool, called Torrent Crawler complex and dynamic BitTorrent network is not (TC) uses a number of techniques to efficiently find simple.
    [Show full text]
  • Casual Resource Sharing with Shared Virtual Folders
    MASTER THESIS IN COMPUTER SCIENCE Casual Resource Sharing with Shared Virtual Folders Siri Birgitte Uldal June 15th, 2007 FACULTY OF SCIENCE Department of Computer Science University of Tromsø, N-9037 Tromsø Casual Resource Sharing with Shared Virtual Folders Siri Birgitte Uldal June 15th, 2007 ___________________ University of Tromsø 2007 Abstract Proliferation of wireless networks has been a major trigger behind increased mobility of computing devices. Along with increased mobility come requests for ad-hoc exchange of resources between computing devices as an extension of humans interacting. We termed it casual resource sharing where resources in this thesis have been narrowed down to files only. We have named our casual resource sharing model for shared virtual folders (SVF). SVFs can be looked upon as a common repository much in the same way as the tuplespace model. The SVF members perceive the repository similarly to a common file directory on a server, while in reality all participating devices stores their own contribution of files. All types of files could be added to the repository and shared. To become a SVF member one needs to be invited by another member or initiate a SVF oneself. All members are free to withdraw their SVF membership whenever they wish. They are also free to log on to the SVF and log out as they please. The SVF cease to exist when the last member has drawn his membership. The SVF implements a simple versioning detection system to alert members when a file has been modified by another member. Feasibility of the model is demonstrated in a prototype implementation based on Java and the JXTA middleware, a peer-to-peer (P2P) infrastructure middleware supporting the Internet protocol.
    [Show full text]