UNDERSTANDING the IMPACT of BITTORRENT on CABLE NETWORKS Jim Martin Terry Shaw Clemson University Cablelabs
Total Page:16
File Type:pdf, Size:1020Kb
UNDERSTANDING THE IMPACT OF BITTORRENT ON CABLE NETWORKS Jim Martin Terry Shaw Clemson University CableLabs Abstract P2P networks1. However, BitTorrent can consume tremendous amounts of bandwidth BitTorrent is a peer-to-peer (P2P) in both the upstream and downstream protocol for sharing media (audio, video directions. The primary incentive used by and software) files that accounts for 18% of BitTorrent to prevent free-riding is that the traffic on cable high-speed data networks. In protocol mandates that a file-downloader order to better understand the network must make some content available for dynamics of traffic associated with this upload. One study from several cable protocol, we analyzed roughly 1 Gbytes of operators found that 18% of all traffic is network trace data from BitTorrent clients BitTorrent [WSJ05]. The government and running in three different environments. the cable industry are addressing the Based on this data, we performed a evolving economic and policy issues. In this simulation-based analysis of a DOCSIS®- study we explore the impact that BitTorrent based cable network involving a varying users can have on a cable network. We number of simulated BitTorrent users and provide insight to this issue by presenting an web browsing users. Our results suggest analysis of BitTorrent involving live that as few as 10 BitTorrent clients can network measurement and simulation2. double the access delay experienced by other non-BitTorrent users. In addition to file sharing, P2P has been used for grid computation [Entropia], INTRODUCTION storage [DKK+01], web caching [IRD02] and directory services [RFH+01, SMK+01]. BitTorrent is a peer-to-peer (P2P) BitTorrent, however, is designed protocol that provides scalable file sharing specifically for file distribution. The capabilities. As in other P2P systems, algorithms contained in BitTorrent were BitTorrent provides an overlay network that heavily influenced by previous P2P runs over the Internet. BitTorrent is unique protocols and applications. We briefly from other P2P protocols in several aspects. overview two such protocols, Gnutella and The most notable of these are: 1) the use of Kazaa, before presenting BitTorrent. swarming; and 2) the use of incentives to prevent free-riding. BitTorrent, frequently the dominant application in a network, poses a dilemma for cable network service providers. In general, P2P applications are broadband applications that contribute to the 1 In a recent study, it was estimated that about 66% of demand for high speed broadband access. American teenagers have downloaded audio and/or video files from the Internet. It was also found that This usage is driven by younger Internet 30% of teenagers admit to currently using P2P users. In a recent study, it was estimated that networks and 28% admit to using P2P networks in about 58% of American teenagers with the past [PIP05]. broadband access have downloaded audio 2 The observations that we report represent and/or video files from the Internet using preliminary work. We expect our findings and conclusions to evolve as the study progresses. Gnutella user has a URL of the file to be downloaded. BitTorrent is therefore a protocol for Gnutella is an overlay network distributing files. Files are broken up into superimposed on the Internet chunks and downloaded in pieces. This [MAR+04,IVK]. Unlike the centralized concept is borrowed from [RB02] whose Napster approach, Gnutella is decentralized authors experimented with several parallel because meta-data (i.e., the indexes of where access schemes to download a file from files are located) is distributed throughout multiple servers concurrently. BitTorrent the network. At initialization time, a files are represented by a torrent which is a Gnutella client accesses a host server to meta-information file that identifies the obtain a list of peers currently in the chunks associated with a file and a tracker network. After this step, peers learn about address that knows about the file. the network and available content using Downloaders, also referred to as clients or Ping and Pong messages. Once a client has peers, download the torrent that is available identified a file to download, it sends the on a web site. The download engages the request to the network using a query BitTorrent client software that was installed message. Query replies will contain on its machine which contacts the tracker information needed for the client to that was identified in the torrent to obtain a download the file. number of IP address/port pairs corresponding to peers that have one or Kazaa more chunks of the file. The client periodically reports its state to the tracker Few details of the Kazaa protocol are (about every 30 minutes). The client available. Several ‘black box’ measurement attempts to maintain connections with at studies do however provide some insight least 20 peers (referred to as the peer set). If [LKR04,LRW03]. There are two types of it can not, it asks the tracker for additional nodes: Super Nodes (SN) and Ordinary peers. When a client first connects to a Nodes (ON). Each ON must be told or must tracker, the tracker attempts to reverse learn the address of at least one SN. SNs connect to it. If it succeeds, the client is maintain lists of other SNs that can be added to the list of peers for the torrent thousands large. Each SN maintains records maintained by the tracker. This ensures that for files located at the ONs in its domain. the peers in the client’s peer set will accept ONs contact their SN on initialize, providing incoming connections. At least one of the metadata that describes the content located peers must contain all of the file. Once a at the ON. When an ON requests a file, the peer has downloaded the entire file, it SN looks locally and also propagates the becomes a seed and provides chunks of the request to other SNs in the overlay network. file to lechers for free. The system is self- To improve performance, files can be stored scalable in that as the number of in fragments on multiple nodes and they can downloaders increase, the number of nodes be transferred in parallel. that can function as seeds also increase. BitTorrent BitTorrent breaks a file into chunks, or pieces. To enhance performance, BitTorrent While other P2P systems provide further breaks pieces into small (typically 16 distributed object location in addition to file Kbytes) sub-pieces (sometimes called downloading, BitTorrent assumes that the blocks) and maintains a strategy of having at least 5 concurrent block requests pipelined an ‘optimistic unchoke.’ Every third rechoke to a given peer. The client contains a piece period, it unchokes a peer by uploading to it selection algorithm that decides the next regardless of its download rate. Therefore, a piece to obtain. The algorithm has three BitTorrent client is allowed to upload to a modes of operation: starting, downloading, maximum of five peers, four that implement or finishing. When the client starts it has tit-for-tat bartering to maximize downstream nothing to upload. The algorithm randomly throughput and a fifth that probes the selects the initial pieces. Once one or more network searching for better performing pieces are received, the algorithm switches peers. to a rarest first algorithm. Based on information learned from its peers, the client BitTorrent is unique from other P2P chooses to download the chunk that is least protocols in several aspects: 1) the use of available. This ensures that peers will have swarming; 2) the use of incentives to pieces that they will need. Once the peer has prevent free-riding. BitTorrent, frequently downloaded all but the last few pieces, it the dominant application in a network, poses might experience significant delay if the a dilemma for cable network service remaining pieces are located at nodes that providers. It is a broadband application are available only over low-bandwidth which contributes to the demand for high- paths. To avoid this, it sends requests for speed broadband access. However, these sub-pieces to all peers. BitTorrent can consume tremendous amounts of bandwidth. In this study, we A peer actively controls its downstream explore the impact that BitTorrent users can bandwidth consumption by granting upload have on a cable network. We provide insight requests to peers from which it is currently to this issue by presenting an analysis of downloading a piece. In essence, it provides BitTorrent involving live network a ‘tit-for-tat’ bartering approach to measurement and simulation. This paper is distributed resource management. By organized as follows. First, we overview choosing to upload to peers that provide the related performance studies of peer-to-peer best download rates, freeloaders will systems. We then describe our modeling perform poorly. A peer effectively ‘chokes’ efforts: a measurement study designed to another peer by denying download requests characterize bandwidth consumption of from a peer. Snubbing is when a peer is BitTorrent clients and a simulation analysis choked by all peers with which it was designed to assess the impact of BitTorrent formerly downloading. on an HFC cable network. We end the paper with conclusions and our future directions. A BitTorrent peer uploads only to a limited number of peers (usually four). Peers RELATED WORK monitor download rates and decide who to choke over time intervals of every ten There has been a great deal of prior seconds (note that some implementations research on P2P protocols and systems. The might use a different rechoke period). When majority of the studies have focused either the next piece is to be downloaded, the peer on P2P deployments in the Internet [SGG02, that has been performing the worst will NCR+03, RIP01] or on evaluating protocol likely be replaced by a higher performing issues [RFK+01, SMK+01,GKT03].