Rarest First and Choke Algorithms Are Enough
Total Page:16
File Type:pdf, Size:1020Kb
Rarest First and Choke Algorithms Are Enough Arnaud Legout G. Urvoy-Keller and P. Michiardi I.N.R.I.A. Institut Eurecom Sophia Antipolis France Sophia Antipolis France [email protected] fGuillaume.Urvoy,[email protected] ABSTRACT for this success. Whereas content localization has attracted The performance of peer-to-peer ¯le replication comes from considerable research interest in the last years [7,12,23,25], its piece and peer selection strategies. Two such strategies content replication has started to be the subject of active have been introduced by the BitTorrent protocol: the rarest research only recently. As an example, the most popular ¯rst and choke algorithms. Whereas it is commonly ad- peer-to-peer ¯le sharing networks [1] eDonkey2K, FastTrack, mitted that BitTorrent performs well, recent studies have Gnutella, Overnet focus on content localization. The only proposed the replacement of the rarest ¯rst and choke algo- widely used [16, 17, 20] peer-to-peer ¯le sharing application rithms in order to improve e±ciency and fairness. In this focusing on content replication is BitTorrent [8]. paper, we use results from real experiments to advocate that Yang et al. [26] studied the problem of e±cient content the replacement of the rarest ¯rst and choke algorithms can- replication in a peer-to-peer network. They showed that not be justi¯ed in the context of peer-to-peer ¯le replication the capacity of the network to serve content grows exponen- in the Internet. tially with time in the case of a flash crowd, and that a key We instrumented a BitTorrent client and ran experiments improvement on peer-to-peer ¯le replication is to split the on real torrents with di®erent characteristics. Our exper- content into several pieces. Qiu et al. [22] proposed a re- imental evaluation is peer oriented, instead of tracker ori- ¯ned model of BitTorrent and showed its high e±ciency. In ented, which allows us to get detailed information on all summary, these studies show that a peer-to-peer architec- exchanged messages and protocol events. We go beyond the ture for ¯le replication is a major improvement compared to mere observation of the good e±ciency of both algorithms. a client server architecture, whose capacity of service does We show that the rarest ¯rst algorithm guarantees close to not scale with the number of peers. ideal diversity of the pieces among peers. In particular, on However, both studies assume global knowledge, which is our experiments, replacing the rarest ¯rst algorithm with not realistic. Indeed, they assume that each peer knows all source or network coding solutions cannot be justi¯ed. We the other peers. As a consequence, the results obtained with also show that the choke algorithm in its latest version fos- this assumption can be considered as the optimal case. In ters reciprocation and is robust to free riders. In particu- real implementations, there is no global knowledge. The lar, the choke algorithm is fair and its replacement with a challenge is then to design a peer-to-peer protocol that bit level tit-for-tat solution is not appropriate. Finally, we achieves a level of e±ciency close to the one achieved in identify new areas of improvements for e±cient peer-to-peer the case of global knowledge. ¯le replication protocols. Piece and peer selection strategies are the two keys of e±cient peer-to-peer content replication. Indeed, in a peer- Categories and Subject Descriptors: C.2.2 [Computer- to-peer system, the content is split into several pieces, and Communication Networks]: Network Protocols; C.2.4 each peer acts as a client and a server. Therefore, each [Computer-Communication Networks]: Distributed Systems peer can receive and give any piece to any other peer. An General Terms: Measurement, Algorithms, Performance e±cient piece selection strategy should guarantee that each Keywords: BitTorrent, choke algorithm, rarest ¯rst algo- peer can always ¯nd an interesting piece from any other peer. rithm, peer-to-peer The rationale is to o®er the largest choice of peers to the peer selection strategy. An e±cient peer selection strategy should maximize the capacity of service of the system. In 1. INTRODUCTION particular, it should employ selection criteria based, e.g., on In a few years, peer-to-peer ¯le sharing has become the upload and download capacity, and should not be biased by most popular application in the Internet [16, 17]. E±cient the lack of available pieces in some peers. content localization and replication are the main reasons The rarest ¯rst algorithm is a piece selection strategy that consists of selecting the rarest pieces ¯rst. This simple strat- egy used by BitTorrent performs better than random piece selection strategies [5, 9]. However, Gkantsidis et al. [11] Permission to make digital or hard copies of all or part of this work for argued based on simulations that the rarest ¯rst algorithm personal or classroom use is granted without fee provided that copies are may lead to the scarcity of some pieces of content and pro- not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to posed a solution based on network coding. Whereas this republish, to post on servers or to redistribute to lists, requires prior specific solution is elegant and has raised a lot of interest, it leads permission and/or a fee. to several complex deployment issues such as security and IMC’06, October 25–27, 2006, Rio de Janeiro, Brazil. computational cost. Other solutions based on source coding Copyright 2006 ACM 1-59593-561-4/06/0010 ...$5.00. [18] have also been proposed to solve the claimed de¯ciencies of consider that each peer only knows few other peers, i.e., of the rarest ¯rst algorithm. each peer has a small peer set [5, 11]. In the case of real The choke algorithm is the peer selection strategy of Bit- torrents, the peer set size is much larger. The consequence Torrent. This strategy is based on the reciprocation of up- is that BitTorrent builds a random graph, connecting the load and download speeds. Several studies [5,10,13,15] dis- peers, that has a larger diameter in simulations than in real cussed the fairness issues of the choke algorithm. In par- torrents. However, the diameter has a fundamental impact ticular, they argued that the choke algorithm is unfair and on the e±ciency of the rarest ¯rst algorithm. favors free riders, i.e., peers that do not contribute. Solu- In this study, we show that in the speci¯c context con- tions based on a bit level tit-for-tat have been proposed to sidered, i.e., Internet peer-to-peer ¯le replication, the rarest address the choke algorithm's fairness problem. ¯rst and choke algorithms are good enough. Even if we can- In this paper, we perform an experimental evaluation of not extend our conclusions to other peer-to-peer contexts, the piece and peer selection strategies as implemented in we believe this paper sheds new light on a system that uses BitTorrent. Speci¯cally, we have instrumented a client and a large fraction of the Internet bandwidth. run extensive experiments on several torrents with di®er- The rest of the paper is organized as follows. We present ent characteristics in order to evaluate the properties of the the terminology used throughout this paper in section 2.1. rarest ¯rst and choke algorithms. While we have not ex- Then, we give a short overview of the BitTorrent proto- amined all possible cases, we argue that we have covered a col in section 2.2 and a description of the rarest ¯rst and representative set of today real torrents. choke algorithms in section 2.3. We present our experimen- Our main conclusions on real torrents are the following. tal methodology in section 3, and our detailed results in sec- tion 4. Related work is discussed in section 5. We conclude ² The rarest ¯rst algorithm guarantees a high diversity the paper with a discussion of the results in section 6. of the pieces. In particular, it prevents the reappear- ance of rare pieces and of the last pieces problem. 2. BACKGROUND ² We have found that torrents in a startup phase can We introduce in this section the terminology used have low piece diversity. The duration of this phase throughout this paper. Then, we give an overview of the Bit- depends only on the upload capacity of the source of Torrent protocol, and we present the rarest ¯rst and choke the content. In particular, the rarest ¯rst algorithm is algorithms. not responsible for the low piece diversity during this phase. 2.1 Terminology The terminology used in the peer-to-peer community and ² The fairness achieved with a bit level tit-for-tat strat- in particular in the BitTorrent community is not standard- egy is not appropriate in the context of peer-to-peer ized. For the sake of clarity, we de¯ne in this section the ¯le replication. We have proposed two new fairness terms used throughout this paper. criteria in this context. ² Pieces and Blocks Files transfered using BitTorrent ² The choke algorithm is fair, fosters reciprocation, and are split in pieces, and each piece is split in blocks. is robust to free riders in its latest version. Blocks are the transmission unit on the network, but the protocol only accounts for transfered pieces. In Our contribution is to go beyond the mere con¯rmation particular, partially received pieces cannot be served of the good performance of BitTorrent.