Development and Implementation of the B-Tracker Approach on a Bittorrent Client

Development and Implementation of the B-Tracker Approach on a BitTorrent Client Andri Lareida Zürich, Switzerland Student ID: 06-700-389 – Communication Systems Group, Prof. Dr. Burkhard Stiller HESIS T Supervisor: Fabio Hecht, Thomas Bocek ASTER Date of Submission: April 30, 2012 M University of Zurich Department of Informatics (IFI) Binzmühlestrasse 14, CH-8050 Zürich, Switzerland ifi Master Thesis Communication Systems Group (CSG) Department of Informatics (IFI) University of Zurich Binzmühlestrasse 14, CH-8050 Zürich, Switzerland URL: http://www.csg.uzh.ch/ Zusammenfassung Die Hauptprobleme welche von der B-Tracker Herangehensweise adressiert werden, sind Effizienz und Lastverteilung in verteilten BitTorrent (BT) Trackern. Historische Daten und Simulationen zeigen, dass das Problem der ungleichen Lastverteilung im sogenann- ten Distributed Hash Table (DHT) Tracker zwei Ursachen hat. Zum einen liegt es an der nicht gleich verteilten Beliebtheit von Torrents und zum anderen an den inneren Me- chanismen einer DHT. Die zweite Lösung PEX ist fur¨ ihre Ineffizienz bekannt, da PEX Nachrichten regelmässig an seine Nachbarn flutet. Erste B-Tracker Simulationen im La- bor haben gezeigt, dass B-Tracker den bisherigen Lösungen uberlegen¨ ist in Effizienz und Lastverteilung. Das Ziel dieser Arbeit ist es B-Tracker fur¨ ein reales BT Programm zu implementieren und unter realistischen Bedingungen zu evaluieren. B-Tracker ist als Plugin fur¨ die das bekannte BT Programm Vuze implementiert. Das Ur- sprungliche¨ B-Tracker Konzept wird dafur¨ in einen Entwurf weiterentwickelt, der sowohl das ursprungliche¨ Konzept erfullt¨ als auch in die fixe Vuze Schnittstelle passt. Dann wird der Entwurf implementiert. Zum Zweck der Evaluation mussen¨ auch einige Anderungen¨ am Vuze Source Code gemacht werden, diese betreffen vor allem die Messung der Nach- richten. Die Entwicklung von B-Tracker als Vuze Plugin bringt einige Vorteile. So bringt Vuze schon eine DHT mit, die auch vom DHT-Tracker verwendet wird. Des Weiteren kann das Plugin in allen aktuellen Vuze Installationen verwendet werden, da fur¨ das alleinige Ausfuhren¨ des Plugins keine Anderungen¨ am Vuze Source Code nötig sind. Drei neue Nachrichten Typen werden eingefuhrt¨ um das B-Tracker Protokoll zu unterstutzen.¨ Die Plugin Implementierung hat auch einige Nachteile unter anderem werden von der Plugin Schnittstelle nicht alle Möglichkeiten die die DHT bietet unterstutzt.¨ So mussen¨ einige Abstriche gemacht werden um das Ursprungliche¨ Konzept zu implementieren. Der fur¨ die Evaluation benötigte Aufbau beinhaltet verschiedene Experimente fur¨ die drei Tracker Varianten und drei verschiedene Churn Raten. An einem Experiment sind 100 Peers beteiligt welche eine bestimmte Datei von anfänglich zwei Seedern herunter laden. Ein Experiment wird als fertig angesehen sobald alle Peers die Datei heruntergeladen haben. Die Resultate zeigen, dass B-Tracker seine Vorteile auch in einer realen Situation ausspielen kann und unter diesen Bedingungen mindestens ähnlich gut und meistens besser abschneidet als seine Konkurrenz. Wenn die Churn rate auf 0 gesetzt ist hat der DHT- Tracker Vorteile in der Effizienz bei 15% sind beide ähnlich und mit 30% liegt B-Tracker vorne. In der Lastverteilung schneidet B-Tracker durchwegs besser ab. Die Implementierung und Evaluierung von B-Tracker als Vuze Plugin kann als erfolgrei- chen zweiten Schritt auf dem weg zu einem effizienteren und gerechteren verteilten Tracker betrachtet werden. i ii Abstract BitTorrent (BT) is a popular Peer to Peer (P2P) system which, by its use of centralized trackers, breaches the P2P paradigm. For trackers to profit from P2P properties they have to be implemented in a distributed approach. Peer Exchange (PEX) and the Azureus Distributed Hash Table (AZDHT) are two widely deployed distributed tracking mechanisms. These suffer from inefficiency and poor load balancing where load is defined as the upload bandwidth used. The B-Tracker approach, which promises to solve these issues, is designed and implemented as a plugin for the Vuze BT client. For the evaluation, simulations of a BT swarm under realistic conditions including churn are run for all the trackers. The results show that B-Tracker improves efficiency and load balance under realistic conditions. B-Tracker shows better load balance than DHT and PEX in all simulated conditions. The evaluation shows also that B-Tracker is more efficient than PEX and similar to AZDHT for 15% churn and better for 30% churn. Most B-Tracker traffic is generated by the underlying AZDHT, which cannot be changed by a plugin. B-Tracker shows that it is possible to improve load balancing and efficiency compared to existing solutions. iii iv Acknowledgments First and foremost, I would like to thank my supervisors Fabio Hecht and Thomas Bocek for developing the B-Tracker idea which formed the basis of my thesis and for the con- structive discussions we had. A special thanks goes to Fabio for proof reading the thesis and the support he gave me whenever I was in need of help. This thesis would not have been possible without the two of you. I am obliged to many members of the Communication Systems Group at the University of Zurich led by Prof Dr. Burkhard Stiller for giving me the opportunity to write this thesis and providing me with an excellent server infrastructure and support. Last but not least I thank my parents for supporting me and for giving me motivation when needed. v vi Contents Zusammenfassung i Abstract iii Acknowledgments v 1 Introduction 1 1.1 Motivation.................................... 1 1.2 Description of Work . 2 1.3 ThesisOutline.................................. 4 2 Related Work 5 2.1 BitTorrent.................................... 5 2.1.1 Tracker Protocol . 7 2.1.2 BT Protocol . 8 2.1.3 Azureus Extensions . 11 2.2 Peer Exchange . 11 2.3 DHT-Tracker Extension . 12 2.4 BloomFilters .................................. 13 2.5 Related Work Summary . 13 vii viii CONTENTS 3 Design 15 3.1 Hash Table Load Analysis . 15 3.2 B-TrackerApproach .............................. 17 3.2.1 Primary Tracker Look Up . 17 3.2.2 Secondary Tracker Query . 17 3.2.3 MainTracker .............................. 18 3.2.4 DHT Manager . 20 3.2.5 Messaging . 21 3.2.6 Parameters . 22 4 Implementation 23 4.1 Plugin Interface . 23 4.2 DHT Interface . 24 4.2.1 Measurement . 25 4.3 FurtherDevelopment.............................. 25 5 Evaluation 27 5.1 Evaluation Environment . 27 5.2 Experiment Design . 28 5.2.1 Files, Bandwidth and Seeding . 28 5.2.2 Churn .................................. 29 5.2.3 DHT................................... 30 5.2.4 Performance Issues . 30 5.2.5 Parameters . 30 5.3 Execution and Results . 31 5.3.1 Execution . 31 5.3.2 Efficiency . 32 5.3.3 Load Balancing . 34 5.3.4 Messages . 36 5.4 RunTimes.................................... 40 CONTENTS ix 6 Summary and Conclusions 41 6.1 Summary .................................... 41 6.2 Conclusion . 42 Abbreviations 47 Glossary 49 List of Figures 49 List of Tables 52 A Installation Guidelines 55 B Contents of the DVD 57 B.1 B-Tracker Plugin . 57 B.2 Data....................................... 57 B.3 Experiment . 58 B.4 Related Work . 58 B.5 Sources . 58 B.6 Thesis ...................................... 58 x CONTENTS Chapter 1 Introduction Peer-to-Peer (P2P) technology allows systems to operate without the need of a central entity (e.g. a server) controlling it. P2P systems use the resources provided by its members the so called peers. They allow to share resources provided by peers, such as CPU time, memory or disk space. Typical properties of P2P systems are scalability because every peer brings resources and no single point of failure since there is no central server. According to a study by ipoque [18] the most popular P2P system accounting for 80% of P2P traffic is BitTorrent (BT). P2P systems in general account for roughly 50% of Internet traffic according to ipoque. BT is mostly used to share large files or large collections of small files such as movies, software distributions or music collections. A file server infrastructure providing similar performance would induce huge costs in hardware and maintenance. With BT a user contributes to the infrastructure by providing some bandwidth and disk space and therefore covers a part of the costs. What means that a peer downloading a file has to upload it at the same time. Traditionally a centralized server called tracker maintains a database containing peers that share a certain file and the amount of data they up- and downloaded[7]. Peers sharing the same file are called a swarm. The tracker as a central entity is a breach of the P2P paradigm. Past events involving \The Pirate Bay" tracker, which is one of the largest free trackers [3], have shown that scalability and single point of failure are an issue in BT trackers. 1.1 Motivation In order to overcome the drawback of centralized trackers two approaches to distributed trackers have been added to the original BitTorrent protocol. One is based on distributed hash table (DHT) technology the second one is called Peer Exchange (PEX). PEX is used by most current BT clients, it is a gossiping based protocol where peers send their peer list to connected peers in certain time intervals. It is totally unstructured and can reveal only peers that are already connected to the swarm [28]. Furthermore, PEX is not very 1 2 CHAPTER 1. INTRODUCTION efficient in terms of bandwidth consumption [13], this is due to the chatty nature of the protocol. DHTs are based on the assumption of roughly equal data distribution [19]. Other work [5, 11, 17] states that in real world applications of DHTs the identifiers and therefore the load is not uniformly distributed among participating nodes. Research has so far focused on uneven load distribution in DHTs but the fact that not all torrents are equally popular has been ignored. A torrent describes a file being shared and gives an address of at least one tracker.

Load more