Improving Performance of Modern Peer-To-Peer Services
Total Page:16
File Type:pdf, Size:1020Kb
UMEA˚ UNIVERSITY Department of Computing Science Master Thesis Improving performance of modern Peer-to-Peer services Marcus Bergner 10th June 2003 Abstract Peer-to-peer networking is not a new concept, but has been resurrected by services such as Napster and file sharing applications using Gnutella. The network infrastructure of todays networks are based on the assumption that users are simple clients making small requests and receiving, possibly large, replies. This assumption does not hold for many peer-to-peer services and hence the network is often used inefficiently. This report investigates the reasons behind this waste of resources and looks at various ways to deal with these issues. Focusing mainly on file shar- ing various attempts are presented and techniques and suggestions on how improvements can be made are presented. The report also looks at the implementation of peer-to-peer services and what kind of problems need to be solved when developing a peer-to-peer application. The design and implementation of a peer-to-peer framework that solves common problems is the most important contribution in this area. Looking at other peer-to-peer services conclusions regarding peer-to-peer in general can be made. The thesis ends by summarizing the solutions to some of the flaws in peer-to-peer services and provides guidelines on how peer-to-peer services can be constructed to work more efficiently. Contents Preface i About this report . i About the author . i Supervisors . ii Typographical conventions . ii Software used . iii Acknowledgments . iii 1 Requirements 1 1.1 Summary of requirements . 1 1.2 Investigate existing services . 1 1.3 Analyzing file sharing protocols . 2 1.4 Constructing a peer-to-peer framework . 2 1.5 Other applications and frameworks . 2 2 Introduction 3 2.1 History . 3 2.1.1 Early services . 3 2.2 Peer-to-peer related services . 4 2.2.1 Usenet . 4 2.2.2 Domain Name System . 5 2.3 Common problems . 5 2.3.1 Firewalls . 5 2.3.2 Port abuse and pushing . 6 2.3.3 Dynamic host addresses . 6 2.3.4 Network Address Translation . 6 2.4 Examples on different services . 6 2.4.1 Client/server: browsing the web . 7 2.4.2 Centralized peer-to-peer: Napster . 8 2.4.3 Decentralized peer-to-peer: Gnutella . 9 3 Gnutella 11 3.1 Introduction . 11 3.2 Protocol messages . 11 3.2.1 Establishing connections . 12 3.2.2 Message header . 13 3.2.3 Ping message . 14 3.2.4 Pong message . 14 3.2.5 Query message . 14 3.2.6 QueryHit message . 15 3.2.7 Push message . 16 3.2.8 Bye message, version 0.6 . 16 3.3 Ping and Pong optimizations . 17 3.3.1 Pong caching schemes . 18 3.3.2 Ping multiplexing . 18 3.4 Ultrapeers and routing . 20 3.5 Transferring files . 20 3.6 Metadata and rich queries . 21 4 P-Grid 23 4.1 Introduction . 23 4.2 Hierarchical protocols . 23 4.2.1 An example of a DNS lookup . 24 4.2.2 Caching in DNS . 24 4.2.3 Redundancy and fault tolerance in DNS . 25 4.3 P-Grid architecture . 25 4.3.1 Peers in the P-Grid network . 25 4.3.2 Key distribution . 26 4.3.3 Searching . 29 4.4 Related protocols . 29 5 FastTrack 31 5.1 Introduction . 31 5.2 FastTrack architecture overview . 31 5.2.1 Protocol details . 32 5.3 The giFT project . 32 5.4 giFT interface protocol . 34 5.4.1 Protocol format . 34 5.4.2 Command summary . 35 5.5 OpenFT network protocol . 35 5.5.1 Connection establishment . 36 5.5.2 Protocol summary . 36 5.5.3 Query management . 37 5.5.4 File transfer . 38 5.6 Evaluation of giFT usability . 38 6 Analysis 39 6.1 Introduction . 39 6.2 Gnutella network usage . 39 6.2.1 Test configuration and scenario . 39 6.2.2 Test results . 40 6.2.3 Logged statistics . 41 6.2.4 Summary and conclusions . 51 6.3 P-Grid network usage . 52 6.3.1 Problems with P-Grid . 53 6.3.2 Building the distributed search tree . 53 6.3.3 Performing queries . 54 6.4 OpenFT and giFT . 55 6.4.1 Test environment . 55 6.4.2 Performed tests . 56 6.4.3 Test results . 56 6.5 Comparison test of FastTrack and Gnutella . 58 6.5.1 Results using Gnutella . 58 6.5.2 Results using FastTrack . 59 6.6 Summary . 60 7 Protocol design 61 7.1 Introduction . 61 7.2 Metadata query protocols . 62 7.2.1 Internet Nomenclator Project . 62 7.2.2 Metadata directory service . 63 7.3 Metadata protocol specification . 64 7.3.1 Definitions . 64 7.3.2 Performing a query . 65 7.3.3 Response to a query . 66 7.4 An efficient peer-to-peer protocol . 67 7.4.1 Network nodes . 67 7.4.2 Protocol overview . 68 7.4.3 Protocol limitations . 68 7.4.4 Key management . 69 7.4.5 Multi-bit keys . 70 7.4.6 Peer classification . 70 7.4.7 Establishing connections . 70 7.4.8 Protocol header . 71 7.4.9 Message types . 72 7.4.10 Using XML encoded messages . 73 7.5 Payload details . 75 7.5.1 Exchanging information with peerinfo ........ 75 7.5.2 Exchanging host lists with list ............. 76 7.5.3 Becoming a mirror with join .............. 77 7.5.4 Disconnecting from nodes with leave .......... 79 7.5.5 Changing keys with changekey ............. 79 7.5.6 Registering shares with register and unregister . 80 7.5.7 Performing searches with query ............. 81 7.6 Practical examples . 81 7.6.1 Bootstrapping the network . 81 7.6.2 New host joins the network . 82 8 Framework design 85 8.1 Introduction . 85 8.2 Backend components . 85 8.2.1 Peer Connection Monitor . 86 8.2.2 Peer Monitor or Protocol Monitor . 87 8.2.3 Peer Reader, or Protocol Reader . 87 8.2.4 Peer Protocol . 88 8.2.5 Peer Back End . 88 8.3 Frontend components . 88 8.4 Support components . 89 8.4.1 Thread pool . 89 8.4.2 Priority queues . 90 8.4.3 Dictionaries for registrations . 90 8.4.4 Sets . 91 8.4.5 Input multiplexing . 92 9 Implementation 93 9.1 Introduction . 93 9.2 Overview . 93 9.2.1 Core classes . 94 9.2.2 The Java nio package . 95 9.2.3 System initialization . 96 9.2.4 System shutdown . 96 9.3 Protocol modules . 97 9.3.1 The PeerProtocol class . ..