Designs and Analyses in Structured Peer-To-Peer Systems
Total Page:16
File Type:pdf, Size:1020Kb
Designs and Analyses in Structured Peer-to-Peer Systems Sameh El-Ansary A Dissertation submitted to the Royal Institute of Technology (KTH) in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2005 The Royal Institute of Technology (KTH) School of Information and Communication Technology Department of Microelectronics and Information Technology Stockholm, Sweden TRITA-IMIT-LECS AVH 05:02 ISSN 1651-4076 ISRN KTH/IMIT/LECS/AVH-05/02–SE and SICS Dissertation Series 38 ISSN 1101-1335 ISRN SICS-D–38–SE c Sameh El-Ansary, June 2005 Printed by Universitetsservice US-AB 2005 To My Beloved Wife Mena 3 4 Abstract Peer-to-Peer (P2P) computing is a recent hot topic in the areas of networking and distributed systems. Work on P2P computing was triggered by a number of ad-hoc systems that made the concept popular. Later, academic research efforts started to investigate P2P computing issues based on scientific principles. Some of that research produced a number of structured P2P systems that were collec- tively referred to by the term “Distributed Hash Tables” (DHTs). However, the research occurred in a diversified way leading to the appearance of similar con- cepts yet lacking a common perspective and not heavily analyzed. In this thesis we present a number of papers representing our research results in the area of structured P2P systems grouped as two sets labeled respectively “Designs” and “Analyses”. The contribution of the first set of papers is as follows. First, we present the principle of distributed k-ary search and argue that it serves as a framework for most of the recent P2P systems known as DHTs. That is, given this framework, understanding existing DHT systems is done simply by seeing how they are in- stances of that framework. We argue that by perceiving systems as instances of that framework, one can optimize some of them. We illustrate that by applying the framework to the Chord system, one of the most established DHT systems. Second, we show how the framework helps in the design of P2P algorithms by two examples: (a) The DKS(n; k; f) system which is a system designed from the beginning on the principles of distributed k-ary search. (b) Two broadcast algo- rithms that take advantage of the distributed k-ary search tree. The contribution of the second set of papers is as follows. We account for two approaches that we used to evaluate the performance of a particular class of DHTs, namely the one adopting periodic stabilization for topology mainte- nance. The first approach was of an intrinsic empirical nature. In this approach, we tried to perceive a DHT as a physical system and account for its properties in a size-independent manner. The second approach was of a more analytical nature. In this approach, we applied the technique of Master Equations, which is a widely used technique in the analysis of natural systems. The application of the technique lead to a highly accurate description of the behavior of structured overlays. Additionally, the thesis contains a primer on structured P2P systems that tries to capture the main ideas prevailing in the field and enumerates a subset of the current hot and open research issues. 5 6 Acknowledgments I had the privilege to work with many experienced senior persons during my research and to whom I would like to offer my thanks. First, I would like to thank my supervisor Prof. Seif Haridi for offering me the chance of being a member of his distinguished research team. Seif’s enthusiasm was a strong source of encouragement. He was always keen to share his long ex- perience with me and to teach me new things. He continuously and successfully exerts lots of effort to provide the best research environment and to open new horizons for myself and to all of his students. Second, I would like to express my gratitude to Dr. Luc Onana Alima. Luc in- troduced me to the field of distributed algorithms as a teacher. He, then, worked with me as a partner and co-supervised the first part of this thesis. Luc has been a serious working partner who offered maximum moral support and pushed me to the limit whenever needed. Third, during the second part of the thesis I was co-supervised by Prof. Erik Aurell and Dr. Supriya Krishnamurthy. Erik showed me how physicists do di- mensional analysis for systems. He was a constant source of support and inspira- tion and acted as an example in efficient and hard work. Finally, the supervision of Supriya at the end of the thesis opened to me a complete new scope of think- ing by introducing me to the technique of Master Equations. She showed me how systems of intricate complexity could be analyzed with high accuracy using very basic probabilistic primitives. I would like also to thank the team leader of the DSL lab at SICS Per Brand. Per taught me distributed programming in Mozart. Moreover, he has always been a very inspiring person. His strong intuition and shrewd remarks have always opened the gate for new ideas. I would like to thank my colleagues in the DSL lab at SICS, Konstantin Popow, Erik Klintskog, Dragan Havelka, Fredrik Holmgren, Frej Drejhammar, Joe Arm- strong for being a constant source of help. Ali Ghodsi was a colleague with whom I had lots of enlightening discussions. We experienced together the stress of weekend- and late-night-working in order to conduct simulations and write papers. On the personal level, I would like to thank my wife, my mother, my father and Prof. Ahmed Rafea. If I was able to finish my PhD, that is because of them. Dr. Mahmoud Rafea has been the friend who shared with me the joys and the hardships of the first two years of living in Sweden. 7 8 Contents 1 Introduction 21 1.1 ThesisMotivation......................... 21 1.2 ThesisOrganization. 22 2 AStructuredP2POverlayNetworksPrimer 29 2.1 WhatisP2P?............................ 29 2.2 EvolutionofP2PSystems . 30 2.2.1 FirstGeneration. 31 2.2.2 SecondGeneration . 32 2.2.3 ThirdGeneration ..................... 33 2.3 DefinitionsandAssumptions . 34 2.4 ComparisonCriteria . 36 2.5 DHTSystems ........................... 36 2.5.1 Chord ........................... 36 2.5.2 Pastry ........................... 40 2.5.3 Tapestry .......................... 43 2.5.4 Kademlia ......................... 43 2.5.5 HyperCup......................... 46 2.5.6 DKS ............................ 46 2.5.7 P-Grid ........................... 48 2.5.8 Koorde........................... 51 2.5.9 DistanceHalving . 53 2.5.10 Viceroy........................... 55 2.5.11 Ulysses........................... 57 2.5.12 CAN ............................ 58 9 2.6 Summary.............................. 60 2.6.1 TheOverlayGraph.................... 60 2.6.2 Mapping Items Onto Nodes . 62 2.6.3 TheLookupProcess . 62 2.6.4 Joins, Leaves and Maintenance . 62 2.6.5 ReplicationandFaultTolerance. 63 2.7 HotandOpenResearchIssues . 64 I Designs 75 3 A Framework for P2P Lookup Services Based on k-ary Search 77 3.1 Introduction ............................ 80 3.1.1 Motivationandcontribution . 81 3.2 TheChordlookupalgorithm . 82 3.2.1 The Chord identifier/search space . 82 3.2.2 Keyassignment...................... 82 3.2.3 Theroutingtable ..................... 83 3.2.4 Keylocation........................ 84 3.2.5 Complexity ........................ 84 3.3 Chordasbinary-search. 84 3.3.1 Complexity ........................ 88 3.4 Lookup services as k-arysearch................. 88 3.4.1 Complexity ........................ 90 3.5 k-arysearchforimprovingChord. 91 3.6 Conclusionandfuturework . 93 4 The DKS(N,k,f) InfrastructureforP2PApplications 95 4.1 Introduction ............................ 98 4.1.1 Motivations and contributions . 98 4.1.2 Anoverviewofourapproach . 99 4.1.3 Paperorganization . 100 4.2 The concepts in the design of the DKS(N,k,f) ........101 4.2.1 Underlyingassumptions. 101 4.2.2 The identifier space and notations . 101 4.2.3 Key/value pairs management . 102 10 4.2.4 Levelsandviews . .102 4.2.5 Responsibilities . 103 4.2.6 Routinginformation . 104 4.3 DKS(N,k,f) networksconstruction . 105 4.4 Correction-on-use. 107 4.5 Lookupina DKS(N,k,f) .....................108 4.6 Leave................................109 4.7 Failure ...............................109 4.8 Experimentalresults . 110 4.9 Concludingremarksandfuturework . 112 5 Efficient Broadcast in Structured P2P Networks 117 5.1 Introduction ............................120 5.2 RelatedWork ...........................121 5.3 OurApproach...........................121 5.3.1 DHTs as Distributed k-ary Search . 121 5.3.2 ProblemDefinition . 123 5.3.3 Solutions..........................123 5.4 TheBroadcastAlgorithm. 124 5.4.1 SystemModel&Notation . 124 5.4.2 Rules............................124 5.4.3 CorrectnessArgument . 127 5.5 CostVersusGuarantees . 127 5.6 SimulationResults . .128 5.7 ConclusionandFutureWork . 130 6 Self-Correcting Broadcast in Distributed Hash Tables 133 6.1 Introduction ............................136 6.1.1 Contribution .......................136 6.1.2 Relatedwork .......................137 6.1.3 Outline...........................138 6.2 DKS overview ...........................138 6.2.1 Structure of the DKS ...................138 6.2.2 Routingtables. .139 6.2.3 Lookups ..........................140 11 6.2.4 Correction-on-use. 141 6.3 Thebroadcastalgorithms . 142 6.3.1 Desiredproperties . 142 6.3.2 Informaldescription . 142 6.3.3 Formaldescription . 143 6.4 SimulationResults . .147 6.5 Conclusion.............................149 7 AComponent-basedP2PSimulationEnvironment 153 7.1 Motivation.............................153 7.2 Architecture