Kademlia: a Peer-To-Peer Information System Based on the XOR Metric

Total Page:16

File Type:pdf, Size:1020Kb

Kademlia: a Peer-To-Peer Information System Based on the XOR Metric Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Maymounkov and David Mazieres` fpetar,[email protected] http://kademlia.scs.cs.nyu.edu Abstract we validate with measurements of existing peer-to- peer systems). We describe a peer-to-peer system which has prov- Kademlia takes the basic approach of many peer- able consistency and performance in a fault-prone to-peer systems. Keys are opaque, 160-bit quantities environment. Our system routes queries and locates (e.g., the SHA-1 hash of some larger data). Partici- nodes using a novel XOR-based metric topology that pating computers each have a node ID in the 160-bit simplifies the algorithm and facilitates our proof. key space. hkey,valuei pairs are stored on nodes with The topology has the property that every message IDs “close” to the key for some notion of closeness. exchanged conveys or reinforces useful contact in- Finally, a node-ID-based routing algorithm lets any- formation. The system exploits this information to one locate servers near a destination key. send parallel, asynchronous query messages that tol- Many of Kademlia’s benefits result from its use of erate node failures without imposing timeout delays a novel XOR metric for distance between points in on users. the key space. XOR is symmetric, allowing Kadem- lia participants to receive lookup queries from pre- 1 Introduction cisely the same distribution of nodes contained in their routing tables. Without this property, systems This paper describes Kademlia, a peer-to-peer such as Chord [5] do not learn useful routing infor- hkey,valuei storage and lookup system. Kadem- mation from queries they receive. Worse yet, be- lia has a number of desirable features not simulta- cause of the asymmetry of Chord’s metric, Chord neously offered by any previous peer-to-peer sys- routing tables are rigid. Each entry in a Chord node’s tem. It minimizes the number of configuration mes- finger table must store the precise node proceeding sages nodes must send to learn about each other. an interval in the ID space; any node actually in the Configuration information spreads automatically as interval will be greater than some keys in the inter- a side-effect of key lookups. Nodes have enough val, and thus very far from the key. Kademlia, in knowledge and flexibility to route queries through contrast, can send a query to any node within an in- low-latency paths. Kademlia uses parallel, asyn- terval, allowing it to select routes based on latency or chronous queries to avoid timeout delays from failed even send parallel asynchronous queries. nodes. The algorithm with which nodes record each To locate nodes near a particular ID, Kademlia other’s existence resists certain basic denial of ser- uses a single routing algorithm from start to finish. vice attacks. Finally, several important properties of In contrast, other systems use one algorithm to get Kademlia can be formally proven using only weak near the target ID and another for the last few hops. assumptions on uptime distributions (assumptions Of existing systems, Kademlia most resembles Pas- try’s [1] first phase, which (though not described this This research was partially supported by National Science Foun- way by the authors) successively finds nodes roughly dation grants CCR 0093361 and CCR 9800085. half as far from the target ID by Kademlia’s XOR metric. In a second phase, however, Pastry switches 1 1 distance metrics to the numeric difference between IDs. It also uses the second, numeric difference met- 0.8 ric in replication. Unfortunately, nodes close by the second metric can be quite far by the first, creating 0.6 discontinuities at particular node ID values, reduc- ing performance, and frustrating attempts at formal 0.4 analysis of worst-case behavior. 0.2 0 2 System description 0 500 1000 1500 2000 2500 Each Kademlia node has a 160-bit node ID. Node Figure 1: Probability of remaining online another IDs are constructed as in Chord, but to simplify this hour as a function of uptime. The x axis represents paper we assume machines just choose a random, minutes. The y axis shows the the fraction of nodes 160-bit identifier when joining the system. Every that stayed online at least x minutes that also stayed message a node transmits includes its node ID, per- online at least x + 60 minutes. mitting the recipient to record the sender’s existence if necessary. these lists k-buckets. Each k-bucket is kept sorted by Keys, too, are 160-bit identifiers. To publish and time last seen—least-recently seen node at the head, find hkey,valuei pairs, Kademlia relies on a notion of most-recently seen at the tail. For small values of i, distance between two identifiers. Given two 160-bit the k-buckets will generally be empty (as no appro- identifiers, x and y, Kademlia defines the distance priate nodes will exist). For large values of i, the lists between them as their bitwise exclusive or (XOR) can grow up to size k, where k is a system-wide repli- interpreted as an integer, d(x; y) = x ⊕ y. cation parameter. k is chosen such that any given k We first note that XOR is a valid, albeit non- nodes are very unlikely to fail within an hour of each Euclidean, metric. It is obvious that that d(x; x) = 0, other (for example k = 20). d(x; y) > 0 if x 6= y, and 8x; y : d(x; y) = d(y; x). When a Kademlia node receives any message (re- XOR also offers the triangle property: d(x; y) + quest or reply) from another node, it updates the d(y; z) ≥ d(x; z). The triangle property follows appropriate k-bucket for the sender’s node ID. If from the fact that d(x; z) = d(x; y) ⊕ d(y; z) and the sending node already exists in the recipient’s k- 8a ≥ 0; b ≥ 0 : a + b ≥ a ⊕ b. bucket, the recipient moves it to the tail of the list. Like Chord’s clockwise circle metric, XOR is uni- If the node is not already in the appropriate k-bucket directional. For any given point x and distance ∆ > and the bucket has fewer than k entries, then the re- 0, there is exactly one point y such that d(x; y) = cipient just inserts the new sender at the tail of the ∆. Unidirectionality ensures that all lookups for the list. If the appropriate k-bucket is full, however, then same key converge along the same path, regardless the recipient pings the k-bucket’s least-recently seen of the originating node. Thus, caching hkey,valuei node to decide what to do. If the least-recently seen pairs along the lookup path alleviates hot spots. Like node fails to respond, it is evicted from the k-bucket Pastry and unlike Chord, the XOR topology is also and the new sender inserted at the tail. Otherwise, symmetric (d(x; y) = d(y; x) for all x and y). if the least-recently seen node responds, it is moved to the tail of the list, and the new sender’s contact is 2.1 Node state discarded. Kademlia nodes store contact information about k-buckets effectively implement a least-recently each other to route query messages. For each seen eviction policy, except that live nodes are never 0 ≤ i < 160, every node keeps a list of removed from the list. This preference for old con- hIP address; UDP port; Node IDi triples for nodes of tacts is driven by our analysis of Gnutella trace data distance between 2i and 2i+1 from itself. We call collected by Saroiu et. al. [4]. Figure 1 shows the 2 percentage of Gnutella nodes that stay online another nodes it has chosen. α is a system-wide concurrency hour as a function of current uptime. The longer parameter, such as 3. a node has been up, the more likely it is to remain In the recursive step, the initiator resends the up another hour. By keeping the oldest live contacts FIND NODE to nodes it has learned about from pre- around, k-buckets maximize the probability that the vious RPCs. (This recursion can begin before all nodes they contain will remain online. α of the previous RPCs have returned). Of the k A second benefit of k-buckets is that they pro- nodes the initiator has heard of closest to the tar- vide resistance to certain DoS attacks. One cannot get, it picks α that it has not yet queried and re- flush nodes’ routing state by flooding the system with sends the FIND NODE RPC to them.1 Nodes that new nodes. Kademlia nodes will only insert the new fail to respond quickly are removed from consider- nodes in the k-buckets when old nodes leave the sys- ation until and unless they do respond. If a round tem. of FIND NODEs fails to return a node any closer than the closest already seen, the initiator resends 2.2 Kademlia protocol the FIND NODE to all of the k closest nodes it has not already queried. The lookup terminates when the The Kademlia protocol consists of four RPCs: PING, initiator has queried and gotten responses from the k STORE, FIND NODE, and FIND VALUE. The PING closest nodes it has seen. When α = 1 the lookup al- RPC probes a node to see if it is online. STORE in- gorithm resembles Chord’s in terms of message cost structs a node to store a hkey; valuei pair for later and the latency of detecting failed nodes.
Recommended publications
  • Looking up Data in P2p Systems
    LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica∗ MIT Laboratory for Computer Science 1. Introduction a good example of how the challenges of designing P2P systems The recent success of some widely deployed peer-to-peer (P2P) can be addressed. file sharing applications has sparked new research in this area. We The recent algorithms developed by several research groups for are interested in the P2P systems that have no centralized control the lookup problem present a simple and general interface, a dis- or hierarchical organization, where the software running at each tributed hash table (DHT). Data items are inserted in a DHT and node is equivalent in functionality. Because these completely de- found by specifying a unique key for that data. To implement a centralized systems have the potential to significantly change the DHT, the underlying algorithm must be able to determine which way large-scale distributed systems are built in the future, it seems node is responsible for storing the data associated with any given timely to review some of this recent research. key. To solve this problem, each node maintains information (e.g., The main challenge in P2P computing is to design and imple- the IP address) of a small number of other nodes (“neighbors”) in ment a robust distributed system composed of inexpensive com- the system, forming an overlay network and routing messages in puters in unrelated administrative domains. The participants in a the overlay to store and retrieve keys. typical P2P system might be home computers with cable modem One might believe from recent news items that P2P systems are or DSL links to the Internet, as well as computers in enterprises.
    [Show full text]
  • Conducting and Optimizing Eclipse Attacks in the Kad Peer-To-Peer Network
    Conducting and Optimizing Eclipse Attacks in the Kad Peer-to-Peer Network Michael Kohnen, Mike Leske, and Erwin P. Rathgeb University of Duisburg-Essen, Institute for Experimental Mathematics, Ellernstr. 29, 45326 Essen [email protected], [email protected], [email protected] Abstract. The Kad network is a structured P2P network used for file sharing. Research has proved that Sybil and Eclipse attacks have been possible in it until recently. However, the past attacks are prohibited by newly implemented secu- rity measures in the client applications. We present a new attack concept which overcomes the countermeasures and prove its practicability. Furthermore, we analyze the efficiency of our concept and identify the minimally required re- sources. Keywords: P2P security, Sybil attack, Eclipse attack, Kad. 1 Introduction and Related Work P2P networks form an overlay on top of the internet infrastructure. Nodes in a P2P network interact directly with each other, i.e., no central entity is required (at least in case of structured P2P networks). P2P networks have become increasingly popular mainly because file sharing networks use P2P technology. Several studies have shown that P2P traffic is responsible for a large share of the total internet traffic [1, 2]. While file sharing probably accounts for the largest part of the P2P traffic share, also other P2P applications exist which are widely used, e.g., Skype [3] for VoIP or Joost [4] for IPTV. The P2P paradigm is becoming more and more accepted also for professional and commercial applications (e.g., Microsoft Groove [5]), and therefore, P2P technology is one of the key components of the next generation internet.
    [Show full text]
  • A Content-Addressable Network for Similarity Search in Metric Spaces Fabrizio Falchi
    University of Pisa Masaryk University Faculty of Informatics Department of Department of Information Engineering Information Technologies Doctoral Course in Doctoral Course in Information Engineering Informatics Ph.D. Thesis A Content-Addressable Network for Similarity Search in Metric Spaces Fabrizio Falchi Supervisor Supervisor Supervisor Lanfranco Lopriore Fausto Rabitti Pavel Zezula 2007 Abstract Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for content- based retrieval in huge and ever growing collections of data pro- duced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model hu- man fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recog- nition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similar- ity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer- to-Peer system for similarity search in metric spaces called Met- ric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup.
    [Show full text]
  • Hartkad: a Hard Real-Time Kademlia Approach
    HaRTKad: A Hard Real-Time Kademlia Approach Jan Skodzik, Peter Danielis, Vlado Altmann, Dirk Timmermann University of Rostock Institute of Applied Microelectronics and Computer Engineering 18051 Rostock, Germany, Tel./Fax: +49 381 498-7284 / -1187251 Email: [email protected] Abstract—The Internet of Things is becoming more and time behavior. Additionally, many solutions leak flexibility and more relevant in industrial environments. As the industry itself need a dedicated instance for administrative tasks. These issues has different requirements like (hard) real-time behavior for will becomes more relevant in the future. As mentioned in many scenarios, different solutions are needed to fulfill future challenges. Common Industrial Ethernet solution often leak [3], the future for the industry will be more intelligent devices, scalability, flexibility, and robustness. Most realizations also which can act more dynamically. Facilities as one main area require special hardware to guarantee a hard real-time behavior. of application will consist of more devices still requiring real- Therefore, an approach is presented to realize a hard real- time or even hard real-time behavior. So we think, the existing time network for automation scenarios using Peer-to-Peer (P2P) solutions will not fulfill the future challenges in terms of technology. Kad as implementation variant of the structured decentralized P2P protocol Kademlia has been chosen as base scalability, flexibility, and robustness. for the realization. As Kad is not suitable for hard real-time Peer-to-Peer (P2P) networks instead offer an innovative applications per se, changes of the protocol are necessary. Thus, alternative to the typical Client-Server or Master-Slave concepts Kad is extended by a TDMA-based mechanism.
    [Show full text]
  • Tutorial on P2P Systems
    P2P Systems Keith W. Ross Dan Rubenstein Polytechnic University Columbia University http://cis.poly.edu/~ross http://www.cs.columbia.edu/~danr [email protected] [email protected] Thanks to: B. Bhattacharjee, A. Rowston, Don Towsley © Keith Ross and Dan Rubenstein 1 Defintion of P2P 1) Significant autonomy from central servers 2) Exploits resources at the edges of the Internet P storage and content P CPU cycles P human presence 3) Resources at edge have intermittent connectivity, being added & removed 2 It’s a broad definition: U P2P file sharing U DHTs & their apps P Napster, Gnutella, P Chord, CAN, Pastry, KaZaA, eDonkey, etc Tapestry U P2P communication U P Instant messaging P2P apps built over P Voice-over-IP: Skype emerging overlays P PlanetLab U P2P computation P seti@home Wireless ad-hoc networking not covered here 3 Tutorial Outline (1) U 1. Overview: overlay networks, P2P applications, copyright issues, worldwide computer vision U 2. Unstructured P2P file sharing: Napster, Gnutella, KaZaA, search theory, flashfloods U 3. Structured DHT systems: Chord, CAN, Pastry, Tapestry, etc. 4 Tutorial Outline (cont.) U 4. Applications of DHTs: persistent file storage, mobility management, etc. U 5. Security issues: vulnerabilities, solutions, anonymity U 6. Graphical structure: random graphs, fault tolerance U 7. Experimental observations: measurement studies U 8. Wrap up 5 1. Overview of P2P U overlay networks U P2P applications U worldwide computer vision 6 Overlay networks overlay edge 7 Overlay graph Virtual edge U TCP connection U or
    [Show full text]
  • Research Article a P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud Environments
    Hindawi Publishing Corporation International Journal of Genomics Volume 2013, Article ID 361327, 9 pages http://dx.doi.org/10.1155/2013/361327 Research Article A P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud Environments Chun-Hung Richard Lin,1 Chun-Hao Wen,1,2 Ying-Chih Lin,3 Kuang-Yuan Tung,1 Rung-Wei Lin,1 and Chun-Yuan Lin4 1 Department of Computer Science and Engineering, National Sun Yat-sen University, No. 70 Lien-hai Road, Kaohsiung City 80424, Taiwan 2 Department of Information Technology, Meiho University, No. 23 Pingkuang Road, Neipu, Pingtung 91202, Taiwan 3 Department of Applied Mathematics, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung City 40724, Taiwan 4 Department of Computer Science and Information Engineering, Chang Gung University, No. 259 Sanmin Road, Guishan Township, Taoyuan 33302, Taiwan Correspondence should be addressed to Ying-Chih Lin; [email protected] Received 22 January 2013; Revised 5 April 2013; Accepted 20 April 2013 Academic Editor: Che-Lun Hung Copyright © 2013 Chun-Hung Richard Lin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Bioinformatics is advanced from in-house computing infrastructure to cloud computing for tackling the vast quantity of biological data. This advance enables large number of collaborative researches to share their works around the world. In view of that, retrieving biological data over the internet becomes more and more difficult because of the explosive growth and frequent changes. Various efforts have been made to address the problems of data discovery and delivery in the cloud framework, but most of them suffer the hindrance by a MapReduce master server to track all available data.
    [Show full text]
  • I2P, the Invisible Internet Projekt
    I2P, The Invisible Internet Projekt jem September 20, 2016 at Chaostreff Bern Content 1 Introduction About Me About I2P Technical Overview I2P Terminology Tunnels NetDB Addressbook Encryption Garlic Routing Network Stack Using I2P Services Using I2P with any Application Tips and Tricks (and Links) Conclusion jem | I2P, The Invisible Internet Projekt | September 20, 2016 at Chaostreff Bern Introduction About Me 2 I Just finished BSc Informatik at BFH I Bachelor Thesis: "Analysis of the I2P Network" I Focused on information gathering inside and evaluation of possible attacks against I2P I Presumes basic knowledge about I2P I Contact: [email protected] (XMPP) or [email protected] (GPG 0x28562678) jem | I2P, The Invisible Internet Projekt | September 20, 2016 at Chaostreff Bern Introduction About I2P: I2P = TOR? 3 Similar to TOR... I Goal: provide anonymous communication over the Internet I Traffic routed across multiple peers I Layered Encryption I Provides Proxies and APIs ...but also different I Designed as overlay network (strictly separated network on top of the Internet) I No central authority I Every peer participates in routing traffic I Provides integrated services: Webserver, E-Mail, IRC, BitTorrent I Much smaller and less researched jem | I2P, The Invisible Internet Projekt | September 20, 2016 at Chaostreff Bern Introduction About I2P: Basic Facts 4 I I2P build in Java (C++ implementation I2Pd available) I Available for all major OS (Linux, Windows, MacOS, Android) I Small project –> slow progress, chaotic documentation,
    [Show full text]
  • Final Resourcediscoverysecuritydistrsystems Thesis Linelarsen
    Resource discovery and Security in Distributed systems Resource discovery and Security in Distributed systems by Line Larsen Thesis is partial fulfilment of the degree of Master in Technology in Information and Communication Technology Agder University College Faculty of Engineering and Science Grimstad Norway May 2007 May 2007 – Line Larsen 1 Resource discovery and Security in Distributed systems Abstract To be able to access our files at any time and any where, we need a system or service which is free, has enough storage space and is secure. A centralized system can handle these challenges today, but does not have transparency, openness and scalability like a peer to peer network has. A hybrid system with characteristics from both distributed and centralized topologies is the ideal choice. In this paper I have gone through the basic theory of network topology, protocols and security and explained “search engine”, “Middleware”, “Distributed Hash Table” and the JXTA protocol. I then have briefly examined three existing peer to peer architectures which are “Efficient and Secure Information Sharing in Distributed, collaborative Environments” based on Sandbox and transitive delegation from 1999, pStore: A Secure Peer–to-Peer backup System” based on versioning and file blocks from 2001 and iDIBS from 2006, which is an improved versions of the SourceForge project Distributed Internet Backup System (DIBS) using Luby Transform codes instead of Reed-Solomon codes for error correction when reconstructing data. I have also looked into the security aspects related to using distributed systems for resource discovery and I have suggested a design of a resource discovery architecture which will use JXTA for backup of personal data using Super-peer nodes in a peer to peer architecture.
    [Show full text]
  • Distributed Hash Tables CS6450: Distributed Systems Lecture 12
    Distributed Hash Tables CS6450: Distributed Systems Lecture 12 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed for use under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. 1 Some material taken/derived from MIT 6.824 by Robert Morris, Franz Kaashoek, and Nickolai Zeldovich. Consistency models Linearizability Causal Eventual Sequential 2 Recall use of logical clocks • Lamport clocks: C(a) < C(z) Conclusion: None • Vector clocks: V(a) < V(z) Conclusion: a → … → z • Distributed bulletin board application • Each post gets sent to all other users • Consistency goal: No user to see reply before the corresponding original message post • Conclusion: Deliver message only after all messages that causally precede it have been delivered 3 Causal Consistency 1. Writes that are potentially P1 P2 P3 causally related must be seen by a all machines in same order. f b c 2. Concurrent writes may be seen d in a different order on different e machines. g • Concurrent: Ops not causally related Physical time ↓ Causal Consistency Operations Concurrent? P1 P2 P3 a, b N a f b, f Y b c c, f Y d e, f Y e e, g N g a, c Y a, e N Physical time ↓ Causal Consistency: Quiz Causal Consistency: Quiz • Valid under causal consistency • Why? W(x)b and W(x)c are concurrent • So all processes don’t (need to) see them in same order • P3 and P4 read the values ‘a’ and ‘b’ in order as potentially causally related.
    [Show full text]
  • Unveiling the I2P Web Structure: a Connectivity Analysis
    Unveiling the I2P web structure: a connectivity analysis Roberto Magan-Carri´ on,´ Alberto Abellan-Galera,´ Gabriel Macia-Fern´ andez´ and Pedro Garc´ıa-Teodoro Network Engineering & Security Group Dpt. of Signal Theory, Telematics and Communications - CITIC University of Granada - Spain Email: [email protected], [email protected], [email protected], [email protected] Abstract—Web is a primary and essential service to share the literature have analyzed the content and services offered information among users and organizations at present all over through this kind of technologies [6], [7], [2], as well as the world. Despite the current significance of such a kind of other relevant aspects like site popularity [8], topology and traffic on the Internet, the so-called Surface Web traffic has been estimated in just about 5% of the total. The rest of the dimensions [9], or classifying network traffic and darknet volume of this type of traffic corresponds to the portion of applications [10], [11], [12], [13], [14]. Web known as Deep Web. These contents are not accessible Two of the most popular darknets at present are The Onion by search engines because they are authentication protected Router (TOR; https://www.torproject.org/) and The Invisible contents or pages that are only reachable through the well Internet Project (I2P;https://geti2p.net/en/). This paper is fo- known as darknets. To browse through darknets websites special authorization or specific software and configurations are needed. cused on exploring and investigating the contents and structure Despite TOR is the most used darknet nowadays, there are of the websites in I2P, the so-called eepsites.
    [Show full text]
  • Content Addressable Network (CAN) Is an Example of a DHT Based on a D-Dimensional Torus
    Overlay and P2P Networks Structured Networks and DHTs Prof. Sasu Tarkoma 6.2.2014 Contents • Today • Semantic free indexing • Consistent Hashing • Distributed Hash Tables (DHTs) • Thursday (Dr. Samu Varjonen) • DHTs continued • Discussion on geometries Rings Rings are a popular geometry for DHTs due to their simplicity. In a ring geometry, nodes are placed on a one- dimensional cyclic identifier space. The distance from an identifier A to B is defined as the clockwise numeric distance from A to B on the circle Rings are related with tori and hypercubes, and the 1- dimensional torus is a ring. Moreover, a k-ary 1-cube is a k-node ring The Chord DHT is a classic example of an overlay based on this geometry. Each node has a predecessor and a successor on the ring, and an additional routing table for pointers to increasingly far away nodes on the ring Chord • Chord is an overlay algorithm from MIT – Stoica et. al., SIGCOMM 2001 • Chord is a lookup structure (a directory) – Resembles binary search • Uses consistent hashing to map keys to nodes – Keys are hashed to m-bit identifiers – Nodes have m-bit identifiers • IP-address is hashed – SHA-1 is used as the baseline algorithm • Support for rapid joins and leaves – Churn – Maintains routing tables Chord routing I Identifiers are ordered on an identifier circle modulo 2m The Chord ring with m-bit identifiers A node has a well determined place within the ring A node has a predecessor and a successor A node stores the keys between its predecessor and itself The (key, value) is stored on the successor
    [Show full text]
  • Overlook: Scalable Name Service on an Overlay Network
    Overlook: Scalable Name Service on an Overlay Network Marvin Theimer and Michael B. Jones Microsoft Research, Microsoft Corporation One Microsoft Way Redmond, WA 98052, USA {theimer, mbj}@microsoft.com http://research.microsoft.com/~theimer/, http://research.microsoft.com/~mbj/ Keywords: Name Service, Scalability, Overlay Network, Adaptive System, Peer-to-Peer, Wide-Area Distributed System Abstract 1.1 Use of Overlay Networks This paper indicates that a scalable fault-tolerant name ser- Existing scalable name services, such as DNS, tend to rely vice can be provided utilizing an overlay network and that on fairly static sets of data replicas to handle queries that can- such a name service can scale along a number of dimensions: not be serviced out of caches on or near a client. Unfortu- it can be sized to support a large number of clients, it can al- nately, static designs don’t handle flash crowd workloads very low large numbers of concurrent lookups on the same name or well. We want a design that enables dynamic addition and sets of names, and it can provide name lookup latencies meas- deletion of data replicas in response to changing request loads. ured in seconds. Furthermore, it can enable updates to be Furthermore, in order to scale, we need a means by which the made pervasively visible in times typically measured in sec- changing set of data replicas can be discovered without requir- onds for update rates of up to hundreds per second. We ex- ing global changes in system routing knowledge. plain how many of these scaling properties for the name ser- Peer-to-peer overlay routing networks such as Tapestry vice are obtained by reusing some of the same mechanisms [24], Chord [22], Pastry [18], and CAN [15] provide an inter- that allowed the underlying overlay network to scale.
    [Show full text]