A Common Protocol for Implementing Various DHT Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Looking up Data in P2p Systems
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica∗ MIT Laboratory for Computer Science 1. Introduction a good example of how the challenges of designing P2P systems The recent success of some widely deployed peer-to-peer (P2P) can be addressed. file sharing applications has sparked new research in this area. We The recent algorithms developed by several research groups for are interested in the P2P systems that have no centralized control the lookup problem present a simple and general interface, a dis- or hierarchical organization, where the software running at each tributed hash table (DHT). Data items are inserted in a DHT and node is equivalent in functionality. Because these completely de- found by specifying a unique key for that data. To implement a centralized systems have the potential to significantly change the DHT, the underlying algorithm must be able to determine which way large-scale distributed systems are built in the future, it seems node is responsible for storing the data associated with any given timely to review some of this recent research. key. To solve this problem, each node maintains information (e.g., The main challenge in P2P computing is to design and imple- the IP address) of a small number of other nodes (“neighbors”) in ment a robust distributed system composed of inexpensive com- the system, forming an overlay network and routing messages in puters in unrelated administrative domains. The participants in a the overlay to store and retrieve keys. typical P2P system might be home computers with cable modem One might believe from recent news items that P2P systems are or DSL links to the Internet, as well as computers in enterprises. -
A Content-Addressable Network for Similarity Search in Metric Spaces Fabrizio Falchi
University of Pisa Masaryk University Faculty of Informatics Department of Department of Information Engineering Information Technologies Doctoral Course in Doctoral Course in Information Engineering Informatics Ph.D. Thesis A Content-Addressable Network for Similarity Search in Metric Spaces Fabrizio Falchi Supervisor Supervisor Supervisor Lanfranco Lopriore Fausto Rabitti Pavel Zezula 2007 Abstract Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for content- based retrieval in huge and ever growing collections of data pro- duced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model hu- man fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recog- nition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similar- ity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer- to-Peer system for similarity search in metric spaces called Met- ric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. -
Tutorial on P2P Systems
P2P Systems Keith W. Ross Dan Rubenstein Polytechnic University Columbia University http://cis.poly.edu/~ross http://www.cs.columbia.edu/~danr [email protected] [email protected] Thanks to: B. Bhattacharjee, A. Rowston, Don Towsley © Keith Ross and Dan Rubenstein 1 Defintion of P2P 1) Significant autonomy from central servers 2) Exploits resources at the edges of the Internet P storage and content P CPU cycles P human presence 3) Resources at edge have intermittent connectivity, being added & removed 2 It’s a broad definition: U P2P file sharing U DHTs & their apps P Napster, Gnutella, P Chord, CAN, Pastry, KaZaA, eDonkey, etc Tapestry U P2P communication U P Instant messaging P2P apps built over P Voice-over-IP: Skype emerging overlays P PlanetLab U P2P computation P seti@home Wireless ad-hoc networking not covered here 3 Tutorial Outline (1) U 1. Overview: overlay networks, P2P applications, copyright issues, worldwide computer vision U 2. Unstructured P2P file sharing: Napster, Gnutella, KaZaA, search theory, flashfloods U 3. Structured DHT systems: Chord, CAN, Pastry, Tapestry, etc. 4 Tutorial Outline (cont.) U 4. Applications of DHTs: persistent file storage, mobility management, etc. U 5. Security issues: vulnerabilities, solutions, anonymity U 6. Graphical structure: random graphs, fault tolerance U 7. Experimental observations: measurement studies U 8. Wrap up 5 1. Overview of P2P U overlay networks U P2P applications U worldwide computer vision 6 Overlay networks overlay edge 7 Overlay graph Virtual edge U TCP connection U or -
Distributed Hash Tables CS6450: Distributed Systems Lecture 12
Distributed Hash Tables CS6450: Distributed Systems Lecture 12 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed for use under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. 1 Some material taken/derived from MIT 6.824 by Robert Morris, Franz Kaashoek, and Nickolai Zeldovich. Consistency models Linearizability Causal Eventual Sequential 2 Recall use of logical clocks • Lamport clocks: C(a) < C(z) Conclusion: None • Vector clocks: V(a) < V(z) Conclusion: a → … → z • Distributed bulletin board application • Each post gets sent to all other users • Consistency goal: No user to see reply before the corresponding original message post • Conclusion: Deliver message only after all messages that causally precede it have been delivered 3 Causal Consistency 1. Writes that are potentially P1 P2 P3 causally related must be seen by a all machines in same order. f b c 2. Concurrent writes may be seen d in a different order on different e machines. g • Concurrent: Ops not causally related Physical time ↓ Causal Consistency Operations Concurrent? P1 P2 P3 a, b N a f b, f Y b c c, f Y d e, f Y e e, g N g a, c Y a, e N Physical time ↓ Causal Consistency: Quiz Causal Consistency: Quiz • Valid under causal consistency • Why? W(x)b and W(x)c are concurrent • So all processes don’t (need to) see them in same order • P3 and P4 read the values ‘a’ and ‘b’ in order as potentially causally related. -
Performance Evaluation of Dhts for Mobile Environment
MEE10:13 Performance Evaluation of DHTs for Mobile Environment Monirul Islam Bhuiya Rakib Mohammad Ahsan This thesis is presented as part of Degree of Master of Science in Electrical Engineering Blekinge Institute of Technology December 2009 Blekinge Institute of Technology School of Engineering Department of Telecommunication Supervisor: Alexandru popescu & Karel De Vogeleer Examiner: Professor Adrian Popescu -1- -2- ABSTRACT istributed Hash Table (DHT) systems are an important part of peer-to-peer routing D infrastructures. They enable scalable wide-area storage and retrieval of information, and will support the rapid development of a wide variety of Internet-scale applications ranging from naming systems and file systems to application-layer multicast. A lot of research about peer-to-peer systems, today, has been focusing on designing better structured peer-to-peer overlay networks or Distributed Hash Tables (DHTs). As far as we concern, not so many papers, however, have been published in an organized way to check the adaptability of the existing four DHTs namely Content Addressable Network (CAN), Chord, Pastry and Tapestry for mobile environments. This thesis presents an attempt to evaluate the performances of these DHTs including adaptability with mobile environments. For this we survey these DHTs including existing solutions and based on that we decide our own conclusion. -3- -4- ACKNOWLEDGEMENT Fast of all we are grateful to our God, the most gracious and merciful, who give us energy and such kind of knowledge which is really helped us to perform well in difficult situation within thesis work. Also we would like to express our gratitude from heart for our parents, who supported us during this total time period. -
Content-Based Multicast: Comparison of Implementation Options
Content-Based Multicast: Comparison of Implementation Options Ryan Huebsch Report No. UCB/CSD-03-1229 February 2003 Computer Science Division (EECS) University of California Berkeley, California 94720 Content-Based Multicast: Comparison of Implementation Options½ Ryan Huebsch UC Berkeley [email protected] flooding of the content space, an efficient multicast func- Abstract tion should be provided. A DHT is composed of multiple components which This paper is an attempt to quantify the perfor- are connected through standard interfaces. It is important mance differences for content-based multicast im- to consider where specific functionality should be imple- plemented inside the overlay routing algorithm or mented in a layered system. Adding functionality to core built on top of the simple API provided by the rout- components often makes them more complicated, prone to ing layer. We focus on overlay networks designed bugs, and may decrease efficiency for more common tasks. for peer-to-peer distributed hash table (DHT) ap- In the network community, the end-to-end argument [13] is plications where content-based multicast is most often used as design principle that argues for pushing most applicable. In particular we study the Content functionality higher in the protocol stack unless it can be Addressable Networks (CAN) and Chord routing correctly (and significantly more efficiently) implemented algorithms. It is our conjecture that similar re- at the lower level. sults would be obtained through other protocols This paper is an attempt to quantify the performance dif- such as Pastry and Tapestry. ferences for content-based multicast implemented in two We show that it is feasible and in some ways more different layers of the DHT. -
Content Addressable Network (CAN) Is an Example of a DHT Based on a D-Dimensional Torus
Overlay and P2P Networks Structured Networks and DHTs Prof. Sasu Tarkoma 6.2.2014 Contents • Today • Semantic free indexing • Consistent Hashing • Distributed Hash Tables (DHTs) • Thursday (Dr. Samu Varjonen) • DHTs continued • Discussion on geometries Rings Rings are a popular geometry for DHTs due to their simplicity. In a ring geometry, nodes are placed on a one- dimensional cyclic identifier space. The distance from an identifier A to B is defined as the clockwise numeric distance from A to B on the circle Rings are related with tori and hypercubes, and the 1- dimensional torus is a ring. Moreover, a k-ary 1-cube is a k-node ring The Chord DHT is a classic example of an overlay based on this geometry. Each node has a predecessor and a successor on the ring, and an additional routing table for pointers to increasingly far away nodes on the ring Chord • Chord is an overlay algorithm from MIT – Stoica et. al., SIGCOMM 2001 • Chord is a lookup structure (a directory) – Resembles binary search • Uses consistent hashing to map keys to nodes – Keys are hashed to m-bit identifiers – Nodes have m-bit identifiers • IP-address is hashed – SHA-1 is used as the baseline algorithm • Support for rapid joins and leaves – Churn – Maintains routing tables Chord routing I Identifiers are ordered on an identifier circle modulo 2m The Chord ring with m-bit identifiers A node has a well determined place within the ring A node has a predecessor and a successor A node stores the keys between its predecessor and itself The (key, value) is stored on the successor -
Overlook: Scalable Name Service on an Overlay Network
Overlook: Scalable Name Service on an Overlay Network Marvin Theimer and Michael B. Jones Microsoft Research, Microsoft Corporation One Microsoft Way Redmond, WA 98052, USA {theimer, mbj}@microsoft.com http://research.microsoft.com/~theimer/, http://research.microsoft.com/~mbj/ Keywords: Name Service, Scalability, Overlay Network, Adaptive System, Peer-to-Peer, Wide-Area Distributed System Abstract 1.1 Use of Overlay Networks This paper indicates that a scalable fault-tolerant name ser- Existing scalable name services, such as DNS, tend to rely vice can be provided utilizing an overlay network and that on fairly static sets of data replicas to handle queries that can- such a name service can scale along a number of dimensions: not be serviced out of caches on or near a client. Unfortu- it can be sized to support a large number of clients, it can al- nately, static designs don’t handle flash crowd workloads very low large numbers of concurrent lookups on the same name or well. We want a design that enables dynamic addition and sets of names, and it can provide name lookup latencies meas- deletion of data replicas in response to changing request loads. ured in seconds. Furthermore, it can enable updates to be Furthermore, in order to scale, we need a means by which the made pervasively visible in times typically measured in sec- changing set of data replicas can be discovered without requir- onds for update rates of up to hundreds per second. We ex- ing global changes in system routing knowledge. plain how many of these scaling properties for the name ser- Peer-to-peer overlay routing networks such as Tapestry vice are obtained by reusing some of the same mechanisms [24], Chord [22], Pastry [18], and CAN [15] provide an inter- that allowed the underlying overlay network to scale. -
D3.5 Design and Implementation of P2P Infrastructure
INNOVATION ACTION H2020 GRANT AGREEMENT NUMBER: 825171 WP3 – Vision Materialisation and Technical Infrastructure D3.5 – Design and implementation of P2P infrastructure Document Info Contractual Delivery Date: 31/12/2019 Actual Delivery Date: 31/12/2019 Responsible Beneficiary: INOV Contributing Beneficiaries: UoG Dissemination Level: Public Version: 1.0 Type: Final This project has received funding from the European Union’s H2020 research and innovation programme under the grant agreement No 825171 ` DOCUMENT INFORMATION Document ID: D3.5: Design and implementation of P2P infrastructure Version Date: 31/12/2019 Total Number of Pages: 36 Abstract: This deliverable describes the EUNOMIA P2P infrastructure, its design, the EUNOMIA P2P APIs and a first implementation of the P2P infrastructure and the corresponding APIs made available for the remaining EUNOMIA modules to start integration (for the 1st phase). Keywords: P2P design, P2P infrastructure, IPFS AUTHORS Full Name Beneficiary / Organisation Role INOV INESC INOVAÇÃO INOV Overall Editor University of Greenwich UoG Contributor REVIEWERS Full Name Beneficiary / Organisation Date University of Nicosia UNIC 23/12/2019 VERSION HISTORY Version Date Comments 0.1 13/12/2019 First internal draft 0.6 22/12/2019 Complete draft for review 0.8 29/12/2019 Final draft following review 1.0 31/12/2019 Final version to be released to the EC Type of deliverable PUBLIC Page | ii H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5 EXECUTIVE SUMMARY This deliverable describes the P2P infrastructure that has been implemented during the first phase of the EUNOMIA to provide decentralized support for storage, communication and security functions. It starts with a review of the main existing P2P technologies, where each one is analysed and a selection of candidates are selected to be used in the project. -
Effective Manycast Messaging for Kademlia Network
Effective Manycast Messaging for Kademlia Network Lubos Matl Tomas Cerny Michael J. Donahoo Czech Technical University Czech Technical University Baylor University Charles square 13, 121 35 Charles square 13, 121 35 Waco, TX, US Prague 2, CZ Prague 2, CZ [email protected] [email protected] [email protected] ABSTRACT General Terms Peer-to-peer (P2P) communication plays an ever-expanding Measurement, Reliability, Performance, Experimentation role in critical applications with rapidly growing user bases. In addition to well-known P2P systems for data sharing Keywords (e.g., BitTorrent), P2P provides the core mechanisms in VoIP (e.g., Skype), distributed currency (e.g., BitCoin), etc. Distributed systems, performance, communication, P2P, There are many communication commonalities in P2P ap- DHT, key-based search plications; consequently, we can factor these communication primitives into overlay services. Such services greatly sim- 1. INTRODUCTION plify P2P application development and even allow P2P in- P2P networking is a widely-known architectural pattern frastructures to host multiple applications, instead of each for distributed systems. P2P has been applied to a variety of having its own network. Note well that such services must be applications, including file sharing (BitTorrent [10]), video both self-scaling and robust to meet the needs of large, ad- or audio streaming (SplitStream [5], Coolstream [22]), par- hoc user networks. One such service is Distributed Hash Ta- allel computation, online payment systems (Bitcoin [16]), ble (DHT) providing a dictionary-like location service, useful or voice-over-IP services (Skype). All these applications in many types of P2P applications. Building on the DHT demonstrate that the architecture is important and useful primitives for search and store, we can add even more power- for multiple domains. -
Truncating Blockchains with Tangly Statistics
Truncating Blockchains with Tangly Statistics Nathan Stouffer advised by Dr. Mike Wittie Abstract Blockchains have applications in cryptocurrencies, supply chain tracking, and providing data integrity. A blockchain provides an immutable, decentralized record. To provide immutability, some blockchains use Proof of Work to construct an ever growing chain of blocks in a peer-to- peer network. Participants, known as miners, expend computational resources to create new blocks. By design, users trust the longest blockchain and blocks are intentionally difficult to create. As a result, data far back in the blockchain is immutable. By design, miners have collective, but not individual, control over a blockchain. When miners are sufficiently decentralized, it is infeasible for a coalition of miners to gain explicit control of a blockchain. Explicit control would allow a coalition to decide which blocks are added to the chain, reducing the integrity of the system. Thus a blockchain performs better when control of the chain is decentralized. Decentralization can be difficult to achieve when blockchains grow to an unmanageable length. To begin mining, one must download and process the entire chain (a process called bootstrapping). This is not a problem when the chain is relatively small, but a larger chain (such as Bitcoin) can take days to download and process. Such a long chain prevents lightweight devices from becoming miners and incredibly long bootstrapping time deters participation from many users who do have sufficient space. Together, these issues make joining a long blockchain more difficult. Over the summer of 2020, I worked with collaborators to devise a faster bootstrapping method. At a high level, our solution has miners of recent blocks vote for a blockchain summary. -
Donut: a Robust Distributed Hash Table Based on Chord
Donut: A Robust Distributed Hash Table based on Chord Amit Levy, Jeff Prouty, Rylan Hawkins CSE 490H Scalable Systems: Design, Implementation and Use of Large Scale Clusters University of Washington Seattle, WA Donut is an implementation of an available, replicated distributed hash table built on top of Chord. The design was built to support storage of common file sizes in a closed cluster of systems. This paper discusses the two layers of the implementation and a discussion of future development. First, we discuss the basics of Chord as an overlay network and our implementation details, second, Donut as a hash table providing availability and replication and thirdly how further development towards a distributed file system may proceed. Chord Introduction Chord is an overlay network that maps logical addresses to physical nodes in a Peer- to-Peer system. Chord associates a ranges of keys with a particular node using a variant of consistent hashing. While traditional consistent hashing schemes require nodes to maintain state of the the system as a whole, Chord only requires that nodes know of a fixed number of “finger” nodes, allowing both massive scalability while maintaining efficient lookup time (O(log n)). Terms Chord Chord works by arranging keys in a ring, such that when we reach the highest value in the key-space, the following key is zero. Nodes take on a particular key in this ring. Successor Node n is called the successor of a key k if and only if n is the closest node after or equal to k on the ring. Predecessor Node n is called the predecessor of a key k if and only n is the closest node before k in the ring.