MEE10:13

Performance Evaluation of DHTs for Mobile Environment

Monirul Islam Bhuiya Rakib Mohammad Ahsan

This thesis is presented as part of Degree of Master of Science in Electrical Engineering

Blekinge Institute of Technology December 2009

Blekinge Institute of Technology School of Engineering Department of Telecommunication Supervisor: Alexandru popescu & Karel De Vogeleer Examiner: Professor Adrian Popescu

-1-

-2-

ABSTRACT

istributed Hash Table (DHT) systems are an important part of peer-to-peer routing D infrastructures. They enable scalable wide-area storage and retrieval of information, and will support the rapid development of a wide variety of Internet-scale applications ranging from naming systems and file systems to application-layer multicast.

A lot of research about peer-to-peer systems, today, has been focusing on designing better structured peer-to-peer overlay networks or Distributed Hash Tables (DHTs). As far as we concern, not so many papers, however, have been published in an organized way to check the adaptability of the existing four DHTs namely Content Addressable Network (CAN), , Pastry and Tapestry for mobile environments. This thesis presents an attempt to evaluate the performances of these DHTs including adaptability with mobile environments. For this we survey these DHTs including existing solutions and based on that we decide our own conclusion.

-3-

-4-

ACKNOWLEDGEMENT

Fast of all we are grateful to our God, the most gracious and merciful, who give us energy and such kind of knowledge which is really helped us to perform well in difficult situation within thesis work. Also we would like to express our gratitude from heart for our parents, who supported us during this total time period.

We really like to appreciate our thesis supervisors Alexandru popescu and Karel De Vogeleer, of the School of Engineering, Blekinge Institute of Technology (BTH), Sweden, for their helping, guiding with many inspiring ideas, steady encouragement and collaboration for doing well in our thesis work. We also like to give thanks to BTH and all other employees of BTH, who really helped us and suggest the right way to reach our goal in many awful situations.

Finally, we are obliged to many of our friends for their unbelievable help, continuous encouragement and trust on us. We would like to dedicate our thesis work to all of our friends and to our grate parents.

-5-

-6-

TABLE OF CONTENT

ABSTRACT ...... 3 ACKNOWLEDGEMENT ...... 5 TABLE OF CONTENT ...... 7 LIST OF FIGURE ...... 11 LIST OF TABLE ...... 13 CHAPTER ONE ...... 15 INTRODUCTION ...... 15

1.1. INTRODUCTION ...... 15 1.2. OBJECTIVE ...... 15 1.3. MOTIVATION ...... 16 1.4. CONTRIBUTION ...... 16 1.5. OUTLINE OF THE THESIS ...... 17 CHAPTER TWO ...... 19 BACKGROUND ...... 19

2.1. PEER TO PEER NETWORK ...... 19 2.1.1 INTRODUCTION: ...... 19 2.1.2 P2P ARCHITECTURE: ...... 19 2.2. ...... 23 2.2.1 INTRODUCTION: ...... 23 2.2.2 WORKING MODEL OF OVERLAY NETWORKS [3]: ...... 24 2.2.3 COSTS: ...... 25 2.2.4 SOME BENEFITS: ...... 25 2.2.5 SOME PROBLEMS: ...... 26 2.3. ...... 27 2.3.1 INTRODUCTION: ...... 27 2.3.2 ARCHITECTURE OF DHT: ...... 28 A. Client: ...... 28 B. Service: ...... 29 C. API: ...... 29 D. DDS Library: ...... 29 E. Brick: ...... 29 2.3.3 STRUCTURE OF DHT: ...... 30 CHAPTER THREE ...... 31 OVERVIEW OF EXISTING DHT’S ...... 31

3.1 CONTENT ADDRESSABLE NETWORK (CAN) ...... 31 3.1.1 INTRODUCTION: ...... 31 3.1.2 WORKING MODEL: ...... 32 A. Node Joining: ...... 32 B. Bootstrapping Nodes: ...... 33

-7-

C. Finding own zone: ...... 33 D. Neighbors’ Node: ...... 34 E. Node Leaving: ...... 34 F. Node Failure: ...... 35 G. Routing: ...... 35 H. Routing Geometry: ...... 36 3.1.2 LOAD BALANCE: ...... 36 3.2 CHORD ...... 38 3.2.1 INTRODUCTION: ...... 38 3.2.2 WORKING MODEL: ...... 39 A. Chord Simple Look up Algorithm: ...... 39 B. Consistent hashing:...... 39 C. Key to Nodes Mapping: ...... 40 D. Finger Table [6]: ...... 40 E. Node Joining: ...... 42 Initialize fingers and predecessor: ...... 42 Update fingers of the existing nodes: ...... 43 Transfer the key:...... 43 F. Node Failure: ...... 43 G. Routing: ...... 44 3.2.3 Routing Geometry: ...... 44 3.2.4 LOAD BALANCE: ...... 44 3.3 PASTRY ...... 46 3.3.1 INTRODUCTION ...... 46 3.3.2 WORKING MODEL OF PASTRY ...... 46 3.3.3 ROUTING TABLE ...... 47 3.3.4 ROUTING ...... 49 3.3.5 NODE JOINING & LEAVING ALGORITHM ...... 50 A. Node Joining ...... 50 B. Node Leaving ...... 51 3.3.6 LOCALITY ...... 52 A. Short Routes ...... 52 B. Route convergence ...... 52 3.4 TAPESTRY ...... 53 3.4.1 INTRODUCTION: ...... 53 3.4.2. WORKING MODEL: ...... 53 3.4.3 ROUTING: ...... 55 3.4.4 NODE ALGORITHMS: ...... 58 A. Node Insertion: ...... 58 B. Node Deletion ...... 60 Voluntary Node Deletion: ...... 60 Involuntary Node Deletion: ...... 60 CHAPTER FOUR ...... 61 FUNCTIONALITY COMPARISON OF DHT’S ...... 61

4.1 INTRODUCTION: ...... 61 4.2 COMPARISON OF PROTOCOLS: ...... 61

-8-

CHAPTER FIVE ...... 62 EVALUATION OF COMPARED DHTS ...... 62

5.1 INTRODUCTION ...... 62 5.2 ROUTING PERFORMANCE ...... 62 5.2.1 COMPARISON OF ROUTING PERFORMANCE: ...... 62 5.2.2 SUMMARY OF THE COMPARISON: ...... 63 5.3 STATIC RESILIENCE ...... 64 5.3.1 OVERVIEW: ...... 64 5.3.2 SUMMARY OF THE COMPARISON: ...... 65 CHAPTER SIX ...... 66 ADAPTABILITY WITH MOBILE ENVIRONMENTS ...... 66

6.1 ISSUES TO BE CONCERN: ...... 66 6.2 ANALYSIS THE DHTS UNDER HIGH CHURN: ...... 67 6.3 SUMMARY OF THE ANALYSIS: ...... 69 CHAPTER SEVEN ...... 70 CONCLUTION ...... 70

7.1 CONCLUSION: ...... 70 7.2 FUTURE WORK: ...... 70 REFERENCES:...... 71 APPENDIX: ...... 76

ALGORITHM-1 FOR CAN CONSTRUCTION ...... 76 ALGORITHM-2 FOR CAN ROUTING ALGORITHM 1 ...... 77 THEOREM: ...... 78 PSEUDOCODE FOR THE NODE JOIN OPERATION ...... 79

-9-

-10-

LIST OF FIGURE

FIGURE 2. 1: LEVELS OF P2P NETWORKS [1]...... 19 FIGURE 2.2 : OVERLAY NETWORK ARCHITECTURE OF P2P ...... 21 FIGURE 2.3: APPLICATION INTERFACE FOR STRUCTURED DHT BASED P2P OVERLAY SYSTEMS [2]...... 22 FIGURE 2.4 : OVERLAY NETWORK ...... 23 FIGURE 2.5 : SAMPLE OVERLAY NETWORK AND QUERY [3]...... 24 FIGURE 2.6: ARCHITECTURE OF DDS DISTRIBUTED HASH TABLE [4]...... 28

FIGURE 3. 1: TWO DIMENSIONAL SPACE (X,Y),A KEY IS MAPPED TO A POINT...... 32 FIGURE 3. 2: PARTITIONS SCHEME FOR NODE JOINING...... 32 FIGURE 3. 3: EXAMPLE 2D SPACE, BEFORE AND AFTER JOINING [5]...... 34 FIGURE 3. 4: DISTRIBUTION OF ZONE VOLUME WITH AND WITHOUT 1-HOP VOLUME CHECK...... 37 FIGURE 3. 5: PERFORMANCE OF 1-HOP CHECKING AFTER INCREASING DIMENSION...... 37 FIGURE 3. 6: BASIC LOOKUP ...... 39 FIGURE 3. 7: KEY TO NODE MAP...... 40 FIGURE 3. 8: FINGER INTERVALS FOR NODE 1[6]...... 41 FIGURE 3. 9: FINGER TABLES AND KEY LOCATIONS FOR A NET WITH NODES 0, 1, AND 3, AND KEYS 1, 2, AND 6 [6]...... 42 FIGURE 3. 10: (A) THE NUMBER OF KEYS STORED PER NODE IN A NODE NETWORK [10]...... 45 FIGURE 3. 11: EXAMPLE OF A PASTRY NODE WITH NODEID 3102 WHERE B = 2, L = 4 WITH BASE 4...... 47 FIGURE 3. 12: ROUTING TABLE OF NODEID 65A1X WHERE B=4. HERE, BASE=16 AND X=ARBITRARY SUFFIX. [38] ...... 48 FIGURE 3.13: THE STATE OF 103220 PASTRY NODE STATE IN THE CASE OF 12 BIT IDENTIFIER SPACE AND 4 BASE [42]...... 49 FIGURE 3. 14: EXAMPLE OF A TAPESTRY NODE 0642 [48] ...... 54 FIGURE 3. 15: TAPESTRY ROUTING MESH WITH LINKS WHICH FORMS THE LOCAL ROUTING TABLE [38]...... 56 FIGURE 3. 16: MESSAGE ON TAPESTRY, PATH TAKEN BY A MESSAGE INITIALIZES FROM NODE 5230 INTENDED FOR NODE 42AD IN A TAPESTRY MESH [37]...... 57 FIGURE 3. 17: TAPESTRY PSEUDO CODE FOR NEXTHOP(.)...... 57 Figure 5. 1: Performance analysis of various routing geometry [27]…...……………………………………………………….……64

FIGURE 6. 1: NODE JOIN/LEAVE WITH INTERVAL = 10S (LEFT N=100, RIGHT N = 1000) [36] ...... 67 FIGURE 6. 2: NODE JOIN/LEAVE WITH INTERVAL = 120S (LEFT N=100, RIGHT N = 1000) [36] ...... 68 FIGURE 6. 3: PASTRY UNDER CHURN [29]...... 68 FIGURE 6. 4: COST VERSUS PERFORMANCE UNDER CHURN IN TAPESTRY [30]...... 69

-11-

-12-

LIST OF TABLE

TABLE 3. 1: ROUTING VARIABLES OF CAN ...... 36 TABLE 3. 2: DEFINITION OF CHORD VARIABLES [6] ...... 40 TABLE 3. 3: ROUTING VARIABLES OF CHORD ...... 44

-13-

-14-

Chapter 1

CHAPTER ONE INTRODUCTION

1.1. Introduction

Today we have a large variety of Peer-to-Peer (P2P) systems. Though in general they are rarely 100% pure P2P, meaning all peers act as equals. In such case the roles of the clients and servers are merged leading to the exclusion of the need for a central administration entity that manages the overlay system. Pure P2P networks have practical implications. Hybrid P2P systems address these practical problems. These systems utilize a centralized service for bootstrapping or maintaining the network while making use of P2P mechanisms for data exchange.

Furthermore there are two main classes of P2P networks with reference to routing substrate or geometry, the unstructured P2P system and the structured P2P system. By far the most common type of structured P2P is the Distributed Hash Table (DHT) which demonstrates advantages such as decentralization, scalability and fault tolerance, offering efficient content allocation and introducing content redundancy. To achieve this, a structured overlay topology is imposed and maintained by a globally employed protocol. This ensures efficient content discovery through the use of a specific routing protocol to search the virtually imposed structure.

1.2. Objective

The aim of this is to compare various existing DHTs in order to way out how DHTs will work on mobile environments. To do this we will use various defining articles and documentation as specification.

-15-

1.3. Motivation

Peer-to-peer file sharing systems are now one of the most popular Internet applications and have become a major source of Internet traffic, especially in the area of structured peer-to-peer overlay networks or Distributed Hash Tables, which are simply called DHTs. Thus, it is extremely important that these systems can efficiently locate the node that stores the desired data in a large system efficiently. Nodes must be able to join and leave the system frequently without affecting its robustness or efficiency, and load must be balanced across the available nodes.

The scope of the thesis is to provide a survey of current popular DHTs, at least the following; CAN, Chord, Pastry, Tapestry in order to provide organized information for future work on DHT. The DHTs should be compared according to their functionality. Their strengths and weaknesses have to be investigated and highlighted. Furthermore, the adaptability to mobile environments needs to be evaluated on a per DHT basis, exposing their suitability.

1.4. Contribution

The main contribution of this paper is the identification of the most important parameters of DHTs and then comparisons between these mechanisms under extreme conditions for better performance in mobile environments.

The information gathered in the comprised survey will then be used for the proposal and implementation of a novel mechanism that can extend any DHT. This mechanism will be referred to as XDHT.

-16-

1.5. Outline of the Thesis

In this thesis, it consists of seven chapters. The first chapter of this thesis report is about an introduction of P2P network along with some information regarding on DHT, which motivates us to work with this topics. We also discussed about our contribution within this thesis.

Chapter two is an overview of P2P network, Overlay network and DHT. This chapter gives us some knowledge about P2P and Overlay network. The total working models of some existing DHT‟s are discussed then in the next chapter, i.e., in chapter three.

In chapter four we find out some functionality comparison between existing DHT‟s.

With continuation in chapter five we discovered some performance evaluation summary of compared DHT‟s.

Then in chapter six we got some analyzed summary of DHT‟s adaptability with mobile environments.

Finally, chapter seven concludes our thesis work and gives us some idea regarding future work related with DHT‟s.

-17-

-18-

Chapter 2

CHAPTER TWO BACKGROUND

2.1. PEER TO PEER NETWORK

2.1.1 INTRODUCTION:

eer-to-peer (P2P) is a network where participating nodes share information equally, using Pproper communication systems without necessarily needing central coordination. It is the opposite conception of client/server networks.

Attractive features of P2P networks, i.e. improved scalability, decentralized coordination, lower cost of ownership, fault tolerance etc. make it popular in networking arena. For example, Napster, one of the peer-to-peer popularized file sharing system.

2.1.2 P2P ARCHITECTURE:

P2P infrastructure consists of three levels model [1] presented below:

Level 3: P2P communities

Level 2 : Consists of P2P applications

Level 1 : P2P infrastructures

Figure 2. 1: Levels of P2P networks [1]

-19-

Level 1 P2P infrastructures: Foundation of all levels which perform communication, integration, and translation between IT components. Mainly, finding nodes, share volume and exchanging resources etc are done in this level.

Level 2 consists of P2P applications: Use the services provided by level 1. In the absence of central control they enable communication and collaboration of entities.

Level 3 P2P communities: Cooperative interaction between communities of similar interest and the dynamics within them.

We can specify P2P overlay network models as hierarchical framework which is distributed in various communication levels. At the top of the hierarchy is Application level which deals with tools, applications and services of the underlying P2P overlay structure. Service specific layer provides parallel scheduling and computational tasks for underlying network. Security management, fault tolerance and resource sharing issues are the concern of Feature management layer‟s. Routing and Lookup are handled by Overlay Nodes Management layer. The Network Communications layer explains the network characteristics of nodes connected through the overlay network.

-20-

Figure 2.2 : Overlay network architecture of p2p

P2P overlay networks can be stated by two formats, Structured and Unstructured.

Structured P2P refers to the network where the contents are placed in specific location rather than place to random peers. The network topology is strongly controlled so that queries become more efficient. For example, Distributed Hash Table (DHT) protocols like CAN, Chord, Pastry etc. These structured P2P protocols provide a self-organizing substrate for large scale peer to peer appliances, variety of decentralized services, including network storage, content distribution and application level multicast .The system uses structured P2P is scalable, fault tolerant and balance loaded.

In Structured P2P overlay network node ID‟s are assigned randomly to the set of peers into large space of identifiers and unique identifiers of data objects are called keys. The peer retrieve (key, value) when it give key associated with that pair. The value g pairs on the overlay network, as illustrated in Figure 2.3.

-21-

Figure 2.3: Application Interface for Structured DHT based P2P Overlay Systems [2].

Each peer in the network keeps a routing table consist of its neighboring peers IDs and the IP addresses. Lookup queries are performed to forward messages in a progressive manner, with the NodeIDs that are closer to the key in the identifier space. Different DHT protocols consume different scheme for lookup queries, routing approaches and load balancing.

Although, Structured P2P networks can competently find rare items as key based routing is used, they gain significant higher overheads than Unstructured P2P networks.

Contrary, Unstructured P2P system is collection of peers with uncontrolled rules to join a overlay network without any former understanding of the topology. Flooding mechanism is used to send queries across the overlay with a limited possibility. When a peer receives a query, tries to match the content of the query and then sends a list of all content to the originating peer. Flooding mechanism is useful for tracing highly replicated objects, resilient to peers joining and leaving the system but poorly suited for locating exceptional items. Also, this approach is not scalable as the load increases linearly with the total number of queries on each peer and the system size.

-22-

2.2. OVERLAY NETWORK

2.2.1 INTRODUCTION:

verlay network is a virtual network built on various nodes connected virtually. It exists on the Otop of one or more underlying network. Each nodes connected correspond to a logical path through physical links of the underlying network. For Example, peer to peer network acts as overlay network as main existing network cannot provide the requirements of various application.

It is introduced to meet the problem where the nodes addresses were not known before. The routing messages send to the logical address of the node. For Example, DHT send routing messages to nodes by their logical address as the IP addresses are not known in advance.

Figure 2.4 : Overlay Network

-23-

2.2.2 WORKING MODEL OF OVERLAY NETWORKS [3]:

• Assure of data retrieval

• Less lookup time, typically O (logN) where N is the number of nodes

• Automatic load balancing

• Self organization

Since overlay networks identify neighbor nodes by content stored, they can modify standard graph traversal problem into a localized iterative process. Iterative process decreases overall network load and formulate the query process deterministic as each hop brings the query closer to its target set of hops. This is calculated according to their mathematical function.

Specifically, an overlay network performs like a distributed hash table by insertion, deletion and querying of key. A consistent hashing algorithm such as the secure hash algorithm (SHA-1) is used.

Node joining scheme of overlay networks is different from TTL-based algorithm as it is structured and characteristically symmetrical. The scheme is based on one or more algorithms that decide how the node will connect. Also the lookup times depend on the network‟s architecture. Moreover, the lookup algorithm includes recovery function for nodes failure as for recreate or maintain an appropriate network structure.

To join the network, a node sends a request such as DNS query or broadcast to find another network node already joined in the network. Such node is called bootstrapping node. The bootstrapping node provide initial information like neighbors node ID or IP addresses of the network

Figure 2.5 : Sample overlay network and query [3].

-24-

One of the important differences between overlay networks and unstructured P2P networks is that overlays doesn‟t support keyword based searching. It lookups data on the basis of identifiers derived from the content.

2.2.3 COSTS:

• Adds overhead and a layer in networking stack.

• Extra packet headers, processing.

• Remove Ethernet addresses from Ethernet header and assume IP header.

• Increases complexity though use layering. Layering does not reduce complexity, it only manages it. More layers of functionality mean more possible unintentional communication between layers.

2.2.4 SOME BENEFITS:

The demand of overlay network is increasing day by day as it includes some good features. It is expensive to develop totally new networking hardware/software

We don‟t need to set up new equipment, only modify existing software/protocols. The new software deployed on top of existing software i.e., of Ethernet does not require modifying the Ethernet protocol or driver, just adding IP on the top. Also do not need to implement at every node.

Another feature is bootstrapping. It allows bootstrapping to provide information for newly joined node.

All networks after the telephone assume as overlay networks.

Not require every node wants overlay network service all the time.

-25-

2.2.5 SOME PROBLEMS:

Sometimes node consumes too much load i.e., memory, bandwidth .Overlay network may inconsistence for those node.

Overlay network may have uncertain security properties • e.g., may be used for Denial Service (DoS) of attack.

Overlay network may not scale (not exactly a benefit).Sometimes may need n2 state.

Random changes in group membership as node join and leave dynamically.

Dynamic changes in network environments and so random changes in topology

Due to network congestion, changes in routing occur delay between members may vary over time.

Information of network conditions is member specific as each member must determine network conditions for itself.

-26-

2.3. DISTRIBUTED HASH TABLE

2.3.1 INTRODUCTION:

HTs were first introduced to the research community in 2001 .Distributed Hash Table (DHT) Dbasically use the similar function of hash table. It is decentralized indexing systems which offer scalable fault tolerance data storage and lookup services. It is used for peer to peer communication having the without conception of server and client model.

To understand how DHT work then we have to understand the lookup service.

- Allocate IDs to nodes - Map hash values to node with closest ID - Leaf set is successors and predecessors - All that’s required for accuracy - Successively matches longer prefixes by Routing table - Allows proficient lookups

Initially it permits decentralize scattered peers to manage a mapping key from keys to value without any fixed structure. Using DHT we can store key value and we look up the value using the key. For storing and lookup DHT distributed the work to several machines. Due to the simultaneous changes of the network memberships DHT always give desirable performance

The basic operations of DHT‟s are very simple.1) send query i.e, ping for udate routing tables2) Lookup to find the nodes 3) get values from nodes found from lookup 4) store value on those nodes.

-27-

2.3.2 ARCHITECTURE OF DHT:

The architecture and implementation of a distributed hash table, Distributed Data Structure (DDS) is describes below [4]:

Figure 2.6: Architecture of DDS distributed hash table [4].

A. Client:

- A computer which includes a specific software services running on that Communicates over the Wide Area Network (WAN) with one of several service requests running on the cluster. - The client selects a service instance is outside the scope of this work - Usually related with round robin DNS [123].

-28-

B. Service:

- A service nothing but the set of software processes. - Service communicates with clients throughout the area and executes some application- level functions. - Services may be soft state but are based on the hash table to control the persistent state.

C. API:

- The boundary between a service and DDS library which provides services that put (), get (), remove (), create () and destroy () functions on hash tables. - Each operation is tiny - All services to see the same consistent image of all the existing hash tables through this API. - Hash table names are strings, the keys to the hash 64-bit integer - The hash table values are obscure byte array - Involve the operations of the hash values in its total.

D. DDS Library:

- Java class library that shows the hash table API to services. - Accepts hash table operations, and cooperates with the `` bricks'' - Contains only soft state, including metadata - Perform as two-phase assign coordinator for the state of currency transactions in distributed hash tables

E. Brick:

- System modules that manage durable data. - Each brick is provided by a set of access networks known as single-node hash tables. - Consists of a buffer cache, lock manager, a persistent chained hash table implementation - Network stubs and skeletons for remote communication. - In general, there is a brick in the cluster run by the CPU - The bricks can be run on dedicated nodes, or nodes can be shared with other components.

-29-

2.3.3 STRUCTURE OF DHT:

There are several components that structure DHT. But mainly point on

1) Keyspace 2) Keyspace partioning 3) Build overlay network 4) Routing table

Keyspace: the foundation of DHT. In hashing keyspace refers to set of all possible keys used in it.

Keyspace Partioning: The ownership of the keyspace is distributed among the joined nodes which are known as Keyspace Partioning Scheme. For example, In DHT every joining node stores some key and value (k, v) in pairs. If a keyspace is K [0, 2410) and identifier of a node id (belongs to) K then the pair will be stored at the node which id is near to K.

Build Overlay network: Every node contain a set of links to other nodes. Joined these links form the overlay structure. The links they maintain are the links of its neighbors .Adjacent neighbors placed in the keyspace.

Routing: Key Based routing.

-30-

Chapter 3 CHAPTER THREE OVERVIEW OF EXISTING DHT’S

3.1 CONTENT ADDRESSABLE NETWORK (CAN)

3.1.1 INTRODUCTION:

ontent Addressable Network (CAN) is one the main original DHT‟s introduced for peer to Cpeer file sharing systems. It provides indexing system for p2p for large scale storage application in the internet.

For example, CAN could be used for large scale storage management system like OceanStore, FarSite and Publius. Indeed the OceanStore system already implemented it in their core design [5]. Because all these system need is a scalable management system and efficient insertion and retrieval method which is provided by CAN. Also CAN could provide wide area name resolution service similar like DNS [5].

Like other DHT, CAN is also a distributed system that provide key, value and retrieve the value associated with that key. The design dimension is Cartesian coordinate space. This d- dimensional coordinate space is virtual logical address of the independent physically connected nodes. The virtual coordinate space stores the pair (key1, value1).The entire coordinate space is divided among all the participant nodes at least each node have one distinct zone. This distinct zone is known as “chunk”.

The basic operation of CAN is insertion, lookup and deletion. Each node keeps information of the neighbor‟s nodes on its routing table. In figure 2 the neighbors of the node 4 are {3, 2, 1}. When a request come then initial node forward it to the node which contain the key.

-31-

3.1.2 WORKING MODEL:

A. Node Joining:

As the entire coordinate space is divided among all the nodes so when a node join, the existing space is split in to two portion, one is assigned for the newly joined node.

For example, in Figure 3.1, the dimension coordinate space includes 4 nodes.

Figure 3. 1: Two dimensional space (x,y),a key is mapped to a point.

Figure 3. 2: Partitions scheme for node joining.

Here one important thing is Virtual Identifier (VID), the binary string which determines the path form root to the partition tree to the leaf node corresponding to zone [5]. VID represents the position of a node.

-32-

To increasing a CAN each node must have to unique VID. These can be done like these:

- The new node has to find a Bootstrapping Node.

- The new node has to find its own place

- The new node has to find the neighbors VID and IP addresses.

B. Bootstrapping Nodes:

A node who provides the initial information like IP addresses to the newly joined node to successfully join in CAN. For these the new node may inform before which is the Bootstrapping node by assigned static address or the bootstrap node can be found by Domain Name Services.

C. Finding own zone:

To find its own space the new node first randomly selects a point and send a join request to for the destined point. Each participant node then start routing until it reaches the point. When it reaches, the owner of that space split its zone and share with the new zone. Here these things should be considered:

- The owner node can‟t directly split its zone

- The other existing nodes compare the volume of its space with its intermediate neighbors in the coordinate space.

- The zone that is split for the newly joined node is the largest volume one.

-33-

Figure 3. 3: example 2d space, before and after joining [5].

Here, the joining of a new node affects only a small numbers of existing nodes lightly on their coordinate space. The neighbor nodes depend only the dimensionality of the coordinate space and it is independent on the total number of nodes.

Thus the joining of a node affects only O (number of the dimension) existing nodes.

D. Neighbors’ Node:

The nodes are said to neighbors if their coordinate space overlaps along d-1 dimension and about along one dimension [5]. For example in Figure 3.3, node 2 is the neighbor of node 6 as its coordinate space overlap.

E. Node Leaving:

It‟s very necessary to confirm that when a node leaves the CAN, its zone should take over by the remaining nodes. For this:

- If the zone of the one of the neighbor is merged and can make a single zone with the departing node‟s zone then it‟s done.

-34-

- If not then the neighbor whose zone is smaller will take over this and handle both the zone temporarily.

F. Node Failure:

Another important issue is node failure. When one or more nodes become unreachable and immediate take over algorithm is run to ensure that the failed node‟s zone is taken over by its neighbor. Usually all nodes send a periodic message to update their zone coordination and the neighbors list. When the update is missing from any node assume it is dead.

After confirmation of died nodes, immediate take over algorithm is run. The neighbors of the died node do this with the timer in proportion of the volume if the node‟s own size. When time expires then neighbor nodes sends a takeover message informing its updated zone volume to the neighbors of the dead node.

Receiving the takeover message, a node terminates its own timer if the volume of its zone is larger than the messaged zone. Otherwise it replies with its own takeover message. Thus they assured about alive nodes and the volume of the zones.

But in these procedures it is possible that a node finds a failure but it is less than half of the neighbors of the failed node‟s are still reachable which make the CAN inconsistent. In this circumstance it‟s better to give priority of expanding ring search instead of repair mechanism. Because its reconstruct acceptable neighbor state to initiate a takeover safely.

Another considering issue is holding more than one zone when normal leaving and takeover algorithm procedure commences. A background zone reassignment algorithm [see Appendix] is run to avoid repeated fragmentation to make sure that CAN tends back towards one zone per node. [1]

G. Routing:

In CAN, routing occurs from source to destination by following the state line path through the Cartesian space .A source node route packets using its neighbor coordinate set by simply greedy forwarding. [5].

In a d dimensional coordinate space each node has 2d neighbors and the average routing path length (d/4) 푛1/푑 .So we can increase the number of node and zone without increasing per node while the path length grows as O(푛1/푑 ). [5]

-35-

When CAN set d=(log2 n)/2, it is possible to get O(log n) hops. But in usual CAN configuration it is inconvenient the path length or node degree tradeoff as the number of node is not known before. Enhancements of basic design of CAN will be beneficial [6].

When a new node gets placed in its zone, it updates the information for example, ip addresses of its neighbors. Similarly previous occupant nodes also update the routing table as new node joined. Each node on the space sends an update message after certain duration. These updates ensure all nodes about their neighbor‟s status.

H. Routing Geometry:

Table 3. 1: Routing Variables of CAN

Lookup Neighbors Routing Optimal Neighbor Average Node Network state path selection Latency congestion diameter

O(d.N1/d) 2d O(d) O(log n) 1 High O(dn1/d-1) O(dn1/d)

3.1.2 LOAD BALANCE:

It‟s not mandatory for proper load balance that the partition should be perfect rather avoid non uniform partitioning [1]. When a node joins, picks a random point from the coordinate space and find its current occupant. The occupant performs a 1-hop volume check. 1-hop volume check is like selecting the bigger of its own and immediate neighbor‟s zone which is split with the new node.

Now from the example of the [1] (Sylvia Ratnasamy) in load balance 2.2.3, Let the total number of coordinate space is 푉푡 and the total number of nodes is n. So we define V=푉푡/푛

-36-

From the Figure 5 below, we can see that without 1-hop volume check, only 40% nodes assigned to zones with volume V. Contrary with 1-hop volume check 82% nodes assigned to zones with volume v.

Figure 3. 4: Distribution of zone volume with and without 1-hop volume check.

In Figure 3.4, all the zone volumes lay between V/2 and 2V when the dimension is increased from 2 to 3 and higher.

Figure 3. 5: Performance of 1-hop checking after increasing dimension.

So, we also agree that 1-hop checking is really helpful to achieve almost perfect partitioning.

-37-

3.2 CHORD

3.2.1 INTRODUCTION:

hord is another peer to peer lookup algorithm for DHT .In peer to peer it‟s a problem to Cefficiently locate a node that stores a specific data item. To meet this problem Chord was first introduced in 2001. Chord get used to competently as nodes join and leave the system, and can answer queries even if the system is continuously altering.

Chord differentiates from many other peer-to-peer lookup protocols by its simplicity, provable correctness, and provable performance [6]. In Chord, a key is routed through a sequence of O(log N) other nodes from source to destination and node requires information about O(log N) other nodes for proficient routing. But when the information is out of date the performance degrade gracefully as the nodes join and leave randomly, and consistency of even O(log N) state is hard to maintain. Chord has a simple algorithm for keeping this information in a random environment.

Existing name and location service provide “direct mapping” i.e. DNS between keys and values. On the contrary Chord maps keys onto node by storing each key or value at the node to which the key maps.

DNS resolve hostname in to IP address [4]. Chord can also provide hostname to ip address mapping but not depend on the set of root server for quires like DNS. DNS provides structured names while Chord requires no naming structure. DNS is used to find named hosts or services, while Chord is also be used to locate data objects that are in several distributed machines [6].

The basic operation of Chord is simple. We have to provide a key and the key maps to node. The node may responsible for the store a value related with that key.

Chord uses a variant of consistent hashing [3], another feature of Chord which is used to assign keys to Chord nodes. Consistent hashing is used intend to load balancing, since each node receives about the same number of keys, and engages relatively slight movement of keys when nodes join and leave the system [6]. In Chord consistent hashing differs from traditional as in chord the routing table is distributed. So each node need to keep information about only few nodes. In N nodes system each nodes keep information about O (log N) other nodes and maintain routing information of no more than 푂(푙표푔2푁) messages [6].

-38-

3.2.2 WORKING MODEL:

A. Chord Simple Look up Algorithm:

Lookup(my-id, key-id) n = my successor if my-id < n < key-id call Lookup(key-id) on node n // next hop else return my successor // done

Figure 3. 6: Basic Lookup

B. Consistent hashing:

- Key identifier = SHA-1(key)

- Node identifier = SHA-1(IP address)

- SHA-1 distributes both uniformly

- Node keys are arranged in a circle.

- The circle cannot have more than 2m nodes.

- The circle can have ids/keys ranging from 0 to 2m – 1

-39-

C. Key to Nodes Mapping:

Figure 3. 7: Key to node map.

Figure 3.7, shows an identifier circle with nodes: 0, 1,2,3,4 and 5. The successor of identifier 1 is node 1, so key 1 would be located at node 1. Similarly, key 2 would be located at node 2, and key 6 at node 0. The purpose of consistent hashing is to let nodes join and leave the network with minimum disturbance. For this, when a node n joins in the network, the keys previously assigned to n’s successor now become assigned to n. Again when the node n leaves the network, all the assigned key of n reassigned to n’s successor. For example above, if a node were to join with identifier 5, it would capture the key with identifier 6 from the node with identifier 0. This is proved by a Theorem 1 [11, 13], mentioned later in the appendix portion of this paper.

D. Finger Table [6]:

Table 3. 2: Definition of Chord variables [6]

Notation Definition

finger[k].start (n+2k-1)mod 2m, 1=

.interval [finger[k].start,finger[k+1].start)

.node first node>=n.finger[k].start

Successor the next node on the identifier circle; finger[1].node

Predecessor the previous node on the identifier circle

-40-

- For faster lookups, Chord preserves additional routing information.

- If each node knows its exact successor then the additional information is not mandatory

- Each node n‟ maintains a routing table with up to m entries (which is in fact the number of bits in identifiers), called finger table.

- In the table, the i th entry at node n contains the identity of the first node s that succeeds n by at least 2i-1 on the identifier circle.

- s = successor(n+2i-1).

- s is called the i th finger of node n, denoted by n.finger(i) [5]

Figure 3. 8: Finger Intervals for node 1[6].

- Both the Chord identifier and the IP address of the relevant node is also contain in the finger table entry.

- The first finger of n is the immediate successor of n on the circle.

As each node has finger entries at power of two intervals roughly the identifier circle so each node can send a query minimum of halfway along the left over distance between the node and the target identifier. From this perception follows a Theorem2, mentioned later in the appendix portion of this paper. [5]

-41-

Figure 3. 9: Finger tables and key locations for a net with nodes 0, 1, and 3, and keys 1, 2, and 6 [6].

E. Node Joining:

One of the challenge of Chord for dynamic network is that node can join and leave any time. For this, need to able to locate every key on the network. All nodes in Chord maintain a predecessor pointer‟s which contain both Chord identifier and IP address of the immediate predecessor of that node. It is used as counter clock wise around the identifier circle.

For locate every key, Chord needs:

- The successor of each node´s is maintained correctly. - For every key k, node successor (k) is responsible for k. - Also for faster lookups the finger tables should be correct.

When a node for example n joined in the network the following tasks perform:

Initialize fingers and predecessor:

When a node n joins it informed about the predecessor and fingers by n´, n´ is an existing node. The function init_ finger_ table is used from pseudocode [see Appendix]. Then n find its successors using pseudocode [see Appendix]. Also the change decreases the expected (and high probability) number of finger entries that must be looked up to (O Log N), which reduces the overall time to (푂 퐿표푔2푁). [6] The newly joined node can update by copying its immediate neighbor´s finger table and predecessors to correct its own finger table.

-42-

Update fingers of the existing nodes:

When a node joins the network is (O Log N) with high probability, lots of nodes need to be updated. Finding and updating these nodes takes (푂 퐿표푔2푁) time [21]. The function update_ finger_ table of the pseudocode [see Appendix] updates the finger table.

Node n will become the i th finger of predecessor node p, if and only if

(1) p precedes n by at least (2푖−1) and (2) The i th finger of node p succeeds n

Thus, for a given n, the algorithm starts with the i th finger of node n, and then continues to walk in the counter-clock-wise direction on the identifier circle until it come across a node who‟s i th finger precedes n.

Transfer the key:

The node n supposes to move responsibility for all the keys for which it is now the successor. Normally it would involve moving the data associated with each key to the new node. The node n will be the successor only for the keys; those were previously accountable of the node instantly following n. So, n only requires contacting that one node to which it can transfer the responsibility for all related keys.

F. Node Failure:

If a node n fails:

- Find the successor‟s of n from the node which include the finger table of n.

- Function find_ predecessor from the pseudocode [Appendix 2] is run.

- a successor list is mantained by every nodes in the chord ring.

- Function find_ successor from the pseudocode [Appendix 2] is run

- Finds the immediate living successor to the query key [Theorem 7]

- The approximate time to execute find_succesor is O(log N).

-43-

G. Routing:

- Circular key space or Ring

- Maintain two sets of neighbors

- Each node maintain a successor list of k nodes

- Routing correctness maintain by the successor list

- Routing efficiency is achieved with the finger list of O(log N)

- Routing consist forwarding to the closest node

- Path lengths are O(log n) hops

3.2.3 Routing Geometry:

Table 3. 3: Routing Variables of Chord

Lookup Neighbors Routing Optimal Neighbor Average Node Network state path selection Latency congestion diameter

O(logN) nlogn/2 log N O(log n) nlogn/2 Low O((log n)/n) O(log n)

3.2.4 LOAD BALANCE:

- Distributed hash function

- Spreading keys evenly over the nodes

- Provides a degree of natural load balance.

According to [Ion Stocia], the number of keys per node shows great variations that increase linearly with the number of keys. Figure 3.10 a, plots the mean and the 1st and 99th percentiles

-44-

of the number of keys per node. For example, some nodes store no keys in all cases. To make it clears, Figure 3.10 b, plots the probability density function (PDF) of the number of keys per node when there are keys stored in the network.

Figure 3. 10: (a) The number of keys stored per node in a node network [10]. (b) The probability density function (PDF) of the number of keys per node [6].

One reason for these disparities is that node identifiers do not uniformly cover the whole identifier space. If N equal-sized bin is getting by divide the identifier space, where N is the number of nodes, then we may expect to see one node in each bin. But in real, the probability that a particular bin does not contain any node is (1 − 1/N)푁.

This problem is solved by associating keys with virtual nodes, and mapping multiple virtual nodes (with unrelated identifiers) to each real node.

-45-

3.3 PASTRY

3.3.1 INTRODUCTION

Pastry is another peer-to-peer algorithm proposed by Rostron and Druschel in 2001. The major problem in peer to peer network is the scalability and routing efficiency. To meet this demand Pastry is introduced with enhances application level routing and object location property which differentiate Pastry from other existing DHTs.

3.3.2 WORKING MODEL OF PASTRY

The working model of Pastry is briefly:

- A Pastry node has 128-bit node identifier (NodeId) which indicates the location of node in a circular nodeId space.

- The ranges are from 0 to(2128 − 1).

- The node id is assigned arbitrarily when a node join in the network.

- To provide node id cryptographic hash function may used.

- With given a message and a key, Pastry can route the message to the node with the nodeId which is numerically nearby to the key, between all live Pastry nodes.

푏 - Only 2 − 1 ∗ [푙표푔2푏 푁] + 푙 entries in routing table.

- When a node join or leave of a new node, the routing tables can be updated by

exchanging 푂[푙표푔2푏 푁] messages. -

For example, let a Pastry network consist of M nodes. So the route will be less than [푙표푔2푏 푀] steps on averages.

-46-

3.3.3 ROUTING TABLE

푏 - The routing table of a node is consists of [푙표푔2푏 푁] rows and 2 − 1 entries each which refer to the present node‟s nodeId.

- In the present node‟s Id, the node with n+1th digit has the probable one of the value of 2푏 − 1 .

- Equivalently distribution of nodeIds confirms an even population of the nodeId space.

- So, in the routing table only [푙표푔2푏 푁] levels are populated.

- Each entry includes the IP address of nodes whose nodeId have the correct prefix. [40]

Figure 3. 11: Example of a Pastry node with nodeId 3102 where b = 2, L = 4 with base 4.

-47-

The exchange between the sizes of occupied segment of routing table depends on selection of b 푏 (about 2 − 1 ∗ [푙표푔2푏 푁] entries) and the requisite maximum hops for route between any pair of nodes is ([푙표푔2푏 푁]). [40]

The neighborhood set S, holds the nodeIds and the IP addresses of the 푆 nodes which nearest of the local node. The neighborhood set is not used in routing messages usually except maintaining locality properties. [40]

퐿 /2 is numerically nearby larger nodeIds of the leaf set L, and the 퐿 /2 nodes with numerically nearby smaller nodeIds, associated to the presenting node‟s nodeId. Normally L-the leaf set is used for routing the message. Usual values for 퐿 and 푆 are 2푏 or 2 ∗ 2푏 . [40]

Figure 3. 12: Routing table of nodeId 65a1x where b=4. Here, base=16 and x=arbitrary suffix. [38]

-48-

3.3.4 ROUTING

From the Pastry routing algorithm we can explain the routing system as follows when a message arrived at the node with nodeID:

First the node checks if the key falls within the scope of collection. If it is ok then the message is advanced straightly to the destination node, specifically the node in the leaf set whose nodeId is much close to the key .If the object found by the node itself then routing procedure is complete.

Now if the key is not within the scope of collection:

- Requires the routing table to forward messages. - Message forward to the node which shares a common prefix with the key - But the key has to be as a minimum one more digit except that the proper entry in the routing table is empty or the related node is unreachable. - In this case the node as long as local node the message forwarded to numerically closer to the key than the present node‟s id.

Figure 3.13: The state of 103220 Pastry node state in the case of 12 bit identifier space and 4 base [42].

Here in Figure 3.14 a router sends a request to find 103200 to 103210.The searching of demanded keyword 102022 is closer to 101203.But the request is send to node 102303 because it shares the first 102 prefix. Again for keyword 103000, even though for sharing a common prefix, there is no routing table which is not smaller than the present node. So the current request will bypass through the node 103112, since this node share 103 prefix. Observe that the value of this node is numerically closer than the present node [42].

-49-

3.3.5 NODE JOINING & LEAVING ALGORITHM

In the Pastry network among the process node joining and node leaving are the most important part. In this section, we explain the process of the joining and leaving of nodes in the Pastry network. Here we start with joining of a new node that joins to the system.

A. Node Joining

- When a node joins in the Pastry network its nodeID is defined through application like SHA-1

- Also know its neighbors ID according to the proximity metric.

- The newly joined node requests the neighbor node to send a joining message with the key with equal value of new joined node‟s nodeID.

- The message route to the destination node whose id is numerically close to the joined node.

- In response to the join message all nodes in the network sends their state table to the new node.

- Thus the new node updates its own state table.

- Finally the new node informs all nodes of its arrival awareness and initializes its own.

-50-

B. Node Leaving

It is a common phenomenon that nodes fail or depart in the Pastry network without any notice. In brief:

- When the nearby neighbor node can‟t communicate with a node than it‟s assume the node is dead or unavailable.

- To restore a departed node in the leaf set, its neighbor in the nodeId space contacts to the available live node and send requests for its leaf table.

- The failing of a node that emerges in the routing table of another node can be discovered when that node try to contact the failed node and in reply there is no response.

- Under these circumstances there is no delay of routing messages as the messages send to another live node. There should always a alternate way to keep the better reliability of the routing table.

- To relocate the departed nodes entry in the routing table, a node contacts initially with the node refer to by another entry of the same row and requests for that node‟s entry for.

- There is no pointer of the proper prefix of the live nodes in the row. The node next 푖 contacts an entry 푍퐿+1, 푖 ≠ 푐 , and so on to casting a larger net. This process is hardly possible to finally find the proper exact node if there is any exists.

- Though the neighborhood set is not used for routing messages still it is important to keep it there as it has key roles in exchanging information with neighbor nodes. That‟s why, a node always keep communicate to each other members of the neighborhood set from time to time to get notice if it is still alive. - If there is no response from any one member, the other members for their neighborhood tables, update the distance of each of the newly discovered nodes and get new updated neighborhood set.

Pastry uses a positive approach to controlling the parallel node joining and node leaving. Since the joining/leaving of a node affects only a small number of existing nodes in the system, conflict is rare and a positive approach is appropriate [40- 50].

-51-

3.3.6 LOCALITY

Another important property of Pastry is its locality which refers to the proximity metric. In this section, we discuss about the properties of Pastry‟s routes with respect to the proximity metric. The proximity metric assume as scalar value for example to find the distance between two nodes within round trip time. The intention is to find the lower distance between the nodes. Here we will discuss two of Pastry‟s locality properties that are relevant to routing performance. [40]

A. Short Routes

All ready mentioned above that each entry in the node routing tables is selected to refer to the nearest node with the correct nodeId prefix, according to the proximity metric. So, the consequences:

- In every stage a message is routed to the numerically closest node, matching with a longer prefix.

- From the result of simulation in [39], we can observe that the average distance of traveled message is very short (1.59 and 2.2) time.

B. Route convergence

- Route convergence refers to the distance of two messages travel with the same key before their routes congregate.

- The simulations in [39] shows that the route convergence is almost equal to the distance between their respective source nodes.

-52-

3.4 TAPESTRY

3.4.1 INTRODUCTION:

Like other DHTs Tapestry provides high-performance, scalable, and location-independent routing. But the extended feature of Tapestry routing named decentralized object location and routing (DOLR), which is mainly concentrates on routing of a messages to the endpoints such as nodes or object imitations. With DOLR, resources are virtualizes and this virtualization delivered the message to mobile or imitate endpoints at unsteadiness in the underlying system. So, DOLR network gives us a simple scheme to execute distributed applications.

Tapestry routing has efficiency on reducing message latency and increasing message throughput. For message routing to the mobile endpoint Tapestry use locality which differentiate it from the other existing structured P2P overlay networks. Moreover an adaptive algorithm is used to maintain fault tolerance or node joining/leaving.

3.4.2. WORKING MODEL:

The working model of Tapestry in briefly:

- Tapestry built on the structure of the data location scheme of Plaxton, Rajaraman, and Richa (PRR).

- In the PRR scheme, each Tapestry node holds pointers to other nodes (neighbor links)

- Also mappings between object GUIDs and the node-IDs of storage servers (object pointers). - The nodes and the objects have unique identifiers and represented as sequences of digits.

- Digits are produced from an alphabet of radix b.

- Identifiers are equally distributed in the namespace.

- Nodes are denoted by node-IDs and objects are denoted by globally unique identifiers (GUIDs).

- Every inquiry has a GUID which ultimately resolves to a node-ID.

-53-

- For a sequence of digits 훼, let 훼 represent the number of digits in that sequence.

- Inquiries are routed between nodes through neighbor links till an appropriate object pointer is discovered and at that point the query is forwarded along neighbor links to the destination node.

Figure 3. 14: Example of a Tapestry node 0642 [48]

-54-

3.4.3 ROUTING:

- Routing scheme is based on “Neighbor mapping” means increase by digits from left to right towards destination node.

- By the ID of the nth node, a message at nth node distributes as a minimum n successor digits.

- Set off the level to (n+1) for neighbor mapping.

- Search a nearest node whose ID divides n successor digits of present node ID.

- Message route to the destination node.

- If the node is unreachable current node is then root node.

- Message could include a predicate to search which is the next node, besides to just using the closest node.

- Uses of Surrogate distributed algorithm routing incrementally calculate a unique root node.

- More than one surrogate roots is used to stay away from single point of failure.

- Insert a fixed sequence of salts to generate IDs

- Each ID acquires a potentially different surrogate root.

Figure 3.15 shows some outgoing link of a node.

-55-

Figure 3. 15: Tapestry routing mesh with links which forms the local routing table [38].

Figure 3.17 shows a massage path through the system. Here, n-th node shares a prefix of at least length n with the destination ID. Now to find the next node, search in the level map (n+1) to match the value entered of the successor digit in the destination ID. Now by consistent neighbor hop mappings, the routing method assurances that the system will be established within at most 퐿표푔푏 푁 logical hops if there is any existing unique node.

To calculate a neighbor map of fixed constant size [37]

Neighbor MapSize = entries/map .# of maps = b. 퐿표푔푏 푁

Here, it only requires to carry on a small constant size (b) entries at each route level as every single neighbor map at a node.

-56-

Figure 3. 16: Message on Tapestry, path taken by a message initializes from node 5230 intended for node 42AD in a Tapestry mesh [37].

But when a digit didn‟t matched, Tapestry looks for a “nearby” digit in the routing table; this process called surrogate routing [37], where each missing ID is mapped to some live node with a similar ID. Figure 3.18 shows the details of the NEXTHOP function pseudo code. This is a active process that maps every identifier, 푔 to a unique root node, 푔푟 .

This active process on the network is to continue to route reliably even when intermediate links are altering or faulty. Providing resilience, utilize network path range in the form of surplus routing paths.

Figure 3. 17: Tapestry Pseudo code for NEXTHOP(.).

-57-

3.4.4 NODE ALGORITHMS:

To maintain routing table‟s stability and assurance of object availability there are several mechanisms Tapestry follows. We refer the complete algorithms and proofs from [38].

The greater part of control messages described here requires acknowledgments and was retransmitted where required. [51]

A. Node Insertion:

The node insertion algorithm strategy of Tapestry is incremental.

- A node intends to join a Tapestry network sends a request to pre known or gateway node of that network. For this bootstrapping mechanism is used.

- The neighbor nodes map newly by routing the new joined nodes id along each hop from the router.

- A message is send to all relevant nodes informing the join of new nodes, so that they can update their neighbor maps.

- Newly joined node routes its own node ID and come to know to whom its shares with Hi a suffix of length.

- Compare the distances between itself, each neighbor‟s entry and its secondary neighbors and determines the primary neighbors.

- Search into the neighbors neighbor‟s map for better distance nodes which is refers as node optimizations.

- The optimization process is repeats until it finds the better routing distance.

- The process stops when a neighbor map query confirms an empty entry in the next hop.

- Finally routes to the root surrogate for new id and shifts data destined for new id to new joined node.

-58-

Figure 3.18: Node Insertion [37].

Figure 3. 19: Updating location pointers for exiting nodes [37].

-59-

B. Node Deletion

Voluntary Node Deletion:

If a node willingly leaves from Tapestry, it tells the set D of nodes in N‟s back pointers of its goal, along with a substitute node for each routing level from its own routing table. The notified nodes each send object republish traffic to both and its substitute.

Involuntary Node Deletion:

- At least one neighbor must be alive in the routing table for the correctness of routing.

- From time to time checking procedures confirm the aliveness of each primary neighbor. For this heartbeat mechanism is used.

- If the node is found to be dead, secondary neighbors are used.

- Failed neighbors get another transform period; they are consider as alive node again if they response in this query period.

-60-

Chapter 4 CHAPTER FOUR FUNCTIONALITY COMPARISON OF DHT’S

4.1 INTRODUCTION:

Each of the above mentioned DHT protocols CAN, Chord, Pastry and Tapestry use different routing algorithm. A common trend in competitive terms in an effort to determine which is “best” when mentioned more than one DHT algorithm. We are not agreeing with this trend. If we look at Table 4.1, the algorithms have lots of commonality than differences. Each algorithm represents some approaches about routing in overlay networks.

4.2 COMPARISON OF PROTOCOLS:

Table 4.1: Feature comparison among evaluated DHT algorithms

Parameters CAN Chord Pastry Tapestry

Routing Geometry Dimensional Ring Hybrid= Tree Coordinate Space Tree+Ring

Lookup Method R,(S-R & I) R,S-R (& I) R,S-R (& I) R,S-R (& I)

Routing state 2d logN

Lookup/Path O(logN) O(logN) O(logN)

Length

Node State O(d) O(logN) O(logN) O(logN)

Node degrees O(d) O(log n) O(log n) O(log n)

Network diameter O(dn1/d) O(log n) O(log n) O(log n)

Node congestion O(dn1/d-1) O((log n)/n) O((log n)/n) O((log n)/n)

Optimal path O(log n) O(log n) 1 1

Neighbor 1 nlogn/2 nlogn/2 nlogn/2 selection

-61-

Chapter 5

CHAPTER FIVE EVALUATION OF COMPARED DHTS

5.1 INTRODUCTION

To compare the DHTs, we evaluate their most significant features routing performance and static resilience. In section 5.2 we described the routing performance comparison and 5.3 describes the static resilience.

5.2 ROUTING PERFORMANCE

In peer to peer network, the network size increases rapidly that scalability depends greatly on the following factors:

- Capability of nodes to tolerate the load

- Expansion of path lengths an degree of nodes

- Congestion at the nodes.

5.2.1 COMPARISON OF ROUTING PERFORMANCE:

We consider the parameter “routing path length” to measure the routing efficiency of existing DHTs.

CAN [5, 9, 21] routing geometry is d-dimensional toroidal space. The key space zone is assumed as hyper cubic region and those nodes are the owner of the contiguous hyper cubes referred as neighbors. Routing is performed depending on which neighbors are closest to the key. CAN nodes have O(d) neighbors and path lengths are 푂(푑푛1/푑 ) hops. When d=(logN) , neighbors O(LogN) and the path lengths O(LogN), CAN routing is also similar like other above mentioned algorithms.

-62-

From the Table 4.1 we can observe that CAN has the longest path 푂(푑푛1/푑 ) while rest of the DHT has O(LogN) path lengths.

Chord [6, 21, 55] also applies one dimensional circular key space. The node is liable for the key is the node whose identifier most closely to the key (numerically); that node is called the successor of that key‟s. Chord keeps two sets of neighbors. Each node has a successor list of k nodes that straight away pursue it in the key space. Routing accuracy is achieved with these lists. Routing competence is achieved with the finger list of O(LogN) nodes spaced exponentially around the key space. Chord route to the node which is numerically contiguous including O(LogN) hops path lengths. So, compared to CAN, Chord gives better routing performance. As we know the keyspace of Pastry is circular, so nodes choose the key considering which is numerically closest. Also the neighbors have a Leaf Set, which is the set of |L| closest nodes. The routing correctness depends on the leaf set. For better routing efficiency Pastry consists an additional set of neighbors spread out in the key.Pastry has O(LogN) neighbors and routes within O(LogN) hops. Pastry uses the statistical representation of the prefix to establish routing information which differs from the Chord. Pastry node maintains more information compared to the node of Chord which refers to large overhead though Chord has also the same problem. But uses Proximity Neighbor Selection (PNS) in Pastry routing, achieves better proximity awareness with a little overhead. PNS is not adopted in the basic Chord algorithm because or at least till now no one tried to implement PNS in Chord. So, Pastry routing performance is better compared to CAN and Chord.

Tapestry [51-63] uses a variant of the Plaxton et al. algorithm. Tapestry has O(LogN) neighbors and O(LogN) routing path lengths of hops. Tapestry‟s additional feature “surrogate routing” algorithm provides great analytic bounds on the number of logical hops taken. Moreover, Tapestry distinguishes from the other approaches for its focus on proximity. Query to locate nodes close in the underlying network reduce the latency as a query hops around the network involved in performing the query. If there are several copies of an object, it would be desirable for the closest copy to be found. So, the Tapestry algorithm is really found an approximately closest copy of any required key.

So, comparing to CAN, Chord and Pastry routing performance, Tapestry gives better routing efficiency.

5.2.2 SUMMARY OF THE COMPARISON:

So, we can conclude in aspects of routing efficiency, Tapestry performs much better than other DHT algorithms though all the algorithms has the opportunity to extend or modification to meet the purpose of robustness, scalability etc.

-63-

5.3 STATIC RESILIENCE

How well the network routes before recovery finishes [26].

5.3.1 OVERVIEW:

From [26], we can observe that the relation of Static Resilience and the routing performance is fully involved with the Flexibility. As they show the relation as follows:

Geometry => Flexibility => Performance

Routing geometry is associated with the flexibility in selecting routing algorithms. Flexibility is important for routing performance. For Static Resilience in route selection we consider the issues:

- Routing efficiency is better if the Flexibility in routing path select as it leads to shorter and consistent paths

- Also if the algorithm has the Flexibility of neighbors‟ selection then it leads to shorter paths means better routing.

100

80 Tree Hybrid 60

40 Hypercube % Paths Failed % 20 Ring 0 0 10 20 30 40 50 60 70 80 90 % Failed Nodes

Figure 5. 1: Performance analysis of various routing geometry [26]

-64-

5.3.2 SUMMARY OF THE COMPARISON:

The routing geometries of CAN, Chord, Pastry and Tapestry are Hypercube, Ring, Hybrid and Tree respectively. From Figure 5.1 we can conclude that,

- Tree geometry has less percentage of node failure compared to the others. - Hybrid geometry is the second best geometry as its node failure is less than Hypercube and Ring. - Hypercube and Ring has more percentage of node failure which indicates the scalability problem.

So, also in the case of Static Resilience Tree geometry is the best which refers to Tapestry routing algorithm.

-65-

Chapter 6 CHAPTER SIX ADAPTABILITY WITH MOBILE ENVIRONMENTS

6.1 ISSUES TO BE CONCERN:

he term “Mobility” in a DHT networks refers to how often nodes join and departure without Tany precondition also due to that the performance often degrade of the underlying peer-to peer network. Continuous departures and joins due to the mobility introduced mobility churn.

Mobile peer-to-peer applications currently have received increasing interest. In the wireless environment, where communication links is highly impacted by high packet loss rate and bandwidth fluctuations, the situation is much more complicated. The scenarios of this complication:

1. It is difficult in P2P networks to establish a reliable Mobile Ad hoc Networks (MANET) as it is totally depends on rapidly changing underlying physical network. So the layer should not establish its overlay routing structure independent from the physical network. 2. Mobile devices are running on short changing devices and frequently move from one BTS to another. So it happen often the devices go out and in of signal. This is a sort of mobility churn [29]. Mobility churn is like normal churn [30]. This causes more traffic on the DHT network due to continuous repeating procedure of node joining and leaving. Also for this mobility churn the nodes have to update the information of newly joined nodes and delete the old state information. So, routing efficiency degrades and increases end to end delay. 3. In MANETs the DHTs cannot efficiently due to high overhead cost. But this conception has proven false. The main problem of deploying DHTs in a MANET not its overhead, but rather the protocol‟s negative timeout and failover strategy [33].Also the churn rate is high in mobile nodes compare to wired networks.

4. CAN, Chord etc. DHT protocols performance degrades under high levels of churn [31][30]. Also, it has been shown that strict levels of churn are likely for peer-to-peer networks with mobile nodes [29][32].

-66-

Considering the above problems, we can say that the structured P2P DHT protocols like are CAN, Chord, Pastry and Tapestry are suitable if only the hops are low in number. [34]

So, we approach to identify which structured P2P DHT algorithm gives better performance under high churn considering the suitability with mobile environments.

6.2 ANALYSIS THE DHTS UNDER HIGH CHURN:

Note that, Churn rate has highly terrible influences to the successful lookup ratio, but shows very slight effect on medium lookup latency.

CAN have high lookup latency [Table 4.1] compare to the other three DHTs means at very high churn it performs badly.

We observe from [36] that Chord performance is inferior to the other DHT in terms of performance optimization. But in the case of Tapestry the parameters like lookup, latency etc. are sets to reach highest performance.

Figure 6. 1: Node join/leave with interval = 10s (left N=100, right N = 1000) [36]

-67-

Figure 6. 2: Node join/leave with interval = 120s (left N=100, right N = 1000) [36]

We can see from the Figure 6.1 & 6.2, when churn rate decrease the fraction of failed lookups also decrease proportionally and routing table become more stable. [36]. Also at high churn rates the lookups failed. So, considering this result we can state that Chord does not work well when churn rate is too high.

From [29] we observe, high churn rate is also a problem for Pastry. The reason is that when nodes join or quite a number of node states have to be trades (at least2[푙표푔2푁]) until all node states are adjusted correctly.

Figure 6. 3: Pastry under churn [29].

-68-

Figure 6.3, describes a Pastry network with the percentage of successful lookups for 1000 nodes under churn [29]. It is clear from Figure 6.3 that successful lookups are mostly consistent while Pastry fails to complete majority of lookup requests under high churn rate.

Tapestry works better than Chord in terms of successful lookup rate at very high churn rate but Chord achieves the best performance among the DHTs when churn rate is less. Tapestry performance is more sensitive to RTT than Chord.

Figure 6. 4: Cost versus performance under churn in Tapestry [30].

We can state from the Figure 6.4 that there is no single best combination of parameter values. Contrary, there is a set of best achievable cost-performance combinations. [30]

6.3 SUMMARY OF THE ANALYSIS:

So, analysis the result above , Tapestry has been proved to better suit for the mobile environment since all other DHTs performance degrade under high churn.

-69-

Chapter 7 CHAPTER SEVEN CONCLUTION

7.1 CONCLUSION:

We can conclude our work as follows,

- We studied Peer to Peer, Overlay Network and DHTs Architecture, Implementation with the help of necessary figures and tables. - After that we analysis the features of the DHTs and Compared - We also used Various Solution and understood the way of adaption of DHTs with mobile environments.

The most important challenge for P2P networks are that it is difficult to establish on a MANET which is depend on constantly changing underlying physical network. So the layer should not establish its overlay routing structure independent from the physical network.

So, considering the issues we can conclude the evaluation of DHTs for mobile environments as,

- Tapestry‟s routing performance is better compare to CAN, Chord and Pastry. - From the testing of Static Resilience property, Tapestry is superior then other DHTs. - Finally, under extreme churn no other DHT perform better like Tapestry.

There are several solutions of DHTs to adopt with mobile environments i.e., CHR, MDHT but all are the modification of main existing DHTs. So modifying of Tapestry for ad hoc problem solution has the great opportunity.

7.2 FUTURE WORK:

DHTs are still under research level. So, lot of works can be done for future optimization of mobile environments especially in wireless systems. For example, implementation of Tapestry in real life scenario should be deployed. Also there is still no specific solution for bootstrapping nodes.

-70-

REFERENCES:

[1] Detlef Schoder , Kai Fischbach and Christian Schmitt , ‘‘Core Concepts in Peer-to-Peer Networking”, in Peer to Peer computing , 1st ed., Joyce Li, Ed. United States, Idea Group, 2005, pp. 2-4.

[2] Lua E K, Crowcroft J, Pias M, Sharma R, Lim S. A survey and comparison of peer-to-peer overlay network schemes. IEEE Communications Surveys & Tutorials, 2005, 7: 72-93.

[3] Diego Doval , Donal O'Mahony, ‘‘ Overlay Networks: A Scalable Alternative for P2P”, IEEE Internet Computing, v.7 n.4, p.79-82, July 2003.

[4] Gribble, S., Brewer, E., A., Hellerstein, J., and Culler, D., ‘‘Scalable, Distributed Data Structures for Internet Service Construction”, 4th Symposium on Operating Systems Design and Implementation (OSDI'00), 2000.

[5] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A Scalable Content- Addressable Network. In ICSI Technical Report, Jan. 2001.

[6] I. Stoica, R. Morris, D.R. Karger, M.F. Kaashoek, and H. Balakrishnan, "Chord: A scalable peer-to-peer lookup service for internet applications", In: Proc. SIGCOMM, 2001, pp.149-160.

[7] David Karger, Eric Lehman, Tom Leighton, Mathhew Levine, Daniel Lewin, and Rina Panigrahy, “Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web”. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (El Paso, TX, May 1997), pp. 654–663.

[8] Mockapetris, P., And Dunlap, K. J. Development of the Domain Name System. In Proc. ACM SIGCOMM (Stanford, CA, 1988), pp. 123–133.

[9] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker, “A scalable content addressable network,” in Proceedings of the 2001 conference on applications, technologies, architectures, and protocols for computer communications. ACM Press, 2001, pp. 161–172. [Online]. Available: citeseer.nj.nec.com/article/ratnasamy01scalable.html

[10] Simon Rieche, Leo Petrak, Klaus Wehrle, „„Comparison of Load Balancing Algorithms for Structured Peer-to-Peer Systems”, In: Proc. of Workshop on Algorithms and Protocols for Efficient Peer-to-Peer Applications (PEPPA), GI-Jahrestagung Informatik 2004, vol. 51, pp. 214–218, Ulm, Germany, GI. LNI.

[11] Yingwu Zhu and Yiming Hu, “ Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems,” IEEE Transactions on Parallel and Distributed Systems, v.16 n.4, p.349-361, April 2005

-71-

[12] V.King and J.Saia, “Choosing a random peer‟‟, in Principles of Distributed Computing (PODC), (New found land, Canada), July 2004.

[13] B. Godfrey, K. Lakshminarayanan, S. Surana, R. M. Karp, and I. Stoica, “Load balancing in dynamic structured p2p systems,” in Proceedings of IEEE INFO-COM, 2004.

[14] J. Ledlie and M. Seltzer, “Distributed, secure load balancing with skew, heterogeneity, and churn,” in Proceedings of IEEE INFO-COM, 2004.

[15] A. Rao, K. Lakshminarayanan, S. Surana, R. M. Karp and I. Stoica, “Load balancing in structured p2p systems, „„in Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS), Berkeley, CA, pp.68–79, 2003.

[16] Aisling O‟ Driscoll, Susan Rea, Dirk Pesch, „„Performance evaluation and Modeling of the chord DHT structured overlay for ad hoc network”, Cork Institute of Technology, Ireland, 2008

[17] Stoica I., Morris R., Karger D., Kaashoek M. F. And Balakrishnan H. Chord: A scalable peer-to-peer lookup service for internet applications. Tech. Rep. TR-819, MIT LCS, March 2001. http://www.pdos.lcs.mit.edu/chord/papers/.

[18 ] Popescu Alex, Ilie D. and Kouvatsos D., “On the Implementation of a Content-Addressable Network”, 5th International Working Conference on Performance Modelling and Evaluation of Heterogeneous Networks (HET-NETs), Karlskrona, Sweden, February 2008.

[19] Lewin D. Consistent hashing and random trees: Algorithms for caching in distributed networks. Master‟s thesis, Department of EECS, MIT, 1998. Available at the MIT Library, http://thesis.mit.edu/

[20] LI J., Jannotti J., De Couto D., Karger D. And Morris R. A scalable location service for geographic ad hoc routing. In Proceedings of the 6th ACM International Conference on Mobile Computing and Networking (Boston, Massachusetts, August 2000), pp. 120–130.

[21] Stefan Götz and Klaus Wehrle, “Distributed Hash Table Algorithms”, L3S Research Center, University of Hannover, 2007

[22] Vinh Trương,´´ Testing implementations of Distributed Hash Tables´´,MSc thesis in SEM,Goteborg, Sweden 2007

[23] E. K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, “A survey and comparison of peer-to-peer overlay networks schemes,” IEEE Communications Surveys and Tutorials, vol. 7, no. 2, pp. 72–93, 2nd quarter 2005.

[24] Castro M., Druschel P., Hu Y. Ch., Rowstron A., Proximity Neighbor Selection In Tree Based Structured Peer-To-Peer Overlays. Technical Report MSR-TR-2003-52 (2003)

-72-

[25] A. Rowstron and P. Druschel, “Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems,” Lecture Notes in Computer Science, vol. 2218, pp. 329– 350, 2001.

[26] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, and I. Stoica, “The impact of DHT routing geometry on resilience and proximity”, In: Proc. of SIGCOMM'03, August 2003

[27] Ben Y. Zhao, John Kubiatowicz, and Anthony D. Joseph Computer Science Division,” Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing”, April 2001

[28] H.-C. Hsiao and C.-T. King. “Mobility churn in DHTs”. In Proc. of the 1st International Workshop on Mobility in Peerto-Peer Systems (MPPS‟05) in conjunction with the 25th IEEE International Conference on Distributed Computing Systems (ICDCS‟05), pages 799–805, June 2005.

[29] S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. “Handling churn in a DHT”. In Proc. of the USENIX Annual Technical Conference, June 2004.

[30] J. Li, J. Stribling, R. Morris, M. F. Kaashoek, and T. M.Gil. “A performance vs. cost framework for evaluating DHT design tradeoffs under churn.” In Proc. of IEEE INFOCOM,Miami, FL, March 2005.

[31] H. Pucha, S. M. Das, and Y. C. Hu. How to implement DHTs in mobile ad hoc networks? In Proc. of the 10th ACM International Conference on Mobile Computing and Network (MobiCom 2004), September 2004.

[32] Hung Nguyen Chan 1 , Khang Nguyen Van 1 , Giang Ngo Hoang 2 ,” Characterizing Chord, Kelips and Tapestry algorithms in P2P streaming applications over wireless network”, 2008 IEEE

[33] Curt Cramer and Thomas Fuhrmann,”Performance Evaluation of Chord in Mobile Ad Hoc Networks”, MobiShare’06, September 25, 2006, Los Angeles, California, USA.Copyright 2006 ACM

[34] J. Eberspächer, R. Schollmeier, S. Zöls, and G. Kunzmann, “Structured P2P Networks in Mobile and Fixed Environments”, In: Proc. HET-NETs '04, Ilkley, West Yorkshire, UK, July 2004.

[35] J. Li, J. Stribling, T. Gil, R. Morris and F. Kaashoek, “Comparing the performance of distributed hash tables under churn,” In: Proc. of the 3rd International Workshop on Peer-to- Peer Systems (IPTPS04), San Diego, CA, February 2004.

-73-

[36] Filipe Ara´ujo and Lu´ıs Rodrigues, “Survey on Distributed Hash Tables”, Ph.D. dissertation, University of Lisbon,Lisbon, Portugal, 2006.

[37] P. Druschel and A. Rowstron. PAST: A large-scale, persistent peer-to-peer storage utility. In Proc. HotOS VIII, Schloss Elmau, Germany, May 2001.

[38] A. Rowstron and P. Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proc. ACM SOSP’01, Banff, Canada, Oct. 2001.

[39] A. Rowstron, A.-M. Kermarrec, P. Druschel and M. Castro. Scribe: The design of a large- scale event notification infrastructure. Submitted for publication. June 2001.http://www.research.microsoft.com/ antr/SCRIBE/.

[40] A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM Middleware 2001, Heidelberg, Germany, Nov. 2001.

[41] M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron. Topology-aware routing in structured peer-to-peer overlay networks, 2002. Submitted for publication.

[42] Jihong Song and Shaopeng Wang, „„The Pastry Algorithm Based on DHT‟‟, Com. & Info. Sci., vol. 2, no. 4, pp. A153-A157, Nov. 2009.

[43] Binzenhofer , A, Staehle, D., &Henjes, R. (2005). Telecommunications Conference, GLOBECOM'05. IEEE Volume2, 28.

[44] Huang, D. Y., & Li, Z. P. (2003). Active Distributed Peer-to-Peer Network Architecture, In: International Conference on Communication Technology Proceedings, 2003.

[45] Stoica I. Morris R. Liben-Nowell D. Karger D R, Kaashoek MF, Dabek F, & Balakrishnan H. (2003). Networking, IEEE/ACM Transactions on, Volume 11, Issue 1.

[46] M. Castro, P. Druschel, A. Ganesh, A. Rowstron, and D. S. Wallach. “Security for peer-to- peer routing overlays,” 2002. Submitted for publication.

[47] S. Iyer, A. Rowstron, and P. Druschel. Squirrel: A decentralized peer-to-peer web cache. In 12th ACM Symposium on Principles of Distributed Computing (PODC 2002), July 2002.

[48] B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph, “Tapestry: An infrastructure for fault tolerant wide-area location and routing,” Univ. California, Berkeley, CA, Tech. Rep. CSD- 011141, Apr. 2001.

-74-

[49] K. Hildrum, J. D. Kubiatowicz, S. Rao, and B. Y. Zhao, “Distributed object location in a dynamic network,” in Proc. SPAA, Winnipeg, Canada, Aug. 2002, pp. 41–52.

[50] F. Dabek, B. Zhao, P. Druschel, J. Kubiatowicz, and I. Stoica, “Toward a common API for structured P2P overlays,” in IPTPS, Berkeley, CA, Feb. 2003, pp. 33–44.

[51] S. Rhea, P. Eaton, D. Geels, H. Weatherspoon, B. Zhao, and J. Kubiatowicz, “Pond: The OceanStore prototype,” in Proc. FAST, San Francisco, CA, Apr. 2003, pp. 1–14.

[52] S. Q. Zhuang, B. Y. Zhao, A. D. Joseph, R. H. Katz, and J. D. Kubiatowicz, “Bayeux: An architecture for scalable and fault-tolerant wide-area data dissemination,” in Proc. NOSSDAV, Port Jefferson, NY, June 2001, pp. 11–20.

[53] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker, “A scalable content- addressable network,” in Proc. SIGCOMM, San Diego, CA, Aug. 2001, pp. 161–172.

[54] A. Rowstron and P. Druschel, “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems,” in Proc. Middleware, Heidelberg, Germany, Nov. 2001, pp. 329–350.

[55] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup service for internet applications,” in Proc. SIGCOMM, San Diego, CA, Aug. 2001, pp. 149–160.

[56] P. Maymounkov and D. Mazieres, “: A peer-to-peer information system based on the XOR metric,” in Proc. IPTPS, Cambridge, MA, Mar. 2002, pp. 53–65.

[57] D. Malkhi, M. Naor, and D. Ratajczak, “Viceroy: A scalable and dynamic emulation of the butterfly,” in Proc. PODC,Monterey, CA, 2002, pp. 183–192.

[58] N. J. A. Harvey, M. B. Jones, S. Saroiu, M. Theimer, and A. Wolman, “Skipnet: A scalable overlay network with practical locality properties,” in Proc. USITS, Seattle, WA, Mar. 2003, pp. 113–126.

[59] Plaxton, C. G., Rajaraman, R., And Richa, A. W. Accessing nearby copies of replicated objects in a distributed environment. In Proc. of the 9th Annual Symp. on Parallel Algorithms and Architectures (June 1997), ACM, pp. 311– 320.

[60] Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, Member, IEEE, and John D. Kubiatowicz, Member, IEEE, „„Tapestry: A Resilient Global-Scale Overlay for Service Deployment‟‟, Selected areas in Communications, vol. 22, no. 1,

-75-

APPENDIX:

Algorithm-1 for CAN construction

1: S BOOTSTRAP 2: c RAND(S) 3: P RAND(X,Y) 4: Z,N FINDZONE(c, P) 5: JOINROUTING(N) 6: procedure BOOTSTRAP 7: Contact a DNS server d 8: b d.LOOKUP(CAN domain) . bootstrap node 9: c b.GETCANNODES . set CAN nodes 10: return c 11: end procedure 12: procedure FINDZONE(c, P) Route JOIN message towards point P via node c 13: Z,N p.GETZONE . P is in p’s zone 14: return Z,N 15: end procedure 16: procedure JOINROUTING(N) 17: Send soft updates to all nodes N 18: end procedure 19: procedure LOOKUP(domain) 20: Lookup IP address ip associated with domain 21: return ip 22: end procedure 23: procedure GETCANNODES 24: return subset of known CAN nodes 25: end procedure 26: procedure GETZONE 27: Give up half of own zone, Z, to calling node 28: Collect the set N of neighbours to half-zone Z 29: return Z,N 30: end procedure

-76-

Algorithm-2 for CAN routing algorithm 1

1: procedure ROUTE(c, P) Route JOIN message through CAN towards point P via node c, return owner p of point P

2: if P 2 c then . Is P in c’s neighbors n

3: p n . Set n as current node p

4: else . P is not in origo node c’s zone Owner of zone where point P lie needs to be found

5: p c . Current node is set to p

6: while P 6= p do . Until owner to P is found Check all neighbors n of current node p for shortest distance to point P

7: d sqrt((Px2 − nx2) + (Py2 − ny2)) Px,Py is x,y for point P, nx,ny is x,y for neighbor n Neighbor n with shortest distance d is next hop on path to destination

8: p n . New current node p

9: end while

10: end if

11: Point P is in current node p’s zone, return p

12: return p

13: end procedure

-77-

Theorem:

-78-

Pseudocode For The Node Join Operation:

-79-