Peercq: a Decentralized and Self-Conﬁguring Peer-To-Peer Information Monitoring System

PeerCQ: A Decentralized and Self-Configuring Peer-to-Peer Information Monitoring System Bu¯gra Gedik and Ling Liu College of Computing Georgia Institute of Technology {bgedik,lingliu}@cc.gatech.edu Abstract anteed a definite answer in a bounded number of network hops. The second generation of routing and location PeerCQ is a totally decentralized system that per- schemes is also considered more scalable. forms information monitoring tasks over a network of Surprisingly, most of the P2P protocols [1, 13, 9, 10, 3] peers with heterogeneous capabilities. It uses Continual to date make the assumption that all nodes tend to par- Queries (CQs) as its primitives to express information- ticipate and contribute equally to the system. Thus, these monitoring requests. A primary objective of the PeerCQ protocols distribute tasks and place data to peers based system is to build a decentralized Internet scale distributed on this assumption. However P2P applications should re- information-monitoring system, which is highly scalable, spect the peer heterogeneity and user characteristics in self-configurable and supports efficient and robust way of order to be more robust [12]. Another common weak- processing CQs. This paper describes the basic architec- ness of the second generation P2P protocols is the lack ture of the PeerCQ system and focuses on the mecha- of flexibility in optimizing the key to peer matching al- nisms used for service partitioning at the P2P protocol gorithms to incorporate important performance metrics layer. A set of initial experiments is reported, demon- such as load balance, system utilization, reliability, and strating the sensitiveness of the PeerCQ approach to large trust. scale P2P information monitoring and the effectiveness of In this paper we describe PeerCQ, a peer-to-peer in- the PeerCQ service-partitioning algorithms with respect to formation monitoring system, which utilizes a large set of load balancing and system utilization. heterogeneous peers to form a peer-to-peer information monitoring network. Many application systems today 1 Introduction have the need for tracking changes in multiple information sources on the web and notifying users of changes if some With the emergence of successful applications like condition over the information sources is met. A typical Gnutella [3] and Napster [7], peer-to-peer technology has example in business world is to monitor availability and received rapid and widespread deployment, and a striking price information of specific products, such as “monitor visibility over the past few years. the price of 2 MP digital cameras in next two months and There are currently several P2P systems in operation, notify me when one with price less than 100$ becomes and many more are under development. Gnutella [3], available”, “monitor IBM stock price and notify me when Napster [7] and Freenet [1] has been among the most it increases by 10%”. prominent peer-to-peer file sharing systems. In these sys- PeerCQ uses continual queries (CQs) as its primi- tems, files are stored at the end user machines rather than tives to express information monitoring requests (sub- at a central server, and as opposed to the conventional scriptions). Continual Queries [5] are standing (long run- client/server model. Files are transferred directly between ning) queries that monitor information updates and re- peers. However, they differ from one another in terms of turn results whenever the updates reach certain specified their lookup services, and the current P2P protocols in thresholds. There are three main components of a CQ: these systems are not scalable. query, trigger, and stop condition. Whenever the trig- Chord [13], Pastry [10], Tapestry [14], CAN [9] are ex- ger condition becomes true, the query part is executed amples of a second generation of peer-to-peer systems. and the part of the query result that is different from the Their routing and location schemes are based on dis- result of the previous execution is returned. The stop tributed hash tables. In contrast to the first generation condition specifies the termination of a CQ. P2P systems such as Gnutella and Freenet, these systems In this paper, we focus on how PeerCQ addresses the provide reliable content location (persistence and avail- problems related to service partitioning, and describe our ability) through a tighter control of the data placement proposed technical solutions. The paper has two main and topology within a P2P network. A query is guar- contributions. First, we introduce a distinct approach to (CQs). Each information monitoring request is assigned CQ systems, which enables processing of large number of to an identifier. Based on an identifier matching crite- CQs by harnessing the power at the edge of the Internet, ria, CQs are executed at their assigned peers and cleanly without a need for centralized servers. Second, we intro- migrated to other peers in the presence of failure or peer duce an effective service-partitioning scheme. A unique entrance and departure. feature of the PeerCQ service-partitioning mechanism is its ability to integrate both the peer heterogeneity and the PeerCQ servant application on Peer A Information Monitoring Layer CQ user information monitoring characteristics of the users into Information PeerCQ Protocol Layer Sources the load balancing scheme, a challenge in large-scale, het- A direct notification erogeneous, and totally decentralized systems. Internet PeerCQ Network e-mail 2 System Overview notification B Information Peers in the PeerCQ system are user machines on Sources PeerCQ Protocol Layer the Internet that execute information monitoring appli- Information Monitoring Layer cations. Peers act both as clients and servers in terms PeerCQ servant application on Peer B of their roles in serving information monitoring requests. Figure 1: PeerCQ Architecture An information-monitoring job, expressed as a continual query (CQ), can be posted from any peer. There is no Figure 1 shows a sketch of the PeerCQ system archi- scheduling node in the system. No peers have global tecture from a user’s point of view. Each peer in the P2P knowledge about other peers in the system. network is equipped with the PeerCQ middleware, a two- There are three main mechanisms that make up the layer software system. The lower layer is the PeerCQ pro- PeerCQ system. The first mechanism is the overlay net- tocol layer responsible for peer-to-peer communication. work membership. Peer membership allows peers to com- The upper layer is the information monitoring subsys- municate directly with one another to distribute tasks or tem responsible for CQ subscription, trigger evaluation, exchange information. A new node can join the PeerCQ and change notification. Any domain-specific informa- system by contacting an existing peer (an entry node) tion monitoring requirements can be incorporated at this in the PeerCQ network. There are several bootstrapping layer. methods to determine an entry node. We may assume A user composes his or her information monitoring re- that a PeerCQ service has an associated DNS domain quest in terms of a CQ and posts it to the PeerCQ system name. It takes care of resolving the mapping of PeerCQ’s via an entry peer, say Peer A. Based on the PeerCQ’s domain name to the IP address of one or more PeerCQ service partition scheme (see Section 3.2), Peer A is not bootstrapping nodes. A bootstrapping node maintains a responsible for this CQ. Thus it triggers the PeerCQ’s short list of PeerCQ nodes that are currently alive in the service lookup function. The PeerCQ system determines system. To join PeerCQ, a new node looks up the PeerCQ which peer will be responsible for processing this CQ us- domain name in DNS to obtain a bootstrapping node’s IP ing the PeerCQ service portioning scheme. Assume that address. The bootstrapping node randomly chooses sev- Peer B was chosen to execute this CQ. After the CQ is eral entry nodes from the short list of nodes and supplies assigned to Peer B, it starts its execution there. During their IP addresses. Upon contacting to an entry node this execution, when an interested information update is of PeerCQ, the new node is integrated into the system detected, the query is fired, and the owner of this CQ is through the PeerCQ protocol’s initialization procedures. notified with the newly updated information. The notifi- The second mechanism is the PeerCQ protocol, includ- cation could be realized by e-mail or by directly sending it ing the service partitioning and the routing query based to Peer A if it is online at the time of notification. Even if service lookup algorithm. In PeerCQ every peer partici- a peer is not participating in the system at a given time, pates in the process of evaluating CQs, and any peer can its previously posted CQs are in execution at other peers. post a new CQ of its own interest. When a new CQ is posted by a peer, this peer first determines which peer 3 The PeerCQ Protocol will process this CQ with the objective of utilizing sys- The PeerCQ protocol specifies how to find peers that tem resources and balancing the load on peers. Upon a are best to serve the given information monitoring re- peer’s entrance into the system, a set of CQs that needs to quests in terms of load balance and overall system utiliza- be re-distributed to this new peer is determined by taking tion, how new nodes join the system, and how they recover into account the same objectives. Similarly, when a peer from the failures or departures of existing nodes. In this departs from the system, the set of CQs of which it was section we first give an overview of the protocol, including responsible is reassigned to the rest of peers, while main- the system model.

Peercq: a Decentralized and Self-Conﬁguring Peer-To-Peer Information Monitoring System

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support