Improving Performance in the Gnutella Protocol

Total Page:16

File Type:pdf, Size:1020Kb

Improving Performance in the Gnutella Protocol Improving Performance in the Gnutella Protocol Jonathan Hess Benjamin Poon Department of Computer Science Department of Computer Science University of California, Berkeley University of California, Berkeley jonhess (at) uclink.berkeley.edu bpoon (at) uclink.berkeley.edu Abstract The Gnutella protocol describes a completely decentralized P2P file sharing system in which queries are flooded to all neighbors in the search for files. As originally specified, the protocol does not have any notion of providing privacy; as such, because agencies have begun to censor and threaten users of such systems, participation has decreased. In turn, users who continue to utilize the network, choose not to share data in fear of litigation. This reduces data redundancy, as well as an increase in the workload of fully participating peers. As files become less available, Gnutella peers must broadcast queries deeper into the network. While data-participation is relatively uncontrollable, increased redundancy and decreased workload can be achieved through replicating files to other peers. This, however, must be done in such a way that preserves the ability of the proxy-peers to deny knowledge of file content. In this paper, we present an extension to the Gnutella protocol that achieves replication through encrypted mirroring. We further improve performance by directing queries using a Bloom filter mechanism. Through simulation, we explore the performance gains of these protocol extensions in terms of query success rate, query bandwidth consumption, and aggregate bandwidth consumption. In the end, BloomNet is able to satisfy queries more readily than Gnutella, using approximately one-fourth of the bandwidth for queries. 1 Introduction Traditionally, computers have communicated in a fashion modeled by the client-server paradigm: a client computer makes requests from a server computer that fulfills those requests. This model has served as a central idea of computer networking for many years. It can be found anywhere from common protocols like HTTP and FTP, to online banking systems. The problems inherent in this paradigm are rooted in its centralization: there is a single point of failure, which make denial of service attacks and possible loss of privacy very possible. Recently, however, the peer-to-peer (P2P) paradigm has become increasingly popular because of its ability to provide ad-hoc collaboration, information sharing, privacy, self-administration, and efficient accumulation of existing distributed resources over a large-scale environment. Peer-to-peer file sharing (P2PFS) is specific to the information sharing and privacy aspects of the P2P paradigm in which any two hosts make a connection through a decentralized network in order to share files. One of the necessities of all P2P systems is cooperation—without it, these systems lose the very fabric of their existence. In the P2PFS domain, without peers sharing files, there are no files to download, thus making the system useless. In [4], it was empirically shown that over 70% of users of the popular P2PFS system Gnutella chose to free-ride—to download from the huge library of files without making any of their own files available. As more and more peers choose to free-ride, P2PFS degenerates into the client-server model with all its disadvantages. [4] shows that a small number of Gnutella peers contribute an unproportionally large number of files. This behavior is indeed reminiscent of the client-server paradigm, where the few contributors act as servers and the remaining population act as clients. Clearly, for all P2PFS systems, as fewer peers contribute files for the common good, the system’s performance degrades; further, as mentioned before, if all peers choose to free ride, the system collapses. To make matters worse, the increased threatening of litigation by some agencies has decreased the replication of files in P2PFS systems. The network still boasts the same library of files. It simply has 1 fewer copies. Unfortunately, demand does not change. This decreased replication causes an increase in the workload for sharing peers. Fewer peers must now supply the unchanged demand. An increase in query depth is similarly required to find data in the now more sparsely populated network. Therefore, the goal of this work is to improve performance of such systems in the face of decreased replication. In particular, we make an extension to the Gnutella protocol called BloomNet that includes two performance-improving techniques: file mirroring and directed search. After introducing Gnutella further in Section 2, we discuss the overall design of the protocol extension in Section 3. Section 4 follows with a description of the constructed simulation model as well as metrics for determining performance, with Section 5 evaluating the results from the simulations according to those metrics. Lastly, Section 6 examines related work, Section 7 concludes, and Section 8 discusses possibilities of further improvement of BloomNet. The key contributions of this work include the addition of several improvements to the Gnutella protocol that allow for less query traffic with improved query success rates, and the creation of a versatile Gnutella simulator with many adjustable parameters. 2 Gnutella The Gnutella protocol is a P2PFS model that provides a mechanism for the distributed searching of shared files across many connected hosts, called peers. To share files, a peer starts a Gnutella client A on her local networked computer. This client will then connect to an already-existing Gnutella client B, finding its address through some out-of-band means. Now, B will announce to all of the clients it knows (its neighbors) that a new client has joined the network. This occurs recursively out into the network, until the announcement message travels a certain distance: the time-to-live or TTL. Similarly, when querying for a file, client A will send out a Query message telling its neighbors that it is looking for a certain file. As other clients see this message, they check their locally stored files to see if any of them match. If a match is found, a QueryHit message is returned to the sender along the path taken by the Query. Subsequent to checking for local matches, the client repeats the broadcasting of the Query message to all of its neighbors. The amount of messages, and hence bandwidth, required for a query is clearly exponential in the breadth and depth of the broadcast; moreover, if a file exists in the network, it is not guaranteed to be found if the Query message does not reach a client that is sharing the file. As opposed to Gnutella, P2PFS systems have also been built on top of distributed hash tables (DHTs) that ameliorate the problem of creating too much traffic as well as guarantee the location of an object if it exists anywhere in the network. However, several factors arise in comparing Gnutella with DHT-based models that prompt us to favor trying to improve Gnutella. First, DHTs can only provide exact-match file querying in a scalable manner, as opposed to Gnutella’s built-in support for keyword searches. Second, DHTs expend much bandwidth when nodes join or leave the network (which happens extremely frequently), whereas Gnutella’s ad-hoc topology creation requires little to no maintenance. Third, as argued in [9], DHTs enable the efficient location of a single file in the network, similar to finding a needle in a haystack. While DHTs are very adept at this, most queries in P2PFS systems are for hay—files that are widely replicated. Gnutella finds such files very easily. Fourth, Gnutella is already widely deployed, and applying incremental changes to already deployed systems is more likely to succeed than trying to deploy a new system. It is for these four reasons that we chose to focus our efforts on creating improving the Gnutella protocol in designing BloomNet, as opposed to creating a new DHT- based model. 3 BloomNet Design BloomNet makes two major additions to the Gnutella protocol, both of which are aimed at improving performance given decreased file replication. Each addition introduces its functionality to the protocol through the creation of a new message type, both described below in Table 1. The use of file mirroring is discussed first in Section 3.1, followed by Section 3.2’s explanation of directed queries. 2 Table 1. A listing and description of the two messages used by BloomNet. Message Description Mirroring Request (MRM) The mechanism by which mirrors are chosen and created Bloom Used in conjunction with Ping messages to discover the Bloom filter associated with a node on the network 3.1 File Mirroring The goal of mirroring is to increase the replication factor of files, while keeping sole legal blame on the original sharer, called the originator. This gives BloomNet a way to deal with flash crowd situations, as well as a means to allow more peers to be able to find mirrored files. We do so by means of a new protocol message called a Mirroring Request Message (MRM), coupled with file encryption. Throughout this section, we explore the problem space from the point of view of a single client. The first decision the originator must make is the strategy with which to replicate its f files F1 … Ff. A naïve technique would be to replicate all f files as much as possible. However, this would consume so much bandwidth in file transfer traffic that it would outweigh the benefits, as seen below in Figure 1. Originator Originator Neighbors Neighbors Figure 1. The originator sends MRMs for all of its files to Figure 2. The originator sends one MRM for a single file, all of its neighbors. each time its demand is above mirrorThresh. The more conservative approach taken by BloomNet is to replicate only certain files, requiring the client to decide which file to mirror at what time (see Figure 2, above).
Recommended publications
  • The Edonkey File-Sharing Network
    The eDonkey File-Sharing Network Oliver Heckmann, Axel Bock, Andreas Mauthe, Ralf Steinmetz Multimedia Kommunikation (KOM) Technische Universitat¨ Darmstadt Merckstr. 25, 64293 Darmstadt (heckmann, bock, mauthe, steinmetz)@kom.tu-darmstadt.de Abstract: The eDonkey 2000 file-sharing network is one of the most successful peer- to-peer file-sharing applications, especially in Germany. The network itself is a hybrid peer-to-peer network with client applications running on the end-system that are con- nected to a distributed network of dedicated servers. In this paper we describe the eDonkey protocol and measurement results on network/transport layer and application layer that were made with the client software and with an open-source eDonkey server we extended for these measurements. 1 Motivation and Introduction Most of the traffic in the network of access and backbone Internet service providers (ISPs) is generated by peer-to-peer (P2P) file-sharing applications [San03]. These applications are typically bandwidth greedy and generate more long-lived TCP flows than the WWW traffic that was dominating the Internet traffic before the P2P applications. To understand the influence of these applications and the characteristics of the traffic they produce and their impact on network design, capacity expansion, traffic engineering and shaping, it is important to empirically analyse the dominant file-sharing applications. The eDonkey file-sharing protocol is one of these file-sharing protocols. It is imple- mented by the original eDonkey2000 client [eDonkey] and additionally by some open- source clients like mldonkey [mlDonkey] and eMule [eMule]. According to [San03] it is with 52% of the generated file-sharing traffic the most successful P2P file-sharing net- work in Germany, even more successful than the FastTrack protocol used by the P2P client KaZaa [KaZaa] that comes to 44% of the traffic.
    [Show full text]
  • IPFS and Friends: a Qualitative Comparison of Next Generation Peer-To-Peer Data Networks Erik Daniel and Florian Tschorsch
    1 IPFS and Friends: A Qualitative Comparison of Next Generation Peer-to-Peer Data Networks Erik Daniel and Florian Tschorsch Abstract—Decentralized, distributed storage offers a way to types of files [1]. Napster and Gnutella marked the beginning reduce the impact of data silos as often fostered by centralized and were followed by many other P2P networks focusing on cloud storage. While the intentions of this trend are not new, the specialized application areas or novel network structures. For topic gained traction due to technological advancements, most notably blockchain networks. As a consequence, we observe that example, Freenet [2] realizes anonymous storage and retrieval. a new generation of peer-to-peer data networks emerges. In this Chord [3], CAN [4], and Pastry [5] provide protocols to survey paper, we therefore provide a technical overview of the maintain a structured overlay network topology. In particular, next generation data networks. We use select data networks to BitTorrent [6] received a lot of attention from both users and introduce general concepts and to emphasize new developments. the research community. BitTorrent introduced an incentive Specifically, we provide a deeper outline of the Interplanetary File System and a general overview of Swarm, the Hypercore Pro- mechanism to achieve Pareto efficiency, trying to improve tocol, SAFE, Storj, and Arweave. We identify common building network utilization achieving a higher level of robustness. We blocks and provide a qualitative comparison. From the overview, consider networks such as Napster, Gnutella, Freenet, BitTor- we derive future challenges and research goals concerning data rent, and many more as first generation P2P data networks, networks.
    [Show full text]
  • Scalable Supernode Selection in Peer-To-Peer Overlay Networks∗
    Scalable Supernode Selection in Peer-to-Peer Overlay Networks∗ Virginia Lo, Dayi Zhou, Yuhong Liu, Chris GauthierDickey, and Jun Li {lo|dayizhou|liuyh|chrisg|lijun} @cs.uoregon.edu Network Research Group – University of Oregon Abstract ically fulfill additional requirements such as load bal- ance, resources, access, and fault tolerance. We define a problem called the supernode selection The supernode selection problem is highly challeng- problem which has emerged across a variety of peer-to- ing because in the peer-to-peer environment, a large peer applications. Supernode selection involves selec- number of supernodes must be selected from a huge tion of a subset of the peers to serve a special role. The and dynamically changing network in which neither the supernodes must be well-dispersed throughout the peer- node characteristics nor the network topology are known to-peer overlay network, and must fulfill additional re- a priori. Thus, simple strategies such as random selec- quirements such as load balance, resource needs, adapt- tion don’t work. Supernode selection is more complex ability to churn, and heterogeneity. While similar to than classic dominating set and p-centers from graph dominating set and p-centers problems, the supernode theory, known to be NP-hard problems, because it must selection problem must meet the additional challenge respond to dynamic joins and leaves (churn), operate of operating within a huge, unknown and dynamically among potentially malicious nodes, and function in an changing network. We describe three generic super- environment that is highly heterogeneous. node selection protocols we have developed for peer-to- Supernode selection shows up in many peer-to-peer peer environments: a label-based scheme for structured and networking applications.
    [Show full text]
  • CS 552 Peer 2 Peer Networking
    CS 552 Peer 2 Peer Networking R. Martin Credit slides from B. Richardson, I. Stoica, M. Cuenca Peer to Peer • Outline • Overview • Systems: – Gnutella – Freenet – Chord – PlanetP Why Study P2P • Huge fraction of traffic on networks today – >=50%! • Exciting new applications • Next level of resource sharing – Vs. timesharing, client-server, P2P – E.g. Access 10’s-100’s of TB at low cost. P2P usage • CMU network (external to world), 2003 • 47% of all traffic was easily classifiable as P2P • 18% of traffic was HTTP • Other traffic: 35% – Believe ~28% is port- hopping P2P • Other sites have a similar distribution Big Picture • Gnutella – Focus is simple sharing – Using simple flooding • Bit torrent – Designed for high bandwidth • PlanetP – Focus on search and retrieval – Creates global index on each node via controlled, randomized flooding • Cord – Focus on building a distributed hash table (DHT) – Finger tables Other P2P systems • Freenet: – Focus privacy and anonymity – Builds internal routing tables • KaaZa • eDonkey • Napster – Success started the whole craze Key issues for P2P systems • Join/leave – How do nodes join/leave? Who is allowed? • Search and retrieval – How to find content? – How are metadata indexes built, stored, distributed? • Content Distribution – Where is content stored? How is it downloaded and retrieved? Search and Retrieval • Basic strategies: – Flooding the query – Flooding the index – Routing the query • Different tradeoffs depending on application – Robustness, scalability, legal issues Flooding the Query (Gnutella) N3 Lookup(“title”) N1 N2 N4 N5 Key=title N8 N6 Value=mp3 N7 Pros: highly robust. Cons: Huge network traffic Flooding the Index (PlanetP) Key1=title1 N3 N1 Key2=title2 N2 N4 N5 Lookup(“title4”) Key1=title3 N8 N6 Key2=title4 N7 Pros: Robust.
    [Show full text]
  • Free Riding on Gnutella
    Free Riding on Gnutella Eytan Adar and Bernardo A. Huberman Internet Ecologies Area Xerox Palo Alto Research Center Palo Alto, CA 94304 Abstract An extensive analysis of user traffic on Gnutella shows a significant amount of free riding in the system. By sampling messages on the Gnutella network over a 24-hour period, we established that nearly 70% of Gnutella users share no files, and nearly 50% of all responses are returned by the top 1% of sharing hosts. Furthermore, we found out that free riding is distributed evenly between domains, so that no one group contributes significantly more than others, and that peers that volunteer to share files are not necessarily those who have desirable ones. We argue that free riding leads to degradation of the system performance and adds vulnerability to the system. If this trend continues copyright issues might become moot compared to the possible collapse of such systems. 1 1. Introduction The sudden appearance of new forms of network applications such as Gnutella [Gn00a] and FreeNet [Fr00], holds promise for the emergence of fully distributed information sharing systems. These systems, inspired by Napster [Na00], will allow users worldwide access and provision of information while enjoying a level of privacy not possible in the present client-server architecture of the web. While a lot of attention has been focused on the issue of free access to music and the violation of copyright laws through these systems, there remains an additional problem of securing enough cooperation in such large and anonymous systems so they become truly useful. Since users are not monitored as to who makes their files available to the rest of the network (produce) or downloads remote files (consume), nor are statistics maintained, the possibility exist that as the user community in such networks gets large, users will stop producing and only consume.
    [Show full text]
  • Title: P2P Networks for Content Sharing
    Title: P2P Networks for Content Sharing Authors: Choon Hoong Ding, Sarana Nutanong, and Rajkumar Buyya Grid Computing and Distributed Systems Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Australia (chd, sarana, raj)@cs.mu.oz.au ABSTRACT Peer-to-peer (P2P) technologies have been widely used for content sharing, popularly called “file-swapping” networks. This chapter gives a broad overview of content sharing P2P technologies. It starts with the fundamental concept of P2P computing followed by the analysis of network topologies used in peer-to-peer systems. Next, three milestone peer-to-peer technologies: Napster, Gnutella, and Fasttrack are explored in details, and they are finally concluded with the comparison table in the last section. 1. INTRODUCTION Peer-to-peer (P2P) content sharing has been an astonishingly successful P2P application on the Internet. P2P has gained tremendous public attention from Napster, the system supporting music sharing on the Web. It is a new emerging, interesting research technology and a promising product base. Intel P2P working group gave the definition of P2P as "The sharing of computer resources and services by direct exchange between systems". This thus gives P2P systems two main key characteristics: • Scalability: there is no algorithmic, or technical limitation of the size of the system, e.g. the complexity of the system should be somewhat constant regardless of number of nodes in the system. • Reliability: The malfunction on any given node will not effect the whole system (or maybe even any other nodes). File sharing network like Gnutella is a good example of scalability and reliability.
    [Show full text]
  • Gnutella Protocol
    A BRIEF INTRODUCTION AND ANALYSIS OF THE GNUTELLA PROTOCOL By Gayatri Tribhuvan University of Freiburg Masters in Applied Computer Science [email protected] ABSTRACT Peer to peer technology has certainly improved mechanisms of downloading files at a very high rate. Statistics and research show that as more number of peer to peer protocols developed, the number of nodes in the entire network increased. In this paper, we observe the structure and operation of Gnutella , a peer to peer protocol. The paper also focuses on some issues of this protocol and what improvements were further made to improve its scalability. We also have a look at some of the security issues of this protocol. Some statistics in the paper also reflect important changes that the protocol inflicted onto the entire network. PDF Created with deskPDF PDF Writer - Trial :: http://www.docudesk.com 1. INTRODUCTION Gnutella is a p2p protocol. This paper has tried to address some of the issues that have been faced by users of Gnutella, including that of scalability (increasing the scale of operation, i.e. the volume of operation with progressively larger number of users) and security. The Gnutella protocol is an open, decentralized group membership and search protocol, mainly used for file searching and sharing. Group membership is open and search protocol addresses searching and sharing of files. The term Gnutella represents the entire group of computers which have Gnutella speaking applications loaded in them forming a virtual network. Each node can function as both client as well as server. Thus they can issue queries to other nodes as well accept and respond to queries from other nodes, after matching the queries with the contents loaded in their own hard disks.
    [Show full text]
  • Digital Piracy on P2P Networks How to Protect Your Copyrighted Content
    Digital Piracy on P2P Networks How to Protect your Copyrighted Content Olesia Klevchuk and Sam Bahun MarkMonitor © 2014 MarkMonitor Inc. All rights reserved. Agenda . P2P landscape: history and recent developments . Detection and verification of unauthorized content on P2P sites . Enforcement strategies . Alternatives to enforcements 2 | Confidential P2P Landscape History and Recent Developments 3 | Confidential History of Digital Piracy Streaming Download to Streaming 1B+ Users . Music piracy enters mainstream with Napster . P2P brought software and video piracy . Shift to consumption of streaming content – TV and sports most impacted P2P Live 300 MM Streaming Users 50 MM Users Napster UseNet 25 MM Users 16 MM Users < 5 MM Today 1995 2000 2005 2010 2015 4 | Confidential First Generation of P2P From Napster to eDonkey2000 . Napster brought P2P to masses . Centralized server model made it possible to shutdown the network 5 | Confidential Second Generation of P2P Kazaa, Gnutella and Torrent Sites . Ability to operate without central server, connecting users remotely to each other . Difficult to shutdown . Attracted millions of users worldwide . Requires some technical ability, plagued with pop-up ads and malware 6 | Confidenti al New P2P piracy . No to little technical ability is required . Attractive, user-friendly interface . BitTorrent powered making enforcements challenging Popcorn Time BitTorrent powered streaming app . Allows you to watch thousands of movies instantaneously . In the U.S., software was downloaded onto millions of devices . Interface resembles that of popular legitimate streaming platforms 8 | Confidential P2P Adoption and Usage BitTorrent is among the most popular platforms online Twitter 307 million users Facebook 1.44 billion users Netflix 69 million subscribers BitTorrent 300 million users 9 | Confidential P2P Piracy Steady trend of a number of P2P infringements .
    [Show full text]
  • Searching for Malware in Bittorrent∗
    Searching for Malware in BitTorrent∗ Andrew D. Berns and Eunjin (EJ) Jung April 24, 2008 Abstract One of the most widely publicized aspects of computer security has been the presence and propagation of malware. Malware has adapted to many different changing technologies, in- cluding recently-popular P2P systems. While previous work has examined P2P networks and protocols like KaZaA and Gnutella for malware, little has been done so far that examines BitTor- rent. This project explored BitTorrent for the presence of malware, and discovered a significant portion of malware in the downloaded file set. Statistics on torrents infected with malware were gathered and analyzed to find patterns that are helpful in creating basic filtering heuristics. While these heuristics may work in simple cases, several easy ways they can be defeated were found. 1 Introduction Recently, peer-to-peer networks have emerged as a popular paradigm for Internet applications. In fact, a study in 2005 estimated that P2P traffic accounted for around 70% of all traffic on the Internet [2]. P2P technology is finding new applications as it grows, including voice-over-IP systems and streaming video delivery. While P2P has found several different uses, perhaps the most widely-known use is for file sharing. One concern with P2P file sharing is that it can be used to distribute malware (malicious software, such as worms, viruses, and rootkits). On the one hand, users have access to huge amounts of data, while on the other hand, this data can easily be tainted with viruses, worms, and other forms of malware. An important consideration, then, is if the concern about malware in P2P networks is warranted, and if it is, are there ways to protect a computer and minimize the risk of malware infection from P2P networks.
    [Show full text]
  • Sharing Files on Peer-To-Peer Networks Based on Social Network
    Sharing Files on Peer-to-Peer Networks based on Social Network Fabrice Le Fessant INRIA Saclay-Île de France 23 Février 2009 1 / 14 Social networks are less vulnerable to attacks by spies ) peer-to-peer network + social network ) less vulnerable to censorship... maybe Fighting Censorship Standard peer-to-peer networks are vulnerable ! Spies can locate content providers States and Big Companies already have spies on P2P networks 2 / 14 Fighting Censorship Standard peer-to-peer networks are vulnerable ! Spies can locate content providers States and Big Companies already have spies on P2P networks Social networks are less vulnerable to attacks by spies ) peer-to-peer network + social network ) less vulnerable to censorship... maybe 3 / 14 Peer-to-Peer + Social Network Each user allows his computer to connect only to his friends' computers What can we do with such a network ? 4 / 14 My main requirement: • Content providers should remain unknown • while Links (friends) are not hidden • ISP must share such information with spies File-Sharing Three main operations: Resource discovery: nd which resources exist Resource localization: nd which resources are available and where Resource access: download the resources 5 / 14 File-Sharing Three main operations: Resource discovery: nd which resources exist Resource localization: nd which resources are available and where Resource access: download the resources My main requirement: • Content providers should remain unknown • while Links (friends) are not hidden • ISP must share such information with
    [Show full text]
  • Gnutella and Freenet
    Decentralized Peer-to-Peer Network Architecture: Gnutella and Freenet AUTHOR: Jem E. Berkes [email protected] University of Manitoba Winnipeg, Manitoba Canada April 9, 2003 Introduction Although traditional network file systems like NFS provide a reliable way for users on a LAN to pool and share data, Internet-wide file sharing is still in its infancy. Software developers and researchers are struggling to find new ways to reliably, efficiently and securely share data across wide area networks that are plagued by high latency, bottlenecks, and unreliable or malicious nodes. This paper will focus on decentralized file sharing networks that allow free Internet-wide participation with generic content. A decentralized network has no central authority, which means that it can operate with freely running nodes alone (peer-to-peer, or P2P). Much of the current research into file sharing focuses on such systems, after the repeated failure of commercially-oriented networks such as Napster™ and Morpheus™ demonstrated that centralized and purely multimedia-based systems were unsuitable for long-term use by the general Internet public. Building a useful decentralized file sharing network is no small feat, but an effective system will be a remarkable accomplishment for the modern Internet. A robust, massive content distribution network will have a multitude of uses. The huge amount of accessible data (already into hundreds of terabytes on existing networks) and enormous transfer capacity at little or no cost to individual participants demonstrates the value of such a system, which may soon become a core Internet technology akin to the World Wide Web. Such large, anonymous networks seem quite natural for the Internet, as they demonstrate the epitome of pooling resources for the mutual benefit of all users.
    [Show full text]
  • Exploiting the Security Weaknesses of the Gnutella Protocol *
    Exploiting the Security Weaknesses of the Gnutella Protocol ? Demetrios Zeinalipour-Yazti Department of Computer Science University of California Riverside, CA 92507, USA [email protected] Abstract. Peer-to-Peer (P2P) file-sharing systems such as Gnutella, Morpheus and Freenet have recently attracted a lot of interest from the internet community because they realized a distributed infrastructure for sharing files. Such systems have shifted the Web's Client-Server model paradigm into a Client-Client model. The tremendous success of such systems has proven that purely distributed search systems are feasible and that they may change the way we interact on the Internet. Beside the several advantages that have been uncovered by P2P systems, such as robustness, scalability and high fault tolerance various other questions and issues arise in the context of Security. Many P2P protocols are bundled along with an adequate amount of security mechanisms but are proprietary which makes their analysis difficult. The Gnutella Protocol on the other hand is an open protocol, which doesn't highlight security in the sake of its simplicity. Most security weaknesses of the Gnutella Protocol could be avoided if the protocol was taking into account that Peers may misbehave. In this paper we provide an overview of the Gnutella Protocol Specification, describe several of its weaknesses and show how they can be turned into Distributed Denial of Service Attacks, User's Privacy Violation and IP Harvesting. We present the weaknesses with experimental attacks that we have performed on the Gnutella Network. We finally evaluate how these attacks could be avoided and suggest in some cases improvements on the protocol.
    [Show full text]