Crawling Bittorrent Dhts for Fun and Profit

Total Page:16

File Type:pdf, Size:1020Kb

Crawling Bittorrent Dhts for Fun and Profit To appear in Proc. 4th USENIX Workshop on Offensive Technologies (WOOT ’10), Washington, D.C., August 9, 2010. Crawling BitTorrent DHTs for Fun and Profit Scott Wolchok and J. Alex Halderman The University of Michigan {swolchok, jhalderm}@eecs.umich.edu Abstract major torrent discovery sites and the major trackers [1,29] is feasible. Though a boon to copyright enforcement, this This paper presents two kinds of attacks based on crawl- capability reduces the privacy of BitTorrent downloaders ing the DHTs used for distributed BitTorrent tracking. by making it much harder to “hide in the crowd.” First, we show how pirates can use crawling to rebuild The BitTorrent community has responded to these BitTorrent search engines just a few hours after they are trends by deploying decentralizing tracking systems based shut down (crawling for fun). Second, we show how on distributed hash tables (DHTs). So far, their impact content owners can use related techniques to monitor pi- on privacy and copyright enforcement has been unclear. rates’ behavior in preparation for legal attacks and negate Distributed tracking is harder to disrupt through legal any perceived anonymity of the decentralized BitTorrent threats, but users still rely on centralized torrent discov- architecture (crawling for profit). ery sites that remain relatively easy targets. Some might We validate these attacks and measure their perfor- expect that decentralizing the tracking architecture would mance with a crawler we developed for the Vuze DHT. enhance privacy by making monitoring more difficult, but We find that we can establish a search engine with over the question has received little research attention to date. one million torrents in under two hours using a single In this work, we demonstrate how deployed distributed desktop PC. We also track 7.9 million IP addresses down- tracking systems can be exploited by both sides in the Bit- loading 1.5 million torrents over 16 days. These results Torrent arms race. Each can attack the other by employing imply that shifting from centralized BitTorrent tracking recent techniques for rapidly crawling the DHTs. to DHT-based tracking will have mixed results for the file sharing arms race. While it will likely make illicit torrents On one hand, we find that DHT crawling can be used harder to quash, it will not help users hide their activities. to bootstrap torrent discovery sites quickly, allowing new sites to spring up almost immediately when existing ones become inaccessible. Using only a single machine, we 1 Introduction can create a BitTorrent search engine indexing over one million torrents in under two hours with this technique. The BitTorrent protocol is used to distribute a wide range On the other hand, we show that it is possible to effi- of content to millions of people. BitTorrent’s core de- ciently monitor BitTorrent user activity by crawling only sign uses centralized but diverse “trackers” to coordinate the DHTs and not the centralized tracker infrastructure peers sharing sets of files, referred to as “torrents.” Users or torrent discovery sites. In 16 days of monitoring, we discover content out of band, often through Web-based were able to observe 1.5 million torrents downloaded by torrent search engines and indexes that we will refer to as 7.9 million IP addresses. Effectively, this lets us measure “torrent discovery sites.” what content each user is sharing. Both trackers and torrent discovery sites have come We validate our monitoring and search results by com- under mounting legal pressure from major content own- paring them to results from previous work by other ers [6, 7, 14, 19, 23, 25], who have in the past success- groups [1, 29] that measured the centralized BitTorrent fully shuttered popular services such as suprnova.org [5], infrastructure. In addition, we categorize our sample of OiNK [16], and TorrentSpy [18]. BitTorrent users also torrents and confirm that the distribution is similar to face surveillance by content owners, who are increasingly previous categorizations of BitTorrent content. making legal threats against individuals [8, 9]. Recent The remainder of this paper is organized as follows. work has shown that large-scale monitoring of both the Section 2 introduces BitTorrent, distributed tracking, and our previous work on crawling DHTs. Section 3 explains receiving node; FIND-NODE, which requests the k closest our technique for bootstrapping BitTorrent search en- contacts to a given ID from the receiver’s routing table; gines. Section 4 presents our method for profiling BitTor- and FIND-VALUE, which causes the receiver to return any rent downloaders. Section 5 describes our experimental values it is storing for a requested key. Nodes periodically setup, Section 6 evaluates the performance of our rapid- replicate the values they store to the k peers closest to bootstrapped search engine, and Section 7 presents the each key. The parameter k is specific to each Kademlia results of our user and content profiling. Section 8 surveys implementation. related work, and Section 9 concludes. BitTorrent has two separate Kademlia implementations: the Mainline DHT [20] and the Vuze DHT [26]. Several BitTorrent clients support the Mainline DHT, which is the 2 Background larger of the two. The Vuze DHT was developed sepa- rately for the popular Vuze BitTorrent client, and typically This section provides overviews of BitTorrent, distributed contains about 25–50% as many nodes as Mainline. We tracking, and our DHT crawling methods, with a focus on focus on the Vuze DHT, as Section 3 explains, but our details that will be relevant in our attacks. results could be extended to the Mainline DHT. The BitTorrent Protocol BitTorrent [2] is a peer-to- Distributed Tracking Both BitTorrent DHTs use the peer file sharing protocol that makes up a substantial same general design for distributed tracking: rather than proportion of traffic on today’s Internet. One recent finding other peers using a tracker, a peer joins a torrent by study [15] found that it comprised a majority (50% to looking up the hash of the “info” section of the .torrent 70%) of traffic in several geographic locations. The key file (known as the “infohash”) as a key in the DHT and innovation of BitTorrent is its ability to use peers’ often using peers from the associated value, which is simply otherwise unused upload bandwidth to share pieces of a list of members of the torrent with that infohash. The files among peers before the entire “torrent,” which is the announcement to the tracker is likewise replaced with a term used to refer to both a set of files being shared and STORE(infohash, peer_address) operation. the set of peers downloading and sharing those files, has In order to support fully distributed tracking, the Bit- finished downloading. Torrent metadata, such as the list Torrent community is converging [13] on a method for of files contained in the torrent and checksums for each describing DHT-tracked torrents with a single hyper- chunk or “piece” of each file, are contained in .torrent link known as a “magnet link” instead of a traditional files, which users usually obtain from torrent discovery .torrent file. A magnet link contains a torrent’s info- sites or other websites. hash together with minimal optional metadata, but, to BitTorrent Tracking The original BitTorrent protocol keep the link short, it omits other metadata items that are relies on centralized servers known as “trackers” to in- necessary for downloading the files and checking their form peers about each other. To join a torrent, a peer P integrity. To solve this problem, BitTorrent developers announces itself to the tracker listed in the .torrent file. have implemented several metadata exchange protocols The tracker records P’s presence and replies with a partial that allow peers to query each other for a .torrent given list of the peers currently downloading the torrent. P then its infohash. Centralized services such as torrage.com and connects to these peers using the BitTorrent peer protocol torcache.com also allow users to download .torrent and begins exchanging pieces with them. files given their infohashes. In response to legal pressure from content owners, Bit- BitTorrent DHT Crawler In previous work [27], we Torrent developers have created extension protocols to developed a crawler that rapidly captures the contents of locate and track peers in a distributed fashion, rather than the Vuze DHT. (We employed it in a context unrelated relying on legally-vulnerable central servers. For our to BitTorrent: attacking a self-destructing data system purposes, the principal distributed tracking methods of called Vanish [11].) We briefly describe it here, referring interest are the BitTorrent distributed hash tables (DHTs). interested readers to our earlier paper for the full details. Distributed Hash Tables A distributed hash table is Our crawler, ClearView, is a C-based reimplementation a decentralized key-value store that is implemented as a of the Vuze DHT protocol. It harvests content by means peer-to-peer network. Nodes and content have numeric of a Sybil attack [3], inserting many clients into the DHT identifiers in the same key space, and each node is re- and waiting for content to be replicated to them by nearby sponsible for hosting a replica of the content it is close peers. It can efficiently host thousands of DHT clients to in the space. The principal DHT design used in file in a single process. After most nearby content has been sharing applications today is Kademlia [21]. Kademlia replicated, the clients “hop” to new locations in the DHT nodes communicate over UDP using simple RPCs, includ- address space, allowing ClearView to capture more data ing: STORE, which stores a set of key-value pairs at the with fewer resources. 2 2) Extract torrent descriptions and peers ClearView 3) read infohashes Log and peers Local disk processor Search queries 1) Crawl 5) write metadata .torrent Web server crawler 4) metadata request/response Search requests & results Vuze clients Figure 1: SuperSeed Architecture — Our system rapidly bootstraps a BitTorrent search engine using data from the Vuze DHT.
Recommended publications
  • What Is Peer-To-Peer File Transfer? Bandwidth It Can Use
    sharing, with no cap on the amount of commonly used to trade copyrighted music What is Peer-to-Peer file transfer? bandwidth it can use. Thus, a single NSF PC and software. connected to NSF’s LAN with a standard The Recording Industry Association of A peer-to-peer, or “P2P,” file transfer 100Mbps network card could, with KaZaA’s America tracks users of this software and has service allows the user to share computer files default settings, conceivably saturate NSF’s begun initiating lawsuits against individuals through the Internet. Examples of P2P T3 (45Mbps) internet connection. who use P2P systems to steal copyrighted services include KaZaA, Grokster, Gnutella, The KaZaA software assesses the quality of material or to provide copyrighted software to Morpheus, and BearShare. the PC’s internet connection and designates others to download freely. These services are set up to allow users to computers with high-speed connections as search for and download files to their “Supernodes,” meaning that they provide a How does use of these services computers, and to enable users to make files hub between various users, a source of available for others to download from their information about files available on other create security issues at NSF? computers. users’ PCs. This uses much more of the When configuring these services, it is computer’s resources, including bandwidth possible to designate as “shared” not only the and processing capability. How do these services function? one folder KaZaA sets up by default, but also The free version of KaZaA is supported by the entire contents of the user’s computer as Peer to peer file transfer services are highly advertising, which appears on the user well as any NSF network drives to which the decentralized, creating a network of linked interface of the program and also causes pop- user has access, to be searchable and users.
    [Show full text]
  • From Sony to SOPA: the Technology-Content Divide
    From Sony to SOPA: The Technology-Content Divide The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation John Palfrey, Jonathan Zittrain, Kendra Albert, and Lisa Brem, From Sony to SOPA: The Technology-Content Divide, Harvard Law School Case Studies (2013). Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:11029496 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP http://casestudies.law.harvard.edu By John Palfrey, Jonathan Zittrain, Kendra Albert, and Lisa Brem February 23, 2013 From Sony to SOPA: The Technology-Content Divide Background Note Copyright © 2013 Harvard University. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means – electronic, mechanical, photocopying, recording, or otherwise – without permission. "There was a time when lawyers were on one side or the other of the technology content divide. Now, the issues are increasingly less black-and-white and more shades of gray. You have competing issues for which good lawyers provide insights on either side." — Laurence Pulgram, partner, Fenwick & Westi Since the invention of the printing press, there has been tension between copyright holders, who seek control over and monetary gain from their creations, and technology builders, who want to invent without worrying how others might use that invention to infringe copyrights.
    [Show full text]
  • Downloading Copyrighted Materials
    What you need to know before... Downloading Copyrighted Materials Including movies, TV shows, music, digital books, software and interactive games The Facts and Consequences Who monitors peer-to-peer file sharing? What are the consequences at UAF The Motion Picture Association of America for violators of this policy? (MPAA), Home Box Office, and other copyright Student Services at UAF takes the following holders monitor file-sharing on the Internet minimum actions when the policy is violated: for the illegal distribution of their copyrighted 1st Offense: contents. Once identified they issue DMCA Loss of Internet access until issue is resolved. (Digital Millennium Copyright Act) take-down 2nd Offense: notices to the ISP (Internet Service Provider), in Loss of Internet access pending which the University of Alaska is considered as resolution and a $100 fee assessment. one, requesting the infringement be stopped. If 3rd Offense: not stopped, lawsuit against the user is possible. Loss of Internet access pending resolution and a $250 fee assessment. What is UAF’s responsibility? 4th, 5th, 6th Offense: Under the Digital Millennium Copyright Act and Loss of Internet access pending resolution and Higher Education Opportunity Act, university a $500 fee assessment. administrators are obligated to track these infractions and preserve relevent logs in your What are the Federal consequences student record. This means that if your case goes for violators? to court, your record may be subpoenaed as The MPAA, HBO and similar organizations are evidence. Since illegal file sharing also drains becoming more and more aggressive in finding bandwidth, costing schools money and slowing and prosecuting alleged offenders in criminal Internet connections, for students trying to use court.
    [Show full text]
  • Seedboxes Cc Forum
    Seedboxes Cc Forum It's simple to get started with, and incredibly functional. Never had problems using. cc Mobile Apps for ios and android user. cc German- English Dictionary: Translation for Sumpf. Ellenkező esetben a webhely funkcionalitása korlátozott. Ask questions regarding our services or generic seedbox related tasks. Gradstein & Celis, M. 2021, 19:09 Replies: 3 [Wait for Plugin Update] Pornhub plugin - only HLS - duplicate problem. Seedbucket was developed in-house by Seedboxes. eu e as velocidades de UP e DOWN são muitos boas mas a limitação de tráfego complica. OB Config for seedboxes. Seedbox & Hosting. A lot slower to post updates though. 2011-05-18: Added a link to a post on HP’s support forum where the post helped a bit. A good starting place for additional information about Rapidleech is the Wiki and the official forum. cc, byte-sized-hosting. ---Description---Tellytorrent is an Indian private tracker for Indian movies & series with a collection of BD50, 4kUHD, DVD9, NetFlix DL & Amazon DL- Source. cc often offers special discounts – called “promo codes” on its website. We have not put down all the specifications but you can read more about them in this post. The Raspberry Pi 4 dropped and it's a major update for the flagship single-board computer. On September 2, 2009, isoHunt announced the launch of a spinoff site, hexagon. 53GHz HyperThreading),. cc - Quality and affordable seedbox with premium bandwidth Seedboxes. Sdedi propose des solutions Seedbox uniques et innovantes : un espace disque illimité, un réseau de 10 Gigas, une app seedbox mobile et plus encore, à partir de 2,99 euros.
    [Show full text]
  • A Dissertation Submitted to the Faculty of The
    A Framework for Application Specific Knowledge Engines Item Type text; Electronic Dissertation Authors Lai, Guanpi Publisher The University of Arizona. Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Download date 25/09/2021 03:58:57 Link to Item http://hdl.handle.net/10150/204290 A FRAMEWORK FOR APPLICATION SPECIFIC KNOWLEDGE ENGINES by Guanpi Lai _____________________ A Dissertation Submitted to the Faculty of the DEPARTMENT OF SYSTEMS AND INDUSTRIAL ENGINEERING In Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY In the Graduate College THE UNIVERSITY OF ARIZONA 2010 2 THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Guanpi Lai entitled A Framework for Application Specific Knowledge Engines and recommend that it be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy _______________________________________________________________________ Date: 4/28/2010 Fei-Yue Wang _______________________________________________________________________ Date: 4/28/2010 Ferenc Szidarovszky _______________________________________________________________________ Date: 4/28/2010 Jian Liu Final approval and acceptance of this dissertation is contingent
    [Show full text]
  • Analyzing Peer Selection Policies for Bittorrent Multimedia On-Demand Streaming Systems in Internet
    International Journal of Computer Networks & Communications (IJCNC) Vol.6, No.1, January 2014 ANALYZING PEER SELECTION POLICIES FOR BIT TORRENT MULTIMEDIA ON -DEMAND STREAMING SYSTEMS IN INTERNET Carlo Kleber da Silva Rodrigues 1,2 1Electrical and Electronics Department, Armed Forces University – ESPE, Sangolquí, Ecuador 2University Center of Brasília – UniCEUB, Brasília, DF, Brazil ABSTRACT The adaptation of the BitTorrent protocol to multimedia on-demand streaming systems essentially lies on the modification of its two core algorithms, namely the piece and the peer selection policies, respectively. Much more attention has though been given to the piece selection policy. Within this context, this article proposes three novel peer selection policies for the design of BitTorrent-like protocols targeted at that type of systems: Select Balanced Neighbour Policy (SBNP), Select Regular Neighbour Policy (SRNP), and Select Optimistic Neighbour Policy (SONP). These proposals are validated through a competitive analysis based on simulations which encompass a variety of multimedia scenarios, defined in function of important characterization parameters such as content type, content size, and client´s interactivity profile. Service time, number of clients served and efficiency retrieving coefficient are the performance metrics assessed in the analysis. The final results mainly show that the novel proposals constitute scalable solutions that may be considered for real project designs. Lastly, future work is included in the conclusion of this paper. KEYWORDS BitTorrent, peer selection, multimedia, streaming, interactivity. 1. INTRODUCTION The BitTorrent protocol is undoubtedly an effective peer-to-peer (P2P) solution for file (content) replication over Internet [1, 2, 3, 4]. The overall idea of this protocol is that peers must cooperate with each other to replicate the wanted file.
    [Show full text]
  • Title: P2P Networks for Content Sharing
    Title: P2P Networks for Content Sharing Authors: Choon Hoong Ding, Sarana Nutanong, and Rajkumar Buyya Grid Computing and Distributed Systems Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Australia (chd, sarana, raj)@cs.mu.oz.au ABSTRACT Peer-to-peer (P2P) technologies have been widely used for content sharing, popularly called “file-swapping” networks. This chapter gives a broad overview of content sharing P2P technologies. It starts with the fundamental concept of P2P computing followed by the analysis of network topologies used in peer-to-peer systems. Next, three milestone peer-to-peer technologies: Napster, Gnutella, and Fasttrack are explored in details, and they are finally concluded with the comparison table in the last section. 1. INTRODUCTION Peer-to-peer (P2P) content sharing has been an astonishingly successful P2P application on the Internet. P2P has gained tremendous public attention from Napster, the system supporting music sharing on the Web. It is a new emerging, interesting research technology and a promising product base. Intel P2P working group gave the definition of P2P as "The sharing of computer resources and services by direct exchange between systems". This thus gives P2P systems two main key characteristics: • Scalability: there is no algorithmic, or technical limitation of the size of the system, e.g. the complexity of the system should be somewhat constant regardless of number of nodes in the system. • Reliability: The malfunction on any given node will not effect the whole system (or maybe even any other nodes). File sharing network like Gnutella is a good example of scalability and reliability.
    [Show full text]
  • International Intellectual Property Alliance®
    I NTERNATIONAL I NTELLECTUAL P ROPERTY A LLIANCE® 1818 N STREET, NW, 8TH FLOOR · WASHINGTON, DC 20036 · TEL (202) 355-7924 · FAX (202) 355-7899 · WWW.IIPA.COM · EMAIL: [email protected] September 14, 2012 Filed via www.regulations.gov, Docket No. USTR–2012–0011 Stanford K. McCoy, Esq. Assistant U.S. Trade Representative for Intellectual Property and Innovation Office of the U.S. Trade Representative Washington, DC 20508 Re: IIPA Written Submission Re: 2012 Special 301 Out-of-Cycle Review of Notorious Markets: Request for Public Comments, 77 Fed. Reg. 48583 (August 14, 2012) Dear Mr. McCoy: In response to the August 14, 2012 Federal Register notice referenced above, the International Intellectual Property Alliance (IIPA)1 provides the Special 301 Subcommittee with the following written comments to provide examples of Internet and physical “notorious markets” – those “where counterfeit or pirated products are prevalent to such a degree that the market exemplifies the problem of marketplaces that deal in infringing goods and help sustain global piracy and counterfeiting.” We hope our filing will assist the Office of the United States Trade Representative (USTR) in “identifying potential Internet and physical notorious markets that exist outside the United States and that may be included in the 2012 Notorious Markets List.” We express appreciation to USTR for publishing a notorious markets list as an “Out of Cycle Review” separately from the annual Special 301 Report. This list has successfully identified key online and physical marketplaces that are involved in intellectual property rights infringements, and has led to some positive developments. These include closures of some Internet websites whose businesses were built on illegal conduct, greater cooperation from some previously identified “notorious” and other suspect sites, and the facilitation of licensing agreements for legitimate distribution of creative materials.
    [Show full text]
  • "Ephemeral Data" and the Duty to Preserve Discoverable Electronically Stored Information Kenneth J
    University of Baltimore Law Review Volume 37 Article 4 Issue 3 Spring 2008 2008 "Ephemeral Data" and the Duty to Preserve Discoverable Electronically Stored Information Kenneth J. Withers The Sedona Conference Follow this and additional works at: http://scholarworks.law.ubalt.edu/ublr Part of the Law Commons Recommended Citation Withers, Kenneth J. (2008) ""Ephemeral Data" and the Duty to Preserve Discoverable Electronically Stored Information," University of Baltimore Law Review: Vol. 37: Iss. 3, Article 4. Available at: http://scholarworks.law.ubalt.edu/ublr/vol37/iss3/4 This Article is brought to you for free and open access by ScholarWorks@University of Baltimore School of Law. It has been accepted for inclusion in University of Baltimore Law Review by an authorized administrator of ScholarWorks@University of Baltimore School of Law. For more information, please contact [email protected]. "EPHEMERAL DATA" AND THE DUTY TO PRESERVE DISCOVERABLE ELECTRONICALLY STORED INFORMATION Kenneth J. Witherst I. ELECTRONIC ALL Y STORED INFORMATION AND THE DUTY OF PRESERVATION UNDER THE RULES OF CIVIL PROCEDURE The duty to take reasonable steps to preserve evidence discoverable in pending or anticivated civil litigation is well established in U.S. civil jurisprudence. Courts differ about when the duty of preservation is triggered, the particular scope of that duty, and what steps of preservation are considered reasonable on a case-by-case basis,2 but the general rule is clear: parties have a duty to preserve t Kenneth J. Withers is Director of Judicial Education and Content for The Sedona Conference®, a non-profit law and policy think-tank based in Arizona.
    [Show full text]
  • List of Search Engines
    A blog network is a group of blogs that are connected to each other in a network. A blog network can either be a group of loosely connected blogs, or a group of blogs that are owned by the same company. The purpose of such a network is usually to promote the other blogs in the same network and therefore increase the advertising revenue generated from online advertising on the blogs.[1] List of search engines From Wikipedia, the free encyclopedia For knowing popular web search engines see, see Most popular Internet search engines. This is a list of search engines, including web search engines, selection-based search engines, metasearch engines, desktop search tools, and web portals and vertical market websites that have a search facility for online databases. Contents 1 By content/topic o 1.1 General o 1.2 P2P search engines o 1.3 Metasearch engines o 1.4 Geographically limited scope o 1.5 Semantic o 1.6 Accountancy o 1.7 Business o 1.8 Computers o 1.9 Enterprise o 1.10 Fashion o 1.11 Food/Recipes o 1.12 Genealogy o 1.13 Mobile/Handheld o 1.14 Job o 1.15 Legal o 1.16 Medical o 1.17 News o 1.18 People o 1.19 Real estate / property o 1.20 Television o 1.21 Video Games 2 By information type o 2.1 Forum o 2.2 Blog o 2.3 Multimedia o 2.4 Source code o 2.5 BitTorrent o 2.6 Email o 2.7 Maps o 2.8 Price o 2.9 Question and answer .
    [Show full text]
  • Gnutella Protocol
    A BRIEF INTRODUCTION AND ANALYSIS OF THE GNUTELLA PROTOCOL By Gayatri Tribhuvan University of Freiburg Masters in Applied Computer Science [email protected] ABSTRACT Peer to peer technology has certainly improved mechanisms of downloading files at a very high rate. Statistics and research show that as more number of peer to peer protocols developed, the number of nodes in the entire network increased. In this paper, we observe the structure and operation of Gnutella , a peer to peer protocol. The paper also focuses on some issues of this protocol and what improvements were further made to improve its scalability. We also have a look at some of the security issues of this protocol. Some statistics in the paper also reflect important changes that the protocol inflicted onto the entire network. PDF Created with deskPDF PDF Writer - Trial :: http://www.docudesk.com 1. INTRODUCTION Gnutella is a p2p protocol. This paper has tried to address some of the issues that have been faced by users of Gnutella, including that of scalability (increasing the scale of operation, i.e. the volume of operation with progressively larger number of users) and security. The Gnutella protocol is an open, decentralized group membership and search protocol, mainly used for file searching and sharing. Group membership is open and search protocol addresses searching and sharing of files. The term Gnutella represents the entire group of computers which have Gnutella speaking applications loaded in them forming a virtual network. Each node can function as both client as well as server. Thus they can issue queries to other nodes as well accept and respond to queries from other nodes, after matching the queries with the contents loaded in their own hard disks.
    [Show full text]
  • Searching for Malware in Bittorrent∗
    Searching for Malware in BitTorrent∗ Andrew D. Berns and Eunjin (EJ) Jung April 24, 2008 Abstract One of the most widely publicized aspects of computer security has been the presence and propagation of malware. Malware has adapted to many different changing technologies, in- cluding recently-popular P2P systems. While previous work has examined P2P networks and protocols like KaZaA and Gnutella for malware, little has been done so far that examines BitTor- rent. This project explored BitTorrent for the presence of malware, and discovered a significant portion of malware in the downloaded file set. Statistics on torrents infected with malware were gathered and analyzed to find patterns that are helpful in creating basic filtering heuristics. While these heuristics may work in simple cases, several easy ways they can be defeated were found. 1 Introduction Recently, peer-to-peer networks have emerged as a popular paradigm for Internet applications. In fact, a study in 2005 estimated that P2P traffic accounted for around 70% of all traffic on the Internet [2]. P2P technology is finding new applications as it grows, including voice-over-IP systems and streaming video delivery. While P2P has found several different uses, perhaps the most widely-known use is for file sharing. One concern with P2P file sharing is that it can be used to distribute malware (malicious software, such as worms, viruses, and rootkits). On the one hand, users have access to huge amounts of data, while on the other hand, this data can easily be tainted with viruses, worms, and other forms of malware. An important consideration, then, is if the concern about malware in P2P networks is warranted, and if it is, are there ways to protect a computer and minimize the risk of malware infection from P2P networks.
    [Show full text]