Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference

Total Page:16

File Type:pdf, Size:1020Kb

Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference USENIX Association Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference San Antonio, Texas, USA June 9-14, 2003 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2003 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. U-P2P: A Peer-to-Peer Framework for Universal Resource Sharing and Discovery Neal Arthorne, Babak Esfandiari, Aloke Mukherjee Department of Systems and Computer Engineering Carleton University Ottawa, Ontario, Canada [email protected] [email protected] [email protected] Abstract of files. Another important problem is the current frac- tured state of peer-to-peer communities. Again, the We present U-P2P, an open source framework for de- lack of metadata about communities makes it difficult veloping, deploying and discovering file-sharing com- to know what there is to look for in the first place. munities. We address the problem of search in peer-to- We propose a peer-to-peer framework called Univer- peer file sharing by allowing the end user to add meta- sal Peer-to-Peer (U-P2P) that simplifies the sharing of data to shared documents. Each file-sharing community custom metadata formats as well as the easy creation, allows the sharing of a particular structured document configuration and discovery of peer-to-peer communi- type. Communities are themselves modeled as struc- ties. In U-P2P, each community is described in part tured documents, thus enabling their sharing and dis- by the metadata of the files it exchanges. The format covery just like any other document. The creator of of a community’s metadata is specified using the XML a particular community specifies, among other proper- Schema language, allowing new communities centered ties, the document type that it shares and the deployment on that file type to be created in any text editor. U-P2P model. U-P2P’s extensible architecture allows develop- allows these custom resources to be created and shared ers to create new properties or extend existing ones. For as in traditional file-sharing services. example, developers can provide new deployment mod- By logical extension, the description of the commu- els or custom privacy and authentication features. U-P2P nity itself (the metadata format of its files, the proto- makes use of other open source projects such as Jakarta col used for search, etc) is encapsulated in an XML file. Tomcat and eXist, an XML database system. In U-P2P, creating, sharing or discovering a community follows the same principles as creating, sharing or dis- 1Introduction covering a file within that community. As a result, file- The current success of peer-to-peer (P2P) file sharing ap- swapping communities such as Napster or Kazaa can be plications has highlighted the benefits of resource dis- seen as instances of U-P2P devoted to sharing a few spe- tribution and redundancy. However, to truly exploit cific file types and utilizing a given P2P protocol. such advantages, a few roadblocks remain to be cleared. In particular, search and discovery of resources is still 2 Related Work quite difficult. Most known approaches rely on sim- Our focus is on the discovery problem in peer-to-peer ple schemes such as a search for the resource name or systems. In Napster [1], the only files that could be type. File sharing communities are most efficient when shared on the network were MP3 audio files. Search the name of the file carries most if not all of the needed was based on filenames and relied on users encoding information. As a result, most file swapping commu- the artist and title of each song in the MP3’s title. Al- nities are restricted to swapping music and video files. though metadata such as encoding rate could be used Even when the exchange of non-music files is possible, to sort the results, there was no way to search on these the difficulty of finding such files has been a sufficient metadata or define other parameters. On a Gnutella [2] deterrent. Lack of metadata is the main problem. network, any type of file can be shared: however there is Another roadblock to more general applications of no explicit metadata handling. Search strings are passed P2P has been the difficulty of creating communities for around without processing between peers and their in- specific purposes. This arises partially from the diffi- terpretation is left to the peer. Each peer must imple- culty of defining custom metadata about different types ment its own search algorithm using the search string USENIX Association FREENIX Track: 2003 USENIX Annual Technical Conference 29 as input. Most Gnutella implementations, like Napster, 3 Background: XML Schema simply return filenames that contain the search string. U-P2P relies heavily on XML Schema and the following Gnutella does permit the design of overlay protocols to sections include some examples of XML Schema and a encode and decode metadata from search strings. Some brief description of the structure of XML Schema docu- Gnutella developers have proposed using this approach ments. for richer metadata searches. [3, 4]. Schemas are defined XML Schema is a Recommendation [8] by the World for common file types: for example an audio file might Wide Consortium for a schema language for XML. It be defined to have properties such as artist, title, bit rate, constitutes a set of markup tags themselves written in album, etc. The schema defines a structured format for XML, that are used to describe both the structure of an searching MP3 metadata that is sent as a search string to XML document and the datatypes of individual elements other Gnutella nodes. Responding clients use the query and attributes. XML Schema uses the XML Namespace to search local files annotated using the schema and re- Recommendation that allows authors to prefix their own turns the results using the same structured format. In tags with a specific namespace. It also allows importing practice though, all members of a given community must types from other schemas both in the current namespace be able to speak a common language in order to commu- and from other namespaces. nicate. XML Schema can be used to validate XML docu- Other P2P systems such as FastTrack [5], Opencola ments using a schema processor. XML documents them- [6] or Bitzi [7] propose variations on that idea, but they selves can link to their schema using a schemaLocation are still limited to a number of predefined schemas. The tag. However this is only a suggested method for speci- latter two have the capability to extract metadata infor- fying a schema for the document and they may use other mation from a file given the file format, but this is only types of schemas such as DTDs or not require validation possible for certain formats. at all. In XML Schema, there are four basic units that can be Clearly, there is a need for shared ontologies if we used to describe parts of an XML document: elements, want to allow search for any type of file. In the In- attributes, simpleTypes and complexTypes. ternet world, the XML Schema format[8] is the cur- rent choice to represent ontologies, replacing Document Elements - Elements are the tags used in XML to Type Definition (DTD) [9], the schema language de- label a piece of data. They have a datatype associated fined in the original XML specification. XML Schema with them that is either defined at the same time as the supports the creation of custom and complex data types element (anonymously) or is a complexType or simple- for XML tags, which is essential to describe aggregated Type. Elements can contain subelements, attributes and or composite resources. For richer semantic descrip- can also be part of an XML Namespace. The structure tions, for example to allow software agents to perform of an element can be controlled in XML Schema using search instead of humans, there can be a need to de- sequences, choices and the same restrictions used to de- scribe relationships between resources. RDF [10] and rive simple and complex types. For example an author more generally the Semantic Web [11] effort address can specify that an element called ’name’ should consist that need. The Edutella project [12] is a technology for of an exact sequence of the two elements ’firstname’ and distributed learning that uses a P2P network for sharing ’lastname’. RDF-formatted metadata. However in Edutella metadata Attributes - Attributes are properties that belong to a is the resource, not the means to describe one. parent element. They can only use simpleTypes and they cannot contain any elements or attributes. Attributes can A P2P system using a semantic layering approach is be optional or mandatory within an element. not without its drawbacks. First and foremost is get- Simple Types - These types are either built-in data ting users of the system to supply metadata for their re- types specified in the XML Schema Recommendation or sources. The addition of metadata requires user-friendly Simple Types derived from the built-in or existing types tools for authoring RDF and schemas and a simplified (XML Schema defines some derived types such as date approach for users who are not familiar with XML lan- and time). The Recommendation includes many useful guages. data types such as string and decimal, that can easily be What we propose in U-P2P starts, as in Edutella, with manipulated into more specialized Simple Types.
Recommended publications
  • Simulacijski Alati I Njihova Ograničenja Pri Analizi I Unapređenju Rada Mreža Istovrsnih Entiteta
    SVEUČILIŠTE U ZAGREBU FAKULTET ORGANIZACIJE I INFORMATIKE VARAŽDIN Tedo Vrbanec SIMULACIJSKI ALATI I NJIHOVA OGRANIČENJA PRI ANALIZI I UNAPREĐENJU RADA MREŽA ISTOVRSNIH ENTITETA MAGISTARSKI RAD Varaždin, 2010. PODACI O MAGISTARSKOM RADU I. AUTOR Ime i prezime Tedo Vrbanec Datum i mjesto rođenja 7. travanj 1969., Čakovec Naziv fakulteta i datum diplomiranja Fakultet organizacije i informatike, 10. listopad 2001. Sadašnje zaposlenje Učiteljski fakultet Zagreb – Odsjek u Čakovcu II. MAGISTARSKI RAD Simulacijski alati i njihova ograničenja pri analizi i Naslov unapređenju rada mreža istovrsnih entiteta Broj stranica, slika, tablica, priloga, XIV + 181 + XXXVIII stranica, 53 slike, 18 tablica, 3 bibliografskih podataka priloga, 288 bibliografskih podataka Znanstveno područje, smjer i disciplina iz koje Područje: Informacijske znanosti je postignut akademski stupanj Smjer: Informacijski sustavi Mentor Prof. dr. sc. Željko Hutinski Sumentor Prof. dr. sc. Vesna Dušak Fakultet na kojem je rad obranjen Fakultet organizacije i informatike Varaždin Oznaka i redni broj rada III. OCJENA I OBRANA Datum prihvaćanja teme od Znanstveno- 17. lipanj 2008. nastavnog vijeća Datum predaje rada 9. travanj 2010. Datum sjednice ZNV-a na kojoj je prihvaćena 18. svibanj 2010. pozitivna ocjena rada Prof. dr. sc. Neven Vrček, predsjednik Sastav Povjerenstva koje je rad ocijenilo Prof. dr. sc. Željko Hutinski, mentor Prof. dr. sc. Vesna Dušak, sumentor Datum obrane rada 1. lipanj 2010. Prof. dr. sc. Neven Vrček, predsjednik Sastav Povjerenstva pred kojim je rad obranjen Prof. dr. sc. Željko Hutinski, mentor Prof. dr. sc. Vesna Dušak, sumentor Datum promocije SVEUČILIŠTE U ZAGREBU FAKULTET ORGANIZACIJE I INFORMATIKE VARAŽDIN POSLIJEDIPLOMSKI ZNANSTVENI STUDIJ INFORMACIJSKIH ZNANOSTI SMJER STUDIJA: INFORMACIJSKI SUSTAVI Tedo Vrbanec Broj indeksa: P-802/2001 SIMULACIJSKI ALATI I NJIHOVA OGRANIČENJA PRI ANALIZI I UNAPREĐENJU RADA MREŽA ISTOVRSNIH ENTITETA MAGISTARSKI RAD Mentor: Prof.
    [Show full text]
  • U-P2P: a Peer-To-Peer System for Description and Discovery of Resource-Sharing Communities
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Carleton University's Institutional Repository 1 U-P2P: A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke MUKHERJEE, Babak ESFANDIARI, and Neal ARTHORNE knowledge who do not possess the expertise to design their Abstract-- A simple method is proposed for peer-to-peer own applications. This allows applications to be tailored to description and discovery of resource-sharing communities as smaller real-world communities and enables the sharing of well as the resources themselves. An XML Schema document virtually any object - for example: describes a shared resource. By applying transformations, specified in XSL, the schema is used to generate an application with the ability to publish and search for the defined object. • XML descriptions of chemical molecules for chemists or Meta-data is indexed and searchable allowing complex objects chemistry students to be discovered. We propose to solve the problem of discovery • descriptions of species for scientists studying biodiversity for resource-sharing communities by treating a community as a • descriptions of genes for researchers studying the human shared resource. An XML Schema description of resource- sharing communities is used to generate a P2P system for genome community discovery. • design patterns for computer science students • software components for application developers Index Terms—peer-to-peer, resource description, resource discovery, XML Schema The focus of existing communities can be narrowed by specifying additional attributes - for example: I. INTRODUCTION HIS paper proposes a solution to the problem of resource • MP3 trading sub-communities focused on the work of a T description and discovery in peer-to-peer file-sharing single artist or genre communities.
    [Show full text]
  • Limewire Case Study
    MARYLAND STATE POLICE COMPUTER FORENSICS LABORATORY MDSP-CFL2006 DIGITAL MEDIA ANALYSIS OF GNUTELLA PEER-TO-PEER NETWORKS LIMEWIRE CASE STUDY David B. Heslep Detective Sergeant DO NOT DISSEMINATE WITHOUT THE EXPRESS PERMISSION OF THE AUTHOR Copyright © 2006, David B. Heslep, All Rights Reserved MDSP-CFL 2006 Gnutella Peer-to-Peer Networks: Limewire D/Sgt. David Heslep 2 of 32 6/13/2006 Contents TITLE - Digital Media Analysis – Gnutella Peer-to-Peer Networks: Limewire................................3 Introduction.........................................................................................................................................3 Intended Audience..............................................................................................................................3 Goals and Objectives .........................................................................................................................3 The Gnutella Protocol.........................................................................................................................4 Gnutella Descriptors ...............................................................................................................4 Handshaking .....................................................................................................................................5 Gnutella Handshake ...............................................................................................................5 Ultrapeer Handshaking......................................................................................................................6
    [Show full text]
  • Xxxxxxxxxxx Xxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxx Xxxxxxxxxxx Xxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxx
    xxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxx xxxxx xxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx xxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx oxygen edition @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ department of corrections @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ correction 1: in the nasa directory and tiz page we stated that natas was involved with the nasa directory. actually, he was not. sheepbyte had formatted the file to send it to osp which is why it contained natas in it. billy hoyle assumed natas was involved because it was in the text. a special apology goes out to natas. update 1: we have switched sites. we are now located at tiz.bravehost.com update 2: we have postponed the government robots.txt project until a later issue because our new projects are cooler and require more work. the government robots project will be completed. apology 1: we ripped the urls for the bellsouth mining project from the binrev topic. apparently they are not cool with that and we thought the project was almost done. update 3: there is no hacker news wire in this issue because the main writer is feeling lazy. if you wish to email us letters, articles, reviews, flames, hate mail, death threats, requests to join, or questions please email [email protected] current members: zraith ([email protected]) no pgp key a3kton ([email protected]) -----begin pgp public key block----- version: pgpfreeware 6.5.8 for non-commercial
    [Show full text]
  • A Middleware Approach to Building Content-Centric Applications
    A Middleware Approach to Building Content-Centric Applications Gareth Tyson B.Sc. (Hons) M.Sc. A Thesis Submitted for the Degree of Doctor of Philosophy Department of Computer Science Lancaster University UK May, 2010 Acknowledgements I would like to thank everybody who has helped me during the compilation of this thesis. First and foremost I would like to thank Andreas Mauthe for his invaluable contributions to my research. I could not have asked for a more insightful and dedicated supervisor, who always found time to help regardless of his own workload. I realise this is a rare characteristic and I consider myself very lucky to have worked with him. In addition, I would also like to thank a number of other people who have made a range of immeasurable contributions to my work over the years: Sebastian Kaune, Yehia El-khatib, Paul Grace, Thomas Plagemann, Gordon Blair, Magnus Skjegstad, Rub´enCuevas Rum´ın,as well as everybody from the CONTENT and INTERSECTION projects. It was a privilege to work with you all. Alongside these people, I would also like to acknowledge David Hutchison, Edmundo Mon- teiro and Michael Luck for making a number of helpful suggestions regarding the final version of this thesis. I would also like to thank everybody at Infolab21 and Lancaster University who made my time here so enjoyable, including Ali, Andi, Andreas, Andy, Colin, Eddie, Frodo, Jamie, Jason, Jayne, Kirsty, Matt, Mu, Rach, Rick, Sean, Sue and Yehia. In particular, I'd like to thank Zo¨efor keeping me sane (and smiling) whilst writing up! Last of all, I'd like to thank my parents for supporting me during every stage of my education and always being there to lend a helping hand.
    [Show full text]
  • Scalable Streaming Multimedia Delivery Using Peer-To-Peer Communication
    Downloaded from orbit.dtu.dk on: Oct 06, 2021 Scalable Streaming Multimedia Delivery using Peer-to-Peer Communication Poderys, Justas Publication date: 2019 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Poderys, J. (2019). Scalable Streaming Multimedia Delivery using Peer-to-Peer Communication. Technical University of Denmarik. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. PhD Thesis DRAFT Scalable Streaming Multimedia Delivery using Peer- to-Peer Communication Justas Poderys Supervised by: Jose Soler, PhD, and Prof. Lars Dittmann, PhD Technical University of Denmark Kgs. Lyngby, Denmark 2018 Cover image: c hermione13 / 123RF Stock Photo. Print license acquired. Technical University of Denmark Department Photonics Engineering Ørsteds Plads 343 2800 Kgs. Lyngby DENMARK Tel: (+45) 45 25 63 52 Fax: (+45) 45 93 65 81 Web: fotonik.dtu.dk E-mail: [email protected] Contents Abstract vii Resumé ix Preface xi Acknowledgments xiii List of Publications xv List of Figures xvii List of Tables xxi Glossary xxiii 1 Introduction 1 1.1 Thesis Organization .........................
    [Show full text]
  • Collaborative Futures’
    COLLABORATIVEFUTURES Published : 2011-04-18 License : None Introduction 1. Anonymous 2. How this Book is Written 3. A Brief History of Collaboration 4. This Book Might Be Useless 1 1. ANONYMOUS You do not talk about Anonymous. You do NOT talk about Anonymous. (Wikis are fine though. FEAR US.) Anonymous works as one, because none of us are as cruel as all of us. Anonymous is everyone. Anonymous does it for the lulz. Anonymous cannot be out-numbered, Anonymous out-numbers you. Anonymous is a hydra, constantly moving, constantly changing. Remove one head, and nine replace it. Anonymous reinforces its ranks exponentially at need. Anonymous has neither leaders nor anyone with any higher stature. Anonymous has no identity. Anonymous is Legion. Anonymous does not forgive. Anonymous does not forget. 13 of the 41 entries in the Sekrit Code of Anonymous In this section we’re breaking the first two rules of the Sekrit Code of Anonymous <encyclopediadramatica.com/index.php?title=Anonymous&oldid=1998440936#The_Sekrit_Code>. When others have done this in the past it has brought down the wrath of this shadowy group of anonymous individuals, causing public humiliation, hacked servers, and other florid forms of chaos. Anonymous is a collection of individuals that post anonymously on /b/ <img.4chan.org/b/>, a section of the image board 4chan.org. When you post content on a typical message board, you are often required to enter your name. If you don’t, your entry is attributed to “anonymous”. On /b/ everyone posts as “anonymous”. The collective actions of users identified with the name anonymous aggregates into the collective identity Anonymous.
    [Show full text]
  • Spam Characterization and Detection in Peer-To-Peer
    Spam Characterization and Detection in Peer-to-Peer File-Sharing Systems Dongmei Jia Wai Gen Yee Ophir Frieder Illinois Institute of Technology Illinois Institute of Technology Illinois Institute of Technology 10 W 31st Street 10 W 31st Street 10 W 31st Street Chicago, IL 60616 Chicago, IL 60616 Chicago, IL 60616 1-312-567-5330 1-312-567-5205 1-312-567-5143 [email protected] [email protected] [email protected] ABSTRACT users by their filenames. A spammer can easily rename a file to Spam is highly pervasive in P2P file-sharing systems and is diffi- manipulate how it is retrieved and ranked. For example, the mu- cult to detect automatically before actually downloading a file due sic/movie industry has been injecting large amounts of spam into to the insufficient and biased description of a file returned to a the network by naming them after real songs/movies in the battle client as a query result. To alleviate this problem, we first charac- against the illegal distribution of copyrighted materials [2][3]. terize spam and spammers in the P2P file-sharing environment Spam is harmful to P2P file-sharing systems in several ways. and then describe feature-based techniques for automatically de- First, it degrades user experience. Second, spam may contain tecting spam in P2P query result sets. Experimental results show malware that, when executed, could destroy a computing system. that the proposed techniques successfully decrease the amount of Third, its transfer and discovery waste a significant amount of spam by 9% in the top-200 results and by 92% in the top-20 re- network and computing resources.
    [Show full text]