Peer to Peer and SPAM in the Internet
Total Page:16
File Type:pdf, Size:1020Kb
Networking Laboratory Department of Electrical and Communications Engineering Helsinki University of Technology Peer to Peer and SPAM in the Internet Report based a Licentiate Seminar on Networking Technology Fall 2003 Editor: Raimo Kantola Abstract: This report is a collection of papers prepared by PhD students on Peer-to-Peer applications and unsolicited e-mail or spam. The phenomena are covered from different angles including the main algorithms, protocols and application programs, the content, the operator view, legal issues, the economic aspects and the user point of view. Most papers are based on literature review including the latest sources on the web, some contain limited simulations and a few introduce new ideas on aspects of the phenomena under scrutiny. Before presenting the papers we will try to give some economic and social background to the phenomena. Overall, the selection of papers provides a good overview of the two phenomena providing light into what is happening in the Internet today. Finally, we provide a short discussion on where the development is taking us and how we should react to these new phenomena. Acknowledgements My interest in Peer-to-peer traffic and applications originates in the discussions that took place in the Special Interest Group on Broadband networks hosted within the TEKES NETS program during 2003. In particular, I would like to thank Pete Helenius of Rommon for his insights. The Broadband group organised an open seminar in Dipoli on November 20th that brought the topic into a wider discussion involving people from both the networking side as well as the content business. I would also like to acknowledge the inspiring discussions with Juha Oinonen and Klaus Lindberg of CSC and Funet on the subject of peer-to-peer traffic and applications. 2 Preface This report is based on the work done by my licentiate and Ph.D students on the Licentiate Course on Networking Technology (S38.030) during the Fall 2003. The idea was to learn about the disruptive and also annoying phenomena that have become very commonplace over the past couple of years in the Internet: namely, the Peer-to-Peer traffic and applications and the unsolicited and unwanted e-mail or Spam. Due to the illegal copies of audio and video files in Peer-to- Peer networks and unwanted nature of Spam one could claim that both of these phenomena are parasitic and threaten the purpose the Internet was designed for. Especially, the proponents of p2p say that Internet was originally all peer to peer and it was only later when the client server approach took hold and became widespread. The nature of information goods on the Internet is such that it is very difficult to earn money on them. This applies equally to content and services. This has brought the advertisement driven business model to the Internet. Internet charging does not much care about the direction of the traffic, nor usually for the volume. Under these conditions sending of bulk unwanted, unsolicited, commercial or non-commercial e-mail or spam has emerged. Spam is a clear violation of the original cooperative nature of the Internet. The students were given assignments. They prepared a paper and presented it during two seminar days late November 2003. The presentations were opponeered and discussed in the seminar and the students had the opportunity to improve their papers. The best of the student papers are included in this report. I have used some time to polish the papers to make them more readable and also given comments to a few of them as Editor’s notes that appear in the Footnotes. Nevertheless, I apologize for the remaining bugs in the text as well as defects in the layout. We preferred timeliness of publication and left final polishing of the text to future work on those papers that warrant it. The idea of the seminar was to learn about the new phenomena and try to understand where these phenomena are leading the Internet in the near future. I hope this report will pass on what we learned and give some material for thought of your own. January, 2004 Raimo Kantola Disclaimer: Some of the students mention their Affiliations. The reader should be aware that the students are responsible for the contents of their papers and nothing in them represents an agreed official position of the Affiliation concerned. Nor does anything said in this Report represent any official legal position of the Networking Laboratory. 3 Table of Contents Preface 3 Part I: Peer-to-Peer applications: Introduction 5 Gnutella project overview 8 The Freenet Project: Signing on to Freedom 20 Peer-to-Peer Communication Services in the Internet 29 Protocols for resource management in a Peer-to-Peer Network 38 Performance measurements and availability problems in P2P systems 48 Network and Content Request Modelling in a Peer-to-Peer System 58 Peer to Peer and ISPs´ Survival strategies 69 Peer to Peer and Content Distribution 78 Building up Trust Collaboration in P2P Systems based on Trusted Computing Platform 88 Open Problems in Peer-to-Peer Systems: Quality of Service and Scalability 96 Economics of Peer-to-Peer Music Industry Case Study 104 Legal Issues in P2P Systems 115 Part II: SPAM 125 Spam – from nuisance to Internet infestation 126 Mechanisms for Detection and Prevention of Email Spamming 135 Economical Impact of SPAM 147 Discussion 159 4 Part I: Peer-to-Peer applications: Introduction The introduction of broadband to the residential market in the Internet has brought always- on connectivity to the wide public. Broadband makes possible many new services. Transporting audio and video on the NET is now possible. The World Wide Web has moved towards a TV –like user experience with a rather clear separation of the roles of content providers and consumers. The users are content producers only to a rather limited extent. From the user’s point of view, he or she has invested in a powerful PC with lots of memory and cycles that are only lightly loaded most of the time. The user is also paying a flat rate monthly charge for the connectivity to the Internet irrespective of the volume of traffic the user is sending or receiving. This is a natural environment for the new popularity and the emergence of new Peer-to-Peer Applications. The idea is to use the PCs of the users to do something useful cooperatively without relying on any service provider’s help for which the users would have to pay. Most popular use of the p2p applications falls under the category of File Sharing. Most of the files seem to be audio and video and a lot of that is originally illegal copies. Other, either clearly useful, benign or useful but disruptive to the existing value chain uses of the p2p technology have also emerged. The share of p2p traffic in all the Internet traffic has grown dramatically over the part 2…3 years. It is now said to anything between 20 and 80% of all Internet traffic. All in all it seems that only recently many were asking why is broadband needed and what is the killer application in broadband? Now that, the users have surprised the incumbent players again and found the killer application, many people are unhappy. The content owner lobby used to protecting its business that is based on copyrights has launched a full-scale attack on peer-to-peer applications and their users. The new popularity of Peer-to-peer started with Napster that still had a central server for storing index information to files that the users were willing to share. Only the sharing of files itself took place in a peer-to-peer fashion directly from host to host. The central server in Napster turned out to be vulnerable to a legal attack of the copyright lobby and Napster was closed down. Recently, Napster has been reopened as a commercial service. The commercial Napster, however, has very little to do with the original Napster in terms of implementation. Very quickly after the demise of Napster the scene was taken over by fully decentralized implementations of File Sharing applications such as Gnutella, KaZaA, DirectConnect etc. A decentralized network of peers has no central authority and nodes can freely join the network at any time and disconnect from the network without loss of operability of the whole system. It is easy to fall into the trap of taking a moral stand and say that Peer-to-peer File Sharing is evil, it is stealing from the authors of content and it should be punished severely. I believe, it is important to keep a cool head and try to understand the phenomena from different angles. One important angle is economic. In a couple of next sections I will briefly analyse the economics of information in a networked society. Information value chain In the digital information economy the simplest value chain makes the distinction between content providers and the distribution chain. Under content providers we lump the audio recording and movie industry as well as newspapers and other publishers. Also the distribution channel is diverse. It contains the traditional methods used by each of the content provider types. Now, with the emergence of the broadband Internet, a more efficient method of digital content distribution has come to being. Besides the Internet many other components are needed; those however, are sold as consumer goods directly to the users of the network with consumer market economic rules. The content industry is used to earn money based on its copyright. The nature of copyright is that it gives a monopoly right to the copyright owner to earn money on the content for tens of years.