PUBLIC PEER-TO-PEER FILESHARING NETWORKS’ EVALUATION

J. Lloret Mauri1, B. Molina Moreno2, . Palau Salvador3, M. Esteve Domingo4 Department of Communications, Polytechnic University of Valencia Camino Vera s/n, Valencia, Spain [email protected], [email protected], [email protected], [email protected]

ABSTRACT • Download speed: some internal P2P network, due to Since the recent appearance of P2P file- networks, its internal behaviour, are optimal for downloading many Internet users have chosen this technology to search files of reduced size. Others, however, use for programs, films, songs, etc. Their number of users is multisplitting mechanisms and permit the download growing every day due to the attractive and interesting from multiple sources, making them suitable for content type that can be found and downloaded over these obtaining larger files. networks. In this article six public architectures are analyzed, , FastTrack, Opennap, Edonkey, MP2P These parameters are responsible for an architecture and , tracking their evolution during a week in becoming very popular or, on the other hand, terms of connected users, number of files and size of disappearing. The above factors can make a P2P network shared files per hour. The results will be compared and more attractive to users of a specific nation due to the discussed with previous measurements taken a year ago. utilization of a specific language or even social trends [7]. These data can be used to design new network models, to It is also interesting to distinguish between a P2P network calculate their performance or to optimize new network and P2P , the development of which must not be parameters. necessarily parallel. If a P2P client changes its P2P network, all its community users will remain using it. As KEY WORDS an example, many users have remained ‘loyal’ to the Peer-To-Peer, Evaluation, Filesharing Networks. P2P client throughout its evolution. Sometimes the reason for changing to another newly offered P2P 1. Introduction client is its ability to simultaneously contact several networks, such as [8] and MLDonkey [9]. Since Internet became accessible to the world, the volume of connected users has been growing up in spectacular Due to the significant social impact of P2P file-sharing fashion. Recently, the number of Internet users is networks, both industry and academia are spending time estimated to be over 580 millions with a steady growth and money analyzing several aspects of these rate of 4% [1]. In the last years, Internet has sufficiently architectures. Probably the first considerations are the matured to catch market’s attention both at home and legality of the files that are being shared [10] and the enterprise level. Besides, the recent transition to potential risks that are running both home users [11] and broadband access networks, with a current amount of 40 industry with workers using their workstations as P2P million of users worldwide [2] allows multimedia clients [12]. Furthermore, most P2P networks have not services. been designed to address security issues (authentication, file permissions and file integrity); some P2P clients even P2P filesharing is one of the P2P variants with a large include addware or spyware. Moreover, as files are often number of supporters. Many users merely interact with downloaded from unknown users, they may include some this P2P network to download files without providing any, viruses or trojans inside the files. but there are many others with the intention to their files with the whole community without caring about the Some ISPs have observed that their networks became person who is downloading them. Those who share files rapidly congested and sometimes P2P traffic reached often have a broadband always-on connection. There are nearly 60% of the total traffic [13]. Although not so many P2P public architectures; some of them are already striking, Internet2 administrators also computed relatively old, such as Gnutella [3], FastTrack [4] and impressive results on 16 February 2004 where 10.46% of OpenNap [5], whilst others are relatively new as is the the total traffic was originated by P2P file-sharing [14]. case of BitTorrent [6]. The success of a P2P network CAIDA (Cooperative Association for Internet Data inside a user community is determined by several factors: Analysis) also shows that Internet traffic is mainly dominated by P2P file-sharing protocols and HTTP [15]. • Simplicity: a P2P network with a graphical and easy- Some articles show the average number of connected to-use P2P is always welcome. users in some architecture [16] and, sometimes, even the • Language: a P2P client with multilanguage support maximum number of users. For example, reached allows a broader deployment among world wide. the number of 20 million connected users with 1,5 billion of shared files in July 2000 [17]. Other studies focus their random events such as a sudden disconnection of a attention on P2P and other fields of neighbor , whose impact is considerable and P2P development. Some papers even study the economic measurable. The total number of on-line Gnutella users cost of downloading a file analyzing the required time has been decreasing continually, from 500.000 users on [18]. The manner in which the number of connected users April 2002 to about 110.000 users on October 2003. Since is calculated is sometimes deceptive if it is solely based that date it has been increasing again to a maximum of on the amount of users that download a certain client 300.000 users in April 2004 [28]. program for a P2P architecture [11, 12]. 2.1.2. FastTrack. This architecture is the one with most In Section 2 the architectures are characterized through users and shared data. As shown on Figure 2, which starts some parameters, such as number of users, shared files on a Friday, it has a waveform whose peaks are located and total size of shared information. These values are between 20:00 and 23:00 this corresponds with USA compared and discussed with those obtained a year ago. afternoon, as it is the country that mostly contributes in the number of users. Low values of the graph are timely 2. Measurements and analysis of the associated to a USA night. The average number of architectures connected users is 3.467.918, with a maximum value of 4.178.674 and a minimum value of 2.728.360, thus In order to measure P2P parameters, we have taken one characterized through a greatest fluctuation of 21,33%. totally decentralized architecture (Gnutella [3]) and four Between February and March 2003 the FastTrack partially decentralized architectures (FastTrack [4], architecture had an average number of connected users of OpenNap [5], eDonkey [19, 20] and MP2P [21]) and a 3.897.015, with a maximum of 4.585.549 and a minimum centralized architecture (SoulSeek [22]). For the purpose of 3.328.813. Hence it follows that the number of users of this article we took measurements between February has decreased about 11%. The possible reason for that is and March 2003 for the Gnutella, FastTrack, OpenNap the creation of new P2P networks that attract users from and eDonkey architectures. Later on, we measured the older architectures. The FastTrack graph does not reveal same parameters between November 2003 and February any significant appreciation between normal working 2004. This time, however, the MP2P and SoulSeek days and weekends. architectures were also analyzed due to the increment in the number of users that they were experiencing. All the 2.1.3. OpenNap. Figure 3 shows the number of on-line Figures correspond to the most significant ones amongst users during a week, starting on a Thursday. The average all obtained data. All time values are GMT+01:00 number of simultaneously connected users is 256.003 timezone. with upper and lower limits of 376.378 and 182.188 respectively, and a highest fluctuation of 42,02%. The To take measurements of the corresponding architectures, obtained graph is very irregular. Taking each day as the most adequate clients have been selected, bearing in independent, we observe that low values of connected mind those that would provide the most information on users are obtained at night (between 0:00 and 4:00) the architecture or the highest update frequency to whereas high values are located between 18:00 and 22:00. measure the parameters. The Gnutella architecture has The main peaks are located on weekends. Comparing the been analyzed with the Limewire client [23], which has above results with those obtained a year before (between been set up with a TTL value of seven. This is the reason February and March 2003) the average number of on-line why the number of users, files and size of shared data is users in the OpenNap architecture was 395.853 with a relatively low. In the FastTrack architecture, the maximum of 463.210 and a minimum of 357.382. We can measurements have been taken with the Lite note a decrease of 35 %. The main reason is probably that client [24]. In order to analyze OpenNap architecture, the some users have migrated to other P2P networks. Napigator client [25] has been used. The eMule client [26] has been utilized to analyze the Edonkey 2.1.4. eDonkey. Figure 4 illustrates the number of architecture. The MP2P architecture has been analyzed by connected users during a week, starting on a Monday. The means of the Blubster client [21]. Finally it has been used average number of on-line users has a value of 1.428.175 the Nicotine [27] client in order to analyze the SoulSeek and the upper and low variation margins are 1.718.201 architecture. and 869.314 respectively, with a highest fluctuation of 39,13 %. The represented graph is rather irregular and 2.1. Number of Users does not show any per-hour dependency, only that there are more users on weekends. Between February and 2.1.1. Gnutella Network. Figure 1 shows the number of March 2003 the eDonkey network had an average number connected users through a week beginning on Friday. The of users of 1.058.580 with top and down variation average number of simultaneously connected users is 181, margins of 1.203,109 and 897.780 respectively. Therefore with a fluctuation from 123 to its maximum value, 256. we can observe that, the number of users has experienced This means a maximum variation of 41%. Figure 1 shows an increment of 34,91 %. a certain continuity in the graph, but it is sensitive to Users Users

280 4.300.000

260 4.100.000

240 3.900.000

220 3.700.000

200 3.500.000

180 3.300.000

160 3.100.000

140 2.900.000

120 2.700.000

100 2.500.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 1. Number of users of Gnutella Network Figure 2. Number of users in the FastTrack Network Users Users 400.000 1.900.000

1.700.000 350.000 1.500.000 300.000 1.300.000 250.000 1.100.000

200.000 900.000

150.000 700.000

500.000 100.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 3: Number of users in the OpenNap Network Figure 4. Number of users in the eDonkey Network

Users Users 260.000 9150

255.000 9100

250.000 9050

245.000 9000

240.000 8950 8900 235.000 8850 230.000 8800 225.000 8750 220.000 0:00 12:00 0:00 12:00 0:00 12:00 0:00 12:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 5. Number of users in MP2P Network Figure 6. Number of users in the SoulSeek Network

2.1.5. MP2P. Figure 5 shows the number of on-line users 2.2.1 Gnutella. The average value of shared files is during a week, starting on a Wednesday. We can observe 55.540 with a maximum variation of 260 % (200.136 in a waveform graph. The peaks are located from 3:00 to the best case, 20.365 in the worst case) as shown in Figure 5:00 whereas the valleys are between 15:00 and 18:00. 7. The relationship between the number of connected The connected users vary from values 257.779 to users and the number of shared files is weak. 230.986, with an mean of 244.418. As we can note the maximum fluctuation is reduced (5,5 %). 2.2.2. FastTrack. A great correlation between connected users and shared files can be distinguished in Figure 8. 2.1.6. SoulSeek. Figure 6 illustrates the number of The average number of shared files is 631.678.681 with a connected privileged users during four days, starting on a maximum value of 741.435.741 and a minimum value of Thursday (we only measured four days due to the 512.969.959. The maximum fluctuation is 18,63 %. instability of the architecture, because it appears that the doesn’t support so many users). This Network is 2.2.3. OpenNap. Figure 9 depicts the number of shared growing daily, and all the graphs taken in two months are files inside the OpenNap network. The average value is similar to the one showed on Figure 6. It has above 200 158.902.178 files with minimum and maximum of new privileged users in four days. The maximum number 119.587.562 and 212.556.704 respectively, and a of users taken has been 9.086 and the minimum 8.875. maximum fluctuation of 53,65 %. As shown, there is no The network had an increment of 2,38 % in those four high correlation between number of users and number of days. The greatest increment is done between 20:00 and shared files in the architecture. The number of shared files 0:00. has decreased above 31 % compared to the results taken between February and March 2003 where, the average 2.2. Number of shared files value was 232.120.378 with the variation limits of 258.133.574 and 201.697.846. Files Files 200.000 800.000.000 750.000.000

150.000 700.000.000

650.000.000 100.000 600.000.000

550.000.000 50.000 500.000.000

0 450.000.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 7. Number of shared files in Gnutella Network Figure 8. Number of files shared in FastTrack Network Files Files 220.000.000 130.000.000

200.000.000 120.000.000 110.000.000 180.000.000 100.000.000 160.000.000 90.000.000 140.000.000 80.000.000 120.000.000 70.000.000

100.000.000 60.000.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 9. Number of shared files in OpenNap Network Figure 10: Number of files in the eDonkey Network

Files 64.000.000 2.3. Size of total shared data 62.000.000

60.000.000

58.000.000 2.3.1. Gnutella. Figure 12 shows the evolution of shared 56.000.000 data. The average size is 301.118 MB with a maximum 54.000.000 value of 1.352.508 MB and a minimum value of 138.203 52.000.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 MB. Hours Figure 11. Number of shared files in the MP2P Network 2.3.2. FastTrack. Figure 13 shows the amount of shared 2.2.4. eDonkey. The eDonkey architecture has a mean data in the FastTrack community. The average amount of value of 103.469.627 files with a maximum of shared data is 4.947.261 GB and the maximum and 122.958.989 and a minimum of 65.436.288 shared files. minimum margin values are 5.578.880 GB and 4.169.392 The greatest fluctuation is 37 %. The correlation between GB respectively, which represents a top variation of connected users and number of shared files (see Figure 15,72%. A decrease of 42,21% has been observed 10) is high. However, it is not always true that a there are compared to the measures taken a year ago (between more shared files the more connected users are in. The February and March 2003), when the average value was lowest values are obtained between 7:00 and 12:00 833.482.249 files, and the max. and min. variations were whereas the highest values appear from 20:00 to 1:00. 946.077.973 and 735.978.908 respectively. Comparing the above results with those obtained a year 2.3.3. OpenNap. The amount of shared information in before (between February and March 2003), the number OpenNap architecture (see Figure 14) is highly dependent of shared files has been increased by 78.477.136. on the number of connected users and number of files. The average value is 5.409.326 GB with a maximum 2.2.5. MP2P. There is a strong relationship between value of 7.279.944 GB and a minimum of 4.071.506 GB. number of users and number of shared files, as can be The greatest fluctuation is 34,58 %. appreciated in Figure 11. The MP2P architecture has an average value of shared files of 59.765.764 with a 2.3.4. eDonkey. Unfortunately, the eMule client does not minimum of 62.962.846 and a maximum of 56.490.875 show the total amount of shared data inside the eDonkey files. The maximum fluctuation is 5,4%. architecture (neither do other related clients), but indirect measurements show that this network is (by far) the one 2.2.6. SoulSeek. Due to the characteristics of the that has larger shared files. This is because this SoulSeek clients and the used protocol, the measurements architecture is mainly used for the downloading of of number of shared files in the network can not be taken. multimedia files, such as songs, videoclips and even However, most of that shared files are mp3 or ogg movies. archives, and SoulSeek users use to share not only some files, but folders and subfolders as it is permitted by the SoulSeek protocol. MB GB 1.400.000 5.800.000 5.600.000 1.200.000 5.400.000 1.000.000 5.200.000 800.000 5.000.000

600.000 4.800.000 4.600.000 400.000 4.400.000 200.000 4.200.000

0 4.000.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 12. Shared data (in MB) in the Gnutella Network Figure 13. Shared data (in GB) in the FastTrack Network GB GB 8.500.000 255.000 250.000 7.500.000 245.000 240.000 6.500.000 235.000 5.500.000 230.000 225.000 4.500.000 220.000 215.000 3.500.000 210.000 2.500.000 205.000 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 0:00 Hours Hours Figure 14. Shared data (in GB) in the OpenNap Network Figure 15. Shared data (in GB) in the MP2P Network

2.3.5. MP2P. The total amount of shared data (see Figure architecture is the one which has the most files, on the 15) is also correlated with the first two parameters, i.e., other hand, the one which has the most total size of the number of users and shared files. This data varies shared data is OpenNap. from 223.034 GB to 250.255 GB with an average of • The OpenNap and MP2P architecture have practically 236.564 GB. The maximum fluctuation is 5,79 %. the same number of on-line users; however, there are three times more shared files in the OpenNap network 2.3.6. SoulSeek. Due to the characteristics of the than in the MP2P network. SoulSeek Clients and the used protocol, the measurements • If the mean value of the maximum and minimum of amount of shared data in the network can not be taken. variation limits are taken, the MP2P architecture is the However, due to the kind of files shared, usually mp3 or one of the six whose average is closest to this value; the ogg archives, it is not too high. average in the eDonkey network is close to the maximum value, and the others are close to the 3. Analyzed architectures summary minimum value. • Observing the obtained graphs we can establish a Table 1 tries to sum up all previous analyses in a certain timetable – considering the maximum values for comparative way allowing us to obtain a global connected users and shared files - for each architecture, perspective of the measurements taken. where it is more probable to obtain the desired content. • The hours where all the architectures measured have 4. Conclusions more users are between 18:00 and 5:00 (GMT+1). • Related to the internal behavior of the network, if the The evolution of the Internet has brought about the Gnutella network incorporated the users and files of the creation of rapid deployable overlay networks. P2P FastTrack and eDonkey architecture, big variations in networks rely on normal users without the necessity of a the measurements would have been obtained and centralized server. Therefore, the potential aggregation of probably the resulting P2P network would have become users allows for the development of data networks quite unstable addressing important aspects such as high performance, • The number of users of some of the older architectures large processing and high availability. The main is decreasing, due to the appearance of new P2P conclusions in this article are the following: networks that attract users from older ones. The total • The graphs showing number of users, files or size of number of users connecting to the P2P file-sharing total files shared do not depend on the decentralization Networks is growing. Therefore, the number of users degree of the architecture. As shown, there are regular increasing Internet traffic, due to the use of these graphs both in decentralized and in partially centralized networks, is growing architectures. • SoulSeek can be seen as an emergent architecture. Its • There could be more users in one architecture (for graph shows a linear increasing of users, but it seems to example eDonkey) than in others (for example be quite unstable to support a high amount of users. OpenNap); however, there are more files shared in OpenNap than in eDonkey. • Total size of shared data does not depend on the number of files shared in the architecture. FasTrack Gnutella FastTrack OpenNap eDonkey MP2P SoulSeek Grade of decentralization Totally Partially Partially Partially Partially None Graph Irregular Regular Irregular Irregular Regular Regular Average number of users 181 * 3.467.918 256.003 1.428.175 244.418 8981 Average number of shared files 55.540 * 631.678.681 158.902.178 103.469.627 59.756.764 n/t Average size of total shared 0,294 GB* 4.947.261 GB 5.409.326 GB n/t 236.564 GB n/t data Max. Variation of users (%) 41,49 21,33 42,02 39,13 5,50 1,17 Max. Variation of shared files 260,35 18,63 53,65 36,76 5,47 n/t (%) Max. Variation of shared data 349,49 15,72 34,58 n/t 5,79 n/t (%) Correlation between High Low Medium High High n/t parameters ‘Rush’ hours for connected Variable 20:00-23:00 18:00-22:00 20:00-1:00 3:00-5:00 Variable users Relevance of the weekday No No Yes Yes No No Behavior compared to a year Decrease Decrease Decrease Grow Grow Grow ago

Table 1. Comparative of the 6 architectures measured (* non total network values, n/t: measured not taken)

References [13] Peer-to-Peer : The impact of filesharing on service provider networks, Sandvine Incorporated, 2002 [14] Internet2 NetFlow, Weekly reports, info at [1] Nielsen-Netratings, Global Internet Population grows an http://netflow.internet2.edu/weekly/20040216/ average of four percent year-over-year, February 20, 2003 [15] The CAIDA website at http://www.caida.org [2] Nielsen-Netratings, Nearly 40 million Internet connect via [16] Stefan Saroiu, P. Krishna Gummadi and Steven D. Gribble, Broadband, growing 49 percent, June 17, 2003 A Measurement Study of Peer-to-Peer File Sharing [3] Eytan Adar and Bernardo Huberman. Free riding on Systems, Department of Computer Science & Engineering, gnutella. First Monday, 5(10), October 2000. Univ. of Washington, Tech Report UW-CSE-01-06-02. [4] Deconstructing the Kazaa Network, Nathaniel Leibowitz, [17] The Pew Internet & American Life Project’s Online Music, Matei Ripeanu, and Adam Wierzbicki, 3rd IEEE Workshop September 28, 2000, info at on Internet Applications (WIAPP'03), June 2003, S. J., CA http://www.pewinternet.org/pdfs/PIP_Online_Music_Repor [5] Open Source Napster Server (OpenNap), at t2.pdf http://opennap.sourceforge.net/ [18] Artur Marques, avaliable at [6] , Incentives Build Robustness in BitTorrent, http://arturmarques.com/docs/economics/arturmarques_dot Workshop on Economics of Peer-To-Peer Systems, _com_freeloading.pdf Berkeley CA, June 2003. [19] Oliver Heckmann and Axel Bock. The eDonkey 2000 [7] Sanvine White Paper, Regional Characteristics of P2P: File Protocol. Technical Report KOM-TR-08-2002, Multimedia sharing as a multi-application, multi-national phenomenon, Communications Lab, Darmstadt University of October 2003 Technology, December 2002. [8] Shareaza client, info at http://www.shareaza.com [20] EDonkey2000 in Wikipedia, info at [9] MLDonkey, info at http://mldonkey.berlios.de/ http://en.wikipedia.org/wiki/EDonkey [10] Palisade News, March 20 2003, available at [21] MP2P, http://www.blubster.com/protocol1.html http://www.palisadesys.com/news&events/p2pstudy.pdf [22] SoulSeek, info at http://www.slsknet.org/ [11] File-sharing programs and peer-to-peer networks privacy [23] Limewire, http://www.limewire.com/developer/ and security risks, United States House of Representatives [24] KaZaA Lite info at http://www.k-lite.tk/ Committee on Government Reform – Staff Report, May [25] Napigator info at http://www.napigator.com/ 2003. [26] Emule, info at http://www.emule-project.net/ [12] Corporate P2P Usage and Risk Analysis, AssetMetrix [27] Nicotine, at http://nicotine.thegraveyard.org/ Research Labs, July 2003 [28] Statistics in Limewire.com http://www.limewire.com/english/content/netsize.shtml