A Measurement Study of Attacks on BitTorrent Leechers

Prithula Dhungel, Di Wu, Brad Schonhorst, Keith W. Ross Polytechnic University, Brooklyn, NY, USA 11201 Email: [email protected], [email protected], [email protected], [email protected]

Abstract the globe, all of which are defying legal threats, in- cluding PirateBay, Mininova, Snarf-it, and BiteNova. Anti-P2P companies have begun to launch Internet at- Moreover, torrent tracking can be decentralized us- tacks against BitTorrent swarms. In this paper, we an- ing DHTs, as is currently being done with clients like alyze how successful these attacks are at impeding the Azureus and uTorrent. distribution of targeted files. We present the results of Given that it is currently difficult, if not impossible, both passive and active measurements. For our active to stop BitTorrent by suing companies, and that suing measurements, we developed a crawler that contacts individual users is both painstaking and unpopular, the all the peers in any given swarm, determines whether only remaining way to stop BitTorrent is via Internet the swarm is under attack, and identifies the attack attacks. Not surprisingly, the music and film indus- peers in the swarm. We used the crawler to analyze tries have begun to hire anti-P2P companies to impede 8 top box-office movies. Using passive measurements, specific “assets” from being distributed in BitTorrent we performed a detailed analysis of a recent album swarms [6, 7]. that is under attack. In this paper, using Internet measurement, we ex- plore how successful these anti-P2P companies cur- rently are at impeding the distribution of targeted files 1. Introduction in BitTorrent. We present results for both passive and Over the past several years, the music industry has active measurement. For active measurements, we de- aggressively attempted to impede the distribution of veloped a crawler that contacts all the peers in any copyrighted content over P2P file distribution net- given swarm, determines whether the swarm is un- works. These attempts included numerous law suits der attack, and identifies the attack peers. We used against P2P file sharing companies (against Napster, the crawler to analyze 8 current top box-office movies. Kazaa and many others), tracking and suing users of Using passive measurements, we performed a detailed P2P file sharing systems [1], and most remarkably, analysis of a recent album that is under attack. For launching large-scale Internet attacks against the P2P the passive measurements, we developed a customized systems themselves. These large-scale Internet attacks packet parser, which identifies the peers that are at- were performed by specialized anti-P2P companies, tacking and the type of attack they employ. working on the behalf of the RIAA and specific record labels. Several studies showed that these attacks were successful at severely impeding the distribution of 2. Two BitTorrent Attacks targeted content over several P2P file sharing sys- BitTorrent swarms are susceptible to a number of dif- tems, including FastTrack/Kazaa, Overnet/eDonkey, ferent attack types. In our measurement work, we have and Gnutella [2, 3, 4]. These attacks, along with the observed two attacks that are frequently deployed to- law suits, have contributed to the demise of Kazaa and day, which we refer to as the fake-block attack and eDonkey file-sharing networks. the uncooperative-peer attack. In this section, we de- BitTorrent is one of the most popular P2P file dis- scribe these two attacks. tribution protocols today, particularly for the distri- bution of large files, such as high-definition movies, television series, record albums and open-source soft- 2.1. Fake-Block Attack ware distributions [5]. Unlike Napster and Kazaa, Bit- Recall that in BitTorrent, each file is divided into Torrent is nothing more than a protocol and about a pieces, where each piece is typically 256 KBytes. dozen clients that implement the protocol. BitTorrent Each piece is further divided into blocks, with typi- swarms and clients are not controlled by a small set of cally 16 blocks in a piece. When downloading a piece, companies which can be targeted for a lawsuit. Also a client requests different blocks for the piece from dif- included in the BitTorrent eco-system are torrent lo- ferent peers. cation and tracker services, which can potentially be In the fake-block attack, the goal of the attacker is legally attacked; in fact, in late 2004, Suprnova, the to prolong the download of a file at peers by wasting largest torrent locater at that time, was closed after le- their download bandwidths. In particular, an attacker gal threats. Today, however, there are many BitTorrent joins the swarm sharing the file by registering itself file location and tracking services, scattered around to the corresponding tracker. It then advertises that

1 it has a large number of pieces of the file. Upon re- [8], to prevent connections to and from the IP ranges ceiving this information, a victim peer sends a request in a specified blacklist. Our IP blacklist is based on to the attack peer for a block. The attacker, instead the ZipTorrent blacklist published on .com of sending the authentic block, sends a fake one. Af- [6]. Note that, since the anti-P2P companies (e.g., Me- ter downloading all the blocks in the piece (from the diaDefender [9]) change the IP range of their attack attack peer and from other benevolent peers), the vic- hosts, this blacklist is not always complete and may tim peer performs a hash check across the entire piece. not always eliminate all the attacker hosts. The hash check then fails due to the fake block from the attacker. This requires the victim peer to download the entire piece (16 blocks) again, delaying the down- 3.2. Passive Measurement Results load of the file. If the peer chooses to download any In this section we present measurement results for a of the blocks again from this or another fake-block at- torrent for the new album titled “Echoes, Silence, Pa- tacker, the download is further delayed. Note that an tience & Grace” from “Foo Fighters”, which we sus- attacker can cause a victim peer to waste 256 KBytes pected to be under attack. This popular album was re- of download bandwidth by only sending it a 16 KByte leased on September 25, 2007, a few weeks before our block (using typical numbers). experiments. At the time of the experiment, it held the number 1 position on the UK album chart and iTunes 2.2. Uncooperative-Peer Attack ranking list. The size of the file is 108MBytes. In our testing, we downloaded the file from this torrent 54 In this class of attacks, the attacker joins the targeted times. swarm and establishes TCP connections with many victim peers. However, it never provides any blocks (authentic or fake) to its victim peers. A common ver- 3.3. Azureus Client sion of this attack is the chatty peer attack. Here, the Because Azureus clients can import IP blacklist, we attack peer speaks the BitTorrent protocol with each use this Azureus feature to perform IP filtering. Within of the victim peers, starting with the handshake mes- one day, we performed downloads for this torrent mul- sage, and then followed by the bitmap message adver- tiple times using Azureus clients, and switched the IP tising that it has a number of pieces available for the filter on or off alternatively. First we present the basic file. When a victim peer requests one or more blocks, average download-time statistics in Table 1. the attack peer doesn’t upload the blocks. Moreover, the nature of the attacker is chatty. After the victim Azureus w/ IP-filtering w/o IP-filtering Delay Ratio peer sends one or more block requests, the attacker re- Ethernet 15.52 mins 20.99 mins 35.2% sends the handshake and bitmap messages. By resend- (6 downloads) (6 downloads) ing these BitTorrent control messages over and over DSL 19.98 mins 25.88 mins 29.5% again, the attacker persists as a neighbor, and the vic- (6 downloads) (6 downloads) tim peer wastes a considerable time dealing with the attack peer, when it could have instead downloaded Table 1: Average downloading time using Azureus blocks from other benevolent peers. The effectiveness clients of this attack is increased if a significant fraction of In Table 1, Delay Ratio is defined as follows to eval- victim’s neighbors are uncooperative. uate the effectiveness of attacks in lengthening BitTor- rent downloading time, 3. Effectiveness of BitTorrent Attacks T w/o IP-filtering − T w/ IP-filtering In this section, we use passive measurements to evalu- Delay Ratio = d d ate the effectiveness of fake-block and uncooperative- Td w/ IP-filtering peer attacks on BitTorrent systems. In the next section, where Td is the average downloading time of Bit- we complement this evaluation with active, crawler- Torrent clients. From the table, we clearly observe that based measurements. the downloading time of the file is prolonged when at- tacked. For both DSL and Ethernet peers, the down- 3.1. Passive Measurement Methodology load time on average increased by about 30%. The While repeatedly downloading a file suspected to be actual increase in download time may be larger, since under attack, we collected multiple packet traces from we may not have blacklisted all the malicious peers. hosts connected to both Ethernet and DSL access net- However, given the download rate of the DSL client, works. For this testing, we used Azureus and uTor- the size of the file, and that the minimum observed rent, as they are the two most widely used BitTorrent download time was 17 minutes, it is unlikely that the clients. On each host, we ran Wireshark (or TCP- average download time without an attack would have dump) to capture all the incoming and outgoing pack- been less than 17 minutes. Thus, we can safely say, ets. We also developed our own packet parser to iden- at least for DSL, that the attackers did not prolong the tify different types of attackers in the trace and analyze downloading of this file by more than 50%. their behaviors. To get a deeper understanding of the attack on To measure the performance of BitTorrent without Azureus clients, we selected one typical packet trace attacks, we use a third-party software, PeerGuardian and analyzed it with the packet parser we developed.

2 Our parser can categorize all the IPs in the trace into belong to chatty peers, which chat with the Azureus different types as follows: client continuously but without any piece uploading. • No-TCP-connection Peers: peers with which our Chatty peers account for a majority of useful peers client cannot establish TCP connections. (i.e., 60%). To estimate how chatty the attackers actually are, • No-BT-handshake Peers: peers with which our we checked how many handshake messages were sent client can successfully establish TCP connection, out by each chatty peer. The results are shown in Fig- but when the client sends a BitTorrent handshake ure 2. We can observe that most of chatty peers are message, the peer does not return a BitTorrent very chatty, and send out as many as 40-60 handshake handshake response. messages to our Azureus client. Those chatty peers • Chatty Peers: peers that just chat with our client. persist as neighbors of the Azureus client during the For Azureus clients, we consider any peer that downloading process, and hinder the client from con- sends more than one Azureus handshake message tacting benevolent peers. No hash fails occurred dur- as a Chatty Peer. 20

• Fake-Block-Attack Peers: peers that upload fake 18

blocks to our client. To identify fake-block-attack 16

peers, we first need to check whether hash fails 14

happened during downloading. When a hash 12

fails, we identify all the IPs that have uploaded 10

blocks for the piece and check whether the up- 8

loaded blocks are fake or not. 6

• Benevolent Peers: peers that communicate nor- Number of Chatty Peers 4 mally with our client via the BitTorrent proto- 2 0 col and upload at least one genuine block to our 0 10 20 30 40 50 60 70 80 client. Number of Azureus Handshake messages sent out • Other Peers: peers that don’t fall into any of the above categories. Figure 2: Distribution of Azureus handshake mes- sages across chatty peers.

No−TCP−Connection Peers No−BT−Handshake Peers Other Peers Chatty Peers 2% Benevolent Peers ing the downloading. Thus, it appears that the attack- Fake−Block−Attack Peers 10% Benevolent Peers Fake−Block−Attack Peers 0% ers did not launch a fake-block attack against Azureus data6 clients at this time.

Chatty Peers 18% 3.4. uTorrent Client We also used uTorrent clients to download the same file. We turned off the automatic filtering function No−TCP−Connection Peers 53% of uTorrent and used PeerGuardian to perform IP- No−BT−Handshake Peers filtering. 17% Table 2 provides the average downloading time of

uTorrent w/ IP-filtering w/o IP-filtering Delay Ratio Ethernet 9.17 mins 9.42 mins 2.7% (10 downloads) (10 downloads) Figure 1: Peer distribution in Azureus trace DSL 18.32 mins 28.93 mins 57.9% (5 downloads) (5 downloads) Figure 1 shows the distribution of different types of peers in the Azureus trace. Among all the IPs tried, Table 2: Average downloading time for uTorrent the Azureus client could not establish TCP connec- tions with over half of them. The high percentage of clients no-TCP-connection peers is not necessarily due to at- uTorrent clients. For uTorrent clients with Ethernet tackers. The no-TCP-connection peers include NATed connections, the attackers did not succeed at signifi- peers, firewalled peers, stale IPs returned by trackers cantly increasing the average download time. How- or gossiping messages, and peers that have reached ever, the attackers appear to have some success with their limit on TCP connections (typically around 50 DSL clients, increasing the average download time by in BitTorrent). Even in clean torrents (e.g., public- 58%. domain software) where no attacks exist, we observe a large percentage of no-TCP-connection peers. Table 3 shows the average number of hash fails No-BT-handshake peers account for 17% of the to- for uTorrent clients. Compared with Azureus clients tal IPs. If combined with no-TCP-connection peers, (which had no hash failures), hash failures occur much almost 70% of all the IPs are not useful for our more frequently. The hash failures are a direct conse- Azureus client. For the remaining 30% of the IPs, only quence of the fake-block attack being launched against 10% of the IPs are benevolent peers, while 18% IPs uTorrent. Hash-failures may not significantly impact

3 uTorrent w/ IP-filtering w/o IP-filtering other hand, average download times for our uTor- Ethernet 1.7 Hash Fails 44.2 Hash Fails rent client behind DSL connections increased by about DSL 4.2 Hash Fails 68.4 Hash Fails 60%. However, even if download times were to dou- ble (100% delay ratio), it is not clear that many users Table 3: Average number of Hash Fails for uTorrent would abandon BitTorrent, since users often download clients in the background or over night. In the next section, we examine the attack over a wider array of torrents. an Ethernet peer, since if the Ethernet peer can find one other high-bandwidth benevolent trading partner, it may be able to rapidly download from it complete 4. Active Measurement Results for pieces (all 16 blocks) even if the other neighbors are Top Box Office Movies producing hash failures. For DSL clients, because of In this section, we provide active measurement results the tit-for-tat algorithm, the client is typically trading for the detection of chatty peers and fake-block-attack only with other lower-bandwidth peers; even if one of peers in torrents for 8 top box office movies. these peers is producing a stream of clean pieces, the pieces would be coming in at a relatively low rate. 4.1. Active Measurement Methodology

No−TCP−Connection Peers Other Peers We developed a crawler that traverses the BitTorrent No−BT−Handshake Peers 5% Chatty Peers Fake−Block−Attack Peers network gathering IP addresses of peers for a given Benevolent Peers Other Peers torrent. We also developed customized BitTorrent Benevolent Peers clients, and devised heuristics for the detection of 22% chatty peers and fake-block-attack peers. Doing this enabled us to quickly run experiments over a large

Fake−Block−Attack number of torrents without having to download the en- Peers 2% tire files (as in the previous section).

Chatty Peers 0% No−TCP−Connection Peers 58% 4.1.1 Crawler Architecture No−BT−Handshake Peers 13% The output of the crawler is a “pool” containing the IP address and port pairs of peers in the torrent. It repeat- edly requests the tracker for lists of peers participating in the torrent. Every time a list is received from the tracker, the crawler checks each IP and port to see if Figure 3: Peer distribution in uTorrent trace it already exists in the pool. If not, the new pair is To gain deeper insights, we plot the peer distribu- added to the end of the pool. After gathering some IP tion in one of uTorrent traces in Figure 3. Similar to addresses and ports in the pool, an IP address and port the Azureus trace, no-TCP-connection peers account pair is extracted from the beginning of the pool. A for 58% of all the IP addresses in the peer list. separate thread is forked, which initiates a TCP con- Compared with the Azureus trace, the main differ- nection to the peer. ence lies in the distribution of chatty peers and fake- If a TCP connection can be successfully estab- lished, the crawler thread then sends a BitTorrent block-attack peers. In Azureus, we saw significant handshake message to the peer indicating that it is chatty-peer activity but no fake-block attacker. In case an Azureus peer. If the peer is also an Azureus peer of Azureus, the attackers exploited the implementation (which is determined from the handshake reply re- vulnerability of not being able to detect malicious be- ceived from the peer), the thread speaks to the peer havior of attackers sending multiple handshake mes- using the Azureus messaging protocol. An interesting sages. It appears that uTorrent clients do not have this feature of Azureus is that Azureus clients have the fea- vulnerability. ture of exchanging gossip messages with each other However, we did observe the fake-block attack in for exchanging peer lists. Hence, by acting as a an uTorrent. The fake-block attack is different from chatty-peer attack in that it doesn’t require many IP Azureus peer, the crawler thread is able to gather ad- addresses to launch the attack. Even if the percent- ditional IP addresses from the gossip messages that it age of fake-block-attack peers is fairly low among all gets from the Azureus peer. It is also possible to obtain the IPs, the attack can still be effective, particularly peer lists by accessing a DHT created with Azureus clients, but we do not consider this feature in this towards the end of the file download (the end game). study. The new pairs gathered via gossip are again In summary, the anti-P2P companies are apply- added to the end of the pool. ing distinctly different strategies against different Bit- A separate thread is forked for each IP and port pair Torrent clients. From this experiment (involving 54 in the pool and each thread runs until either there is downloads from the same torrent), we observe that the attacks are not always successful at significantly error in the TCP connection with the peer, or the timer prolonging download times. For Ethernet clients, the for the peer expires. Similarly, the whole crawling attackers appear to be largely ineffective. On the process is continued until the timer for the crawler ex- pires. We tested our crawler on a number of torrents

4 Table 4: Measurement Results for Chatty Peers

Movie Total Peers Crawled IPs from Blacklist Non-Useful Peers Useful Chatty Peers ID Tracker Gossip Tracker Gossip Tracker Gossip Peers Tracker Gossip IPs from Blacklist Movie1 116 864 1 73 90 836 54 0 27 26 Movie2 633 206 1 48 528 159 152 0 7 7 Movie3 144 158 0 30 111 98 93 0 0 0 Movie4 16 407 0 12 8 398 17 0 2 0 Movie5 29 1460 0 2 16 1460 13 11 0 0 Movie6 2356 3992 0 4 1992 3558 798 0 0 0 Movie7 111 0 0 0 81 0 30 0 0 0 Movie8 82 0 0 0 57 0 25 0 0 0

and observed that even for a swarm size as large as We also selected 2 other movies (Movies 7 and 8) that 12,000, the crawler was able to crawl more than 90% did not show any blacklisted IP addresses in their peer of peers within 8 minutes. In all of our experiments, lists. we ran the crawler for 8 minutes. 4.2.1 Test Results for Chatty-Peer Attack 4.1.2 Detection of Chatty Peer Table 4 shows the test results for chatty-peer attack. For the detection of chatty peers, the instrumented We observe that for the 6 attacked movies, 70% (or client initiates TCP connections to IP addresses from more) of the peers crawled are not useful, meaning the crawler pool. After having established a TCP con- that they are either not reachable by TCP connections, nection, the instrumented client speaks the Azureus or do not reply with a BitTorrent handshake message messaging protocol to the peer if the peer is an after a TCP connection is made. This result is consis- Azureus peer, and the “conventional” protocol in case tent with our passive measurements in Section 3. of other peers. For peers that are Azureus clients, our We also observe that for Movie 1, half of the use- client marks them as being “chatty” if they send more ful peers (those who reply with the BitTorrent hand- than one Azureus handshake message. Our client also shake message) are chatty. For Movie 5, about 85% of marks a peer as “non-useful” if either a TCP connec- the useful peers are chatty. Interestingly, for Movie tion cannot be made to it, or if it does not reply with 5, none of the chatty peers that were detected fall a BitTorrent handshake message when a TCP connec- into the blacklist that we have. As a verification, we tion is established. downloaded the same movie using a real BitTorrent client and found that these IP addresses were indeed chatty. This indicates that static blacklisting is not 4.1.3 Detection of Fake-Block Attack sufficient for preventing such attacks since the attack- For the detection of fake-block-attack peers, the in- ers can always change IP addresses. Furthermore, for strumented client establishes TCP connections to each movie, we observe a large number of blacklisted peers from crawler pool and speaks the “conventional” IP addresses from gossip. However, not all attack IPs BitTorrent protocol to all peers. In addition to mark- come from gossiping - for Movie 5, there are 11 at- ing peers as “non-useful,” this client marks a peer as tacker IP addresses from the tracker and none from being “fake-block-attack peer” if the peer sends a fake gossip. block. For Movies 7 and 8, which did not appear to be under attack from the initial crawling, no chatty peers were detected and the percentage of non-useful peers 4.2. Active Measurement Results is still around 70%. We collected torrents for the 20 top box office movies during the time of the experiment. We ran an initial crawling on these torrents and checked the peer lists 4.2.2 Test Results for Fake-Block Attack obtained against the blacklist. Out of the 20 movies, Table 5 shows the test results for fake-block attack for we chose the 3 movies (Movies 1 through 3) that ap- the same 8 torrents. It can be seen that the number peared to be heavily attacked (based on the large num- of non-useful IP addresses is again 65% (or more) for ber of blacklisted peers in the peer lists obtained from the 8 torrents. For Movie 1, almost half of the use- the crawler). We also selected 3 other movies (Movies ful peers were fake-block-attack peers. Since similar 4 through 6) that appeared to be lightly attacked. For results were seen for the chatty peer test, it can be con- each of the 6 movies, we chose the torrent that showed cluded that Movie 1 was indeed “heavily attacked” at the highest number of blacklisted peers for the movie. the time of our experiments. Interestingly, at that time,

5 Table 5: Measurement Results for Fake-Block-Attack Peers

Movie Total Peers Crawled IPs from Blacklist Non-Useful Peers Useful Chatty Peers ID Tracker Gossip Tracker Gossip Tracker Gossip Peers Tracker Gossip IPs from Blacklist Movie1 104 2284 6 68 75 2260 53 4 17 21 Movie2 604 313 1 72 494 255 168 0 8 8 Movie3 59 524 0 29 41 439 103 0 0 0 Movie4 15 86 0 10 10 77 14 0 0 0 Movie5 22 640 0 1 11 640 11 0 0 0 Movie6 374 884 1 1 292 677 289 0 0 0 Movie7 89 0 0 0 67 0 22 0 0 0 Movie8 114 0 0 0 74 0 40 0 0 0 it was already over 1.5 months after Movie 1 was re- Acknowledgment leased and so the movie was below 3 of the other 5 (at- tacked) movies in the box office rankings at that time. We would like to thank Vishal Misra for discussions at We compared the list of chatty peers and fake- the early stage of this research. block-attack peers that were detected for Movies 1 and 2. We found that for each of these, some of the IP ad- dresses detected as chatty were also detected as fake- References block-attack peers. This reaffirms our claim that a spe- cific attacker behaves differently for different BitTor- [1] A. Banerjee, M. Faloutsos, and L. Bhuyan, “Is some- rent clients. one tracking P2P users?” in Proc. of IFIP NETWORK- In summary, our active measurement apparatus and ING, Atlanta, GA, 2007. methodology can quickly determine whether a torrent is under attack. We have found that several, but not [2] J. Liang, R. Kumar, Y. Xi, and K. W. Ross, “Pollution all, top box-office movies are currently under attack. in P2P file sharing systems,” in IEEE Infocom, Miami, We have also found that published blacklists do not FL, USA, March 2005. always cover all the attackers in a torrent. We also [3] N. Christin, A. S. Weigend, and J. Chuang, “Con- observed that the majority of the attack IPs enter the tent availability, pollution and poisoning in file sharing system through gossiping; however, some also enter peer-to-peer networks,” in EC ’05: Proceedings of the through trackers. 6th ACM conference on Electronic commerce. New York, NY, USA: ACM, 2005, pp. 68–77. [4] J. Liang, N. Naoumov, and K. Ross, “The index poi- soning attack in P2P file-sharing systems,” in Proc. of 5. Conclusion IEEE Infocom, Barcelona, 2006. We have seen that anti-P2P companies are currently [5] A. Pouwelse, P. Garbacki, D. Epema, and H. Sips, launching Internet attacks to impede file distribution “The BitTorrent P2P file-sharing system: measure- in BitTorrent swarms. We have identified two classes ments and analysis,” in The 4th Int’l Workshop on of attacks currently being employed: fake-block at- Peer-to-Peer Systems (IPTPS’05), Ithaca , NY, Febru- tack and uncooperative peer attack. We have found ary 2005. that these two attacks can indeed prolong the aver- age download time of files, particularly for residential [6] “Ziptorrent blacklist,” broadband users. However, the extent of this prolon- http://torrentfreak.com/ziptorrent-pollutes-and- gation, at least for the torrents studied here, is modest - slows-down-popular-torrents/. typically not by more than 50%. Since most BitTorrent [7] “Bittorrent servers under attack,” users are fairly patient and download files overnight http://news.zdnet.com/. or in the background, we believe that download times need to be tripled (or at least doubled) to have a sig- [8] “Peerguardian,” http://phoenixlabs.org/pg2/. nificant impact. Thus, the anti-P2P companies are not currently successful at stopping the distribution of tar- [9] “Media defenders,” http://www.mediadefender.com/. geted assets over BitTorrent. We have also found that blacklist-based IP filtering is insufficient to filter out all the attackers. To better filter out attackers, it is nec- essary to design smart online algorithms to identify different types of attackers.

6