Bittorrent Darknets Chao Zhang, Prithula Dhungel, Di Wu, Zhengye Liu and Keith W
Total Page:16
File Type:pdf, Size:1020Kb
BitTorrent Darknets Chao Zhang, Prithula Dhungel, Di Wu, Zhengye Liu and Keith W. Ross Woonhak Kang 2010. 11. 04 VLDB Lab. [email protected] Contents • Introduction • BitTorrent (Background) § Architecture and Term. § Public and Private torrent sites • Overview of BitTorrent Darknets Operation • Analysis § Macroscopic § Medium-scopic § Microscopic • Conclusion 2 SKKU VLDB Lab. Introduction • Darknet § 비공개 토런트 사이트(private torrent sites) § 가입자에게만 공개 § 초대(inviatation), 사이트 임시 가입기간에 가입 § 사용자의 upload, download 크기를 기록 - up/down 비율을 통해 사용자의 이용제한 - up/down 비율이 높은 유저에게 혜택 • Motivation § 연구분야에서 큰 주목을 받지 못했다. § 독특한 정책 때문에 공개 토런트와 특성이 다르다. § 토런트 전체 시스템의 이해를 위해서는 공개/비공개 모두를 고려할 필요 가 있다 3 SKKU VLDB Lab. Introduction • Analysis § Macroscopic - 800개 이상의 비공개 토런트 분석 - Sharky list 와 Alexa rank 이용 - 전체 토런트 파일, 유저, 피어(peer) 정보 분석 § Medium-scopic - 4개의 인기 비공개 토런트 분석 - 트랙커(trackers), 피어(peer), 유저, 실제 공유파일 분석 - 공개 사이트와 비공개 사이트간의 상관관계 § Microscopic - HDChina 분석 - 유저의 up/down 기록, 활동시간 조사 4 SKKU VLDB Lab. Contents • Introduction • BitTorrent (Background) § Architecture and Term. § Public and Private torrent sites • Overview of BitTorrent Darknets Operation • Analysis § Macroscopic § Medium-scopic § Microscopic • Conclusion 5 SKKU VLDB Lab. BitTorrent (Background) • Bittorrent is a system for efficient and scalable replication of large amounts of static data § Scalable - the throughput increases with the number of downloaders § Efficient - it utilises a large amount of available network bandwidth • The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece 6 SKKU VLDB Lab. BitTorrent (Background) • A metadata file (.torrent) is distributed to all peers § Usually via HTTP • The metadata contains: § The SHA-1 hashes of all pieces § A mapping of the pieces to files § trackers reference 7 SKKU VLDB Lab. BitTorrent (Background) • The tracker is a central server keeping a list of all peers participating in the swarm • A swarm is the set of peers that are participating in distributing the same files • A peer joins a swarm by asking the tracker for a peer list and connects to those peers 출처 : An introduction to the BitTorrent Peer-to-Peer File-Sharing System, J.A. Pouwelse et al. 8 SKKU VLDB Lab. BitTorrent (Background) § Private vs Public § Private flag set to 1 - DHT, PEX 활성화 결정 9 SKKU VLDB Lab. BitTorrent (Background) 10 SKKU VLDB Lab. Contents • Introduction • BitTorrent (Background) § Architecture and Term. § Public and Private torrent sites • Overview of BitTorrent Darknets Operation • Analysis § Macroscopic § Medium-scopic § Microscopic • Conclusion 11 SKKU VLDB Lab. Overview of BitTorrent Darknets Operation • Darknet owner § web site and tracker • User § register web site and get “pass key” § Invitation system § Tracker Checker, BTRACS • Incentive policies § Ratio incentive § Enforce minimum ratio 12 SKKU VLDB Lab. Contents • Introduction • BitTorrent (Background) § Architecture and Term. § Public and Private torrent sites • Overview of BitTorrent Darknets Operation • Analysis § Macroscopic § Medium-scopic § Microscopic • Conclusion 13 SKKU VLDB Lab. Analysis • Macroscopic § Rough idea about - How many BitTorrent Darknets - How many files being shared - How many users participate - Where the Trackers are located - Where the users of the darknets are located • Methodology § Find darknets : sharky list § Alexa rank § crawler 14 SKKU VLDB Lab. Analysis • Sharky list § 900+ darknets in June, 2009 § 963, today • Create list § Tracker checker websites § File sharing blogs and forums § Google search § IRC invite channel 15 SKKU VLDB Lab. Analysis • In this paper § manually checking only operational sites § 863 private sites • Category analysis § 55% General 16 SKKU VLDB Lab. Analysis • Geographic distribution § Using MaxMind GeoIP § Europe(Leading Netherlands) 17 SKKU VLDB Lab. Analysis • Which site most popular? § Using Alexa’s rank § Alexa rank - present usage statisitcs § Pick 15 most popular darknets - 6 of them locate in netherlands - 1 china 18 SKKU VLDB Lab. Analysis • Top site – Torrents.ru § 612,000 torrents § 3.5 million user account • Usage by country § where is the netherlands? 19 SKKU VLDB Lab. Analysis • Total estimation § Regression analysis btw. Alexa rank and Darknets § Obtained 33 private sites (out of 67 sites) § Manually gather statistics from the sites (some of it partial stat.) § # of torrent : 0.84 § # of account : 0.81 § # of peers : 0.89 20 SKKU VLDB Lab. Analysis • Total estimation § Obtain correlation eq. (X is alexa rank) § yt = torrents § ya = account § yp = peers • Aggregate total estimation using eq 21 SKKU VLDB Lab. Analysis • Privates vs Public § Public : top 5 public torrent site (Mininova, Pirate Bay, Torrent Reactor, Btmonster, and torrent portal) § Collect - 8.8 million .torrent files(4.6 million unique info hashes) - 38,996 trackers § Observe - 5,085,217 unique peers • Summary § Darknets - Private world is comparable to the public site - 4.4 million torrent vs 4.6 info hashes - Active peers larger than that of the public sites 22 SKKU VLDB Lab. Analysis • Medium-scopic § 4 sites - Torrents.ru, Zamunda, BitSoup, HDChina - Use only one tracker - From April 11, 2009 to June 13, 2009 crawling - Zamunda, BitSoup, HDChina - Torrents.ru is private flag set to 0, Using DHT - Active torrent : has at least one active peers 23 SKKU VLDB Lab. Analysis • Overlap with the Public ecosystem § Infohash based - has same infohash (SHA-1) § Piece-based - Because of private flag, different infohash - Alternative - matching each pieces’ hash - Better than infohash matching system 24 SKKU VLDB Lab. Analysis • Overlap with the Public ecosystem § Infohash based § Piece-based § Comparably low overlap ratio btw. each darknets 25 SKKU VLDB Lab. Analysis • Overlap btw. public sites § more than 50 % 26 SKKU VLDB Lab. Analysis • Title match, extended match § Title match - Same file has same title - eg. Ghost Ship. HDDVD.1080p.DTS.x264-CtrlHD - Title, Source Media, Resolution, Codec, release team and so on § Extended match - Title match + the same file size(within 5%) - Same file but different hash set - encode rate, different language § Methodology - top-100, random 100 - do TM, EM check 27 SKKU VLDB Lab. Analysis • Leakage with the Public ecosystem § IPs have leaked into a DHT - If we know these IPs, don’t need to register private site § Methodology - Develop DHT crawler - crawl the DHT system for all the infohashes obtained from private sites - Low leakage rate except torrents.ru 28 SKKU VLDB Lab. Analysis • Characteristics of private torrents § Newly released torrents, attract more peer § Decay of private sites much less - Because of purging policy - Remove unpopular 29 SKKU VLDB Lab. Analysis • Characteristics of private torrents § Average torrent age on private site smaller § Because of purging policy - Old, unpopular removed by administrator § Rank - have a longer tail 30 SKKU VLDB Lab. Analysis • Microscopic § HDChina - HD Movies and TV series - 18,054 user account - 15,738 active torrents - 10GB, 0.3 up/down ratio - 100GB, 0.7 up/down - pased for all user data in HDChina 31 SKKU VLDB Lab. Analysis • Microscopic § Up/down rate - incentive policy - Total up/down - 17,054 TB/2,568TB - Many users upload more than 1TB 32 SKKU VLDB Lab. Analysis • Microscopic § Share rate (up/down) - more than 90% ratio higher than 1 - less than 5% higher than 100 33 SKKU VLDB Lab. Analysis • Microscopic § Online time - 50% users return within 10 hours - 95 % users return within 100 hours 34 SKKU VLDB Lab. Contents • Introduction • BitTorrent (Background) § Architecture and Term. § Public and Private torrent sites • Overview of BitTorrent Darknets Operation • Analysis § Macroscopic § Medium-scopic § Microscopic • Conclusion 35 SKKU VLDB Lab. Conclusion • Investigate 800+ private torrent sites § In terms of geographic concentrations and content distributions § Using sharky’s list and alexa rank - present informative view of darknets landscape - regression analysis - give us estimation - Private torrent sites - are relatively small but aggregation size of the darknets is large • Popular torrent sites § private sites vs. public sites § Overlap, leakage 36 SKKU VLDB Lab. QnA • Any question? 37 SKKU VLDB Lab..