CS555: Distributed Systems [Fall 2019] Dept
Total Page:16
File Type:pdf, Size:1020Kb
CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science, Colorado State University Frequently asked questions from the previous class survey CS 555: DISTRIBUTED SYSTEMS ¨ Difference in routing in the network space vs ID space [ BITTORRENT & DISTRIBUTED COMPUTING ECONOMICS] ¨ Can Gnutella be viewed as a semi-structured P2P system? Shrideep Pallickara Computer Science Colorado State University CS555: Distributed Systems [Fall 2019] L9.1 September 24, 2019 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.2 Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Topics covered in this lecture ¨ BitTorrent ¨ Distributed Computing Economics BITTORRENT CS555: Distributed Systems [Fall 2019] September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.3 September 24, 2019 L9.4 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Dept. Of Computer Science, Colorado State University Bit Torrent: Traffic statistics BitTorrent ¨ In November 2004 ¨ Designed for downloading large files ¤ Responsible for 25% of all Internet traffic ¨ Not intended for real-time routing of content ¨ February 2013 ¨ Relies on capabilities of ordinary user machines ¤ 3.35% of all worldwide bandwidth ¤ > 50% of the 6% total bandwidth dedicated to file sharing September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.5 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.6 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University SLIDES CREATED BY: SHRIDEEP PALLICKARA L9.1 CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science, Colorado State University Bit Torrent: Key concepts Segmented file transfer [1/2] ¨ Instead of downloading a file from a single source server ¨ File being transferred is divided into fixed-size segments called ¤ Users join a swarm of hosts to upload-to/download-from simultaneously chunks (or pieces) ¤ Chunks are of the same size throughout a single download (10MB file: 10 ¨ Several basic commodity computers can replace large, customized 1MB chunks or 40 256KB chunks) servers ¤ Without compromising on efficiency ¨ Chunks are downloaded non-sequentially and rearranged into the ¤ In fact, lower bandwidth usage with swarms prevents large internet traffic correct order by BitTorrent spikes September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.7 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.8 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Segmented file transfer [2/2] BitTorrent: Protocol summary ¨ Advantages: ¨ Splits files into fixed-sized chunks ¤ File transfers can be stopped at any time and resumed ¨ Chunks are then made available at various peers across the P2P n Without loss of previously downloaded content network ¤ Clients seek out readily available chunks, rather than waiting for an unavailable (next in sequence) chunk ¨ Clients can download a number of chunks in parallel from different sites ¤ Reduces the burden on a particular peer to service the entire download September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.9 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.10 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University The BitTorrent protocol Advantages of hashing chunks ¨ When a file is made available in BitTorrent, a .torrent file is ¨ Each chunk has a cryptographic hash in the torrent descriptor created ¨ Modifications of chunks can be reliably detected ¤ Holds metadata associated that file ¤ Prevents accidental and malicious modifications ¨ Metadata ¤ The name and length of the file ¨ If a node starts with an authentic/legitimate torrent descriptor? ¤ Location of a tracker (URL) ¤ It can verify the authenticity of the entire file that it receives n Centralized server that manages download for that file ¤ Checksum n Associated with each chunk n Generated using the SHA-1 algorithm September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.11 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.12 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University SLIDES CREATED BY: SHRIDEEP PALLICKARA L9.2 CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science, Colorado State University The swarm or torrent for a particular file includes Trackers ¨ Tracker ¨ The use of trackers, compromises a core P2P principle ¨ Seeders ¤ But simplifies the system ¨ Leechers ¨ Trackers are responsible for tracking the download status for a particular file September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.13 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.14 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University The roles of participants in BitTorrent: Seeder The roles of participants in BitTorrent: Leechers ¨ Peer with a complete version of a file (i.e. with all its chunks) is known ¨ Peers that want to download a file are known as leechers as a seeder ¤ A given leecher, at any given time, contains a number of chunks for that file ¨ Peer that initially creates the file, provides the initial seed for file ¨ Once a leecher downloads all chunks for a file, it can become a distribution seeder for subsequent downloads ¤ Files spread virally based on demand September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.15 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.16 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University When a peers wants to download a file Incentive mechanism: Quid pro quo ¨ Contacts the tracker ¨ Gives downloading preference to peers who have previously uploaded to the site ¨ Is given a partial view of the torrent ¤ Encourages concurrent uploads/downloads to make better use of bandwidth ¤ The set of peers that can support the download ¤ The tracker does not participate in scheduling the downloads ¨ A peer supports downloads from n simultaneous peers by unchoking n Decentralized these peers ¨ Chunks are requested and transmitted in any order ¤ Decisions based on rolling calculations of download rates September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.17 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.18 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University SLIDES CREATED BY: SHRIDEEP PALLICKARA L9.3 CS555: Distributed Systems [Fall 2019] Dept. Of Computer Science, Colorado State University Scheduling downloads How BitTorrent differs from a classic download ¨ Rarest first scheduling policy BitTorrent Classic download ¨ Peer prioritizes chunk that is rarest among its set of connected peers Connections Many small data requests ¨ Ensures that chunks that are not widely available, spread rapidly One TCP connection over different IP connections to one machine to different machines Download Order Random or “rarest Sequential first” to ensure high- availability ** Allows BitTorrent to achieve lower cost, higher redundancy, and resistance to abuse September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.19 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.20 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University BitTorrent: Advantages BitTorrent: Shortcomings ¨ Advantages ¨ Downloads can take time to rise to full speed ¤ Lower costs, greater redundancy, higher resistance to abuse or “flash ¤ May take time to enough peer connections to be established crowds” ¤ Takes time for a node to receive data to become an effective uploader ¨ Shortcomings ¨ Regular (non-BitTorrent/traditional) downloads on the other hand: ¤ Non-contiguous download precludes progressive download ¤ Rise to full speed very quickly and maintain this speed throughout ¤ No streaming playback n Beta BitTorrent Streaming protocol was made available for testing in 2013; this was not successful n New service BitTorrent Live was released as Public Beta in Spring 2019. September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.21 September 24, 2019 CS555: Distributed Systems [Fall 2019] L9.22 Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University Professor: SHRIDEEP PALLICKARA Dept. Of Computer Science, Colorado State University But how do you find a torrent? Support for trackerless Torrents ¨ Browsing the web or by some other means ¨ Azureus (now Vuze) supported this first ¤ Open it with a BitTorrent client ¨ Mainline BitTorrent provides a DHT based implementation ¨ Client connects to trackers in the torrent file and finds peers ¤ Mainline DHT ¤ If swarm contains only the initial seeder, client connects directly to it and ¤ Kademlia-based Distributed Hash Table (DHT) used by BitTorrent clients begins to request pieces September 24, 2019 CS555: Distributed Systems [Fall 2019]