Peer-To-Peer Networks

Peer-To-Peer Networks

Peer-to-Peer Networks 14-740: Fundamentals of Computer Networks Credit to Bill Nace, 14-740, Fall 2017 Material from Computer Networking: A Top Down Approach, 6th edition. J.F. Kurose and K.W. Ross traceroute • P2P Overview • Architecture components • Napster (Centralized) • Gnutella (Distributed) • Skype and KaZaA (Hybrid, Hierarchical) • KaZaA Reverse Engineering Study 14-740: Spring 2018 2 What is P2P? • Client / Server interaction • Client: any end-host • Server: specific end-host • P2P: Peer-to-peer • Any end-host • Aim to leverage resources available on “clients” (peers) • Hard drive space • Bandwidth (especially upload) • Computational power • Anonymity (i.e. Zombie botnets) • “Edge-ness” (i.e. being distributed at network edges) • Clients are particularly fickle • Users have not agreed to provide any particular level of service • Users are not altruistic -- algorithm must force participation without allowing cheating • Clients are not trusted • Client code may be modified • And yet, availability of resources must be assured P2P History • Proto-P2P systems exist • DNS, Netnews/Usenet • Xerox Grapevine (~1982): name, mail delivery service • Kicked into high gear in 1999 • Many users had “always-on” broadband net connections • 1st Generation: Napster (music exchange) • 2nd Generation: Freenet, Gnutella, Kazaa, BitTorrent • More scalable, designed for anonymity, fault-tolerant • 3rd Generation: Middleware -- Pastry, Chord • Provide for overlay routing to place/find resources 14-740: Spring 2018 6 P2P Architecture • Content Directory • “Database” of content • Structured? Unstructured? • Which peer has what files? • Metadata: Other info about files • Signaling protocol • How do peers exchange coordination messages? • Proprietary? Encrypted? 14-740: Spring 2018 7 Architecture (2) • File transfer • How does a peer retrieve a file from another peer? • HTTP or HTTP-like • Any peer must be able to send reply messages 14-740: Spring 2018 8 Overlay network is not the network • Overlay networks are formed on top of network graph • Connect peers via abstract links in the overlay • Transport accomplished on network edges • Overlay algorithms abstract particulars of the network perhaps even built on HTTP one edge for transport! traceroute • P2P Overview • Architecture components • Napster (Centralized) • Gnutella (Distributed) • Skype and KaZaA (Hybrid, Hierarchical) • KaZaA Reverse Engineering Study 14-740: Spring 2018 10 Napster • Original “centralized” design 1. When peer connects it informs central server of • IP address • content 2. Marcia queries for “Believer” • Server looks through index • Reply: “Daichi has Believer” 3. Marcia requests file from Daichi Problems? • File transfer is decentralized, but locating content is highly centralized • Single point of failure • Performance bottleneck • Single point of lawsuit • Result: Napster was owned by Best Buy • Now it’s a rebranded Rhapsody music streaming service 14-740: Spring 2018 12 traceroute • P2P Overview • Architecture components • Napster (Centralized) • Gnutella (Distributed) • Skype and KaZaA (Hybrid, Hierarchical) • KaZaA Reverse Engineering Study 14-740: Spring 2018 13 Gnutella • Created in response to Napster problems • Fully decentralized • Does not depend on central directory • Participants arrange themselves in overlay • Queries flood network to find file • Fully anonymous • Public domain protocol • Various Gnutella clients 14-740: Spring 2018 14 Bootstrapping 1. New peer X must find some member of the Gnutella network • Use a list of candidate peers 2. X sequentially attempts to make TCP connection with peers on list until successful with peer Y 3. X sends ping message to Y; Y forwards ping message 4. All peers receiving a ping message respond to X with a pong message 5. X receives many pong messages and can setup additional TCP connections 14-740: Spring 2018 15 Query Flooding File transfer • Query messages sent (HTTP) Query over existing TCP QueryHit connections • Peers forward query message Query • QueryHit messages sent over reverse path • File transfer arranged over HTTP Limited Scope Query Flooding • Original design not scalable • Exponential increase in signaling traffic • Solution is to limit scope of query • Include peer-count field in query message, e.g. peer-count = 4 • This field gets decremented by 1 at each hop • Message stops propagating when peer-count hits zero Query (peer-count = 3) Query (peer-count = 2) 14-740: Spring 2018 17 Question • If peer-count = 4 at the start, how many peers would the query message eventually reach? • It depends on the number of neighbors each peer has! 14-740: Spring 2018 19 More Questions • Is limited scope query flooding scalable? (i.e. How does number of nodes affect message counts?) • Not scalable • Number of messages grows with number of nodes • Desire: constant time search 14-740: Spring 2018 21 Even more questions • Are we guaranteed to find an object? (Assume the object exists somewhere in the overlay network) • No guarantee • Query stops after peer-count hits zero • Gnutella uses a unstructured graph 14-740: Spring 2018 23 traceroute • P2P Overview • Architecture components • Napster (Centralized) • Gnutella (Distributed) • Skype and KaZaA (Hybrid, Hierarchical) • KaZaA Reverse Engineering Study 14-740: Spring 2018 24 KaZaa: Exploiting Heterogeneity • Each peer is either a Super Node (SN) or an Ordinary Node (ON) assigned to a SN • TCP connection between ON and its SN • TCP connections between some pairs of SNs • SN tracks the content in all its children KaZaa Queries • Each file has a hash and a descriptor • Client sends keyword query to its SN • SN responds with matches: • For each match: metadata, hash, IP address • If SN forwards query to other SNs, they respond with matches • Client then selects files for downloading • HTTP requests using hash as identifier sent to peers holding desired file 14-740: Spring 2018 27 Measurement Study • Developed tools to reverse engineering KaZaA • Attempt to answer the following questions: • What is the ratio of SN to ONs? • What is the fraction of SNs overall? • How are SNs connected, sparsely or densely? • How does ON pick best SN? • Random port numbers and NATs? 14-740: Spring 2018 28 Structural Properties • Deployed apparatus in Polytechnic campus and broadband residential network • SN connects to 40-50 other SNs (dynamic) • SN has 100-160 ONs at Polytechnic, 55-70 at access network • Given 3 million peers, 25000 – 40000 SNs • SN is connected to ~0.1% of other SNs 14-740: Spring 2018 29 Unanswered Questions... • Details about the residential access network? • Where is it? What is it? • What is the uplink/download bandwidth? • How long was the measurement study? • 6 hours on 2 days? Aug 22 03, Oct 24 03 • How are these time periods representative samples? • Where did the 3 million peers number come from? • From KaZaA? 14-740: Spring 2018 30 Overlay Dynamics • Connection lifetimes are short • Average for ON-SN is 34 mins, SN-SN is 11 mins • 38% of ON-SN and 32% of SN-SN lasted < 30 secs • Why so short? • SN searching for other SNs with small workload • Long-term connection shuffling, so larger set of SNs can be explored • Exchange of SN lists 14-740: Spring 2018 31 Unanswered Questions ... • Big jump from overlay dynamic numbers to conjectures of what SNs are doing… • How can we interpret these numbers better? “Staircases” in the cumulative distribution? Different distinct groups of connection times Compare these times to conjectures 14-740: Spring 2018 32 Parent Selection • Workload • Exact algorithm to calculate workload is unknown • Tied to the number of connections a SN is current supporting • Locality • RTT measurements • 60% of SN-SN connections < 50 msec • 40% of ON-SN < 5 msecs • Transatlantic traffic ~ 100 msecs • Transpacific traffic ~ 180 msecs • Topological closeness (Prefix matching) • SNs in SN list close to ON • Issues with this methodology? 14-740: Spring 2018 33 Skype • P2P Voice-over-IP (VoIP) Skype login server • pc-to-pc, pc-to-phone, phone- to-pc • also IM, video • proprietary application-layer protocol (inferred via reverse engineering) • hierarchical overlay Making a Call • User starts Skype Skype • Client registers with SN login server • list of bootstrap SNs • Client logs in (authenticates) • Call: client queries SN with callee ID • SN contacts other SNs (how? unknown) to find addr of callee • SN returns address to client • Client directly contacts callee (TCP) Lesson Objectives • Now, you should be able to: • list reasons that led to the creation of P2P networks • describe what an overlay network is and how it is different from the internet • use historical P2P networks to describe centralized P2P networks, fully distributed P2P networks, and hierarchical P2P networks • describe search techniques in the various P2P forms, and to analyze search efficiencies 14-740: Spring 2018 36.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us