The Application Layer: Overview

▪ P2P applications ▪ Principles of network ▪ video streaming and applications content distribution ▪ Web and HTTP networks ▪ E-mail, SMTP, IMAP ▪ socket programming with ▪ The Domain Name System UDP and TCP DNS

1 Peer-to-peer (P2P) Architecture ▪no always-on server ▪arbitrary end systems directly mobile network communicate national or global ISP

▪peers request service from other

peers, provide service in return to other peers

• self scalability – new peers bring new local or service capacity, and new service regional ISP demands home network content ▪peers are intermittently provider network datacenter connected and change IP network

addresses • complex management enterprise ▪examples: P2P network (BitTorrent), streaming (KanKan), VoIP (Skype) 2 File distribution: client-server vs P2P Q: how much time to distribute file (size F) from one server to N peers? • peer upload/download capacity is limited resource

us: server upload capacity

di: peer i file, size F u1 d us 1 u2 d download capacity server 2 di uN network (with abundant bandwidth) ui dN

ui: peer i upload capacity

3 File distribution time: client-server

▪server transmission: must sequentially send (upload) N file copies: F us • time to send one copy: F/us di • time to send N copies: NF/us network u ▪ client: each client must i download file copy • dmin = min client download rate

• min client download time: F/dmin

time to distribute F to N clients using Dc-s > max{NF/us,,F/dmin} client-server approach

increases linearly in N 4 File distribution time: P2P

▪server transmission: must upload at least one copy: • time to send one copy: F/u F s us d ▪ client: each client must i network download file copy ui • min client download time: F/dmin ▪ clients: as aggregate must download NF bits • max upload rate (limiting max download rate) is us + Σui time to distribute F to N clients using DP2P > max{F/us,,F/dmin,,NF/(us + Σui)} P2P approach increases linearly in N … … but so does this, as each peer brings service capacity 5 Client-server vs. P2P: example

client upload rate = u, F/u = 1 hour, us = 10u, dmin

≥ u4s P2P Client-Server 3

2

1 Minimum Distribution Time MinimumDistribution 0 0 5 10 15 20 25 30 35 N 6 https://blog.jse.li/posts/torrent/

7 P2P file distribution: BitTorrent

▪ file divided into 256Kb chunks ▪ peers in torrent send/receive file chunks

tracker: tracks peers torrent: group of peers participating in torrent exchanging chunks of a file

Alice arrives … … obtains list of peers from tracker … and begins exchanging file chunks with peers in torrent

8 Tracker server for Debian

http://bttracker.debian.org:6969/

The Debian linux tracking server

Central point of failure

9 Torrent Trackers can get seized (Single point of failure)

▪TorrentSpy, ▪P2P discovery ▪ ▪DHT ▪Kickass Torrent ▪PEX ▪Magnet Links

10 (DHT)

▪ Hash table

▪ DHT paradigm

▪ Circular DHT and overlay networks

▪ Peer churn Simple Database

Simple database with(key, value) pairs: • key: human name; value: social security #

Key Value John Washington 132-54-3570 Diana Louise Jones 761-55-3791 Xiaoming Liu 385-41-0902 Rakesh Gopal 441-89-1956 Linda Cohen 217-66-5609 ……. ……… Lisa Kobayashi 177-23-0199 • key: movie title; value: IP address Hash Table • More convenient to store and search on numerical representation of key • key = hash(original key)

Original Key Key Value John Washington 8962458 132-54-3570 Diana Louise Jones 7800356 761-55-3791 Xiaoming Liu 1567109 385-41-0902 Rakesh Gopal 2360012 441-89-1956 Linda Cohen 5430938 217-66-5609 ……. ……… Lisa Kobayashi 9290124 177-23-0199 Distributed Hash Table (DHT)

▪ Distribute (key, value) pairs over millions of peers • pairs are evenly distributed over peers ▪ Any peer can query database with a key • database returns value for the key • To resolve query, small number of messages exchanged among peers ▪ Each peer only knows about a small number of other peers ▪ Robust to peers coming and going (churn) Assign key-value pairs to peers

▪ rule: assign key-value pair to the peer that has the closest ID. ▪ convention: closest is the immediate successor of the key. ▪ e.g., ID space {0,1,2,3,…,63} ▪ suppose 8 peers: 1,12,13,25,32,40,48,60 • If key = 51, then assigned to peer 60 • If key = 60, then assigned to peer 60 • If key = 61, then assigned to peer 1 Circular DHT

• each peer only aware of immediate successor and predecessor. 1

12 60

13 48 25 40 32 “overlay network” Resolving a query

1 What is the value associated with key 53 ? value 12 60

13 48 O(N) messages 25 on avgerage to resolve 40 query, when there 32 are N peers Circular DHT with shortcuts 1 value What is the value for key 53 12 60

13 48 25 40 32

• each peer keeps track of IP addresses of predecessor, successor, short cuts. • reduced from 6 to 3 messages. • possible to design shortcuts with O(log N) neighbors, O(log N) messages in query Peer churn 1 handling peer churn: ❖peers may come and go (churn) 3 15 ❖each peer knows address of its two successors 4 ❖each peer periodically pings 12 its 5 two successors to check aliveness 10 8 ❖if immediate successor leaves, choose next successor as new immediate successor example: peer 5 abruptly leaves Peer churn handling peer churn: 1 ❖peers may come and go (churn) ❖ 3 each peer knows address of 15 its two successors ❖each peer periodically pings 4 its 12 two successors to check aliveness ❖ 10 if immediate successor 8 leaves, choose next successor as new immediate successor example: peer 5 abruptly leaves ▪peer 4 detects peer 5’s departure; makes 8 its immediate successor ▪ 4 asks 8 who its immediate successor is; makes 8’s immediate successor its second successor. Torrent Trackers can get seized (Single point of failure)

▪TorrentSpy, ▪P2P discovery ▪Popcorn Time ▪DHT ▪Kickass Torrent ▪PEX ▪Magnet Links

21 d8:announce41:http://bttracker.debian.org:6969/announce7:comment35:"Debian CD from cdimage.debian.org"13:creation datei1573903810e9:httpseedsl145:https://cdimage.debian.org/cdimage/ release/10.2.0//srv/cdbuilder.debian.org/dst/deb-cd/weekly-builds/amd64/iso-cd/debian-10.2.0-amd64- netinst.iso145:https://cdimage.debian.org/cdimage/archive/10.2.0//srv/cdbuilder.debian.org/dst/deb-cd/ weekly-builds/amd64/iso-cd/debian-10.2.0-amd64- netinst.isoe4:infod6:lengthi351272960e4:name31:debian-10.2.0-amd64-netinst.iso12:piece lengthi262144e6:pieces26800:PS^ (binary blob of the hashes of each piece)ee

22 23 SHA1 hash contained in the original file ensure that pieces have not been modified.

24 P2P file distribution: BitTorrent

▪ peer joining torrent: • has no chunks, but will accumulate them over time from other peers • registers with tracker to get list of peers, connects to subset of peers (“neighbors”) ▪ while downloading, peer uploads chunks to other peers ▪ peer may change peers with whom it exchanges chunks ▪ churn: peers may come and go ▪ once peer has entire file, it may (selfishly) leave or (altruistically) remain in torrent

25 https://markuseliasson.se/article/bittorrent-in-python/

def _calculate_peer_id(): """ Calculate and return a unique Peer ID.

The `peer id` is a 20 byte long identifier. This implementation use the Azureus style `-PC1000-`.

Read more: https://wiki.theory.org/BitTorrentSpecification#peer_id """ return '-PC0001-' + ''.join( [str(random.randint(0, 9)) for _ in range(12)])

26 Communicating with Tracker ▪Make a GET request to Tracker announce ▪info_hash ▪peer_id ▪Port ▪Uploaded ▪Downloaded ▪Compact ▪Left

27 Tracker generating the following response

▪Peer are binary blob

Refresh list every 900 seconds

In big endian

28 Downloading from peers

▪Start TCP connection ▪Port also provided by tracker ▪Complete two way BitTorrent Handshake ▪Exchange Messages to download pieces

29 HandShake Details

\x13BitTorrent protocol\x00\x00\x00\x00\x00\x00\x00\x00\x86\xd4\xc8\x00\x24\xa4\x69\xbe\x4c\x50\xbc\x5a\x10\x2c\xf7\x17\ x80\x31\x00\x74-TR2940-k8hj0wgej6ch

1. The length of the protocol identifier, which is always 19 (0x13 in hex) 2. The protocol identifier, called the pstr which is always BitTorrent protocol 3. Eight reserved bytes, all set to 0. We’d flip some of them to 1 to indicate that we support certain extensions. But we don’t, so we’ll keep them at 0. 4. The infohash that we calculated earlier to identify which file we want 5. The Peer ID that we made up to identify ourselves 6.

We receive the same message back the and info hash match, know that we are looking for the same file

30 Can’t send messages until peer says they are ready

▪The torrent client enters a choked state. ▪Once we have been unchecked we can begin sending messages

31 BitTorrent: requesting, sending file chunks

Requesting chunks: Sending chunks: tit-for-tat ▪ at any given time, ▪ Alice sends chunks to those different peers have four peers currently sending different subsets of file her chunks at highest rate chunks • other peers are choked by Alice ▪ periodically, Alice asks (do not receive chunks from her) each peer for list of • re-evaluate top 4 every10 secs chunks that they have ▪ every 30 secs: randomly select ▪ Alice requests missing another peer, starts sending chunks from peers, rarest chunks first • “optimistically unchoke” this peer • newly chosen peer may join top 4

32 BitTorrent: tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers

https://blog.jse.li/posts/torrent/

higher upload rate: find better trading partners, get file faster !

33 Messages

▪Message IDs

MsgChoke messageID = 0 MsgUnchoke messageID = 1 MsgInterested messageID = 2 MsgNotInterested messageID = 3 MsgHave messageID = 4 MsgBitfield messageID = 5 MsgRequest messageID = 6 MsgPiece messageID = 7 MsgCancel messageID = 8 ▪

34 The Application Layer: Overview

▪ P2P applications ▪ Principles of network ▪ video streaming and applications content distribution ▪ Web and HTTP networks ▪ E-mail, SMTP, IMAP ▪ socket programming with ▪ The Domain Name System UDP and TCP DNS

35 Video Streaming and CDNs: context

▪ stream video traffic: major consumer of Internet bandwidth • Netflix, YouTube, Amazon Prime: 80% of residential ISP traffic (2020) ▪ challenge: scale - how to reach ~1B users? • single mega-video server won’t work (why?) ▪ challenge: heterogeneity ▪different users have different capabilities (e.g., wired versus mobile; bandwidth rich versus bandwidth poor) ▪ solution: distributed, application-level

infrastructure 36 spatial coding example: instead Multimedia: Video of sending N values of same color (all purple), send only two values: color value (purple) and number of repeated values (N) ▪video: sequence of images displayed at constant rate ……………………..……………….……. • e.g., 24 images/sec ▪digital image: array of pixels • each pixel represented by bits coding: use redundancy within ▪ frame i and between images to decrease # bits used to encode image • spatial (within image) temporal coding example: instead of sending • temporal (from one image to complete frame at i+1, next) send only differences from frame i frame i+1

37