2

Light at the end of the tunnel

Course Overview Key Concepts in Networking

Mike Freedman COS 461: Computer Networks http://www.cs.princeton.edu/courses/archive/spr20/cos461/

Some Key Concepts Hierarchy

• Course was organized around protocols • Scalability of large systems – But a small set of concepts recur in many protocols – Cannot store all information everywhere – Cannot centrally coordinate everything • General CS concepts • Hierarchy to manage scale – Hierarchy, indirection, caching, randomization – Divide system into smaller pieces • Networking-specific concepts • Hierarchy to divide control – Soft state, layering, (de)multiplexing – Decentralized management – End-to-end argument • Examples in the Internet – IP addresses, routing protocols, DNS, peer-to-peer

1 Hierarchy: IP Address Blocks Hierarchy: IP Address Blocks

• Number related hosts from a common subnet • Separation of control – 1.2.3.0/24 on the left LAN – Prefix: assigned to an institution – 5.6.7.0/24 on the right LAN – Addresses: assigned by institution to its nodes

1.2.3.4 1.2.3.7 1.2.3.156 5.6.7.8 5.6.7.9 5.6.7.212 • Who assigns prefixes? ...... host host host host host host – Internet Corporation for Assigned Names & Numbers

LAN 1 LAN 2 – Regional Internet Registries (RIRs) router router router WAN WAN – Internet Service Providers (ISPs)

1.2.3.0/24 – Stub networks 5.6.7.0/24 – Regions within an enterprise forwarding table

Hierarchy: Routing Protocols Hierarchy: Routing Protocols • AS-level topology • Interdomain routing ignores details in an AS – Nodes are Autonomous Systems (ASes) – Routers flood information to learn the topology – Edges are links and business relationships – Routers determine “next hop” to other routers… – Hides the detail within each AS’s network – By computing shortest paths based on link weights

4 2 1 3 3 1 5 3 2 1 5 2 7 6 4 1 3 Client Web server

2 9 Hierarchy: Domain Name System Hierarchy: Domain Name System • 13 root servers (see http://www.root-servers.org/) unnamed root • Labeled A through M com edu org ac uk zw arpa A Verisign, Dulles, VA C Cogent, Herndon, VA (also Los Angeles) D U Maryland College Park, MD generic domains country domains G US DoD Vienna, VA K RIPE London (also Amsterdam, Frankfurt) in- H ARL Aberdeen, MD bar ac addr J Verisign, ( 11 locations) I Autonomica, Stockholm (plus 3 other locations) E NASA Mt View, CA F Internet Software C. Palo Alto, CA (and 17 other m WIDE Tokyo west east cam 12 locations)

foo my usr 34 B USC-ISI Marina del Rey, CA L ICANN Los Angeles, CA my.east.bar.edu usr.cam.ac.uk 56

12.34.56.0/24

11 Hierarchy: Super Peers in P2P (KaZaA) Indirection • • Each peer is either group leader Referencing by name or assigned to group leader – Rather than the value itself – TCP connection between peer – E.g., manipulating a variable through a pointer and its group leader • Benefits of indirection – TCP connections between some pairs of group leaders – Human convenience – Reducing overhead when things change • Group leader tracks the content in all its children • Examples of indirection in the Internet ordinary peer

group-leader peer – Names vs. addresses

neighoring relationships in overlay network – Mobile IP

3 Indirection: Names vs. Addresses Indirection: Load Balancers & Switches

• Host name to IP address • Not fixed binding of IPs or MAC address to – Mnemonic names to location-dependent addresses physical machine – E.g., from www.cnn.com to 64.236.16.20 – NAT allows multiple machines to share single – Using the Domain Name System (DNS) public IP address • From IP address to MAC address – Load balancers: Machines share IP address, LB maps to physical machine by network flow – From hierarchical global address to interface card – VM can migrate across L2 network through – E.g., from 64.236.16.20 to 00-15-C5-49-04-A9 gratuitous ARP – Using the Address Resolution Protocol (ARP)

Caching Caching: DNS Caching

• Duplicating stored elsewhere Root server

– To reduce latency for accessing the data 3 – To reduce resources consumed 4 Application DNS cache • Caching is often quite effective 5 DNS query Top-level 1 10 Local DNS 6 – Speed difference between cache and primary copy 2 domain server server DNS resolver 7 – Locality of reference, and small set of popular data DNS response 9 8 • Examples from the Internet Second-level domain server – DNS caching, Web caching

4 18 Caching: Web Caching Randomization • Distributed adaptive algorithms origin – • Caching location server Multiple distributed parties – Proxy cache Proxy – Adapting independently HTTP requestserver HTTP response – Browser cache client HTTP request • Risk of synchronization • Better performance HTTP response – Many parties reacting at the same time – Lower RTT HTTP request – Leading to bad aggregate behavior HTTP response – Existing connection client • Randomization can desynchronize – Less network load origin server – Ethernet back-off, Random Early Detection • Rather than imposing centralized control

Randomization: Ethernet Back-off Randomization: Dropping Packets Early • Random access: exponential back-off • Congestion on a link – After collision, wait random time before retrying – Eventually the queue becomes full – After mth, choose K randomly from {0, …, 2m-1} – And new packets must be dropped – Wait for K*512 bit times before trying again • Drop-tail queuing leads to bursty loss – Many packets encounter a full queue – Many TCP senders reduce their sending rates

5 Randomization: Dropping Packets Early Soft State • Better to give early feedback • State: stored in nodes by network protocols – Get a few connections to slow down – Installed by receiver of a set-up message – – … before it is too late Updated when conditions change • Random Early Detection (RED) • Hard state: valid unless told otherwise – Removed by receiver of tear-down message – Randomly drop packets when queue (near) full – Requires error handling to deal with sender failure – Drop rate increases as function of queue length • Soft state: invalid if not told to refresh – Periodically refreshed, removed by timeout • Soft state reduces complexity Probability Average Queue Length – DNS caching, DHCP leases

Soft State: DNS Caching Soft State: DHCP Leases

• Cache consistency is a hard problem • DHCP “offer message” from the server – Ensuring the cached copy is not out of date – Configuration parameters (proposed IP address, mask, • Strawman: explicit revocation or updates gateway router, DNS server, ...) – – Keep track of everyone who has cached information Lease time (the time information remains valid) – If name-to-host mapping changes, update caches • Why is a lease time necessary? • Soft state solution – Client can release address (DHCP RELEASE) • – DNS responses include a “time to live” (TTL) field E.g., “ipconfig /release” or clean shutdown of computer – – Cached entry is deleted after TTL expires But, the host might not release the address • E.g., the host crashes or buggy client software – You don’t want address to be allocated forever

6 Layering: A Modular Approach Layering: Standing on Shoulders • Sub-divide the problem host host HTTP message – Each layer relies on services from layer below HTTP HTTP – Each layer exports services to layer above

• Interface between layers defines interaction TCP segment TCP TCP – Hides implementation details – Layers can change without disturbing other layers router router

IP packet IP packet IP packet IP Application IP IP IP

Application-to-application channels Ethernet Ethernet SONET SONET Ethernet Ethernet Host-to-host connectivity interface interface interface interface interface interface

Link hardware Ethernet frame SONET frame Ethernet frame

Layering: Internet Protocol Suite Layering: Encapsulation of Data • Different devices switch different things – Physical layer: electrical signals (repeaters and hubs) FTP HTTP NV TFTP Applications – Link layer: frames (bridges and switches) UDP TCP TCP UDP – Network layer: packets (routers) Waist IP Application gateway Data Link NET1 NET2 … NETn Physical Transport gateway Frame Packet TCP User header header header data The Hourglass Model Router Bridge, switch The waist facilitates interoperability Repeater, hub

7 31 Demultiplexing (De)multiplexing: With a NAT

• Separating multiple streams out of one – Recognizing the separate streams – Treating the separate streams accordingly 138.76.29.7

• Examples in the Internet 10.0.0.1 type port # outside NAT Frame Packet TCP User header header header data

10.0.0.2 protocol inside

Power at the End Host Why No Math in This Course?

End-to-End Principle • Hypothesis #1: theory not relevant to Internet Whenever possible, communications protocol – Body of math created for telephone networks operations should be defined to occur at the – Many of these models don’t work in data networks end-points of a communications system. • Hypothesis #2: too many kinds of theory – Queuing: statistical multiplexing works Programmability – Control: TCP congestion control works With programmable end hosts, new network – Optimization: TCP maximizes aggregate utility services can be added at any time, by anyone. – Game: reasoning about competing ASes

8 No Strict Notions of Identity

What Will Happen • Leads to to the Internet – Spam – Spoofing – Denial-of-service – Route hijacking

35

Protocols Designed Based on Trust Nobody in Charge

• That you don’t spoof your addresses • Traffic traverses many Autonomous Systems – MAC spoofing, IP address spoofing, spam, … – Who’s fault is it when things go wrong? – How do you upgrade functionality? • That port numbers correspond to applications – Rather than being arbitrary, meaningless numbers • Implicit trust in the end host – What if some hosts violate congestion control? • That you adhere to the protocol – Ethernet exponential back-off after a collision • Anyone can add any application – TCP additive increase, multiplicative decrease – Whether or not it is legal, moral, good, etc. • • That protocol specifications are public Spans many countries – So others can build interoperable implementations – So no one government can be in charge

9 Challenging New Requirements The Internet of the Future • Can we fix what ails the Internet • Disseminating data – Security, performance, reliability • Mobile, multi-homed hosts – Upgradability, managability – • Sometimes-connected hosts • Without throwing out baby with bathwater • Large number of hosts – Ease of adding new hosts • Real-time applications – Ease of adding new services – Ease of adding new link technologies • An open technical and policy question…

Thank You!

10