CS162: Operating Systems and Systems Programming Lecture 20
Total Page:16
File Type:pdf, Size:1020Kb
CS162: Operating Systems and Systems Programming Lecture 20: Networking: TCP/ P !finis#$% RPC (start'$ 23 Ju*y 201+ Char*es &eiss #ttps://cs162.eecs.berkeley,edu/ Recal*: Sockets .bstraction o/ network I/O inter/ace ― Bidirectional communication c#anne* ― 0ses file interface once esta-*is#ed ● read% write% c*ose Server setup: ― socket(), bind(), listen(), accept!$ ― read, write, c*ose from socket returned by accept Client setup: ― socket(), connect!$ ― t#en read, write, c*ose getaddrin/o($ to resolve names, addresses /or bind($ and connect!$ 2 Recal*: Layering Comple2 ser1ices from simpler ones TCP/IP networking layers: ― Physical + Link (4ireless, Ethernet, ..,$ ● Unrelia-*e, loca* e2c#ange o/ limited-si7ed frames ― Network (IP) – routing between networks ● Unrelia-*e, glo-a* e2c#ange o/ limited6si7ed packets ― Transport (TCP, 08P) – routing ● &elia-*ity, streams o/ bytes instead o/ packets, … ― .pplication – everything on top o/ sockets ( Recal*: Glue: Adding Functionality Physical Reality: Frames Abstraction: Stream Limited Size .r-itrary Si7e 0nordered (sometimes$ Ordered 0nre*iab*e Reliable <achine-to-machine Process-to-process On*y on local area net Routed anywhere .synchronous Synchronous Recal*: >ierarchical Networking: The nternet >ierarchy of networks – scales to mil*ions of host Other subnets "ranscontinental subnet1 Router #ink Router ther subnet$ subnets Router subnet2 + Recal*: Routing (2) Internet routing mechanism: routing tab*es ― 5ach router does tab*e lookup to decide which link to use to get packet closer to destination ― 8on’t need 4 -illion entries !or 2^128 /or IPv6) in ta-le: routing is by su-net Routing table contains: ― 8estination address range: output link c*oser to destination ― 8e/ault entry 6 Names 1ersus Addresses Names are ― meaningfu* ― memorable ― donDt c#ange if the resource moves .ddresses ― explain #ow to access a resource ― c#ange i/ the resource moves 5xamp*e: ― int foo; ← variab*e in C ― DfooD is t#e name, address is a pointer to some p*ace on the stack,,, C Naming in the nternet Name %ddress You probably want to use human-readable names: ― www,google.com ― www,-erkeley.edu Network wants an IP address: ― thatDs what's in routing ta-les ― a*lows routing ta-les to take advantage o/ hierarchy Mapping is done -y the Domain Name System Domain Name System (1) "op-le'el edu com (IT 169.229.131.81 berkeley)edu berkele/ www (it.edu calmail eecs 128.32.61.103 eecs.berkeley.edu www 8NS is a hierarc#ica* mec#anism for naming 128.32.139.., ― Name di1ided into "la-e*s"% right to le/t: www,eecs.-erkeley.edu 5ac# domain owned by a particu*ar organization ― Top *e1e* hand*ed -y ICANN ― SubseIuent le1e*s owned -y organi7ations Domain Name System (2) "op-le'el edu com (IT 169.229.131.81 berkeley)edu berkele/ www (it.edu calmail eecs 128.32.61.103 eecs.berkeley.edu www Resolution -y repeated Iueries: 128.32.139.., ― One server /or each HdomainH (JrootK, edu, -erkeley.edu$ ― Plus -ackups: redundancy !availa-le, not gaurenteed consistent$ Domain Name System (3) >ow do you find the root ser1er' >ardcoded list of root ser1ers and backups (updated rarely) … or use your ISP's ser1er (makes repeated queries on your be#al/$ ― called a recursive resolver Recal*: Iterati1e 1s. Recursive Query Master/Directory Master/Directory get(K14) get(K14) K14 N3 N3 K14 N3 V14 V14 Recursive Iterative K14 V14 K14 V14 … … N1 N2 N3 N50 N1 N2 N3 N50 Recursive Query: Directory delegates Iterative Query: Client de*egates 12 Pv4 Packet Format P >eader Size o/ datagram ;*ags & Lengt# !#eader+data$ Fragmentation to sp*it large 0 1+ 16 (1 messages P Oer4 @ >L ToS Tota* *engt#(166-its) 16--it identification Nags 136-it /rag oM Time to P #eader TTL protoco* 166-it #eader c#ecksum Li1e !#ops) 20 -ytes 32--it source IP address Type o/ 32--it destination IP address transport options (if any$ protoco* 8ata Destination machine Pv4 Packet Format: 4hat the router cares about P >eader Size o/ datagram ;*ags & Lengt# !#eader+data$ Fragmentation to sp*it large 0 1+ 16 (1 messages P Oer4 @ >L ToS Tota* *engt#(166-its) 16--it identification Nags 136-it /rag oM Time to P #eader TTL protoco* 166-it #eader c#ecksum Li1e !#ops) 20 -ytes 32--it source IP address Type o/ 32--it destination IP address transport options (if any$ protoco* 8ata Destination machine nternet Protocol Features &outing – IP packet goes anyw#ere ― Just need the IP address ;ragmentation – sp*it big messages into sma**er ― Still message si7e limit (6@Q$ ― Reassemble at destination ― >ides diferences in p#ysical layers <u*tiple protocols on top: ― 1 byte to specify protoco*: ● C<P (1), TCP (6$% 0DP (1C), IPSec !+0, 51), … ― Registry o/ protocol num-ers nternet Protocol Non6Features 0nre*iable delivery ― IP packets are not guaranteed ― <ay be lost -y underlying physical layer (radio noise'$ ― <ay be dropped i/, e.g., router out o/ resources Out-of-order/duplicate de*ivery ― Tolerance to physical layer retrying packets ― Tolerance to multiple paths Layering Comple2 ser1ices from simpler ones TCP/IP networking layers: ― Physical + Link (4ireless, Ethernet, ..,$ ● Unrelia-*e, loca* e2c#ange o/ limited-si7ed frames ― Network !IP$ 9 routing -etween networks ● Unrelia-*e, glo-a* e2c#ange o/ limited6si7ed packets ― "ransport ("CP1 UDP) – streams ● Reliablity, streams of bytes instead of packets, … ― .pplication – everything on top o/ sockets =*ue: Adding ;unctionality Physical Reality: Frames Abstraction: Stream Limited Size Arbitrary Size 2nordered rdered (sometimes) 2nreliable Reliable (achine-to-machine Process-to-process On*y on local area net Routed anywhere .synchronous Synchronous Ordered Messages: Problem Want to divide message into packets, e,g. ― H=5T /static/hw/hw1.pd/ HTTP/1.0" into ― H=5T /static" and "/hw/hw1.pd/ HTTP/1.0" 4#y? Not can't always fit request in one packet (IP: 64K *imit 9 think of uploading a fi*e) IP mig#t reorder these packets: ― H/hw/hw1.pd/ HTTP/1.0H, then ― H=5T /static" Ordered Message: Solution Ordered messages on top of unordered ones: ― .ssign seIuence num-ers to packets: 0,1,2,3,4…, ― / packets arrive out o/ order, hold and put back in order be/ore delivering to user (through socket inter/ace$ ― For instance, #old onto #3 until #2 arrives, etc. Tricky case: packets from "o*d" connections Reliable Message Deli1ery: Problem .** physical networks can gar-*e and/or drop packets ― Physical hardware pro-lems (-ad wire% -ad signa*$ ― There/ore, IP can gar-le/drop packets (doesn't fix this$ Suilding re*ia-*e message de*ivery ― Con"rm that packets arrive e7actly once ― Con"rm that packets aren't gar-led Using Acknowledgements % B % B P acket "imeout ack Checksum: detect gar-*ed packets Receiver send packet to acknowledge when packet received and ungar-*ed: ― No acknowledgement' Resend after timeout 4#at i/ ack dropped? ― Packet is resent !wastefu*$% second c#ance to acknowledge it, 4hat about duplicates' Recal*: Sequence number Simplest version: 1 bit (0/1): % B Problem: What if packet delayed too muc#' 4aiting for Acks: Performance Time from Soda to google,com and back: 8 ms Maximum packet size: ~1500 bytes % B 1500 bytes / 8 ms = 188 KByte/s 4indow-based acknowledgements !1) A pkt B 4indowing protocol !not Iuite TCP$ #0 N=5 Send up to N packets without ack ― .**ows pipelining of packets Queue 0 ― k# N *imits 8ueue si6e ac 4 k# ac Bot! source and destination need to store N packets !w#y'$ 5ac# packet #as seIuence num-er ― .ck says Vreceived all packets up to sequence number 9W ― .d1ances window Sliding Window 0 1 2 3 4 5 6 7 %0K 0 0 1 2 3 4 5 6 7 Window represents packets: ― That might need to -e retransmitted !dropped/gar-led$ ― That receiver needs space to -uMer (ordering) 4indow-based acknowledgements !2) Packet lost' ― &esent -y timeout, again .cknow*edgement lost? ― Packet resent, causing ack to -e resent (again$ ― / we're lucky acknowledged together with later packet instead 8iscard out o/ order packets' ― / no, need some way to indicate Hholes" Transmission Control Protocol !TCP$ Stream in; Stream out; ))6/7w'uts Router Router gfedcba Transmission Control Protoco* !TCP) ― Reliable b/te stream between two processes on diferent mac#ines o1er Internet (read, write$ ― Si-directiona* (two streams for e1ery connection) TCP 8etai*s ― ;ragments byte stream into packets, hands packets to IP ― 4indow-based acknowledgement protoco* ― Automatica**y retransmits lost packets ― Adjusts rate o/ transmission to a1oid congestion ● <ec#anism: Window size TCP 4indows and Sequence Numbers Se8uence Numbers 5ent 5ent Not yet acked not acked sent Sender Recei'ed Recei'ed Not /et Given to app Bu=ered recei'ed Recei'er Sender has three regions: ― sent and ack’ed ― sent and not ack’ed ― not yet sent TCP 4indows and Sequence Numbers Se8uence Numbers 5ent 5ent Not yet acked not acked sent Sender Recei'ed Recei'ed Not /et Given to app Bu=ered recei'ed Recei'er Receiver has three regions: ― received and ack’ed (given to app*ication$ ― received and buMered ― not yet received (or discarded because out o/ order$ 4indow-Based 1--Acknowledgements1.- 1+- 2$- 2*- $--(TCP)$.- $,- .-- SeI: Seq:100 Seq:140 SeI:190 SeI:230 SeI:260 SeI:300 SeI:340 380 Si7e:@0 Size:50 Size:40 Si7e:30 Si7e:40 Si7e:40 Si7e:40 Si7e: 20 %:10->$-- 5e8:1-- %:14->260 5e8:1.- %:19->210 5e8:2$- %:19->210 5e8:2*- %:19->210 5e8;$-- %:19->210 5e8:1+- Retransmit? %;$.->60 5e8;$.- %;$,->20 5e8;$,- %;.-->0 Selecti1e Acknowledgement Option !SACK$ Oanil*a TCP Acknow*edgement ― 5very message encodes SeIuence num-er and .ck ― Can include data for /orward stream and/or ack for reverse stream Se*ective Acknowledgement ― .cknowledgement in/ormation includes not just one num-er, -ut rather ranges o/ received packets ― TCP ption (e2tension$ – o/ten not used ― Turns out e2tending TCP is hard !compati-ility pro-lems) Congestion &outer Congestion: too muc# data somew#ere in network P's solution: dro packets Sad for TCP ― Lots o/ retransmission – wasted work ― Lots o/ waiting for timeouts – underuti*i7ed connection Congestion Avoidance &outer Solution: adjust window size .