Roadmap

16: Application, Transport, r (User level) Network and Link Layers r (OS) r (OS) Last Modified: r Link Layer (Device Driver, Adapter Card) 7/3/2004 1:46:53 PM

-1 -2

Application Layer Applications and application-layer protocols

Application: communicating, application transport r Network Applications Drive Network distributed processes network Design m running in network hosts in physical r Important to remember that network “user space” m exchange messages to applications are the reason we care about implement app building a network infrastructure m e.g., email, file transfer, r Applications range from text based the Web Application-layer protocols command line ones popular in the 1980s application m one “piece” of an app application transport (like , ftp, news, chat, etc) to transport network network data link m define messages data link physical multimedia applications (Web browsers, physical exchanged by apps and audio and video streaming, realtime actions taken videoconferencing, etc.) m user services provided by lower layer protocols -3 -4

Client-server paradigm How do clients and servers communicate?

Typical network app has two application pieces: client and server transport network API: application Q: how does a process data link Client: physical programming interface “identify” the other request r initiates contact with server r defines interface process with which it (“speaks first”) between application wants to communicate? r typically requests service from and transport layer m IP address of host server, running other process r socket: API r for Web, client is implemented reply m “port number” - allows in browser; for e-mail, in mail m two processes receiving host to application communicate by sending reader transport determine to which network data into socket, data link local process the Server: physical reading data out of message should be r Running first (always?) socket delivered r provides requested service to client e.g., Web server sends requested Web page, mail … more on this later. server delivers e-mail -5 -6

1 Socket programming Sockets

Goal: learn how to build client/server application that Socket: a door between application process communicate using sockets and end-end-transport protocol (UCP or Socket API socket TCP) r introduced in BSD4.1 UNIX, 1981 a host-local, application- created/owned, r Sockets are explicitly created, used, released by OS-controlled interface controlled by (a “door”) into which controlled by process application applications application process application process can developer r client/server paradigm developer socket socket both send and controlled by r two types of transport controlled by kernel kernel receive messages to/from buffers, operating service via socket API: operating buffers, internet another (remote or system variables system m unreliable datagram variables local) application process m reliable, byte stream- host or oriented host or server server

-7 -8

Languages and Platforms Transport services and protocols

r providelogical communication application transport r Socket API is available for many languages between app’ processes network data link network physical running on different hosts logical end data link on many platforms: network physical r transport protocols run in data link m C, Java,Perl, Python,… physical end systems network - data link end transport m *nix, Windows,… r physical network transport vs network layer data link services: physical r Socket Programs written in any language network r network layer:data transfer data link physical and running on any platform can between end systems application communicate with each other! r transport layer:data transport transfer between processes network r data link Client and server must agree on the type m relies on, enhances, network physical of socket, the server port number and the layer services protocol

-9 -10

Services provided by Internet UDP transport protocols r UDP adds very little TCP service: UDP service: 32 bits functionality (or r connection-oriented:setup r unreliable data transfer overhead) to bare IP source port # dest port # required between client, between sending and length checksum server receiving process r Adds multiplexing/ Length, in r reliable transport between r does not provide: demultiplexing bytes of UDP sending and receiving process connection setup, r other UDP uses segment, r including flow control: sender won’t reliability, flow control, (why?): overwhelm receiver header congestion control, timing, m DNS: small, retransmit Application r congestion control: throttle sender when network or bandwidth guarantee if necessary data overloaded m often used for streaming (message) multimedia apps r Q: why bother? Why is does not providing: timing, • Loss tolerant minimum bandwidth there a UDP? • rate sensitive guarantees UDP segment format

-11 -12

2 Process-to-Process Message Multiplexing/demultiplexing Delivery Multiplexing: Demultiplexing: Goal : Deliver application data to correct process (and more gathering data from multiple particularly to the right socket) Stream of incoming data into app processes, enveloping one machine separated into data with header (later used smaller streams destined for Segment - unit of data exchanged between transport layer for demultiplexing) entities; transport (TPDU) individual processes receiver 32 bits Demultiplexing based on IP P3 P4 source port # dest port # application-layer M M addresses of sender and and data port numbers of both sender application and receiver segment P1 transport P2 other header fields header M m Can distinguish traffic M network application application coming to same port but segment transport Ht M transport part of separate network application Hn segment network conversations (like data multiple client connections (message) to a web server)

TCP/UDP segment format -13 -14

TCP adds functionality Common Sense

r TCP adds lots of functionality over bare IP and r Consider faxing a document with flaky machine over UDP m Can’t talk to person on the other side any other way m Still has multiplexing/demultiplexing r What would you do to make sure they got the m Adds reliable, in-order delivery transmission? m Adds flow control and congestion control m Number the pages – so receiver can put them in order/detect duplicates/detect losses r How can you guarantee that other side gets “A B C m Need feedback from the receiver!!! D E” when network could: m Resend data that is missing or if don’t hear from m Lose data “A B D E” receiver m Duplicate data “A B C C D E” r Put some info on cover sheet that lets person m Corrupt data “A B X D E” verify fax info (summarize info like checksum) m Reorder data “A C D E B” r What if it is a really big document? Receiver might m Or all of the above! like to be able to tell you send first 10 pages then 10 more… -15 -16

TCP Connection Management Three-Way Handshake Active participant Passive participant Recall: TCP sender, receiver Three way handshake: (client) (server) establish “connection” before exchanging data Step 1: client end system SYN, SequenceNum = segments sends TCP SYN control x r segment to server initialize TCP variables: y, m seq. #s m specifies initial seq # x + 1 m buffers, flow control info (e.g. RcvWindow) Step 2: server end system SYN + ACK, SequenceNum = receives SYN, replies with Acknowledgment = r client:connection initiator SYNACK control segment ACK, Acknowledgment = Socket clientSocket = new Socket("hostname","port m ACKs received SYN y + 1 number"); m allocates buffers r server:contacted by client m specifies server-> Socket connectionSocket = receiver initial seq. # Note: SYNs take up a sequence number even though welcomeSocket.accept(); no data bytes Step 3: client acknowledges servers initial seq. # -17 -18

3 Timeout and Retransmission TCP: retransmission scenarios (1)

Host A Host B Host A Host B

r Seq=92, 8 bytes data Receiver must acknowledge receipt of all Seq=92, 8 bytes data packets X loss r Sender sets a timer if acknowledgement ACK=100 timeout has not arrived before timer expires then timeout X loss sender will retransmit packet Seq=92, 8 bytes data Seq=92, 8 bytes data r Adaptive retransmission: timer value computed as a function of average round ACK=100 trip times and variance ACK=100 time lost data scenario time lost ACK scenario

-19 -20

TCP: retransmission scenarios (2) Network layer functions

Host A Host B Host A Host B r transport packet from sending application to receiving hosts transport Seq=92, 8 bytes data Seq=92, 8 bytes data network data link network Seq r network layer protocols in physical Seq=100, 20 bytes data =100, 20 bytes data network data link network every host, router (Recall data link physical data link physical physical Seq=120, 20 bytes data X loss transport layer is end-to-end) network data link =92 timeout three important functions: physical network ACK=100ACK=120 data link Seq ACK=100 physical =100 timeout r path determination: route =100 timeout Seq=92, 8 bytes data ACK=100 network Seq taken by packets from source

Seq network data link data link physical to dest. Routing algorithms physical r switching: move packets from network Seq data link application =100, 20 bytes data router’s input to appropriate physical transport ACK=120 network router output data link physical r call setup: some network time time architectures (e.g. telephone, premature timeout, Duplicate ACK, fast retransmit (really need ATM) require router call setup cumulative ACKs 3 dup acks before fast retransmit) along path before data flow -21 -22

Internet Protocol IP Addressing: introduction

223.1.1.1 r The Internet is a network of heterogeneousnetworks: r IP address: 32-bit identifier for host, 223.1.2.1 m using different technologies (ex. different maximum packet 223.1.1.2 router interface sizes) 223.1.1.4 223.1.2.9 r interface: connection m belonging to different administrative authorities (ex. Willing 223.1.2.2 between host and 223.1.1.3 223.1.3.27 to accept packets from different addresses) physical link r Goal of IP: interconnect all these networks so can send m router’s must have multiple interfaces end to end without any knowledge of the intermediate m host may have multiple 223.1.3.1 223.1.3.2 networks interfaces m IP addresses (unicast r Routers, switches, bridges: machines to forward addresses) associated packets between heterogeneous networks with interface, not 223.1.1.1 = 11011111 00000001 00000001 00000001 host, router 223 1 1 1

-23 -24

4 IP Addressing IP Addressing

223.1.1.2 r IP address: 223.1.1.1 How to find the 223.1.1.1 223.1.1.4 m 32 bits 223.1.2.1 networks? m network part (high order 223.1.1.2 223.1.1.3 bits) 223.1.1.4 223.1.2.9 r Detach each m host part (low order bits) 223.1.2.2 interface from 223.1.9.2 223.1.7.0 m Defined by class of IP 223.1.1.3 223.1.3.27 address? router, host m Defined by subnet mask LAN r create “islands of r What’s a network ? (from isolated networks IP address perspective) 223.1.3.1 223.1.3.2 223.1.9.1 223.1.7.1 m device interfaces with 223.1.8.1 223.1.8.0 same network part of IP address 223.1.2.6 223.1.3.27 m can physically reach each network consisting of 3 IP networks other without intervening (223.1.1, 223.1.2, 223.1.3) Interconnected 223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2 router system consisting of six networks

-25 -26

IP Addresses (Classes) IP Address Space Allocation

given notion of “network”, let’s re-examine IP addresses: CAIDA 1998 “class-full” addressing

class network 1.0.0.0 to A 0 host 127.255.255.255 128.0.0.0 to Unicast B 10 network host 191.255.255.255 192.0.0.0 to C 110 network host 223.255.255.255

224.0.0.0 to Multicast D 1110 239.255.255.255

240.0.0.0 to Reserved reserved E 1111 255.255.255.255 32 bits -27 -28

Recall: How to get an IP IP addressing: CIDR Address? r classful addressing: m inefficient use of address space, address space exhaustion r m e.g., class B net allocated enough addresses for 65K hosts, Answer 1: Normally, answer is get an IP address even if only 2K hosts in that network from your upstream provider r CIDR: Classless InterDomain Routing m This is essential to maintain efficient routing! m network portion of address of arbitrary length r m address format: a.b.c.d/x, where x is # bits in network Answer 2: If you need lots of IP addresses then portion of address you can acquire your own block of them. m IP address space is a scarce resource - must prove you network host have fully utilized a small block before can ask for a part part larger one and pay $$ (Jan 2002 - $2250/year for /20 11001000 00010111 00010000 00000000 and $18000/year for a /14)

200.23.16.0/23

-29 -30

5 How to get lots of IP Classful vs Classless Addresses? Internet Registries RIPE NCC (RiseauxIP Europiens Network r Class A = /8 Coordination Centre) for Europe, Middle-East, r Africa Class B = /16 r APNIC(Asia Pacific Network Information Centre ) Class C = /24 for Asia and Pacific ARIN (American Registry for Internet Numbers) for the Americas, the Caribbean, sub-saharan Africa Note: Once again regional distribution is important for efficient routing! Can also get Autonomous System Numbers (ASNs) from these registries

-31 -32

IP addresses: how to get one? Hierarchical addressing: route aggregation revisted Network (network portion): Hierarchical addressing allows efficient advertisement of routing information: r get allocated portion of ISP’s address space:

Organization 0 ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 200.23.16.0/23

Organization 1 “Send me anything Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 200.23.18.0/23 with addresses Organization 2 beginning 200.23.16.0/20” Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 200.23.20.0/23 . Fly -By - Night-ISP . . . Internet Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 Organization 7 . ... ….. …. …. 200.23.30.0/23 “Send me anything ISPs-R-Us with addresses Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 beginning 199.31.0.0/16”

-33 -34

Hierarchical addressing: more specific IP Address Allocation routes

ISPs-R-Us has a more specific route to Organization 1 r CIDR is great but must work around existing allocations of IP address space Organization 0 m Company 1 has a /20 allocation and has given out sub portions of it to 200.23.16.0/23 other companies m University has a full class B address “Send me anything with addresses m Company 2 has a /23 allocation from some other class B Organization 2 beginning m ALL use the same upstream ISP – that ISP must advertise routes to all 200.23.20.0/23 . Fly -By - Night-ISP 200.23.16.0/20” these blocks that cannot be described with a simple CIDR network ID and . . Internet mask! Organization 7 . r 200.23.30.0/23 Estimated reduction in routing table size with CIDR m If IP addresses reallocated, CIDR applied to all, IP addresses reallocated “Send me anything ISPs-R-Us based on geographic and service provider divisions that current routing with addresses beginning 199.31.0.0/16 tables with 10000+ entries could be reduced to 200 entries [Ford, Organization 1 Rekhter and Brown 1993] 200.23.18.0/23 or 200.23.18.0/23” m How stable would that be though? Leases for all?

-35 -36

6 Current Allocation Routing

r Interesting to exam current IP address r IP Routing – each router is supposed to space allocation (who has class A’s ? Etc) send each IP datagram one step closer to m Who has A’s? its destination m Computer companies around during initial r How do they do that? allocation (IBM, Apple) m Hierarchical Routing – in ideal world would that m Universities (Stanford, MIT) be enough? Well its not an ideal world m CAIDA has info on complete allocation m Other choices • Static Routing • Dynamic Routing – Before we cover specific routing protocols we will cover principles of dynamic routing protocols

-37 -38

Routing Algorithm classification: Routing Static or Dynamic? Routing protocol Goal: determine “good” path 5 Choice 1: Static or dynamic? (sequence of routers) thru Static: B 3 C network from source to dest. 2 5 r routes change slowly over time A 2 1 F r Configured by system administrator 3 Graph abstraction for 1 r Appropriate in some circumstances, but obvious D E 2 routing algorithms: 1 drawbacks (routes added/removed? sharing load?) r graph nodes are r Not much more to say? routers r “good” path: Dynamic: r graph edges are m typically means minimum r routes change more quickly physical links cost path m periodic update m link cost: delay, $ cost, m other definitions or congestion level possible m in response to link cost changes

-39 -40

Routing Algorithm classification: Global or decentralized? Link Layer: setting the context Choice 2, if dynamic: global or decentralized r two physically connected devices: information? m host-router, router-router, host-host r unit of data: frame Global: r all routers have complete topology, link cost info r “link state” algorithms

M Decentralized: application Ht M transport r router knows physically-connected neighbors, link Hn Ht M network data link network costs to neighbors protocol Hl Hn Ht M link link Hl Hn Ht M r iterative process of computation, exchange of info physical physical frame with neighbors (gossip) phys. link r “distance vector” algorithms adapter card -41 -42

7 Link Layer Services Link Layer Services (more) r Framing, link access: r Flow Control: m encapsulate datagram into frame, adding header, trailer m implement channel access if shared medium, m pacing between sender and receivers m ‘physical addresses’ used in frame headers to identify r source, dest Error Detection: • different from IP address! m errors caused by signal attenuation, noise. r Reliable delivery between two physically connected m receiver detects presence of errors: devices: • signals sender for retransmission or drops frame m Reliable delivery over an unreliable link (like TCP but done r at link layer) Error Correction: m seldom used on low bit error link (fiber, some twisted m receiver identifies and corrects bit error(s) pair) without resorting to retransmission m wireless links: high error rates • Q: why both link-level and end-end reliability?

-43 -44

Multiple Access Links and Protocols Link Layer: Implementation

Three types of “links”: r implemented in “adapter” r broadcast (shared wire or medium; e.g, , m e.g., PCMCIA card, Ethernet card Wavelan, etc.) m typically includes: RAM, DSP chips, host bus interface, and link interface

M application Ht M transport Hn Ht M network data link network r point-to-point (single wire, e.g. PPP, SLIP) protocol Hl Hn Ht M link link Hl Hn Ht M r switched (e.g., switched Ethernet, ATM etc) physical physical phys. link frame

adapter card -45 -46

Multiple Access protocols CSMA: Carrier Sense Multiple Access r single shared communication channel r two or more simultaneous transmissions by CSMA: listen before transmit: r If channel sensed idle: transmit entire pkt nodes: interference r If channel sensed busy, defer transmission m only one can send successfully at a time m Persistent CSMA: retry immediately with r multiple access protocol: probability p when channel becomes idle (may cause instability) m distributed algorithm that determines how stations share channel, i.e., determine when station can m Non-persistent CSMA: retry after random interval transmit r human analogy: don’t interrupt others! r claim: humans use multiple access protocols all the time

-47 -48

8 Ethernet

“dominant” LAN technology: r cheap $20 for 100Mbs! r first widely used LAN technology r Simpler, cheaper than token LANs and ATM r Kept up with speed race: 10, 100, 1000 Mbps r Uses CSMA with collision detection

Metcalfe’s Ethernet sketch

-49

9