Forwarding and Addressing in the Link Layer

3035/GZ01 Networked Systems Kyle Jamieson Lecture 4

Department of Computer Science University College London The link layer

The link layer (L2) has the job of transferring a datagram point-to-point over a link

Datagram link_send(packet, link-id) network_handle Wireless Node A Node B Node D Link layer Link layer Link layer Serial Link layer Node C Unit of data is called a frame at the link layer Where is the link layer implemented?

• In each and every host Host Applicaon • Network adapter Transport CPU Memory – Physical card Network – Chip in a device Link Network adapter – Aaches to host’s Controller system bus Link • Implemented both in Physical PHY hardware and soware – Soware: device driver running on CPU – Hardware: FPGA, ASIC in network adapter Hardware or soware?

Host • Soware is more flexible, Applicaon unless you need: Transport CPU Memory Network 1. Time-crical funconality Link Network adapter – Backoff transmission ming in Ethernet Controller Link Physical PHY 2. Highly parallel processing – Dot product computaons for CDMA, filtering for the PHY Today

1. Framing – Byte-oriented framing – Bit-oriented framing 2. Link layer addressing 3. Link layer switching and forwarding Framing: The problem

Bit stream 1 0 0 0 1 0 0 1 1

Binary encoding

Manchester encoding

Time • We have seen how to encode bits on a link – Ethernet: Manchester encoding, synchronizaon on edges – Result in general: An infinite stream of bits • Problem: where does each frame begin and end? Byte-oriented framing

• Treat each frame as a collecon of bytes rather than bits • Byte-oriented framing link layer protocols: – PPP (point-to-point protocol), used to carry Internet Protocol (IP) packets over dialup lines, Virtual Private Networks – BISYNC (IBM, 60’s), Digital Data Communicaon Message Protocol (DDCMP), Digital Equipment Corporaon’s DECNET • BISYNC uses sennel characters to frame:

• SYN: Beginning of frame • SOH: Start of header • STX: Start of text (packet data) • ETX: End of text • What if sennel occurs in data? Character stuffing: insert escape character DLE before sennel • What if escape character occurs in data? Escape the escape character PPP byte stuffing

• Like BISYNC, PPP takes a sennel-based approach

• At the sender: – Stuff escape byte before each flag byte in payload – Stuff escape byte before each escape byte in payload

• At the receiver: – Escape, flag byte à discard escape byte, keep flag byte, connue data recepon – Single flag byte: interpret as the flag byte – Two escape bytes à discard one Byte-counng approach: DDCMP

• Byte-counng framing: Include number of bytes in header (Count field):

• What if Count field is corrupted? – Will frame the wrong bytes – This is called a framing error – With high probability, CRC will detect the framing error and discard the frame • Do we sll need to escape the SYNs? Bit-oriented framing

• Main idea: View link as stream of bits instead of bytes • Example: High-Level Data Link Control (HDLC), used in Cisco routers and the V.42 standard for telephone modems • Choose some paern of bits as a sennel e.g.: 1111111 Seven 1s • Bit stuffing algorithm: Modify the outgoing data stream as follows: • At sender: 111111 → 1111110 Six 1s Six 1s

• At receiver: 1111110 → 111111 1111111 → (end of frame) Seven 1s • Size of frame depends on data it contains! • Not guaranteed free of framing errors Today

1. Framing 2. Link layer addressing 3. Link layer switching and forwarding 4. Virtualizing links Comparing addressing schemes

address (IP address) – Funcon: move datagram to desnaon network – 32-bit address, doed quad notaon a.b.c.d where each component is an eight-bit unsigned integer – Hierarchical address space

• Link layer address (MAC address, Ethernet address): – Funcon: move frame from one point to another point on the same network – 48-bit address (in most LANs) – Burned in NIC ROM, also somemes soware seable – Usually a flat address space Link-layer addressing

• Each adapter on LAN has unique link layer address, my_station_id! • Special broadcast address broadcast_id!

Station Identifier (Ethernet 17 24 12 05 19 Address)

• Each L3 protocol registers itself in net_handler array • L2 address-related funconality for inbound frames: • Test received_frame.destination against my_station_id or broadcast_id, drop if neither match • handler ß net_handler! !! !!!![received_frame.net_protocol]! • call handler(received_frame.payload)! Mapping network to link addresses

• Goal: Translate from nework_send(data, length, IP, N) to link_send(data, length, IP, enet/18)! • Name-mapping table: example of so state

upper-layer link identifier G internet Ethernet/ L M N P Q K address station work work work work 1 H station station server station station 2 enet/15 … 3 J M 4 enet/18 1 1 1 1 1 N 6 5 E P enet/14 17 15 18 14 22 19 Q enet/22 K enet/19 Ethernet F E enet/19 Ethernet station identifier Link layer addresses aren’t routable

• Network layer routes from one network to another

upper-layer network address link identifier G L M N P Q K work work work work router 1 H station station server station station 2 … 3 J 4 1 1 1 1 1 6 5 E 17 15 18 14 22 19

Ethernet F Ethernet station identifier (link layer address) The Address Resoluon Protocol (ARP) L sending a datagram to N: L: network_send(data, length, IP, N)! L: link_send(“where is N?”, length, ARP, broadcast_id)! N: link_send(“N is at enet/18”, length, ARP, enet/17)! L: link_send(data, length, IP, enet/18)!

upper-layer network address ARP table at L: link identifier G Internet enet/ L M N P Q K Address station TTL work work work work router 1 H station station server station station 2 … 3 J N 18 20 min 4 1 1 1 1 1 6 5 E 17 15 18 14 22 19

Ethernet F Ethernet station identifier ARP’s role in L3 roung

• Walkthrough: send IP data from L to E via router K • Router K • Two ARP tables: one for each subnet • Two interfaces (interface ids 6, 4): one for each subnet, with two link-layer addresses (19, 27)

L N K E work work router server station station 4 27 1 1 6 17 18 19 28

Ethernet Ethernet Example: roung between subnets

L sending a datagram to E: L: network_send(data, length, IP, E)! L: link_send(data, length, IP, enet/28)! L: ARPs for K’s link layer address (enet/19) L: link_send(data, length, IP, enet/19)! K: ARPs for E’s link layer address (enet/28) on link/4 K: link_send(link/4, data, length, IP, enet/ 28)!

L N K E work work router server station station 4 27 1 1 6 17 18 19 28

Ethernet Ethernet Hubs

• Replicang device: bits coming in go out all ports – Operates at the physical layer (L1) • Doesn’t run CSMA/CD: immediately replicates bits • Collisions possible between all hosts • No frame buffering at the hub • Physical limitaons: can’t grow collision domain too big Hub Today

1. Framing 2. Link layer addressing 3. Link layer switching and forwarding – Learning switches – – Virtual LANs Ethernet nowadays: Switched

• Hosts have dedicated, direct A connecon to switch F B • Switches buffer frames 1 2 6 3 Switch S • Ethernet protocol used on each 5 4 incoming link, but no collisions – Each link in own collision domain C

• Switching: Allows two E D conversaons A → A’ and B → B’ simultaneously, without collisions The switch table

• How to tell which port to A forward packet? F B 21 22 26 1 2 • The switch table: 6 3 Switch S LAN addr Port Time-to-live 5 4 23 25 24 C • Switch table inially empty E D • How to build up switch table entries? Building the switch table: Self-learning

• Switch table: entries of (LAN address, port id, me-to-live)

• Learning algorithm: on receiving a packet from LAN address S to LAN address D: 1. Add S to switch table with incident port id 2. If D is broadcast_id: forward out all ports 3. If D is in switch table, lookup entry and forward to resulng port id. Otherwise, forward out all ports

• Periodically flush link table entries based on me-to-live Building up the switch table: example

1. Aà S: [A (enet/21) à D (enet/24)] A S à (all ports) F 2. B à S: [B (enet/22) à C (enet/23)] B 21 22 26 S à (all ports) 1 2 à à 6 3 3. D S: [D (enet/24) B (enet/22)] Switch S S à port 2 only 5 4 23 25 24 Switch table: C LAN addr Port TTL E D enet/21 1 20 min enet/22 2 20 min enet/24 4 20 min Aside: What is a hub?

Hub Switch • Differences – Physical layer (L1) device – Link-layer (L2) device – Replicang device: bits in one – Selecvely forward frame to port go out all ports one or more outgoing links – No medium access control – CSMA/CD MAC – No frame buffering – Buffers frames

• Commonalies – Hosts are unaware of – Hosts are unaware of presence of hubs presence of switches – No configuraon needed – No configuraon needed Interconnecng LANs

Link layer addresses S 1 4 2 3 S1 11 1 1 S3 2 S 1 21 2 A 3 4 2 12 3 4 13 D 22 23 B C E F

• Switches can connect LANs as well as hosts • Somemes called bridges in this context • The enrety is called an extended LAN Interconnecng LANs: example Suppose C sends frame to F, F responds to C

Link layer addresses S 1 4 2 3 S1 11 1 1 S3 2 S 1 21 2 A 3 4 2 12 3 4 13 D 22 23 B C E F

Address Port 13 4 Interconnecng LANs: example Suppose C sends frame to F, F responds to C S 1 4 2 3 S1 11 1 1 S3 2 S 1 21 2 A 3 4 2 12 3 4 13 D 22 23 B C E F

Address Port 13 4

Address Port 13 1 Interconnecng LANs: example Suppose C sends frame to F, F responds to C S 1 4 2 3 S1 11 1 1 S3 2 S 1 21 2 A 3 4 2 12 3 4 13 D 22 23 B C E F Address Port 13 1 Address Port 13 4 Address Port 13 1 Address Port 13 1 Interconnecng LANs: example Suppose C sends frame to F, F responds to C S 1 4 2 3 S1 11 1 1 S3 2 S 1 21 2 A 3 4 2 12 3 4 13 D 22 23 B C E F Address Port 13 1 Address Port 13 4 23 1 Address Port 13 1 23 4 Address Port 13 1 23 2 Problem: Forwarding loops

S 1 4 2 3 S1 1 11 1 1 S3 2 S2 A 3 4 5 2 12 3 4 13 22 23 B C E F Address Port 13 1 Address Port 13 4 23 1 Address Port 22 1 13 1 23 4 Address Port 22 3 13 1 23 2 22 2 Problem: Forwarding loops

S 1 4 2 3 S1 1 11 1 1 S3 2 S2 A 3 4 5 2 12 3 4 13 22 23 B C E F Address Port 13 1 Address Port 13 4 23 1 Address Port 22 1 13 1 23 4 Address Port 22 3 13 1 22 2 23 2 22 2 Problem: Forwarding loops

• Can’t learn the direcon of a source if it’s in more than one direcon, so bridge learning algorithm breaks

• Why might loops form? – Inadvertently: many people responsible for network, one person adds a bridge

• Intenonally: more connecons between bridges increases redundancy, helping to cope with failure – So we need to revise the bridge learning algorithm The spanning tree protocol (STP)

• Manager at DEC asked Radia Perlman to build a switch (bridge) to connect two

• Perlman’s idea: Switches agree on a loop-free and connected spanning tree

• Implementers at DEC resisted (wanted simplest possible design), first customer site connected bridge to one Ethernet twice, generang a “broadcast storm”

• Once the spanning tree is formed: – Switches block some ports from sending or receiving data – Switches connue using the learning switch algorithm to forward over the spanning tree

Spanning tree protocol (STP): Outline

1. Elect one root switch (switch with the lowest ID) 2. Compute shortest paths tree from root to each switch S – Note which port of S is on path to root: root port (R)

3. All switches connected to a LAN choose a designated switch to forward frames to root (switch on path to root from LAN) – e.g. switch 2 is designated for the LAN below – i.e., each port decides if it is a designated port (D)

4. Block all other ports: blocked port (B) Port number 1(D) 1(R) 2 A 3 Switch ID LAN ID Ethernet LAN Switch STP: Messages and switch state

• All switches exchange configuraon messages switch X sends: (Root idenfier, distance to root, X) – Configuraon messages are never blocked – Switch X generates configuraon messages periodically with a “clock ck” mer, inially sending (X, 0, X)

• State at each switch X: 1. Root idenfier (inially X) 2. Configuraon message to send (inially (X, 0, X)) 3. State at each port: a) Forwarding data traffic or blocking data traffic (inially forwarding) b) “Best” configuraon message + age of that message (inially empty) Elecng a per-LAN designated switch

• Designated port rule: At a switch, for each port p – Consider all configuraon messages received on port p and the configuraon message the switch would send

– If receive “beer” configuraon message on a port p, don’t send configuraon messages on port p, otherwise p is designated: send your configuraon message on p

• Rule for comparing configuraon messages: (R1, d1, X1) beer than (R2, d2, X2) if R1 < R2 or (R1 = R2 and d1 < d2) or (R1 = R2 and d1 = d2 and X1 < X2) Comparing configuraon messages

• Example: Switch 2 sends (2, 0, 2); 3 sends (3, 0, 3) • Message (2, 0, 2) is beer than (3, 0, 3)

– Subsequently, switch 2, port 1 sends configuraon messages periodically – Subsequently, switch 3, port 1 does not send configuraon messages

1(D) 1 2 3 STP: inial startup phase

F E 1(D) (78,0,78) 87 G 5 1(D) 2 78 (87,0,87) 1 4 2(D) (78,0,78) 1(D) 90 3 (66,0,66) 2 2 D 66 A 3(D) 1 5 (5,0,5) C 2 2(D) 1(D) 101 (2,0,2) 2 (2,0,2) B 1 2

• Inially, X generates configuraon Switch 5 Root Dist Transmier messages periodically, sending (X, 0, X) Port 1 2 0 2 • Port 2 2 0 2 All ports designated and forwarding Port 3 5 0 5 • Result: For each LAN, one aached switch (the designated switch) transmits configuraon messages Calculang root ID and cost to root

• Switches connuously take the following steps: 1. Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports 2. Calculate distance to root d at switch X: – If X is the root (X = r), d = 0 – Otherwise, X is not the root (X ≠ r): • d = one plus the minimum cost from configuraon messages received on all ports (transmier field breaks es). Suppose this comes from port p. • Root port rule: Designate port p as a root port 3. Switch X’s configuraon message is now (r, d, X). Reapply designated port rule on all ports 4. Blocked port rule: Don’t forward data to or from a port if it is not a designated port or a root port STP: building the tree

F E 1(D) (78,0,78) 87 G 1(D) 78 (87,0,87) 2(D) (78,0,78) (66,0,66) 90 1(D) 3(D) (2,1,5) D 66 A 5 1(R) C 2(B) (2,0,2) 1(D) 101 2(D) 2 (2,0,2) B

• On LANs A & B, 5 hears (2, 0, 2) Switch 5 Root Dist Transmier • 5’s root ID r: 2, d = 1 (port breaks e) Port 1 2 0 2 Port 2 2 0 2 • 5’s new message: (2, 1, 5) Port 3 2 1 5 STP: building the tree

F E 1(D) (78,0,78) 87 G 1(D) 78 (87,0,87) 2(D) (78,0,78) (2,1,66) 90 1(D) 2(R) 3(D) (2,1,5) D 66 A 5 1(R) C 2(B) (2,0,2) 1(D) 101 2(D) 2 (2,0,2) B

• On LAN A , 66 hears (2, 0, 2) Switch 66 Root Dist Transmier • 66’s root ID r: 2, d = 1 Port 1 2 1 66 Port 2 2 0 2 • 66’s new message: (2, 1, 66) STP: building the tree

F E 1(D) (78,0,78) 87 G 1(D) 78 (87,0,87) 2(D) (78,0,78) (2,1,66) 90 1(D) 2(R) 3(D) (2,1,5) D 66 A 5 1(R) C 2(B)

(2,0,2) 1(D) 1(R) 101 2(D) (2,2,101) 2(D) 2 (2,0,2) B

• On LAN C, 101 hears (2, 1, 5) Switch 101 Root Dist Transmier • 101’s root ID r: 2, d = 2 Port 1 2 1 5 Port 2 2 2 101 • 101’s new message: (2, 2, 101) STP: building the tree

(2,2,78) F E 1(D) (2,1,90) 87 G 5(D) 78 1(B) 4(D) (2,1,90) 2(R) (2,1,66) 90 3(B) 1(D) 2(R) 2(R) D 66 A 5 1(R) 3(D) C 2(B) (2,1,5) (2,2,101) (2,0,2) 1(D) 101 2(B) 2(D) 2 (2,0,2) B 1(R)

Switch 90 Root Dist Transmier Port 1 2 1 66 • On LAN A , 90 hears (2, 0, 2) Port 2 2 0 2 • 90’s root ID r: 2, d = 1 Port 3 2 1 5 • 90’s new message: (2, 1, 90) Port 4 2 1 90 Port 5 2 1 90 STP: building the tree

(2,2,87) (2,2,78) F E 1(D) (2,1,90) 1(R) 87 2(B) G 5(D) 78 1(B) 4(D) (2,1,90) 2(R) (2,1,66) 90 3(B) 1(D) 2(R) 2(R) D 66 A 5 1(R) 3(D) C 2(B) (2,1,5) (2,2,101) (2,0,2) 1(D) 101 2(B) 2(D) 2 (2,0,2) B 1(R)

Switch 87 Root Dist Transmier Port 1 2 1 66 • On LAN F , 87 hears (2, 1, 90) Port 2 2 0 2 • 87’s root ID r: 2, d = 2 • 87’s new message: (2, 2, 87) STP: Handling topology changes

• Configuraon messages also contain age: (r, d, X, age) – age = 0 when sending configuraon message

• Best configuraon message for each port contains age – Age incremented each unit of me – If age reaches some threshold (max age), discard that configuraon message and recalculate using all rules

• Recalculate when: 1. Receive beer or newer configuraon message on port p: overwrite exisng configuraon message 2. Timer cks: increment message age in stored messages for each port STP: Handling failures

(2,2,78) 1(D) (2,1,90) 87 5(D) 1(R) 2(B) 78 1(D) 4(D) (2,1,90) 2(R) (2,1,90) 90 3(B) 2(R) 2(R) 66 1(R) 5 3(D) 2(B) (2,1,5) (2,0,2) 1(D) 101 2(B) 2(D) 2 (2,0,2) 1(R)

Switch 90 Root Dist Transmier Port 1 2 2 1 1 66 90 Port 2 2 0 2 Port 3 2 1 5 Port 4 2 1 90 Port 5 2 1 90 90’s root ID r: 2, d = 1 90’s message: (2, 1, 90) STP: Handling new bridges

(2,2,78) 1(D) (2,1,90) 87 5(D) RP 2(B) 78 1(D) 4(D) (2,1,90) 2(R) (2,1,90) 90 3(B) 2(R) 2(R) 66 1(R) 5 3(D) 2(B) (2,1,5) (2,0,2) 1(D) 101 2(B) 2(D) 2 (2,0,2) 1(R) (95,0,95) (95,0,95) 2(D) 1(D) 95 • Don’t want loops, even for short periods of me • Switches 2, 5, 101 send messages immediately, with current age in table • On power-up, switch puts its ports in new pre-forwarding state: sends configuraon messages as if designated, but doesn’t forward data The problem with extended LANs

• Switched LANs afford greater scalability, but extended LANs do not isolate traffic – Broadcast frames traverse the enre extended LAN

• Two consequences: 1. Broadcast traffic does not scale, reducing overall performance 2. Allows eavesdropping across LANs

• Therefore, extended LANs don’t scale Virtual LANs (VLANs)

1 9 15 2 4 8 10 16 … … Computer Science Electrical Engineering • Idea: The switch assigns each port a color, an idenfier designang the VLAN that port belongs to • Traffic isolaon: colors = broadcast domains • Easily reconfigurable port assignments • Roung between VLANs: layer 3 roung funconality VLAN example

• Configure ports on W, X, Y, and Z to be in appropriate VLANs – Trunk ports between B1 and B2 configured for both VLANs

• Bridge inserts VLAN header containing color between Trunk Ethernet header and payload link

• If a packet contains a VLAN header, bridges only forward on matching-color or trunk ports Why can’t Ethernet replace IP?

• L2 is point-to-point, L3 has source and desnaon

• Then Ethernet came, world got confused and thought Ethernet was a competor to L3

• Perlman: “Should have been called ‘Etherlink.’”

• So, why can’t Ethernet replace IP? 1. Flat addresses 2. No hop count (loops would be a disaster) 3. No fragmentaon, reassembly, for heterogeneous networks Mul-hop Networks and End-to-end Arguments Pre-Reading: End-to-end Arguments in System Design NEXT TIME