Protocols Interconnection Protocols

Purpose of Course Good Ideas

• Understand generic problems and approaches • Parameters (e.g., how often to send hellos) • Less important (for this course) but covered should be settable at nodes one at a time somewhat: description of what happens to be without having to reset the whole network implemented • How can you ensure that two nodes have compatible parameters? • Some changes are “compatible” in that they can be added to nodes one at a time without disrupting operations. • How can you design messages so that you can add fields in the future and still interoperate with old nodes?

1 2

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Another Distance Vector Link Costs Trick • Some think it’s a good idea for the cost of the • Double metric: hops and cost link to vary according to congestion. - - allows routes to be chosen based on “real” route around congestion metric (allowing slow links to have higher - don’t have to configure link costs costs) • I don’t - detects unreachable destinations (with path - extra control traffic lengths more than 15) just as quickly - less time the network is in a converged state - can’t react quickly enough to matter

3 4

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

IEEE 802 What is a LAN?

Chartered to standardize LANs Badly defined term (typical in this field) Took their job very seriously -- standardized lots Assumed general characteristics: of LANs • multiaccess • logical full connectivity • 802.1 -- management, interconnection • multicast/broadcast capability • 802.2 -- SAPs, LLC • limited scalability (distance, number of • 802.3 -- CSMA/CD stations, total traffic) X • 802.4 -- Token bus A • 802.5 -- Token ring Q • Other committees: FDDI, Metropolitan area nets, security M

5 6

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

What is Multicast? Taking Turns

• Unicast: Alice sends to Bob • Suppose there’s a shared medium (like the • But, on a LAN, everyone can hear, so if they’re room we’re in). If more than one speaker, just not Bob, they discard the packet hear garbage • • The hardware has to be smart to filter out Old kinds had a master that polled the “slaves” traffic not intended for this (so as not to - “Alice, do you have anything to send?” bother the node with all the interrupts) - Alice “Terminal 23 typed ‘Gree’” • But sometimes you want to send to a group of - “Amy, I’m sending 315 to you. Anything to nodes. Use the natural ability of the LAN! send?” - to send to a group of nodes...tell them all to - listen to address X (in addition to their own address). Then transmit to X and they’ll all - “Andy do you have anything to send?” receive it - Fancier protocols poll recently active • “Promiscuous” listen means you tell your chip stations more frequently to send everything to you. - Could have 2 slaves talking directly, but only if 2 addresses in header

7 8

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

CSMA/CD () The “CD” in CSMA/CD

- Carrier Sense Multicast Access with • Speed of light: “It’s not just a good idea. It’s Collision Detect. the law” - Single “bus” on which all stations reside. If • Suppose packet very small. May not detect anyone transmits, everyone hears collision - Intended for peer-to-peer (not master to • Maximum length of bus (maximum delay, slaves) including repeaters), and minimum sized - To transmit: If link idle, transmit. Check for packet so guaranteed to detect delay. collision. If collision, back off a random amt. • Packet size must be twice length of cable in If again a collision, back off a random amt order for to know it hasn’t collided! chosen from a double-sized interval, etc., (why?) until max collisions. Then give up on that • Hub: Multi-port repeater packet, but next packet start from small interval again “exponential backoff” - Many variants

9 10

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Token Ring Token Rings, Continued C B D • removes frame from ring (why?) A E • Slower rings (802.5) tend to have one frame at I a time H F • Faster rings (e.g., FDDI) have a train of G packets followed by the token • Idea simple, but made much more complicated • Just a bunch of point-to-point links with priorities • Want to minimize delay around the circle • Minimum delay: one bit per station • “token” can be turned into “start of packet” by flipping one bit • Token travels around ring. To transmit, when see token, flip bit, transmit packet • Two “ack” bits: A=address recognized, C=frame copied

11 12

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Token Bus Which is best?

• Like Ethernet, everything on one wire • For awhile, hotly debated • Complicated protocol for building a logical • These days, all of them dying out! (I’ll show ordering of active stations you what “Ethernet” is these days in a later • Each active station must know predecessor, lecture) successor • When get token, send packet (if you have one), then transmit token to successor • Periodically, invite other stations to join the ring (between you and your successor) • VERY complex protocol for doing that

13 14

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

IEEE802 Addresses Address Issues

• Idea: unique ID for each IEEE802 device • How do you write down an address? OUI - canonical written form: 43-75-cf-5f-45-7a • Transmitted on 802.3 or 802.4 LSB first

group/individual 01000011 01110101 11001111 01011111 01000101 01111010 globally/locally assigned

24 • Transmitted on 802.5 MSB first • Assigned in blocks of 2

• Given 23 byte constant (Organizationally 11000010 10101110 11110011 11111010 10100010 01011110 Unique Identifier), plus group/individual bit • All 1’s intended to mean “broadcast”, i.e., • Group/individual bit supposed to be first bit “everyone”, which is nonsense. Really each transmitted protocol should use its own multicast address • Different bit order very annoying to mean all nodes that speak that protocol

15 16

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

EUI-48/EUI-64 Multi-Lingual Environments • EUI-48 is the 6-byte address already discussed • EUI-64 is IEEE’s new 8-byte address. • You can speak lots of things (IP, CLNP, IPX, • How much should be OUI? AppleTalk, etc.) • - 5 byte OUI, virtually unlimited number of Someone hands you a pile of bits. What is it? OUIs, 224 addresses per block - Maybe we were careful -- yeah, right - 3 byte OUI, virtually unlimited number of - Maybe we were lucky -- yeah, right addresses per OUI, but 22 bits (4 million) • Conclusion: not enough information in the OUIs packet header to differentiate -- need an extra - Consequence of too small block...some field in the header to say what it is people need to come back for more blocks - protocol type: well-known (globally - Consequence of too small OUI...run out of administered) values, one field in header OUIs, world ends - SAP (service access point) or socket: locally - They kept 3-byte OUI, but with plea to not administered, one for destination, one for ask for more OUIs until you’ve used up all source your addresses

17 18

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Packet HDRs on CSMA/CD How the SAPs work

• Notice the “global/local” bit -- those SAPs are globally assigned! If you are a very privileged Ethernet protocol, and obtain one of these, you’d set DSAP=SSAP= your assigned SAP value 6 6 2 46-1500 4 • dest src p-t DATA fcs How does it work if you’re not a privileged protocol? Uh... 802.3 • World class kludge -- get a SAP value assigned 6 6 2 1 1 1 43-1497 4 to mean “underprivileged protocol”. That was dest src ln dsap ssap ctl data fcs done. It’s called SNAP SAP (SubNetwork Access Protocol), and it = aa hex. Format of SAP • If DSAP=SSAP=aa hex, then after CTL is a protocol type field G/L G/I • The protocol type is 5 bytes long • Convention: 0.0.0.protocol type allows 2 octet Ethertypes to fit into 5 octets

19 20

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Converting between 802.3 Why Bridges? and Ethernet • LANs don’t scale - Distance Ethernet - Number of stations

6 6 2 46-1500 - Traffic dest src X DATA • Want to glue LANs together 802.3 • Want local traffic to stay local 6 6 2 1 1 1 5 43-1497

dest src ln AA AA 3 0.0.0.X data

AC D

21 22

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Two Types of Bridges Transparent Bridge

• Design constraint: work with unmodified • Two types stations, designed to work on single LAN - Transparent (also known as spanning tree) - Source routing • Transparent standardized by 802.1 • Source routing originally adopted by 802.5 AC D when it lost out to transparent bridges in 802.1 • Packet contains two interesting fields: - destination - source • Learn based on source field • Forward based on destination field • When in doubt, forward

23 24

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Multiple Ports Multiple Hops

A X,D XF

AC D X,F A,D A D X

AC D XF

25 26

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Arbitrary Trees Maybe All Topologies Work

A X,D S Z

AC D A,D,Z A,X,D

X Z B1 B2 B3

ZY XF

27 28

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

What to do about loops? Why Allow Loops

• Give up on bridges: “I guess that idea didn’t • Useful for redundancy work” • Hard to avoid • Document the restriction • Build automatic loop detection and reporting into bridges B • Spanning tree algorithm -- automatically and continuously prune the topology to use a loop- free subset

29 30

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Algorhyme Basic Algorithm

• Elect unique Root bridge amongst all bridges I think that I shall never see - lowest ID (append priority as most A graph more lovely than a tree. significant portion of ID) • Calculate distance to Root from self A tree whose crucial property Is loop-free connectivity. • Elect unique Designated Bridge on each LAN - It’s the one closest to the Root A tree which must be sure to span So packets can reach every LAN. - (priority.ID) breaks ties • Designated Bridge periodically transmits a First the Root must be selected. Hello message on the LAN, (with 802 header By ID it is elected. with multicast destination address) containing: Least cost paths from Root are traced. - Root ID In the tree these paths are placed. - distance to Root A mesh is made by folks like me. - Designated bridge’s ID Then bridges find a spanning tree. - other stuff

31 32

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Finding Cost to Root Pruning to a Tree (Root ID, cost to Root, my ID) 13,5,21 13,5,40 (Root ID, cost to Root, my ID) 13,5,36 13,5,21 13,5,40 Root=13 36 cost=5 my ID=36 Root=? 36 cost=? my ID=? 15,2,21 13,7,22 13,5,36 13,5,36 13,4,80 15,2,21 13,7,22 In tree: those for which you are Designated 13,4,80 Bridge, plus single link which is best path to Root

33 34

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Recovering from Failures

• Hello contains age field (usually 0). Increment 7 2 it every time unit. Throw it away when it hits MAX-AGE 2,0,2 2,0,2 • Since (I think) temporary loops are worse than 62 4 temporary partitions, wait preforwarding delay 2,1,9 before transitioning port from backup to 2,1,4 fowarding

9 12 78 2,1,9 2,1,9

5 17 3

2,2,3

35 36

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

More Details Station Cache Timer

• Memory doesn’t grow with size of net! It’s • How long should a bridge remember the about 50 bytes per interface for each bridge location of a station? • Once algorithm stabilizes, only Designated • If timer too long, black holes might last too Bridge transmits long

• To ensure all bridges use the same parameters, A the Root tells them what to use in its Hello 5 • Tuning AC - link cost - bridge priority • If timer too short, traffic needlessly leaked - hello timer • Why would cache be wrong? - max-age - station physically moved—user knows it happened—it takes a few minutes—station - pre-forwarding delay can detect powerup and multicast something - spanning tree reconfigures—none of the above true

37 38

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Topology Change Topology Change Notification Notification TC R • Use short cache timer for awhile after topology top change change notification B2 TCA • How do you know when there’s been a B1 TC topology change B3 top change • A bridge that switches the state of a link notification (forwarding to non-forwarding or vice versa) notifies its parent in a “topology change notification” message B5 B6 • Designated Bridge receiving topology change notifies its parent, and sets TCA in its Hello on LAN from which info received • topology change notification: anyone to DR • Root turns on bit in Hello (TC) and leaves it on • TCA flag: DR in its Hello on LAN (“I heard for awhile your topology change notification”) • TC flag: set by Root in Hello, copied by everyone else in their Hello

39 40

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Bridge Configuration PDU Bridge Problems (BPDU) • Bit order different—bridges need to bit swap 1st 12 bytes 2 Protocol ID (=0) • Frame sizes different on different types of 1 Version (=0) LAN 1 BPDU type (=0) - FDDI priority kludge -- reserve priority 0 to 1 flags (TCA, TC) mean “it passed through CSMA/CD” 8 Root ID (6+2 priority) SDB B 4 path cost 8 bridge ID 2 port ID • Info gets lost (like priority, which doesn’t exist 2 message age on CSMA/CD) 2 max age • Frame formats different -- can’t always 2 hello time preserve CRC end to end 2 pre-forwarding delay • Ethernet format doesn’t exist except on CSMA/CD -- need to translate back and forth

41 42

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

AppleTalk Kludge AppleTalk/Bridge Interaction, Continued • AppleTalk distinguishes version 1 from version 2 based on whether packet is in 802.2 B1 B2 or Ethernet format - Ethernet format—2 byte protocol type, say Af Mr X.Y - 802.2 format—follows usual convention (which bridges also follow) for using • M will think A is old version, since bridges Ethertype in 5 byte protocol type: translate into Ethernet. DSAP=SSAP=SNAP, protocol • How did this get fixed? type=0.0.0.X.Y - New 3 byte extension declared for this purpose, 00-00-f8. Bridge algorithm: leaving CSMA/CD: If Ethernet format and PT=X.Y, translate to 00-00-f8-.X.Y. When forwarding to CSMA/CD, translate 00-00-f8--X.Y to Ethernet, translate anything of form 0.0.0.*.* to Ethernet except 0.0.0.X.Y

43 44

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Token Ring A and C bit What should a bridge do? D1 • In token ring, the source transmits the packet B D2 and removes it when it returns • Others forward the packet while receiving it, S with as small a delay as possible • There is enough time to set a flag at the end of • Leave A and C alone? Then if dest=D2, S will the packet, to acknowledge it give up. • Two ack flags, A=address recognized, • Set A and C to 1? Then if dest=D1, D1 will get C=packet successfully copied upset and reset the ring, thinking there’s a • What might source do with A and C? duplicate address - A=0, C=0 destination must be down, give up • Verdict? The standard says bridge should leave - A=1, C=1 success bits alone unless they are both 0, in which case the bridge should make A=0, C=1 - A=1, C=0 destination temporarily busy, send packet for another trip round ring - A=0, C=1 ... shouldn’t happen

45 46

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Source Route Encoding

• 802 header expanded to contain a source route 802 packet • Source station puts route in packet header dest src data • Stations “discover” route to each other • Three types of packet (expanded 802 hdr tells Source routing packet which type of packet it is) - specifically routed—route is in header dest src RIF data - all paths broadcast—send along all possible paths, collecting diary of travels source address multicast bit indicates RIF follows - single copy broadcast—send along spanning tree to all LANs Info in RIF (Route Information Field) - type (specifically routed, all paths broadcast, • Stations (somehow) use all paths broadcast to single copy broadcast) find route - length (even # between 2 and 30) • Stations maintain a route cache - direction (forward or back) - largest frame - route

47 48

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Encoding of Route Route Information Field

• Sequence of bridge IDs?—too long • Name LANs and make it a sequence of LAN type length of RIF numbers? d largest frame S B1 B1 D route B5 7 B2 12 4 B3 B7 B2

- Suppose the route were 7,12,4. A bridge is supposed to forward it if that bridge connects • type: specifically routed = 0xx, all paths to the next LAN in the route. What will explorer = 10x, single path explorer = 11x happen? • d = direction (1=reverse) • Conclusion: Give parallel bridge numbers. 12 • largest frame: bottom 3 bits indicates one of 7 bits for LAN, 4 bits for bridge, so route is popular sizes. Top 3 bits gives intermediate LAN7,B2,LAN12,B1,LAN4 values • route: 12 bits for LAN, 4 for bridge

49 50

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Specifically Routed All Paths Broadcast

• Assume bridge receives specifically routed • Assume bridge receives all paths broadcast packet on port that bridge assumes is LAN k. packet on port that bridge assumes is LAN k. • find k in route • check if final LAN number in collected route = • look at next LAN number in route (before or k. If not, drop packet after, depending on d bit), say it’s j • For each other port: • are any of your other ports configured to be - if that port is configured to be LAN j, check LAN j? If no, drop the packet if j is already in collected route. If so, drop • look at parallel bridge number between k and j the packet. Otherwise add (parallel bridge #, in the route, say it’s n j) to collected route and forward onto that port • Are you configured to be bridge n between LAN k and LAN j? If no, drop packet. - if no hops yet in route (packet gotten directly from endnode), add (k,bridge #,j) to route • Check to make sure j isn’t in route multiple before forwarding onto port for LAN j. times. If so, drop packet. • Now forward packet onto LAN j.

51 52

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Single Copy Broadcast Bridge Configuration

• Bridges run the spanning tree algorithm • For each port, configure a LAN number • Implement single copy broadcast just like all • For each pair of ports, configure a parallel paths broadcast, but only accept these on links bridge number, or use up hops in route with an in the spanning tree and only forward onto “internal LAN” links in the spanning tree 4 4,12=B3 12 4,3=B5 3 4,7=B2 B 4,51=B2 8 7 ... 51

4 12 B1 B1 3 9 B1 8 B1 B1 B1 7 51

53 54

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

End Station’s Job Bridge Wars

• Generate additional header • Transparency—Transparent bridges designed • Decide when appropriate to send each type of with the constraint that they can’t change end pkt systems—otherwise, end systems could just have built in a network layer and worked with • Choose a route based on received all paths routers broadcasts, or store routes from received specifically routed packets • - • Decide when info in route cache is stale Transparent bridges confined to spanning tree, and forward pkts to unknown dest • Make intelligent decisions when cache is overfull - Source routing route discovery has exponential overhead — every time a pair of stations want to communicate • Configuration • Economics — original claim was that source routing bridges would be cheaper/faster. Not true. But even if it were, is it worth making endnodes more expensive?

55 56

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols Interconnection Protocols

Other Bridge Types Hubs/Switches

• SRT (source routing and transparent). Do both. • Ethernet used to be a long bus (originally, TB must drop pkt if source • Easier to wire, more robust if star topology: multicast bit on, and SR must drop pkt if bit one huge multi-port repeater with pt-to-pt links off) • Security feature: Learn or configure station • SR-TB (source route to transparent addresses. Forward noise (after destination translational bridge). Sit between SR and TB address) except on port where destination is region and translate. • If store and forward rather than repeater, then can have more aggregate bandwidth (A to B can simultaneously use whole bandwidth while C and D talk) • Can cascade devices • Should therefore do spanning tree • We’ve reinvented the (transparent) bridge!

57 58

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman

Interconnection Protocols Interconnection Protocols

Virtual LANs VLAN Tags

• “Broadcast domain”, i.e., separate LANs on VLAN B the same switch, or LAN spread among switches q r • Logically the same as two unconnected LANs • Need a router to get between them VLAN A • Sometimes the switch acts as the router between the “virtual LANs” • Separate by port or configuration simplest • Switch q has to tell switch r which VLAN to • People have tried to infer the VLAN from transmit the packet on layer 3 information • Done by inserting extra header. Ethertype 81- - IP address (allows stations to move and 00 indicates 2 byte “VLAN header” follows. retain IP address) (Or SNAP-encoded using 8 bytes instead of 2 to indicate tag exists) - protocol family (e.g., IPX vs IP) • tag contains 3 bit “priority”, format flag, 12 bit VLAN ID

59 60

Copyright © 2001 Radia Perlman Copyright © 2001 Radia Perlman Interconnection Protocols

61

Copyright © 2001 Radia Perlman