Chapter 7 Packet-Switching Networks
Network Services and Internal Network Operation Packet Network Topology Datagrams and Virtual Circuits Routing in Packet Networks Shortest Path Routing ATM Networks Traffic Management
Network Layer
Network Layer: the most complex layer Requires the coordinated actions of multiple, geographically distributed network elements (switches & routers) Must be able to deal with very large scales Billions of users (people & communicating devices) Biggest Challenges Addressing: where should information be directed to? Routing: what path should be used to get information there?
t1 t0
Network
Transfer of information as payload in data packets Packets undergo random delays & possible loss Different applications impose differing requirements on the transfer of information
Network Service
Messages Messages
Transport Segments Transport layer layer Network Network service service Network Network Network Network layer layer layer layer Data link Data link Data link Data link End End layer layer layer layer system system Physical Physical β α Physical Physical layer layer layer layer
Network layer can offer a variety of services to transport layer Connection-oriented service or connectionless service Best-effort or delay/loss guarantees
2 Network Service vs. Operation
Network Service Internal Network Connectionless Operation Datagram Transfer Connectionless Connection-Oriented IP Reliable and possibly Connection-Oriented constant bit rate transfer Telephone connection ATM
Various combinations are possible Connection-oriented service over Connectionless operation Connectionless service over Connection-Oriented operation Context & requirements determine what makes sense
The End-to-End Argument
An end-to-end function is best implemented at a higher level than at a lower level End-to-end service requires all intermediate components to work properly Higher-level better positioned to ensure correct operation Example: stream transfer service Establishing an explicit connection for each stream across network requires all network elements (NEs) to be aware of connection; All NEs have to be involved in re- establishment of connections in case of network fault In connectionless network operation, NEs do not deal with each explicit connection and hence are much simpler in design
3 Network Layer Functions
What are essentials?
Routing: mechanisms for determining the set of best paths for routing packets requires the collaboration of network elements
Forwarding: transfer of packets from NE inputs to outputs
Priority & Scheduling: determining order of packet transmission in each NE Optional: congestion control, segmentation & reassembly, security
End-to-End Packet Network
Packet networks very different than telephone networks Individual packet streams are highly bursty Statistical multiplexing is used to concentrate streams User demand can undergo dramatic change Peer-to-peer applications stimulated huge growth in traffic volumes Internet structure highly decentralized Paths traversed by packets can go through many networks controlled by different organizations No single entity responsible for end-to-end service
4 Access Multiplexing
Access MUX To packet network
Packet traffic from users multiplexed at access to network into aggregated streams DSL traffic multiplexed at DSL Access Mux Cable modem traffic multiplexed at Cable Modem Termination System
Oversubscription
Access Multiplexer r N subscribers connected @ c bps to mux r Each subscriber active r/c of time (ave. rate r) • • • • nc Mux has C=nc bps to network r Oversubscription rate: N/n Nr Nc Find n so that at most 1% overflow probability Feasible oversubscription rate increases with size
N r/c n N/n 10 0.01 1 10 10 extremely lightly loaded users 10 0.05 3 3.3 10 very lightly loaded user 10 0.1 4 2.5 10 lightly loaded users 20 0.1 6 3.3 20 lightly loaded users 40 0.1 9 4.4 40 lightly loaded users 100 0.1 18 5.5 100 lightly loaded users What is the probability that k users transmit at the peak rate?
5 Home LANs
WiFi
Ethernet Home Router To packet network
How about network addressing? Home Router LAN Access using Ethernet or WiFi (IEEE 802.11) Private IP addresses in Home (192.168.0.x) using Network Address Translation (NAT) Single global IP address from ISP issued using Dynamic Host Configuration Protocol (DHCP)
Campus Network Servers have redundant connectivity to backbone Organization To Internet or Servers wide area network s s Gateway Backbone R R R S S S Departmental R R Server R s s High-speed campus s backbone net connects dept routers s s s s s s Only outgoing packets leave LAN through router
6 Connecting to ISP
Internet service provider
Border routers
Campus Border routers Network Interdomain level Autonomous system or domain Intradomain level s LAN network administered s s by single organization running same routing protocol
Key Role of Routing
How to get packet from here to there? Decentralized nature of Internet makes routing a major challenge Interior gateway protocols (IGPs) are used to determine routes within a domain Exterior gateway protocols (EGPs) are used to determine routes across domains Routes must be consistent & produce stable flows Scalability required to accommodate growth Hierarchical structure of IP addresses essential to keeping size of routing tables manageable
7 Chapter 7 Packet-Switching Networks
Datagrams and Virtual Circuits
Packet Switching Network
Packet switching network User Transfers packets between users Transmission line Transmission lines + packet switches Network (routers) Packet switch Origin in message switching Two modes of operation: Connectionless Virtual Circuit
8 Message Switching
Message switching invented for telegraphy
Message Entire messages Message multiplexed onto shared Message lines, stored & forwarded
Source Headers for source & destination addresses Message Routing at message switches Switches Destination Connectionless
Transmission delay vs. propagation delay Transmit a 1000B from LA to DC via a 1Gbps network, signal speed 200Km/sec.
Message Switching Delay
Source T t
Switch 1 t τ Switch 2 t
t Destination Delay
Minimum delay = 3τ + 3T
Additional queueing delays possible at each link
9 Long Messages vs. Packets
1 Mbit message source dest BER=p=10-6 BER=10-6 How many bits need to be transmitted to deliver message?
Approach 1: send 1 Mbit Approach 2: send 10 message 100-kbit packets Probability message Probability packet arrives arrives correctly correctly
6 6 6 −6 10 −10 10− −1 −6 105 −10510−6 −0.1 Pc = (1−10 ) ≈ e = e ≈1/ 3 Pc′ = (1−10 ) ≈ e = e ≈ 0.9 On average it takes about On average it takes about 1.1 transmissions/hop 3 transmissions/hop Total # bits transmitted ≈ Total # bits transmitted ≈ 6 Mbits 2.2 Mbits
Packet Switching - Datagram
Messages broken into smaller units (packets) Source & destination addresses in packet header Packet 1 Connectionless, packets routed independently Packet 1 (datagram) Packet 2 Packet may arrive out of order Pipelining of packets across Packet 2 network can reduce delay, increase throughput Packet 2 Lower delay that message switching, suitable for interactive traffic
10 Packet Switching Delay
Assume three packets corresponding to one message traverse same path t 1 2 3 t 1 2 3 t 1 2 3 t Delay Minimum Delay = 3τ + 5(T/3) (single path assumed) Additional queueing delays possible at each link Packet pipelining enables message to arrive sooner
Delay for k-Packet Mess. over L Hops
Source t 1 2 3 Switch 1 t τ 1 2 3 Switch 2 t 1 2 3 t Destination 3 hops L hops 3τ + 2(T/3) first bit received Lτ + (L-1)P first bit received
3τ + 3(T/3) first bit released Lτ + LP first bit released 3τ + 5 (T/3) last bit released Lτ + LP + (k-1)P last bit released where T = k P
11 Routing Tables in Datagram Networks
Destination Output Route determined by table address port lookup Routing decision involves 0785 7 finding next hop in route to given destination 1345 12 Routing table has an entry for each destination 1566 6 specifying output port that leads to next hop Size of table becomes impractical for very large number of destinations 2458 12
Example: Internet Routing
Internet protocol uses datagram packet switching across networks Networks are treated as data links
Hosts have two-part IP address: Network address + Host address
Routers do table lookup on network address This reduces size of routing table
In addition, network addresses are assigned so that they can also be aggregated Discussed as CIDR in Chapter 8
12 Packet Switching – Virtual Circuit
Packet Packet Packet
Packet
Virtual circuit
Call set-up phase sets ups pointers in fixed path along network
All packets for a connection follow the same path
Abbreviated header identifies connection on each link
Packets queue for transmission
Variable bit rates possible, negotiated during call set-up
Delays variable, cannot be less than circuit switching
ATM Virtual Circuits
A VC is a connection with resources reserved.
A Cell is a small and fixed-size packet, delivered in order.
13 Routing in VC Subnet
Does VC subnets need the capability to route isolated packets from an arbitrary source to an arbitrary destination? Label switching
Connection Setup
Connect Connect Connect request request request SW SW … SW 1 2 n Connect Connect Connect confirm confirm confirm
Signaling messages propagate as route is selected
Signaling messages identify connection and setup tables in switches
Typically a connection is identified by a local tag, Virtual Circuit Identifier (VCI)
Each switch only needs to know how to relate an incoming tag in one input to an outgoing tag in the corresponding output
Once tables are setup, packets can flow along path
14 Connection Setup Delay
t Connect 1 2 3 request CC t Release CR 1 2 3 CC t Connect CR 1 2 3 confirm t
Connection setup delay is incurred before any packet can be transferred
Delay is acceptable for sustained transfer of large number of packets
This delay may be unacceptably high if only a few packets are being transferred
Virtual Circuit Forwarding Tables
Input Output Output Each input port of packet VCI port VCI switch has a forwarding table
12 44 Lookup entry for per-link/port 13 VCI of incoming packet
Determine output port (next 15 15 23 hop) and insert VCI for next link
27 13 16 Very high speeds are possible (HW-based)
Table can also include priority or other information about how packet should be treated 58 7 34
What are key pros & cons of VC switching vs. datagram switching?
15 Cut-Through switching
Source t 2 Switch 1 1 3 t 1 2 3 Switch 2 t 1 2 3 t Destination Minimum delay = 3τ + T
Some networks perform error checking on header only, so packet can be forwarded as soon as header is received & processed
Delays reduced further with cut-through switching
Message vs. Packet Min. Delay
Message: L τ + L T = L τ + (L – 1) T + T
Packet L τ + L P + (k – 1) P = L τ + (L – 1) P + T
Cut-Through Packet (Immediate forwarding after header) = L τ + T
Above neglect header processing delays
16 Comparison of VC and Datagram Subnets
5-4
17 Chapter 7 Packet-Switching Networks
Router Architecture Overview
17 Router Architecture Overview
Two key router functions: run routing algorithms/protocol (RIP, OSPF, BGP) forwarding datagrams from incoming to outgoing link
Network Layer 4-1
Input Port Functions
Physical layer: bit-level reception Data link layer: Decentralized switching: e.g., Ethernet given datagram dest., lookup output port see chapter 5 using forwarding table in input port memory goal: complete input port processing at ‘line speed’ queuing: if datagrams arrive faster than forwarding rate into switch fabric
Network Layer 4-2
1 Three types of switching fabrics
Network Layer 4-3
Switching Via Memory
First generation routers: traditional computers with switching under direct control of CPU packet copied to system’s memory speed limited by memory bandwidth (2 bus crossings per datagram)
Input Memory Output Port Port
System Bus
Network Layer 4-4
2 Switching Via a Bus
datagram from input port memory to output port memory via a shared bus bus contention: switching speed limited by bus bandwidth 32 Gbps bus, Cisco 5600: sufficient speed for access and enterprise routers
Network Layer 4-5
Switching Via An Interconnection Network
overcome bus bandwidth limitations Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric. Cisco 12000: switches 60 Gbps through the interconnection network
Network Layer 4-6
3 Output Ports
Buffering required when datagrams arrive from fabric faster than the transmission rate Scheduling discipline chooses among queued datagrams for transmission
Network Layer 4-7
Output port queueing
buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow!
Network Layer 4-8
4 How much buffering?
RFC 3439 rule of thumb: average buffering equal to “typical” RTT (say 250 msec) times link capacity C e.g., C = 10 Gps link: 2.5 Gbit buffer Recent recommendation: with N flows, buffering equal to RTT . C N
Network Layer 4-9
Input Port Queuing
Fabric slower than input ports combined -> queueing may occur at input queues Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward queueing delay and loss due to input buffer overflow!
Network Layer 4-10
5 Chapter 7 Packet-Switching Networks
Routing in Packet Networks
17 Routing in Packet Networks
1 3 6
4
2 5 Node (switch or router)
Three possible (loopfree) routes from 1 to 6: 1-3-6, 1-4-5-6, 1-2-5-6 Which is “best”?What is the objective function for optimization? Min delay? Min hop? Max bandwidth? Min cost? Max reliability?
Creating the Routing Tables
Need information on state of links Link up/down; congested; delay or other metrics Need to distribute link state information using a routing protocol What information is exchanged? How often? How to exchange with neighbors? Need to compute routes based on information Single metric; multiple metrics Single route; alternate routes
18 Routing Algorithm Requirements
Correctness
Responsiveness Topology or bandwidth changes, congestion
Optimality Resource utilization, path length
Robustness Continues working under high load, congestion, faults, equipment failures, incorrect implementations
Simplicity Efficient software implementation, reasonable processing load
Fairness
Centralized vs Distributed Routing
Centralized Routing All routes determined by a central node All state information sent to central node Problems adapting to frequent topology changes What is the problem? Does not scale
Distributed Routing Routes determined by routers using distributed algorithm State information exchanged by routers Adapts to topology and other changes Better scalability, but maybe inconsistent due to loops
19 Static vs Dynamic Routing
Static Routing Set up manually, do not change; requires administration Works when traffic predictable & network is simple Used to override some routes set by dynamic algorithm Used to provide default router
Dynamic Routing Adapt to changes in network conditions Automated Calculates routes based on received updated network state information
Flooding
Send a packet to all nodes in a network Useful in starting up network or broadcast No routing tables available Need to broadcast packet to all nodes (e.g. to propagate link state information)
Approach Send packet on all ports except one where it arrived Exponential growth in packet transmissions
20 Flooding Example
1 3 6
Is flooding static or adaptive? 4 What is the major problem?
How to handle the problem? 2 5 What are main nice properties of flooding?
How flooding can be terminated? Duplicates (infinite number due to loops) Hop Count; Sequence number with a counter per a source router. Robustness; always follow shortest path TTL is a way to terminate flooding.
Flooding Example
1 3 6
4
2 5
Flooding is initiated from Node 1: Hop 2 transmissions
21 Flooding Example
1 3 6
4
2 5
Flooding is initiated from Node 1: Hop 3 transmissions
Limited Flooding
Time-to-Live field in each packet limits number of hops to certain diameter Each switch adds its ID before flooding; discards repeats Source puts sequence number in each packet; switches records source address and sequence number and discards repeats
22 Limited Flooding Example
Suppose the following network uses flooding as the routing algorithm. If a packet sent by A to D has a maximum hop of 3, list all the routes it will take. Also tell how many hops worth of bandwidth it consumes. Assume the bandwidth weight of the lines is the same.
B C
A D E
F G
Shortest Paths & Routing
Many possible paths connect any given source and to any given destination
Routing involves the selection of the path to be used to accomplish a given transfer
Typically it is possible to attach a cost or distance to a link connecting two nodes
Routing can then be posed as a shortest path problem
23 Routing Metrics
Means for measuring desirability of a path Path Length = sum of costs or distances Possible metrics Hop count: rough measure of resources used Reliability: link availability Delay: sum of delays along path; complex & dynamic Bandwidth: “available capacity” in a path Load: Link & router utilization along path Cost: $$$
An Example: Sink Tree
Sink tree: minimum # of hops to a root.
(a) A subnet. (b) A sink tree for router B. Q1: must a sink tree be unique? An example? Q2: each packet will be delivered within a finite # of hops?
24 Shortest Path Approaches
Distance Vector Protocols Neighbors exchange list of distances to destinations Best next-hop determined for each destination Ford-Fulkerson (distributed) shortest path algorithm Link State Protocols Link state information flooded to all routers Routers have complete topology information Shortest path (& hence next hop) calculated Dijkstra (centralized) shortest path algorithm
Distance Vector Do you know the way to San Jose?
San Jose 294 San Jose 392
San Jose 596 San Jose 250
25 Distance Vector
Local Signpost Table Synthesis Direction Neighbors exchange Distance table entries Determine current best Routing Table next hop For each destination list: Inform neighbors Periodically Next Node After changes Distance dest next dist
Shortest Path to SJ Focus on how nodes find their shortest path to a given destination node, i.e. SJ San Jose
Dj C ij j i
Di If Di is the shortest distance to SJ from i and if j is a neighbor on the shortest path,
then Di = Cij + Dj
26 But we don’t know the shortest paths i only has local info from neighbors San Dj' Jose j' Cij' Dj Cij j i C Pick current D ij” i j" shortest path Dj"
Why Distance Vector Works SJ sends accurate info
San 1 Hop 2 Hops From SJ Jose 3 Hops From SJ From SJ
Hop-1 nodes calculate current (next hop, dist), & Accurate info about SJ send to neighbors ripples across network, Shortest Path Converges
27 Distance Vector Routing (RIP)
Routing operates by having each router maintain a vector table giving the best known distance to each destination and which line to use to get there. The tables are updated by exchanging information with the neighbors.
Vector table: one entry for each router in the subnet; each entry contains two parts: preferred outgoing line to use for that destination and an estimate of the time or distance to the destination.
The router is assumed to know the distance/cost to each neighbor and update the vector table periodically by changing it with neighbors.
# hops Delay (ECHO) X i m Capacity Congestion
An Example
(a) A subnet. (b) Input from A, I, H, K, and the new routing table for J.
28 Distance Vector Routing (RIP)
Routing operates by having each router maintain a vector table giving the best known distance to each destination and which line to use to get there. The tables are updated by exchanging information with the neighbors.
Vector table: one entry for each router in the subnet; each entry contains two parts: preferred outgoing line to use for that destination and an estimate of the time or distance to the destination.
The router is assumed to know the distance/cost to each neighbor and update the vector table periodically by changing it with neighbors.
# hops Delay (ECHO) X i m Capacity Congestion
An Example
(a) A subnet. (b) Input from A, I, H, K, and the new routing table for J.
28 Bellman-Ford Algorithm
Consider computations for one destination d Initialization Each node table has 1 row for destination d
Distance of node d to itself is zero: Dd=0
Distance of other node j to d is infinite: Dj=∝, for j≠ d Next hop node nj = -1 to indicate not yet defined for j ≠ d Send Step Send new distance vector to immediate neighbors across local link Receive Step At node i, find the next hop that gives the minimum distance to d,
Minj { Cij + Dj } Replace old (nj, Dj(d)) by new (nj*, Dj*(d)) if new next node or distance Go to send step
Bellman-Ford Algorithm
Now consider parallel computations for all destinations d Initialization Each node has 1 row for each destination d
Distance of node d to itself is zero: Dd(d)=0 Distance of other node j to d is infinite: Dj(d)= ∝ , for j ≠ d Next node nj = -1 since not yet defined Send Step Send new distance vector to immediate neighbors across local link Receive Step For each destination d, find the next hop that gives the minimum distance to d,
Minj { Cij+ Dj(d) } Replace old (nj, Di(d)) by new (nj*, Dj*(d)) if new next node or distance found Go to send step
29 Bellman-Ford Algorithm
Consider computations for one destination d Initialization Each node table has 1 row for destination d
Distance of node d to itself is zero: Dd=0
Distance of other node j to d is infinite: Dj=∝, for j≠ d Next hop node nj = -1 to indicate not yet defined for j ≠ d Send Step Send new distance vector to immediate neighbors across local link Receive Step At node i, find the next hop that gives the minimum distance to d,
Minj { Cij + Dj } Replace old (nj, Dj(d)) by new (nj*, Dj*(d)) if new next node or distance Go to send step
Bellman-Ford Algorithm
Now consider parallel computations for all destinations d Initialization Each node has 1 row for each destination d
Distance of node d to itself is zero: Dd(d)=0 Distance of other node j to d is infinite: Dj(d)= ∝ , for j ≠ d Next node nj = -1 since not yet defined Send Step Send new distance vector to immediate neighbors across local link Receive Step For each destination d, find the next hop that gives the minimum distance to d,
Minj { Cij+ Dj(d) } Replace old (nj, Di(d)) by new (nj*, Dj*(d)) if new next node or distance found Go to send step
29 Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) 1 2 3
Table entry Table entry @ node 1 @ node 3 for dest SJ for dest SJ 2 3 1 1 5 2 San 4 Jose 3 1 3 6
2 5 2 4
Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) 1 (-1, ∞) (-1, ∞) (6,1) (-1, ∞) (6,2) 2 3
D3=D6+1 n3=6 D =0 2 3 1 6 1 1 5 2 0 4 San 3 6 1 3 Jose 2 5 2 4 2 D6=0 D5=D6+2 n5=6
30 Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) 1 (-1, ∞) (-1, ∞) (6, 1) (-1, ∞) (6,2) 2 (3,3) (5,6) (6, 1) (3,3) (6,2) 3
3 1 2 3 1 1 5 2 3 0 4 San 3 6 1 3 Jose 2 5 2 4 6 2
Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) (-1, ∞) 1 (-1, ∞) (-1, ∞) (6, 1) (-1, ∞) (6,2) 2 (3,3) (5,6) (6, 1) (3,3) (6,2) 3 (3,3) (4,4) (6, 1) (3,3) (6,2)
Nodes 1, 4, 5 process the new entry from node 2, but no new finding; Shorting path tree to node 6 is constructed. 1 3 2 3 1 1 5 2 3 0 4 San 3 6 1 3 Jose 2 5 2 4 6 4 2
31 Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (3,3) (4,4) (6, 1) (3,3) (6,2) 1 (3,3) (4,4) (4, 5) (3,3) (6,2) 2 3
1 5 3 2 3 1 1 5 2 3 0 4 San 3 6 1 3 Jose 2 2 5 4 4 2 Network disconnected (3-6 fails); Loop created between nodes 3 and 4
Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (3,3) (4,4) (6, 1) (3,3) (6,2) 1 (3,3) (4,4) (4, 5) (3,3) (6,2) 2 (3,7) (4,4) (4, 5) (5,5) (6,2) 3
5 7 3 2 3 1 1 5 5 2 3 0 4 San 3 1 3 6 Jose 2 2 5 4 4 2
Node 4 could have chosen 2 as next node because of tie
32 Iteration Node 1 Node 2 Node 3 Node 4 Node 5 Initial (3,3) (4,4) (6, 1) (3,3) (6,2) 1 (3,3) (4,4) (4, 5) (3,3) (6,2) 2 (3,7) (4,4) (4, 5) (5,5) (6,2) 3 (3,7) (4,6) (4, 7) (5,5) (6,2)
5 7 7 2 3 1 1 5 2 5 0 4 San 3 6 1 3 Jose 2 5 2 4 2 4 6 Node 2 could have chosen 5 as next node because of tie
Iteration Node 1 Node 2 Node 3 Node 4 Node 5 1 (3,3) (4,4) (4, 5) (3,3) (6,2) 2 (3,7) (4,4) (4, 5) (2,5) (6,2) 3 (3,7) (4,6) (4, 7) (5,5) (6,2) 4 (2,9) (4,6) (4, 7) (5,5) (6,2) 5 (2,9) (4,6) (4, 7) (5,5) (6,2)
7 7 9 2 3 1 1 5 2 5 0 4 San 3 1 3 6 Jose 2 5 2 6 4 2
Node 1 could have chose 3 as next node because of tie
33 Count-to-Infinity Problem
It converges to the correct answer quickly to good news but slowly to bad news.
B knows A is 1 hop away while all other routers still think A is down, why?
What is the spreading rate of good news? Does B know that C’s path runs through B?
How many exchanges needed in a N-hop subnet? Why spreading rate of bad news so slow? What is the core problem?
Problem: Bad News Travels Slowly
Remedies Split Horizon Do not report route to a destination to the neighbor from which route was learned
Poisoned Reverse Report route to a destination to the neighbor from which route was learned, but with infinite distance Breaks erroneous direct loops immediately Does not work on some indirect loops
34 10/14/2013
Counting to Infinity Problem
(a) 1 2 3 4 1 1 1 Nodes believe best path is through each (b) 1 2 3 X 4 other 1 1 (Destination is node 4)
Update Node 1 Node 2 Node 3
Before break (2,3) (3,2) (4, 1) After break (2,3) (3,2) (2,3) 1 (2,3) (3,4) (2,3) 2 (2,5) (3,4) (2,5) 3 (2,5) (3,6) (2,5) 4 (2,7) (3,6) (2,7) 5 (2,7) (3,8) (2,7) … … … …
Problem: Bad News Travels Slowly
Remedies
• Split Horizon – Do not report route to a destination to the neighbour from which route was learned.
• Split Horizon with Poisoned Reverse – Report route to a destination to the neighbour from which route was learned, but with infinite distance.
1 10/14/2013
Split Horizon with Poison Reverse
(a) 1 2 3 4 1 1 1 Nodes believe best path is through (b) 1 2 3 X 4 each other 1 1
Update Node 1 Node 2 Node 3 Before break (2, 3) (3, 2) (4, 1) After break (2, 3) (3, 2) (-1, ) Node 2 advertises its route to 4 to node 3 as having distance infinity; node 3 finds there is no route to 4
1 (2, 3) (-1, ) (-1, ) Node 1 advertises its route to 4 to node 2 as having distance infinity; node 2 finds there is no route to 4
2 (-1, ) (-1, ) (-1, ) Node 1 finds there is no route to 4
2 Link-State Algorithm
Basic idea: two stage procedure Each source node gets a map of all nodes and link metrics (link state) of the entire network Learning who the neighbors are and what are delay to them Construct a link state packet, and deliver it to others Find the shortest path on the map from the source node to all destination nodes; Dijkstra’s algorithm
Broadcast of link-state information Every node i in the network broadcasts to every other node in the network:
ID’s of its neighbors: Ni=set of neighbors of i
Distances to its neighbors: {Cij | j ∈Ni} Flooding is a popular method of broadcasting packets
Building Link State Packets
A state packet starts with the ID of the sender, a seq#, age, and a list of neighbors with delay information.
(a) A subnet. (b) The link state packets for this subnet.
When to build the link state packets? Periodically, or when significant event occurs.
35 Distributing the Link State Packets
Flooding is used to distribute the link state packets.
What is the major problem with flooding?
How to handle the problem? (source router, sequence number)
How to make the sequence number unique? 32-bit sequence number
What happens if a router crashes, losing its track, and starts again?
What happens if sequence number is corrupted, say 65,540, not 4. Age field (in sec; usually a packet comes in 10 sec.) All packets should be ACKed.
A Packet Buffer
flooding
36 Shortest Paths in Dijkstra’s Algorithm
2 2 1 3 1 1 3 1 6 6 5 2 5 2 3 3 1 4 2 1 4 2 3 3 2 2 4 5 4 5 2 2 1 3 1 1 3 1 6 6 5 2 5 2 3 3 4 1 2 1 4 2 3 3 2 2 4 5 4 5 2 2 1 3 1 1 3 1 6 6 5 2 5 2 3 3 1 4 2 1 4 2 3 3 2 2 4 5 4 5
Execution of Dijkstra’s algorithm
2 2 1 3 1 1 3 1 6 6 5 2 5 2 3 3 4 4 1 4 2 1 4 2 3 3 2 2 4 5 4 5
Iteration N D2 D3 D4 D5 D6 Initial {1} 3 2 5 ∝ ∝ 1 {1,3} 3 2 4 ∝ 3 2 {1,2,3} 3 2 4 7 3 3 {1,2,3,6} 3 2 4 5 3 4 {1,2,3,4,6} 3 2 4 5 3 5 {1,2,3,4,5,6} 3 2 4 5 3
37 Dijkstra’s algorithm
N: set of nodes for which shortest path already found Initialization: (Start with source node s)
N = {s}, Ds = 0, “s is distance zero from itself”
Dj=Csj for all j ≠ s, distances of directly-connected neighbors Step A: (Find next closest node i) Find i ∉ N such that
Di = min Dj for j ∉ N Add i to N If N contains all the nodes, stop Step B: (update minimum costs) For each node j ∉ N Minimum distance from s to D = min (D , D +C ) j j i ij j through node i in N Go to Step A
Reaction to Failure
If a link fails, Router sets link distance to infinity & floods the network with an update packet All routers immediately update their link database & recalculate their shortest paths Recovery very quick
But watch out for old update messages Add time stamp or sequence # to each update message Check whether each received update message is new If new, add it to database and broadcast If older, send update message on arriving link
38 Why is Link State Better?
Fast, loopless convergence Support for precise metrics, and multiple metrics if necessary (throughput, delay, cost, reliability) Support for multiple paths to a destination algorithm can be modified to find best two paths More flexible, e.g., source routing
What is the memory required to store the input data for a subnet with n routers – each of them has k neighbors? But not critical today!
Source Routing vs. H-by-H
Source host selects path that is to be followed by a packet Strict: sequence of nodes in path inserted into header Loose: subsequence of nodes in path specified Intermediate switches read next-hop address and remove address Or maintained for the reverse path Source routing allows the host to control the paths that its information traverses in the network Potentially the means for customers to select what service providers they use
39 Example
3,6,B 6,B 1,3,6,B 1 3 B 6 A
4 B Source host 2 5 Destination host
How path learned? Source host needs link state information
Chapter 7 Packet-Switching Networks
Traffic Management Packet Level Flow Level Flow-Aggregate Level
40 QOS in IP Networks
IETF groups are working on proposals to provide QOS control in IP networks, i.e., going beyond best effort to provide some assurance for QOS Work in Progress includes RSVP, Differentiated Services, and Integrated Services Simple model for sharing and congestion studies:
QoS #1
Principles for QOS Guarantees
Consider a phone application at 1Mbps and an FTP application sharing a 1.5 Mbps link. bursts of FTP can congest the router and cause audio packets to be dropped. want to give priority to audio over FTP PRINCIPLE 1: Marking of packets is needed for router to distinguish between different classes; and new router policy to treat packets accordingly
QoS #2
1 Principles for QOS Guarantees (more)
Applications misbehave (audio sends packets at a rate higher than 1Mbps assumed above); PRINCIPLE 2: provide protection (isolation) for one class from other classes Require Policing Mechanisms to ensure sources adhere to bandwidth requirements; Marking and Policing need to be done at the edges:
QoS #3
Principles for QOS Guarantees (more)
Alternative to Marking and Policing: allocate a set portion of bandwidth to each application flow; can lead to inefficient use of bandwidth if one of the flows does not use its allocation PRINCIPLE 3: While providing isolation, it is desirable to use resources as efficiently as possible
QoS #4
2 Principles for QOS Guarantees (more)
Cannot support traffic beyond link capacity Two phone calls each requests 1 Mbps PRINCIPLE 4: Need a Call Admission Process; application flow declares its needs, network may block call if it cannot satisfy the needs
QoS #5
QoS #6
3 Traffic Management
Vehicular traffic management Packet traffic management Traffic lights & signals Multiplexing & access control flow of traffic in city mechanisms to control flow street system of packet traffic Objective is to maximize Objective is make efficient flow with tolerable delays use of network resources & Priority Services deliver QoS Police sirens Priority Cavalcade for dignitaries Fault-recovery packets Bus & High-usage lanes Real-time traffic Trucks allowed only at night Enterprise (high-revenue) traffic High bandwidth traffic
Time Scales & Granularities
Packet Level Queueing & scheduling at multiplexing points Determines relative performance offered to packets over a short time scale (microseconds) Flow Level Management of traffic flows & resource allocation to ensure delivery of QoS (milliseconds to seconds) Matching traffic flows to resources available; congestion control Flow-Aggregate Level Routing of aggregate traffic flows across the network for efficient utilization of resources and meeting of service levels “Traffic Engineering”, at scale of minutes to days
41 End-to-End QoS
Packet buffer
…
1 2 N – 1 N
A packet traversing network encounters delay and possible loss at various multiplexing points End-to-end performance is accumulation of per-hop performances
How to keep end-to-end delay under some upper bound? Jitter, loss?
Scheduling & QoS
End-to-End QoS & Resource Control Buffer & bandwidth control → Performance Admission control to regulate traffic level
Scheduling Concepts fairness/isolation priority, aggregation,
Fair Queueing & Variations WFQ, PGPS
Guaranteed Service WFQ, Rate-control
Packet Dropping aggregation, drop priorities
84
42 FIFO Queueing
Packet buffer Arriving packets Transmission Packet discard link when full
All packet flows share the same buffer Transmission Discipline: First-In, First-Out Buffering Discipline: Discard arriving packets if buffer is full (Alternative: random discard; pushout head-of-line, i.e. oldest, packet) How about aggressiveness vs. fairness?
85
FIFO Queueing
Cannot provide differential QoS to different packet flows Different packet flows interact strongly
Statistical delay guarantees via load control Restrict number of flows allowed (connection admission control) Difficult to determine performance delivered
Finite buffer determines a maximum possible delay
Buffer size determines loss probability But depends on arrival & packet length statistics
Variation: packet enqueueing based on queue thresholds some packet flows encounter blocking before others higher loss, lower delay
86
43 FIFO w/o and w/ Discard Priority
(a) Packet buffer Arriving packets Transmission Packet discard link when full
(b) Packet buffer Arriving packets Transmission link Class 1 Class 2 discard discard when threshold when full exceeded
87
HOL Priority Queueing
Packet discard when full High-priority Transmission packets link
Low-priority When packets high-priority Packet discard queue empty when full
High priority queue serviced until empty High priority queue has lower waiting time Buffers can be dimensioned for different loss probabilities Surge in high priority queue can cause low priority queue to
saturate 88
44 HOL Priority Features
Strict priority vs. WTP Provides differential QoS Pre-emptive priority: lower classes invisible Non-preemptive priority: lower (Note: Need labeling) classes impact higher classes Delay through residual service times High-priority classes can hog all of the bandwidth & starve lower priority classes Need to provide some isolation between classes
Per-class loads
89
Fair Queuing / Generalized Processor Sharing (GPS)
Packet flow 1 Approximated bit-level round robin service Packet flow 2 C bits/second
Transmission … Packet… flow n link
Each flow has its own logical queue: prevents hogging; allows differential loss probabilities C bits/sec allocated equally among non-empty queues transmission rate = C / n(t), where n(t)=# non-empty queues Idealized system assumes fluid flow from queues Implementation requires approximation: simulate fluid system; sort
packets according to completion time in ideal system 90
45 Fair Queuing – Example 1
Buffer 1 Fluid-flow system: at t=0 both packets served at rate ½ (overall rate : Buffer 2 1 1 unit/second) at t=0 Both packets t complete service 0 1 2 at t = 2
Packet from Packet-by-packet system: buffer 2 waiting buffer 1 served first at rate 1; 1 then buffer 2 served at rate 1. Packet from buffer 2 being served Packet from t buffer 1 being 0 1 2 91 served
Fair Queuing – Example 2
2 Buffer 1 Fluid-flow system: at t=0 both packets served at rate 1/2 Buffer 2 1 Packet from buffer at t=0 2 served at rate 1 t 0 2 3 Service rate = reciprocal of the number of active buffers at the time. * Within a buffer, FIFO still though!
Packet from Packet-by-packet buffer 2 fair queueing: waiting 1 buffer 2 served at rate Packet from 1 buffer 1 served t 92 at rate 1 0 1 2 3
46 Weighted Fair Queueing (WFQ)
Buffer 1 Fluid-flow system: at t=0 packet from buffer 1 served at rate 1/4; Buffer 2 1 at t=0 Packet from buffer 1 Packet from buffer 2 t served at rate 1 served at rate 3/4 0 1 2
Packet-by-packet weighted fair Packet from queueing: buffer 1 waiting buffer 2 served first at rate 1; then buffer 1 served at rate 1 1 Packet from Packet from buffer 1 served at rate 1 buffer 2 t served at rate 1 0 1 2 93
Example
FIFO -> Fair queueing packet-by-packet vs. bit-by-bit
(a) A router with five packets queued for line O. (b) Finishing times for the five packets.
47 Packetized GPS/WFQ
Sorted packet buffer Arriving Tagging packets unit Transmission Packet discard link when full
Compute packet completion time in ideal system add tag to packet sort packet in queue according to tag serve according to HOL WFQ and its many variations form the basis for providing QoS in packet networks 95
Buffer Management
Packet drop strategy: Which packet to drop when buffers full Fairness: protect behaving sources from misbehaving sources Aggregation: Per-flow buffers protect flows from misbehaving flows Full aggregation provides no protection Aggregation into classes provided intermediate protection Drop priorities: Drop packets from buffer according to priorities Maximizes network utilization & application QoS Examples: layered video, policing at network edge Controlling sources at the edge
48 Early or Overloaded Drop
Random early detection (RED): drop pkts if short-term avg of queue exceeds threshold pkt drop probability increases linearly with queue length mark offending pkts improves performance of cooperating TCP sources increases loss probability of misbehaving sources
Random Early Detection (RED)
Packets produced by TCP will reduce input rate in response to network congestion Early drop: discard packets before buffers are full Random drop causes some sources to reduce rate before others, causing gradual reduction in aggregate input rate
Algorithm: Maintain running average of queue length
If Qavg < minthreshold, do nothing
If Qavg > maxthreshold, drop packet If in between, drop packet according to probability Flows that send more packets are more likely to have packets dropped
49 Packet Drop Profile in RED
1 Probability of packet drop 0 minth maxth full Average queue length
The method of “early” is used to notify some source to reduce its transmission rate before the buffer becomes full.
Chapter 7 Packet-Switching Networks
Traffic Management at the Flow Level
50 Why Congestion?
Congestion Congestion occurs when a surge of traffic overloads 3 6 network resources
Can a large buffer help? 1 Or even worse? 4 8
2 7 5 Approaches to Congestion Control: • Preventive Approaches (open –loop): Scheduling & Reservations • Reactive Approaches (closed-loop): Detect & Throttle/Discard
Ideal Effect of Congestion Control
(Controlled)
(uncontrolled)
Resources used efficiently up to capacity available
51 Open-Loop Control
Network performance is guaranteed to all traffic flows that have been admitted into the network Initially for connection-oriented networks Key Mechanisms Admission Control Policing Traffic Shaping Traffic Scheduling
Admission Control
Flows negotiate contract with network Peak rate Specify requirements: Peak, Avg., Min Bit rate Maximum burst size Average rate Delay, Loss requirement Network computes Bits/second resources needed “Effective” bandwidth If flow accepted, network allocates resources to Time ensure QoS delivered as Typical bit rate demanded by long as source conforms to a variable bit rate information contract source
52 Policing
Network monitors traffic flows continuously to ensure they meet their traffic contract When a packet violates the contract, network can discard or tag the packet giving it lower priority If congestion occurs, tagged packets are discarded first Leaky Bucket Algorithm is the most commonly used policing mechanism Bucket has specified leak rate for average contracted rate Bucket has specified depth to accommodate variations in arrival rate Arriving packet is conforming if it does not result in overflow
The Leaky Bucket
water poured irregularly
(a) A leaky bucket with water. (b) a leaky bucket with packets.
53 The Leaky Bucket Example
Data comes to a router in 1 MB bursts, that is, an input runs at 25 MB/s (burst rate) for 40 msec. The router is able to support 2 MB/s output (leaky) rate. The router uses a leaky bucket for traffic shaping.
(1) How large the bucket should be so there is no data loss (assuming fluid system)? (2) Now, if the leaky bucket size is 1MB, how long the maximum burst interval can be?
The Leaky Bucket Example
° Example: data comes to a router in 1 MB bursts, that is, an input runs at 25 MB/s for 40 msec. The router is able to support 2 MB/s outgoing (leaky) rate. The leaky bucket size is 1MB.
(a) Input to a leaky bucket. (b) Output from a leaky bucket.
54 Leaky Bucket Algorithm
Arrival of a packet at time ta Depletion rate: 1 packet per unit time X’ = X - (ta - LCT)
L+I = Bucket Depth Current bucket Interarrival time
content Yes I = increment per arrival, X’ < 0? nominal interarrival time
Non-empty No X’ = 0 empty Nonconforming Yes X’ > L? packet arriving packet No would cause overflow X = X’ + I X = value of the leaky bucket counter LCT = ta X’ = auxiliary variable conforming packet LCT = last conformance time conforming packet
Leaky Bucket Example
I = 4 L = 6 Nonconforming Packet arrival
Time
L+I
Bucket content I Per-packet not fluid system
* * * * * * * * * Time
Non-conforming packets not allowed into bucket & hence not included in calculations maximum burst size (MBS = 3 packets)
55 Leaky Bucket Traffic Shaper
Size N Incoming traffic Shaped traffic Server
Packet
Buffer incoming packets Play out periodically to conform to parameters Surges in arrivals are buffered & smoothed out Possible packet loss due to buffer overflow
Too restrictive, since conforming traffic does not need to be completely smooth, how to allow some burstiness?
Token Bucket Traffic Shaper
Tokens arrive periodically An incoming packet must have sufficient tokens before admission into the network Size K Token
Size N Incoming traffic Shaped traffic Server
Packet Token rate regulates transfer of packets If sufficient tokens available, packets enter network without delay K determines how much burstiness allowed into the network
56 Token Bucket Traffic Shaper
• LB Traffic Shaper is restricted as its output rate is constant when the buffer is not empty • most applications generate variable rate traffic • storing packets for such traffic causes unnecessary delay •Token Bucket Traffic Shaper is designed to allow variation in the transmission rate • it employs a buffer to absorb the burst, as LB shaper Tokens arrive • it controls the output rate using a token bucket periodically (counter)
The number of bytes in a burst length of S sec at the maximum output rate M bytes/sec if the token bucket capacity is C tokens, Size K token arrival rate is R tokens/sec is MS = C + RS Token S = C/(M-R) Packet Shaped Size N Incoming traffic traffic Server
TB Algorithm Algorithm: • the token is generated at a constant rate and stored in the token bucket with capacity C; if the bucket is full, arriving tokens are discarded. In a packet system the token is a count of bytes, e.g. the token is generated as R tokens (bytes) in ∆T seconds. • a packet from the buffer is transmitted only if the tokens can be drawn from the bucket, I.e. the bucket has tokens equivalent to the length of the packet, L bytes. • if no token is available, the packet is backlogged in the buffer • when the bucket is empty, then backlogged packets are transmitted at the rate at which token arrives in the buffer • it behaves like a LB shaper transmitting packets at a constant rate • when the bucket is not empty, then the packets are drawn from the buffer as soon as they arrive • the traffic burstiness is preserved • however if burstiness continues the bucket is exhausted and the packets are backlogged • the bucket size limits the burstiness
6 Delay Bound
A(t) = b+rt R(t) b bytes R(t) instantly No backlog of packets r bytes per Buffer b second occupancy R - r @ 1 empty 0 b t t t R
• In time T a token bucket with rate r bytes/sec and depth b can transmit b+rT bytes • b bytes at time t=0 and rT upto time T • Assume a TB shaper followed by two multiplexers in tandem each capable of transmitting R bytes/sec, where R > r • first multiplexer will buffer b bytes at t=0 and will subsequently receive r bytes/sec • the buffer occupancy will be b bytes, which will be drained at the rate R-r bytes/sec • a newly arrived byte experiences delay equivalent to the time it takes for the buffer occupancy to dissipate
Delay Bound Calculation
•the maximum delay experienced by the bytes following b bytes in the first multiplexer will be b/R seconds • no queue will build up at the second multiplexer, because arrival rate = service rate • the maximum delay through a chain of multiplexers each guaranteeing R bytes/sec bandwidth to the flow is b/R Æ fluid flow model • Parekh et al comes up with delay bound in a packet network to be ∑ D <= b/R + (H-1)m/R + j=1,H(M/Rj), where m is max. packet size of the flow, M the max packet size in the network, H is the number of hops and Rj is the link bandwidth b/R: buffer dissipation time at the first hop (H-1)m/R: transmit time of the preceding packet of the same flow at every other hop ∑ j=1,H (M/Rj): transmit time of the preceding packet of other flow at every hop
7 Token Bucket Shaping Effect
b bytes instantly b + r t The token bucket constrains the traffic from a source to be limited to b + r t bits in an interval of length t
r bytes/second
t
Q1: what are two main differences of a leaky bucket and a token bucket? Allow saving for burst spending; packet discarding or not.
Q2: When a token bucket is the same as a leaky bucket? b = 0; but still different indeed: packet discarding or not
The Token Bucket Example 1
° A network uses a token bucket for traffic shaping. A new token is put into the bucket every 1 msec. Each token is good for one packet, which contains 100 bytes of data. What is the maximum sustainable (input) data rate?
57 The Token Bucket Example 2
° Given: the token bucket capacity C, the token arrival rate p, and the maximum output rate M, calculate the maximum burst interval S
C + pS = MS
° Example 2: data comes to a router in 1 MB bursts, that is, an input runs at 25 MB/s (burst rate) for 40 msec. The router uses a token bucket with capacity of 250KB for traffic shaping. Initially, the bucket is full of tokens. And, the tokens are generated and put into the bucket in a rate of 2 MB/s.
What will be the output from the token bucket?
Fluid Examples
Output from a token bucket with capacities of (c) 250 KB, (d) 500 KB, (e) 750 KB, (f) Output from a 500KB token bucket feeding a 10-MB/sec leaky bucket of 1MB.
58 Current View of Router Function
Routing Reservation Mgmt. Agent Agent Agent
Admission Control
[Routing database] [Traffic control database]
Classifier Pkt. scheduler Input Internet driver forwarder Output driver How about security?
Closed-Loop Flow Control
Congestion control feedback information to regulate flow from sources into network Based on buffer content, link utilization, etc. Examples: TCP at transport layer; congestion control at ATM level End-to-end vs. Hop-by-hop Delay in effecting control Implicit vs. Explicit Feedback Source deduces congestion from observed behavior Routers/switches generate messages alerting to congestion
59 E2E vs. H2H Congestion Control
Source Packet flow Destination
(a)
Source Destination
(b)
Feedback information
TCP vs. ATM
Congestion Warning
Threshold-based utilization warning Which factor used for threshold calculation? How to measure the utilization? Instantaneously or smoothed? How to set the threshold? How many threshold levels?
The Warning Bit in ACKs
Choke packets to the source
Isn’t this approach too slow in reaction?
60 Traffic Engineering
Management exerted at flow aggregate level Distribution of flows in network to achieve efficient utilization of resources (bandwidth) Shortest path algorithm to route a given flow not enough Does not take into account requirements of a flow, e.g. bandwidth requirement Does not take account interplay between different flows Must take into account aggregate demand from all flows
Effect of Traffic Engineering
3 6 3 6
1 1 4 7 4 7
2 2 8 8 5 5
(a) (b) Better flow allocation Shortest path routing distributes flows congests link 4 to 8 more uniformly
61 Homework
Chapter 7 Homework (due one week later): 7.28, 7.29, 7.33, 7.34, 7.51, 7.56, 7.57, 7.58, 7.59, 7.64
62