Last lecture: Congestion Control What can end-points do to collectively to make good use of shared underlying resources?

logical link name Queue Management ? ? Mike Freedman COS 461: Computer Networks physical address http://www.cs.princeton.edu/courses/archive/spr20/cos461/ path

2

Today: Queue Management What can individual links do to make good use of shared underlying resources?

logical link name Packet Queues

physical ? path address

3 4

1 Router Line Cards (Interface Cards, Adaptors)

control plane • Packet handling to/from link data plane Processor –Packet forwarding

–Buffer management Transmit –Link Line card Line card lookup –Packet filtering Switching Receive Line card Fabric Line card –Rate limiting –Packet marking Line card Line card –Measurement to/from switch

5 6

Queue Management Issues FIFO Scheduling and Drop-Tail • Scheduling discipline • Access to the bandwidth: first-in first-out queue – Which packet to send? – Packets only differentiated when they arrive – Some notion of fairness? Priority? • Drop policy – When should you discard a packet? • Access to the buffer space: drop-tail queuing – Which packet to discard? – If the queue is full, drop the incoming packet • Goal: balance throughput and delay – Huge buffers minimize drops, but add to queuing delay (thus higher RTT, longer slow start, …) 7 8

2 Bursty Loss From Drop-Tail Queuing • TCP depends on packet loss – Packet loss is indication of congestion – TCP additive increase drives network into loss Early Detection of Congestion • Drop-tail leads to bursty loss – Congested link: many packets encounter full queue – Synchronization: many connections lose packets at once

9 10

Slow Feedback from Drop Tail Random Early Detection (RED) • Feedback comes when buffer is completely full • Router notices that queue is getting full – … even though the buffer has been filling for a while – … and randomly drops packets to signal congestion • Plus, the filling buffer is increasing RTT • Packet drop probability – … making detection even slower – Drop probability increases as queue length increases – Else, set drop probability f(avg queue length) • Better to give early feedback – Get 1-2 connections to slow down before it’s too late! Drop 0 1 0 Probability Average Queue Length

11 12

3 Properties of RED Properties of RED • Drops packets before queue is full • Drops packets before queue is full – In the hope of reducing the rates of some flows – In the hope of reducing the rates of some flows

• Tolerant of burstiness in the traffic • Tolerant of burstiness in the traffic – By basing the decisions on average queue length – By basing the decisions on average queue length

• Which of the following are true? • Which of the following are true? (Y) Drops packet in proportion to each flow’s rate (Y) Drops packet in proportion to each flow’s rate (M) High-rate flows selected more often (M) High-rate flows selected more often (C) Helps desynchronize the TCP senders (C) Helps desynchronize the TCP senders (A) All of the above (A) All of the above

13 14

Problems With RED • Hard to get tunable parameters just right – How early to start dropping packets? – What slope for increase in drop probability? – What time scale for averaging queue length? From Loss to Notification • RED has mixed adoption in practice – If parameters aren’t set right, RED doesn’t help • Many other variations in research community – Names like “Blue” (self-tuning), “FRED”…

15 16

4 Feedback: From loss to notification Explicit Congestion Notification • Needs support by router, sender, AND receiver • Early dropping of packets – End-hosts check ECN-capable during TCP handshake – Good: gives early feedback – Bad: has to drop the packet to give the feedback • ECN protocol (repurposes 4 header bits) 1. Sender marks “ECN-capable” when sending • Explicit Congestion Notification 2. If router sees “ECN-capable” and congested, marks – Router marks the packet with an ECN bit packet as “ECN congestion experienced” – Sending host interprets as a sign of congestion 3. If receiver sees “congestion experienced”, marks “ECN echo” flag in responses until congestion ACK’d 4. If sender sees “ECN echo”, reduces cwnd and marks “congestion window reduced” flag in next packet 17 18

ECN Questions ECN Questions

Why separate ECN experienced and echo flags? Why separate ECN experienced and echo flags? (Y) Detect reverse path congestion with “experienced” (Y) Detect reverse path congestion with “experienced” (M) Congestion could happen in either direction, want (M) Congestion could happen in either direction, want sender to react to forward direction sender to react to forward direction (C) Both of the above (C) Both of the above Why “echo” resent & “congestion window reduced” ACK? (Y) Congestion in reverse path can lose ECN-echo, still want to respond to congestion in forward path (M) Only should apply backoff once per cwnd (C) Both of the above

19 20

5 ECN Questions

Why separate ECN experienced and echo flags? (Y) Detect reverse path congestion with “experienced” (M) Congestion could happen in either direction, want sender to react to forward direction (C) Both of the above Link Scheduling Why “echo” resent & “congestion window reduced” ACK? (Y) Congestion in reverse path can lose ECN-echo, still want to respond to congestion in forward path (M) Only should apply backoff once per cwnd (C) Both of the above

21 22

First-In First-Out Scheduling Strict Priority • First-in first-out scheduling • Multiple levels of priority – Simple, but restrictive – Always transmit high-priority traffic, when present • Example: two kinds of traffic • Isolation for the high-priority traffic – Voice over IP needs low delay – Almost like it has a dedicated link – E-mail is not that sensitive about delay – Except for (small) delay for packet transmission • Voice traffic waits behind e-mail • But, lower priority traffic may starve L

23 24

6 Weighted Fair Scheduling Weighted Fair Scheduling

• Weighted fair scheduling • If non-work conserving (resources can go idle) – Assign each queue a fraction of the link bandwidth – Each flow gets at most its allocated weight – Rotate across queues on a small time scale • WFQ is work-conserving – Send extra traffic from one queue if others are idle – Results in (Y) higher or (M) lower utilization than non- work conserving? • WFQ results in max-min fairness – Maximize the minimum rate of each flow 50% red, 25% blue, 25% green

25 26

Max-Min Fairness Max-Min Fairness • Maximize the minimum rate of each flow • Maximize the minimum rate of each flow 1. Allocate in order of increasing demand 1. Allocate in order of increasing demand 2. No flow gets more than demand 2. No flow gets more than demand 3. The excess, if any, is equally shared 3. The excess, if any, is equally shared

8 8 10 10 min(ri , f ) = C 4 å 10 10 10 4 i 10 2 2 8 2 8

f = 4: Total = 10 2 Total = 6 min(8, 4) = 4 2 min(10, 4) = 4 4 4 Rate Demanded/Allocated Rate 2 2 2 Demanded/Allocated Rate 2 2 2 min(2, 4) = 2 Flows sorted by demand 27 Flows sorted by demand 28 27 28

7 Bit-by-Bit Fair Queueing Question: What is a “flow”? Flow 5-tuple: protocol, IP source/dest, port src/dest

Link 1, ingress Link 1, egress Question: How can we give weights? Protocol class, bit markings, prefixes, etc. Link 2, ingress Link 2, egress flow 1 Flow 1 Classifier flow 2 Scheduler Bit-by-bit round robin Link 3, ingress Link 3, egress flow n

Classification Flow N

29 30 WFQ slides credit: Nick McKeown (Stanford)

Bit-by-Bit Weighted Fair Queueing Packet vs. “Fluid” System

• Flows allocated different rates by servicing different • Bit-by-bit FQ is not implementable: number of bits for each flow during each round. …In real packet-based systems: – One queue is served at any given time

w1 = 0.1 – Packet transmission cannot be preempted

w2 = 0.3 1 R1 C w3 = 0.3 • Goal: A packet scheme close to fluid system w4 = 0.3 Order of service for the four queues: – Bound performance w.r.t. fluid system … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,…

31 32

8 First Cut: Simple Round Robin Packet-by-packet Fair Queuing (Weighted Fair Queuing) • Serve a packet from non-empty queues in turn – Lets assume all flows have equal weight Deals better with variable size packets and weights • Variable packet length à get more service by sending bigger packets Key Idea: 1. Determine the finish time of packets in bit-by- • Unfair instantaneous service rate (esp. with bit system, assuming no more arrivals variable weights) 2. Serve packets in order of finish times – E.g. 500 flows: 250 with weight 1, 250 with weight 10 – Each round takes 2,750 packet times – What if a packet arrives right after its “turn”?

33 34

Implementing WFQ Finish order

Challenge: Determining finish time is hard In what order do the packets finish? Increasing Li/wi

Idea: Don’t need finish time. Need finish order. w1 = 0.1 L1

w2 = 0.3 L2 1 R1 C The finish order is a lot easier to calculate. w3 = 0.3 L3 Order of service for the four queues: w4 = 0.3 L4 … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,…

Does not change with future packet arrivals!

35 36

9 Bit-by-bit System Round Bit-by-bit System Round

ww11== 0.11 L1 ww11== 0.11 L1 ww22== 0.33 L2 1 ww22== 0.33 L2 1 R1 C R1 C ww33== 0.33 L3 ww33== 0.33 L3 Order of service for the four queues: Order of service for the four queues: ww44== 0.3 3 L4 ww44== 0.3 3 L4 … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,… … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,…

Round – One complete cycle through all the queues Round – One complete cycle through all the queues sending wi bits per queue sending wi bits per queue

dR C R(t) = # of rounds at time t Question: How long does a round take? = C = link rate dt w j ∑ j∈B(t) 37 B(t) = set of backlogged flows 38

Bit-by-bit System Round Bit-by-bit System Round

ww11== 0.11 L1 ww11== 0.11 L1 ww22== 0.33 L2 1 ww22== 0.33 L2 1 R1 C R1 C ww33== 0.33 L3 ww33== 0.33 L3 Order of service for the four queues: Order of service for the four queues: ww44== 0.3 3 L4 ww44== 0.3 3 L4 … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,… … f1, f2, f2, f2, f3, f3, f3, f4, f4, f4, f1,…

Round – One complete cycle through all the queues Round – One complete cycle through all the queues sending wi bits per queue sending wi bits per queue Question: How many rounds does it take to serve a packet of length L from flow i? Packet of length L takes L/wi rounds to serve 39 40

10 Round (aka. “Virtual Time”) Round (aka. “Virtual Time”) Implementation of WFQ Implementation of WFQ Assign a start/finish round to each packet at arrival k Question: What is finish round of kth packet – Fi ? à serve packets in order of finish rounds Suppose kth packet of flow i arrives at time a Finish round of (k-1)th packet Start round of kth packet k−1 k k-1…… Fi k (Flow i backlogged) Si = R(a) k Round number (Flow i empty) 41 at time a 42

Putting it All Together Implementation Trade-Offs • FIFO For kth packet of flow i arriving at time a: – One queue, trivial scheduler k k−1 Si = max(Fi , R(a)) • Strict priority k k k Li – One queue per priority level, simple scheduler Fi = Si + wi • Weighted fair scheduling Question: How to compute R(a)? – One queue per class, and more complex scheduler

dR C Simple approximation: = Set R(a) to start or finish round dt w j ∑ j∈B(t) of packet currently in service 43 44

11 Distinguishing Traffic • Applications compete for bandwidth – E-mail traffic can cause congestion/losses for VoIP • Principle 1: Packet marking Guarantees – So router can distinguish between classes – E.g., Type of Service (ToS) bits in IP header

45 46

Preventing Misbehavior Subdividing Link Resources • Applications misbehave • Principle 3: Link scheduling – VoIP sends packets faster than 1 Mbps – Ensure each application gets its share • Principle 2: Policing – … while (optionally) using any extra bandwidth – E.g., weighted fair queuing – Protect one traffic class from another – By enforcing a rate limit on the traffic

47 48

12 Reserving Resources, and Saying No Quality of Service (QoS) • Traffic cannot exceed link capacity • Guaranteed performance – Deny access, rather than degrade performance – Alternative to best-effort delivery model • Principle 4: Admission control • QoS protocols and mechanisms – Application declares its needs in advance – Packet classification and marking – Application denied if insufficient resources available – Traffic shaping – Link scheduling – Resource reservation and admission control – Identifying paths with sufficient resources

49 50

13