<<

Advanced Computer Networks

QoS in IP networks

Prof. Andrzej Duda [email protected]

http://duda.imag.fr

1 Contents

§ QoS principles § § leaky bucket § token bucket § Scheduling § FIFO § Fair queueing § RED § IntServ § DiffServ

2 Improving QOS in IP Networks

§ IETF groups are working on proposals to provide better QOS control in IP networks, i.e., going beyond best effort to provide some assurance for QOS § Work in Progress includes , RSVP, and Differentiated Services § Simple model for sharing and congestion studies:

3 Principles for QOS Guarantees

§ Consider a phone application at 1Mbps and an FTP application sharing a 1.5 Mbps link. § bursts of FTP can congest the and cause audio packets to be dropped. § want to give priority to audio over FTP § PRINCIPLE 1: Marking of packets is needed for router to distinguish between different classes; and new router policy to treat packets accordingly

4 Principles for QOS Guarantees

§ Applications misbehave (audio sends packets at a rate higher than 1Mbps assumed above); § PRINCIPLE 2: provide protection (isolation) for one class from other classes § Require Policing Mechanisms to ensure sources adhere to bandwidth requirements; Marking and Policing need to be done at the edges:

5 Principles for QOS Guarantees

§ Alternative to Marking and Policing: allocate a set portion of bandwidth to each application flow; can lead to inefficient use of bandwidth if one of the flows does not use its allocation § PRINCIPLE 3: While providing isolation, it is desirable to use resources as efficiently as possible

6 Principles for QOS Guarantees

§ Cannot support traffic beyond link capacity § PRINCIPLE 4: Need a Call Admission Process; application flow declares its needs, network may block call if it cannot satisfy the needs

7 Traffic shaping

§ How to prevent congestion? § it may result from burstiness § make arrivals more deterministic, obtain better performance § example : no. of clients in D/D/1 vs. G/D/1 or group arrivals vs. single arrivals § control the rate and burst size § traffic description - leaky bucket, token bucket § Service contract § if the network knows the type of the traffic, it can reserve resources to support the traffic § contract between the source and the network § source: traffic description - leaky bucket, token bucket § network: QoS guarantee if the traffic conforms to the description § if the traffic is not conformant (leaky bucket, token bucket), penalty: reject a packet, no guarantees of the QoS (traffic policing) 8 Leaky bucket

§ Limited size buffer with constant departure rate § R if buffer not empty § 0 if buffer empty § Equivalent to the queue G/D/1/N b § Fixed size packets § one packet per clock tick R § Variable size packets § number of bytes per clock tick § if buffer filled

9 Token bucket

arrival of tokens : rate r

packet buffer b

test

peak rate R

10 Characterizing Burstiness: Token Bucket § Parameters § r – average rate, i.e., rate at which tokens fill the bucket § b – bucket depth (limits size of burst) § R – maximum link capacity or peak rate § A bit (packet) can be transmitted only when a token is available

Maximum # of bits sent r b/s bits

b·R/(R-r) slope r b bits slope R ≤ R b/s b/(R-r) time regulator Token bucket

§ Tokens generated with rate r § 1 token : 1 packet or k bytes § Packet must wait for a token before transmission § no losses § allows limited bursts (a little bit more than b) § When packets are not generated, tokens accumulate § n tokens - burst of n packets § if bucket filled, tokens are lost § Mean departure rate: r § Delay limited by b/r (Little's formulae)

12 Example

§ 25 MB/s link § Network can support a peak rate R = 25 MB/s, but prefers sustained throughput of r = 2 MB/s § Data generated § 1 MB each second, burst during 40 ms § Example 1. leaky bucket with b = 1 MB, R = 25 MB/s, r = 2 MB/s 2. token bucket with b = 250 KB, R = 25 MB/s, r = 2 MB/s 3. token bucket with b = 500 KB, R = 25 MB/s, r = 2 MB/s 4. token bucket with b = 750 KB, R = 25 MB/s, r = 2 MB/s 5. token bucket with b = 500 KB, R = 25 MB/s, r = 2 MB/s and leaky bucket with b = 1 MB, R = 10 MB/s

13 14 Burst duration

§ Burst duration - S [s] § Size of the bucket - b bits § Maximal departure rate - R b/s § Token arrival rate - r b/s § burst of b + rS bits § burst of RS § b + rS = RS -> S = b/(R - r) § Example § b = 250 KB, R = 25 MB/s, r = 2 MB/s § S = 11 ms

17 QoS Guarantees: Per-hop Reservation

§ End-host: specify § arrival rate characterized by token bucket with parameters (b, r, R) § the maximum tolerable delay D, no losses

§ Router: allocate bandwidth ra, buffer space Ba such that § no packet is dropped § no packet experiences a delay larger than D

slope ra bits slope r Arrival curve b•R/(R-r) R D = b/ra(R-ra)/(R-r) Ba time b/(R-r) Token Bucket and a router

R ra r

Queue - Ba Source Token Bucket b, r, R

19 QoS Guarantees: Per-hop Reservation

§ Router: if allocated bandwidth ra = r, buffer space B such that § no packet is dropped § no packet experiences a delay larger than D = b/r

bits slope r Arrival curve b•R/(R-r) R slope r D = b/r B b/(R-r) time Traffic description

rate 3 MB/s Flow B 2 MB/s

Flow A 1 MB/s

1 s 2 s 3 s 4 s time

§ Flow A : r = 1 MB/s, b = 1 B § Flow B : r = 1 MB/s, b = 1 MB § during 2 s, the flow saves 2 s at 0.5 MB/s = 1 MB

24 Scheduling strategies

packets

transmission queue § Scheduler § defines the order of packet transmission § Allocation strategy § throughput § which packet to choose for transmission § when chosen, packet benefits from a given throughput § buffers § which packet to drop, when no buffers

25 FIFO

§ Current state of routers § Allows to share bandwidth § proportionally to the offered load § No isolation § elastic flows (rate controlled by the source eg. TCP) may suffer from other flows § a greedy UDP flow may obtain an important part of the capacity § real time flows may suffer from long delays § Last packets are dropped - tail drop § TCP adapt bandwidth based on losses § RED (Random Early Detection) techniques § choose a packet randomly before congestion and drop it

26 Priority Queue

§ Several queues of different priority § source may mark paquets with priority § eg. ToS field of IP § packets of the same priority served FIFO § non-preemptive § Problems § starvation - high priority source prevents less priority sources from transmitting § TOS field in IP - 3 bits of priority § how to avoid everybody sending high priority packets?

27 Class Based Queueing (CBQ)

Class 1

Class 2

Class 3

§ Also called Custom Queueing (CISCO) § Each queue serviced in round-robin order § Dequeue a configured byte count from each queue in each cycle § Each class obtains a configured proportion of link capacity 28 Characteristics

§ Limited number of queues (CISCO - 16) § Link sharing for Classes of Service (CoS) § based on protocols, addresses, ports § Method for service differentiation § assign different proportions of capacity to different classes § not so drastic as Priority Queueing § Avoids starvation

29 Per Flow Round Robin

flow 1

flow 2

flow 3

§ Similar to Processor Sharing or Time Sharing § one queue per flow § cyclic service, one packet at a time

30 Characteristics

§ It modifies the optimal strategy of sources § FIFO: be greedy - send as much as possible § RR: use your part the best § a greedy source will experience high delays and losses § Isolation § good sources protected from bad ones § Problems § flows sending large packets get more § cost of flow classification

31 Fair Queueing

flow 1

flow 2

flow 3

time

§ Round robin "bit per bit" § each packet marked with the transmission instant of the last bit § served in the order of the instants

§ allocates rates according to local max-min fairness 32 Weighted Fair Queueing

§ Fair queueing § equal parts : 1/n § Weighted fair queueing § each flow may send different number of bits

§ Example - weights wi

flow 1 flow 2 flow 3 1/3 1/6 1/2

ri = C wi , C: link capacity

33 Rate guarantee

§ Weights expressed as proportions (wi - guaranteed weight)

§ If no packets of a given flow, unused capacity shared equally by other flows

ri >= C wi

§ Weights to guarantee a given rate

wi = ri / C

34 Delay guarantee

§ Flow constrained by a token bucket § rate r, buffer of b § delay limited by b/r

§ If r ≤ ri (the rate obtained is sufficient for the flow)

§ delay limited by b/ri § total delay limited by the sum of all delays

35 Policing Mechanisms (more)

❒ Delaytoken bucket, guarantee WFQ combine to provide guaranteed upper bound on delay, i.e., QoS guarantee! arriving token rate, r traffic

bucket size, b per-flow rate, R WFQ

D = b/R max

7: Multimedia Networking 7-83

36 Deficit Round-Robin (DRR)

§ A quantum of bits to serve from each connection in order § Each queue: deficit counter (dc) (to store credits) with initial value zero § Scheduler visits each non-empty queue, compares the packet at the head to dc and tries to serve one quantum of data § if size ≤ (quantum + dc) § send and save excess in dc: dc ß quantum + dc – size, § otherwise save entire quantum: dc += quantum § if no packet to send, reset dc § Easier implementation than other fair policies § O(1)

37 Deficit Round-Robin

§ DRR can handle variable packet size

Quantum size : 1000 bytes § 1st Round § A’s dc : 1000 2000 1000 0 § B’s dc : 200 (served twice) 1500 A § C’s dc : 1000 § 2nd Round 500 300 B § A’s dc : 500 (served) § B’s dc : 0 1200 C § C’s dc : 800 (served)

Second First Head of Round Round Queue 38 DRR: performance

§ Handles variable length packets fairly § If weights are assigned to the queues, then the quantum size applied for each queue is multiplied by the assigned weight § Queues not served during round build up “credits”: § only non-empty queues § Quantum normally set to max expected packet size: § ensures one packet per round, per non-empty queue § Backlogged sources share bandwidth equally § Simple to implement § Similar to round robin

39 Drop-tail queues

Sender 1 Losses due to buffer overflow - De-facto mechanism today

Receiver

Sender 2 + Very simple to implement - Filled buffers (large delay) - Synchronizes flows 44 How big should router buffers be?

§ Classic buffer-sizing rule: B = C * RTprop § BDP buffer § Single TCP flow halving its window still gets a throughput of 100% link rate

§ Q: should buffers be BDP-sized?

§ Significant implications:

§ Massive pkt buffers (e.g., 40 Gbit/s with 200ms RTprop): high cost § Massive pkt delays: bufferbloat “Bufferbloat”

§ Problem: too large buffers, notably in home routers, get filled by TCP leading to long packet latency Bufferbloat

§ How TCP should use the buffer

router

buffer link

TCP flow

TCP window/rate packets “good queue” in buffer Bufferbloat

§ Impact of a bloated buffer: longer delays, same throughput

router

bloated buffer link

TCP flow

packets “bad queue” in buffer Bufferbloat

§ impact of a bloated buffer: high latency for real time flows router VoIP bloated buffer flow link

TCP flow

packets “bad queue” in buffer Bufferbloat

§ impact of drop tail: unfair bandwidth sharing

router TCP flow bloated buffer link

TCP flow

packets in buffer Bufferbloat

§ impact of drop tail: unfair bandwidth sharing

router “UDP” flow bloated buffer link

TCP flow

packets in buffer 1 1 Peer Min. RTT per sample 0.9 0.9 All RTT Samples 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CDF CDF 0.4 0.4 0.3 0.3 0.2 0.2 0.1 Peer Min. RTT per sample 0.1 All RTT Samples 0 0 0.001 0.01 0.1 1 0.001 0.01 0.1 1 RTT (sec) RTT (sec)

Figure 1: Distributions of minimum RTT sample to each remote peer for each RTT sample and all RTTs samples taken for non-residential (left) and residential (right) peers. RTT measurements 1 is generally larger than the corresponding distribution for Non-Res. non-residential peers. 0.9 Res. The second observation from figure 1 is that 99.6% of 0.8 residential RTT samples and 98.3% of non-residential RTT 0.7 samples are less than one second. Previous measurements— 0.6 both wide-scale [16] and anecdotal [10]—have illustrated 0.5 CDF that buffers are often deep enough to make RTTs of more 0.4 than one second possible. However, we find that RTTs above 0.3 one second are rare in normal traffic. A similar empiri- 0.2 cal result is shown in [21] as part of the case for setting 0.1 TCP’s initial retransmission timeout to 1 second. This re- 0 sult highlights the importance of (i)carefulempiricalstudy 0.0001 0.001 0.01 0.1 1 in identifying and understanding the scope of problems and Samp. - Min. (sec) (ii)carefullyunderstandingthelessonsfromwell-conducted§ Difference between each sample and the minimum RTT previous experiments. forFigure the given 2: Distributionhost pair of each RTT sample minus Even though we find RTTs generally less than one sec- minimum RTT to the given remote peer. ond in our dataset, our third observation is that buffering § The median increase in RTT is just over 1 msec for non is happening to some extent. In both plots in figure 1 the residential peers and roughly 45 msec for residential “all samples” distribution shows longer RTTs than the peer peers 52 minimum distribution. This indicates that in general TCP cannot ascribe a reason to the larger buffering to residential segments are encountering some queuing delay through the peers even though it is unmistakably present. network and hence take longer than the baseline would sug- While figure 1 shows buffering does indeed happen, we gest. An exception to this is non-residential hosts that are now assess the magnitude of the phenomenon in our dataset. within 10 msec of the CCZ hosts. In this case our expec- Figure 2 shows the distribution of difference between each tation is that any difference with the minimum does not sample in our dataset and the minimum RTT for the given necessarily reflect queuing, but at these time scales could be host pair (within each trace). The median increase in RTT caused by myriad small issues. While our results show queu- is just over 1 msec for non-residential peers and roughly ing does happen with in the network—as we would expect th 45 msec for residential peers. At the 99 percentile the since queues are in place for a reason—Whether this consti- buffering represents 450 msec and 976 msec of added delay tutes “bloat” is a subjective judgment. All we conclude is for non-residential and residential peers, respectively. In that we do not find queues that impose the seconds of delay other words, fewer than 1% of the packets experience an often touted as the hallmark of bufferbloat in our dataset. RTT increase of more than 1 second. Our final observation from figure 1 is that the amount of Finally, we turn our attention to the buffering patterns we buffering is larger for residential peers than for non-residential observe. First, we wish to gain an understanding of whether peers. One reason for this could be that residential users RTTs are generally increasing or decreasing across a pair of generally have slower links than non-residential hosts and hosts in a given trace. We therefore count the number of therefore draining a queue—which the FTTH-connected sen- times the RTT increases (decreases) across subsequent RTT der could readily fill—naturally takes longer. Additionally, samples for each host pair. In figure 3 (solid lines) we plot residential users are both geographically and delay-wise fur- the ratio of the count of RTT increases to the count of RTT ther from the CCZ hosts. Therefore, traffictoresidential decreases for non-residential (top) and residential (bottom) peers may have a natural tendency to accumulate more delay peers. A ratio greater than one indicates the RTT increases as it passes through more routers. Finally, home network- more times than it decreases, with a ratio of less than one ing gear may simply have a higher propensity to over-buffer indicating the opposite. The plots show that in over 90% as suggested in [16]. Without additional ancillary data we of the cases the host pair has either more RTT increases Active Queue Management (AQM)

§ Detect “incipient” (early) congestion in the router § Try to keep average queue size in “good” range § Randomly choose flows to notify about congestion § e.g. RED: packet drops are implicit notifications

Randomly “notify” flow via packet drop

queue_len max min

53 Random Early Detection

§ Family of techniques used to detect congestion and notify sources § when a queue is saturated, packets are dropped § losses interpreted as congestion signals ® decrease rate § Idea § act before congestion and reduce the rate of sources § threshold for starting to drop packets § Losses are inefficient § result in retransmissions, dropped packets should be retransmitted - enter Slow Start § Synchronization of TCP sources § several packets dropped § several sources detect congestion and enter slow start at the same instant 54 RED

th-max average th-min

§ Estimation of the average queue length § average ← q measure + (1 - q) average § If average ≤ th-min § accept the packet § If th-min < average < th-max § drop with probability p § If th-max ≤ average § drop the packet

55 RED Characteristics

§ Tends to keep the queue reasonably short § low delay § Suitable for TCP § single loss recovered by Fast Retransmit § Probability p of choosing a given flow is proportional to the rate of the flow § more packets of that flow, higher probability of choosing one of its packet

56 RED Characteristics

§ Dynamic probability p § p-tmp = max-p (average - th-min)/ (th-max - th-min) § max-p: maximal drop probablility when the queue attains th-max threshold § p = p-tmp/(1 - nb-packets p-tmp) § nb-packets: how many packets have been accepted since the last drop § p increases slowly with nb-packets § drops are spaced in time § Recommended values § max-p = 0.02 § if average in the middle of two thresholds, 1 drop in 50 57 Drop probability

p-tmp

1

max-p

th-min th-max average

58 Example network for RED

S1 2 Mb/s, 10 ms destination S2 router 2 Mb/s, 60 ms 2 Mb/s 100 ms

20 seg. 20 seg. S 3 2 Mb/s, 100 ms

§ Example network with three TCP sources § different link delays § limited queues on the link (20 packets)

59 Throughput in time

ACK numbers S1

S2

S3

time 60 Throughput in time with RED

ACK numbers S1

S2

S3

time 61 WRED

th-max2 th-min2

th-max1 th-min1 average § Weighted RED § Different thresholds for different classes § higher priority class - higher thresholds § lower drop probability § lower priority class - lower thresholds § greater drop probability § Method for service differentiation 62 Different drop probabilities

p-tmp

1

max-p

th-min2 th-max2 average

th-min1 th-max1

63 CoDel - Controlled Delay Management

64 CoDel § Keep a single-state variable of how long the minimum delay has been above or below the TARGET value for standing queue delay § Rather than measuring queue size in bytes or packets, we used the packet-sojourn time through the queue § Need to add timestamp of packet arrival time to the packet in the queue § Standing queue of TARGET delay is OK § No drop of packets if fewer than one MTU in the buffer § CoDel identifies the persistent delay by tracking the (local) minimum queue delay packet experience § To ensure that the minimum value does not become stale, it has to have been experienced within the most recent INTERVAL (time on the order of a worst-case

RTT of connections through the bottleneck) 65 CoDel

Packet dequeue

Within interval(100ms) Min Queuing delay > target(5ms)

Y N

Drop Stop drop

Dequeue next packet

Schedule next drop

66 CoDel § TARGET = 5 ms (optimizes power) § INTERVAL = 100 ms (normal Internet usage) § Deque packet § track whether the sojourn time is above or below TARGET and, if above, if it has remained above continuously for at least INTERVAL § If the sojourn time has been above TARGET for INTERVAL, enter DROPPING STATE - minimum packet sojourn time is greater than TARGET § If in DROPPING STATE § drop first packet: ++count, § fix next time for the next drop: § t + INTERVAL / sqrt(count) // next drop time is decreased in inverse proportion to the square root of the number of drops since the DROPPING STATE 67 Power vs. Target for a Reno TCP CoDel 1.00 0.95 0.90 0.85 0.80 Average Power (Xput/Delay) Power Average 0.75 0.70

0 20 40 60 80 100

Target (as % of RTT)

68

18 Power vs. Target for a Reno TCP CoDel 1.00 0.99 0.98 0.97 0.96 Average Power (Xput/Delay) Power Average 0.95 0.94 0.93

0 5 10 15 20 25 30

Target (as % of RTT)

69

19 CoDel

§ Setpoint target of 5% of nominal RTT (5 ms for 100 ms RTT) yields substantial utilization improvement for small added delay

70 CoDel - Controlled Delay Management

§ AQM that has the following characteristics: § parameterless—it has no knobs for operators, users, or implementers to adjust. § treats good queue and bad queue differently—that is, it keeps the delays low while permitting bursts of traffic. § controls delay, while insensitive to round-trip delays, link rates, and traffic loads. § adapts to dynamically changing link rates with no negative impact on utilization. § simple and efficient—it can easily span the spectrum from low-end, Linux-based access points and home routers up to high-end commercial router silicon. 71 Throughput and Fairness Problem

72 Delay

140 120 100

) 80 ms 60 1 CUBIC 1 RENO full( 40 1 CUBIC 1 UDP 20 0 Max Queuing Delay when queue is is queue when Delay Queuing Max FIFO RED SFB CODEL AFQ Queue management schemes CODEL works great for control delay!

73 Throughput and Fairness

8 7

) 6

Gbps 5 4

3 CUBIC flow

Throughput ( Throughput 2 RENO flow 1 0 FIFO RED SFB CODEL AFQ Queue Management Schemes 1 CUBIC flow and 1 RENO flow competing in the bottleneck Unfairness for heterogeneous TCPs! AFQ approximately solves fairness problem 74 Flow Queue CoDel Packet Scheduler - FQ-CoDel § Combine a packet scheduler (DRR) with Co-Del (AQM) § Optimization for sparse flows similar to Shortest Queue First (SQF) § "flow queueing" rather than "fair queueing", as flows that build a queue are treated differently from flows that do not § FQ-CoDel stochastically classifies incoming packets into different queues by hashing 5-tuple (not exactly Flow Queueing) § each queue managed by the CoDel AQM § packet ordering within a queue is preserved - FIFO § Round-robin mechanism distinguishes between § "new" queues (which don't build up a standing queue) and § "old" queues (which have queued enough data to be active for more than one iteration of the round-robin scheduler). 75 5 interface cards (NIC), network switches, routers, proxies and than a specific threshold, AQM infers congestion and reacts firewalls. accordingly based on the congestion level. A well-known Buffers are used to temporarily queue packets when the example of such statistical AQM is Random Early Detection next layer is busy or unable to process the packet as fast as (RED) [32]. Many queue occupancy-based AQMs have been they are provided. There may be a number of causes, such proposed to mitigate different issues [33], [34], [35]. as devices with low processing power, network scheduling More recently, new AQM mechanisms have emerged that priority, temporary reductions in link layer sending rate, and rely on queue delay measurement rather than queue occupancy. transient network congestion. Queue delay is directly correlated to the network metric that AQM is intended to manage. These new AQMs are able to A. Traditional Buffering and Queues achieve high throughput and better delay control with low The most common method for implementing network complexity. Further, these AQMs are designed to perform buffers is First-In First-Out (FIFO) with a DropTail manage- reasonably using their default configurations. Well-known ex- ment mechanism. In a FIFO queue, packets are appended amples of such AQMs are CoDel (Controlled Delay) [36] and to the tail of the queue during the enqueue process and PIE (Proportional Integral controller Enhanced) AQM [37], fetched and removed from the head of the queue during the [38]. dequeue process. When the queue size exceeds the buffer size, Moreover, hybrid AQM/scheduler schemes have been pro- the DropTail mechanism drops any new packet until suitable posed to improve fairness between competing flows while buffer space becomes available. Figure 4 shows a conceptual keeping queuing delay low. They achieve these goals by di- representation of FIFO and DropTail mechanism. verting the flows into separately managed queues and applying an individual AQM instance for each queue. This separation protects low rate flows from aggressive flows while the indi- vidual AQM instances control the queue delay. Examples of hybrid AQM/scheduler schemes are FQ-CoDel (Flow-Queue CoDel) [39] and FreeBSD’s FQ-PIE (Flow-Queue PIE) [40]. In addition to control queuing delays and provide relatively equal sharing of the bottleneck capacity, these AQMs provides Fig. 4. FIFO and DropTail buffer management short periods of priority to lightweight flows to increase network responsiveness. Figure 5 illustrates simplified FQ-CoDel and FQ-PIE al- When TCP was first designed, the bit error rate of transmis- gorithms where flows are hashed to separate queues which sion channel (usually wired) was very low. Therefore, packet FQare- managedCoDel by either CoDel or PIE AQM. These queues loss was mainly caused by buffer overflow, and taken as an are serviced using a deficit round robin scheduler with higher indication of congestion at the bottleneck. This relationship priority for new flows. between packet loss and network congestion is exploited by loss-based TCP CC to infer congestion along the path. Sub‐queues Deficit round per‐queue robin scheduler The proliferation of oversized FIFO buffers in the network, CoDel/PIE queues coupled with the aggressiveness of loss-based TCP CC, causes Input packets Output packets high queuing delay in a phenomenon called Bufferbloat [30]. New This high delay has a negative impact on latency-sensitive Flows hashing applications in particular, and on network performance in queues

general. Old Active Queue Management (AQM) is a mechanism used to keep the bottleneck queues of network nodes to a controlled depth, effectively creating short queues [31]. AQM is used as a Fig. 5. Simplified FQ-CoDel/FQ-PIE AQMs replacement for the DropTail mechanism. When AQM detects congestion it reacts by dropping or marking packets with an ECN [3]. The loss event or ECN signal is then detected by 76 the sender which reduces the transmission rate by decreasing IV. TCP CONGESTION CONTROL LITERATURE -- SIGNALS cwnd. AND ALGORITHMS

B. Active Queue Management In Section II we explained that CC algorithms try to In the last two decades, many AQM algorithms have been estimate available bandwidth to optimally configure cwnd and proposed to manage the queuing delay problem. However, maximise network utilisation. However, accurate bandwidth none have yet been widely deployed due to both a reduction estimation is hard to achieve. Instead, CC algorithms use in network utilisation and complicated optimal configuration. one or more congestion feedback signals to infer whether the Legacy AQMs monitor queue occupancy based on bytes path is under or over utilised. Senders react by increasing or or packets in the queue. If the queue length becomes larger reducing cwnd appropriately.

across queues while using PIE rather than CoDel to manage given that PIE and CoDel make drop decisions on a per- individual queues. packet basis, a flow sending small packets is just as likely to be hit as a flow sending a similar number of large packets E. AQM deployment – how and why per second, despite the latter flow being the dominant cause Modern AQM is slowly gaining traction in different mar- of queue build-up. kets. For example, a variant of PIE is mandatory for cable The price for significantly reduced RTT is that relatively modem service based on DOCSIS 3.1 [19], [20], FQ-PIE low-bitrate interactive flows are likely to experience collateral is available in FreeBSD 10.x onwards, and PIE, CoDel and damage in the form of increased packet loss rates. In the FQ-CoDel are available from the late 3.x series Linux kernel rest of this paper we (a) explore and characterise the degree onwards. Embedded Linux has a strong presence in the home to which packet loss rates under PIE and CoDel alone may gateway market, so vendors can offer AQM once their products negatively impact on multiplayer game traffic (as an illustrative Performancemove from 2.x to 3.x-based of FQ Linux-CoDel kernels. Technically-adept example), (b) show why FlowQueue variants of AQM should owners can already upgrade a variety of consumer gateways be promoted for real-world deployment, and (c) explore the using dedicated Linux distributions such as OpenWRT [21]. impact of enabling ECN more widely.

XPERIMENTAL ETHODOLOGY 5 ●● ●●● III. E M 15 ●●● ● ●

bps) 4 ●●● ●●● ● M (s) We have chosen to explore the impact of AQM and ECN 10 3 TT ● R 2 on game traffic packet loss rates using a representative subset 05 Throughput 1 ●●● of plausible real-world network conditions. We describe our 0 0 Up Mp: 1 1 2 2 3 3 4 4 5 5 Up Mp: 1 1 2 2 3 3 4 4 5 5 experimental testbed in detail sufficient to assist others in AQM: FIFO FQ- FIFO FQ- FIFO FQ- FIFO FQ- FIFO FQ- AQM: FIFO FQ- FIFO FQ- FIFO FQ- FIFO FQ- FIFO FQ- CoDel CoDel CoDel CoDel CoDel CoDel CoDel CoDel CoDel CoDel reproducing (and extending) our results. (a) RTT: Very low over FQ-CoDel, (b) Throughput: Slightly lower under very high over FIFO FQ-CoDel compared to FIFO A. A simplified use-case scenario Figure 2: A single upstream TCP flow through a FIFO or FQ- Our primary simplification is to focus on one significant scenario: An online twitch game competing with elastic traffic CoDel bottleneck, RT Tbase = 40ms, link speeds 1-5Mbps in the upstream direction. Ideally both ends of the last-mile will be upgraded, but Using X/Y Mbps to indicate X Mbps downstream and Y 9 77 Mbps upstream, we consider a hypothetical last-mile offering upgrades need not be simultaneous. Figure 2 illustrates 10 the major motivation for replacing FIFO with AQM in the 15/1Mbps or 15/5Mbps service. Queue management at each upstream. A hypothetical home user is using a TCP-based end is FIFO (common industry default), PIE, CoDel or FQ- application to push data to offsite storage location 40ms {CoDel,PIE}. We repeat each combination with unloaded RTT (RT Tbase) varying from 20ms to 100ms. away (RT Tbase = 40ms). With a FIFO gateway having 180 packets of buffering, Figure 2a shows upstream congestion In Section V we discuss how this simple case provides resulting in median RTTs for any interactive traffic sharing enough insight into the impact of AQM on induced packet the gateway at the same time ranging from 1400ms down to losses that readers can extrapolate to cases with more or less ⇠ 300ms as the upstream rate is varied from 1Mbps to 5Mbps. competing TCP flows, flows in the opposite direction, ON- ⇠ Switching to FQ-CoDel results in RTTs well below 100ms OFF elastic flows (such as streaming services) and different even at only 1Mbps upstream. Figure 2b shows minimal loss upstream or downstream last-mile bandwidths. of TCP throughput relative to using FIFO in order to gain this B. Experimental testbed and traffic sources significant reduction in RTT. Figure 3 shows our physical testbed topology based on [23] F. Packet loss – collateral damage from deploying AQM and controlled by a publicly available tool called TEA- CUP [24], [25].11 The bottleneck router provides configurable Aside from their initial burst tolerance, PIE and CoDel (or rate shaping, AQM and RT T between 172.16.10.0/24 FQ-{CoDel,PIE} for flows hashed into the same queue) effect- base (home) and 172.16.11.0/24 (internet) using FreeBSD 10.2 ively instantiate ‘short’ queues (both in terms of milliseconds for FQ-PIE experiments and openSUSE 12.3 Linux (kernel and bytes). Consequently, TCP flows will experience more 3.17.4) for PIE, CoDel and FQ-CoDel experiments.12 capacity probing cycles (congestion epochs), and trigger more The competing elastic traffic consists of two overlapping congestion notifications per unit time, than would occur over NewReno TCP flows generated by two iperf sessions star- conventionally sized (larger than BDP) FIFO queues [19]. ted one second apart between hosts running FreeBSD 10.2- This is a problem because much of today’s Internet still RELEASE-p7. requires congestion signalling via packet loss (with ECN yet to see significant deployment). All packets who will experience 10The former representing plausible ADSL2+ service, while the latter a (PIE) or have experienced (CoDel) too much delay in a queue plausible low-end fibre-to-the-home service offering. 11TEACUP enables repeatable network performance experiments where are candidates to be dropped, regardless of whether they configurable traffic sources congest an emulated network path having a belong to the flow(s) actually causing the congestion. And specified range of RT Tbase, bottleneck bandwidth and bottleneck AQM. 12See section IV-B of [25] for details on how TEACUP appropriately 9Based on experimental results in Figure 3 of [22]. configures netem and tc for this purpose.

Authors’ copy. To appear in the 26th International Conference on Computer Communications and Networks (ICCCN 2017) 4 July 31 - August 3, 2017. See notice on the first page. Performance of Reno - FIFO

stime_20_etime_100_exp_20170223−150558_aqm_fifo_bs_down_200_bs_up_40_tcp_newreno_chunksize_2_mpd_BigBuckBunnystime_20_etime_100_exp_20170223−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0−150558_aqm_fifo_bs_down_200_bs_up_40_tcp_newreno_chunksize_2_mpd_BigBuckBunny−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0

● ● C2S UDP (Upstream) Elastic TCP Upstream #2 0.8 C2S UDP (Upstream) Elastic TCP Upstream #2

Elastic TCP Upstream # 300 ● Elastic TCP Upstream # ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ●● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ●● ● ● 250 ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ●● ●● ● ● ●●● ● ●● ●●● ● ● ● ●● ● ●●● ●● ● ●● ●● ● ●●● ● ●● ● ●●●●● ●●● ● ● ●● ●● ● ● ●● ●●●● ●● ● ●● ●● ●● ●● ●●●● ●● ● ●● ●● ●●●●●● ●●● ●● ● ●● ●● ●●● ●● ●● ●● ●● ●● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ●● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ●● ● ●● ●● ●●●● ● ● ● ● ●● ●● ●● ● ●● ●●● ● ● ● ●●● ●● ●●●● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ●● 0.6 ● ● ● ● ● ●●● ●● ● ●●● ●● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ● ● ●● ●● ● ● ●●● ●● ● ●● ● ●● ●● ● ●●● ● ● ●● ●● ●●● ● ● ● ●● ● ● ●● ●● ● ●●● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ●●● ● ● ●● ●● ●● ● ●●● ●● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●●●● ●● ●●●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ●●●●● ●● ● ● ● ●●●●● ●● ●● ● ● ●●●● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● 200 ●● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ms) ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●●● ●●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ●●●● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●●●● ●●●● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 ● ●●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● 150●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● TT ( ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ●● ● ●● ● ● ● ●●● ●● ● ●

● R ●● ●●● ●● ●● ●● ●● ●● ●● ● ●● ●●● ● ●● ● ● ● 100 ● ● ● ● ● 0.2 ●● ● ● ● ●● ●● Throughput (Mbps) 50 ● ●● ●●● ●●● ●● ● ● ●●● ●● ● ● ●● ●●● ●● ●●● ●● ●● ●●● ●●● ●●● ●● ●● ● ● ●●● ● ● ●●●● ●●● ●● ●● ● ●●● ●● ●● ●●● ●● ●●● ●● ●● ●● ●●● ●● ●● ●● ●● ●● ●●● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●● ●●● ●●● ●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●● ● ●●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● 0 0 20 30 40 50 60 70 80 20 30 40 50 60 70 80 Time (s) Time (s) (a) Throughput (b) RTT Figure 4: Upstream congestion of a 15/1Mbps link using FIFO bottleneck, RT Tbase = 20ms Figure 3: TEACUP testbed to emulate mixed-traffic over last- mile between home and Internet § 2 Reno sources, 1 UDP gaming stream, FIFO B. Latency and throughput with AQM (no ECN)

Game traffic is emulated using pktgen v0.3.113, which uses Now we look at the experience of each flow when running 78 a statistical model [26] to synthesise server to client (s2c) through a CoDel, PIE or FQ-CoDel bottleneck. and client to server (c2s) Quake III Arena traffic for different 1) Using single-queue CoDel: Although the IETF does not numbers of players. In our experiments we emulate one client recommend CoDel be deployed on its own, we use Figure 5 participating in a four-player game. The traffic is two streams to illustrate the potential impact of a 15/1Mbps last-mile using of small UDP packets, averaging around 20 packets per second CoDel when the three flows have RT Tbase = 20ms. Figure 5a in the s2c direction and 95 packet per second in the c2s reveals the two iperf flows sharing the available 1Mbps far direction. more chaotically than for the FIFO case in Figure 4a. On the For FIFO experiments we configure the upstream bottle- upside, Figure 5b demonstrates a significant improvement in neck with 40 packets of buffering (approximately BDP of a RTTs experienced by all three flows, ranging from 20ms to 5Mbps/100ms path) and the downstream with 200 packets of 80ms with intermittent peaks to around 100ms. buffering (in excess of 15Mbps/100ms BDP). We allow CoDel, PIE and FQ-{CoDel,PIE} to grow their queues to 1000 packets (per IETF recommendations).

IV. EXPERIMENTAL RESULTS Each experiment involves two-way game traffic running for 20 seconds before two TCP flows begin pushing data from Figure 3’s home to Internet (i.e. upstream). The elastic flows then last for roughly 80 seconds. During this period we look (a) Throughput (b) RTT closely at the impact on c2s packet loss rates, the upstream throughput achieved by the elastic TCP flows, and the overall Figure 5: CoDel during upstream congestion, 15/1Mbps, RTT experienced by all three application flows during the RT Tbase = 20ms period of congestion. The noticeable banding of elastic flow RTTs in Figure 5a is A. Latency and throughput with FIFO because at 1Mbps enqueuing a 1500 byte TCP packet causes queuing delay to jump in increments of 12ms. This also causes Figure 4 shows the impact of upstream congestion on our trouble for CoDel, as a single full-size packet immediately 15/1Mbps last-mile using FIFO queue management. In this exceeds CoDel’s default T =5ms, yet CoDel wont drop case RT T = 20ms, emulating a case where the unloaded target base packets unless there’s at least one full-size packet in the queue. RTT to the game server is only 20ms (excellently close for 2) Using single-queue PIE: The IETF does propose PIE for interactive play) and the elastic TCP flows are pushing data use in single-queue scenarios. So in Figure 6 we illustrate the to ‘cloud’ servers also only 20ms away. potential impact of a 15/1Mbps last-mile using PIE when all In Figure 4a the low-rate c2s game flow weaves itself three flows have RT T = 20ms. Figure 6a reveals chaotic between the 1500byte TCP data packets as the two iperf flows base capacity sharing between the two iperf flows reminiscent of somewhat noisely share the available 1Mbps. Unfortunately, CoDel in Figure 5a. Figure 6b shows a reduction in RTT Figure 4b, reveals that despite RT T = 20ms all three base relative to FIFO (Figure 4b), but not as good as the reduction flows are experiencing RTTs ranging from just under 150ms provided by CoDel (Figure 5b). Aside from the initial RTT to just over 250ms (swinging rapidly in time with the cyclical spike (due to PIE’s burst tolerance as the iperf flows begin) capacity probing of each TCP flow). all three flows experience RTTs ranging from 20ms to around 13http://caia.swin.edu.au/bitss/tools/pktgen-0.3.1.tgz 150ms. 3) Using multi-queue FQ-CoDel: The IETF recommends FQ-CoDel as the way to introduce CoDel’s benefits into a net-

Authors’ copy. To appear in the 26th International Conference on Computer Communications and Networks (ICCCN 2017) 5 July 31 - August 3, 2017. See notice on the first page. (a) Throughput (b) RTT Figure 4: Upstream congestion of a 15/1Mbps link using FIFO bottleneck, RT Tbase = 20ms Figure 3: TEACUP testbed to emulate mixed-traffic over last- mile between home and Internet B. Latency and throughput with AQM (no ECN) Game traffic is emulated using pktgen v0.3.113, which uses Now we look at the experience of each flow when running a statistical model [26] to synthesise server to client (s2c) through a CoDel, PIE or FQ-CoDel bottleneck. and client to server (c2s) Quake III Arena traffic for different 1) Using single-queue CoDel: Although the IETF does not numbers of players. In our experiments we emulate one client recommend CoDel be deployed on its own, we use Figure 5 participating in a four-player game. The traffic is two streams to illustrate the potential impact of a 15/1Mbps last-mile using of small UDP packets, averaging around 20 packets per second CoDel when the three flows have RT Tbase = 20ms. Figure 5a in the s2c direction and 95 packet per second in the c2s reveals the two iperf flows sharing the available 1Mbps far direction. more chaotically than for the FIFO case in Figure 4a. On the For FIFO experiments we configure the upstream bottle- upside, Figure 5b demonstrates a significant improvement in neck with 40 packets of buffering (approximately BDP of a RTTs experienced by all three flows, ranging from 20ms to 5Mbps/100ms path) and the downstream with 200 packets of 80msPerformance with intermittent peaksof Reno to around- 100ms.CoDel

● buffering (in excess of 15Mbps/100ms BDP). We allow CoDel,stime_20_etime_100_exp_20170223−183002_aqm_codel_bs_down_1000_bs_up_1000_tcp_newreno_chunksize_2_mpd_BigBuckBunnystime_20_etime_100_exp_20170223−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0−183002_aqm_codel_bs_down_1000_bs_up_1000_tcp_newreno_chunksize_2_mpd_BigBuckBunny−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0 ●

140 ● ● ● C2S UDP (Upstream) Elastic TCP Upstream #2 ● C2S UDP (Upstream) Elastic TCP Upstream #2 ● ● PIE and FQ-{CoDel,PIE} to grow their queues to 1000 packets 1 Elastic TCP Upstream #1 120 ● Elastic TCP Upstream #1 ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● 0.8 100 ●● ● ● ● ● ● ● (per IETF recommendations). ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 80 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ● 0.6 ●● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ●● ●●● ●● ● ● ● ● ● ●● ● ●●●● ● ●● ●● ● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ●●● ● ●● ●● ●● ● ●●●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●●●● ● ●●● ● ●● ● ● ●●● ●● ● ●● ● ● ●● ● ● ● ●●●● ● ●● ● ● ● ● ● ●●● ●●●● ●● ● ● ●●● ●● ● ● ●● ● ●● ● ● ● ●●● ● ●●● ●● ●● ●●● ●●● ●● ● ●● ● ● ● ●● ● ●● XPERIMENTAL ESULTS 60 ●●● ●● ● ●●● ● ● ●●●●● ● ● ● ● ● ●●●● ●● ● ●● ●● ●●● ●●●● ●● ● ●●● ●●● ● ●● ●● ●● ●●●● ●● ●● ● ● IV. E R ●●●● ● ●● ●●● ● ●●●●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ●●● ●●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ●● ● ●● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ●● RTT (ms) ● ● ● ● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● 0.4 ● ● ● ●● ● ●● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●●●●●● ●●●● ● ●●●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ●●● ●● ● ●●● ●●● ● ● ● ● ● ●●●● ● ● ● ● ●●●● ● ● ●● ●● ● ●● ●● ● ● ●● ● ● ●● ●● ● ●●●●●● ●●● ● ●● ● ●● ●●●●●●●● ●● ● ● ● ●● ● ●● ● ●●● ● ● ● ●● ●● ●●●● ●●● ● ● ● ● ●● ●●● ● ●●● ● ●● ● ● ●●● ● ● ●● ● ● ● ●●●● ●● ● ●●●●●●● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ●● ●● ● ● ●● ● ●● ● ● ●● ●● ●●● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● 40 ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ● ●●● ● ● ●●●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●●●●●● ● ●●●● ●● ●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●●●●●●●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ●● ● ● ● ●● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●●●●●●●● ●●● ●● ● ●● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● Each experiment involves two-way game traffic running for ●●●●●●● ●●●●● ●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ●● ● ●● ● ●●● ●●●● ●20● ●● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ●●● ●● ● ●●● ● ●●● ●●● ● ● ● ●●●●● ● Throughput (Mbps)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ● ●●● ●● ● ● ●●●● ●●●●● ●●●● ● ●●● ● ● ● ●● ●●●●●●●● ●●●●● ●●● ● ●●●● ●● ● ● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ●● ●● ●●●●●● 20 seconds before two TCP flows begin pushing data from ● 0 0 Figure 3’s home to Internet (i.e. upstream). The elastic flows 20 30 40 50 60 70 80 20 30 40 50 60 70 80 Time (s) Time (s) then last for roughly 80 seconds. During this period we look (a) Throughput (b) RTT closely at the impact on c2s packet loss rates, the upstream throughput achieved by the elastic TCP flows, and the overall Figure 5: CoDel during upstream congestion, 15/1Mbps, RTT experienced by all three application flows during the RT Tbase = 20ms period of congestion. The§ 2 noticeable Reno sources, banding 1 UDP of elastic gaming flow stream RTTs, CoDel in Figure 5a is A. Latency and throughput with FIFO because at 1Mbps enqueuing a 1500 byte TCP packet causes queuing delay to jump in increments of 12ms. This also causes79 Figure 4 shows the impact of upstream congestion on our trouble for CoDel, as a single full-size packet immediately 15/1Mbps last-mile using FIFO queue management. In this exceeds CoDel’s default T =5ms, yet CoDel wont drop case RT T = 20ms, emulating a case where the unloaded target base packets unless there’s at least one full-size packet in the queue. RTT to the game server is only 20ms (excellently close for 2) Using single-queue PIE: The IETF does propose PIE for interactive play) and the elastic TCP flows are pushing data use in single-queue scenarios. So in Figure 6 we illustrate the to ‘cloud’ servers also only 20ms away. potential impact of a 15/1Mbps last-mile using PIE when all In Figure 4a the low-rate c2s game flow weaves itself three flows have RT T = 20ms. Figure 6a reveals chaotic between the 1500byte TCP data packets as the two iperf flows base capacity sharing between the two iperf flows reminiscent of somewhat noisely share the available 1Mbps. Unfortunately, CoDel in Figure 5a. Figure 6b shows a reduction in RTT Figure 4b, reveals that despite RT T = 20ms all three base relative to FIFO (Figure 4b), but not as good as the reduction flows are experiencing RTTs ranging from just under 150ms provided by CoDel (Figure 5b). Aside from the initial RTT to just over 250ms (swinging rapidly in time with the cyclical spike (due to PIE’s burst tolerance as the iperf flows begin) capacity probing of each TCP flow). all three flows experience RTTs ranging from 20ms to around 13http://caia.swin.edu.au/bitss/tools/pktgen-0.3.1.tgz 150ms. 3) Using multi-queue FQ-CoDel: The IETF recommends FQ-CoDel as the way to introduce CoDel’s benefits into a net-

Authors’ copy. To appear in the 26th International Conference on Computer Communications and Networks (ICCCN 2017) 5 July 31 - August 3, 2017. See notice on the first page. FQ-CoDel is far better at flow isolation and bandwidth shar- ing as the bottleneck bandwidth increases. Figure 8 shows the impact of FQ-CoDel on a 15/5Mbps last-mile and RT Tbase = 20ms. There is almost perfect bandwidth sharing between the iperf flows (Figure 8a) and a much tighter RTT spread (Figure 8b). The c2s game flow again experiences consistent throughput and RTT close to RT Tbase. (a) Throughput (b) RTT C. Latency and throughput with AQM and ECN enabled Figure 6: PIE during upstream congestion, 15/1Mbps, ECN is usually promoted as a less blunt instrument for RT Tbase = 20ms congestion notification. Enabling ECN in our testbed allows the two elastic TCP flows to have their packets marked rather work. Figure 7 shows the potential impact of a 15/1Mbps last- than dropped. As ECN does not apply to UDP flows, when a mile using FQ-CoDel when all three flows have RT Tbase = game packet is selected by the AQM it will be dropped. 20ms or RT Tbase = 100ms. With RT Tbase = 20ms Broadly speaking ECN made little difference to the RTTs the chaotic bandwidth sharing (Figure 7a) and RTT spread and throughputs experienced by the iperf and game flows (Figure 7b) experienced by the iperf flows is similar to single- under most of the scenarios in Section IV-B except one: queue PIE and CoDel. However, a significant difference is that CoDel with RT Tbase = 20ms at 15/1Mbps. Figure 9 shows a the c2s game flow now experiences consistent throughput and dramatic failure mode – about halfway through the experiment RTTPerformance that is significantly of lower Reno than that - FQ of the-CoDel iperf flows. CoDel would begin dropping 100% of game traffic.

stime_20_etime_100_exp_20170223−183002_aqm_fq−codel_bs_down_1000_bs_up_1000_tcp_newreno_chunksize_2_mpd_BigBuckBunnystime_20_etime_100_exp_20170223−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0−183002_aqm_fq−codel_bs_down_1000_bs_up_1000_tcp_newreno_chunksize_2_mpd_BigBuckBunny−2s.mpd_game_q3_down_15mbit_up_1mbit_del_10_run_0

● ● 1 C2S UDP (Upstream) Elastic TCP Upstream #2 300 C2S UDP (Upstream) Elastic TCP Upstream #2 Elastic TCP Upstream #1 Elastic TCP Upstream #1 0.8 250 200 0.6 150 0.4

RTT (ms) 100

0.2 50 Throughput (Mbps) ● ● ●● ● ● ● ● ● ● ● ● ● ●●●●●● ●●●●●●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●● ●●●● ● ●●●● ●●●●● ●●●●●● ●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●● ●●●●●●●●●●● ● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●● ●● ●●●●●● ●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●● ●●●●●●●● ●●●●●●●●● ●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●● ●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●● ● ●● ●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ● ● ● 0 0 20 30 40 50 60 70 80 20 30 40 50 60 70 80 Time (s) Time (s) RT T =20ms RT T =20ms (a) Throughput, RT Tbase =20ms (b) RTT, RT Tbase =20ms (a) Throughput, base (b) RTT, base Figure 9: Dramatic failure mode with CoDel, ECN, 15/1Mbps and RT T = 20ms § 2 Reno sources, 1 UDP gaming stream, FQ-CoDel base This failure mode has its roots in CoDel struggling to 80 effect control over the two iperf flows. As noted earlier, at 1Mbps a 1500 byte TCP packet is 12ms long, immediately exceeding CoDel’s default Ttarget =5ms. With ECN dis- abled, CoDel applies drops to the TCP flows and the traffic abates sufficiently from time to time that we end up with (c) Throughput, RT Tbase =100ms (d) RTT, RT Tbase =100ms Figure 5. With ECN enabled, the TCP flows experience marks Figure 7: FQ-CoDel during upstream congestion, 15/1Mbps rather than drops, and the load on the CoDel bottleneck never abates. Direct inspection of packets captured during Figure 9’s With 15/1Mbps and RT T = 100ms the iperf flows see base experiment reveal that at a certain point in time CoDel finds far better bandwidth sharing (Figure 7c) and slightly tighter itself needing to flag congestion on all packets passing through RTT spread (Figure 7d). The c2s game flow again experiences – every TCP packet was ECN marked, and as collateral consistent throughput and RTT close to RT T . base damage every game packet was dropped. D. Loss vs time, with and without ECN The preceding results illustrate what is already well known – deploying single-queue PIE or multi-queue FQ-CoDel is a good way to reduce (or close to eliminate) the RTT inflation experienced when elastic TCP flows compete with game traffic over FIFO bottlenecks. However, as noted in Section II-F we must also investigate the impact of AQMs on packet loss rates

(a) Throughput, RT Tbase =20ms (b) RTT, RT Tbase =20ms for the game traffic. Figure 10 directly compare the cumulative packet loss vs Figure 8: FQ-CoDel during upstream congestion, 15/5Mbps time experienced by c2s traffic during the upstream congestion

Authors’ copy. To appear in the 26th International Conference on Computer Communications and Networks (ICCCN 2017) 6 July 31 - August 3, 2017. See notice on the first page. QoS architectures

§ Integrated Services (IntServ) § per flow reservation at routers (RSVP protocol for reservation) § per flow scheduling § Differentiated Services (DiffServ) § no reservation § classification at the border § scheduling per aggregated classes in the backbone

81 Integrated Services

§ An architecture for providing QOS guarantees in IP networks for individual application sessions § Relies on resource reservation, and routers need to maintain state info, maintaining records of allocated resources and responding to new Call setup requests on that basis

82 Flow Admission

§ Session must first declare its QOS requirement (T- spec) and characterize the traffic it will send through the network § Routers check for resources and reserve them § A signaling protocol is needed to carry QOS requirement to the routers where reservation is required

83 RSVP (Reservation Protocol)

§ PATH message § T-spec - source traffic description § defines the traffic characteristics § token bucket: rate, capacity, and peak rate § packet from source to destination determines the return route § RESV message § R-spec: if receiver wants to have better QoS (e.g. higher rate and jitter) § packet from destination to source follows the route established by PATH § reservations are done upon receiving this message

84 Flow Admission

§ Flow Admission: routers will admit flows based on their T-spec and R-spec and base on the current resource allocated at the routers to other flows

85 Integrated Services: Classes

§ Guaranteed QOS: this class is provided with firm bounds on queuing delay at a router; envisioned for hard real-time applications that are highly sensitive to end-to-end delay expectation and variance § rate and delay § Controlled Load: this class is provided a QOS closely approximating an unloaded network; envisioned for todays IP network real-time applications which perform well in an unloaded network § rate

86 Differentiated Services

§ Intended to address the following difficulties with Intserv and RSVP § Scalability: maintaining states by routers in high speed networks is difficult due to the very large number of flows § Flexible Service Models: IntServ has only two classes, want to provide more classes - relative service distinction (Platinum, Gold, Silver, …) § Simpler signaling: (than RSVP) many applications and users may only want to specify a more qualitative notion of service

87 Differentiated Services

§ Approach: § Only simple functions in the core, and relatively complex functions at edge routers (or hosts) § Do not define service classes, instead provide functional components with which service classes can be built

88 End-to-end DiffServ architecture

Mobile Hosts

PHB Access Router

Core Access Routers Networ Host k Edge PHB PHB Edge Router Router Access Router SLA Mobile Service Level Hosts Agreement

89 Edge Functions

§ At DS-capable host or first DS-capable router § Classification: edge node marks packets according to classification rules to be specified (manually by admin, or by some TBD protocol) § Traffic Conditioning: edge node may delay and then forward or may discard

90 Classification and Conditioning

§ Packet is marked in the Type of Service (TOS) in IPv4, and Traffic Class in IPv6 § 6 bits used for Differentiated Service Code Point (DSCP) and determine PHB that the packet will receive

91 Core Functions

§ Forwarding: according to Per-Hop-Behavior or PHB specified for the particular packet class; such PHB is strictly based on class marking (no other header fields can be used to influence PHB) § QoS, if sufficient provisioning

§ BIG ADVANTAGE: No state info to be maintained by routers!

92 DiffServ service classes

§ Two main types of application § interactive (games, interactive distributed simulations, VoIP, device control) § delay, jitter § elastic (data transfer) § sustained throughput § Traffic classes § EF (Expedited Forwarding) § short delay, small jitter § AF (Assured Forwarding) § minimal sustained throughput § 4 subclasses with 3 different drop probabilities (12 subclasses in total) § BE (Best Effort)

93 DiffServ - Edge router

§ Classification, metering, marking

94 DiffServ - Core router

§ Queue management and scheduling § EF: high priority § AF, BE: WFQ - Weighted Fair Queueing § Traffic shaping

95 96 97 Facts to remember

§ QoS in packet networks based on § scheduling algorithms § buffer management policies § Traffic shaping helps to deal with QoS § limiting bursts § traffic description § traffic policing § IETF models § IntServ, DiffServ

98