Fair and efficient router congestion control

Xiaojie Gao∗ Kamal Jain† Leonard J. Schulman‡

Abstract nicated back to the source by the absence of acknowl- Congestion is a natural phenomenon in any network queuing edgments) cause the source to reduce its transmission system, and is unavoidable if the queuing system is operated rate. This “backoff” mechanism has shown itself to be near capacity. In this paper we study how to set the rules of a queuing system so that all the users have a self-interest remarkably successful at maintaining a functional in- in controlling congestion when it happens. ternet in spite of congestion. However, not all TCP Routers in the respond to local congestion by installations implement this backoff policy. Moreover, dropping packets. But if packets are dropped indiscrimi- nately, the effect can be to encourage senders to actually not all traffic in the internet is generated by TCP; most increase their transmission rates, worsening the congestion significantly, UDP, which increasingly carries voice and and destabilizing the system. Alternatively, and only slightly video signals, is unresponsive, which is to say, it does not more preferably, the effect can be to arbitrarily let a few in- sistent senders take over most of the router capacity. have a backoff mechanism. There is nothing inherent in We approach this problem from first principles: a router the design of the internet to impose a charge or penalty packet-dropping protocol is a mechanism that sets up a game on unresponsive traffic in order to discourage it from between the senders, who are in turn competing for link capacity. Our task is to design this mechanism so that the crowding out responsive traffic. This raises the specter game equilibrium is desirable: high total rate is achieved and of internet performance degrading due to high-volume is shared widely among all senders. In addition, equilibrium unresponsive traffic — in turn causing more users to im- should be reestablished quickly in response to changes in transmission rates. Our solution is based upon auction plement unresponsive transmission policies. The quality theory: in principle, although not always in practice, we drop of service properties of such a network are not likely to packets of the highest-rate sender, in case of congestion. We be desirable, and include the possibility of congestion will prove the game-theoretic merits of our method. We’ll also describe a variant of the method with some further collapse in the internet. advantages that will be supported by network simulations. Naturally this problem has drawn significant atten- tion in the networking literature. For a better introduc- tion than we could possibly provide here, see [4] as well 1 Introduction as [17] and [14]. In the sequel we will not dwell further In a packet-based communication network such as the on the motivation, but adopt the problem from the ex- internet, packets go through routers on their way from isting literature, and focus on the technical aspects of source to destination. A router must examine each prior solutions, their advantages and limitations — fi- packet header and perform certain operations including, nally pointing out a significant limitation to all existing most significantly, deciding along which of the physical solutions, and our approach to addressing it. links the packet should be sent. In many (though Comment: Router design is an active field; modern not all) cases, this processing of the header is the high-speed routers are complex and contain several limiting factor determining the capacity (in packets processing units and buffers. We follow prior literature per second) of the router. In order to accommodate in using the simplified one-processor one-queue model as traffic bursts, the router maintains a queue. However, a representative for whichever component of the router when traffic arrives over a sustained period at a rate happens, in a given circumstance, to be the bottleneck. above the router capacity, the length of the queue will The basic problem approach the available buffer size; eventually, packet The engineering challenge is to design a congestion loss is unavoidable. control mechanism — to be implemented at routers, In the TCP protocol, packet losses (when commu- since we cannot control the sources — to achieve the following simultaneous objectives.

∗ Computer Science Department, California Institute of Tech- 1. The rate (packets transmitted per second) of a nology. Email: [email protected] †Microsoft. Email: [email protected] router should at all times be close to the lesser of ‡Computer Science Department, California Institute of Tech- the router capacity and the received traffic. nology. Supported by National Science Foundation grant no. 0049092 and by the Charles Lee Powell Foundation. Email: schul- 2. The achieved rates of the various sources should [email protected] be fair: high-volume sources should not be able to Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

1050

crowd out low-volume sources. proposal to similar criticism of excessive computational overhead. This challenge has been taken up in several papers. To address this issue, it was proposed in [17] that “Fair queuing” mechanisms create separate queues for “core” routers of the network adopt a protocol that does the packets from each source, and generally forward not maintain “state” (such as a record of the volume) for packets from different sources in round-robin fashion, each flow. However, such per-flow states still need to be selectively dropping packets from high-volume sources maintained at “edge” routers of the network, which then in order to fairly allocate rate among sources [3, 15, 7, need to communicate rate estimates for the various flows 16, 2]. An attempt is usually made to allocate to each to the core routers. In this method the core routers are source its max-min-fairness rate, defined as follows: not slowed down by flow-specific computations. Like Definition 1.1. If the desired rate of each source i is the FQ methods, this proposal aims to achieve fair usage of router capacity by explicitly allocating to each ri, and the router capacity is C, then the max-min- flow its max-min-fairness rate. Potential drawbacks of fairness rates are Ri = min{ri,α} where α =α(C, {ri}) this method are the assumption that there are edge is the supremum of the values for which Ri

1051

erentially penalizing high-volume flows. Moreover (like constant, comparable with CHOKe.) the other RED variations, as well as the FQ methods), it can be implemented at any router, independently of A2. The space complexity of implementing the router other routers. Although CHOKe’s performance is not protocol should be within a constant factor of well understood analytically, extensive simulations have simply maintaining a single packet queue. provided support for its favorable performance with re- A3. The protocol should be deployable at a single gard to both objectives (1) and (2) above, provided there router, with no dependence on whether it has been is just one UDP (unresponsive) flow and all other flows deployed at any other routers. are TCP compliant [18]. On the other hand when several flows are unrespon- (B) Some explanation is needed before presenting sive, it’s easy to see that CHOKe will fail to prevent the game-theoretic requirements. We will not try those from crowding out the responsive flows. Roughly to apply the theory of Nash equilibria to the most speaking, if a flow occupies fraction pi of the incom- general situation in which there are infinitely many ing traffic to the router, then fraction pi of its packets sources, each sending messages at times entirely of will be dropped by the router; this prevents a single un- their choosing, in full knowledge of the randomized responsive flow from trying to dominate traffic into the strategies of every other source. In view of the very router, but if there are even two unresponsive flows, they little information actually available in practice to each have no incentive to leave any capacity to the responsive source, and the overall asynchrony in a large network, flows. If some bound can be assumed on the number of this is a needlessly general setting. Instead, we will unresponsive flows, then this problem can be compen- start with the case in which every source is Poisson. sated for by increasing the complexity of CHOKe: for After establishing the basic game-theoretic conclusions instance, by sampling a set of more than just two pack- in this framework, we will go on to consider what one ets, and deleting any packets that occur multiply in the source can gain by deviating from this framework, if all set. However, since the size of this set needs to grow the other sources remain Poisson. (This is not a severe at least linearly with the bound on the number of un- restriction because even if the other sources send packets responsive flows, the complexity of this solution grows at deterministic times, network delays on the way to the sharply with that bound, and the solution loses its prin- router introduce noise into the arrival times.) While cipal merit, the computational efficiency that yields ob- the Poisson model is not good for short bursts of traffic, jective (1). it is a reasonable model, much used in the networking Our contribution literature (in spite of some limitations) for aggregate There is no reason to suppose that, in practice, the and extended-duration traffic. number of flows aggressively (unresponsively) maximiz- We stress that when considering a source that is ing their throughput will be bounded by one, or by any trying to “trick” our router, we will not constrain that other small number. This limitation of CHOKe is the source to generate Poisson traffic; the source will be stimulus for our contribution. Our approach is rooted allowed to generate traffic in an arbitrary pattern. r in game theory and, in particular, in what is known as Notation: Let i be the desired transmission rate i C mechanism design. The perspective is that as the de- of source .Let be the capacity of the router. C signer of the router protocol, we are in charge of a game (Specifically, is the rate the router can achieve while among the sources, each of which is trying to achieve administering a single queue and spending constant throughput as close as possible to its desired transmis- time per packet on congestion control. Equivalently, the R sion rate. It is well known that, under certain technical rate achievable by CHOKe.) Let i be the max-min- i conditions, such multiplayer games have Nash equilibria fairness rate of the source (as defined earlier), given {r } C s in which the strategies chosen by each of the players, are i and .Let i be the actual Poisson rate chosen by i a i best possible conditional on the strategies of the other source .Let i be the throughput of source . players. It is our task to set up the game so that its Nash Our game-theoretic requirements are: equilibria satisfy our design objectives (1) and (2). B1. Assume an idealized situation in which the router, We begin now by specifying the technical require- and all the sources, know the source rates {si};and ments we demand of our solution. There are two types in which the queue buffer is unbounded. This ideal- of requirements: (A) Computational requirements, (B) ized game should have a unique Nash equilibrium Game-theoretic requirements. which is the max-min-fairness rates Ri as deter- (A) Computational requirements: mined by the inputs C and {ri}. A1. The per-packet time complexity of implementing As a corollary, when all sources are acting in their the router protocol should be constant. (A small own best interest, the rate of the router equals C.

1052

B2. In the actual game (which is administered by a One can also consider CHOKe and our protocol as router that can only use its history to govern its unusual sorts of single-item auctions in which the play- actions), there is a small ε>0 such that any ers wish to bid as high as possible without actually win- source sending at rate si ≤ (1 − ε)α, will achieve ning the item. From this perspective the winner, in the throughput ai ≥ si(1 − ε). case of CHOKe, is picked randomly with probabilities As a corollary, when all sources are acting in their proportional to the bids; whereas in our protocol the own best interest, the rate of the router is at least winner is the highest bidder. Since nobody wants to C(1 − 2ε). “win” the penalty, the senders, in our protocol, have an incentive to lower their bids until the total is low enough By establishing (B2), we will have accomplished the that the auction is cancelled. capacity and fairness objectives (1,2) specified earlier. We’ll actually describe two slightly different ver- This will be done in section 4.1. sions of the protocol. In either case, the protocol will The next step will be, as indicated earlier, to maintain several items of data: remedy (to a degree) our insistence on considering only Poisson sources. We will show: 1. Q, the total number of packets presently in the queue. B3. The long-term throughput of a source which is allowed to send packets at arbitrary times (while all 2. A hash table containing, for each flow i having other sources are still restricted to being Poisson) packets in the queue, a record of mi, the total is no more than 1+ε times that of the best Poisson number of packets presently in the queue from strategy. source i. Finally, we’ll attend to the performance of a TCP 3. MAX, a pointer to the record of the source having source in our system. The reason for this is not game- the highest number of packets in the queue. theoretic; naturally, TCP is not likely to perform quite In addition, there are several adjustable parameters so well as a strategy optimized to play our game. controlling the protocol behavior: F , the size of the Rather, the reason to consider TCP is that it is precisely queue buffer; “high” H and “low” L markers satisfying the sort of responsive protocol which a mechanism such 0 H, stamp the packet DROP; protocol; and that it subsequently “creeps” up toward H ≥ Q>L i the max-min-fairness threshold α (before again having (b) otherwise if and = MAX, stamp to back off). the packet DROP; (c) otherwise, stamp the packet SEND. 2 The protocol 3. The packet is appended to the tail of the queue, From the network packet queuing theory perspective our together with the stamp. (If the stamp is DROP, protocol is similar to CHOKe. In case of congestion, the packet data can be deleted, but the header is CHOKe penalizes all the sources in proportion to their retained so long as the packet is on the queue.) sending rate. We instead penalize only the highest rate senders. So all the senders compete to not be the highest 4. Q and mi are incremented by 1. (If mi was 0, a rate flow; this eliminates the congestion. new record is created.)

1053

5. If mi >mMAX, then the MAX pointer is reassigned Protocol II the value i. Protocol II differs from Protocol I only in line 2(b) of Packet arrivals, which we replace by: Packet departures H−Q 2(b)’ If H ≥ Q>Land mi ≥ mMAX, stamp Each time a packet is pulled off the head of the H−L the packet DROP; otherwise, stamp it SEND. queue, the following actions are performed: Protocol II has no advantage over Protocol I with 1. The packet source i is identified. regard to Nash equilibria. However, its sliding scale for drops has a substantial advantage over both CHOKe Q m m 2. and i are decremented by 1. (If i =0the and Protocol I in the effectiveness with which multiple record is eliminated from the hash table.) unresponsive flows are handled. This will be demon- 3. If the packet is stamped SEND, it is routed to its strated by simulation in section 6. destination; otherwise, it is dropped. 3 Satisfaction of the computational 4. If i = MAX but mi is no longer maximal, MAX is requirements (A) reassigned to the maximal sender. The most straightforward way to handle the protocol Comment 1: the usual procedure when facing computations is to maintain the active sources (those m > buffer overflow is to “droptail”, i.e., to continue to serve with i 0, i.e., those having packets in the queue) in m packets already on the queue but not to accept new a priority queue, keyed by i. This does not entirely packets into the queue. Here, we make drop decisions resolve the computational requirements, though, for two n at the tail of the buffer, but we don’t really drop the reasons: (a) updates to a priority queue with items O n packets there. Instead, we append a packet to the tail take time (log ), rather than a constant. (b) A of the buffer together with the stamp of SEND or DROP hashing mechanism is still required in order to find, i and send or drop it at the head of the queue according given a source label , the pointer into the priority to its stamp. It may appear peculiar, at first sight, that queue. m when we decide to drop packets we put them on the Item (a) is easily addressed. Since we change i ± queue anyway, and only really get rid of them when by only 1 in any step, the following data structure they reach the head of the queue. The reason is that can substitute for a general-purpose priority queue. N k our queue serves a dual purpose. One is the ordinary Maintain, in a doubly-linked list, a node k for each i m k purpose: a buffering facility to time-average load and such that there exists for which i = . The linked list k N thereby approach router capacity. The other purpose is maintained in order of increasing .At k maintain c k i is as a measuring device for recent traffic volumes from also a counter ( ) of the number of distinct for which m k N c k the sources. In our method the contents of the queue i = . Finally, from k maintain also ( ) two-way represent, in all circumstances, the complete history of pointers, each linking to a node at which one of those i packets received at the queue over some recent interval labels is stored. This data structure can easily be of time. updated in constant time in response to increments or m Our handling of drops enables us to use the {mi} to decrements in any of the i. estimate the source rates instead of having to compute Item (b) is slightly more difficult since it asks, O exponential averages, as was done in some of the essentially, for a dynamic hash table with (1) access recent literature in this area. (The averaging is not time and linear storage space. (The elegant method complicated but requires reference to the system clock of Fredman, Koml´os and Szemer´edi (FKS) [6] is not plus some arithmetic; in view of the computational dynamic.) We know of no perfect way to handle this, demands on the router, the gain may be meaningful.) but two are at least satisfactory. Comment 2: Since we don’t “droptail”, we need One solution is to modify FKS, allow polylogarith- to ensure that we don’t create buffer overflow. There mic space overhead, and achieve constant amortized ac- are three time parameters associated with this queue: cess time by occasionally recomputing the FKS hash T1 = the time to route a SEND packet at the head of function. the queue, T2 = the time to append a packet to the An alternative solution is simply to store the point- tail of the queue, T3 = the time to move the head ers in a balanced binary search tree, keyed by the source {i} O n of the queue past a DROP packet. For a queuing labels . This method uses linear space but (log ) system to make sense, these times should satisfy the access time. A simple device fixes the access time prob- m Q inequalities T1 >T2 >T3. We can prevent buffer lem: instead of updating i and with every packet, F overflow in our protocol simply by choosing F and H so perform these updates only every (log )’th packet. Our that F/H ≥ T1/T2. (This is not hard to show.) game-theoretic guarantees will still hold with a slight

1054

loss in quality due to the slightly less accurate estima- that the above argument survives the constraint of tion of flow rates, and with slightly slower reactions to having to make do with a finite queue buffer. We first changes in flow rates. In practice we anticipate that need a Chernoff bound for the probability that a Poisson these effects will be negligible, and therefore that this is process deviates far from its mean. preferable to the modified-FKS solution. Lemma 4.1. X E X λ In some operating environments there might be is a Poisson process with [ ]= : a moderate, known bound on the number of sources −λδ2/4 Pr{X ≥ (1 + δ)λ}

2 2 4 Satisfaction of the game-theoretic − α(β−α) − β(β−α) requirements (B) Pr{X ≥ Y } C of the total traffic. which i , and which can transmit at any Pois- son rates 0 ≤ si ≤ ri, there is a small ε>0 such that s ≤ − ε α 4.1 Equilibrium guarantees: properties B1, B2 any source sending at rate i (1 ) , will achieve a ≥ s − ε α α C, {r } We first consider a “toy” version of our protocol in which throughput i i(1 ).(Here = ( j ) and L ∈ B 1 1 all the sources, and the router, know the true transmis- Ω( ε2 ln ε ).) sion rates si of all sources. (As indicated earlier, we’ll as- Proof. The hypotheses guarantee a ratio of at most 1−ε sume the sources are constant-rate Poisson.) Moreover, between si and α(C, {sj }). Packets from source i will the buffer for the queue is unbounded. If si >C, be dropped only when the queue grows to size L,and the router simply drops all the packets of the highest- mi is higher than that of all other sources. rate source; if several tie for the highest rate, it rotates   Case 1: sj ≥ min{(1 − ε/2)α, rj }.The randomly between them. j j idea for this case is that with high probability the Theorem 4.1. If the sources have desired rates {ri} packet dropping ends quickly, since the protocol will for which ri >C, and can transmit at any Poisson soon identify a higher-rate source. ≤ s ≤ r rates 0 i i, then their only Nash equilibrium is to In this case si is not the largest, and there must be s R R transmit at rates i = i,for i the max-min-fairness a fluctuation in the transmission rates that causes mi rates. to be the largest. Assume source k is sending at the s s ≥ − ε/ α s ≤ Proof. This is immediate (recalling that we treat only largest rate k. It’s clear that k (1 2) and i  ε/2 s

1055

queue length to increase to L. The fraction of packets asetofN buffers named in a circular manner. When lost due to this reason is proportional to the probability a packet arrives (from some application) at a sender, of such a fluctuation. The argument proceeds by the packet is parked into one of the available buffers showing that if L is large enough, this probability is and also sent over the network. If no buffer is available very small. (Details omitted.) 2 then the packet generation rate is higher than the serv- ing rate. In this case the rate is halved. The parked 4.2 Sources cannot increase throughput by packets are removed once the acknowledgement of their variable-rate transmission: property B3 Consider successful receipt is received. If all the buffers are emp- the case that the transmission rates of all sources except tied then the packet generation rate is smaller than the one (denote it source S) are fixed. We have the following serving rate. In this case the rate is increased by a con- theorem: stant, say 1. If the generation rate is exactly the same as the serving rate then the buffer will reach the empty Theorem 4.3. For a source which is allowed to send state or full state occasionally. Since in our protocol packets at arbitrary times (while all other sources are the ideal serving rate, which is given by the ”max-min- restricted to being Poisson), the long-term throughput fairness” threshold α, is not precisely known, we assume ε is no more than 1+ times that of the best Poisson that the generation rate is not equal to the serving rate. strategy. (Also, equality is actually a favorable case; we write the following theorem from the worst-case point of view.) Proof. Assume source S is allowed to send packets Let T I be the time to increase the generation rate at arbitrary times while other sources 1, 2, ··· ,B are TCP from 0 to α,orfromα to 2α, if all the packets are restricted to being Poisson process with rates si (s1 ≥ D getting through; let TTCP be the time to decrease the s2 ≥···≥sB). generation rate to 0, starting from rate at most 2α at Consider the packets from source S arriving in the moment that all packets begin to be dropped; and [T,T]. We ignore the last a few packets which are let T W be the waiting time until the generation rate dropped as they have no contribution to the throughput. TCP starts increasing, once the router begins allowing pack- Look at the last packet p1 from source S which is not ets through. These parameters are illustrated in Fig- dropped. Let its arrival time be t1 and at time t1,the ure 1. arrival time of the packet at the head of the queue be t1. p D W I Since packet 1 is stamped SEND, in the time interval Theorem 4.4. If T ,T ,T1F ∈ O(T ), then t ,t t ,T TCP TCP TCP [ 1 1], and so in [ 1 ], the number of routed packets the throughput of an adaptive flow using a TCP-like S m t from is no more than MAX at time 1; otherwise, adjustment method of increase-by-one and decrease-by- the stamp will be DROP. By the same argument, we /D half in our system is at least optimal for some can split the interval [T,T ]inton + 1 sub-intervals: D constant . [T,tn],[tn,tn−1], [tn−1,tn−2], ··· , [t2,t1], [t1,T ]. Let t0 = T and mMAX = MAXi at time ti(i ∈ Proof. We’ll suppose in order to simplify details that D W I {0, 1, ··· ,n−1}). The number of routed packets from S TTCP,TTCP,T1F ≤ TTCP. in [tn,T ]isnomorethanMAX0(t0 −t1)+MAX1(t1 − Examine the router starting at a moment at which t2)+···+ MAXn−1(tn−1 − tn). Given that [T,T ]is the TCP generation rate is smaller than the serving rate, long enough, the throughput of source S in [T,T]is which is the “max-min-fairness” threshold α. Soatthis approximate to that of source S in [tn,T ]. time, the rate is increased by 1. When the number Since in any [ti+1,ti], E[MAXi] ≤ s1(ti − ti+1)(1 + of packets from the flow becomes maximal among all ε) (Details omitted), the number of routed packets from flows (at which time the rate is between α and 2α, I S, in the interval [T,T ], is no more than 1+ε times that since T1F ≤ TTCP), the arriving packets will start to be D of arrival packets from source 1. The best throughput dropped. Then within time TTCP, the sender will have for source S is no more than 1 + ε times that of the reduced its transmission rate. After an additional time Poisson strategy with rate s1. 2 at most T1F , the queue will have cleared, its statistics will reflect the low transmission rate and the router will 4.3 Performance of TCP: property B4 We give stop dropping packets from this source. Finally after an W here a brief overview of TCP from a theoretical per- additional time at most TTCP, the generation rate will spective. TCP is a transmission protocol whose main again begin increasing. The worst-case throughput for idea is additive increase and multiplicative decrease in the flow is given by a history as in Figure 1, in which the rate in response to absence or presence of congestion. curve is the generation rate as a function of time and TCP maintains a rotating window of a fixed size, say the area in shadow illustrates the throughput of the flow N, at the sender side. The rotating window is basically (We have pessimistically supposed even that TCP backs

1056

off all the way to rate 0). Given the simplified timing of packets which get through, and that the message rate assumptions, the worst-case throughput achieves D ≤ 8. must therefore be somewhat lower than the throughput 2 rate. Therefore by reducing transmission rate until no packets are being dropped, a source can still get just as many packets through, and increase its message rate. T = TI , T = TD , T = TW , T ,T ≤ T F 1 TCP 3 TCP 4 TCP 2 5 1 (This argument doesn’t apply to low-volume sources whose throughputs are not limited by the network; but even for such sources, there are small additional 2α T T T T 1 2 3 4 costs, e.g., in CPU time, associated with generating extraneous packets.)

Rate DROP SEND This reduces our task to showing: α Lemma 5.1. T If each flow is sending at its throughput, 5 ...... then there is a unique Nash equilibrium, the “max-min- fairness” allocation.

Time Proof. It is well known that the “max-min-fairness” allocation is unique, so we have noly to show that any Figure 1: The generation rate and throughput of the Nash equilibrium is max-min-fair. adaptive flow using a TCP- like adjustment method of For each router, the summation of the “max-min- fairness” rates for all flows going through the router is increase-by-one and decrease-by-half, as a function of −→ at most its capacity. Assume x is a Nash equilibrium time. −→ and y is any other allocation. If there exists an As a final remark of this section, the use of a s ∈{1,...,N}(where N is the number of flows) such TCP-like adaptive mechanism is well motivated in our that ys >xs, then xs

1057

6.1 Simulation 1: testing the equilibrium prop- erty To verify the Nash equilibrium property of our 600 Fair Throughput (2000/33 Kbps) Protocol I’s Throughput protocol, we consider a single r Mbps congested link CHOKe’s Throughput shared by 5 Poisson flows, indexed from 0 to 4. The 500 sending rates of flows 0, 1, 2, and 3 are fixed: r Mbps, 2r/3 Mbps, r/2 Mbps, and 2r/5 Mbps. Flow 4 is al- 400 lowed to send packet at arbitrary rates. We vary the sending rate of flow 4 to study our protocol over a 5000r 300 sec interval. When the buffer size is set to be finite,

H = F/6, and L is 0, the simulation results for Pro- Throughput (Kbps) 200 tocol I and II are summarized in Figure 2. When the sending rate reaches r, a large percentage of the packets 100 are dropped. The greatest bandwidth, approximately 60.6

0.3r Mbps, is obtained at a sending rate less than r 0 0 5 10 15 20 25 30 Mbps. Flow Number

0.35 Protocol I Figure 3: The average throughput of each flow over a Protocol II 0.3 100 sec interval under Protocol I and CHOKe.

0.25 of ill-behaved UDP flows on a set of TCP flows, we 0.2 consider a 1.5 Mbps link shared by ten UDP sources, each of which are sending at 0.5 Mbps, and ten TCP 0.15 sources, which are using TCP Tahoe and whose prop-

Throughput (*r Mpbs) 0.1 agation delays are set to 3 ms. (Simulations of TCP Reno, Newreno and Sack gave similar results.) We com- 0.05 pare the performances of Protocol I, Protocol II, and CHOKe. The maximal and minimal throughputs of two 0 0 0.5 1 1.5 2 2.5 3 Sending rate of flow 4 (*r Mpbs) kinds of flows over a 100 sec interval is given in Table 1. All of the protocols prevent the TCP sources from be- Figure 2: Average throughput of flow 4, as a function ing entirely shut out of the channel (as would happen of the sending rate, under Protocol I and II. without any penalties to unresponsive flows); but Pro- tocol I and CHOKe don’t penalize the ill-behaved UDP In the following simulations we use ns-2 (a network flows enough as they still get more than their fair rates, simulator) to test more complicated behaviors. and Protocol II stands out by providing just as much bandwidth (actually more) to the TCP sources than to 6.2 Simulation 2: Performance on a single the UDP sources. congested link Protocol CHOKe I II 6.2.1 Comparison with CHOKe To compare the UDP MAX (Mbps) 0.15432 0.15576 0.00488 performances of Protocol I and CHOKe on a single UDP MIN (Mbps) 0.14608 0.14376 0.00352 congested link, we consider a 2 Mbps link shared by one TCP MAX(Mbps) 0.00200 0.00224 0.14544 UDP flow which is sending at 3 Mbps, indexed 1, and TCP MIN(Mbps) 0.00008 0.00056 0.13560 32 TCP flows, indexed from 2 to 33, whose propagation delays are 3 ms. The average throughput of each flow Table 1: The maximal and minimal throughputs of two over a 100 sec interval is given in Figure 3. Under kinds of flows over a 100 sec interval under Protocol I, Protocol I, the ill-behaved UDP flow is penalized heavily Protocol II and CHOKe. and the TCP flows get approximately their fair rates (as would happen without any penalties to unresponsive flows); but CHOKe doesn’t provide as much bandwidth 6.3 Simulation 3: Multiple Congested Links So to each TCP source as to the UDP source. far we have verified the Nash equilibrium and seen the performance of flows on a single congested link. We now 6.2.2 Ten UDPs and ten TCPs flows on a sin- analyze how the throughputs of flows are affected when gle congested link To examine the impact of a set they traverse more than one congested link. A simple

1058

network configuration with three routers is constructed [1] A. Albanese, J. Blomer, J. Edmonds, M. Luby, and as shown in Figure 4. Each of the congested links has M. Sudan. Priority encoded transmission. IEEE capacity 10 Mbps: the link between routers 1 and 2 Trans. Information Theory, 42(6):1737–1744, 1996. 2 (L12), and the following link between routers 2 and 3 [2] J.C.R. Bennett and H. Zhang. WF Q: Worstcase fair (L23). . In Proc. IEEE INFOCOM, pages 120–128, 1996. [3] A. Demers, S. Keshav, and S. Shenker. Analysis 10 Mbps 10 Mbps Router 1 Router 2 Router 3 and simulation of a fair queueing algorithm. Journal @ L12 L23 @ of Internetworking Research and Experience,pages gg @ g ww w @ 3–26, October 1990. Also in Proceedings of ACM SIGCOMM’89, pp 312. flow1&2 flow 3 flow1to3 g w [4] S. Floyd and K. Fall. Promoting the use of end-to-end Source Sink congestion control in the internet. IEEE/ACM Trans. Figure 4: Topology for analyzing how the throughput on Networking, 7(4), 1999. of a flow is affected by more than one congested link. [5] S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. There are three flows in the network: source 1 is on Networking, 1(4):397–413, 1993. [6]M.L.Fredman,J.Koml´os, and E. Szemer´edi. Storing a UDP source, which sends at 10 Mbps, and the other a sparse table with O(1) worst case access time. J. two are TCP sources. The propagation delays of TCPs Assoc. Comput. Mach., 31(3):538–544, 1984. are 3 ms. Using Protocol I, the average throughput of [7] S. Golestani. A selfclocked fair queueing scheme for each flow over a 20 sec interval at various link in the broadband applications. In Proc. IEEE INFOCOM, network is listed in Table 2. Since there are two links, pages 636–646, 1994. the throughput of TCP2 is dominated by link L23. The [8] E. Hashem. Analysis of random drop for gateway average bandwidth of TCP2 on Link L23 is 2.451 Mbps, congestion control. MIT LCS Technical Report 465, it is reasonable that three quarters of the capacity of link 1989. L12 is occupied by UDP1. Both TCP flows get rates [9] J. M. Jaffe. Bottleneck flow control. IEEE Trans. on close to their fair rates on link L23; since there is only Comm., COM-29(7):954–962. one non-adaptive flow (UDP flow 1), it also gets close [10] R. M. Karp, C. H. Papadimitriou, and S. Shenker. A simple algorithm for finding frequent elements in to its fair rate on the link. The average throughputs of streams and bags. Manuscript. flows at various link in the network under Protocol II [11] D. Lin and R. Morris. Dynamics of random early and CHOKe are also listed in Table 2. The performance detection. In Proc. ACM SIGCOMM, pages 127–137, of Protocol II is comparable to that of Protocol I and it 1997. outperforms CHOKe. Comparing these results with the [12] P. McKenny. Stochastic fairness queueing. In Proc. single congested link case, we can see that non-adaptive IEEE INFOCOM, pages 733–740, 1990. flows, like UDP, suffer from more severe packet loss than [13] T. Ott, T. Lakshman, and L. Wong. SRED: Stabilized adaptive flows. RED. In Proc. INFOCOM, pages 1346–1355, 1999. [14] R. Pan, B. Prabhakar, and K. Psounis. CHOKe: a CHOKe Protocol I Protocol II stateless scheme for approx- (Mbps) L12 L23 L12 L23 L12 L23 imating fair bandwidth allocation. In Proc. IEEE IN- FOCOM, 2000. UDP1 9.579 8.943 7.339 4.125 7.643 4.077 [15] A. Parekh and R. Gallager. A generalized processor TCP2 0.404 0.400 2.534 2.451 2.244 2.163 sharing approach to flow control - the single node case. TCP3 - 0.615 - 3.196 - 3.530 In Proc. IEEE INFOCOM, 1992. [16] M. Shreedhar and G. Varghese. Efficient fair queueing Table 2: One UDP flow and two TCP flows. The using deficit round robin. In Proc. ACM SIGCOMM, average bandwidth of each flow is given under Protocol pages 231–243, 1995. I, Protocol II, and CHOKe. [17] I. Stoica, S. Shenker, and H. Zhang. Core-stateless fair queueing: achieving approximately fair bandwidth Acknowledgments: Thanks to Scott Shenker for allocations in high speed networks. In Proc. ACM SIGCOMM, pages 118–130, 1998. helpful discussions and for access to simulation codes; [18] A. Tang, J. Wang, and S. H. Low. Understanding to Steven Low and Ao Tang for helpful discussions; and CHOKe. In Proc. IEEE INFOCOM, 2003. to Jiantao Wang for access to simulation codes.

References

1059