Mo deling TCP : A Simple Mo del and its Empirical Validation

Jitendra Padhye Victor Firoiu Don Towsley Jim Kurose

f jitu, v roiu, towsley, [email protected]

Department of Computer Science

University of Massachusetts

Amherst, MA 01003 USA

Abstract In this pap er we develop a simple analytic characteriza-

tion of the steady state throughput of a bulk TCP

In this pap er we develop a simple analytic characterization

ow i.e., a ow with a large amount of data to send, such

of the steady state throughput, as a function of loss rate

as FTP transfers as a function of loss rate and round trip

and round trip time for a bulk transfer TCP ow, i.e., a

time. Unlike the recentwork of [6 , 7, 10 ], our mo del captures

ow with an unlimited amount of data to send. Unlike the

not only the b ehavior of TCP's fast retransmit mechanism

mo dels in [6 , 7, 10 ], our mo del captures not only the be-

which is also considered in [6 , 7, 10 ] but also the e ect

havior of TCP's fast retransmit mechanism whichis also

of TCP's timeout mechanism on throughput. The measure-

considered in [6 , 7, 10] but also the e ect of TCP's timeout

ments we present in Section 3 indicate that this latter b ehav-

mechanism on throughput. Our measurements suggest that

ior is imp ortant from a mo deling p ersp ective, as we observe

this latter b ehavior is imp ortant from a mo deling p ersp ec-

more timeout events than fast retransmit events in almost

tive, as almost all of our TCP traces contained more time-

all of our TCP traces. Another imp ortant di erence b etween

out events than fast retransmit events. Our measurements

ours and previous work is the ability of our mo del to accu-

demonstrate that our mo del is able to more accurately pre-

rately predict throughput over a signi cantly wider range

dict TCP throughput and is accurate over a wider range of

of loss rates than b efore; measurements presented in [7 ] as

loss rates.

well the measurements presented in this pap er, indicate that

this to o is imp ortant. We also explicitly mo del the e ects

of small receiver-side windows. By comparing our mo del's

1 Intro duction

predictions with a numb er of TCP measurements made b e-

tween various Internet hosts, we demonstrate that our mo del

A signi cant amount of to day's Internet trac, including

is able to more accurately predict TCP throughput and is

WWW HTTP, le transfer FTP, email SMTP, and re-

able to do so over a wider range of loss rates.

mote access Telnet trac, is carried by the TCP transp ort

The remainder of the pap er is organized as follows. In

proto col [18 ]. TCP together with UDP form the very core

Section 2 we describ e our mo del of TCP congestion control

of to day's Internet transp ort layer. Traditionally, simula-

in detail and derive a new analytic characterization of TCP

tion and implementation/measurementhave b een the to ols

throughput as a function of loss rate and average round trip

of choice for examining the p erformance of various asp ects of

time. In Section 3 we compare the predictions of our mo del

TCP. Recently,however, several e orts have b een directed

with a set of measured TCP ows over the Internet, having

at analytically characterizing the throughput of TCP's con-

as their endp oints sites in b oth United States and Europ e.

gestion control mechanism, as a function of and

Section 4 discusses the assumptions underlying the mo del

round trip delay [6, 10 , 7]. One reason for this recent in-

and a number of related issues in more detail. Section 5

terest is that a simple quantitativecharacterization of TCP

concludes the pap er.

throughput under given op erating conditions o ers the p os-

sibility of de ning a \fair share" or \TCP-friendly" [6 ] through-

put for a non-TCP ow that interacts with a TCP connec-

2 A Mo del for TCP Congestion Control

tion. Indeed, this notion has already b een adopted in the

design and development of several multicast congestion con-

In this section we develop a sto chastic mo del of TCP conges-

trol proto cols [19 , 20 ].

tion control that yields a relatively simple analytic expres-

sion for the throughput of a saturated TCP sender, i.e., a



This material is based up on work supp orted by the National

ow with an unlimited amount of data to send, as a function

Science Foundation under grants NCR-95-08274, NCR-95-23807 and

of loss rate and average round trip time RTT.

CDA-95-02639. Any opinions, ndings, and conclusions or recommen-

dations expressed in this material are those of the authors and do not

TCP is a proto col that can exhibit complex b ehavior,

necessarily re ect the views of the National Science Foundation.

esp ecially when considered in the context of the current In-

ternet, where the trac conditions themselves can b e quite

complicated and subtle [14 ]. In this pap er, we fo cus our at-

tention on the congestion avoidance b ehavior of TCP and

its impact on throughput, taking into account the dep en-

dence of congestion avoidance on ACK b ehavior, the manner

in which packet loss is inferred e.g., whether by duplicate

ACK detection and fast retransmit, or by timeout, limited

advertised window Section 2.3. We note that wedonot receiver window size, and average round trip time RTT.

mo del certain asp ects of TCP's b ehavior e.g., fast recov- Our mo del is based on the Reno avor of TCP, asitisby

ery but b elievewehave captured the essential elements of far the most p opular implementation in the Internet to day

TCP b ehavior, as indicated by the generally very go o d ts [13 , 12]. We assume that the reader is familiar with TCP

between mo del predictions and measurements made on nu- Reno congestion control see for example [4 , 17 , 16 ] and we

merous commercial TCP implementations, as discussed in adopt most of our terminology from [4 , 17 , 16 ].

Section 3. A more detailed discussion of mo del assumptions Our mo del fo cuses on TCP's congestion avoidance mech-

and related issues is presented in Section 4. Also note that anism, where TCP's congestion control window size, W; is

in the following, we measure throughput in terms of packets increased by 1=W each time an ACK is received. Con-

p er unit of time, instead of bytes p er unit of time. versely, the window is decreased whenever a lost packet is

detected, with the amount of the decrease dep ending on

whether packet loss is detected by duplicate ACKs or by

2.1 Loss indications are exclusively \triple-duplicate" ACKs

timeout, as discussed shortly.

In this section we assume that loss indications are exclu-

We model TCP's congestion avoidance b ehavior in terms

sively of typ e \triple-duplicate" ACK TD, and that the

of \rounds." A round starts with the back-to-back transmis-

window size is not limited by the receiver's advertised ow

sion of W packets, where W is the current size of the TCP

control window. We consider a TCP ow starting at time

congestion window. Once all packets falling within the con-

t = 0, where the sender always has data to send. For any

gestion windowhave b een sent in this back-to-back manner,

given time t> 0, we de ne N to b e the numb er of pack-

t

no other packets are sentuntil the rst ACK is received for

ets transmitted in the interval [0;t], and B = N =t, the

t t

one of these W packets. This ACK reception marks the end

throughput on that interval. Note that B is the number of

t

packets sent p er unit of time regardless of their eventual fate

of the current round and the b eginning of the next round.

i.e., whether they are received or not. Thus, B represents

t

In this mo del, the duration of a round is equal to the round

the throughput of the connection, rather than its go o dput.

trip time and is assumed to b e indep endent of the window

We de ne the long-term steady-state TCP throughput B to

size, an assumption also adopted either implicitly or ex-

be

plicitly in [6, 7 , 10 ]. Note that wehave also assumed here

N

t

B = lim B = lim

t

that the time needed to send all the packets in a windowis

t!1 t!1 t

smaller than the round trip time; this b ehavior can b e seen

Wehave assumed that if a packet is lost in a round, all re-

in observations rep orted in [2, 12 ].

maining packets transmitted until the end of the round are

0

At the b eginning of the next round, a group of W new

also lost. Therefore we de ne p to b e the probability that

0

packets will b e sent, where W is the new size of the con-

a packet is lost, given that either it is the rst packet in its

gestion control window. Let b be the number of packets

round or the preceding packet in its round is not lost. We

that are acknowledged by a received ACK. Many TCP re-

are interested in establishing a relationship B p between

ceiver implementations send one cumulative ACK for two

the throughput of the TCP connection and p, the loss prob-

consecutive packets received i.e., delayed ACK, [16 ], so b

ability de ned ab ove.

is typically 2. If W packets are sent in the rst round and

A sample path of the evolution of congestion window size

are all received and acknowledged correctly, then W=b ac-

is given in Figure 1. Between two TD loss indications, the

knowledgments will b e received. Since eachacknowledgment

sender is in congestion avoidance, and the window increases

increases the window size by1=W; the window size at the

by1=b packets p er round, as discussed earlier. Immediately

0

b eginning of the second round is then W = W +1=b. That

after the loss indication o ccurs, the window size is reduced

is, during congestion avoidance and in the absence of loss,

by a factor of two.

the window size increases linearly in time, with a slop e of

We de ne a TD p erio d TDP to b e a p erio d b etween two

1=b packets p er round trip time.

TD loss indications see Figure 1. For the i-th TD p erio d

In the following subsections, we mo del TCP's b ehavior

we de ne Y to b e the numb er of packets sent in the p erio d,

i

in the presence of packet loss. Packet loss can b e detected in

A the duration of the p erio d, and W the window size at

i i

one of twoways, either by the reception at the TCP sender

the end of the p erio d. Considering fW g to b e a Markov

i i

regenerative pro cess with rewards fY g see for example

of \triple-duplicate" acknowledgments, i.e., four ACKs with

i i

[15 ], it can b e shown that

the same sequence numb er, or via time-outs. We denote the

former event as a \TD" triple-duplicate loss indication,

E [Y ]

and the latter as a \TO" loss indication.

B = 1

E [A]

We assume that a packet is lost in a round indep endently

of any packets lost in other rounds, a mo deling assumption

In order to derive an expression for B , the long-term steady-

justi ed to some extentby past studies [1 ] that have shown

state TCP throughput, wemust next derive expressions for

that p erio dic UDP packets that are separated byas little

the mean of Y and A.

as 40 msec tend to get lost only in singleton bursts. On

Consider a TD p erio d as in Figure 2. A TD p erio d starts

the other hand, we assume that packet losses are correlated

immediately after a TD loss indication, and thus the current

among the back-to-back transmissions within a round: if a

congestion window size is equal to W =2, half the size of

i1

window b efore the TD o ccurred. At each round the window

packet is lost, all remaining packets transmitted until the

is incremented by1=b and the numb er of packets sentper

end of that round are also lost. This bursty loss b ehavior,

round is incremented by one every b rounds. We denote

which has b een shown to arise from the drop-tail queuing

by the rst packet lost in TDP , and by X the round

i i i

discipline adopted in manyInternet routers, is discussed

where this loss o ccurs see Figure 2. After packet , W 1

i i

in [2 , 3]. We discuss it further in Section 4.

more packets are sent in an additional round b efore a TD

We develop a sto chastic mo del of TCP congestion con-

loss indication o ccurs and the current TD p erio d ends, as

trol in several steps, corresp onding to its op erating regimes:

discussed in more detail in Section 2.2. Thus, a total of

when loss indications are exclusively TD Section 2.1, when

Y = + W 1 packets are sent in X +1 rounds. It

i i i i

loss indications are b oth TD and TO Section 2.2, and

follows that:

E [Y ]= E [ ]+ E [W ] 1 2

when the congestion window size is limited by the receiver's W W 1 W 2

W 3

A 1 A 2 A 3 t

TDP 1 TDP 2 TDP 3

Figure 1: Evolution of window size over time when loss indications are triple duplicate ACKs

packets LEGEND sent ACKed packet W i lost packet

W α i-1 i TD occurs α −1 2 3 ...... i TDP ends β 2 5 i 1 4 no of rounds ..... 1234 Xi last round b b b penultimate round

TDP i

Figure 2: Packets sent during a TD p erio d

To derive E [ ], consider the random pro cess f g , where Finally, to derive an expression for E [X ], we consider the

i i

is the numb er of packets sent in a TD p erio d up to and evolution of W as a function of the numb er of rounds, as in

i i

including the rst packet that is lost. Based on our assump- Figure 2. To simplify our exp osition, in this derivation we

tion that packets are lost in a round indep endently of any assume that W =2 and X =b are integers. First we observe

i1 i

packets lost in other rounds, f g is a sequence of indep en- that during the i-th TD p erio d, the window size increases

i i

dent and identically distributed i.i.d. random variables. between W =2 and W . Since the increase is linear with

i1 i

Given our loss mo del, the probability that = k is equal slop e 1=b,wehave:

i

to the probability that exactly k 1 packets are successfully

X W

i i1

acknowledged b efore a loss o ccurs

+ ; i =1; 2;: :: 7 W =

i

2 b

k 1

P [ = k ]=1 p p; k =1; 2;: :: 3

The fact that Y packets are transmitted in TDP is ex-

i i

pressed by

The mean of is thus

1

X =b1

i

X

X

1

W

k 1

i1

1 p pk = 4 E [ ]=

Y = + k b + 8 

i i

p

2

k =1

k =0

X W X X

i i1 i i

Form 2 and 4 it follows that

= +  1 + 9

i

2 2 b

1 p

X W

i i1

E [Y ]= + E [W ] 5

=  + W 1 + using 7 10

i i

p

2 2

where is the numb er of packets sent in the last round see

i

To derive E [W ] and E [A], consider again TDP . We de-

i

Figure 2. fW g is a Markov pro cess for which a stationary

i i

ne r to b e the duration round trip time of the j -th round

ij

P

X +1

distribution can b e obtained numerically, based on 7 and

i

r . of TDP . Then, the duration of TDP is A =

ij i i i

j =1

10 and on the probability density function of f g given

i

We consider the round trip times r to b e random variables,

ij

in 3. We can also compute the probability distribution

that are assumed to b e indep endent of the size of congestion

of fX g. However, a simpler approximate solution can b e

i

window, and thus indep endent of the round numb er, j . It

obtained by assuming that fX g and fW g are mutually

i i

follows that

indep endent sequences of i.i.d. random variables. With this

E [A]=E [X ]+ 1E [r ] 6

assumption, it follows from 7, 10 and 5 that

Henceforth, we denote by RT T = E [r ] the average value of

2

E [X ] 11 E [W ]=

round trip time. b

and, In the case that another time-out o ccurs b efore successfully

retransmitting the packets lost during the rst time out, the

E [W ] 1 p E [X ]

p erio d of time out doubles to 2T ; this doubling is rep eated

0

+ E [W ]= + E [W ] 1 + E [ ] 12

p 2 2

for each unsuccessful retransmission until 64T is reached,

0

after which the time out p erio d remains constantat64T .

0

We consider that , the numb er of packets in the last round,

i

An example of the evolution of congestion window size

is uniformly distributed b etween 1 and W , and thus E [ ]=

i

TO

is given in Figure 3. Let Z denote the duration of a

E [W ]=2. From 11 and 12, wehave i

TD

sequence of time-outs and Z the time interval between

i

r

2

two consecutive time-out sequences. De ne S to b e

i

2+b 2+b 81 p

13 E [W ] = + +

3b 3bp 3b

TD TO

S = Z + Z

i

i i

Observe that,

Also, de ne M to b e the numb er of packets sent during S .

i i

r

Then, fS ;M g is an i.i.d. sequence of random variables,

i i i

8

p

and wehave

E [W ]= + o1= p 14

E [M ]

3bp

B =

E [S ]

q

8

We extend our de nition of TD p erio ds given in Section 2.1

i.e., E [W ]  for small values of p. From 11, 6 and

3bp

to include p erio ds starting after, or ending in, a TO loss in-

13, it follows

dication b esides p erio ds b etween two TD loss indications.

TD

r

Let n b e the numb er of TD p erio ds in interval Z . For

i

i

2

TD

2+b 2+b 2b1 p

the j -th TD p erio d of interval Z we de ne Y to b e the

ij

i

15 E [X ] = + +

6 3p 6

numb er of packets sent in the p erio d, A to b e the dura-

ij

 !

r

tion of the p erio d, X to b e the numb er of rounds in the

ij

2

2+b 2b1 p 2+b

p erio d, and W to be the window size at the end of the

ij

E [A] = RT T +1 16 + +

p erio d. Also, R denotes the numb er of packets sent during

i

6 3p 6

TO

time-out sequence Z . Observe here that R counts the

i

i

TO

total numb er of packet transmissions in Z , and not just

i Observe that,

r

the numb er of di erent packets sent. This is b ecause, as dis-

2b

p

cussed in Section 2.1, we are interested in the throughput of

E [X ]= + o1= p 17

3p

a TCP ow, rather than its go o dput. Wehave

From 1 and 5 wehave

n n

i i

X X

TO

M = Y + R ; S = A + Z

1p

i ij i i ij

i

+ E [W ]

p

j =1 j =1

B p = 18

E [A]

r

and, thus,

2

81p

1p

2+b 2+b

+ + +

n n

p 3b 3bp 3b

i i

X X

TO

= 19

q

A ]+E [Z ] Y ]+E [R]; E [S ]= E [ E [M ]= E [

ij ij

2b1p

2+b 2+b

2

RT T + +  +1

6 3p 6

j =1 j =1

If we assume fn g to b e an i.i.d. sequence of random

Which can b e expressed as:

i i

variables, indep endentof fY g and fA g, then wehave

ij ij

r

3 1

p

n n

i i

+ o1= p 20 B p=

X X

RT T 2bp

E [ Y  ]= E [n]E [Y ]; E [ A  ]=E [n]E [A]

ij i ij i

j =1 j =1

Thus, for small values of p, 20 reduces to the throughput

formula in [6] for b =1.

TD

To derive E [n] observe that, during Z , the time b etween

i

We next extend our mo del to include TCP b ehaviors

two consecutive time-out sequences, there are n TDPs, where

i

such as timeouts and receiver-limited windows not consid-

each of the rst n 1 end in a TD, and the last TDP ends

i

ered in previous analytic studies of TCP congestion control.

TD

there is one TO out of n in a TO. It follows that in Z

i

i

loss indications. Therefore, if we denote by Q the probabil-

2.2 Loss indications are triple-duplicate ACKs and time-

ity that a loss indication ending a TDP is a TO, wehave

outs

Q =1=E [n]. Consequently,

So far, we have considered TCP ows where all loss indi-

E [Y ]+Q  E [R]

B = 21

cations are due to \triple-duplicate" ACKs. Our measure-

TO

E [A]+ Q  E [Z ]

ments show see Table 2 that in many cases the ma jorityof

Since Y and A do not dep end on time-outs, their means

window decreases are due to time-outs, rather than fast re-

ij ij

are those derived in 4 and 16. To compute TCP through-

transmits. Therefore, a go o d mo del should capture time-out

TO

loss indications.

put using 21 wemust still determine Q; E [R ] and E [Z ]:

In this section we extend our mo del to include the case We b egin by deriving an expression for Q: Consider the

round of packets where a loss indication o ccurs; it will b e re-

where the TCP sender times-out. This o ccurs when packets

1

ferred to as the \p enultimate" round see Figure 4 . Let w

or ACKs are lost, and less than three duplicate ACKs are

received. The sender waits for a p erio d of time denoted by

1

In Figure 4 each ACK acknowledges individual packets i.e.,

T , and then retransmits non-acknowledged packets. Follow-

0

ACKs are not delayed. Wehavechosen this for simplicity of il-

ing a time-out, the congestion window is reduced to one, and

lustration. We will see that the analysis do es not dep end on whether

one packet is thus resent in the rst round after a time out. ACKs are delayed or not. W i2

W i1 W i3 W t i R i=2

A i1 A i2 A i3 T0 2T0 4T0 t TD TO Zi Zi

S i

Figure 3: Evolution of window size when loss indications are triple-duplicate ACKs and time-outs

LEGEND sequence number received packet lost packet sk ACK

k sm+1

m s 1 TD occurs, fw TDP ends

fk+1 fk w

k

f1 time

RTT RTT

penultimate round last round

Figure 4: Packet and ACK transmissions preceding a loss indication

b e the current congestion window size. Thus packets f ::f are ACKed in sequence in the last round where n packets

1 w

are sent in the penultimate round. Packets f ::f are ac- were sent and the rest of the packets in the round, if any,

1

k

knowledged, and packet f is the rst one to b e lost or are lost. Then,

k +1

not ACKed. We again assume that packet losses are corre-

n

m

1 p p; m  n 1

lated within a round: if a packet is lost, so are all packets

C n; m=

n

1 p ; m = n

that follow, till the end of the round. Thus, all packets fol-

lowing f in the p enultimate round are also lost. However,

k +1

^

since packets f ..f are ACKed, another k packets, s ::s are

Then, Qw , the probability that a loss in a window of size

1 1

k k

sent in the next round, whichwe will refer to as the \last"

w is a TO, is given by

round. This round of packets mayhave another loss, say



1 if w  3

packet s . Again, our assumptions on packet loss corre-

m+1

^

lation mandates that packets s ::s are also lost in the

Qw =

m+2

k

P P P

2 w 2

last round. The m packets successfully sent in the last round

Aw; k+ Aw; k C k; m otherwise

k =0 k =3 m=0

are resp onded to byACKs for packet f , which are counted

k

22

as duplicate ACKs. These ACKs are not delayed [16 ], p.

since a TO o ccurs if the number of packets successfully

312, so the numb er of duplicate ACKs is equal to the num-

transmitted in the p enultimate round, k , is less than three,

b er of successfully received packets in the last round. If the

or otherwise if the numb er of packets successfully transmit-

numb er of suchACKs is higher than three, then a TD in-

ted in the last round, m is less than three. Also, due to the

dication o ccurs, otherwise, a TO o ccurs. In b oth cases the

assumption that packet s is lost indep endently of packet

m+1

current p erio d between losses, TDP, ends. We denote by

f since they o ccur in di erent rounds, the probability

Aw; k the probability that the rst k packets are ACKed

k +1

in a round of w packets, given there is a sequence of one or

that there is a loss at f in the p enultimate round and a

k +1

more losses in the round. Then

loss at s in the last round equals Aw; k  C k; m, and

m+1

k

22 follows.

1 p p

Aw; k=

w

1 1 p

Also, we de ne C n; m to b e the probability that m packets

After algebraic manipulations, wehave 2.3 The impact of window limitation

3 3 w 3

1 1 p 1 + 1 p 1 1 p 

So far, wehave not considered any limitation on the con-

^

Qw  = min 1;

w

gestion window size. At the b eginning of TCP ow estab-

1 1 p

lishment, however, the receiver advertises a maximum bu er

23

size which determines a maximum congestion window size,

Observe for example, using L'Hopital's rule that

W . As a consequence, during a p erio d without loss indi-

max

3

^

: lim Qw =

cations, the window size can growuptoW , but will not

max

w p!0

grow further b eyond this value. An example of the evolution

^

of window size is depicted in Figure 5.

Numerically we nd that a very go o d approximation of Q is

To simplify the analysis of the mo del, we make the fol-

3

^

lowing assumption. Let us denote by W the unconstrained

u

Qw   min1;  24

w

window size, the mean of whichisgiven in 13

Q, the probability that a loss indication is a TO, is

r

2

1

81 p 2+b 2+b

X

30 + + E [W ]=

u

^ ^

Q = Qw P [W = w ]=E [Q]

3b 3bp 3b

w =1

We assume that if E [W ]

u max

We approximate

tion E [W ]  E [W ]. In other words, if E [W ]

u u max

^

Q  QE [W ] 25

receiver-window limitation has negligible e ect on the long

where E [W ] is from 13.

term average of the TCP throughput, and thus the TCP

TO

We consider next the derivation of E [R ] and E [Z ].

throughput is given by 27.

For this, we need the probability distribution of the number

On the other hand, if W  E [W ], we approximate

max u

TD

of timeouts in a TO sequence, given that there is a TO. We

E [W ]  W . In this case, consider an interval Z be-

max

have observed in our TCP traces that in most cases, one

tween two time-out sequences consisting of a series of TD

packet is transmitted between two time-outs in sequence.

p erio ds as in Figure 6. During the rst TDP, the window

Thus, a sequence of k TOs o ccurs when there are k 1

grows linearly up to W for U rounds, then remains con-

max 1

consecutive losses the rst loss is given followed by a suc-

stant for V rounds, and then a TD indication o ccurs. The

1

cessfully transmitted packet. Consequently, the number of

window then drops to W =2, and the pro cess rep eats.

max

TOs in a TO sequence has a geometric distribution, and

Thus,

thus

W U

max

i

k 1

W = + ; 8i  2

max

P [R = k ]= p 1 p

2 b

Then we can compute R 's mean

which implies E [U ] = b=2W . Also, considering the

max

1

X

numb er of packets sent in the i-th TD p erio d, wehave

1

26 E [R]= kP [R = k ]=

1 p

W U

max

i

k =1

 + W +V W Y =

max max

i i

2 2

TO

Next, we fo cus on E [Z ], the average duration of a time-

and then

out sequence excluding retransmissions, which can b e com-

puted in a similar way. We know that the rst six time-outs

3 3b

2

i1

E [Y ]= + W E [V ] W E [U ]+ W E [V ]= W

max max max

in one sequence have length 2 T , i =1:::6, with all im-

max

0

4 8

mediately following timeouts having length 64T . Then, the

0

duration of a sequence with k time-outs is

Since Y , the numb er of packets in the i-th TD p erio d, do es

i

n

not dep end on window limitation, E [Y ] is given by 5,

k

2 1T for k  6

0

L =

E [Y ]= 1 p=p + W , and thus

k

max

63 + 64k 6T for k  7

0

TO

1 p 3b

and the mean of Z is

E [V ]= W +1

max

pW 8

1

max

X

TO

L P [R = k ] E [Z ] =

k

Finally, since X = U + V ,wehave

i i i

k =1

1 p b

2 3 4 5 6

1+p +2p +4p +8p +16p +32p

+1 W + E [X ]= E [U ]+ E [V ]=

max

= T

0

8 pW

max

1 p

TO

By substituting this result in 27, we obtain the TCP through-

Armed now with expressions for Q; E [S ];E[R ] and E [Z ]

put, B p, when the window is limited

we can now substitute these expressions into equation 21

to obtain the following for B p:

1p

1

^

+ W + QW 

max max

p 1p

1p

1

^

+ E [W ]+ QE [W ]

B p=

p 1p

f p

1p

b

^

27 B p=

+2+QW T W + RT T 

max max

0

f p

8 pW 1p

max

^

RT T E [X ] + 1 + QE [W ]T

0

1p

In conclusion, the complete characterization of TCP through-

where:

put, B p, is:

2 3 4 5 6

f p=1+p +2p +4p +8p +16p +32p 28

8

1p

1

^

+E [W ]+QE [W ]

p 1p

^

>

if E [W ]

Q is given in 23, E [W ] in 13 and E [X ] in 16. Using

u max

>

f p

b

^

<

E [W ]+1+QE [W ]T RT T 

u

0

2 1p

24, 14 and 17, wehave that 27 can b e approximated

B p=

by

1p

1

>

^

+W +QW 

max max

>

p 1p

:

1

otherwise

f p

1p

b

^

B p  29

W + +2+QW T RT T 

max max

0

p p

8 pW 1p

max

2bp 3bp

2

RT T + T min 1; 3 p1+32p 

31

0

3 8 W i2 W i1 W Wmax W t i i3 R i=2

A i1 A i2 A i3 T0 2T0 4T0 t TD Z TO

Zi i

Figure 5: Evolution of window size when limited by W max

W Wmax

Y1 Y2 Y3

U1 V1 U2 V2 U3 V3 no. of rounds

X1 X2 X3

TDP 1 TDP 2 TDP 3

Figure 6: Fast retransmit with window limitation

^

where f p is given in 28, Q is given in 23 and E [W ]in

u

13. In the following sections we will refer to 31 as the

\full mo del". The following approximation of B p follows

from 29 and 31:

0 1

W 1

max

@ A

B p  min ;

p p

RT T

2bp 3bp

2

RT T + T min 1; 3 p1 + 32p 

0

3 8

Receiver Domain Op erating System

32

ada hofstra.edu Irix 6.2

In Section 3 we verify that equation 32 is indeed a very

afer cs.umn.edu Linux

go o d approximation of equation 31. Henceforth we will refer

al cs.wm.edu Linux 2.0.31

to 32 as the \approximate mo del".

alps cc.gatech.edu SunOS 4.1.3

bab el cs.umass.edu SunOS 5.5.1

baskerville cs.arizona.edu SunOS 5.5.1

3 Measurements and Trace Analysis

ganef cs.ucla.edu SunOS 5.5.1

imagine cs.umass.edu win95

Equations 31 and 32 provide an analytic characteriza-

manic cs.umass.edu Irix 6.2

tion of TCP as a function of packet loss indication rate,

mafalda inria.fr SunOS 5.5.1

RTT, and maximum window size. In this section we empiri-

maria wustl.edu SunOS 4.1.3

cally validate these formulae, using measurement data from

mo di4 ncsa.uiuc.edu Irix 6.2

37 TCP connections established b etween 18 hosts scattered

pif inria.fr Solaris 2.5

across United States and Europ e.

p ong usc.edu HP-UX

Table 1 lists the domains and op erating systems of the

spi sics.se SunOS 4.1.4

18 hosts. All data sets are for unidirectional bulk data

sutton cs.columbia.edu SunOS 5.5.1

transfers. We gathered the measurement data by running

tove cs.umd.edu SunOS 4.1.3

tcpdump at the sender, and analyzing its output with a set

void US site Linux 2.0.30

of analysis programs develop ed by us. These programs ac-

count for various measurement and implementation related

Table 1: Domains and Op erating Systems of Hosts

problems discussed in [13 , 12 ]. For example, when we an-

alyze traces from a Linux sender, we account for the fact

that TD events o ccur after getting only two duplicate acks

instead of three. Our trace analysis programs were further

veri ed bychecking them against tcptrace[9 ] and ns [8 ].

Table 2 summarizes data from 24 data sets, each of which

corresp onds to a one hour long TCP connection in which the

sender b ehaves as an \in nite source" { it always has data

Sender Receiver Packets Loss TD TO RTT Time Sender Receiver Packets Loss TD TO RTT Time

Sent Indic. Out Sent Indic. Out

manic alps 54402 722 19 703 0.207 2.505 manic ada 531533 6432 4320 2112 0.141 2.223

manic baskerville 58120 735 306 429 0.243 2.495 manic afer 255674 4577 2584 1993 0.180 2.3

manic ganef 58924 743 272 471 0.226 2.405 manic al 264002 4720 2841 1879 0.188 2.354

manic mafalda 56283 494 2 492 0.233 2.146 manic alps 667296 3797 841 2956 0.112 1.915

manic maria 68752 649 1 648 0.180 2.416 manic baskerville 89244 1638 627 1011 0.473 3.226

manic spi 117992 784 47 737 0.211 2.274 manic ganef 160152 2470 1048 1422 0.215 2.607

manic sutton 81123 1638 988 650 0.204 2.459 manic mafalda 171308 1332 9 1323 0.250 2.512

manic tove 7938 264 1 263 0.275 3.597 manic maria 316498 2476 5 2471 0.116 1.879

void alps 37137 838 7 831 0.162 0.489 manic mo di4 282547 6072 3976 2096 0.174 2.26

void baskerville 32042 853 339 514 0.482 1.094 manic p ong 358535 4239 2328 1911 0.176 2.137

void ganef 60770 1112 414 696 0.254 0.637 manic spi 298465 2035 159 1876 0.253 2.454

void maria 93005 1651 33 1618 0.152 0.417 manic sutton 348926 6024 3694 2330 0.168 2.185

void spi 65536 671 72 599 0.415 0.749 manic tove 262365 2603 6 2597 0.115 1.955

void sutton 78246 1928 840 1088 0.211 0.601

void tove 8265 856 5 843 0.272 1.356

Table 3: Summary data from 100s traces

bab el alps 13460 1466 0 1461 0.194 1.359

bab el baskerville 62237 1753 197 1556 0.253 0.429

bab el ganef 86675 2125 398 1727 0.201 0.306

[7 ], as well as the predicted throughput from our prop osed

bab el spi 57687 1120 0 1120 0.331 0.953

bab el sutton 83486 2320 685 1635 0.210 0.705

mo del given in 31 as describ ed b elow. The title of the trace

bab el tove 83944 1516 1 1514 0.194 0.520

indicates the average round trip time, the average \single"

pif alps 83971 762 0 760 0.168 7.278

timeout duration T , and the maximum window size W

0 max

pif imagine 44891 1346 15 1329 0.229 0.700

advertised by the receiver in numb er of packets. The x-

pif manic 34251 1422 43 1377 0.257 1.454

axis represents the frequency of loss indications, p, while

y -axis represents the numb er of packets sent.

Table 2: Summary data from 1hr traces

Each one-hour trace was divided into 36 consecutive 100

second intervals, and each plotted p oint on a graph repre-

sents the numb er of packets sentversus the numb er of loss

to send and thus TCP throughput is only limited by the

indications during a 100s interval. While dividing a continu-

TCP congestion control. The exp eriments were p erformed

ous trace into xed sized intervals can lead to some inaccura-

at randomly selected times during 1997 and b eginning of

cies in measuring p, e.g., the interval b oundaries may o ccur

1998. The third and fourth columns of Table 2 indicate

within timeout intervals, thus p erhaps not attributing a loss

the numb er of packets sent and the numb er of loss indica-

eventto the interval where most of its impact is felt, we

tions resp ectively triple duplicate ack or timeout. Dividing

b elieve that by using interval sizes of 100s, which are longer

the total numb er of loss indications by the total number of

than most timeouts, wehave minimized the impact of such

packets sent, yields an approximate value of p. This ap-

inaccuracies. Each 100 second interval is classi ed into one

proximation is similar to the one used in [7 ]. The next two

of four categories: intervals of typ e \TD" did not su er any

columns show a breakdown of the loss indications bytyp e:

timeout only triple duplicate acks, intervals of typ e \T 0"

the numberofTDevents, and the numb er of timeouts. Note

su ered at least one \single" timeout but no exp onential

that p dep ends only on the total numb er of loss indications,

backo , \T 1" represents intervals that su ered a single ex-

and not on their typ e. The last two columns rep ort the

p onential backo at least once i.e a \double" timeout etc.

average round trip time, and average duration of a \single"

The line lab eled \TD Only" stands for Triple-Duplicate

timeout T . These values have b een averaged over the entire

0

acks Only plots the predictions made by the mo del de-

trace. When calculating round trip time values, we follow

scrib ed in [7], which is essentially the same mo del as de-

Karn's algorithm [5 ], in an attempt to minimize the impact

scrib ed in [6 ], while accounting for delayed acks. The line

of timeouts and retransmissions on the RTT estimates.

lab eled \Prop osed Full" represents the mo del describ ed by

Table 3 rep orts summary results from additional 13 data

Equation 31. It has b een p ointed out in [6 ] that the \TD

sets. In these cases, each data set represents 100 serially-

Only" mo del may not be accurate when the frequency of

initiated TCP connections b etween a given sender-receiver

loss indications is higher than 5. We observe that in many

pair. Each connection lasted 100 seconds, and was followed

traces the frequency of loss indications is higher than 5

by a 50 second gap b efore the next connection was initi-

and that indeed the \TD Only" mo del predicts values for

ated. These exp eriments were p erformed at randomly se-

TCP throughput much higher than measured. Also, in sev-

lected times during 1998. The data in columns 3-10 of Ta-

eral traces see for example, Figure 7 we observe that TCP

ble 3 are cumulativeover the set of 100 traces for the given

throughput is limited by the receiver's advertised window

source-destination pair. The last two columns rep ort the av-

size. This is not accounted for in the \TD Only" mo del,

erage value of round trip time and \single" timeout. These

and thus \TD Only" overestimates the throughput at low p

values have b een averaged over all one hundred traces for

values.

the given source-destination pair.

Figures 13-17 show similar graphs, where each p oint rep-

An imp ortant observation to b e drawn from the data in

resents an individual 100 second TCP connection. To plot

these tables is that, in all traces, timeouts constitute the

the mo del predictions, we used round trip and timeout du-

ma jority or a signi cant fraction of the total numb er of loss

rations that were averaged over all 100 traces these values

indications. This underscores the imp ortance of including

also app ear in Table 3. Equation 32 in Section 2 rep-

the e ects of timeouts in the mo del of TCP congestion con-

resents the simple, but approximate form 32 of the full

trol. Wehave also noticed that exp onential backo due to

mo del given in 31. In Figure 18, we plot the predictions

multiple timeouts o ccurs with signi cant frequency. More

of the approximate mo del along with the full mo del. The

details are provided in [11 ].

results for other data sets are similar.

Next, we use the measurement data describ ed ab oveto

In order to accurately evaluate the mo dels, we compute

validate our mo del prop osed in Section 2. Figures 7-12 plot

the average error as follows:

the measured throughput in our trace data, the mo del of manic-baskerville, RTT=0.243, TO=2.495, WMax=6, 1x1hr pif-imagine, RTT=0.229, TO=0.700, WMax=8, 1x1hr 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full)

1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 7: manic to baskerville Figure 8: pif to imagine

pif-manic, RTT=0.257, TO=1.454, WMax=33, 1x1hr void-alps, RTT=0.162, TO=0.489, WMax=48, 1x1hr 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full)

1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 9: pif to manic Figure 10: void to alps

void-tove, RTT=0.272, TO=1.356, WMax=8, 1x1hr babel-alps, RTT=0.194, TO=1.359, WMax=48, 1x1hr 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full)

1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 11: void to tove Figure 12: bab el to alps

 Hour-long traces: We divide each trace into 100 B is from 31. The average error is given by:

second intervals, and compute the numb er of packets

P

jN N j

pr edicted obser v ed

sent during that interval here denoted as N 

obser v ed

N

obser v ations

obser v ed

as well as the value of loss frequency here p .

obser v ed

numb er of observations

We also calculate the average value of round trip time

and timeout for the entire trace these values are avail-

The average error of our approximate mo del using B

able in Table 2. Then, for each 100 second interval we

from 32 and of \TD Only" are calculated in a sim-

calculate the numb er of packets predicted by our pro-

ilar manner. A smaller average error indicates b etter

p osed mo del, N = B p   100s, where

pr edicted obser v ed

mo del accuracy. In Figure 19 we plot these error values

to allow visual comparison. On the x-axis, the traces manic-ganef, RTT=0.2150, TO=2.6078, WMax=6.0, 100x100s manic-mafalda, RTT=0.2501, TO=2.5127, WMax=8.0, 100x100s 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full)

1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 13: manic to ganef Figure 14: manic to mafalda

manic-spiff, RTT=0.2539, TO=2.4545, WMax=32.0, 100x100s manic-baskerville, RTT=0.4735, TO=3.2269, WMax=6.0, 100x100s 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full)

1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 15: manic to spi Figure 16: manic to baskerville

manic-sutton, RTT=0.1683, TO=2.1852, WMax=25.0, 100x100s RTT=0.2539, TO=2.4545, WMax=32.0, 100x100s 10000 10000

1000 1000

100 100 TD TD T0 T0 T1 T1 T2 T2 Number of Packets Sent 10 T3 or more Number of Packets Sent 10 T3 or more TD Only TD Only Proposed (Full) Proposed (Full) Proposed (Approx) 1 1 0.001 0.01 0.1 1 0.001 0.01 0.1 1

Frequency of Loss Indications (p) Frequency of Loss Indications (p)

Figure 17: manic to sutton Figure 18: manic to spi , with approximate mo del

are identi ed by sender and receiver names. The order It can b e seen from Figures 19 and 20 that in most cases,

in which the traces app ear is such that, from left to our prop osed mo del is a b etter estimator of the observed

right, the average error for the \TD Only" mo del is values than the \TD Only" mo del. Our approximate mo del

increasing. The p oints corresp onding to a given mo del also generally provides more accurate predictions than the

are joined by line segments only for better visual rep- \TD Only" mo del, and is quite close to the predictions made

resentation of the data. by the full mo del. As one would exp ect, our mo del do es not

match all of the observations. We show an example of this

 100 second traces: We use the value of round trip

in Figure 17. This is probably due to a large number of

time and timeout calculated for each 100-second trace.

triple duplicate acks observed for this trace set.

The error values are shown in Figure 20.

Figure 20: Comparison of the mo dels for 100s traces

Figure 19: Comparison of the mo dels for 1hr traces

A Discussion of the Mo del and the Exp erimental Re- 4 manic-p5, RTT=4.726, TO=18.407, WMax=22, 1x1hr

sults 10000

In this section, we discuss various simplifying assumptions

made while constructing the mo del in Section 2, and their 1000

impact on the results describ ed in Section 3.

Our mo del do es not capture the subtleties of the fast re-

covery algorithm. We b elieve that the impact of this omis-

100 ted in Section

sion is quite small, and that the results presen TD

alidate this assumption indirectly. Wehave also assumed 3v T0

T1

tinslow start is negligible compared to that the time sp en T2

Number of Packets Sent 10

Both these assumptions have also the length of our traces. T3 or more TD Only

b een made in [6 , 7, 10 ]. Proposed (Full)

Wehave assumed that packet losses within a round are

orrelated. Justi cation for this assumption comes from the c 1

0.001 0.01 0.1 1

ast ma jority of the routers in Internet to day

fact that the v Frequency of Loss Indications (p)

use the drop-tail p olicy for packet discard. Under this p ol-

icy, all packets that arrive at a full bu er are dropp ed. As

packets in a round are sent back-to-back, if a packet arrives

Figure 21: manic to p5

at a full bu er, it is likely that the same happ ens with the

rest of the packets in the round. Packet loss correlation at

drop-tail routers was also p ointed out in [2 , 3]. In addition,

plot results from one such exp eriment. The receiver was a

we assume that losses in one round are independent of losses

Pentium PC, running Linux 2.0.27 and was connected to the

in other rounds. This is justi ed by the fact that packets

Internet via a commercial service provider using a 28.8Kbps

in di erent rounds are separated by one RTT or more, and

mo dem. The results are for a 1 hour connection divided into

thus they are likely to encounter bu er states that are inde-

100 second intervals.

p endent of each other. This is also con rmed by ndings in

Wehave also assumed that all of our senders implement

[1 ].

TCP-Reno as describ ed in [4, 17 , 16 ]. In [13 , 12 ], it is ob-

Another assumption we made, that is also implicit in

served that the implementation of the proto col stack in each

[6 , 7 , 10], is that the round trip time is indep endent of the

op erating system is slightly di erent. While wehave tried to

window size. Wehave measured the co ecient of correla-

account for the signi cant di erences for example in Linux

tion b etween the duration of round samples and the number

the TD loss indications o ccur after two duplicate ACKs,

of packets in transit during each sample. For most traces

wehave not tried to customize our mo del for the nuances

summarized in Table 2, the co ecient of correlation is in

of each op erating system. For example, we have observed

the range of -0.1 to +0.1, thus lending credence to the sta-

that the Linux exp onential backo do es not exactly follow

tistical indep endence b etween round trip time and window

the algorithm describ ed in [4 , 17, 16 ]. Our observations also

size. However, when we conducted similar exp eriments with

seem to indicate that in the Irix implementation, the exp o-

5 6

receivers at the end of a mo dem line, we found the co ecient

nential backo is limited to 2 , instead of 2 . We are also

of correlation to b e as high as 0.97. We sp eculate that this

aware of the observation made in [13 ] that the SunOS TCP

is a combined e ect of a slow link and a bu er devoted ex-

implementation is derived from Taho e and not Reno. We

clusively to this connection probably at the ISP, just b efore

have not customized our mo del for these cases.

the mo dem. As a result, our mo del, as well as the mo dels

describ ed in [6 , 10 , 7 ] fail to match the observed data in the

case of a receiver at the end of a mo dem. In Figure 21, we

References 5 Conclusion

[1] J. Bolot and A. Vega-Garcia. Control mechanisms for packet

In this pap er wehave presented a simple mo del of the TCP-

audio in the Internet. In Proceedings IEEE Infocom96, 1996.

Reno proto col. The mo del captures the essence of TCP's

[2] K. Fall and S. Floyd. Simulation-based comparisons of congestion avoidance b ehavior and expresses throughput as

Taho e, Reno, and SACK TCP. Computer Communication

a function of loss rate. The mo del takes into account the

Review, 263, July 1996.

b ehavior of the proto col in the presence of timeouts, and is

valid over the entire range of loss probabilities.

[3] S. Floyd and V. Jacobson. Random Early Detection gate-

Wehave compared our mo del with the b ehavior of sev-

ways for congestion avoidance. IEEE/ACM Transactions on

Networking, 14, August 1997.

eral real-world TCP connections. We observed that most

of these connections su ered from a signi cant number of

[4] V. Jacobson. Mo di ed TCP congestion avoidance algorithm.

timeouts. We found that our mo del provides a very go o d

Note sent to end2end-interest mailing list, 1990.

match to the observed b ehavior in most cases, while mo dels

[5] P. Karn and C. Partridge. Improving Round-Trip time esti-

prop osed in [6, 7 , 10] signi cantly overestimate throughput.

mates in reliable transp ort proto cols. Computer Communi-

Thus, we conclude that timeouts have a signi cant impact

cation Review, 175, August 1987.

on the p erformance of the TCP proto col, and that our mo del

[6] J. Mahdavi and S. Floyd. TCP-friendly unicast rate-based

is able to account for this impact.

ow control. Note sent to end2end-interest mailing list, Jan

Wehave also presented a simpli ed expression for TCP

1997.

in Equation 32, which is a go o d approximation

[7] M. Mathis, J. Semske, J. Mahdavi, and T. Ott. The macro-

for the prop osed mo del in most cases. This simple approxi-

scopic b ehavior of the TCP congestion avoidance algorithm.

mation can b e used in proto cols such as those describ ed in

Computer Communication Review, 273, July 1997.

[19 , 20 ] to ensure \TCP-friendliness'.

[8] S. MCanne and S. Flyo d. ns-LBL Network Simulator, 1997.

Anumber of avenues for future work remain. First, our

Obtain via http://www-nrg.ee.lbnl.gov/ns/.

mo del can b e enhanced to account for the e ects of fast re-

[9] S. Ostermann. tcptrace: TCP dump le analysis to ol, 1996.

covery and fast retransmit. Second, a more precise through-

http://jarok.cs.ohiou.edu/software/tcptrace/.

put calculation can be obtained if the congestion window

size is mo deled as a Markovchain. Third, wehave assumed

[10] T. Ott, J. Kemp erman, and M. Mathis. The stationary b e-

havior of ideal TCP congestion avoidance. in preprint. that once a packet in a given round is lost, all remaining

packets in that round are lost as well. This assumption can

[11] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Mo deling

b e relaxed, and the mo del can b e mo di ed to incorp orate a

TCP throughput: A simple mo del and its empirical valida-

loss distribution function. Estimating this distribution func-

tion. Technical rep ort UMASS-CS-TR-1998-08.

tion for a given path in the Internet is a signi cant research

[12] V. Paxson. Automated packet trace analysis of TCP imple-

e ort in itself. Fourth, it is interesting to further investigate

mentations. In Proceedings of SIGCOMM 97, 1997.

the b ehavior of TCP over slow links with dedicated bu ers

[13] V. Paxson. End-to-End Internet packet dynamics. In Pro-

such as mo dem lines. We are currently investigating more

ceedings of SIGCOMM 97, 1997.

closely the data sets for which our mo del is not a go o d esti-

[14] V. Paxson and S. Floyd. Whywe don't knowhow to simulate

mator. We are also working on a TCP-friendly proto col to

the Internet. In Proccedings of the 1997 Winter Simulation

control transmission of continuous media. This proto col will

Conference, 1997.

use our mo del to mo dulate its throughput to ensure TCP

[15] S. Ross. AppliedProbability Models with Optimization Ap-

friendliness.

plications.Dover, 1970.

[16] W. Stevens. TCP/IP Il lustrated, Vol.1 The Protocols.

6 Acknowledgments

Addison-Wesley, 1994.

Wewould like to thank Gary Wallace and the CSCF sta at

[17] W. Stevens. TCP Slow Start, Congestion Avoidance, Fast

Retransmit, and Fast Recovery Algorithms. RFC2001, Jan

the Department of Computer Science, University of Mas-

1997.

sachusetts, Amherst, for helping us with tcp dump setup

and general system administration. We are grateful to P.

[18] K. Thompson, G. Miller, and M. Wilder. Wide-area internet

Krishnan of Hofstra University, Zhi-Li Zhang of University

trac patterns and charateristics. IEEE Network, 116,

Novemb er-Decemb er 1997.

of Minnesota, Andreas Stathop oulos of College of William

and Mary, Peter Wan of Georgia Inst. of Tech., Larry

[19] T. Turletti, S. Parisis, and J. Bolot. Exp eriments

Peterson of Universityof Arizona, Lixia Zhnag of Univer-

with a layered transmission scheme over the Internet.

sity of California, Los Angeles, John Bolot and Phillip e

Technical rep ort RR-3296, INRIA, France. Obtain via

http://www.inria.fr/RRRT/RR-3296.html.

Nain of INRIA, Chuck Cranor of Washington University,

St. Louis, Grig Gheorhiu of University of Southern Cali-

[20] L. Vicisano, L. Rizzo, and J. Crowcroft. TCP-like congestion

fornia, Stephen Pink of Swedish Institute of Science, Hen-

control for layered multicast data transfer. In Proceedings of

ning Schulzerinne of Columbia University, Satish Tripathi of

INFOCOMM'98, 1998.

University of Maryland, College Park, and Sneha Kasera of

University of Massachusetts, Amherst for providing us with

computer accounts that allowed us to gather the data pre-

sented in this pap er. We also thank Dan Rub enstein and

the anonymous referees for their helpful comments on earlier drafts of this pap er.