TCP Congestion Control with a Misbehaving Receiver

Stefan Savage, Neal Cardwell, David Wetherall, and Tom Anderson Department of Computer Science and Engineering University of Washington, Seattle

Abstract col modifications, a faulty or malicious receiver can at most cause the sender to transmit data at a slower rate than it otherwise would, In this paper, we explore the operation of TCP congestion control thus harming only itself. Because our work has serious practical when the receiver can misbehave, as might occur with a greedy ramifications for an that depends on trust to avoid conges- Web client. We first demonstrate that there are simple attacks that tion collapse, we also describe backwards-compatible mechanisms allow a misbehaving receiver to drive a standard TCP sender ar- that can be implemented at the sender to mitigate the effects of un- bitrarily fast, without losing end-to-end reliability. These attacks trusted receivers. are widely applicable because they stem from the sender behavior As far as we are aware, the division of trust between sender specified in RFC 2581 rather than implementation bugs. We then and receiver has not been studied previously in the context of con- show that it is possible to modify TCP to eliminate this undesir- gestion control. While end-to-end congestion control protocols as- able behavior entirely, without requiring assumptions of any kind sume that both sender and receiver behave correctly, in many en- about receiver behavior. This is a strong result: with our solution vironments the interests of sender and receiver may differ consid- a receiver can only reduce the data transfer rate by misbehaving, erably – creating significant incentives to violate this “good faith” thereby eliminating the incentive to do so. doctrine. For example, in many wide-area data retrieval applica- tions, such as Web browsing, the sender's interest is to provide 1 Introduction uniform service to all clients requesting data, while the interest of each client receiver is to maximize its own data throughput. Tra- End-to-end congestion control mechanisms, such as those used in ditional cryptographic security mechanisms, such as embodied in TCP, are the primary means used for sharing scarce re- IPSEC [KA98], can provide guarantees of authentication and con- sources in the Internet. These mechanisms implicitly rely on both fidentiality, but they can not prevent a receiver from violating TCP's endpoints to cooperate in determining the proper rate at which to congestion control specification and consequently undermining the send data. Obviously, if the sending endpoint misbehaves, and does fairness and stability provided therein. not obey the appropriate congestion control , then it may The potential congestion resulting from aggressive senders send data more quickly than well-behaved hosts – possibly forc- has received significant attention from the networking commu- ing competing traffic to be delayed or discarded. Less obviously, a nity [She94, FF99, RHE99, VRC98], and has produced proposals misbehaving receiver can achieve the same result. for per-flow bandwidth reservation [ZDE+ 93] and mechanisms to While the possibility of such attacks has been hinted at pre- detect and limit “unfriendly” flows in the network [FF99]. These viously [APS99, PAD + 99], we believe the ease of exploiting this solutions, if workable, would solve the more general problem of vulnerability and the potential impact are not fully appreciated. We unconstrained data transmission and would make the issue of trust note that the population of receivers is extremely large (all Inter- in end-to-end congestion control less pressing. However, given that net users) and has both the incentive (faster Web surfing) and the it is unlikely that such mechanisms will be widely deployed in the opportunity (open source operating systems) to exploit this vulner- near term, we feel it is still prudent to consider the potential im- ability. pact of untrusted receivers on the congestion control mechanisms In this paper, we explore the impact that a misbehaving receiver in today's Internet. can have on TCP congestion control. We present two kinds of re- sults. First, we identify several vulnerabilities that can be exploited 2 Vulnerabilities by a malicious receiver to defeat TCP congestion control. This can be done in a manner that does not break end-to-end reliability se- By systematically considering sequences of message exchanges, mantics and that relies only on the standard behavior of correctly we have been able to identify several vulnerabilities that allow mis- implemented TCP senders. Tests against live Web servers using a behaving receivers to control the sending rate of unmodified, con- modified TCP implementation that we produced for this purpose, forming TCP senders. This section describes these vulnerabilities “TCP Daytona”, confirm that common TCP implementations pos- and techniques for exploiting them. In addition to denial-of-service sess these vulnerabilities. Second, we show that it is possible to attacks, these techniques can be used to enhance the performance modify the design of TCP to eliminate this behavior – without re- of the attacker's TCP sessions at the expense of behaving clients. quiring that the receiver be trusted in any manner. With our proto-

This work was funded by generous grants from NSF (CCR 94-53532), DARPA (F30602-98-1-0205), USENIX, the National Library of Medicine, Cisco, Fuji and In- tel. Correspondence concerning this paper may be sent to [email protected]. 2.1 TCP review Sender Receiver While a detailed description of TCP's error and congestion con- Data trol mechanisms is beyond the scope of this paper, we describe the 1:1461 rudiments of their behavior below to allow those unfamiliar with TCP to understand the vulnerabilities explained later. For simplic- 7 ACK 48 ity, we consider TCP without the Selective Acknowledgment op- RTT 3 tion (SACK) [MMFR96], although the vulnerabilities we describe ACK 97 61 also exist when SACK is used. ACK 14 TCP is a connection-oriented, reliable, ordered, byte-stream Da protocol with explicit flow control. A sending host divides the data ta 146 1:2921 stream into individual segments, each of which is no longer than the Da ta 292 Sender Maximum Segment Size (SMSS) determined during con- 1:4381 Da nection establishment. Each segment is labeled with explicit se- ta 438 1:5841 quence numbers to guarantee ordering and reliability. When a host Da ta 584 receives an in-sequence segment it sends a cumulative acknowl- 1:7301 edgment (ACK) in return, notifying the sender that all of the data preceding that segment's sequence number has been received and can be retired from the sender's retransmission buffers. If an out- of-sequence segment is received, then the receiver acknowledges the next contiguous sequence number that was expected. If out- Figure 1: Sample time line for a ACK division attack. The sender be- standing data is not acknowledged for a period of time, the sender gins with cwnd=1, which is incremented for each of the three valid ACKs will timeout and retransmit the unacknowledged segments. received. After one round-trip time, cwnd=4, instead of the expected value TCP uses several algorithms for congestion control, most no- of cwnd=2. tably slow start and congestion avoidance [Jac88, Ste94, APS99]. Each of these algorithms controls the sending rate by manipulating This attack is demonstrated in Figure 1 with a time line. Here, a congestion window (cwnd) that limits the number of outstanding each message exchanged between sender and receiver is shown as unacknowledged bytes that are allowed at any time. When a con- a labeled arrow, with time proceeding down the page. The labels nection starts, the slow start is used to quickly increase indicate the type of message, data or acknowledgment, and the se- cwnd to reach the bottleneck capacity. When the sender infers that quence space consumed. In this example we can see that each ac- a segment has been lost it interprets this has an implicit signal of knowledgment is valid, in that it covers data that was sent and pre- network overload and decreases cwnd quickly. After roughly ap- viously unacknowledged. This leads the TCP sender to grow the proximating the bottleneck capacity, TCP switches to the conges- congestion window at a rate that is M times faster than usual. The tion avoidance algorithm which increases the value of cwnd more receiver can control this rate of growth by dividing the segment slowly to probe for additional bandwidth that may become avail- at arbitrary points – up to one acknowledgment per byte received able. (when M = N). At this limit, a sender with a 1460 byte SMSS could We now describe three attacks on this congestion control pro- theoretically be coerced into reaching a congestion window in ex- cedure that exploit a sender's vulnerability to non-conforming re- cess of the normal TCP sequence space (4GB) in only four round- ceiver behavior. trip times! 1 Moreover, while high rates of additional acknowledg- ment traffic may increase congestion on the path to the sender, the 2.2 ACK division penalty to the receiver is negligible since the cumulative nature of TCP uses a byte granularity error control protocol and consequently acknowledgments inherently tolerates any losses that may occur. each TCP segment is described by sequence number and acknowl- edgment fields that refer to byte offsets within a TCP data stream. 2.3 DupACK spoofing However, TCP's congestion control algorithm is implicitly defined TCP uses two algorithms, fast retransmit and fast recovery, to miti- in terms of segments rather than bytes. For example, the most re- gate the effects of . The fast retransmit algorithm detects cent specification of TCP's congestion control behavior, RFC 2581, loss by observing three duplicate acknowledgments and it immedi- states: ately retransmits what appears to be the missing segment. How- During slow start, TCP increments cwnd by at most ever, the receipt of a duplicate ACK also suggests that segments SMSS bytes for each ACK received that acknowledges are leaving the network. The fast recovery algorithm employs this new data. information as follows (again quoted from RFC 2581): ... During congestion avoidance, cwnd is incremented by 1 Set cwnd to ssthresh plus 3*SMSS. This artificially “in- full-sized segment per round-trip time (RTT). flates” the congestion window by the number of seg- ments (three) that have left the network and which the The incongruence between the byte granularity of error control receiver has buffered. and the segment granularity (or more precisely, SMSS granularity) .. of congestion control leads to the following vulnerability: For each additional duplicate ACK received, increment Attack 1: cwnd by SMSS. This artificially inflates the congestion Upon receiving a data segment containing N bytes, the window in order to reflect the additional segment that receiver divides the resulting acknowledgment into M, has left the network.

where M  N, separate acknowledgments – each cov- 1Of course the practical transmission rate is ultimately limited by other factors such ering one of M distinct pieces of the received data seg- as sender buffering, receiver buffering and network bandwidth. ment. Sender Receiver Sender Receiver Data 1:1461 Da ta 1:1461 1 ACK RTT K 1 1461 RTT AC ACK K 1 1 AC ACK 292 ACK 1 Da ta 1461:2 K 1 921 AC Data 2 921:4381 Data Data 43 1:146 81:5841 Da 1 Data ta 14 5841:73 61:292 01 Data 2 1 921:43 Data 81 4381: Da 5841 ta 584 1:7301

Figure 2: Sample time line for a DupACK spoofing attack. The receiver Figure 3: Sample time line for optimistic ACKing attack. The ACK for forges multiple duplicate ACKs for sequence number 1. This causes the the second segment is sent before the segment itself is received, leading sender to retransmit the first segment and send a new segment for each the receiver to grow cwnd more quickly than otherwise. At the end of this additional forged duplicate ACK. example, cwnd=3, rather than the expected value of cwnd=2.

There are two problems with this approach. First, it assumes gestion avoidance), sender-receiver pairs with shorter round-trip that each segment that has left the network is full sized – again an times will transfer data more quickly. unfortunate interaction of byte granularity error control and seg- However, the protocol does not use any mechanism to enforce ment granularity congestion control. Second, and more important, its assumption. Consequently, it is possible for a receiver to emu- because TCP requires that duplicate ACKs be exact duplicates, late a shorter round-trip time by sending ACKs optimistically for there is no way to ascertain which data segment they were sent in data it has not yet received: response to. Consequently, it is impossible to differentiate a “valid” duplicate ACK, from a forged, or “spoofed”, duplicate ACK. For Attack 3: the same reason, the sender cannot distinguish ACKs that are acci- Upon receiving a data segment, the receiver sends a dentally duplicated by the network itself from those generated by a stream of acknowledgments anticipating data that will receiver [APS99]. In essence, duplicate ACKs are a signal that can be sent by the sender. be used by the receiver to force the sender to transmit new segments into the network as follows: This technique is demonstrated in Figure 3. Note that while it is easy for the receiver to anticipate the correct sequence numbers Attack 2: to use in each acknowledgment (since senders generally send full- Upon receiving a data segment, the receiver sends a sized segments), this accuracy is not necessary. As long as the long stream of acknowledgments for the last sequence receiver acknowledges new data the sender will transmit additional number received (at the start of a connection this would segments. Moreover, if an ACK arrives for data that has not yet be for the SYN segment). been sent, this is generally ignored by the sending TCP – allowing a sender to be arbitrarily aggressive in its generation of optimistic Figure 2 shows a time line for this technique. The first four ACKs. ACKs for the same sequence number cause the sender to retransmit Unlike the previous attacks, this technique does not necessarily the first segment. However, cwnd is now set to its initial value preserve end-to-end reliability semantics – if data from the sender plus 3*SMSS, and increased by SMSS for each additional duplicate is lost it may be unrecoverable since it has already been acknowl- ACK, for a total of 4 segments (as per the fast recovery algorithm). edged. However, new features in protocols such as HTTP-1.1 al- Since duplicate ACKs are indistinguishable, the receiver does not low receivers to request particular byte-ranges within a data ob- need to wait for new data to send additional acknowledgments. As ject [FGM+ 99]. This suggests a strategy in which data is gath- a result, the sender will return data at a rate directly proportional ered on one connection and lost segments are then collected se- to the rate at which the receiver sends acknowledgments. After a lectively with application-layer retransmissions on another. Opti- period, the sender will timeout. However, this can easily be avoided mistic ACKing could be used to ramp the transfer rate up to the if the receiver acknowledges the missing segment and enters fast bottleneck rate immediately, and then hold it there by sending ac- retransmit again for a new, later, segment. knowledgments in spite of losses. This ability of the receiver to conceal losses is extremely dangerous because it eliminates the 2.4 Optimistic ACKing only congestion signal available to the sender. A malicious at- tacker could conceal all losses and therefore lead a sender to in- Implicit in TCP's algorithms is the assumption that the time be- crease cwnd indefinitely – possibly overwhelming the network with tween a data segment being sent and an acknowledgment for that useless packets. segment returning is at least one round-trip time. Since TCP's con- gestion window growth is a function of round-trip time (an expo- nential function during slow start and a linear function doing con- 60000 60000

50000 50000

40000 40000

30000 30000

20000 20000

Sequence number (Bytes) Data Segments Sequence number (Bytes) Data Segments 10000 ACKs 10000 ACKs Data Segments (normal) Data Segments (normal) ACKs (normal) ACKs (normal) 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Time (sec) Time (sec)

Figure 4: The TCP Daytona ACK division attack convinces the TCP Figure 5: The TCP Daytona DupACK spoofing attack, like the ACK divi- sender to send all but the first few segments of a document in a single burst. sion attack, convinces the TCP sender to send all but the first few segments of a document in a single burst.

3 Implementation experience 60000 To exploit the vulnerabilities described above, we made three mod- ifications to the TCP subsystem of 2.2.10. This resulting 50000 TCP implementation, which we refer to facetiously as “TCP Day- tona”, provides extremely high performance at the expense of its 40000 competitors. We demonstrate these abilities with time sequence plots of packet traces for both normal and modified receiver TCP's. 30000 Needless to say, our implementation is intentionally not “stable”, and would likely lead to congestion collapse if it were widely de- 20000

ployed. Sequence number (Bytes) Data Segments 10000 ACKs Data Segments (normal) 3.1 ACK division ACKs (normal) 0 The TCP Daytona ACK division algorithm adds 24 lines of code 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 that divide each new outgoing ACK into many ACKs for smaller Time (sec) extents of the sequence space. Half of the new code is dedicated to ensuring that the number of outgoing ACKs is no more than Figure 6: The TCP Daytona optimistic ACK attack, by sending a stream should be needed to coerce a sender in slow start to saturate our of early ACKs, convinces the TCP sender to send data much earlier than it test machine's 100Mbps Ethernet interface. normally would. Figure 4 shows client-side TCP sequence number plots of our test machine making an HTTP request for the index.html ob- ject from cnn.com, with and without our ACK division attack en- on a normal transfer. The many duplicate ACKs that the receiver abled. This figure spans the entire transaction, beginning with the sends at around 140ms cause the sender to enter fast recovery and TCP handshake that starts at 0ms and ends at around 70ms, when transmit the rest of the data, which arrives at around 210ms. Were the HTTP request is sent. The first HTTP data from the server ar- there more data, the flurry of duplicate ACKs sent at 210ms-230ms rives at around 140ms. would elicit another burst from the sender. Since there is no more This figure shows that, when this attack is enabled, the many new data, the sender simply fills in the hole it perceives; this seg- small ACKs sent around 140ms convince the Web server to un- ment arrives at around 290ms. This figure illustrates how the Du- leash the entire remainder of the document in a single burst; this pACK spoofing attack can achieve performance essentially equiva- data arrives exactly one round-trip time later. By contrast, with the lent to the ACK division attack – namely, both attacks can convince normal TCP implementation, the server spreads out the data over the sender to empty its entire send buffer in a single burst. the next four round-trip times. In general, as this figure suggests, this attack can convince a TCP sender to send all of its data in a 3.3 Optimistic ACKing single burst. The TCP Daytona implementation of optimistic ACKing consists of 45 lines of code. Because acknowledging data that has not ar- 3.2 DupACK spoofing rived is a fundamentally tricky business, we chose a very simple The TCP Daytona DupACK spoofing attack is implemented by 11 implementation as a proof of concept. When a TCP connection lines of code that cause the receiver to send sufficient duplicate for an HTTP or FTP client receives its first data, we set a timer ACKs such that the sender (re-)enters fast recovery and fills the to expire every 10ms. Any interval would do, but we chose 10ms receiver's advertised flow control window each round-trip time. because it is the smallest interval that Linux 2.2.10 supports on the Figure 5 shows another client-side plot of the same HTTP re- Intel PC platform. Whenever this periodic timer expires, or a new quest, this time with the DupACK spoofing attack superimposed data segment arrives, our receiver sends a new optimistic ACK for one MSS beyond the previous optimistic ACK. Figure 6 shows our optimistic ACK algorithm in action trans- Principle 1. Every message should say what it means: the inter- ferring the same index.html, again with a normal transfer su- pretation of the message should depend only on its content. perimposed. Note that after the first few data segments arrive at around 140ms, the receiver sends a steady stream of ACKs, where Principle 2. The conditions for a message to be acted upon should each ACK is sent about 10ms-70ms before the corresponding data be clearly set out so that someone reviewing a design may see arrives! The result is that the data transfer employing optimistic whether they are acceptable or not. ACKs completes in approximately half the normal transfer time. Principle 3. If the identity of a principal is essential to the mean- Though this is a modest gain relative to the other attacks, a more ing of a message, it is prudent to mention the principal's name bold optimistic ACKing scheme could achieve far greater through- explicitly in the message. put by acknowledging data at a more rapid pace. 4.2 ACK division 3.4 Applicability This vulnerability arises from an ambiguity about how ACKs In order to verify that common TCP implementations have these should be interpreted – a violation of the second principle. TCP's vulnerabilities, we tested each attack against a set of nine Web error-control allows an ACK to specify an arbitrary byte offset in servers running a diverse array of popular server operating systems. the sequence space while the congestion control specification as- To determine the running on each Web server, sumes that an ACK covers an entire segment. nmap we used , a tool that identifies TCP implementations based There are two obvious solutions: either modify the congestion on the “fingerprint” of their characteristic response to TCP seg- control mechanisms to operate at byte granularity or guarantee that ments whose correct response is under-specified [Vas]. To further segment-level granularity is always respected. The first solution decrease the possibility of misidentifying an operating system, in is virtually identical to the “byte counting” modifications to TCP all but three cases we were able to use Web servers that were op- discussed in [All98, All99]. If cwnd is not incremented by a full nmap erated by OS vendors and that confirmed were running the SMSS, but only proportional to the amount of data acknowledged, high-end server operating system from that vendor. then ACK division attacks will have no effect. The second, perhaps Table 1 shows which TCP implementations are vulnerable to simpler, solution is to only increment cwnd by one SMSS when each attack. The attacks are all widely applicable, with three ex- a valid ACK arrives that covers the entire data segment sent. As ceptions. First, Linux 2.2 is not vulnerable to the ACK division mentioned earlier, this technique is employed in the latest versions attack because it increases its congestion window only if at least of Linux (2.2.x) at the time of this writing. one whole previously-unacknowledged segment is acknowledged. Second, Linux 2.0 refuses to count duplicate acknowledgments un- til cwnd is greater than three. Consequently, the DupACK attack 4.3 DupACK spoofing will fail if initiated on connection startup. Finally, Windows NT During fast recovery and fast retransmit, TCP's design violates the appears to have a bug that causes it to rarely, if ever, enter fast re- first principle – the meaning of a duplicate ACK is implicit, depen- covery. This bug renders NT immune to attacks that rely on extra dent on previous context, and consequently difficult to verify. duplicate acknowledgments. TCP assumes that all duplicate ACKs are sent in response to unique and distinct segments. This assumption is unenforceable 4 Solutions without some mechanism for identifying the data segment that led to the generation of each duplicate ACK. The traditional method for As demonstrated in the previous section, TCP's current specifica- guaranteeing association is to employ a nonce [Sch96]. We present tion has several vulnerabilities that allow a misbehaving receiver a simple version of such a nonce protocol below (we will extend it to control the sender's transmission rate. While it is impossible shortly): to force a receiver to behave correctly, it is both possible and de- sirable to remove its incentive to misbehave. That is, we wish to Singular Nonce: ensure that a misbehaving receiver can not obtain data faster than We introduce two new fields into the TCP packet format: a behaving one. In this section we describe simple modifications Nonce and Nonce reply. For each segment, the sender to the TCP protocol that, without changing the nature of conges- fills the Nonce field with a unique random number gen- tion control, allow the verification of what has historically been an erated when the segment is sent. When a receiver gen- implicit contract between sender and the receiver – that each ac- erates an ACK in response to a data segment, it echoes knowledgment faithfully and unambiguously reflects data that has the nonce value by writing it into the Nonce Reply field. been successfully transferred to the receiver. The sender can then arrange to only inflate cwnd in response to duplicate ACKs whose Nonce Reply value corresponds to a data 4.1 Designing robust protocols segment previously sent and not yet acknowledged. We believe TCP's vulnerabilities arise from a combination of un- We note that the singular nonce, as we have described it so stated assumptions, casual specification and a pragmatic need to far, is similar to the Timestamps option [JBB92], with two impor- develop congestion control mechanisms that are backward compat- tant differences. First, the Nonce field preserves association for ible with previous TCP implementations. In retrospect, if the con- duplicate ACKs, while the Timestamps option does not (prefer- tract between sender and receiver had been defined explicitly these ring instead to reuse the previous timestamp value). Second, and vulnerabilities would have been obvious. more important, because Timestamps is a option, a receiver has the We are inspired by Abadi and Needham's paper, Prudent En- choice to not participate in its use. We cannot rely on misbehaving gineering Practice for Cryptographic Protocols, which presents clients to voluntarily participate in their own policing. For the same a set of design rules that are surprisingly germane to this prob- reason, we cannot rely on other TCP options, such as proposed ex- lem [AN96]. In particular we reprint their first three principles be- tensions to SACK [FMM+ 99], to eliminate this vulnerability. low: ACK Division DupACK Spoofing Optimistic Acks Solaris 2.6 Y Y Y Linux 2.0 Y Y(N) Y Linux 2.2 N Y Y Windows NT4/95 Y N Y FreeBSD 3.0 Y Y Y DIGITAL Unix 4.0 Y Y Y IRIX 6.x Y Y Y HP-UX 10.20 Y Y Y AIX 4.2 Y Y Y

Table 1: For each TCP Daytona attack, we denote with a “Y” those operating systems that we found to be vulnerable. Most operating systems we tested were vulnerable to all the Daytona attacks.

Unfortunately, our fix requires the modification of clients and servers and the addition of a TCP field. While it is the only com- Sender Receiver plete solution we have discovered, there are sender-only heuristics Data 1:146 which can mitigate, although not eliminate, the impact of the Du- 1 (27) pACK spoofing attack in a purely backward compatible manner. In ) 61 (27 particular, the sender can maintain a count of outstanding segments ACK 14 sent above the missing segment. For each duplicate acknowledg- Data ment this count is decremented and when it reaches zero any ad- 1461:2 D 921 (6 ditional duplicate acknowledgments are ignored. This simple fix ata 29 2) 21:438

appears to limit the number of segments wrongly sent to contain no 1 (36)

SM SS more than cw nd bytes. Unfortunately, a clever receiver 5) 81 (12 can acknowledge the missing segment and then repeat the process CK 43 indefinitely unless other heuristics are employed to penalize this A Dat behavior (e.g. by refusing to enter fast retransmit multiple times in a 4381 :5841 a single window as suggested in [Flo95]). Data 5 (19) 841:73 01 (5) 4.4 Optimistic ACKing 6) 01 (15 ACK 73 The optimistic ACK attack is possible because ACKs do not con- tain any proof regarding the identity of the data segment(s) that caused them to be sent. In the context of the third principle de- Figure 7: A time line for a transfer using a cumulative nonce. The nonce scribed earlier, a data segment is a principal and an ACK is the values are shown in parenthesis and it is assumed that each side starts with message of concern. a nonce sum of zero. The dotted line indicates a data segment that was This problem is also well addressed using a nonce. If a nonce dropped. The final ACK, which attempts to conceal the loss of this segment, can' t be guessed by the receiver, than ACKs with valid nonces im- will be rejected because its cumulative nonce value is incorrect (156 instead ply that a full round-trip time has taken place (man-in-the-middle of the expected value of 149). attacks notwithstanding). However, the singular nonce we have described is imperfect because it does not mirror the cumulative nature of TCP. Acknowl- An example of this protocol is depicted in Figure 7. The sec- edgments can be delayed or lost, yet the cumulative property of ond ACK (acknowledging bytes 4380 and below) demonstrates the TCP's sequence numbers ensures that the most recent ACK can cumulative effect of the nonce, proving that the receiver has in fact cover all previous data. In contrast, the singular nonce only pro- seen all three segments (125=27+62+36). The fourth data segment vides evidence that a single segment was received. A misbehaving is lost (indicated by a dotted line) and the third ACK attempts to sender could still mount a denial of service attack by concealing conceal this loss by acknowledging a later segment. However, the lost data, yet still sending back ACKs with valid nonces. To ad- ACK will be rejected by the sender, since it cannot provide the dress this deficiency we describe a cumulative nonce as follows: proper nonce sum (149) for the data it purports to acknowledge. A potential complication can occur if the segment boundaries Cumulative Nonce: differ between the initial transmission and a subsequent retransmis- For each segment, the sender fills the Nonce field with sion. Such an occurrence might occur during dynamic path MTU a unique random number generated when the segment changes. There are several implementation strategies to address is sent. Each side maintains a nonce sum represent- this situation, but the simplest is to randomly subdivide the orig- ing the cumulative sum of all in-sequence acknowledged inal nonce value in a way that the sum of the new nonce values nonces. When a receiver receives an in-sequence seg- is still consistent with the original transmission. For example, if a ment it adds the value contained in its Nonce field to this 1460 byte segment is initially transmitted with a nonce value of 14, sum. When a receiver generates an ACK in response to but subsequent retransmissions are limited to 536 bytes by a path a data segment, it either echoes the current value of the MTU change, then we might retransmit the data in three packets, nonce sum (for in-sequence data) or echoes the nonce with nonce values of 7, 3 and 4. value sent by the sender (for out-of-sequence data). While it is difficult to prevent loss concealment without a cu- The sender can then efficiently verify that the data acknowl- mulative nonce, there are interim sender-side modifications that can edged by the receiver has, in fact, been successfully transferred. approximate a singular nonce and thereby limit the impact of op- timistic ACKing attacks. If the sending TCP randomly varies the Acknowledgments size of outgoing segments by a small amount (e.g. [SMSS-15 bytes .. SMSS bytes]), a misbehaving receiver will be unable to correctly This paper has benefited from conversations with many different anticipate the segment boundaries. Consequently, the exact seg- people. In particular, we'd like to thank Sally Floyd, Vern Pax- ment boundaries encode a form of nonce and the sending TCP can son, Jeff Mogul, Greg Minshall, Venkat Padmanabhan and Robert filter out optimistic ACKs as those that do not fall on the appropri- Grimm for their careful reading and thoughtful comments. In ad- ate sequence numbers (this assumes that receivers acknowledge all dition, Roch Guerin and the anonymous CCR reviewers provided of the data they receive). As an added disincentive, the sender could valuable feedback on the submitted version of this paper. send a RST for any ACK that acknowledged data not yet sent. This strategy does not prevent the receiver from concealing loss, but it References can mitigate the effects of optimistic ACKs (which we believe is a more attractive attack for the average user). [All98] Mark Allman. On the generation and use of TCP ac- knowledgments. Computer Communications Review, 5Conclusion 28(5), October 1998.

In this paper we have described how a receiver can manipulate the [All99] Mark Allman. TCP byte counting refinements. Com- TCP congestion control function managed by the sender, and how puter Communications Review, 29(3), July 1999. the sender can prevent these manipulations. Our work highlights [AN96] Martin Abadi and Roger Needham. Prudent engi- two results that we believe are significant yet not widely appreci- neering practice for cryptographic protocols. IEEE ated: Transactions on Software Engineering, 22(1), January 1996.

 TCP, which was originally designed for a cooperative envi- ronment, contains several vulnerabilities that an unscrupulous [APS99] M. Allman, V. Paxson, and W. Stevens. TCP conges- receiver can exploit to obtain improved service at the expense tion control. RFC 2581, April 1999. of other network clients or to implement a denial-of-service attack. We have described ACK division, DupACK spoofing [FF99] Sally Floyd and Kevin Fall. Promoting the use of end- and Optimistic ACK mechanisms and implemented them to to-end congestion control in the Internet. IEEE/ACM demonstrate that the attacks are both real and widely applica- Transactions on Networking, August 1999. ble. [FGM+ 99] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masin-

 The design of TCP can be modified, without changing the ter, P. Leach, and T. Berners-Lee. Hypertext Transfer nature of the congestion control function, to eliminate these Protocol – HTTP/1.1. RFC 2616, June 1999. vulnerabilities. We have described the workings of a new Cu- mulative Nonce approach that accomplishes this in a simple [Flo95] Sally Floyd. TCP and successive fast re- yet effective manner. We have also identified and described transmits. http://www.aciri.org/floyd/ sender-only modifications that can be deployed immediately papers/fastretrans.ps, May 1995.

to reduce the scope of the vulnerabilities without receiver-side [FMM+ 99] Sally Floyd, Jamshid Mahdavi, Matt Mathis, Matthew modifications. Podolsky, and Allyn Romanow. An extension to the selective acknowledgment (SACK) option for TCP. Our work can readily be extended to other protocols. While Internet Draft, August 1999. the Cumulative Nonce was defined in the context of TCP, it could be adapted to any sender-based congestion control scheme. This [Jac88] Van Jacobson. Congestion avoidance and control. might prove fruitful for unreliable transports, for example, either SIGCOMM '88, August 1988. those that are explicitly TCP-friendly, such as RAP [RHE99], or other rate adaptive mechanisms, like those employed by RealAu- [JBB92] V. Jacobson, R. Braden, and D. Borman. TCP exten- dio. A Cumulative Nonce could also be used more widely to aid sions for high performance. RFC 1323, May 1992. in the design of other kinds of protocols. This is because it effec- tively defines a sequencing mechanism between untrusted parties [KA98] S. Kent and R. Atkinson. Security architecture for the that, because it is lightweight, idempotent and cumulative, is well internet protocol. RFC 2401, November 1998. suited to network environments. [MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd, and Al- Beyond these immediate results, our work raises more specula- lyn Romanow. TCP Selective Acknowledgement op- tive protocol design issues. TCP was originally designed for a co- tions. RFC 2018, April 1996. operative environment, and its evolution through the years has built on this base. Given this, it is perhaps not so surprising that we were [PAD+ 99] V. Paxson, M. Allman, S. Dawson, W. Fenner, able to find the vulnerabilities we did, because they naturally arise J. Griner, I. Heavens, K. Lahey, J. Semke, and B. Volz. when the sender and receiver represent different interests. With the Known TCP implementation problems. RFC 2525, growth of the Internet, however, it is arguable that “separate in- March 1999. terests” should be assumed by default. Protocol functions that are managed by one party would then be designed to minimize the trust [RHE99] Reza Rejaie, Mark Handley, and Deborah Estrin. they place in other parties. We observe that this kind of “separation RAP: An end-to-end rate-based congestion control of interests” will require new mechanisms, such as a Cumulative mechanism for realtime streams in the Internet. In Nonce, to guarantee that different parties respect a common behav- INFOCOM '99, March 1999. ioral contract. [Sch96] Bruce Schneier. Applied Cryptography. John Wiley & Sons, 2nd edition, 1996. [She94] Scott Shenker. Making greed work in networks: A game-theoretic analysis of switch service disciplines. In SIGCOMM '94, pages 47–57, August 1994. [Ste94] W. Richard Stevens. TCP/IP Illustrated, volume 1. Addison Wesley, 1994. [Vas] Fyodor Vaskovich. nmap. http://www. insecure.org/nmap/. [VRC98] L. Vivisano, L. Rizzo, and J. Crowcroft. TCP-like congestion control for layered multicast data transfer. In INFOCOM '98, April 1998.

[ZDE + 93] L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala. RSVP: A New Resource ReSerVation Protocol. IEEE Network, pages 8–18, September 1993.