An Argument for Increasing TCP's Initial Congestion Window
Total Page:16
File Type:pdf, Size:1020Kb
An Argument for Increasing TCP's Initial Congestion Window Nandita Dukkipati Tiziana Refice Yuchung Cheng Jerry Chu Tom Herbert Amit Agarwal Arvind Jain Natalia Sutin Google Inc. {nanditad, tiziana, ycheng, hkchu, therbert, aagarwal, arvind, nsutin}@google.com ABSTRACT connection lifetime to grow the amount of data that may be TCP flows start with an initial congestion window of at most outstanding at a given time. Slow start increases the conges- four segments or approximately 4KB of data. Because most tion window by the number of data segments acknowledged Web transactions are short-lived, the initial congestion win- for each received acknowledgment. Thus the congestion win- dow is a critical TCP parameter in determining how quickly dow grows exponentially and increases in size until packet flows can finish. While the global network access speeds loss occurs, typically because of router buffer overflow, at increased dramatically on average in the past decade, the which point the maximum capacity of the connection has standard value of TCP’s initial congestion window has re- been probed and the connection exits slow start to enter mained unchanged. the congestion avoidance phase. The initial congestion win- In this paper, we propose to increase TCP’s initial conges- dow is at most four segments, but more typically is three tion window to at least ten segments (about 15KB). Through segments (approximately 4KB) [5] for standard Ethernet large-scale Internet experiments, we quantify the latency MTUs. The majority of connections on the Web are short- benefits and costs of using a larger window, as functions lived and finish before exiting the slow start phase, making of network bandwidth, round-trip time (RTT), bandwidth- TCP’s initial congestion window (init cwnd) a crucial pa- delay product (BDP), and nature of applications. We show rameter in determining flow completion time. Our premise that the average latency of HTTP responses improved by is that the initial congestion window should be increased to approximately 10% with the largest benefits being demon- speed up short Web transactions while maintaining robust- strated in high RTT and BDP networks. The latency of low ness. bandwidth networks also improved by a significant amount While the global adoption of broadband is growing, TCP’s in our experiments. The average retransmission rate in- init cwnd has remained unchanged since 2002. As per a creased by a modest 0.5%, with most of the increase com- 2009 study [4], the average connection bandwidth globally ing from applications that effectively circumvent TCP’s slow is 1.7Mbps with more than 50% of clients having bandwidth start algorithm by using multiple concurrent connections. above 2Mbps, while the usage of narrowband (<256Kbps) Based on the results from our experiments, we believe the has shrunk to about 5% of clients. At the same time, appli- initial congestion window should be at least ten segments cations devised their own mechanisms for faster download of and the same be investigated for standardization by the Web pages. Popular Web browsers, including IE8 [2], Fire- IETF. fox 3 and Google’s Chrome, open up to six TCP connections per domain, partly to increase parallelism and avoid head-of- line blocking of independent HTTP requests/responses, but Categories and Subject Descriptors mostly to boost start-up performance when downloading a C.2.2 [Computer Communication Networks]: Network Web page. Protocols—TCP, HTTP; C.2.6 [Computer Communica- In light of these trends, allowing TCP to start with a tion Networks]: Internetworking—Standards; C.4 [Perfor- higher init cwnd offers the following advantages: mance of Systems]: Measurement techniques, Performance (1) Reduce latency. Latency of a transfer completing in attributes slow start without losses [9], is: S(γ − 1) S ⌈logγ ( + 1)⌉ ∗ RTT + (1) General Terms init cwnd C Measurement, Experimentation, Performance where S is transfer size, C is bottleneck link-rate, γ is 1.5 or 2 depending on whether acknowledgments are delayed Keywords or not, and S/init cwnd ≥ 1. As link speeds scale up, TCP’s latency is dominated by the number of round-trip TCP, Congestion Control, Web Latency, Internet Measure- times (RTT) in the slow start phase. Increasing init cwnd ments enables transfers to finish in fewer RTTs. (2) Keep up with growth in Web page sizes. The Inter- 1. INTRODUCTION AND MOTIVATION net average Web page size is 384KB [14] including HTTP We propose to increase TCP’s initial congestion window headers and compressed resources. An average sized page to reduce Web latency during the slow start phase of a con- requires multiple RTTs to download when using a single nection. TCP uses the slow start algorithm early in the TCP connection with a small init cwnd. To improve page load times, Web browsers routinely open multiple concur- 1 rent TCP connections to the same server. Web sites also spread content over multiple domains so browsers can open 0.8 even more connections [8]. A study on the maximum num- 0.6 Google Web search ber of parallel connections that browsers open to load a Top 500 sites page [16] showed Firefox 2.0 opened 24 connections and IE8 0.4 All sites Gmail opened 180 connections while still not reaching its limit. 0.2 Google maps These techniques not only circumvent TCP’s congestion con- Photos blogger Top 100 sites trol mechanisms [13], but are also inefficient as each new flow 0 100 1000 10000 100000 independently probes for end-to-end bandwidth and incurs Response size (Bytes) the slow start overhead. Increasing init cwnd will not only mitigate the need for multiple connections, but also allow Figure 1: CDF of HTTP response sizes for top 100 newer protocols such as SPDY [1] to operate efficiently when sites, top 500 sites, all the Web, and for a few popu- downloading multiple Web objects over a single TCP con- lar Google services. Vertical lines highlight response nection. sizes of 3 and 10 segments. (3) Allow short transfers to compete fairly with bulk data traffic. Internet traffic measurements indicate that most 500 bytes in the network are in bulk data transfers (such as Google Web search video), while the majority of connections are short-lived and 495 transfer small amounts of data. Statistically, on start-up, 490 a short-lived connection is already competing with connec- 485 tions that have a congestion window greater than three seg- 480 ments. Because short-lived connections, such as Web trans- 475 fers, don’t last long enough to achieve their fair-share rate, TCP latency (ms) 470 a higher init cwnd gives them a better chance to compete 465 with bulk data traffic. 3 6 10 16 26 42 (4) Allow faster recovery from losses. An initial win- init_cwnd dow larger than three segments increases the likelihood that losses can be recovered through Fast Retransmit rather than Figure 2: TCP latency for Google search with dif- the longer initial retransmission timeout. Furthermore, in ferent init cwnd values. the presence of congestion, the widespread deployment of Selective Acknowledgments (SACK) enables a TCP sender to using three segments. We note that raising init cwnd to to recover multiple packet losses within a round-trip time. 16 improves latency further. However, much larger values, We propose to increase TCP’s init cwnd to at least ten such as 42, show a degradation in latency, likely due to in- segments (approximately 15KB).1 To that end, we quantify creased packet losses. Larger scale experiments described in the latency benefits and costs, as measured in large scale the rest of the paper demonstrate at length the benefits and experiments conducted via Google’s front-end infrastructure potential costs of using an initial congestion window of ten serving users a diverse set of applications. segments. Ideally, we want to pick an init cwnd satisfying each of There are numerous studies in literature on speeding up the following properties: (i) minimize average Web page short transfers over new TCP connections [10, 15]. These download time; (ii) minimize impact on tail latency due techniques range from faster start-up mechanisms using cached to increased packet loss, and (iii) maintain fairness with congestion windows such as in TCP Fast Start, to more com- competing flows. Raising the initial congestion window to plex schemes requiring router support such as Quick Start. ten segments can reasonably satisfy these properties. It im- These solutions are neither widely deployed, nor standard- proves average TCP latency, yet is sufficiently robust for ized, and do not have practical reference implementations. use on the Internet. In the following, we articulate the per- A recent measurements study [12] showed that up to 15.8% tinence of ten segments to TCP’s initial congestion window: of Internet flows have an init cwnd larger than the standard (1) Covers ≈90% of HTTP Web objects: 90% of HTTP re- specification [5]. sponses from the top 100 and 500 sites fit within 16KB [14], The rest of the paper is arranged as follows: Section 2 as shown in the response size distribution of Figure 1. The describes the experiments’ setup and datasets. Section 3 distribution also shows that about 90% of Google Web search, presents an analysis of receiver advertised windows. Sec- Maps, and Gmail responses fit in about 15KB or ten seg- tion 4 describes the experiment results with an initial con- ments; for applications with larger average size such as Pho- gestion window of ten, quantifying its benefits and cost an- tos, about 20% more responses can fit within ten segments alyzed by network properties (bandwidth, BDP, RTT), as as compared to three segments. well as traffic characteristics. Section 5 concludes the paper (2) Improves latency, while being robust: Figure 2 shows with a discussion on future work.