Reducing Internet Latency: a Survey of Techniques and Their Merits
Total Page:16
File Type:pdf, Size:1020Kb
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/264192993 Reducing Internet Latency: A Survey of Techniques and Their Merits Article in IEEE Communications Surveys & Tutorials · November 2014 DOI: 10.1109/COMST.2014.2375213 CITATIONS READS 104 2,180 10 authors, including: Bob Briscoe Anna Brunstrom Cable Television Laboratories, Inc. Karlstads Universitet 115 PUBLICATIONS 1,512 CITATIONS 229 PUBLICATIONS 1,760 CITATIONS SEE PROFILE SEE PROFILE Andreas Petlund David A. Hayes Simula Research Laboratory Simula Metropolitan Center for Digital Engineering 49 PUBLICATIONS 352 CITATIONS 31 PUBLICATIONS 394 CITATIONS SEE PROFILE SEE PROFILE Some of the authors of this publication are also working on these related projects: Inter-Network Quality of Service View project SCTP SmartSwitch – Smart Session Management for Multihomed SCTP View project All content following this page was uploaded by Bob Briscoe on 09 April 2016. The user has requested enhancement of the downloaded file. Preprint. To appear in IEEE Communications Surveys & Tutorials <http://www.comsoc.org/cst> Until published, please cite as: Briscoe, B., Brunstrom, A., Petlund, A., Hayes, D., Ros, D., Tsang, I.-J., Gjessing, S., Fairhurst, G., Griwodz, C. & Welzl, M., ”Reducing Internet Latency: A Survey of Techniques and their Merits,” IEEE Communications Surveys & Tutorials (2014) (To appear) c 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. 1 Reducing Internet Latency: A Survey of Techniques and their Merits Bob Briscoe1, Anna Brunstrom2, Andreas Petlund3, David Hayes4, David Ros5,3, Ing-Jyh Tsang6, Stein Gjessing4, Gorry Fairhurst7, Carsten Griwodz3, Michael Welzl4 1BT, UK 2Karlstad University, Sweden 3Simula Research Laboratory AS, Norway 4University of Oslo, Norway 5Institut Mines-Tel´ ecom´ / Tel´ ecom´ Bretagne, France 6Alcatel-Lucent Bell-Labs, Belgium 7University of Aberdeen, UK Latency is increasingly becoming a performance bottleneck for Internet Protocol (IP) networks, but historically networks have been designed with aims of maximizing throughput and utilization. This article offers a broad survey of techniques aimed at tackling latency in the literature up to August 2014, and their merits. A goal of this work is to be able to quantify and compare the merits of the different Internet latency reducing techniques, contrasting their gains in delay reduction versus the pain required to implement and deploy them. We found that classifying techniques according to the sources of delay they alleviate provided the best insight into the following issues: 1) the structural arrangement of a network, such as placement of servers and suboptimal routes, can contribute significantly to latency; 2) each interaction between communicating endpoints adds a Round Trip Time (RTT) to latency, especially significant for short flows; 3) in addition to base propagation delay, several sources of delay accumulate along transmission paths, today intermittently dominated by queuing delays; 4) it takes time to sense and use available capacity, with overuse inflicting latency on other flows sharing the capacity; and 5) within end systems delay sources include operating system buffering, head-of-line blocking, and hardware interaction. No single source of delay dominates in all cases, and many of these sources are spasmodic and highly variable. Solutions addressing these sources often both reduce the overall latency and make it more predictable. Index Terms—Data communication, networks, Internet, performance, protocols, algorithms, standards, cross-layer, comparative evaluation, taxonomy, congestion control, latency, queuing delay, bufferbloat I. INTRODUCTION between benefit and cost (‘gain vs. pain’). The benefits of any technique are highly scenario-dependent, so we carefully ANY, if not most, Internet Protocol (IP) networks and chose a set of scenario parameters that would be amenable to M protocols have traditionally been designed with opti- visual comparison across an otherwise complex space. mization of throughput or link utilization in mind. Such a focus on “bandwidth” may well be justified for bulk-data transfer, or more generally for applications that do not require timeliness A. Importance of latency to applications in their data delivery. However, nowadays the quality of experience delivered by many applications depends on the Latency is a measure of the responsiveness of an applica- delay to complete short data transfers or to conduct real-time tion; how instantaneous and interactive it feels, rather than conversations, for which adding bandwidth makes little or no sluggish and jerky. In contrast to bandwidth, which is the rate difference. As a result, latency in the current Internet has been at which bits can be delivered, latency is the time it takes for gaining visibility as a truly critical issue that impairs present- a single critical bit to reach the destination, measured from day applications, and that may hinder the deployment of new when it was first required. This definition may be stretched ones. It is therefore important to: (a) understand the root causes for different purposes depending on which bit is ‘critical’ for of latency, and (b) assess the availability of solutions, deployed different applications, the main categories being: or not, and their expected contribution to lowering end-to-end 1) Real-time interaction, where every ‘chunk’ of data pro- latency. This paper seeks to address these questions. We offer a duced by an end-point is unique and of equal importance broad survey of techniques aimed at tackling Internet latency and needs to be delivered as soon as possible, for example up to August 2014, classifying the techniques according to an on-line game, or an interactive video conference (the the sources of delay that they address, i.e. where delays ‘critical bit’ is therefore the last bit of each ‘chunk’). arise along the communications chain. To decide on the best 2) Start-up latency, where the time to begin a service is most classification system, we tried a number of alternative systems: important, as at the start of a video stream (the ‘critical classifying by sources of delay was the easiest to understand bit’ is the first bit of data). and led to the fewest gaps and least overlap. We also attempt 3) Message completion time, where the time to complete to quantify the merits of a selection of the most promising a whole (often small) transmission is most important, techniques. We decided to focus on reduction in delay and for example downloading Javascript code to start an ease of deployment, which loosely represent the main tradeoff application (the ‘critical bit’ is the last bit of the message). 2 Fig. 1. Waterfall diagram showing the timing of download of an apparently uncluttered example Web page (ieeexplore.ieee.org), actually comprising over one hundred objects, transferred over 23 connections needing 10 different DNS look-ups. The horizontal scale is in seconds. This access was from Stockholm, Sweden, over a 28ms RTT 5 Mb/s down 1 Mb/s up cable access link, using Internet Explorer v8 without any prior cache warming. Source: www.webpagetest.org An important characteristic of latency is that it is additive the artificially introduced delay was removed, it took a similar in nature and accumulates over the communication session or period to linearly regain the original demand. These results application task. Distributed systems involving machine to ma- were presented in a joint Microsoft-Google presentation on chine interactions, e.g. Web services, consist of long sequences the value of reducing Web delay [1]. Google’s VP for search of automated interactions in between each human intervention. pointed out that if Google’s experiment had been conducted on Therefore even slight unnecessary delay per transfer adds up to their whole customer base, losing 0.75% of their 2009 revenue considerable overall delay. For instance, Fig. 1 shows the con- would have lost them $75M for the year [2]. nections involved in downloading an apparently uncluttered Certain applications suffer more when there is a large varia- Web page (ieeexplore.ieee.org). This unremarkable example is tion in latency (jitter). This includes time-stepped applications fairly typical of many Web pages. Closer examination shows such as voice or applications with real-time interaction, such that the critical path of serial dependencies consists of about as on-line games [3]. In general, latency is characterized by its half a dozen short TCP connections, each starting with a 3-way distribution function, and the relative importance of the higher handshake and each preceded by a DNS look-up, adding up order moments is application dependent. to at least six back-and-forth messages each, and totalling nearly forty transfers in series. It might be considered that B. Scope delays up to say 50 ms are so close to the bounds of human perception that they are not worth removing. However, adding We restrict our scope to generic techniques that are each 40 × 50 ms would delay completion of this example page applicable to a wide range of ways the Internet could be by about 2 seconds. Experiencing such unnecessary delay on used in its role as a public network. Still, a few promising every click during a browsing session makes the experience techniques currently applicable only in private networks are unnecessarily intermittent and unnatural. included when we see that they may be adapted to, or inspire solutions for, the public Internet. We also draw a In 2009 a team from Microsoft found that artificially in- fairly arbitrary line to rule out more specialist techniques. For troducing 500 ms extra delay in the response from the Bing instance, we include the delay that certain applications (e.g. search engine translated to 1.2% less advertising revenue.