Measuring Latency Variation in the Internet
Total Page:16
File Type:pdf, Size:1020Kb
Measuring Latency Variation in the Internet Toke Høiland-Jørgensen Bengt Ahlgren Per Hurtig Dept of Computer Science SICS Dept of Computer Science Karlstad University, Sweden Box 1263, 164 29 Kista Karlstad University, Sweden toke.hoiland- Sweden [email protected] [email protected] [email protected] Anna Brunstrom Dept of Computer Science Karlstad University, Sweden [email protected] ABSTRACT 1. INTRODUCTION We analyse two complementary datasets to quantify the la- As applications turn ever more interactive, network la- tency variation experienced by internet end-users: (i) a large- tency plays an increasingly important role for their perfor- scale active measurement dataset (from the Measurement mance. The end-goal is to get as close as possible to the Lab Network Diagnostic Tool) which shed light on long- physical limitations of the speed of light [25]. However, to- term trends and regional differences; and (ii) passive mea- day the latency of internet connections is often larger than it surement data from an access aggregation link which is used needs to be. In this work we set out to quantify how much. to analyse the edge links closest to the user. Having this information available is important to guide work The analysis shows that variation in latency is both com- that sets out to improve the latency behaviour of the inter- mon and of significant magnitude, with two thirds of sam- net; and for authors of latency-sensitive applications (such ples exceeding 100 ms of variation. The variation is seen as Voice over IP, or even many web applications) that seek within single connections as well as between connections to to predict the performance they can expect from the network. the same client. The distribution of experienced latency vari- Many sources of added latency can be highly variable in ation is heavy-tailed, with the most affected clients seeing an nature. This means that we can quantify undesired latency order of magnitude larger variation than the least affected. In by looking specifically at the latency variation experienced addition, there are large differences between regions, both by a client. We do this by measuring how much client la- within and between continents. Despite consistent improve- tency varies above the minimum seen for that client. Our ments in throughput, most regions show no reduction in la- analysis is based on two complementary sources of data: we tency variation over time, and in one region it even increases. combine the extensive publicly available dataset from the We examine load-induced queueing latency as a possible Measurement Lab Network Diagnostic Tool (NDT) with a cause for the variation in latency and find that both data- packet capture from within a service provider access net- sets readily exhibit symptoms of queueing latency correlated work. The NDT data, gathered from 2010 to 2015, com- with network load. Additionally, when this queueing latency prises a total of 265.8 million active test measurements from does occur, it is of significant magnitude, more than 200 ms all over the world. This allows us to examine the develop- in the median. This indicates that load-induced queueing ment in latency variation over time and to look at regional contributes significantly to the overall latency variation. differences. The access network dataset is significantly smal- ler, but the network characteristics are known with greater Keywords certainty. Thus, we can be more confident when interpreting the results from the latter dataset. These differences between Latency; Bufferbloat; Access Network Performance the datasets make them complement each other nicely. We find that significant latency variation is common in both datasets. This is the case both within single connections and between different connections from the same client. In the NDT dataset, we also observe that the magnitude of la- tency variation differs between geographic regions, both be- tween and within continents. Looking at the development CoNEXT ’16 December 12-15, 2016, Irvine, CA, USA over time (also in the NDT dataset), we see very little change c 2016 Copyright held by the owner/author(s). in the numbers. This is in contrast to the overall throughput ACM ISBN 978-1-4503-4292-6/16/12. that has improved significantly. DOI: http://dx.doi.org/10.1145/2999572.2999603 473 Table 1: Total tests per region and year (millions). Reg. 2010 2011 2012 2013 2014 2015 AF 0.65 0.63 0.79 0.72 0.76 0.57 AS 7.79 7.45 6.55 5.75 5.83 4.57 Time EU 35.00 31.12 27.96 22.40 21.23 16.82 NA 11.70 8.51 7.90 7.01 7.06 8.00 SA 2.68 1.73 2.83 2.94 2.05 1.26 OC 1.33 1.33 0.79 0.55 0.65 0.57 Total 59.22 50.81 46.87 39.43 37.60 31.79 One important aspect of latency variation is the corre- lation between increased latency and high link utilisation. Queueing delay, in particular, can accumulate quickly when the link capacity is exhausted, and paying attention to such Figure 1: Delays computed from the TCP connection setup. scenarios can give insight into issues that can cause real, if intermittent, performance problems for users. We examine queueing delay as a possible source of the observed latency representative sample of the traffic mix flowing through the variation for both datasets, and find strong indications that internet. Instead, it may tell us something about the links it is present in a number of instances. Furthermore, when being traversed by the measurement flows. Looking at links queueing latency does occur it is of significant magnitude. under load is interesting, because some important effects can The rest of the paper is structured as follows: Section 2 in- be exposed in this way, most notably bufferbloat: Loading troduces the datasets and the methodology we have used to up the link causes any latent buffers to fill, adding latency analyse them. Section 3 discusses the large-scale variations that might not be visible if the link is lightly loaded. This in latency over time and geography, and Section 4 examines also means that the baseline link utilisation (e.g., caused by delay variation on access links. Section 5 presents our exam- diurnal usage patterns) become less important: an already ination of load-induced queueing delay. Finally, Section 6 loaded link can, at worst, result in a potentially higher base- discusses related work and Section 7 concludes the paper. line latency. This means that the results may be biased to- wards showing lower latency variation than is actually seen 2. DATASETS AND METHODOLOGY on the link over time. But since we are interested in estab- lishing a lower bound on the variation, this is acceptable. The datasets underlying our analysis are the publicly avail- For the base analysis, we only exclude tests that were trun- able dataset from Measurement Lab (M-Lab), specifically cated (had a total run time less than 9 seconds). We use the Network Diagnostic Tool (NDT) data [1], combined with the TCP RTT samples as our data points, i.e., the samples packet header traces from access aggregation links of an in- computed by the server TCP stack according to Karn’s algo- ternet service provider. This section presents each of the rithm [23], and focus on the RTT span, defined as the dif- datasets, and the methodology we used for analysis. ference between the minimum and maximum RTT observed 2.1 The M-Lab NDT data during a test run. However, we also examine a subset of the data to assess what impact the choice of min and max ob- The M-Lab NDT is run by users to test their internet con- served RTT has on the data when compared to using other nections. We use the 10-second bulk transfer from the server percentiles for each flow (see Section 3.3). to the client, which is part of the test suite. When the test is run, the client attempts to pick the nearest server from the 2.2 The access aggregation link data geographically distributed network of servers provided by The second dataset comes from two access aggregation the M-Lab platform. The M-Lab servers are globally dis- links of an internet service provider. The links aggregate the tributed1, although with varying density in different regions. traffic for about 50 and 400 clients, respectively, and connect The server is instrumented with the Web100 TCP kernel them to the core network of the service provider. The traffic instrumentation [21], and captures several variables of the was captured passively at different times distributed over an TCP state machine every 5 ms of the test. Data is available eight-month period starting at the end of 2014. The average from early 2009, and we focus on the six year period 2010– loads on the 1 Gbps links were, respectively, about 200 and 2015, comprising a total of 265.8 million test runs. Table 1 400 Mbit/s during peak hours. This dataset is an example of shows the distribution of test runs for the years and regions real internet traffic, since we are not generating any traffic. we have included in our analysis. We analyse the delay experienced by the TCP connection Since the NDT is an active test, the gathered data is not a setup packets in this dataset. The TCP connection setup con- 1See http://www.measurementlab.net/infrastructure for sists of a three-way handshake with SYN, SYN+ACK, and more information on the server placements. ACK packets, as illustrated in Figure 1.