A Longitudinal and Cross-Dataset Study of Internet Latency and Path Stability

A Longitudinal and Cross-Dataset Study of Internet Latency and Path Stability Mosharaf Chowdhury Rachit Agarwal Vyas Sekar Ion Stoica Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-172 http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-172.html October 11, 2014 Copyright © 2014, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. A Longitudinal and Cross-Dataset Study of Internet Latency and Path Stability Mosharaf Chowdhury Rachit Agarwal Vyas Sekar Ion Stoica UC Berkeley UC Berkeley Carnegie Mellon University UC Berkeley ABSTRACT Even though our work does not provide new active mea- We present a retrospective and longitudinal study of Internet surement techniques or new datasets, we believe that there is value in this retrospective analysis on several fronts. First, it latency and path stability using three large-scale traceroute provides a historical and longitudinal perspective of Internet datasets collected over several years: Ark and iPlane from path properties that are surprisingly lacking in the measure- 2008 to 2013 and a proprietary CDN’s traceroute dataset spanning 2012 and 2013. Using these different “lenses”, we ment community today. Second, it can help us revisit and revisit classical properties of Internet paths such as end-to- reappraise classical assumptions about path latency and sta- end latency, stability, and of routing graph structure. Iterative bility used in designing Internet-scale systems. Third, such data analysis at this scale is challenging given the idiosyn- a cross-dataset analysis can shed light on potential “blind crasies of different collection tools, measurement noise, and spots” and potential biases in our understanding even with the diverse analysis we desire. To this end, we leverage re- large-scale datasets. cent big-data techniques to develop a scalable data analy- Specifically, we analyze the following dimensions of In- sis toolkit, Hummus, that enables rapid and iterative analysis ternet path properties: on large traceroute measurement datasets. Our key findings • Has the end-to-end latency1 changed over this measure- are: (1) overall latency seems to be decreasing; (2) some ge- ment period? Have specific geographic regions improved ographical regions still have poor latency; (3) route stability more than others or have some regressed to worse con- (prevalence and persistence) is increasing; and (4) we ob- nectivity? (§4) serve a mixture of effects in the routing graph structure with • Do classical assumptions about route stability in terms high-degree ASes rapidly increasing in degree and lower- of route persistence and route prevalence [36] still hold? degree ASes forming denser “communities”. How have these evolved over this time period? (§5) 1 Introduction • Has the routing graph structure changed significantly Understanding Internet path properties is a fundamental re- over the last several years? (§6) quirement for many applications including server placement • Do different “lenses” provide complementary, consistent, (e.g., [37]) and selection (e.g., [26]), fault detection and di- or contradictory views into the above routing character- agnosis (e.g, [16]), and analyzing key inefficiencies in rout- istics? (§4–6) ing protocols (e.g., [13]). Measuring Internet paths is al- Performing such a study over diverse and large-scale most as old as the discipline of Internet measurement it- datasets raises a number of practical scalability challenges self [35]. Starting from the seminal studies by Vern Pax- in terms of preprocessing (e.g., cleaning missing traces and son [36], there have been several efforts to analyze latency converting traces to AS-granularity) and extracting mean- (e.g., [19,31,32]), route predictability (e.g., [12,38]), as well ingful information — the combined raw traceroute data in as numerous tools and datasets (e.g., [1, 7,30]). our study amounts to roughly 1 terabyte of uncompressed Our work follows in the spirit of this rich body of work in data. Conventional data analysis techniques (e.g., custom routing measurement. Our main contribution here is to sys- scripts) are simply not scalable or sustainable for such itera- tematically analyze key characteristics of the Internet paths tive analysis. Our contribution here is a systematic data anal- using multiple large-scale traceroute datasets collected over ysis toolkit, Hummus, implemented on top of an in-memory several years. Specifically we use a six-year dataset (2008- distributed data processing system called Apache Spark [42]. 2013) from iPlane [30] that uses over 1000 vantage points, Hummus enables rapid and iterative analysis of large-scale another six-year (2008-2013) dataset from Ark that uses over traceroute measurement datasets that would otherwise be 80 vantage points [1], and a proprietary dataset from a large infeasible or tedious. For instance, several analysis tasks CDN with traceroutes from over 1800 vantage points over performed in this paper that would have otherwise taken 2012-2013. Together, these datasets provide a panoramic several hours using conventional techniques can be com- and longitudinal view of Internet routing behavior to more than 25000 destination ASes spread across 200 countries. 1Since traceroute datasets allow us to only measure round-trip times (RTT) between any source-destination pair, we use RTT as the metric for latency and use the terms RTT and latency inter- changeably in this paper. 1 pleted within six minutes using Hummus. We cannot stress Ark iPlane LargeCDN the value of such a capability enough—it simplified our Period 2008-2013 2008-2013 2012/2013 analysis workflow and enabled iterative analysis, typical of Sampling 1 cycle/month 1 day/month No large-scale measurement studies, to extract interesting infor- traceroute Paris [7] Normal Normal mation. We have made our code public2 for future measure- Traces 2.52 billion 1.85 billion 0.83 billion ment studies to benefit from the scalability and rapid itera- Valid Traces 187 million 449 million 104 million tive analysis enabled by Hummus. Vantage IPs 80 1051 1895 Vantage Countries 35 40 89 Our key findings are: Vantage ASes 72 226 815 Dst Countries 219 226 222 • Overall, most countries are improving in terms of la- Dst ASes 36238 27243 25876 tency from different vantage points. Countries in Africa, which currently have the highest latencies, are also the Table 1: Dataset details. ones showing the most improvements over the last six years. Some countries are surprisingly regressing in the 2 Datasets and Methodology Ark dataset; we identify them as anomalies and point out the likely causes. In this section, we describe the traceroute datasets used in our study (§2.1) as well as the methodology we follow to • We observe that both routing prevalence (i.e., how fre- preprocess and clean the datasets, followed by a discussion quently the dominant route is used) and persistence (i.e., of potential biases and limitations of this study (§2.2). how stable are routes across consecutive measurements) are increasing both at AS- and city-granularity. However, 2.1 Datasets we find the absolute numbers to be significantly lower We use three traceroute datasets from Ark [1], iPlane [30], than those observed in classical studies. In general, we and a large commercial CDN (LargeCDN). The Ark and iPlane observe that prevalence/persistence are inversely corre- data spans six years from 2008 to 2013, while the LargeCDN lated with the in-degrees of destination ASes. dataset spans 2012 and 2013. Table 1 provides a quick sum- • The AS-granularity Internet routing graph is getting mary of each dataset. traceroutes from all three datasets in- denser and more clustered, which is correlated with the cluded only IPv4 addresses. While all three datasets cover improvements in latencies, route lengths, and stability. almost all of the geographical regions and ASes, both Ark Following a “rich-getting-richer” phenomenon, the high- and iPlane have their vantage points located in a significantly degree ASes are connecting to increasingly larger frac- smaller number of locations than LargeCDN. Each dataset is tion of ASes, and lower degree ASes are forming more collected by (periodically) issuing traceroutes from a fixed tightly-knit communities. set of vantage points to some destinations based on dataset- • We do find different datasets to be largely in qualitative specific criteria. For completeness, we briefly summarize agreement and having good coverage. But we observe how these individual traceroutes were collected and refer several subtle differences (e.g., latency anomalies in Ark) readers to the above references for more details. and some non-trivial coverage gaps (e.g., many countries • in Africa and Asia often do not have a sufficient number In iPlane, each vantage point daily probes a selective list of measurements). of IP prefixes (120K prefixes out of 450K operational prefixes [25]) concentrated on the core Internet. These results have important implications both for mea- • Ark ensures that in each “probing cycle” traceroutes are surement research as well as the design of Internet-scale sys- sent to randomly selected destinations in every routable tems. For instance, we find that our visibility into signifi- /24 IP prefix. cant geographical regions (e.g., Africa, Asia) is quite lim- • For the proprietary LargeCDN dataset, the destination se- ited, motivating the need for more careful selection of tar- lection process information is unavailable; however, we gets. Similarly, we find that most stable vantage points are observed significantly higher “coverage” over the IPv4 in the US and Europe providing a very US- and Euro-centric address space relative to Ark and iPlane.

A Longitudinal and Cross-Dataset Study of Internet Latency and Path Stability

NOAA Technical Report NOS NGS 60

Misleading Stars: What Cannot Be Measured in the Internet?

PAX A920 Mobile Smart Terminal Quick Setup Guide

PAX-It! TM Applications the Paxcam Digital USB 2.0 Camera System

Freebsd and Netbsd on Small X86 Based Systems

Parallel MPI I/O in Cube: Design & Implementation

Release 0.11.0

Z/VM Version 7 Release 2

The Emaginepos Guide to Pax / Emv / Broadpos

An Airborne Network Telemetry Link for the Inet Technical Demonstration System

LDM7 Deployment Guidance

Objections to Killer Robots