Accurate, Minimum Overhead, Available Bandwidth Estimation in High-Speed Wired Networks

Timing is Everything: Accurate, Minimum Overhead, Available Bandwidth Estimation in High-speed Wired Networks Han Wang1 Ki Suh Lee1 Erluo Li1 [email protected] [email protected] [email protected] Chiun Lin Lim2 Ao Tang2 Hakim Weatherspoon1 [email protected] [email protected] [email protected] 1Department of Computer Science 2Department of Electrical and Computer Engineering Cornell University Cornell University Ithaca, NY 14850 Ithaca, NY 14850 ABSTRACT receiving end from probe packet timestamps [20, 36, 40, 46]. Active end-to-end available bandwidth estimation is intru- There are three significant problems with current bandwidth sive, expensive, inaccurate, and does not work well with estimation approaches. First, such approaches are intrusive bursty cross traffic or on high capacity links. Yet, it is im- since load is being added to the network. Current measure- portant for designing high performant networked systems, ment methods require creating explicit probe packets that improving network protocols, building distributed systems, consume bandwidth and CPU cycles. Second, current ap- and improving application performance. In this paper, we proaches often do not have enough fidelity to accurately gen- present minProbe which addresses unsolved issues that have erate or timestamp probe packets. By operating in userspace plagued available bandwidth estimation. As a middlebox, to maximize programming flexibility, precise control and minProbe measures and estimates available bandwidth with measurement of packet timings is sacrificed. As a result, high-fidelity, minimal-cost, and in userspace; thus, enabling the probe packet timings are often perturbed by operating cheaper (virtually no overhead) and more accurate avail- systems (OS) level activities. Without precise timestamps, able bandwidth estimation. MinProbe performs accurately accurate estimation becomes very challenging, especially in on high capacity networks up to 10 Gbps and with bursty high-speed networks. Finally, Internet traffic is known to cross traffic. We evaluated the performance and accuracy of be bursty at a range of time scales [22], but previous stud- minProbe over a wide-area network, the National Lambda ies have shown that existing estimation approaches perform Rail (NLR), and within our own network testbed. Results poorly under bursty traffic conditions [48]. indicate that minProbe can estimate available bandwidth In this paper, we present a middlebox approach that with error typically no more than 0.4 Gbps in a 10 Gbps addresses the issues described above. MinProbe performs network. available bandwidth estimation in real-time with minimal- cost, high-fidelity for high-speed networks (10 Gigabit Eth- ernet), while maintaining the flexibility of userspace control 1. INTRODUCTION and compatibility with existing bandwidth estimation algo- Active end-to-end available bandwidth estimation is in- rithms. MinProbe minimizes explicit probing costs by us- trusive, expensive, inaccurate, and does not work well with ing application packets as probe packets whenever possible, bursty cross traffic or on high capacity links [13,33]. Yet, it while generating explicit packets only when necessary. Min- is important [17, 43, 45]. It is necessary for designing high Probe greatly increases measurement fidelity via software performant networked systems, improving network proto- access to the wire and maintains wire-time timestamping cols, building distributed systems, and improving applica- of packets before they enter the host. Finally, a userspace tion performance. process controls the entire available bandwidth estimation The problems associated with available bandwidth esti- process giving flexibility to implement new algorithms as mation stem from a simple concept: Send a train of probe well as reuse existing estimation algorithms. Importantly, packets through a network path to momentarily congest the minProbe operates in real-time, such that it is useful as a bottleneck link, then infer the available bandwidth at the networked systems building block. MinProbe was designed with commodity components and Permission to make digital or hard copies of all or part of this work for personal or achieves results that advance the state of the art. It can classroom use is granted without fee provided that copies are not made or distributed be built from a commodity server and an FPGA (field pro- for profit or commercial advantage and that copies bear this notice and the full cita- grammable gate array) PCIe (peripheral component inter- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- connect express) pluggable board, SoNIC [26]. As a result, publish, to post on servers or to redistribute to lists, requires prior specific permission any server can be turned into a minProbe middlebox. In- and/or a fee. Request permissions from [email protected]. deed, we have turned several nodes and sites in the GENI IMC’14, November 5–7, 2014, Vancouver, BC, Canada. (global environment for networking innovations) network Copyright 2014 ACM 978-1-4503-3213-2/14/11 ...$15.00. into minProbe middleboxes. It has been evaluated on a http://dx.doi.org/10.1145/2663716.2663746. Low Fidelity High-Fidelity Timestamping Timestamping Pathload [20], Active Pathchirp [40], minProbe gi hi Probe Spruce [46], TOPP [36] MP MP si+1 si si+1 si Passive tcpdump DAG [1] Monitor Table 1: Bandwidth Estimation Design Space testbed that consists of multiple 10 Gigabit Ethernet (10 GbE) switches and on the wide-area Internet via the Na- Figure 1: Usage of bandwidth estimation tional Lambda Rail (NLR). It achieves high accuracy: Re- sults illustrate available bandwidth estimation with errors at the receiver to infer the amount of cross traffic in the typically no more than 0.4 Gbps in a 10 Gbps network. network. The key idea to such inferences: When the probing It estimates available bandwidth with minimal overhead: rate exceeds the available bandwidth, the observed packet Available bandwidth estimates use existing network traffic characteristics undergo a significant change. The turning as probes. Overall, minProbe allows software programs and point where the change occurs is then the estimated avail- end-hosts to accurately measure the available bandwidth of able bandwidth. See Figure 9, for example, the turning point high speed, 10 gigabit per second (Gbps), network paths where queuing delay variance increases is the available band- even with bursty cross traffic, which is a unique contribu- width estimate. Most existing available bandwidth estimation. tion tools can be classified according to the packet charac- Our contributions are as follows: teristics and metrics used to observing a turning point. Bandwidth estimation tools such as Spruce [46] and IGI • High performance: We are the first to demonstrate a [16] operate by observing the change in the output prob- system that can accurately estimate available bandwidth in ing rate. The idea is that when the probing rate is below 10 Gbps network. the available bandwidth, probe packets should emerge at the receiver with the same probing rate. However, once the • Minimal overhead: Albeit not the first middlebox ap- probing rate exceeds the available bandwidth, the gap be- proach to available bandwidth estimation(e.g. MGRP [37]), tween packets will be stretched due to queuing and conges- our system performs accurate bandwidth estimation with tion, resulting in a lower output probing rate. On the other minimal overhead. hand, bandwidth estimation tools such as Pathload [20], • Ease of use: Our system provides flexible userspace pro- PathChirp [40], TOPP [36], and Yaz [42] operate by ob- gramming interface and is compatible with existing algo- serving the change in one-way delay (OWD), i.e. the time rithms. required to send a packet from source to destination. OWD • Sensitivity study: We conduct a sensitivity study on increases when the probing rate exceeds the available band- our estimation algorithm with respect to internal parameter width, as packets have to spend extra time in buffers. selection and network condition. An alternative methodology to the end-to-end approach would be to query every network element (switch/router) 2. BACKGROUND along a network path. A network administrator could collect the statistical counters from all related ports, via protocols Available bandwidth estimation is motivated by a simple such as sFlow [5]; or infer from Openflow control messages problem: What is the maximum data rate that a user could such as PacketIn and FlowRemoved messages [50]. However, send down a network path without going over capacity? This obtaining an accurate, consistent, and timely reading from data rate is equal to the available bandwidth on the tight multiple switches in an end-to-end manner can be very diffi- link, which has the minimum available bandwidth among cult [2]. Further, collecting counters from switches is done at all links on the network path. While the problem sounds per-second granularity and requires network administrative simple, there are four main challenges to available band- privileges, which makes the approach less timely and useful width estimation—timeliness, accuracy, non-intrusiveness, for building distributed systems, improving network proto- and consistency. In particular, an available bandwidth es- cols, or improving application performance. As a result, we timation methodology and tool would ideally add as little assume and

Accurate, Minimum Overhead, Available Bandwidth Estimation in High-Speed Wired Networks

IEEE Std 802.3™-2012 New York, NY 10016-5997 (Revision of USA IEEE Std 802.3-2008)

Revision to IEEE Std 802.3-2012 Initial Sponsor Ballot Comments

Ethernet Explained

Packet Frame Structure Preamble

UNIVERSITY of CALIFORNIA, SAN DIEGO Packet Pacer

Data Link and Physical Layers and 10 Gbe Protocol

8101/8104 Gigabit Ethernet Controller Technical Manual

Packet Clustering Introduced by Routers: Modeling, Analysis and Experiments

Design and Implementation of a 10 Gigabit Ethernet XAUI Test Systems

Towards Precise Network Measurements

PHY Covert Channels

DP83825I Low Power 10/100 Mbps Ethernet Physical Layer Transceiver 1 Features 2 Applications