Pied Piper: Rethinking Internet Data Delivery

Pied Piper: Rethinking Internet Data Delivery Aran Bergman1;2, Israel Cidon1, Isaac Keslassy1;2, Noga Rotman3, Michael Schapira3, Alex Markuze1;2, Eyal Zohar1 1 VMware 2 Technion 3 HUJI ABSTRACT congestion control, though a decades-old agenda, remains at We contend that, analogously to the transition from resource- the center of attention, as evidenced by the surge of recent limited on-prem computing to resource-abundant cloud com- interest in novel congestion control paradigms [6, 9, 13] and puting, Internet data delivery should also be adapted to a real- in data-driven BGP-path-selection [34, 45]. ity in which the cloud offers a virtually unlimited resource, Meanwhile, recent developments give rise to new opportu- i.e., network capacity, and virtualization enables delegating nities. Major cloud providers are pouring billions of dollars local tasks, such as routing and congestion control, to the into high-performance globe-spanning networking infrastruc- cloud. This necessitates rethinking the traditional roles of tures [3, 35, 37]. In addition, virtualization enables setting inter- and intra-domain routing and conventional end-to-end up logical relays within the cloud (in the form of virtual ma- congestion control. chines (VMs)), and controlling the routing and transport-layer We introduce Optimized Cloudified Delivery (OCD), a protocols executed by these relays. As a result, any user can holistic approach for optimizing joint Internet/cloud data de- establish a virtually infinite number of VMs in the cloud, each livery, and evaluate OCD through hundreds of thousands of potentially sending at a virtually infinite capacity to its outgo- file downloads from multiple locations. We start by examin- ing links (e.g., , up to 2Gbps per VM in our measurements on ing an OCD baseline approach: traffic from a source A to a the most basic VM types), in a pay-as-you-go model where destination B successively passes through two cloud virtual rates per GB keep falling. machines operating as relays - nearest to A and B; and the We argue that the emergence of cloud networks of this scale two cloud relays employ TCP split. and capacity compels the networking community to rethink We show that even this na¨ıve strategy can outperform the traditional roles of inter- and intra-domain routing, and recently proposed improved end-to-end congestion control of conventional end-to-end congestion control. In fact, we paradigms (BBR and PCC) by an order of magnitude. contend that networking is following the path of computing Next, we present a protocol-free, ideal pipe model of data and storage in their transition from resource-limited on-prem transmission, and identify where today’s Internet data deliv- computing to resource-abundant cloud computing. Similarly ery mechanisms diverge from this model. We then design to storage and computing, in the context of networking, too, and implement OCD Pied Piper. Pied Piper leverages vari- the cloud offers a virtually unlimited resource, i.e., network ous techniques, including novel kernel-based transport-layer capacity, and virtualization enables delegating to the cloud accelerations, to improve the Internet-Cloud interface so as local tasks, namely, (interdomain) routing and congestion to approximately match the ideal network pipe model. control. Our aim is to analyze what would happen to Internet data delivery if it also were to become “cloudified”. 1 INTRODUCTION arXiv:1812.05582v2 [cs.NI] 20 Dec 2018 The Internet’s data delivery infrastructure, whose core com- OCD baseline (§2). We introduce Optimized and Cloudified ponents (BGP, TCP, etc.) were designed decades ago, re- Delivery (OCD), a holistic approach that aims at exploring the flects a traditional view of the Internet’s landscape: traffic performance implications of delegating routing and conges- leaving a source traverses multiple organizational networks tion control to the cloud. We start by comparing the current (Autonomous Systems), potentially contending over network end-to-end Internet data delivery to an OCD baseline: (1) resources with other traffic flows en route to its destination traffic from a source A to a destination B successively passes at multiple locations along the way. Indeed, despite many through the two cloud relays, one near A and the other near B; changes to the Internet ecosystem over the past two decades and (2) the two cloud relays employ TCP split, dividing the (the emergence of global corporate WANs, IXPs, CDNs), end-to-end TCP connection into three connections. Thanks BGP paths consistently remain about 4-AS-hop-long on av- to the advent of virtualization, this OCD baseline is relatively erage [2, 19], and congestion control on the Internet remains easy to implement. This baseline stategy, while simple, cap- notoriously bad [9]. Consequently, fixing the deficiencies tures the essence of OCD: minimizing time outside the cloud of the Internet’s core networking protocols for routing and so as to avoid BGP’s inefficiencies and separating traffic mod- This work has been submitted to the IEEE for possible publication. Copyright ulation within and without the cloud. may be transferred without notice, after which this version may no longer be Our evaluation of the OCD baseline on three major clouds accessible. and on both inter- and intra-continental traffic, reveals that file 1 download times for a 4MB file over this OCD baseline outper- 2 OCD BASELINE form those over the wild Internet by a 5× factor. Download We present below a baseline proposal for OCD. We show that times of large files can improve by up to 30×, while those of even the proposed na¨ıve solution significantly outperforms small files are barely affected or get slightly worse. Impor- utilizing recently proposed improved end-to-end congestion tantly, using recently proposed end-to-end congestion-control control schemes. protocols over the Internet, such as PCC [9] and BBR [6], improves performance by less than 30%. In short, the OCD baseline approach appears quite promising. Pied Piper (§3). To go beyond the OCD baseline, we next 2.1 Baseline Strategy introduce the abstraction of a fixed-capacity network pipe Consider an end-host A in Frankfurt that is sending traffic to interconnecting source A and destination B. We present a an end-host B in New York. We will refer to A as the server protocol-free, ideal theoretical model of such a network pipe, and B as the client. Under today’s default data transport and explain how today’s Internet data delivery mechanisms mechanisms, traffic from A to B will be routed along the BGP introduce a seriality that diverges from this ideal pipe model. path between the two. As illustrated in Figure 1a, such a We argue that as the (long) cloud leg essentially encounters path might connect A’s ISP to B’s ISP via an intermediate no congestion, transport-layer mechanisms can be adapted to ISP (or more). The modulation of traffic from A to B in better approximate the ideal network pipe model. To accom- our example is via TCP congestion control, which reacts to plish this, we design and implement OCD Pied Piper, which network conditions on the path (congestion, freed capacity) leverages various techniques, including novel kernel-based by adaptively adjusting the sending rate. transport-layer accelerations. We consider the following simple strategy for the cloudified OCD Evaluations (§4). We next revisit two key design data delivery, as illustrated in Figure 1b: Traffic from A is choices behind our OCD approach: (1) employing two cloud sent directly to a geographically-close cloud ingress point. relays (placed near the source and the destination) and (2) Then, traffic traverses the (single) cloud’s infrastructure and splitting TCP. Our evaluation relies on a large-scale empirical leaves that cloud en route to B at an egress point close to investigation involving the download of hundreds of thou- B. In addition, the transport-layer connection between A and sands of files of varying sizes from multiple locations within B is split into 3 separate connections, each employing TCP and outside the cloud. Our results show that: (1) Using the (Cubic) congestion control: between A and the cloud, within two cloud relays that are closest to A and B indeed improves the cloud, and between the cloud and B. We point out that performance over using most single cloud relays. While using this strategy does not incorporate aspects such as employing the best single cloud relay may perform slightly better, it is other rate control schemes, leveraging additional relays within hard to identify this relay in advance. Further, using three the cloud, traversing multiple clouds, and more. Yet, as our relays instead of two provides, at best, small marginal benefit. empirical results below demonstrate, even the simple baseline Also, (2) not splitting TCP significantly hurts the data transfer strategy yields significant performance benefits. We later performance, and therefore splitting appears to be a must. revisit the rate-control and routing aspects of the baseline In addition, we find that combining routes through several approach in Sections3 and 4.1, respectively. To conclude, the clouds may help in the future. baseline strategy relies on: OCD implementation (§5). The OCD Pied Piper relies on a novel Linux kernel-based split acceleration we call K- Two cloud relays, one geographically-close to the server Split [50]. To our knowledge, K-Split is the first TCP split and one geographically-close to the client. We set up, for tool written in the Linux kernel, using kernel sockets. Avoid- each client-server pair, two cloud relays (VMs), selected ing the penalties that stem from the numerous system calls according to their geographical proximity to the end-point of user-mode implementations. We make K-Split publicly (the first closest to the server and the second closest to the available as open-source on Github, thus enabling any user to client).

Pied Piper: Rethinking Internet Data Delivery

Libioth (Slides)

Improving Networking

Post Sockets - a Modern Systems Network API Mihail Yanev (2065983) April 23, 2018

Introduction to RAW-Sockets Jens Heuschkel, Tobias Hofmann, Thorsten Hollstein, Joel Kuepper

Freertos and TCP/IP Communication: the Lwip Library

STREAMS Vs. Sockets Performance Comparison for UDP

Transport Layer Socket Programming

Socket Programming

The Transport Layer = L4

Design and Research of Future Network (IPV9) API

The Berkeley Sockets API

Socket Programming& Porting Linux On