Making Network Stack Part of the Virtualized Infrastructure

NetKernel: Making Network Stack Part of the Virtualized Infrastructure Zhixiong Niu Hong Xu Peng Cheng City University of Hong Kong City University of Hong Kong Microsoft Research Yongqiang Xiong Tao Wang Dongsu Han Microsoft Research City University of Hong Kong KAIST Keith Winstein Stanford University ABSTRACT Applications own the network stack, which is separated The network stack is implemented inside virtual machines from the network infrastructure that operators own; they (VMs) in today’s cloud. This paper presents a system called interface using the virtual NIC abstraction. This architecture NetKernel that decouples the network stack from the guest, preserves the familiar hardware and OS abstractions so a and offers it as an independent module implemented bythe vast array of workloads can be easily moved into the cloud. cloud operator. NetKernel represents a new paradigm where It provides high flexibility to applications to customize the network stack is managed by the operator as part of the entire network stack. virtualized infrastructure. It provides important efficiency We argue that the current division of labor between appli- benefits: By gaining control and visibility of the network cation and network infrastructure is becoming increasingly stack, operator can perform network management more di- inadequate. The central issue is that the operator has almost rectly and flexibly, such as multiplexing VMs running dif- zero visibility and control over the network stack. This leads ferent applications to the same network stack module to to many efficiency problems that manifest in various aspects save CPU cores, and enforcing fair bandwidth sharing with of running the cloud network. distributed congestion control. Users also benefit from the Many network management tasks like monitoring, diag- simplified stack deployment and better performance. For ex- nosis, and troubleshooting have to be done in an extra layer ample mTCP can be deployed without API change to support outside the VMs, which requires significant effort in design nginx and redis natively, and shared memory networking and implementation [23, 54, 55]. Since these network func- can be readily enabled to improve performance of colocat- tions need to process packets at the end-host [29, 37, 45, 61], ing VMs. Testbed evaluation using 100G NICs shows that they can be done more efficiently if the network stack were NetKernel preserves the performance and scalability of both opened up to the operator. More importantly, the operator is kernel and userspace network stacks, and provides the same unable to orchestrate resource allocation at the end-points of isolation as the current architecture. the network fabric, resulting in low resource utilization. It re- mains difficult today for the operator to meet or define perfor- 1 INTRODUCTION mance SLAs despite much prior work [17, 28, 34, 39, 52, 53], as she cannot precisely provision resources just for the net- Virtual machine (VM) is the predominant virtualization form work stack or control how the stack consumes these re- arXiv:1903.07119v2 [cs.NI] 19 Mar 2019 in today’s cloud due to its strong isolation guarantees. VMs sources. Further, resources (e.g. CPU) have to be provisioned allow customers to run applications in a wide variety of oper- on a per-VM basis based on the peak traffic; it is impossible to ating systems (OSes) and configurations. VMs are also heav- coordinate across VM boundaries. This degrades the overall ily used by cloud operators to deploy internal services, such utilization of the network stack since in practice traffic to as load balancing, proxy, VPN, etc., both in a public cloud individual VMs is extremely bursty. for tenants and in a private cloud for supporting various Even the simple task of maintaining or deploying a net- business units of an organization. Lightweight virtualization work stack suffers from inefficiency today. Network stack technologies such as containers are also provisioned inside has critical impact on performance, and many optimizations VMs in many production settings for isolation, security, and have been studied with numerous effective solutions, rang- management reasons [2, 3, 6]. ing from congestion control [13, 19, 47], scalability [33, 40], VM based virtualization largely follows traditional OS de- zerocopy datapath [5, 33, 51, 59, 60], NIC multiqueue sched- sign. In particular, the TCP/IP network stack is encapsulated uling [57], etc. Yet the operator, with sufficient expertise inside the VM as part of the guest OS as shown in Figure 1(a). 1 Technical report, 2019, online Z. Niu et al. VM VM We make three specific contributions. APP1 APP2 APP1 APP2 • We design and implement a system called NetKernel Tenant Networking API Networking API to show that this new division of labor is feasible without radical changes to application or infrastructure Network Stack (§3–§5). NetKernel provides transparent BSD socket Network stack module vNIC redirection so existing applications can run directly. Network Stack Provider The socket semantics from the application are encapsulated into small queue elements and transmitted to (a). Existing architecture (b). Decoupling network stack from the guest the corresponding NSM via lockless shared memory Figure 1: Decoupling network stack from the guest, and making it queues. part of the virtualized infrastructure. • We present new use cases that are difficult to realize today to show NetKernel’s potential benefits (§6). For example, we show that NetKernel enables multiplex- and resources, still cannot deploy these extensions to im- ing: one NSM can serve multiple VMs at the same prove performance and reduce overheads. As a result, our time and save over 40% CPU cores without degrading community is still finding ways to deploy DCTCP in the pub- performance using traces from a production cloud. lic cloud [20, 31]. On the other hand, applications without • We conduct comprehensive testbed evaluation with much knowledge of the underlying network or expertise on commodity 100G NICs to show that NetKernel achieves networking are forced to juggle the deployment and mainte- the same performance, scalability, and isolation as the nance details. For example if one wants to deploy a new stack current architecture (§7). For example, the kernel stack like mTCP [33], a host of problems arise such as setting up NSM achieves 100G send throughput with 3 cores; the kernel bypass, testing with kernel versions and NIC drivers, mTCP NSM achieves 1.1M RPS with 8 cores. and porting applications to the new APIs. Given the intricacy of implementation and the velocity of development, it is a 2 MOTIVATION daunting task if not impossible to expect users, whether ten- Decoupling the network stack from the guest OS, hence mak- ants in a public cloud or first-party services in a private cloud, ing it part of the infrastructure, marks a clear departure from to individually maintain the network stack themselves. the way networking is provided to VMs nowadays. In this We thus advocate a new division of labor in a VM-based section we elaborate why this is a better architectural design cloud in this paper. We believe that network stack should by presenting its benefits and contrasting with alternative be managed as part of the virtualized infrastructure instead solutions. We discuss its potential issues in §8. of in the VM by application. The operator is naturally in a better position to own the last mile of packet delivery, so it can directly deploy, manage, and optimize the network stack, 2.1 Benefits and comprehensively improve the efficiency of running the We highlight key benefits of our vision with new use cases entire network fabric. Applications’ functionality and perfor- we experimentally realize with NetKernel in §6. mance requirements can be consolidated and satisfied with Better efficiency in management for the operator. Gain- several different network stacks provided by the operator. ing control over the network stack, the operator can now As the heavy-lifting is taken care of, applications can just perform network management more efficiently. For exam- use network stack as a basic service of the infrastructure and ple it can orchestrate the resource provisioning strategies focus on their business logic. much more flexibly: For mission-critical workloads, it can Specifically, we propose to decouple the VM network stack dedicate CPU resources to their NSMs to offer performance from the guest as shown in Figure 1(b). We keep the network SLAs in terms of throughput and RPS (requests per second) APIs such as BSD sockets intact, and use them as the ab- guarantees. For elastic workloads, on the other hand, it can straction boundary between application and infrastructure. consolidate their VMs to the same NSM (if they use the same Each VM is served by a network stack module (NSM) that network stack) to improve its resource utilization. The op- runs the network stack chosen by the user. Application data erator can also directly implement management functions are handled outside the VM in the NSM, whose design and as an integral part of user’s network stack and improve the implementation are managed by the operator. Various net- effectiveness of management, compared to doing them inan work stacks can be provided as different NSMs to ensure extra layer outside the guests. applications with diverse requirements can be properly sat- Use case 1: Multiplexing (§6.1). Utilization of network stack isfied. We do not enforce a single transport design, ortrade in VMs is very low most of the time in practice. Using a real off flexibility of the existing architecture in our approach. trace from a large cloud, we show that NetKernel enables 2 NetKernel: Making Network Stack Part of the Virtualized Infrastructure Technical report, 2019, online multiple VMs to be multiplexed onto one NSM to serve the 2.2 Alternative Solutions aggregated traffic and saves over 40% CPU cores forthe We now discuss several alternative solutions and why they operator without performance degradation.

Load more