Fast and Scalable VMM Live Upgrade in Large Cloud Infrastructure

Fast and Scalable VMM Live Upgrade in Large Cloud Infrastructure Xiantao Zhang Xiao Zheng Zhi Wang Alibaba Group Alibaba Group Florida State University [email protected] [email protected] [email protected] Qi Li Junkang Fu Yang Zhang Tsinghua University Alibaba Group Alibaba Group [email protected] [email protected] [email protected] Yibin Shen Alibaba Group [email protected] Abstract hand over passthrough devices to the new KVM instance High availability is the most important and challenging prob- without losing any ongoing (DMA) operations. Our evalua- lem for cloud providers. However, virtual machine mon- tion shows that Orthus can reduce the total migration time itor (VMM), a crucial component of the cloud infrastruc- and downtime by more than 99% and 90%, respectively. We ture, has to be frequently updated and restarted to add secu- have deployed Orthus in one of the largest cloud infrastruc- rity patches and new features, undermining high availabil- tures for a long time. It has become the most effective and ity. There are two existing live update methods to improve indispensable tool in our daily maintenance of hundreds of the cloud availability: kernel live patching and Virtual Ma- thousands of servers and millions of VMs. chine (VM) live migration. However, they both have serious CCS Concepts • Security and privacy → Virtualization drawbacks that impair their usefulness in the large cloud and security; • Computer systems organization → Avail- infrastructure: kernel live patching cannot handle complex ability. changes (e.g., changes to persistent data structures); and VM live migration may incur unacceptably long delays when Keywords virtualization; live upgrade; cloud infrastructure migrating millions of VMs in the whole cloud, for example, ACM Reference Format: to deploy urgent security patches. Xiantao Zhang, Xiao Zheng, Zhi Wang, Qi Li, Junkang Fu, Yang In this paper, we propose a new method, VMM live up- Zhang, and Yibin Shen. 2019. Fast and Scalable VMM Live Upgrade grade, that can promptly upgrade the whole VMM (KVM & in Large Cloud Infrastructure. In ASPLOS ’19: ACM International QEMU) without interrupting customer VMs. Timely upgrade Conference on Architectural Support for Programming Languages of the VMM is essential to the cloud because it is both the and Operating Systems. ACM, New York, NY, USA, 13 pages. https: main attack surface of malicious VMs and the component to //doi.org/10.1145/1122445.1122456 integrate new features. We have built a VMM live upgrade system called Orthus. Orthus features three key techniques: 1 Introduction dual KVM, VM grafting, and device handover. Together, they High availability is the most important but challenging prob- enable the cloud provider to load an upgraded KVM instance lem for cloud providers. It requires the service to be accessi- while the original one is running and “cut-and-paste” the ble anywhere and anytime without perceivable downtime. VM to this new instance. In addition, Orthus can seamlessly However, there are two important issues that can undermine high availability: new features and security. Specifically, a Permission to make digital or hard copies of all or part of this work for cloud usually provides a range of services, such as the stor- personal or classroom use is granted without fee provided that copies age, database, and program language services, in addition to are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights running customer VMs. Features like the cloud storage need for components of this work owned by others than the author(s) must to be integrated into the core cloud component, the VMM. be honored. Abstracting with credit is permitted. To copy otherwise, or New features will only take effect after the VMM is restarted, republish, to post on servers or to redistribute to lists, requires prior specific thus interrupting the service. Meanwhile, security demands permission and/or a fee. Request permissions from [email protected]. the system software to be frequently updated to patch newly ASPLOS ’19, April 13–17, 2019, Providence, RI discovered vulnerabilities. All the major Linux distributions © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. push security updates weekly, if not daily. Cloud providers ACM ISBN 978-1-4503-9999-9/18/06...$15.00 have to keep their systems patched otherwise risk being https://doi.org/10.1145/1122445.1122456 compromised. Even though there are efforts to proactively ASPLOS ’19, April 13–17, 2019, Providence, RI Zhang and Zheng, et al. Methods Application Advantages Limitations Cannot handle complex fixes Kernel live patching Simple security fixes No downtime Maintenance headache Short downtime New cloud features VMM live upgrade Scalable, no migration traffic VMM-only update (justified) Complex fixes in the VMM Passthrough device support Not scalable, needs spare servers Migration traffic consumes network bandwidth VM live migration Whole system upgrade Update everything on the server Performance downgrade and long downtime Cannot support passthrough devices Table 1. Comparison of three live update methods discover and fix vulnerabilities in the VMM, vulnerabilities urge security vulnerabilities. In addition, large-scale are inevitable given the complexity of the VMM. For example, VM migration can cause severe network congestion. KVM, the VMM used by many major public cloud providers, • Second, VM live migration will incur temporary per- is a type-2 VMM running in the host Linux kernel, which formance downgrade and service downtime [40]. The has more than 16 millions lines of source code. Therefore, it more memory a VM has, the longer the migration is critical to keep the VMM updated to minimize the window takes. High-end cloud customers may use hundreds of vulnerability. In summary, both new features and security of GBs of memory per VM to run big data, massive demand a method to quickly update the cloud system with streaming, and other applications. These applications minimal downtime. are particularly sensitive to performance downgrade Limitations of the existing methods: currently, there and downtime. Such customers often closely monitor are two live update methods available to cloud providers: ker- the real-time system performance. VM live migration nel live patching and VM live migration. They are valuable is simply not an option for those VMs. in improving the cloud availability but have serious draw- • Last, VM live migration cannot support passthrough backs: kernel live patching applies patches to the running devices. Device passthrough grants a VM direct ac- kernel without rebooting the system [7, 27, 33]. However, cess to the hardware devices. Passthrough devices are live patching cannot handle complex updates that, for exam- widely deployed in the cloud. For example, network ple, change the persistent data structures or add new features. adapter passthrough is often used to speed up network It is most suitable for simple security fixes, such as checking traffic, and many heterogeneous computing services buffer sizes. We instead aim at creating an update mecha- rely on GPU or FPGA passthrough. Existing VM live nism that can not only support simple security checks but migration systems do not support device passthrough. also add new features and apply complex security patches (e.g., the Retpoline fix for Spectre that requires recompiling VMM live upgrade, a new method: in this paper, we the code [39]). propose a third method complimentary to, but more useful VM live migration is another valuable tool used by cloud than, kernel live update and VM live migration – VMM live providers to facilitate software update [10, 12]. It iteratively upgrade, a method to replace the live VMM. Unlike kernel live copies the VM states (e.g., memory) from its current server patch, VMM live upgrade replaces the whole running VMM. to an idle backup server while the VM is still running. At the Consequently, it can apply more complex fixes to the VMM, final iteration, it completely stops the VM, copies the last for example, to support a new cloud storage system. Because changed states, and restarts the VM on the backup server. VMM live upgrade is conducted on the same physical server, This allows the cloud provider to update/replace everything it does not require spare servers to temporarily hold VMs on the source server, including the host kernel, the VMM, or introduce significant network traffic; its performance and failed hardware devices, etc. The VM can be migrated back to downtime are significantly better than VM live migration; the source server after the update if necessary. Nevertheless, and it can seamlessly support passthrough devices. Table 1 VM live migration has three limitations that severely limit compares these three methods. its usefulness in large cloud datacenters: VMM live upgrade replaces the VMM only. This is con- • First, it requires spare servers to temporarily hold the sistent with the threat model in the cloud, where the host migrating VMs. These servers must be equally power- kernel is well isolated from the untrusted VMs through net- ful in order not to degrade the VM performance. If a work segregation, device passthrough, and driver domains; large number of VMs have to be migrated simultane- and the main attack surface lies in the interaction between ously, it brings significant pressure to the inventory the VMM and VMs. This is reflected in Table 2, which lists control for spared servers. In large clouds like ours, all the known vulnerabilities related to KVM/QEMU [3, 34]. it is common for hundreds of thousands of or even These vulnerabilities are exposed to the VMs and pose the millions of VMs to be migrated, for example, to fix most imminent threat to the cloud infrastructure. 2 Fast and Scalable VMM Live Upgrade in Large Cloud Infrastructure ASPLOS ’19, April 13–17, 2019, Providence, RI x86 Other Core Emu VMX PIC Other Arch Kernel QEMU Total the same hardware configurations.

Fast and Scalable VMM Live Upgrade in Large Cloud Infrastructure

Live Kernel Patching Using Kgraft

Think ALL Distros Offer the Best Linux Devsecops Environment?

Fast and Live Hypervisor Replacement

(12) United States Patent (10) Patent No.: US 8,124,082 B2 Fong Et Al

Instant OS Updates Via Userspace Checkpoint-And

Update Your Kernel with No Service Interruption: SUSE Linux Enterprise

SUSE Linux Enterprise Live Patching

Think All Distors Offer the Best Linux Devsecops Environment?

Mitigating Vulnerability Windows with Hypervisor Transplant Dinh Ngoc Tu, Boris Teabe Djomgwe, Alain Tchana, Gilles Muller, Daniel Hagimont

Protecting Commodity Operating Systems Through Strong Kernel Isolation

High Velocity Kernel File Systems with Bento

Practical Linux Topics