Fast and Live Hypervisor Replacement
Total Page:16
File Type:pdf, Size:1020Kb
Fast and Live Hypervisor Replacement Spoorti Doddamani Piush Sinha Hui Lu Binghamton University Binghamton University Binghamton University New York, USA New York, USA New York, USA [email protected] [email protected] [email protected] Tsu-Hsiang K. Cheng Hardik H. Bagdi Kartik Gopalan Binghamton University Binghamton University Binghamton University New York, USA New York, USA New York, USA [email protected] [email protected] [email protected] Abstract International Conference on Virtual Execution Environments (VEE Hypervisors are increasingly complex and must be often ’19), April 14, 2019, Providence, RI, USA. ACM, New York, NY, USA, updated for applying security patches, bug fixes, and feature 14 pages. https://doi.org/10.1145/3313808.3313821 upgrades. However, in a virtualized cloud infrastructure, up- 1 Introduction dates to an operational hypervisor can be highly disruptive. Before being updated, virtual machines (VMs) running on Virtualization-based server consolidation is a common prac- a hypervisor must be either migrated away or shut down, tice in today’s cloud data centers [2, 24, 43]. Hypervisors host resulting in downtime, performance loss, and network over- multiple virtual machines (VMs), or guests, on a single phys- head. We present a new technique, called HyperFresh, to ical host to improve resource utilization and achieve agility transparently replace a hypervisor with a new updated in- in resource provisioning for cloud applications [3, 5–7, 50]. stance without disrupting any running VMs. A thin shim Hypervisors must be often updated or replaced for various layer, called the hyperplexor, performs live hypervisor re- purposes, such as for applying security/bug fixes [23, 41] placement by remapping guest memory to a new updated adding new features [15, 25], or simply for software reju- hypervisor on the same machine. The hyperplexor lever- venation [40] to reset the effects of any unknown memory ages nested virtualization for hypervisor replacement while leaks or other latent bugs. minimizing nesting overheads during normal execution. We Updating a hypervisor usually requires a system reboot, present a prototype implementation of the hyperplexor on especially in the cases of system failures and software aging. the KVM/QEMU platform that can perform live hypervi- Live patching [61] can be used to perform some of these sor replacement within 10ms. We also demonstrate how a updates without rebooting, but it relies greatly on the old hyperplexor-based approach can used for sub-second reloca- hypervisor being patched, which can be buggy and unsta- tion of containers for live OS replacement. ble. To eliminate the need for a system reboot and mitigate service disruption, another approach is to live migrate the CCS Concepts • Software and its engineering → Vir- VMs from the current host to another host that runs a clean tual machines; Operating systems. and updated hypervisor. Though widely used, live migrat- Keywords Hypervisor, Virtualization, Container, Live Mi- ing [18, 27] tens or hundreds of VMs from one physical host gration to another, i.e. inter-host live migration, can lead to signifi- cant service disruption, long total migration time, and large ACM Reference Format: migration-triggered network traffic, which can also affect Spoorti Doddamani, Piush Sinha, Hui Lu, Tsu-Hsiang K. Cheng, other unrelated VMs. Hardik H. Bagdi, and Kartik Gopalan. 2019. Fast and Live Hypervi- sor Replacement. In Proceedings of the 15th ACM SIGPLAN/SIGOPS In this paper, we present HyperFresh, a faster and less disruptive approach to live hypervisor replacement which Permission to make digital or hard copies of all or part of this work for transparently and quickly replaces an old hypervisor with a personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear new instance on the same host while minimizing impact on this notice and the full citation on the first page. Copyrights for components running VMs. Using nested virtualization [12], a lightweight of this work owned by others than ACM must be honored. Abstracting with shim layer, called hyperplexor, runs beneath the traditional credit is permitted. To copy otherwise, or republish, to post on servers or to full-fledged hypervisor on which VMs run. The new replace- redistribute to lists, requires prior specific permission and/or a fee. Request ment hypervisor is instantiated as a guest atop the hyper- permissions from [email protected]. VEE ’19, April 14, 2019, Providence, RI, USA plexor. Next, the states of all VMs are transferred from the © 2019 Association for Computing Machinery. old hypervisor to the replacement hypervisor via intra-host ACM ISBN 978-1-4503-6020-3/19/04...$15.00 live VM migration. https://doi.org/10.1145/3313808.3313821 VEE ’19, April 14, 2019, Providence, RI, USA S. Doddamani, P. Sinha, H. Lu, T. Cheng, H. Bagdi and K. Gopalan However, two major challenges must be tackled with this VM VM approach. First, existing live migration techniques [18, 27] Old Replacement incur significant memory copying overhead, even for intra- Hypervisor Hypervisor VM VM host VM transfers. Secondly, nested virtualization can de- (L1) (L1) grade a VM’s performance during normal execution of VMs Old Replacement Hyperplexor when no hypervisor replacement is being performed. Hyper- Hypervisor Hypervisor (L0) Fresh addresses these two challenges as follows. a. Inter-host migration b. Nested migration on same host First, instead of copying a VM’s memory, the hyperplexor relocates the ownership of the VM’s memory pages from Figure 1. Hypervisor replacement in (a) Inter-host (non- the old hypervisor to the replacement hypervisor. The hy- nested) and (b) Intra-host (nested) setting. perplexor records the mappings of the VM’s guest-physical optimizations to reduce nesting overheads, and live con- and host-physical address space from the old hypervisor and tainer relocation for OS replacement. In the rest of this pa- uses them to reconstruct the VM’s memory mappings on per, we first demonstrate the quantitative overheads ofVM the replacement hypervisor. Most of the remapping opera- migration-based hypervisor replacement, followed by the tions are performed out of the critical path of the VM state HyperFresh design, implementation, and evaluation, and transfer, leading to a very low hypervisor replacement time finally discussion of related work and conclusions. of around 10ms, irrespective of the size of the VMs being relocated. In contrast, traditional intra-host VM migration, 2 Problem Demonstration involving memory copying, can take several seconds. For the same reason, HyperFresh also scales well when remapping In this section, we examine the performance of traditional multiple VMs to the replacement hypervisor. live migration for hypervisor replacement to motivate the HyperFresh addresses the second challenge of nesting need for a faster remapping-based mechanism. overhead during normal execution as follows. In compar- 2.1 Using Pre-Copy For Hypervisor Replacement ison with the traditional single-level virtualization setup, where the hypervisor directly controls the hardware, nested Inter-Host Live VM Migration: To refresh a hypervisor, a virtualization introduces additional overheads, especially traditional approach is to live migrate VMs from their current for I/O virtualization. Hence HyperFresh includes a number host to another host (or hosts) having an updated hypervisor. of optimizations to minimize nesting overheads, allowing As shown in Figure1(a), we can leverage the state-of-the-art the hypervisor and its VMs to execute mostly without hy- pre-copy live VM migration technique. Pre-copy VM live perplexor intervention during normal operations. Specifi- migration consists of three major phases: iterative memory cally, HyperFresh uses direct device assignment (VT-d) for pre-copy rounds, stop-and-copy, and resumption. During emulation-free I/O path to the hypervisor, dedicates physical the memory pre-copy phase, the first iteration transfers all CPUs to reduce scheduling overheads for the hypervisor, memory pages over the network to the destination, while reduces CPU utilization on hyperplexor by disabling the the VM continues to execute concurrently. In the subsequent polling of hypervisor VCPUs, and eliminates VM Exits due iterations, only the dirtied pages are transferred. After a to external device interrupts. certain round of iterations, determined by a convergence Finally, as a lightweight alternative to VMs, containers [21, criteria, the stop-and-copy phase is initiated, during which 52–54] can be used to consolidate multiple processes. We the VM is paused at the source and any remaining dirty pages, demonstrate how the hyperplexor-based approach can be VCPUs, and I/O state are transferred to the destination VM. used for live relocation of containers to support replacing the Finally, the VM is resumed at the destination. underlying OS. Specifically, we demonstrate sub-second live Intra-Host Live VM Migration: As Figure1(b) shows, relocation of a container from an old OS to a replacement OS with nested virtualization, a base hyperplexor at layer-0 (L0) by combining hyperplexor-based memory remapping mech- can run deprivileged hypervisors at layer-1 (L1), which con- anism and a well-known process migration tool, CRIU [58]. trol VMs running at layer-2 (L2). At hypervisor replacement In this case, the hyperplexor runs as a thin shim layer (hy- time,