Post-Copy Live Migration of Virtual Machines
Total Page:16
File Type:pdf, Size:1020Kb
Post-Copy Live Migration of Virtual Machines Michael R. Hines, Umesh Deshpande, and Kartik Gopalan Computer Science, Binghamton University (SUNY) {mhines,udeshpa1,kartik}@cs.binghamton.edu ABSTRACT 1 a cluster environment where physical nodes are intercon- We present the design, implementation, and evaluation of nected via a high-speed LAN and also employ a network- post-copy based live migration for virtual machines (VMs) accessible storage system. State-of-the-art live migration across a Gigabit LAN. Post-copy migration defers the trans- techniques [19, 3] use the pre-copy approach which works as fer of a VM’s memory contents until after its processor state follows. The bulk of the VM’s memory state is migrated has been sent to the target host. This deferral is in contrast to a target node even as the VM continues to execute at a to the traditional pre-copy approach, which first copies the source node. If a transmitted page is dirtied, it is re-sent memory state over multiple iterations followed by a final to the target in the next round. This iterative copying of transfer of the processor state. The post-copy strategy can dirtied pages continues until either a small, writable work- provide a “win-win” by reducing total migration time while ing set (WWS) has been identified, or a preset number of maintaining the liveness of the VM during migration. We iterations is reached, whichever comes first. This constitutes compare post-copy extensively against the traditional pre- the end of the memory transfer phase and the beginning of copy approach on the Xen Hypervisor. Using a range of VM service downtime. The VM is then suspended and its pro- workloads we show that post-copy improves several metrics cessor state plus any remaining dirty pages are sent to a including pages transferred, total migration time, and net- target node, where the VM is restarted. work overhead. We facilitate the use of post-copy with adap- Pre-copy’s overriding goal is to keep downtime small by tive prepaging techniques to minimize the number of page minimizing the amount of VM state that needs to be trans- faults across the network. We propose different prepaging ferred during downtime. Pre-copy will cap the number of strategies and quantitatively compare their effectiveness in copying iterations to a preset limit since the WWS is not reducing network-bound page faults. Finally, we eliminate guaranteed to converge across successive iterations. On the the transfer of free memory pages in both pre-copy and post- other hand, if the iterations are terminated too early, then copy through a dynamic self-ballooning (DSB) mechanism. the larger WWS will significantly increase service down- DSB periodically reclaims free pages from a VM and sig- time. Pre-copy minimizes two metrics particularly well – nificantly speeds up migration with negligible performance VM downtime and application degradation – when the VM impact on VM workload. is executing a largely read-intensive workload. However, even moderately write-intensive workloads can reduce pre- Categories and Subject Descriptors D.4 [Operating copy’s effectiveness during migration because pages that are Systems] repeatedly dirtied may have to be transmitted multiple times. In this paper, we propose and evaluate the postcopy strat- General Terms Experimentation, Performance egy for live VM migration, previously studied only in the context of process migration. At a high-level, post-copy Keywords Virtual Machines, Operating Systems, Process migration defers the memory transfer phase until after the Migration, Post-Copy, Xen VM’s CPU state has already been transferred to the target and resumed there. Post-copy first transmits all processor state to the target, starts the VM at the target, and then ac- 1. INTRODUCTION tively pushes the VM’s memory pages from source to target. This paper addresses the problem of optimizing the live Concurrently, any memory pages that are faulted on by the migration of system virtual machines (VMs). Live migra- VM at target, and not yet pushed, are demand-paged over tion is a key selling point for state-of-the-art virtualization the network from source. Post-copy thus ensures that each technologies. It allows administrators to consolidate system memory page is transferred at most once, thus avoiding the load, perform maintenance, and flexibly reallocate cluster- duplicate transmission overhead of pre-copy. wide resources on-the-fly. We focus on VM migration within Effectiveness of post-copy depends on the ability to mini- 1 mize the number of network-bound page-faults (or network A shorter version of this paper appeared in the ACM SIG- faults), by pushing the pages from source before they are PLAN/SIGOPS International Conference on Virtual Exe- faulted upon by the VM at target. To reduce network faults, cution Environments (VEE), March 2009 [11]. The addi- tional contributions of this paper are new prepaging strate- we supplement the active push component of post-copy with gies including dual-direction and multi-pivot bubbling (Sec- adaptive prepaging. Prepaging is a term borrowed from ear- tion 3.2), proactive LRU ordering of pages (Section 4.3), and lier literature [22, 32] on optimizing memory-constrained their evaluation (Section 5.4). disk-based paging systems. It traditionally refers to a more 2. RELATED WORK proactive form of pre-fetching from storage devices (such Process Migration: The post-copy technique has been as hard disks) in which the memory subsystem can try to variously studied in the context of process migration lit- hide the latency of high-locality page faults by intelligently erature: first implemented as “Freeze Free” using a file- sequencing the pre-fetched pages. Modern virtual memory server [26], then evaluated via simulations [25], and later via subsystems do not typically employ prepaging due increas- actual Linux implementation [20]. There was also a recent ing DRAM capacities. Although post-copy doesn’t deal with implementation of post-copy process migration under open- disk-based paging, the prepaging algorithms themselves can Mosix [12]. In contrast, our contributions are to develop a still play a helpful role in reducing the number of network viable post-copy technique for live migration of virtual ma- faults in post-copy. Prepaging adapts the sequence of ac- chines. Process migration techniques in general have been tively pushed pages by using network faults as hints to pre- extensively researched and an excellent survey can be found dict the VM’s page access locality at the target and actively in [17]. Several distributed computing projects incorporate push the pages in the neighborhood of a network fault be- process migration [31, 24, 18, 30, 13, 6]. However, these fore they are accessed by the VM. We propose and compare systems have not gained widespread acceptance primarily a number of prepaging strategies for post-copy, which we because of portability and residual dependency limitations. call bubbling, that reduce the number of network faults to In contrast, VM migration operates on whole operating sys- varying degrees. tems and is naturally free of these problems. Additionally, we identified a deficiency in both pre-copy PrePaging: Prepaging is a technique for hiding the la- and post-copy migration due to which free pages in the VM tency of page faults (and in general I/O accesses in the crit- are also transmitted during migration, increasing the total ical execution path) by predicting the future working set [5] migration time. To avoid transmitting the free pages, we and loading the required pages before they are accessed. develop a “dynamic self-ballooning” (DSB) mechanism. Bal- Prepaging is also known as adaptive prefetching or adap- looning is an existing technique that allows a guest kernel tive remote paging. It has been studied extensively [22, 33, to reduce its memory footprint by releasing its free memory 34, 32] in the context of disk based storage systems, since pages back to the hypervisor. DSB automates the balloon- disk I/O accesses in the critical application execution path ing mechanism so it can trigger periodically (say every 5 sec- can be highly expensive. Traditional prepaging algorithms onds) without degrading application performance. Our DSB use reactive and history based approaches to predict and implementation reacts directly to kernel memory allocation prefetch the working set of the application. Our system em- requests without the need for guest kernel modifications. It ploys prepaging, not in the context of disk prefetching, but neither requires external introspection by a co-located VM for the limited duration of live VM migration to avoid the nor excessive communication with the hypervisor. We show latency of network page faults from target to source. Our that DSB significantly reduces total migration time by elim- implementation employs a reactive approach that uses any inating the transfer of free memory pages in both pre-copy network faults as hints about the VM’s working set with and post-copy. additional optimizations described in Section 3.1. The original pre-copy algorithm has advantages of its own. Live VM Migration. Pre-copy is the predominant ap- It can be implemented in a relatively self-contained external proach for live VM migration. These include hypervisor- migration daemon to isolate most of the copying complexity based approaches from VMware [19], Xen [3], and KVM [14], to a single process at each node. Further, pre-copy also pro- OS-level approaches that do not use hypervisors from OpenVZ vides a clean way to abort the migration should the target [21], as well as wide-area migration [2]. Self-migration of op- node ever crash during migration because the VM is still erating systems (which has much in common with process running at the source. (Source node failure is fatal to both migration) was implemented in [9] building upon prior work migration schemes.) Although our current post-copy im- [8] atop the L4 Linux microkernel.