Survey on Mechanisms for Live Virtual Machine Migration and Its Improvements
Total Page:16
File Type:pdf, Size:1020Kb
Information and Media Technologies 11: 101-115 (2016) reprinted from: Computer Software 33(2): 101-115 (2016) © Japan Society for Software Science and Technology 特集●サーベイ論文 解 説 Survey on Mechanisms for Live Virtual Machine Migration and its Improvements Hiroshi Yamada Live virtual machine (VM) migration (simply live migration) is a powerful tool for managing data center resources. Live migration moves a running VM between different physical machines without losing any states such as network conditions and CPU status. Live migration has attracted the attention of academic and industrial researchers since replacing running VMs inside data centers by live migration makes it easier to manage data center resources. This paper summarizes live migration basics and techniques for improving them. Specifically, this survey focuses on software mechanisms for realizing basic live migration, improving its performance, and expanding its applicability. Also, this paper shows research opportunities that the state-of-the-art live migration techniques have not covered yet. tualization software such as Xen [5],KVM[31],Vir- 1 Introduction tualBox [46], VMware ESXi [55], and Hyper-V [41] One of the innovative technologies in our com- is widely available. puter systems for the last decade is system virtual- Live VM migration (simply live migration) is a ization, which allows us to run multiple operating powerful tool for managing data center resources. systems (OSes) on a physical machine. In system Live migration moves a running VM between dif- virtualization, the virtual machine monitor (VMM) ferent physical machines without losing any states is a primary software layer that directly manages such as network connections and CPU status. Re- the underlying hardware, instead of the OS. The placing running VMs inside the data centers by live VMM provides virtual machines (VMs) on which migration makes it easier to manage data center OSes are running as if they are running on physi- resources. For example, the availability of services cal machines. System virtualization brings several can be improved by migrating less loaded VMs to benefits. For example, we can reduce the number of another host to assign resources to more loaded running physical machines by consolidating the VM VMs. Another typical example is to support phys- running server software into one physical machine. ical machine maintenance; a physical machine can This leads to improvements in physical resource uti- be maintained with much less service downtime by lization and reduction in power consumption. An migrating all the VMs running on the target ma- AFCOM survey [1] reports that 72.9% of data cen- chine to other machines. Policies for VM replace- ters in the world are virtualized in 2010. Also, vir- ment using live migration including load balanc- ing [21][57][62] and power saving [19][40][54][58] have 仮想マシンライブマイグレーション機構およびその効率化 been widely studied in research communities. に関するサーベイ Exploring ways to design and implement effective Hiroshi Yamada, 東京農工大学工学部情報工学科, Dept. and/or efficient live migration is still a hot topic in of Information and Computer Sciences, Tokyo Uni- the system research community. This paper de- versity of Agriculture and Technology. scribes a survey on live migration basics and tech- コンピュータソフトウェア,Vol.33, No.2 (2016),pp.101–115. niques for improving them. Specifically, this survey [解説論文] 2015 年 10 月 2 日受付. 101 Information and Media Technologies 11: 101-115 (2016) reprinted from: Computer Software 33(2): 101-115 (2016) © Japan Society for Software Science and Technology focuses on software mechanisms for realizing ba- In virtualized environments, an OS is running on sic live migration, improving its performance, and a VM created by the VMM. We refer to an OS run- expanding its applicability. We believe that this ning on the VM as the guest OS. The VMM pro- survey helps researchers learn about existing live vides an illusion that the guest OSes are running migration techniques, helps administrators judge as if they are running on the physical hardware. which live migration technique is suitable for their The VMM multiplexes the underlying hardware to services and data centers, and sheds light on the create virtual hardware such as virtual CPUs and research directions of live migration. virtual devices. The VMM assigns part of the un- Numerous researches have implicitly assumed derlying hardware to VMs running on top of it. In that live migration is done on a local area network addition, the VMM achieves isolation between run- (LAN); the source and destination are connected in ning VMs; even if a guest OS crashes or is hijacked, the same network. The focus of this paper is live the other guest OSes are not affected. migration mechanisms that are used in intra data The VMM runs in the privileged mode to manage centers. Some efforts extend live migration to apply and multiplex the underlying physical hardware de- it to wide area networks [9][23][38][66]. Surveying vices whereas guest OSes run in the non-privileged these techniques is out of the scope of this paper. mode. When a guest OS executes a privileged in- We also note that exploring VM replacement poli- struction, such as access to MMU or I/O peripher- cies [19][21][40][54][57][58][62] is an important topic als, software interrupts occur, and control is trans- of live migration research, but this paper focuses ferred to the VMM. At this point, the VMM can on the software mechanisms for realizing live mi- capture and regulate all resources because it pro- gration. cesses the interrupts before delivering them to the The contributions of this paper are as follows: guest OS. • We describe mechanisms of live migration and its improvements. Note that the previous sur- 2. 2 Live Migration vey of live migration [39] describes basic mech- The goal of live migration is to move a running anisms of live migration; our survey is different VM between physical machines without disrupting from it. The previous survey mainly focuses on its services. To achieve this goal, minimizing the the difference between process migration and downtime during which the VM is stopped is im- VM migration while our main focus is on the portant, and is the most different point from a sus- difference between live migration mechanisms pend & resume scheme that stops a VM, extracts (Sec.3,4,and5). its memory image, and restores it on the destina- • We classify live migration mechanisms into tion machine. The details of algorithms to mini- two categories: performance and applicability. mize downtime in moving the VM are described in We discuss the state-of-the-art live migration Sec. 3. mechanisms in terms of the two aspects (Sec. 4 To move a running VM from a physical machine and 5). to another one, the live migration mechanism run- • We compare the mechanisms with each other ning inside the VMM transfers the target VM hard- and show some research directions of live mi- ware states to the destination machine. At the des- gration (Sec. 6). tination machine, the VMM builds the target VM from the received states and runs it after the state 2 Background restoration. The live migration mechanism typi- cally transfers memory contents, CPU states, CPU 2. 1 Virtualization register values, and device states. Live migration System virtualization is commonplace in comput- is supported by open-source virtualization software ing environments including high-end data centers, such as Xen [5] and KVM [31]. laptops, and embedded systems. To support sys- The concept of migration is not new. In the tem virtualization, CPU vendors offer CPU exten- system research community, process-level migration sions for virtualization. Typical examples are Intel- approaches have been studied widely [42].Com- VTx [27], AMD SVM [3], and ARM TrustZone [4]. pared to this approach, VM-level migration has the 102 Information and Media Technologies 11: 101-115 (2016) reprinted from: Computer Software 33(2): 101-115 (2016) © Japan Society for Software Science and Technology following advantages, as described in [11]. Sec. 1. • VM states on source are eliminated completely: The narrow interface between 3. 1 Pre-copy Approach a guest OS and the VMM makes it easy to Pre-copy [11] is a widely used approach to trans- avoid the problem of residual dependencies in fer VM resources. The basic idea of pre-copy is to which the source machine must remain avail- iteratively copy from the source to the destination able and network-accessible to service certain the VM’s pages that have been dirtied during live system calls or even memory accesses on behalf migration execution. This idea is used by other sys- of migrated processes. Avoiding this problem tems such as VM fault-tolerant systems [13][43][49]. is valuable when we conduct live migration for Figure 1 shows the execution flow of the pre-copy maintenance of the source machine. migration. The pre-copy consists of two phases: • Entire VM memory can be migrated: push phase and stop-and-copy phase. When the Live migration transfers all of the in-memory live migration starts, it first starts the push phase. state of the VM in a consistent and effi- The VMM copies all pages of a VM from the source cient fashion. This applies to the kernel- to the destination during the first iteration. Subse- internal state (e.g. the TCP control block quent iterations only copy those pages dirtied dur- for a currently active connection) as well as ing the previous iteration. To detect dirty pages, the application-level state, even when this is the VMM maintains the dirty bitmap that describes shared between multiple cooperating processes. which page becomes dirtied. When the number of In practical terms, for example, this means dirty pages is under a predefined threshold, the pre- that we can migrate an on-line game server copy live migration starts the stop & copy phase.