Live Virtual Machine Migration Techniques: Survey and Research Challenges

Divya Kapil, Emmanuel S. Pilli and Ramesh C. Joshi Department of Computer Science and Engineering Graphic Era University Dehradun, India (divya.k.rksh, emmshub, chancellor.geu)@gmail.com

Abstract—Cloud is an emerging technology in the world of interaction with each service provider. Broad network access information technology and is built on the key concept of gives access to capabilities available over the network through virtualization. Virtualization separates hardware from software standard mechanisms. Resource pooling pools computing and has benefits of server consolidation and live migration. Live resources to serve multiple consumers. Rapid elasticity is used migration is a useful tool for migrating OS instances across to elastically provision and release capabilities or resources. distant physical of data centers and clusters. It facilitates load Measured service control and optimize resource use by balancing, fault management, low-level system maintenance and leveraging a metering capability. reduction in energy consumption. In this paper, we survey the major issues of virtual machine live migration. We discuss how The key concept of the cloud computing is Virtualization. the key performance metrics e.g downtime, total migration time Virtualization technology has become popular and valuable for and transferred data are affected when a live virtual machine is cloud computing environment. Virtualization technology was migrated over WAN, with heavy workload or when VMs are implemented on IBM mainframe 1960. Virtualization is the migrated together. We classify the techniques and compare the abstraction of the physical resources needed to complete a various techniques in a particular class. request and underlying hardware used to provide service. It splits up a physical machine into several virtual machines. Keywords—Cloud computing; Virtualization; Virtual machine, Live migration; Pre-copy; Post-copy; “A virtual machine (VM) is a software implementation of a computing environment in which an operating system or I. INTRODUCTION program can be installed and run” [6]. The computational world has become very large and complex. Cloud computing is the latest evolution of Application Application computing, where IT capabilities are offered as services. Cloud Operating Systems Operating System computing delivers services like software or applications (SaaS – Software as a Service), infrastructure (IaaS - Infrastructure as Virtualized Hardware Virtualized Hardware a service), and platform (PaaS - Platform as a service). Computing is made available in a Pay-As-You-Use manner to (Virtualization) users. Some common examples are Google’s App Engine [1], MEM NIC DISK Amazon’s EC2 [2], Microsoft Azure [3], IBM SmartCloud [4]. CPU Cloud based services are on demand, scalable, device Fig. 1. Virtualization independent and reliable. Many different businesses and organization have adopted the concept of the cloud computing. VMware ESX / ESXi [7], Virtual PC [8], [9], and Cloud computing enables consumer and businesses to use Microsoft Hyper-V [10], KVM [11], VirtualBox [12] are application without installation and they can access their files popular virtualization software. Virtualization can run multiple on any computer through Internet. A standard definition for operating systems concurrently as shown in Fig. 1. A single cloud computing is a model for enabling convenient, on host can have many smaller virtual machines in which isolated demand network access to a shared pool of configurable operating system instances are running. Virtualization computing resources (e.g., networks, server, storage, technologies have a host program called Virtual Machine application, and services) that can be rapidly provisioned and Monitor or “Hypervisor”, which is a logical layer between released with minimal management effort or service provider underlying hardware and computational processes, and runs on interaction” [5]. the top of a given host. Cloud model is composed of five essential characteristics: In cloud computing, storage, application, server and On-demand self service, Broad network access, Resource network devices can be virtualized. Virtualization can provide pooling, Rapid elasticity and Measured service. On-demand many benefits, such as resource utilization, portability, and self service ensures that a consumer can one-sidedly provision computing capabilities automatically without requiring human

978-1-4673-4529-3/12/$31.00 c 2012 IEEE 963 application isolation, reliability of system, higher performance, state in its disks, memory, CPU registers, and I/O devices. It is improved manageability and fault tolerance. a hardware state called a capsule and includes the entire operating system as well as applications and running processes. The reasons for VM migration are: Load Balancing, They have developed techniques to reduce the amount of data accomplished by migrating VMs out of overloaded / sent over the network. The copy-on-write disks track only the overheated servers, and Server Consolidation, where servers updates to capsule disks, "ballooning" zeros unused memory, can be selectively brought down for maintenance after demand paging fetches only needed blocks, and hashing avoids migrating their workload to other servers [13]. sending blocks that already exist at the remote end. In this paper we survey on the performance technologies of The basic idea of live migration algorithm, first proposed the VM live migration. We discuss live migration techniques to by Clark et. al. [15]. First Hypervisor marks all pages as dirty, cluster, grid etc, much before the concept was applied to Cloud then algorithm iteratively transfer dirty pages across the computing. We survey the literature on the evaluation of network until the number of pages remaining to be transferred various VM migration techniques and identify the performance is below a certain threshold or a maximum number of iterations metrics. All the existing live virtual machine migration is reached. Then Hypervisor mark transferred pages as clean, techniques are studied and classified based on these metrics. since VM operates during live migration, so already transferred This paper is organized as follows. Section II gives a brief memory pages may be dirtied during iteration and must need to introduction of Virtual Machine Migration (VMM). Section III be re-transferred. The VM is suspended at some point on the describes some related work on evaluation metrics. Live VMM source for stopping further memory writes and transfer Techniques are surveyed in section IV. We conclude our work remaining pages. After transferring all the memory contents, in section V with future directions. the VM resumes at destination. II. BACKGROUND Nelson et. al. [16] describes the design and implementation Virtualization technology allows multiple operating of a system that uses virtual machine technology to provide systems run concurrently on the same physical machine. fast, transparent application migration, neither the applications Virtualization provides facility to migrate virtual machine from nor the operating systems need to be modified. Performance is one host (source) to another physical host (destination). Virtual measured with hundred virtual machines, migrating Machine Migration (VMM) is a useful tool for administrator of concurrently with standard industry benchmarks. It shows that data center and clusters: it allows clean separation between for a variety of workloads, application downtime due to hardware and software. Process level migration problems can migration is less than a second. be avoided by migrating a virtual machine. VMM avoids A high performance virtual machine migration design Residual Dependencies. Virtual Machine Migration enables based on Remote Direct Memory Access (RDMA) was energy saving, load balancing, efficient resources utilization. proposed by Huang et al. [17]. InfiniBand is an emerging Virtual Machine Migration methods are divided into two interconnects offering high performance and features such as types: Hot (live) migration and cold (non-live) migration. The OS-bypass and RDMA. RDMA is a direct memory status of the VM loses and user can notice the service access from the memory of one computer into that of another interruption in cold migration. Virtual machine keeps running without involving either one's operating system. By using while migrating and does not lose its status. User doesn’t feel RDMA remote memory can be read and write (modified) any interruption in service in hot (live) migration. In live directly, hardware I/O devices can directly access memory migration process, the state of a virtual machine to migrate is without involving OS. transferred. The state consists of its memory contents and local Luo et. al. [18] describe a whole-system live migration file system. Local file system need not be transferred. First, scheme, which transfers the whole system run-time state, VM is suspended, then its state is transferred, and lastly, VM is including CPU state, memory data, and local disk storage, of resumed at destination host. the virtual machine (VM). They propose a three-phase Live migration facilitates online maintenance, load migration (TPM) algorithm as well as an incremental migration balancing and energy management: (IM) algorithm, which migrate the virtual machine back to the 1. Online maintenance: To Improve system’s reliability and source machine in a very short total migration time. During the availability a system must be connected with the clients and the migration, all the write accesses to the local disk storage are up gradation and maintenance of the system is also necessary tracked by using Block-bitmap. task so for this all VMs are migrated away without Synchronization of the local disk storage is done according disconnecting. to the block-bitmap in the migration. The migration downtime 2. Load Balancing: VMs can be migrated from heavy is around 100 milliseconds, close to shared-storage migration. loaded host to light loaded host to avoid overloading of any one Using IM algorithm, total migration time is reduced. server. Synchronization mechanism based on the block-bitmap is 3. Energy Management: VMs can be consolidated to save simple and effective. Performance overhead of recording all the the energy. Some of the underutilized server VM’s are writes on migrated VM is very low. switched down and the consolidated servers ensure power efficient green cloud. Bradford et. al. [19] presented a system for supporting the transparent, live wide-area migration of virtual machines which Sapuntzakis et. al. [14] demonstrate how to quickly move use local storage for their persistent state. This approach is the state of a running computer across a network, including the

964 2013 3rd IEEE International Advance Computing Conference (IACC) transparent to the migrated VM, and does not interrupt open instances may be rearranged across machines to relieve the network connections to and from the VM during wide area load on overloaded hosts. In order to perform the live migration migration, guarantees consistency of the VM’s local persistent of a VM, its runtime state must be transferred from the source state at the source and the destination after migration, and is to the destination with VM still running. able to handle highly write-intensive workloads. There are two major approaches: Post-Copy and Pre-Copy memory migration. Post-copy first suspends the migrating VM III. PERFORMANCE METRICS at the source, copies minimal processor state to the target node, Researchers have evaluated the issues in live virtual resumes the virtual machine, and begins fetching memory machine migration and suggested various performance metrics. pages over the network from the source. Voorsluys et. al. [13] evaluate the effects of live migration of virtual machines on the performance of applications running There are two phases in Pre-copy approach: Warm-up inside Xen VMs. Results show that migration overhead is phase and Stop-and-Copy phase. In warm up VM memory acceptable but cannot be disregarded, especially in systems migration phase, the hypervisor copies all the memory pages where availability and responsiveness are governed by strict from source to destination while the VM is still running on the SLAs. source. If some memory pages change during memory copy process—dirty pages, they will be re-copied until the rate of re- Kuno, et. al. [20] present performance evaluation of both copied pages is not less than page dirtying rate. In Stop and migration methods (live and non-live), and demonstrate that Copy phase, the VM will be stopped in source and the performance of processes on a migrating virtual machine remaining dirty pages will be copied to the destination and VM severely declines. The important reasons for the decline are a will be resumed in destination. host OS communication and memory writing. They also analyze the reasons of I/O performance decline. These results A. Post Copy Approaches: demonstrate that one of important reasons of the performance decline is transmission for migration. Hines et. al. [22] present the design and implementation of a post-copy technique for live migration of virtual machines. Feng et. al. [21] compare the performance of VMotion and Post-copy consists of four key components: demand paging, XenMotion. VMotion performs better in generating total live active pushing, prepaging, and dynamic self-ballooning. They migration data when migrating VM instance than XenMotion. have implemented and evaluated post-copy on Xen and Linux The performance of both VMotion and XenMotion degrades in based platform. The evaluations show that post-copy network with delay and packet loss. VMotion performs much significantly reduces the total migration time and the number of worse than XenMotion in certain network with moderate delay pages transferred compared to pre-copy. The bubbling and packet loss. Existing live migration technology performs algorithm for prepaging is able to significantly reduce the well in LAN live migration. number network faults incurred during post-copy migration. The following metrics are usually used to measure the Michael et. al. [23] compare post-copy against the pre-copy performance of live migration [22]: approach on top of the Xen Hypervisor. This shows improvements in several migration metrics including pages 1. Preparation Time: The time when migration has started transferred, total migration time and network overhead using a and transferring the VM’s state to the target node. The VM range of VM workloads. They use post-copy with adaptive pre- continues to execute and dirty its memory. paging in order to eliminate all duplicate page transmissions.

2. Downtime: The time during which the migrating VM’s is They eliminate the transfer of free memory pages in both not executing. It includes the transfer of processor state. migration schemes through a dynamic self-ballooning (DSB) 3. Resume Time: This is the time between resuming the VM’s mechanism. DSB periodically releases free pages in a guest execution at the target and the end of migration, all VM back to the hypervisor and significantly speeds up dependencies on the source are eliminated. migration with negligible performance degradation. 4. Pages Transferred: This is the total amount of memory pages transferred, including duplicates, across all of the B. Pre Copy Approaches: above time periods. There are many categories in pre-copy approach. Many

5. Total Migration Time: This is the total time of all the technologies are combined, existing pre-copy approaches are above times from start to finish. Total time is important improved, multiple VMs are migrated, and specific application because it affects the release of resources on both loads are considered. The techniques are explained below: participating nodes as well as within the VMs.

6. Application Degradation: This is the extent to which 1) Combined Technologies: migration slows down the applications executing within Liu et. al. [24] describe a novel approach. They combine the VM. technology of recovering system (check pointing / recovery and trace / replay) with CPU scheduling to provide fast and IV. LIVE VM MIGRATION TECHNIQUES IN CLOUD transparent migration. Target host executes log files generated on source host to synchronize the states of source and target Live migration is an extremely powerful tool for cluster and hosts, during which a CPU scheduling mechanism is used to cloud administrator. An administrator can migrate OS instances adjust the log generation rate. This approach has short with application so that the machine can be freed for downtime and reasonable total migration time. maintenance. Similarly, to improve manageability, OS

2013 3rd IEEE International Advance Computing Conference (IACC) 965 Liu et. al. [25] describes novel approach CR/TR-Motion transmitted in the last round of the iteration process. This that adopts check pointing / recovery and trace / replay ensures that frequently updated pages are transmitted just once. technology to provide fast, transparent VM migration. This scheme can greatly reduce the migration downtime and Svard et al. [31] implemented delta compression live network bandwidth consumption. In multi-processor (or multi- migration algorithm as a modification to the KVM hypervisor. core) environment, as expensive memory race among different The performance is evaluated by migrating running VMs with VCPUs must be recorded and replayed, this make an inherent different type of workload and it shows a significant decrement difficult for this approach to migrate SMP guest OS. VCPU hot in migration downtime. They demonstrate that when VMs plug technique may address this issue by dynamically migrate with high workloads and/or over low-bandwidth configuring the migrated VM to use only one VCPU before networks there is a high risk of service interruption. Using delta migration, and give back the VCPUs after the migration. compression, risk of service can be reduced as data is stored in the form of changes between versions. In order to improve When CPU and/or memory intensive VMs are migrated, it performance, either the dirtying rate has to be reduced or the has extended migration downtime that may cause service network throughput increased. interruption or even failure, and prolonged total migration time that is harmful for the overall system performance. Svard et. Ibrahim et al. [32] present a performance analysis of the current KVM implementation and study the behavior of al. [26] approach this two-fold problem through a combination of techniques. They dynamically adapt the transfer order of iterative pre-copy live migration for memory intensive VM memory pages during live migration reducing the risk of applications. The scientific application (VM contains multiple re-transfers for frequently dirtied pages and use a compression cores) memory rate of change is likely to be higher than the scheme that increases the migration throughput and the migration draining rate. They present a novel algorithm that migration downtime is effectively reduced. achieves both low downtime and low application performance impact. This approach is implemented in KVM. Moving live VM with large size over WAN with low bandwidth is a big problem. Bose et. al. [27] propose to 3) Multiple VMs migration combine VM replication with VM scheduling so that migration Al-Kiswany et. al. presents VMFlockMS [33], a migration latencies can be minimized. They compensate for the additional service optimized for cross-datacenter transfer and instantiation storage requirement due to the increase in the number of of groups of related VM images with an application-level replicas by exploring commonalities across different VM solution (e.g., a three-tier web application). VMFlockMS uses images using de-duplication techniques. two techniques: 1) data deduplication to be migrated within the Kumar Bose et. al. [28] propose to combine VM replication VMFlock, and among the VMs in the VMFlock and the data with VM scheduling to overcome the challenge of Migration already present at the destination datacenter. 2) Accelerated latencies associated with moving large files (VM images) over instantiation of the application at the target datacenter after the relatively low-bandwidth networks. They replicate a VM transferring only a partial set of data blocks and prioritization image selectively across different cloud sites, choose a replica of the remaining data based on previously observed access of the VM image to be the primary copy, and propagate the patterns originating from the running VMs. A scalable and high incremental changes at the primary copy to the remaining performance migration service can be achieved. replicas of the VM. The proposed architecture for integrated Ye et. al. [34] present Resource reservation based live replication and scheduling called CloudSpider, is capable of migration framework consists of Migration decision maker, minimizing migration latencies associated with the live Migration controller, Resource reservation controller and migration of the VM images across WANs. Resource monitor. The reserved resource in source machine 2) Improved Pre-copy Approaches: includes CPU (in Xen virtualization platform) and memory resource (dynamically adjusting VM memory size) while in Jin et. al. [29] present the design and implementation of a target machine it includes the whole virtual machine resources. novel memory-compression based VM migration approach Three metrics to quantify the efficiency are downtime, total (MECOM). They first use memory compression to provide fast time, workload performance overheads. and stable virtual machine migration, though virtual machine services may be slightly affected based on memory page Kikuchi et. al. [35] constructed a performance model of characteristics. They also designed an adaptive zero-aware concurrent live migrations in virtualized datacenters. First the data is collected and live migration is executed simultaneously compression algorithm for balancing the performance and the cost of virtual machine migration. Pages are quickly on the data. A performance model is constructed representing compressed in batches on the source and exactly recovered on the performance characteristics of live migration using the the target. Experiments demonstrate that compared with Xen, PRISM probabilistic model checker. This approach provides to this system can significantly reduce 27.1% of downtime, 32% orchestrate management operations and determine the of total migration time and 68.8% of total transferred data. appropriate configuration to avoid undesirable situations from a probabilistic viewpoint in cloud system. Fei Ma et. al. [30] improved pre-copy approach on Xen 3.3.0 by adding a bitmap page which marks those frequently Deshpande et. al. [36] present the design, implementation, updated pages. In the iteration process, frequently updated and evaluation of a de-duplication based approach to perform pages are put into the page bitmap, and those pages can only be concurrent live migration of co-located VMs. This approach transmits memory content that is identical across VMs only once during migration to significantly reduce both the total

966 2013 3rd IEEE International Advance Computing Conference (IACC) migration time and network traffic. They used QEMU/KVM mapping at each step. They compared metrics used for the Linux platform for live gang migration of virtual machines. evaluation of AppAware, such as total traffic volume that is transported by the data center network once all overloaded Deshpande et. al. [37] present a inter-rack live migration VMs have been assigned to physical machines using one of the (IRLM) system. IRLM reduces the traffic load on the core methods. Using simulations, they show that it decreases network links during mass VM migration through distributed network traffic by up to 81% compared to a well known deduplication of VMs’ memory images. IRLM initial prototype alternative VM migration method that is not application-aware. migrates multiple QEMU/KVM VMs within a Gigabit Ethernet cluster with 10GigE core links. For a configuration of 6 hosts H. Liu et. al. [45] construct two application oblivious per rack and 4 VMs per host, IRLM can reduce the amount of models for the cost prediction by using learned knowledge data transferred over the core links during migration. about the workloads at the hypervisor level. It is the first model VM migration costs in terms of both performance and energy. 4) Specific Cloud Environments: They validate the models by conducting a large set of Elmroth et. al. [38] presented two novel interface and experiments. The evaluation results demonstrate the architectural contributions, facilitating for cloud computing effectiveness of model-guided live migration in both software to make use of inter and intra-site VM migration and performance and energy costs. improved inter- and intra-site monitoring of VM resources, Modeling the performance of migration involves several both on an Infrastructural and on an application-specific level. factors: the size of VM memory, the workload characteristic Celesti et. al. [39] propose a Composed Image Cloning (denotes the memory dirtying rate), network transmission rate, (CIC) methodology to reduce consumption of bandwidth and and the migration algorithm (different configurations of cloud resources. This approach does not consider the disk- migration algorithm means great variations of migration image of a VM as a single monolithic block, but as a performance). The most important challenge is to correctly combination between “composable” and “user data” blocks. characterize the memory access pattern of each running Suen et. al. [40] propose and compare techniques that can workloads. reduce the transfer bandwidth and storage cost of data involved 6) Other Technologies: during the migration process. Nocentino et. al. [46] proposes a novel dependency-aware 5) Application / Workload Specific Technologies: approach to live virtual machine migration and presents the There is a limitation of migration technology, when it is results of the initial investigation into its ability to reduce used on larger application system such as SAP ERP. Such migration latency and overhead. The approach uses a tainting systems consume a large amount of memory. Hacking and mechanism originally developed as an intrusion detection Hudzia [41] present design, implementation, and evaluation a mechanism. Dependency information is used to distinguish system for supporting the transparent, live migration of virtual processes that create direct or indirect external dependencies machines running typical large enterprise applications during live migration. workloads. It minimizes service disruption due to using delta Akoush et. al. [47] show that the link speed and page dirty compression algorithm by VM for memory transfer, as well as rate are the major factors impacting migration behavior. These the introduction of an adaptive warm up phase in order to factors have a non-linear effect on migration performance reduce the rigidity of migrating large VMs. largely because of the hard stop conditions that force migration Sato et. al. [42] presented a VM relocation algorithm for to its final stop-and-copy stage. Migration times should be data intensive applications on a virtual machine in a accurately predicted to enable more dynamic and intelligent geographically distributed environment. The proposed placements of VMs without degrading performance. algorithm determines optimal location of VM to access target “Address-warping” problem is one of the difficulties in file, while minimizing the total expected file access time to wide-area migration, the address of the VM warps from the files by solving DAG shortest path search problems, on the source server to the destination server which complicates the assumption that the network throughput between sites and the status of the WAN, and the LANs connected to the WAN. size and locations of target files are given. It can achieve higher Kanada et. al. [48] propose two solutions to this problem: 1) To performance than simple techniques. switch an address-translation rule (analogous to paging in Piao et. al. [43] presents a network aware VM placement memory virtualization) and 2) To switch multiple virtual and migration approach for data intensive applications in cloud networks (analogous to paging in memory virtualization). computing environments. The proposed approach places the Wood et. al. [49] present cloud framework CloudNet VMs on physical machines with consideration of the network consisting of cloud computing platforms linked with a VPN conditions between the physical machines and the data storage. based network infrastructure to provide seamless and secure V. Shrivastava et. al. [44] introduce AppAware—a novel, connectivity between enterprise and cloud data center sites. An computationally efficient scheme for incorporating (1) inter- optimized support for live WAN migration of virtual machines VM dependencies and (2) the underlying network topology is provided by CloudNet, that is beneficial over low bandwidth into VM migration decisions. AppAware is a greedy algorithm and high latency Internet links, it minimizes the cost of with heuristics for assigning VMs to physical machines one at transferring storage and virtual machine memory during a time, while trying to minimize the cost that results from the migrations. At the heart of CloudNet is a Virtual Cloud Pool (VCP) abstraction that enables server resources across data

2013 3rd IEEE International Advance Computing Conference (IACC) 967 centers and cloud providers to be logically grouped into a proportional to total migration time and downtime. Page dirty single server pool. VPN, based on Multi-Protocol Label rate is the rate at which memory pages in the VM are modified Switching (MPLS) is used in CloudNet to create the abstraction which, in turn, directly affects the number of pages that are of a private network and address space shared by multiple data transferred in each pre-copy iteration. [47]. centers. The hypervisor’s memory migration is coordinated with a disk replication system by CloudNet so that the entire D. Available Resources VM state can be transferred if needed. CloudNet is optimized Resource availability can help to make better decision on to reduce the amount of data transferred and total migration when to migrate VM and how to allocate resources [51]. time and application downtime. E. Address wrapping Huang et. al. [50] present an implementation in live migration benchmark, Virt-LM, for comparing live migration Address-warping problem is one of the difficulties in wide- performance among different software and hardware area migration, the address of the VM warps from the source environments in a data center scenario. Metrics, Workloads, server to the destination server which complicates the status of Impartial Scoring Methodology, Stability, Compatibility, and the WAN, and the LANs connected to the WAN [48]. Usability are the goals to design Virt-LM. There are some other challenges such as network faults Resource availability can help to make better decision on [22], overloaded VMs [44], memory and data intensive when to migrate VM and how to allocate necessary resources. applications [32, 42, 43], consumption of bandwidth and cloud Wu and Zhao [51] create a performance model using statistical resources [39]. method such as regression. It can be used to predict migration time and guide resource management decision. They did VI. CONCLUSION AND FUTURE WORK experiment by migrating a xen-based VM running CPU, This paper is a survey of live migration of virtual machine memory, or I/O intensive application and allocating different techniques. Live migration involves transferring a running amount of CPU share. It shows that the available resources to virtual machine across distinct physical hosts. There are many live migration have an impact on migration time. techniques which attempt to minimize the down time and to Jing et. al. [52] propose a optimization migration provide better performance in low bandwidth environment. We framework to reduce the migration downtime, which is based have categorized the papers and there is a need to compare on the analysis of the memory transfer in the real-time techniques in each category to understand the strengths and migration of current Xen virtual machine. This framework weaknesses. In future, we plan to propose a performance makes use of layered copy algorithm and memory compression model, based on the research gaps identified through the algorithm, optimizes the time and space complexity of real- limitations. This will be helpful for reducing the migration time time migration, reduces the migration downtime greatly and with heavy workload. We would want to parallelize the improves the migration performance. migration process using MapReduce so that data can be distributed among various places. The other problem in live Ashino et. al. [53] propose VM migration method to solve migration is low bandwidth we can have a better utilization of the problems (Guest OS fails to boot up on destination after network bandwidth by allocating it dynamically. migration, loading device drivers, or adjusting its device configuration). EDAMP migration method is proposed and is REFERENCES still in development. Method only overwrites the files and does not destroy the device driver. EDAMP can be used in multiple [1] Google, "Google App Engine", (2012), [online]. Available: cloud.google.com [Nov 1, 2012]. cloud services and integrating to one hypervisor. [2] Amazon, "Amazon Elastic Compute Cloud (Amazon EC2)", (2012), [online]. Available: aws.amazon.com/ec2/ [Nov 1, 2012]. V. RESEARCH CHALLENGES IN LIVE VM MIGRATION [3] Microsoft, "Windows Azure.", (2012), [online]. Available: windowsazure.com [Nov 1, 2012]. A. Low Bandwidth over WAN [4] IBM, "SmartCloud." (2012), [online]. Available: ibm.com/cloud- computing [Nov 1, 2012]. A virtual machine can be scheduled for execution at [5] P. Mell and T. Grance, "The NIST definition of cloud computing geographically disparate cloud locations depending upon the (draft)," NIST special publication, vol. 800, p. 145. cost of computation and the load at these locations. However, [6] A. Desai, "Virtual Machine." (2012), [online]. Available: http://searchservervirtualization.techtarget.com/definition/virtualmachin trans−locating a live VM across high−latency low−bandwidth [7] VMWare, "vSphere ESX and ESXi Info Center.", (2012), [online]. wide area networks (WAN) within ‘reasonable’ time is nearly Available: vmware.com/products/vsphere/esxi-and-esx [Nov 1, 2012]. impossible due to the large size of the VM image [27]. [8] Microsoft, "Windows Virtual PC.", (2012), [online]. Available: http://www.microsoft.com/windows/virtual-pc/ [Nov 1, 2012]. B. Virtual Machine with different type of workload [9] Xen, "Xen Hypervisor.", (2012), [online]. Available: http://www.xen.org/products/xenhyp.html [Nov 1, 2012]. There is a limitation of migration technology, when it is [10] Microsoft, "Hyper-V Server 2012.", (2012), [online]. Available: used on larger application system such as SAP ERP. Such microsoft.com/server-cloud/hyper-v-server/ [Nov 1, 2012]. systems consume a large amount of memory [41]. [11] KVM, "Kernel-based Virtual Machine.", (2012), [online]. Available: linux-kvm.org [Nov 1, 2012]. C. Link speed and page dirty rate [12] Oracle, "VirtualBox.", (2012), [online]. Available: .org [Nov 1, 2012]. Link speed and page dirty rate are the major factors [13] W. Voorsluys, J. Broberg, S. Venugopal, and R. Buyya, "Cost of Virtual impacting migration behavior. Link capacity is inversely Machine Live Migration in Clouds: A Performance Evaluation," in 1st

968 2013 3rd IEEE International Advance Computing Conference (IACC) International Conference on Cloud Computing, Berlin, Germany, 2009, [33] S. Al-Kiswany, D. Subhraveti, P. Sarkar, and M. Ripeanu, "VMFlock: pp. 254-65. Virtual machine co-migration for the cloud," IEEE International Sym. on [14] P. S. Constantine, C. Ramesh, P. Ben, C. Jim, S. L. Monica, and R. High Performance Distributed Computing, 2011, pp. 159-170. Mendel, "Optimizing the migration of virtual computers," in 5th [34] Y. Kejiang, J. Xiaohong, H. Dawei, C. Jianhai, and W. Bei, "Live Symposium on Operating Systems Design and Implementation, Migration of Multiple Virtual Machines with Resource Reservation in SIGOPS Oper. Syst. Rev., vol. 36, Issue SI, pp. 377-390, 2002. Cloud Computing Environments," California, USA, 2011, pp. 267-74. [15] C. Christopher, F. Keir, H. Steven, H. Jacob Gorm, J. Eric, L. Christian, [35] S. Kikuchi and Y. Matsumoto, "Performance modeling of concurrent P. Ian, and W. Andrew, "Live migration of virtual machines," 2nd live migration operations in cloud computing systems using prism conference on Symposium on Networked Systems Design & probabilistic model checker," 2011 IEEE 4th International Conference Implementation - Volume 2: USENIX Association, 2005. on Cloud Computing, CLOUD 2011, July 2011, pp. 49-56. [16] N. Michael, L. Beng-Hong, and H. Greg, "Fast transparent migration [36] D. Umesh, W. Xiaoshuang, and G. Kartik, "Live gang migration of for virtual machines," Annual conference on USENIX Annual Technical virtual machines," 20th International Symposium on High performance Conference Anaheim, CA: USENIX Association, 2005. distributed computing, San Jose, California, USA: ACM, 2011. [17] H. Wei, G. Qi, L. Jiuxing, and D. K. Panda, "High performance virtual [37] D. Umesh, K. Unmesh, and G. Kartik, "Inter-rack live migration of machine migration with RDMA over modern interconnects," in IEEE multiple virtual machines,", 6th International workshop on International Conference on Cluster Computing, 2007, pp. 11-20. Virtualization Technologies in Dist. Computing, Delft, Netherlands. [18] L. Yingwei, Z. Binbin, W. Xiaolin, W. Zhenlin, S. Yifeng, and C. [38] E. Elmroth and L. Larsson, "Interfaces for placement, migration, and Haogang, "Live and incremental whole-system migration of virtual monitoring of virtual machines in federated clouds," in 8th International machines using block-bitmap," in IEEE International Conference on Conf. on Grid and Cooperative Computing, GCC 2009, pp. 253-260. Cluster Computing, 2008, pp. 99-106. [39] A. Celesti, F. Tusa, M. Villari, and A. Puliafito, "Improving virtual [19] B. Robert, K. Evangelos, F. Anja, S. Harald, and berg, "Live wide-area machine migration in federated cloud environments," 2nd International migration of virtual machines including local persistent state," 3rd Conference on Evolving Internet, Internet 2010, pp. 61-67. International Conference on Virtual execution environment, San Diego, [40] S. Chun-Hui, M. Kirchberg, and L. Bu Sung, "Efficient Migration of California, USA: ACM, 2007. Virtual Machines between Public and Private Cloud," in IEEE Third [20] Y. Kuno, K. Nii, and S. Yamaguchi, "A study on performance of International conference on Cloud Computing Technology and Science processes in migrating virtual machines,", 10th International Symposium (CloudCom), Los Alamitos, CA, USA, Nov 2011, pp. 549-53. on Autonomous Decentralized Systems, ISADS 2011, 2011, pp. 567-572. [41] H. Stuart, Beno, and H. t, "Improving the live migration process of large [21] X. Feng, J. Tang, X. Luo, and Y. Jin, "A performance study of live VM enterprise applications," 3rd International Workshop on Virtualization migration technologies: VMotion vs XenMotion," The International Technologies in Distributed Computing, Barcelona, Spain: ACM, 2009. Society for Optical Engineering, 2011. [42] K. Sato, H. Sato, and S. Matsuoka, "A model-based algorithm for [22] R. H. Michael, D. Umesh, and G. Kartik, "Post-copy live migration of optimizing I/O intensive applications in clouds using vm-based virtual machines," SIGOPS Oper. Syst. Rev., vol. 43, pp. 14-26, 2009. migration," in 2009 9th IEEE/ACM International Symposium on Cluster [23] R. H. Michael and G. Kartik, "Post-copy based live virtual machine Computing and the Grid, CCGRID 2009, pp. 466-471. migration using adaptive pre-paging and dynamic self-ballooning,", [43] J. T. Piao and J. Yan, "A network-aware virtual machine placement and ACM SIGPLAN/SIGOPS international conference on Virtual execution migration approach in cloud computing," 9th International Conference environments, Washington, DC, USA: ACM, 2009. on Grid and Cloud Computing, GCC 2010, pp. 87-92. [24] L. Weining and F. Tao, "Live migration of virtual machine based on [44] V. Shrivastava, P. Zerfos, L. Kang-won, H. Jamjoom, L. Yew-Huey, and recovering system and CPU scheduling," in 6th IEEE joint International S. Banerjee, "Application-aware virtual machine migration in data Information Technology and Artificial Intelligence Conference, centers," in IEEE INFOCOM, 2011, pp. 66-70. Piscataway, NJ, USA, May 2009, pp. 303-7. [45] L. Haikun, X. Cheng-Zhong, J. Hai, G. Jiayu, and L.Xiaofei, [25] L. Haikun, J. Hai, L. Xiaofei, H. Liting, and Y. Chen, "Live migration of "Performance and energy modeling for live migration of virtual virtual machine based on full system trace and replay," 18th ACM machines," 20th International Symposium on High Performance International Symposium on High performance distributed computing Distributed Computing, San Jose, California, USA: ACM, 2011. Garching, Germany: ACM, 2009. [46] N. Anthony and M. R. Paul, "Toward dependency-aware live virtual [26] P. Svard, J. Tordsson, B. Hudzia, and E. Elmroth, "High performance machine migration," 3rd International Workshop on Virtualization live migration through dynamic page transfer reordering and Technologies in Distributed Computing, Barcelona, Spain: ACM, 2009. compression," 2011 3rd IEEE International Conference on Cloud [47] A. Sherif, S. Ripduman, R. Andrew, W. M. Andrew, and H. Andy, Computing Technology and Science, CloudCom 2011, pp. 542-548. "Predicting the Performance of Virtual Machine Migration," in IEEE [27] S. K. Bose, S. Brock, R. Skeoch, and S. Rao, "CloudSpider: Combining International Symposium on Modeling, Analysis and Simulation of replication with scheduling for optimizing live migration of virtual Computer and Telecommunication Systems, 2010. machines across wide area networks," 11th IEEE/ACM International [48] Y. Kanada and T. Tarui, "A "network-paging" based method for wide- Symposium on Cluster, Cloud and Grid Computing, CCGrid 2011, May area live-migration of VMs," in International Conference on 2011, pp. 13-22. Information Networking 2011, ICOIN 2011, Jan 2011, pp. 268-272. [28] S. Kumar Bose, S. Brock, R. Skeoch, N. Shaikh, and S. Rao, [49] T. Wood, P. Shenoy, K. K. Ramakrishnan, and J. Van Der Merwe, "Optimizing live migration of virtual machines across wide area "CloudNet: Dynamic pooling of cloud resources by live WAN migration networks using integrated replication and scheduling," in 2011 IEEE of virtual machines," 2011 ACM SIGPLAN/SIGOPS International International Systems Conference, SysCon 2011 - pp. 97-102. Conference on Virtual Execution Environments, VEE 2011, pp. 121-132. [29] J. Hai, D. Li, W. Song, S. Xuanhua, and P. Xiaodong, "Live virtual [50] H. Dawei, Y. Deshi, H. Qinming, C. Jianhai, and Y. Kejiang, "Virt-LM: machine migration with adaptive, memory compression," in IEEE a benchmark for live migration of virtual machine," Second joint International Conference on Cluster Computing and Workshops, WOSP/SIPEW International Conference on Performance engineering CLUSTER '09, pp. 1-10. Karlsruhe, Germany: ACM, 2011. [30] M. Fei, L. Feng, and L. Zhen, "Live virtual machine migration based on [51] Y. Wu and M. Zhao, "Performance modeling of virtual machine live improved pre-copy approach," in IEEE International Conference on migration,", IEEE 4th International Conference on Cloud Computing, Software Engineering & Service Sciences ICSESS), 2010, pp. 230-233. CLOUD 2011, pp. 492-499. [31] S. Petter, H. Benoit, T. Johan, and E. Erik, "Evaluation of delta [52] J. Yang, "Key technologies and optimization for dynamic migration of compression techniques for efficient live migration of large virtual virtual machines in cloud computing," Int. Conf. on Intelligent Systems machines," 7th ACM SIGPLAN/SIGOPS International Conference on Design and Engineering Applications, ISDEA 2012, pp. 643-647. Virtual Execution Environments, California, USA: ACM, 2011. [53] Y. Ashino and M. Nakae, "Virtual machine migration method between [32] K. Z. Ibrahim, S. Hofmeyr, C. Iancu, and E. Roman, "Optimized pre- different hypervisor implementations and its evaluation," 26th IEEE copy live migration for memory intensive applications," in International International Conference on Advanced Information Networking and Conference for High Performance Computing, Networking, Storage and Applications Workshops, WAINA 2012, pp. 1089-1094. Analysis (SC),2011, pp. 1-11.

2013 3rd IEEE International Advance Computing Conference (IACC) 969