Live Container Migration Via Pre-Restore and Random Access Memory

2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom) Live Container Migration via Pre-restore and Random Access Memory 1st Zhixing Yu 2nd Kejing He Guangdong Key Laboratory of Computer Network Guangdong Key Laboratory of Computer Network School of Computer Science and Engineering School of Computer Science and Engineering South China University of Technology South China University of Technology Guangzhou, China Guangzhou, China [email protected] [email protected] 3rd Chao Chen 4th Jian Wang School of Computer Science and Engineering School of Computer Science and Engineering South China University of Technology South China University of Technology Guangzhou, China Guangzhou, China [email protected] cs [email protected] Abstract—Container technology is increasingly being used to migrate the files used by the container process to the target for virtualization due to its ability to isolate the operating host, but only need to migrate the container runtime memory environment of the program. In cloud computing environment, and other metadata information instead. As shown in Fig. 1, we need to migrate containers between different hosts for load balancing or downtime maintenance. However, during the the conventional container migration process includes three migration process, the container will be temporarily shut down, steps. First, the migration tool freezes the running container and the service will be unavailable. Therefore, the time cost is process in order to collect memory page information and an essential indicator to measure the quality of the migration metadata information (e.g., file descriptors, pipe parameters) process. To achieve live container migration, we propose a pre- and then dump the information locally as image files. Second, restore method and a complete random access memory (RAM) based method to migrate containers. Extensive experiments all the local image files are transported from the source host validate the effectiveness of our methods in reducing downtime to the destination host over the network. Finally, the migration and improving the efficiency of container migration. tool re-runs the container by extracting the information from Index Terms—Container, Live migration, Downtime, Pre- the image files on the target host. restore, RAM There are four main metrics to measure the performance of live migration, including three key metrics of time for I. INTRODUCTION container migration represented in Fig. 2: Virtual machine migration is a method of moving a virtual • total amount of data transferred, including the size of the machine on a source host to a target host and re-running the container memory image and metadata size. virtual machine on the target host. After years of research, • total migration time, which refers the time from the start virtual machine migration technology has been widely applied of the migration to the duration of the container running in practical applications [1], some of which focus on Service again at the target host. Level Agreements (SLA) [2], some on energy efficiency [3], etc. In recent years, the technology of container, also known as the lightweight virtual machine, has achieved great attention since it is lightweight and easy to be deployed. Some 6RXUFHKRVW 'HVWLQDWLRQKRVW researches have shown that, in most cases, the performance of the container is better than or equal to that of the virtual &RQWDLQHU &RQWDLQHU machine [4]. In a real production environment, container migration generally comes from the requirement for load balancing or downtime maintenance. GXPS UHVWRUH In this paper, we focus on live migration (or real-time 7UDQVSRUW migration) of a container, which is a process transmitted from )LOHV\VWHP )LOHV\VWHP source hosts to destination hosts. In this case, there is no need This work was supported by the Science and Technology Planning Project of Guangdong Province, China (2017B030306016), and the Special Support Program of Guangdong Province (201528004). Fig. 1. Conventional container migration process. 978-0-7381-3199-3/20/$31.00 ©2020 IEEE 102 DOI 10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00039 • downtime, which is the time from the migrated container II. BACKGROUND AND PROBLEM STATEMENT to pause on the source host to re-run on the target host. A. Background • absolute downtime, which is the difference between downtime and network transmission time. Therefore, ab- 1) Live migration: Live migration or real-time migration solute downtime can eliminate the network quality factor for virtual machine refers to the process of moving appli- compared to downtime. cations between different physical computers or cloud plat- forms while ensuring that client access is not interrupted [8]. Live container migration is a similar process. In [6], Andrey start finish Mirkin, Alexey Kuznetsov, and Kir Kolyshkin first presented down transmit transmit re-run transmitting restarting the checkpointing and restart mechanism for live container migration implemented in OpenVZ. During the live container total migration time migration, the memory, file systems, and network connections downtime required to run the container on bare hardware can be moved from the source host to the target host while ensuring consis- tabs tabs absolute downtime absolute downtime tent state. Live container migration is useful for maintaining high availability and fault tolerance of applications in the cloud Fig. 2. Time metrics for container migration. environment, as well as for dynamic load balancing in a cluster server. However, the larger the image files being transferred, the 2) Pre-copy: A container needs to be dumped to disk before longer the transmission time and downtime it consumes, it can be further migrated to another server. Therefore, if the which makes the migrated container service unavailable for container to be migrated consumes a lot of memory, it will a long time. Therefore, various methods have been proposed take a long time to dump, resulting in more extra downtime. to improve the performance of migration from the perspective Long downtime means that the service is interrupted for a long of the migrated file transfer mode. Dynamic self-expansion [1] time, which violates the SLA. Consequently, the main require- and compression-based method [5] were proposed to reduce ment for live container migration is to minimize application the data to be transmitted. On the other hand, pre-copy [6] and downtime. post-copy [7] were proposed to effectively reduce downtime Clark, Christopher, and Fraser et al. proposed a pre-copy and accelerate the transport process of image files by dividing method to solve this problem [8]. As shown in Fig. 3, the pre- the migration into multiple phases. copy mainly includes three phases. In the first phase, all the In this paper, we propose two approaches to improve the memory pages of the container are dumped into the image file performance of live container migration, including the pre- and transferred to the destination host, where the container still restore method and the complete Random Access Memory keeps running. In the second phase, the memory page tracking (RAM) based method. The pre-restore method leverages the technique is used to record the memory page modified from idea of mapping and merging the memory pages, the RAM- the last iteration and dump it into the image file for migration. based method adopts random access memory to store a collec- In the final phase, the container stops until the iteration stop tion of image files instead of using disks during the migration condition are reached, then copies all remaining memory pages process. The main contributions are summarized as follows: and metadata information and transfers to the target host. 3) Post-copy: Michael R Hines, Umesh Deshpande, and • We design and implement a novel pre-restore method for Kartik Gopalan proposed the post-copy [7] method for virtual live container migration, which processes the image files machine migration. The post-copy method is quite different before transmission and reconstruction, making downtime from the pre-copy method, although they are both iterative of the migrated container declined. migration algorithms. First, it migrates the virtual machine’s • We propose a novel complete RAM-based method for the smallest working set from the source host to the target host migration process of all image files, which alleviates the and then immediately rehabilitates the virtual machine running disk I/O overhead. on the target host. After the recovery operation, a page fault • We implement the above two methods in the form of interrupt will occur when the memory page that needs to middleware running on the destination host and carry out be accessed does not exist, where the corresponding page experiments. Experimental results demonstrate that the information is obtained from the source host through the proposed methods reduce the total migration time and network. Hirofuchi, Takahiro and Nakada et al. [9] proposed a absolute downtime effectively. virtual machine migration method based on post-copy, which The rest of this paper is organized as follows. Section enables the virtual machine to be migrated automatically 2 provides the background and problem statement of live according to the change of resource usage, thus providing a container migration. Section 3 elaborates on our proposed pre- higher performance

Live Container Migration Via Pre-Restore and Random Access Memory

IEEE Std 1003.1-2008

Operating Systems Processes

Openextensions POSIX Conformance Document

System Calls & Signals

Process Relationships (Chapter 9 )

C-Balancer: a System for Container Profiling and Scheduling

RED HAT ENTERPRISE LINUX FEATURE BRIEF Application-Optimized Infrastructure with Containers

Project Shell 2 Due: November 6, 2019 at 11:59Pm

RAIK 284H, Spring 2010 Writing Your Own Unix Shell Assigned: Apr. 1, Due: Thur., Apr

Signals Kernel Software This Lecture Non‐Local Jumps Application Code Carnegie Mellon Today

Inter-Process Communication Mechanisms (IPC)

IBM Systems - Iseries