
Process Process Process Process Process OS Interconnect OS Host A Host B Migrating LinuX Containers Using CRIU Simon Pickartz, Niklas Eiling, Stefan Lankes, Lukas Razik, and Antonello Monti RWTH Aachen University, Aachen, Germany Why do we need Migration? Resiliency Increasing hard- and software failures with growing cluster sizes Evacuation of faulty nodes ⇒ no whole job aborts Load balancing Applications’ scalability is usually limited by a single resource Co-scheduling can improve the overall cluster utilization Revocation of an exclusive node assignment Dynamic schedules necessary 2 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 App App App Abstraction Layer Shared Kernel Hardware ⇒ Containers are fast and light-weight Why should we care about containers? Virtual Maschines App App App Virtualization of hardware Guest Guest Multiple kernels Kernel Kernel Device emulation Hypervisor Requires special hardware support for nearly native performance Host Kernel Flexibel in terms of kernel / OS choice Hardware 3 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 App App App Guest Guest Kernel Kernel Hypervisor Host Kernel ⇒ Containers are fast and light-weight Why should we care about containers? Containers Re-use host kernel User-space instances defined and App App App separated by so-called cgroups Abstraction Layer namespaces No special hardware support Shared Kernel required Hardware Restricts kernel / OS choice 3 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 App App App Guest Guest Kernel Kernel Hypervisor Host Kernel Why should we care about containers? Containers Re-use host kernel User-space instances defined and App App App separated by so-called cgroups Abstraction Layer namespaces No special hardware support Shared Kernel required Hardware Restricts kernel / OS choice ⇒ Containers are fast and light-weight 3 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Agenda Background Container-based Virtualization in terms of HPC Libvirt Checkpoint / Restore in Userspace (CRIU) Lxctools Driver Integration into Libvirt Migration Based on CRIU Evaluation lxctools Driver Overhead Migration Time Conclusion 4 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Container-based Virtualization in terms of HPC Isolation at OS level which is lower than with hypervisors ⇒ Container may crash whole system App App App No overhead introduced by multiple kernels Abstraction Layer Flexible resource allocation Shared Kernel Potentially better I / O and CPU performance than common Hardware hypervisor-based approaches ⇒ Complies with the goals of HPC 5 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Namespaces and Cgroups Namespaces and Cgroups are Linux Vanilla Kernel features Namespaces separate processes by groups Cgroups control, limit, and observe the resource usage of processes Namespaces cgroup subsystems pid Process IDs cpu CPU time and utilization net Network configuration cpuset Amount of CPU cores ipc Inter-process communication blkio Limit / observe I / O accesses mnt Mountpoints memory Limit memory uts Hostname consumption / utilization user User and group IDs 6 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 LinuX Containers Infrastructure for container projects (e. g. Docker) Linux Conainers (LXC) use cgroups and namespaces User-space solution ⇒ needs no customized kernels Ships with C-API Command-Line Interface (CLI) 7 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Libvirt Domain (VM etc.) Management library written in C Support for a variety of virtualization solutions, e. g. QEMU, VMware ESX, OpenVZ Domain Independent Layer ⇒ Independence of changes to virtualization layer Implements common functionality (e. g., configuration via XML) Domain Dependent Layer Implemented in terms of drivers Remote driver enabling remote management of domains dom dom dom To Network libvirt libvirt drv libvirt drv ··· remote drv Domain Independent Layer virsh API libvirtd From Network User Interactions 8 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Checkpoint / Restore in Userspace (CRIU) Checkpoint / Restore (C / R) without kernel modifications C/R processes or groups thereof Process Supports incremental checkpoints Uses ptrace to freeze a process and inject parasite code dump restore Gathers file descriptors, memory maps, and register contents Requires root privileges File System Provides direct memory transfer to a page-server running on remote nodes 9 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 The lxctools Driver Utilizes LXC and CRIU for usage by libvirt Minor libvirt source code modifications Make the driver public to libvirt (7 source files) Link against the LXC’s C-API Remote management capabilities by libvirt Implemented features Start / Stop containers Checkpoint / Restore containers Migrate containers (cold- and live-migration) Configuration of LXC instances via XML lxctools driver not related to libvirt’s lxc driver libvirt’s lxc driver provides no migration support 10 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Migration by lxctools Driver Implements skeleton of libvirt’s domain-independent layer Reduce file system overhead by using tmpfs Migration in four steps 1. Create tmpfs on both nodes 2. Start page-server on target node 3. Dump process and transfer memory content via page-server transfer other checkpointing data via file-transfer 4. Restore process from dump on target system and remove tmpfs source node dest. node container container memory content memory content, metadata metadata metadata file system file system 11 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Live-Migration by lxctools Driver Live-migration leverages incremental dumps Static and dynamic iteration counts are supported Pre-dumps only for memory content Common pre-copy live-migration approach pre-dumps final dump (stop domain) restore domain resume domain 12 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 lxctools Driver Overhead Systems: 2 NUMA nodes, each with 2 sockets Intel IvyBridge CPUs (E5-2650 v2), 8 cores (16 threads) @2.6 GHz Gigabit Ethernet Fedora 23, Linux Kernel v4.4.4-301, libvirt v1.2.16 1 LXC Comparison: Execution time libvirt 0.8 libvirt & lxctools vs. LXC CLI Averaged over 200 runs 0.6 Save / restore 0.4 with 1 GiB memory load Execution Time in0 s .2 Libvirt support by the cost 0 of a few tens of milliseconds start shutdown save restore Commands 13 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Comparison: Migration Time Empty VM / Container Migration with Workload 4 40 3 30 2 20 1 10 Migration Time in s Migration Time in s 0 0 0.25 0.5 1 2 4 Min 0.25 0.5 1 2 4 Domain size in GiB Workload in GiB lxctools QEMU / KVM 14 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Cold vs. Live Migration 1 GiB memory load Cold Averaged over 20 runs Live In this case: 0 2 4 6 8 10 12 only two pre-dumps Execution Time in Seconds Live Migration Downtime of 1.96 s Pre-dump Dump Restore 15 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Conclusion First libvirt driver with migration support for LXC containers Working prototype Support for container management via libvirt Minimal overhead to LXC CLI Promising migration times Future work Reduction of downtimes for live-migration RDMA migration Open source: https://github.com/RWTH-OS/libvirt 16 of 17 Migrating LinuX Containers Using CRIU | Simon Pickartz et al. | ACS | 06/23/2016 Thank you for your kind attention! Simon Pickartz et al. – [email protected] Institute for Automation of Complex Power Systems E.ON Energy Research Center, RWTH Aachen University Mathieustraße 10 52074 Aachen www.eonerc.rwth-aachen.de.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages19 Page
-
File Size-