Container-Based Virtualization for Byte-Addressable NVM Data Storage

2016 IEEE International Conference on Big Data (Big Data) Container-Based Virtualization for Byte-Addressable NVM Data Storage Ellis R. Giles Rice University Houston, Texas [email protected] Abstract—Container based virtualization is rapidly growing Storage Class Memory, or SCM, is an exciting new in popularity for cloud deployments and applications as a memory technology with the potential of replacing hard virtualization alternative due to the ease of deployment cou- drives and SSDs as it offers high-speed, byte-addressable pled with high-performance. Emerging byte-addressable, non- volatile memories, commonly called Storage Class Memory or persistence on the main memory bus. Several technologies SCM, technologies are promising both byte-addressability and are currently under research and development, each with dif- persistence near DRAM speeds operating on the main memory ferent performance, durability, and capacity characteristics. bus. These new memory alternatives open up a new realm of These include a ReRAM by Micron and Sony, a slower, but applications that no longer have to rely on slow, block-based very large capacity Phase Change Memory or PCM by Mi- persistence, but can rather operate directly on persistent data using ordinary loads and stores through the cache hierarchy cron and others, and a fast, smaller spin-torque ST-MRAM coupled with transaction techniques. by Everspin. High-speed, byte-addressable persistence will However, SCM presents a new challenge for container-based give rise to new applications that no longer have to rely on applications, which typically access persistent data through slow, block based storage devices and to serialize data for layers of block based file isolation. Traditional persistent data persistence. When writing values into a persistent memory accesses in containers are performed through layered file access, which slows byte-addressable persistence and transac- tier, programmers are faced with a dual edged problem tional guarantees, or through direct access to drivers, which of how to catch spurious cache evictions while atomically do not provide for isolation guarantees or security. grouping stores to manage consistency guarantees in case of This paper presents a high-performance containerized ver- failure. sion of byte-addressable, non-volatile memory (SCM) for applications running inside a container that solves perfor- Figure 1 shows how new Storage Class Memory sits at mance challenges while providing isolation guarantees. We the intersection of both byte-addressability and persistence, created an open-source container-aware Linux loadable Kernel allowing applications to use ordinary loads and stores to Module (LKM) called Containerized Storage Class Memory, quickly persist data without having to serialize data for block or CSCM, that presents SCM for application isolation and storage. SCM is timely for Big Data applications as it sits ease of portability. We performed evaluation using micro- benchmarks, STREAMS, and Redis, a popular in-memory at the common challenges of Velocity, Volume, and Variety data structure store, and found our CSCM driver has near [4] [5] and recent extensions to additional characteristics the same memory throughput for SCM applications as a non- [6]. Figure 1 also shows traditional container based storage containerized application running on a host and much higher in relation to Storage Class Memory. As noted in the figure, throughput than persistent in-memory applications accessing SCM sits at an interesting intersection of byte-addressable SCM through Docker Storage or Volumes. access and persistence. Access to SCM will be provided via a traditional mmap call to an underlying device driver or I. INTRODUCTION file [7]–[9]. This poses a problem for applications running Docker [1] is a relatively new open-source implementation of container-based virtualization technology that has been gaining in popularity for quick and easy cloud deployments and for recent work in live migration of containers using Flocker [2]. Docker creates lightweight Linux based containers with containerized applications having near the same performance as when the application is executed outside the container [3]. Containers offer lightweight virtualization for applications and services running on the same host operating system. Containers are an alternative to full virtualization of a host operating system and devices, only isolating running applications using a ”chroot” for persistent file accesses, Linux C groups for CPU and memory usage, Figure 1: Container based storage with byte-addressable, and I/O isolation. fast, and non-volatile Storage Class Memory. 978-1-4673-9005-7/16/$31.00 ©2016 IEEE 2754 inside an isolated container because an mmap of a file from inside a container using regular virtual memory ac- through an isolation layer may not gain the performance cesses with limits imposed by the operating system if Linux benefits of SCM, due to cascading persistence consistency CGroups are used. guarantees in layered file accesses. Additionally, exposing a Handling access to persistent data inside a Docker Con- shared device or volume to multiple containers can remove tainer is not a choice to be taken lightly, as there are several persistence isolation from containers and introduce security choices available: using a container file or traditional Docker and portability issues. New Linux kernel Direct Access Storage, an external or Docker Volume, or direct access to a (DAX) support and eXecute In Place (XIP) support bypasses device. Each has its own advantages and disadvantages and the virtual-memory system and page caches creating direct needs to be specified up-front on container start if requiring access to SCM without the need to copy data into internal a special device or volume. We summarize each method and buffers for persistence. Even with DAX support in file then relate to SCM storage access. systems such as ext2 and ext4, containers will still have 1. Docker Storage: Storage to traditional files inside a to access data through a Docker Storage Layer or Volume, container are accessed using a pluggable storage driver ar- each with the challenges above. Additionally, multi-threaded chitecture. File accesses are layered using AUFS (or Another applications utilizing the Docker Volume driver do not scale Union File System), Device Mapper, OverlayFS, VFS, or well due to internal Docker synchronization. ZFS. A layered access requires copying of data through We present a solution to these problems by introducing multiple file or persistence layers; an example is shown in a Containerized Storage Class Memory or CSCM driver. Figure 2. AUFS uses Copy-on-Write and copies the entire It detects when applications are accessing the driver from file on first update, which could waste space but be faster if within a container and presents an identical copy of SCM. all of the contents of a file are updated. Device Mapper is a This allows for ease in portability of containers coupled with new option that performs copies at the block layer reducing high-speed, byte-addressable persistent memory accesses for space, but is slower on the writes to first blocks. Another storage. In addition, with isolated, direct access to persistent drawback for Docker Storage is the entire Docker service SCM, consistency guarantees that perform persistent mem- daemon must be configured for the driver and cannot be ory fences in an application, such as an in-memory database, changed unless reconfigured and re-installed. do not suffer from multiple layers each performing persis- 2. Docker Volumes: Docker Volumes are used to share tence consistency, thus allowing our CSCM implementation data between containers or between the container and the to achieve high scalability. host, also in Figure 2. Volume specifications are passed as options on startup of a container, and cannot be added later. II. OVERVIEW Data is also not isolated, so changes in one environment Virtualization was pioneered by Popek and Goldberg who affect the other. If a container is paused and restarted stated that a virtual machine should be ”an efficient, isolated elsewhere, volume data must be copied and managed. Ad- duplicate of the real machine” [10]. Virtualization has three ditionally, we found that multi-threaded applications, with key properties including: 1) Isolation - guests should not each thread stressing the persistence volume, did not scale be able to affect others; 2) Encapsulation - allowing for well due to internal synchronization points in the volume easy migration as an entire system can be represented as driver. a single file; and 3) Interposition - the system can monitor 3. Direct Device Access: Direct access to a device is or intercept guest operations. specified and granted on the start of a container, and the Jails [11] was introduced for lightweight virtualization of environments to allow for the sharing of a machine between several customers or users while still allowing for isolation of files and services of the guests on the same machine through the use of chroot and I/O constraints. Chroot changes the root of the file system to a different location for application-level persistence isolation and security. This has many benefits since a system can be shared securely with little performance burden, but services such as CPU and memory were not isolated and could be abused by users. Linux Containers or LXC [12] were introduced as the Linux version of Jails. The implementation added features to restrict memory and CPU usage that

Container-Based Virtualization for Byte-Addressable NVM Data Storage

Security Assurance Requirements for Linux Application Container Deployments

Rootless Containers with Podman and Fuse-Overlayfs

How to Create a Custom Live CD for Secure Remote Incident Handling in the Enterprise

In Search of the Ideal Storage Configuration for Docker Containers

MINCS - the Container in the Shell (Script)

Hypervisors Vs. Lightweight Virtualization: a Performance Comparison

Understanding the Performance of Container Execution Environments

Uebayasi's Simple Template

Proceedings of the Linux Symposium

Systemd As a Container Manager

NOVA: a Log-Structured File System for Hybrid Volatile/Non

High Velocity Kernel File Systems with Bento