On the Use of Containers in High Performance Computing Environments

On the Use of Containers in High Performance Computing Environments Subil Abrahamy, Arnab K. Pauly, Redwan Ibne Seraj Khan, Ali R. Butt Virginia Tech fsubil, akpaul, redwan, [email protected] Abstract—The lightweight nature, application portability, Modern data analysis is compute and I/O hungry [2], and deployment flexibility of containers is driving their [3] and require massive computing power, far beyond the widespread adoption in cloud solutions. Data analysis and deep capabilities of what a scaled-up single node can provide. learning (DL)/machine learning (ML) applications have especially benefited from containerization. As such data analysis is At the same time, more and more scientific workflows rely adopted in high performance computing (HPC), the need for on DL/ML to analyze and infer from scientific observations container support in HPC has become paramount. However, and simulations data. Thus, there is a natural need for containers face crucial performance and I/O challenges in HPC. data analysis and I/O-intensive DL/ML support in HPC. A One obstacle is that while there have been HPC containers, challenge to this end is that modern DL/ML software stacks such solutions have not been thoroughly investigated, especially from the aspect of their impact on the crucial HPC I/O are complex and bespoke, with no two setups exactly the throughput. To this end, this paper provides a first-of-its-kind same. This leads to an operations nightmare of addressing empirical analysis of state-of-the-art representative container library, software, and other dependencies and conflicts. To solutions (Docker, Podman, Singularity, and Charliecloud) in this end, enterprise data analysis solutions have adopted con- HPC environments. We also explore how containers interact tainers given their ability to encapsulate disparate services with an HPC parallel file system like Lustre. We present the design of an analysis framework that is deployed on all nodes that can be orchestrated together, reused, and reinstantiated in an HPC environment, and captures CPU, memory, network, transparently. Thus, efficiently supporting containers in HPC and file I/O statistics from the nodes and the storage system. systems will provide a suitable and effective substrate for the We are able to garner key insights from our analysis, e.g., crucial data analysis tasks. Charliecloud outperforms other container solutions in terms The use of containers is being explored in the HPC envi- of container start-up time, while Singularity and Charliecloud are equivalent in I/O throughput. But this comes at a cost, as ronment, and have produced benefits for large scale image Charliecloud invokes the most metadata and I/O operations processing, DL/ML workloads [4]–[6], and simulations [7]. on the underlying Lustre file system. By identifying such Due to user demand and ease of use, containers are also trade-offs and optimization opportunities, we can enhance becoming an integral part of HPC workflows at leadership HPC containers performance and the ML/DL applications that computing facilities such as at Oak Ridge [8] and Los increasingly rely on them. Alamos [9] national laboratories. Currently available HPC Keywords-Container Performance, High Performance Com- container solutions include Singularity [10], Podman [11], puting, Parallel File Systems, HPC Storage and I/O Shifter [12], and Charliecloud [13]. Among these, Singu- larity and Charliecloud are designed especially to make the I. INTRODUCTION best use of HPC resources. Containers are experiencing massive growth as the de- In addition to meeting the computing needs of HPC ployment unit of choice in a broad range of applications applications, containers have to support the dynamic I/O from enterprise to web services. This is especially true for requirements and scale, which introduce new challenges leveraging the flexibility and scalability of the cloud envi- for data storage and access [14]–[17]. Moreover, different ronments [1]. Extant data analysis including deep learning container solutions incur different overheads. Therefore, it (DL) and machine learning (ML) are further driving the is crucial to analyze the interaction of containers with the adoption of containerized solutions. Containers offer highly underlying HPC storage system—one of the most important desirable features: they are lightweight, comprehensively HPC resources. The knowledge gained can help build an capture dependencies into easy-to-deploy images, provide effective ecosystem where both parallel file systems (PFS) application portability, and can be scaled to meet application and containers can thrive. There have been several works demands. Thus, containers free the application developers that have examined the performance of containers on HPC from lower-level deployment and management issues, allow- systems [7], [18]–[23]. There have also been studies on ing the developers to focus on their applications and in turn parallel file systems for use in cloud environments [24]–[28]. significantly reduce the time-to-solution for mission-critical However, to the best of our knowledge, existing works have applications. not studied the behavior of HPC storage system, especially the parallel file system, in service of containers. y Made equal contribution to this work. In this paper, we examine the performance of different container solutions on a parallel file system. We present Beyond isolation, arguably the main benefit of container- a framework to analyze different representative container ization is the ease of packaging the application and all of its solutions running atop HPC storage. The framework collects dependencies in a readily deployable format. This advantage and compares performance metrics from both client and is particularly useful for HPC, where such packaging makes storage nodes, including CPU, memory, network, and file deployment of ML/DL significantly easier by freeing the system usage. We also collect metadata and I/O statistics users from installing and managing software dependencies gathered from the PFS. We focus on file I/O, specifically in complex HPC systems. analyzing the effects of I/O operations made by containers. We use the containerized image of Sysbench [29] to run B. Containerization Solutions CPU, memory, and file I/O benchmarks. We also use real- 1) Docker: Docker [39] is one of the most popular world representative HPC workloads such as HaccIO [30] container solutions. Although approaches such as LXC [40] and IOR [31] to explore the I/O impact of the studied have been around for a long time, it was Docker that containers. popularized containerization as a better and more efficient We study four extant container solutions: Docker, Pod- solution for providing isolation for applications, especially man, Singularity, and Charliecloud. The HPC PFS that we in the cloud computing setting. Figure 1a shows a simplified selected for our analysis is Lustre [32], as it is the most architecture overview of Docker. widely used file system in HPC [33]. According to the However, a number of drawbacks in Docker has lim- latest Top 500 list [34], Lustre powers ∼60% of the top 100 ited its enthusiastic adoption in HPC environments. Firstly, supercomputers in the world, and will power Frontier [35], the approach depends on an always-on Docker daemon the world’s first exascale supercomputer. While the analysis that the Docker client (e.g. the Command Line Interface is based on the experiments on the Lustre file system, the (CLI)) interacts with to perform container operations. This results can be easily extended to other HPC file systems. Our daemon spawns the containers as child processes, which analysis shows that Charliecloud gives better performance make it a single point of failure and can possibly lead to in terms of container startup time. Both Singularity and orphaned containers in case the daemon crashes. The entire Charliecloud give similar I/O throughput on the Lustre file container management ecosystem of Docker rests inside system. However, when analyzing the number of requests this single daemon, making it bloated and unsuitable for that arrive at Lustre’s metadata and object storage servers, mission-critical HPC deployments. Docker also needs root Charliecloud fares the worst. Our aim is to provide insights permissions to run. This almost always contradict with the to help both HPC practitioners and system administrators HPC center policies that disallow such permissions to avoid so that they can obtain best I/O performance when running execution of harmful arbitrary code. In addition, Docker HPC container solutions. stores its images split up into a number of read-only layers. When a Docker container is started from an image, the II. BACKGROUND daemon uses OverlayFS [41] to create a union of all of its In this section, we give an introduction to the container layers and creates a read-write layer at the top. However, a technology, and describe some of the commonly used con- parallel file system such as Lustre does not support Over- tainer solutions. We also describe HPC parallel file systems, layFS. So an image must be explicitly downloaded and run especially the architecture of Lustre file system. from a supported local file system—a step that introduces unnecessary overhead and limits scalability. Docker also A. Containers does not provide useful support for native MPI operations Containers differ from other virtualization methods like that are often needed for typical HPC applications. KVM [36] that explicitly virtualize the hardware and run Storing of

On the Use of Containers in High Performance Computing Environments

Open Source Licensing Information for Cisco IP Phone 8800 Series

De-Anonymizing Live Cds Through Physical Memory Analysis

Hardware-Driven Evolution in Storage Software by Zev Weiss A

Elinos Product Overview

Unionfs: User- and Community-Oriented Development of a Uniﬁcation File System

Scibian 9 HPC Installation Guide

Key-Based Self-Driven Compression in Columnar Binary JSON

Ted Ts'o on Linux File Systems

Cnvision 4.5 User Guide

How to Run Ubuntu, Chromiumos, Android at the Same Time on an Embedded Device

Compression: Putting the Squeeze on Storage

Microsoft NTFS for Linux by Paragon Software 9.7.5