Evaluating Power-Performace Benefits of Data Compression In
Total Page:16
File Type:pdf, Size:1020Kb
ENERGY 2014 : The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies Evaluating Power-Performace Benefits of Data Compression in HPC Storage Servers Konstantinos Chasapis, Manuel F. Dolz, Michael Kuhn, Thomas Ludwig Department of Informatics University of Hamburg Hamburg, Germany E-mails: fkonstantinos.chasapis,manuel.dolz,michael.kuhn,[email protected] Abstract—Both energy and storage are becoming key issues in amount of data that is being stored in the storage subsystem high-performance (HPC) systems, especially when thinking about can also help to improve application performance. Thus, data upcoming Exascale systems. The amount of energy consumption compression can be used for this purpose. Furthermore, due and storage capacity needed to solve future problems is growing to the increasing electricity footprints, energy used for storage in a marked curve that the HPC community must face in represents an important portion of the Total Cost of Ownership cost-/energy-efficient ways. In this paper we provide a power- (TCO) [3]. For instance, the German Climate Computing performance evaluation of HPC storage servers that take over tasks other than simply storing the data to disk. We use the Lustre Center’s (DKRZ) HDD-based storage system is using 10,000 parallel distributed file system with its ZFS back-end, which HDDs to provide a 7 PB file system for earth system science. natively supports compression, to show that data compression can Assuming a power consumption of 5–10 W for typical HDDs, help to alleviate capacity and energy problems. In the first step this results in energy costs of 50,000–100,000 e per year for of our analysis we study different compression algorithms with the HDDs alone. To make the problem worse, the growth of regards to their CPU and power overhead with a real dataset. HDD capacity has recently also started to slow down, requiring Then, we use a modified version of the IOR benchmark to verify additional investment to keep up with the increasing processing our claims for the HPC environment. The results demonstrate power. that the energy consumption can be reduced by up to 30 % in the write phase of the application and 7 % for write-intensive In this paper, we analyze the usefulness of HPC storage applications. At the same time, the required storage capacity can servers that compress file data in order to reduce the required be reduced by approximately 50 %. These savings can help in amount of storage hardware and the overall energy consump- designing more power-efficient and leaner storage systems. tion, and thus the total cost of storage hardware usually Keywords-power consumption; high performance computing; found in supercomputers. As current-generation CPUs provide parallel distributed file system; storage servers; compression; ample performance for data processing such as compression or encryption, we provide a case for turning on compression I. INTRODUCTION by default to reduce the number of required storage devices. As we progress towards Exascale systems, the economic In particular, the paper includes the following contributions: cost of energy consumption and the pressure exerted by power dissipation on cooling equipment are rapidly becoming major 1) We review the architecture of parallel distributed file hurdles to the deployment of new HPC facilities. As of today, systems for the special case of Lustre and present a the most energy-efficient HPC supercomputers deliver close to power-performance profiling and tracing environment 4.5 GFLOPS/W (109 floating-point operations per second, per for these platforms. watt) [1]. Simple arithmetic shows that building an ExaFLOPS 2) We evaluate the performance of different compression system based on this technology would require about 220 MW, algorithms using different synthetic and real scientific yielding this approach economically unfeasible. Even if we can sets of input data. maintain the considerable improvements experienced by the 3) We employ the parallel I/O Performance Benchmark most energy efficient systems, the goal of building a 20 MW (IOR) [4] and real scientific data to assess different Exascale system will still be largely exceeded. If we want to compression algorithms of the Lustre parallel dis- continue enjoying the significant advances enabled by scientific tributed file system with its Zettabyte File System computing and supercomputers during these past decades, a (ZFS) [5] back-end. holistic investigation is needed to improve energy efficiency 4) We evaluate the impact of slowly-spinning, energy- of HPC hardware and software. aware HDDs used in storage servers as a way to High performance I/O is also a major stumbling block reduce energy consumption in large storage servers. to reach the ExaFLOPS barrier. Considering that large-scale The paper is structured as follows: In Section II, we computations and simulations conducted to solve complex present the parallel distributed file system Lustre and our problems in several scientific domains increasingly produce power-performance measurement framework used to analyze large amounts of data, the ExaFLOPS performance goal is the benefits of our proposed approach. Section III contains still far away. While CPU speed and HDD capacity have an extensive evaluation of our proposal to compress all data increased by roughly a factor of 1,000 every 10 years [2], produced by HPC systems. We present related work in Sec- HDD speed only develops at a slow pace: only a 300-fold tion IV. Finally, we draw conclusions and discuss further work throughput increase over the last 25 years. Minimizing the in Section V. Copyright (c) IARIA, 2014. ISBN: 978-1-61208-332-2 29 ENERGY 2014 : The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies ACKGROUND Power-performance II. B Lustre architecture analysis framework In this section, we explain the power-performance frame- VampirTrace work that we use to conduct our evaluation. First, we describe Clients Instrumentation the distributed parallel file system and the power-performance ... measurement framework we used. Parallel distributed file systems are used in HPC to aggregate storage devices existing Vampir in several storage servers. File systems provide an abstraction VampirTrace layer between the applications and the actual storage hardware Network Trace file PMLib-plugin such that application developers do not have to worry about the organizational layout or technology of the underlying storage Resource utilization hardware. To meet the high demands of current high per- tracing server ... formance computing applications, they distribute data across multiple storage servers. Moreover, they are designed to allow ... ... parallel access to multiple clients and cooperatively work with MDT OSTs the same data concurrently. On the other hand, power measure- Power tracing MDS OSSs Wattmeters server ments are needed to design new energy-aware approaches and meet consumption constraints. In this case, wattmeters attached Figure 1. Lustre architecture in combination with the power-performance to the server nodes are leveraged in a profiling and tracing analysis framework. framework that allows developers to correlate performance and VampirTrace and Vampir, which are profiling/tracing and vi- power data. sualization tools, respectively. The left part of the Figure 1 A. Parallel distributed file systems on the current page offers a graphical representation of the Lustre architecture; the right depicts the tracing and profiling To perform our evaluation we use Lustre [6]. Lustre is framework. To use our approach, the application running on the an open-source parallel distributed file system and is widely clients is compiled using the VampirTrace compiler wrappers, used on current supercomputers. Lustre powers around half which automatically instrument the code. Next, the application of the TOP100 supercomputers and almost one third of all is run on the client nodes, thus consuming a certain amount TOP500 supercomputers [7]. In contrast to other proprietary of energy, both on the client and server nodes, due to the solutions such as the General Parallel File System (GPFS) [8], I/O instructions. The server nodes are connected to power it is possible to adapt, extend and improve Lustre since it is measurement devices that account for the consumed energy licensed under the GPL (version 2). and send the power data to the tracing server. The attached Lustre distinguishes between file system clients and VampirTrace pmlib plugin employs the client API that sends servers. While all clients are identical, the servers can have start/stop primitives in order to gather captured data by the object storage servers different roles. First, the (OSSs) manage wattmeters onto the tracing server, where an instance of the the file system’s data and are connected to possibly multiple pmlib server is running. Once the application’s run has object storage targets (OSTs) that store the actual file data. finished, the VampirTrace pmlib plugin receives the power meta-data servers Second, the (MDSs) handle the file system’s data from the tracing server. The instrumentation post-process meta-data, such as directories, file names and permissions. generates the performance trace files and the pmlib plugin MDS meta-data targets Each is connected to possibly multiple inserts the power data into them. (MDTs) that store the actual meta-data. The general architec- In addition to the power measurements, we also account MDTs ture of Lustre is illustrated in Figure 1 on this page. Both for the resource utilization values of the Lustre servers: CPU OSTs and use an underlying back-end file system to store their load, memory usage and storage device utilization. We run data. This file system is responsible for block allocation and special pmlib server instances on the server nodes that hiding the internal disk layout from Lustre. Past versions of retrieve these values from the proc file system (leveraging ext4 Lustre supported only a modified version of [9] called the psutil Python library). Thus, pmlib plugin instances ldiskfs .