A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems Balaji Subramaniam and Wu-chun Feng Department of Computer Science Virginia Tech fbalaji, [email protected] Abstract— In recent years, the high-performance com- HPCC benchmark suite is a collection of benchmarks puting (HPC) community has recognized the need to that provide better coverage and stress-testing of dif- design energy-efficient HPC systems. The main focus, ferent components of the system. However, the broader however, has been on improving the energy efficiency acceptance of HPCC as the performance benchmark for of computation, resulting in an oversight on the energy HPC has been limited, arguably because of its inability efficiencies of other aspects of the system such as memory or disks. Furthermore, the energy consumption of the to capture performance in a rankable manner, e.g., with non-computational parts of a HPC system continues to a single number. Hence, HPL and its performance metric consume an increasing percentage of the overall energy of FLOPS continues to reign. consumption. Therefore, to capture a more accurate pic- Today, the HPC community finds itself in a similar ture of the energy efficiency of a HPC system, we seek to situation, but this time, with respect to the greenness of create a benchmark suite and associated methodology to HPC systems. The HPC community has acknowledged stress different components of a HPC system, such as the this issue as a major problem in designing future exascale processor, memory, and disk. Doing so, however, results in systems. Efforts such as the Green500 [9], launched a potpourri of benchmark numbers that make it difficult in 2007, seek to raise the awareness of power and to “rank” the energy efficiency of HPC systems. This leads to the following question: What metric, if any, can capture energy in supercomputing by reporting on power con- the energy efficiency of a HPC system with a single number? sumption and energy efficiency, as defined by floating- To address the above, we propose The Green Index point operations per second per watt or FLOPS/watt. (TGI), a metric to capture the system-wide energy effi- By 2008, the TOP500 also began tracking the power ciency of a HPC system as a single number. Then, in consumption of supercomputers. However, these efforts turn, we present (1) a methodology to compute TGI, (2) are inherently limited by the use of LINPACK bench- an evaluation of system-wide energy efficiency using TGI, mark for performance measurement because it primarily and (3) a preliminary comparison of TGI to the tra- stresses only the CPU component of an HPC system. ditional performance-to-power metric, i.e., floating-point Furthermore, as noted in a recent exascale study [1], the operations per second (FLOPS) per watt. energy consumption of a HPC system when executing non-computational tasks, especially data movement, is I. INTRODUCTION expected to overtake the energy consumed due to the For decades now, the LINPACK benchmark has processing elements. Clearly, there is a necessity to been widely used to evaluate the performance of high- evaluate the energy efficiency of different components performance computing (HPC) systems. The TOP500 of a HPC system. list [7], for example, uses the high-performance LIN- Like performance benchmarking with HPCC, we pro- PACK (HPL) benchmark [2] to rank the 500 fastest pose an approach that evaluates the energy efficiency supercomputers in the world with respect to the floating- of different components of an HPC system using a point operations per second (or FLOPS). While the benchmark suite. However, there are seven different TOP500 continues to be of significant importance to the benchmark tests in the suite, and each of them reports HPC community, the use of HPL to rank the systems their own individual performance using their own met- has its own limitations as HPL primarily stresses only the rics. We propose an approach that seeks to answer the processor (or CPU component) of the system. To address following questions: this limitation and enable the performance evaluation of different components of a system, the HPC Challenge • What metric should be used to evaluate the energy (HPCC) benchmark suite [3] was developed in 2003. The efficiency of different components of the system? • How should the energy consumed by the different relative to a reference system, where time is used as the benchmarks be represented as a single number? unit of performance. A SPEC rating of 25 means that the system under test is 25 times faster than the reference This paper presents an initial step towards answering system. The rating is relative to a reference system in or- the above questions. Specifically, we propose The Green der to normalize and ease the process of comparison with Index (TGI) [8] — a metric for evaluating the system- other systems. The reference system for each benchmark wide energy efficiency of an HPC system via a suite of is different. For example, the reference system for SPEC benchmarks. The chosen benchmarks currently include CPU2000 is a Sun Ultra5 10 workstation whereas SPEC HPL for computation, STREAM for memory, and IO- CPU2006 uses the Sun Ultra Enterprise 2 workstation as zone for I/O. its reference machine. The contributions of this paper are as follows: Performance of Reference System SPEC rating = (1) Performance of System Under Test • A methodology for aggregating the energy efficiency of different benchmarks into a single We propose the Green Index (TGI) metric in an metric—TGI. effort to capture the “greenness” of supercomputers by • A preliminary analysis of the goodness and scala- combining all the different performance outputs from bility of the TGI metric for evaluating system-wide the different benchmark tests. We believe that TGI is energy efficiency. the first effort towards providing a single number for • A comparison of the TGI metric to traditional evaluating the energy efficiency of supercomputers using energy-efficient metrics such as the performance- a benchmark suite. to-power ratio. Following an approach similar to the SPEC rating, the energy efficiency of a supercomputer is measured The rest of the paper is organized as follows. Section II with respect to a reference system by providing a relative describes the proposed TGI metric. Section III presents performance-to-watt metric. The TGI of a system can be an evaluation of the TGI metric. Specifically, we propose calculated by using the following algorithm: the desired property for an energy-efficient metric and analyze the validity of the TGI metric. We also look 1) Calculate the energy efficiency (EE), i.e., into appropriate weights which can be used with TGI. performance-to-power ratio, while executing In Section IV, we present experimental results, including different benchmark tests from a benchmark suite a description of the benchmarks that we used to evaluate on the supercomputer: the TGI metric and a comparison of TGI with the Performancei performance metrics used in the benchmarks. Related EEi = (2) Power Consumedi work in this area is described in Section V. Section VI concludes the paper. where each i represents a different benchmark test. 2) Obtain the relative energy efficiency (REE) for a II. THE GREEN INDEX (TGI) FOR HPC specific benchmark by dividing the above results Formulating a canonical green metric for a diverse with the corresponding result from a reference benchmark suite that stresses different components of system: an HPC system is a challenging task. To the best of our EEi REEi = (3) knowledge, there exists no methodology that can be used EERefi to combine all the different performance outputs from the where each i represents a different benchmark test. different benchmark tests and deliver a single number of energy efficiency to look at. Despite arguments that 3) For each benchmark, assign a TGI component energy efficiency can only be represented by a vector (or weighting factor W) such that the sum of all which captures the effect of energy consumed by a weighting factor is equal to one. benchmark suite, we seek the “holy grail” of a single 4) Use the weighting factors and sum across product representative number with which to make comparisons. of all weighting factors and corresponding REEs The earliest metric for comparing system performance to arrive at the overall TGI of the system. is the Standard Performance Evaluation Corporation X (SPEC) rating [6]. As shown in Equation (1), the SPEC T GI = Wi ∗ REEi (4) rating defines the performance of a system under test, i A. Arithmetic Mean TGI allows for flexibility in green benchmarking as The simplest way to assign the TGI component it can be used and viewed in different ways by its con- (weighting factor) is to use the arithmetic mean i.e., sumers. For example, although we use the performance- assign an equal weighting factor to all the benchmarks. per-watt metric for energy efficiency in this paper, the The arithmetic mean (AM) for a data set Xi where i ≥ 1 methodology used for computing TGI can be used with in general is defined by Equation (6) any other energy-efficient metric, such as the energy- Pn delay product. The advantages of using TGI for eval- REEi TGI using Arithmetic Mean = i=0 (7) uating the energy efficiency of supercomputers are as n follows: 1) Each weighting factor can be assigned a value Pn EEi based of the specific needs of the user, e.g., as- i=0 EERef TGI using Arithmetic Mean = i signing a higher weighting factor for the memory n benchmark if we are evaluating a supercomputer Pn Performance Metrici i=0 to execute a memory-intensive application. = Power Consumedi EERefi ∗ n 2) TGI can be extended to incorporate power con- n X 1 sumed outside the HPC system, e.g., cooling.

Load more