Agile, Efficient Virtualization Power Management with Low-Latency
Total Page:16
File Type:pdf, Size:1020Kb
Agile, Efficient Virtualization Power Management with Low-latency Server Power States Canturk Isci⋆ Suzanne McIntosh⋆ Jeffrey Kephart⋆ Rajarshi Das⋆ James Hanson⋆ Scott Piper† Robert Wolford† Thomas Brey† Robert Kantner† Allen Ng† James Norris⋆ Abdoulaye Traore⋆ Michael Frissora⋆ ⋆ IBM T.J. Watson Research Center †IBM Systems & Technology Group Yorktown Heights, NY Raleigh, NC & Kirkland, WA ABSTRACT tecture levels up through the firmware, operating system, One of the main driving forces of the growing adoption and the middleware. of virtualization is its dramatic simplification of the pro- Among these techniques, those which employ virtualiza- visioning and dynamic management of IT resources. By tion have been shown to yield significant energy efficiency decoupling running entities from the underlying physical re- improvements. By consolidating multiple workloads onto a sources, and by providing easy-to-use controls to allocate, single physical server, virtualization drives up system uti- deallocate and migrate virtual machines (VMs) across phys- lization and enables a given amount of computational work ical boundaries, virtualization opens up new opportunities to be performed on a smaller set of servers. While a more for improving overall system resource use and power effi- heavily utilized server uses more power than a lightly uti- ciency. While a range of techniques for dynamic, distributed lized one, the total power consumed by running a workload resource management of virtualized systems have been pro- on a smaller number of more heavily-utilized server can be posed and have seen their widespread adoption in enterprise substantially less than that for a large number of lightly- systems, similar techniques for dynamic power management utilized servers. have seen limited acceptance. The main barrier to dynamic, One important class of virtualization-based energy effi- power-aware virtualization management stems not from the ciency methods is static consolidation. The fundamental limitations of virtualization, but rather from the underlying idea, which arose from the observation that a large fraction physical systems; and in particular, the high latency and of dedicated physical servers are severely underutilized [3], energy cost of power state change actions suited for virtual- is based upon capacity planning. Long-term statistics on ization power management. resource consumption for a given set of workloads are used In this work, we first explore the feasibility of low-latency to estimate how many virtualized physical servers would be power states for enterprise server systems and demonstrate, needed to support that computational load [8, 36]. This with real prototypes, their quantitative energy-performance approach is commonly used for migrating workloads from trade offs compared to traditional server power states. Then, dedicated physical servers to virtualized environments [30, we demonstrate an end-to-end power-aware virtualization 43, 44], although the original environment could also be vir- management solution leveraging these states, and evaluate tual. VM migration may be employed from time to time the dramatically-favorable power-performance characteris- to even out the workload across the physical servers, but tics achievable with such systems. We present, via both real no further changes are made to the power state of the ser- system implementations and scale-out simulations, that vir- vers. Thus, while static consolidation eliminates substantial tualization power management with low-latency server po- amounts of overprovisioning, it does not take advantage of wer states can achieve comparable overheads as base distri- dynamic fluctuations in workload intensity [31, 39]. buted resource management in virtualized systems, and thus A second important class, dynamic consolidation, attempts can benefit from the same level of adoption, while delivering to extract further efficiency by migrating VMs to vacate close to energy-proportional power efficiency. some physical servers during lulls in workload intensity, turn- ing them off to save energy until the workload increases [16, 25, 42, 46, 40]. Dynamic consolidation techniques promise 1. INTRODUCTION to yield substantial energy efficiency improvements on top of Over the last decade, reducing the energy consumption what is achievable by static methods. However, while they of computing devices in data centers has become a topic have gained some traction in certain desktop virtualization of great practical and academic interest [24, 41]. Myriad frameworks [10, 5], in practice they do not make good on techniques for reducing energy consumption have been in- that promise at the enterprise level. This is because mecha- troduced at all levels of the stack, from circuits and archi- nisms for turning servers on and off entail latencies that can extend to minutes. The longer the latency, the greater the risk that the workload will increase and the performance will Permission to make digital or hard copies of all or part of this work for suffer for the period during which the increase is detected personal or classroom use is granted without fee provided that copies are and addressed by turning on enough servers. not made or distributed for profit or commercial advantage and that copies Previously-explored methods for mitigating the risk caused bear this notice and the full citation on the first page. To copy otherwise, to by the latency of coarse-grained power management include republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. workload scheduling, workload forecasting, and keeping ex- ISCA’13 Tel-Aviv, Israel tra servers on as a buffer to protect against unanticipated Copyright 2013 ACM 978-1-4503-2079-5/13/06 ...$15.00. Power workload surges. All of these approaches have drawbacks. Power down/up Workload scheduling is feasible only when one has substan- Lost power overhead tial control over the workload (for example in a compute saving Demand opportunity grid scenario). Workload forecasting is difficult to apply be- t cause no workload is completely predictable, and the penalty Wait to be sure Power Power up for suffering an unanticipated surge—even if such errors are demand is low down (Perf. Degradation!) rare—can be substantial. Therefore, even when one uses (a) DPM Overview. Power workload forecasting, some amount of buffering is necessary, Power and this can largely wipe out the potential energy savings. down/up Lost power overhead In this paper, we demonstrate a radically different ap- saving Demand proach that overcomes the traditional barriers to achiev- opportunity ing substantial energy savings via dynamic consolidation. t Power down Power up Rather than merely treating the symptoms, we attack the (b) Power management with low-latency power states. underlying root cause or the problem: the latency itself. Our method exploits low-power, low-latency power states in en- Figure 1: The effect of power state change overheads terprise servers; specifically, the S3 state. We first present an on energy-aware virtualization management. across-the-stack implementation of S3 capability for enter- prise servers with two actual prototypes. Then, we build areas) during which power is wasted and/or performance is an end-to-end virtualization management solution, lever- degraded: aging this technology. Our experimental results with our • Power-on latency. In the rightmost region, the PM real-system prototypes and scale-out simulation framework algorithm detects that the cluster demand has risen, show substantial energy efficiency improvements over stan- so it initiates a power-on action. As the server pow- dard techniques. These further underline two key aspects ers on, power consumption is high—potentially even of our approach with low-latency server power states: the higher than when the server is processing the work- ability to react to fine-grained load fluctuations that most load. Worse, the performance of the cluster as a whole existing solutions cannot handle with standard power sta- suffers until the boot-up is complete and work has mi- tes; and reduced impact from incorrect power management grated back to the server. decisions with substantially faster reaction times. While the • Power-off latency. In the central region, the power basic principle of our solution is simple, realizing this prin- consumed as the server is being turned off is used to ciple in practice entails substantial work all along the stack place the server in a stable state rather than being used from the hardware level up through the firmware, the OS to process the workload. and the middleware used to control the VM migration and • Decision period. A PM algorithm must be judicious manage the power states of the physical servers. in its response to lower demand. If it is too hasty, and The remainder of this paper is organized as follows. Sec- the reduction in demand proves short-lived, the perfor- tion 2 provides more background on the fundamental idea mance cost during power-on latency will outweigh any underlying our solution and highlights its energy saving po- energy savings that may have accrued during the tran- tential. Section 3 presents a power state characterization sient. To reduce the risk of regret, it is reasonable for study, establishing that the S3 state offers an excellent trade- the PM to adopt a more conservative strategy, accord- off between latency and power consumption. Section 4 de- ing to which it waits for a specified time period before tails how we modified a commercial server to support S3 at turning the server off. Yet this wait time (which may the hardware, firmware and OS levels, while Section 5 de- reasonably be set to the better part of an hour) can scribes middleware that we developed for power-aware vir- result in significant lost opportunity for power savings, tualization management. Sections 6 and 7 describe our ex- and makes it impossible to save any power if the du- perimental results. We describe related work in Section 8, ration of the low demand is less than the wait time. and summarize our results in Section 9.