System-Level Power Management Using Online Machine Learning for Prediction and Adaptation

Total Page:16

File Type:pdf, Size:1020Kb

System-Level Power Management Using Online Machine Learning for Prediction and Adaptation UNIVERSITY OF SOUTHAMPTON System-Level Power Management using Online Machine Learning for Prediction and Adaptation by Luis Alfonso Maeda-Nunez A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy in the Faculty of Physical Sciences and Engineering Electronics and Computer Science 2016-07-22 UNIVERSITY OF SOUTHAMPTON ABSTRACT FACULTY OF PHYSICAL SCIENCES AND ENGINEERING SCHOOL OF ELECTRONICS AND COMPUTER SCIENCE Doctor of Philosophy by Luis Alfonso Maeda-Nunez Nowadays embedded devices have the need to be portable, battery powered and high performance. This need for high performance makes power management a matter of critical priority. Power management algorithms exist, but most of the approaches focus on an energy-performance trade-off oblivious to the applications running on the system. Others are application-specific and their solution cannot be applied to other applications. This work proposes Shepherd, a cross-layer runtime management system for reduction of energy consumption whilst offering soft real-time performance. It is cross-layer because it takes the performance requirements from the application, and learns to adjust the power management knobs to provide the expected performance at the minimum cost of energy. Shepherd is implemented as a Linux governor running at OS level, this layer offers a low-overhead interface to change the CPU voltage and frequency dynamically. As opposed to the reactive behaviour of Linux Governors, Shepherd adapts to the application-specific performance requirements dynamically, and proactively selects the power state that fulfils these requirements while consuming the least power. Proac- tiveness is achieved by using AEWMA for adapting to the upcoming workload. These adaptations are facilitated using a model-free reinforcement learning algorithm, that once it learns the optimal decisions it starts exploiting them. This work enables Shep- herd to work with different applications. A programming framework was designed to allow programmers to develop their applications to be power-aware, by enabling them to send their performance requirements and annotations to Shepherd and provide the cross-layer soft real-time performance desired. Shepherd is implemented within the Linux Kernel 3.7.10, interfacing with the applica- tion and hardware to select an appropriate voltage-frequency control for the executing application. The performance of Shepherd is demonstrated on an ARM Cortex-A8 pro- cessor. Experiments conducted with multimedia applications demonstrate that Shep- herd minimises energy consumption by up to 30% against existing Governors. Also, the framework has been used to adapt example applications to work with Shepherd, achieving 60% energy savings compared to the existing approaches. Contents Acknowledgements xvii 1 Introduction 1 1.1 Research Justification . .2 1.1.1 Power Consumption . .2 1.1.2 Low Power Design Techniques . .3 1.1.2.1 Power Gating . .4 1.1.2.2 Dynamic Voltage and Frequency Scaling . .5 1.1.3 Challenges of System Level Power Management . .6 1.2 Research Questions . .9 1.3 Research Contributions . .9 1.4 Research Outputs . 10 1.5 Document Outline . 10 2 Power Management in Microprocessors 11 2.1 Power consumption in Microprocessors . 13 2.1.1 Dynamic Power . 14 2.1.1.1 Switching Power . 15 2.1.1.2 Internal Power . 16 2.1.2 Static Power . 17 2.2 Power Management Techniques: Knobs . 18 2.2.1 Multi-Vdd . 18 2.2.2 Dynamic Voltage and Frequency Scaling (DVFS) . 18 2.2.3 Power Gating (PG) . 19 2.3 Applications and OS: Requirements . 22 2.3.1 Real-Time Systems . 22 2.3.2 Hard Real-Time versus Soft Real-Time . 23 2.4 Power Management: Control . 24 2.4.1 Dynamic Power Management (DPM) . 24 2.4.1.1 Predictive Techniques . 24 2.4.1.2 Stochastic Techniques . 26 2.4.1.3 Machine Learning Techniques . 27 2.4.1.4 Reinforcement Learning for DPM . 29 2.4.1.5 Other Predictive Techniques . 30 2.4.2 DVFS . 31 2.4.2.1 Workload Detection . 31 2.4.2.2 Online Learning . 32 v vi CONTENTS 2.4.2.3 Offline Learning . 32 2.4.2.4 Workload Prediction . 34 2.4.2.5 Deadline Prediction . 34 2.4.2.6 Application-specific DVFS . 35 2.4.2.7 General Purpose DVFS . 36 2.4.3 The Interplay of DPM and DVFS . 37 2.4.4 Reinforcement Learning . 38 2.5 Discussion . 41 3 System-Level Power Management 43 3.1 Run-Time Management . 43 3.1.1 Requriements . 44 3.1.1.1 Application adaptation for use with RTM . 47 3.1.2 Power Knobs . 47 3.1.3 Control . 48 3.1.3.1 Monitor Feedback . 48 3.1.4 Run-Time Management Algorithm . 48 3.2 Prediction Unit . 50 3.2.1 Motivation . 50 3.2.2 EWMA . 51 3.2.2.1 Optimisation and Results . 52 3.2.3 AEWMA . 54 3.3 Decision Unit . 57 3.3.1 Cost Function . 58 3.3.2 Learning Phase in Shepherd . 60 3.4 Discussion and Summary . 61 4 Implementation of Run-time Manager 65 4.1 Implementation as Linux Governor . 65 4.1.1 Restrictions and optimisations for Linux Governor Implementation 67 4.2 Implementation of Prediction Unit . 68 4.3 Implementation of Decision Unit . 70 4.3.1 Implementation of Learning Phase in Shepherd . 71 4.3.2 Action Suitability for Exploiting Learning . 73 4.4 Performance Counters Module . 74 4.5 An Application Programming Interface (API) for Shepherd . 75 4.5.1 Design flow for application adapted to use Shepherd Governor . 77 4.6 Discussion . 78 5 Results 79 5.1 Experimental Setup . 79 5.2 Case Study: Run-Time Manager for Video Decoding . 81 5.2.1 Real time behaviour of Shepherd . 82 5.2.1.1 Shepherd on H.264 VGA . 84 5.2.1.2 Shepherd on H.264 QVGA . 85 5.2.1.3 Run-time Manager (RTM) on H.263 . 86 5.2.1.4 Shepherd on MPEG2 . 86 CONTENTS vii 5.2.1.5 Shepherd on MPEG4 . 87 5.2.2 Governor comparison . 87 5.2.2.1 Comparison versus Dynamic Governors . 87 5.2.2.2 Comparison versus Static Governors . 88 5.3 Shepherd API results . 89 5.4 Overheads . 91 5.5 Discussion and Summary . 92 6 Conclusions 93 6.1 Future Work . 94 Bibliography 97 Appendix A - Sample Shepherd Output Files 107 Appendix B - Publications 109 List of Figures 1.1 Dynamic and Leakage Power trend[1] . .3 1.2 Power vs Energy. Reproduced from [2] . .6 2.1 Flowchart of an embedded system in a cross-layer approach. Analytical approach (left) and practical approach (right). Sections 2.2, 2.3 and 2.4 are described as per the figure . 12 2.2 A CMOS Inverter showing the charging and discharging of the capacitor Cload. Based on [3] . 15 2.3 A CMOS Inverter showing the short circuit current flowing from VDD to VSS. Based on [4] . 16 2.4 Leakage Currents in a CMOS circuit (a). Taken from [2]. ITRS trends for leakage power dissipation (b). Based on [5]. 18 2.5 Definitions of power and times for a device with IDLE and SLEEP modes 20 2.6 Markov chain and Markov Decision Process model for DPM of Stron- gARM SA-1100, as presented by Benini et al. [6] . 26 2.7 Decision Tree example for selecting V-F settings. Taken from [7]. 33 2.8 Reinforcement Learning simple diagram. Taken from [8] . 38 3.1 Generic cross-layer RTM, where the arrows show the communication be- tween the layers. 44 3.2 Run-Time Management Unit in the cross-layer approach . 49 3.3 Comparison of real workload vs. predicted on a sample video, from frames 430 to 615 . 52 3.4 Effect of weight for Exponential Weighted Moving Average (EWMA) on prediction error using dynamic workloads . 53 3.5 Effect of weight for EWMA on prediction error using static workloads . 54 3.6 Sample of video with frame types 1, 2 and 3. Red line represents the group of frames of type 2 (cross) and 3 (green filled circles) between frames type 1 (blue hollow circles) used to calculate local standard deviation. 55 3.7 AEWMA λ parameter change at transitions . 56 3.8 Cost function of Q-Learning algorithm for Shepherd. This graph shows the level of reward/punishment obtained for finishing the workload too early (A), very close to the deadline without surpassing it (B), finishing the workload shortly after the deadline (C) and finishing very late (D). Time left for deadline is calculated as 100 tdeadline−tfinished ......... 60 tdeadline 3.9 Q-Table during A) exploration and B) exploitation phases. The red boxes represent the best Action for each State. 61 3.10 Suitability of actions for a particular state (state 4), defined by the Q Values of the different actions . 62 ix x LIST OF FIGURES 4.1 Shepherd governor implementation . 67 4.2 Random number distribution generated from taking the 8 Least Signif- icant Bits (LSBs) from the CPU cycles measured per frame, over 5760 frames. 68 4.3 Adaptation of program code for utilisation of the Shepherd Run-Time Manager . 78 5.1 BeagleBoard-xM[9] development board used for running the experiments of Shepherd . 80 5.2 Video Decoding experiment using Shepherd Governor running on BeagleBoard- xM........................................ 81 5.3 Performance and Power Consumption of the governors Shepherd and On- demand for an H.264 Video ..
Recommended publications
  • Power Management 24
    Power Management 24 The embedded Pentium® processor family implements Intel’s System Management Mode (SMM) architecture. This chapter describes the hardware interface to SMM and Clock Control. 24.1 Power Management Features • System Management Interrupt can be delivered through the SMI# signal or through the local APIC using the SMI# message, which enhances the SMI interface, and provides for SMI delivery in APIC-based Pentium processor dual processing systems. • In dual processing systems, SMIACT# from the bus master (MRM) behaves differently than in uniprocessor systems. If the LRM processor is the processor in SMM mode, SMIACT# will be inactive and remain so until that processor becomes the MRM. • The Pentium processor is capable of supporting an SMM I/O instruction restart. This feature is automatically disabled following RESET. To enable the I/O instruction restart feature, set bit 9 of the TR12 register to “1”. • The Pentium processor default SMM revision identifier has a value of 2 when the SMM I/O instruction restart feature is enabled. • SMI# is NOT recognized by the processor in the shutdown state. 24.2 System Management Interrupt Processing The system interrupts the normal program execution and invokes SMM by generating a System Management Interrupt (SMI#) to the processor. The processor will service the SMI# by executing the following sequence. See Figure 24-1. 1. Wait for all pending bus cycles to complete and EWBE# to go active. 2. The processor asserts the SMIACT# signal while in SMM indicating to the system that it should enable the SMRAM. 3. The processor saves its state (context) to SMRAM, starting at address location SMBASE + 0FFFFH, proceeding downward in a stack-like fashion.
    [Show full text]
  • Power Management Using FPGA Architectural Features Abu Eghan, Principal Engineer Xilinx Inc
    Power Management Using FPGA Architectural Features Abu Eghan, Principal Engineer Xilinx Inc. Agenda • Introduction – Impact of Technology Node Adoption – Programmability & FPGA Expanding Application Space – Review of FPGA Power characteristics • Areas for power consideration – Architecture Features, Silicon design & Fabrication – now and future – Power & Package choices – Software & Implementation of Features – The end-user choices & Enablers • Thermal Management – Enabling tools • Summary Slide 2 2008 MEPTEC Symposium “The Heat is On” Abu Eghan, Xilinx Inc Technology Node Adoption in FPGA • New Tech. node Adoption & level of integration: – Opportunities – at 90nm, 65nm and beyond. FPGAs at leading edge of node adoption. • More Programmable logic Arrays • Higher clock speeds capability and higher performance • Increased adoption of Embedded Blocks: Processors, SERDES, BRAMs, DCM, Xtreme DSP, Ethernet MAC etc – Impact – general and may not be unique to FPGA • Increased need to manage leakage current and static power • Heat flux (watts/cm2) trend is generally up and can be non-uniform. • Potentially higher dynamic power as transistor counts soar. • Power Challenges -- Shared with Industry – Reliability limitation & lower operating temperatures – Performance & Cost Trade-offs – Lower thermal budgets – Battery Life expectancy challenges Slide 3 2008 MEPTEC Symposium “The Heat is On” Abu Eghan, Xilinx Inc FPGA-101: FPGA Terms • FPGA – Field Programmable Gate Arrays • Configurable Logic Blocks – used to implement a wide range of arbitrary digital
    [Show full text]
  • Clock Gating for Power Optimization in ASIC Design Cycle: Theory & Practice
    Clock Gating for Power Optimization in ASIC Design Cycle: Theory & Practice Jairam S, Madhusudan Rao, Jithendra Srinivas, Parimala Vishwanath, Udayakumar H, Jagdish Rao SoC Center of Excellence, Texas Instruments, India (sjairam, bgm-rao, jithendra, pari, uday, j-rao) @ti.com 1 AGENDA • Introduction • Combinational Clock Gating – State of the art – Open problems • Sequential Clock Gating – State of the art – Open problems • Clock Power Analysis and Estimation • Clock Gating In Design Flows JS/BGM – ISLPED08 2 AGENDA • Introduction • Combinational Clock Gating – State of the art – Open problems • Sequential Clock Gating – State of the art – Open problems • Clock Power Analysis and Estimation • Clock Gating In Design Flows JS/BGM – ISLPED08 3 Clock Gating Overview JS/BGM – ISLPED08 4 Clock Gating Overview • System level gating: Turn off entire block disabling all functionality. • Conditions for disabling identified by the designer JS/BGM – ISLPED08 4 Clock Gating Overview • System level gating: Turn off entire block disabling all functionality. • Conditions for disabling identified by the designer • Suspend clocks selectively • No change to functionality • Specific to circuit structure • Possible to automate gating at RTL or gate-level JS/BGM – ISLPED08 4 Clock Network Power JS/BGM – ISLPED08 5 Clock Network Power • Clock network power consists of JS/BGM – ISLPED08 5 Clock Network Power • Clock network power consists of – Clock Tree Buffer Power JS/BGM – ISLPED08 5 Clock Network Power • Clock network power consists of – Clock Tree Buffer
    [Show full text]
  • Computer Architecture Techniques for Power-Efficiency
    MOCL005-FM MOCL005-FM.cls June 27, 2008 8:35 COMPUTER ARCHITECTURE TECHNIQUES FOR POWER-EFFICIENCY i MOCL005-FM MOCL005-FM.cls June 27, 2008 8:35 ii MOCL005-FM MOCL005-FM.cls June 27, 2008 8:35 iii Synthesis Lectures on Computer Architecture Editor Mark D. Hill, University of Wisconsin, Madison Synthesis Lectures on Computer Architecture publishes 50 to 150 page publications on topics pertaining to the science and art of designing, analyzing, selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals. Computer Architecture Techniques for Power-Efficiency Stefanos Kaxiras and Margaret Martonosi 2008 Chip Mutiprocessor Architecture: Techniques to Improve Throughput and Latency Kunle Olukotun, Lance Hammond, James Laudon 2007 Transactional Memory James R. Larus, Ravi Rajwar 2007 Quantum Computing for Computer Architects Tzvetan S. Metodi, Frederic T. Chong 2006 MOCL005-FM MOCL005-FM.cls June 27, 2008 8:35 Copyright © 2008 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Computer Architecture Techniques for Power-Efficiency Stefanos Kaxiras and Margaret Martonosi www.morganclaypool.com ISBN: 9781598292084 paper ISBN: 9781598292091 ebook DOI: 10.2200/S00119ED1V01Y200805CAC004 A Publication in the Morgan & Claypool Publishers
    [Show full text]
  • Analysis of Body Bias Control Using Overhead Conditions for Real Time Systems: a Practical Approach∗
    IEICE TRANS. INF. & SYST., VOL.E101–D, NO.4 APRIL 2018 1116 PAPER Analysis of Body Bias Control Using Overhead Conditions for Real Time Systems: A Practical Approach∗ Carlos Cesar CORTES TORRES†a), Nonmember, Hayate OKUHARA†, Student Member, Nobuyuki YAMASAKI†, Member, and Hideharu AMANO†, Fellow SUMMARY In the past decade, real-time systems (RTSs), which must in RTSs. These techniques can improve energy efficiency; maintain time constraints to avoid catastrophic consequences, have been however, they often require a large amount of power since widely introduced into various embedded systems and Internet of Things they must control the supply voltages of the systems. (IoTs). The RTSs are required to be energy efficient as they are used in embedded devices in which battery life is important. In this study, we in- Body bias (BB) control is another solution that can im- vestigated the RTS energy efficiency by analyzing the ability of body bias prove RTS energy efficiency as it can manage the tradeoff (BB) in providing a satisfying tradeoff between performance and energy. between power leakage and performance without affecting We propose a practical and realistic model that includes the BB energy and the power supply [4], [5].Itseffect is further endorsed when timing overhead in addition to idle region analysis. This study was con- ducted using accurate parameters extracted from a real chip using silicon systems are enabled with silicon on thin box (SOTB) tech- on thin box (SOTB) technology. By using the BB control based on the nology [6], which is a novel and advanced fully depleted sili- proposed model, about 34% energy reduction was achieved.
    [Show full text]
  • Dynamic Voltage/Frequency Scaling and Power-Gating of Network-On-Chip with Machine Learning
    Dynamic Voltage/Frequency Scaling and Power-Gating of Network-on-Chip with Machine Learning A thesis presented to the faculty of the Russ College of Engineering and Technology of Ohio University In partial fulfillment of the requirements for the degree Master of Science Mark A. Clark May 2019 © 2019 Mark A. Clark. All Rights Reserved. 2 This thesis titled Dynamic Voltage/Frequency Scaling and Power-Gating of Network-on-Chip with Machine Learning by MARK A. CLARK has been approved for the School of Electrical Engineering and Computer Science and the Russ College of Engineering and Technology by Avinash Karanth Professor of Electrical Engineering and Computer Science Dennis Irwin Dean, Russ College of Engineering and Technology 3 Abstract CLARK, MARK A., M.S., May 2019, Electrical Engineering Dynamic Voltage/Frequency Scaling and Power-Gating of Network-on-Chip with Machine Learning (89 pp.) Director of Thesis: Avinash Karanth Network-on-chip (NoC) continues to be the preferred communication fabric in multicore and manycore architectures as the NoC seamlessly blends the resource efficiency of the bus with the parallelization of the crossbar. However, without adaptable power management the NoC suffers from excessive static power consumption at higher core counts. Static power consumption will increase proportionally as the size of the NoC increases to accommodate higher core counts in the future. NoC also suffers from excessive dynamic energy as traffic loads fluctuate throughout the execution of an application. Power- gating (PG) and Dynamic Voltage and Frequency Scaling (DVFS) are two highly effective techniques proposed in literature to reduce static power and dynamic energy in the NoC respectively.
    [Show full text]
  • Power Reduction Techniques for Microprocessor Systems
    Power Reduction Techniques For Microprocessor Systems VASANTH VENKATACHALAM AND MICHAEL FRANZ University of California, Irvine Power consumption is a major factor that limits the performance of computers. We survey the “state of the art” in techniques that reduce the total power consumed by a microprocessor system over time. These techniques are applied at various levels ranging from circuits to architectures, architectures to system software, and system software to applications. They also include holistic approaches that will become more important over the next decade. We conclude that power management is a multifaceted discipline that is continually expanding with new techniques being developed at every level. These techniques may eventually allow computers to break through the “power wall” and achieve unprecedented levels of performance, versatility, and reliability. Yet it remains too early to tell which techniques will ultimately solve the power problem. Categories and Subject Descriptors: C.5.3 [Computer System Implementation]: Microcomputers—Microprocessors;D.2.10 [Software Engineering]: Design— Methodologies; I.m [Computing Methodologies]: Miscellaneous General Terms: Algorithms, Design, Experimentation, Management, Measurement, Performance Additional Key Words and Phrases: Energy dissipation, power reduction 1. INTRODUCTION of power; so much power, in fact, that their power densities and concomitant Computer scientists have always tried to heat generation are rapidly approaching improve the performance of computers. levels comparable to nuclear reactors But although today’s computers are much (Figure 1). These high power densities faster and far more versatile than their impair chip reliability and life expectancy, predecessors, they also consume a lot increase cooling costs, and, for large Parts of this effort have been sponsored by the National Science Foundation under ITR grant CCR-0205712 and by the Office of Naval Research under grant N00014-01-1-0854.
    [Show full text]
  • Real-Time Dynamic Voltage Scaling for Low-Power Embedded Operating Systems£
    Real-Time Dynamic Voltage Scaling for Low-Power Embedded Operating Systems£ Padmanabhan Pillai and Kang G. Shin Real-Time Computing Laboratory Department of Electrical Engineering and Computer Science The University of Michigan Ann Arbor, MI 48109-2122, U.S.A. pillai,kgshin @eecs.umich.edu ABSTRACT ful microprocessors running sophisticated, intelligent control soft- In recent years, there has been a rapid and wide spread of non- ware in a vast array of devices including digital camcorders, cellu- traditional computing platforms, especially mobile and portable com- lar phones, and portable medical devices. puting devices. As applications become increasingly sophisticated and processing power increases, the most serious limitation on these Unfortunately, there is an inherent conflict in the design goals be- devices is the available battery life. Dynamic Voltage Scaling (DVS) hind these devices: as mobile systems, they should be designed to has been a key technique in exploiting the hardware characteristics maximize battery life, but as intelligent devices, they need powerful of processors to reduce energy dissipation by lowering the supply processors, which consume more energy than those in simpler de- voltage and operating frequency. The DVS algorithms are shown to vices, thus reducing battery life. In spite of continuous advances in be able to make dramatic energy savings while providing the nec- semiconductor and battery technologies that allow microprocessors essary peak computation power in general-purpose systems. How- to provide much greater computation per unit of energy and longer ever, for a large class of applications in embedded real-time sys- total battery life, the fundamental tradeoff between performance tems like cellular phones and camcorders, the variable operating and battery life remains critically important.
    [Show full text]
  • Happy: Hyperthread-Aware Power Profiling Dynamically
    HaPPy: Hyperthread-aware Power Profiling Dynamically Yan Zhai, University of Wisconsin; Xiao Zhang and Stephane Eranian, Google Inc.; Lingjia Tang and Jason Mars, University of Michigan https://www.usenix.org/conference/atc14/technical-sessions/presentation/zhai This paper is included in the Proceedings of USENIX ATC ’14: 2014 USENIX Annual Technical Conference. June 19–20, 2014 • Philadelphia, PA 978-1-931971-10-2 Open access to the Proceedings of USENIX ATC ’14: 2014 USENIX Annual Technical Conference is sponsored by USENIX. HaPPy: Hyperthread-aware Power Profiling Dynamically Yan Zhai Xiao Zhang, Stephane Eranian Lingjia Tang, Jason Mars University of Wisconsin Google Inc. University of Michigan [email protected] xiaozhang,eranian @google.com lingjia,profmars @eesc.umich.edu { } { } Abstract specified power threshold by suspending a subset of jobs. Quantifying the power consumption of individual appli- Scheduling can also be used to limit processor utilization cations co-running on a single server is a critical compo- to reach energy consumption goals. Beyond power bud- nent for software-based power capping, scheduling, and geting, pricing the power consumed by jobs in datacen- provisioning techniques in modern datacenters. How- ters is also important in multi-tenant environments. ever, with the proliferation of hyperthreading in the last One capability that proves critical in enabling software few generations of server-grade processor designs, the to monitor and manage power resources in large-scale challenge of accurately and dynamically performing this datacenter infrastructures is the attribution of power con- power attribution to individual threads has been signifi- sumption to the individual applications co-running on cantly exacerbated.
    [Show full text]
  • Learning-Directed Dynamic Voltage and Frequency Scaling Scheme with Adjustable Performance for Single-Core and Multi-Core Embedded and Mobile Systems †
    sensors Article Learning-Directed Dynamic Voltage and Frequency Scaling Scheme with Adjustable Performance for Single-Core and Multi-Core Embedded and Mobile Systems † Yen-Lin Chen 1,* , Ming-Feng Chang 2, Chao-Wei Yu 1 , Xiu-Zhi Chen 1 and Wen-Yew Liang 1 1 Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei 10608, Taiwan; [email protected] (C.-W.Y.); [email protected] (X.-Z.C.); [email protected] (W.-Y.L.) 2 MediaTek Inc., Hsinchu 30078, Taiwan; [email protected] * Correspondence: [email protected]; Tel.: +886-2-27712171 (ext. 4239) † This paper is an expanded version of “Learning-Directed Dynamic Volt-age and Frequency Scaling for Computation Time Prediction” published in Proceedings of 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, Changsha, China, 16–18 November 2011. Received: 6 August 2018; Accepted: 8 September 2018; Published: 12 September 2018 Abstract: Dynamic voltage and frequency scaling (DVFS) is a well-known method for saving energy consumption. Several DVFS studies have applied learning-based methods to implement the DVFS prediction model instead of complicated mathematical models. This paper proposes a lightweight learning-directed DVFS method that involves using counter propagation networks to sense and classify the task behavior and predict the best voltage/frequency setting for the system. An intelligent adjustment mechanism for performance is also provided to users under various performance requirements. The comparative experimental results of the proposed algorithms and other competitive techniques are evaluated on the NVIDIA JETSON Tegra K1 multicore platform and Intel PXA270 embedded platforms.
    [Show full text]
  • Energy Proportional Computing in Commercial Fpgas with Adaptive
    Energy proportional computing in Commercial FPGAs with Adaptive Voltage Scaling Jose Nunez-Yanez Department of Electrical and Electronic Engineering University of Bristol, UK +44 117 3315128 [email protected] ABSTRACT an open-loop configuration. However, worst case variability is Voltage and frequency adaptation can be used to create energy rarely the case. For this reason, this paper investigates Adaptive proportional systems in which energy usage adapts to the amount Voltage Scaling (AVS) in which run-time monitoring of of work to be done in the available time. Closed-loop voltage and performance variability in the silicon is used together with system frequency scaling can also take into account process and characterization to influence the voltage and the frequency on the temperature variations in addition to system load and this removes fly in a closed-loop configuration. a significant proportion of the margins used by device The rest of the paper is structured as follows. Section 2 describes manufacturers. This paper explores the capabilities of commercial related work. Section 3 presents the hardware platform used in FPGAs to use closed-loop adaptive voltage scaling to improve this research while section 4 introduces the design flow that their energy and performance profiles beyond nominal. An embeds the AVS capabilities in the user design. Section 5 adaptive power architecture based on a modified design flow is presents the power adaptive architecture based on the novel in- created with in-situ detectors and dynamic reconfiguration of situ detectors. Section 6 presents and discusses the results clock management resources. The results of deploying AVS in focusing on power and energy measurements.
    [Show full text]
  • The Dynamic Voltage and Frequency Scaling Based on the On-Chip Microcontroller System
    Journal of Theoretical and Applied Information Technology 10th May 2013. Vol. 51 No.1 © 2005 - 2013 JATIT & LLS. All rights reserved. ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195 THE DYNAMIC VOLTAGE AND FREQUENCY SCALING BASED ON THE ON-CHIP MICROCONTROLLER SYSTEM 1,2TIEFENG LI, 1CAIWEN MA, 1WENHUA LI 1 Xi'an Institute of Optics and Precision Mechanics of Chinese Academy of Sciences, Xi’an, 710119,China 2The Graduate University of Chinese Academy of Sciences ,Beijing, 100049, China E-mail: [email protected] , [email protected] , [email protected] ABSTRACT With the rapid increase of complexity and size of the on-chip microcontroller system (OCMS), the power consumption issue for the OMCS is increasingly becoming critical and needs to be solved quickly. Because the CPU is often the major power consumer in the OCMS, so an important strategy to achieve energy saving is via the dynamic voltage and frequency scaling (DVFS), which can enable a processor to operate at a range of voltages and frequencies. However, it needs to be emphasized that the conventional DVFS is fully executed by the software scheduler in the operating system, and its main drawback is that the scheduler can’t accurately track the performance requirements of CPU when the dormant frequency of CPU is increasing continuously. In this paper, we firstly present a typical hardware DVFS architecture, which can automatically carry out DVFS without the software scheduler involvement. Therefore, it avoids increasing the software's workload and reduces the power consumption. Keywords: Power Consumption, DVFS, Software Scheduler, Performance Requirement 1.
    [Show full text]