Enhancing Energy-Performance for Power Constrained Soc Systems

ENHANCING ENERGY-PERFORMANCE FOR POWER CONSTRAINED SOC SYSTEMS Rami Jioussy Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 ENHANCING ENERGY-PERFORMANCE FOR POWER CONSTRAINED SOC SYSTEMS Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Rami Jioussy Submitted to the Senate of the Technion Israel Institute of Technology Sh’vat 5775 Haifa February 2015 Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 This research was carried out under the supervision of Prof. Avi Mendelson and Dr. Yariv Aridor (Intel), in the Faculty of Computer Science. Acknowledgments I would like to thank my supervisors, Prof. Avi Mendelson for enriching me with a lot of information and valuable help, and Dr. Yariv Aridor (Intel) for keeping pushing me forward, providing me with guidance to keep me focused on this thesis objectives. All this research wouldn’t have even started without the encouragement, help and endless patience of my dear ex-wife, Lana Kattawi Jioussy (Rest In Peace). I would like to thank my team at Intel (specially Shai Satt and Shiri Manor) for demonstrating patience with my intermittent absence and providing me the hours during a regular work-day to complete my master studies. Last and not least, I would like to thank my loving wife, Heba and my family for keeping pushing me forward and being there for me in the tough moments. Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 Contents Abstract ........................................................................................................................................... 1 1 Introduction ............................................................................................................................. 2 2 Related works .......................................................................................................................... 4 3 Glossary and Assumptions....................................................................................................... 5 3.1 Terms ............................................................................................................................... 5 3.2 Assumptions .................................................................................................................... 6 4 Technical Background .............................................................................................................. 7 4.1 SoC Architecture .............................................................................................................. 7 4.1.1 Shared power envelop............................................................................................. 8 4.2 OS power-policies ............................................................................................................ 8 4.3 OpenCL .......................................................................................................................... 11 4.3.1 OpenCL kernel splitting for Hybrid execution ....................................................... 12 5 This research focus ................................................................................................................ 14 5.1 The hybrid execution model .......................................................................................... 14 5.1.1 Mapping to the OpenCL model ............................................................................. 14 5.2 EDP: energy-performance metric .................................................................................. 15 6 WHP: an energy-performance optimizing method ............................................................... 16 6.1 Observations .................................................................................................................. 16 6.1.1 Constant power consumption ............................................................................... 16 6.1.2 Performance is linear............................................................................................. 16 6.1.3 Package power offset ............................................................................................ 17 6.1.4 TDP budget interferes hybrid execution on SoC systems ..................................... 17 6.2 Device completion assumption ..................................................................................... 19 6.3 Algorithm ....................................................................................................................... 19 6.3.1 Stage 1: Constructing FT and FP ............................................................................ 20 6.3.2 Stage 2: Determine optimal configuration ............................................................ 20 6.3.3 Stage 3: Apply (cpufreq, gpufreq, α) settings ............................................................. 21 6.4 Walk-through example .................................................................................................. 21 7 Experimental testbed ............................................................................................................ 23 7.1 The testbed hardware ................................................................................................... 23 7.2 Measurement Methodology ......................................................................................... 23 7.2.1 Measuring power .................................................................................................. 24 7.2.2 Measuring performance (execution time) ............................................................ 24 7.3 Workloads...................................................................................................................... 24 Technion - Computer Science Department - M.Sc. Thesis MSC-2015-02 - 2015 8 Experiments ........................................................................................................................... 26 8.1 Demonstrating the observations .................................................................................. 26 8.1.1 Constant power consumption ............................................................................... 26 8.1.2 Performance is linear............................................................................................. 26 8.1.3 Package power offset ............................................................................................ 27 8.2 Applying WHP ................................................................................................................ 28 8.3 WHP vs. Balanced .......................................................................................................... 29 8.3.1 Results discussion .................................................................................................. 30 8.4 WHP vs. other OS policies ............................................................................................. 34 9 Summary and future work .................................................................................................... 36 9.1 Ideas for future work ..................................................................................................... 36 10 References ......................................................................................................................... 37 List of Tables Table 1: Mapping the hybrid execution model onto OpenCL ....................................................... 14 Table 2 FT and FP for AES256 kernel ............................................................................................. 21 Table 3 WHP result for AES26 kernel ............................................................................................ 21 Table 4: The Testbed platform specification ................................................................................. 23 Table 5 Characteristics of the testbed kernels .............................................................................. 25 Table 6 Testbed kernels power consumption: CPUwatt + GPUwatt / Packagewatt. Confirms Equation 3. .................................................................................................................................................... 28 Table 7: WHP scores (execution time, energy, EDP). The runtime EDP is calculated by Time x Energy. The EDP matching ratio is calculated by MIN (runtime EDP, computed EDP)/MAX (runtime EDP, computed EDP) ...................................................................................................... 29 Table 8: balanced-mode policy scores (execution time, energy, EDP). The runtime EDP is calculated by Time x Energy. ......................................................................................................... 29 Table 9 High-performance policy scores ...................................................................................... 35 Table 10 Power-save policy scores ................................................................................................ 35 List of Figures Figure 1 IvyBridge SoC Architecture ................................................................................................ 7 Figure 2 High-performance power policy. CPU is set to 2800MHz

Enhancing Energy-Performance for Power Constrained Soc Systems

Wind Rose Data Comes in the Form >200,000 Wind Rose Images

Copyrighted Material

Power1.Ps (Mpage)

Chapter 1-Introduction to Microprocessors File

Power Architecture® ISA 2.06 Stride N Prefetch Engines to Boost Application's Performance

Floboss 107 Flow Manager Instruction Manual

Ibm Power8 Processors Analýza Výkonnosti Procesorů Ibm Power8

Computer Architectures an Overview

The POWER4 Processor Introduction and Tuning Guide

Ilore: Discovering a Lineage of Microprocessors

POWER Processor

Qoriq LS1012A SDK V0.3 Contents