Timing Speculation and Adaptive Reliable Overclocking Techniques for Aggressive Computer Systems Viswanathan Subramanian Iowa State University

Total Page:16

File Type:pdf, Size:1020Kb

Timing Speculation and Adaptive Reliable Overclocking Techniques for Aggressive Computer Systems Viswanathan Subramanian Iowa State University Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2009 Timing speculation and adaptive reliable overclocking techniques for aggressive computer systems Viswanathan Subramanian Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Electrical and Computer Engineering Commons Recommended Citation Subramanian, Viswanathan, "Timing speculation and adaptive reliable overclocking techniques for aggressive computer systems" (2009). Graduate Theses and Dissertations. 10967. https://lib.dr.iastate.edu/etd/10967 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Timing speculation and adaptive reliable overclocking techniques for aggressive computer systems by Viswanathan Subramanian A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Computer Engineering Program of Study Committee: Arun K. Somani, Major Professor Akhilesh Tyagi Randall L. Geiger Joseph A. Zambreno David Ferna ´ndez-Baca Iowa State University Ames, Iowa 2009 Copyright c Viswanathan Subramanian, 2009. All rights reserved. ii To my dear parents To my enlightening teachers To my loving wife To my caring family To my beloved friends iii TABLE OF CONTENTS LIST OF TABLES . vi LIST OF FIGURES . viii ACKNOWLEDGEMENTS . xii ABSTRACT . xiv CHAPTER 1. INTRODUCTION . 1 1.1 High Performance Computing . 3 1.1.1 Device Scaling . 4 1.1.2 Microprocessor Architectures . 4 1.1.3 Better-Than-Worst-Case Designs . 5 1.1.4 Adaptive Systems . 6 1.2 Fault Tolerant Computing . 7 1.2.1 Transient Faults . 8 1.2.2 Redundancy Techniques . 8 1.2.3 Fault Mitigation Techniques . 9 1.2.4 Exploiting Fault Tolerance to Improve Performance . 10 1.3 Power/Thermal Aware Computing . 11 1.4 Contributions of this Thesis . 12 CHAPTER 2. BACKGROUND . 16 2.1 Parameter Variations . 16 2.2 Reliable Overclocking . 18 2.2.1 Timing Error Detection and Recovery . 18 iv 2.2.2 Timing Error Rate Based Feedback Control System . 20 2.2.3 Timing Speculation . 20 2.3 Razor Architecture . 22 2.4 SPRIT3E Framework . 24 CHAPTER 3. MANIPULATING SHORT-PATHS FOR PERFORMANCE . 27 3.1 Impact of Short-paths . 28 3.1.1 Timing Constraints . 28 3.1.2 Variable or Fixed Phase Shift . 30 3.1.3 Manipulating Contamination Delay . 31 3.2 Increasing Contamination Delay of a CLA Adder Circuit - A Case Study . 33 3.2.1 Analysis of Reliable Overclocking Performance . 37 CHAPTER 4. CHARACTERIZING ADAPTIVE RELIABLE OVERCLOCKING . 41 4.1 Evaluating Speculative Reliable Overclocking . 42 4.1.1 Performance Metrics . 46 4.2 Analysis Framework . 47 4.2.1 Modeling a Reliably Overclocked Processor (ROP) . 48 4.2.2 Power and Thermal Modeling . 53 4.3 Adaptive Clocking . 55 4.3.1 Clock Tuning Schemes . 57 4.3.2 Comparing Adaptive Clocking Techniques . 59 4.4 Reliable Overclocking Analysis . 63 CHAPTER 5. THERMAL IMPACT OF RELIABLE OVERCLOCKING . 69 5.1 Thermal and Reliability Management . 70 5.2 Analysis Framework for Estimating On-chip Temperature . 71 5.2.1 Thermal Throttling . 72 5.2.2 Simulation Parameters . 72 5.3 On-chip Temperature Trends in Reliably Overclocked Processors . 73 v CHAPTER 6. RELIABLE OVERCLOCKING AND TECHNOLOGY SCALING . 78 6.1 Technology Scaling . 79 6.2 A Reliable Overclocking Approach . 81 6.3 Analysis Framework . 82 6.4 Performance at Different Technology Nodes . 84 6.5 Comparing Technology Scaling with Reliable Overclocking . 85 CHAPTER 7. FAULT TOLERANT AGGRESSIVE SYSTEMS . 91 7.1 Conjoined Pipeline Architecture . 93 7.1.1 Conjoined Pipeline Datapath Description . 93 7.1.2 Error Detection and Recovery . 95 7.2 Timing Requirements . 100 7.3 Implementation Considerations . 104 7.3.1 Two Clock Approach . 106 7.4 Experiments and Results . 106 CHAPTER 8. CONCLUSIONS AND FUTURE WORK . 110 vi LIST OF TABLES Table 3.1 Implementation details of CLA adder circuits . 37 Table 4.1 Processor specifications . 49 Table 4.2 Synthesis report of major pipeline stages . 49 Table 4.3 Simulator parameters . 53 Table 4.4 Comparing various performance metrics between a base non-overclocked pro- cessor, a reliably overclocked processor tuned using a single clock generator and a reliably overclocked processor tuned using dual clock generators. All the systems execute SPEC2000 integer benchmarks . 61 Table 4.5 Comparing various performance metrics between a base non-overclocked pro- cessor, a reliably overclocked processor tuned using a single clock generator and a reliably overclocked processor tuned using dual clock generators. All the systems execute SPEC2000 floating point benchmarks . 62 Table 4.6 Comparing various performance metrics for non-overclocked and reliably over- clocked processors executing SPEC2000 integer benchmarks . 67 Table 4.7 Comparing various performance metrics for non-overclocked and reliably over- clocked processors executing SPEC2000 floating point benchmarks . 67 Table 4.8 Effect of memory overclocking on the performance benefits of a ROP execut- ing SPEC2000 integer benchmarks . 68 Table 4.9 Effect of memory overclocking on the performance benefits of a ROP execut- ing SPEC2000 floating point benchmarks . 68 Table 5.1 Mean Time To Failure (MTTF) for critical wear out models . 71 vii Table 5.2 Simulator parameters . 73 Table 6.1 Technology scaling parameters . 82 Table 6.2 Comparing various performance metrics across different technology nodes for a non-overclocked processor executing SPEC2000 integer benchmarks . 83 Table 6.3 Comparing various performance metrics across different technology nodes for a non-overclocked processor executing SPEC2000 floating point benchmarks 84 Table 7.1 Possible error scenarios . 99 Table 7.2 Fault injection results . 107 Table 7.3 Timing errors . 108 viii LIST OF FIGURES Figure 2.1 Cross section of a n-channel MOSFET in the ON state showing channel for- mation. The channel exhibits pinch-off near drain indicating operation in sat- uration (active) region. 18 Figure 2.2 Typical pipeline stage in a ROP. Local timing error detection and recovery scheme for critical registers is shown in detail. 19 Figure 2.3 Timing diagram showing overclocking advantage per cycle, as compared to the worst-case clock . 20 Figure 2.4 Timing diagram showing pipeline stage level timing speculation . 21 Figure 2.5 Reduced overhead Razor flip-flop and metastability detection circuits (Figure reproduced from [27]) . 23 Figure 2.6 SPRIT3E framework . 25 Figure 3.1 Clock timing waveforms showing governing requirements, for MAINCLK and PSCLK, over the full range of overclocked aggressive frequencies (FMIN )* FMAX ) 28 Figure 3.2 Examples of Main and PS clocks with variable and fixed phase shifts . 31 Figure 3.3 Timing waveforms after increasing contamination delay to half the propaga- tion delay for the full range of overclocked aggressive frequencies (FMIN )* FMAX ) ...................................... 33 Figure 3.4 8-bit CLA adder . 34 Figure 3.5 Delay distribution for an 8-bit CLA adder . 34 Figure 3.6 8-bit CLA adder with additional delay blocks to increase contamination delay 35 Figure 3.7 Delay distribution for an 8-bit CLA adder after increasing contamination delay 36 Figure 3.8 Experimental setup to estimate performance improvement of CLA adder circuits 37 ix Figure 3.9 Percent of error cycles versus clock period for an 8-bit delay added CLA adder circuit . 38 Figure 3.10 Percent of error cycles versus clock period for a 32-bit delay added CLA adder circuit (Contamination delay 1:21ns)...................... 39 Figure 3.11 Percent of error cycles versus clock period for a 32-bit delay added CLA adder circuit (Contamination delay 1:38ns)...................... 39 Figure 3.12 Percent of error cycles versus clock period for a 64-bit delay added CLA adder circuit . 40 Figure 4.1 Alpha 21264 integer and floating point pipeline showing timing error detec- tion and recovery circuit for critical registers . 41 Figure 4.2 Simulation framework . 48 Figure 4.3 Cumulative error profile for all pipeline stages at overclocked operating fre- quencies for SPEC2000 integer benchmarks. Also shown separately are error profiles for issue stage and execute stage. 51 Figure 4.4 Error profile for three SPEC2000 integer benchmarks executing five different instruction and data sets . ..
Recommended publications
  • Approaches in Green Computing
    Special Issue - 2015 International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 NSRCL-2015 Conference Proceedings Approaches in Green Computing Reena Thomas Fedrina J Manjaly 3 rd BCA, Department of Computer Science 3 rd BCA , Department of Computer Science Carmel College Carmel College Mala, Thrissur Mala, Thrissur Abstract— In a 2008 article San Murugesan defined green computing as "the study and practice of designing, manufacturing, using, and disposing of computers, servers, and associated subsystems — such as monitors, printers, storage devices, and networking and communications systems — efficiently and effectively with minimal or no impact on the environment."Murugesan lays out four paths along which he believes the environmental effects of computing should be addressed:Green use, green disposal, green design, and green manufacturing. Green computing can also develop solutions that offer benefits by "aligning all IT processes and practices with the core principles of sustainability, which are to reduce, reuse, and recycle; and finding innovative ways to use IT in business processes to deliver sustainability benefits across the Figure 1: Green Computing Migration Framework enterprise and beyond". I. INTRODUCTION II. APPROACHES In 1992, the U.S. Environmental Protection Agency A. Product longevity launched Energy Star, a voluntary labeling program that is Gartner maintains that the PC manufacturing process designed to promote and recognize energy-efficiency in accounts for 70% of the natural resources used in the life monitors, climate control equipment, and other cycle of a PC. More recently, Fujitsu released a Life Cycle technologies. This resulted in the widespread adoption of Assessment (LCA) of a desktop that show that sleep mode among consumer electronics.
    [Show full text]
  • Analysis and Optimization of Dynamic Voltage and Frequency Scaling for AVX Workloads Using a Software-Based Reimplementation
    Analysis and Optimization of Dynamic Voltage and Frequency Scaling for AVX Workloads Using a Software-Based Reimplementation Bachelor’s Thesis submitted by cand. inform. Yussuf Khalil to the KIT Department of Informatics Reviewer: Prof. Dr. Frank Bellosa Second Reviewer: Prof. Dr. Wolfgang Karl Advisor: Mathias Gottschlag, M.Sc. May 03 – September 02, 2019 KIT – The Research University in the Helmholtz Association www.kit.edu I hereby declare that the work presented in this thesis is entirely my own and that I did not use any source or auxiliary means other than these referenced. This thesis was carried out in accordance with the Rules for Safeguarding Good Scientic Practice at Karlsruhe Institute of Technology (KIT). Karlsruhe, September 2, 2019 Abstract While using the Advanced Vector Extensions (AVX) on current Intel x86 pro- cessors allows for great performance improvements in programs that can be parallelized by using vectorization, many heterogeneous workloads that use both vector and scalar instructions expose degraded throughput when mak- ing use of AVX2 or AVX-512. This eect is caused by processor frequency reductions that are required to maintain system stability while executing AVX code. Due to the delays incurred by frequency switches, reduced clock speeds are attained for some additional time after the last demanding instruction has retired, causing code in scalar phases directly following AVX phases to be executed at a slower rate than theoretically possible. We present an analysis of the precise frequency switching behavior of an Intel Syklake (Server) CPU when AVX instructions are used. Based on the obtained results, we propose avxfreq, a software reimplementation of the AVX frequency selection mechanism.
    [Show full text]
  • Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems
    Old Dominion University ODU Digital Commons Electrical & Computer Engineering Theses & Dissertations Electrical & Computer Engineering Winter 2007 Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems Hau Trung Ngo Old Dominion University Follow this and additional works at: https://digitalcommons.odu.edu/ece_etds Part of the Electrical and Computer Engineering Commons Recommended Citation Ngo, Hau T.. "Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems" (2007). Doctor of Philosophy (PhD), Dissertation, Electrical & Computer Engineering, Old Dominion University, DOI: 10.25777/j6kw-q685 https://digitalcommons.odu.edu/ece_etds/185 This Dissertation is brought to you for free and open access by the Electrical & Computer Engineering at ODU Digital Commons. It has been accepted for inclusion in Electrical & Computer Engineering Theses & Dissertations by an authorized administrator of ODU Digital Commons. For more information, please contact [email protected]. POWER-AW ARE DESIGN METHODOLOGIES FOR FPGA-BASED IMPLEMENTATION OF VIDEO PROCESSING SYSTEMS By Hau Trung Ngo B. S. May 2001, Old Dominion University M. S. May 2003, Old Dominion University A Dissertation Submitted to the Faculty of Old Dominion University in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY ELECTRICAL AND COMPUTER ENGINEERING OLD DOMINION UNIVERSITY December 2007 Approved by: Vijayan i K. Asari (Direc(Director) Shirshak K. Dhali (Member) Min Song (Member) Ravi Mukkdmala (Member) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTRACT POWER-AWARE DESIGN METHODOLOGIES FOR FPGA-BASED IMPLEMENTATION OF VIDEO PROCESSING SYSTEMS Hau Trung Ngo Old Dominion University Director: Dr. Vijayan Asari The increasing capacity and capabilities of FPGA devices in recent years provide an attractive option for performance-hungry applications in the image and video processing domain.
    [Show full text]
  • A Dynamic Voltage Scaling Algorithm for Sporadic Tasks∗
    In: Proceedings of the 24th IEEE Real-Time Systems Symposium, Cancun, Mexico, December 2003, pp. 52-62. A Dynamic Voltage Scaling Algorithm for Sporadic Tasks¤ Ala0 Qadi Steve Goddard Shane Farritor Computer Science & Engineering Mechanical Engineering University of Nebraska—Lincoln University of Nebraska - Lincoln Lincoln, NE 68588-0115 Lincoln, NE 68588-0656 faqadi,[email protected] [email protected] Abstract In CMOS circuits the power consumed by a CMOS gate is proportional to the square of the voltage applied to the Dynamic voltage scaling (DVS) algorithms save energy circuit, as shown by Equation (1) where CL is the gate load by scaling down the processor frequency when the proces- capacitance (output capacitance),VDD is the supply voltage sor is not fully loaded. Many algorithms have been proposed and f is the clock frequency [29]. The circuit delay td is for periodic and aperiodic task models but none support the given by Equation (2) where k is a constant depending on canonical sporadic task model. A DVS algorithm, called the output gate size and the output capacitance and VT is DVSST, is presented that can be used with sporadic tasks the threshold voltage [29]. The clock frequency is inversely in conjunction with preemptive EDF scheduling. The algo- proportional to the circuit delay; it is expressed using td and rithm is proven to guarantee each task meets its deadline the logic depth of a critical path as in Equation (3) where Ld while saving the maximum amount of energy possible with is the depth of the critical path [29]. processor frequency scaling.
    [Show full text]
  • Comptia Fc0-Gr1 Exam Questions & Answers
    COMPTIA FC0-GR1 EXAM QUESTIONS & ANSWERS Number : FC0-GR1 Passing Score : 800 Time Limit : 120 min File Version : 31.4 http://www.gratisexam.com/ COMPTIA FC0-GR1 EXAM QUESTIONS & ANSWERS Exam Name: CompTIA Strata Green IT Exam Visualexams QUESTION 1 A small business currently has a server room with a large cooling system that is appropriate for its size. The location of the server room is the top level of a building. The server room is filled with incandescent lighting that needs to continuously stay on for security purposes. Which of the following would be the MOST cost-effective way for the company to reduce the server rooms energy footprint? A. Replace all incandescent lighting with energy saving neon lighting. B. Set an auto-shutoff policy for all the lights in the room to reduce energy consumption after hours. C. Replace all incandescent lighting with energy saving fluorescent lighting. D. Consolidate server systems into a lower number of racks, centralizing airflow and cooling in the room. Correct Answer: C Section: (none) Explanation Explanation/Reference: QUESTION 2 Which of the following methods effectively removes data from a hard drive prior to disposal? (Select TWO). A. Use the remove hardware OS feature B. Formatting the hard drive C. Physical destruction D. Degauss the drive E. Overwriting data with 1s and 0s by utilizing software Correct Answer: CE Section: (none) Explanation Explanation/Reference: QUESTION 3 Which of the following terms is used when printing data on both the front and the back of paper? A. Scaling B. Copying C. Duplex D. Simplex Correct Answer: C Section: (none) Explanation Explanation/Reference: QUESTION 4 A user reports that their cell phone battery is dead and cannot hold a charge.
    [Show full text]
  • Drmos King of Power-Saving
    Insist on DrMOS King of Power-saving Confidential Eric van Beurden / Feb 2009 /Page v1.0 1 EU MSI King of Power-saving What are the power-saving components and technologies from MSI? 1. DrMOS 2. APS 3. GreenPower design 4. Hi-c CAP Why should I care about power-saving? 1. Better earth (Think about it! You can be a hero saving it !) 2. Save $$ on the electricity bill 3. Cool running boards 4. Better overclocking Confidential Page 2 MSI King of Power-saving Is DrMOS the name of a MSI heatpipe? No! DrMOS is the cool secret below the heatpipe, not the heatpipe itself. Part of the heatpipe covers the PWM where the DrMOS chips are located. (PWM? That is technical stuff, right ? Now you really lost me ) Tell me, should I write DRMOS, Dr. MOS or Doctor Mos? The name comes from Driver MOSFET. There is only one correct way to write it; “DrMOS”. Confidential Page 3 MSI King of Power-saving So DrMOS is a chip below the heatpipe? Yes, DrMOS is the 2nd generation 3-in-1 integrated Driver MOSFET. It combines 3 PWM components in one. (Like triple core…) 1. Driver IC 2. Bottom-MOSFET 3. Top-MOSFET Confidential Page 4 MSI King of Power-saving Is MSI the first to use DrMOS on it’s products? DrMOS is an integrated MOSFET design proposed by Intel in 2004. The first to use a 1st generation Driver Mosfet on a 8-Phase was Asus Blitz Extreme. This 1st generation had some problems and disadvantages. These are all solved in the 2nd generation DrMOS which we use exclusive on MSI products.
    [Show full text]
  • Website Designing, Overclocking
    Computer System Management - Website Designing, Overclocking Amarjeet Singh November 8, 2011 Partially adopted from slides from student projects, SM 2010 and student blogs from SM 2011 Logistics Hypo explanation Those of you who selected a course module after last Thursday and before Sunday – Apologies for I could not assign the course module Bonus deadline for these students is extended till Monday 5 pm (If it applies to you, I will mention it in the email) Final Lab Exam 2 weeks from now - Week of Nov 19 Topics will be given by early next week No class on Monday (Nov 19) – Will send out relevant slides and videos to watch Concerns with Videos/Slides created by the students Mini project Demos: Finish by today Revision System Cloning What is it and Why is it useful? What are different ways of cloning the system? Data Recovery Why is it generally possible to recover data that is soft formatted or mistakenly deleted? Why is it advised not to install any new software if data is to be recovered? Topics For Today Optimizing Systems Performance (including Overclocking) Video by Vaibhav, Shubhankar and Mukul - http://www.youtube.com/watch?v=FEaORH5YP0Y&feature=youtu.be Creating a basic website A method of pushing the basic hardware components beyond the default limits Companies equip their products with such bottle-necks because operating the hardware at higher clock rates can damage or reduce its life span OCing has always been surrounded by many baseless myths. We are going to bust some of those. J Slides from Vinayak, Jatin and Ashrut (2011) The primary benefit is enhanced computer performance without the increased cost A common myth is that CPU OC helps in improving game play.
    [Show full text]
  • Vertigo: Automatic Performance-Setting for Linux
    USENIX Association Proceedings of the 5th Symposium on Operating Systems Design and Implementation Boston, Massachusetts, USA December 9–11, 2002 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2002 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Vertigo: Automatic Performance-Setting for Linux Krisztián Flautner Trevor Mudge [email protected] [email protected] ARM Limited The University of Michigan 110 Fulbourn Road 1301 Beal Avenue Cambridge, UK CB1 9NJ Ann Arbor, MI 48109-2122 Abstract player, game machine, camera, GPS, even the wallet— into a single device. This requires processors that are Combining high performance with low power con- capable of high performance and modest power con- sumption is becoming one of the primary objectives of sumption. Moreover, to be power efficient, the proces- processor designs. Instead of relying just on sleep mode sors for the next generation communicator need to take for conserving power, an increasing number of proces- advantage of the highly variable performance require- sors take advantage of the fact that reducing the clock ments of the applications they are likely to run. For frequency and corresponding operating voltage of the example an MPEG video player requires about an order CPU can yield quadratic decrease in energy use.
    [Show full text]
  • Power Reduction Techniques for Microprocessor Systems
    Power Reduction Techniques For Microprocessor Systems VASANTH VENKATACHALAM AND MICHAEL FRANZ University of California, Irvine Power consumption is a major factor that limits the performance of computers. We survey the “state of the art” in techniques that reduce the total power consumed by a microprocessor system over time. These techniques are applied at various levels ranging from circuits to architectures, architectures to system software, and system software to applications. They also include holistic approaches that will become more important over the next decade. We conclude that power management is a multifaceted discipline that is continually expanding with new techniques being developed at every level. These techniques may eventually allow computers to break through the “power wall” and achieve unprecedented levels of performance, versatility, and reliability. Yet it remains too early to tell which techniques will ultimately solve the power problem. Categories and Subject Descriptors: C.5.3 [Computer System Implementation]: Microcomputers—Microprocessors;D.2.10 [Software Engineering]: Design— Methodologies; I.m [Computing Methodologies]: Miscellaneous General Terms: Algorithms, Design, Experimentation, Management, Measurement, Performance Additional Key Words and Phrases: Energy dissipation, power reduction 1. INTRODUCTION of power; so much power, in fact, that their power densities and concomitant Computer scientists have always tried to heat generation are rapidly approaching improve the performance of computers. levels comparable to nuclear reactors But although today’s computers are much (Figure 1). These high power densities faster and far more versatile than their impair chip reliability and life expectancy, predecessors, they also consume a lot increase cooling costs, and, for large Parts of this effort have been sponsored by the National Science Foundation under ITR grant CCR-0205712 and by the Office of Naval Research under grant N00014-01-1-0854.
    [Show full text]
  • A+ Guide to Managing and Maintaining Your PC, 7E
    A+ Guide to Managing and Maintaining Your PC, 7e Chapter 6 Supporting Processors Objectives • Learn about the characteristics and purposes of Intel and AMD processors used for personal computers • Learn about the methods and devices for keeping a system cool • Learn how to install and upgrade a processor • Learn how to solve problems with the processor, the motherboard, overheating, and booting the PC A+ Guide to Managing and Maintaining Your PC, 7e 2 Types and Characteristics of Processors • Processor – Installed on motherboard – Determines system computing power • Two major processor manufacturers – Intel and AMD Figure 6-1 An AMD Athlon 64 X2 installed in socket AM2+ with cooler not yet installed Courtesy: Course Technology/Cengage Learning A+ Guide to Managing and Maintaining Your PC, 7e 3 Types and Characteristics of Processors (cont’d.) • Features affecting processor performance and motherboards – System bus speeds the processor supports – Processor core frequency – Motherboard socket and chipset – Multiprocessing ability – Memory cache – Amount and type of DDR, DDR2, DDR3 memory – Computing technologies the processor can use – Voltage and power consumption A+ Guide to Managing and Maintaining Your PC, 7e 4 How a Processor Works • Three basic components – Input/output (I/O) unit • Manages data and instructions entering and leaving the processor – Control unit • Manages all activities inside the processor – One or more arithmetic logic units (ALUs) • Performs all logical comparisons, calculations A+ Guide to Managing and Maintaining
    [Show full text]
  • Vertigo: Automatic Performance-Setting for Linux
    Vertigo: Automatic Performance-Setting for Linux Krisztián Flautner ARM Limited, Cambridge, UK Trevor Mudge The University of Michigan Presented by Choi Hojung Background(1) • In 2002 – need for low power and high performance processors – from embedded computers to servers – high performance – battery operated Background(2) – Intel SpeedStep • SpeedStep by Intel – No built-in performance-setting policy – A simple approach by the usage model – When on AC power, processor runs at higher speed. – When on battery power , processor runs at a slower speed, thus saving battery power Background(3) - LongRun • LongRun for Crusoe™, by Transmeta – power management that dynamically manages the frequency and voltage levels at runtime – use historical utilization to guide clock rate selection – in processor’s firmware – interval-based algorithm Background(4) - LongRun • Flaws & Questions – in Processor’s firmware – utilization periods can be obscured when all tasks are observed in the aggregate – not have any information about interactive performance in operating system level – a single algorithm perform well under all conditions Background(5) - DVS • Dynamic voltage scaling(DVS) – also called Dynamic Voltage and Frequency Scaling(DVFS) – reduces the power consumed by a processor by lowering its operating voltage – P : power consumption - V : the supply voltage – C : the capacitance - f : operating frequency Proposed • What Vertigo proposed – Implemented in OS kernel level to use a richer set of data for prediction – to reduce the processor’s performance
    [Show full text]
  • ECE 571 – Advanced Microprocessor-Based Design Lecture 28
    ECE 571 { Advanced Microprocessor-Based Design Lecture 28 Vince Weaver http://web.eece.maine.edu/~vweaver [email protected] 9 November 2020 Announcements • HW#9 will be posted, read AMD Zen 3 Article • Remember, no class on Wednesday 1 When can we scale CPU down? • System idle • System memory or I/O bound • Poor multi-threaded code (spinning in spin locks) • Thermal emergency • User preference (want fans to run less) 2 Non-CPU power saving • RAM • GPU • Ethernet / Wireless • Disk • PCI • USB 3 GPU power saving • From Intel lesswatts.org ◦ Framebuffer Compression ◦ Backlight Control ◦ Minimized Vertical Blank Interrupts ◦ Auto Display Brightness • from LWN: http://lwn.net/Articles/318727/ ◦ Clock gating or reclocking ◦ Fewer memory accesses: compression. Simpler background image, lower power 4 ◦ Moving mouse: 15W. Blinking cursor: 2W ◦ Powering off unneeded output port, 0.5W ◦ LVDS (low-voltage digital signaling) interface, lower refresh rate, 0.5W (start getting artifacts) 5 More LCD • When LCD not powered, not twisted, light comes through • Active matrix display, transistor and capacitor at each pixel (which can often have 255 levels of brightness). Needs to be refreshed like memory. One row at a time usually. 6 Ethernet • PHY (transmitter) can take several watts • WOL can draw power when system is turned off • Gigabit draw 2W-4W more than 100Megabit 10 Gigabit 10-20W more than 100Megabit • Takes up to 2 seconds to re-negotiate speeds • Green Ethernet IEEE 802.3az 7 WLAN • power-save poll { go to sleep, have server queue up packets. latency • Auto association { how aggressively it searches for access points • RFKill switch • Unnecessary Bluetooth 8 Disks • SATA Aggressive Link Power Management { shuts down when no I/O for a while, save up to 1.5W • Filesystem atime • Disk power management (spin down) (lifetime of drive) • VM writeback { less power if queue up, but power failure potentially worse 9 Soundcards • Low-power mode 10 USB • autosuspend.
    [Show full text]