Computer Power Management Rules Ø Jim Kardach, re red chief power architect, Intel h p://www.youtube.com/watch?v=cZ6akewB0ps
1 HW1 Ø Has been posted on the online schedule. Ø Due on March 3rd, 1pm. Ø Submit in class. Ø Hard deadline: no homework accepted a er deadline. Ø No collabora on is allowed.
Chenyang Lu CSE 467S The Power Problem
Ø Processors improve performance at the cost of power. q Performance/wa remains low.
Ø Solu on q Hardware offer mechanisms for saving power. q So ware executes power management policies.
3 Power vs. Energy Ø Power: Energy consumed per unit me q 1 wa = 1 joule/second Ø Power à heat Ø Energy à ba ery life
4 Why worry about energy? Intel vs. Duracell
16x
14x Processor (MIPS)
12x Hard Disk (capacity) 10x Improvement (compared to year 0) 8x
6x Memory (capacity)
4x
2x Battery (energy stored) 1x
0 1 2 3 4 5 6 Time (years) Ø No Moore’s Law in ba eries: 2-3%/year growth. Trend in Power Density Sun’s Surface
1000 Rocket Nozzle
Nuclear Reactor
2 100 Pentium® 4
Pentium® III
Watts/cm Pentium® II 10 Hot plate Pentium® Pro i386 Pentium® New Microarchitecture Challenges in the Coming i486 Genera ons of CMOS Process Technologies, Fred Pollack, 1 Intel Corp. Micro, 1999. 1.5µ 1µ 0.7µ 0.5µ 0.35µ 0.25µ 0.18µ 0.13µ 0.1µ 0.07µ Process
6 Trend in Cooling Solu on
7 Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis c approach
8 CMOS Power Consump on Ø Voltage drops: power consump on ∝ V2. Ø Toggling: more ac vity à higher power. Ø Leakage when inac ve.
9 Power-Saving Features
Voltage drops Reduce power supply voltage.
Toggling Run at lower clock frequency. Reduce ac vity. Disable func on units when not in use.
Leakage Disconnect parts from power supply when not in use.
Ø Why voltage scaling? q Power ∝ V2 à reduce power supply voltage saves energy. q Lower voltage à lower clock frequency. • Tradeoff between performance vs. energy.
Ø Why dynamic? q Peak compu ng demand is much higher than average.
Ø Changing voltage takes me q to stabilize power supply and clock
11 Examples
Ø StrongARM SA-1100 takes two supplies q VDD is main 3.3V supply. q VDDX is 1.5V. Ø AMD K6-2+ q 8 frequencies: 200-600 MHz. q Voltage: 1.4, 2.0 V. q Transi on me: 0.4 ms for voltage change. Ø PowerPC 603 q Can shut down unused execu on units. q Cache organized into subarrays to reduce ac ve circuitry.
12 Intel SpeedStep
Intel Core 2 Duo E6600
Intel Pen um M P states
13 Linux DVFS Governors Ø Performance q Always set at the max frequency Ø Powersave q Always set at the lowest frequency Ø Ondemand q Automa cally adjust the frequency according to CPU usage Ø Conserva ve q Like ondemand, but in a more conserva ve way. Ø Userspace q Set at a fixed frequency by the user
14 Ondemand Ø Ini al implementa on in 2.6.9 Ø For all CPUs q if (> 80% busy) then P0 (max frequency) q if (< 20% busy) then down by 20% Ø Mul ple improvements since 2.6.9
15 Get & Set CPU Frequency Ø Get the current frequency: q /sys/devices/system/cpu/cpu[X]/cpufreq/scaling_cur_freq q Example: 2400000 (2.4GHz)
Ø Frequency & governors available: q /sys/devices/system/cpu/cpu[X]/cpufreq/scaling_available_frequencies q Example: 2400000 2133000 1867000 1600000 q /sys/devices/system/cpu/cpu[X]/cpufreq/scaling_available_governor q Example: ondemand userspace performance powersave conserva ve
Ø Set the frequency: q Root privilege q echo userspace > /sys/devices/system/cpu/cpu[X]/cpufreq/ scaling_governor q echo 2133000 > /sys/devices/system/cpu/cpu[X]/cpufreq/scaling_setspeed
16 Clock Ga ng
Ø Applicable to clocked digital components q Processors, controllers, memories Ø Stop clock à stop signal propaga on in circuits
✔ Short transi on me q Clock genera on is not stopped q Only clock distribu on is stopped
✘ Rela vely high power consump on q Clock itself s ll consumes energy q Cannot prevent power leaking
17 Supply Shutdown Ø Disconnect parts from power supply when not in use.
✔ General ✔ Save most power
✘ Long transi on me
18 Example: SA-1100 Three power modes: Ø Run: normal opera on. Ø Idle: stops CPU clock, w. I/O logic s ll powered. Ø Sleep: shuts off most of chip ac vity
19 SA-1100 SLEEP Ø RUN à SLEEP q (30 µs) Flush to memory CPU states (registers) q (30 µs) Reset processor state and wakeup event q (30 µs) Shut down clock
Ø SLEEP à RUN q (10 ms) Ramp up power supply q (150 ms) Stabilize clock q (negligible) CPU boot
20 Intel Core Duo Processor SV IntelIntel CoreCore DuoDuo ProcessorProcessor SVSV Name Vcc Watt C0 High Frequencey Mode (P0) 1.3 31 C0 Low Frequency Mode (Pn) 1.0 C1 Auto Halt Stop Grant (HFM) 15.8 C1E Enhanced Halt (LFM) 4.8 C2 Stop Clock (HFM) 15.5 C2E Enhanced Stop Clock (LFM) 4.7 C3 Deep Sleep (HFM) 10.5 C3E Enhanced Deep Sleep (LFM) 3.4 C4 Intel Deeper Sleep 0.85 2.2 DC4 Intel Enhanced Deeper Sleep 0.80 1.8
3 Ottawa Linux*Intel® SymposiumCore™ Duo Processor 65nmJuly 19, 2006Process – Datasheet 7 21 The Mote Revolution: Low Power Wireless Sensor Network Devices, Joseph Polastre, Robert Szewczyk, Cory Sharp, David Culler, Hot Chips 16.
22 Power Consump on Computer with Wireless NIC Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis c approach
24 Approaches Ø Sta c Power Management q Does not depend on ac vity. q Example: user-ac vated power-down.
Ø Dynamic Power Management q Adapt to ac vity at run me. q Example: automa cally disabling func on units.
25 Dynamic Power Management
Ø Inherent tradeoff: energy vs. performance Ø Fundamental premises q Non-uniform workload during opera on q Possible to predict workload with some degree of accuracy
26 PowerPC 603 Ac vity Percentage of me idle for SPEC integer/floa ng-point: unit Specint92 Specfp92 D cache 29% 28% I cache 29% 17% load/store 35% 17% fixed-point 38% 76% floa ng-point 99% 30% system register 89% 97%
27 Problem Formula ons Ø Minimize energy under performance constraints q Real- me applica ons
Ø Op mize performance under energy/power constraints q Ba ery life me (energy) q Temperature (power)
28 Power Down/Up Cost Ø Going into/out of an inac ve mode costs q me q energy
Ø Must determine if going into an inac ve mode is worthwhile.
Ø Model power states with a Power State Machine (PSM)
29 SA-1100 Power State Machine
PON = 400 mW
run 10 µs 160 ms 90 µs 10 µs 90 µs idle sleep
P = 50 mW OFF POFF = 0.16 mW
PTR = PON
30 Greedy Policy Ø Immediately goes to sleep when system becomes idle
Ø Works when transi on me is negligible q Ex. between IDLE and RUN in SA-1100
Ø Doesn’t work when transi on me is long! q Ex. between SLEEP and RUN/IDLE in SA-1100 q Need be er solu ons!
31 Break-Even Time TBE Ø Minimum idle me required to compensate for the cost of entering an inac ve state.
Ø Enter an inac ve state is beneficial only if idle me > TBE.
32 Break-Even Time
PTR ≤ PON
Ø PTR: Power consump on during transi on
Ø PON: Power consump on when ac ve
Ø TBE of an inac ve state is the total me it takes to enter and leave the state
Ø TBE = TTR = TON,OFF + TOFF,ON
q TBE = 160 ms + 90 µs for SLEEP in SA-1100
33 SA-1100 Power State Machine
PON = 400 mW
run 10 µs 160 ms 90 µs 10 µs 90 µs idle sleep
P = 50 mW OFF POFF = 0.16 mW
PTR = PON
34 Break-Even Time
PTR > PON
Ø TBE must include addi onal inac ve me to compensate for extra power consump on during transi on.
TBE = TTR + TTR(PTR - PON)/(PON - POFF)
Ø Reduce TBE à save more energy
q Shorter TTR
q Higher power difference between PON – POFF
q Lower PTR
35 Inherent Exploitability Ø Achievable energy saving depends on workload! q Distribu on of idle periods
Ø Given an idle period Tidle > TBE
q ES(Tidle) = (Tidle - TTR)(PON - POFF) + TTR(PON – PTR)
Ø Assump ons q No performance penalty. q Ideal manager with knowledge of workload in advance.
36 Inherent Exploitability based on real workload
37 Time-Power Product Workload-independent Metric
CS = TBEPOFF
Ø An inac ve state with lower CS may save more energy Ø Only a crude es mate q May not be representa ve of real power savings
38 Predic ve Techniques
Ø Interested event: p = {Tidle > TBE} q Predict based on history Ø Observed event: o q Triggers state transi on Ø Objec ve: predict p based on o
39 Metrics Ø Safety: condi onal probability Prob(p|o)
q If an observed event happens à the probability of Tidle>TBE q Ideally, safety = 1. Ø Efficiency: Prob(o|p)
q If Tidle > TBE à the probability of correctly predic ng.
Ø Overpredic on à high performance penalty à poor safety Ø Underpredic on à wastes energy à poor efficiency
40 Fixed Timeout Policy
Ø Enter inac ve state when system has been idle for TTO
q o: Tidle > TTO
Ø Wake up in response to ac vity
Ø Hypothesis: If system has been idle for TTO à it will con nue to be idle for Tidle-TTO > TBE
41 TTO???
Ø Increasing TTO improves safety, but reduces efficiency. Ø Highly workload dependent
Ø Karlin’s result: TTO = TBE à Energy consump on is at most twice the energy consumed under an ideal policy
42 Impact of Timeout Threshold
43 Impact of Workloads
44 Cri que: Fixed Timeout
Ø How to set meout threshold? q Tradeoff between safety and efficiency q Works best when workload traces are available
Ø Fundamental limita ons q Always waste energy before reaching the meout threshold q Always incur performance penalty for wake up
45 Possible Improvement Ø Predic ve shutdown q shut down immediately when an idle period starts. q avoid was ng energy before reaching the meout threshold. q more efficient, less safe.
Ø Predic ve wakeup q wake up when the predicted idle me expires, even if no new ac vity has occurred. q avoid performance penalty for wakeup. q less efficient, safer.
46 Predic ve Shutdown Threshold-based Policy
Ø Observa on: short ac ve period tends to be followed by long idle period. Ø If ac ve period < threshold, the following idle period is
predicted to be longer than TBE. Ø What is the right threshold? q Workload dependent q Require offline analysis
47 Threshold-based Predic ve Shutdown
48 Predic ve Wakeup Regression-based Algorithm
Ø Predict the length of an idle period based on q preceding ac ve period q previous n pairs of idle/ac ve periods Ø More complicated than fixed meout q Need to maintain history informa on Ø Depend on offline analysis and traces to determine the regression func on and parameters
49 Adapt to Workload Changes
Ø Grade n meout thresholds based on history q Use the best one for predic on q Use weighted average of n thresholds Ø Adjust meout q Increase meout threshold if causing too many shutdowns q Decrease meout threshold if causing too few shut downs Ø Stochas c techniques
50 Cri ques: History-based Predictors
Ø Depend on short-term correla on between past & future q Hold in many workloads q Fail when the correla on is weak
Ø Workload in many embedded systems are more predictable than PCs q Workload (e.g., periodic tasks) known a priori q Specialized applica on
51 ESSAT Efficient Sleep Scheduling based on Applica on Timing
Ø Reduce radio power consump on by exploi ng the ming proper es of periodic queries in sensor networks Ø Sleep scheduling incurs low delay penalty
O. Chipara, C. Lu, and G.-C. Roman, Efficient Power Management based on Applica on Timing Seman cs for Wireless Sensor Networks, ICDCS 2005.
52 Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis c approach
53 Power Manager Ø Usually implemented in so ware (OS) for flexibility Ø Hardware and so ware co-design q So ware implements policy q Hardware implements power saving mechanisms Ø Need standard interfaces to deal with hardware diversity q Different vendors q Different devices: processor, sensor, controller …
54 ACPI Advanced Configura on and Power Interface Open standard for power management services. h p://www.acpi.info/
applica ons power OS kernel device management drivers ACPI BIOS
Hardware pla orm devices, processor, chipset
55 ACPI System Power States
Used as contract between hardware and OS vendors
56 ACPI Global Power States Ø G3: mechanical off – no power consump on Ø G2: so off – restore requires full OS reboot Ø G1: sleeping state q S1: low wake-up latency with no loss of context q S2: low latency with loss of CPU/cache state q S3: low latency with loss of all state except memory q S4: lowest-power state with all devices off Ø G0: working state
57 Intel Core i7 C States
58 Intel Pen um M P states
59 Device Power States Ø Device power state is invisible to the user. q Devices may be inac ve when the system is in the working state. Ø Each device may be controlled by a separate power management policy.
60 Power Ø Hardware support Ø Power management policy Ø Power manager Ø Holis c approach
61 Holis c View of Power Consump on
Ø Instruc on execu on (CPU) Ø Cache (instruc on, data) Ø Main memory Ø Other: non-vola le memory, display, network interface, I/O devices
62 Mote • System view when switching from sleep to ac ve
2.5 1– 10ms ms typical
Source: Joseph Polastre, Robert Szewczyk, Cory Sharp, David Culler. The Mote Revolution: Low Power Wireless Sensor Network Devices. In Hot Chips 16, 2004.
63 Sources of Energy Consump on
Rela ve energy per opera on (Ca hoor): q memory transfer: 33 q external I/O: 10 q SRAM write: 9 q SRAM read: 4.4 q mul ply: 3.6 q add: 1
64 Op mize Memory System Ø Different instruc ons à Different energy consump on
Ø Energy: register << cache (SRAM) << memory (DRAM)
Ø Op mizing memory system à significant energy saving
65 Cache Behavior
Sweet spot in cache size: Ø Too small: waste energy on memory accesses; Ø Too large: cache itself burns too much power.
66 Impacts of Cache Size
67 Op miza ons
Ø Reduce memory footprint q Reduce code size q Analyze/test footprint to find right size: stack, heap… Ø Find correct cache size q Analyze cache behavior (size of working set) Ø Minimize memory and cache access q Use registers efficiently à less cache access q Iden fy and eliminate cache conflicts à less memory access Ø Be er performance à More idle me!
68 Reading
Ø Textbook 3.7 Ø Required: Sec ons I, II, III.A, III.B, IV of L. Benini, A. Bogliolo and G. De Micheli, A Survey of Design Techniques for System-Level Dynamic Power Management, IEEE Transac ons on VLSI, pp. 299-316, June 2000. Ø Interes ng: Intel Inside…Your Smartphone h p://spectrum.ieee.org/semiconductors/processors/intel- insideyour-smartphone
69