Galaxy Low Power Solution

Galaxy Low Power Solution

Design methodologies and techniques for production low power SOC designs Dr. Kaijian Shi Synopsys Professional Services 1 Content • Dynamic Voltage Frequency Scaling • Power-gating design • Production low-power SOC implementation • Power intent definitions through UPF • Production low-power design environment • Summary 2 Dynamic Voltage Frequency Scaling (DVFS) • Principles • Workload based DVFS • Adaptive VFS (AVFS) • Application and PVT based VFS • Production design considerations and recommendations 3 DVFS Principles CMOS Power, Energy and performance 2 P = Pdy + Pleak ~ C * v dd * f + vdd * Ileak E = ∫ P * dt f ~ (vdd-vt) / vdd 1.3 • Scale Vdd and f dynamically to just meet performance needs • Reduce f helps lowering power and thermal but not energy (battery life) for a task • Must reduce Vdd to save energy 4 Workload-based DVFS • Workload – number of clock cycles to complete a task • Deadline – latest time to complete the task • DVFS – Reduce V and F to run just fast enough to meet the task deadlines and maintain quality of operations Task-2 100% Run/Idle 1 2 3 4 5 1 Run task as slow as possible 0% Reduce voltage 2 to lower level 2 1 4 3 Run task in 100% 3 time available Energy Saved 4 Reduce voltage DVFS to match time 3 4 5 0% 5 DVFS System DVFS software DVFS controller Programmable Look-ahead task V-regulator DVFS SoC block Predict workload F = workload deadline Programmable V from F-V table Clock generator – OS-plug-in -> Pre-defined performance profiles Level shifters (policies) -> predict workload and deadline of next task -> F -> look up V from F-V table – Profiles are statistically generated and app- specific, e.g IEM for ARM cpu, – F-V table is SoC block and process dependent Other SoC blocks – DVFS controller is app-specific hardware – Level shifters are needed at block interface 6 Level Shifters VddL • H-L: a simple buffer - timing model differs from In (VddH) Out(VddL) normal buffers; characterized with high input swing and low output transition • L-H: diff-amp buffer VddH • low-swing inputs (VddL) • pull-up to VddH by diff-amp VddL Out(0-VddH) In(0-VddL) 7 IEM Testchip – SoC Implementation 8 DVFS Power and Energy saving 300MHz, 1.21V 100% 100% 225MHz, 1.03V 56% 75% 150MHz, 0.81V 23% 46% 75MHz, 0.69V 9% 36% 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Power Energy • 4 levels of DVFS 9 DVFS design considerations • V-F scaling sequence – Up-scaling: scale V -> settles -> switch F – Down-scaling: scale F -> locked -> scale V • Manage large clock skew due to V scaling – fully asynchronous - synchronizer or FIFO – Clock pre-compensation – shift launch clock earlier (max- skew) and add buffers to fix hold in short paths • Above all, reliable app-specific DVFS software is essential – a real challenge! – Sufficient app-runs to generate quality performance profile – Conservative workload prediction 10 Clock latency variation with V-T (130um) Clock Tree Latency Analysis 1.5 1.4 Latency[FF] • Variation 0.733 0.96 1.333 1.3 Latency[TT] accelerates 1.2 1.064 Latency[SS] below 0.9V ) 1.1 1.207 1.741 V ( e Delta = 1ns (50%) g a 1 t l 1.079 1.483 2.187 o V 0.9 1.568 • Much worse in 0.8 1.962 2.963 2.227 weak corner - 1.66 2.36 3.626 0.7 50% (1V-0.8V) 0.6 0.5 0 5 0 5 0 5 0 5 0 . 0 0 1 1 2 2 3 3 4 Latency (ns) 11 Adaptive voltage frequency scaling (AVS) • Based on on-chip performance monitor • Close-loop system for process and temperature compensation AVS controller Programmable DVFS SoC block V-regulator monitor Programmable Clock generator shiftersLevel Level shifters 12 AVS design considerations • On-chip performance monitors – Ring oscillator sensing PVT variation – Power & area overhead mitigation – Multiple monitors placed in chip centre and four corners • AVS controller – Reliable algorithms for loop stability in discrete V-F scaling – Short enough loop response time to prevent VF oscillation • Mixed analog-digital physical implementation – Analog/digital isolations (guide rings etc.) – A/D block interface signalling – A/D power grids separations 13 Application based VFS • Pseudo-dynamic V-F scaling based on application needs (e.g. voice vs. video) • Switching before an app-execution • Simpler design -> lower risk impact on yield, TTM – A few V-F levels and hence manageable PVT variations – Full corner timing closure is feasible for high yield – No sensors and complex scaling control – Not depend on workload profiling • Less efficient than DVFS and AVS • Not for PVT compensation 14 An example: Intel Turbo Boost • Clock scaling constrained by T and IR- drop to boost performance • For a high-performance application, increase clock frequency by 133MHz a step, until reaching defined T or IR-drop limit • Only scale clock and hence not for energy saving 15 IBM EngerScale for Power core • Static Power Saver Mode – User control (on/off) for predictable workload change – Lower V-F with safe margin • Dynamic Power Saver Mode – DFS based on core utilization and policies configured by user – Favor Power: default at low F, increase under heavy utilization – Favor Performance: default at max F, decrease when lightly utilized or idle 16 Production design considerations and recommendation – V-F scaling • V-F scaling – Min V = 1.8~2 * Vt • Too big variation to manage when noise margin < Vt • T-inversion at low Vdd make the case even worse – Max 4 scaling levels (design corners and closure concerns) – Up to 2 Vt cells in a VFS domain to maximize VF scaling range and PVT variation tolerance • V-regulator – Avoid on-chip regulator (linear regulator does not reduce chip P & E; bulk regulator is too noisy) – Programmable off-chip regulator is often the choice – Watch regulator’s settling time (10’s-100’s us) – custom design regulator to reduce settling time as needed 17 Production design considerations and recommendations - VFS • Clock generator – Can be embedded in the SoC and combined with SoC clk-generator for efficiency – Watch clock locking time – good PLL to meet req • Block interface timing challenge due to large clock skew in large V variation in scaling – Async interface (sync-cell or FIFO), if applicable – Clock pre-compensation – shift launch clock earlier by max-skew-variation and add buffers to fix hold violations in short paths 18 DVFS vs. AVS vs. App VFS vs. PVT DVFS • Things to consider: – Actual P&E saving considering added P/E overhead (P&E on DVFS logic and scaling operations) – Area penalty – Design closure and verification – QoR and TTM – IO standards do not scale! – Reliability and cost • A good choice for production design: Chip-level, app-based VFS combined with PVT DVFS, fixed V on standard IOs 19 Content • Dynamic Voltage Frequency Scaling • Power-gating design • Production low-power SOC implementation • Power intent definitions through UPF • Production low-power design environment • Summary 20 Power-gating design • Principle • System and components • Retention strategies and techniques • Production design considerations and recommendations 21 Power gating principle 25250 sleep Leakage Power Vdd 0 Virtual Vdd 20200 Dynamic Power 0 15150 0 10100 0 Power (Watts) Power 5050 L-VTH logic cells 0 0 Virtual Vss 25025 nm 180 nm 13013 nm 90 nm 65 nm sleep Vss 0 Technology0 (nm) • Leakage trend 22 Power gating system components Shut-down Shut-down domain 1 domain 2 • Shut-down and ao domains Retention Memory ISO • PM unit RR ISO • ao-buffers • Switch cells ao Always- • Isolation cells on domain • Retention rams PM • Retention flops 23 Power Management (PM) Unit • Control sleep/wakeup sequence – Clocks – architecturally suppress during sleep – Isolation – control clamping of outputs pre-sleep – Retention – control save and restore states – Resets – put the block in “quiet” state pre-/after-sleep – Power-Down – when to shut-down and wakeup a block CLOCK N_ISOLATE SAVE N_RESET N_PWRON RESTORE 24 Always-on repeaters • Distribute signals through shut-down domains • Dual rail buffer/inverter with AO-power pin VDD rail is not used; VDDC connects to AO-power No placement restriction VDD VDDC VSS • Normal buf/inv – placed them in dedicate regions with separated ao-rail 25 Switch cells TSMC 65g pMOS : Ion/Ioff (Vdd=1, Vbb=1v) Ion/Ioff • Custom-designed HVt 25000 pMOS for header switch 20000 and nMOS for footer switch 15000 Wgate 10000 • Optimized (L,W,BBS) for 4.40E-06 5000 max efficiency (Ion/Ioff) 2.40E-06 0 8 4.00E-07 8 - 7 • Integrated repeaters - - 7 - Lgate 6.50E • Single/dual switch cell 9.50E 1.25E 1.55E Pon-ack VDD Pon-ack1 VDD Pon-ack2 Pon VDDC Pon1 VDDC Pon2 26 Signal isolations VDD • isolation methods ISO ISO – Retain “1” isolation circuit ISO_out IN – Retain “0” isolation circuit IN – Retain current state circuit ISO_out Retention flop • Output isolations Gnd – Pros – simple control – Cons – ao-cell and ao-power VDDC • Input isolations – Pros - Normal std_cells and ISO_out Less power (floating input) IN – Cons - Complex control on ISO inputs that connect to ISO_out power-down outputs ISO IN Gnd 27 Power-gating design • Principle • System and components • Retention strategies and techniques • Production design considerations 28 State retention in shut-down period • Needed to fast resume operations after wakeup based on the states at shut-down • Retention through live memories • Retention registers • Retention rams • Production design considerations and recommendations 29 Retention through live memory 1. Write/read states to/from live memory, before sleep and after wakeup – Proc: simple (software) – Cons: long retention/restore latency 2. Scan in/out states to/from the memory – Proc: shorter latency – Cons: retention-based flop stitching may conflict optimal DFT scan stitching • Much longer latency than retention registers 30 Balloon style retention register • Add an always-on high-Vt balloon latch for state retention.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    127 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us