AnalysisAnalysis ofof TheThe EnhancedEnhanced IntelIntel® SpeedstepSpeedstep® TechnologyTechnology ofof thethe PentiumPentium® MM ProcessorProcessor

Efi Rotem, Avi Mendelson, Alon Naveh, Micha Moffie

June 2004

MOBILE TECHNOLOGY OverviewOverview zz MobileMobile computerscomputers challengeschallenges Energy and Average power Thermal Design Power zz IntelIntel®® PentiumPentium®® MM powerpower managementmanagement zz TheThe experimentsexperiments zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -2- OverviewOverview zz MobileMobile computerscomputers challengeschallenges Energy and Average power Thermal Design Power zz IntelIntel PentiumPentium MM powerpower managementmanagement zz TheThe experimentsexperiments zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -3- PowerPower && DensityDensity increasesincreases z Think Watts/Cm2  Complex architecture at faster and smaller technology  Denser power is harder to cool 1000 RocketRocket Nozzle NuclearNuclear ReactorReactor Nozzle 2 100 ® 4

Centrino®

Watts/cm Hot plate Pentium® III Pentium® II 10 Pentium® Pro i386 Pentium® i486 1 1.5µ 1µ 0.7µ 0.5µ 0.35µ 0.25µ 0.18µ 0.13µ 0.1µ 0.07µ

* “New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies” – Fred Pollack, Corp. Micro32 conference key note - 1999. TACS June 2004 -4- TheThe MobileMobile EnvironmentEnvironment zz MaximizeMaximize performanceperformance && featuresfeatures withinwithin givengiven constraintsconstraints  Power / Thermal  Size – form factor  Noise  Energy / Battery life  etc. z MobileMobile platformsplatforms offeroffer TradeoffTradeoff preferencespreferences  User defined or built-in scheme „ Compromise performance for longer battery life, lower acoustic noise, cool box etc.

TACS June 2004 -5- OverviewOverview zz MobileMobile computerscomputers challengeschallenges Average power Thermal Design Power zz IntelIntel PentiumPentium MM powerpower managementmanagement zz TheThe experimentexperiment zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -6- DieDie TemperatureTemperature measurementsmeasurements z Temperature based control mechanisms z A diode connected to an external A/D - reports temperature z Fixed temperature sensors  Max spec junction temperature  Critical temperature detector

ref High# high

n: diode ideality factor CR k: Boltzmann constant ref Critical# q: electron charge constant T: diode temperature (Kelvin) VBE: Voltage from base to emitter Ic : Collector current Is : Saturation current N: Ratio of collector currents Source current Measured Vbe A/D

TACS June 2004 -7- ControllingControlling TDPTDP -- LinearLinear

z P states and T states

Trigger

Temp.

P>TDP

Power

Clock

TACS June 2004 -8- ControllingControlling TDPTDP -- LinearLinear

z Self or S/W trigger – Stop clock asserted, Power and temperature drop

Trigger

Temp.

P>TDP

Power

Clock

STOPCLK

TACS June 2004 -9- ControllingControlling TDPTDP -- LinearLinear

z STOPCLOCK continues toggling for a pre defined time z Average power and temperature drop – performance drops

Trigger Thermal significant time Average Temperature Temp.

P>TDP

Power Average Power Reduced

Clock

STOPCLK

TACS June 2004 -10- ControllingControlling TDPTDP -- LinearLinear

z STOPCLOCK continues toggling for a pre defined time z Average power and temperature drop – performance drops

Trigger Thermal significant time Average Temperature Temp.

P>TDP

Power Full Power Leakage Average Power Reduced

Clock

Linear dependency (<1:1) STOPCLK Power to Performance Energy = P*t Æ no gain Leakage impacts energy

TACS June 2004 -11- DynamicDynamic VoltageVoltage ScalingScaling (DVS)(DVS)

z One µ ARCH implementation – Power and energy control

Switch Thermal or Utilization

Clock

Vcc

Power

TACS June 2004 -12- DynamicDynamic VoltageVoltage ScalingScaling (DVS)(DVS)

z PLL relock at lower frequency at same Vcc z Fast change – no user experience impact Short time O.K with S/W

Clock

Vcc

Power

TACS June 2004 -13- DynamicDynamic VoltageVoltage ScalingScaling (DVS)(DVS)

z Vcc drops gradually while CPU active z Power savings changes from linear to F3

Clock

Performance drops by Vcc Function of F (Frequency)

CPU power drops by Power P=C*V2*F @ function of F3

Power savings > time increase Energy net savings

TACS June 2004 -14- DynamicDynamic VoltageVoltage ScalingScaling (DVS)(DVS)

z Vcc is ramped up increasing power z Once stable – PLL relock at high frequency

Switch Back

Clock

Vcc

Power

TACS June 2004 -15- AdaptiveAdaptive EnergyEnergy ControlControl zz ApplicationsApplications havehave widewide dynamicdynamic powerpower rangerange zz RequireRequire highhigh powerpower highhigh performanceperformance burstsbursts Determine user experience zz TradeTrade powerpower performanceperformance asas neededneeded Driven by Operating system ACPI Average power control on the fly - ADAPTIVE

High High

Low Low

TACS June 2004 -16- ACPIACPI ControlControl z Operating system feature – Industry standard  Supported by MS and Linux  Passive and active policies defined for each zone  User preference – Max performance or Max battery „ Switches _PSV and _ACx -95- _CRT  DVS or clock throttling used for passive power control -90- „ Available Power states published by BIOS -85- -80- „ DVS and once reached min Vcc - linear -75- _AC0 „ Implements PD controller algorithm -70- -65- _AC1  Fan on/off and speed used for active cooling -60- _TMP = 60 -55- 1900 100 -50- _PSV 1700 90 -45- -40- 1500 80 -35- 1300 -30- 70 1100 -25- 60 900 Frequency 700 50 Temperature 500 40 8:56:59 8:57:42 8:58:25 8:59:08 8:59:51 9:00:35 TACS June 2004 -17- TACS June 2004 -18- TACS June 2004 -19- OverviewOverview zz MobileMobile computerscomputers challengeschallenges Average power Thermal Design Power zz IntelIntel PentiumPentium MM powerpower managementmanagement zz TheThe experimentexperiment zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -20- TestTest SetupSetup

Thermal control z CPU temperature is a function of power and ambient temp.  Long time to heat the ambient and heat sink z Case temperature control formed stable environment  Heating plate and fan to keep temperature at fixed temperature  Force extreme conditions

Test cases Fan z SPEC-Int and SPEC-FP components z Used the self trigger mechanism  For repeatable & consistent results Controller z Collected power temperature and performance Tc Heating Die Plate Testing on real silicon the Tj Main Board - Mobile theoretical expected behavior

TACS June 2004 -21- OverviewOverview zz MobileMobile computerscomputers challengeschallenges Average power Thermal Design Power zz IntelIntel PentiumPentium MM powerpower managementmanagement zz TheThe experimentexperiment zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -22- LinealLineal vs.vs. DVSDVS controlcontrol

Thermal Throttle impact 100%

99% Start Throttle 98% ~3%

97%

96%

Linear 95% Higher temp. Less cooling DVS 94% SPEC2K Performance [SPEC2K Performance % of max ] 55% 60% 65% 70% 75% 80%

Cooling [% of max]

TACS June 2004 -23- LinealLineal vs.vs. DVSDVS controlcontrol

Thermal Throttle impact 100%

99%

98% ~3%

97%

96%

Linear 95% Higher temp. Less cooling DVS

SPEC2K Performance [SPEC2K Performance % ] 94% 55% 60% 65% 70% 75% 80%

Cooling [% of max]

TACS June 2004 -24- LinealLineal vs.vs. DVSDVS controlcontrol

Thermal Throttle impact 100% Shallower Slope: 99% More cooling at same performance ~3% 98% ~3% ~4% 97%

~5% 96%

~6% Linear 95% Higher temp. Less cooling DVS

SPEC2K Performance [SPEC2K Performance % ] 94% 55% 60% 65% 70% 75% 80%

Cooling [% of max]

TACS June 2004 -25- DVSDVS setset pointpoint

Throttle impact on performance

100% throttle 100% Overheat

90% 600 Mhz 800 Mhz 80% 1000 Mhz 70% 1200 Mhz

Performance [%] 60%

50% 50 60 70 80 90 100 Cooling [% of max]

Optimal point (performance) is at the highest frequency that dose not accede Tj_max with no transitions up and down TACS June 2004 -26- AverageAverage PowerPower managementmanagement

1.2 216 200 1.1 1.0

0.8 150

112 0.6 48% Performance 100 66% Power 0.4 Efficiency 1:1.4 - Static 0.37 50 0.2. Mobile Mark 02 [Score] Average CPU Power [W] Average CPU Power Lowest freq. 0 Average 0 Power 600 MHz 1600 MHz Score Power saving (Battery) Performance (AC)

TACS June 2004 -27- AverageAverage PowerPower managementmanagement

1.2 216 200 1.1 1.0 195

10% Performance 0.8 150 43% Power Efficiency 1:4 112 0.6 48% Performance 0.63 100 66% Power 0.4 Efficiency 1:1.4 - Static 0.37 50 0.2. Mobile Mark 02 [Score] Average CPU Power [W] Average CPU Power

0 Average 0 Power 600 MHz 600 – 1600 1600 MHz Score Power saving (Battery) ADAPTIVE Performance (AC)

TACS June 2004 -28- EnergyEnergy efficiencyefficiency z Energy consumed for the SPEC-Int and SPEC-FP  We measured energy as E=∫ P(t) ∫ t  Energy consumed at 1600 MHz defined as 100%

Energy efficiency Lowest energy at 100% lowest frequency 90%

80%

70%

60% 50% 40%

30%

Relative Energyefficiency 20%

10%

0% 1600 1200 1000 800 600 Frequency [MHz]

TACS June 2004 -29- OverviewOverview zz MobileMobile computerscomputers challengeschallenges Average power Thermal Design Power zz IntelIntel PentiumPentium MM powerpower managementmanagement zz TheThe experimentexperiment zz TestTest resultsresults zz ConclusionsConclusions

TACS June 2004 -30- SummarySummary andand ConclusionsConclusions z Intel Specifically designed for mobile  Targeting energy and power efficiency z Mobile systems trade performance for power  To meet user preference z Enhanced Intel SpeedStep Technology provides significantly improved power to performance and energy control scheme  Silicon measurement confirm the theoretical work  Optimal point for performance to power is at the highest frequency that dose not accede Tj_max „ Policy implemented into ACPI algorithm  Optimal point for energy and battery life is at the minimum frequency possible with DVS

TACS June 2004 -31- ThanksThanks

The authors would like to thank Nachum shamir, Irina Ilatov and Riad Durr for the thermal clamping measurements, to Jessie Garcia, Ali Saeed, Marco Wirasinghe for the evaluation and data collection for this article and to Cohen Aviad and Lev Finkelstein for results verification using simulation.

TACS June 2004 -32- DieDie TemperatureTemperature measurementsmeasurements z A diode connected to an external A/D reports temperature z Fixed temperature

Appling 2 Currents

ref High# high

n: diode ideality factor k: Boltzmann constant CR q: electron charge constant ref Critical# T: diode temperature (Kelvin) VBE: Voltage from base to emitter Ic : Collector current Is : Saturation current N: Ratio of collector currents

Source current Measured Vbe A/D

TACS June 2004 -33- ControllingControlling TDPTDP -- LinearLinear

z STOPCLOCK continues toggling for a pre defined time z Average power and temperature drop – performance drops

Trigger Thermal significant time

Temp.

P>TDP

Power

Clock

STOPCLK

TACS June 2004 -34- DynamicDynamic VoltageVoltage ScalingScaling (DVS)(DVS)

z Vcc drops gradually while CPU active z Power savings changes from linear to F3

Clock

Performance drops by Vcc Function of F (Frequency)

CPU power drops by Power P=C*V2*F @ function of F3

TACS June 2004 -35- TheThe PentiumPentium MM PowerPower ControlControl SchemesSchemes

Linear power scaling z Change CPU frequency only  reduces both power and performance linearly with frequency  Saves power, No energy savings

New (DVS) z Reduces Voltage and frequency on the fly  Frequency is dependent ~ linearly on Voltage  Power is a function of C*F*V2 z Reduction of both voltage and frequency provides F3 power reduction for linear performance reduction  Also results with energy savings

TACS June 2004 -36- ACPIACPI InterfaceInterface

TACS June 2004 -37- TheThe PentiumPentium MM PowerPower ControlControl SchemesSchemes

Linear power scaling z Change CPU frequency only  reduces both power and performance linearly with frequency

New Dynamic Voltage Scaling (DVS) z Reduces Voltage and frequency on the fly z Reduction of both voltage and frequency provides F3 power reduction for linear performance reduction z See details…

TACS June 2004 -38-