Pentium M Processor Micro-Architecture Yy Thethe Performanceperformance
Total Page:16
File Type:pdf, Size:1020Kb
TheThe IntelIntel®® PentiumPentium®® MM processorprocessor PowerPower--AwarenessAwareness StoryStory FromFrom TheoryTheory toto PracticePractice Ronny Ronen Senior Principal Engineer Director of Architecture Research Intel Labs - Haifa Intel Corporation Technion EE, Haifa, June 2, 2003 BasedBased on…on… TheThe IntelIntel® PentiumPentium® MM Processor:Processor: MicroarchitectureMicroarchitecture andand PerformancePerformance By Simcha Gochman, Ronny Ronen, Ittai Anati, Ariel Berkovits, Tsvika Kurts, Alon Naveh, Ali Saeed, Zeev Sperber, Robert C. Valentine Intel Technology Journal Q2/2003 http://developer.intel.com/technology/itj/ Page 2 All dates, plans, and features are preliminary and subject to change without notice IDCIDC –– IsraelIsrael DevelopmentDevelopment CenterCenter Located on Israel's Mediterranean coast, Haifa is the home of Intel's Israel Development Center (IDC). IDC was established in 1974, and is Intel's first development center outside the US. The center is a multi- disciplinary team, with more than 1000 employees. Many of Intel's leading products were developed and originated at IDC. IDC's employees are currently working on Intel's future The Baha`i Shrine microprocessors, CAD tools, Haifa most known attraction advanced networking components and software technologies. Page 3 All dates, plans, and features are preliminary and subject to change without notice TheThe IntelIntel® CentrinoCentrinoTM MobileMobile technologytechnology Announcing Intel® Centrino™ mobile technology. Intel has expanded its history of innovation with new notebook capabilities Intel® designed specifically for the mobile world. Pentium® M Now you can work, play and connect without wires. And choose from a whole Processor new generation of thin, light notebooks designed to enable extended battery life. This new innovative technology enables: ®® Integrated wireless LAN capability InInteltel 855855 Intel® ChipsetChipset Breakthrough mobile performance FamilFamilyy Pro/Wireless Extended battery life 2100 Network Thinner, lighter designs Connection ICH4ICH4-M-M Page 4 All dates, plans, and features are preliminary and subject to change without notice AgendaAgenda yy TheThe TheoryTheory – Power, Energy – Power Awareness yy TheThe PracticePractice – The Intel Pentium M processor micro-architecture yy TheThe PerformancePerformance Page 5 All dates, plans, and features are preliminary and subject to change without notice PowerPower andand thethe digitaldigital world…world… y Power is consumed: – When capacitance is charged and discharged. – A charged cap is a logical ‘1’, a discharged cap is ‘0’. 1 2 E= /2CV IN OUT 0110 1010 y The capacitance can be the gates of other transistors or wires (busses and long interconnects). Page 6 All dates, plans, and features are preliminary and subject to change without notice PowerPower andand thethe digitaldigital worldworld (2)…(2)… y Secondary effects like leakage and short-circuit current are increasing with advanced process technologies. IN OUT IN OUT 00 11 1/21/2 Leakage Short-circuit (sub-threshold) y Depends heavily on operating voltage and temperature y Leakage is growing dramatically – Reaching 20% in current process technology, and growing… Page 7 All dates, plans, and features are preliminary and subject to change without notice PowerPower && EnergyEnergy yy TotalTotal energyenergy – Total of all switch energy and leakage waste – Measured in either in joules or watt x hour yy “En“Eneergyrgy perper task”task” Lower Energy per task means – Longer battery life. – Lower electric bills yy PowerPower == energyenergy // timetime == ααCVCV2ff (+ leakage power) ((α: activity, C: capacitance, V: voltage, f: frequency) – Measured in watts Page 8 All dates, plans, and features are preliminary and subject to change without notice PowerPower && EnergyEnergy y Average power – Total energy / Total time – Including low-activity and idle-time y Peak power – Higher power Î higher current. – Higher power Î higher temperature. – Cannot exceed the thermal constrains. y Typical figures (leading edge processors) – Average power: 1W-3W – Peak power: 20W-100W Page 9 All dates, plans, and features are preliminary and subject to change without notice PowerPower DensityDensity yy ThinkThink ofof watts/cmwatts/cm2.. yy DenserDenser powerpower isis harderharder toto cool.cool. yy ComplexComplex algorithmsalgorithms leadlead toto denserdenser power:power: – Dense random logic. – Timing pressure leads to faster/bigger/power-hungrier gates. yy IncreasedIncreased everyevery processprocess technologytechnology generationgeneration (higher(higher powerpower @@ smallersmaller diedie size).size). Page 10 All dates, plans, and features are preliminary and subject to change without notice PowerPower DensityDensity andand ThermalThermal Power Density (Simulated)1 Thermal Map (EDO System)2,3 Color codes: (lowest) black, red, orange yellow, white (highest) (lowest) blue, green, yellow, orange, purple, white (highest) Pentium M Processor power density example 1 Source : Intel® Pentium® M Processor Power Estimation, Budgeting, Optimization, and Validation Dani Genossar, Nachum Shamir, ITJ Q2/2003 2 Source: Dani Genossar, Nachum Shamir, Intel 2003 3 The L2 left portion thermal map blank due to measurements limitations. Page 11 All dates, plans, and features are preliminary and subject to change without notice Voltage,Voltage, Power,Power, FrequencyFrequency y Transistor switches faster at higher voltage Î higher voltage enables higher frequency y Maximum frequency grows about linearly with voltage. …Within a given voltage range Vmin-Vmax. – V < Vmin 1000 Î transistors won’t switch. XScale processor freq. & power vs. voltage * 900 – V > Vmax 800 Î the device may burn. Fequency(Mhz) 700 Power (mWatt) y “The cube law”: 600 PP ≈≈ kVkV3 500 (or ~1%V = 3%P) 400 300 y Implications 200 – Can save energy/power when 100 Performance is not a factor 0 * Source: Intel Corp. (http://developer.intel.com) 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 Page 12 All dates, plans, and features are preliminary and subject to change without notice BeanBean Counting*Counting* High Voltage Low Voltageers 1. Number of transistors 80 M 80b M 2. Die area 80 mm2 um80 mm2 3. Operation Voltage 1.48V N 0.96V 4. Operation frequency e1.6 GHz 0.6 GHz 5. Energy per switch per transistor v 0.85 fJ 0.36 fJ Í 6. Power per transistor (#4x#5) ti 1.4 uW 0.21 uW 7. Activity factor ta 20% 10% Í 8. Energy per cyclee pern chip (#1x#5x#7) 14 nJ 2.9 nJ 9. Power (#4x#8)s 22 W 1.7 W 10.Power reDensity (#9/#2) 27 W/cm2 2.1 W/cm2 ep R* These numbers are representative only and do not intend to reflect any existing device Page 13 All dates, plans, and features are preliminary and subject to change without notice MobileMobile PlatformPlatform GoalsGoals && ChallengesChallenges yy Goal:Goal: HigherHigher performanceperformance – Challenge: How much power one can afford to spend in order to implement a performance feature? yy Goal:Goal: LongerLonger BatteryBattery lifelife – Challenge: How to balance the design for maximum performance and extended battery life? Page 14 All dates, plans, and features are preliminary and subject to change without notice HigherHigher PerformancePerformance vs.vs. LongerLonger BatteryBattery LifeLife y Processor average power is <10% of platform y LCD and other components consume much more Intel® LAN Fan ÎEven ideal processor can DVD ICH 2% 2% 2% extend battery life by 11% 3% Display CLK (panel + inverter) at most! 5% 33% HDD 8% ÎDecision: ÎDecision: GFX – Optimize for performance when Active 8% – Optimize for battery life when idle Misc. CPU 8% 10% y Caveat Intel® MCH Power Supply – This observation is Pentium M specific. 9% 10% May not hold as such in the future! Source: 2004 Extended Battery Life Technologies, Don J Nguyen, Intel Developer Forum, Spring 2003 Page 15 All dates, plans, and features are preliminary and subject to change without notice OptimizeOptimize forfor PerformancePerformance “Maximize performance at given thermal constraints” ÎApproximated by: Maximizing performance at given Power budget y The test: “A micro-architectural feature that gains performance or saves power should be better than simply using voltage/frequency scaling” y That is: f ≈ K*V Power = α*C*V 2*f ≈ α*C*f 3 ∆power/Power = ((f+∆f)3 – f 3 )/ f 3 ≈ 3∆f / f Perf = IPC*f ÎThe right Performance/Power tradeoff: 1% more performance in less than 3% Power – a gain! Page 16 All dates, plans, and features are preliminary and subject to change without notice OptimizeOptimize forfor BatteryBattery LifeLife ““MinimizeMinimize EnergyEnergy perper Task”Task” yy ShouldShould addressaddress bothboth activeactive andand idleidle energyenergy yy TheThe activeactive energyenergy tradeoff:tradeoff: Energyactive = Poweractive * Timeactive or Energyactive ≅ Poweractive / Perfactive ÎÎTheThe rightright Performance/PowerPerformance/Power tradeoff:tradeoff: 1% more performance in less than 1% Power – a gain! Page 17 All dates, plans, and features are preliminary and subject to change without notice PuttingPutting itit allall together:together: TheThe PentiumPentium MM processorprocessor ApproachApproach 100% Energy Loss Constrained- Constrained Perf Loss 80% Performance Breakeven line 60% >> Wrong trade-off zone Î = = Energy Loss ss ss 40% Constrained Perf oo L L Energy Gain rr Breakeven 20% wewe oo line 0% | P | P % % % % % % % % % % % % % % % % % % % % % 0 7 5 2 9 6 3 0 3 6 9 2 5 8 4 7 0 inin - - - -3 -2 -24 -21 -18 -1 -1 1 1 1 21 2 2 3 -20% Ga Ga rr ee Energy Gain Energy Gain ww oo Constrained Perf -40% Constrained