ICCADICCAD TutorialTutorial 20062006

PowerPowerPower andandand ThermalThermalThermal ChallengesChallengesChallenges forforfor 656565 nmnmnm andandand BelowBelowBelow

Organizer:Organizer: KaustavKaustav BaBanerjeenerjee UniversityUniversity ofof CaliforCalifornia-Santania-Santa BarbaraBarbara

Speakers:Speakers: VivekVivek De—CircuitDe—Circuit ResReseearcharch Labs,Labs, IntelIntel PaulPaul Coteus—IBMCoteus—IBM T.J.T.J. WatsonWatson ResearchResearch CenterCenter KaustavKaustav BaBanerjee—Universitynerjee—University ofof California-SCalifornia-Santaanta BarBarbbaarraa Part-I

Power Dissipation in Nanometer CMOS: Trends, Challenges and Solutions

Vivek De Circuits Research Lab IntelIntel CorporationCorporation

1 ICCAD’06 tutorial

Outline

y Power challenges y Energy-efficient technology & design y Leakage power control

2 Vivek De ICCAD’06 tutorial

1 Moore’s Law for multi-core processors 1.E+10 Dual-core ® 2 Dual-core 1.E+09 ® Itanium® 2

Pentium® 4 1.E+08 ® III

Pentium® II Itanium® 1.E+07 Transistors Pentium® 486 1.E+06 386 286 1.E+05 1980 1990 2000 2010

3 Vivek De ICCAD’06 tutorial

PowerPower isis thethe limiter…limiter…

1000

Pentium® 4 proc 100 1000's of Power Watts? (Watts) 10 Pentium® proc

386 1 8086

8080 0.1 1970 1980 1990 2000 2010 2020

4 Vivek De ICCAD’06 tutorial

2 Power challenges are neither new nor fundamental

“Will it be possible to remove the heat generated by 10’s of thousands of components?” G. Moore, Cramming more components onto integrated circuits, Electronics, Volume 38, Number 8, April 19, 1965

5 Vivek De ICCAD’06 tutorial

Tulsa – Xeon®

FSB TOP

Core 0 Core 1 T Core 1 A G 1MB L2 1MB L2 1MB L2 16MB L3 Control Logic

Shared 16MB L3 1MB L2 T A Interface Core 0 G

FSB BOT y technology: 65 nm, 8 Cu interconnect layers y : 1.328 Billion y Die area: 435 mm2

6 Vivek De ICCAD’06 tutorial

3 Montecito – Itanium® 2 Processor

1MB Core 1 Core 0 Core 1 L2I

1MB 256kB 1MB 256kB L2I L2D L2I L2D 12MB L3 12MB L3 12MB L3 12MB L3

1MB Bus Interface Core 0 L2I

y Process technology: 90nm, 7 Cu interconnect layers y Transistor count: 1.72 Billion y Die area: 596mm2

7 Vivek De ICCAD’06 tutorial

Yonah – Mobile/Desktop/Blade Server

Core 0 Core 1 Core 0 Core 1

Bus Shared 2MB L2 Cache

Bus Interface 2 MB L2 Cache

y Process technology: 65 nm, 8 Cu interconnect layers y Transistor count: 151 Million y Die area: 90.3 mm2

8 Vivek De ICCAD’06 tutorial

4 Tulsa power breakdown

Total Power Breakdown Leakage Breakdown

Cores Cores 74% 67%

L3 L3 Cache Cache I/O Ctrl 12% I/O Ctrl 22% 3% 11% 2% 9%

9 Vivek De ICCAD’06 tutorial

Technology outlook

High Volume 2004 2006 2008 2010 2012 2014 2016 2018 Manufacturing Technology Node 90 65 45 32 22 16 11 8 (nm) Integration 2 4 8 16 32 64 128 256 Capacity (BT) Delay = CV/I 0.7 ~0.7 >0.7 Delay scaling will slow down scaling Energy/Logic Op >0.35 >0.5 >0.5 Energy scaling will slow down scaling Bulk Planar CMOS High Probability Low Probability Alternate, 3G etc Low Probability High Probability Variability Medium High Very High ILD (K) ~3 <3 Reduce slowly towards 2-2.5 RC Delay 1 1 1 1 1 1 1 1 Metal Layers 6-7 7-8 8-9 0.5 to 1 layer per generation

10 Vivek De ICCAD’06 tutorial

5 SubthresholdSubthreshold leakageleakage

10,000 0.10 µm

0.13 µm 1,000 0.18 µm

0.25 µm m)

µ 100 Ioff (nA/ Ioff 10

1

30 40 50 60 70 80 90 100 110

Temp (C)

11 Vivek De ICCAD’06 tutorial

GateGate oxideoxide leakageleakage

12 Vivek De ICCAD’06 tutorial

6 JunctionJunction leakageleakage

Ig leakage @ 30nm 1E-061E- Ioff leakage @ 30nm

1E-071E- 10nm 1E-081E- 15nm 20nm

(A/µm) 1E-091E- JE

I 30nm 1E-101E-

1E-111E-

1E-121E- 0 5 10 1518 20 Doping Concentration (10/cm³) Fig 15. Junction leakage Vs doping concentration. Circles - data, squares – extrapolated points. Other sources of leakage at Lg=30nm have been added to the graph

13 Vivek De ICCAD’06 tutorial

LeakageLeakage powerpower dominating…dominating…

50% Must stop 40% at 50%

30%

20% (% of Total) of (%

Leakage Power 10%

0% 1.5 0.7 0.35 0.18 0.09 0.05 Technology (µ)

14 Vivek De ICCAD’06 tutorial

7 Limit power growth…

1000

100 Total Power ▼ 10 Active 1 Power Power [W] 0.1

Leakage ► 0.01

0.001 1990 1992 1994 1996 1998 2000 2002 2004 2006 Year

15 Vivek De ICCAD’06 tutorial

Single- design (in)efficiency!

4 40% Same Process Technology Same Process Technology Die Area Performance 3 Power Enegry efficiency drops ~20%

2 20%

1 Growth(X) from previous uArch Reduction in MIPS/Watt Reduction

0 0% S-Scalar Dynamic Deep S-Scalar Dynamic Deep Pipeline Pipeline

16 Vivek De ICCAD’06 tutorial

8 Memory latency bottleneck!

1000

100 CPU Cache Memory

10 Small ~few Clocks Large 50-100ns (Clocks) Latency Memory 1 100 1000 10000 Assume: 50ns Memory latency Freq (MHz)

17 Vivek De ICCAD’06 tutorial

Energy-efficient technology & design

18 Vivek De ICCAD’06 tutorial

9 Carrier mobility enhancement

G G

S D S D

PMOS NMOS

SiGe S-D creates strain Tensile Si3N4 Cap

10-25% higher ON current 84-97% leakage current reduction OR 15% active power reduction Source: Mark Bohr,

19 Vivek De ICCAD’06 tutorial

Non-planar MOS devices

Tri-gate Gate 3 Drain

Gate Gate Lg WSi

Drain Gate 1 T Source Source Si Gate 2

Source: Intel

Improved short-channel effects Higher ON current for lower SD Leakage Manufacturing control: research underway

20 Vivek De ICCAD’06 tutorial

10 Active Power Reduction

Reduce switched capacitance: Technology scaling: • Minimize diffusion, wire and • gate loading • Supply voltage scaling is • Use more efficient layout slowing down techniques • Thresholds don’t scale

2 P= α CL V fCLK

Reduce switching activity: Reduce clock frequency: • Conditional execution • Use parallelism • Conditional clocking • Less pipeline stages • Conditional precharge • Use double-edge flip- • Turn off inactive blocks

21 Vivek De ICCAD’06 tutorial

Active power management

Max Performance

Power scaling range ~ 3–4 Increasing Power α V 3 Performance Power Increasing Efficiency Minimum (Freq/Power) Operating Voltage Most efficient operating point Deep Sleep / Quick Start Frequency y Voltage-frequency scaling with active thermal feedback y Multi-operating states from high performance to deep sleep y Power management reduces average and peak power

22 Vivek De ICCAD’06 tutorial

11 Xscale V/F adjustment

1200 5000 4500 1000 4000 3500 800 3000 600 2500 2000 400 Watt / MIPS 1500

Core Power (mW) 1000 200 500 0 0 0 200 400 600 800 1000 Core Speed (MHz)

23 Vivek De ICCAD’06 tutorial

Tulsa V/F scaling

1

0.95

0.9

0.85 1.25V

Normalized TDP Normalized 0.8 1.20V 1.15V 0.75 0.85 0.9 0.95 1 Normalized Frequency

24 Vivek De ICCAD’06 tutorial

12 More on-die cache memory

100 )

2 100%

Logic 75% Memory Pentium® M 10 50%

Pentium® III 25% 486 Pentium® Pentium® 4 Cache% of TotalArea

Power Density (Watts/cm 0% 1 1u 0.5u 0.25u 0.13u 65nm 0.25µ 0.18µ 0.13µ 0.1µ

25 Vivek De ICCAD’06 tutorial

Multi-threading

100% Thermals & Power Delivery designed for full HW utilization 80% Single Thread

60% Full HW Utilization

1 GHz ST Wait for Mem 40%

Performance 2 GHz Multi-Threading 20% MT1 Wait for Mem 3 GHz MT2 Wait 0% 100% 98% 96% MT3 Cache Hit %

26 Vivek De ICCAD’06 tutorial

13 Multi-core

3.5 3 C1 C2 Multi Core 2.5

Cache 2 1.5 Single Core C3 C4

Relative Performance Relative 1 1234 Die Area, Power

27 Vivek De ICCAD’06 tutorial

General-purpose vs. special-purpose

28 Vivek De ICCAD’06 tutorial

14 Energy & area efficiency

Courtesy: Prof. Teresa Meng, Stanford

29 Vivek De ICCAD’06 tutorial

Special-purpose hardware TCP Offload Engine 1.E+06

1.E+05 PLLPLL GP MIPS Send buffer @75W OOOROB 1.E+04 MIPS 1.E+03 TOE MIPS ExecExecROMROM Core @~2W TCB CoreTCB

CLB 1.E+02 CAM1 Input seq 1995 2000 2005 2010 2015

Opportunities: Network processing engines MPEG Encode/Decode engines Speech engines 2.23 mm X 3.54 mm, 260K transistors

30 Vivek De ICCAD’06 tutorial

15 The Leap to Parallelism: Driving Energy-Efficient Performance The Next Leap Is Our Focus

Quad-Core 2007 MP Tigerton Early ’07 DP Clovertown Dual-Core 2005 – First Intel dual-core ships 2H’06 – Next Generation uArch

Hyper-Threading

O O O Multi-Processor ENERGY-EFFICIENT PERFORMANCE

TIME

31 Vivek De ICCAD’06 tutorial

Leakage Control

32 Vivek De ICCAD’06 tutorial

16 GateGate oxideoxide leakageleakage controlcontrol

1033 2 2.5nm 5.1nm ) 130nm Transistor 2 3.5nm 7.6nm 1000 CoSi2 (A/cm 2 3.0nm Si3N4 3 GATE

I 4 5 6 C. Hu, 1996. 10-77 70 nm 0 2 4 6 8 10 12 Gate Voltage (V)

Poly Si Gate Electrode 1.5 nm Gate Oxide Si Substrate

33 Vivek De ICCAD’06 tutorial

Leakage vs. voltage 100 e 80

60 Subthreshold 40 Leakage ► 20 ◄ Gate Leakage Normalized Leakag 0 0 0.3 0.6 0.9 1.2 1.5 Voltage (V)

34 Vivek De ICCAD’06 tutorial

17 Dual-VtDual-Vt designdesign forfor leakageleakage controlcontrol

60% 55% 50% Full low-Vt performance! Note: not drawn to scale 100nm dual--Vt 45% low-Vt usage: 34% 100nm 40% 100nm high-Vt 35% 150nm low-Vt 30% # of paths# of

Vt transistorVt width 25% - 20% 15% slack

very low very 10% (as % (as % of total transistor width) 5% 0% 0% 5% 10% 15% 20% 25% % timing scaling from all high-Vt design

35 Vivek De ICCAD’06 tutorial

ReverseReverse bodybody biasbias (RBB)(RBB)

Intrinsic leakage reduction at 110C Intrinsic leakage reduction at 27C 100 100 110C 27C 0.5V RBB 0.5V RBB

10 Low Vt 10 Low Vt High Vt

High Vt Iint reduction factor (X) Iint reductionIint (X) factor 1 1 0.01 0.1 1 10 100 1000 0.01 0.1 1 10 100 1000 Target Ioff (nA/um) Target Ioff (nA/um)

36 Vivek De ICCAD’06 tutorial

18 OptimalOptimal RBBRBB

Microprocessor critical path I/O circuit circuit

1.0E-04 1.0E-05 Lwc 110C 27C 1.0E-05 1.0E-06 Lwc Chip 1.0E-06 1.0E-07 Chip

IDD (A) IDD Lnom (A) IDD 1.0E-07 1.0E-08 Lnom 1.0E-08 1.0E-09 00.511.5 0 0.5 1 1.5 VBS (V) VBS (V)

37 Vivek De ICCAD’06 tutorial

ScalingScaling ofof RBBRBB effectivenesseffectiveness

10 110C

110 nm LVt 150 nm

110 nm HVt

Iint reduction factor (X) reduction Iint 1 0.1 1 10 100 1000 10000 Target Ioff (nA/um)

38 Vivek De ICCAD’06 tutorial

19 StackStack effecteffect

100000 1.2 O O Vdd 30 C and 80CC Idevice 1 Istack-l 10000 80 C 0.8 wl w I 1000 0.6 stack-u w 100 30 C Vdd 0.4 u wu I stack-u 10

Normalized current Normalized current 0.2 Normalized two stack leakage stack two Normalized Vint 0 1 Normalized two stack leakage stack two Normalized 0 0.5 1 1.5 1 10 100 1000 1000 10000 10000 100000 wl Vint (V)(V) Istack-l Vint (V) NormalizedNormalized single device device leakage leakage VX

39 Vivek De ICCAD’06 tutorial

ExploitingExploiting naturalnatural stacksstacks

32-bit Kogge-Stone 30% High VT Low VT 20% 10% vectors

% ofinput % 0% 5.0 5.6 6.2 6.8 7.4 105 120 135 Standby leakage current (µA)

Reduction Avg Worst

High VT 1.5X 2.5X

Low VT 1.5X 2X

40 Vivek De ICCAD’06 tutorial

20 StackStack forcingforcing forfor leakageleakage controlcontrol

100e-1210 Dual-V + Stack forcing Two-stack Two-stack t High-Vt Low-V t Dual- V w t Low-Vt

delay + Stack forcing wu wu≥½ w High-Vt wl≤½ w Low-V Number ofpaths Normalized delay Normalized Low-V t Normalized iso-load Normalized w +w = w t

under iso-input load u l 10e-121 1e-5 1e-4 1e-3 1e-2 1e-1 1e+01 wl Normalized Ioff Target delay De lay for Lmin device

Low-Vt + stack-forcing reduces leakage power by 3X

41 Vivek De ICCAD’06 tutorial

ActiveActive leakageleakage controlcontrol

IDLE ACTIVE IDLE ACTIVE

+ sleep or body bias

IDLE IDLE + sleep or clock gating body bias

42 Vivek De ICCAD’06 tutorial

21 DynamicDynamic bodybody biasbias

VCC 450mV PMOS FBB PMOS bias body Active mode:mode: ...... NMOS NMOS Forward body bias bias 450mV body (FBB)(FBB) FBB VSS

Dual-VT core PMOS VHIGH body V 500mV CC PMOS RBB bias ...... IdleIdle mode:mode: NMOS 500mV Reverse body bias bias RBB VSS (RBB)(RBB)

VLOW NMOS body

43 Vivek De ICCAD’06 tutorial

FBBFBB effectivenesseffectiveness

1000 110C

FBB 100 Low Vt 30X

NBB Low Vt 10 High Vt

2X 3X 1 Active to idle Ioff Reduction (X) Reduction Ioff idle to Active 0.1 1 10 100 1000 10000 Target Ioff at 27C (nA/um)

44 Vivek De ICCAD’06 tutorial

22 DynamicDynamic sleepsleep transistortransistor

PMOS forward body bias

V ON: gate CC overdrive Virtual VCC IdleIdle mode:mode: Noise on Dual-V ... T virtual supply core Sleep transistor OFF

ON: gate PMOS reverse body bias Virtual VSS overdrive VSS OFF: gate VCC underdrive Virtual VCC

Active mode: ... Virtual supply collapse Sleep transistor ON

OFF: gate Virtual VSS underdrive VSS

45 Vivek De ICCAD’06 tutorial

FrequencyFrequency impactsimpacts

Body bias Sleep transistor 75°C, 450mV FBB to core 75°C, No sleep transistor 4.5 4.5 450mV FBB to core 5% frequency 3% frequency penalty 4.05GHz increase 4.05GHz 4 ZBB 4

3.5 3% higher V for 3.5 None CC 5% lower VCC for same frequency same frequency PMOS 3 3 Frequency (GHz) Frequency Frequency (GHz) NMOS 1.28V 1.32V 1.28V 1.35V 2.5 2.5 1 1.1 1.2 1.3 1.4 1.5 1 1.1 1.2 1.3 1.4 1.5 Vcc (V) Vcc (V)

46 Vivek De ICCAD’06 tutorial

23 LeakageLeakage reductionreduction

Reference: No sleep Frequency Leakage Area transistor, 450mV FBB to degradation reduction increase core, 1.35V, 75°C

No over/under drive 2.3% 37X 11% or sleep body bias

200mV over/under drive 1.8% 44X 12% transistor

PMOS sleep Sleep body bias: 1.8% 64X 12% 450mV FBB – 500mV RBB

Dynamic body bias: 0% 1.9X 2% 450mV FBB - ZBB PMOS body bias

47 Vivek De ICCAD’06 tutorial

TotalTotal activeactive powerpower savingssavings

TON = 100 cycles, 75°C, α=0.05, F=4.05GHz 15% 8% 12 savings savings Overhead 10 Leakage 8 ↓ 77% ↓ 45% LBG 6

4 ↑ 3% Switching Total power (mW) Total power Tota power (mW) power Tota 2 1.32V 1.28V 1.28V 0 ClockClock gating gating ++ ClockClock gating gating onlyClock Clock gatinggating + + sleepsleep transistor transistor only body biasbias

48 Vivek De ICCAD’06 tutorial

24 L3 cache sleep and shut-off modes

Active Mode Sleep Mode Shut-off Mode

Sub-array Sub-array Sub-array

Virtual VSS Sleep Block Bias Select Shut X X off X 1.1V 2x lower 2x lower leakage leakage Virtual 520mV

Voltage VSS 250mV 0V 0V

49 Vivek De ICCAD’06 tutorial

Dynamic Intel® smart cache sizing

y First implementation in dual-core – Dynamic implementation of the shut-off mode y HW based algorithm predicts cache usage requirements – Considers the % of time the CPU is in Active state compared to the various sleep states y During periods of low activity or inactivity the processor dynamically adapts its effective cache size – Cache content is gradually flushed to system memory – Cache ways are gradually turned off (physically as well as logically), thus reducing power y Cache ways are re-powered on demand to deliver full performance when needed

50 Vivek De ICCAD’06 tutorial

25 Part-II

From Chips to Systems: Cooling challenges and solutions

Paul Coteus Thomas J. Watson Research Center IBMIBM

Acknowledgements: Craig Atherton, R.J. Bezama, Erwin Cohen, Evan Colgan, B. Furman, M. Gaynes, Shawn Hall, Hendrik F. Hamann, Mahdusudan Iyengar, N. LaBianca, James Lacey,Lacey, JohnJohn Magerlein,Magerlein, K.K. Marston,Marston, MartinMartin OO’’Boyle, R.J. Polastre, Roger Schmidt, andand AlanAlan WegerWeger

Introduction

• Challenges for sub-65 nm technology nodes • Overview of packaging and cooling techniques • Measurement and simulation techniques of chip thermal profiles − Silicon microchannel cooling • Hot-spot management • System level cooling issues • Emerging Technologies: 3-D

2 Paul Coteus ICCAD’06 tutorial

1 History Repeats

14 CMOS 12 Bipolar ) 2 10

8

6

4

2 ? Module Heat Flux(watts/cm Opportunity

0 1950 1960 1970 1980 1990 2000 2010 Year of Announcement

3 Paul Coteus ICCAD’06 tutorial

Passive Power Continues to Explode

1000 •Power components: )

2 100 − Active power Active Power − Passive power 10 Passive Power Sub-threshold 1 leakage (source-drain leakage) 0.1

Gate leakage (W/cm Density Power Gate Leakage − 0.01

1994 2005 0.001 1 0.1 0.01 GateGate Length Length (microns)(microns)

4 Paul Coteus ICCAD’06 tutorial

2 Approaches to Overcome the Limits Heavily doped, Contacts Stress liner ultra- thin further from overlap increased body junction

12-22 nm Ultra-thin SOI Speed optimized High K – Metal Gate Strained Silicon Layout Materials Innovation & Structures

Cooling Lithography Silicon Germanium

Hitachi: Water cooled notebook, July 2002 29.5 nm Resolution Micro-channel cooler Immersion Lithography 5 Paul Coteus ICCAD’06 tutorial

Range of Heat Fluxes

100000

10000 Surface of sun

2 1000 Oxy-acetylene CMOS torch 100 IC’s

10 100 W light bulb 1

Heat Flux (W/cm ) Black body 0.1 radiation (ST4)

0.01 373 K 10 100(100 C) 1000 10000

Temperature (K)

6 Paul Coteus ICCAD’06 tutorial

3 Computing Systems Have Many Cooling Challenges 1000 Chip / Module Rack Data Center 100 2

Key issues: 10 y Cooling using air Key issues: is costly, noisy, y Extreme power density and is near limit y Hot spots y Chilled water 1 Key issues: distribution? Power Density (W/cm ) y Thermal interfaces y Noise y Mechanical issues y Size y Cost y Near air cooling limit 0.1 0.1 1 10 100 1000 10000 Total Power (kW) 7 Paul Coteus ICCAD’06 tutorial

Packaging and Cooling Techniques

8 Paul Coteus ICCAD’06 tutorial

4 Limits of Advanced Cooling Technologies Si microchannel cooler

• Air cooling limit is increasing due to advanced thermal interface materials (TIMs) 350

300

250 Copper based liquid cooler 200

150 Liquid metal TIM Liquid-cooled 100 Apple machine

Traditional Paste TIM 50 Liquid cooling enables: • Higher power density 0 • Lower junction temperature • Compact system packaging Chip Power Density (W/cm2)

(for Tj ~ 85 C) 9 Paul Coteus ICCAD’06 tutorial

Chip Power Density Forecasts and Cooling Technologies

500 500

2 450 450 Advanced 400 400 microchannel 350 350 300 300

250 Approx current limit for 250 conventional C4’s ITRS 2003 high perf Microchannel 200 200 cooling 150 150 iNEMI 2004 high perf Cold plate 100 100 Better TIM and ITRS 2005 cost perf heat pipes Liquid 50 ITRS 2005 high perf 50 Cooling Average Power Density (W/cm ) (W/cm Density Power Average 0 0 2000 2005 2010 2015 2020 Air Cooling Year of Production

10 Paul Coteus ICCAD’06 tutorial

5 Options for Cooling Very High Power Density Chips 10000 International Electronics Manufacturing Initiative (iNEMI) Technology Roadmap December 2004

1000 Today

100

10 chip on Boiling on chip Relative Cooling Potential over chip over Performance Cold Plate Water withCooling High

1 Water Cooling with Microchannels Impingement on chip Performance Heat Sink Water Impingement Jet chip on Forced Convection Air + Air Convection High Forced Fluorocarbon Single Phase Jet Phase Single Fluorocarbon Fluorocarbon Jet Impingement with Fluorocarbon Spray Cooling chip on Cooling Spray Fluorocarbon Fluorocarbon Flow Foiling over chip Fluorocarbon Forced Convection Convection Forced Fluorocarbon Fluorocarbon Pool Boiling on chip Air Natural 0.1 Convection

11 Paul Coteus ICCAD’06 tutorial

Motivation for Liquid Cooling

Device performance • Increasingly difficult to further improve CMOS performance • Lower temperature operation can reduce leakage current and improve performance

Chip power density • Air cooling limit has increased from ~50 W/cm2 to ~100 W/cm2, but further increases very difficult • Power density (and “hot-spots”) increasing

Total rack power • Current maximum rack power ~30 kW • Little further increase possible with air cooling • High power, air-cooled racks cannot be packed closely together in a data center

Liquid cooling solves many problems!

12 Paul Coteus ICCAD’06 tutorial

6 Typical Electronic Packaging for Processor Chips

Heat sink or cooler Single chip Multi-chip Cap or heat module (SCM) module (MCM) spreader 1000’s of flip chip solder balls Thermal interface ~0.2 mm pitch materials (TIM) Chip Ball, pin, or land grid array Circuit board ~1 mm pitch

Fundamental trends in electronic packaging: • Ever higher electrical and thermal power density • Ever higher interconnect density with improved high-frequency performance – Wirebond replaced by flip chip – Peripheral attach replaced by full area array of contacts • Transition from ceramic to plastic modules • System-on-a-chip replaced by system-on-a-package for some applications • Elimination of potentially harmful materials including Pb and halogens

13 Paul Coteus ICCAD’06 tutorial

Thermal Interface Material (TIM) Issues Heatsink TIM Thermal hat Chip Substrate Material Thermal Conductivity Thermal Exp (W/m-K) Coef (ppm/oC) y Thermal paste conductivity ~50X lower than Si or metals, but Si 124 3 ~100X better than air SiC 270-300 3 y Critical to make paste layer thin Al 180 23 y May need to provide mechanical Cu 390 17 compliance as well as thermal Cu-W 170 7 contact y Coverage and degradation are Ag 430 19 issues Diamond 600-2000 2-3 Thermal v 5-10 --- paste Air 0.03 ---

14 Paul Coteus ICCAD’06 tutorial

7 Thermal Resistances for Air and Liquid Cooling Tambient Air Cooling Liquid Cooling

Rheatsink Tair = Tliquid = 20-35 C 20-35 C

∆T = Rspreader ∆T = 50-65 C 50-65 C Chip Chip T ~ 85 C T ~ 85 C j RTIM j Substrate Substrate

Rchip Thermal interface ) materials (TIM) T 2 ) junction 1 60 $ 1x1 cm chip 0.8 75 $$ ∆T=60 C 0.6 100 0.4 150 0.2 $$$ 300 0 Thermal Resistance (C/W

Air cooling Enhanced air cooling Enhanced water cooling Max Power Density (W/cm $ for cooling a processor; with many processors and machine room costs, $ comparison may invert 15 Paul Coteus ICCAD’06 tutorial

How to Use Low Thermal Resistance Offered by Water Cooling

1.Increase chip power while keeping same Tj • Highest performance, but much poorer performance/power • Leakage current large and increasing with technology generations • Supplying 2-4X as much power would be very difficult • Reliability concerns

2.Reduce Tj and keep power about the same • Increased performance for a given technology generation Estimate 15-20% increase in transistor switching frequency • Improved power efficiency and lower leakage current • Improved reliability

16 Paul Coteus ICCAD’06 tutorial

8 Introduction to Microchannel Cooling

• Invented ~20 years ago by Tuckerman and Pease − High heat transfer coefficient due to small channels and greatly increased surface area for heat transfer • Can use multiple “heat-exchanger” zones in parallel to reduce pressure drop • Fabricating cooler from Si avoids problems with differential thermal expansion • Copper microchannel coolers are also attractive Inlets/outlets Manifold chip Inlets Microchannel chip Active chip Outlets

Inlets

Outlets

17 Paul Coteus ICCAD’06 tutorial

Si Microchannel Coolers

• Demonstrated cooling over 1500 W (>500 W/cm2) in a practical single chip module (SCM) implementation

Si microchannel cooler with staggered fins

Manifold block Inlet Outlet

Adhesive Gasket Microchannel cooler

Ceramic substrate Chip TIM

Practical SCM implementation

18 Paul Coteus ICCAD’06 tutorial

9 Manifold Parts and Assembled Microchannel SCM Lower manifold (bottom view) Lower manifold (top view) Upper manifold (bottom view)

Chip on ceramic SCM Microchannel cooler / gasket Completed microchannel SCM

• Upper and lower manifold parts molded from high-temperature plastic • Assembled into completed microchannel SCM

19 Paul Coteus ICCAD’06 tutorial

Temperature Profile Measurements and Simulations

20 Paul Coteus ICCAD’06 tutorial

10 Determining Thermal Resistance

16 Thermometer and ∆T Heater Resistors 14 Chip T (C) ∆ 12 TIM Cu block

10 Thermocouple ∆T Thermometer and 8 Heater Resistors

6 Water In/Out 4

2

Temperature Difference Difference Temperature 2x2 cm chip 0 0 20406080100120 Chip Power (W)

21 Paul Coteus ICCAD’06 tutorial

Thermal Chip for Characterizing Microchannel Coolers Thermometer resistor terminals Thermometer resistor

Heater resistor

Thermometer and Heater Resistors

Heater resistor terminals Water In/Out

22 Paul Coteus ICCAD’06 tutorial

11 Microchannel Cooler Characterization

Manual bypass Chiller and Filter pump

Proportioning valve

Flow meter

Differential pressure gauge

Thermocouples

Power to Measure thermometer heater resistance

23 Paul Coteus ICCAD’06 tutorial

Microchannel Single Chip Module Section a) Molded manifold block b) Gasket Si microchannel 250 µm Channels 200 µm Base

Chip, 400 µm Ag epoxy Epoxy Underfill Si microchannel cooler Gasket

Chip Adhesive Ceramic substrate

c) Molded manifold block

Gasket

Si microchannel cooler Chip

• Polished cross-sections, after filling channels w/epoxy.

24 Paul Coteus ICCAD’06 tutorial

12 Microchannel Single Chip Module Results

2 400 um chip/Ag epoxy/450 um channel chip 25 P60/C35 P60/C35 P75/C42 P75/C48 20 P75/C48 P80/C53 P100/C60

15 Thermal Resistance (C-mm /W) (C-mm Resistance Thermal 0.5 1 1.5 2 Flow (lpm) • For 1.25 lpm, P60/C35 best with 15.9 C-mm2/W & 42 kPa. • For 34.5 kPa, P80/C53 best with 15.8 C-mm2/W & 1.6 lpm

25 Paul Coteus ICCAD’06 tutorial

Microchannel Single Chip Module Results 70 P60/C35 Flow Resistance 60 P60/C35 50 P75/C42 P75/C48 40 P75/C48 30 P80/C53 P100/C60 20 Pressure Drop (kPa) Pressure 10

0 0.5 1 1.5 2 Flow (lpm) • Higher flow resistance with smaller pitch & channel width 26 Paul Coteus ICCAD’06 tutorial

13 Microchannel SCM Results within TIM

400 um chip/TIM/450 um channel chip 25 2 P60/C35 microchannels

Ag epoxy 20 Ag epoxy Ag epoxy TIMIn solder

15

In solder TIM

Thermal Resistance (C-mm /W) 10 0.5 1 1.5 Flow (lpm) • For In solder with 1.25 lpm, 12.0 C-mm2/W and 41.4 kPa. • For In solder with 34.5 kPa, 12.5 C-mm2/W and 1.1 lpm

27 Paul Coteus ICCAD’06 tutorial

Si Microchannel Cooling > 500 W/cm2 80 Power Toutlet - Tinlet Tchip avg - Tinlet 1600 70 1400 60 1200 50 1000 40 800

30 Power (W) 600 20 1.1 lpm, 29 kPa, kW1.1 1.59 29 lpm, 1.3 lpm, 37 kPa, kW1.3 1.58 37 lpm, kPa, kW1.7 1.56 57 lpm, 400 1.6 lpm, 50 kPa, kW1.6 1.56 50 lpm,

Temperature Difference (C) Difference Temperature 10 200

0 0 51015 Z102-5 HP0 Time (minutes) • With 400 µm chip, In TIM and 450 µm channel chip, cooled >500 2 0 W/cm with Tchip –Tinlet = 66 C for 37 kPa & 1.3 lpm. • Heated area of 3 cm2 & P60/C35 microchannel. • Inlet water heating up as exceeded chiller capacity.

28 Paul Coteus ICCAD’06 tutorial

14 Si Microchannel Cooling Results with Water

25 Chip TIM Base Fins 2 2 20 285 W/cm 0 w/∆T of 63 , Tj avg.-Tinlet

15 380 W/cm2 395 W/cm2

10 525 W/cm2

5 Total Unit Resistance (C-mm /W) 0 75 um Stag +Thin Si 60 um Stag +In TIM

• Total unit resistance is average of 6 sensors with 1.25 lpm. • 75 µm or 60 µm pitch staggered microchannels. • +Thin Si: 400 µm chip & 450 µm Si channel chip vs. 725 µm & 675 µm Si. • +In TIM; replace Ag epoxy with In solder.

29 Paul Coteus ICCAD’06 tutorial

Sub-ambient Si Microchannel Cooling

•N2 purged plexiglas box over test area to avoid condensation. • Special tubing & materials for -400C fluorinated fluid.

30 Paul Coteus ICCAD’06 tutorial

15 Cooling High Power with Sub-ambient Fluids 70 1000 400 um chip/Ag epoxy/525 um channel chip 60 P75/C42 microchannel, Tinlet = -280C ~1.4 lpm flow & 70 kPa pressure drop 800 50 Toutlet - Tinlet Tchip= 35 C Tchip avg - Tin 815 W 600 40 Power Tchip= 21 C 30 640 W T = -4 C 400 chip Power (W) 320 W 20 200

Temperature Difference (C) 10

0 0 0 5 10 15 Time (minutes)

2 • Demonstrated cooling 270 W/cm with Tinlet of -28C & Tj of 35 C. • Heated area of 3 cm2.

31 Paul Coteus ICCAD’06 tutorial

Sub-ambient Si Microchannel Cooling 27

2 Ag epoxy 26 In solder

25 Ag epoxy TIM

24 400 um Chip/TIM/450 um channel chip P60/C35 microchannel 23 Pressure drop of ~67 kPa

22 In solder TIM

Thermal Resistance (C-mm /W) (C-mm Resistance Thermal 21 -40 -30 -20 -10 0 10 20 Inlet Fluid Temperature (C) • Reduced thermal resistance with In solder vs. Ag epoxy TIM layer. 2 0 0 • With In TIM, for 100 W/cm & Tinlet = -40 C, Tjavg = -17 C.

32 Paul Coteus ICCAD’06 tutorial

16 Si Microchannel Cooling: Fluorinated Fluid vs. Water 30

2 Chip TIM Base Fins 25 2 0 255 W/cm w/∆T of 63 , Tj avg.-Tinlet 20 290 W/cm2

15 395 W/cm2

10 525 W/cm2

5 400 um chip/TIM/450 um chan. chip 1.25 lpm H20, 1.6 lpm fluorinated fluid

Total Unit Resistance (C-mm /W) Resistance Unit Total 0 Ag epoxy, water Ag epoxy, fluorinated In solder, water In solder, fluorinated

• Unit resistance increased by ~ 9 C-mm2/W using fluorinated fluid vs. water. Both fluids at ~200C.

33 Paul Coteus ICCAD’06 tutorial

Hot Spot Management

Dual Core PowerPCTM970MP Microprocessor

34 Paul Coteus ICCAD’06 tutorial

17 Power- vs. Hotspot-aware Layout

Power-limited Hotspot-limited design design

Ptotal = 25 W Ptotal = 100 W Power map Temp. map Power map Temp. map 185 W/cm2 Ptotal = 50 W 93 ∆T°C 46 W/cm2 41 ∆T°C Power map 0 W/cm2 185 W/cm2

12.3 W/cm2 save 10 W in low power save 10 W in high power density region (P =40 W) density region (P =40 W) Temp. map total total Power map Temp map Power map Temp map 98 ∆T°C 2 2 2 0 W/cm 185 W/cm 96 ∆T°C 111 W/cm 61 ∆T°C 2 7.4 W/cm2 21 W/cm

Power distributions are very important if chip is hotspot-limited

35 Paul Coteus ICCAD’06 tutorial

Spatially-resolved Imaging of Microprocessor Power (SIMP)

Step#1: IR thermal measurements camera using an transparent heat sink to detector

imaging lens

transparent IR radiation windows transparent liquid base chip

flow cell - infra-red (IR) radiation signal measures on-chip temperatures - transparent heat sink is well-defined & characterized - high cooling rates (up to 4 W/cm2 K) - tune ability emulates different package conditions - high cooling rates limit spreading & mimic correct transient behavior

36 Paul Coteus ICCAD’06 tutorial

18 Spatially-resolved Imaging of Microprocessor Power (SIMP)

Step#2: Temperature-to-power conversion* camera *here for 30 x 30 power sources

measured chip thermal map is represented as the sum of the individual temperature fields of each power source:

37 Paul Coteus ICCAD’06 tutorial

Determination of A-matrixes 1. Experimentally: individual temperature fields are measured using a focused laser beam as a heat source*: *here for 30 x 20 power sources max

min

2. Computational fluid dynamic calculations

In this study we employed calculated matrixes, which were validated using thermal test chips

38 Paul Coteus ICCAD’06 tutorial

19 SIMP validation experiments

Test vehicle: Measured thermal distribution Fully packaged thermal test chip with 21.7K heaters and 14 resistive sensors

Puff Case 2-1

1 12 3 4 12 10 11 6 8 7 0K

6

y [mm] 4 9 13 Derived Power distribution* 2 14 8 10 2 0 5 2 W/mm 024681012141618 x [mm] 100 C]

o 80 sensors heater 60 IR 0 40 •here for 20 SIMP can derive very 30 x 20 temperature [ temperature 0 Power 2468101214 accurate power maps sources sensor location 39 Paul Coteus ICCAD’06 tutorial

High power density regions of PowerPCTM970MP

Vector engine Dual FXU Power map & die picture overlay ISU Dual FPU IDU IFU Dual LSU L1 inst. cache L1 Data cache L2 cache

- High power density regions: sub units of vector engine, ISUs and the FXUs - both cores running: thermal cross talk between FXUs governs hotspot - one core running: vector engine governs hotspot

40 Paul Coteus ICCAD’06 tutorial

20 Hotspot movement Dual core single core

hotspot hotspot

- hotspot movement of ~2 mm from single to dual core operation

- thermal sensor reading & difference to Tjmax is workload dependent - hotspot movements can have broad implications for appropriate thermal/power management

41 Paul Coteus ICCAD’06 tutorial

Power5 Hotspot Patterns

Thermal map Power map

- 50 different workloads for Power5 imaged & analyzed - observed significant differences in circuit utilization

Jointly with P. Bose, Z. Hu, Y. Li et al. (IBM Research)

42 Paul Coteus ICCAD’06 tutorial

21 Clock Gating observed by SIMP

Without Clock Gating With Clock Gating

- 14 % hotspot temperature reduction - 18 % total power reduction

Jointly with P. Bose, Z. Hu, Y. Li et al. (IBM Research)

43 Paul Coteus ICCAD’06 tutorial

System Level Cooling Techniques

44 Paul Coteus ICCAD’06 tutorial

22 System with Microchannel Coolers

• Allows for higher power density and/or lower Tj • A key design feature for PERCS and Zebra supercomputers • Evaluating for IBM servers such as IH node

Blade prototype with 2 microchannel coolers

Thermal rack prototype with 32 microchannel coolers Proposed PERCS water cooled blade

45 Paul Coteus ICCAD’06 tutorial

Thermal Prototype Blade without Covers (One of 32 per rack)

Midplane Thermal replica’s of memory DIMMs connector (64 per blade, 1350 W total)

Optical connectors (for future work)

A Plumbing + flow monitors

B

A DRAM Power Converters Re-drive chips

B Total Power Hoses not shown between A-A and B-B Per Blade ~ 2.5 kW

46 Paul Coteus ICCAD’06 tutorial

23 Thermal Rack Prototype

5 • Blade packaging with 32 blades/rack • Up to 82 KW/rack 4 – High power density chips direct water cooled (26 KW/rack) 6 – Other chips (DRAM, etc.) are air cooled (up to 56 KW) AIR – Closed-loop air path (“Machine room in a rack”) 3 7 AIR

2 8

2.20 m 1

1.52 m 0.81 m

47 Paul Coteus ICCAD’06 tutorial

Typical Data Center Layout

"High-tech" cooling for multi-million dollar servers

48 Paul Coteus ICCAD’06 tutorial

24 Data Center Power • Currently reaching air cooling limit for high-performance computing data centers • Energy efficiency is a key design issue

High Performance Extreme Computing Thermal Problem

Creeping Commercial Thermal Computing Problem

Typical office - 4 to 8 watts/ft2

Roger Schmidt, IBM Server Group

49 Paul Coteus ICCAD’06 tutorial

HPC Data center heat load density trends

500 40 data from R.R. Schmidt et al., IBM Journal of Res. and Dev. ]

2 400 49, 709 (2005) 30 rack power [kW] power rack 300

20

200

10 100

zonal heat flux [W/feet flux heat zonal 2x every 2.5 years 0 0 1998 2000 2002 2004 2006 year

exploding heat load densities on the data center level

50 Paul Coteus ICCAD’06 tutorial

25 Facility/Network Concerns

When asked about the top 3 concerns…

78 % . 57 % Heat/Power Density Availability (Uptime) 39 % Space Constraints/Growth 23 % #1 concern: Technology changes 18 % Monitoring heat/power/cooling 17 % Security Justifying Expenditures 14 % D.C. Consolidations 13 % Hardware Reliability 13 % Sarbanes Oxley Compliance 10 % From Spring 2005 Service Delivery Data Center Users’ Group Conference Staffing/Training Limitations 7 % The adaptive Data Center: Software Issues 6 % Managing Dynamic Technologies Other 1 % 0 1020304050607080 1 % percentage [%]

51 Paul Coteus ICCAD’06 tutorial

Biggest Cooling Issues

44 %

Planning 24 % #2 issue: Hot spots

18 % Hot spots Uniform air distribution

Aging CRACs 6 %

Airflow through the rack 6 % From Spring 2005 Data Center Users’ Group Conference Other The adaptive Data Center: 0102030402 % Managing Dynamic Technologies percentage [%]

52 Paul Coteus ICCAD’06 tutorial

26 RApid Thermal Imaging of a Data center (RATID) cart with thermal sensors mounted in a defined 3D pattern is rolled thru data center while data logging

53 Paul Coteus ICCAD’06 tutorial

Deployment of RATID cart

inlet air

outlet air

RATID cart chilled air

54 Paul Coteus ICCAD’06 tutorial

27 IBM supercomputer layout

y 3 z 60 x 56 x 9 ~ 30.000 feet x 50.000 data points in a few hours 55 Paul Coteus ICCAD’06 tutorial

3D Temperature distributions of IBM supercomputer

z=5.5 feet

Hot spot at long aisle

z y

x first 3D temperature maps of a datacenter

56 Paul Coteus ICCAD’06 tutorial

28 Cross-section: Intermixing between hot and cold air

hot air is sucked into cold aisle

higher nodes can lower nodes are z be potential too hot too effectively cooled

y x

57 Paul Coteus ICCAD’06 tutorial

Cool Blue Heat Exchangers

before installation of Cool Blue: with Cool Blue:

Cool blue Hot spots 6 Installations removed by ~ 10-25oC

58 Paul Coteus ICCAD’06 tutorial

29 Emerging Technologies: 3-D

59 Paul Coteus ICCAD’06 tutorial

Why 3D Now?

• Increasing difficulties associated with 2D horizontal scaling in a power constrained world • Multi-core, multi-threaded and multi-image (virtualization) chips are stressing memory bandwidth and on-chip memory capacity • Ultra low voltage operating point enables 3D integration • Advances in Si through-via technology and wafer bonding providing much higher I/O densities

60 Paul Coteus ICCAD’06 tutorial

30 Transition to 3D CMOS

14 CMOS 12 Bipolar ) 2 10

8

6

4

2 ? for 3D Si Module Heat Flux(watts/cm Opportunity

0 1950 1960 1970 1980 1990 2000 2010 Year of Announcement

61 Paul Coteus ICCAD’06 tutorial

Broad Industry Investigation of 3D CMOS Established Companies Consortia ƒ Micron ƒ Hitachi ƒ Sematech ƒ Infineon ƒ IMEC ƒ Intel ƒ ASET ƒ Cu:Cu bonding ƒ Front-to-front Ref: B.Black, CCD conference 2004 Start-Up Companies ƒ Ziptronics ƒ IBM ƒ Oxide:Oxide ƒ Tezzaron bonding ƒ Cu:Cu bonding ƒ Cu:Cu bonding ƒ Front-to-front

ƒ Front-to-back Ref: http://www.tezzaron.com/ ƒ Zycube ƒ Adhesion layer ƒ Front-to-back

Ref: http://www.zy-cube.com/

62 Paul Coteus ICCAD’06 tutorial

31 Active Research on Many 3D Electronics Structures Active work on 3D electronics at ASET (Japan), Fraunhofer 3D IC (IBM) Institute, Georgia Tech, IBM, Intel, Micron Technology, Philips, Samsung, Singapore, Tezzaron, Thru-Si, Ziptronix, ...

Chip stacks with through-Si vias (Intel) Memory chip stack (Samsung)

Multiple chips on a Si carrier (IBM)

Microjoins Chip 1 Chip 2 Density BEOL Cu wiring Silicon Carrier Through Si vias Decoupling caps Substrate C4

63 Paul Coteus ICCAD’06 tutorial

Packaging for Revolutionary 3D Integrated Silicon Cooling Solution

3D Chip Package Interaction ƒ Intra-chip layer heat generation/flow Issues ƒ Heat flow to external cooling solution ƒ Chip to carrier stress ƒ Inter- and intra-layer stress

Chip Carrier

PCB

64 Paul Coteus ICCAD’06 tutorial

32 Difficulties in Cooling Stacked Chips

Thermal Cooler interface material Chip 1 (TIM) Chip joining layer Chip 2

• For high-performance processor applications, large I/O count usually requires many metal bump interconnects between the chips and vias through chip 2 • Typically a processor chip, which has many I/O, must be on the bottom −Highest power chip is located where cooling is the worst • Current thick dielectric stacks and chip joining layers have high thermal resistance • Internal cooling channels would occupy space needed for chip-chip interconnects 65 Paul Coteus ICCAD’06 tutorial

Stacked Chips – Estimating Cooling Capability Thermal Layer Resistance Cooler (C-mm2/W) Good air-cooled heatsink with spreader for 1 cm2 chip k40 Air or water High-performance microchannel cooler to inlet water 7-15 coolant Metal TIM ~3 725 µm thick Si 5.6

200 µm thick Si 1.5 Heat

10 µm thick SiO2 7.7 50 µm of typical chip underfill ~100 20% coverage of 25 µm long thermal vias with thermal ~0.6 Heat conductivity half that of Cu (not demonstrated) C4 layer with solder balls 100 µm high ~12

Board to ambient Very high Heat • Thermal modeling calculates temperature for various configurations − If 2 chips at 100 W/cm2 each and 2 memory chips on top with aggressive water cooler, estimate ~40 C-mm2/W, ∆T = 80C • Cooling is likely to limit performance for microprocessor applications Ambient 66 Paul Coteus ICCAD’06 tutorial

33 Integrated Thermal-fluidic Chip Cooling

Encapsulant

‰ Assembly/encapsulation C4 solder bumps with thermal-fluidic I/Os

B. Dang et al., IEEE Electron Device Letters, Feb 2006

tube Overcoat or Coolant out glue layer

Die with on-chip microfluidic channels, I/Os and heaters B. Dang, Ph.D. Dissertation, Ga Tech, 2006

67 Paul PaulCoteus Coteus ICCAD’06 tutorial

Enabling Various Thermal-fluidic Schemes

Distribution channels are at the Distribution Channels are at the back side front side of the package of the substrate, aligned with through-vias ‰ Compatible fabrication processes for various configurations B. Dang, Ph.D. Dissertation, Ga Tech, 2006

68 Paul PaulCoteus Coteus ICCAD’06 tutorial

34 Summary

• Computers face thermal challenges at the chip, rack, and data center level • Better thermal interface materials are critical • For higher power densities, liquid cooling will be needed • System issues must be considered

Thermally Burn-in and Intelligent Chip Thermal Reliability Design Modeling Testing

Assembly Qualified Mechanical Process Thermal Design and Development Solution Modeling

Prototype Material Design and Material Development Fabrication Characterization

69 Paul Coteus ICCAD’06 tutorial

Industry Implications

• Power and variability continue to be the two principal concerns facing semiconductor industry • CMOS performance gains will continue for the next ten years through the incorporation of new materials, innovative process integration, and tailored layouts • Ultra-low voltage devices and circuits will emerge, enabling 3D silicon integration with attendant density scaling • Packaging technology will continue to grow in importance as devices structures increase in complexity • Increasingly on-chip sensing and monitoring for improved variability control and yield will be the norm

70 Paul Coteus ICCAD’06 tutorial

35 Part-III

Electrothermal Engineering: from devices and interconnects to circuits and systems

Kaustav Banerjee Electrical and Computer Engineering University of California-Santa Barbara

1 Kaustav Banerjee ICCAD’06 tutorial

Electrothermal Engineering

Temperature awareness at every level… Integrated approach…

2 Kaustav Banerjee ICCAD’06 tutorial

1 Outline

¾ Micro-scale vs macro-scale thermal effects ¾ Electrothermal effects in scaled devices ¾ Electrothermal effects in scaled interconnects ¾ Circuit level effects ¾ Chip-scale effects ¾ Implications for emerging technologies

3 Kaustav Banerjee ICCAD’06 tutorial

Micro-Scale vs. Macro-Scale

Global View of IC Heat Transfer….

Micro-Scale Macro-Scale

4 Kaustav Banerjee ICCAD’06 tutorial

2 Device Self-Heating increases…… LeakageLeakage increases……increases…… Dissipated Power (µW/µm) Power Dissipated T Increase (K) ∆

°C ) Pop et al. IEDM 2001 Banerjee et al. IEDM 2003 Degrades Performance….. 109 Degrades Reliability….. N-FET PD-SOI (120nm) 108 P-FET

107

106

105 P-MOSFET 104 45 nm N-MOSFET Lin et al. IEDM 2005 3 45 nm 10 (Texas Instruments) -40 -20 0 20 40 60 80 100 ESDESD FailureFailure Operating Temperature (°C)

5 Kaustav Banerjee ICCAD’06 tutorial

Interconnect

Number of metal layers increases…. Low-kLow-k dielectricsdielectrics increaseincrease self-heatingself-heating Current density increases….

IBM

20 µm ULSI Metallization Im and Banerjee, IEDM 2000 Back-end thermal Profile!!

Global Wires Electromigration failure Global Wires ESD failure 209 °C ESD failure

50 nm

Im and Banerjee, IEDM 2000 Ryu et al. IRPS 1997 Banerjee et al. IRPS 2000 126 °C

6 Kaustav Banerjee ICCAD’06 tutorial

3 Circuit

IncreasedIncreased delaydelay andand variancevariance….…. Chip temperatures are non-uniform… µ=1.41e-9 µ=1.55e-9 σ2=2.0e-21 σ2=2.3e-21 Cache 70ºC

T=25°C T=125°C Temp (oC)

Core Distribution

120ºC Lin et al. IEDM 2005 Courtesy of S. Borkar, Intel 12 14 16 18 Delay (ns)

¾ Reliability ¾ Leakage and yield estimation ¾ Wire delay and clock skew ¾ Power/performance optimization ¾ Buffer insertion ¾ Voltage drop Zhang et al. ISLPED 2004 Lin et al. ICCD 2005 Ajami et al., TCAD 2005, JAICSP 2005

7 Kaustav Banerjee ICCAD’06 tutorial

System

Power density increases……

source : Intel, AMD

8 Kaustav Banerjee ICCAD’06 tutorial

4 System

Impact of higher power dissipation and density (hot-spots) on system

¾ System Performance and Stability ¾ degrades performance ¾ Heat-induced failure and instability

M. Miller, AMD ¾ Product Lifetime and Reliability ¾ Most reliability mechanisms are highly temperature sensitive

¾ Operating Cost ¾ More complex cooling solutions

9 Kaustav Banerjee ICCAD’06 tutorial

Electrothermal Effects in Scaled Devices

10 Kaustav Banerjee ICCAD’06 tutorial

5 Implications of Self-heating for Device Performance Increasing Self-Heating Increasing Self-Heating IncreasingIncreasing LeakageLeakage PowerPower 1400 200 1200

150 1000 800 100 600 400 50 200 0 0 0 30 60 90 120 150 180 210 Channel Length (nm) Pop et al. IEDM 2001 Banerjee et al. IEDM 2003 Degradation in On-Current <= ~15% for <= strained Si/Ge

~ 25% for FinFETs => Kolluri et al. (UCSB) Jenkins et al. IEDM 2002

11 Kaustav Banerjee ICCAD’06 tutorial

Implications of Self-heating for Emerging CMOS

ITRS 2004

Degrades Performance….. ITRS 2004 109 N-FET PD-SOI (120nm) Due to confined geometry and poor 108 P-FET

7 thermal conductivity materials, 10

106 emerging CMOS devices will exhibit

105 severe localized heating effects !! P-MOSFET 4 10 45 nm N-MOSFET 45 nm 103 -40 -20 0 20 40 60 80 100 Operating Temperature (°C) Lin et al. IEDM 2005

12 Kaustav Banerjee ICCAD’06 tutorial

6 Implications of Self-heating for Device Reliability

ESDESD FailureFailure High sensitivity of FinFET devices to ESD leading to early failure (It ) leading to early failure (It22)

(Texas Instruments)

FailureFailure inin DENMOSDENMOS devicesdevices underunder ESDESD

Source Gate Drain Russ et al. EOS/ESD Symposium 05

(Texas Instruments)

13 Kaustav Banerjee ICCAD’06 tutorial

FinFET Temperature Profiles High temperatures due to phonon confinement

Indirectly verified with experimental measurements…. Kolluri et al. (UCSB) ¾ High temperature gradient from source to drain ¾ Localized heat generation region near the drain ¾ Understanding critical for optimization of device parameters

14 Kaustav Banerjee ICCAD’06 tutorial

7 Self-Consistent Simulations

Specification of Operating ¾ Spice/Device Conditions Simulations can be used for estimating Estimation of Heat Genertion Profile Electrical Simulations heat generation (SPICE / 2D or 3D Device Simulations)

Heat Generation Profile Estimation of Temperature Distribution ¾ Analytical or Compact Thermal Simulations (Analytical or Compact Model / Temperature Profile Finite Element or Device Simulations) Models / Numerical

Temperature Profile Simulations can be used for estimating Convergance ? temperature Yes distribution

Final Self-consistent Temperature Profile & Output Current Kolluri et al. (UCSB)

15 Kaustav Banerjee ICCAD’06 tutorial

Thermal Modeling of FinFETs

Compact Thermal Model for FinFETs

Kolluri et al. (UCSB)

150 460 150 460

140 140 440 440 130 130

120 120 420 420 110 110

100 400 100 400

90 90 380 380 80 80

70 70 360 360 60 60

50 340 50 340 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80

Design Windows Constructed based on Compact Thermal Model

16 Kaustav Banerjee ICCAD’06 tutorial

8 Heat Generation Models – Increasing Complexity

Macro-Scale II == CurrentCurrent V = Voltage P = I ⋅V Q’’’ = Heat Generation Rate Per Unit Vol. J = Current Density Micro-Scale Drift Diffusion E = Electric Field R = Electron Recombination Rate G = Electron Generation Rate

Micro-Scale Hydrodynamic Model Te =Electron Temp. TL =Lattice Temp. Eg =Band Gap Quantum-Mechanical Monte-Carlo Models Summation over all phonon generation and absorption energies

17 Kaustav Banerjee ICCAD’06 tutorial

Drift Diffusion Model

• Heat Generation Rate

• Assumes: Lattice Temperature = Electron Temperature.

• Does not account for non-locality of phonon emission near strong electric field regions, like drain area of a transistor.

• Does not differentiate between various phonon modes and frequencies.

18 Kaustav Banerjee ICCAD’06 tutorial

9 Hydrodynamic Approach

• A better model (than drift diffusion) considering electron temperature (different from lattice temperature) and energy relaxation time.

• Only average electron temperature and relaxation times are used - but carrier scattering rates are energy dependent.

• Does not differentiate between different phonon modes & frequencies – but phonon velocities are dependent on frequency.

19 Kaustav Banerjee ICCAD’06 tutorial

Monte Carlo Method

• A simulation of the motion of all electrons, subject to applied electric fields and given scattering mechanisms.

• Flight Duration, type of scattering event etc are randomly chosen during the simulation based on microscopic probabilities.

• Phonon generation rates are calculated considering different modes (LA,TA,LO,TO) and frequencies.

20 Kaustav Banerjee ICCAD’06 tutorial

10 Monte Carlo Method

21 Kaustav Banerjee ICCAD’06 tutorial

Heat Transport Models – Micro-Scale Vs Macro-Scale

Phonon Transport – Length Scales

•• Phonon Phonon Wavelength(~2nm)Wavelength(~2nm) << DeviceDevice DimensionsDimensions Î Quantum Mechanical Analysis is not necessary

•• Device Device DimensionsDimensions << PhononPhonon Mean Free Path (~100-300nm) Î Classical Heat Diffusion Equation Not Valid !

NOT VALID !!

Î Sub-Continuum Analysis Necessary !!

Boltzmann Transport Equation

22 Kaustav Banerjee ICCAD’06 tutorial

11 Electrothermal Effects in Nanoscale Devices

Gate

Source L Drain ¾ Traditional CMOS scaling assumes isothermal problem ¾ High E field at drain Æ hot electrons ¾ Phonon hot spot near drain (as Λ > L) electron high ¾ Affects device behavior electron Λ energy phonon ¾ Need to solve phonon Boltzmann Transport Λ ~ 300 nm Equation (BTE) in Silicon at Room Temperature

23 Kaustav Banerjee ICCAD’06 tutorial

Device Temperature Profiles

Local hot-spot temperature rises well beyond the diffusion theory prediction…..

Gate x Source Drain y T (K) T (K)

y (nm) x (nm) x (nm) Pop et al. IEDM 2001 Indirectly verified against ESD failure pulses…. Sverdrup et al. SISPAD 2000

¾ Important implications for device performance and leakage ¾ Critical for estimating failure conditions under ESD events

24 Kaustav Banerjee ICCAD’06 tutorial

12 Multi-scale Approach for Practical Simulations Electrothermal Device Modeling and Simulation

ChannelMicroscale Region OtherMacroscale Region Electron Monte Carlo Simulation Heat Diffusion Heat Generation rate Thermal Profile estimation Estimation Heat Heat Profile Profile Compact Electrothermal Generation Generation

Temperature Temperature device model

Phonon BTE Channel Thermal Profile Estimation

Ongoing Research at UCSB: (Collaboration with Stanford, IBM and TI)

25 Kaustav Banerjee ICCAD’06 tutorial

Electrothermal Effects in Scaled Interconnects

26 Kaustav Banerjee ICCAD’06 tutorial

13 Size Effect on Cu Interconnect Resistivity Barrier occupies 20%-25% of • Cu Diffusion Barrier intended Cu cross-section area – Barriers have higher resistivity – Barriers can’t be scaled below a minimum thickness • Surface Scattering - – e- scattering from the e High resistance diffusion barrier surface layer (TaN) – Increases as surface area to TEM cross-section of volume ratio increases narrow Cu interconnect • Grain Boundary e- Scattering – e- scattering from the G-bs – Increases as grain size decreases W. Steinhogl et al, JAP, 2005

27 Kaustav Banerjee ICCAD’06 tutorial

Cu Interconnect Resistivity

W. Steinhogl et al, JAP, 2005

⎧ 1 ⎡1 α 2 3 ⎛ 1 ⎞⎤ 3 1 + AR λ ⎫ ρ = ρo ⎨ ⎢ − + α −α ln⎜1 + ⎟⎥ + C(1 − p) ⎬ ⎩ 3 ⎣ 3 2 ⎝ α ⎠⎦ 8 AR w ⎭ Grain Boundary Scattering Surface Scattering d : distance between grain boundaries α = (λ / d g )[R /(1− R)] g

z Reflectivity Coefficient (R) z Specularity Parameter (p) R = 0: Complete Transmission p = 0: Diffuse Scattering R = 1: Complete Scattering p = 1: Specular Scattering

Scattering parameters (p, R) are independent of temperature

Temperature dependence results from ρ0 (bulk resistivity) and λ (mean free path of electrons in Cu)

28 Kaustav Banerjee ICCAD’06 tutorial

14 Cu Interconnect Resistivity

MFP of Cu ~ 40 nm at room temperature 6 Intermediate Tier Wires Combined Model 5 5 Barrier Layer Effect At 300 K

-cm) Surface Scattering Surface Scattering 4 Grain Boundary Scattering µΩ 4 -cm] Background Scattering (ρ ) p = 0.5 o µΩ 3 Total R = 0.2 3 AR = 2 2 T = 300 K

2 Resistivity [ 1 Resistivity ( Bulk Resistivity 0 1 90 65 45 32 22 10 100 1000 Technology Node [nm] Metal Width (nm) Impact is worse for local wires and vias Im et al., IEEE TED, Dec. 2005 Increases wire delay: even in local wires

29 Kaustav Banerjee ICCAD’06 tutorial

Local Wires….

Metal 1 dimensions and barrier layer thickness (ITRS 2005)

Technology Node 65 nm 45 nm 32 nm 22 nm Metal 1 Width (nm) 68 45 32 22 Metal 1 Height (nm) 115.6 81 60.8 44 Aspect Ratio 1.7 1.8 1.9 2 Barrier Thickness (nm) 5.2 3.3 2.4 1.7

6 8 Barrier Layer Effect At 300 K 5 Surface Scattering 7 Technology

-cm) Grain Boundary Scattering -cm) Node 4 Bulk Resistivity (ρo) µΩ

µΩ 6 Total 22 nm 3 5 32 nm 2 4 45 nm 65 nm 1 3

Resistivity ( Resistivity Bulk Resistivity Resistivity ( Resistivity 0 2 65 45 32 22 300 350 400 450 500 550 600 Technology Node (nm) Temperature (K)

Banerjee et al., IEEE Nano-Net 2006.

30 Kaustav Banerjee ICCAD’06 tutorial

15 Implications….

• Current carrying capability (reliability/thermal)

• Interconnect performance (delay)

• Increasing IR-drop in power/ground distribution networks

• Other architecture level issues….

31 Kaustav Banerjee ICCAD’06 tutorial

Thermal Implications

1.6 DEM Model (Xerogel) FEM Simulations 770 K PWSM Model (Xerogel) 1.2 FSG HSQ CDO 0.8 Polymer MSQ Xerogel 45 0.4 nm 378 K 0.0

Thermal Conductivity [W/(m-K)] Conductivity Thermal 1234 Dielectric Constant

Im et al., IEEE TED, Dec. 2005

¾ Interconnect resistivity

¾ Current density

¾ ILD thermal conductivity Cu interconnect temperature rises significantly due to self-heating….

32 Kaustav Banerjee ICCAD’06 tutorial

16 Current Carrying Capability

• Electromigration Lifetime: strongly reduces with temperature • Limits maximum current carrying capacity….

18 Maximum allowed J based on 16 Max allowed ITRS requirement self-consistent (EM+Self-heating)

) 14 2 solutions… 12

10 Duty Ratio = 0.001 8 Significant deficit in current 6 carrying capacity for local vias….

4 Current Density(MA/cm

2

0 Increasing via size and/or 90nm 65nm 45nm 30nm 22nm Technology Node number will be expensive….

Srivastava and Banerjee, JOM 2004.

33 Kaustav Banerjee ICCAD’06 tutorial

Need to Identify New Interconnect Material…. Future Interconnect Requirements: 2005 ITRS

7 Red Areas: no known solutions! from 2014 onwards: Jmax > 1.06 x 10 A/cm2

34 Kaustav Banerjee ICCAD’06 tutorial

17 Circuit-Level Electrothermal Effects

35 Kaustav Banerjee ICCAD’06 tutorial

Circuit Level ET Issues

Impact of temperature variations on buffered interconnect systems…..

Wason and Banerjee, ISLPED 2005 Temperature variation has a strong impact on both delay and leakage power…..

36 Kaustav Banerjee ICCAD’06 tutorial

18 CircuitCircuit LevelLevel ETET IssuesIssues

ImpactImpact ofof SubstrateSubstrate thermalthermal gradients….gradients….

Delay/skew analysis for non-uniform interconnect temperaturetemperature

Ajami et al., DAC 2001

Direction dependence of thermal gradient….

Increasing thermal profile has better performance than that of decreasing thermal profile (optimal wire sizing) T(x) T(x)

x x Better

37 Kaustav Banerjee ICCAD’06 tutorial

Implications of Substrate Thermal Gradients

ImpactImpact onon delaydelay estimationestimation…..….. ImpactImpact onon IRIR-drop-drop analysisanalysis…..…..

T1: positive exponential gradient Worst-case voltage-drop (VIR/Vdd) T2: negative exponential gradient increases in the presence of thermal For a fixed T_Low gradients

Ajami et al., TCAD 2005 Ajami et al., JAICSP, 2005

38 Kaustav Banerjee ICCAD’06 tutorial

19 Impact on Buffer Insertion

Buffer movement in a 6660 um line (180 nm node)

18 16 0.18 14 0.13 0.1 12 10 8 6 4 2 Performance % Imp. 0 15 25 35 45 55 65 75 Temperature gradient (C)

Ajami et al., ICCAD 2001 Delay improvement after thermally-aware buffer insertion

39 Kaustav Banerjee ICCAD’06 tutorial

Chip-Scale Electrothermal Effects

40 Kaustav Banerjee ICCAD’06 tutorial

20 Thermal Challenges

Transistor Scaling Continues

Leakage power expected to be Substantial thermal gradient is more than half of the chip power !! observed from high performance chips

Montecito Thermal Map for TPC-C Code

source : Intel C. Poirier et al., ISSCC 2005

41 Kaustav Banerjee ICCAD’06 tutorial

Impact of Heat • System Performance and Stability – Higher temperature degrades performance – Heat-induced failure and instability • Product Lifetime and Reliability – Most reliability mechanisms are highly temperature sensitive M. Miller, AMD • Operating Cost – More complex cooling solutions – NOT conform to consumer’s expectation

Cause of Failure

42 Kaustav Banerjee ICCAD’06 tutorial

21 Stronger Electrothermal Couplings

Supply Voltage

Switching Performance Power

Leakage Threshold Power Voltage Total Power

Device Temperature Model

Process Variation Cooling Cost Reliability Banerjee et al., IEDM 2003

43 Kaustav Banerjee ICCAD’06 tutorial

Self-Consistent ET Analysis Tool

Packaging / Cooling Model Layout Geometry & Power Dissipation Boundary Condition Realistic packaging structure ( Including active and leakage power )

Heatsink 10000 1.8 Parabolic Heat 1.6 Heat Spreader 8000 PDEs

1.4

6000

1.2 TIM 2 4000 TIM 1 1 Die Width (µm) Core (die) 2000 0.8 Substrate Electrothermal 0.6 Socket 0 0 2000 4000 6000 8000 10000 Couplings PCB

3-D Electrothermally-Aware Spatial Temperature Estimation

Lin and Banerjee, ICCAD 2006

44 Kaustav Banerjee ICCAD’06 tutorial

22 Thermal Profile & Implications

Lin and Banerjee, ICCAD 2006

45 Kaustav Banerjee ICCAD’06 tutorial

Application 1 Full-Chip Leakage Estimation

Case 1: Die-to-die channel length variations Case 2: Case1 + Within-die variations Case 3: Case 2 + Die-to-die temperature variations

Zhang et al. ISLPED 2004

Die-to-die temperature variations significantly increases the leakage power

46 Kaustav Banerjee ICCAD’06 tutorial

23 Application 2 Power-Performance Tradeoff

Low Power Design High Performance Design • Lower supply voltage • Higher supply voltage • Higher threshold voltage • Lower threshold voltage

Need a design metric for optimization

47 Kaustav Banerjee ICCAD’06 tutorial

Application 2 (cont.) Power-Performance Tradeoff Vdd-Vth optimization using Energy-Delay Product (EDP) Mark Horowitz, 1997 Electrothermally Coupled EDP 1.2 0.01 0.001 1.0 0.1 0.0001 ( V ) Thermal 1.6 1 3

dd . Runaway 0.8 0.5 1.0 0.6 0.7 0.6 0.8 0.9 V =V 0.4 dd th

Supply Voltage V Voltage Supply Optimal EDP 0.2 0.2 0.3 0.4 0.5 Threshold Voltage Vth ( V ) Lin et al., DAC 2004 ¾ A shift in optimal point, EDP and performance contours ¾ An overall change in shape of EDP contours ¾ Operation region restricted by electrothermal constraints

48 Kaustav Banerjee ICCAD’06 tutorial

24 Application 3 Leakage and Packaging Aware Design Space ( V ) ( V dd ( V ) dd Supply Voltage V Supply Voltage V Voltage Supply

Banerjee et al. IMAPS 2005

While the leakage increases due to Lowering of the junction temperature by technology scaling or process variations, employing advanced packaging and cooling the operation region prohibited by techniques with lower thermal impedance thermal runaway expands (θj) will expand the design space Allows circuit designers to comprehend reliability and packaging constraints……

49 Kaustav Banerjee ICCAD’06 tutorial

Application 4 Thermally-Aware Design-Specific Optimization

Different metrics result in different optimization……

μ Metric : PT

ratio of the exponents of power over delay

EDP: Energy-Delay Product μ=2 PDP: Power-Delay Product μ=1 PDP: Power-Energy Product μ=0.5

How to choose a design-specific metric ?

50 Kaustav Banerjee ICCAD’06 tutorial

25 Application 4 (cont.) Thermally-Aware Design-Specific Optimization

μ Metric : PT ( V )

dd ratio of the exponents of delay over power

EDP: PT2 μ=2

PDP: PT μ=1

Supply Voltage V PDP: P2T μ=0.5

Lin et al. ICCD 2005 μ is bounded by thermal and performance requirements

51 Kaustav Banerjee ICCAD’06 tutorial

Application 4 (cont.)

Impact of technology scaling

0.9 scaling 8 8 Thermal 6 6 Restricted by temperature limit ( V ) ( V Runaway 5 5 dd 4 0.7 4 100 nm technology node 3 3 70 nm technology node

0.5 2 2 Restricted by Vdd>Vth constraint or required performance constraint

Supply Voltage V Voltage Supply µ 0.5 0.3 0.25 0.30.35 0.4 Lin et al. ICCD 2005 Threshold Voltage Vth ( V ) The optimal operation locus shifts right with technology scaling Design space gets increasingly restricted by thermal constraints

52 Kaustav Banerjee ICCAD’06 tutorial

26 Application 5 Power and Thermal Management

Efforts on Low Power….without hurting performance ¾ Device Engineering ¾ Enhanced Channel Mobility ¾ Reduced Gate Leakage ¾ High-K Gate, Nitrogen Doped ¾ Circuit Level ¾ Adaptive body-biasing, Dual Vth ¾ Sleep Transistor, Clock/Power Gating ¾ Micro-Architecture Level ¾ Multi-Core

53 Kaustav Banerjee ICCAD’06 tutorial

Why Cooling

Power Dissipation C o o Dual-Vth l in g

Circuit Techniques

Cooling is the Knob !

54 Kaustav Banerjee ICCAD’06 tutorial

27 Device and Circuit Benefits

109 N-FET PD-SOI (120nm) 108 P-FET

7 10 9-stage Inverter Chain 106

105 P-MOSFET 4 10 45 nm N-MOSFET 45 nm 103 -40 -20 0 20 40 60 80 100 Operating Temperature (°C) Lin et al. IEDM 2005 S. Borkar, Intel

Lowering temperature

• Enhance Ion to Ioff ratio • Reduce propagation delay and variance • Benefit back-end performance and reliability

55 Kaustav Banerjee ICCAD’06 tutorial

Cooling Benefit-Cost Tradeoff Power Dissipation(W) Power Dissipation (W)

Lin et al. IEDM 2005 Beyond this point, further cooling does not lead to any power saving

The limit occurs at a lower temperature as technology scales

56 Kaustav Banerjee ICCAD’06 tutorial

28 Hot-Spot Management

Global vs. localized cooling

Global cooling Localized cooling

TMAX decreases but hot-spots remain (µm) Width Die

Substrate Temperature Profile (°C) 10000 130

128 Lin et al. IEDM 2005 8000 126

6000 124

4000 122

120 2000

118 0 0 2000 4000 6000 8000 10000 Die Length (µm) (Using thin-film TEC) (reduce θja 20%) Size 0.8 mm X 0.8 mm Localized cooling will be more effective for hot-spot management

57 Kaustav Banerjee ICCAD’06 tutorial

Electrothermal Effects in Emerging Technologies

58 Kaustav Banerjee ICCAD’06 tutorial

29 ET Issues in 3D ICs Thermal Gradient Thermal ( Watts ) ( Watts ( Watts ) ( Watts leak chip P P

Banerjee et al. Proc. IEEE, 2001 Performance evaluation of 3D design must account for negative impact of high temperature on all active layers

59 Kaustav Banerjee ICCAD’06 tutorial

Implications for 3-D Processor-

Execution time per instruction and maximum chip temperature as a function of operating frequency for 2-D and 3-D chip for (a) highly memory intensive (mcf) application and (b) less memory intensive (twolf).

Loi et al., DAC 2006

60 Kaustav Banerjee ICCAD’06 tutorial

30 Reliability and Current Carrying Capacity of Carbon Nanotubes

Graphene Single-Walled Multi-Walled Nanotube Nanotube (SWCNT) (MWCNT)

Wei et al. APL 79, 1172 (2001)

10 2 • Current density up to 10 A/cm without heatsink (not embedded in SiO2) • Equivalent Au-, Cu-, Al-wires deteriorate at 107 A/cm2

61 Kaustav Banerjee ICCAD’06 tutorial

Thermal Management with Carbon Nanotube Vias Maximum interconnect temperature rise for Cu interconnect stack with Cu vias compared to CNT bundle vias integrated with Cu interconnects

Srivastava et al. IEDM 2005

For CNT bundles, the shaded region shows the range 1750 W/mK < Kth < 5800 W/mK

62 Kaustav Banerjee ICCAD’06 tutorial

31 Impact of CNT Vias on Cu Interconnect Reliability and Performance

2 orders of magnitude improvement in the lifetime of Cu!

Srivastava et al. IEDM 2005

Hybridization of CNT vias with Cu lines can help extend the lifetime of Cu……

63 Kaustav Banerjee ICCAD’06 tutorial

Conclusions

¾ Electrothermal effects are increasing at every level from devices and interconnects to circuits and systems ---need careful modeling and optimization ¾ Electrothermal Engineering is a critical need….. ---temperature awareness at every level to optimize performance, power and reliability --- understand various couplings through an integrated approach ¾ Important for Emerging technologies….

64 Kaustav Banerjee ICCAD’06 tutorial

32