The Challenges of Correlating Silicon and Models in High Variability CMOS Processes

Rob Aitken ARM R&D [email protected]

1 Outline

§ Background § Variability § Characterization § Library validation § Debug examples § Conclusions

©2009 Rob Aitken 222 Benchmark Results

Google Scholar Citations

100000 optimal

10000

unmodified route benchmark Citations

1000

opportunity 100 n10 n30 n50 n100 n300 Benchmark

©2009 Rob Aitken 333 Physical IP is more than NAND gates

§ 15+ foundries § Up to 8 process generations per foundry § Up to 11 variants in each generation § 6+ memory generators § 900+ cells per library § 200+ I/Os § 10+ views per cell § Plus PHY, analog IP, etc.

§ 300+ libraries per year, each with thousands of individual data elements

©2009 Rob Aitken 444 Moore’s Law

§ Original Paper: § “Cramming more components onto integrated circuits” § Electronics , 38-8, 4/19/65 § Tracks changes from 1959 to 1965 and predicts trend going forward § It’s still going…

©2009 Rob Aitken 555 Feedback: Moore’s Law and Consumer Expectations

Nintendo GameBoy (1989)

CPU: 8-bit Z-80 processor, 1.05 MHz Screen: 2.6" 160 x 144 LCD 4 b/w Connectivity: 4 players by serial cable

Introductory price - $169

>1000x performance for the same price DSi (2008/2009)

CPU: ARM9 ™ (133 MHz), ARM7 ™ (33MHz) Screen: Two 3" 256 X 192 color LCDs 256MB Flash, AAC audio, 2 VGA cameras Connectivity: Wifi, web browser, shopping

© Nintendo Introductory price - $169

©2009 Rob Aitken 666 Variability trends

§ Non Gaussian behavior § Local spread close to global § Reduced correlation

1000 Samples of Variation 45 1000 Samples of Variation 0.00021 0.0002 90 0.0002 0.00019

0.00019

0.00018 0.00018 Idsat 0.00017 0.00017 Idsat

0.00016 0.00016

0.00015 global 0.00015 local global 0.00014 local 0.00E+00 2.00E-09 4.00E-09 6.00E-09 8.00E-09 1.00E-08 1.20E-08 0.00014 Leakage 0.00E+00 2.00E-08 4.00E-08 6.00E-08 8.00E-08 1.00E-07 1.20E-07 Leakage

©2009 Rob Aitken 777 Background

§ Classes of test chip: § ARM has multiple classes of test chip §Library qualification chips §Processor qualification chips §Experimental test chips § Chips: § 2 32nm tapeouts since 9/08 § ~40 Tapeouts in 2008, mainly first group § Part of shuttle or multi-project wafer § Usually 40-100 packaged chips § Challenge: § Silicon validation of library elements (standard cells, memory, IO)

©2009 Rob Aitken 888 Background cont.

§ The library qualification test chip program objectives § Verify functionality § Validate new architectures in silicon § Provide silicon correlation for timing and power measurements § Objectives for other programs (not these chips) § Serve as yield predictors or measure defect density § Evaluate transistor properties or develop SPICE models § Study lithography issues § Determine the statistical properties of a process (FEOL or BEOL) § Measure reliability

©2009 Rob Aitken 999 Library characterization history

1000 900 900* 900 Cell Count 800 Disk Usage CPU Usage 700 650 600 550 550 475 500 450 400 300 200 200 100 0 .35um .25um .18um .15um .13um 90nm 65nm 45nm

©2009 Rob Aitken 101010 Characterization History

§ Three delay numbers: slow, typ, fast § Just too inaccurate § Linear delay: f(cap) § What about slew rates? § NLDM: table f(cap, slew), interpolate between points § Which table? How many points? § Multiple voltage support complicated § Complex interconnects not modeled well § Current source models (CCS, ECSM) § More complex modeling of device behavior § Allows for more accuracy, especially at intermediate points § Giant files § Statistical models…

©2009 Rob Aitken 111111 How much accuracy do you need?

§ Depends on tools, stage of design flow § Need for accuracy varies § Preliminary floorplanner needs different accuracy than final extraction flow § Interconnect modeling needs good, but not perfect accuracy § 3D field solver not required, but need more than interpolation between table data points, especially for complex shapes, long distances § Accuracy is important, but needs to be defined correctly

"0.015667, 0.020832, 0.030359, 0.048810, 0.086296, 0.161212, 0.311795", \ "0.017252, 0.023636, 0.033981, 0.052357, 0.089780, 0.164684, 0.315177", \

These are not “within X% of SPICE”, they are SPICE! So are these.

values("7.117695e-03, 1.329720e-02, 3.667810e-02, 4.492980e-02, 5.068780e-02, 5.180790e-02, 5.086270e-02, 4.749920e-02, 5.026640e-02, 4.425650e-02, 2.832520e-02, 2.129720e-02, 1.605390e-02, 9.854060e-03, 5.124280e-03, 2.224300e-03, 1.424661e-03");

©2009 Rob Aitken 121212 Accuracy and precision: NLDM interpolation versus transistor variability

0.55 § Interpolation accuracy: 2-3% 0.5 0.45 § Temperature, voltage variation handled 0.4 with current source models 0.35 0.3 § Variability: SS versus TT: -20% to 40% 0.25

0.2

0.15

1 0.1 1.4 0.512 0.264 0.05 0.14 1.35 0 0.08 7 6 0.048 5 1.3 4 0.032 3 SS 2 1 1.25

0.55 1.2

0.5 1.15 0.45 1.1 0.4

0.35 1.05

0.3 1 0.25 0.95 0.2 0.032 0.048 0.15 0.08 0.9

1 0.1 0.14 0.512 0.85 0.05 0.264 0.264 0.14 0 0.512 0.8 0.08 7 2 1 6 1 3 0.048 5 4 4 6 5 0.032 3 7 2 TT 1

©2009 Rob Aitken 131313 “Scaling” of standard cell delay

§ Relative delay

for equivalent 200% 1.8-2 cells in two 1.6-1.8 180% 1.4-1.6 different 1.2-1.4 1-1.2 160% technologies 0.8-1 0.6-0.8 A and B 140% 0.4-0.6

§ Is A slower 120% than B or faster? 100% performance ratio A to AratioB performance 0.0003 80% 0.0009 0.0027 loa 60% 0.0083 d (sc ale 0.0253 d) 40% 0.0778 0.344 0.148 0.2387 0.064 0.028 0.012 0.004 transition (scaled)

©2009 Rob Aitken 141414 Silicon Validation of Libraries § Basic idea § Measure silicon, compare with model prediction § Things to measure § Delay § Power §Leakage §Dynamic § Challenges § Where does silicon fit in “corners” § Measurement accuracy § Single point versus table § Model versus SPICE § SPICE versus silicon § Parametric variation § Presence of “soft” defects

©2009 Rob Aitken 151515 Overcoming challenges

§ Challenges § Where does silicon fit in “corners” §Oscillator data, test structure data § Measurement accuracy §Understand equipment, measure deltas §Big challenge for power § Single point versus table §Carefully select design point §Shmoo across voltage, temperature § Model versus SPICE §Understand characterization issues § SPICE versus silicon §Work with foundries to understand § Parametric variation §Design around local variation § Presence of “soft” defects §Look for trends across chips

©2009 Rob Aitken 161616 Validation in practice

140

135

130

125

120

115

110 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 § Variability observed for similar objects across chips § “Correct” value somewhere in the middle § Does this validate it?

©2009 Rob Aitken 171717 How Close is it to SPICE?

§ Significant difference

Delay versus Silicon between simulators 150% observed 140% Simulator 1 130% § Variety of issues Simulator 2 120% represented 110% § Model file interpretation 100% § Performance options 90% § Extraction issues 80% 70% § Silicon variability

60%

50% Remember this the 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 next time some tool Instance claims to be within X% of SPICE

©2009 Rob Aitken 181818 Number of Contacts (Yield vs Speed)

§Typical questions after early data collection (small number of units) §3% shift in mean but what is mean – 3σ?

Units 208/304/501 contact experiments

108%

106%

104%

102%

100%

98%

96% Stage Delay, normalized StageDelay, 94%

92% oscilator_inv_x1_cont11 oscilator_inv_x1_cont12 oscilator_inv_x1_cont21 oscilator_inv_x1_cont22

©2009 Rob Aitken 191919 Min VDD for Different Design Styles

0.54

0.53

0.52

0.51

0.5

Min VDD Min 0.49 Design 1 0.48 Design 2 0.47 Design 3 Design 4 0.46 1 2 3 4 5 Instance Class

§ Question: Are any of these designs better? § Statistically: No § But: More data might give the edge to design 4

©2009 Rob Aitken 202020 Variability and Validation

Delay Correlation vs. Circuit Distance (SEC, RVT, -40C, 1.08V, H2L)

1.005 1 0.995 0.99 0.985 0.98 0.975 0.97

Correlationcoefficient 0.965 0.96 0.955 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Circuit block distance (um)

§ Look for correlation § 65nm data § Result: small, but measurable, distance-based effect observed

©2009 Rob Aitken 212121 Sources of Variability

§ Lithography § Line edge roughness § CD variation § Influence of neighbors § Device § Well boundary effects § Variation between N and P § Stress/strain effects § Interconnect § Dielectric variation § Via/contact quality § Metal width/height variation § Deterministic versus Random

©2009 Rob Aitken 222222 W  I ≈ µ ⋅C  (V −V )α Effects of Variability d ox  L  gs t §Leakage

§ Variation in L, Vt, µ, tox §Performance § Changes in L, W, R, C, Vt, µ, §Min VDD § Changes in Vt, L, W § SRAM bit cell main limiter +3 σσσ §Dynamic power § Changes in C § Side effect of changes in performance, leakage §Yield § Indirect result of others § Parameter goes beyond spec + tolerance -3σσσ

©2009 Rob Aitken 232323 Local Variation Dominates in VDSM

Histogram of Leakage Histogram of Idsat 250 160

140 200 local local global 120 global

150 100

80

100 60

40

50 20

0

0 4 4 4 4 4 4 4 4 0 0 0 0 0 0 0 0 5 5 .5 .5 5 5 .5 .5 5 8E- 2E- 6E- 8E- 2E- 6E- 4E- 8E- 0. 1.5 2.5 3. 4 5 6. 7. 8 9 0. .4 .5 .5 .6 .7 .7 .80E-04 .8 .8 1 1.40E-04 1.44E-04 1 1 1 1.60E-04 1.64E-04 1 1 1 1 1 1

1000 Monte Carlo samples, 45nm technology § Local variation (within chip) is nearly as much as global variation (between chips) at 45nm

©2009 Rob Aitken 242424 Critical Defect Behavior

1720 1730

1750 1770

1800 good 2000

Nominal delay Worst-case delay

Resistance ( Ω) 1700 1720 1730 1750 1800 2000 3000

Delay SA1 600ps 400ps 250ps 150ps 70ps <10ps

©2009 Rob Aitken 252525 Critical Variability

§ Designed for a predetermined operating point, plus margin § Example: SS Corner, 0.9V, 125C, 0 slack § Case 1: TT silicon, 0.9V, 85C §>2X nominal delay will still function correctly § Case 2: SS silicon, 0.9V, 85C §~20% extra delay will cause failure § Influential factors: §Process, voltage, temperature, slack, noise § Expected silicon distribution: SS < 1%, TT(+/-) > 50% §Might expect a 1-2% yield hit if SS corner just misses timing § Guaranteed silicon distribution: None (usually) §Could wind up with no yield 1-2% of the time (1 week per year) §Or worse…

©2009 Rob Aitken 262626 Outline

§ Background § Variability § Characterization § Library validation § Debug examples § Conclusions

©2009 Rob Aitken 272727 Key to Memory Debug: Bit Mapping

§ Pass/Fail information is adequate for production testing § For yield improvement, debug, system bring-up, etc., it is also useful to be able to identify each failing address and data pattern § Need logical to physical mapping for this also § Some common patterns are shown below:

Single Cell Entire Column Half Row

©2009 Rob Aitken 282828 Learning From Bit Maps

Vertical Pair: Partial Column: Multi-Row: Bit Line Contact Resistive Bit Line Short Address Decoder

Swath: Entire Bit: Catastrophic: CMP Scratch Sense amp, I/O Timing circuit

©2009 Rob Aitken 292929 What can happen?

1 § Temperature related leakage problem Normalized 0.1 § Root causes sub-threshold current § 5-10X worst case leakage Worse case 0.01 sub-threshold § Circuit marginality current

0.001 -20 0 20 40 60 80

Temperature

©2009 Rob Aitken 303030 Lithography troubles

§Early silicon may be affected by incomplete or improper processing §This is less common later in process cycle (memory optimized first) §May still be an issue for logic

©2009 Rob Aitken 313131 Power Design

§ The good and the bad of power connection § Memory provider can only account for so much! § More margin = less performance

©2009 Rob Aitken 323232 Analog problem: Read disturb fault

2nd read 520 mV

1st read 250 mV

§ Data node voltage increases with successive reads § Given time, settles back to zero § Root cause: defective ground contact

©2009 Rob Aitken 333333 Conclusions

§ Validation is more complicated than you’d think § Variation increasingly important § Understanding sources of variation helps § An effective debug methodology helps when new troubles arise §Tools §Infrastructure (silicon, equipment, software) §Experience

§ Feel free to send questions to [email protected]

©2009 Rob Aitken 343434