TRADEOFFS BETWEEN PERFORMANCE AND
RELIABILITY IN INTEGRATED CIRCUITS
by
DANIEL J. WEYER
Submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy
Department of Electrical Engineering and Computer Science
CASE WESTERN RESERVE UNIVERSITY
May 2019 CASE WESTERN RESERVE UNIVERSITY
SCHOOL OF GRADUATE STUDIES
We hereby approve the dissertation of Daniel J. Weyer
candidate for the degree of Doctor of Philosophy*.
Committee Chair Dr. Christos Papachristou
Committee Member Dr. Phillip Feng
Committee Member Dr. Soumyajit Mandal
Committee Member Dr. Francis Merat
Committee Member Dr. Daniel Saab
Date of Defense
April 9, 2019
*We also certify that written approval has been obtained
for any proprietary material contained therein. Contents
List of Tables v
List of Figures ix
Acknowledgments xx
Abstract xxii
1 Introduction 1 1.1 Problem Statement ...... 4 1.2 Outline ...... 6
2 Reliability of ICs/ASICs 8 2.1 Reliability analysis ...... 16 2.1.1 Time To Failure ...... 16 2.2 Continuous Distributions ...... 21 2.2.1 Exponential Distribution ...... 21 2.2.2 Normal (Gaussian) Distribution ...... 23 2.2.3 lognormal Distribution ...... 26 2.2.4 Weibull Distribution ...... 30 2.2.5 Gamma Distribution ...... 36 2.3 Discrete Distributions ...... 39
i CONTENTS
2.3.1 Poisson Distribution ...... 39 2.3.2 Chi-Square Disribution ...... 42
3 Aging Mechanism of CMOS Devices 49 3.0.1 xBTI = NBTI and PBTI ...... 50 3.0.2 Hot Carrier Injection (HCI) ...... 55 3.0.3 TDDB ...... 60 3.0.4 Electromigration (EM) ...... 66 3.0.5 Industry Standard Reliability Calculations ...... 70 3.0.6 Summary of IC/ASIC Aging/Degradation Models ...... 75
4 Interconnects 77 4.1 Metal and Dielectric Diffusion ...... 78 4.2 Electromigration Failure in Interconnects ...... 82 4.3 Reliability and Electromigration (EM) ...... 84 4.4 Interconnect Resistance ...... 89 4.4.1 Interconnect Scaling ...... 116 4.4.2 Dennard scaling relationship to Moore’s law ...... 132 4.4.3 ITRS scaling ...... 133 4.4.4 Interconnect Dielectric scaling ...... 153 4.4.5 Electromigration (EM) in interconnect wires ...... 158 4.5 ITRS Interconnect Scaling challenges ...... 164 4.5.1 Barrier Metal ...... 165 4.5.2 Inter-metal Dielectrics (ILD) ...... 166 4.5.3 Interconnect delay challanges ...... 166 4.5.4 Industry standards for EM ...... 170 4.5.5 EM percent lifetime equations ...... 176 4.6 Electromigration Copper Model parameters ...... 178
ii CONTENTS
4.7 Future of Interconnects ...... 193 4.7.1 Cobalt ...... 196 4.7.2 Ruthenium ...... 197
5 Trade offs for Lifetime versus Performance 199 5.1 Electromigration tradeoffs ...... 201 5.1.1 Current density (j) effects on gate delay ...... 210 5.2 Synopsys Process Development Kit ...... 211 5.2.1 Interconnect Resistivity ...... 212
5.2.2 Synopsys 28/32 PDK EM tradeoffs by Imax ...... 234
6 Lifetime Driven Design Methodology 238 6.1 OpenCore Amber 25 Microprocessor ...... 239 6.2 High Level Description of the Methodology ...... 241 6.3 Detailed Description of the Methodology ...... 245
7 Results 260 7.1 Layout ...... 260 7.2 Amber 25 Layout analysis ...... 266 7.3 EM analysis ...... 277
8 Summary 282
9 Discussion 284
10 Acronyms 287
11 Appendix 297 11.1 Tool Versions ...... 297 11.2 Amber 25 Testbench Source ...... 299 11.3 ICC TCL reporting ...... 306
iii CONTENTS
11.4 Python script to generate ALF file ...... 311 11.5 ITF (Interconnect Technology File) ...... 317 11.6 StarXtract ...... 323 11.7 Hspice Simulations ...... 328 11.7.1 Lifetime Comparisons ...... 328 11.7.2 Hspice Resistance Sweeps ...... 337 11.7.3 Clock Buffer hspice Simulations ...... 359 11.8 Semiconductor properties ...... 365 11.9 Nernst-Einstein relationship of drift velocity ...... 367 11.10Intel and Synopsys 90nm parameters ...... 370
12 Published Papers 374
iv List of Tables
4.1 Interconnection roadmap for scaling.[IRDS, 2016] ...... 86 4.2 Interconnect, etc. difficult challenges.[IRDS, 2016] ...... 86 4.3 Resistivity and temperature coefficient at 20 ◦C [GSU, 2017] . . . . . 94 4.4 Resistivity temperature variation for bulk pure metals ...... 101 4.5 Intel Resistivity Roadmap ...... 102 4.6 Intel 130nm process ...... 108 4.7 Dennard Scaling ...... 128 4.8 Intel Node Scaling ...... 129
4.9 Intel Jmax and ITRS Imax Scaling ...... 130 4.10 Intel and ITRS predicted Resistivity ...... 131
4.11 ITRS Jmax and Imax Scaling ...... 135
4.12 Intel Jmax and Imax Scaling ...... 136 4.13 Dennard Reliability effects ...... 151 4.14 Dennard Reliability effects on EM ...... 152 4.16 Interconnect’s Dielectric by Node ...... 153 4.15 Interconnect Dielectric ...... 157 4.17 EM Terms ...... 161 4.18 Projected electrical specifications of logic core device. [IRDS, 2016] . 167 4.19 Table: Projected power-performance-area (PPA) metrics of functional datapath. [IRDS, 2016] ...... 169
v LIST OF TABLES
4.20 IC/ASIC Aging Properties ...... 175 4.21 Black’s law Regression ...... 180
4.22 Black’s law Regression M2 ...... 182
4.23 M2 CDF...... 184 4.24 Black’s Law from Tan and Vairagar ...... 190
4.25 Black’s Law t50 for various Temperatures ...... 191
4.26 Jmax for various Temperatures and Failure rates ...... 192
5.1 trade off for n = 1 and α = 1 and n = 2 and Sakurai-Newton α = 1.2 205 5.2 Temperature verses Performance and Lifetime. Lowering the temper- ature has a more pronounced impact on the lifetime than the Perfor- mance increase...... 209 5.3 Synopsys PDK3D45:Metal Layers ...... 211 5.4 Comparison Synopsys 28/32nm to the Intel 32nm process ...... 214 5.5 BEOL for 28G and 28LP Products for Global Foundries Cu intercon- nects [Augur et al., 2012] ...... 218 5.6 ILD parameters for Synopsys 28/32 PDK ...... 220 5.7 Synopsys 28/32nm ILD parameters ...... 223
5.8 M1 dimensions for Synopsys 28/32 PDK ...... 224 5.9 Interconnect resistance parameters for Synopsys 28/32 PDK . . . . . 224 5.10 Via resistivity parameters for 25 ◦C [Lin et al., 2007] ...... 225 5.11 Via resistivity calculated for different Temperatures [Lin et al., 2007] 225 5.12 Dielectric and Capacitance parameters for Synopsys 28/32nm PDK . 226 5.13 EM parameters for Synopsys 28/32nm PDK ...... 226
5.14 Synopsys 28/32nm Tsubstrate temperature rise based on Tambient and percent rise in temperature of the IC/ASIC ...... 228
5.15 Imax, Imaxw and Jmax for various Tuse temperatures ...... 235
5.16 Synopsys 28/32 nm PDK data for FEM EM temperature derating . . 235
vi LIST OF TABLES
6.1 Comparison of Lifetime calculations between ICC and hspice . . . . . 251
6.2 Synopsys 28/32 nm PDK data for the EM function FEM temperature derating ...... 252
6.3 Synopsys 28/32 nm PDK data for the EM function FEM temperature derating ...... 255
6.4 Synopsys 28/32 nm PDK data for the EM function FEM temperature derating ...... 256 6.5 Calculations for EM I and J for values for metal layers from Synopsys 28/32nm PDK ...... 256 6.6 Calculations for EM I and J values calculated for Vias ...... 257
7.1 Design Exploration (Max performance - meets timing) ...... 280 7.2 Design Exploration (Increment the clock period by 1 ns) ...... 280 7.3 Design Exploration (Iterate until 15 year target met) ...... 281 7.4 Design Exploration (Iterate until 15 year target met) ...... 281 7.5 Design Exploration (Power 15 year target met) ...... 281
10.1 Acronyms ...... 287
11.1 Tool Versions ...... 298 11.2 Percent change in tp for the 28/32nm inverter with ±40 change in interconnect resistance ...... 347 11.3 Delay change in tp for a typical 28/32nm inverter with ±40 change in interconnect resistance ...... 349
11.4 Iomax change for a typical 28/32nm inverter with ±40 change in inter- connect resistance ...... 351 11.5 Percent change in tp for the 90nm inverter with ±40 change in inter- connect resistance ...... 353
vii LIST OF TABLES
11.6 Delay change in tp for a typical 90nm inverter with ±40 change in interconnect resistance ...... 355
11.7 Iomax change for a typical 90nm inverter with ±40 change in intercon- nect resistance ...... 357
11.8 Capacitance load effect on power, Jmax and Lifetime ...... 364 11.9 Semiconductor Properties ...... 366 11.10Boundry Conditions ...... 369 11.11Intel:Metal Layers ...... 370 11.12Synopsy 90nm PDK Metal Layers ...... 371 11.13Synopsy 90nm PDK simplified Metal Layers ...... 372 11.14Synopsy 90nm PDK capacitance for Metal Layers ...... 373
viii List of Figures
1.1 Perspective on sizes. [Bohr, 2014] ...... 3
2.1 The IC/ASIC Shrinking Bathtub ...... 9 2.2 Aging effects of a CMOS inverter [Bafleur and Perdu, 2016]...... 9 2.3 Typical FIT rates of electronic components and trend due to aging or degradation. [Hillman, 2009] ...... 10 2.4 Total guard banding of 15% is large. Blue shows the faults due to process variation. Red shows aging degradation [Alam et al., 2008]. . 11 2.5 Warranty Costs ...... 12 2.6 PDF f(t). f(t) represents the probability of finding a device failure between t and t+dt [McPherson, 2013] ...... 17 2.7 CDF F(t). F(t) represents the fraction of the population that failed [policeanalyst.com, 2012] ...... 17 2.8 Comparison of PDF, CDF and R [Andy, 2018] ...... 18 2.9 The Exponential PDF [Wikipedia, 2017b] ...... 22 2.10 The Exponential CDF [Wikipedia, 2017b] ...... 22 2.11 PDF f(t) for Normal Distribution [Kapur and Pecht, 2014] ...... 25 2.12 PDF f(t) of the Log normal Distribution for σ = 0.1 and σ = 0.5. [Kapur and Pecht, 2014] ...... 28 2.13 The Weibull PDF [Wikipedia, 2017d] ...... 31
ix LIST OF FIGURES
2.14 The Weibull distribution showing use in Reliability [Spinato et al., 2009] 32 2.15 The Weibull CDF [Wikipedia, 2017d] ...... 33 2.16 The Weibull distribution plot in terms of Weibits [McPherson, 2013] . 34 2.17 The Weibull distribution plot in terms of Weibits [McPherson, 2013] . 34 2.18 Comparison of the PDF, CDF and hazard functions for Exponential, Normal, lognormal and Weibull [Industrial-Electronics, 2017] . . . . . 35 2.19 The Gamma function for real values of α [Pishro-Nik, 2017] . . . . . 37 2.20 PDF for the Gamma Distribution for values α and λ [Pishro-Nik, 2017] 37 2.21 The Poisson PMF [Wikipedia, 2017c] ...... 41 2.22 The Poisson CDF [Wikipedia, 2017c] ...... 41 2.23 The Chi-Square PDF [Wikipedia, 2017a] ...... 43 2.24 The Chi-Square CDF [Wikipedia, 2017a] ...... 44 2.25 Chi Square Statistic [Berman, 2017] ...... 44
3.1 Threshold voltage shifts as a function of stress time under NBTI vs channel length [Yan-Rong et al., 2010] ...... 51 3.2 CMOS Inverter xBTI stress in a design [Shiyanovskii et al., 2009c] . 51 3.3 Silicon Lattice interface at Gate Oxide [Shiyanovskii et al., 2009c] . . 52 3.4 Illustrating the effects of PMOS NBTI effect on a CMOS inverter under stress [Shiyanovskii et al., 2009c] ...... 53 3.5 Measurement of NBTI at different temperatures ...... 54
3.6 HCI effects: CHE(V d = V g), DAHC(V d = 2Vg), SGHE Vd >
Vg,SHE(|Vsub| >> 0 [Shiyanovskii et al., 2009c] ...... 56 3.7 A. Dielectric degradation occurs due to broken bonds/trap-creation in the dielectric material and at the SiO2/Si interface [McPherson, 2013] 60 3.8 Poly Short due to TDDB [McPherson, 2013] ...... 61 3.9 The four models best fittings to the same set of accelerated TDDB data [McPherson, 2013]...... 65
x LIST OF FIGURES
4.1 Interconnection distribution.[Borkar, 1999] ...... 77 4.2 Diagram showing an ideal (sharp) interface of the metal and dielectric materials [Balasinski, 2016] ...... 78 4.3 Diagram showing a diffused metal-dielectric interface after the pene- tration of the metal into the dielectric [Balasinski, 2016] ...... 79 4.4 Diagram showing the metal dielectric interface with an energy diagram showing how metal atoms diffuse out of the metal matrix into the dielectric [Balasinski, 2016] ...... 79 4.5 Negative heat of oxide formation per oxygen atom in various metals [Balasinski, 2016] ...... 80
4.6 Cross section of the Al SiO2 interface [Balasinski, 2016] ...... 81 4.7 Failures in a damascene line. (a) Failure dominated by the void nucle- ation phase. (b) Failure dominated by void nucleation migration and growth [Orio, 2010]...... 83 4.8 EM lifetime variation as a function of the interconnect dimensions [Orio, 2010]...... 84 4.9 Active layers FEOL, MOL Local interconnect Li and BEOL Metal interconnect ...... 85 4.10 Experiment and model of lifetime scaling versus interconnect geometry
(∆Lcr).[IRDS, 2016] ...... 87
4.11 Evolution of Jmax (from device performance) and JEM (from targeted lifetime).[IRDS, 2016] ...... 88 4.12 Inter-connectection distribution...... 89 4.13 Comparison of the manufacturing process step differences between Al and Cu. [Khan and Kim, 2011] ...... 91 4.14 Copper dual-damascene fabrication process: Via patterning and Via and trench patterning [Orio, 2010]...... 92
xi LIST OF FIGURES
4.15 Copper dual-damascene: Barrier layer deposition and Cu seed depo- sition. Cu electroplating and excess removal by chemical mechanical polishing [Orio, 2010]...... 92 4.16 Copper dual-damascene: Capping layer deposition [Orio, 2010]. . . . 93 4.17 Interconnect dimensions...... 95 4.18 3 wire segments with different dimensions and branches...... 95 4.19 Illustrating different materials vias and interconnect wires...... 96 4.20 TCR (α) versus line width. [Guillaumond et al., 2003] ...... 100 4.21 TCR (α) versus line width. [Huang et al., 2008a] ...... 100 4.22 Copper grain boundries [Sun, 2009]...... 103 4.23 Grain boundry scattering [Cornelius and Toimil-Molares, 2010]. . . . 103 4.24 Surface scattering [Cornelius and Toimil-Molares, 2010]...... 103 4.25 Resitivity of Cu versus thickness; all surface scattering elastic, ρ = 1 [Yarimbiyik et al., 2006]...... 104 4.26 The effects on Cu resistivity of line width scaling due to scattering [Saraswat, 2003]...... 105 4.27 The resistivity of copper wire as a function of line width. The to- tal resistivity is from scattering at the liner interface, grain bound- ary (GB) scattering and bulk resistivity (electron-phonon scattering) [Roberts et al., 2015]...... 105 4.28 Thickness dependence of the resistivity of evaporated copper films at 293 (◦K) Experimental data (—) calculated according to the statistical model [Finzel and Wimann, 1985]; (- - -) calculated on the basis of best fit [Schmiedl et al., 2008]...... 107 4.29 Bulk resistivity of various metals. [GSU, 2017] ...... 109 4.30 Sheet resistance as a function of layer pitch [Tyagi et al., 2000a] . . . 109 4.31 Sheet resistance as a function of layer pitch [Brain, 2016] ...... 110
xii LIST OF FIGURES
4.32 Cu line resistivity increases rapidly as line width decreases. The actual Van der Pauw pad thickness is label next to the data point. The resistivity of the largest pad with a 0.26 um thickness matches the measured data very well. [Jiang et al., 2001] ...... 111 4.33 van der Pauw Resistivity measurment technique. [Gadkari, 2005] . . . 112 4.34 Modeled Cu resistivity as a function of both inverse width and height. Model assumes no grain boundary scattering and ρ = 0, completely inelastic sidewalll scattering. [VanOlmen et al., 2007] ...... 113 4.35 Cu has excellent electromigration resistance. [Heidenreich et al., 1998] 114 4.36 Multiple interconnect stacks for cost, density and performance (from ITRS)...... 115 4.37 Technology for 90nm to 22nm nodes (from ITRS)...... 115 4.38 Parallel plate capacitance model of interconnect wire [Brain, 2016] . . 116 4.39 Capacitance vs Resistance change [Stork, 2005] ...... 117 4.40 Interconnect scaling [Brain, 2016] ...... 117
4.41 Ctotal for the line includes capacitance components from line-to-line and layer-to-layer [Brain, 2016] ...... 118 4.42 RC Delay Calculation Wire [Bohr, 1995] ...... 120 4.43 Interconnect scaling is limiting speed increases. [Bohr, 1995] . . . . . 120 4.44 Global line scaling. [Diebold, 2016] ...... 121 4.45 Delay versus Pitch. [Diebold, 2016] ...... 121 4.46 Novel materials innovations drive contact and BEOL RC improvement (reduction). [Besser, 2017a] ...... 122
4.47 Dennard scaling underestimates Intel and ITRS for M1 (y axis = Pitch in nm) ...... 138
4.48 Dennard scaling underestimates Intel for M1 (y axis = Pitch in nm) . 139
xiii LIST OF FIGURES
4.49 Dennard scaling underestimates Intel and ITRS for M1 (y axis = Pitch in nm log 10) ...... 140
4.50 Dennard scaling underestimates Intel for M1 (y axis = Pitch in nm log 10)...... 141 4.51 Dennard scaling underestimates Intel and ITRS for thickness (y axis = H in nm) ...... 142 4.52 Dennard scaling underestimates Intel for thickness (y axis = H in nm) 143 4.53 Dennard scaling underestimates Intel and ITRS for thickness (y axis = H in nm log 10) ...... 144 4.54 Dennard scaling underestimates Intel for thickness (y axis = H in nm log 10) ...... 145
4.55 Plot illustrating Intel and the ITRS are not scaling Imaxw according to Dennard scaling. The 2001 - 2011 are the ITRS predicted (y = mA/µm).146
4.56 Plot illustrating Intel and the ITRS are not scaling Imaxw according to Dennard scaling (y = mA/µm)...... 147
4.57 Intel and the ITRS are not scaling Imaxw according to Dennard scaling (y = mA/µm log 10) ...... 148
4.58 Intel and ITRS are not scaling Imaxw according to Dennard scaling (y = mA log 10) ...... 149
4.59 Intel is not scaling Imax according to Dennard scaling (y axis = mA) 150 4.60 Correlation of the thermal conductivity to the dielectric constant of various materials [Im et al., 2005b] ...... 158 4.61 Electron Wind [McPherson, 2013] ...... 159 4.62 A FIB cross section of the dual damascene copper line showing a slit failure under the Via. [He and Suo, 2004] ...... 162 4.63 A FIB cross section of the dual damascene copper line showing a trench failure in the interconnect line [He and Suo, 2004] ...... 163
xiv LIST OF FIGURES
4.64 Barrier layer needed to prevent Cu diffusion into dielectric ...... 165
4.65 Plot of ln t50 versus ln J to determine n ...... 173
4.66 Plot of ln t50 versus 1/T to determine the value of Ea ...... 173
4.67 Unipolar waveform illustrating Ipeak, Irms and Iavg [Liew et al., 1990] 178
4.68 How the current values Ipeak, Irms and Iavg used in interconnects . . . 179 4.69 Black’s Law Regression ...... 181
4.70 Black’s Law Regression for M2 ...... 183 4.71 Black’s Law Regression CDF ...... 185
4.72 tf50% Lifetimes using Black’s Law for M1 (y = years) ...... 186
4.73 log10(tf50%) Lifetimes using Black’s Law for M1 (y = years) . . . . . 186
4.74 tf50% Lifetimes using Black’s Law for M2 (y = years) ...... 187 4.75 Electromigration activation energies (left) and lifetimes for Cu/TaN/Ta liner/SiCN cap, Cu/TaN/Ta liner/Co cap, and Cu/TaN/Co liner/Co cap. [Edelstein, 2017] ...... 193 4.76 Via chamfer in the non-SAV direction (left). FAV scheme comparison for chamfer and CD control (right). [Briggs et al., 2017] ...... 194 4.77 Fully-Aligned Via (FAV) schematic, Cu/barrier recess TEM/EELS map, and implementation on W MOL. [Briggs et al., 2017] ...... 194 4.78 Through-Co Self-Forming Barrier concept and data. Mn from Cu(Mn) seed layer diffuses through ultrathin TaN/Co liner, reacts with residual O, and seals the composite barrier. Provides line-R reduction with preserved reliability. [Nogami et al., 2015] ...... 195 4.79 Through-Co Self-Forming Barrier concept and data. Mn from Cu(Mn) seed layer diffuses through ultrathin TaN/Co liner, reacts with residual O, and seals the composite barrier. Provides line-R reduction with preserved reliability. [Briggs et al., 2017] ...... 196
xv LIST OF FIGURES
4.80 Adhesion energy of CoWP compared to SiC, NSiC and SiN capping layers. [Gupta, 2010] ...... 197
5.1 DPA Model [Nagaraj et al., 1998] ...... 200 5.2 28nm EM example ...... 201
5.3 Performance verses Lifetime tradeoff by % change in j with∆tlifetime
cuurent density exponent (j) of n = 1 and ∆fperformance current density exponent (j) of α =1...... 203
5.4 Performance verses Lifetime tradeoff for ∆tlifetime current density ex-
ponent of n = 2, and Sakurai-Newton ∆fperformance current density exponent of α = 1.2...... 204 5.5 Performance verses Lifetime trade off for n = 1 and α = 1 ...... 206 5.6 Plot illustrating the tradeoffs of Current density j vs Temperature (T ) vs Lifetime (Z)...... 208 5.7 Metal stack for nominal process variation ...... 213 5.8 Capacitance, C, vs. inverse resistance, 1/R, at 112nm pitch (32nm: Metal-2; 22nm: Metal-4) [Ingerly et al., 2012b] ...... 216
5.9 Capping and Barrier layers for M2 ...... 218
5.10 Modulus of elasticity versus Dielectric constant, r [Besser, 2007] . . . 220 5.11 Interconnect capacitance model ...... 222
◦ ◦ 5.12 COMSOL simulation for Tsubstrate of 105 C and Tambient of 25 C .. 229
◦ ◦ 5.13 COMSOL simulation for Tsubstrate of 105 C and Tambient of 85 C .. 230
◦ 5.14 COMSOL simulation for COMSOL simulation for Tsubstrate of 105 C
◦ and Tambient of -40 C ...... 231
◦ ◦ 5.15 COMSOL simulation for Tsubstrate of 105 C and Tambient of 25 C .. 232
◦ ◦ 5.16 COMSOL simulation for Tsubstrate of 105 C and Tambient of 85 C .. 233
◦ ◦ 5.17 COMSOL simulation for Tsubstrate of 105 C and Tambient of -40 C .. 233
xvi LIST OF FIGURES
6.1 Amber 25 design block diagram ...... 240 6.2 High Level Design Exploration System ...... 241 6.3 Plot illustrating the tradeoffs of Current density j vs Temperature (T ) vs Lifetime (Z)...... 244 6.4 Tool flow for EM Methodology ...... 246 6.5 Library PVT (Process, Voltage and Temperature) diagram ...... 247 6.6 Tool flow for Spice to validate the EM Methodology using ICC . . . . 250 6.7 Synopsys 28/32nm PDK Acceleration Factor ...... 253 6.8 Synopsys 28/32nm PDK Acceleration Factor as used in ICC . . . . . 254
7.1 Amber25 design with floorplanning, gate placement and clock tree syn- thesized ...... 261 7.2 Clock tree highlighted in ICC ...... 262 7.3 Final ICC layout including the signals. Not all layers are shown. The x, y units are in µm ...... 263 7.4 Amber 25 layout showing the major blocks of the design ...... 264 7.5 Arm3 processor die for comparison to the Amber 25 design ...... 265 7.6 Amber 25 design wire count (Net Wire Length is in µm)...... 266 7.7 Amber 25 design average number of signals and clocks per metal layer (Net Wire Length is in µm)...... 267 7.8 Amber 25 design average number of Vias for the signals and clocks (Net Wire Length is in µm)...... 268 7.9 Amber 25 design ICC default toggle estimates (Net Wire Length is in µm)...... 269 7.10 Amber 25 design toggle rates with a cache size of 64 (Net Wire Length is in µm)...... 270 7.11 Amber 25 design toggle rates with a cache size of 128 (Net Wire Length is in µm)...... 271
xvii LIST OF FIGURES
7.12 Amber 25 design Power using the ICC default toggle rates (Net Wire Length is in µm)...... 272 7.13 Amber 25 design Power with a cache size of 128 (Net Wire Length is in µm)...... 273 7.14 Resistance for signal and clocks (Net Wire Length is in µm)..... 274 7.15 Capacitance for signal and clocks (Net Wire Length is in µm).... 275 7.16 Fanout of signal and clocks (Net Wire Length is in µm)...... 276 7.17 ICC layout showing the EM violations (yellow X’s) ...... 277
11.1 28/32nm LVT Inverter 0.5 Drive Schematic ...... 333 11.2 90nm LVT Inverter Schematic ...... 334 11.3 90nm LVT Inverter Schematic with Capicitors ...... 334 11.4 28/32nm LVT NAND2 Schematic ...... 335 11.5 90nm LVT NAND2 Schematic ...... 336 11.6 90nm LVT NOR Schematic ...... 336 11.7 Plot of the percent change in tp for a typical 28/32nm inverter . . . . 348 11.8 Plot of the delay change in tp for a typical 28/32nm inverter with ±40 change in interconnect resistance ...... 350
11.9 Plot of the Iomax change for a typical 28/32nm inverter with ±40 change in interconnect resistance ...... 352 11.10Plot of the percent change in tp for a typical 90nm inverter ...... 354 11.11Plot of the delay change in tp for a typical 90nm inverter with ±40 change in interconnect resistance ...... 356
11.12Plot of the Iomax change for a typical 90nm inverter with ±40 change in interconnect resistance ...... 358
11.13Interconnect delay calculation for Tperiod = 1RC ...... 359
11.14Interconnect delay calculation for Tperiod = 4RC ...... 360
11.15Interconnect delay calculation for Tperiod = 5RC ...... 360
xviii LIST OF FIGURES
11.16Interconnect delay calculation for Tperiod = 6RC ...... 361
11.17Interconnect delay calculation for Tperiod = 10RC ...... 361
11.18Clock simulation with Idd for time constant of 1 RC ...... 361
11.19Clock simulation with Idd for time constant of 4 RC ...... 362
11.20Clock simulation with Idd for time constant of 5 RC ...... 362
11.21Clock simulation with Idd for time constant of 6 RC ...... 362
11.22Clock simulation with Idd for time constant of 10 RC ...... 363 11.23Ratio of vacancy concentration at the blocking barrier C(0, t) to the
initial vacancy concentration C(x, 0) = C0) as a function of the nor-
malized to τ = α2 ×Dt for various conductor lengths for each boundary conditions of 11.6. Note that all solutions approximate the semi-infinite case except near steady state. [Clement and Lloyd, 1992] ...... 368
xix Acknowledgments
Dedicated to my parents Richard and Hazel Weyer. I owe my deepest gratitude to my Committee Chair Dr. Christos Papachristou. His personal and professional guidance throughout my research has taught me more than I give him credit here. I also gratefully acknowledge my committee members for support and time for this work. I would like to express my sincere appreciation to my friend and mentor in re- search, Dr. Francis G. Wolff. Without his enthusiasm, encouragement, knowledge and assistance throughout my educational journey, I would not have completed this dissertation. He inspired me to complete my research. Francis is the university’s Syn- opsy tool administrator, and the SolvNet ambassador. His help in getting through all the complexity of the tools and technology files was needed for this work. I am extremely grateful to my friend and co-worker Steve Clay. His knowledge and expertise in mathematics and reliability engineering, with his willingness and ability to teach me the principals of reliability has been instrumental in completing this dissertation. I need to thank Rockwell Automation for their support in my pursuit of my PhD. My manager and co-workers all supported me throughout my Master and PhD work. I would also like to thank Synopsys Incorporated for their support of the Case Computer Engineering Department by allowing the use of their ASIC tool suite, which made this work possible.
xx ACKNOWLEDGMENTS
And finally, I need to thank my family, and especially my wife Geri and children LeeAnn, Daniel and Darlene for their loving support in my education. Without their support this would not have been possible.
xxi Tradeoffs between Performance and Reliability in Integrated Circuits
Abstract by
DANIEL J. WEYER
The Reliability of the ICs or ASICs was assumed to always exceed the expected life- time of the product. Reliability cannot be ignored as the IC/ASIC industry moves to nano-scale geometries. Integrated Circuit technology (IC) and ASIC in particular were always designed to tradeoff between Performance, Cost (Area)Power. The Re- liability of the ICs or ASICs was assumed to always exceed the expected lifetime of the product. Reliability cannot be ignored as the IC industry moves to nano-scale geometries. This paper describes a design methodology to perform tradeoffs between Lifetime, Performance, Cost (Area) and Power. The main objective of this paper is to develop a design space exploration method and tools for IC/ASICs driven by lifetime concerns due to Electromigration. Our method applies to both safety based products that require longer lifetime, and also to higher performance products that are frequently replaced.
xxii Chapter 1
Introduction
In modern nanoscale IC and ASIC (IC/ASIC) designs, Lifetime and Reliability is now one of the design tradeoffs that needs to be considered. Traditionally, IC/ASIC design have been dominated by performance, cost (area) and power consumption espe- cially in mobile wireless applications. There are different requirements for consumer, computer servers, automotive, medical, industrial, avionics and military IC/ASICs. Methods are needed to predict the useful life of an IC/ASIC for the lifetime of the product(s) it will be used in. Transistor degradation occurs due to the effects of Electromigration (EM),Negative Bias Temperature Instability (NBTI), Positive Bias Temperature (PBTI), Hot Carrier Injection (HCI) and Time Dependent Dielectric Breakdown (TDDB), which all can reduce the useful life of the ASIC. The useful IC/ASIC life due to these degredations was over100 years in 350 nano meter (nm) technology. The useful life is now targeted for10 years for150nm and below IC/ASIC technology.
Motivation
The IC/ASIC designer needs to consider Lifetime and Reliability as one of the trade- offs along with Performance, Area and Power. If the end product needs the highest
1 CHAPTER 1. INTRODUCTION performance, and the lifetime of the product is 5 years, the design of the IC/ASIC can be maximized for performance and tradeoff the lifetime. If the lifetime of the product is greater than 10 years, performance can be traded off for longer lifetimes. With the growth of autonomous vehicles, reliability and lifetimes will be an very important design criteria going forward. Autonomous vehicles will require the latest IC/ASIC technologies and most likely be under power longer than the current average of 2 hours/per day. There are different lifetime expectations for consumer, automotive, medical, in- dustrial and military. Methods are needed to predict the useful life of an IC/ASIC for the expected lifetime of the product(s) it will be used in. In smaller IC geometries, transistor degradation occurs due to the effects of Electromigration (EM), Negative Bias Temperature Instability (NBTI), Positive Bias Temperature (PBTI), Hot Car- rier Injection (HCI) and Time Dependent Dielectric Breakdown (TDDB). Each of these can reduce the useful life of the IC/ASIC. The useful IC/ASIC life due to these degradation’s was > 100 years in 350nm technology. The IC/ASIC process are now designed for a useful life of > 10 years from150nm and below process technologies. Each new IC/ASIC process has been following Moore’s law. Gordon Moore who at the time was the director of research at Fairchild Semiconductor in 1965 was asked to forecast what the IC/ASIC industry would do in the next 10 years. He published an editorial in Electronics Magazine where he speculated: “The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years” [Moore, 1965a] He revised his forecast in 1975 with an article presented at the IEEE International Electron Devices Meeting to
2 CHAPTER 1. INTRODUCTION
“Semiconductor complexity would continue to double annually until about 1980 after which it would decrease to a rate of doubling approximately every two years” [Moore, 1975] Moore’s law has been driving the IC/ASIC industry. This doubling has reduced the lifetime of an IC/ASIC. Figure 1.1 gives a perspective on the transistor sizes of the IC/ASICs technologies to meet “Moore’s Law”.
Figure 1.1: Perspective on sizes. [Bohr, 2014]
As the transistor shrank to meet Moore’s Law, the interconnect wires to con- nect the transistors also had to shrink. The interconnect wires are now the limit- ing factor in useful life of an IC/ASIC, due to electromigration in the interconnects wires. AMD predicted that “Electromigration: the time bomb in deep-submicron ICs” [Li and Young, 1996]. At IBM it was estimated that close to a billion 1966 dol- lars were spent in the effort to understand and fix the problem of electromigration failure. “This was when a billion dollars was a lot of money.” [Lloyd, 2002]. EM continues to be a major challenge in the designing of IC/ASIC today. In the late 1980’s Western Digital (WD) desktop hard drives had widespread failures in 12 to 18 months. The root cause was determined to be caused by an electromigation rule violation in a third party controller IC/ASIC. WD corrected the problem, but not before damage was done to their reputation [Balasinski, 2016].
3 CHAPTER 1. INTRODUCTION
Guard banding has been the technique used to predict an IC/ASICs useful life. Gaurd banding consist of a set design rules used when developing an IC/ASIC. These rules account for the process variations in the manufacturing of an IC/ASIC. Process variation occurs in the fabrication of an IC/ASIC, which causes variations in the at- tributes of transistors (length, widths, oxide thickness) and interconnect wires (length, width, thickness). Process variation causes measurable and predictable variance in the performance and reliability of the circuits in the IC/ASIC design. The amount of pro- cess variation becomes more pronounced at smaller process nodes [Wikipedia, 2018c]. The useful life is the expectation of the life of the product the IC/ASIC was designed into. The expectations for useful life of a cell phone is different than the expected life of automobile or a satellite. “Companies that have suffered from elec- tromigration failures consequently, tend to set very conservative design rules, usually including wide power rails, which consume valuable routing space” “Unless the prob- lem is eliminated by design, deep-submicron circuits will be failures waiting to occur” [Li and Young, 1996]. Guard banding makes the assumption that a component is operated under the absolute worst case conditions. To tradeoff between reliability and performance, a 10% degradation parameter does not imply 10 year component failure. IC/ASIC life may be greater or less than 10 years depending on the IC/ASIC circuit design and the environmental conditions that the IC/ASIC is operated under. We will propose a design methodology which will allow a designer to better predict an IC/ASICs useful life. The methodology allows a designer to consider Reliability in the Performance, Cost and Power design tradeoffs.
1.1 Problem Statement
The decision to create a new product involves detailed documentation of the design requirements. The requirements include, but not limited to:
4 CHAPTER 1. INTRODUCTION
• Cost of the product in the marketplace
• Design and manufacturing costs
• How many years can the product be sold
• Number of product sold
• How long must the product be in service
• Performance
• Power requirements
• Environment the product will used in
• Expected useful life of the product
• Size of the end product
From the product requirements, a product design team will make the decision to use a Commercial Off the Shelf (COTS) IC. FPGA or to design a custom ASIC. The COTS IC supplier will have to consider the design requirements for the IC with targeted markets, such as consumer, medical, industrial, aerospace military, etc. Product design teams have the knowledge to make the tradeoffs between Performance, Cost and Power. Reliability of the IC/ASICs was always greater than the other components in the product such as electrolytic capacitors in the power supplies. As the process nodes continue to shrink, this assumption no longer applies. The objective is to present an explanation of the physical aspects of transistor degradation, how the calculations for reliability are performed in the IC/ASIC industry, and propose a methodology in which product design teams can trade off between performance, power, cost and reliability. The focus will be on electromigration (EM) in the interconnect wires, as this is the limiting factor in useful life in process technologies below 150nm. The IC/ASIC
5 CHAPTER 1. INTRODUCTION industry made the tradeoff between performance and the reliability of the interconnect wires due and set the design goal of IC/ASIC to be a lifetime at least 10 years. The effects of EM will be discussed, and through layout rules and simulation show how the tradeoff between performance and reliability can be made. For a product design, the IC/ASIC designer needs to consider the degradation or aging mechanisms of an IC/ASIC to ensure the requirements for lifetimes are met. The focus of our research is the clock lines and a process to analyze trade offs in performance versus reliability in clock lines. Though power lines carry more current, thus have the potential for EM failures, but are routed on wider interconnect layers, have specialized tools for the design and analysis and are typically can be over designed in practice. For the discussion, aging is continuous phenomenon in the degradation of materials. Lifetime is a milestone in the age of IC/ASIC, which depends application domain. Our focus is Lifetime.
1.2 Outline
Chapter 2 will look at the Reliability of IC/ASICs and discusses the reasons reliability is important, reliability from the IC/ASIC industry perspective and how reliability is calculated for each of the aging mechanism as follows:
1. Importance of Reliability in IC/ASICs
2. Theory of transistor degradation or aging
(a) NBTI (Negative Bias Temperature Instability) and PBTI (Positive Bias Temperature Instability)
(b) HCI (Hot Carrier Injection)
(c) TDDB (Time Dependent Dielectric Breakdown)
(d) EM (Electromigration)
6 CHAPTER 1. INTRODUCTION
EM will be expanded to discuss our approach and how it needs to considered in the design of an IC/ASIC. Chapter 3 discusses the aging or degradation mechanism of CMOS devices used in IC/ASIC design. Chapter 4 investigates the interconnects and their effects on per- formance and reliability. Chapter 5 is the Trade offs for Lifetime versus Performance and the Library used in the work to show the design considerations for the EM aging or degradation. Chapter 6 is our Methodology. It use the theory and equations from Chapter 5 using a 28/32nm process node to show the affects of the aging in electromi- gration and our propose methodology. An Open Core 32-bit processor design used as an example with simulations and analysis of the clock trees to show our methodology of how an IC/ASIC can be designed to optimize for the tradeoffs of performance, cost, power and in reliability. Chapter 7 is the results of our research and discusses how the design space can be explored for performance, power, lifetime and cost in IC/ASIC design. Chapter 8 is a discussion of our work. Chapter 9 is is a summary of our contributions and future research. Chapter 10 is a list of Acronyms. Chapter 11 is the Appendix and details of the work performed to explore the IC/ASIC design space. Chapter 12 is my related research.
7 Chapter 2
Reliability of ICs/ASICs
IC/ASIC CMOS technology scaling has posed increasing part reliability concerns, which affect the whole bathtub curve regions (Figure 2.1):
• Failure rate increases with scaling feature size
• Aging mechanism: Aging mechanism dominate
• Can be optimize for the end product
The Bathtub plots the Failure Rate verses time for IC/ASICs. There are three parts of the Bathtub cure:
• Infant Mortality
• Constant or Random failure
• Aging
Understanding the scaling impact on IC/ASICs reliability and the limitations of progressively scaled IC/ASIC CMOS technologies from a Point of Failure (PoF), standpoint will lead to improved reliability prediction, appropriate de-rating criteria, and help projects more effectively mitigate risks. As fabrication processes progress
8 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.1: The IC/ASIC Shrinking Bathtub to smaller geometries, certain semiconductor effects become to dominate and limit the useful life of an IC/ASIC. The factors for transistor aging are EM, NBTI, PBTI, HCI and TDDB. These mechanisms affects the useful life. EM, NBTI, PBTI, HCI, and TDDB, which used to be second or even third-order effects, are now becoming a major failure mechanism if IC/ASIC design does not consider these effects as shown in Figure 2.2. These effect the performance of the CMOS transistors performance over time. The transistors trap charge in the case of NBTI, PBTI and HCI . The shrinking on the interconnects increases the resistance of the interconnects, which affects the speed at which a circuit will perform. To keep performance, a tradeoff must be made to the amount of current allowed in an interconnect wire. The more current the better the performance, but, the higher current causes degradation due to EM. TDDB is a catastrophic failure and must be accounted for in the design of the dielectric of the IC/ASIC technology process.
Figure 2.2: Aging effects of a CMOS inverter [Bafleur and Perdu, 2016].
The early Infant Mortality is due to manufacturing defects. These are hopefully
9 CHAPTER 2. RELIABILITY OF ICS/ASICS found in the manufacturing testing of ICs/ASICs. The Random Failure are failures of the IC/ASIC during the lifetime. Aging failures predict the useful lifetime of the IC/ASIC. A group of industry experts formed The International Technology Roadmap for Semiconductors (ITRS). ITRS was created to asses the semiconductor or IC/ASIC technology and created roadmaps. In 2011 ITRS emphasized the concerns of these IC/ASIC failure mechanisms. For example, a telecom 90nm [Hillman, 2009], ASIC had a 10% failure within 4 years due to HCI. These reliability limitations are usually measured in units of FITs. For example, IC/ASIC CMOS process must not be worse than 50 FITs (Failure in Time) level in one billion device hours of operation (e.g. 1000 devices for 1 million hours or 1 million devices for 1000 hours). Figure 2.3 shows typical FIT rates for electronic components and products they are used in. The figure shows the trends of the semiconductor FIT rates are increasing. This is due to Moore’s Law, as more transistors are being placed in IC/ASICs.
Figure 2.3: Typical FIT rates of electronic components and trend due to aging or degradation. [Hillman, 2009]
There is no practical correlation of methods between transistor degradation and an IC/ASIC library of gates Flip Flops etc. or the design application. There are millions of gates and interconnect wires all with process variations in manufactur- ing. Guard banding has been the technique used to predict an IC/ASICs useful life. Guard banding does not predict the useful life. Guard banding makes the assumption
10 CHAPTER 2. RELIABILITY OF ICS/ASICS that a component is operated under the absolute worst case conditions. To tradeoff between reliability and performance, a 10% transistor degradation parameter rule does not imply 10 year component failure. ASIC life may be greater or less than 10 years depending on the IC/ASIC circuit design, and the environmental conditions that the ASIC is operated under. Figure 2.4 shows the design consideration due to aging. Clearly, guard-banding of 15% is large and understanding why this is nec- essary needs to be investigated to increase the performance of the IC/ASIC design. The cost of guaranteeing reliable operation of an IC/ASIC for the specified lifetime is increasing, and is paid for in terms of the chips performance.[Arasu et al., 2016]. Nano scale IC/ASICs have become more sensitive to various process parameters such as supply voltage and temperature. Designers follow the worst case design approach for guaranteeing IC/ASIC operation for all Process, Voltage and Temperature (PVT) variations of the IC/ASIC manufacturing process . However, rarely do the extreme corners occur in most fabricated IC/ASICs. Also, over time the IC/ASIC fabrica- tion process is held to tighter tolerances. Excessive use of guard-banding limits the maximum performance. To continue the digital design success in nanometer CMOS, cost-effective variation tolerant design approaches are needed that guarantee circuit robustness in the presence of the variability influences, while avoiding over constrain- ing of the design [Meijer, 2011].
Figure 2.4: Total guard banding of 15% is large. Blue shows the faults due to process variation. Red shows aging degradation [Alam et al., 2008].
The reasons for guard-banding in ICs/ASICs is to ensure the reliability of the
11 CHAPTER 2. RELIABILITY OF ICS/ASICS
products designed with the IC/ASIC. The Figure 2.5 illustrates the cost of warranty to various industries. As can be seen from the figure the warranty cost are in billions of dollars.
Figure 2.5: Warranty Costs
Examples of recalls are: PC motherboards [Singer, 2005], medical pacemakers [LaPedus, 2006], thermal reliability such as graphic ICs [Yoshida, 2008], micropro- cessors causing system errors when running certain programs at a particular temper- ature [Merritt, 2004] [Fried, 2000], wire bonding and packaging reliability resulted in a graphic ICs [Clark, 2008] [LaPedus, 2008], and an FPGA vendor [McGrath, 2006]. These led to either a recall of the product or to the extension of the product war- ranties. CMOS technology scaling has posed increasing parts reliability concerns, which affect the whole bath tub curve regions:
• Failure rate increases with scaling feature size
• Aging or Wear-out mechanism: Aging mechanism dominate
• Can be pushed out by design
Understanding the scaling impact on parts reliability and the limitations of pro- gressively scaled CMOS technologies will lead to improved reliability prediction, ap- propriate derating criteria, and will help IC/ASIC designers more effectively mitigate risks. Reliability is measured in FITs (Failures in Time). The number of units/ICs
12 CHAPTER 2. RELIABILITY OF ICS/ASICS failure is measured in PPM (Parts per Million). It is useful to note the definitions below:
1 FIT = 1 failure in 109 device-hours
1 PPM = 1 failure in 106 parts
Examples:
100 FIT = after 10 years operation 1% failure
10 FIT = after 10 years operation 0.1% failure
MTBF = Mean Time Between Failure - the arithmetic mean time between failures for a repairable system.
MTTF = Mean Time to Failure - the arithmetic mean time to failures for a non-repairable system. MTBF ≈ MTTF for non-repairable systems.
It is noteworthy, the MTBF and MTTF values are when 63% of the devices or products fail. Safety is also a concern especially for the Aerospace, Automotive, Industrial, Med- ical, and Military. There are standards such IEC 61508 (IEC - International Electro- technical Commission) for Industrial, ISO 26262 (ISO - International Organization for Standardization) for Automotive and various others such as the defense industry DOD 254. There are overlaps between the standards. For example the automotive standard ISO 26262 specifies the use of IEC 61508. The IEC 61508 is a functional safety standard for Electric/ Electronic/Programmable Electronic (E/E/PE) Safety related systems - Proof of Safety [Foerster, 2010] It consist of documented body of evidence that provides a convincing and valid argument that a system is adequately safe for a given application in a given environment with justification of engineering and management approaches to safety issues as follows:
13 CHAPTER 2. RELIABILITY OF ICS/ASICS
1. Safety System Project has various phases in its life cycle.
2. Functional Safety Assessment is a critical activity that checks and reviews out- put of each phase to make sure that the Functional Safety has actually been achieved.
3. Based on the Risk level SIL (Safety Integrity Level) an Independent person or an independent organization is required to carry out a Safety Assessment.
There are 4 SIL classes (ASIL for automotive) ratings for the compound or average certainty level of failure probability (probability of a failure per hour of operation that introduces danger) for a design. These are categorized as follows:
SIL4 (-): 1 failure in a minimum of 110,000 years
SIL3 (ASIL-D): 1 failure in a minimum of 11,000 years
SIL2 (ASIL-B/C): 1 failure in a minimum of 1,100 years
SIL1 (ASIL-A): 1 failure in a minimum of 380 years
For example a Nuclear Power Plant would be certified to SIL 4. Most industrial and automotive applications are SIL2 (ASIL B/C) or SIL3 (ASIL D). The aspect of aging must be accounted for in the IC/ASIC design to ensure the failures metrics are met for the product designs. When IC/ASICs are manufactured, and used under identical stress conditions, they will not fail in exactly the same way or at the same time. The reasons for this is that slight differences in the micro-structure and manufacturing processes exist in the micro-structures of the IC/ASIC. An IC/ASIC consists of the silicon die placed in package. Failure can occur in the silicon or the package the die is placed in and the failure mechanisms may not be identical. Statistical analysis can be performed on failures. Reliability of IC/ASICs can be summarized as follows [Wikipedia, 2018d]:
14 CHAPTER 2. RELIABILITY OF ICS/ASICS
1. Semiconductor devices are very sensitive to impurities and particles. There- fore, to manufacture these devices it is necessary to manage many processes while accurately controlling the level of impurities and particles. The finished product quality depends upon the many layered relationship of each interacting substance in the semiconductor, including metallization, IC/ASIC material (list of semiconductor materials) and package.
2. The problems of micro-processes, and thin films and must be fully understood as they apply to metallization and wire bonding. It is also necessary to analyze surface phenomena from the aspect of thin films.
3. Due to the rapid advances in technology, many new devices are developed using new materials and processes, and design calendar time is limited due to non- recurring engineering constraints, plus time to market concerns. Consequently, it is not possible to base new designs on the reliability of existing devices.
4. To achieve economy of scale, semiconductor products are manufactured in high volume. Furthermore, repair of finished semiconductor products is impracti- cal. Therefore, incorporation of reliability at the design stage and reduction of variation in the production stage have become essential.
5. Reliability of semiconductor devices may depend on assembly, use, and en- vironmental conditions. Stress factors affecting device reliability include gas, dust, contamination, voltage, current density, temperature, humidity, mechan- ical stress, vibration, shock, radiation, pressure, and intensity of magnetic and electrical fields.
15 CHAPTER 2. RELIABILITY OF ICS/ASICS
2.1 Reliability analysis
There are many time-dependent forms of degradation. The following three (3) forms are generally used as they tend to occur in nature. They are the Power-Law, Expo- nential and the Logarithmic models:
f(x) = ax±k is the Power Law function
f(x) = bx is the Exponential function
logb(x) = y is the Logarithmic function
The following sections give a brief overview on reliability and the statistical dis- tributions used by the IC/ASIC Industry. A very good summary can be found in the book “Reliability Physics and Engineering” by J. W. McPherson and was used extensively [McPherson, 2013] in the following sections.
2.1.1 Time To Failure
For semiconductor industry three probability distribution functions historically have been used used for reliability analysis. They are the “normal”, ‘’lognormal” and ‘’Weibull”. The normal, lognormal and Weibull distributions are continuous distri- butions used for time to failure (TTF or tf ) analysis when failure in time data is available. The following are important statistical concepts, which are defined math- ematically in the next sections [Strong et al., 2009a]:
ft = f(t) = Probability Density Function (PDF) or Probability Distribution Function = Probability of observing a failure between t and t+dt
Ft = F (t) = Cumulative Distribution Function (CDF) or Cumulative failure probability
Rt = R(t) = Cumulative surviving distribution = R(t) = 1 − F (t)
16 CHAPTER 2. RELIABILITY OF ICS/ASICS
h(t) = Instantaneous Failure Rate (IFR) or Hazard function
H(t) = Cumulative hazard function
The PDF is a function that describes the ratio of devices or products that fail in a given period of time (t and t+dt) to the number of devices or products [Kapur and Pecht, 2014]. Figure 2.6 illustrates the PDF.
Figure 2.6: PDF f(t). f(t) represents the probability of finding a device failure between t and t+dt [McPherson, 2013]
The CDF is the fraction of the population that has failed. The CDF increases from zero to one. The CDF mathematically describes the probability of failure at a given time as is shown in Figure 2.7. The CDF equation is Equation 2.1.
Figure 2.7: CDF F(t). F(t) represents the fraction of the population that failed [policeanalyst.com, 2012]
17 CHAPTER 2. RELIABILITY OF ICS/ASICS
F (t) = P robability of failure = P robability[F ailed product ≤ t] = P [T ≤ t] (2.1) Z t = f(t)dt (2.2) 0
The reliability function R(t) is the cumulative distribution of the surviving pop- ulation. It is obtained by subtracting the cumulative fails (CDF) from 1 (R(t) = 1 − F (t)). Just as the CDF must equal 1 after the last failure, the reliability function must equal 0 since there are by definition no survivors [Kapur and Pecht, 2014]. Figure 2.8 illustrate the PDF, CDF and Reliability.
Figure 2.8: Comparison of PDF, CDF and R [Andy, 2018]
The hazard rate is the rate at which failures occur over a given time interval. It does not depend on the original sample size. The hazard function or rate is defined
18 CHAPTER 2. RELIABILITY OF ICS/ASICS
in Equation 2.3:
# of failures in a given time interval Hazard rate = # of survivors at the start of the interval × interval length (2.3) f(t) h(t) = (2.4) R(t)
The hazard rate or the instantaneous failure rate indicates the failure rate over the life of the population. The hazard rate is defined in terms of the reliability R(t), and is described in Equation 2.5 [Kapur and Pecht, 2014]:
−1 dR(t) h(t) = (2.5) R(t) dt
dR(t) since f(t) = dt . The reliability function is shown in Equation 2.6:
Reliability function = R(t) = P robability [P roduct life > t] (2.6)
= P [T > t] (2.7)
= 1 − P [T ≤ t] (2.8)
= 1 − F (t) (2.9)
The importance of the cumulative hazard function H(t) is that it indicates the change in failure rate over the life of the population. Two designs may provide the same reliability at a specific time, but the hazard rates can differ over time. The cumulative hazard in given by Equation 2.10 [Kapur and Pecht, 2014]:
Z t H(t) = h(τ)dτ (2.10) τ−1
R(t) and F (t) are related to h(t) and H(t) and the following relationships can be
19 CHAPTER 2. RELIABILITY OF ICS/ASICS
developed (Equation 2.11 [Kapur and Pecht, 2014].
f(t) 1 d h(t) = = − R(t) (2.11) R(t) R(t) dt d ln [R(t)] = (2.12) dt
or −d ln (R(t)) = h(t)dt (2.13)
Integrating both sides yeilds Equation 2.14
Z t − ln [R(t)] = h(τ)dτ (2.14) τ=0 = H(t) (2.15)
Since the four function f(t), F (t), R(t) and h(t) are related, if any one of the functions are known the three can be developed as shown in Equations 2.16, 2.17, 2.18 and 2.19: f(t) R(t) = (2.16) R(t
Z t R(t) = exp − h(u)du (2.17) 0
Z t f(t) = h(t) exp − h(u)du (2.18) 0
F (t) = 1 − R(t) (2.19)
It is useful to note the reliability function R(t) for systems is assumed to follow
20 CHAPTER 2. RELIABILITY OF ICS/ASICS
an exponential distribution as shown in Equation 2.20.
R(t) = expτ0t (2.20)
2.2 Continuous Distributions
Useful probability distribution for analysis of IC/ASIC are the Exponential, Normal (Gaussian) Distribution, lognormal, Weibull and Gamma distributions. These are continuous distributions, which is when a variable can take on value between two specified values.
2.2.1 Exponential Distribution
The exponential function is characterized as follows [Taboga, 2010], [Tsirelson, 2010]: Let X be an absolutely continuous random variable, then for the set of positive real numbers (Equation 2.21).
RX = [0, ∞] (2.21)
Let λ ∈ R+ the X has an exponential distribution with a rate parameter λ if its probability density function (PDF) is shown in Equation 2.22.
−λx λ exp if x ∈ RX fx(x) = (2.22) 0 if ∈/ RX
Figure 2.9 shows the plots for Exponential PDF: Then the Cumulative distribution function (CDF) or Cumulative failure rate is shown in Equation 2.23:
−λx 1 − exp if x ∈ RX Fx(x) = (2.23) 0 if ∈/ RX
21 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.9: The Exponential PDF [Wikipedia, 2017b]
Figure 2.10 shows the plots for Exponential CDF:
Figure 2.10: The Exponential CDF [Wikipedia, 2017b]
An important property of the exponential distribution is the memory-less property. This means that if a random variable X is exponentially distributed, its conditional probability obeys Equation 2.25 [Taboga, 2010].
P (X ≤ x + y|X > x) = P (X ≤ y) (2.24)
This property says that the probability that the event happens during a time interval of length y is independent of how much time has already elapsed x without the event happening. An example, if x = 30 seconds and y = 20 seconds shown in
22 CHAPTER 2. RELIABILITY OF ICS/ASICS
Equation 2.25: P (X > 30|X > 20) = P (X > 10) (2.25)
which says the event must wait more than another 10 seconds before the first arrival, given that the first arrival has not yet happened after 20 seconds, which is not different from the initial probability that is needed to wait more than 10 seconds for the first arrival. Reliability theory and reliability engineering make use of the exponential distri- bution. The memory-less property of exponential distribution, it is well-suited to model the constant Intrinsic failures portion of the bathtub curve in Figure 2.14. It is also very convenient because it is easy to add failure rates in a reliability model [Morris, 2014]. The mean or expected value E of an exponential distributed variable X with rate parameter λ is shown in Equation 2.26:
1 E[X] = (2.26) λ
The variance of X is shown in Equation 2.27:
1 V ar[X] = (2.27) λ2
2.2.2 Normal (Gaussian) Distribution
The normal distribution occurs when a random variable is affected by random ef- fects that no single factor dominates. By the central limit theorem which, states the sum of a large number of random variables is approximately normally distributed [Kapur and Pecht, 2014]. The central limit theorem (CLT) establishes that, in most situations, when independent random variables are added, their properly normal- ized sum tends toward a normal distribution (Gaussian) even if the variables are
23 CHAPTER 2. RELIABILITY OF ICS/ASICS
not normally distributed. The theorem is a key concept in probability theory be- cause it implies that probabilistic and statistical methods that work for normal dis- tributions can be applicable to many problems involving other types of distributions [Wikipedia, 2018a]. The CLT is used in the IC/ASIC reliability calculations that Normal distribution equations are used for the calculation of lognormal distributions. The PDF is defined based on the Gaussian function is shown in Equation 2.28.
" # 1 1 t − µ2 f(t) = √ exp − , −∞ ≤ t ≤ ∞ (2.28) σ 2π 2 σ
where mu is the mean and sigma is the standard deviation. The normal distribution parameters follows:
Mean (arithmetic average) = µ
Variance = µ
Mode (highest value) = µ
Median = µ
Location parameter = µ
Shape parameter/standard diviation = µ
Figure 2.11 shows a plot of the Normal distribution PDF. The CDF or unreliability for the normal distribution is shown in Equation 2.29.
" # 1 Z t 1 x − µ2 F (t) = √ exp − dx (2.29) σ 2π −∞ 2 σ
There is no closed form of Equation 2.29. The values for the area under the Normal distribution are obtained from Normal distribution tables. This is performed
24 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.11: PDF f(t) for Normal Distribution [Kapur and Pecht, 2014] by converting the random value t to a random variable z as shown in Equation 2.30.
t − u z = (2.30) σ
A Normal random variable with the mean equal to zero and a variance of 1 is called a standard Normal variable (Z). The PDF is given by Equation 2.31.
2 1 − z φ(z) = √ exp 2 (2.31) 2π
(t−µ) where z ≡ σ . The properties of the standard Normal variables are tabulated in statistical tables. The CDF is defined in Equation 2.32.
t − µ F (t) = Φ(z) = Φ (2.32) σ
The reliability function is shown in Equation 2.33. The hazard function is shown in Equation 2.34. t − µ R(t) = 1 − Φ (2.33) σ
25 CHAPTER 2. RELIABILITY OF ICS/ASICS
h φ(t−µ) i σ h(t) = (2.34) σR(t)
The Normal distribution has an increasing hazard function. When the normal distribution is used, the probabilities of a failure occurring before or after the mean µ time are equal because the mean is the same as the median. Comparisons given the mean value, the variability about the mean value is defined through the standard deviation.
2.2.3 lognormal Distribution
For a continuous random variable, there may be a situation in which the random variable is a product of a series of random variables. The lognormal distribution is a positively skewed distribution. It can be used to model situations where large occurrences are concentrated at the tail (left) end of the range. The lognormal dis- tribution is based on the normal distribution, but the failures in time are assumed to be distributed logarithmically rather than linearally. If X is a random variable with a normal distribution, then Equation 2.35 applies:
Y = expX (2.35)
If Y has a lognormal distribution the X = logY has a normal distribution. If Y is the product of n independent random variables shown in Equation 2.36:
Y = Y1Y2.....Yn (2.36)
Taking the natural logarithm of Equation 2.36:
ln Y = ln Y1 + ln Y2 + ..... + ln Yn (2.37)
26 CHAPTER 2. RELIABILITY OF ICS/ASICS
Then ln Y can be approximated by the normal distribution based on the central limit theorem [Kapur and Pecht, 2014]. The Lognormal distribution applies to many engineering functions. The use of the lognormal distribution is used to describe failures in time for an IC/ASIC, where aging mechanisms are general and complex in nature and not caused by one failure. An ex- ample is IC/ASIC industry committees have used the lognormal distributions for EM. The lognormal is also used for corrosion-induced and fatigue-induced failures. The lognormal distribution (PDF) is shown in Equation 2.38 [Kapur and Pecht, 2014].
" # 1 1 ln(t) − µ)2 f(t) = √ exp − (2.38) σt 2π 2 σ " # 1 ln(t) − µ)2 = √ exp √ (2.39) σt 2π σ 2
where σ is the standard deviation of the logarithms of the times to failure and µ is the mean of all the logarithms of all the times to failure. If a random variable X follows a lognormal distribution then ln X follows a normal distribution. The median (which equals the mean and mode) can also can be expressed as
ln(t50), which allows the lognormal PDF to be expressed as in Equation 2.40 [McPherson, 2013].
" # 1 ln(t) − ln(t )2 f(t) = √ exp − √ 50 (2.40) σt 2π σ 2
σ is approximated by Equation 2.41:
t50 σ = ln(t50) − ln(t15.87) ≈ ln (2.41) t16
where t16 is the time to failure for 16% of the devices. Equation 2.41 is use to approximate σ [McPherson, 2013]. Figure 2.12 shows a plot of the lognormal distribution PDF.
27 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.12: PDF f(t) of the Log normal Distribution for σ = 0.1 and σ = 0.5. [Kapur and Pecht, 2014]
28 CHAPTER 2. RELIABILITY OF ICS/ASICS
The cumulative failure probability F for the lognormal distribution is given by the Equations 2.42:
" # 1 Z t 1 ln x − µ2 F (t) = √ exp − dx (2.42) σ 2π −∞ 2 σ lnt − µ F (t) = Φ (2.43) σ 1 ln(t50 − ln(t) F (t) = erfc √ for t ≤ t50 (2.44) 2 σ 2 1 ln(t50 − ln(t) F (t) = 1 − erfc √ for t ≥ t50 (2.45) 2 σ 2
An alternative method to determine and plot cumulative fraction failed F is to use the number of logarithmic standard deviations represented by the “Z-values”. The Z-value is the number of standard deviations associated with a cumulative failure F. The values to go from F to Z-value and Z-vaule to F can be found in tables or using Microsoft’s EXCEL program. EXCEL provides function Z = NORMSINV (F ) and
F = NORMDIST (Z). In general once the t50 and σ values have been found any cumulative fraction F can be found with Equation 2.46 [McPherson, 2013].
tF % = t50 exp (ZF × σ) (2.46)
The following relationships Equations 2.47 are often used:
t t t t = 50% ; t = 50% ; t = 50% (2.47) 16% exp (1σ) 1% exp (2.33σ) 0.13% exp (3σ)
The hazard function for the lognormal is described in Equation 2.48.
φ ln t−u h(t) = σ (2.48) tσR(t)
If a population follows a lognormal distribution, then the MTTF can be found
29 CHAPTER 2. RELIABILITY OF ICS/ASICS with Equation 2.49. σ2 MTTF = exp µ + (2.49) 2
The lognormal distribution parameters follow:
Mean (arithmetic average) = exp (µ + 0.5σ2)
Variance = (exp σ2 − 1) exp 2µ + σ2
Median (50% failures) = exp µ
Mode (highest value of f(t)) t = exp (µ − σ2)
Location parameter = exp µ
Shape parameter/standard diviation = σ
Estimate of σ = ln t50 t16
[McPherson, 2013].
2.2.4 Weibull Distribution
The Weibull distribution is used to provide the distribution of lifetimes of objects. It was originally proposed to quantify fatigue data, but it is also used in analysis of systems involving a “weakest link” [Weisstein, 2017]. The weakest link means that the failure of a system is dominated by the weakest element in the system. Thus the Weibull is used in the reliability of systems. The Weibull distribution is useful in calculating and plotting IC/ASIC failure mechanisms. An example is Time Dependent Dielectric Breakdown (TDDB), which the entire capacitor fails when a localized region of the capacitor fails. The Weibull distribution is useful in modeling semiconductor failure rates when failure in time data is available.
30 CHAPTER 2. RELIABILITY OF ICS/ASICS
The general form of Weibull probability density function (PDF) is defined as Equation 2.50: " # β t − γ β−1 t − γ β f(t) = exp − (2.50) α α α
where:
β is the shape function
α is the scale parameter
γ is the location parameter and is not usually used and can be set to 0
In most cases of IC/ASIC reliability analysis, the location parameter γ is not required. The Weibull can then be represented by Equation 2.51:
" # β t β−1 t β f(t) = exp − (2.51) α α α
Figure 2.13 shows the plots for Weibull PDF.
Figure 2.13: The Weibull PDF [Wikipedia, 2017d]
Figure 2.14 illustrates how the shape parameter affects the shape of the Weibull distribution and how it can be used in reliability analysis. β < 1 is the infant mortality
31 CHAPTER 2. RELIABILITY OF ICS/ASICS
or early failures of IC/ASIC. β = 1 is the intrinsic or normal life failures and β > 1 is the deterioration or aging failures. The Failure rate is the frequency a IC/ASIC fails and is expressed in failures per unit of time. The Greek letter lambda λ is used in reliability engineering. The MTBF is defined as:
1 MTBF = λ
λ is only valid in the normal life or the flat region in the bathtub curve. The IC/ASIC lifetimes are typically much less than the MTBF due to early failures and aging failures.
Figure 2.14: The Weibull distribution showing use in Reliability [Spinato et al., 2009]
The cumulative failure probability can be found by Equation 2.52.
" # Z t t β F (t) = f(t)dt = 1 − exp − (2.52) 0 α
Figure 2.15 shows the plots for Weibull CDF. Rearranging Equation 2.52 and taking the logarithms of both side shown in Equa- tion 2.53. t ln [− ln (1 − F )] = β ln (2.53) α
When F is 0.63212 in Equation 2.53 the left side of the equation approaches 0, therefore when discussing distributions following the Weibull distribution the 63.2
32 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.15: The Weibull CDF [Wikipedia, 2017d] percentile is the figure of merit for fallout [Strong et al., 2009b]. Generally this is simplified and t63 is substituted for α as shown in Equation 2.54.
t ln [− ln (1 − F )] = β ln (2.54) t63
The slope β can be found by rearranging Equation 2.54, which results in Equation 2.55. ln [− ln (1 − F )] β = h i (2.55) ln t t63
When fitting data it is useful to plot the data using ‘’Weibits’ [McPherson, 2013]’. A Weibit is defined as in Equation 2.56.
W eibit = ln [− ln (1 − F )] (2.56)
To plot the collected data using Weibits as the y axis and t (time) as the y axis is shown in Figure 2.16. From the graph of the data, the Weibit = 0, corresponds to
t63 and the slope would be β.
33 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.16: The Weibull distribution plot in terms of Weibits [McPherson, 2013]
Conversions from F (t) to Weibits can be found in tables. An example table is in Figure 2.17.
Figure 2.17: The Weibull distribution plot in terms of Weibits [McPherson, 2013]
Knowing t63 and β any F can be found by using Equation 2.57.
1 t = t exp ln [− ln (1 − F )] (2.57) F % 63% β
34 CHAPTER 2. RELIABILITY OF ICS/ASICS
The following relationship Equations 2.58 are often used [McPherson, 2013].
t63% t63% t63% t10% = ; t1% = ; t0.1% = (2.58) 2.25 4.60 6.91 exp β exp β exp β
A comparison of the PDF, CDF and hazard functions for the Exponential, Normal, lognormal and Weibull is shown in Figure 2.18.
Figure 2.18: Comparison of the PDF, CDF and hazard functions for Exponential, Normal, lognormal and Weibull [Industrial-Electronics, 2017]
35 CHAPTER 2. RELIABILITY OF ICS/ASICS
2.2.5 Gamma Distribution
The gamma distribution is a two-parameter family of continuous probability distri- butions. The gamma distribution is used in reliability analysis for cases where partial failures can exist. A given number of partial failures must occur before an item fails (i.e redundant systems), or is the time to the second failure when the time to failure is exponentially distributed. The failure density function or PDF is Equation 2.59.
λ f(t) = (λt)α−1 exp−λt (2.59) Γ(α)
for t > 0 and where:
α mean = µ = λ
1 α 2 standard deviation = σ = λ
α is the number of partial failures or events to generate a failure
λ is the complete failure rate
The Γ(α) is the gamma function (Equation 2.60).
Z ∞ Γ(α) = x(α−1) expx dx (2.60) 0
Figure 2.19 illustrates the Gamma function for positive real values: which can be evaluated with standard tables. When (α − 1) is a positive integer, Γ(α) = (α − 1)!, which is usually the case for most reliability analysis, i.e. a partial failure situation. The failure density distribution is Equation 2.61.
λ f(t) = (λt)(α−1) exp−λt (2.61) (α − 1)
For the case α = 1 then the Gamma function becomes the Exponential density
36 CHAPTER 2. RELIABILITY OF ICS/ASICS
Figure 2.19: The Gamma function for real values of α [Pishro-Nik, 2017] function. Figure 2.20 illustrates the PDF for the Gamma distribution for example values of α.
Figure 2.20: PDF for the Gamma Distribution for values α and λ [Pishro-Nik, 2017]
The cumulative failure rate or CDF is shown in Equation 2.62.
λα Z ∞ R(t) = 1 − F (t) = t(α−1) expt dt (2.62) Γ(α) 0
It can be shown when α is an integer (Equation 2.63):
α−1 X λtk exp−λt R(t) = 1 − F (t) = (2.63) k! k=0
which is the Poisson distribution [Math, 2011].
37 CHAPTER 2. RELIABILITY OF ICS/ASICS
An example [Math, 2011]: An anti-aircraft missile system has demonstrated a gamma failure distribution with α = 3 and λ = 0.05. What is the reliability for a 24 hour mission time and the hazard rate at the end of 24 hours, since α is an integer (Equation 2.64).
α−1 X λtk exp−λt R(t) =F (t) = (2.64) k! k=0 3−1 X 0.05(24)k exp−0.05t R(24) = (2.65) k! k=0 R(24) = 0.301 + 0.362 + 0.216 = 0.88 (2.66)
The hazard function for the gamma is Equation 2.67.
f(t) h(t) = (2.67) R(t)
The PDF is shown in Equation 2.68.
λ f(t) = (λt)(α−1) exp−λt (2.68) (α − 1) 0.05 f(t) = (0.05(24))(3−1) exp−.0.05(24) (2.69) (3 − 1) f(t) = 0.011 (2.70)
and the hazard calculation is Equation 2.71.
0.011) h(24) = (2.71) R(0.88) h(24) = 0.012 failures/hour (2.72)
38 CHAPTER 2. RELIABILITY OF ICS/ASICS
2.3 Discrete Distributions
Other useful probability distribution are the Poisson, Bi-nomial and Chi-Square dis- tribution. These distributions are discrete distributions used in physics, engineering and manufacturing. They describe the probability of occurrence for discrete random events or processes controlled by chance. They are related and express the probabil- ity of a given number of events occurring in a fixed interval of time and/or space, if these events occur with a known average rate and independently of the time since the last event [Franken, 1970]. If you sample a population in order to determine defects, discrete values are obtained such as the part is good or the part is defective. These distributions are use to describe the probability of occurrence for discrete random (controlled by chance) events/processes. Examples are the number of defects observed on a silicon wafer or the number of manufacturing defects observed to come off the assembly line in given time intervals, and to calculate FIT rates for a IC/ASIC device. These distributions are used to estimate lifetimes when part failure data in time is not available. The following sections give the definitions and a small example of how they can be used in IC/ASIC fabrication.
2.3.1 Poisson Distribution
The Poisson distribution is a discrete distribution and used to describe the probability of occurrence of a random variable or processes. The Poisson random variable must satisfy the following conditions [Bourne, 2018], [Wikipedia, 2017c]:
1. The number of successes in two disjoint time intervals is independent
2. The probability of a success during a small time interval is proportional to the entire length of the time interval
3. The occurrence of one event does not affect the probability that a second event will occur. They occur independently
39 CHAPTER 2. RELIABILITY OF ICS/ASICS
4. The rate at which events occur is constant. The rate cannot be higher in some intervals and lower in other intervals
5. Two events cannot occur at exactly the same instant; instead, at each very small sub-interval exactly one event either occurs or does not occur
6. The probability of an event in a small sub-interval is proportional to the length of the sub-interval
If these conditions are met, then k is a Poisson random variable and the distribu- tion of k is a Poisson distribution. Many times a certain event occurs in a specific time interval or in a specific length or area. Examples are:
1. defects per silicon wafer
2. Number of late shipments per 1,000 shipments
3. Number of bugs per byte of code
4. Number of pieces scrapped per 1,000,000 pieces produced
5. Survival rate analysis
6. Birth defects and genetic mutations
7. The number of phone calls received at an exchange or call center in an hour
Equation 2.73 is the Poisson distribution.
λk P (X = k) = f(k) = exp−λ (2.73) k!
where:
k = 0, 1, 2,...
40 CHAPTER 2. RELIABILITY OF ICS/ASICS
λ is the mean number of successes in the given time interval or region of space
If λ is the average number of successes E(X) occurring in a given time interval or region for the Poisson distribution, then the mean λ and the variance V(X) of the Poisson distribution are both equal to λ (Equation 2.74).
E(X) = λ; V (X) = σ2 = λ (2.74)
The Probability Mass Function (PMF) gives the probability that a discrete ran- dom variable is exactly equal to a value. Figure 2.21 shows plots for the Poisson PMF for various values of k. Figure 2.22 shows the plots for Poisson CDF.
Figure 2.21: The Poisson PMF [Wikipedia, 2017c]
Figure 2.22: The Poisson CDF [Wikipedia, 2017c]
Example [Wiley, 2017]: Assume that the number of particles of contamination
41 CHAPTER 2. RELIABILITY OF ICS/ASICS on a wafer follow a Poisson distribution with 1.5 particles per square inch. The specifications for a 6-inch diameter wafer state that there must be 24 or fewer particles in each of the sectors of the wafer. The objective of this type of specification is to limit the number and the clustering of particles. What yield is expected from the current process?
1. For six-inch diameter wafer, the area is 32 = 28.27 square inches.
2. For each sector, the area is 28.27/6 = 4.71 square inches.
3. Consequently, the mean number of particles per sector is 4.71 in2 x 1.5 parti- cles/in2 = 7.06 particles.
Let X have a Poisson distribution with = 7.06, the probability that there are 12 or fewer particles in a sector is Equation 2.75.
λk exp12 7.0612 P (X ≤ 12) = exp−λ = = 0.9714 = 97.14% (2.75) k! k!
2.3.2 Chi-Square Disribution
Another useful probability distribution is the Chi-Square (pronounced Kai-Square) distribution. This distribution is used to estimate lifetimes when part failure data in time is not available. The Chi-Squared distribution is used for reliability demonstra- tion test design when the failure rate behavior of the product follows an exponential distribution. It is well known in reliability engineering for testing the “goodness of fit”. The devices/processes are treated as good or defective with discrete values where means there are 2 degrees of freedom. The Chi-Square distribution is used in estimat- ing the lifetimes of an IC/ASIC in the High Temperature Operating Lifetime Test (HTOL). HTOL is a reliability test applied to IC/ASICs to determine their intrinsic reliability. The HTOL test stresses the IC/ASICs at elevated temperatures, elevated
42 CHAPTER 2. RELIABILITY OF ICS/ASICS
voltage and dynamic operation for a predefined period of time. The IC/ASIC is monitored under stress and tested at intermediate intervals. HTOL test is sometimes referred to as a ”lifetime test”. The value χ2 can be computed by performing k trials and note the observed value Oi versus the expected value Ei then this sampling distribution will follow a Chi-Square distribution as shown in Equation 2.76 [McPherson, 2013].
k X (Oi − Ei) χ2 = (2.76) E i=1 i The Chi-Square distribution has the following properties [Berman, 2017]:
The mean is equal to the number of degrees of freedom: = v = k.
The variance is equal to two times the number of degrees of freedom: σ2 = 2×v
When the degrees of freedom are ≥ 2, the maximum value for PDF occurs when χ2 = v − 2. As the degrees of freedom increase, the Chi-Square curve approaches a normal distribution.
Figure 2.23 shows the plots for Chi-Square PDF.
Figure 2.23: The Chi-Square PDF [Wikipedia, 2017a]
The Chi-Square Distribution is constructed so that the total area under the curve is equal to 1. The area under the curve between 0 and a particular Chi-Square value,
43 CHAPTER 2. RELIABILITY OF ICS/ASICS has a cumulative probability associated with that Chi-Square value. Figure 2.24 shows the plots for Chi-Square CDF.
Figure 2.24: The Chi-Square CDF [Wikipedia, 2017a]
For example, in the Figure 2.25 below, the shaded area represents a cumula- tive probability associated with a Chi-Square statistic equal to A or the probability that the value of a Chi-Square statistic will fall between 0 and A [Berman, 2017] [weibull.com, 2017].
Figure 2.25: Chi Square Statistic [Berman, 2017]
χ2 (1 − P, v) is the upper end of the range and χ2 (P, v) is the lower end of the specified range. To find a defective fraction F , for a single Sample Size (SS) from a large population of the IC/ASICs, it is expected the number of defects is SS ×F . But
x it is observes the number of defects is SS × Fs from the sampling. Let Fs = SS where
44 CHAPTER 2. RELIABILITY OF ICS/ASICS
x is the number of defective IC/ASIC devices, which can be described by Equation 2.77 [McPherson, 2013].
χ2 (1 − P, v) χ2 (P, v) ≤ SS ≤ (2.77) x 2 x 2 [( SS )−F ] [( SS )−F ] F F Since the IC/ASIC is defective or non-defective, then the number of degrees of freedom is equal to v = k − 1 = 2 − 1 = 1. The upper end (left side of Equation 2.77) and the SS is randomly drawn from the population for a P confidence level. The upper end of the range in Equation 2.77 can be used to determine the sample size SS that should be drawn from the population to be at a P confidence that the fraction defective in the population is ≤ F we have Equation 2.78 [McPherson, 2013].
χ2 (P, v = 1) SS = (2.78) x 2 [( SS )−F ] F Confidence intervals for the Chi Square distribution can be defined as in Equation 2.79. χ2 (1 − P, v) ≤ χ2 ≤ χ2 (P, v) (2.79)
The Confidence interval has a Confidence Level (CL), which in general terms quan- tifies the level of confidence that the parameter lies in the interval.
Using Chi square in Reliability
The Chi square distribution is one of the most widely used probability distributions in inferential statistics. Examples of where Chi square is used follows:
1. Goodness of fit tests
2. In hypothesis testing
3. Estimating variances
45 CHAPTER 2. RELIABILITY OF ICS/ASICS
4. Independence of two criteria of classification of qualitative data
5. Friedmans analysis of variance by ranks
6. Estimating the slope of a regression line Via its role in Students t-distribution
7. Analysis of variance problems Via its role in the F-distribution [Morteza and Ahmadabadi, 2010]
The Chi square can be used when the failure distribution is exponential such as IC/ASIC aging where the failure in time distribution is exponential. The exponential distribution PDF is shown in Equation 2.80 [weibull.com, 2017] [McPherson, 2013].
f(t) = λ exp−λt (2.80)
where λ is the failure rate. For the exponential distribution, the Mean Time To Failure (MTTF) is the inverse of the failure rate which is equal to the mean of the exponential distribution shown in Equation 2.81.
R(t) = λ exp−λt (2.81)
The reliability function (R(t)) for an exponential distribution is Equation 2.82.
1 MTTF = (2.82) λ
There is a one-to-one relationship between the failure rate MTTF and reliability R. If T is the accumulated test time (T ) then Equation 2.83 is used to calculate T .
χ2 1 CL,2(r+1) = (2.83) 2T MTTF
where r is the number of the failures and CL is the confidence level.
46 CHAPTER 2. RELIABILITY OF ICS/ASICS
Why can the Chi-Squared distribution be used for design of reliability demon- stration test [weibull.com, 2017]. Equation 2.84 can be used when the failures times follow an exponential distribution. Then the number of failures in the time interval T follows a Poisson distribution with the associate parameter λT .
(λT )i exp−(λT ) P [N(T ) = i] = (2.84) i!
where N(T ) is the number of events during time T. The upper bound of the failure rate λ can be found by Equation 2.85.
r X (λT )i exp−(λT ) 1 − CL = (2.85) i! i=0 where:
r is the total number of failures
CL is the confidence level
λ is the failure rate at the confindence level of CL
From Equation 2.85 the relationship from the Poisson distribution to the Chi square distribution can be shown. Letting x = λT , then Equation 2.85 becomes Equation 2.86. r X xi exp−x 1 − CL = (2.86) i! i=0 For a given confidence level CL, the corresponding upper bound of a random variable X can be solved by Equation 2.87.
r X xi exp−x Pr(X < x) = CL = 1 − (2.87) i! i=0
It can be shown that Equation 2.87 can be related to the Gamma distribution
47 CHAPTER 2. RELIABILITY OF ICS/ASICS
Y ∼ Gamma(k, λ). The CDF for the Gamma distribution is Equation 2.88.
k−1 X (λy)i exp−λy Pr(Y < y) = F (y, k, λ) = 1 − (2.88) i! i=0
Comparing Equation 2.87 with Equation 2.88 it can be shown that the Gamma distribution X ∼ Gamma(r + 1, 1). Based on the properties of a Gamma random
2 variable, 2X ∼ Gamma(r + 1, 2), it can be shown χ2(r+1) is a special case of the Gamma distribution if the random variable follows Gamma(r + 1, 2). Therefore,
2 2X ≈ χ2(r+1). Since x = λT , the upper bound of the failure rate is Equation 2.89.
χ2 1 λ = CL,2(r+1) = (2.89) 2T MTTF
48 Chapter 3
Aging Mechanism of CMOS Devices
In this section the following mechanisms of the aging process for a CMOS device will be briefly discussed for completeness: Negative Bias Temperature Instability (NBTI), Positive Bias Temperature Instability (PBTI), and Hot Carrier Injection (HCI). These mechanisms result in creating charged centers in the gate oxide layer over time when the IC/ASIC is operating. These charged centers effect the potential distribution within the transistor and degrade transistor performance by increasing
the transistor threshold voltage (Vth) and decrease in saturation drive current (IDsat ) and transconductance (gm). The exponential character of the model equations for both NBTI and HCI reveals extreme sensitivity of the device lifetime to the variation of the device parameters that are controlled by manufacturing process. The CMOS technology scaling and rising complexity of manufacturing process, create variations of device parameters that are critical for device lifetime [Shiyanovskii et al., 2010]. Time Dependent Dielectric Breakdown (TDDB) is a catastrophic failure, not a degradation or aging mechanism. TDDB must be accounted for in the design and manufacture of the dielectrics in the IC/ASIC.
49 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
This section will also discuss failure mechanism Electromigration (EM) and Time Dependent Dielectric Breakdown (TDDB). EM affects the resistance of the intercon- nect wires, vias and transistor contacts over time. This increase in resistance will cause an increase in the delay of a circuit, and eventual a timing failure will occur. TDDB is a catastrophic failure, not a degradation or aging mechanism, and must be accounted for in the design and manufacture of the dielectrics in the IC/ASIC.
3.0.1 xBTI = NBTI and PBTI
NBTI and PBTI effects have been known since the very early developments of MOS- FET devices. In nano-scale IC/ASIC technologies they are a major degradation mechanisms. NBTI has effects on the p-channel MOSFETs and PBTI has effects on the n-channel MOSFETs. NBTI causes a significant threshold voltage shift (50 - 100mV ) and decrease in drain current mostly in p-channel MOSFETs under a negative gate bias and at elevated temperatures (100 − 150◦C). PBTI effects mostly in p-channel MOSFETs under a position gate bias and at elevated temperatures 3.1 [Shiyanovskii et al., 2009b]. For the discussion, NBTI and PBTI will be combined as xBTI. There are many models proposed to explain the xBTI effects. These include ox- ide hole injection, electron tunneling and the diffusion-reaction models. The most accepted model is the diffusion-reaction (or electrochemical) model concept that re-
lates the activation energy Ea to the diffusion of hydrogen, which dissociate at the interface with the multiple hydrogen-terminated Si bonds (Si − H). the deposition processes for current MOSFETs employ oxynitride, SiON layer as a gate dielectric material deposited by plasma enhanced CVD or rapid thermal oxidation in presence
of NO or NO2 gases [Shiyanovskii et al., 2009b]. The transistors of an inverter are under stress in a design as shown in Figure 3.2 that illustrates when the conditions for xBTI occur in an IC/ASIC design.
50 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
Figure 3.1: Threshold voltage shifts as a function of stress time under NBTI vs channel length [Yan-Rong et al., 2010]
Figure 3.2: CMOS Inverter xBTI stress in a design [Shiyanovskii et al., 2009c]
51 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
The xBTI effect is by the breaking of Si − H bonds in the Si − SiO2 interface in the diffusion-reaction model [Liu et al., 2002], [Alam, 2003], [Schroder and Babcock, 2003]. The transistor channel consists of a silicon lattice (Si),
which interfaces with a gate oxide layer consisting of silicon dioxide (SiO2), shown in
Figure 3.3. Due to a crystal mismatch at the Si − SiO2 interface, there are dangling Si bonds, called interface traps, which are present at the interface. Hydrogen (H) is usually used to passivate most of the interface traps [Shiyanovskii et al., 2010].
Figure 3.3: Silicon Lattice interface at Gate Oxide [Shiyanovskii et al., 2009c]
The breaking of the Si − H occurs when the MOSFET transistors are under stress. This results in a dangling silicon interface trap, increase in Vth and the released
hydrogen forming into H2 within the gate oxide as shown in Figure 3.4. The change P in the number of interface traps can be described as ∆Nit ∝ (broken SiH bonds).
The change in Vth, IDsat and transistor delay τd are also proportionally related with
the change in Nit as shown in equation 3.1:
X ∆τd ∝ ∆Vth ∝ ∆IDsat ∝ ∆Nit ∝ (broken Si − H bonds) (3.1)
The aging mechanism as a function of time can be described as follows using the
static NBTI neutral H2 diffusion model [Vattikonda, 2006] shown in Equation 3.2.
2 1 Nit = K 3 × t 6 + Nit0 (3.2)
52 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
The value K is calculated by Equation 3.3.
s ε V gs−V th −Ea ox T E K ∝ (Vgs − Vth) × e ox 0 × e kT (3.3) Tox
Figure 3.4: Illustrating the effects of PMOS NBTI effect on a CMOS inverter under stress [Shiyanovskii et al., 2009c]
where Nit0 is the initial number of interface traps of Nit, t is stress time, Tox is the
gate oxide thickness, Vgs is the gate to source voltage, Vth is the threshold voltage, k
is the Boltzmann constant, T is the temperature, E0 and Ea are fitted coefficients of 1.92.0MV/cm and 0.12eV , respectively. Equation 3.4 describes how NBTI causes a shift in a transistors performance.
Ea βVgs k T n ∆p = A0e e B t (3.4)
where ∆p is a shift in a device parameter such as ∆Vth or ∆IDsat , A0 is derived from the CMOS gate oxide and process technology, β is the measured gate sensitivity,
Vth is the threshold voltage, Ea is the apparent activation energy, t is stress time, n is measured stress time exponent, k is the Boltzmann constant, and T is the tempera- ture. The equations 3.3 and 3.5 demonstrate that NBTI effect increases with elevated temperature. The effect is due to the creation of interface traps and to buildup of positive
53 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
oxide charge over periods of time (from several months to years depending on the device operation conditions). Trapped holes can be thermally activated and can
cause dissociation of oxide defects. The threshold voltage shift ∆Vth is expressed as shown in Equation 3.5.
( −Ea ) ∆Vth = A0(Vg; t) exp kT (3.5)
where Ea is the NBTI activation energy and A(Vg; t) is a function of gate volt- age and time,[Bernstein et al., 2006], [Gielen et al., 2008], [Stathis and Zafar, 2006], [Shiyanovskii et al., 2010]. NBTI temperature dependence is shown in Figure 3.5.
Figure 3.5: Measurement of NBTI at different temperatures
54 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
3.0.2 Hot Carrier Injection (HCI)
The term hot carrier injection describes electrons (holes) that have accumulated suf- ficient kinetic energy to overcome potential barrier and be injected into the gate oxide. Such accumulation occurs in high electric field for electrons that have avoided subsequent scatterings with the lattice atoms. The operational conditions and the toggling frequency of the CMOS transistor are direct contributors to the HCI rate
[Schroder and Babcock, 2003]. The carriers must overcome the SiSiO2 energy bar- rier of about 3.7eV for electrons and 4.6eV for holes. For NMOS, hot electrons are produced and for PMOS holes are produced. The average free path of hot electrons decreases with the rise in temperature. The HCI effect is enhanced at low tempera- ture operation. Injection of hot carriers can result from generation of new traps at
or near the SiSiO2 interface or generation of new traps in the oxide itself. The traps located at SiSiO2 interface affect the transconductance, gm, and leakage current of the device. The traps that are located in the gate oxide increase the threshold voltage,
Vth. The carriers can also increase the substrate current, Isub. Thus, the HCI degra- dation can be monitored through shifts in the threshold voltage or transconductance and drain current. There are four known mechanism for HCI, and they describe the conditions for
carriers to enter the gate oxide as listed below when the voltage on the gate (Vg)
approximately equal to the voltage on drain (Vd):
CHE: Channel hot-electron
DAHC: Drain Avalanche hot-carrier
SGHE: Secondary generated hot-electron
SHE: Substrate hot-electron
The conditions necessary for the hot carrier mechanisms are listed below:
55 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
CHE: Vg ≈ Vd
DAHC: Vg << Vd
SGHE: Vd > Vg
SHE: |Vsub| >> 0
Figure 3.6 illustrates the injection mechanisms for CHE, DAHC, SGHE and SHE. The effects of HCI are more prominent in NMOS devices compared to the PMOS devices. It requires 3.3eV for electrons to overcome the surface energy barrier at the Si − SiO2 interface and get injected into the oxide, compared to 4.6eV for holes [Schroder and Babcock, 2003]. Hot carrier effects are created about or aggravated by reductions in device dimensions without corresponding reductions in operating voltages, resulting in higher electric fields internal to the device.
Figure 3.6: HCI effects: CHE(V d = V g), DAHC(V d = 2Vg), SGHE Vd > Vg,SHE(|Vsub| >> 0 [Shiyanovskii et al., 2009c]
CHE injection is at a maximum with Vg ≈ Vd. Channel carriers that travel from the source to the drain are sometimes driven towards the gate oxide and can
be trapped in the SiO2 region. DAHC injection occurs under the stress condition
Vg << Vd. This occurs as hot electrons and hot holes are injected into the dielectric. The carriers gain their energy from the high electric field in the drain region. SGHE
is a photon generation process. SGHE stress condition is Vd > Vg. Photons are
56 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
generated in the high field region near the drain and induce electron-hole pairs. The second effect is the avalanche condition near the drain region leading to the injection of both, electrons and holes into the gate dielectric. SHE injection is a result of a
high positive or negative bias at the substrate (|Vsub| >> 0) Vg ≈ Vd. Figure 3.6 shows the operating region where HCI conditions CHE and DAHC occur during IC/ASIC operation. The most physically destructive HCI mechanism is DAHC injection [Takeda et al., 1983]. This type of carrier injection occurs when the
drain voltage, Vd, is much greater then the gate voltage, Vg (worse case Vd = 2Vg). Such conditions create a very high electric field near the drain region. This high electric field accelerates the carriers into the drain depletion region. The high rate of acceleration propels the carriers to collide with Si lattice atoms and through impact ionization, create displaced electron-hole pairs, shown by the yellow region in Figure 3.6. The majority of the generated holes are usually absorbed by the substrate and thus increases the substrate current, Isub Some of the generated electrons proceed to the drain and result in increased drain current, Id. However, some of the electron - hole pairs gain enough energy to breach the Si − SiO2 interface energy barrier, 3.7eV for electrons and 4:6eV for holes. Once, the carriers have passed the energy barrier of the Si − SiO2, they can either be trapped at the Si − SiO2 interface, within the oxide itself or become gate current,
Ig. After the bulk silicon has been cleaved, and the exposed silicon bonds have been passivated during the manufacturing process, Si − H bonds are formed as shown in Figure 3.3. Carriers with enough energy, 0.3eV , can break these weak Si − H bonds and get trapped thus forming a space charge. Over time, as more and more carriers are trapped, there is an increase in threshold voltage, Vt, and a change in conveyed conductance, gm for NMOS transistors. These changes in device characteristics result in decrease performance and eventual device failure [Shiyanovskii et al., 2009c]. There have been several models describing the effect of HCI on device lifetime.
57 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
The first model, shown in 3.6, expresses generated interface traps through set of
experimentally measured characteristics such as: the substrate current (Isub), drain current (ID), the degradation parameters extracted from wafer measurements (n, m and H), the time duration per transition (TS), and the total number of transistor switching (NS) [Jiang et al., 1998].
Z T s 1 1−m m n ∆Nit ∝ NS × Id (t) × Isub(t)dt) (3.6) WH 0
The second model characterizes HCI effect through the mean free path of hot - electron, λe shown in Equation 3.7 [Li et al., 2008].
h ( −φit,e )in ∆Nit = C1 Ids/W × exp qλeEm × tn (3.7)
where W is the width of the CMOS device, Em is the electric field at the drain,
Ids is the drain to source current, φit,e is the critical energy of an electron to generate interface traps ∆Nit , q is the charge of an electron, t is the stress time, the value of n ranges between 0.5 and 1 and C1 is a process constant [Jiang et al., 1998].
The third model expresses device the lifetime through the substrate current, Isub (Equation 3.8 [Li et al., 2008].
n Isub EaHCI tf = AHCI exp (3.8) W kBT
where tf is the time to failure, Isub is the substrate current, EaHCI is the apparent activation energy (0.1 to 0.2eV ), kB is the Boltzmanns constant, T is the temperature in Kelvin, n is a process dependent constant, and AHCI is a model pre-factor. The exponential character of model equations for HCI (and NBTI) reveal extreme sensitiv- ity of the device lifetime to the variation of the device parameters that are controlled by the manufacturing process [Shiyanovskii et al., 2010], [Shiyanovskii et al., 2009c],
58 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
[Shiyanovskii et al., 2009b].
59 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
3.0.3 TDDB
Time Dependent Dielectric Breakdown (TDDB) is not an aging mechanism. TDDB is due to very high electrical fields in the gate dielectric of the MOSFET devices in an IC/ASIC. After a period of degradation due to bond breakage and/or trap creation the dielectric eventually undergoes a breakdown as illustrated in Figure 3.7. The breakdown is caused by a thermal runaway condition due to high current flow. This localized current density and associated severe heating can result in a conductive filament forming in the dielectric shorting the gate to the substrate in the MOSFET device (see Figure 3.8). A failure due to TDDB is a catastrophic failure.
Figure 3.7: A. Dielectric degradation occurs due to broken bonds/trap-creation in the dielectric material and at the SiO2/Si interface [McPherson, 2013]
Figure 3.8 shows the trapping of the holes initially and then followed by electron trapping continues up to the point of catastrophic breakdown whereby the localized Joule heating produces a melt-filament shorting the poly-gate and silicon substrate. In very thin dielectrics (10 nm), the pre-breakdown leakage may show a stress-induced leakage current increase prior to breakdown of the dielectric. Also, hyper-thin di- electrics (4 nm) can show soft breakdown characteristics [McPherson, 2013]. Historically there are two models used to describe TDDB. The E-Model that is field driven and 1/E model, which is current driven.
60 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
Figure 3.8: Poly Short due to TDDB [McPherson, 2013]
61 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
TDDB: Exponential E-Model
In the E-Model low fields and high temperature results in TDDB due to field enhanced thermal bond breakage. The field stretches the molecular bonds thus making them
weaker and more susceptible to breakage by Boltzmann’s thermal processes. The tf Equation 3.9 is the inverse of the degradation rate and decrease exponentially with the field [McPherson, 2013].
Q tf = A0 exp (−γEox) exp (3.9) KBT
where:
γ is the field acceleration parameter,
Eox is the electric field in the oxide and is given by the voltage dropped Vox
across the dielectric divided by the oxide thickness tox,
Q is the activation energy (enthalpy of activation), and
A0 is a process/material-dependent coefficient that varies for each node and
causes the tf to actually become a times-to-failure distribution, usually model as Weibull distribution.
62 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
TDDB: Exponential 1/E-Model
The 1/E Model for TDDB damage is assumed to be due to current flow through the dielectric due to FowlerNordheim (FN) conduction. Electrons are injected by F-N conduction band of the dielectric from the cathode to the anode. As the electron are accelerated through the dielectric because of the impact ionization the dielectric. Hot holes can also be produced, which could tunnel back causing dielectric damage.
Since the degradation of the dielectric are the result of F-N conduction the tf is an
1 exponential dependence on the reciprocal of the electric field i.e. E shown in Equation 3.10 [McPherson, 2013]:
G(T ) tf = τ0(T ) exp (3.10) Eox
where:
τ0(T ) a temperature-dependent prefactor
G(T ) is a temperature-dependent field acceleration parameter for the 1/E- Model.
63 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
TDDB: Power-Law Voltage V-Model
˙ For thin SiO2 dielectrics (40A), a power-law voltage model has been proposed for TDDB in the form of Equation 3.11:
−n tf = B0 (T )[V ] (3.11)
This model assumes for ballistic transport there is no scattering of the engery or energy loss in the ultra thin dielectric films. The amount of energy delivered to the anode is e × V . For ultra thin oxides the exponent observed is in the range of n = 40-48 [McPherson, 2013].
√ TDDB: Exponential E-Model
For poor quality SiO2 or low k dielectrics the mechanism has been proposed to be due to Poole-Frenkel effect (a means by which an insulator can conduct electricity) or Schottky conduction. The current induced degradation the tf model is shown in 3.12 [McPherson, 2013]: h √ i tf = C0(T ) exp −α E (3.12)
where the root-field acceleration parameter α a is given by Equation 3.13:
∂ ln(TF ) α = − √ (3.13) ∂ E T
The physics for each of the TDDB models are different. There is no consensus in the industry on which model to use. McPherson has ranked the result as shown in Fig- ure 3.9. Shown are the four models best fittings to the same set of accelerated TDDB data. All the models tend to give a very good fitting to the four accelerated TDDB data points. But, their extrapolated results to lower electric fields are quite different. The E-Model gives the shortest time-to-failure when the results are extrapolated to
64 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES lower electric fields. The 1/E-Model gives the longest time-to-failure at lower electric fields. From this the E-Model is the most conservative and the 1/E-Model is the most optimistic in their projections of tf [McPherson, 2013]. The E-Model model is the √ most conservative, next is E-Model, then the V-Model, and lastly the 1/E-Model [McPherson, 2013].
Figure 3.9: The four models best fittings to the same set of accelerated TDDB data [McPherson, 2013].
65 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
3.0.4 Electromigration (EM)
Electromigration was discovered more than 100 years ago. The first observation was reported by the french physicist M. Gerardin in 1841. This emerged as an important area of studies since the late 1960s when electromigration damage was found to have caused failure of inductor lines in integrated circuits [Ho and Kwok, 1989]. Although EM exists whenever current flows through a metal wire, the conditions necessary for EM to be a problem simply did not exist back then. It became a concern only when the relatively severe conditions necessary for the operation of an IC/ASICs, as the geometries of IC/ASIC interconnect wires keep shrinking [Lloyd, 2002]. James R Black of Motorola Inc. in 1967 studied the EM in semiconductors. He carried out experimental work that led to development of Black’s Law. The original Blacks Law model is shown in Equation 3.14 [Black, 1967].
Ea −2 k T tf50% = A0 × j × e B (3.14) where:
tf is the median time to failure in hours for 50% of the interconnect wires to fail
A0 is a constant including the failure condition in hours/A/cm2
j is the current density in the interconnect wire
Ea is the activation energy in electron volts ranging from 0.2eV to 1.33eV for Aluminum
2 is current density exponent, n = 2 nucleation
kB is Boltzmans constant
T is the temperature in Kelvin
66 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
There are many more proposed equations for EM. The General Blacks Law model shown in Equation 3.15 was generalized by Blair [Blair et al., 1971] as current density of j−2 in Black’s Law (Equation 3.14) did not fit measured data as the process nodes shrank. It was found that n ranged from 1 to 2. The industry had to switch from Aluminum to Copper as the interconnect wires shrank. The reasons are described
in Chapter 4. For Aluminum n = 2, and had an activation energy (Ea) of 0.2eV to
1.33eV. Copper was found to have a n = 1 and Ea of 0.4eV to 2.07eV.
Ea −n k T tf = A0 × j × e B (3.15)
The Duty cycle model (Equation 3.16) takes into account the duty cycle of the AC waveform (d) in the interconnect wire[English et al., 1972], [Hummel and Hoang, 1989], [Tao et al., 1994].
A0 Ea t = e kB T (3.16) f dmjn
The Blech Length model in Equation 3.17 was proposed when it was observed a critical current density was needed before EM occurred [Arzt and Nix, 1991], [Blech, 1976].
Ea −n k T tf = A0 (j − jc) e B (3.17)
The Width model in Equation 3.18 added the width of the interconnect W [Black, 1978], [Merchant, 1982], [Young and Christou, 1994], [Yan et al., 2005].
Ea −2 k T tf = A0W t × j × e B (3.18)
The Contact model was proposed to account for the contact of the metal-semiconductor of the CMOS transistors (Equation 3.19). W is the contact width, L the contact length and the constant α which relates grain size and distribution [Prokop and Joseph, 1972],
[Veshinfsky, 1995]. B0 is a process constant. It is differentiated from A0 in Black’s
67 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
Law as it may be different for contacts as opposed to vias.
B0W α + Ea t = e L kB T (3.19) f j
The Self-heating (Joule) model shown in Equation 3.20, considered the effects of the self heating due to the current flow in the interconnect [Black, 1982].
Ea −2 kB (Tself +Tmetal) tf = A0 × j e (3.20)
Researchers created an Analytical model considering diffusion concurrently with EM (Equation 3.21), which resulted in the original Black’s Law current exponent of 2 e.g. j2 [Shatzkes and Lloyd, 1986].
2 E Cf kT −2 a kB T tf = ∗ × j e (3.21) D0 Z qρ
The Recovery model has a recovery factor for AC current flow (r). Since current is flowing in both direction for an AC signal, metal ions move in both directions, which causes recover from EM in the interconnect. (Equation 3.22) [Ting et al., 1993].
A Ea 0 k T tf = n e B (3.22) (javg+ − r|javg+|)
The Multilayered model considers when barriers are used in the interconnect [Tao et al., 1996]. W is the width and H is the height of the interconnect wire.
E 2 2 20 − 3π 3 1 a t = W − H + H e kB T (3.23) f 10W j
The Nucleation-Growth model consider the relative contributions of nucleation and growth. A0 and B0 are constants that contain geometric information, such as the size of the void required for failure. Given values for A0 (nucleation) and B0 (growth),
68 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
Equation 3.24 shows that the relative contributions of nucleation and growth vary as a function of the current density [Lloyd, 2007].
B0T A0kT Ea t = t + t = + e kB T (3.24) f nuc growth j2 j
Multi-Via model an empirical model (Equation 3.25) for Cu dual-damascene tech- nology (Section 4.4) considers with a quantitative relation capable of predicting the lifetime expectancy of multi-Via structures [Marras et al., 2007].
Ea 1 −n k T γ 1− N tf = A0j e B e vias (3.25)
Current density j, is not clearly defined in many published papers. Symbolically,
j could refer to javg, jpeak, jmax, jrms or jdc. The original usage is the average cur-
rent density (javg) [Black, 1969], but later these ambiguities were refined to consider pulse waveform [English et al., 1972] or electromigration recovery [Ting et al., 1993]. The current density exponent n, was originally 2 [Black, 1969], and was assumed a property of aluminum, but later fabrication techniques using barriers, liners, refrac- tory, joule heating and shunts, revealed different values could be better modeled as nucleation and void growth [Lloyd, 2007] In Black’s Law, the failure rate is defined as when 50% of the interconnects have failures. It is not a degradation model. No implication are made for the reliability calculations e.g. the 0.1% failure distributions used by reliability engineers. The temperature T is also ambiguous in Black’s Law, because it is not clear if T is the substrate temperature or the temperature rise due to joule heating from the current in the interconnect.
69 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
3.0.5 Industry Standard Reliability Calculations
The IC/ASIC industry created a council called the Joint Electronics Devices and Engineering Council (JEDEC). JEDEC is the global leader in developing standards for the IC/ASIC industry. JEDEC collaborative efforts are to ensure product inter- operability by decreasing time-to-market, reduce product development costs, which benefits the industry and the consumer. JEDEC has created standards for predict- ing, testing and measuring failures that are used for IC/ASIC industry to predict lifetimes. The standards can be found online at www.jedec.org. These standards are elaborated in the next Sections. “JESD” is a JEDEC Standard Document and “JEP” is a JEDEC Procedure. EM is the dominate aging/degradation mechanism at today’s process node, and needs to be considered when designing an IC/ASIC. EM is discussed in detail in Chapter 4.
Industry Standards for calculating xBTI
JEDEC has multiple standards that can be used for estimating aging/degradation effects due to xBTI. They are:
Standard JEP 122H [JEP122H, 2016] “Failure Mechanisms and Models for Semiconductor Devices”
Standard JESD 90 [JESD90, 2004] “A Procedure for Measuring P-Channel MOSFET Hot-Carrier-Induced Degradation Under DC Stress”
Standard JESD 241 [JESD241, 2015] “Procedure for Wafer-Level DC Charac- terization of Bias Temperature Instabilities”
JEP 122H calculations for xBTI recognizes that the current state of the xBTI models are limited by the knowledge of the physics for the mechanism, in contrast
70 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
to EM where the physics are better understood and observable with electron micro- scopes.
For a given gate oxide thickness (tox), either Equations 3.26 or 3.27 are used for phenomenological models for xBTI degradation.
E ∆p = A (V )α exp a tn (3.26) 0 G kT
E ∆p = A exp(βV ) exp a tn (3.27) 0 G kT
where:
∆p = shift in device parameter of interest (Vt, %gm, %Idsat, etc.)
A0 = pre-factor dependent on the gate oxide process and CMOS technology
Ea = apparent activation engery (experimental measured values range from -0.01 to +0.15 eV)
k = Bolztmann’s constant (8.617332478x10−5eV/◦K)
T = channel temperature in kelvins ◦K
α = measured gate voltage exponent (measured values range between 3 to 4)
β = measured gate voltage sensitivity, units are reciprocal of Voltage
t = stress time
n = measured time exponent (measured values range between 0.15 to 0.25)
The xBTI lifetime equation for Equation 3.26 is shown in Equation 3.28.
1 " # n ∆p tf50% = (3.28) α Ea A0(VG) exp kT
71 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
The xBTI lifetime equation for Equation 3.27 is shown in Equation 3.29.
1 " # n ∆p t = (3.29) f50% Ea A0 exp(βVG) exp kT
HCI
There are multiple JEDEC Standards used for HCI:
JESD 122 [JEP122H, 2016] “Failure Mechanisms and Models for Semiconductor Devices”
JESD 28-1 [JESD28-1, 2001] “N-Channel MOSFET Hot Carrier Data Analysis”
JESD 28A [JESD28A, 2001]“Procedure for Measuring N-Channel MOSFET Hot-Carrier-Induced Degradation Under DC Stress”
JESD 60A [JESD60A, 2004] “A Procedure for Measuring P-Channel MOSFET Hot-Carrier-Induced Degradation Under DC Stress”
JESD 122 states that for sub-0.25 µm p-channel, the drive current tends to de- crease like NMOS after hot carrier stress. For sub-0.25 µm p-channel, worst-case
lifetime occurs at maximum substrate current stress. The time-to-failure (tf ) model is the same as n-channel. The drive currents for the n-channel transistors tend to decrease after HCI stressing; the p-channel drive current may increase or decrease depending on channel length and stress conditions. The JESD 122 model for degradation induced by HCI is shown in Equation 3.30.
n ∆p = A0 × t (3.30) where:
∆p = shift in device parameter of interest (Vt, %gm, %Idsat, etc.)
72 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
A0 = material dependent material parameter
t = stress time
n = empirically determined exponent, a function of stress voltage, temperature and effective channel length)
The N-channel transistors uses an Eyring model. The Eyring model makes the practical assumption of mathematically separable, independent variables. Equation 3.31 shows the model for an N-channel device.
E t = B(I )−N exp a (3.31) f50% sub kT
where:
B = arbitrary scale factor (strong function of proprietary factors such as doping profiles, sidewall spacing dimensions, etc.)
Isub = peak substrate current during stressing
N = 2 to 4
Ea = apparent activation engery (experimental measured values range from -0.2 to +0.4 eV)
k = Bolztmann’s constant (8.617332478x10−5eV/◦K)
T = channel temperature in kelvins ◦K
The P-channel transistor (< 25µm )model is shown in Equation 3.32.
E t = B(I )−N exp a (3.32) f50% sub kT
where:
73 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
B = arbitrary scale factor (strong function of proprietary factors such as doping profiles, sidewall spacing dimensions, etc.)
Isub = peak substrate current during stressing
N = 2 to 4
Ea = apparent activation engery (experimental measured values range from +0.1 to +0.2 eV)
k = Bolztmann’s constant (8.617332478x10−5eV/◦K)
T = channel temperature in kelvins ◦K
Industry Standards for TDDB
JEDEC has created standard procedures for testing an IC/ASIC process for TDDB:
JEP001 [JEP001A, 2014] “foundry Process Qualification Guidelines - Backend of Life (Wafer Fabrication Manufacturing Sites)”
JEP122 [JEP122H, 2016] “Failure Mechanisms and Models for Semiconductor Devices”
JEP159 [JEP159A, 2015] “Procedure for the Evaluation of Low-k Metal Inter / Intra-Level Dielectric Integrity”
These standards specify tests and procedures for TDDB, and recommend using the model that best fits the data collect when a process node is being created.
Industry Standards for EM
The JEDEC standard, JESD63 [JESD63, 1998], used by the semiconductor industry only considers the electromigration stress of direct current (DC), jdc or javg and does
74 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES not consider recovery or the joule heating in the interconnects, which are discussed in later Chapter 4.
3.0.6 Summary of IC/ASIC Aging/Degradation Models
This Section summarizes the aging/degradation time to failure (tf ) models. tf is defined when the process node is being develop. xBTI, HCI and EM are typically defined as a 10% change in a parameter. These parameters affect the timing, and are guard banded to allow for a 10% change over the defined lifetime of the IC/ASIC. The defined lifetime of an IC/ASIC is defined as the time when 0.1% of the devices fail in sub 180nm processes. The IC/ASIC manufacture process lifetime goal is typically greater 10 years. We want to use these parameters and guard bands, to create a process to make the tradeoffs between performance, cost, power and reliability of an IC/ASIC design. NBTI is a degradation of the PMOS transistor as previously described. NBTI is a charge trapping phenomenon. NBTI tf is typically less that PBTI. Equation 3.33 is used for NBTI [McPherson, 2013], [Yan et al., 2009], [Bernstein et al., 2006], [Salemi, 2008], [Schroder, 2007], [Pompl and Rhner, 2005].
Ea γNBTI Eox k T tf = A0e e B (3.33)
where:
tf failure condition (i.e. 10% Vt shift)
A0 is a process constant
γNBTI effective metal charge
Eox is the electic field across the gate and channel
Ea is 0.019eV to 0.24eV
75 CHAPTER 3. AGING MECHANISM OF CMOS DEVICES
kB is Boltzmans constant
T is the temperature in Kelvin
Equation 3.34 is the equation for HCI.
Isub Ea t = A e kB T (3.34) f 0 W
[McPherson, 2013] where:
tf failure condition (i.e. 10% Vt shift)
A0 is a process constant
Isub is the substrate current
W is the width of the transistor
Ea is 0.019eV to 0.24eV
kB is Boltzmans constant
T is the temperature in Kelvin
EM tf is usually defined as a 10% change in resistance of the interconnect wire, not a failure of the wire. The EM TTF is shown in Equation 3.35.
Ea −n k T tf = A0 × j × e B (3.35)
76 Chapter 4
Interconnects
Advanced IC/ASICs have ≈ 30 miles of interconnect wires if they were laid end-to-end [Miyasato, 2018]. The Figure 4.1 illustrates the amount of interconnect wires and wire length within a IC/ASIC. With the IC/ASIC industry trying to follow Moore’s law the number of interconnect wires will continue to grow. This chapter introduces the
Figure 4.1: Interconnection distribution.[Borkar, 1999] concepts of interconnect delays, which are dependent on interconnect wire resistance, wire capacitance, wire geometrical dimensions of length, width, thickness, pitch, Vias and layers. Topics also discussed are future trends (Dennard scaling), current density and electromigration.
77 CHAPTER 4. INTERCONNECTS
4.1 Metal and Dielectric Diffusion
In the IC/ASIC process metal is placed over a dielectric. An issue is that the metal can migrate into the dielectric. There are two mechanisms by which metals can migrate into dielectrics. One is diffusion of the metal atoms by elevated temperature. The other is drift of metal ions from an external electric field. Ideally there would be a very sharp interface between the metal and dielectric, as the schematic diagram Figure 4.2 illustrates. But, under sufficiently high temperature, metal atoms can be activated and they will diffuse into the dielectric. There is a higher concentration of the metal atoms in the deposited metal film in a metal interconnect than the dielectric. The movement produces a net flow of metal atoms across the interface into the dielectric. The consequence is the boundary between the metal is not clearly defined as shown in Figure 4.3 [Balasinski, 2016].
Figure 4.2: Diagram showing an ideal (sharp) interface of the metal and dielectric materials [Balasinski, 2016]
There is also field enhanced ion drift that can occur if a strong electric field is applied to the dielectric and there are metal ions in the dielectric. The electric field will provide an additional driving force for ion migration inside the dielectric. Figure 4.4 shows the metal dielectric interface with an energy diagram, illustrating that the metal ions must overcome two barriers to diffuse into the dielectric film. The process by which the metal atoms or ions are initially released is complex. For metal atoms to thermally diffuse into the dielectric, the atoms need to overcome the
78 CHAPTER 4. INTERCONNECTS
Figure 4.3: Diagram showing a diffused metal-dielectric interface after the penetration of the metal into the dielectric [Balasinski, 2016]
Figure 4.4: Diagram showing the metal dielectric interface with an energy dia- gram showing how metal atoms diffuse out of the metal matrix into the dielectric [Balasinski, 2016]
79 CHAPTER 4. INTERCONNECTS
metallic bonding Em as was shown in Figure 4.4. A good indication of the metallic bonding Em is given by the melting point of the material. The origin of metal-ions depends on the chemistry of the interface layer. Most dielectric materials used in IC/ASICs contain oxygen. The oxidation of the metal at the dielectric surface is an important reaction that takes place in the metal dielectric interface. The Metal Oxygen (M − O) bonding strength is related to the heat of metal oxide formation. The less negative heat of oxide formation implies a lower oxidation tendency and weaker MO bonds. These bonds may not be sustained when under severe electrical and thermal stress. Figure 4.5 shows the negative heat of oxide formation per oxygen atom for a selective group of elements.
Figure 4.5: Negative heat of oxide formation per oxygen atom in various metals [Balasinski, 2016]
Al has a low melting temperature, which indicates a weaker Em barrier. How- ever, the formation of Al2O3 is very high. When an Al interconnect is deposited on a SiO2 surface, the Al reduces the SiO2 dielectric to form a thin layer of Al2O3.
The Al2O3 serves as a barrier. The Al2O3 is so dense that it can prevent Al atoms and oxygen from diffusing, which prevents further oxidation acting as a self-limited oxide as shown in Figure 4.6. The dense Al oxide which, has very strong AlO bonds, which prevents bond breakage in the oxide eliminating the ion generation
80 CHAPTER 4. INTERCONNECTS
[Fisher and Eizenberg, 2008], [He et al., 2011], [Mallikarjunan et al., 2002].
This strong Al − SiO2 is a very stable interface and has served the IC/ASIC industry for a long time [Balasinski, 2016].
Figure 4.6: Cross section of the Al SiO2 interface [Balasinski, 2016]
As the interconnect wires became smaller, the resistance of Al was a limiting fac- tor. Cu replaced Al, due to it’s lower resistivity, but does not have a stable interface
to the dielectric. Most literature on Cu ion drift in SiO2 results from the oxidation of Cu at the interface. This is caused by the contamination of oxygen-containing species on the surface or from oxidation ambient due to the environment, instead
of through the reduction at the SiO2 surface. In general, if there is a Cu oxide at
the Cu − SiO2 interface, Cu ions can be generated and released into SiO2 under thermal or electrical stress. Cu has a relatively low melting temperature (1085◦C), which means its metallic bonding within Cu metal matrix is weak. Without a sta- ble and dense interface, Cu atoms can diffuse thermally into low-k dielectrics at an elevated temperature or during the PVD (Physical Vapor Deposition) deposition of Cu [Balasinski, 2016]. Once Cu is ionized it diffuses through the dielectric due to a concentration gradient, reaching the interface with the Si where the transistors are formed [Fisher and Eizenberg, 2008]. This is why a barrier is needed for Cu (Section 4.4) .
81 CHAPTER 4. INTERCONNECTS
4.2 Electromigration Failure in Interconnects
When an electric current passes in a conductor, ions and atoms are driven towards the anode due to the momentum transfer from the electrons to the ions/atoms [Huntington and Grone, 1961]. Due to the confinement boundary imposed by the bar- rier layer in Cu dual-damascene interconnects, there is an accumulation of ions/atoms at the anode side and a depletion of ions/atoms at the cathode end. The dual- damascene process is how Cu interconnects are created and will be described in Sec- tion 4.4. The ion/atom accumulation at the anode leads to a compressive stress, while the depletion of atom at the cathode causes a tensile stress. These stress developments can results in two distinct failures. If the compressive stress is sufficiently high and the surrounding dielectrics are weak, metal extrusion or hillocks can form, causing short circuit [Wei et al., 2008]. Conversely a sufficiently high tensile stress can lead to void formation, which can grow and span the line or via, causing the line resistance to significantly increase, and the interconnect fails [Gignac et al., 2003]. Usually, the critical stress for extrusion formation is larger than that for void formation, and the latter is the dominant failure mechanism [De Orio, 1981]. Electromigration induced failures occur in two (2) distinctive phases; nucleation and growth. In the first phase of nucleation, no EM generated voids can be observed in the interconnect, and there is not a significant resistance change of the line detected [Marieb et al., 1995]. This phase lasts until a void is nucleated and is observed in scanning electron microscopy (SEM) pictures. After the second phase starts (growth), the void can evolve in several different ways, until the void finally grows to a critical size causing a significant resistance increase. The void may grow, and completely sever the interconnect line [Choi et al., 2008], [Besser et al., 1992]. The total EM lifetime is the sum of the time for a void to nucleate plus the time for the void to develop [De Orio, 1981]. Figure 4.7 shows the two typical failures in a copper dual-damascene line
82 CHAPTER 4. INTERCONNECTS
[Gambino et al., 2009]. In the first case (a) the void nucleates right under the via, and tends to occur soon after nucleation. The time to failure of the interconnect is dominated by the nucleation period. If the void nucleates away from the Via (b), the void has first to migrate towards the via, where it then grows to cause the failure. The lifetime of the interconnect is dominated by the void growth or evolution phase [Gambino et al., 2009], [De Orio, 1981].
Figure 4.7: Failures in a damascene line. (a) Failure dominated by the void nucleation phase. (b) Failure dominated by void nucleation migration and growth [Orio, 2010].
Figure 4.8 shows the EM lifetime, normalized to the expected lifetime for the 1µm technology node, as a function of the interconnect cross sectional area [Hu et al., 2006], [Hu et al., 2004]. As the dimensions of the interconnects decrease, the electromigra- tion lifetime also decreases, due to the reduction of Via and line dimensions required for a smaller critical void to cause the failure [Hu et al., 2006]. Moreover, as the line width is reduced beyond 100nm, the growth of Cu grains during the line fabrication is also reduced, leading to smaller grain sizes [Dubreuil et al., 2008], [Yao et al., 2008], [Schwartz and Srikrishnan, 2007]. This causes the interconnect to change from a bamboo-like to polycrystalline struc- ture, where the grain boundary diffusion provides an additional path for mass trans- port [Hu et al., 2007] of the Cu ions. Another contribution for shorter electromigration lifetimes comes from the intro-
83 CHAPTER 4. INTERCONNECTS duction of low-κ interlevel dielectrics [Noguchi et al., 2009], [Gambino et al., 2009], [De Orio, 1981], [Orio, 2010].
Figure 4.8: EM lifetime variation as a function of the interconnect dimensions [Orio, 2010].
4.3 Reliability and Electromigration (EM)
Interconnect wires are formed in the Back End of Line (BEOL) in the IC/ASIC process (Metal 1 and above). The active transistors are formed in the Front End of Line (FEOL). The Middle of the Line (MOL) (sometimes referred to Middle End of Line (MEOL)) for interconnects, first introduced for mainstream production at 20nm, can help reduce congestion for short local routes. The MOL local interconnect (Li) layers offers a way to achieve very dense local routing below the first metal layer (BEOL). There may be several of these layers available in a given process, and most do not use contacts or vias. Instead, they connect by shape overlap without any need for a cut layer. Not needing contacts makes the interconnect routing denser because contacts are larger than nets, and cannot be placed too close to nets [Carlson, 2013]. The MOL is below the first metal layer, and can be considered an extension to polysilicon local interconnect. Although MOL routing requires multiple masks, it is
84 CHAPTER 4. INTERCONNECTS less expensive to implement than traditional metal layers because it does not employ vias. These interconnect lines connect the upper and lower layers and on to the polysilicon contacts using shape overlap [Carlson, 2013]. Figure 4.9 [Beckley, 2012] illustrates the FEOL, MOL and BEOL layers in an IC/ASIC.
Figure 4.9: Active layers FEOL, MOL Local interconnect Li and BEOL Metal inter- connect
The resistance of the interconnect wires is a limiting factor in the performance of an IC/ASIC. In a perfect lattice, there is no resistance. Electrons move in a peri- odic potential with no other interaction with the metal atoms. But a perfect lattice cannot exist above absolute zero due to atomic vibrations, vacancies, and chemical impurities. Grain boundaries and dislocations are also present. Perhaps even more important, at any temperature above 0 ◦K, the atomic vibrations are larger. The vibrations of a metal move the atoms out of their thermal equilibrium position, dis- turbing the periodic potential of the lattice, causing electron scattering by the atoms. The force due to collisions of electrons to metal atoms is called the momentum ex- change. This results in EM. EM is defined as motion of ions/atoms of a thin film interconnect due to the high current densities passing through the interconnect. The motion on the ion/atoms can lead to the formation of ”voids or holes (Figure 4.7 above) and hillocks in the thin film, which can grow to a size where the interconnect resistance will increase, and unable to pass current or causes a shorting to other inter- connects. Copper is known to be more resistive to electromigration than aluminum
85 CHAPTER 4. INTERCONNECTS
[Gadkari, 2005]. The industry roadmap for scaling of the wires is shown in Table 4.3.
Table 4.1: Interconnection roadmap for scaling.[IRDS, 2016]
The challenges and issues for the scaling of the interconnect wires on performance are shown in Table 4.3 list.
Table 4.2: Interconnect, etc. difficult challenges.[IRDS, 2016]
An effective scaling model has been established, which assumes the void is located at the cathode end of the interconnect wire containing a single Via with a drift velocity (ion/atom movement) dominated by interfacial diffusion as shown in Figure
86 CHAPTER 4. INTERCONNECTS
h 4.10. This model predicts that life time scales with w × J , where w is the linewidth (or the Via diameter), h the interconnect thickness, and J is the current density. The geometrical model predicts that the lifetime of the interconnect decreases by half for each new generation, it can also be affected by small process variations of the interconnect dimensions.
Figure 4.10: Experiment and model of lifetime scaling versus interconnect geometry (∆Lcr).[IRDS, 2016]
The maximum equivalent dc current density (Jmax) and the maximum current
density (JEM ) is limited by the interconnect geometry scaling is shown in Figure
4.11. Jmax increases with scaling due to reduction in the interconnect cross-sectional area and the increase in the maximum operating frequencies of IC/ASICs. Practical solutions to overcome the lifetime decrease in the narrow linewidths are being pursued. Recent studies show an increasingly important role of grain structure in contributing to the drift velocity and thus the EM reliability beyond the 45nm node. Process options with Cu alloys seed layer (e.g., Al or Mn) have been shown to be an optimum approach to increase the lifetime. Other approaches are the insertion of a thin metal layer (e.g CoWP or CVD-Co) between the Cu trench and the dielectric SiCN barrier and the usage of the short length effect. The short length effect has effectively been
87 CHAPTER 4. INTERCONNECTS used to extend the current carrying capability of conductor lines and has dominated the current density design rule for interconnects[IRDS, 2016].
Figure 4.11: Evolution of Jmax (from device performance) and JEM (from targeted lifetime).[IRDS, 2016]
The issue with interconnect wires is the delay increases by the square of the length (L2). To minimize the delay of the wire requires more current drive from the gate to compensate for the increased resistance, and thus timing closure. This increase in current leads to progressively worse EM aging issues. Long wires called global interconnects are used to route VDD and voltage signal across the IC/ASIC. The wire resistance dominates the resistance of the driving gate i.e. Rwire Rdr →
2 RwireCwire ∝ L . The length of the longest wire on a IC/ASIC has increased by ≈ 20% with each new process technology. The cross sectional area of global interconnects has not scaled with each process technology. The width (w) and the (h) are not scaled down in newer process technology. In order to minimize the increase in global interconnect delay, global interconnect are placed in separate planes or layers of the IC/ASIC interconnect wiring. The use of repeaters can be used to reduce the maximum capacitance of the interconnect wiring, which will reduce the driving gate current as shown in Figure 4.12 [Magen et al., 2004]. In general the Delay = n(R × C), where n is the number
88 CHAPTER 4. INTERCONNECTS of wires a gate has to drive. Buffers can be added to reduce the delay Delay = 3 × RC + 2 × GateDelay.
best line
Figure 4.12: Inter-connectection distribution.
4.4 Interconnect Resistance
The IC/ASIC industry has moved from aluminum to copper for interconnect wires to decrease the resistance to increase performance. IBM introduced copper on its PowerPC microprocessor in 1998, which was revolutionary at the time. Aluminum interconnects was the standard in 1998. The rest of the industry followed more slowly. Advanced Micro Devices in 2000 produced a 180-nanometer Athlon processor, and in 2002, Intel produced a 130-nm copper Pentium 4. Fabrication cost and yield were the fundamental reasons for the slow adoption. The move was made for 3 reasons; performance, scaling and reliability. The lower resistivity of copper results in higher performance as wire delay is decreased. Also as noted in the previous section the length of the longest wire in an IC/ASIC increases by ≈ 20% with each new process, which results in increase wire resistance. The lower resistivity of copper leads to lower joule heating of the wire. This allows higher current densities for the smaller wire sizes. Reliability is also improved due to copper’s lower activation energy, making it more resistive to EM failures. Morever,
89 CHAPTER 4. INTERCONNECTS copper’s higher thermal conductivity provides more efficient heat conduction paths [Khan and Kim, 2011], [Quirk and Serda, 2001b], [Keyes and Scansen, 2006], [Matsumoto and Wade, 1999]. There are challenges with copper. Copper is difficult to pattern using the con- ventional techniques used for aluminum. Chlorine gas used to etch metals in plasma, forms chloride that does not readily evaporate. Copper quickly diffuses (hillocks) into silicon and oxides (copper poisoning). This causes spikes of copper that can be long enough to penetrate through the transistor junctions. Copper also has poor oxida- tion and corrosion resistance. Copper quickly oxidizes in air and does not protect the underlying copper (Section 4.1) from further oxidation [Khan and Kim, 2011]. Copper interconnects require a manufacturing process significantly different from that of aluminum-based IC/ASIC. Around 1995, IBM collaborated with equipment manufacture Novellus Systems (now part of Lam) on an electrochemical deposition (ECD) or metal plating process for copper. IBM contributed the ECD solution, which enabled copper to be plated from the bottom up without voids in high aspect ratio features. Novellus led a Damascus Alliance to address key integration issues to aid in the process adoption. Then in June of 1998, the Damascus production-ready damascene process was announced. Shortly thereafter, IC/ASIC manufactures began replacing aluminum with copper for the interconnect layers as the Age of Damascene began [Miyasato, 2018]. In 1997 IBM and Motorola introduced the process called the Dual Damascene to form copper interconnects. The name originates from Damascus the capital of Syria. The Damascene process is similar to a metal inlay process from the middle ages used in the Middle East. The Damascene process is an additive process. Figure 4.13 illustrates the differences in the process steps between Al and Cu [Khan and Kim, 2011].
90 CHAPTER 4. INTERCONNECTS
Figure 4.13: Comparison of the manufacturing process step differences between Al and Cu. [Khan and Kim, 2011]
91 CHAPTER 4. INTERCONNECTS
For the Damascene process the dielectric is deposited (shown in Yellow), and patterned using standard lithography and etching techniques to form the Via and trench as shown Figure 4.14. This is followed by the deposition of a diffusion barrier, which is typically a Ta-based layer shown in Blue in Figure 4.15. The diffusion barrier layer has two major functions. It prevents Cu atoms migrating into the interlevel dielectric (ILD), and provides good adhesion to Cu . A copper seed is then deposited by physical vapor deposition (PVD), followed by chemical electroplating of the copper to fill the Via and trench shown in Figure 4.16 (Green). The excess Cu is removed by a chemical mechanical polishing process (CMP), and an etch stop layer (also called capping layer shown in Tan), typically SiN based, is deposited. The complete interconnect structure is produced by repeating these process steps for each level of metallization completing the interconnect structure of an IC/ASIC [Justison, 2003],[De Orio, 1981], [Orio, 2010].
Figure 4.14: Copper dual-damascene fabrication process: Via patterning and Via and trench patterning [Orio, 2010].
Figure 4.15: Copper dual-damascene: Barrier layer deposition and Cu seed de- position. Cu electroplating and excess removal by chemical mechanical polishing [Orio, 2010].
The benefits of introducing copper is the reduction of resistivity to 1.678 µΩ − cm
92 CHAPTER 4. INTERCONNECTS
Figure 4.16: Copper dual-damascene: Capping layer deposition [Orio, 2010]. versus 3.2 µΩ − cm for Al-0.5%Cu, which results a reduction in power consumption, tighter packing density and superior resistance to EM [Quirk and Serda, 2001a]. Ta- ble 4.4 compares the resistivity of various materials.
93 CHAPTER 4. INTERCONNECTS
Table 4.3: Resistivity and temperature coefficient at 20 ◦C [GSU, 2017]
94 CHAPTER 4. INTERCONNECTS
Interconnect wires within a metal layer are describe by length (l), width(w), thick- ness/height (h) and pitch (P ). Pitch is the wire width and the spacing between the wires (W + S). Pitch generally refers to the minimum width (W ) and spacing (S) of a particular metal layer as shown in Figure 4.17.
Figure 4.17: Interconnect dimensions.
Interconnect wires are routed with different widths and length within a single metal layer as shown in Figure 4.18. Vias are used to connect interconnect wires between one or more layers.
Figure 4.18: 3 wire segments with different dimensions and branches.
The Via metal is not required to be the same metal material as the interconnect wire. Metal materials at different layers may also be different, for example in past
95 CHAPTER 4. INTERCONNECTS
nodes M1 may be aluminum (future nodes Co or Ru) and M2 may be Cu as shown in Figure 4.19.
Figure 4.19: Illustrating different materials vias and interconnect wires.
Interconnect resistance is described in several ways as follows:
1. resistivity Rho
2. ρ (Greek letter Rho)
3. sheet ρ
4. sheet resistance Rs
5. wire resistance Rw
Fabrication suppliers highly guard the resistivity of their process, and generally
will use Rs or Rw. If the supplier considers the minimum width of process a com- petitive advantage they specify Rw. Even pitch is given in order not to reveal the width of the interconnects to the competition. Useful equation for wire resistance are shown in Equations 4.1.
96 CHAPTER 4. INTERCONNECTS
ρ(T, h, g) l R = × = ohm (4.1) t w ρ R = = Ω/m (4.2) s t ρ R = = ohm/m (4.3) w t × W P = l × W (4.4) P W ≈ × (4.5) 2 t AR = = aspect ratio = (4.6) w ρ l ρ × l ρ l 2l R = × = 2 × = = R × = R × l ohm (4.7) h w h × P h w s P w P ρ = R × h = R × h × w = R × h = R × h × (4.8) s w w w 2 P 2 = R × AR × ohm − m (4.9) W 4 (4.10)
The symbol ρ represents resistivity in units of Ohms-m (ohms-meter). ρ is depen- dent on Temperature (T ), thickness/height (h) and grain size (g). For example, the
−8 ◦ bulk DC resistivity of pure copper, ρ0 is 1.72×10 ohm-m at temperature 20 C (1.72 µΩ−cm or 17.2 nohm-m). Resistivity is affected by temperature, and by the thickness and grain size. Large grain sizes and small thickness can lead to a resistivity of copper as high as 7.0 × 10−8 ohms-m. For Al(0.5% Cu), ρ = 2.8µΩ − cm, which is sometimes used in process as small as 45nm. As mentioned earlier, in metals, increasing temper- ature increases ion vibration amplitudes, which increases collisions and reduces the current flow. This produces a positive temperature coefficient. For semiconductors, increasing temperature “shakes loose” more electrons which increases the mobility and increases current flow. This produces a negative temperature coefficient. The Equation 4.11 is the first order linear resistivity equation as a function of
temperature. The nominal temperature, Tnom, is the resistivity measured for ρnom,w
97 CHAPTER 4. INTERCONNECTS
of the wire. w is the width of the wire. The temperature coefficient, α is the linear rate at which the wire changes.
ρw = ρnom,w (1 + αnom,w (T − Tnom)) (4.11)
Equation 4.12 shows how the spice simulator models the resistor. The TC2 where TC is temperature coefficient is rarely reported in the literature.