<<

Embedded Memory Design in CMOS FinFET Technology

Yih Wang

Logic Technology Development, Hillsboro, Oregon, U.S.A Corporation

2016 VLSI Technology Symposium, Honolulu, HI, U.S.A. June 14th, 2016

2016 Symposium on VLSI Circuits Short Course Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 2 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 3

Processor • Smaller capacities

Registers • Higher bandwidth • Lower latency • Predominantly SRAM L1 • Embedded in SoC for L2 Cache • Higher cost performance L3 Cache • More Reliable

Predominantly DRAM Main Memory Local storage Non-volatile storage Network storage

• Memory hierarchy driven by memory locality principle • Different embedded memory technologies developed to meet diverse capacity, latency, bandwidth and cost requirements

2016 Symposium on VLSI Circuits Short Course 4 Growing Bandwidth for Embedded Memory

Chittor, ISSCC Forum, 2015

• High-performance computing drives BW growth in Server • GFX/Visual computing drives BW growth in Client

2016 Symposium on VLSI Circuits Short Course 5 Embedded Memory • SRAM remains the predominant embedded memory technology in modern SoC – High performance bitcell capable of operating at high frequency and wide voltage range – Simple integration with logic technology with no additional cost – Performance and density benefit from continual process scaling • Alternative embedded memories (e.g. eDRAM and eFLASH) address specific applications – Logic-based high-performance eDRAM provides high memory bandwidth (>100MB/s) directly to compute engines – eFlash enables direct code execution and persistent storage for MCUs – Alternative embedded memories requires additional processing steps vs. logic CMOS, adding cost and integration complexity.

2016 Symposium on VLSI Circuits Short Course 6 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 7 CMOS Scaling

Mistry, IEDM, 2006 Natarajan, IEDM, 2008 Auth, VLSI, 2012 Natarajan, IEDM, 2014

45nm 32nm 22nm 14nm Planar CMOS FinFET CMOS FinFET technology introduced since 22nm node XTEM image does not reflect real size – used for illustration purpose 2016 Symposium on VLSI Circuits Short Course 8 SoC in FinFET CMOS

Die size does not reflect real die size – used for illustration purpose

Intel Core M-5Y70 4MB LLC 14nm SRAM Intel 5110P 30.5MB 22nm SRAM

Intel Xeon E5-2699v4 55MB LLC 14nm SRAM Intel Core i7-6700 8MB LLC 14nm SRAM Intel Xeon E5-2699v3 45MB LLC 22nm SRAM SRAM Cache continues to play a key role for the performance of modern SoC

2016 Symposium on VLSI Circuits Short Course 9 Benefits of FinFET Technology

• Superior short-channel control vs. planar

• Low doped enables low VT transistor, critical for low VMIN • Reverse transistor width scaling provides higher drive current per unit area • Improved Ion/Ioff ratio at scaled channel length

• Reduced transistor VT variation due to low doped channel • Improved soft-error-rate (SER) due to reduce transistor junction area

Kuhn, ECSM, 2010

2016 Symposium on VLSI Circuits Short Course 10 FinFET vs. Planar Transistor Performance

Ramey, IRPS, 2013

Trigate FinFET transistor offers higher drive current, and lower

subthreshold slope and transistor VT than planar transistor.

2016 Symposium on VLSI Circuits Short Course 11 Fin Design Consideration

• Transistor Width per Fin = W+ 2*Hsi LGATE • Fin Width (W)

• Higher LGATE / W ratio improves DIBL and subthresold slope Fin Width (W) • Smaller fin width difficult for patterning • Fin Height (H ) si Fin Height (H ) • Improve transistor current si • Taller fin is difficult for fin etch process Fin Pitch • Scaling challenge for FinFET • Fin Pitch • Determine transistor layout density • Tighter fin pitch difficult for fin patterning process

2016 Symposium on VLSI Circuits Short Course 12 Transistor Fin Scaling

22nm 14nm

Bohr, Intel Development Forum, 2014 • Tighter fin pitch improves circuit density • Taller and thinner fins improve circuit performance and short-channel control

2016 Symposium on VLSI Circuits Short Course 13 Transistor Variability

K. Kuhn, IEDM, 2007 T. Matsukawa, VLSI, 2008 P. Oldiges, ICSSPD, 2000

Gate Length Variation by Random Dopant Line-Edge Roughness (LER) Fluctuation (RDF)

Gate WF, Fin LER and Rext Variation

• RDF, gate work function (WF), Fin and Gate LER and Rext contribute to the variation of transistor threshold voltage and drive current.

• RDF is significantly reduced in FinFET. Variations from WF, Fin LER and Rext become significant.

2016 Symposium on VLSI Circuits Short Course 14 Transistor Variation Trend

Natarajan, IEDM, 2014 Minimum Width Transistor Giles, VLSI, 2015

1-fin equivalent VT Variation

Strain HKMG

FinFET Minimum Transistor Width Width (A.U.) Minimum Transistor

• Low doped channel and reversed width scaling in FinFET technology enable

reduction of transistor VT variation for min-width transistor • Lower VT variation enables improved read/write window for FinFET SRAM

2016 Symposium on VLSI Circuits Short Course 15 Challenges of FinFET Technology

• Complex process integration – e.g. fin patterning and isolation, and gate stack and S/D engineering. • New sources of variability

– VT variation sensitive to variability of fin dimension and shape. – Work function of metal gate dominant for undoped channel • Increasing device parasitics – Higher fringing capacitance due to 3D FinFET structure – Increasing complexity on parasitics modeling and circuit simulation • Design-Layout Challenges – Design need to adapt to quantized device width and new parasitics – Min-area SRAM cell requires assist circuitry to balance read/write margin – EDA tools need to be adapted for FinFET design

2016 Symposium on VLSI Circuits Short Course 16 FinFET Parasitics

Kuhn, ECSM, 2010 Wong, SISPAD, 2009

• Intrinsic gate capacitance is a small percentage of total capacitance • Optimization of memory bitcell and circuit must include parasitics

2016 Symposium on VLSI Circuits Short Course 17 Feature Size Scaling

103 1. Mistry, IEDM, 2006

(nm) 2. Natarajan, IEDM, 2008 CPP 3. Auth, VLSI, 2012

1. 4. Natarajan, IEDM, 2014 Pitch 2. 2 10 3. Metal

4. Metal Metal

Gate and and Contact

Planar CMOS FinFET CMOS CPP 101 45 32/28 22/20 16/14 10 Technology Node

• 14nm technology has 70nm contacted poly pitch and 52nm metal pitch • Interconnect has increasingly become the chip performance limiter

2016 Symposium on VLSI Circuits Short Course 18 Interconnect Scaling

104 1. Bai, IEDM, 2004 2. Mistry, IEDM, 2006 3. Natarajan, IEDM, 2008 103 4. Auth, VLSI, 2012 5. Natarajan, IEDM, 2014

1 2 3 2 10 4

5

Line Resistance (A.U.) Resistance Line Interconnect Pitch (nm) Pitch Interconnect 101 65 45 32 22 14 Technology Generation (nm) • Line resistivity increases as metal cross-sectional area decreases with scaling • Resistance variation is also getting worse with scaling • Interconnect resistance has increasing impact on memory performance • Wordline and bitline use narrow width for density. • New memory architecture and novel interconnect needed to overcome interconnect scaling challenge

2016 Symposium on VLSI Circuits Short Course 19 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 20 SRAM Bitcell Scaling

Planar FinFET 0.10 6 1. Song, ISSCC, 2016 2. Karl, ISSCC, 2015 0.08 5 3. Song, ISSCC, 2014 4 3 4. Chen, ISSCC, 2014 0.06 5. Chang, ISCCC, 2013 2 6. Karl, ISSCC, 2012 1 0.04

0.02

0.00 22 20 16 14 10

FinFET technology enables continual SRAM cell area scaling • 10nm 128Mb FinFET SRAM reported in ISSCC’16

2016 Symposium on VLSI Circuits Short Course 21

SRAM Array Density Scaling ) 2 100 Karl, ISSCC, 2015

14.5Mb/mm2 Mbits/mm

( 10 11.6Mb/mm2

1.0

High-Density SRAM Low-Voltage SRAM

SRAM Array Density Array SRAM 0.1 65nm 45nm 32nm 22nm 14nm Process technology 14nm HDC SRAM achieves 14.5Mb/mm2 array density . Including assist circuitry for leakage reduction and R/W margin expansion

2016 Symposium on VLSI Circuits Short Course 22 14nm HDC SRAM Performance

HDC SRAM LVC SRAM 512b/BL 256b/BL 14.5Mb/mm2 11.6Mb/mm2

Karl, ISSCC, 2015 • 14nm HDC SRAM array reaches 1.2GHz at 0.7V • Performance driven by bitcell C/I, variability and array configuration

2016 Symposium on VLSI Circuits Short Course 23 Low-Power SRAM

Transistor and circuit co-optimization enable sub-10pA bitcell leakage in 22nm FinFET technology

2016 Symposium on VLSI Circuits Short Course 24 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 25 SRAM Design

N1 N1 N0

N0 • Conflicting requirement by sharing pass-gate transistor for read and write operation • Strong pass-gate transistor and weak PMOS needed for write optimized SRAM • Week pass-gate transistor and strong PMOS needed for read optimized SRAM • Bitcell is normally optimized to strike balance between read stability and write-ability

• Transistor variability reduces margin to balance competing requirement for low VMIN 2016 Symposium on VLSI Circuits Short Course 26 SRAM Operating Voltage

Systematic variations determines

SRAM VMIN of the 100

80 Random variation

determines VMIN of each die 60

40

Percentile (%) Percentile 20

0 VMIN (A.U.)

• Low SRAM VMIN is critical for SoC power consumption and reliability • VMIN is dominated by random and systematic transistor variations

2016 Symposium on VLSI Circuits Short Course 27 SRAM Performance Requirement

Ideal SRAM High-Performance SRAM Vccmin Speed Vmin Speed

better worse

Density Power Density Power

Low-Power SRAM High-Density SRAM Vccmin Speed Vmin Speed

better worse

Density Power Density Power SRAM assist needs to be optimized for diverse product requirements

2016 Symposium on VLSI Circuits Short Course 28 Restricted Design Rules

90nm SRAM cell 14nm HDC SRAM Cell

1:1:1 PU:PG:PD Fin Ratio

• • Bi-direction features Uni-directional features • • Wide range of gate CD Limited gate CD option • • Wide range of transistor size Gridded diffusion and poly • Quantized device width

Restricted design rules in FinFET CMOS remove the sizing option in conventional SRAM cell.

2016 Symposium on VLSI Circuits Short Course 29 SRAM Design: From 32n Planar to 22nm FinFET

• Shrinking read/write operating window at lower VMIN • Quantized device size in FinFET SRAM cell affect the operating margin of min-area SRAM bitcell due to fixed transistor sizing ratio • External circuit assist techniques utilized to enhance and balance SRAM read/write margins

2016 Symposium on VLSI Circuits Short Course 30 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 31 Assist Circuits: Area Overhead

Karl, ISSCC, 2012 Chang, ISSCC, 2013 Song, ISSCC, 2014

• Assist circuits enable superior VMIN at cost 1-5% area overhead • Still a better option than increasing memory cell size • Power, Performance and Area tradeoff decides optimal assist technique

2016 Symposium on VLSI Circuits Short Course 32 Circuit Assist Techniques for Read Margin

Boost Global Vcc Negative VBL/VSS Boost SRAM Vcc Underdrive WL

Mann, Solid State Electronics, 2010

2016 Symposium on VLSI Circuits Short Course 33 Wordline Under Drive Read Assist

Karl, ISSCC, 2012 Nii, ISSCC, 2007

• Wordline underdrive read assist has low implementation cost but can have significant limitation on read performance and write margin at lower Vmin • Careful design required to mitigate wordline voltage variation of wordline under drive circuit with process/temperature variation.

2016 Symposium on VLSI Circuits Short Course 34 Bitline Voltage Underdrive Read Assist

Khellah, VLSI, 2006 Pilo,Pilo, H. ISSCC,(ISSCC 2011) 2011

Lower bitline voltage • Lower bitline voltage reduces read disturb at storage node. Higher bitline voltage • Read stability vs. read performance tradeoff (no write margin impact) • Limited improvement vs. wordline underdrive read assist circuit. • Need optimal bitline voltage range to achieve optimal read stability Lower bump voltage vs. reverse stability failure at lower bitline voltage

2016 Symposium on VLSI Circuits Short Course 35 Exploiting Dynamic Stability for Read Margin

Toh, JSSC, 2011 Short WL Pulse Long WL Pulse Reduce wordline pulse lowers Duration too short for voltage change Enough time to complete voltage change read fail bit count

• Shortening wordline pulse reduces read failure rate but sensing margin is a concern • Short bitline improves both stability and alleviate sensing margin with short WL pulse

Yamaoka, ESSCIRC, 2008

2016 Symposium on VLSI Circuits Short Course 36 Circuit Assist Techniques for Write Margin

Negative Bitline SRAM VSS Boost SRAM VCC Collapse Boost WL

Mann, Solid State Electronics, 2010 2016 Symposium on VLSI Circuits Short Course 37 Transient Voltage Collapse (TVC) Write Assist

Wang, IEDM 2011 Karl, ISSCC 2012

• Dynamically weaken PMOS to reduce write contention • Risk: Dynamic retention of unselected cells along column • Dynamic voltage droop (write margin) vs. retention trade off • No impact on read stability • Write power overhead

2016 Symposium on VLSI Circuits Short Course 38 TVC Write Assist Circuits

Strong Bias TVC Karl, ISSCC, 2015 Charge-Share TVC

• Design Tradeoff: Write Power vs. assist circuit area overhead

2016 Symposium on VLSI Circuits Short Course 39 Negative Bit-Line Write Assist

Precharge Nboost to ground

Pilo, ISSCC, 2011 Karl, IEDM, 2012

• Dynamically strengthen PG NMOS to improve write margin • PG NMOS reliability and data retention issues of unselected bitcells • Timing and area overhead.

2016 Symposium on VLSI Circuits Short Course 40 Systematic Variations

Transistor stress vs. layout density Kuhn, IEDM, 2007 Die to die variation of of on-die ring oscillator frequency

Anneal temperature variation CMP

• New process technologies bring new source of systematic variations • Stress from strained silicon: drive current variation

• Metal gate work function: transistor VT variation • Fin Etch: drive current and VT variation due to fin width, height and shape variations • Adaptive circuit techniques to monitor and compensate systematic variations 2016 Symposium on VLSI Circuits Short Course 41 Process Variation Control

Systematic Variation Poly Opening Polish Requirements for Metal Gate Scaling of Gate CD control

STI/POP WID Performance STI WID 1 70% Line POP WID

0.1

Random Variation Threshold to enable Functional HKMG transistor

CMP Topography 0.01 350 250 180 130 90 65 45 Technology node (nm)

Steigerwald, IEDM, 2008 Kuhn, IEDM, 2007 Continued improvement of process control important for future SRAM scaling

2016 Symposium on VLSI Circuits Short Course 42 Adaptive to Systematic Variation

Kolar, JSCC, 2011

• Systematic variation can be mitigated by design and test techniques • On-die sensor “intelligently” choose optimal WL voltage setting for WLUD • Per die control of assist settings

2016 Symposium on VLSI Circuits Short Course 43 SRAM Interconnect Challenges

• Increasing bitline and wordline resistances as technology scales • 22nm: 256b M2 bitlines ~10-100s  • 7nm: 256b M2 bitline ~1000 

• Bigger impact on high-performance array and large signal operation • Repeater insertion affect memory density • SRAM write, precharge and large signal sensing are more sensitive to higher metal resistance

2016 Symposium on VLSI Circuits Short Course 44 Hierarchical Array Design

+ Read stability from short BL + Access time less sensitivity to metal resistance + Lower Power consumption - Lower density

Sinangil, ISSCC, 2011

Hierarchical design utilizes short local and long global bitlines to

address power (Cdyn) and delay from long and resistive bitline.

2016 Symposium on VLSI Circuits Short Course 45 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 46 FinFET Transistor Reliability

• Intrinsic reliability of FinFET transistor is matched or better than planar with appropriate engineering. • FinFET has improved TDDB and matched BTI vs. planar.

2

Wang, EDL, 2013 Ramey, IRPS, 2013 Lee, IRPS, 2013

2016 Symposium on VLSI Circuits Short Course 47 Bias-Temperature-Instability (BTI) and SRAM Vmin

Read & Standby VT and (VT)BT • Read Vmin increases increases after stress • Write Vmin decreases • Vmin after BI could increase or decreases depending on time0 Vmin is read or write limited

Lin, IRPS, 2008

Write Write Read Vmin Vmin

Pae , TEMR, 2008

A. Bansal, IRPS, 2009 • BTI degrades P/NMOS and change balance of read and write margins at time-zero • Design for both time-zero and BTI-induced variability.

2016 Symposium on VLSI Circuits Short Course 48 Radiation Induced Soft Error

Planar Transistor FinFET Transistor

SER  Adiff exp(QCRIT /  QCOLL )

Fang and Oates, TDMR, 2011 Less collected charge QCOLL  Reduced SER

Soft error event is substantially reduced in FinFET transistor due to reduced junction area

2016 Symposium on VLSI Circuits Short Course 49 Soft Error Trend

Alpha particle Thermal neutrons High-energy neutrons SBU High-energy neutrons MCU

Correctable by ECC ECC is less effective for MBU Column interleaving to alleviate MBU

1.E+00 22nm 1.E-01 32nm MCU 1.E-02

1.E-03 SER / SER (a.u.) Bit

N. Seifert, DFT, 2011 MCUProb 1D 1.E-04 0.0 0.5 1.0 1.5 0 0.5 1 1.5 Voltage (V) Cell Distance (um) • Single-bit upset (SBU) is reduced with the FinFET technology • Multi-bit upset (MBU) needs to be mitigated by combination of ECC and bit interleaving techniques

2016 Symposium on VLSI Circuits Short Course 50 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternate Embedded Memory Options • Summary

2016 Symposium on VLSI Circuits Short Course 51 Embedded Memory for Low-Power IoT Platform

Processor Centric Processor

Registers

L1 Cache Embedded L2 Cache Low power on-die volatile and SRAM L3 Cache non-volatile memories are critical to meet the Main Memory External requirement of ultra-low- DRAM and Local storage power IoT System Storage Network storage

2016 Symposium on VLSI Circuits Short Course 52 Emerging Memory Technology

Compute IoT MCU Automotive

SRAM (Volatile) eDRAM (Volatile) eNVM (eFlash/EEPROM/OTP/MTP) Emerging Memory Technology

• Existing embedded memories have limitation to address latency, memory BW and persistent storage requirements at the same time – SRAM/eDRAM are fast but volatile. eFlash is non-volatile but slow and has limited endurance • Alternative memories and hierarchy are been actively explored to expand the capability for both compute and persistent storage.

2016 Symposium on VLSI Circuits Short Course 53 Alternative Embedded Memory Technologies

• Logic-process-based embedded DRAM (eDRAM) + High density, lower latency and higher BW/Watt vs. SRAM cache, lower SER + 22nm FinFET Gbit eDRAM product – Process integration complexity and manufacturing cost – Scaling of capacitor and low-leakage access transistor • ReRAM + High density memory with nonvolatility at high temperature, zero retention power – Require high voltage thick gate transistor – Limited cycling endurance not suitable for cache memory • Spin-Transfer-Torque MRAM (STT-MRAM) + High density memory with nonvolatility and high cycling endurance – Complicated material stack and integration requirement with logic process • Phase Change Memory + High density memory with nonvolatility and zero retention power – Device Integration requirement with logic process and write power Alternative memories compliment SRAM for applications demanding higher capacity, higher memory BW/Watt and non-volatility

2016 Symposium on VLSI Circuits Short Course 54 Embedded DRAM

For applications demanding large memory capacity, embedded DRAM bridges the gap in memory density, latency, power and cost between on- die SRAM and DRAM

2016 Symposium on VLSI Circuits Short Course 55 High Performance SoC using Embedded DRAM

IBM zEC12 Chip CPU Intel Iris™ Pro Warnock, ISSCC, 2013 Graphics

128MB eDRAM 77mm2 Hamzaoglu, ISSCC,2014

192MB eDRAM L4 Cache 48MB eDRAM L3 Cache 526mm2 eDRAM on 32nm SOI CMOS eDRAM on 22nm FinFET CMOS

Logic-based eDRAM improves SoC performance by increasing embedded memory capacity and memory bandwidth at lower BW/Watt versus SRAM cache

2016 Symposium on VLSI Circuits Short Course 56 22nm Embedded DRAM

22nm 1T1C Trigate Wang, IEDM,2013 Meterelliyoz, VLSI, 2014 eDRAM Cell 0.029 m2 Cell Transistor

Cell Transistor Fin

• 0.029µm2 eDRAM cell with >13fF MIM capacitor • eDRAM cell utilizes trigate transistor for low leakage and MIM capacitor to enable >100µs retention time • Noise reduction techniques and bitcell layout optimization applied to reduce retention failure rate • Temperature dependent self-fresh to minimize refresh >13fF MIM capacitor embedded in power 22nm logic process Brain, VLSI, 2013 2016 Symposium on VLSI Circuits Short Course 57 ReRAM

Wei, IEDM, 2015 Ueki, VLSI, 2015

• 2Mbit RRAM in 40nm logic CMOS • Demonstrated 10year data retention at 85C with 100k cycles • 2Mbit RRAM in 90nm logic CMOS • Demonstrated 10year data retention at 85C • Data retention performance show the promise as alternative eNVM • Challenges: – Integrate RRAM device in scaled BEOL process – High forming/set/reset voltage  require thick gate transistor – Reliability: Resistance variability after cycling, endurance and data retention. – Limited cycling endurance prevents usage as on-die cache

2016 Symposium on VLSI Circuits Short Course 58 Spin-Transfer-Torque MRAM (STT-MRAM)

Yu, IEDM, 2015 Ranjan, SNIA, 2016

• 1Mbit STT-MRAM in 40nm LP CMOS • Memory device: Perpendicular MTJ • 64Mbit STT-MRAM in 55nm CMOS • Bitcell size = 0.065um2 • Memory device: 50nm Perpendicular MTJ • Demonstrated 20ns read time and <100ns write time embedded between M1 and Transistor • Speed, endurance and data retention show the promise to replace eFlash, eDRAM and SRAM • Challenges: – Integrate MTJ device in scaled BEOL process – Reliability: Data retention at high temperature, magnetic interference and stochastic write error

2016 Symposium on VLSI Circuits Short Course 59 Phase Change Memory

Sandre, JSSC, 2011 • Data retention performance show promise as alternative eNVM • Challenges: – Integrate PCM device in scaled BEOL process – Device performance at high temperature – Impact of high programming temperature on BEOL reliability – High write energy and latency • 4Mbit ePCM in 90nm 6-ML CMOS • Bitcell size = 36F2 / 0.29um2 – Reliability: Resistance drift after cycling • Three additional masks for integrating memory element • set/reset time < 1 µsec • Endurance ~ 106 cycles

2016 Symposium on VLSI Circuits Short Course 60 Outline

• Introduction • FinFET CMOS Technology • Embedded SRAM Design in FinFET Technology • SRAM Scaling Trend • SRAM Design Challenges in FinFET • Circuit Assist Techniques • Reliability • Alternative Embedded Memory Technologies • Summary

2016 Symposium on VLSI Circuits Short Course 61 Summary

• SRAM continues to be the workhorse memory of modern SoC products. • FinFET technology enables continual SRAM scaling by

providing higher performance and lower VMIN • Circuit-assist techniques and tight design-process collaboration are increasingly critical to address SRAM scaling challenges in FinFET CMOS and future technology • Alternative memory technologies and hierarchy emerge to address requirements on dense and non-volatile embedded memory for new class of applications.

2016 Symposium on VLSI Circuits Short Course 62 Acknowledgements

The author gratefully acknowledges the many people in the following organizations at Intel who contributed to this work: • Advanced Design • Logic Technology Development • Quality and Reliability Engineering • Components Research • Assembly & Test Technology Development

2016 Symposium on VLSI Circuits Short Course 63 References

[1] C. Auth, et al., ”A 22nm High Performance and Low-Power CMOS Technology Featuring Fully-Depleted Tri-Gate Transistors, Self-Aligned Contacts and High Density MIM Capacitors,” Symp. on VLSI Tech. Dig., 2012 [2] A. Bansal, et al., “Impact of NBTI and PBTI in SRAM Bit-cells: Relative Sensitivities and Guidelines for Application-Specific Target Stability/Performance,” IRPS, 2009. [3] R. Baumann, “The Impact of Technology Scaling on Soft Error Rate Performance and Limits to the Efficacy of Error Correction, ” IEDM Tech. Digest, Dec. 2002. [4] M. Bohr, “The Evolution of Scaling from the Homogeneous Era to the Heterogeneous Era,” IEDM Tech. Digest, Dec. 2011. [5] M. Bohr, “14nm Process Technology: Opening New Horizons,” 2014 Intel Development Forum. [6] R. Brain, et al., “A 22nm High Performance Embedded DRAM SoC Technology Featuring Tri-Gate Transistors and MIMCAP COB,” Symp. on VLSI Tech. Dig., June 2013. [7] J. Chang, et al., “A 20nm 112Mb SRAM in High-к metal-gate with assist circuitry for low-leakage and low-VMIN applications,” ISSCC Dig. Tech. Papers, Feb. 2013.

[8] Y. Chen, et al., “A 16nm 128Mb SRAM in High-K Metal-Gate FinFET Technology with Write-Assist Circuitry for Low-VMIN Applications,” ISSCC Dig. Tech. Papers, Feb. 2015. [9] S. Chittor, “Memory Requirement Trends and Challenges: Servers to Devices“, ISSCC Forum, 2015. [10] M. Giles, et al., “High sigma measurement of random threshold voltage variation in 14nm Logic FinFET technology,” Symp. on VLSI Tech. Dig., 2015. [11] F. Hamzaoglu, et al., “A 153 Mb-SRAM design with dynamic stability enhancement and leakage reduction in 45 nm high-K metal-gate CMOS technology,” IEEE ISSCC Dig. Tech. Papers, Feb. 3–7, 2008, pp. 376–621. [12] F. Hamzaoglu, et al., “ A 1Gb 2GHz Embedded DRAM in 22nm Tri-Gate CMOS Technology,” IEEE J. Solid-State Circuits, Jan. 2015 [13] C.-H. Jan, et al., ”A 32nm SoC Platform Technology with 2nd Generation High-k/Metal Gate Transistors Optimized for Ultra Low Power, High Performance, and High Density Product Applications,” ” IEDM Tech. Digest, Dec. 2009. [14] C.-H. Jan, et al., “A 22nm SoC platform technology featuring 3-D tri-gate and high-k/metal gate, optimized for ultra low power, high performance and high density SoC applications,” IEDM Tech. Digest, Dec. 2012. [15] C.-H. Jan, et al., “ A 14nm SoC Platform Technology Featuring 2nd Generation Tri-Gate Transistors, 70nm Gate Pitch, 52nm Metal Pitch, and 0.0499um2 SRAM cells, Optimized for Low Power, High Performance and High Density SoC Products,” Symp. on VLSI Tech. Dig., 2015. [16] E. Karl, et al., “A 4.6GHz 162Mb Single-Supply SRAM Design in 22nm Tri-Gate CMOS Technology with Active Vmin-Enhancing Assist Circuitry,” ISSCC Dig. Tech. Papers, Feb. 2012. [17] E. Karl, et al., “A 0.6V, 1.5GHz 84Mb SRAM in 14 nm FinFET CMOS Technology With Capacitive Charge-Sharing Write Assist Circuitry,” ISSCC Dig. Tech. Papers, Feb. 2015. [18] M. Khellah, et al., “Process, temperature, and supply-noise tolerant 45 nm dense cache arrays with Diffusion-Notch-Free (DNF) 6T SRAM cells and dynamic multi-Vcc circuits,” IEEE J. Solid-State Circuits, vol. 44, no. 4, pp. 1199–1208, Apr. 2009.

2016 Symposium on VLSI Circuits Short Course 64 References (cont’d)

[19] P. Kolar, et al., ”A 32nm High-K Metal Gate SRAM with Adaptive Dynamic Stability Enhancement for Low-Voltage Operation ,” IEEE J. Solid-State Circuits, Vol. 46, No.1, pp. 76-84, Jan. 2011. [20] K. Kuhn, “Reducing Variation in Advanced Logic Technologies: Approaches to Process and Design for Manufacturability of nanoscale CMOS,” IEDM Tech. Digest, Dec. 2007. [21] K. Kuhn, “Moore's Law past 32nm: Future Challenges in Device Scaling, Future Challenges in Device Scaling, “ Solid-State Materials and Devices Conference 2009. [22] K. Lee, et al., “Technology Scaling on High-K & Metal-Gate FinFET BTI Reliability,” IRPS 2013. [23] J. Lin, et al., “Time Dependent Vccmin Degradation of SRAM Fabricated with High-k Gate Dielectrics”, IRPS, pp. 439-444, 2007. [24] Y. Lu, et al., “Fully Functional Perpendicular STT-MRAM Macro Embedded in40 nm Logic for Energy-efficient IOT Applications,” IEDM Tech. Digest, Dec. 2015. [25] T. K. Liu, “FinFET History, Fundamentals and Future,” Symp. on VLSI Tech short course, 2012. [26] R. Mann, et al., “Impact of circuit assist methods on margin and performance in 6T SRAM,” Solid-State Electronics, vol. 54, pp. 1398–1407, 2010. [27] W. Maszara, et al., ”FinFETs ‐ Technology and Circuit Design Challenges,” ESSCIRC, September 2013. [28] T. Matsukawa et al.,“Comprehensive analysis of variability sources of FinFET characteristics,“ VLSI Tech. Digest, 2008. [29] M. Meterelliyoz, et al., “2nd Generation Embedded DRAM with 4X Lower Self Refresh Power in 22nm Tri-Gate CMOS Technology“, Symp. on VLSI Circuits Dig., June 2014. [30] K. Mistry, et al., “A 45nm Logic Technology with High-k+Metal Gate Transistors, Strained Silicon, 9 Cu Interconnect Layers, 193nm Dry Patterning, and 100% Pb-free Packaging,” IEDM Tech. Digest, Dec. 2007. [31] S. Natarajan, et al., ”A 32nm logic technology featuring 2nd-generation high-k + metal-gate transistors, enhanced channel strain and 0.171μm2 SRAM cell size in a 291Mb array,“ IEDM Tech. Digest, Dec. 2008. [32] S. Natarajan, et al., ”A 14nm logic technology featuring 2nd-Generation FinFET transistors, Air-Gapped Interconnects, Self-Aligned Double Patterning and a 0.0588μm2 SRAM cell size,“ IEDM Tech. Digest, Dec. 2014. [33] K. Nii, et al., “A 45-nm bulk CMOS embedded SRAM with improved immunity against process and temperature variations,” IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 180–191, Jan. 2008. [34] P. Packan, et al., “High performance Hi-K + metal gate strain enhanced transistors on (110) silicon,” IEDM Tech. Digest, Dec. 2008. [35] S. Pae, et al., “Effect of BTI Degradation on Transistor Variability in Advanced Technologies,” IEEE Trans. Device Mater. Rel., Vol. 8, No. 3, Sept., 2008. [36] H. Pilo, ISSCC Tutorial, 2011. [37] M. Radosavljevic, et al., “Non-planar, multi-gate InGaAs quantum well field effect transistors with high-K gate dielectric and ultra-scaled gate-to- drain/gate-to-source separation for low power logic applications ,” IEDM Tech. Digest, Dec. 2010.

2016 Symposium on VLSI Circuits Short Course 65 References (cont’d)

[38] S. Ramey, et al., “Tri-gate Transistor Reliability,” IRPS Tutorial, 2014. [39] N. Seifert, “Radiation Effects in Nanoscale Devices,” IEEE Symp. on Defect and Fault Tolerance, Oct. 2011. [40] J. Steigerwald, et al., ”Chemical mechanical polish: The enabling technology ,” IEDM Tech. Digest, Dec. 2008. [41] G. Sandre, et al., ”A 4 Mb LV MOS-Selected Embedded Phase Change Memory in 90 nm Standard CMOS Technology,” IEEE J. Solid-State Circuits, vol. 46, no. 1, pp. 52–63, Jan. 2011

[42] T. Song, et al., “A 14nm FinFET 128Mb 6T SRAM with VMIN Enhancement Techniques for Low-Power Applications,” ISSCC Dig. Tech. Papers, Feb. 2014 [43] T. Song, et al., “A 10nm FinFET 128Mb SRAM with Assist Adjustment System for Power, Performance, and Area Optimization,” ISSCC Dig. Tech. Papers, Feb. 2016. [44] R. Strenz, “Embedded Flash Technologies and their Applications: Status & Outlook,”, IEDM Tech. Digest, Dec. 2011 [45] T. Suzuki, et al., “A Sub-0.5-V Operating Embedded SRAM Featuring a Multi-Bit-Error-Immune Hidden-ECC Scheme,” IEEE J. Solid-State Circuits, vol. 41, no.1, pp. 152–160, Jan.,2006. [46] M. Ueki, et al., “Low-Power Embedded ReRAM Technology for IoT Applications ,” VLSI Tech. Digest, 2015 [47] M. Wang, et al., “Superior PBTI Reliability for SOI FinFET Technologies and Its Physical Understanding,” IEEE EDL, VOL. 34, NO. 7, JULY 2013, pp. 837-839. [48] Y. Wang, et al., “A 1.1GHz 12uA/Mb-Leakage SRAM Design in 65nm Ultra-Low-Power CMOS with Integrated Leakage Reduction for Mobile Applications,” ISSCC Dig. Tech. Papers, Feb. 2007. [49] Y. Wang, et al., “A 4.0 GHz 291 Mb voltage-scalable SRAM design in a 32 nm high-k+ metal-gate CMOS technology with integrated power management,” IEEE J. Solid-State Circuits, vol. 45, pp. 103–110, 2010. [50] Y. Wang, et al., “Dynamic Behavior of SRAM Data Retention and a Novel Transient Voltage Collapse Technique for 0.6V 32nm LP SRAM ,” IEDM Technical Digest, Dec, 2011. [51] Y. Wang, et al., “Retention Time Optimization for eDRAM in 22nm Tri-Gate CMOS Technology,” ,” IEDM Technical Digest, pp. Dec, 2013. [52] J. Warnock, et al., ”5.5 GHz System z and Multi-Chip Module,” ISSCC Dig. Tech. Papers, Feb. 2013 [53] Z. Wei, et al., “Distribution Projecting the Reliability for 40 nm ReRAM and beyond based on Stochastic Differential Equation,” IEDM Technical Digest, Dec, 2015. [54] M. Yamaoka, et al., “Low-power embedded SRAM modules with expanded margins for writing,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2005, pp. 480–611. [55] T. Yabe, et al., “Circuit Techniques to Improve Disturb and Write Margin Degraded by MOSFET Variability in High-Density SRAM Cells,” pp. 106-107, VLSI Tech. Digest, 2011. [56] H. Yamauchi, “A Discussion on SRAM Circuit Design Trend in Deeper Nanometer-Scale Technologies,” IEEE Trans. VLSI Systems, Vol. 18, No.5, May 2010.

2016 Symposium on VLSI Circuits Short Course 66