Lecture 5: Scaled for ICs

MSE 6001, Materials Lectures Fall 2006

5 MOSFETS and scaling

Silicon is a mediocre semiconductor, and several other have better electrical and optical properties. However, the very high quality of the electrical properties of the -silicon dioxide interface allows very good metal-oxide-semiconductor field-effect (MOSFETs) to be fabricated. These devices have several properties, such as operation frequency and power con- sumption, that improve as their size is scaled down to smaller dimensions, which also allows more transistors to be packed onto a chip. The smaller sizes are achieved using higher-resolution pho- tolithography, which is improved by steadily improving the fabrication technologies. The trends of steadily improving performance and greater integration density with time are generically refered to A 70 Mbit SRAM test vehicle with >as0.5 “Moore’s billion transi Law”.stors Of course, scaling has to end eventually, somewhere before the scale of atoms and incorporating all of the features described in this paper has been fabricated on this technologyis. reached.The aggressive de- sign rules allow for a small 0.57Pm2 6-T SRAM that is also compatible with high performance logic processing. A top view of the cell after poly patterni5.1ng is show Basicn in Figur MOSFETe 12. In addition to small size, this cell has a robust static noise margin down to 0.7V VDD to alloMOSFETsw low voltage turnopera- on and off via the non-linear gate capacitor between the gate electrode and the tion (Fig. 13). Figure 14 is a Shmosubstrate.o plot for the The 70 M MOSFETb of Fig. 1 controls the flow of electrons (an NMOSFET) from the n-type SRAM operating frequency vs voltage, showing the SRAM operates at 3.43 GHz at 1.2V. A die photsourceo is show electroden in Fig- to the n-type drain electrode. ure 15. A positive gate potential attracts a very thin layer of electrons to the surface of the substrate,

VII. Conclusionat its interface with the oxide, forming a “channel” that allows current to be conducted from the We have developed an industry leading 65nm CMOS tech- Figure 2: size trend for technology nodes. nology for high performance microprocessors with excel- lent transistor and interconnect performance, along with aggressive dimensional scaling. A high performance, high density 70 Mbit SRAM test vehicle has been successfully fabricated utilizing all of the 65nm process features. This 65nm technology is on track for high volume manufactur- ing in 2005. L = 35 nm

References

[1] K. Mistry, et al., Symp. VLSI Tech. Dig., 2004.

[2] T. Ghani, et al., IEDM Tech. Dig., pp. 197-200, 2003.

Layer Pitch Thick AspectRatio Transmission electron (nm) (nm) micrograph of MOSFET, Isolation 220 320 - 65 nm process Polysilicon 220 90 - Contacted gate pitch 220 - - Figure 3: TEM cross section of 35nm NMOS Metal 1 210 170 1.6 P. Bai, et al., “65 nm Logic Technology Featuring 35 nm Gate Lengths, Enhanced Metal 2 210 190 1.8 Channel Strain, 8 Cu Interconnect Layers, Low-k ILD, and 0.57 µm2 SRAM Cell”, Metal 3 220 200 1.8 IEDM Proceedings , 2004 Metal 4 280 250 1.8 Metal 5 330 300 1.8 Metal 6 480 430 1.8 FIGURE 1: The MOSFET gate controls the conductivity between the source and drain. Metal 7 720 650 1.8 Metal 8 1080 975 1.8

Table 1: Layer pitch, thickness and aspect ratio 5-1

10 10

) ) m 2 u SRAM Cell Area (

250nm m

h 0.5x every 2 years u ( c t

i 180nm a P e

r e

A PMOS t

130nm l a 1 1 l e G

C d 90nm Figure 4: TEM cross section of 35nm PMOS e t M c

A a

t 65nm R

n Contacted Gate Pitch S o 0.7x every 2 years C

0.1 0.1 1994 1996 1998 2000 2002 2004 2006

Figure 1: Intel contacted gate pitch and SRAM area trends

thicknessoftheburiedoxide.Forthickburiedoxide,thereisno backsidescreeningofthedrainpotential,resultinginrelatively poor scaling characteristics compared to other device types [10]–[13]. Since such devices are not likely to be used at the limits of scaling they are not discussed here. We do, however, discuss the scaling advantages of the more novel double gated type of FD-SOI MOSFETs, wherein both the insulator on the back side of the Si channel layer and the Si layer itself are very thin so that both sides of the channel are gated. There are also in-between FD-SOI MOSFETs with buried oxide thin enough to offer some screening, but not thin enough for use in active Fig. 1. Schematic illustration of the scaling of Si technology by a switching. These devices are interesting from a circuit point of factor alpha. Adapted from [5]. view since the back gate can be used to dynamically adjust the thresholdvoltage,butarenotdiscussedhereforlackofspace. Table 1 FIGURE 2: Scaling the size of a MOSFET. The outline of the paper is as follows. Section II ad- Technology Scaling Rules for Three Cases dresses some of the more fundamental limitationssource to drain.the The gate length L is the separation between the source and drain. The smallest continued scaling of MOSFETs that appear featureto be on thatthe can be fabricated in an IC technology is approximated by the “half-pitch width”, which horizon. Based only on these fundamental limits,characterizesit may the IC technology and is a measure of L. For example, this year 65 nm technology be possible to scale FETs down to very smallICsdimensions, have been in full production, and 45 nm technology chips will be introduced. The gate lengths e.g., 10-nm channel length or smaller. SectionforIII thedescribes 65 nm technology is approximately 35 nm. The thin gate oxide, made from silicon dioxide, research results related to this fundamental limit regime: has a thickness tox, which in the current advanced chips is approximately 2 nm thick. very tiny one-of-a-kind FETs. In the more practical world of manufacturing, however, there are many types of variations and fluctuations that require the design of MOSFETs5.2 Scalingwith tolerances. In Section IV, we look at some of theseFigurepractical 2 (Frank, et al., “Device scaling limits of Si MOSFETs and their application dependencies”, limitations and their consequences for deviceProc.design. IEEESec-, 2001) depicts the effects of scaling and Fig. 3 tables the resulting changes in transistor tion V describes how the concepts of the previousand circuitsections properties. play out when they are applied to meeting the Scalingneeds of will be limited by a number of issues. For a very thin gate oxide, less than about 1 specific classes of applications. The paper endsnm,in Section electronsVI can quantum mechanically tunnel directly from the gate electrode to the conducting by summarizing all of the limits into a large table,channel,follo givingwed large currents. For very short channels, electrons can tunnel directly from by the conclusion in Section VII. the source to drain, for L less than about 5 nm. Some of the smallest transistors made to date have L ≈ 6 nm. Doping becomes a problem, because the random distribution of dopant atoms means II. FUNDAMENTAL SCALING LIMITS that different MOSFETs have different numbers of dopants and thus different electrical properties. At the highest doping levels, which can approach 1019 cm−3, the dopant atom spacing is only A. Scaling Theory about 3 nm. The gate metalis the isdimensional also expectedscaling toparameter change., Sinceis the approximatelyelectric field scaling 1980, heavily-doped For many years now, the shrinking of MOSFETspolycrystallinehas been siliconparameter (poly), and has beenand theare preferredseparate dimensional gate “metal”.scaling However,parameters for at oxide thicknesses governed by the ideas of scaling [14], [15]. Thebelowbasic aboutidea is 1.5 nm,the theselecti semiconductingve scaling case. propertiesis applied ofto thethe de polyvice becomevertical dimensions important, manifested as a illustrated in Fig. 1: a large FET is scaled down∼ 1bynma f scaleactor depletionand g layer,ate length, andwhile actual metalsapplies to willthe needdevice towidth be used.and the wiring. to produce a smaller FET with similar behavior. When all of the voltages and dimensions are reduced by the scaling been slow because of the nonscaling of the subthreshold factor and the doping and charge densities 5.3are increased 25 nm MOSFETslope and technologiesthe OFF current. To accommodate this trend, by the same factor, the electric field configurationWithininside a fewthe years,more thegeneralized semiconductorscaling industryrules expectshave been to producecreated, in 25which nm MOSFETs, where, FET remains the same as it was in the originalbyde comparison,vice. This thethe currentelectric Intelfield 65is nmallo technologywed to increase has 35by nma physicalfactor gate[17]. lengths, A generic is called constant field scaling, which results indevicecircuit thatspeed has beenFurthermore, studied is schematicallythe device widths givenand in Fig.wiring 4 (Frank,dimensions et al.).hav Thise IC technology increasing in proportion to the factor and circuit density not been scaled as fast as the channel lengths, leading to increasing as . These scaling relations are shown in the a further scaling parameter for those dimensions. These second column of Table 1 along with the scaling behavior of generalized rules are also sho5-2wn in Table 1 and are described some of the other important physical parameters. in more detail in [5], [9], and [18]. Fig. 2 illustrates the actual past and projected future The preceding scaling rules do not tell a designer how scaling behavior of several of these parameters versus the short he can make a MOSFET for given doping profiles and channel length [16]. As can be seen, the voltages have not layer thicknesses; they only describe how to shrink a known been scaled at the same rate as the length, in violation of the good design. Furthermore, since the built-in potentials are simple scaling rules outlined above. In earlier generations of not usually scaled, the rules are inaccurate anyway. To find MOSFETs, this occurred because carrier velocities were in- the minimum gate length at each generation of technology, creasing with increasing field, yielding higher performance, one must analyze the two-dimensional (2-D) field effects while deleterious high-field effects were kept in check by inside the FET. This is often done numerically using com- the gradually descending voltage. More recently, carrier plex 2-D simulation tools, but the recent analytic analysis velocities have become saturated, but voltage scaling has by Frank et al. [19] reveals the primary dependencies. Other

260 PROCEEDINGS OF THE IEEE, VOL. 89, NO. 3, MARCH 2001 thicknessoftheburiedoxide.Forthickburiedoxide,thereisno backsidescreeningofthedrainpotential,resultinginrelatively poor scaling characteristics compared to other device types [10]–[13]. Since such devices are not likely to be used at the limits of scaling they are not discussed here. We do, however, discuss the scaling advantages of the more novel double gated type of FD-SOI MOSFETs, wherein both the insulator on the back side of the Si channel layer and the Si layer itself are very thin so that both sides of the channel are gated. There are also in-between FD-SOI MOSFETs with buried oxide thin enough to offer some screening, but not thin enough for use in active Fig. 1. Schematic illustration of the scaling of Si technology by a switching. These devices are interesting from a circuit point of factor alpha. Adapted from [5]. view since the back gate can be used to dynamically adjust the thresholdvoltage,butarenotdiscussedhereforlackofspace. Table 1 The outline of the paper is as follows. Section II ad- Technology Scaling Rules for Three Cases dresses some of the more fundamental limitations to the continued scaling of MOSFETs that appear to be on the horizon. Based only on these fundamental limits, it may be possible to scale FETs down to very small dimensions, e.g., 10-nm channel length or smaller. Section III describes research results related to this fundamental limit regime: very tiny one-of-a-kind FETs. In the more practical world of manufacturing, however, there are many types of variations and fluctuations that require the design of MOSFETs with tolerances. In Section IV, we look at some of these practical limitations and their consequences for device design. Sec- tion V describes how the concepts of the previous sections play out when they are applied to meeting the needs of specific classes of applications. The paper ends in Section VI by summarizing all of the limits into a large table, followed by the conclusion in Section VII.

II. FUNDAMENTAL SCALING LIMITS

A. Scaling Theory is the dimensional scaling parameter, is the electric field scaling For many years now, the shrinking of MOSFETs has been parameterFIGURE, and 3:andScalingare propertiesseparate dimensional of siliconscaling MOSFETs.parameters for governed by the ideas of scaling [14], [15]. The basic idea is the selective scaling case. is applied to the device vertical dimensions illustrated in Fig. 1: a large FET is scaled downshouldby a f enableactor manyand newgate applications,length, while asapplies tabledto the in Fig.device 5,width including,and the wiring. for example, the ability to to produce a smaller FET with similar behaviortranslate. When languagesall in real time that will only require 0.2 cm2 of chip area and need only 10 mW of of the voltages and dimensions are reduced bypower.the scaling Human intelligence-scalebeen slow because computationof the nonscaling power wouldof requirethe subthreshold some tens-of-square meters, factor and the doping and charge densities areaccordingincreased to Frank,slope et al.,andDevicethe ResearchOFF current. ConferenceTo accommodate, 1999. this trend, by the same factor, the electric field configuration inside the more generalized scaling rules have been created, in which FET remains the same as it was in the original5.4device. NewThis materialsthe electric neededfield foris allo scalingwed to increase by a factor [17]. is called constant field scaling, which results in circuit speed Furthermore, the device widths and wiring dimensions have increasing in proportion to the factor and circuitSincedensity the early 1980s,not thebeen materialsscaled as usedfast foras integratedthe channel MOSFETSlengths, onleading siliconto substrates have not increasing as . These scaling relations are shochangedwn in greatly.the Thea further gate “metal”scaling isparameter made fromfor highly-dopedthose dimensions. polycrystallineThese silicon. The gate second column of Table 1 along with the scalingoxidebehavior is siliconof dioxide.generalized For therules smallestare also devices,shown thesein T materialsable 1 and willare needdescribed to be replaced. some of the other important physical parameters. in more detail in [5], [9], and [18]. Fig. 2 illustrates the actual past and projected5.4.1future New gate oxidesThe preceding scaling rules do not tell a designer how scaling behavior of several of these parameters versus the short he can make a MOSFET for given doping profiles and channel length [16]. As can be seen, the voltagesTheha capacitanceve not perlayer areathicknesses; of the gate oxidethey only is describe how to shrink a known been scaled at the same rate as the length, in violation of the good design. Furthermore, since the built-in potentials are ox Ko simple scaling rules outlined above. In earlier generations of not usually scaled, theCoxrules= are=inaccurate, anyway. To find (1) t t MOSFETs, this occurred because carrier velocities were in- the minimum gate length atoxeach generationox of technology, creasing with increasing field, yielding higher performance,where ox is the permitivityone must ofanalyze the oxide,the two o-dimensionalthe permitivity(2-D) of thef vacuum,ield effects and K the dielectric while deleterious high-field effects were kept constant.in check Scaledby MOSFETsinside the requireFET. This largerisCoftenox, whichdone hasnumerically been achievedusing withcom- smaller tox. Increas- the gradually descending voltage. More recentlying K, carriercan also increaseplex C2-Dox, andsimulation other oxides,tools, “highbut the K dielectrics”recent analytic are beinganalysis developed, including velocities have become saturated, but voltageforscaling example,has mixturesby Frank of HfOet2 al.and[19] Al2Ore3v.eals the primary dependencies. Other

5-3 260 PROCEEDINGS OF THE IEEE, VOL. 89, NO. 3, MARCH 2001 0.5 Lateral S/D gradient: 0.4 - v& = 1 .o v 0.3 - 0.2 - 0.1 - 0- d -0.1 - 16 nmldec. -0.2- -0.3 - Super-Halo 101 ' """' I I -0.4 xi= 25 nm 10 100 -0.5 I I Dielectric Constant 10 20 30 40 50 Channel Length (nm) Fig. 9 Scale length versus dielectric constant for three val- ues of equivalent oxide thickness. Adapted from 171. Fig. 7 Dependence of short channel effect on lateral doping gradient. From [3]. DoubleGate. V, = 1.0 V

Fig. 17. Short-channel threshold rolloff for superhalo and Fig. 15. Source, drain, and superhalo doping contours in a 25-nm retrograde (nonhalo) doping profiles. Threshold voltage is defined nMOSFET design. The channel length is defined by the points as the gate voltage where A/ m. From [27]. where the source–drain doping concentration falls to 2 10 cm10 . 100 Dashed lines show the potential contours for zero gate voltage and Gate Length (L,,) [nm] a drain bias of 1.0 V. refers to the midgap energy level of the threshold voltage magnitudes far too high for both devices substrate. From [27]. Fig. 10 VT rolloff characteristics of double-gated Electric Field (MVlcm) [48]. With doped poly-Si gates, a frequently raised issue MOSFETs. From [SI. is the effect of poly-Si depletion on CMOS performance. Fig.F IGURE8 Band-4:to-ba25nd nmtunne gateling length current MOSFET density (at (Frank,1 V et al., “Device scaling limits ofDepletion Si MOSFETseffects occur and in polysilicon in the form of a appltheiried) ve applicationrsus electric f dependencies”,ield. Adapted from Proc.[3]. IEEE, 2001). thin-space charge layer near the gate oxide interface, which acts to reduce the gate capacitance and inversion charge Application density for a given gate drive. The percentage of gate ca- pacitance attenuation becomes more significant as the oxide thickness is scaled down. Actually, the net performance loss due to poly-Si depletion effects is much less severe than Speech recognition (to text) is suggested by – measurements. As it happens, the Real time language translation delay of intrinsic, unloaded circuits is only slightly degraded Video encoding ( 5%) because although poly-Si depletion causes a loss in QClF (174 x 144, 10 fps) the drive current, it also decreases the charge needed for CCIR 601 (720 x 480.30 fps) the next stage. These two effects tend to cancel each other. Vew hiah res. 1920Fig x 1200.30. 16. Subthreshold fos) currents for channel lengths from 30 to For the heavily loaded case in which the devices drive a 2-way video wrist wat15chnm. A/cm (1 nA/ m) for 20, 25, and 30 nm large fixed capacitance, the delay degradation approaches devices. From [27]. those of the ON currents ( 15%). This can be compensated PDA to some extent by using wider devices. On the average, the Tablet of about 2 10 cm [48]. Any source–drain doping that performance loss due to poly-Si depletion effect is about Factoring 5 12 bit numextendsbers beyond this point into the channel tends to compen- 10% for partially loaded 25-nm CMOS circuits with a Deep Blue chess sate or counterdope the channel region and aggravate the 1.5-nm-thick oxide [27]. OM-based device sshort-channelimulation effect. The abruptness requirements of both Extensive 3-D statistical simulations have been carried out the source–drain and the halo doping profiles dictate abso- on the effects of dopant fluctuations on threshold voltage for petaFLOPS computing challenges lutely minimum thermal cycles after the implants. Note that the above 25-nm device design [49]. Some of the details are Table 1. Selecteda posraisedsiblsource–draine applicationsstructure of 25 nmaym ChelpMOSmaking technolcontacts,ogy and theirpresented estimatedin Section IV-C. requirements. Powbutedoesr estinotmatebys aitselfre forsatisfy genertheal purabruptnesspose procrequirementessors (GPP) andT ospeecvialuateal the potential ON-state performance of purpose DSP-likediscussed processorhere.s. Adapted from [ 13. 25-nm CMOS, detailed Monte Carlo simulations were As discussed in Section II-B, a key issue with the high performed using the simulator DAMOCLES [50]. Both n- FIGURE 5: Future applicationsp-type doping enabledlevel byand 25narro nmw depletion gate lengthregions MOSFETin this (Frank,and p-channel et al., DeviceMOSFETs have been simulated, yielding 25-nm design is the band-to-band tunneling through the low-output conductance high-performance – charac- Research Conference, 1999). high-field region between 2the1 p-halo and the drain. For the teristics for both device types [27]. The transconductance peak field intensity (1.75 MV/cm) at high drain and zero exceeds 1500 mS/mm for this nFET, with an estimated gate biases shown in Fig. 15, the tunneling current density is higher than 250 GHz. Transient Monte Carlo simulations 5.4.2 New gate metal on the order of 1 A/cm (Fig. 9). This should not constitute were also done for a three-stage chain of 25-nm CMOS in- a major component of the device leakage current given the verters. Fig. 18 shows the output waveforms. The estimated The doped polycrystallinenarro siliconw width usedof the forhigh-f gatesield hasregion, a very15 nm thinaccording depletionto layer,delay time approximatelyis 4–4.5 ps, about 1 three to four times faster than nm thick, which causes scalingFig. 15. problems for small devices. Other metals are100-nm beingCMOS investigatedoperated at 1.5 V. The threshold design in Fig. 17 assumes dual n /p One way to go beyond 25-nm bulk CMOS is to cool the for replacing the silicon gates,Si work includingfunction tungstengates for nMOS/pMOS, and molybdenum.respectively. A CMOS chip to low temperatures as discussed in connection midgap work function metal gate would clearly result in to the 11-nm bulk MOSFET described in Section III. This is

FRANK et al.: DEVICE SCALING LIMITS OF Si MOSFETs AND THEIR APPLICATION DEPENDENCIES 271

5-4 100 l l polysilicon half-pitch

l printed gate L l s l physical gate L l l 50 l t l

] l t m l n

[ t l t l ze

i t l s t l e

r 20 t l

u t l t

a t e

f poly half-pitch t t t 10 t t t t

2000 2005 2010 2015 2020 Year

FIGURE 6: The ITRS roadmap for the gate length and polysilicon half-pitch in DRAMs. (ITRS 2004 update: http://public.itrs.net/)

5.4.3 Removing the substrate: Silicon on insulator For high-frequency circuits (about 5 GHz and above), capacitive coupling to the silicon substrate limits the switching frequency. Also, leakage into the substrate from the small devices can cause extra power dissipation. These problems are being avoided by making circuits on insulating sub- strates (either sapphire or silicon dioxide) that have a thin, approximately 100 nm layer of crys- talline silicon, in which the MOSFETs are fabricated.

5.5 The Roadmap The collaborates on predicting and determining the future changes to IC technology, contained in the International Technology Roadmap for Semiconductors (ITRS). Fig- ure 6 gives an example from the 2004 update for dynamic random access memory (DRAM) MOS- FETs. The gate length is projected to drop below 10 nm in about 10 years.

5-5