Quick viewing(Text Mode)

NANOCHIP Technology Journal

NANOCHIP Technology Journal

NANOCHIP Technology Journal

INTEGRATING ATOMIC LAYER DEPOSITION HIGH-κ DIELECTRICS

IN THIS ISSUE • Advanced Transistors—Scaling with New Materials and New Architecture • Spike Anneals for 32nm and Beyond • Nano-Porous Dielectrics for 28nm Applications

volume 9, issue 2, 2011 A MESSAGE FROM KLAUS SCHUEGRAF Table of Contents

Corporate Vice This is an exciting time in our industry, spurring the technical advances highlighted 3 Advanced Transistors — President and CTO, in this issue. With planar transistor scaling facing a growing number of limitations, Scaling with New Materials and New Architecture Silicon Systems Group we seek a new degree of freedom by going three-dimensional, while continuing to pursue solutions that extend planar scaling to its ultimate extent. The 2x nanometer 10 Optimizing Spike Anneals era requires bold changes in device architectures, introducing a significant increase in for 32nm and Beyond complexity throughout the manufacturing flow.

At the 2x nanometer node, logic devices require high-k/metal gate architecture, 15 Integrating Atomic Layer Deposition High-κ Dielectrics introducing additional process steps and much more stringent process requirements. at the ≤22/20nm Logic Technology Node Equivalent oxide thickness scaling and gate leakage challenges drive the need for atomic layer deposition (ALD) of high-k dielectric gate stacks. Turning attention to source/drain regions, we discuss solutions to 20 Extending Oxynitride Gate Technology thermal pattern loading effects in rapid thermal processing (RTP), which is particularly critical in the formation for Advanced DRAM of shallower junctions without compromise to activation. 25 Improving Tungsten Chemical Mechanical Planarization As DRAM devices scale below the 3x nanometer node, a new deep plasma process is required to enable for Next-Generation Applications higher-dose nitridation without increasing leakage current or the magnitude of transistor threshold voltage. Tungsten CMP applications in memory now require real-time process control for greater precision and flatness 29 A High Productivity ALD-Like Conformal Oxide Liner uniformity, and benefit from dual-wafer polishing with its significantly lower cost of consumables. for ≤20nm Technology Nodes Scaling is also driving breakthroughs in the interconnect. We discuss a third-generation low-k dielectric with uniform porosity, which boosts mechanical strength over previous generations and lowers the dielectric 35 Optimizing Nano-Porous Dielectrics constant to 2.2. We also discuss a new tuning parameter for achieving etch depth uniformity and CD control for ≤28nm Applications needed for uniform resistance of copper interconnect lines.

I expect these articles will interest you, and I am encouraged by our ongoing engagements as we all innovate to 43 Enhancing Dielectric Etch Uniformity enhance the technical expertise, productivity, and efficiency of our industry. for 28nm Copper Dual Damascene

Front Cover: To extend Moore’s law to the 2x nm node and beyond, logic devices require high-k/metal gates; the high-k dielectric is actually a stack of several layers, with atomic layer deposition (ALD) used to deposit the ultra-thin high-k layer. Integrating the ALD process on a single platform with low-temperature radical oxidation, nitridation, and post-nitridation anneal chambers enables the high-k dielectric to be optimized, minimizing queue-time between each step and keeping the wafer in a controlled environment for the entire sequence. This avoids contamination and prevents degradation of the interfaces in this critical core of the transistor. Table of Contents

3 Advanced Transistors — Scaling with New Materials and New Architecture

10 Optimizing Spike Anneals for 32nm and Beyond

15 Integrating Atomic Layer Deposition High-κ Dielectrics at the ≤22/20nm Logic Technology Node

20 Extending Oxynitride Gate Technology for Advanced DRAM

25 Improving Tungsten Chemical Mechanical Planarization for Next-Generation Applications

29 A High Productivity ALD-Like Conformal Oxide Liner for ≤20nm Technology Nodes

35 Optimizing Nano-Porous Dielectrics for ≤28nm Applications

43 Enhancing Dielectric Etch Uniformity for 28nm Copper Dual Damascene ADVANCED TRANSISTORS Scaling with New Materials and New Architecture

KEYWORDS Classic transistor scaling has given way to modern scaling field effect transistor) can offer a step function im- Transistor and related performance enhancement based on new provement in transistor performance with substantially 3D materials in strain engineering and high-κ metal gate higher carrier mobilities and lower operating voltages.[6] schemes. Next-generation devices will incorporate an FinFET Structural building blocks of a MOSFET transistor are even wider range of new materials and three-dimensional QWFET transforming radically (Figure 1). This review examines transistor architectures to sustain Moore’s Law; these HKMG the evolution of each to meet today’s and tomorrow’s trends are posing challenges that are driving development Isolation needs. of new process capabilities in transistor fabrication. Channel Figure 1 Gate Stack Over the past 40 years, transistors have undergone Classic MOSFET (130nm) Source/Drain steady miniaturization in accordance with Moore’s Silicide Epitaxial Growth Law. Until the 130nm node, referred to as the classic or Denard era, scaling followed a set of simple rules RTP Anneal Poly

to shrink gate length, gate dielectric thickness, and Spacer Shallow Junction junction depth by a factor 1/k (k~1.4).[1] The 90nm node SiON Strain Engineering transistor to the present, referred to as the modern Source Drain MOSFET (metal oxide field effect

transistor) era, has seen the non-classical adoption of (a) Figure 1. (a) Classic new materials to sustain scaling. For the 22nm node Modern MOSFET (32nm) (130nm) MOSFET and and beyond, we expect an even greater adoption of new (b) modern (32nm) MOSFET materials and architectures. showing application of Metal newer materials for strain The classic era efficiently provided for the needs of that

time in higher circuit speeds and higher densities. But High-κ engineering and high-κ metal Spacer WF Metal the modern era imposes multiple demands for lower gate. SiON Liner active and passive power, and higher speed and packing Epi Epi Source Drain density. Many innovations in new materials have been responses to these multiple demands. Strain engineering through epitaxial source-drain structures increased (b) carrier mobility in the channel and enabled higher speeds.[2] At the 45nm node, high-κ metal gates ISOLATION substantially reduced active power via leakage from gate While the active silicon pitch and shallow trench isola- to channel.[3] The 3D FinFET (fin field effect transistor) tion (STI) widths decrease proportionally with scaling, transistor at the 22nm node is a new architecture that STI depth is decreasing only incrementally. According substantially reduces passive power or active power to the 2009 ITRS, STI top width will decrease from while enabling advanced transistor scaling.[4] New 59nm in 2009 to 28nm in 2015, but the trench depth substrates, such as FDSOI (fully depleted silicon-on- will decrease only from 353nm to 309nm. As a result, insulator), are designed for lower operating power and the STI aspect ratio will grow from 6:1 to 11:1, increasing higher performance benefits.[5] New channel materials the challenge for void-free STI fill. A series of chemical incorporated in a QWFET architecture (quantum well vapor deposition (CVD) oxide processes has been

3 Volume 9, Issue 2, 2011 Nanochip Technology Journal , Inc. Evolving Transistor Technologies

developed to meet fill challenges: from high-density CHANNEL plasma (HDP) to sub-atmospheric CVD (SACVD), Much recent research has been directed at new to high aspect ratio process (HARP) CVD, to today’s channel materials: SiGe,[8] Ge,[9] III-V compound eHARP, and flowable CVD solutions. HDP has good film ,[10] carbon nanotubes,[11] and quality and low wet etch rate ratio (WERR) as deposited, grapheme.[12] Among them, III-V and Ge materials are while other films require optimized post-deposition front runners given their greater maturity through adop- anneal to improve film quality and reduce WERR. HDP tion in optoelectronic and communication devices and is capable of void-free STI fill up to an aspect ratio of in logic devices today. Other candidates face a “bottom approximately 5:1. HARP and eHARP are currently up synthesis” challenge requiring a different alignment standard films for STI fill, while flowable CVD with its approach than established etch and patterning.[13] outstanding bottom-up fill capability will be used for III-V materials, such as InSb or InAs, can theoretically the n+1 node and beyond. provide 50-100 times the electron mobility of silicon and For 3D FinFET transistors, a critical process step is Ge provides higher hole mobility than silicon, making recessing the STI oxide to form the “fin” (Figure 2). them attractive candidates for NMOS and PMOS, [14] Standard wet etching, dry plasma etching, or plasma- respectively. A possible architectural construct for free dry oxide removal processes can be employed. implementing these new channel materials is a QWFET The latter iterates between growth and sublimation of transistor derived from a HEMT (high electron mobility [14] ammonium fluorosilicate with each cycle, consuming transistor) device. Here the active channel layer is a well-controlled amount of oxide (e.g., 20nm). Recent sandwiched between two other material layers and research on forming a FinFET structure using this recess carriers are confined to the active layer (hence the name etch process after STI showed good electrical results for quantum well). Coulombic scattering, which adversely I vs. I performance relative to planar transistors.[7] affects carrier mobility, is virtually eliminated, giving rise on off to the possibility of exceptionally high carrier mobilities. The other advantage of this process is that the oxide removal rate is less dependent on oxide density, reducing Multiple processing challenges exist for heterogeneous the incidence of foot and void formations that occur in integration of III-V materials onto a silicon substrate. conventional wet etch or dry etch processes.[7] The epitaxial growth of a III-V film from a starting silicon lattice can be very challenging owing to a large Figure 2 lattice constant mismatch between the two materials that can introduce crystallographic defects, such as Figure 2. (a) STI structure SiN dislocations and anti-phase domains.[15] A composite post-etch and (b) similar buffer layer approach is needed, wherein intermediate structure post-STI fill, CMP, layers bridge the wide gap in lattice constants via nitride strip, and oxide smaller intermediate steps, thereby relaxing strain over recess for fin formation.[7] the stack so that a final defect-free channel layer can be grown (Figure 3).[16] Coefficient of thermal expansion Si mismatch issues in subsequent heating and cooling

(a) steps also must be addressed. III-V materials form a direct Schottky contact with metal gate in a QWFET and a suitable gate dielectric is needed to reduce the high parasitic gate leakage.[17]

Integration of Ge onto silicon also presents processing Oxide challenges. Epitaxial Ge growth on silicon is relatively straight forward compared to III-V growth on silicon, but surface preparation and interface control for defect

Si density prior to high-κ dielectric deposition is critical. Nitridation of the Ge surface[18,19] or annealing in a (b) [20] SiH4+N2 chemistry shows improvement in surface

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 4 Evolving Transistor Technologies

Figure 3 Figure 5

Figure 3. A composite buffer In Ga As Cap Layer 0.53 0.47 Metal Electrode layer approach is necessary InP Etch Stop Metal Gate In Ga As QW for growing defect-free 0.7 0.3 InGaAs quantum well [16] 0.8µm In Al As InIn AlAl AsAs To Topp Barrier Barrier Cap Metal structures on silicon. x 1-x 0.0.5252 0.0.4848 Cap Metal SiON Interface 1.3µm High-κ High-κ ~2.5nm Layer ~0.5nm SiON 0.5µm GaAs In Ga As QW 0.7 0.3 Silicon Substrate

Silicon Substrate Si 0.2μm 10nm In0.52Al0.48As Bottom Barrier Source: Applied Materials Maydan Technology Center 2nm

passivation, but much more development is needed to HKMG is composed of two material systems: a high-κ A metal gate electrode must be paired with a high-κ obtain the best quality surface. A stable gate dielectric dielectric stack and a metal gate electrode. The high-κ dielectric to address two known issues of Vt (threshold with low defect densities is difficult to obtain as opposed dielectric is physically thick (impeding leakage) but voltage) pinning and depletion layer formation (4-6Å

to the prior high quality Si/SiO2 interface. In general, behaves capacitively thin (promoting greater increase in electrical thickness) seen with conventional

the Ge/GeO2 system is less stable chemically and electrostatics). For example, a 3nm thick HfO2 layer poly gate. Dual work functions for independent and low electrically than a SiO /Si system. Ge oxynitride has has the capacitance of a <1nm thin SiO film.High-κ 2 2 Vt control for NMOS and PMOS transistors are needed better stability than native germanium oxides[21] and films require an underlying SiON interface layer (IL) for best performance. Clustered tool metal gate a high quality thin oxynitride has been formed on to achieve low interface trap density. This reduces the processing offers optimal conditions for depositing germanium by NH nitridation of a thermally grown average κ value that can be achieved. There must be 3 multiple metal layers (Figure 6) to ensure compositional [22] germanium oxide. good dielectric film stack reliability measured in metrics and contamination control. Other gate requirements such as biased temperature instability, time-dependent In the future, III-V and Ge channel-based transistors will include low-damage downstream steps, non-reactivity dielectric breakdown, and voltage extended life test. be incorporated in 3D transistor architectures, whose to high-κ, good adhesion to interface films, and low Also, HfO dielectrics can react with metal gate materi- feasibility with silicon has been demonstrated in recent 2 resistivity at shrinking device nodes. The smaller critical als and silicon at high temperatures. Mitigation of this path-finding studies.[23] dimension (CD) causes more wall scattering and reduces reactivity can require cap metal protection and minimal the volume of conducting metal relative to barrier GATE STACK exposure to high-temperature steps. metal inside the gate trench, resulting in worsening sheet The introduction of the high-κ (dielectric constant) metal Interface engineering is paramount for good carrier resistivity.[25] Metal fill also becomes more challenging gate (HKMG) has been a key inflection in continued mobility and precise control is required in formation for small gaps. In response, new materials and atomic scaling.[24] As seen in Figure 4, HKMG restores the of the high-κ/SiON (IL)/Si interface (Figure 5). The stalled electrical thickness scaling while simultaneously layer depositions (ALD) or CVD are likely to be required. interface needs to be of uniform thickness and free reducing the otherwise rising leakage.[24] The industry’s convergence on the gate-last HKMG[26] of unwanted native surface oxidation or carbon scheme makes chemical mechanical planarization Figure 4 contamination. Interface formation involves precisely (CMP) processes more critical in gate feature size and Figure 4. HKMG has been 10 1000 controlled deposition, anneal, and nitridation steps. a major enabler of sustained A clustered platform integrating all these steps and height control. Poly open and metal gate CMP processes electrical thickness scaling.[24] 100 processing a wafer through the entire sequence under face stringent requirements for precise thickness and continuous vacuum can offer the needed interface uniformity control within die, within wafer, and from 10 control. Future options to scale electrical thickness wafer to wafer to minimize variability and ensure (nm) [27] lNV include increasing the κ value of the high-κ dielectric, highest transistor performance. Furthermore, HKMG

T 1 reducing the IL thickness, or increasing the IL κ value. stacks will be integrated onto 3D FinFET architecture 1 Gate Leakage (Rel.) 0.1 Reliability performance will play a strong role in with high-κ and metal gate surrounding the 3D determining the eventual course. channel. Deposition processes that offer outstanding 0.01 conformality along all surfaces of the fin (e.g., ALD of 350 250 180 130 90 65 45 32 22 15 11 Technology Node (nm) high-κ) are likely to be required.

5 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Applied Materials, Inc. Evolving Transistor Technologies

Figure 5

Figure 5. Precisely Metal Electrode engineered high-κ/IL/Si Metal Gate interface.

Cap Metal Cap Metal SiON Interface High-κ High-κ ~2.5nm Layer ~0.5nm SiON

Silicon Substrate

Silicon Substrate

Source: Applied Materials Maydan Technology Center 2nm

A metal gate electrode must be paired with a high-κ Figure 6 dielectric to address two known issues of Vt (threshold Figure 6. A metal gate voltage) pinning and depletion layer formation (4-6Å Metal Gate electrode is composed of increase in electrical thickness) seen with conventional TiAl multiple metallic layers. poly gate. Dual work functions for independent and low

Vt control for NMOS and PMOS transistors are needed Ti TiN for best performance. Clustered tool metal gate TaN TiN processing offers optimal conditions for depositing Cap Metal multiple metal layers (Figure 6) to ensure compositional High-κ SiON and contamination control. Other gate requirements Silicon Substrate include low-damage downstream steps, non-reactivity to high-κ, good adhesion to interface films, and low SOURCE/DRAIN resistivity at shrinking device nodes. The smaller critical In planar transistors, source/drain (S/D) parasitic dimension (CD) causes more wall scattering and reduces resistance control and ultra-shallow junction formation the volume of conducting metal relative to barrier are key concerns. In-situ dopant incorporation during metal inside the gate trench, resulting in worsening sheet epitaxial growth and subsequent dopant activation to the [25] resistivity. Metal fill also becomes more challenging fullest extent (even beyond solid solubility limits) are for small gaps. In response, new materials and atomic critical in lowering the resistance of the S/D region (RSD) layer depositions (ALD) or CVD are likely to be required. and its extension region (RSDE). Also, rapid thermal The industry’s convergence on the gate-last HKMG[26] anneal processes (RTP) must maintain an ultra-shallow scheme makes chemical mechanical planarization junction depth within a lower thermal budget, which necessitates lower anneal temperatures and shorter (CMP) processes more critical in gate feature size and residence times. With the advent of 3D FinFET transis- height control. Poly open and metal gate CMP processes tors, S/D regions may be grown by selective epitaxy face stringent requirements for precise thickness and on top of the silicon fins; one such dual-raised S/D uniformity control within die, within wafer, and from approach has been recently demonstrated.[28] Uniform wafer to wafer to minimize variability and ensure doping across the three-dimensional surfaces of the fins [27] highest transistor performance. Furthermore, HKMG becomes critical for device performance and conformal stacks will be integrated onto 3D FinFET architecture doping technologies will become enabling solutions. In with high-κ and metal gate surrounding the 3D FDSOI technology, the already thin extension regions can channel. Deposition processes that offer outstanding undergo amorphization damage during ion implantation, conformality along all surfaces of the fin (e.g., ALD of which degrades RSDE, and a dopant loss path to buried high-κ) are likely to be required. [29] oxide can worsen RSD. Recent studies employing

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 6 Evolving Transistor Technologies

a diffusion-based approach to drive dopants from The nature of desired strain is dependent on transistor a faceted p+ epi S/D into the extension regions show type. Tensile strain increases electron mobility suiting a much lower parasitic resistance for PMOSFETs.[30,31] NMOS while compressive strain increases hole mobility suiting PMOS transistors. Dual stress liners employ CONTACT AND SHEET stress-tunable silicon nitride films[39] with a tensile Sheet resistivity and interface resistance are the main nitride film deposited over the NMOS gate stack and a concerns in contact scaling. Sheet resistivity increases compressive nitride film deposited over the PMOS gate geometrically at smaller CDs owing to a shrinking con- stack, respectively. In the recessed S/D approach, tact area and improved channel conductance.[32] Using trench contacts (large rectilinear openings) instead of active silicon regions are etched and replaced with plugs (small circular openings) relieves this problem epitaxially-grown structures that impart strain to the by accommodating more metal and shunting the entire channel. Si-Ge structures have been implemented for S/D area. Interface resistance from silicide intrinsic PMOS transistor S/D regions to transfer compressive resistance and silicide-silicon Schottky barrier height strain to the channel and increase hole mobility. can be reduced through new materials engineering. Similarly Si-C structures are considered potential Nickel silicide and NiPtSi, today’s material choices, have candidates for NMOS S/D regions; research efforts [40] the lowest bulk resistivity among common silicides. have shown more than 50% higher electron mobility. Continued junction scaling, however, calls for a thinner Scaling for greater strain to achieve higher drive cur- silicide layer. It is critical to ensure uniform conversion rents in future nodes is possible and can be achieved to the low resistance monosilicide phase, and avoid through higher dopant concentrations, geometrical depth, and channel proximity of S/D regions. Integrating the resistive NiSi2 phase, which causes spikes or pipes that lead to junction leakage.[33,34] This can be achieved strain engineering into FinFET transistors will be an by using an optimized PVD process to ensure uniform inevitable extension of this technology. However, metal deposition with density and topology, a dry achieving a high-stress state in a free-standing fin chemistry pre-clean process and advanced anneal steps surface will be a challenge calling for novel solutions.[41] to obtain the best possible uniformity at the lowest CONCLUSION thermal budget. Two key advances are backside rapid Transistor scaling into the next decade will be enabled thermal annealing to improve uniformity in the conver- by new materials and architectures. The initial wave of sion step and laser millisecond anneal, which reduces new materials adoption in strain engineering and HKMG nickel diffusion. Laser anneal has also proved useful for will proliferate throughout the transistor structure. New reducing leakage variability.[35] Barrier height lowering is architectural constructs, such as 3D FinFET, QWFET, attracting active research and dual silicides to optimize and FDSOI will increase process complexity and require for NMOS and PMOS separately are investigated.[36] innovative and holistic solutions. In the future, one can expect ALD or CVD of thinner REFERENCES barrier materials, advanced laser anneal technologies, [1] R. Dennard, et al., “Design of Ion-Implanted and newer metal choices (e.g., low resistivity tungsten with Very Small Physical Dimensions,” or copper contacts).[37] Integration with 3D FinFET IEEE Journal of Solid State Circuits, Vol. SC-9, No. 5, transistors will likely build upon the rectilinear trench pp. 256-268, October 1974. contacts; here, a design in which multiple fins share a common contact is promising.[38] [2] S. Thompson, et al., “A 90nm Logic Technology Featuring 50nm Strained Silicon Channel Transistors, STRAIN ENGINEERING 7 layers of Cu Interconnects, Low-κ ILD, and 1μm Strain engineering played a key role in perpetuating 2 SRAM Cell,” IEEE International Electron Devices Moore’s Law after the 90nm node by reviving the Meeting Technical Digest, pp. 61- 64, 2002. degraded carrier mobility after years of classic scaling. The introduction of strain into the channel reduces [3] K. Mistry, et al., “A 45nm Logic Technology with carrier scattering, increases mobility, and produces High-κ + Metal-Gate Transistors, Strained Silicon, higher drive current performance. Two commonly 9 Cu Interconnect Layers, 193nm Dry Patterning, employed strain approaches are dual stress liners and 100% Pb-Free Packaging,” IEDM Tech. Dig., and raised or recessed S/D structures. pp. 247-250, December 2007.

7 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Evolving Transistor Technologies

[4] “ Reinvents Transistors Using New 3-D [15] S.F. Fang, et al., “Gallium Arsenide and Other Structure,” retrieved 5/4/2011. Compound Semiconductors on Silicon,” J. Appl. http://newsroom.intel.com/community/intel_ Phys., Vol. 68, pp. R31-R58, 1990. newsroom/blog/2011/05/04/intel-reinvents- [16] M.K. Hudait, et al., “Heterogeneous Integration of transistors-using-new-3-d-structure. Enhancement Mode In0.7Ga0.3As Quantum Well [5] S. A. Vitale, et al., “FDSOI Process Technology for Transistor on Silicon Substrate Using Thin (=2µm) Sub-Threshold-Operation Ultra-Low-Power Composite Buffer Architecture for High-Speed Electronics,” Proceedings of the IEEE, Vol. 98, No. 2, and Low-Voltage (0.5V) Logic Applications,” IEEE pp. 333-342, 2010. International Electron Devices Meeting Technical [6] R. Chau, et al., “Integrated Nanoelectronics for the Digest, pp. 625-628, 2007. Future,” Nature Materials, Vol. 6, pp. 810-812, [17] S. Datta, et al., “85nm Gate Length Enhancement and November 2007. Depletion Mode InSb Quantum Well Transistors for [7] A. Redolfi, et al., “Bulk FinFET Fabrication with New Ultra-High Speed and Very Low Power Digital Logic Approaches for Oxide Topography Control Using Applications,” IEEE International Electron Devices Dry Removal Techniques,” 12th International Meeting Technical Digest, pp. 783-786, 2005. Conference on Ultimate Integration on Silicon [18] J.J.H. Chen, et al., “Ultra-Thin Al O and HfO Gate (ULIS), Tyndall Institute, Cork-Ireland, March 2011. 2 3 2 Dielectrics on Surface-Nitrided Ge,” IEEE Trans. [8] M.L. Lee and E.A. Fitzgerald, “Hole Mobility Electron Device, Vol. 51, pp. 1441-1447, 2004. Enhancements in Nanometer-Scale Strained-Silicon [19] E.P. Gusev, et al., “Microstructure and Thermal Heterostructures Grown on Ge-Rich Relaxed Stability of HfO Gate Dielectric Deposited on Ge Si Ge ,” J. Appl. Phys., 94, pp. 2590–2596, 2003. 2 1−x x (100),” Appl. Phys. Lett., 85, p. 2334, 2004. [9] Y. Kamata, “High-κ/Ge MOSFETs for Future [20] Nanoelectronics,” Materials Today, Vol. 11, No. 1-2, N. Wu, et al., “Alternative Surface Passivation on pp. 30-38, 2008. Germanium for Metal-Oxide-Semiconductor Applications with High-κ Gate Dielectric,” Appl. [10] R. Chau, “III-V on Silicon for Future High Speed and Phys. Lett., 85, pp. 4127-4129, 2004. Ultra-Low-Power Digital Applications: Challenges [21] and Opportunities,” CS Mantech Conference, Digest D.J. Hymes and J.J. Rosenberg, “Growth and of Papers, pp. 15-18, 2008. Materials Characterization of Native Germanium Oxynitride Thin Films on Germanium,” J. Electrochem. [11] S.J. Wind, et al., “Vertical Scaling of Carbon Soc., Vol. 135, pp. 961-965, 1988. Nanotube Field-Effect Transistors Using Top Gate Electrodes,” Appl. Phys. Lett., Vol. 80, pp. 3817-3819, [22] C.O. Chui, et al., “Nanoscale Germanium MOS 2002. Dielectrics—Part I: Germanium Oxynitrides,” IEEE Trans. Electron Devices, Vol. 53, No. 7, pp. 1501- [12] W. A. de Heer, et al., “Pionics: The Emerging Science 1508, 2000. and Technology of Graphene-Based Nanoelectronics,” IEEE International Electron Devices Meeting [23] M. Radosavljevic, et al., “Non-Planar, Multi-Gate Technical Digest, pp. 199-202, 2007. InGaAs Quantum Well Field Effect Transistors with [13] R. Chau, et al., “Opportunities and Challenges High-κ Gate Dielectric and Ultra-Scaled Gate-to- of III-V Nanoelectronics for Future High-Speed, Drain/Gate-to-Source Separation for Low Power Low-Power Logic Applications,” Technical Digest, Logic Applications,” IEEE International Electron De- IEEE Compound Semiconductor Integrated Circuit vices Meeting Technical Digest, pp. 6.1.1-6.1.4, 2010. Symposium, Palm Springs, CA., pp. 17-20, [24] C. Auth, et al., “45nm High-κ+ Metal Gate Strain- November 2005. Enhanced Transistors,” Symp. VLSI Technology, [14] S.M. Sze, High Speed Semiconductor Devices, Wiley, pp. 128-129, June 2008 and Applied Materials CTO New York, 1990. group forecast.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 8 Evolving Transistor Technologies

[25] C. Auth, et al., “45nm High-κ+ Metal Gate Strain- [36] B. Nishi, et al., “Interfacial Segregation of Metal at Enhanced Transistors,” Symp. VLSI Technology, NiSi/Si Junctional for Novel Dual Silicide Technology,” pp. 128-129, June 2008. IEEE International Electron Devices Meeting Technical Digest, pp. 135-138, 2007. [26] M. Lapedus, “Update: IBM ‘Fab Club’ Switches High-κ Camps,” EE Times News & Analysis, 18 [37] A. Topol, et al., “Lower Resistance Scaled Metal Jan. 2011, http://www.eetimes.com/electronics- Contacts to Silicide for Advanced CMOS,” VLSI news/4212271/IBM--fab-club--switches-high-k- Technology, 2006. camps. [38] K. Kuhn, “Moore’s Law Past 32nm: Future Challenges [27] J. Steigerwald, “Chemical Mechanical Polish: The in Device Scaling,” Proc. of Intl. Workshop on Enabling Technology,” IEEE International Electron Computational Electronics, pp. 1-6, 2009. Devices Meeting Technical Digest, pp. 37-40, [39] S. Pidin, et al., “A Novel Strain Enhanced CMOS December 2008. Architecture Using Selectively Deposited High [28] J. Kavalieros, et al., “Tri-Gate Transistor Architecture Tensile and High Compressive Silicon Nitride Films,” with High-κ Gate Dielectrics, Metal Gates, and IEEE International Electron Devices Meeting Strain Engineering,” VLSI Technology Digest of Technical Digest, pp. 213-216, 2004. Technical Papers, pp. 62-63, June 2006. [40] K.W. Ang, et al., “Enhanced Performance in 50nm [29] A. Majumdar, et al., “High-Performance Undoped- N-MOSFETs with Silicon-Carbon Source/Drain Body 8nm-Thin SOI Field-Effect Transistors,” Regions,” IEEE International Electron Devices Electron Device Letters, IEEE , Vol. 29, No. 5, Meeting Technical Digest, pp. 1069-1072, 2004. pp. 515-517, May 2008. [41] Kelin Kuhn, “22nm Device Architecture and [30] K. Cheng, et al., “Extremely Thin SOI (ETSOI) CMOS Performance Elements,” IEEE International Electron with Record Low Variability for Low Power System- Devices Meeting Short Course, 2008. on-Chip Applications,” VLSI Tech. Dig., p. 49, 2009. AUTHORS [31] K. Cheng, et al., “Extremely Thin SOI (ETSOI) CMOS Balaji Chandrasekaran is a marketing programs manager with Record Low Variability for Low Power System- in the Silicon Systems Group at Applied Materials. He on-Chip Applications,” Electron Devices Meeting, holds his M.S. in materials science and engineering 2009 IEEE International, pp. 1-4, Issue 7-9, from Northwestern University and an MBA from the Dec. 2009. University of California, Berkeley.

[32] M. Tada, et al., “Performance Modeling of Low-κ/Cu Khaled Ahmed is a distinguished member of technical Interconnects for 32nm-Node and Beyond,” Electron staff in the Silicon Systems Group at Applied Materials. Devices, IEEE Transactions, Vol. 56, No. 9, pp. 1852- He received his Ph.D. in electrical engineering from 1861, 2009. North Carolina State University.

[33] Lauwers, et al., “Ni-Based Silicides for 45nm CMOS Tony Pan is a distinguished member of technical staff and Beyond,” J. Materials Science and Engineering, in the Silicon Systems Group at Applied Materials. He B 114–115, pp. 29–41, 2004. holds his Ph.D. in materials science and engineering from Cornell University. [34] C. Detavernier, et al., “Kinetic of Agglomeration

of NiSi and NiSi2 Phase Formation,” Mat. Res. Soc. Adam Brand is a director of the Transistor Technology Symp. Proc., Vol. 745, 2003. group in the Silicon Systems Group at Applied Materials. He received his M.S. in electrical engineering from the [35] Y. Chen, et al., “Advances on 32nm NiPt Salicide Massachusetts Institute of Technology. Process,” Advanced Thermal Processing of Semiconductors, 17th International Conference, ARTICLE CONTACT pp. 1-4, Sept. 29-Oct. 2, 2009. [email protected]

9 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. OPTIMIZING SPIKE ANNEALS for 32nm and Beyond

Each application brings with it its own set of unique KEYWORDS challenges. Currently, a major challenge for spike Spike Anneal annealing is to minimize temperature variations arising Low-Temperature Anneal from variations in radiant energy absorption within a die. Rapid Thermal Processing These effects are generically called pattern effects (or Pattern Loading Effect pattern loading effects, PLE) and result from variations Residence Time in optical properties (reflectivity, absorptivity, and dif- Multi-Point Temperature fraction) within the die itself, as well as thermal heat Control transfer from the surface to the bulk silicon beneath the die. PLE is a strong function of the radiative flux incident Rapid thermal processing is becoming increasingly upon a surface. Thus, as ramp rates increase to reduce challenging as shrinking geometries require the reduction total thermal budget, so do pattern effects.[1,2] of yield-limiting thermal pattern loading effects (PLE), An additional challenge for spike annealing is the smaller thermal budgets, and lower process temperatures. reduction in thermal budget itself. Heat transfer occurs However, an innovative approach to wafer heating now in spike annealing primarily through radiation but also minimizes PLE, while new techniques dramatically shorten through conduction and convection. Much attention spike residence time to meet tight thermal budgets, and has focused on increasing the ramp rate of a spike transmission pyrometry enables closed-loop control to anneal to reduce the time taken to reach peak below 75°C for achieving production-worthy repeatability temperature; now the limiting factor in the thermal at process temperatures below 180°C. budget is the cooling rate of the wafer after the peak Rapid thermal processing (RTP) continues to be an temperature has been reached. important process in the VLSI circuit manufacturing Finally, implementation of nickel and nickel alloy flow. Originally introduced for dopant diffusion and silicides has been driving lower temperature RTP annealing, the process space has steadily expanded into processing. To enable this, the main problem to solve a wider temperature processing regime and to lower is the temperature measurement itself. The limit today thermal budgets. Thermal processes now range from with traditional optical temperature measurement relatively low temperatures (around 250°C and below) techniques is approximately 200°C. Alternative for contact formation applications to very high tem- measurement techniques will have to be employed to peratures (greater than 1200°C) for substrate defect accurately measure lower temperatures. anneals. At the same time, tighter thermal budget requirements have driven the need for very short Here, we address these three aspects of RTP and demon- annealing times, for example, “spike” annealing used strate significant improvement in each. To address PLE, for diffusion engineering. Although this trend to shorter we explore lamp heating from the back of the substrate anneal times has created some applications that cannot and employ conduction through the bulk as the method be performed with traditional RTP (e.g., millisecond for heating the device. For thermal budget reduction, annealing for contact nickel silicide formation), new we demonstrate increased cooling rates by employing RTP processes are being introduced—these range from concepts to improve radiative, conductive, and convec- thermal processes previously performed in furnaces to tive heat loss. Finally, a novel temperature measurement thermal modifications of new materials. technique is tested for low temperature processing.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 10 Pattern-Free, Low-Temperature Spike Anneal

MITIGATING PLE and the target Rs is 240Ohm/sq for a 1050°C spike Since the 65nm node, PLE in RTP has emerged as a anneal with a ramp rate of 220°C/s. The test wafer was major yield limiter. Interference effects and reflectivity placed in a RTP chamber first with the patterned side variation resulting from layout details of the layers facing the lamps and then with the non-patterned side present at the integration stage of the spike anneal facing them. As the results in Figure 2 demonstrate, (e.g., STI, poly, and nitride spacers) lead to significant backside heating reduced the pattern-related tempera- temperature differences across each die. This in ture variation from more than 120°C to a level below turn leads to larger variations in critical transistor measurement noise. For a wafer like that shown in performance parameters.[1,2] Figure 1 shows an ex- Figure 1, it is, therefore, reasonable to expect a complete ample of temperature differences across a chip during absence of PLE.

a 1000°C, 150°C/s spike anneal. The top image is a Figure 1 simulation of the temperature variation across a device Figure 1. (a) Simulated using the detailed optical properties of the chip layout to temperature map with cold predict the values. The bottom graph shows the actual region colored blue and measurements, using temperature-sensitive test struc- (b) measured temperature tures like ring-oscillators placed in the marked regions. 2 range for a 750mm chip, The simulation and the measurements match well, showing close correlation confirming that this phenomenon is well understood.[2] between the two sets of data.[2] The temperature difference is a strong function of the ramp-up rate due to the increased radiative energy flux from the lamps during spike anneal.[1] Several methods (a) for mitigating the temperature variation have been proposed, such as layout modifications to reduce the 2 ) variation of optical properties across the wafer, and 0 -2 R�=0.97 the application of absorption layers. Although these ed (�C -4 �T=8�C methods can lead to significant improvements, they -6

also limit the flexibility of chip layout as well as process T Measur -8 � flexibility, and certainly add cost. -10 -10 -7.5 -5 -2.5 0 2.5 The studies reported here aimed to devise a more robust �T Simulated (�C) solution to the PLE problem by delivering radiation (b) solely to the wafer backside. Because the backside of the wafer has uniform absorptivity and no structures, Figure 2 radiative heating is very uniform. The heating of the Figure 2. (a) Checkerboard device region is accomplished by conductive heating Checkerboard Wafer wafer with poly/oxide film through the substrate. In this manner, a more uniform stack and oxide film to temperature distribution is achieved across the device create large areas of extreme and no layout modifications or absorption layers are reflectivity difference (0.485 necessary. vs. 0.816), implant on backside. To test the effectiveness of this approach, a patterned wafer with extreme variation in absorptivity was created comprising a 20mm checkerboard pattern of highly reflective and highly absorptive film stacks (Figure 2).

The non-patterned side of the wafer received a BF2 implant at 3keV and dose of 1·1015cm-2, a typical Poly (570Å)/Oxide R ~ 0.485 Oxide (1700Å) R ~0.816 implant condition used for a spike monitor wafer. The (a) temperature sensitivity of this implant is 2.7Ohm/s/°C

11 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Pattern-Free, Low-Temperature Spike Anneal

Figure 2 (continued)

Pattern to Lamps Reflector Figure 2. (b) Pattern-facing Lamps lamps showed large temperature Lamps fluctuation during spike and Reflector Pattern to Reflector extremely large Rs variation along centerline, measured 1100 1100 on implanted side. (c) Backside-facing lamps 1000 1000

showed good Rs uniformity. 900 900 e (C) e (C)

800 800 atur atur per per

m 700 m 700 Te Te

600 600

500 500 78 80 82 84 88 90 92 94 Time (s) Time (s)

800 800 >>5C, 3σ <5C, 3σ 700 700

) 600 ) 600 sq sq

500 500 hm/ hm/ (O (O

s 400 s 400 R R

300 300

200 200 -150 -100 -50 0 50 100 150 -150 -100 -50 0 50 100 150 Wafer Diameter (mm) Wafer Diameter (mm)

(b)(c)

REDUCING RESIDENCE TIME PROFILES gradually increasing helium gas flow during the recipe, Extending spike anneal to next-generation nodes requires such that a 75°C/s cooling rate was obtained (red faster ramp-up and cool-down rates to minimize the profile in Figure 3). Radiative heat loss was increased spike residence time, generally defined as the time by chamber modifications to absorb heat in general. taken for the wafer to reach the peak temperature This approach increased the cool-down rate from from 50°C below it and back down. The main effect of 75°C/s to 90°C/s (yellow profile in Figure 3). However, shorter residence time is a reduction in lateral diffusion to further improve cool-down, specifically after the length of the source/drain regions, which improves peak temperature was reached, the distance between the gate overlap capacitance and Vt roll off while wafer and reflector was dynamically reduced during [3] maintaining high Ion/Ioff. While backside heating the recipe. Wafer motion towards the reflector plate eliminates the risk of PLE temperature variations, in provided additional convective heat transfer from the turn facilitating a faster ramp-up rate, accelerating the gas exiting the space during the motion. This resulted cool-down rate is a different challenge. Cooling of the in the shortest residence time (green in Figure 3). silicon wafer is enhanced by improving the conductive, Whereas the radiative heat loss decreased strongly radiative, and convective heat transfers. In these with temperature, the conductive and convective parts studies, greater conductive heat loss was achieved by did not.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 12 Pattern-Free, Low-Temperature Spike Anneal

Figure 3

Figure 3. (a) Time/temperature profiles of spike anneals BF , 1keV, 1 1015cm-2 1100 120 2 (using three methods to (1)+(2)+(3) 1100 (1)+(2) improve cool-down) used to 1050 100 1000 calculate diffusion length for (1) 1053.64C, 0.77s ) ) ) 1000 80 900 boron based on an intrinsic sq 1054.66C, 1.54s 1054.61C, 1.12s e (�C diffusion model. hm/

Arb. Units 800 O

950 60 ( atur (b) Experimental R /X plot s s j R per ength ( 700 1082.92C, 1.55s L with BF implant using three m 2 900 40 ff. Te Di cool-down methods over a 600 1081.47C, 0.75s range of temperatures. 850 20 500 As Implanted Xj

800 0 400 6 7 8 9 10 11 12 100 150 200 250 300 350

. 18 -3 Time (s) Xj @ 5 10 cm (Å) T-Res.-1.49s Diff. Length-1.49s Data-0.8s Res.Time-0.75s T-Res.-1.09s Diff. Length-1.09s Data-1.2s Res.Time-1.5s T-Res.-0.78s Diff. Length-0.78s Data-1.5s Res.Time-1.2s (a) (b) Figure 4 illustrates results of the above experiments plate increased the cool-down rate approximately using various dynamic wafer-level movements. All other proportionally as would be expected from the increase conditions remained constant. In all profiles, the lamps in conductive heat transfer (red temperature profile). shut off when the temperature reached 575°C and the However, faster movement produced a significant wafer-level movement was initiated after that as shown increase in the cool-down rate. This effect can be in the graphs. Compared to the static wafer profile, attributed to convective cooling action and makes slow movement of the wafer toward the reflector possible significant improvement in sharpening the spike profile, even in lower temperature regimes.

Figure 4

Figure 4. (a) Experimental 600 20 100 20 time/temperature profiles 18 18 with actual wafer-level profiles. 550 50 16 16 (b) Rate of temperature mm)

14 14 ( change calculated for the ) 500 0 RP ) same profiles. 12 12 m /s mm) e (�C ro 450 -50

atur 10 10 dt (K ofile ( / ance f per Pr dT st

m 8 8 Z-

Te 400 -100 6 6

4 4 Wafer Di 350 -150

2 2

300 0 -200 0 4 6 8 10 4 6 8 10 Time (s) Time (s) T1-No Move Profile-No Move RR1-No Move T1-10mm/s 0.5s Delay Profile-10mm/s 0.5s Delay RR1-10mm/s 0.5s Delay Profile-10mm/s 0.5s Delay T1-10mm/s 1s Delay Profile-10mm/s 1s Delay RR1-10mm/s 1s Delay Profile-10mm/s 1s Delay T1-5mm/s 1s Delay Profile-5mm/s 1s Delay RR1-5mm/s 1s Delay Profile-5mm/s 1s Delay (a) (b)

13 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Pattern-Free, Low-Temperature Spike Anneal

IMPROVING PYROMETRY FOR LOW- reducing the distance between wafer and reflection TEMPERATURE RAPID THERMAL ANNEAL plate during processing increases the rates of radiative, As silicide film thickness shrinks with each technology conductive, and convective cooling, consistent node, process temperatures and times must be with smaller thermal budgets. New temperature correspondingly adjusted downward. Temperatures measurement techniques show promise for wafer lower than 250°C are required. Minimizing the thermal processing below the limits imposed by optical budget while sustaining minimum reaction temperatures pyrometry, suggesting that further reductions in necessitates low-temperature spike anneals.[4] chamber processing temperature can be achieved. Achieving a production-worthy process poses a two- REFERENCES fold challenge. First, temperature measurement with [1] I. Ahsan, et al., “Impact of Intra-Die Thermal conventional pyrometry reaches a fundamental limitation Variation on Accurate MOSFET Gate-Length at approximately 200°C due to the decreasing signal- Measurement,” ASMC, p. 174, 2009. to-noise ratio between the thermal emission dictated [2] P. Morin, et al., “Managing Annealing Pattern Effects by the Planck relation and the background radiation. in 45nm Low-Power CMOS Technology,” ESSDERC, Therefore, new methods for non-contact temperature 2009. measurement must be developed. Second, to obtain repeatable results for short processing times, multi- [3] C.I. Li, et al., “Superior Spike Annealing Performance point temperature control must be implemented for all in 65nm Source/Drain Extension Engineering,” RTP, possible wafer properties from as low as 50°C. p. 163, 2005.

[4] To address the temperature measurement challenge, S. Ramamurthy, et al., “Nickel Silicide Formation Using Low-Temperature Spike Anneal,” Solid State the temperature dependence of the silicon substrate Technology, p. 37, October 2004. itself is used to calculate temperature via a transmission measurement. Figure 5 shows a closed-loop, ultra-low AUTHORS temperature spike using this transmission pyrometry Wolfgang Aderhold is a senior member of the technical method at four locations distributed from the center to staff in the Anneals and Epitaxy group of the Front End the edge of the wafer. Products business unit at Applied Materials. He holds his Ph.D. in electrical engineering from Friedrich- Figure 5 Alexander Universität, Erlangen, Germany. 200 Figure 5. Temperature profile Aaron Hunter is a senior director in the Anneals and 180 for a low-temperature spike. 160 Epitaxy group of the Front End Products business unit

) 140 at Applied Materials. He received his B.A. in physics

120

e (�C Wafer from the University of California at Santa Cruz. Setpoint 100 atur Shankar Muthukrishnan is a global product manager

per 80 m 60 in the Anneals and Epitaxy group of the Front End Te

40 Products business unit at Applied Materials. He holds

20 his M.S. in environmental engineering from Texas A&M

0 University. 5 15 25 35 45 55 65 Time (s) ARTICLE CONTACT [email protected] CONCLUSION PROCESS SYSTEM USED IN STUDY Several recent advances expand the process window Applied Vantage® Vulcan™ RTP for RTP. Backside heating has proven effective in overcoming PLE regardless of device structures or differences in material absorptivity. Dynamically

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 14 To characterize the physical nature of the stacks, XPS INTEGRATING ATOMIC was used to accurately determine the thickness of the interface layer and HfO2. AR-XPS was used to verify the nitrogen profile in the stack after post high-κ nitridation, LAYER DEPOSITION and TEM to measure ALD step coverage. MOSCAP devices were fabricated with thermal budgets mimicking HIGH-κ DIELECTRICS both “gate first” and “gate last” process flows to quantify Jg and EOT values. Short-loop transistors (ring at the ≤22/20nm Logic Technology Node gates) were used to investigate the impact of scaling on electron mobility. INTERFACE LAYER SCALING KEYWORDS performance boost (increase in C and improvement inv The oxide interface layer (IL) has a low-κ value; hence, High-κ/Metal Gate to sub-threshold slope) and mitigating short channel ef- it is advantageous to focus on reducing the thickness fects, specifically drain induced barrier lowering (DIBL). Atomic Layer Deposition of this layer as it will produce a large effect on overall However, extending Moore’s Law to the 22nm and 14nm Radical Oxidation EOT. Both rapid thermal oxidation (RTO) and radical nodes gives rise to high-κ gate stack scaling challenges. Plasma Nitridation oxidation processes offer a means to scale the IL below Process Clustering The gate stack is composed of the interface oxide layer, 8Å, as measured by XPS.[7] For scaling the IL to 5Å typically grown thermally or chemically, and the bulk and below, the focus has been on radical oxidation

high-κ layer, deposited by either chemical vapor (N2O/H2), which enables scaling it down to 2Å in a deposition (CVD) or atomic layer deposition (ALD). controlled and repeatable fashion with a thermal budget With each node, the stack thickness must scale down compatible with both gate first and gate last processes

to meet ever decreasing EOT targets (Table 1). To (Figure 1). MOSCAP results verified equivalent gJ /EOT reduce the EOT further, plasma nitridation can be performance for both high-temperature RTO and low- employed to incorporate a controlled dose of nitrogen temperature radox processes, confirming that no quality into the entire stack, followed by an anneal to stabilize was lost in using the latter (Figure 2a). In addition, ring gate devices verified the expected mobility trend for both the stack. With this process flow, the 22nm gJ /EOT targets can be met. However, manufacturing the stack processes (Figure 2b).[8] Although there is a decrease in in a repeatable and reliable manner involves an mobility from IL scaling due to the increasing proximity At ≤22nm, the dielectric gate stack must attain an EOT less additional consideration. The dielectric gate stack is the of Hf atoms to the channel causing phonon scattering, than 9Å without degrading gate leakage current, gate stack core of the transistor and is electrically very sensitive to the reduction in EOT does boost transistor drive reliability, or channel carrier mobility. Methods of extending variation and quality. Both of these perturbations can be current as defined by the following approximation, each unit process in the replacement gate are shown to reduced by (a) having a consistent and minimal queue- I∞ C *μ*(V -V )^2*(W/L), meet requirements for 20nm. In addition, process clustering time between each step, and (b) keeping the wafer dsat ox g t proves a means of improving gate stack control and quality. in a controlled vacuum atmosphere between steps. where Cox is the gate capacitance, which is inversely proportional to the EOT. The balance between IL scaling In 2007, Intel announced the “biggest change in transis- Clustering each of the process chambers onto a vacuum for EOT reduction and its impact on mobility must be tor technology since the introduction of the polysilicon mainframe can satisfy these requirements. [1,2] optimized. gate MOS transistor in the late 1960s” with the Table 1 industry’s first high-κ/metal gate product at the 45nm An alternative approach for IL formation is to use an Table 1. Target thickness node. From that time until now, other industry logic and Node EOT IL Thickness High-κ SC1 chemical treatment, which cleans the surface and values by technology node foundry leaders have announced their offerings for (nm) (Å) (Å) Thickness (Å) leaves an oxide layer of approximately 4Å. This type high-κ/metal gate with variations in the integration of oxide has a much lower density than a thermally approach—gate first and gate last.[3-6] The high-κ 32/28 9-12 7-10 20-23 grown oxide, which can be seen by comparing the large dielectric instantly provided relief to the equivalent oxide discrepancy in thickness from optical ellipsometry 22/20 6-9 4-7 17-20 thickness (EOT) scaling speed bump that came about versus XPS. Current studies are examining the reliability when SiON hit its limit due to the immense increase of this type of oxide versus thermally grown oxide; 16/14 4.5-6 0-3 15-18 in leakage current.[2] EOT scaling is required for both results will be presented at a future date.

15 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Applied Materials, Inc. Advanced Effective Oxide Thickness Scaling

To characterize the physical nature of the stacks, XPS Figure 1 was used to accurately determine the thickness of the Figure 1. Radical oxidation 9 LT RadOx 5s 1%H2 interface layer and HfO2. AR-XPS was used to verify the LT RadOx 60s 1%H2 enables controlled, repeatable 8 nitrogen profile in the stack after post high-κ nitridation, IL scaling to 2Å for gate first 7 and TEM to measure ALD step coverage. MOSCAP and gate last processes. 6 devices were fabricated with thermal budgets mimicking 5 SC1 Chem-Ox hick (Å) both “gate first” and “gate last” process flows to 4 Chem-Ox Thickness quantify Jg and EOT values. Short-loop transistors (ring XPS T 3 gates) were used to investigate the impact of scaling on 2 HF-Last electron mobility. 1

0 INTERFACE LAYER SCALING 450 500 550 600 650 700 750 800 The oxide interface layer (IL) has a low-κ value; hence, Temperature (C) it is advantageous to focus on reducing the thickness of this layer as it will produce a large effect on overall Figure 2 EOT. Both rapid thermal oxidation (RTO) and radical Jg/EOT Figure 2. (a) MOSCAP oxidation processes offer a means to scale the IL below Spike RTO results showed equivalent 8Å, as measured by XPS.[7] For scaling the IL to 5Å Low-Temp RadOx Jg/EOT performance for 1.E+01 and below, the focus has been on radical oxidation high-temperature RTO and (N O/H ), which enables scaling it down to 2Å in a HfO Trendline 2 2 2 low-temperature radox

) 1.E+00 controlled and repeatable fashion with a thermal budget 2 processes. cm compatible with both gate first and gate last processes 3Å IL (b) Ring gate mobility

(Figure 1). MOSCAP results verified equivalent J /EOT 1 (Å/ 1.E-01

g b- Gate First decreased with EOT for both

Vf 6Å IL performance for both high-temperature RTO and low- 3Å IL along the expected trendline. @

g 9Å IL temperature radox processes, confirming that no quality J 1.E-02 was lost in using the latter (Figure 2a). In addition, ring Gate Last gate devices verified the expected mobility trend for both 1.E-03 8 9 10 11 12 13 14 15 processes (Figure 2b).[8] Although there is a decrease in EOT (Å) mobility from IL scaling due to the increasing proximity of Hf atoms to the channel causing phonon scattering, Internal MTCG Data (a) the reduction in EOT does boost transistor drive current as defined by the following approximation, Peak Mobility

Idsat∞ Cox*μ*(Vg-Vt)^2*(W/L), Spike RTO 340 where C is the gate capacitance, which is inversely Low-Temp RadOx ox 320

proportional to the EOT. The balance between IL scaling ) s 300 /V for EOT reduction and its impact on mobility must be 2 IL Scaling Trendline m 9Å IL optimized. (c 280

6Å IL An alternative approach for IL formation is to use an 260 SC1 chemical treatment, which cleans the surface and Mobility 3Å IL 240 leaves an oxide layer of approximately 4Å. This type of oxide has a much lower density than a thermally 220 10 11 12 13 14 15 grown oxide, which can be seen by comparing the large EOT (Å) discrepancy in thickness from optical ellipsometry versus XPS. Current studies are examining the reliability Internal MTCG Data (b) of this type of oxide versus thermally grown oxide; results will be presented at a future date.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 16 Advanced Effective Oxide Thickness Scaling

ALD HIGH-κ DEPOSITION AND POST a 4-10% dose) provides 1-2Å of additional EOT scaling TREATMENTS (Figure 5), deferring the need for higher-κ materials to Replacing thermally grown oxide dielectrics with a <6Å.[11] A post-nitridation anneal removes metastable Hf-O-N bonds.The EOT-reduction effect of nitrided deposited dielectric (ALD HfO2) requires a highly

manufacturable process, in particular excellent HfO2, derives from an increase in the film dielectric uniformity and low particle count.[9] The ALD high-κ constant (through greater electron and ionic film used in these studies demonstrates 100% step polarization).[12,13] coverage in a very aggressive 10nm structure, GATE STACK CLUSTERING confirming its extendibility to emerging 3D transistor Clustering of oxidation, plasma nitridation, and anneal structures (Figure 3). process chambers on a vacuum mainframe is well Figure 3 established in high-volume manufacturing for SiON Figure 3. ALD high-κ gate dielectrics. This approach ensures short, consistent, achieves 100% step and vacuum-controlled queue-time between each coverage of aggressive process step in the critical and sensitive gate stack aspect ratios. sequence. By adding a high-κ dielectric film to the gate stack, two additional interfaces are introduced. At each logic technology node reduction, the interface-to-bulk ratio increases dramatically, making process clustering ever more crucial. During a vacuum break, molecular contaminants (e.g., C, N, O, F, Na, S) can be incorporated

10nm CD Trench with into the gate stack interfaces. Previous studies have 7.0-8.6 Aspect Ratio shown a reduction in gate oxide integrity from the (Internal Wafer) introduction of hydrocarbons into the gate stack.[14] Besides freedom from contamination, process control Figure 4 shows the scaling trend of oxide, oxynitride, is another vital advantage of gate stack clustering. Figure 5 shows the growth in interface layer thickness and HfO2, defined by the dielectric constant, band gap and alignment, and tunneling effective mass.[10] with increasing queue-time after an ultra-thin IL process as compared to a clustered process (zero For higher-κ films, the slope of the gJ /EOT trendline becomes steeper, reducing the EOT window for each queue-time data point). Data collected thus far suggest tighter transistor characteristics (e.g., >30% reduction material. At this rate, the leakage for HfO2, is expected

to be too high for EOTs of less than ~7-8Å. To extend in σVt) and >5% increase in mobility when gate stack

HfO2 further, post high-κ plasma nitridation (typically clustering is used.

Figure 4

Doped High- HfO Si 1e+2 Figure 4. (a) Jg/EOT trends O 2 illustrate the narrower EOT 2 SiON 102 1e+1 scaling window for each ) 2

κ cm material. 0

) 10 2 1e+0 ALD HfO (b) Below 7-8Å, gate m 2 /c with Post ALD HfO -2 2

leakage necessitates plasma akage (Å/ (Å 10 g

J 1e-1

Le Nitridation nitridation to obtain 1-2Å

-4 further EOT scaling. 10 Gate 1e-2

10-6 1e-3 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 5 6 7 8 9 10 11 12 13 EOT (nm) EOT (Å) (a) (b)

17 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Advanced Effective Oxide Thickness Scaling

Figure 5 ACKNOWLEDGEMENTS The authors would like to thank the entire high-κ ALD Figure 5. Increasing 4 team, the oxidation and nitridation team in FEP, design interface layer thickness and core engineering, GPS, TPS, CAT, DTCL, and MTCG. with unclustered process 3 REFERENCES for ultra-thin IL. [1] 2 http://www.intel.com/pressroom/archive/ hickness (Å) releases/2007/20070128comp.htm.

XPS T 1 [2] Clustered Condition K. Mistry, et al., “A 45nm Logic Technology with

0 High-κ + Metal Gate Transistors, Strained Silicon, 0 1 2 3 4 5 9 Cu Interconnect Layers, 193nm Dry Patterning, and Queue-Time to ALD High-κ Deposition (Hours) 100% Pb-Free Packaging,” Proceedings of the IEDM, EXTENDIBILITY TO 16NM AND BELOW pp. 247-250, 2007. To further scale the EOT to meet 16/14nm node [3] M. Chudzik, et al., “High-Performance High-κ/Metal requirements, reduction of the low-κ interface layer Gates for 45nm CMOS and Beyond with Gate-First to less than 3Å was investigated (Figure 1). This is the Processing,” Symposium on VLSI Technology, limit of an SC1-last pre-clean. For ultra-thin to zero IL pp. 194-195, 2007. formation, an HF-last type clean must be used. The degree of control and interface quality necessary for [4] http://www.samsung.com/us/business/ the gate stack require that a dry plasma gate pre-clean semiconductor/newsView.do?news_id=1162. be integrated on the same mainframe. MOSCAP data [5] http://www.globalfoundries.com/ show J /EOT performance equivalent to traditional g newsroom/2010/20100901_ARM.aspx. cleans, but with superior interface layer control resulting from fixed and clustered queue-time. The κ-value of bulk [6] http://www.tsmc.com/tsmcdotcom/

HfO2 must also be increased, which is accomplished by PRListingNewsArchivesAction.do?action= incorporating a κ-boosting element, such as Ti, into the detail&newsid=4041&language=E. matrix during ALD deposition. By so doing, the element [7] M.J. Bevan, et al., “Ultra-Thin SiO Interface Layer distribution in the layer can be carefully tailored. 2 Growth,” Proc. of the 18th Conf. on Adv. Thermal CONCLUSION Process. of Semic. – RTP 2010. The consistent reduction in EOT at each subsequent [8] node necessitated by Moore’s Law poses new challenges J. Huang, et al., “Gate First High-κ/Metal Gate for the high-κ gate stack at 22nm and below. These EOT Stacks with Zero SiOx Interface Achieving scaling challenges can be addressed at each process EOT=0.59nm for 16nm Application,” Symposium step in the gate stack sequence. Using low-temperature on VLSI Technology, pp. 34-35, 2009. radical oxidation, IL thickness can be reduced in a [9] A. Noori, et al., “Enabling ALD Hardware for High-κ controlled manner to less than 8Å. The κ-value of Dielectrics,” 10th International Conference on Atomic the ALD HfO layer can be increased through post- 2 Layer Deposition, 2010. deposition plasma nitridation and anneal, reducing the EOT by an additional 1-2Å. Clustering the IL and high-κ [10] Y.C. Yeo, et al., “MOSFET Gate Leakage Modeling processes improve control and quality. The impact and Selection Guide for Alternative Gate Dielectrics on transistor performance and reliability is still under Based on Leakage Considerations,” IEEE Trans. on investigation, but initial data suggest improvement Elect. Dev., Vol. 50, No. 4, pp. 1027-1035, 2003. in both electrical performance and distribution. The [11] proposed gate stack is extendible to 14/16nm nodes S. Hung, et al., “ALD HfOx Scaling Through th by introducing integrated dry gate pre-clean, <2Å IL Nitridation for 22nm and Beyond,” 10 International formation, and higher-κ ALD films. Conference on Atomic Layer Deposition, 2010.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 18 Advanced Effective Oxide Thickness Scaling

[12] T. Ino, et al., “Dielectric Constant Behavior of ARTICLE CONTACT Hf-O-N System,” Jap J. of Appl. Phys. Vol. 45, [email protected] No. 4B, pp. 2908-2913, 2006. PROCESS SYSTEM USED IN STUDY [13] C.S. Kang, et al., “The Electrical and Material Centura® Integrated Gate Stack Characterization of Hafnium Oxynitride Gate (RadOxiL, ALD High-κ, DPN3, RadiancePlus Anneal) Dielectrics with TaN-Gate Electrode,” IEEE Trans. Elect. Dev., Vol. 51, No. 2, pp. 220-227, 2004.

[14] F. Tardiff, et al., “Hydrocarbons Impact on Thin Gate Oxides,” Proc. of the Intern. Symp. on Ultra-Clean Process. of Silicon Surf, pp. 309-312, 1996. AUTHORS Atif Noori, is a global product manager in the ALD division at Applied Materials. He holds his Ph.D. in materials science and engineering from UCLA.

Steven Hung is an integration engineer in the HKMG Technology Group of the ALD division at Applied Materials. He received his Ph.D. in electrical engineering from Stanford University.

Tatsuya E. Sato is a process member of technical staff in the HKMG Technology Group of the ALD division at Applied Materials. He holds his M.S. in mechanical engineering from Hokkaido University, Japan.

Malcolm Bevan is gate manager of the Gate Stack and Oxidation Products unit of the Front End Products business unit at Applied Materials. He received his Ph.D. in physical chemistry from Cambridge University.

Kenric Choi is a member of technical staff in the ALD division at Applied Materials. He holds his B.S. in mechanical engineering from San Jose State University.

Brendan McDougall is an integration engineer in the Transistor Technology Group of Applied Materials. He received his Ph.D. in physics from Brandeis University.

Johanes Swenberg is head of the gate and oxidation product line in the Front End Products business unit at Applied Materials. He holds his Ph.D. in applied physics from the California Institute of Technology.

Maitreyee Mahajani is general manager in the HKMG Technology Group of the ALD division at Applied Materials. She holds her M.S. in materials science from the University of Alabama.

19 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. EXTENDING OXYNITRIDE GATE TECHNOLOGY for Advanced DRAM

used as the gate dielectric. Consequently, integrated KEYWORDS device manufacturers transitioned to nitrided oxide Nitridation (SiON). Over the past decade, nitridation treatments Gate Stack have been used to incorporate nitrogen into previously DRAM Peripheral Gate grown SiO 2 film. Incorporating the SiON (oxynitride) as Pulsed Plasma the dielectric gate raises the gate dielectric constant, Post-Nitridation Anneal thus reducing the equivalent oxide thickness (EOT), while also reducing the gate leakage more than ten

times over SiO2. Besides enabling device scaling, oxynitride gate dielectrics are highly effective barriers to boron dopant diffusion from the polysilicon gate, which would other- wise result in unmanageable PMOS threshold voltage shifts and degraded process control.

Scaling the DRAM peripheral gate is key to advanced high- Over the past decade, operating powers have scaled performance, low-power devices, but current nitridation aggressively from 2.5V to operating voltages now processes are limited in achieving the optimal leakage and approaching 1.0V while, simultaneously, transfer threshold voltage. A new high-temperature, high-power rates have increased from 1.0GB/s to those exceeding 20GB/s. Scaling the DRAM peripheral gate is key to NH3 nitridation process with pulsed RF plasma generates the higher nitrogen doses needed for the 3x/2x nodes enabling this aggressive performance improvement. without deterioration of leakage and threshold voltage PLASMA VS. THERMAL NITRIDATION performance. Although the first oxynitrides were processed by

DRAM devices with advanced power-management annealing a grown SiO2 film in nitrogen-containing features and faster access/storage rates are required to chemistries, plasma nitridation has proven to be far keep pace with the rapidly expanding variety of mobile superior in performance, reliability, and extendibility. devices. The subsequent performance scaling of the Unlike thermal nitridation, which relies on temperature dielectric gate has prompted successive advances in to drive nitrogen into the oxide, plasma nitridation oxynitride gate technology. This article discusses employs a plasma source to expose the SiO oxide film the advantages of plasma nitridation in creating an 2 to a nitrogen-rich environment, thereby “dosing” it with oxynitride film and the advances in extending this nitrogen. An RF generator creates an electric current technology over multiple generations. through two source coils (inner and outer) that oscillates Scaling the transistor to smaller dimensions requires at 13.56MHz. The resulting synchronous oscillating a gate dielectric with increased capacitance to control magnetic field generates an oscillating electric field short channel effects. Higher capacitance is achieved that initiates plasma ionization. The ensuing inductively by reducing the gate oxide thickness, but this increases coupled plasma is “self-contained” in this electric field, gate leakage. At thicknesses less than approximately minimizing acceleration of the ions to the wafer surface. Adjusting the input power modifies the ion density 30Å, the leakage is unacceptably high when SiO2 is

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 20 High-Temperature, Pulsed Plasma Nitridation

of the plasma, which, in turn, defines the amount of unwanted dielectric gate thickening caused by thermal nitrogen incorporated into the oxide layer. The inner processes, thus enabling greater scaling control. and outer coil design of the source allows for tuning Investigation of alternative process integrations plasma density across the wafer to improve nitrogen incorporating thermal nitridation has shown some dose uniformity. capability to shift the nitrogen profile toward the top Figure 1 compares nitrogen depth profiles of a thermal surface, but such efforts are inherently limited by the nitridation and plasma nitridation treatment.[1] The thermal nature of the process.[4,5,6] former results in high nitrogen concentration at the PULSED PLASMA Si/SiON interface. Nitrogen has a deleterious effect The penetration of nitrogen ions through the dielectric on device performance, both in the form of reduced gate to the Si/SiO2 interface, potentially compromising mobility and reduced negative bias temperature insta- device performance, becomes of increasing concern [2,3] bility (NBTI). On the other hand, plasma nitridation as the dielectric gate is thinned. Pulsing the plasma concentrates most of the nitrogen near the top surface avoids this effect.[2] Higher energy electrons diffuse to of the oxynitride film, improving device performance the wall of the chamber during the off cycle, leaving and relative to thermal nitridation. By concentrating the cooling the plasma. The heavier ions are too slow to nitrogen near the top surface, higher nitrogen content escape. The net result is a significant reduction in the can be realized compared to thermal nitridation without electron energy of the plasma species (<0.5eV) without adversely affecting device performance.[1] In addition, a change in ion density. The plasma “softens” by this plasma nitridation at lower temperatures suppresses shifting of the ion energy distribution to lower levels.

Figure 1 Figure 1. Comparison of Thermal Nitridation Plasma Nitridation 104 104 nitrogen profiles shows nitrogen accumulation at SiON/Si Interface SiON/Si Interface the Si/SiO interface with 103 N 103 N 2 O O ounts thermal nitridation. Plasma ounts nitridation keeps nitrogen at 102 102 the top surface away from the Si/SiO interface. 2 101 101 Secondary Ion C Secondary Ion C

100 100 0 200 400 600 800 0 200 400 600 800 Sputter Time (s) Sputter Time (s) Images courtesy of Toshiba (a) (b)

Figure 2

Figure 2. (a) Langmuir probe 1.0 Pulsed RF (% Duty Cycle) CW measurements reveal lower kTe CW 5% 10%

e or 0.8 and Vp when the RF plasma is tur

a 15% pulsed (green data). A 4X re- Vp tential (au) mper 0.6 duction is achieved at 10mTorr. kT Te Po e (v) (b) Nitrogen ion-energy on f 20% distributions (10mTorr and fixed 0.4 pRF effective RF power for modified 0.2 Norm. Plasma

pulsed RF plasmas at 10kHz Norm. Electr and duty cycles <20%) show a 0.0 shift to lower ion energy levels, 0 20 40 60 0.0 0.2 0.4 0.6 0.8 1.0 i.e., “softer” plasma. Pressure (mTorr) Normalized Ion Energy (au) (a) (b)

21 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. High-Temperature, Pulsed Plasma Nitridation

Langmuir probe measurements of kTe at 1µs intervals INTEGRATED POST-NITRIDATION ANNEAL reveal the periodic variations resulting from pulsed Nitrogen incorporation into the dielectric gate shifts the plasma power. At the lower process pressures threshold voltage. The high sensitivity of threshold volt-

(~10mTorr) optimal for high plasma density, kTe age to nitrogen content poses manufacturing challenges for pulsed RF plasma is four times lower than for as nitrogen concentration in the oxynitride decreases continuous wave plasma (Figure 2a). The shift in over time following the nitridation treatment.[8] Figure ion-energy level can also be represented by plotting the 4a shows a greater than 10% drop in nitrogen content ion-energy distributions (Figure 2b). (Although higher over a short idle time after plasma nitridation, which pressures can be used as an alternative means of translates into a significant shift in threshold voltage. lowering the ion energy, by decreasing the mean free A high-temperature (≥950°C) post-nitridation anneal path, the subsequent drop in ion density results in lower (PNA) immediately following the nitridation process nitridation rates.[7]) The ensuing “softer” plasma further counteracts this nitrogen loss (Figure 4b). By clustering ensures that nitrogen is incorporated only at the top the plasma nitridation and post-nitridation anneal on a surface of the dielectric gate, thereby maintaining high common platform, time-dependent process variability channel mobility.[3] This “softer” plasma results in a 15% can be removed, resulting in a stable and robust process improvement in high-field effective mobility compared essential for manufacturing oxynitride gates. to continuous wave plasma at a given nitrogen content (Figure 3). This improvement reduces threshold voltage The PNA also eliminates an unstable bonding phase shift, translating into an approximate doubling of that results from plasma damage during the nitridation transistor lifetime.[2] process. This instability causes a fluctuation in electrical Figure 3 Figure 3. (a) Normalized 1600 102 saturation drain current for

Pulsed RF (15% Higher) 10 long channel PMOS transistors. Pulsed RF 1400 v CW CW (b) NBTI lifetime estimate in

T 1

* (10% I degradation failure

sat dsat d I 1200 10-1 criterion). The lower electron

Lifetime (au) temperature (“softer” plasma) PMOS 10-2 of pulsed plasma improves 1000 10-3 high-field effective mobility by 15%, in turn improving NBTI 800 10-4 10 12 14 16 1.0 1.5 2.0 performance. Nitrogen Concentration (at. %) Stress Voltage (V) Source: Reference 2. (a) (b)

Figure 4 Figure 4. (a) PMOS 300 100 threshold voltage shows a With PNA significant shift as a function

200 95 of nitrogen content. Vt Sensitivity e dos %) is 28mV/%N

N (b) A 15% drop in nitrogen (4 ermal Nitridation ed

Th 100 90 No PNA content occurs without PNA.

ference PNA counteracts the nitrogen Re Normaliz loss. 0 85 PMOS 10Å Base Oxide (PMOS 10Å Base Oxide) Shift (mV) vs.

t 20Å Base Oxide, ~20% N, 1000°C PNA V -100 80 0 5 10 15 20 0 5 10 15 20 25 30 XPS N% (EW) Time to XPS (hrs) (a) (b)

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 22 High-Temperature, Pulsed Plasma Nitridation

characteristics and, hence, variation in the threshold chemistry, with an added insulated gas box to ensure voltage. The high-temperature, post-nitridation anneal safe delivery of the toxic chemistry. The necessary high not only removes this unstable phase, but also reduces temperature is provided by an electrostatic chuck that the defects brought about by plasma damage near the delivers direct and more stable heating of the wafer high-mobility channel, further ensuring optimal device than is possible with a heated pedestal that relies on performance. convective heating. Although a heated pedestal can achieve higher starting temperatures, the convective HIGHER NITROGEN CONTENT PROCESSING nature of the heater causes wafer temperature to drop Conventional nitrogen plasma nitridation begins to during the reduced-pressure plasma process. This approach a limit when scaling the dielectric gate to results in a highly variable process temperature with an thicknesses of approximately 20Å. Attempts to increase overall lower average. In comparison, an electrostatic nitrogen content can enable further EOT scaling; chuck demonstrates a stable and robust temperature however both leakage and threshold voltage require- over the entire process (Figure 7). ments can no longer be met. A high-temperature, high-

power NH3 plasma satisfies all three requirements by High power is necessary to compensate for the lower

altering the trade-off between EOT, threshold voltage, nitridation rates associated with NH3 plasma. Pulsing and gate leakage. delivers a “soft” plasma at elevated RF power and temperature. Combining these conditions with an NH plasma consists primarily of NH radicals while N 3 2 integrated PNA on a common platform delivers superior plasma contains mostly N ions (Figure 5). It has been 2 production-worthy, plasma nitridation capability. theorized that ammonia’s propensity to dissociate in plasma relative to that of nitrogen results in more Figure 5 effective incorporation of nitrogen into the oxide film.[9] Figure 5. Optical emission 7000

N2 N Plasma However, NH3 plasma nitridation at room temperature 2 spectroscopy highlights the 337.1 + 6000 N2 N NH3 Plasma species difference between produces an unacceptably large shift in threshold 2 NH 353.3,353.8, 315.9 336.0 354.9,356.4,358.2 5000 voltage for the same amount of nitrogen incorporated N N2 and NH3 plasmas. 2 NH 357.7 (Figure 6a). Heating the substrate during nitridation 337.0 4000 N N + 2 N 2 eliminates the threshold voltage shift and also improves 313.6 2 391.4 353.7 N 3000 2 N N EOT scaling (Figure 6b). The additional process space 375.5 2 2 380.5 394.3 N2

OES Intensity (au) N N N2 made available by the shifted threshold voltage versus 2000 311.7 2 2 N 328.5 N 371.1 2 399.8 N 2 N 389.5 2 350.0 2 EOT trade-off may then be exploited to improve leakage 1000 326.8 367.2 performance by increasing the nitrogen dose. 0 300 310 320 330 340 350 360 370 380 390 400 NH3 chemistry is introduced into the chamber in a fashion similar to that used for earlier nitrogen Wavelength (nm)

Figure 6 Figure 6. (a) Elevated process Room Temperature NH3 NH 3 17.5% N High Temperature NH temperature eliminates the 3 N2

shift in threshold voltage that ) occurs in room-temperature Target Performance ltage (V NH3 plasma. ltage (V) Vo Vo 10mV (b) NH3 plasma improves shold EOT scaling compared to shold re re 50mV Th N2 plasma for given nitrogen Th content. 20% N

20 22 24 26 28 1.85 1.9 1.95 2.00 ReVera N% Equivalent Oxide Thickness (CET@1V - 0.45) (nm) (a) (b)

23 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. High-Temperature, Pulsed Plasma Nitridation

Figure 7 [6] S.H. Hong, et al., “The Development of Dual-Gate Poly Scheme with Plasma Nitride Gate Oxide for 550 Figure 7. Process Wafer Chuck Gas Stabilization Mobile High Performance DRAMs: Plasma Process temperature-time profiles Monitoring and the Correlation with Electrical 500 for electrostatic chuck and Results,” IEEE ICICDT, pp. 219-222, 2004.

) heated pedestal demonstrate Plasma On 450 [7] temperature stability during

e (�C H.H. Tseng, et al., “Ultra-Thin Decoupled Plasma Ni- the plasma process when atur tridation (DPN) Oxynitride Gate Dielectric for 80nm Wafer Dechuck per 400 Plasma On Advanced Technology,” IEEE Electron Device Letters, using the chuck. m

Te Vol. 23, No. 12, pp. 704-706, 2002.

350 DPN HD (500�C) [8] A.Hegedus, et al., “Clustering of Plasma Nitridation Pre-Heat and Heated Pedestal (550�C) Gas Stabilization and Post Anneal Steps to Improve Threshold Voltage 300 Time Repeatability,” IEEE Transactions on Semiconductor Manufacturing, Vol. 16, No. 2, pp. 165-169, 2003.

CONCLUSION [9] T. Bieniek, et al., “Ultra-Shallow Nitrogen Plasma

Combining a high-temperature, high-power NH3 plasma Implantation for Ultra-Thin Silicon Oxynitride (SiOxNy) nitridation with pulsed RF plasma and an integrated Layer Formation,” Journal of Telecommunications PNA satisfies EOT scaling, leakage, and threshold voltage and Information Technology, pp. 70-75, 2005. requirements for next-generation DRAM peripheral gates and offers an attractive alternative to the costly AUTHORS transition to high-κ dielectric gates. Furthermore, a David Chu is global product manager of the Gate Stack and Oxidation Products unit of the Front End Products unique feature of the NH3 chemistry is the formation of NH radicals in the plasma, which is typically associated business unit at Applied Materials. He received his with greater process conformality. As device structures Ph.D. in materials science and engineering from the transition to 3D, this property may facilitate new University of California, Berkeley. nitridation applications utilizing NH3 plasma. Wei Liu is a senior member of technical staff in the REFERENCES Gate Stack and Oxidation Products unit of the Front End [1] B. Chow, et al., “Extending Silicon Oxynitride Gate Products business unit at Applied Materials. He holds Dielectrics for the 90nm Node,” Semiconductor his Ph.D. in chemistry from the University of British FabTech, 17th Edition – Wafer Processing, pp. 107- Columbia, Canada. 109, 2002. Theresa Guarini is a process engineer in the Gate Stack [2] P.A. Kraus, et al., “Low-Energy Nitrogen Plasmas and Oxidation Products unit of the Front End Products for 65nm node Oxynitride Gate Dielectrics: A business unit at Applied Materials. She received her Correlation of Plasma Characteristics and Device Ph.D. in applied physics from Stanford University. Parameters,” VLSI, 2003. Nathan Sanchez is program manager for the Gate Stack [3] P.A. Kraus, et al., “Further Optimization of Plasma and Oxidation Products unit of the Front End Products Nitridation of Ultra-Thin Oxides for 65nm Node business unit at Applied Materials. He holds his B.S. MOSFETS,” Semiconductor FabTech 23rd Edition – in mechanical engineering from the Massachusetts Wafer Processing, pp. 73-76, 2004. Institute of Technology.

[4] S. Inaba, et al., “Device Performance of Sub-50nm ARTICLE CONTACT CMOS with Ultra-Thin Plasma Nitrided Gate [email protected] Dielectrics,” IEEE-IEDM, pp. 651-654, 2002. PROCESS SYSTEM USED IN STUDY [5] T. Sasaki, et al., “Engineering of Nitrogen Profile in Applied Centura® DPN HD™ an Ultra-Thin Gate Insulator to Improve Transistor Performance and MBTI,” IEEE Electron Device Letters, Vol. 24, No. 3, pp. 150-152, 2003.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 24 IMPROVING TUNGSTEN CHEMICAL MECHANICAL PLANARIZATION for Next-Generation Applications

KEYWORDS Like other metal CMP applications, the fundamental Tungsten mechanism for tungsten removal is based on oxidizing CMP the metal surface and subsequently abrading the metal Slurry oxide to expose a new layer of unoxidized metal. This process repeats itself until all the metal is removed over Pad a stop layer (typically a dielectric material, e.g., TEOS, Planarization USG), which is also referred to as clearing the metal. The remaining metal in the via or trench undergoes further removal (metal dishing) by this oxidation process. Controlling the removal process to clear the tungsten over oxide and ensure uniform removal within the dies, both within wafer (WiW) and wafer-to-wafer (WtW), is the key to meeting next-generation tungsten CMP performance requirements.

This article looks at two technologies to help improve tungsten CMP performance and lower costs. First, Tungsten CMP is becoming a more frequent and it shows how adapting an in-situ tungsten thickness challenging process in advanced integrations. Here, sensor and a multi-zone polishing head improves WtW, preservation of device topography through precision WiW, and within die uniformity. Second, it investigates planarization end-pointing is crucial to successful device dual-wafer tungsten CMP and its ability to lower costs. performance. Future applications, such as 3D memory The combination and synergy of these two technologies structures, will demand even greater removal control, while in a single product enables tungsten CMP for future on- lower costs and higher productivity will be necessary if wafer performance, throughput, and cost requirements. these new integration schemes are to be cost competitive. Dual-wafer polishing and real-time process control will help EXPERIMENTAL SET-UP satisfy these diverse requirements. Dual-wafer and single-wafer polishing platforms were used in these comparative studies. Both were equipped Tungsten is used throughout the transistor creation with production-proven stacked polishing pads level, i.e., front-end, of advanced 3D memory in buried comprising a top pad composed of a high durometer word line and buried bit line, local interconnects, and polyurethane over a high compression sub-pad, i.e., contacts to transistors. Tungsten is favored over other a hard pad, with concentric grooves. For these experi- metals for its large thermal budget for subsequent ments, the polish slurry was a silica-based abrasive processing steps. In addition, tungsten’s conductivity, slurry with 2% H2O2. The polishing test was conducted lack of diffusion/interaction with other materials and on both blanket tungsten and test pattern wafers. An proven performance in logic and memory devices also RS-100 four-point probe and an eddy current sensor make it the likely metal for next-generation applications. measured blanket tungsten film thickness and profile.

25 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Real-Time Process Control

TUNGSTEN REMOVAL PROFILE CONTROL diverging from the target. It then generates incremental Profile control has been implemented for 300mm modifications to the polish recipe parameters, feeding tungsten CMP via closed-loop, real-time WiW control these forward to the CMP system at discrete, user- of bulk tungsten removal. The method uses a high- defined intervals to correct the deviations. resolution sensor for in-situ measurement of tungsten The process control model is constructed from a thickness across the wafer during polishing. The high- physical model of the hardware components and the resolution sensor is a critical component of process characteristic response of the tungsten CMP process control technology due to the challenge of the relatively for a given set of input parameters. The system can high resistance of the relatively thin tungsten film. The be adapted to a variety of input parameters, including bulk tungsten removal process is controlled via real-time, pressure, velocity, and slurry flow. The polish head in-situ incremental changes to polishing conditions, incorporates multiple independent annular pressure using a multi-zone polishing head. These controls have demonstrated more uniform tungsten clearing, resulting zones located behind the wafer. The simple graphical in lower and more consistent WiW tungsten and dielec- user interface allows the user to optimize system tric loss than is possible with an open-loop process. performance by adjusting relatively few parameters.

Figure 1 shows the elements of the control system. The The evolution of the tungsten film profile during a CMP eddy current-based sensor makes in-situ, millimeter- process is shown in Figure 2, in which the open-loop scale measurements of tungsten thickness from the polish (Figure 2a) was conducted with fixed input center to the edge of the film.[1] During polishing, the pressures and the closed-loop polish (Figure 2b) with software model instantaneously analyzes the tungsten variable input pressures on different zones depending thickness profile and determines which areas are on signal feedback. Polishing conditions were the same

Figure 1

Desired Profile Figure 1. (a) Schematic Wafer Bulk of eddy current sensor Conductor D In-Situ Profile Layer Measurement for tungsten thickness Opposing Flux measurement. Eddy Current (b) Elements of the closed- Driving Flux Profile U Wafer signal Error loop profile control system. Electromagnetic Sensor

Process Control U CMP System D can be correlated to: drive Software Model

- Usignal (amplitude) - Phase di erence Incremental Adjustment between U and U drive signal (a) (b)

Figure 2 Figure 2. (a) Open-loop and 2000 Open Loop Closed Loop 1600

1400 (b) closed-loop polishing

1500 1200 profiles of tungsten films at

1000 4-point Probe Measurements 3.5psi, and (c) a film profile ickness (Å) 1000 800 Open Loop (1 wafer) comparison. Th Closed Loop (4 wafers) 600 gsten

500 n 400 Tu 200

0 0 -150 -75 0 75 150 -150 -75 0 75 150 -150 -100 -50 0 50 100 150 Position (mm) (a) (b)(c)

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 26 Real-Time Process Control

at the start of both experiments. Under open-loop DUAL-WAFER POLISHING OF TUNGSTEN conditions, the polish rate at the edge dropped over A dual-wafer polishing configuration was developed time, resulting in an edge-thick profile of the remaining to advance CMP manufacturing capabilities in which film (edge thickness of 450Å vs. center thickness of two wafers were polished simultaneously on the same 250Å). Under closed-loop conditions, the system platen (42-inch). The two heads were controlled increased the pressure to the edge zones as polishing independently with dedicated pressure regulators and proceeded, achieving a much more uniform post-polish closed-loop control systems. Figure 3 demonstrates the profile. The 200Å difference in edge thickness remaining closely matched performance on the two wafers. between the two polish modes is significant as the post In describing its polish rate as a function of platen profiles are compared in Figure 2c. revolutions per minute and polish pressure, tungsten Figure 3 CMP closely follows the Preston equation RR = k*P*V,

Figure 3. Comparison of 2000 where:

s 1800 in-situ, real-time tungsten Side A RR = removal rate ment) 1600 e Side B

film profile monitoring with icknes 1400 closed-loop control on dual Th k = empirical constant taking into account such

st 1200

Po parameters as polishing pad surface and slurry

heads A and B. nt Measur 1000 e lm 800 availability at the wafer’s surface Fi 600

gsten 400 P = average polishing pressure at the wafer’s surface n

Tu 200 (Å, Eddy Curr V = average polishing velocity between wafer and 0 -150 -100 -50 0 50 100 150 pad surfaces Wafer Diameter (mm) As shown in Figure 4, the rate of tungsten removal is linearly proportional to the down force and the linear Figure 4 velocity experienced at the wafer surface. The linearity Figure 4. (a) Tungsten 2000 42” Platen is almost independent of the polish system as the data CMP exhibits the Prestonian 3psi/73rpm 4psi/115rpm

) from a 30-inch platen system and those from a 42-inch 1500 relationship regardless of platen closely match a single fit line. The smaller platen size, but (b) dual-wafer diameter of the 30-inch platen limits the process 1000 polishing increases the l (Å, 60sec window whereas a 42-inch platen widens it and va removal rate. mo 500 enhances the removal rate by 50%. Re 3psi/73rpm 4psi/115rpm Figure 4 also illustrates the increase in removal rate 0 30” Platen 100 200 300 400 500 600 achieved by using a dual-wafer platen. While each Prestonian, Pressure x Linear Velocity (psi*inch/sec) wafer can still be independently controlled, an (a) additional synergy occurs when two wafers share the

5000 same platen. This synergy dramatically enhances the Dual Heads per Platen Single Head per Platen polish rate and significantly reduces slurry consumption. 4000 In our tests, we observed an average of 20% min)

3000 enhancement in productivity and 50% savings in slurry usage. The rate enhancement also enables flexibility

l Rate (Å/ 2000

va in pad selection. Pads that are typically eliminated mo 1000 on the basis of a lower removal rate can be reconsid- Re ered for their superior defect performance and/or 0 200 300 400 500 600 cost advantage. Figure 5 illustrates a typical example Prestonian, Pressure x Linear Velocity of pattern wafer performance. The tungsten plugs (psi*inch/sec) remained intact after polish with very minimum plug (b) topography (24Å).

27 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Real-Time Process Control

Figure 5

Section Analysis Figure 5. Tungsten CMP 9.5 performance on patterned wafers (Applied Materials internal test pattern, 50nm

0 plug arrays). (a) SEM shows nm intact plugs and (b) AFM Plug Recess = 24Å shows minimum plug recess.

-9.5 FOV = 3μm 0 200 400 600 nm (a) (b)

PERFORMANCE CONSISTENCY Figure 6

In a 5000-wafer marathon to test the validity of the Figure 6. Consistent in-film above studies, the tungsten CMP process demonstrated 1000 900 stopping at 650Å remaining a consistent removal rate of 3700Å/min with average 800 10% Control Line through 5000-wafer marathon.

WiW non-uniformity of less than 2.1% and WtW non- ) 700 uniformity of 3.7%. With the eddy current sensor, the ickness 600 Th consistent polishing performance results in accurate 500 400 aining (Å, @ 28% EP in-film stopping capability (Figure 6). m 300 Re CONCLUSION 200 100 Wet Wet Wet Wet Day 1DIdle ay 2DIdle ay 3DIdle ay 4DIdle ay 5 Real-time process control enables the greater precision 0 and uniformity of tungsten removal and dishing required 0 1000 2000 3000 4000 5000 Wafer # for next-generation devices, such as 3D memories. In addition, real-time process control enables dual-wafer Stan Tsai is a technology manager in the Chemical polishing in a production environment. Real-time process Mechanical Planarization business unit at Applied control ensures the same on-wafer performance for Materials. He received his Ph.D. in chemistry from both of the wafers being polished. Dual-wafer polishing the University of Alberta, Canada. inherently improves productivity/throughput, but Zhefu Wang is a process engineer in the Chemical more importantly uses substantially less slurry without Mechanical Planarization business unit at Applied compromising removal rate, uniformity, and defectivity. Materials. He holds his Ph.D. in mechanical engineering By combining both technologies in a single product, from Oregon State University. both on-wafer performance and productivity/cost for future tungsten CMP applications can be achieved Rixin (Vince) Peng is a process engineer in the simultaneously. Chemical Mechanical Planarization business unit at Applied Materials. He received his B.S. in chemical REFERENCES engineering from the University of California, Berkeley. [1] D. Bennett, et al., “Real-Time Profile Control: Advanced Process Control for Copper CMP,” ARTICLE CONTACT Semiconductor Fabtech, Vol. 22, pp. 33-35, 2004. [email protected] AUTHORS PROCESS SYSTEMS USED IN STUDY Sidney Huey is a global product manager in the Chemical Applied Reflexion® GT™ CMP Mechanical Planarization business unit at Applied Applied Reflexion® LK™ CMP Materials. He holds his M.S. in mechanical engineering from Princeton University.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 28 A HIGH PRODUCTIVITY ALD-LIKE CONFORMAL OXIDE LINER for ≤20nm Technology Nodes

KEYWORDS is lowered. Atomic layer deposition (ALD) techniques ALD can provide near 100% conformality and zero pattern Liner loading, but deposition rates are extremely low and Oxide productivity concerns rule out scaling these technologies Low-Temperature for mid-range thicknesses nearing 300Å. Conformal An ALD-like conformal oxide film that deposits at low Pattern Loading Effect temperatures has been developed to address the thermal budget concerns for these thin oxide liners. This ALD- like process uses a new silicon precursor with low sticking coefficient to achieve near 100% conformality and minimal pattern loading, as well as high deposition rates and extremely good film quality. Further, the Complex multi-level logic and memory structures require process can be scaled from very thin films approaching a large number of thin conformal oxide layers, but low 10Å up to 300Å at high productivity, making it suitable thermal budgets rule out furnace processes. Additionally, for an array of spacer and thin oxide film applications. conformality and pattern loading effects become significant challenges at the aggressive aspect ratios and high feature EXPERIMENTAL SET-UP AND RESULTS densities of sub-20nm geometries. A new low-temperature The reactor used for this film was a modified HARP process deposits an oxide film with conformality and design[1] with the unique capability of in-situ plasma pattern independence similar to atomic layer deposition, treatments using different inert gases, which densify but scalable from 10Å to 300Å at high productivity, making the film and result in lower wet etch rate ratios (WERR). it suitable for diverse applications. It also allows for cyclic processing similar to ALD reactors with plasma treatments. Precursor flow control With device geometries shrinking to sub-20nm levels, and hardware were optimized to enable uniform mixing the integration schemes for manufacturing complex of individual reactants, facilitating the short deposition multi-level logic and memory structures require an times necessary for thickness control in the angstrom increased number of applications for thin conformal regime. oxide films (typically 10Å-300Å thick). Besides good conformality, these applications require minimal to no For process development, such parameters as pattern loading and film properties similar to those of temperature, precursor flow, oxidative ambient thermal oxides. Further, at advanced technology nodes, flow/concentration, and heater spacing were varied. the thermal budget of these liners is limited to less Step coverage and pattern loading effect (PLE) were than 400°C. Furnaces have historically been used for evaluated using cross-sectional TEM images. Based on thin oxide film deposition, but they typically operate at these experiments, a process regime was identified that much higher temperatures (>600°C) and film properties shows extremely good conformality and minimal PLE. and deposition rates degrade as the thermal budget Figure 1 shows several examples. Figure 1a is a cross-

29 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. ALD-Like High Productivity Oxide Liner

Figure 1

Figure 1. The new liner exhibits >99% conformality Via at Pattern Edge Center Via on aspect ratios from

24.1nm (a) a 2:1 via to (b) 4:1

1.00um 24.2nm trenches to (c) 7:1 STI in 23.9nm addition to minimal PLE. 24nm

24nm 24nm

50nm (a)

DenseDense and Open Before Liner With Liner Deposited 74Å

71Å

69Å 69Å

71Å 71Å 71Å

71Å71Å 74Å 50nm 50nm 6/28/2010 500nm 100nm

Step Coverage: Sidewall/Top: 100%; Bottom/Top: 100% Pattern Loading E ect: Top/Open: 96%; Sidewall/Open: 96% (b)(c) section of TEM images of via features with ~2:1 aspect of the underlying metal, which may inherently increase ratio. The images were taken in the via array center the resistivity of the metal line. The TEMs in Figure 2 and the via array edge. The TEM measurements reveal show deposition on W and TiN with virtually no >99% film conformality. Figure 1b shows the ALD-like oxidation of the underlying metal. oxide liner deposited on 4:1 aspect ratio structures on a Figure 2 dense array of trenches and a trench feature adjacent to a wide open area. Here, the film was >99% conformal Pre ~150Å Oxide Liner Figure 2. Low-temperature and PLE measured only ~3Å from the dense sidewall deposition of new oxide liner Minimal oxide growth allows for deposition on thickness to the thickness in the open field. For oxide te after liner dep liner applications in deep trenches, such as shallow ra 14.8nm metal lines without oxidation trench isolation (STI) liners or buried word lines and of underlying metal. N Subst Ti buried bit lines in advanced DRAM, the oxide liner Å conformality was also measured in high aspect ratio 50.1nm 500

(7:1) STI features. Figure 1c shows an STI structure on 20nm 20nm which >99% conformality was achieved. 30Å oxide 17.2nm prior to liner dep The low deposition temperature (350°C) makes the 35Å oxide e after liner dep new liner suitable for a variety of advanced applications at 3nm 3.5nm in which lowering the thermal budget is critical. Several in advanced memory and logic require oxide deposition 48.3nm 47.6nm W Substr on metals, such as TiN or W (e.g., on a gate in logic Å or flash, or a word or bit line in memory). Higher 500 temperatures are not suitable because of the oxidation 20nm 20nm

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 30 ALD-Like High Productivity Oxide Liner

Electrical properties for the new oxide liner were demonstrating a higher breakdown voltage (>10MV/cm) obtained on both planar and 3D MOSCAP structures. for the new liner. Also shown is the sidewall breakdown For a planar MOSCAP, the top and bottom electrodes voltage (using 3D MOSCAP wafers), which measures the are separated by a dielectric in the horizontal plane insulation and electrical characteristics of the film on the (Figure 3). The 3D MOSCAP is conceptually similar sidewall. Again, the oxide liner performance was superior to a planar MOSCAP, except that the dielectrics to those of a 100Å high quality thermal oxide film. between the two electrodes are no longer planar. Three The modified HARP reactor allows for in-situ dielectrics are present: SiN on top of the silicon fin, a densification of the oxide liner using plasma treatments. sidewall dielectric, and a bottom STI oxide. The sidewall These can be cycled in blocks of thickness depending dielectric is designed to be much thinner than the top on the final thickness and WERR requirement of the SiN and bottom STI oxide, thus all current-related film. Several applications for new oxide liners require measurements are dominated by the sidewall. The improved WERR in wet clean environments. The new total sidewall area is designed to be much larger than oxide liner exhibits a WERR of <2.3 in 100:1 DHF the area of the top SiN and bottom STI oxide. The total compared to thermal oxides. The improvement in film measured capacitance is dominated by the sidewall quality with the plasma treatment manifests itself in dielectric due to this combination of much larger area film stress stability as a function of time, as well as and much thinner dielectric. The oxide liner exhibited Fourier-transform infrared (FTIR) spectra. Figure 4 excellent electrical and isolation characteristics. Figure shows excellent film stress stability over time in 3 shows planar MOSCAP I-V data for 100Å of conformal addition to FTIR spectra confirming good quality oxide oxide compared to 100Å of high quality thermal oxide, with no Si-H, Si-OH or C present.

Figure 3

Figure 3. (a) Cross-section Planar MOSCAP 3D MOSCAP of planar MOSCAP and 3D MOSCAP.

Pad Metal Pad Metal

Poly Gate SiN oly

Gate Dielectric Dielectric P

Si 2 SiO

(a)

2X 50Å Figure 3. (b) (c) Excellent 1.00E+00 2X 50Å 1.00E+00 100Å Thermal Oxide Film breakdown voltage on both 100Å Thermal Oxide Film 1.00E-02 1.00E-02 planar and 3D MOSCAPS.

1.00E-04 1.00E-04 ) )

1.00E-06 1.00E-06 Amps Amps I ( I (

1.00E-08 1.00E-08

1.00E-10 1.00E-10

1.00E-12 1.00E-12 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 V (mV/cm) Vbd (mV/cm) bd (b) (c)

31 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. ALD-Like High Productivity Oxide Liner

DISCUSSION Figure 4

Advanced semiconductor fabrication requires thin films Figure 4. (a) The new oxide 0 to be deposited uniformly on patterned structures with shows stable compressive -20 varying geometries (trenches, fins, vias, and islands) and stress (-135 MPa) over time, -40 a wide range of aspect ratios. Line-of-sight processes while (b) FTIR spectra show ) -60 (PVD, PECVD) do not provide good conformality on Pa stable film without detectable -80 aggressive structures, but ALD and CVD provide viable C or H bonds. alternatives. CVD processes in certain growth regimes ess (M -100 Str are capable of high conformality without sacrificing -120 film growth rates. In the CVD literature, the problem of -140 identifying conformal growth regimes is addressed by -160 the surface reaction probability ß (also called sticking 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 Time (hrs) coefficient), defined as the probability that a precursor (a) molecule will adsorb to form a film per unit molecular collision.[2,3] For high conformality, lower values of 0.25 5min Si-O Stretch ß are preferred, which implies that the precursor 0.2 1hr 2hr consumption rate is low, resulting in a smaller gradient 3hr 0.15 in precursor partial pressure with increasing depth in the feature.[4] 0.1 Si-O Si-O 0.05 Stretch The new ALD-like process lowers the surface reaction Absorption (au) Bend probability ß by changes in process, as well as by 0 employing a new precursor for film growth. The two -0.05 key features of the new process regime that improve 3900 3400 2900 2400 1900 1400 900 400 conformality are high pressure (>500Torr) and choked Wavenumber (cm-1) oxidative ambient flow. Higher pressure reduces the (b) precursor partial pressure gradient from feature top to bottom and the choked oxidative ambient flow reduces Figure 5 the precursor consumption rate. Both these parameters TEOS as Precursor Figure 5. The >Si=O help to operate in a surface reaction limited regime with OC H OC H moieties of the new lower values of surface reaction probability ß. 2 5 2 5 precursor are very stable

OC H Si OC H +O3 OC H Si OH The improved conformality with the new precursor is 2 5 2 5 2 5 and no further polymeriza- OC H OC H attributed to the reduction in the concentration of the 2 5 2 5 tion is possible to bigger (Silanol) chains. Low sticking coef- higher order intermediates and their participation in OC2H5 film growth. As shown in Figure 5, the new precursor ficient of these monomers 2x OC H Si OH Si O Si forms a stable intermediate while higher order reactive 2 5 promotes less selective and OC H species like poly-siloxanes are not formed (These 2 5 (Higher Order Silanols) highly conformal film growth heavier species have low mobility and much higher with minimal PLE. sticking coefficients, leading to an increased surface New Precursor reaction rate and severely worsening film conformality.) O H H-O The secondary reactive species are produced by gas O +O R2 Si R1 phase reactions and/or surface reactions on the trench R2 Si R1 3 walls and thus have a higher concentration gradient (with H H feature depth) than that of the first order intermediates. O O H O PLE has been another major concern with traditional OSi R1 R2 Si R1 CVD oxide liners as the deposition rate is inversely H correlated with the pattern area density of the H

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 32 ALD-Like High Productivity Oxide Liner

Figure 6 substrate.[5,6] One of the keys to pattern insensitivity is eliminating mass transport as a limiting step in the Figure 6. Incubation effects 700 Low-Temperature CVD Process deposition mechanism.[7] For this to occur, precursor on deposition mechanism. 600 Conformal Selective concentration must be constant through the boundary 500 (Low Pattern (High Pattern Loading) Loading) layer over the entire deposition surface (Figure 6). 400 This is achieved in two ways: 1) the oxidative ambient 300 ckness (�) i is limited to reduce gas phase reaction and confine Th 200 the reaction closer to the substrate surface and 2) the 100 reaction times are kept short and cycled to minimize 0 0 20 40 60 80 100 any depletion that may occur during deposition. The Deposition Time (s) deposition rate limiting step then becomes the surface reaction wherein the reactive intermediates are incorporated into the bulk film. Gas Oxide Gas Oxide Film Film Figure 7 shows that the deposition rate is insensitive to C C g C g C s s temperature, precursor flow, and spacing, suggesting that the process is indeed operating in a surface reaction δ δ (Boundary Layer) (Boundary Layer) limited regime. This is achieved by depleting oxidative ambients available for the CVD reaction. Supplying a Cg: Bulk gas phase concentration of reactive intermediate

[eg., (NEt)HSi=O] choked oxidative ambient flow substantially confines C : Concentration of the reactive intermediate at the interface s the reaction closer to or on the patterned substrate surface, thereby promoting uniform deposition thickness Figure 7 regardless of the local exposed pattern area density. Figure 7. Deposition Furthermore, the deposition thickness is limited, 12 rate independence from terminating the CVD reaction before the silicon reactant 10 temperature, spacing, and 8 concentration in the boundary layer decreases. precursor flow suggest that 6 This methodology of operating in a surface reaction the reaction is in a surface 4 2 limited regime and cyclic deposition would work with reaction limited regime. 0 differing precursors, such as TEOS, which is commonly 340 345 350 355 360 used for CVD oxide depositions. However, to minimize Temperature (C) PLE, the new precursor was selected, because its small 12 siloxane intermediates are stable and inhibit further 10 polymerization. As these radical species are much 8 smaller (compared with TEOS), they have low sticking 6 4 coefficients, which results in a high net diffusion rate 2 through the boundary layer that promotes film growth 0 with minimal PLE. 150 200 250 300 350 Spacing (mils) CONCLUSION A new ALD-like oxide liner with excellent electrical 12 10 film properties has been developed that achieves 8 maximum conformality and negligible PLE. The film can 6 be used for multiple thin oxide applications, such as STI 4 liners, gate and implant spacers, oxide hard masks, and 2 0 contact liners. Furthermore, a single CVD reactor can 400 600 800 1000 1200 be used to deposit film thicknesses ranging from 10Å Flow (mgm) to 300Å at high productivity.

33 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. ALD-Like High Productivity Oxide Liner

REFERENCES AUTHORS [1] C. Ching, et al., “Improving Electrical Performance Cary Ching is a product manager in the Dielectric and Using SACVD Oxide Films,” Semiconductor Systems Modules business unit at Applied Materials. International, April 1, 2008. He holds his M.S. in materials science from Rensselaer Polytechnic Institute. [2] J.C. Rey, et al., “Monte Carlo Low Pressure Deposition Profile Simulations,” Journal of Vacuum Sidharth Bhatia is a process engineer in the Dielectric Science and Technology, May 1991. Systems and Modules business unit at Applied Materials. He received his Ph.D. in materials science [3] A. Sanger, et al., “Chemical Vapor Deposition of from Brown University. Tungsten Silicide (WSix) for High Aspect Ratio Applications,” Thin Solid Films, October 2003. Paul Gee is a process engineering manager in the Dielectric Systems and Modules business unit at [4] A. Yanguas-Gil, et al., “Highly Conformal Film Applied Materials. He holds his Ph.D. in chemical Growth by Chemical Vapor Deposition. I. engineering from the University of California Los A Conformal Zone Diagram Based on Kinetics,” Angeles. Journal of Vacuum Science & Technology A, Vol. 27, Issue 5, 2009. Bingxi Wood is a senior technology program manager in FEOL integration in the SSG CTO/MTCG group at [5] A.H. Labun, et al., “Mechanistic Feature-Scale Applied Materials. She received her Ph.D. in physics Profile Simulation of SiO Low-Pressure Chemical 2 from Rensselaer Polytechnic Institute. Vapor Deposition by Tetraethoxysilane Pyrolysis,” Journal of Vacuum Science & Technology B, Vol. 18, Ajay Bhatnagar is a director of Global Product Issue 1, 2000. Management in the Dielectric and Systems Modules business unit at Applied Materials. He holds his Ph.D. [6] M.K. Gobbert, et al., “Mesoscopic Scale Modeling of in materials science and engineering from Stanford Microloading During Low Pressure Chemical Vapor University. Deposition,” Journal of the Electrochemical Society, Vol. 142, Issue 8, 2003. ARTICLE CONTACT [email protected] [7] J.W. Smith, et al.,“Pattern-Dependent Microloading and Step Coverage of Silicon Nitride Thin Films PROCESS SYSTEM USED IN STUDY Deposited in a Single-Wafer Thermal Chemical Applied Producer® GT™ Vapor Deposition Chamber,” Journal of Vacuum Science & Technology B, Vol. 23, Issue 6, 2005.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 34 OPTIMIZING NANO-POROUS DIELECTRICS for ≤28nm Applications

KEYWORDS associated barrier films must scale correspondingly. Interlayer Dielectrics With packaging development and the move towards Nano-Porous Dielectrics lead-free solder, chip packaging interaction becomes Ultraviolet increasingly important to ensure high yields. The UV Curing dielectric constant reduction must not be at the expense of film strength and compatibility with existing process Ultra-Low-κ Dielectrics techniques. Cross-Linking Shrinkage Plasma enhanced chemical vapor deposition (PECVD) carbon-doped oxides (CDO) were first introduced as a means of reducing RC delay at the 90nm node. As devices scaled below the 45nm node, the industry transitioned from dense CDO (dielectric constant, or κ of 3) to nano-porous CDO in which porosity was introduced to lower the dielectric constant below 2.5. Today, six years after the introduction of nano-porous CDOs, the industry is seeking to extend nano-porous CDOs to 28nm and below through innovation of the dielectric film and subsequent UV cure process. Nano-porous low-κ films are the standard interlayer IMPROVING FILM STRENGTH AND dielectrics at 45nm, but their relative fragility poses REDUCING DIELECTRIC CONSTANT challenges, such as susceptibility to downstream plasma Nano-porous low-κ films with a dielectric constant damage or physical damage during packaging. However, of approximately 2.5 are the industry standard for new chemistries and improved ultraviolet curing have the 45/32nm node. These films are most effectively enhanced these films’ mechanical strength and damage- fabricated using a two-step process. First, PECVD resistance through controlled polymerization and is used to create a nano-composite composed of an physical restructuring, producing a new generation of organo-silicate glass (OSG) backbone, simultaneously robust ultra-low-κ dielectrics that fulfill requirements deposited with a thermally labile organic (TLO) phase for 28nm and beyond. that will partially define the final pore space. Second, As semiconductor devices continue to scale in an advanced curing step, optimally an ultraviolet (UV) accordance with Moore’s Law, the ability to increase cure, is applied to the film to remove the labile phase signal speed is significantly affected by the dielectric and to restructure the remaining matrix to form the final constant (κ) of the insulating materials sandwiched nano-porous film.[1] This porosity helps to reduce the between the . To achieve desired final dielectric constant of the film; however, it can also

electrical performance, the effective κ value (κeff) make the film extremely susceptible to material damage of the inter-level dielectric (ILD) combined with any during subsequent wet and dry processing steps.

35 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Robust Ultra-Low-κ Dielectrics

At 45/32nm, this challenge has been successfully Having engineered a precursor with the appropriate managed. Achieving ultra-low-κ (ULK) film (κ~2.2) for carbon content and structure provided the matrix scaling to 28nm and below, however, requires to control film properties and maintain chemical re-engineering of the ILD’s composition and structure resistance. As with the existing generation of 45nm to improve its ability to withstand subsequent films, the key to achieving the desired mechanical downstream plasma etch, photoresist ash, wet clean, strength (elastic modulus, E, and hardness, H) and and chemical mechanical planarization (CMP), and dielectric constant was applying a UV cure to the maintain required mechanical properties. film. This process, in cooperation with the engineered nano-porous film, promotes controlled cross-linking To determine the most effective chemistry for making of the dielectric backbone, creating a denser, stronger a robust nano-porous low-κ film, comparative studies material. It also drives out the labile species, thereby were conducted of various organosilane precursors lowering the κ value. The new chemistry enables a wide with different carbon-based backbone structures and range of dielectric constants while maintaining superior an organic precursor for the labile, pore-forming phase. film mechanical properties (Table 1). Among the organosilane precursors, the key differences included the nature of the carbon bridge (seen to Ultraviolet curing leverages both thermal and photon largely determine resistance to integration damage) energy in restructuring the film to generate porosity, and methyl content. Comparisons of commercially enhances backbone structure, and modifies film available options showed that a new, proprietary composition to meet advanced logic requirements. precursor needed to be developed as the methyl in the However, the curing of nano-porous CDOs introduced existing chemicals tended to decrease the mechanical new challenges requiring advances in curing process strength contributed by the carbon. and hardware design.

Table 1 CREATING NANO-POROUS LOW-κ FILM BY UV CURING Table 1. Key film properties Dielectric Curing of the nano-composite starts with TLO removal 2.2 2.4 2.55 2.6 2.8 over wide range of dielectric Constant (κ) using UV photon wavelengths (200nm-300nm) to constants break the carbon-hydrogen (C-H) and carbon-carbon Modulus (GPa) 7.1 10.7 8 11.3 14.2 (C-C) bonds in the TLO. Figure 1a shows the C-H intensity, measured using FTIR absorption spectroscopy, Hardness (GPa) 1.0 1.54 1.25 1.8 2.27 as a function of UV curing time. The intensity is expressed as a percentage of C-H peak area remaining XPS Carbon (%) 23 17.3 19 >19 >19 relative to that of an uncured sample. When exposed to

Figure 1

0.46 Figure 1. (a) FTIR C-H 1.0 intensity vs. UV curing time. l 0.45

ota Plots A and B represent ULK A 0.8 Si T 0.44 films cured under normal B and higher UV intensity, C 0.6 0.43 rk/Si-O- respectively. Plot C wo H Intensity 0.42 represents κ≤2.2 films with C- 0.4

Si Net increased TLO content cured 0.41

0.2 Si-O- under higher UV intensity.

0.40 (b) Degree of cross-linking 0 200 400 600 800 1000 0 100 200 300 400 500 600 vs. UV curing time for ULK Time (s) Time (s) films. (a) (b)

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 36 Robust Ultra-Low-κ Dielectrics

UV light, a rapid decrease in C-H bonds occurs within porous ULK films cured under UV treatment, which approximately 15 seconds and then gradually slows suggests that it will also hold true for films with ≤κ 2.5 down. The data set shows that TLO removal occurs and next-generation ULK film with ≤κ 2.2. very rapidly and can be accelerated by strengthening Experiments were conducted to understand whether or the UV intensity. This TLO removal behavior remains not ULK film property behavior varies with different UV unchanged even for ULK film with a dielectric constant curing conditions. Film shrinkage is a control parameter of 2.2 as well. often used as a measure of the film state to target During the curing process, cross-linking of the Si-O-Si desired dielectric constant, carbon content, and network occurs in parallel with TLO removal. Figure 1b mechanical strength. Figure 2a and b illustrate the shows the extent of cross-linking as a function of UV trends in dielectric constant, modulus, and carbon cure time. As soon as ULK film is exposed to UV light, content with respect to shrinkage of ULK film cured the Si-O-Si network starts to increase linearly as curing under various wafer temperature and UV output power progresses. This indicates that the cross-linking and conditions. While elastic modulus increases linearly as TLO removal is happening simultaneously in the early film shrinkage progresses, dielectric constant drops and stage of the film re-structuring process, and then cross- plateaus at a certain shrinkage range before starting to

linking continues on after TLO removal is complete. rise again as a result of greater loss in Si-CH3 bonds. This behavior remains the same for all of the current From Figure 2, multiple conclusions can be made. First,

Figure 2

Figure 2. (a) Dielectric 3.0 14 constant and modulus, and 2.9 400°C, 75% 12

(b) methyl content with ) 400°C, 90% 3.5 390°C, 90% respect to film shrinkage of 2.8 10 ) 410°C, 90% Pa (G ULK film cured under various nstant (κ 2.7 8 3.0 wafer temperatures and UV Co output powers. 2.6 6 ontent (%) 400°C, 75% Modulus 400°C, 75% 2.5

Dielectric 400°C, 90% 400°C, 90% 2.5 390°C, 90% 4 390°C, 90% Methyl C 410°C, 90% 410°C, 90%

2.4 2 2.0

S0.3 S0.7 S1.0 S1.3 S0.3 S0.7 S1.0 S1.3 Normalized Film Shrinkage Normalized Film Shrinkage (a) (b) Figure 3

Figure 3. (a) Effect of bulb 3.0 14 type on film shrinkage. 2.9 Bulb A only 12

Bulb A and B are short ) Bulb B/Bulb A Bulb B only

S ) wavelength (200–300nm) e 1.3 Bulb A 2.8 10 Bulb B Pa

Bulb B/Bulb A (G and long wavelength nstant (κ S1.0 2.7 8 (300nm–500nm) bulbs, Co

lm Shrinkag S respectively. Fi 2.6 6 0.7 Modulus ed

(b) Dielectric constant, Dielectric Bulb A only S 2.5 4 modulus, and methyl 0.3 Bulb B/Bulb A Bulb B only content with respect to film Normaliz 2.4 2 t t t S S S S shrinkage of ULK film cured 1.0 1.4 3.0 0.3 0.7 1.0 1.3 Normalized Cure Time Normalized Film Shrinkage under various UV bulbs. (a) (b)

37 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Robust Ultra-Low-κ Dielectrics

the curing behavior of ULK film remains consistent Two-step curing did not show acceleration of film regardless of UV curing conditions. In other words, at restructuring in this experiment. Figure 3b and Table 2 the same film shrinkage, comparable film properties confirm that regardless of the curing approach, ULK can be achieved. Second, for every curing condition, an film curing behavior is universal and comparable film optimal window of process conditions exists in which properties are achievable for the same film shrinkage. elastic modulus increases while dielectric constant of Thus, reaching the target film shrinkage as fast as the ULK film remains constant. possible is the key factor for successful ULK curing. SELECTING THE OPTIMAL UV WAVELENGTH CURING PROCESS EFFICIENCY Effective curing of nano-porous CDOs starts with As noted earlier, the key function of the UV curing selection of the correct UV wavelength for efficient TLO process is to modify the molecular structure of the removal (C-H, C-C) and Si-O-Si network cross-linking. as-deposited ULK film through UV photon energy. The microwave lamp used as the UV source in this Therefore, it is important to develop a curing process study can easily shift the wavelength distribution by that can accelerate this restructuring process and reach the selection of the appropriate UV bulb. To identify the the desired target as rapidly, uniformly, and consistently optimal UV bulb, two types with different wavelengths as possible. were used to cure nano-porous low κ film (κ≤2.5). In The kinetics of removing the TLO and cross-linking addition, the effect of separating the TLO removal and can be enhanced without compromising the thermal cross-linking steps was studied to understand whether budget by increasing the UV intensity.[2] However, the or not acceleration of the film restructuring process curing temperature plays a critical role in this process, could be achieved by first removing TLO with a long as illustrated in Figure 4. ULK film was cured at wavelength UV bulb (300nm–500nm) followed by various wafer temperatures for different cure times cross-linking with a short wavelength UV bulb (200nm- using broadband UV. The UV cure time series data 300nm). Heater temperature was adjusted to achieve demonstrate a linear relationship between film comparable wafer temperature between the two bulbs. shrinkage and cure time with the trendline shifting Figure 3 illustrates film shrinkage with respect to the to the left as wafer temperature increases, indicating cure time for different approaches. The data indicate higher curing efficiency. However, BEOL thermal budget that one-step curing with Bulb A was the most effective. limitation requires wafer temperature to be ~400°C.

Table 2

Table 2. Post-cured Parameter Bulb A Only Bulb B Only Bulb B/Bulb A (2-Step) properties of ULK film with similar shrinkage cured Normalized Cure Time 1.0 3.0x 1.4x under various UV bulbs

Refractive Index 1.349 1.347 1.350

Normalized Film Shrinkage 1.0 0.96 1.02

Dielectric Constant (5pts avg) 2.48 2.48 2.48

Hardness/Modulus (GPa) 1.28/8.3 1.27/8.2 1.32/8.6

Porosity (%) 23.9 24.9 23.2

Pore Radius (Å) 11.6 11.5 11.3

SiCH3/SiO (area %) 2.8 2.8 2.8

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 38 Robust Ultra-Low-κ Dielectrics

Figure 4 Cure pressure is another process variable that accelerates film restructuring. Figure 5 shows the Figure 4. Relationship Wafer Temperature 405°C 435°C 18.0 relationship of pressure to film shrinkage at constant of wafer temperature to 420°C 445°C film shrinkage and curing 17.0 wafer temperature. To understand the effect of pressure 16.0 efficiency. on the restructuring process, ULK film was cured at low 15.0 and high pressures and for various lengths of time. 14.0 Figure 5a shows that lower cure pressure achieves the 13.0 lm Shrinkage (%)

Fi target film shrinkage approximately 40% faster than 12.0

11.0 high cure pressure. Figures 5b and 5c show that for the t t t 0.3 0.7 1.0 same film shrinkage, low pressure performance is supe- Normalized Cure Time rior to high pressure cure, suggesting that the former accelerates restructuring of the as-deposited ULK film. Figure 5 One of the major challenges in curing TLO-containing Figure 5. (a) Effect of curing Low Pressure High Pressure ULK film is managing the outgassed TLO by-products. pressure on film shrinkage. S1.1 As shown earlier, TLOs are removed very quickly. (b) Dielectric constant vs. Target S1.0 film shrinkage. lm Shrinkage These organic by-products can unfortunately be easily Fi

(c) Modulus vs. film ed re-deposited onto cold (<200°C) surfaces, such as the S shrinkage. 0.8 UV window and chamber body, leading to inconsistent

Normaliz and non-uniform curing that, in turn, compromises

t1.0 t1.4 film properties and can have negative impact on film Normalized Cure Time defectivity levels. Consequently, maintaining a clean (a) environment is imperative.

2.54 Low Pressure UV CHAMBER CLEANING High Pressure In-situ cleaning is essential for delivering fast, uniform, 2.52 and consistent film restructuring by curing. Because the nstant

Co 2.50 outgassed by-products are organic, an oxygen-based clean was expected to be best for removing them, 2.48 producing mainly CO and H by-products that do not Dielectric 2 2

2.46 re-deposit on chamber surfaces. Two oxygen-based S S S 0.8 1.0 1.1 cleaning processes were evaluated: one ozone-based Normalized Film Shrinkage and the other remote plasma source-based. Both are (b) known to create active oxygen radicals. The clean etch rate of the two approaches was compared using blanket 9.5 Low Pressure 9.0 High Pressure photoresist and constant wafer temperature to elimi-

) 8.5 nate the thermal effect. The data show an etch rate ap- Pa

(G 8.0 proximately 20% higher for the ozone-based cleaning, 7.5 suggesting greater oxygen radical availability. An addi- 7.0

Modulus tional benefit of the ozone clean is that the dissociation 6.5 process starts only when the gases are introduced into 6.0 the chamber and are exposed to UV lights (200nm– S0.8 S1.0 S1.1 Normalized Film Shrinkage 300nm) and hot (>200°C) surfaces, creating the (c) chemically active oxygen radicals precisely where they are needed for effective and efficient cleaning.

39 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Robust Ultra-Low-κ Dielectrics

OPTIMIZING CURING PROCESS UNIFORMITY by increasing the distance between the wafer and the Purge flow is another critical parameter governing UV bulb, this approach reduces source efficiency and consistent and uniform cure within the wafer and from leads to a drop in cure rate. The UV window itself can wafer to wafer. The purge flow prevents outgassed also improve UV light distribution uniformity. Figure 7 by-products from reaching the UV window so that compares film shrinkage profiles across the wafer for uniform and consistent UV light reaches the wafer. conventional and engineered windows. The profile for Computational fluid dynamics (CFD) modeling was the conventional window shows more shrinkage at the used to compare concentrations of outgassing species center than at the edge, corresponding to the loss of for side-to-side purge flow and uniform purge flow from light at the rim of the window. The new window design an overhead source. Simulation data indicated that minimizes the reflection loss at the window surfaces, uniform purge flow was superior in keeping the window allowing more UV light to illuminate the edge of wafer continuously clean (Figure 6a). Previous study has and thereby optimizing UV light distribution. Shrinkage shown that the uniformity of the film shrinkage is highly uniformity maps in Figure 6b demonstrate the effect dependent on the uniformity of wafer temperature. on curing uniformity of the re-engineered lighting and Side-to-side purge flow creates a temperature gradient purge flow hardware. on the wafer that can be correlated to the shrinkage IMPROVING CDO ADHESION TO THE BARRIER profile. Uniform purge flow can reduce wafer DIELECTRIC temperature non-uniformity to <5°C. The low-κ dielectric must be mechanically strong to Figure 6b shows the change in the film shrinkage maps resist stresses applied to the chip during packaging. Packaging failures commonly occur at the barrier/low-κ of post-cured TLO-containing ULK (κ≤2.5) film using interface due to mismatch between the barrier film conventional UV hardware and optimized UV hardware. and bulk dielectric. To enhance adhesion, a transition Shrinkage uniformity can be further enhanced through layer is deposited between the barrier and bulk low-κ. uniform purge flow and improved window engineering Although this layer greatly improves adhesion, its compared to conventional side-to-side purge flow and higher dielectric properties raise the κ . However, the window configuration. eff chemistry used results in the initiation and transition Delivering uniform incident UV light to the wafer is layer of the film being significantly thinner than was essential for uniform film curing. UV source rotation is true of previous-generation chemistries. The SIMS one method of achieving this while maximizing source profile in Figure 8 shows that the transition from low efficiency. Similar to the microwave oven, rotation of carbon at the barrier interface to the bulk carbon in the UV source delivers uniform light across the wafer the film occurs in 75Å. This thinner adhesion layer without leaving “hot spots” behind as occurs when the improves overall κeff of the film without loss of UV source is fixed. Although the issue can be resolved mechanical integrity.

Figure 6

Outgassing Species Parameter Optimized UV Hardware Conventional UV Hardware Figure 6. (a) CFD modeling Concentration showed uniform purge flow at Window to be more effective in maintaining a constantly

Shrinkage Map clean UV window. (b) On-wafer results also demonstrated better Film Shrinkage (%) 15.0 15.3 shrinkage uniformity. Uniform Side-to-Side Shrinkage Uniformity (%, 1s) 1.6 3.0 Purge Flow Purge Flow Shrinkage Range (%) (Max-Min) 1.0 1.8 E&H Uniformity (GPa) (Max-Min) 0.3/0.04 0.8/0.1 (a) (b)

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 40 Robust Ultra-Low-κ Dielectrics

Figure 7 CONCLUSION

Figure 7. Greater illumination 1.2 Developing dielectric TLO-containing ULK dielectric around the edges of the Conventional films for successful integration at advanced logic Engineered new window results in more 1.1 nodes poses many challenges in identifying the optimal uniform center-to-edge film CDO chemistry, enhancing the curing process, and shrinkage. 1.0 optimizing integrated process steps to maintain ed Shrinkage

adhesion and lower κeff. Designing a UV cure solution 0.9 for robust film restructuring not only extends the use of Normaliz nano-porous ULK film to the next generation but also 0.8 extends curing to other applications, such as tensile -150 -100 -50 0 50 100 150 stress nitride for device performance improvement and Wafer Position (nm) κ-recovery of integration-damaged nano-porous ULK films. Process and hardware advances in dielectric Figure 8 deposition, UV curing, and interface engineering have Figure 8. Carbon profile 20 340Å created a film ready for sub-20nm integration (Figure 9). demonstrates the reduction 75Å

n 16 of adhesion layer thickness. ACKNOWLEDGMENTS atio The authors would like to thank K. Chan, M. Chhabra, 12 B. Xie, S. Hendrickson, A. Kangude, M. Martinelli, oncentr

(atom%) 8 D. Raj, R. Odom and D. Witty for their support and New Chemistry Previous Chemistry comments.

Carbon C 4 REFERENCES 0 [1] 0 100 200 300 400 500 600 700 Kelvin Chan, et al., “Structural Evolution of Nano- Depth (Å) Porous Ultra-Low-κ Dielectrics under Broadband UV Curing,” Advanced Metallization Conference 2007, Figure 9 Proceedings, October 9-11, 2007, Tokyo, Japan;

Figure 9. 2LM test October 22-24, 2007, New York, NY, pp. 507-511, structure fabricated with 2008. TLO-containing ULK [2] Alex Demos, et al., “Porous Low-κ Dielectrics dielectric and advanced Using Ultraviolet Curing,” Solid State Technology, UV curing (courtesy MTCG). September, 2005. AUTHORS

0.2μm Harry Whitesell is a global product manager in the Dielectric Systems and Modules business unit at Applied Materials. He holds his Ph.D. in materials engineering from Auburn University.

Tsutomu Kiyohara is a global product manager in the Dielectric Systems and Modules business unit at Applied Materials. He received his B.S. in industrial engineering from the University of Toronto.

Kang Sub Yim is a senior member of technical staff in the Dielectric Systems and Modules business unit at Applied Materials. He holds his Ph.D. in chemical engineering from Stanford University.

41 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Robust Ultra-Low-κ Dielectrics

Thomas Nowak is a distinguished member of technical staff in the Dielectric Systems and Modules business unit at Applied Materials. He received his Ph.D. in mechanical engineering from the Massachusetts Institute of Technology.

Alex Demos is a director in the Dielectric Systems and Modules business unit at Applied Materials. He holds his Ph.D. in chemical engineering from the University of Michigan.

Juan Carlos Rocha is an engineering director in the Dielectric Systems and Modules business unit at Applied Materials. He received his Engineer’s Degree in Mechanical Engineering from the Massachusetts Institute of Technology.

Sanjeev Baluja is a senior engineering manager in the Dielectric Systems and Modules business unit at Applied Materials. He holds his Master’s of Engineering in Manufacturing from the University of Michigan. ARTICLE CONTACT [email protected] [email protected] PROCESS SYSTEMS USED IN STUDY Applied Producer® PECVD Black Diamond® 3 Applied Producer® PECVD with Nanocure™ 3

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 42 OPTIMIZING DIELECTRIC ETCH UNIFORMITY for 28nm Copper Dual Damascene

KEYWORDS present results from an investigation of dual damascene Dual Damascene (DD) etch uniformity using a reactor with a moving Dielectric Etch cathode to optimize the inter-electrode gap. Inter-Electrode Gap MODELING STUDY Center-to-Edge Uniformity Delivering uniform process results across the wafer typically requires optimizing the spatial distribution of plasma-generated species (ions, electrons) in the reactor. One factor that significantly influences plasma spatial distribution is the inter-electrode gap. To under- stand the effect of this gap on the behavior of capacitive coupled plasmas, we developed a two-dimensional model of an argon plasma operating at 13.5MHz.

The reactor has a top electrode separated from the grounded chamber wall by a quartz ring. The bottom electrode is surrounded by a set of silicon and quartz rings. A silicon wafer is placed at the bottom electrode; argon gas flows into the plasma process chamber through the top electrode and is removed through the Feature dimensions at the 2x node are making it increasingly pump port at the bottom of the chamber. The plasma challenging to achieve the center-to-edge etch depth simulations were performed at 1000W with the RF uniformity and CD control necessary for uniform resistance source connected to the bottom electrode. Gas pressure of copper interconnect lines. In combination with optimized was 100mTorr and the inter-electrode gap was varied chemistry, varying the inter-electrode gap of a plasma between 1.25in. and 3.0in. Secondary electron emission reactor is an effective method by which to tune these was not included in these simulations. Peak electron aspects of etch performance. Experiments show that using density is close to the edge of the bottom electrode for this mechanism for individual steps achieves superior yield an inter-electrode gap of 1.25in. (Figure 1).[1] from a multi-step process. Plasma sheath thickness scales with electron density; Etch reactors are equipped with a number of tuning at the minimum gap, the sheath is thicker near the parameters, including power levels, process chemistry, center of the chamber where electron density is lower. gas flow distribution, chamber pressure, and wafer As the inter-electrode gap is increased, the peak temperature gradient to meet requirements for etch plasma density initially increases, then saturates and rate, selectivity, and critical dimension (CD) control. starts decreasing (Figure 1). Although not shown here, However, as process requirements tighten, it becomes the peak electron density further decreases if the inter- difficult to optimize the process for one performance electrode gap exceeds 3.0in. requirement without affecting other aspects of perfor- mance. Varying the gap between the electrodes provides These changes in electron density magnitude can be a significant and independent means of tuning process understood in terms of the surface area to volume ratio, performance, particularly center-to-edge uniformity. We A/V, of the effective plasma region, which decreases

43 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Inter-Electrode Gap Tuning

Figure 1

8.2” Quartz Figure 1. (a) Electron density and Top Electrode 1.25” Gap Max: 377.8kW/m 3 1.25” Gap Max: 3.54 x 1016m-3 (b) power distribution for

Gap Wafer different inter-electrode gaps Bottom Electrode (a) (a) at 1000W, 100mTorr, and 6.0” 2.5” Max: 240.0kW/m 3

2.5” Max: 6.17 x 1016m-3 hamber Wall 13.5MHz. Model results (top C to bottom) from 1.25in., 2.5in., and 3.0in. gap. (b) (b) 3.0” Max: 239.3kW/m 3 3.0” Max: 6.07 x 10 16m-3

Si (c) Quartz (c) Pump Port Min = Max/10 Max Min Max (a) (b) as the inter-electrode gap increases. Electrons are lost near the electrode edge at the narrow gap of 1.25in. primarily at the reactor surfaces in an argon discharge. The plasma diffuses to the chamber center as the inter- Therefore, as the relative surface area A/V is larger electrode gap is increased. This, in turn, raises plasma at smaller gaps, electron density is lowest when the density and argon ion flux near the wafer center. inter-electrode gap is 1.25in. As the gap increases The etch rate is typically dependent on the flux of ions beyond a certain limit, the same RF power is deposited and reactive species to the wafer. If the etch process over a larger volume. The peak plasma density therefore at the center of the reactor is limited by the availability saturates and then decreases. The spatial structure of reactive species, widening the inter-electrode gap of the plasma changes appreciably with the inter- facilitates diffusion of the reactive species from the electrode gap. edge region, providing a more uniform etch result. Electric field enhancement causes electron power Controlling the flux distribution by varying the inter- deposition to be stronger near the edge of the electrode electrode gap offers a mechanism for optimizing the (Figure 1). Plasma is therefore produced more strongly etch across the wafer. The CD of the etch feature, which near the electrode edge. At large inter-electrode gaps, is dependent on the anisotropy of the etch and mask plasma can diffuse relatively easily and fills the inter- erosion, can be similarly optimized. electrode region better. The axial location of the peak in plasma density shifts toward the powered electrode Figure 2 when the inter-electrode gap increases. When the gap Figure 2. Argon ion flux to 1.2x1020 is narrow, the lower electron density results in a thicker the wafer for different inter- cathode sheath, notably above the lower electrode, as electrode gaps at 1000W, the RF potential drop across the sheath is larger in this 100mTorr, and 13.5MHz. )

-1 19 s 8.0x10

asymmetric plasma reactor. The peak electron density -2 therefore appears off axis, closer to the top electrode.

As the inter-electrode gap increases, the peak electron Flux (m + Ar density gradually moves closer to the lower electrode 4.0x1019 1.25” 1.5” as more electron power is being deposited adjacent to it. 2.0” 2.5”

3.0” The effect of the inter-electrode gap on plasma spatial structure is also reflected in the ion flux to the wafer 0 0 0.05 0.10 0.15 (Figure 2). As electron density is higher near the edge Radius (m) of the lower electrode, argon ion flux to the wafer peaks

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 44 Inter-Electrode Gap Tuning

Figure 3

Figure 3. Via CD bias vs. gap for via mask open etch -20.4 -20.6 -28.1 -23.3 -22.2 -21.9 -29.2 -24.6 (a) 1.25in., (b) 1.6in., -24.1 -23.2 -30.2 -25.9 (c) 2.5in., and (d) 3.5in. -25.9 -24.5 -31.3 -27.2 -27.7 -25.8 -32.4 -28.5

-29.6 -27.1 -33.4 -29.9

-31.4 -28.4 -34.5 -31.2

-33.3 -29.7 -35.5 -32.5

-35.1 -31.0 -36.6 -33.8

Mean : -26.3nm Mean : -26.2nm Mean : -32.8nm Mean : -29.4nm 3-Sigma : 10.2nm 3-Sigma : 8.9nm 3-Sigma : 6.1nm 3-Sigma : 7.2nm Range : 14.7nm Range : 10.4nm Range : 8.5nm Range : 10.5nm

(a) (b) (c) (d)

ON-WAFER RESULTS The optimum gap varies with other parameters of The effect of electrode gap variation on etch perfor- the etch process, such as the process chemistry. Figure mance was investigated experimentally for copper back- 5 shows electrical resistance of copper-filled DD test end-of-line DD etches using porous low-κ dielectric with structures with optimization of process chemistry and a TiN hard mask, representative of the film stacks being gap to improve uniformity of the electrical results. adopted by the industry for nominal 28nm design rules. Achieving uniform resistance of the copper lines The inter-electrode gap shows a significant effect on CD requires optimizing both CD and trench depth uniformity for the via etch process. During the via mask uniformity. Moving from a C4F6-based chemistry to open step, the CD pattern shifts from center-small/edge- a leaner C4F8-based chemistry and reducing the gap large to center-large/edge-small as the gap increases from 3.5in. to 1.8in. resulted in a 55% reduction in from 1.25in. to 3.5in., with optimum uniformity occurring non-uniformity. at gaps between 1.6in. and 2.5in. (Figure 3). Via depth uniformity also varies with the gap and was similar to For reactors in which one of the electrodes can be the via CD results, the optimum center-edge uniformity moved under recipe control, the inter-electrode gap occurring when the gap ranged between 1.5in. and 2.0in. can be used as a process tuning parameter to optimize uniformity for individual steps of a multi-step recipe For the subsequent trench etch, the inter-electrode gap (Figure 6). In this example, the electrode gap was was found to be a powerful means of optimizing center- adjusted by moving the cathode to the appropriate to-edge trench depth uniformity. However, unlike the via process, center-to-edge trench CD uniformity was position for individual steps of the DD process, resulting not affected by the gap, likely because of the difference in CD uniformity of 1.4nm 3σ. After filling with copper, in plasma conditions (power, gas flows) for the two these structures yielded at 100% with a uniform line processes (Figure 4). resistance distribution (2.3% 1σ).

Figure 4 Figure 5

Figure 4. Effect of inter- 16 Trench Depth Non-Uniformity 99.9 electrode gap on trench etch 14 Trench CD Non-Uniformity 99 � � � � 95 � 12 � depth and CD uniformities. � � 90 �� � � � 80 � 10 �� �� 1.8” Gap/ � �� ��

mity (nm) 60 � ent � �� C F Chemistry �� 8 4 8 �� rc or � � 40 �� if +, x Figure 5. Line resistance (R) � 3.5” Gap/

Pe � � 6 20 �� � C F Chemistry -Un � 4 6 for 4.5m long 45/45nm line/ �

n � 10 � � � � *, 4 � 5 � space copper lines. No � 2 � 2 .5 � LSL Target USL 0 .1 1.60 1.75 2.00 1E+07 1E+08 Inter-Electrode Gap (in.) Line Resistance (Ohms)

45 Volume 9, Issue 2, 2011 Nanochip Technology Journal Applied Materials, Inc. Inter-Electrode Gap Tuning

Figure 6 ACKNOWLEDGEMENTS

PR The authors would like to acknowledge the support Figure 6. (a) DD etch process ARC HM of Nikos Bekiaris and the Applied Materials Maydan with gap optimized by SOH Technology Center for processing the electrical test TiN process step. HM wafers. Shahid Rauf, Peter Hsieh, Kathryn Keswick, (b) Distribution of electrical ULK and Bryan Pu of the Etch Products business unit also line resistance for 120 micron Barrier contributed to this work. 4 M1 75/75nm comb-serpentine AUTHORS test structures. 3

) Amulya Athayde is the global product manager for 2 Dielectric Etch at Applied Materials. He holds his Ph.D.

Gap (in. in chemical engineering from the University of Notre 1 Dame.

0 Via Hard Organic Via Etch Via Ash/ Trench Barrier Chia-Ling Kao is a senior process engineer in the Mask Mask Organic Etch Open/ Dielectric Etch Application Technology Development Etch Etch Mask Strip Post Etch Treatment group at Applied Materials. He received his Ph.D. in (a) chemical engineering from Stanford University. 99.9

99 Kallol Bera is a senior member of technical staff in the 95 Dielectric Etch Disruptive Technology and Engineering 90 80 group at Applied Materials. He holds his Ph.D. in 60 ent mechanical engineering from Drexel University. rc 40 Pe 20 Sean Kang is a senior member of technical staff in the 10 5 Dielectric Etch Application Technology Development 2 .5 group at Applied Materials. He received his M.S. in .1 chemical engineering from the University of Southern 10 100 1000 10000 California. Measurement (b) ARTICLE CONTACT [email protected] CONCLUSION PROCESS SYSTEM USED IN STUDY Experimental studies have demonstrated the Applied Producer® Etch effectiveness of using the inter-electrode gap of a plasma reactor to tune center-to-edge etch performance, such as CD uniformity or etch depth uniformity. Widening the gap caused etch rate to transition from edge high to center high; the CD pattern shifted from center-small/edge-large to center-large/edge-small as the gap increased. These results were consistent with an argon plasma model that showed an increase in ion flux near the center of the wafer and a decrease at the edge as the distance between the electrodes widened. REFERENCES

[1] K. Bera, et al., “Effects of Inter-Electrode Gap on High Frequency and Very High Frequency Capacitively Coupled Plasmas,” J. Vac. Sci. Technol. A 27 (4), pp. 706-711, Jul/Aug 2009.

Applied Materials, Inc. Nanochip Technology Journal Volume 9, Issue 2, 2011 46 NANOCHIP Technology Journal

INTEGRATING ATOMIC LAYER DEPOSITION HIGH-g DIELECTRICS

IN THIS ISSUE èȃ"5 -!#"ȃ1 -2(23.12Ó! +(-%ȃ with New Materials and New Architecture èȃ/(*#ȃ--# +2ȃ$.1ȃyx-,ȃ and Beyond èȃ -.Ñ.1.42ȃ(#+#!31(!2ȃ$.1 28nm Applications

BECAUSE INNOVATION MATTERS™

www.appliedmaterials.com 3050 Bowers Avenue P.O. Box 58039 Santa Clara, CA 95054-3299 U.S.A. Tel: +1-408-727-5555

Applied Materials and the Applied Materials logo are registered trademarks. All trademarks so designated or otherwise indicated as product names or services are trademarks of Applied Materials, Inc. in the U.S. and other countries. All other product and service marks contained herein are trademarks of their respective volume 9, issue 2, 2011 owners. © 2011 Applied Materials, Inc. All rights reserved. Printed in the U.S. 07/11 6.5K