NEMO5 on Blue Waters – A Flexible Package for Nanoelectronics Modeling Problems
Gerhard Klimeck Network for Computational Nanotechnology
PRAC: Accelerating Nano-scale Transistor Innovation PI: Gerhard Klimeck Blue Waters Symposium June 2018 Moore’s Law Forever? Number of transistors: Moore s law is continuing
http://jai-on-asp.blogspot.com 2005: free lunch is over, updated 2009 CPU’s are not getting faster! Number of transistors: Moore s law is continuing
Clock speed: no longer scaling
http://jai-on-asp.blogspot.com 2005: free lunch is over, updated 2009 Power is the Limit! Number of transistors: Moore s law is continuing
Clock speed: no longer scaling Power: today s limitation ~100W
http://jai-on-asp.blogspot.com 2005: free lunch is over, updated 2009 Limited Performance Improvements
Number of transistors: Moore s law is continuing
Clock speed: no longer scaling Power: today s limitation ~100W Performance Gain: limited Supply voltage (Vdd) stopped http://jai-on-asp.blogspot.com scaling at around 2003 2005: free lunch is over, updated 2009 What is Special about 100W ?
Power: today s limitation ~100W
http://jai-on-asp.blogspot.com 2005: free lunch is over, updated 2009 Intel Projection from 2004 How do we burn power? Switching Circuits & Leakage => Transistors
Power: today s limitation ~100W
http://jai-on-asp.blogspot.com CMOS Inverter Dynamic / Switching Power: Vdd P ØCharging a capacitor network leakage ! H L ! ∝ !!!!!!!!! conduct ØReduce frequency L N Vss ØReduce capacitance => device size K ØReduce voltage J Static Power: ØLeakage through transistors
! ∝ !!""!∝ 1/exp!(V!!)! L “Fundamental” Limit Vg S D log Id Metal Oxide Threshold n+ p n+
Vg↑ S≥60 mV/dec
0 Vdd Vg
E
x “Fundamental” Limit Vg S D log Id Metal Oxide Threshold n+ p n+
DOS(E), log f(E) Vg↑ S≥60 mV/dec ` Ef 0 Vdd Vg
E
x
log ! ! ∝ −!! !! Device Scaling for Performance
log I Dynamic / Switching Power: d ØCharging a capacitor network ! ! ∝ !!!!!!!!!! Ø Reduce supply voltage I OFF S≥60 mV/dec
0 Vdd Vg Vdd Vdd Static Power: ØLeakage through transistors
! ∝ !!"" ∝ 1/exp!(V!!)! Device Scaling for Performance
log Id
I OFF S≥60 mV/dec
From: Eric Pop, Nanotechnology
0 Vdd Vg
Static Power: ØLeakage through transistors
! ∝ !!"" ∝ 1/exp!(V!!)! Device Scaling for Performance
log Id log Id ION
S<<60 mV/dec I S≥60 mV/dec OFF S≥60 mV/dec IOFF
0 Vdd Vg 0 Vdd Vg
Static Power: ØLeakage through transistors
! ∝ !!"" ∝ 1/exp!(V!!)! Need a Different Switch
log I G d n-type S D Metal ION Oxide p+ i n+
S<<60 mV/dec log f(E) S≥60 mV/dec IOFF Ef 0 Vdd Vg
Cold injection Static Power: ØLeakage through transistors
! ∝ !!"" ∝ 1/exp!(V!!)! Ultra-Thin-Body (8nm) InAs BTBT Device
§ Do not need phonons for direct current § Highly non-parabolic conduction band § Realistic valence band features
Gerhard Klimeck UTB Band Edges
• 1D quantization => Bandgap raised from bulk 0.37eV => ~0.6eV • Realistic Poisson Solution Doping 1x1018/cm3 • High doping regions flat bands • Central gate region good control
Doping 1x1018/cm3 § Gerhard Klimeck UTB: Current Hotspots in Energy
• 1D quantization => Bandgap raised from bulk 0.37eV => ~0.6eV • Realistic Poisson Solution Doping 1x1018/cm3 • High doping regions flat bands • Central gate region good control
• Bandgap narrow at »Certain energies »Certain biases
Gerhard Klimeck UTB Current Density in Energy
• 1D quantization => Bandgap raised from bulk 0.37eV => ~0.6eV • Realistic Poisson Solution Doping 1x1018/cm3 • High doping regions flat bands • Central gate region good control
• Bandgap narrow at »Certain energies »Certain biases • 2 energy regions of current flow »Low Ec / high Ev »High Ec / low Ev
Gerhard Klimeck UTB Current as a function of Gate Voltage
• Current vs. Gate Voltage
Gerhard Klimeck Application to pin InAs UTB and Nanowire
Single-Gate UTB 20nm 20nm
6nm 6nm
GAA Nanowire 20nm
-3 NA=ND=5e19 cm 6nm t =1nm ox Double-Gate UTB
Gerhard Klimeck BTBT in pin InAs Devices – SG-UTB / DG-UTB / NW
Subthreshold Swing @Vds=0.2 V
Nanowires can deliver very steep turn-offs!
Gerhard Klimeck Tunnel FETs
J. Charles, P. Sarangapani, D. Lemus, Chin Yi Chen, P. Long, Tarek Ameen, X. Guo, T. Kubis, G. Klimeck Nitride Tunnel Hetero-structures
Multi-Scale, Multi-Physics 23Partitioning is critical Nitride Tunnel Hetero-structures
ON and OFF current need careful24 Geometry designs phonon scattering degrades transistor performance
Local current density 1016
1014
1012
scattering
1010 A/(eV*m) BlueWaters simulation: Phonon scattering degrades ON/OFF ratio of 3HJ TFETs.
At 30nm Lg , VDD must be increased ~0.07V to maintain on/off ratio 25 5. Chemically doped tunneling FETs Channel thickness optimization Application: Tunneling FETs Impact: § Low power electronics. § TFET’s performance Ion/Ioff can be improved up § Mobile/medical devices. 4 Simulation method: 10 if optimized channel thickness is used. § Quantum transmitting boundary method coupled with Poisson equation WSe2 BP InAs § 10 band WSe2 , Phosphorene (BP), and InAs tight binding Hamiltonian Achievement: V § Channel thickness design rule. DD ü Compact model developed. VDD source oxide drain
oxide
Details: arXiv:1804.11034; https://arxiv.org/abs/1804.11034 26 Benchmarking by industry: Charge-devices continue to shine D. Nikonov, I. Young (Intel) 102 2012 ST: Spin-Torque Charge-based devices outperform ST transfer 101 others in electronic switching Spin-wave triad
ST oscillator 0 All-spin 10 logic ST Transfer/ Graphene pn junction SpinFET BISFET ST Domain-Wall Nanomagnet -1 majority 10 CMOS high performance gate Logic
Heterojunction CMOS low power -2 10 III-V Energy(fJ)
Graphene nanoribbon 10-3 Preferred corner Tunnel-FETs 10-4 10-1 100 101 102 103 104 Delay (ps)
Energy vs delay of inverters with fanout 4 with current-controlled switching, Vdd=0.01 V Power Problem: Tunneling Transistors to the Rescue!
For a little while! Intel Roadmap Today: non-planar 3D devices Better gate control!
Intel 22nm finFET
http://www.goldstandardsimulations.com/index.php/news/blog_search/simulation-analysis-of-the-intel-22nm-finfet/ http://www.chipworks.com/media/wpmu/uploads/blogs.dir/2/files/2012/08/Intel22nmPMOSfin.jpg Today: non-planar 3D devices Better gate control!
22nm = 176 atoms 8nm = 64 atoms
http://www.goldstandardsimulations.com/index.php/news/blog_search/simulation-analysis-of-the-intel-22nm-finfet/ http://www.chipworks.com/media/wpmu/uploads/blogs.dir/2/files/2012/08/Intel22nmPMOSfin.jpg Today: non-planar 3D devices Better gate control! 1,085 atoms
22nm = 176 atoms 8nm = 64 atoms
http://www.goldstandardsimulations.com/index.php/news/blog_search/simulation-analysis-of-the-intel-22nm-finfet/ http://www.chipworks.com/media/wpmu/uploads/blogs.dir/2/files/2012/08/Intel22nmPMOSfin.jpg Roadmap of finite atoms! Atomistic Modeling => NEMO
nm Node 22 14 10 7 5 Node atoms 176 122 80 56 40 Critical atoms 64 44(?) 29(?) 20(?) 14(?) Electrons 160-190 64-80 30-38 18-23 11-15 Nanoelectronics Today
• $300 billion semiconductor industry • International Technology Roadmap for Semiconductors (ITRS) • Moore’s Law • Countable number of atoms • Quantum effects • Devices (transistors) • Smaller, faster, more energy efficient • Designs, materials Band-to-band tunneling Topological insulators Single atom transistors
IEEE Elec. Dev. Lett. 30, 602 (2009) Nature Physics 6, 584 (2010) Nature Nanotechnology 7, 242 (2012) http://www.chipworks.com/media/wpmu/uploads/blogs.dir34 /2/files/2012/08/Intel22nmPMOSfin.jpg NEMO5 - Bridging the Scales From Ab-Initio to Realistic Devices Ab-Initio TCAD
Goal: Approach: • Device performance with realistic • Ab-initio: extent, heterostructures, fields, etc. • Bulk constituents for new / unknown materials • Small ideal superlattices Problems: • Map ab-initio to tight binding • Need ab-initio to explore new (binaries and superlattices) material properties • Current flow in ideal structures • Ab-initio cannot model non- • Study devices perturbed by: equilibrium. • Large applied biases • TCAD uses quantum corrections • Disorder 35 • Phonons Quantum Transport far from Equilibrium
Macroscopic dimensions NonLaw-Equilibriumof Equilibrium Quantum: Statisticalρ = exp (−( HMechanics− µN)/kT)
Diffusive Atomic Ballistic Drift / dimensions Diffusion Quantum Σs Boltzmann Unified model Transport µ1 H µ2 Non-WhichEquilibrium GreenFormalism? Functions Σ1 Σ2 S D SILICON S D INSULATOR VG VD VG VD Gerhard Klimeck,I Supriyo Datta I Multi-Scale Modeling
Atom Geometry Positions Construction ~10-50 million o Valence Force Field Atomistic atoms Strain (VFF) Method Relaxation
Piezoelectric Hamiltonian Electrical / o Piezoelectric eff. Valence Potential Construction Magnetic field Pol. charge density Electrons o Empirical tight binding Optical Prop. ~0.5-10 million Single Particle 3 5 States Electronic Str sp d s* + spin orbit
e-e Many Body Optical Prop. o SCP: Poisson + LDA interactions Interactions Electronic Str Few States o Slater Determinants A Journey Through Nanoelectronics Tools NEMO and OMEN
NEMO-1D NEMO-3D NEMO3Dpeta OMEN NEMO5 Transport Yes - - Yes Yes Dim. 1D any any any any Atoms ~1,000 50 Million 100 Million ~140,000 100 Million Crystal [100] [100] [100], Any Any Cubic, ZB Cubic, ZB Cubic,ZB, WU Any Any Strain - VFF VFF - MVFF Multi- - Spin, physics Classical Parallel 3 levels 1 level 3 levels 4 levels 4 levels Comp. 23,000 cores 80 cores 30,000 cores 220,000 co 100,000 cores
38 A Journey Through Nanoelectronics Tools NEMO and OMEN
NEMO-1D NEMO-3D NEMO3Dpeta OMEN NEMO5 Transport Yes - - Yes Yes Dim. 1D any any any any Atoms ~1,000 50 Million 100 Million ~140,000 100 Million Crystal [100] [100] [100], Any Any Cubic, ZB Cubic, ZB Cubic,ZB, WU Any Any Strain - VFF VFF - MVFF Multi- - Spin, physics Classical Parallel 3 levels 1 level 3 levels 4 levels 4 levels Comp. 23,000 cores 80 cores 30,000 cores 220,000 co 100,000 cores
39 A Journey Through Nanoelectronics Tools NEMO and OMEN
NEMO-1D NEMO-3D NEMO3Dpeta OMEN NEMO5 Transport Yes - - Yes Yes Dim. 1D any any any any Atoms ~1,000 50 Million 100 Million ~140,000 100 Million Crystal [100] [100] [100], Any Any Cubic, ZB Cubic, ZB Cubic,ZB, WU Any Any Strain - VFF VFF - MVFF Multi- - Spin, physics Classical Parallel 3 levels 1 level 3 levels 4 levels 4 levels Comp. 23,000 cores 80 cores 30,000 cores 220,000 co 100,000 cores
40 A Journey Through Nanoelectronics Tools NEMO and OMEN
NEMO-1D NEMO-3D NEMO3Dpeta OMEN NEMO5 Transport Yes - - Yes Yes Dim. 1D any any any any Atoms ~1,000 50 Million 100 Million ~140,000 100 Million Crystal [100] [100] [100], Any Any Cubic, ZB Cubic, ZB Cubic,ZB, WU Any Any Strain - VFF VFF - MVFF Multi- - Spin, physics Classical Parallel 3 levels 1 level 3 levels 4 levels 4 levels Comp. 23,000 cores 80 cores 30,000 cores 220,000 co 100,000 cores
41 A Journey Through Nanoelectronics Tools NEMO and OMEN
NEMO-1D NEMO-3D NEMO3Dpeta OMEN NEMO5 Transport Yes - - Yes Yes Dim. 1D any any any any Atoms ~1,000 50 Million 100 Million ~140,000 100 Million Crystal [100] [100] [100], Any Any Cubic, ZB Cubic, ZB Cubic,ZB, WU Any Any Strain - VFF VFF - MVFF Multi- - Spin, physics Classical Parallel 3 levels 1 level 3 levels 4 levels 4 levels Comp. 23,000 cores 80 cores 30,000 cores 220,000 co 100,000 cores
42 Modeling Electronic Transport of Metal Interconnects
Daniel Valencia
Electrical and Computer Engineering Purdue University Why interconnect power dissipation is a BIG deal
Total interconnect length in Metal 1 and Intermediate wiring ~ 3km/cm2 !!!
Electron travels through far more metal than semiconductor!!!
2004, Nir Magen, Intel http://lp-hp.com/pangrle/files/2011/09/barry1.png ITRS 2007 Interconnect
45 45 Sources of resistance in interconnects
1. Surface and Surface roughness scattering (Metal-dielectric Interface) 2. Grain Boundary scattering (Metal-Metal interface)
2 3. Barrier scattering (Metal-Metal Interface) 1
3
Image from EE-311, V.K.Saraswat, Stanford University 2003
ITRS Interconnect 46(2010) 46 Challenges
Resistivity as a function of the Atomistic structure of an width interconnect
Traditional Model
Challenges of Traditional Models Challenges of Atomistic Models o Require experimental input o Structure becomes an arrangement of o Fail to describe the effect of surface atoms instead of a continuous bulk roughness and grain boundary material individually o Atoms models requires a quantum o Pure fitting process and lack of treatment physical meaning of those parameters o Grain has 104 to 105 atoms à cannot be modelled by ab initio Atomistic models can be used to o Atom positions are unknown a priori à tackles those shortcomings the structure must be minimized Results: Resistivity at room temperature Resistivity at room At room T, resistivity:ρT = ρGB + ρSR + ρe−ph temperature Simulated copper thin- filmLength
Grain 1 Grain 2 Grain 3 Width
Source Drain
The resistivity for copper thin films at room T is simulated assuming same structures than before but Results: o At room T, the e-ph effects are o e-ph method correctly describes the included by the Molecular Dynamics + increasing in the resistivity at room Landauer formalism temperature o 10 different “snapshots” are averaged o e-ph contribution is smaller than SR per configuration effect for the simulated results Mode Space: Computational Burden Reduction
Daniel Lemus
Electrical and Computer Engineering Purdue University Computational complexity of Non- Equilibrium Green’s Functions (NEGF)
NEGF equations best solved with recursive Green’s function (RGF) algorithm Device Hamiltonian ! rows for total device
Solution for RGF: # *(+) for + layers
With RGF:
Time-to-solution: Matrix mult./inversions $(&') ,- = / − # − Σ- 23 † ,4 = ,-Σ4,- Memory: Σ4 = ,464 " rows for $(&)) Σ- = ,-6- + ,-64 + ,46- single layer 54 Computational complexity - example
• Hundreds of thousands of rows » ! "# time to solution $ » ! " memory footprint NEGF equations solved • Realistic basis sets 500 million times » 10 rows per atom 80 million CPU hours can be • Self-consistency spent on designing one device » 100 iterations • Multiple energies, momentums All of these factors needed for » 500 instances realistic simulations of nanodevices • Multiple electrostatic response tests » 10 instances A way to reduce this workload • Calculation of inelastic scattering is required » 100 instances • Geometry/material variations » 10 instances 55 Mode space basis reduction
2880 81
2880 ∗ 81 !" ∗
81 ! × # × !" !# !$ = ∗ (+,-- 81 ( 2880 )* !$ 2880