Physical Design Challenges and Innovations to Meet Power, Speed, and Area Scaling Trend
LC LU TSMC TSMC Fellow/Senior Director, R&D
© 2017 ISPD ISPD 2017 1 Chip and System Integration Trends for Better PPA & System Performance
Platform Solutions
Mobile
3D Transistors High (FinFET, GAA-NWT) Performance Computing
Automotive CPU
Voice 3D Stacking
(BSI, Monolithic MEMS) Camera IoT
Video MP 3 Storage SYSTEM SoC 3D Packaging INTEGRATION (WLP, WoW) Multiple System performance/ Better performance Bandwidth Chips Better Power Application-specific on PCB Better form factor Process technology + Design Solutions 2D 3D © 2017 ISPD ISPD 2017 2 Application-Optimized Design Platforms Area, Performance and Power
Power Speed & Memory Bandwidth High- Performance Functional Safety Computing Standard Compliance
Mobile
Ultra-Low Power Automotive
IoT
Speed
© 2017 ISPD ISPD 2017 3 Semiconductor Trend
Area: process primary dimension scaling slows down Very challenging to continue Moore’s law economically Process and design co-optimization to provide enough area scaling Performance: dimension scaling grows metal and via resistance exponentially Fully automatic and smart via pillar design flow to reduce high resistance impact Power: ultra low voltage for ultra low power Robust design and variation modeling at low voltages Heterogeneous integration Low cost and high system performance 3D packaging High physical design complexity Machine learning to be applied to physical design
© 2017 ISPD ISPD 2017 4 Area Scaling
Process primary dimension (metal, gate and fin pitch) scales less and per generation cadence takes longer Process-design co-optimization innovation to keep area scaling Fin depopulation to increase cell density Power plan and cell layout co-optimization to increase cell placement utilization EUV to increase routing density
© 2017 ISPD ISPD 2017 5 Fin Depopulation can Increase Density and Reduce Process Dimension Scaling Pressure
4-fin cell 3-fin cell 2-fin cell
© 2017 ISPD ISPD 2017 6 Fin Depopulation can Increase Speed and Power Efficiency
Higher fins have highest speed Fewer fins have highest speed@same power and lowest power@same speed
© 2017 ISPD ISPD 2017 7 Fin Depopulation Impact on Logic Density and Cell Utilization
Logic Density 3.0
2.0 Cell Utilization 1.0 80%
75%
70% 16nm 10nm 7nm 3~4 fins 3 fins 2 fins
© 2017 ISPD ISPD 2017 8 Logic Density Improvement (I) Power Plan Optimization
To improve power plan IR, PG via count is increasing over technology generations
Shrink PG Pitch to increase via count
However, shrinking PG pitch impacts routing resources. So different and new PG design approaches evolved
Allows better cell placement freedom
© 2017 ISPD ISPD 2017 9 Logic Density Improvement (II) Power Plan and Cell Layout Co-optimization
Uniform M1 Dual M1 Top Down
PG design has to be friendly to cell architecture (Uniform Dual) PG and Cell Optimized Co-Design Cell re-Design
Big cell design has to be Bottom Up redesigned to be compatible with Dual PG architecture
© 2017 ISPD ISPD 2017 10 Logic Density Improvement (III) Power Plan and Cell Layout Push to Limit
Power strap: No stagger pin: 5 access points Less cells placed under PG
Unused space
Power stub: More cells placed under PG Stagger pin: 6 access points © 2017 ISPD ISPD 2017 11 Logic Density Improvement (IV) EUV to Increase Routing Density
NXE3300 EUV Hole type DSA, Line/Space DSA, Single 16 nm half pitch 12 nm half pitch Patterning
Inverse Multiple EUV Directed Self- Litho. patterning Assembly
© 2017 ISPD ISPD 2017 12 EUV Metal Metal : Poly pitch = 2 : 3
More Metal resources for logic density improvement 2 library sets are needed for two metal offset.
Metal:Poly = 1:1 Metal:Poly = 2:3
© 2017 ISPD ISPD 2017 13 Less Coupling from Metal : Poly = 2 : 3
ND2D1 ND2D1 M1:PO pitch = 1:1 M1:PO pitch = 2:3
© 2017 ISPD ISPD 2017 14 More Routing from Metal : Poly = 2 : 3
M1:Poly pitch = 1:1 M1:Poly pitch = 2:3
© 2017 ISPD ISPD 2017 15 Enhanced Logic Density and Cell Utilization Trends Logic Density 3.0
2.0 Optimized power plan, Cell pin access, Utilization 1.0 EUV routing 80%
75%
70% N16 N10 N7 11ML 12ML 13ML
© 2017 ISPD ISPD 2017 16 Performance Scaling
Process dimension scaling grows metal and via resistance exponentially Fully automatic and smart EDA design flow is needed to effectively solve high resistance issue for 7nm and beyond
© 2017 ISPD ISPD 2017 17 Cross Node Metal RC Scaling
75 2
) Mx R
1.8 nm
60 40
increase ~3X 1.6 nm)
45 40
1.4 (vs. (vs. 30 increase ~3X
Mx C 1.2 Cap
15 Mx
1
Mx Resistance (vs. (vs. Resistance Mx
0 0.8 40nm 28nm 16nm 7nm 5nm
© 2017 ISPD ISPD 2017 18 Cross Node Wire R and Via Impact
50%
40%
30%
20%
10% BEOL R delay vs. Total delay Total vs. delay R BEOL
0% 40nm 28nm 16nm 7nm 5nm
© 2017 ISPD ISPD 2017 19 VIA Pillar
M6
Single stacking via Via pillar
M6 M6
M5 M5 M4 M4
M3 M3 M2 M2
• Large drivers + upper thick metal + via pillar to reduce transistor, metal and via resistance • VIA-Pillar enablement is complex and requires EDA enablement across all Implementation phases
© 2017 ISPD ISPD 2017 20 Via Pillar Insertion
Implies layer promotion for better wire resistance Effectively reduces source Via resistance
Connection through DPT layer: DPT wire resistance dominates
Metal R Source Via R Std. Wire cell (20μm) 55.43X 1X
Layer promotion to non-DPT layer: Source Via resistance increases
Metal R Source Via R 13.23X 3.29X
Via pillar: Best combination of metal and source Via resistance
Metal R Source Via R 13.23X 1X
Std. cell Double Patterning Non-Double Patterning Technology (DTP) Layer Technology (DTP) Layer © 2017 ISPD ISPD 2017 21 Cross Node Wire R and Via Impact
50%
40%
30%
20% with Via Pillar
10% BEOL R delay vs. Total delay Total vs. delay R BEOL
0% 40nm 28nm 16nm 7nm 5nm
© 2017 ISPD ISPD 2017 22 Via Pillar Type and Usage
• EM via pillar (EM VP), used to guarantee cell-level EM, has to be 100% inserted – One cell master will only have 1 EM VP • Performance via pillar (Performance VP), which is larger than EM VP, can reduce more resistance – One cell master can have multiple performance VP
VP Association I
Cell Performance Performance Performance EM Default VP1 VP2 VP.. VP VP Master VP Association II Cell EM Performance Performance Performance Master VP VP1 VP2 VP..
© 2017 ISPD ISPD 2017 23 Via Pillar Automatic Design Flow
Design Kits •VIA pillar (VP) definition, association
Floorplan
• VP aware placement Placement • Automatic NDR
• VP insertion and verification CTS • CTS with VP
Routing/Opt. • VP optimization • DEF out option
© 2017 ISPD ISPD 2017 24 Voltage and Power Scaling
Low voltage design reduces power effectively Low voltage design challenges Functionality robustness Accurate variation modeling
© 2017 ISPD ISPD 2017 25 Ultra-Low-Voltage Standard Cell Solution
Skew Cells/ D0 or Fine-Grained Cells Internal Half-Sizing Cells 0.9V- 0.6V Small High-Stack(3 or 4) Cells Transmission Gates Multi-Bit Latch/Flop Proper Stage Ratio to Maintain Internal Slew 0.5V and Proper P/N Ratio to Improve Noise Margin below New Circuit Structure to Enhance Robustness at Low–Vdd
Improve cell performance by reducing delay degradation and variation induced by low-Vdd
© 2017 ISPD ISPD 2017 26 Flop Low Voltage Design Robustness
• High sigma design checks are required to ensure robust operation at low operating voltage • For write operation, forward path (red) must be stronger than feedback path (blue)
QN
Reset clk_b clk SI
D clk Reset clk_b
clk_b
Reset clk clk
clk_b Reset
© 2017 ISPD ISPD 2017 27 Delay Variation Increases at Lower VDD
2 1.8 Delay variation increases 1.6 exponentially as VDD decreases
1.4
1.2
1
0.8 deviation 0.6 ◄ Non-Gaussian Gaussian ► 0.4 at low voltage at regular voltage
0.2 Log of scaled delay standard delay scaled of Log 0 VDD
© 2017 ISPD ISPD 2017 28 Abnormal Hold-Time Caused by Low-Vdd Non-Gaussian Behavior
Flop Hold Constraint Behavior
Output Low VDD
Regular VDD
Hold time
© 2017 ISPD ISPD 2017 29 Solutions for Low-Vdd Variations (I)
Methodology improvement – Timing values are models by both “mean” and “early/late distribution” – Implement the new models in timing characterization and STA tools Original distribution (Non-Gaussian with long right tail)
“Early” distribution to cover short tail “Late” distribution to cover long tail
© 2017 ISPD ISPD 2017 30 Solutions for Low-Vdd Variations (II)
By using new timing model and advanced statistical OCV method, we achieve more accurate STA results compared to Monte-Carlo simulations
Optimistic
Slack
SPICE Monte Carlo Simple OCV Advanced Statistical OCV Path ID
© 2017 ISPD ISPD 2017 31 Heterogeneous Integration
Low cost and high system performance InFO and CoWoS integration Integrated EDA design flow for package and chips
© 2017 ISPD ISPD 2017 32 Future Device Possibilities Through Heterogeneous Technology Integration
TFET RF CMOS Analog GAA- NWT eFlash
Source Gate Drain III-V FFET Mixed Signal Si CMOS BSI CIS 3D Low Integrated Ge FinFET Power Specialty on Si CMOS MEMS CMOS Technology
FinFET HV 3D-IC Technology BCD - Power IC InFO TM Cu-TSV CoWoS Vertical Stacking Si Interposer
© 2017 ISPD ISPD 2017 33 3D Packaging for Better System Performance and Form Factor
Analog RF Passive
Logic
Memory Wafer Bonding
Conventional CoWoS / InFO Vertical Stacking
Conventional SIP 3D Wafer-based or MCM System Integration (Wirebond, (CoWoS, InFo-PoP, Vertical Stacking, Wafer Bonding) Flipchip, SMT)
Sources: A-SSCC 2014, IEDM 2014, VLSI 2015 © 2017 ISPD ISPD 2017 34 Different Systems Require Different Packaging Solutions
4000
3000
CoWoS
2000 InFO
I/O Pin CountI/O Pin InFO-M 1000 InFO PoP
WLCSP*
0 0 200 400 600 800 1000 1200 1400 2 * WLCSP : Wafer Level Chip Scale Package Die/PKG size (mm ) ** InFO : Integrated Fan-Out *** CoWoS : Chip on Wafer on Substrate © 2017 ISPD ISPD 2017 35 Wafer-Level Packaging Technologies: Integrated Fan-Out (InFO)
4000 Logic Multi-chips integration Smallest form-factor InFO
3000 Cost competitive
CoWoSDRAM Logic Through- 2000 InFO–Via InFO (TIV) InFO-PoP
I/O Pin CountI/O Pin InFO-M 1000 InFO PoP Logic Memory
* WLCSP InFO-M (Multi-chip) 0 0 200 400 600 800 1000 1200 1400 Die/PKG size (mm2)
© 2017 ISPD ISPD 2017 36 Wafer-Level Packaging Technologies: Chip-on-Wafer-on-Substrate (CoWoS)
4000
Homogeneous partition
3000
CoWoS High bandwidth memory 2000 integration for high-performance InFO computing
I/O Pin CountI/O Pin InFO-M 1000 InFO PoP High-performance heterogeneous partitioning WLCSP*
0 0 200 400 600 800 1000 1200 1400 Die/PKG size (mm2)
© 2017 ISPD ISPD 2017 37 Advanced Packaging Co-Design – SoC + InFO + IPD (Integrated Passive Device) FC-PoP InFO-PoP o o Rja (17.2 C/W) Rja (15.4 C/W) IPD 1 Thermal SR F-S pad L6 PM4 dissipation RDL3 Insulator5 PM3 improved by 12% L5 RDL2 Insulator4 PM2 L4 RDL1 250μm RDL0 PM1 Core Molding 590μm Silicon L3 Compounds Insulator2 L2 B-S pad Insulator1 L1 SR 2 Thickness TMV Silicon DRAM Molding reduced by 57% Compounds
DRAM Eye Opening: Eye Opening: 0.785 UI 0.663 UI 3 Eye width improved by >12%
InFO-PoP W/I IPD
FC-PoP W/O IPD 4 Voltage droop © 2017 ISPD reduced by 5~10% ISPD 2017 38 InFO Design Solutions (InFO Only)
CDL Netlist Auto-Export InFO Layout Creation InFO GDS PKG Layout In-design DRC InFO PKG Layout InFO CDL LVS Netlist
InFO DRC/LVS
RLCK Extraction < 4GHz RC/RLCK Extraction for STA, IR, SEM, & PI Analysis InFO RC Extraction S-parameter Extraction 4GHz – 10GHz S-parameters Extraction for SI/PI, & EMI analysis
© 2017 ISPD ISPD 2017 39 InFO Design Solutions (Dies + InFO)
Inter-die DRC/LVS Inter-die Inter-die DRC Inter-die LVS DRC/LVS InFO EM/IR Analysis
B1 B2 B3 B4 Dynamic A1 A2 A3 A4 Static IR IR Top die EM/IR Analysis Analysis Analysis
SI/PI Simulation SI/PI Simulation SI Analysis PI Analysis Thermal-aware EM/IR Analysis
Normal Analysis Thermal-aware (No Thermal) Analysis Thermal Analysis
© 2017 ISPD ISPD 2017 40 Applying Machine Learning to
Complex Physical Design Problems
Place and route Machine Learning experiment Feature extraction and convolutional neural network data mapping Clock gating and routing congestion speed improvement
© 2017 ISPD ISPD 2017 41 Machine Learning-Based Design Solution
Eliminate human & fixed-model subjective bias with statistical important features
Traditional EDA Approach Machine Learning Approach
Placement Placement result Routing result Routing Model Routing simulation overheads training overheads
Routing programming Detailed Learning
Prediction pattern model routing result
Easy to be biased and Only heuristics Leverage widely used deep no guaranteed quality are possible learning packages
Traditional (EDA) flow Flow enables by Machine Learning Pre-route design Congestion map after routing Congestion prediction Congestion map after routing
Only know routing results after Able to predict routing behavior and improve running routing congestion with new recipes to achieve 40Mhz © 2017 ISPD speed improvement ISPD 2017 42 ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
Adopt TSMC ML Platform to predict the design improvement opportunity
RTL Synthesis Place & Route STA Signoff Optimized design
APR TSMC ML Design Enablement Platform database Script to Feature Data pre- perform design Prediction extractions processing optimization
TSMC’s proven ML training models
© 2017 ISPD ISPD 2017 43 ML Design Enablement Platform
RTL Synthesis Place & Route STA Signoff Tapeout
Adopt TSMC ML Platform to train design and build new models
RTL Synthesis Place & Route STA Signoff Optimized design
APR database TSMC ML Design Enablement Platform for new Feature Data pre- Script to learning perform design extractions processing optimization Prediction
Training Model Mgmt. TSMC’s models
Customer’s models
© 2017 ISPD ISPD 2017 44 Performance Gains Prediction from ML
• Capability to predict and set accurate ARM A72 clock gating latency to avoid over-design and achieve better speed from 50 to 150Mhz • Post-route speed is improved by 40MHz with detour prediction and early fix
Predicted detours Real post-route detours
© 2017 ISPD Achieved 95% matching on A72 ISPD 2017 45 Conclusion
Five semiconductor industry trends, issues and solutions are discussed: area, performance and power scaling, heterogeneous 3D integration and Machine Learning Physical design and EDA play even more critical roles to extend Moore’s law and to enable highly integrated complex 3D SOC chips
© 2017 ISPD ISPD 2017 46