Hu.Pdf (653.8Kb)

ALGORITHMIC TECHNIQUES FOR NANOMETER VLSI DESIGN AND MANUFACTURING CLOSURE A Dissertation by SHIYAN HU Submitted to the O±ce of Graduate Studies of Texas A&M University in partial ful¯llment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2008 Major Subject: Computer Engineering ALGORITHMIC TECHNIQUES FOR NANOMETER VLSI DESIGN AND MANUFACTURING CLOSURE A Dissertation by SHIYAN HU Submitted to the O±ce of Graduate Studies of Texas A&M University in partial ful¯llment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Jiang Hu Committee Members, Charles J. Alpert Mosong Cheng Donald K. Friesen Weiping Shi Head of Department, Costas N.Georghiades May 2008 Major Subject: Computer Engineering iii ABSTRACT Algorithmic Techniques for Nanometer VLSI Design and Manufacturing Closure. (May 2008) Shiyan Hu, B.S., Beijing University of Aeronautics and Astronautics; M.S., Polytechnic University, Brooklyn, NY Chair of Advisory Committee: Dr. Jiang Hu As Very Large Scale Integration (VLSI) technology moves to the nanoscale regime, design and manufacturing closure becomes very di±cult to achieve due to increasing chip and power density. Imperfections due to process, voltage and tem- perature variations aggravate the problem. Uncertainty in electrical characteristic of individual device and wire may cause signi¯cant performance deviations or even func- tional failures. These impose tremendous challenges to the continuation of Moore's law as well as the growth of semiconductor industry. E®orts are needed in both deterministic design stage and variation-aware design stage. This research proposes various innovative algorithms to address both stages for obtaining a design with high frequency, low power and high robustness. For deterministic optimizations, new bu®er insertion and gate sizing techniques are proposed. For variation-aware optimizations, new lithography-driven and post-silicon tuning-driven design techniques are proposed. For bu®er insertion, a new slew bu®ering formulation is presented and is proved to be NP-hard. Despite this, a highly e±cient algorithm which runs > 90£ faster than the best alternatives is proposed. The algorithm is also extended to handle continuous bu®er locations and blockages. For gate sizing, a new algorithm is proposed to handle discrete gate library in contrast to unrealistic continuous gate library assumed by most existing algorithms. iv Our approach is a continuous solution guided dynamic programming approach, which integrates the high solution quality of dynamic programming with the short runtime of rounding continuous solution. For lithography-driven optimization, the problem of cell placement considering manufacturability is studied. Three algorithms are proposed to handle cell ﬂipping and relocation. They are based on dynamic programming and graph theoretic approaches, and can provide di®erent tradeo® between variation reduction and wirelength increase. For post-silicon tuning-driven optimization, the problem of uni¯ed adaptivity optimization on logical and clock signal tuning is studied, which enables us to signif- icantly save resources. The new algorithm is based on a novel linear programming formulation which is solved by an advanced robust linear programming technique. The continuous solution is then discretized using binary search accelerated dynamic programming, batch based optimization, and Latin Hypercube sampling based fast simulation. v To my parents Changxin Hu and Xiaoyu Hu. vi ACKNOWLEDGMENTS I would like to express my great thanks to my advisor Dr. Jiang Hu for his kind guidance for my Ph.D. study. Dr. Jiang Hu introduced me the ¯eld of VLSI Computer-Aided Design. He shared his deep knowledge and research experience with me and constantly provided invaluable advise to me. I truly appreciate all of his academic, moral and ¯nancial support to me. Many thanks to my Ph.D. dissertation committee members, Dr. Charles Alpert, Dr. Mosong Cheng, Dr. Donald Friesen, Dr. Jiang Hu and Dr. Weiping Shi. I really appreciate their invaluable assistance to my dissertation. In addition, I would like to thank Dr. Weiping Shi for instructing great courses on physical design where I learned a lot. He also spent much time in discussing various CAD problems with me. I really appreciate Dr. Charles Alpert in IBM Austin Research Lab for being my mentor and manager when I was an intern there. He shared his great academic and industrial experience with me. Special thanks to Dr. Mosong Cheng and Dr. Donald Friesen for giving many highly valuable comments on my preliminary examination, proposal and dissertation. I would also like to thank the graduate students Ganesh Venkataraman, Zhuo Feng, Pratik Shah, Zhanyuan Jiang and Nikhil Jayakumar in Computer Engineering group at Texas A&M University for their helps on my research. Last, but not least, I would like to express my greatest gratefulness to my family for their long-lasting encouragement and support. vii TABLE OF CONTENTS CHAPTER Page I INTRODUCTION :::::::::::::::::::::::::: 1 A. Preliminaries and Motivation . 1 B. Contribution . 4 1. Fast Algorithms for Slew Constrained Minimum Cost Bu®ering . 4 2. Gate Sizing for Cell Library-Based Designs . 5 3. Pattern Sensitive Placement for Manufacturability . 7 4. Uni¯ed Adaptivity Optimization of Clock and Logic Signals . 9 II FAST ALGORITHMS FOR SLEW CONSTRAINED MINIMUM COST BUFFERING :::::::::::::::::: 11 A. Introduction . 11 B. Preliminaries . 15 C. Complexity of Slew Bu®ering Problem . 18 D. Slew Constrained Minimum Cost Bu®ering Algorithms . 20 1. Overview of Classic Timing-Driven Bu®ering . 20 2. Discrete Slew Bu®ering Assuming Fixed Input Slew . 22 a. Algorithm . 22 b. Critical Di®erences from Timing Bu®ering . 24 c. Implementation Experiences . 25 3. Discrete Bu®ering without Input Slew Assumptions . 28 a. Basic Modi¯cations . 28 b. Reduction to Maximum Bipartite Matching . 30 4. Continuous Slew Bu®ering . 32 5. Bu®er Blockage Avoidance . 37 E. Discussion of Related Approaches . 40 1. Minimum Cost Slew Constrained Timing Bu®ering . 40 2. Capacitance-Based Bu®ering . 41 F. Experimental Results . 41 1. Experiment Setup . 41 2. Comparison with Timing Bu®ering . 43 3. Slew Bu®ering with Non-Fixed Input Slew . 46 viii CHAPTER Page 4. Continuous Slew Bu®ering . 47 5. Handling Blockage . 49 6. Comparison with Capacitance-Based Bu®ering . 50 G. Conclusion . 51 III GATE SIZING FOR CELL LIBRARY-BASED DESIGNS ::: 52 A. Introduction . 52 B. Problem Formulation . 54 C. Optimization Methodology . 55 1. Error Due to Nearest Rounding . 55 2. Proposed Methodology . 58 D. Discretization Algorithm . 59 1. Explore Gate Sizes Close to the Continuous Solution . 61 2. Solution Pruning . 62 3. Solution Clustering by LSH . 65 E. Experimental Results . 68 F. Conclusion . 72 IV PATTERN SENSITIVE PLACEMENT FOR MANUFAC- TURABILITY :::::::::::::::::::::::::::: 74 A. Introduction . 74 B. Preliminaries . 78 1. Motivation . 78 2. Pattern . 78 3. Lookup Table for Manufacturability Cost . 81 4. Problem Formulation . 82 C. Cell Flipping . 84 1. Algorithmic Overview . 84 2. Solution Characterization . 85 3. Solution Propagation . 85 4. Solution Pruning . 85 D. Single Row Optimization and Multiple Row Optimization . 88 1. Algorithmic Overview (Single Row Optimization) . 88 2. Unconstrained Optimal Manufacturability-Driven Placement . 89 3. Manufacturability-Wirelength Tradeo® . 91 4. Extension to Multiple Row Optimization . 94 E. Experimental Results . 96 ix CHAPTER Page 1. Experiment Setup . 96 2. Experiments with ISCAS'89 Benchmark Circuits . 97 3. Experiments with ISPD'04 Benchmark Circuits . 100 F. Conclusion . 102 V UNIFIED ADAPTIVITY OPTIMIZATION OF CLOCK AND LOGIC SIGNALS :::::::::::::::::::::::::: 105 A. Introduction . 105 B. Preliminaries and Motivation . 109 C. Overall Flow . 113 D. Continuous Optimization . 113 1. Linear Programming Formulation . 113 2. Robust Linear Programming . 116 3. Adaptive Application of Robust Linear Programming 118 E. Discretization . 119 1. Discretizating PST Clock Bu®ers . 120 a. Solution Characterization . 121 b. Solution Propagation . 121 c. Acceleration by Pruning . 121 2. Discretizing Logic Circuits . 122 3. Fast Simulations for Timing Yield Estimation . 124 4. Time Complexity . 126 F. Experiments . 127 1. Continuous Adaptivity Optimization . 128 2. Discretization . 129 G. Conclusion . 135 VI CONCLUSION ::::::::::::::::::::::::::: 136 REFERENCES ::::::::::::::::::::::::::::::::::: 138 VITA :::::::::::::::::::::::::::::::::::::::: 147 x LIST OF TABLES TABLE Page I Technology trend for VLSI chips [1]. :::::::::::::::::: 1 II C; Q values for sinks [19]. :::::::::::::::::::::::: 19 III C; R; W values for each bu®er type [19]. :::::::::::::::: 20 IV Comparison of discrete slew bu®ering (SB) and slew constrained timing bu®ering (VGL+S). #S refers to the average number of non-dominated solutions at driver. Slack is in ns. CPU time is in seconds. :::::::::::::::::::::::::::::::::: 43 V Slew constrained bu®ering with pruning based on (C; W ), CWB. #S: the number of non-dominated solutions at driver. Area Sav- ing is obtained comparing to SB. :::::::::::::::::::: 45 VI The comparison of SB and VGL+S+PSP (VGL+S incorporated with pre-bu®er slack pruning [19]). Speed up refers to the runtime di®erence between SB and VGL+S+PSP. ::::::::::::::: 47 VII Comparison of discrete slew bu®ering (SB) and slew constrained timing bu®ering (VGL+SB+PSP) on 100 large-degree nets. Slack is in ns. :::::::::::::::::::::::::::::::::: 47 VIII Results of slew bu®ering with

Hu.Pdf (653.8Kb)

A Comprehensive Stochastic Design Methodology for Hold-Timing Resiliency in Voltage-Scalable Design

Overcoming the Challenges in Very Deep Submicron for Area Reduction, Power Reduction and Faster Design Closure

Design Closure: Power Constraints, Best Practices for an Accurate Report Power Estimation

NOT for Printing

SEMICONDUCTOR COLLABORATIVE DESIGN PROCESS Enable Collaborative Design for Complex Semiconductor Projects

Design for Manufacturing (Dfm) in Submicron Vlsi Design

Designing a Chip Challenges, Trends, and Latin America Opportunity

Physical Design of a 3D-Stacked Heterogeneous Multi-Core Processor

A Semi-Custom Design Flow in High-Performance Microprocessor Design Gregory A

LAMDA: Learning-Assisted Multi-Stage Autotuning for FPGA Design Closure

UNIVERSITY of CALIFORNIA SAN DIEGO IC Physical Design

Ultrafast Design Methodology Guide for the Vivado Design Suite