Phd Dissertation

Total Page:16

File Type:pdf, Size:1020Kb

Phd Dissertation ABSTRACT WIDIALAKSONO, RANDY HARI. Three-Dimensional Integration of Heterogeneous Multi-Core Processors. (Under the direction of Dr. Paul Franzon and Dr. W. Rhett Davis.) This dissertation will explore the advantages of and design methodology for 3D integration in the context of building heterogeneous multi-core processors. The processor features a fast thread migration and cache core decoupling scheme. First, we present empirical results in a commercial 130 nm process. We demonstrate that the 3D implementation of a heterogeneous multi-core processor consumes 31% less power and 22% shorter average wirelength compared to the 2D implementation. Second, this work presents the physical design methodology used for the tape-out of a die-stacked 3D-IC processor. Finally, we propose a new algorithm and methodology for timing-driven 3D-IC via assignment. Experiment results show up to 30% improvement in total negative slack compared to a via assignment algorithm with total wirelength objective function. © Copyright 2016 by Randy Hari Widialaksono All Rights Reserved Three-Dimensional Integration of Heterogeneous Multi-Core Processors by Randy Hari Widialaksono A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Computer Engineering Raleigh, North Carolina 2016 APPROVED BY: Dr. Eric Rotenberg Dr. Agnes Szanto Dr. Paul Franzon Dr. W. Rhett Davis Co-chair of Advisory Committee Co-chair of Advisory Committee DEDICATION Dedicated to my wife and my parents who instilled the importance of pursuing and applying knowledge. ii BIOGRAPHY Randy Widialaksono was born in Jakarta, Indonesia. He completed Bachelors in Electrical Engineering at Institut Teknologi Bandung, Indonesia, in 2009. He started his Ph.D. in Com- puter Engineering at North Carolina State University in 2010. His research focus is on design implementation methodologies for realizing 3D integrated circuits. He also maintains an active interest in computer architecture, digital VLSI design, and machine learning. He has been a IEEE member since 2008. iii ACKNOWLEDGEMENTS First of all, I would like to thank both of my advisors, Dr. Paul Franzon and Dr. W. Rhett Davis for being supportive and providing the opportunity for a rewarding research project. I would also like to thank the following faculty: Dr. Eric Rotenberg for teaching advanced computer micro-architecture concepts. Dr. Krishnendu Chakrabarty at Duke University for collaborating on our research project and welcoming me to his DFT course and research. Dr. Agnes Santo for feedback on the assignment problem and teaching computer algebra. I would like thank the following people for their contribution that made this dissertation possible: Dr. Steve Lipa for his tremendous contributions in deploying the design kit infras- tructure, signing off our tapeouts and developing numerous EDA utilities. Zhenqian Zhang for being a great colleague throughout the research project and helping in the final days of the tapeout performing timing ECO fixes. Bagus Wibowo for collaboration on the timing-driven via assignment experiments. Wenxu Zhao for collaboration on papers and helpful technical discussions. Josh Ledford for developing customized I/O pads for the 3D-IC tapeout process. Jongbeum Park for sharing insights on device and interconnect scaling. Kirti Bhanushali and Dr. T. Robert Harris, for proofreading and presentation feedback. Thor Thorollfsson for mentoring and sharing tape-out/Ph.D. experience. Elliott Forbes for conducting chip bringup of the 2D prototype. Rangeen for taking part in physical design for tapeouts and collaboration on papers. Brandon Dwiel for establishing the processor implementation. Vinesh Srinivasan for providing and verifying the 3D processor netlist. Qouniitah Fadhilah, for full support throughout the doctoral program, proofreading, and assisting in the graphics and typesetting department. iv TABLE OF CONTENTS LIST OF TABLES ...................................... vi LIST OF FIGURES ..................................... vii Chapter 1 Introduction ................................... 1 1.1 Overview of the Following Chapters ......................... 2 1.2 Abbreviations...................................... 3 Chapter 2 3D Integration .................................. 4 2.1 3D Multi-core Processor................................ 6 2.2 Challenges for 3D.................................... 7 2.3 Design for Test..................................... 8 2.3.1 3D DFT Overhead............................... 8 2.4 On-chip Timing Measurements ............................ 9 2.5 Routability Improvement ............................... 10 2.6 3D Via Assignment................................... 12 Chapter 3 3D-IC Physical Design Methodology .................... 14 3.1 Fabrication Process Technology............................ 14 3.2 Processor Architecture................................. 16 3.2.1 2D Prototype.................................. 17 3.2.2 3D Architecture ................................ 18 3.3 Design Flow....................................... 19 3.4 Floorplan ........................................ 20 3.5 Via Assignment..................................... 21 3.5.1 Visualization Tool ............................... 22 3.6 Power Delivery..................................... 23 3.7 Timing.......................................... 26 3.7.1 Timing Constraints and Analysis....................... 26 3.7.2 Inter-tier Clock Skew Balancing........................ 27 3.8 Physical Verification.................................. 28 3.8.1 Design Rule Checks .............................. 28 3.8.2 Connectivity Checks.............................. 28 3.9 Physical Design Metrics ................................ 28 Chapter 4 3D-IC Benefits Case Study .......................... 30 4.1 2D vs 3D Register File................................. 31 4.1.1 Experimental Framework ........................... 31 4.1.2 Floorplan.................................... 31 4.1.3 Area Comparison................................ 32 4.1.4 Power Analysis................................. 32 4.1.5 Face-to-face Via Pitch Analysis........................ 34 v 4.1.6 Routing Congestion .............................. 35 4.1.7 Wirelength Analysis.............................. 38 4.2 2D vs 3D Processor................................... 39 4.2.1 Floorplan.................................... 39 4.2.2 Wirelength Analysis.............................. 40 4.2.3 Power Analysis................................. 43 4.2.4 Path Delay Analysis.............................. 43 4.3 Conclusion ....................................... 47 Chapter 5 Timing Driven Via Assignment in 3D-IC ................. 51 5.1 Timing Metrics..................................... 53 5.2 Optimal Assignment.................................. 54 5.3 Nearest-Neighbor Assignment............................. 56 5.3.1 Timing-Ordered ................................ 57 5.3.2 Contention Based................................ 57 5.4 Resolving Multiple Sinks................................ 59 5.4.1 Fan-In...................................... 59 5.4.2 Fan-Out..................................... 59 5.5 Timing Aware Cost Function ............................. 60 5.6 Congestion Avoidance ................................. 61 5.7 Experiment Results................................... 62 5.7.1 Framework ................................... 62 5.7.2 Runtime..................................... 63 5.7.3 Parameter Search................................ 64 5.7.4 Wirelength Comparison............................ 64 5.7.5 Quality of Result Comparison......................... 64 5.8 Conclusion ....................................... 65 Chapter 6 Conclusion and Future Work ........................ 67 6.1 Summary of Contributions............................... 67 6.2 Future Work ...................................... 68 BIBLIOGRAPHY ....................................... 68 vi LIST OF TABLES Table 3.1 Process technology metrics........................... 15 Table 3.2 H3 core types .................................. 16 Table 3.3 FabScalar processor metrics [33]........................ 17 Table 3.4 Estimated maximum currents per metal width for vias and metals (mA per µm)[12] ................................... 25 Table 3.5 Physical design metrics of the fabricated 3D-IC processor.......... 29 Table 4.1 Face-to-face via experiment parameters.................... 35 Table 5.1 Assignment runtime with 2500 x 2500 problem size (seconds) . 63 Table 5.2 Comparison of total wirelength between via assignment schemes (µm) . 64 Table 5.3 Comparison of WNS between via assignment schemes (ns) ......... 64 vii LIST OF FIGURES Figure 2.1 Vernier TDC architecture for 3D on-chip timing measurements . 11 Figure 2.2 3D on-chip timing measurement scheme................... 11 Figure 3.1 Cross-section of 3D-IC stack.......................... 15 Figure 3.2 Prototype fabricated in IBM-8RF 130 nm.................. 17 Figure 3.3 Inter-core state transfer scheme: fast thread migration (FTM) [10] . 18 Figure 3.4 Inter-core state transfer scheme: cache core decoupling (CCD) [10] . 18 Figure 3.5 The 3D-IC physical design flow........................ 20 Figure 3.6 Detailed 3D-IC EDA tool flow......................... 21 Figure 3.7 3D-IC heterogeneous processor floorplan................... 22 Figure 3.8 Inter-tier signal to via assignment flow.................... 23 Figure 3.9 F2F via visualization and analysis tool.................... 24 Figure 3.10
Recommended publications
  • Timing Analysis of Integrated Circuits Under Process Variations
    UNIVERSIDADE TECNICA´ DE LISBOA INSTITUTO SUPERIOR TECNICO´ Timing Analysis of Integrated Circuits Under Process Variations Lu´ısJorge Br´asMonteiro Guerra e Silva (Mestre) Disserta¸c~aopara obten¸c~aodo Grau de Doutor em Engenharia Inform´aticae de Computadores Orientador: Doutor Lu´ısMiguel Teixeira d'Avila´ Pinto da Silveira J´uri Presidente: Reitor da Universidade T´ecnicade Lisboa Vogais: Doutor Jo~aoPaulo Marques da Silva Doutor Lu´ısMiguel Teixeira d'Avila´ Pinto da Silveira Doutor Jo~aoManuel Paiva Cardoso Doutor Jos´eCarlos Alves Pereira Monteiro Doutor Joel Reuben Phillips Doutor Nuno Filipe Valentim Roma Maio de 2009 Abstract As feature sizes in integrated circuit technology decrease into the nanometer scale, the impact of process parameter variations in circuit performance becomes extremely relevant. Traditional, nominal case analysis and verification methodologies are no longer able to ensure silicon success. This dissertation addresses this problem, by developing key contributions for a variation-aware timing analysis methodology, capable of accurately model and predict circuit performance for the latest integrated circuit technologies. The proposed approach builds on reliable and established timing analysis paradigms by introducing a variation-aware extension, that can easily be implemented in currently used design flows. This dissertation presents several key contributions. One is a methodology for generating parametric delay models, tailored to the specific needs of delay calculation for pre-characterized standard cells. Unlike previous approaches based on numerical approximations, the proposed method is essentially analytical, and therefore capable of producing more accurate and robust results, at a fraction of the computational cost. Another contribution is a methodology that enables the automated computation of the critical timing conditions (corners) of a digital integrated circuit, given variation-aware parametric delay models.
    [Show full text]
  • Test and Characterization Methodologies for Advanced Technology Nodes Darayus Adil Patel
    Test and characterization methodologies for advanced technology nodes Darayus Adil Patel To cite this version: Darayus Adil Patel. Test and characterization methodologies for advanced technology nodes. Elec- tronics. Université Montpellier, 2016. English. NNT : 2016MONTT285. tel-01808874 HAL Id: tel-01808874 https://tel.archives-ouvertes.fr/tel-01808874 Submitted on 6 Jun 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Délivré par Université de Montpellier Préparée au sein de l’école doctorale I2S Et de l’unité de recherche LIRMM Spécialité : SYAM Présentée par DARAYUS ADIL PATEL TEST AND CHARACTERIZATION METHODOLOGIES FOR ADVANCED TECHNOLOGY NODES Soutenue le 5 Juillet 2016 devant le jury composé de M. Daniel CHILLET Professeur, Université de Président du Jury / Rennes Rapporteur M. Salvador MIR Directeur de Recherche Rapporteur CNRS, TIMA Mme. Sylvie NAUDET Team Leader, Examinateur STMicroelectronics M. Alberto BOSIO MCF HDR, Université de Examinateur Montpellier M. Patrick GIRARD Directeur de Recherche Directeur de Thèse CNRS, LIRMM M. Arnaud
    [Show full text]
  • Study of the Impact of Variations of Fabrication Process on Digital Circuits Tarun Chawla
    Study of the impact of variations of fabrication process on digital circuits Tarun Chawla To cite this version: Tarun Chawla. Study of the impact of variations of fabrication process on digital circuits. Micro and nanotechnologies/Microelectronics. Télécom ParisTech, 2010. English. NNT : -. pastel-00537050 HAL Id: pastel-00537050 https://pastel.archives-ouvertes.fr/pastel-00537050 Submitted on 17 Nov 2010 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Thèse Présentée pour obtenir le grade de Docteur du Télécom ParisTech Spécialité: Électronique et Communications Tarun CHAWLA Titre: Etude de l’impact des variations du procédé de fabrication sur les circuits numériques Soutenue le 30 Septembre 2010 devant le jury composé de: Prof. Lirida NAVINER Président de Jury Dr. Marc BELLEVILLE Rapporteurs Dr. Nadine AZEMARD Rapporteurs Prof. Amara AMARA Directeur de thèse Prof. Andrei VLADIMIRESCU Co-directeur de thèse M. Sebastien MARCHAL Tuteur industriel - 2 - Abstract Designing digital circuits for sub-100nm bulk CMOS technology faces many challenges in terms of Process, Voltage, and Temperature variations. The focus has been on inter- die variations that form the bulk of process variations. Much work has been done to study their effects and to make circuits more robust by improvements in technology or design.
    [Show full text]
  • Reliable Design of Three-Dimensional Integrated Circuits
    Reliable Design of Three-Dimensional Integrated Circuits For obtaining the academic degree of Doctor of Engineering Department of Informatics Karlsruhe Institute of Technology (KIT) Karlsruhe, Germany Approved Dissertation by Master of Science Shengcheng Wang From Tianjin, China Date of Oral Examination: 04.05.2018 Supervisor: Prof. Dr. Mehdi Baradaran Tahoori, Karlsruhe Institute of Technology Co-supervisor: Prof. Dr. Said Hamdioui, Delft University of Technology ii Shengcheng Wang Haid-und-Neu Str. 62 76131 Karlsruhe Hiermit erkläre ich an Eides statt, dass ich die von mir vorgelegte Arbeit selbstständig verfasst habe, dass ich die verwendeten Quellen, Internet-Quellen und Hilfsmittel voll- ständig angegeben haben und dass ich die Stellen der Arbeit - einschließlich Tabellen, Karten und Abbildungen - die anderen Werken oder dem Internet im Wortlaut oder dem Sinn nach entnommen sind, auf jeden Fall unter Angabe der Quelle als Entlehnung kenntlich gemacht habe. Karlsruhe, Mai 2018 ——————————— Shengcheng Wang iii This page would be intentionally left blank. Abstract Beginning with the invention of the first Integrated Circuit (IC) by Kilby and Noyce in 1959, performance growth inIC is realized primarily by geometrical scaling, which has resulted in a steady doubling of device density from one technology node to another. This observation was famously known as “Moore’s law”. However, the performance en- hancement due to traditional technology scaling has begun to slow down and present diminishing returns due to a number of imminent show-stoppers, including fundamen- tal physical limits of transistor scaling, the growing significance of quantum effects as transistors shrink, and a mismatch between transistors and interconnects regarding size, speed and power.
    [Show full text]
  • Measurement and Analysis of Variability in CMOS Circuits
    Measurement and Analysis of Variability in CMOS circuits Liang Teck Pang Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2008-108 http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-108.html August 29, 2008 Copyright 2008, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Acknowledgement I wish to acknowledge the contributions of the students, faculty and sponsors of the Berkeley Wireless Research Center, the National Science Foundation Infrastructure Grant No. 0403427, wafer fabrication donation of STMicroelectronics, and the support of the Center for Circuit & System Solutions (C2S2) Focus Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation program. Measurement and Analysis of Variability in CMOS circuits by Liang Teck Pang Diplˆome D’Ing´enieur (Ecole Centrale de Paris, France) 1997 Master of Philosophy (University of Cambridge, United Kingdom) 1997 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Sciences in the GRADUATE
    [Show full text]