Bespoke Behavioral Processors

Total Page:16

File Type:pdf, Size:1020Kb

Bespoke Behavioral Processors BESPOKE BEHAVIORAL PROCESSORS by Rohit Sreekumar APPROVED BY SUPERVISORY COMMITTEE: Dr. Benjamin Carrion Schaefer, Chair Dr. Lakshman Tamil Dr. Yang Hu Copyright c 2020 Rohit Sreekumar All rights reserved This thesis is dedicated to my parents, R Sreekumar and Girija Sreekumar & my dear friends. BESPOKE BEHAVIORAL PROCESSORS by ROHIT SREEKUMAR, B.Tech THESIS Presented to the Faculty of The University of Texas at Dallas in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN COMPUTER ENGINEERING THE UNIVERSITY OF TEXAS AT DALLAS May 2020 ACKNOWLEDGMENTS I would like to thank god almighty for instilling in me the confidence and drive to successfully complete this study. I express my deepest gratitude to my graduate thesis advisor, Dr. Benjamin Carrion Schaefer for the opportunity to work with him on a research-oriented project. His continued support, guidance and most of all his belief in me enabled me to successfully complete my thesis. I was able to learn immensely from his in-depth knowledge and experience in my work. Thank you sir. I would like to express my gratitude to Dr. Lakshman Tamil and Dr. Yang Hu for being on my evaluation committee. I would like to give a special thanks to my fiancee for her love, support and immense en- couragement throughout. I would like to thank my family for their constant motivation, support and for providing me with words of wisdom during my difficult times. I am thankful to the Department of Electrical and Computer Engineering at The University of Texas at Dallas for their help and providing me with ample research facilities all through my work. March 2020 v BESPOKE BEHAVIORAL PROCESSORS Rohit Sreekumar, MSCE The University of Texas at Dallas, 2020 Supervising Professor: Dr. Benjamin Carrion Schaefer Many emerging applications require simple controllers that run the exact same application continuously. These include medical devices and IoTs of different nature. Because of the nature of these applications, they have to be ultra-low power and small. Most of the appli- cations are mapped onto low-power processors that are computationally inexpensive, thus, amenable to be executed on a simple microprocessor. One of the problems of using a general purpose processor, is that not all of the resources are required for a specific application, thus, there is a large potential for simplifying the processor to achieve lower area and power. In addition, these processors can be specified at the behavioral level using High-Level Synthesis (HLS) to generate the RTL automatically. This opens a window for additional optimizations as the processor can be pruned and re-synthesized at different VLSI design levels in order to obtain a smaller and more power-efficient processor. This work presents a methodology to customize a behavioral RISC processor automatically for a given workload such that its area and power are significantly reduced as compared to the original processor. Compared to previous work that customizes a given processor at the gate netlist only, this proposed method helps reduce the area and power significantly by raising the level of abstraction. vi TABLE OF CONTENTS ACKNOWLEDGMENTS . v ABSTRACT . vi LIST OF FIGURES . ix LIST OF TABLES . x CHAPTER 1 INTRODUCTION . 1 1.1 Thesis Motivation . .1 1.2 Thesis Contribution and Organization . .2 CHAPTER 2 APPLICATION SPECIFIC INSTRUCTION SET PROCESSORS . 3 2.1 Introduction . .3 2.2 Application Specific Processors . .3 2.2.1 Definition . .3 2.2.2 ASIP vs General CPU . .4 2.3 ASIP Design Flow . .5 2.4 Synopsys Processor Designer . .5 2.5 Cadence Xtensa . .7 CHAPTER 3 HIGH LEVEL SYNTHESIS . 9 3.1 Introduction to VLSI Design . .9 3.2 VLSI Design Flow and its Applications . .9 3.3 Introduction To High Level Synthesis . 11 3.4 High Level Synthesis Design Flow . 12 3.4.1 Resource Allocation . 15 3.4.2 Scheduling . 17 3.4.3 Binding . 19 3.4.4 RTL Code Generation . 20 3.5 Advantages of HLS . 22 3.6 Disadvantages of HLS . 23 3.7 Commercial HLS Tools . 24 3.7.1 Vivado HLS . 24 vii 3.7.2 Catapult C . 24 3.7.3 C to Silicon . 25 3.7.4 CyberWorkBench . 25 CHAPTER 4 BESPOKE PROCESSORS . 26 4.1 Introduction . 26 4.2 Motivational Example . 27 4.3 Bespoke Processor Proposed Method . 31 4.3.1 Behavioral Pruning . 32 4.3.2 RTL Pruning . 34 4.3.3 Gate Netlist Pruning . 34 4.4 Experimental Results . 35 4.4.1 Experimental Setup . 35 4.4.2 Experimental Results . 37 CHAPTER 5 CONCLUSION AND FUTURE WORK . 43 5.1 Conclusion . 43 5.2 Future Work . 43 REFERENCES . 44 BIOGRAPHICAL SKETCH . 47 CURRICULUM VITAE viii LIST OF FIGURES 2.1 ASIP design flow. [1] . .5 2.2 Synopsys Processor Design overview [2] . .6 2.3 Cadence Xtensa Design Flow. [3] . .8 3.1 VLSI design flow . 10 3.2 HLS Gajski-Kuhn Y-chart [18] . 13 3.3 HLS Design Flow . 14 3.4 Control and Data Flow graph . 15 3.5 Resource allocation example . 16 3.6 Scheduling example . 18 3.7 Binding example . 20 4.1 Motivational example. (a) Synthesizable behavioral description snippet of scalar MIPS processor.(b)RT-Level block diagram of processor, (c) Gate netlist view. (d) Application to be run on the processor (average of 8 numbers). 28 4.2 Area, power and timing reduction after each stage . 29 4.3 Overview of complete bespoke processor proposed flow . 31 4.4 Proposed method vs Previous method area savings . 37 4.5 Area savings per level of abstraction . 38 4.6 Power savings per level of abstraction . 39 4.7 Delay savings per level of abstraction . 40 4.8 Synthetic Benchmarks Area Savings . 41 ix LIST OF TABLES 4.1 Supported MIPS Instruction Set . 30 4.2 Benchmark details . 36 4.3 Run Time for Iterative Reduction Approach . 42 4.4 Run Time for Direct Reduction Approach . 42 x LIST OF ABBREVIATIONS ALAP As Late As Possible ALU Arithmetic Logic Unit ASAP As Soon As Possible ASIP Application Specific Instruction Set Architecture CISC Complex Instruction Set Computer CDFG Control and Data Flow Graph CPU Central Processing Unit DSP Digital Signal Processing HDL Hardware Description Language HLS High Level Synthesis I/O Input Output IoT Internet of Things LISA Language for Instruction Set Architectures MOS Metal Oxide Semiconductor RAM Random Access Memory RISC Reduced Instruction Set Computer ROM Read Only Memory RTL Register Transfer Level xi VHDL Very High Speed Integrated Circuit Hardware Description Language VLIW Very Long Instruction Word VLSI Very Large Scale Integration xii CHAPTER 1 INTRODUCTION The Internet of Things (IoT) is probably one of the most exciting fastest growing new technologies happening right now. There are currently over 6 billion connected devices and it is predicted that by 2022 this number will increase to over 30 billion and it is estimated that the global data volume will grow exponentially from 4.4 zettabytes to 44 zettabytes in 2022. These devices range from connected homes to industrial applications. Examples of IoT applications go from smart homes and cities to healthcare and transportation. IoT is a term that has been coined to describe this network of interconnected devices. IoT systems typically require an embedded processor, some communication circuit (e.g. Wi-Fi, Bluetooth) and multiple sensors. The main problem is that these type of systems are often battery operated and rely on renewable energies to re-charge the batteries, thus, they need to be ultra-low power, at the same time they often execute specific, static applications. This opens the question whether these IoT systems can be tailored to further reduce their power consumption. 1.1 Thesis Motivation A large number of applications require ultra-low power hardware and are cost sensitive. At the same time, these applications are not very computationally demanding and thus, can be executed on a general purpose processor. These applications include wearable [10; 23] and IoT applications [11; 24]. One of the problems with general purpose processors is that they are not as power efficient as dedicated solutions like ASICs. To address this, state of the art processors make use of different level of adaptive power management techniques such as power gating [15; 12] and event-driven programming through interrupts [14]. These techniques help reducing the 1 power consumption of the unused parts of the processor, but are often restricted to a coarser granularity while at the same time lead to area overheads due to the need to include different clock domains, gating logic etc... 1.2 Thesis Contribution and Organization A bespoke processor is created from an existing general purpose microprocessor tailored to a particular target application. This thesis raises the abstraction level of the tailoring procedure from the gate netlist level to begin the tailoring starting from a higher level of abstraction. This methodology follows an iterative tailoring procedure where the bespoke processor tailoring begins at the behavioral description of the processor where all the unused lines of code is removed followed by an RTL level reduction and finally a gate level netlist reduction. The following chapters of the thesis are organized as follows. Chapter 2 introduces ASIPs a component used in System-on-Chips whose instruction set is tailored to a specific application, its design flow and the tools supporting the same. Chapter 3 introduces the concept of High Level Synthesis, its design flow, the pros and cons of HLS and the various tools for HLS. Chapter 4 introduces the concept of Bespoke processors, the proposed methodology for the generation of behavioral bespoke processors and the experimental results obtained. Chapter 5 finally presents the conclusion. 2 CHAPTER 2 APPLICATION SPECIFIC INSTRUCTION SET PROCESSORS 2.1 Introduction Advancements in the semiconductor fabrication technologies has enabled solutions with short product cycles to cope with the constantly varying application functionality.
Recommended publications
  • Synthesis and Verification of Digital Circuits Using Functional Simulation and Boolean Satisfiability
    Synthesis and Verification of Digital Circuits using Functional Simulation and Boolean Satisfiability by Stephen M. Plaza A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in The University of Michigan 2008 Doctoral Committee: Associate Professor Igor L. Markov, Co-Chair Assistant Professor Valeria M. Bertacco, Co-Chair Professor John P. Hayes Professor Karem A. Sakallah Associate Professor Dennis M. Sylvester Stephen M. Plaza 2008 c All Rights Reserved To my family, friends, and country ii ACKNOWLEDGEMENTS I would like to thank my advisers, Professor Igor Markov and Professor Valeria Bertacco, for inspiring me to consider various fields of research and providing feedback on my projects and papers. I also want to thank my defense committee for their comments and in- sights: Professor John Hayes, Professor Karem Sakallah, and Professor Dennis Sylvester. I would like to thank Professor David Kieras for enhancing my knowledge and apprecia- tion for computer programming and providing invaluable advice. Over the years, I have been fortunate to know and work with several wonderful stu- dents. I have collaborated extensively with Kai-hui Chang and Smita Krishnaswamy and have enjoyed numerous research discussions with them and have benefited from their in- sights. I would like to thank Ian Kountanis and Zaher Andraus for our many fun discus- sions on parallel SAT. I also appreciate the time spent collaborating with Kypros Constan- tinides and Jason Blome. Although I have not formally collaborated with Ilya Wagner, I have enjoyed numerous discussions with him during my doctoral studies. I also thank my office mates Jarrod Roy, Jin Hu, and Hector Garcia.
    [Show full text]
  • A Logic Synthesis Toolbox for Reducing the Multiplicative Complexity in Logic Networks
    A Logic Synthesis Toolbox for Reducing the Multiplicative Complexity in Logic Networks Eleonora Testa∗, Mathias Soekeny, Heinz Riener∗, Luca Amaruz and Giovanni De Micheli∗ ∗Integrated Systems Laboratory, EPFL, Lausanne, Switzerland yMicrosoft, Switzerland zSynopsys Inc., Design Group, Sunnyvale, California, USA Abstract—Logic synthesis is a fundamental step in the real- correlates to the resistance of the function against algebraic ization of modern integrated circuits. It has traditionally been attacks [10], while the multiplicative complexity of a logic employed for the optimization of CMOS-based designs, as well network implementing that function only provides an upper as for emerging technologies and quantum computing. Recently, bound. Consequently, minimizing the multiplicative complexity it found application in minimizing the number of AND gates in of a network is important to assess the real multiplicative cryptography benchmarks represented as xor-and graphs (XAGs). complexity of the function, and therefore its vulnerability. The number of AND gates in an XAG, which is called the logic net- work’s multiplicative complexity, plays a critical role in various Second, the number of AND gates plays an important role cryptography and security protocols such as fully homomorphic in high-level cryptography protocols such as zero-knowledge encryption (FHE) and secure multi-party computation (MPC). protocols, fully homomorphic encryption (FHE), and secure Further, the number of AND gates is also important to assess multi-party computation (MPC) [11], [12], [6]. For example, the the degree of vulnerability of a Boolean function, and influences size of the signature in post-quantum zero-knowledge signatures the cost of techniques to protect against side-channel attacks.
    [Show full text]
  • Logic Optimization and Synthesis: Trends and Directions in Industry
    Logic Optimization and Synthesis: Trends and Directions in Industry Luca Amaru´∗, Patrick Vuillod†, Jiong Luo∗, Janet Olson∗ ∗ Synopsys Inc., Design Group, Sunnyvale, California, USA † Synopsys Inc., Design Group, Grenoble, France Abstract—Logic synthesis is a key design step which optimizes of specific logic styles and cell layouts. Embedding as much abstract circuit representations and links them to technology. technology information as possible early in the logic optimiza- With CMOS technology moving into the deep nanometer regime, tion engine is key to make advantageous logic restructuring logic synthesis needs to be aware of physical informations early in the flow. With the rise of enhanced functionality nanodevices, opportunities carry over at the end of the design flow. research on technology needs the help of logic synthesis to capture In this paper, we examine the synergy between logic synthe- advantageous design opportunities. This paper deals with the syn- sis and technology, from an industrial perspective. We present ergy between logic synthesis and technology, from an industrial technology aware synthesis methods incorporating advanced perspective. First, we present new synthesis techniques which physical information at the core optimization engine. Internal embed detailed physical informations at the core optimization engine. Experiments show improved Quality of Results (QoR) and results evidence faster timing closure and better correlation better correlation between RTL synthesis and physical implemen- between RTL synthesis and physical implementation. We elab- tation. Second, we discuss the application of these new synthesis orate on synthesis aware technology development, where logic techniques in the early assessment of emerging nanodevices with synthesis enables a fair system-level assessment on emerging enhanced functionality.
    [Show full text]
  • Designing a RISC CPU in Reversible Logic
    Designing a RISC CPU in Reversible Logic Robert Wille Mathias Soeken Daniel Große Eleonora Schonborn¨ Rolf Drechsler Institute of Computer Science, University of Bremen, 28359 Bremen, Germany frwille,msoeken,grosse,eleonora,[email protected] Abstract—Driven by its promising applications, reversible logic In this paper, the recent progress in the field of reversible cir- received significant attention. As a result, an impressive progress cuit design is employed in order to design a complex system, has been made in the development of synthesis approaches, i.e. a RISC CPU composed of reversible gates. Starting from implementation of sequential elements, and hardware description languages. In this paper, these recent achievements are employed a textual specification, first the core components of the CPU in order to design a RISC CPU in reversible logic that can are identified. Previously introduced approaches are applied execute software programs written in an assembler language. The next to realize the respective combinational and sequential respective combinational and sequential components are designed elements. More precisely, the combinational components are using state-of-the-art design techniques. designed using the reversible hardware description language SyReC [17], whereas for the realization of the sequential I. INTRODUCTION elements an external controller (as suggested in [16]) is utilized. With increasing miniaturization of integrated circuits, the Plugging the respective components together, a CPU design reduction of power dissipation has become a crucial issue in results which can process software programs written in an today’s hardware design process. While due to high integration assembler language. This is demonstrated in a case study, density and new fabrication processes, energy loss has sig- where the execution of a program determining Fibonacci nificantly been reduced over the last decades, physical limits numbers is simulated.
    [Show full text]
  • Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization
    Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization Shubham Raif,6,y, Walter Lau Neton,10,y, Yukio Miyasakao,1, Xinpei Zhanga,1, Mingfei Yua,1, Qingyang Yia,1, Masahiro Fujitaa,1, Guilherme B. Manskeb,2, Matheus F. Pontesb,2, Leomar S. da Rosa Juniorb,2, Marilton S. de Aguiarb,2, Paulo F. Butzene,2, Po-Chun Chienc,3, Yu-Shan Huangc,3, Hoa-Ren Wangc,3, Jie-Hong R. Jiangc,3, Jiaqi Gud,4, Zheng Zhaod,4, Zixuan Jiangd,4, David Z. Pand,4, Brunno A. de Abreue,5,9, Isac de Souza Camposm,5,9, Augusto Berndtm,5,9, Cristina Meinhardtm,5,9, Jonata T. Carvalhom,5,9, Mateus Grellertm,5,9, Sergio Bampie,5, Aditya Lohanaf,6, Akash Kumarf,6, Wei Zengj,7, Azadeh Davoodij,7, Rasit O. Topalogluk,7, Yuan Zhoul,8, Jordan Dotzell,8, Yichi Zhangl,8, Hanyu Wangl,8, Zhiru Zhangl,8, Valerio Tenacen,10, Pierre-Emmanuel Gaillardonn,10, Alan Mishchenkoo,y, and Satrajit Chatterjeep,y aUniversity of Tokyo, Japan, bUniversidade Federal de Pelotas, Brazil, cNational Taiwan University, Taiwan, dUniversity of Texas at Austin, USA, eUniversidade Federal do Rio Grande do Sul, Brazil, fTechnische Universitaet Dresden, Germany, jUniversity of Wisconsin–Madison, USA, kIBM, USA, lCornell University, USA, mUniversidade Federal de Santa Catarina, Brazil, nUniversity of Utah, USA, oUC Berkeley, USA, pGoogle AI, USA The alphabetic characters in the superscript represent the affiliations while the digits represent the team numbers yEqual contribution. Email: [email protected], [email protected], [email protected], [email protected] Abstract—Logic synthesis is a fundamental step in hard- artificial intelligence.
    [Show full text]
  • Logical Equivalence Checking of Asynchronous Circuits Using Commercial Tools
    Logical Equivalence Checking of Asynchronous Circuits Using Commercial Tools Arash Saifhashemi Hsin-Ho Huang Priyanka Bhalerao Peter A. Beerel∗ Intel Corporation Electrical Engineering Yahoo Corporation Electrical Engineering Santa Clara, CA University of Southern California Sunnyvale, CA University of Southern California Email: [email protected] Los Angeles, CA Email: [email protected] Los Angeles, CA Email: [email protected] Email: [email protected] Abstract—We propose a method for logical equivalence check generally cannot be used to compare CSP with decomposed (LEC) of asynchronous circuits using commercial synchronous versions because the decomposition often introduces pipelining tools. In particular, we verify the equivalence of asynchronous that changes the allowed sequence of events at the external circuits which are modeled at the CSP-level in SystemVerilog as interface. Therefore, some researchers only check critical prop- well as circuits modeled at the micro-architectural level using con- erties on the final decomposed design [15], [16]. ditional communication library primitives. Our approach is based on a novel three-valued logic model that abstracts the detailed Our proposed approach is different from the previous work handshaking protocol and is thus agnostic to different gate-level in the following ways: first, since it is focused on CSP- implementations, making it applicable to a variety of different level designs, it is implementation-agnostic and can be used design styles. Our experimental results with commercial LEC for design flows that target various asynchronous templates. tools on a variety of computational blocks and an asynchronous Secondly, compared to [11], we explicitly support modules microprocessor demonstrate the applicability and limitations of the proposed approach.
    [Show full text]
  • Verilog HDL 1
    chapter 1.fm Page 3 Friday, January 24, 2003 1:44 PM Overview of Digital Design with Verilog HDL 1 1.1 Evolution of Computer-Aided Digital Design Digital circuit design has evolved rapidly over the last 25 years. The earliest digital circuits were designed with vacuum tubes and transistors. Integrated circuits were then invented where logic gates were placed on a single chip. The first integrated circuit (IC) chips were SSI (Small Scale Integration) chips where the gate count was very small. As technologies became sophisticated, designers were able to place circuits with hundreds of gates on a chip. These chips were called MSI (Medium Scale Integration) chips. With the advent of LSI (Large Scale Integration), designers could put thousands of gates on a single chip. At this point, design processes started getting very complicated, and designers felt the need to automate these processes. Electronic Design Automation (EDA)1 techniques began to evolve. Chip designers began to use circuit and logic simulation techniques to verify the functionality of building blocks of the order of about 100 transistors. The circuits were still tested on the breadboard, and the layout was done on paper or by hand on a graphic computer terminal. With the advent of VLSI (Very Large Scale Integration) technology, designers could design single chips with more than 100,000 transistors. Because of the complexity of these circuits, it was not possible to verify these circuits on a breadboard. Computer- aided techniques became critical for verification and design of VLSI digital circuits. Computer programs to do automatic placement and routing of circuit layouts also became popular.
    [Show full text]
  • Object-Oriented Development for Reconfigurable Architectures
    Object-Oriented Development for Reconfigurable Architectures Von der Fakultät für Mathematik und Informatik der Technischen Universität Bergakademie Freiberg genehmigte DISSERTATION zur Erlangung des akademischen Grades Doktor Ingenieur Dr.-Ing., vorgelegt von Dipl.-Inf. (FH) Dominik Fröhlich geboren am 19. Februar 1974 Gutachter: Prof. Dr.-Ing. habil. Bernd Steinbach (Freiberg) Prof. Dr.-Ing. Thomas Beierlein (Mittweida) PD Dr.-Ing. habil. Michael Ryba (Osnabrück) Tag der Verleihung: 20. Juni 2007 To my parents. ABSTRACT Reconfigurable hardware architectures have been available now for several years. Yet the application devel- opment for such architectures is still a challenging and error-prone task, since the methods, languages, and tools being used for development are inappropriate to handle the complexity of the problem. This hampers the widespread utilization, despite of the numerous advantages offered by this type of architecture in terms of computational power, flexibility, and cost. This thesis introduces a novel approach that tackles the complexity challenge by raising the level of ab- straction to system-level and increasing the degree of automation. The approach is centered around the paradigms of object-orientation, platforms, and modeling. An application and all platforms being used for its design, implementation, and deployment are modeled with objects using UML and an action language. The application model is then transformed into an implementation, whereby the transformation is steered by the platform models. In this thesis solutions for the relevant problems behind this approach are discussed. It is shown how UML can be used for complete and precise modeling of applications and platforms. Application development is done at the system-level using a set of well-defined, orthogonal platform models.
    [Show full text]
  • Automated Synthesis of Unconventional Computing Systems
    University of Central Florida STARS Electronic Theses and Dissertations, 2004-2019 2019 Automated Synthesis of Unconventional Computing Systems Amad Ul Hassen University of Central Florida Part of the Computer Engineering Commons Find similar works at: https://stars.library.ucf.edu/etd University of Central Florida Libraries http://library.ucf.edu This Doctoral Dissertation (Open Access) is brought to you for free and open access by STARS. It has been accepted for inclusion in Electronic Theses and Dissertations, 2004-2019 by an authorized administrator of STARS. For more information, please contact [email protected]. STARS Citation Ul Hassen, Amad, "Automated Synthesis of Unconventional Computing Systems" (2019). Electronic Theses and Dissertations, 2004-2019. 6500. https://stars.library.ucf.edu/etd/6500 AUTOMATED SYNTHESIS OF UNCONVENTIONAL COMPUTING SYSTEMS by AMAD UL HASSEN MSc Computer Science, University of Central Florida, 2016 MSc Electrical Engineering, University of Engineering & Technology Lahore, 2013 BSc Electrical Engineering, University of Engineering & Technology, Lahore, 2008 A Dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the Department of Electrical and Computer Engineering in the College of Engineering and Computer Science at the University of Central Florida Orlando, Florida Summer Term 2019 Major Professor: Sumit Kumar Jha c 2019 Amad Ul Hassen ii ABSTRACT Despite decades of advancements, modern computing systems which are based on the von Neu- mann architecture still carry its shortcomings. Moore’s law, which had substantially masked the effects of the inherent memory-processor bottleneck of the von Neumann architecture, has slowed down due to transistor dimensions nearing atomic sizes.
    [Show full text]
  • Robust Boolean Reasoning for Equivalence Checking and Functional Property Verification
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. XX, NO. Y, MONTH 2002 1 Robust Boolean Reasoning for Equivalence Checking and Functional Property Verification Andreas Kuehlmann, Senior Member, IEEE, Viresh Paruthi, Florian Krohm, and Malay K. Ganai, Member, IEEE Abstract— Many tasks in CAD, such as equivalence checking, in a powerful solution for a wider range of applications. Ad- property checking, logic synthesis, and false paths analysis require ditionally, by including random simulation its efficiency can be efficient Boolean reasoning for problems derived from circuits. further improved for problems with many satisfying solutions. Traditionally, canonical representations, e.g., BDDs, or structural A large fraction of practical problems derived from the above SAT methods, are used to solve different problem instances. Each of these techniques offer specific strengths that make them efficient mentioned applications have a high degree of structural re- for particular problem structures. However, neither structural dundancy. There are three main sources for this redundancy: techniques based on SAT, nor functional methods using BDDs of- First, the primary netlist produced from a register transfer level fer an overall robust reasoning mechanism that works reliably for (RTL) specification contains redundancies generated by lan- a broad set of applications. In this paper we present a combina- guage parsing and processing. For example, in industrial de- tion of techniques for Boolean reasoning based on BDDs, struc- tural transformations, a SAT procedure, and random simulation signs, between 30 and 50% of generated netlist gates are redun- natively working on a shared graph representation of the prob- dant [1]. A second source of structural redundancy is inherent lem.
    [Show full text]
  • An Optimal Power Supply and Body Bias Voltage for an Ultra Low Power Micro-Controller with Silicon on Thin BOX MOSFET
    An Optimal Power Supply And Body Bias Voltage for an Ultra Low Power Micro-Controller with Silicon on Thin BOX MOSFET Hayate Okuharay, Kuniaki Kitamoriy, Yu Fujitay, Kimiyoshi Usamiz, and Hideharu Amanoy yKeio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Japan zShibaura Institute of Technology, 3-7-5 Toyosu, Kohtoh-ku, Tokyo, Japan yE-mail: fhayate,[email protected] zE-mail: [email protected] Abstract| Body bias control is an efficient means of Although a CPU with the SOTB was investigated in [2], it balancing the trade-off between leakage power and perfor- was not based on a performance and power model. mance especially for chips with silicon on thin buried oxide In the present work, we propose and examine a method (SOTB), a type of FD-SOI technology. In this work, a to find the optimal combination of supply voltage and back- method for finding the optimal combination of the supply gate bias for a micro-controller with the SOTB technique. voltage and body bias voltage to the core and memory is The main contributions of this paper are: proposed and applied to a real micro-controller chip using • A method is proposed to optimize the supply voltage SOTB CMOS technology. By obtaining several coefficients and back-gate bias for a real 32-bit micro-controller of equations for leakage power, switching power and op- implemented with a 65-nm SOTB CMOS technique in erational frequency from the real chip measurements, the which the core and memory are controlled indepen- optimized voltage setting can be obtained for the target dently.
    [Show full text]
  • Busting the Myth That Systemverilog Is Only for Verification
    Synthesizing SystemVerilog Busting the Myth that SystemVerilog is only for Verification Stuart Sutherland Don Mills Sutherland HDL, Inc. Microchip Technology, Inc. [email protected] [email protected] ABSTRACT SystemVerilog is not just for Verification! When the SystemVerilog standard was first devised, one of the primary goals was to enable creating synthesizable models of complex hardware designs more accurately and with fewer lines of code. That goal was achieved, and Synopsys has done a great job of implementing SystemVerilog in both Design Compiler (DC) and Synplify-Pro. This paper examines in detail the synthesizable subset of SystemVerilog for ASIC and FPGA designs, and presents the advantages of using these constructs over traditional Verilog. Readers will take away from this paper new RTL modeling skills that will indeed enable modeling with fewer lines of code, while at the same time reducing potential design errors and achieving high synthesis Quality of Results (QoR). Target audience: Engineers involved in RTL design and synthesis, targeting ASIC and FPGA implementations. Note: The information in this paper is based on Synopsys Design Compiler (also called HDL Compiler) version 2012.06-SP4 and Synopsys Synplify-Pro version 2012.09-SP1. These were the most current released versions available at the time this paper was written. SNUG Silicon Valley 2013 1 Synthesizing SystemVerilog Table of Contents 1. Data types .................................................................................................................................4
    [Show full text]