Design Automation of Real-Life Asynchronous Devices and Systems Full Text Available At

Total Page:16

File Type:pdf, Size:1020Kb

Design Automation of Real-Life Asynchronous Devices and Systems Full Text Available At Full text available at: http://dx.doi.org/10.1561/1000000006 Design Automation of Real-Life Asynchronous Devices and Systems Full text available at: http://dx.doi.org/10.1561/1000000006 Design Automation of Real-Life Asynchronous Devices and Systems Alexander Taubin Boston University, USA Jordi Cortadella Universitat Polit`ecnica de Catalunya, Spain Luciano Lavagno Politecnico di Torino, Italy Alex Kondratyev Cadence Design Systems, USA Ad Peeters Handshake Solutions, The Netherlands Boston – Delft Full text available at: http://dx.doi.org/10.1561/1000000006 Foundations and Trends R in Electronic Design Automation Published, sold and distributed by: now Publishers Inc. PO Box 1024 Hanover, MA 02339 USA Tel. +1-781-985-4510 www.nowpublishers.com [email protected] Outside North America: now Publishers Inc. PO Box 179 2600 AD Delft The Netherlands Tel. +31-6-51115274 The preferred citation for this publication is A. Taubin, J. Cortadella, L. Lavagno, A. Kondratyev and A. Peeters, Design Automation of Real-Life Asynchronous Devices and Systems, Foundations and Trends R in Electronic Design Automation, vol 2, no 1, pp 1–133, 2007 ISBN: 978-1-60198-058-8 c 2007 A. Taubin, J. Cortadella, L. Lavagno, A. Kondratyev and A. Peeters All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording or otherwise, without prior written permission of the publishers. Photocopying. In the USA: This journal is registered at the Copyright Clearance Cen- ter, Inc., 222 Rosewood Drive, Danvers, MA 01923. Authorization to photocopy items for internal or personal use, or the internal or personal use of specific clients, is granted by now Publishers Inc for users registered with the Copyright Clearance Center (CCC). The ‘services’ for users can be found on the internet at: www.copyright.com For those organizations that have been granted a photocopy license, a separate system of payment has been arranged. Authorization does not extend to other kinds of copy- ing, such as that for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. In the rest of the world: Permission to pho- tocopy must be obtained from the copyright owner. Please apply to now Publishers Inc., PO Box 1024, Hanover, MA 02339, USA; Tel. +1-781-871-0245; www.nowpublishers.com; [email protected] now Publishers Inc. has an exclusive license to publish this material worldwide. Permission to use this content must be obtained from the copyright license holder. Please apply to now Publishers, PO Box 179, 2600 AD Delft, The Netherlands, www.nowpublishers.com; e-mail: [email protected] Full text available at: http://dx.doi.org/10.1561/1000000006 Foundations and Trends R in Electronic Design Automation Volume 2 Issue 1, 2007 Editorial Board Editor-in-Chief: Sharad Malik Department of Electrical Engineering Princeton University Princeton, NJ 08544 Editors Robert K. Brayton (UC Berkeley) Raul Camposano (Synopsys) K.T. Tim Cheng (UC Santa Barbara) Jason Cong (UCLA) Masahiro Fujita (University of Tokyo) Georges Gielen (KU Leuven) Tom Henzinger (EPFL) Andrew Kahng (UC San Diego) Andreas Kuehlmann (Cadence Berkeley Labs) Ralph Otten (TU Eindhoven) Joel Phillips (Cadence Berkeley Labs) Jonathan Rose (University of Toronto) Rob Rutenbar (CMU) Alberto Sangiovanni-Vincentelli (UC Berkeley) Leon Stok (IBM Research) Full text available at: http://dx.doi.org/10.1561/1000000006 Editorial Scope Foundations and Trends R in Electronic Design Automation will publish survey and tutorial articles in the following topics: • System Level Design • Physical Design • Behavioral Synthesis • Circuit Level Design • Logic Design • Reconfigurable Systems • Verification • Analog Design • Test Information for Librarians Foundations and Trends R in Electronic Design Automation, 2007, Volume 2, 4 issues. ISSN paper version 1551-3939. ISSN online version 1551-3947. Also available as a combined paper and online subscription. Full text available at: http://dx.doi.org/10.1561/1000000006 Foundations and Trends R in Electronic Design Automation Vol. 2, No. 1 (2007) 1–133 c 2007 A. Taubin, J. Cortadella, L. Lavagno, A. Kondratyev and A. Peeters DOI: 10.1561/1000000006 Design Automation of Real-Life Asynchronous Devices and Systems Alexander Taubin1, Jordi Cortadella2, Luciano Lavagno3, Alex Kondratyev4 and Ad Peeters5 1 Boston University, USA, [email protected] 2 Universitat Polit`ecnica de Catalunya, Spain, [email protected] 3 Politecnico di Torino, Italy, [email protected] 4 Cadence Design Systems, USA, [email protected] 5 Handshake Solutions, The Netherlands, [email protected] Abstract The number of gates on a chip is quickly growing toward and beyond the one billion mark. Keeping all the gates running at the beat of a single or a few rationally related clocks is becoming impossible. In static timing analysis process variations and signal integrity issues stretch the timing margins to the point where they become too conservative and result in significant overdesign. Importance and difficulty of such problems push some developers to once again turn to asynchronous alternatives. However, the electronics industry for the most part is still reluctant to adopt asynchronous design (with a few notable exceptions) due to a common belief that we still lack a commercial-quality Electronic Design Automation tools (similar to the synchronous RTL-to-GDSII flow) for asynchronous circuits. Full text available at: http://dx.doi.org/10.1561/1000000006 The purpose of this paper is to counteract this view by presenting design flows that can tackle large designs without significant changes with respect to synchronous design flow. We are limiting ourselves to four design flows that we believe to be closest to this goal. We start from the Tangram flow, because it is the most commercially proven and it is one of the oldest from a methodological point of view. The other three flows (Null Convention Logic, de-synchronization, and gate-level pipelining) could be considered together as asynchronous re-implementations of synchronous (RTL- or gate-level) specifications. The main common idea is substituting the global clocks by local syn- chronizations. Their most important aspect is to open the possibility to implement large legacy synchronous designs in an almost “push but- ton” manner, where all asynchronous machinery is hidden, so that syn- chronous RTL designers do not need to be re-educated. These three flows offer a trade-off from very low overhead, almost synchronous implementations, to very high performance, extremely robust dual-rail pipelines. Full text available at: http://dx.doi.org/10.1561/1000000006 Contents 1 Introduction 1 1.1 Requirements for an Asynchronous Design Flow 1 1.2 Motivation for Asynchronous Approach 4 1.3 Asynchronous Design 7 1.4 An Overview of Asynchronous Design Styles 9 1.5 Asynchronous Design Flows 11 1.6 Paper Organization 16 2 Handshake Technology 19 2.1 Motivation 19 2.2 Functional Design 21 2.3 Design Language Haste 21 2.4 Handshake Circuits 29 2.5 Handshake Implementations 32 2.6 Library Connection 34 2.7 Simulation and Verification 35 2.8 Structural Design Flow 36 2.9 Physical Design 39 3 Synchronous-to-asynchronous RTL Flow Using NCL 43 3.1 Overview 43 ix Full text available at: http://dx.doi.org/10.1561/1000000006 3.2 Null Convention Logic 45 3.3 NCL Design Flow with HDL Tools 49 3.4 DIMS-based NCL Design Flow 51 3.5 NCL Flow with Explicit Completeness 53 3.6 Verification of NCL Circuits 55 4 De-synchronization: Simple Mutation Circuit into Asynchronous 63 4.1 Introduction 63 4.2 Signal Transition Graphs 64 4.3 Revisiting Synchronous Circuits 66 4.4 Relaxing the Synchronous Requirement 68 4.5 Minimum Requirements for Correct Asynchronous Communication 69 4.6 Handshake Protocols for De-synchronization 76 4.7 Implementation of Handshake Controllers 80 4.8 Design Flow 82 4.9 Why De-synchronize? 83 4.10 Conclusions 84 5 Automated Gate Level Pipelining (Weaver) 85 5.1 Automated Pipelining: Motivation 85 5.2 Automated Gate-Level Pipelining: General Approach 88 5.3 Micropipeline Stages: QDI Template 91 5.4 Design Flow Basics 92 5.5 Fine-Grain Pipelining with Registers 93 5.6 Pipeline Petri Net Model of the Flow 95 5.7 Weaving: Simple Examples 100 6 Applications and Success Stories 107 6.1 Low-Power Robust Design Using Haste 107 6.2 Low-Power Robust Design using De-synchronization 109 6.3 Design of Cryptographic Coprocessor 111 Full text available at: http://dx.doi.org/10.1561/1000000006 7 Conclusions 121 Acknowledgments 125 References 127 Full text available at: http://dx.doi.org/10.1561/1000000006 1 Introduction 1.1 Requirements for an Asynchronous Design Flow With the increases in die size and clock frequency, it has become increasingly difficult to drive signals across a die following a globally synchronous approach. Process variations and signal integrity stretch the timing margins in static timing analysis to the point where they become too conservative and result in significant over-design. Asyn- chronous circuits measure, rather than estimate, the delay of the combinational logic, of the wires, and of the memory elements. They operate reliably at very low voltages (even below the transistor thresh- old), when device characteristics exhibit second and third order effects. They do not require cycle-accurate specifications of the design, but can exploit specification concurrency virtually at every level of granular- ity. They can save power because they naturally perform computation on-demand. Asynchronous design also provides unique advantages, such as reduced electromagnetic emission, extremely aggressive pipelining for high performance, and improved security for cryptographic devices. In addition, asynchronous design might become a necessity for 1 Full text available at: http://dx.doi.org/10.1561/1000000006 2 Introduction non-standard future fabrication technologies, such as flexible electron- ics based on low-temperature poly-silicon TFT technology [50] or nano- computing [97].
Recommended publications
  • Software-Defined Hardware Provides the Key to High-Performance Data
    Software-Defined Hardware Provides the Key to High-Performance Data Acceleration (WP019) Software-Defined Hardware Provides the Key to High- Performance Data Acceleration (WP019) November 13, 2019 White Paper Executive Summary Across a wide range of industries, data acceleration is the key to building efficient, smart systems. Traditional general-purpose processors are falling short in their ability to support the performance and latency constraints that users have. A number of accelerator technologies have appeared to fill the gap that are based on custom silicon, graphics processors or dynamically reconfigurable hardware, but the key to their success is their ability to integrate into an environment where high throughput, low latency and ease of development are paramount requirements. A board-level platform developed jointly by Achronix and BittWare has been optimized for these applications, providing developers with a rapid path to deployment for high-throughput data acceleration. A Growing Demand for Distributed Acceleration There is a massive thirst for performance to power a diverse range of applications in both cloud and edge computing. To satisfy this demand, operators of data centers, network hubs and edge-computing sites are turning to the technology of customized accelerators. Accelerators are a practical response to the challenges faced by users with a need for high-performance computing platforms who can no longer count on traditional general-purpose CPUs, such as those in the Intel Xeon family, to support the growth in demand for data throughput. The core of the problem with the general- purpose CPU is that Moore's Law continues to double the number of available transistors per square millimeter approximately every two years but no longer allows for growth in clock speeds.
    [Show full text]
  • Opencapi, Gen-Z, CCIX: Technology Overview, Trends, and Alignments
    OpenCAPI, Gen-Z, CCIX: Technology Overview, Trends, and Alignments BRAD BENTON| AUGUST 8, 2017 Agenda 2 | OpenSHMEM Workshop | August 8, 2017 AGENDA Overview and motivation for new system interconnects Technical overview of the new, proposed technologies Emerging trends Paths to convergence? 3 | OpenSHMEM Workshop | August 8, 2017 Overview and Motivation 4 | OpenSHMEM Workshop | August 8, 2017 NEWLY EMERGING BUS/INTERCONNECT STANDARDS Three new bus/interconnect standards announced in 2016 CCIX: Cache Coherent Interconnect for Accelerators ‒ Formed May, 2016 ‒ Founding members include: AMD, ARM, Huawei, IBM, Mellanox, Xilinx ‒ ccixconsortium.com Gen-Z ‒ Formed August, 2016 ‒ Founding members include: AMD, ARM, Cray, Dell EMC, HPE, IBM, Mellanox, Micron, Xilinx ‒ genzconsortium.org OpenCAPI: Open Coherent Accelerator Processor Interface ‒ Formed September, 2016 ‒ Founding members include: AMD, Google, IBM, Mellanox, Micron ‒ opencapi.org 5 5 | OpenSHMEM Workshop | August 8, 2017 NEWLY EMERGING BUS/INTERCONNECT STANDARDS Motivations for these new standards Tighter coupling between processors and accelerators (GPUs, FPGAs, etc.) ‒ unified, virtual memory address space ‒ reduce data movement and avoid data copies to/from accelerators ‒ enables sharing of pointer-based data structures w/o the need for deep copies Open, non-proprietary standards-based solutions Higher bandwidth solutions ‒ 25Gbs and above vs. 16Gbs forPCIe-Gen4 Better exploitation of new and emerging memory/storage technologies ‒ NVMe ‒ Storage class memory (SCM),
    [Show full text]
  • Program & Exhibits Guide
    FROM CHIPS TO SYSTEMS – LEARN TODAY, CREATE TOMORROW CONFERENCE PROGRAM & EXHIBITS GUIDE JUNE 24-28, 2018 | SAN FRANCISCO, CA | MOSCONE CENTER WEST Mark You Calendar! DAC IS IN LAS VEGAS IN 2019! MACHINE IP LEARNING ESS & AUTO DESIGN SECURITY EDA IoT FROM CHIPS TO SYSTEMS – LEARN TODAY, CREATE TOMORROW JUNE 2-6, 2019 LAS VEGAS CONVENTION CENTER LAS VEGAS, NV DAC.COM DAC.COM #55DAC GET THE DAC APP! Fusion Technology Transforms DOWNLOAD FOR FREE! the RTL-to-GDSII Flow GET THE LATEST INFORMATION • Fusion of Best-in-Class Optimization and Industry-golden Signoff Tools RIGHT WHEN YOU NEED IT. • Unique Fusion Data Model for Both Logical and Physical Representation DAC.COM • Best Full-flow Quality-of-Results and Fastest Time-to-Results MONDAY SPECIAL EVENT: RTL-to-GDSII Fusion Technology • Search the Lunch at the Marriott Technical Program • Find Exhibitors www.synopsys.com/fusion • Create Your Personalized Schedule Visit DAC.com for more details and to download the FREE app! GENERAL CHAIR’S WELCOME Dear Colleagues, be able to visit over 175 exhibitors and our popular DAC Welcome to the 55th Design Automation Pavilion. #55DAC’s exhibition halls bring attendees several Conference! new areas/activities: It is great to have you join us in San • Design Infrastructure Alley is for professionals Francisco, one of the most beautiful who manage the HW and SW products and services cities in the world and now an information required by design teams. It houses a dedicated technology capital (it’s also the city that Design-on-Cloud Pavilion featuring presentations my son is named after).
    [Show full text]
  • Migrating to Achronix FPGA Technology (AN023) Migrating to Achronix FPGA Technology (AN023)
    Migrating to Achronix FPGA Technology (AN023) Migrating to Achronix FPGA Technology (AN023) November 19, 2020 Application Note Introduction Many users transitioning to Achronix FPGA technology will be familiar with existing FPGA solutions from other vendors. Although Achronix technology and tools are very similar to existing FPGA technology and tools, there are some differences. Understanding these differences are needed to achieve the very best performance and quality of results (QoR). This application note discusses any differences in the Achronix tool flow, highlighting key files and methodologies that users may not be familiar with. Further this application note details the primitive components present in the Achronix fabric, and how they may differ from, or in many cases are similar to, other vendors. Finally this application note reviews the unique features, particularly focused on AI and ML workloads that are present in the Achronix FPGA devices. Related Documents This application note is intended to give an overview of any changes that a user may encounter when migrating to Achronix technology. For full details of any of the items described below, the user is directed to the appropriate user guide or application note. Instead of duplicating information, this application note highlights the changes, and refers the user to the appropriate user guide where they can obtain the full information. A number of user guides are commonly referred to throughout this document Speedster7t IP Component Library User Guide (UG086) .This user guide describes all the silicon elements on the Speedster7t family. It includes descriptions, and instantiation templates, of the memories, DSP, MLP, NAP and logic primitives.
    [Show full text]
  • QDI Asynchronous Design and Voltage Scaling
    PONTIFICAL CATHOLIC UNIVERSITY OF RIO GRANDE DO SUL FACULTY OF INFORMATICS COMPUTER SCIENCE GRADUATE PROGRAM QDI Asynchronous Design and Voltage Scaling Ricardo Aquino Guazzelli Dissertation submitted to the Pontifical Catholic University of Rio Grande do Sul in partial fulfillment of the requirements for the degree of Master in Computer Science. Advisor: Prof. Dr. Ney Laert Vilar Calazans Porto Alegre 2017 Replace this page with the Library Catalog Page Replace this page with the Committee Forms “If I have seen further than others, it is by standing upon the shoulders of giants.” (Isaac Newton) PROJETO ASSÍNCRONO QDI E ESCALAMENTO DA TENSÃO DE ALIMENTAÇÃO RESUMO Circuitos quasi-delay insensitive (QDI) são uma solução promissora para lidar com variações significativas de processo em tecnologias modernas, já que suas características acomodam variações significativas de atraso de fios e portas lógicas. Além disso, com as novas tendências de dispositivos móveis e a Internet das Coisas (IoT), o projeto QDI apresenta-se como um alternativa de implementação para aplicações operando em tensão baixa ou muito baixa. Como principal desvantagem, o projeto QDI conduz a um aumento significativo no consumo de área e potência, o que pode comprometer seu emprego. Este trabalho investiga a compatibilidade de projeto QDI sob escalamento de tensão, explorando e analisando templates QDI disponíveis na literatura. Entre estes templates, seleciona-se o template SCL como um opção interessante para amenizar consumo de área e potên- cia. Contudo, a proposta original deste template, demonstra-se aqui, apresenta problemas temporais. Devido a estes problemas, propõe-se aqui um template alternativo. Usando o fluxo de projeto ASCEnD, bibliotecas de standard cells para os templates em questão foram geradas a fim de avaliar os benefícios e desvantagens destes.
    [Show full text]
  • Transistor-Level Programmable Fabric
    TRANSISTOR-LEVEL PROGRAMMABLE FABRIC by Jingxiang Tian APPROVED BY SUPERVISORY COMMITTEE: ___________________________________________ Carl Sechen, Chair ___________________________________________ Yiorgos Makris, Co-Chair ___________________________________________ Benjamin Carrion Schaefer ___________________________________________ William Swartz Copyright 2019 Jingxiang Tian All Rights Reserve TRANSISTOR-LEVEL PROGRAMMABLE FABRIC by JINGXIANG TIAN, BS DISSERTATION Presented to the Faculty of The University of Texas at Dallas in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY IN ELECTRICAL ENGINEERING THE UNIVERSITY OF TEXAS AT DALLAS December 2019 ACKNOWLEDGMENTS I would like to express my deep sense of gratitude to the consistent and dedicated support of my advisor, Dr. Carl Sechen, throughout my entire PhD study. You have been a great mentor, and your help in my research and career is beyond measure. I sincerely thank my co-advisor, Dr. Yiorgos Makris, who brings ideas and funds for this project. I am grateful that they made it possible for me to work on this amazing research topic that is of great interest to me. I would also like to thank my committee members, Dr. Benjamin Schaefer and Dr. William Swartz for serving as my committee members and giving lots of advice on my research. I would like to give my special thanks to the tech support staff, Steve Martindell, who patiently helped me a thousand times if not more. With the help of so many people, I have become who I am. My parents and my parents-in-law give me tremendous support. My husband, Tianshi Xie, always stands by my side when I am facing challenges. Ada, my precious little gift, is the motivation and faith for me to keep moving on.
    [Show full text]
  • A Radiation Hardened Reconfigurable FPGA Shankarnarayanan Ramaswamy1, Leonard Rockett1, Dinu Patel1, Steven Danziger1 Rajit Manohar2, Clinton W
    A Radiation Hardened Reconfigurable FPGA Shankarnarayanan Ramaswamy1, Leonard Rockett1, Dinu Patel1, Steven Danziger1 Rajit Manohar2, Clinton W. Kelly, IV2, John Lofton Holt2, Virantha Ekanayake2, Dan Elftmann2 1BAE Systems, 9300 Wellington Road, Manassas, VA, 20110, USA 2Achronix Semiconductor Corporation, 333 W. San Carlos Street, San Jose, CA, 95110, USA 703-367-4611 [email protected] Abstract—A new high density, high performance radiation As described previously BAE Systems has produced, hardened, reconfigurable Field Programmable Gate Array completed radiation testing, and qualification of a radiation (FPGA) is being developed by Achronix Semiconductor hardened 16M Static Random Access Memory (SRAM) and BAE Systems for use in space and other radiation built using the RH15 process [2]. The BAE-Achronix Proof hardened applications.12 The reconfigurable FPGA fabric of Concept (POC) RH FPGA device, to be named architecture utilizes Achronix Semiconductor novel RadRunner, currently being fabricated uses this same picoPIPE technology and it is being manufactured at BAE memory cell technology for storing the device FPGA fabric Systems using their strategically radiation hardened 150 nm configuration data [3]. epitaxial bulk CMOS technology, called RH15. Circuits built in RH15 consistently demonstrate megarad total dose hardness and the picoPIPE asynchronous technology has RH15 Technology Features been adapted for use in space with a Redundancy Voting Circuit (RVC) methodology to protect the user circuits from Rad Hard 150nm CMOS Technology (RH15) Features single event effects. Minimum Feature Size 150nm Isolation Shallow Trench Isolation Device Options 26 Å / 70 Å S/D Engineering Halo with S/D Extensions Supply Voltages 1.5V / 3.3V Gate Electrodes N+ Poly (NFET) / P+ Poly (PFET) TABLE OF CONTENTS Metal Levels 7 Levels (Planarized BEOL) Poly & Diffusion Silicide CoSi2 1.
    [Show full text]
  • Are Fpgas Suffering from the Innovator's Dilemma?
    FPGA’2013 Panel Are FPGAs Suffering from the Innovator’s Dilemma? Moderator: Jason Cong, UCLA Panelists Jonathan Bachrach, UC Berkeley Robert Blake, CEO, Achronix Misha Burich, CTO, Altera Chuck Thacker, Technical Fellow, Microsoft Research Steve Trimberger, Fellow, Xilinx Credit – Idea of the Panel Jonathan Rose at Univ. of Toronto Sabbatical in Shanghai, China 2/18/13 UCLA VLSI CAD LAB 2 FPGA Industry Has Been Innovating and Riding on Moore’s Law System BOM Extensible Processing Sub-system Agile Mixed Signal Converter Power Cost Stacked Silicon Interconnect Ethernet MAC PCI Interface System Mixed Signal System Monitor Phase Multi-Mode Clock Generators Performance Multi-Gigabit SerDes Processor LVDS Transceivers DSP I/O Termination Impedance Phase Locked Loops Multi-Standard Programmable I/O Support I/O Buffers with Programmable Drive Strength CMOS / TTL Programmable I/O Dual Port RAM Block RAM Distributed RAM Oscillator 2/18/13 UCLA VLSI CAD LAB source: 3 FPGA Has the Highest Margins in Semiconductor Symbol Gross Margin Operating Margin Pre-tax Margin Net Margin AMD 22.80% -17.60% -22.40% -21.80% ALTR 69.60% 33.20% 33.20% 31.20% INTC 62.20% 27.40% 27.90% 20.60% MU 11.80% -7.50% -12.70% -12.50% NVDA 51.40% 16.20% 16.60% 14.50% XLNX 64.90% 28.90% 26.70% 23.70% source: Morgan Stanley 2/18/13 UCLA VLSI CAD LAB 4 So, What’s the Problem? Source WSTS (January 2013) and Xilinx The FPGA fraction is only $4.5B/$300B = 1.5% of semiconductor industy, and has been that way for 10+ years! 2/18/13 UCLA VLSI CAD LAB source: 5 ASIC Product Segment Marketshare
    [Show full text]
  • The Fabless/Foundry Supply Chain
    Close The Fabless/Foundry Supply Chain Fabless companies are growing and evolving, driven by the need for a wider array of high performance, low power products in tablets, smartphones, and ultrabooks. Foundries are racing to meet their needs. Led by Qualcomm, Broadcom, AMD, Nvidia and Marvell, fabless companies are quickly evolving, developing high performance, low power products in tablets, smartphones, and ultrabooks. Foundries, led by TSMC, UMC, GLOBALFOUNDRIES and SMIC, are racing to meet their needs, pushing into the 22nm technology node and more advanced types of packaging, including 3D integration. Meanwhile, all eyes are on Intel's recent expansion into the foundry business. The giant microprocessor company recently landed three new fabless customers: Netronome, Tabula and Achronix. That's just a toe in the water at this point, but the potential to be a major player is explosive. Also making a major change in the conventional fabless/foundry supply chain model is the industry leader, Taiwan Semiconductor Manufacturing Company (TSMC). The company recently announced plans to offer some levels of advanced packaging – so-called 2.5 3D integration – which pits is squarely against OSATS (companies offering OutSourced Assembly and Test Services), such as Amkor, ASE and STATSChipPAC. TSMC has also expanded into solar, LEDs and MEMS, and it's a sure bet other foundries are at least contemplating similar moves. The overall supply chain is made of fabless companies – more than 1800 of them according to the Global Semiconductor Alliance (GSA) – foundries that do that do manufacturing and OSATS that do the packaging and testing of the parts. Analysts differentiate "pure play" foundries that do nothing but foundry work from other companies that offer foundry services.
    [Show full text]
  • Complicity of High-End SOC-FPGA's for Data Centers
    International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-9 Issue-6, March 2021 Complicity of High-End SOC-FPGA’s for Data Centers Retikal Anil Kumar Abstract: As the network traffic increasing significantly due to application user can integrate it to system through Ethernet, increase in Data streaming, Big Data Analytics, Cloud PCIe, SATA, etc., the way of the FPGA integration with the Computing, Increasing the load on Data Centers, Which leads to system defines the type of data processing. This paper demand for high computational capabilities, low latency, consists of seven sections, the section-I will gives the high-bandwidth, power efficient data accelerators. As Re-Configurability of FPGA’s are more flexible for developing introduction, Section-II briefs the role of FPGA’s in data customized applications, so the FPGA hardware based data centers and section-III to section-VI describes about high-end accelerators are the potential devices to achieve low latency and FPGA’s by various vendors and their specifications, Section power efficient requirements. The modern FPGA’s are coming up VII gives conclusion of the paper. The major vendors of with the embedded communication hard IP’s like PCIe, Ethernet, FPGS’s are: (In this paper, considered SOC-FPGA’s & DDR based memory controllers, which makes easy for the embedded with features like PCIe Gen4 or 5 / Ethernet / deployment of network attached FPGA’s in data centers. This paper presents the role of FPGA’s in datacenters and analysis of SATA / Memory Controllers). high-end FPGA’s by various vendors, which are suitable for • Xilinx (Acquired By AMD) deployment in data centers.
    [Show full text]
  • The Future of Fpgas
    The Future of FPGAs Rajit Manohar Computer Systems Lab Chief Scientist Cornell University, Ithaca, NY 14853 Achronix Semiconductor Corp. http://vlsi.cornell.edu/ http://www.achronix.com/ System complexity in VLSI Year Name Transistors 1982 80286 134,100 1985 80386 275,000 1993 Pentium 3.1 million 1995 Pentium Pro 6 million 1998 Pentium III 9.5 million 2000 Pentium IV 42 million 2002 McKinley 243 million 2005 Montecito 1.7 billion 2020 ???? ~50 billion Source: Intel, IBM CPU frequency scaling Processor Frequencies 4000 3000 2000 Frequency (MHz) Frequency 1000 0 1990 1994 1998 2002 2006 2010 Year Source: Intel, IBM What did CPUs do with the transistors? • Improved throughput while preserving a sequential programming model ❖ Multi-cycle ❖ Pipelined ❖ Superscalar ❖ Out-of-order A A A A A B B B B B A A A A A ❖ Multi-threaded B B B B B Today’s supercomputers use commodity microprocessors as building-blocks FPGA architectures LB CB LB CB CB SB CB SB LB CB LB CB CB SB CB SB LB CB LB CB CB SB CB SB LB CB LB CB CB SB CB SB FPGA frequency scaling 3000 CPU FPGA 2250 1500 Frequency (MHz) Frequency 750 0 1990 1992 1996 1998 2000 2002 2004 2006 2007 2008 Year Source: Intel, IBM, Xilinx, Altera datasheets What did FPGAs do with the transistors? • Basic lookup-table (LUT) / flip-flop (FF) / carry-chain • Embedded multiplier LUT • Embedded memory FF • Embedded processor • DSP slices • Hardened I/Os MAC / MULT • Larger LUT configurations DSP • ... more logic ... reduce logic depth MEM Seems simple compared to CPUs..
    [Show full text]
  • Implementing Balsa Handshake Circuits
    Implementing Balsa Handshake Circuits A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science & Engineering 2000 Andrew Bardsley Department of Computer Science 1 Contents Chapter 1. Introduction .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 12 1.1. Asynchronous design .. .. .. .. .. .. .. .. .. .. .. .. .. .. 13 1.1.1. The handshake .. .. .. .. .. .. .. .. .. .. .. .. .. 15 1.1.2. Delay models .. .. .. .. .. .. .. .. .. .. .. .. .. 19 1.1.3. Data encoding .. .. .. .. .. .. .. .. .. .. .. .. .. 22 1.1.4. The Muller C-element .. .. .. .. .. .. .. .. .. .. .. 25 1.1.5. The S-element .. .. .. .. .. .. .. .. .. .. .. .. .. 27 1.2. Thesis Structure .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 28 Chapter 2. Asynchronous Synthesis .. .. .. .. .. .. .. .. .. .. .. .. 30 2.1. Design flows for asynchronous synthesis .. .. .. .. .. .. .. .. 30 2.2. Directness and user intervention .. .. .. .. .. .. .. .. .. .. .. 32 2.3. Macromodules and DI interconnect .. .. .. .. .. .. .. .. .. .. 33 2.3.1. Sutherland’s micropipelines .. .. .. .. .. .. .. .. .. 34 2.3.2. Macromodules .. .. .. .. .. .. .. .. .. .. .. .. .. 39 2.3.3. Brunvand’s OCCAM synthesis .. .. .. .. .. .. .. .. 40 2.3.4. Plana’s pulse handshaking modules .. .. .. .. .. .. .. 40 2.4. Other asynchronous synthesis approaches .. .. .. .. .. .. .. .. 41 2.4.1. ‘Classical’asynchronous state machines .. .. .. .. .. .. 42 2.4.2. Petri-net synthesis .. .. .. .. .. .. .. .. .. .. .. .. 43 2.4.3. Burst-mode machines .. .. .. .. .. .. .. .. .
    [Show full text]