HARDWARE ACCELERATORS for VLSI GLOBAL ROUTING a Thesis

Total Page:16

File Type:pdf, Size:1020Kb

HARDWARE ACCELERATORS for VLSI GLOBAL ROUTING a Thesis HARDWARE ACCELERATORS FOR VLSI GLOBAL ROUTING A Thesis Presented to The Faculty of Graduate Studies of The University of Guelph by MAHDIELGHAZALI In partial fulfilment of requirements for the degree of Master of Science January, 2009 © Mahdi Elghazali, 2009 Library and Bibliotheque et 1*1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de I'edition 395 Wellington Street 395, rue Wellington Ottawa ON K1A0N4 Ottawa ON K1A0N4 Canada Canada Your file Votre reference ISBN: 978-0-494-47764-9 Our file Notre reference ISBN: 978-0-494-47764-9 NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par Plntemet, prefer, telecommunication or on the Internet, distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats. The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission. In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these. While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. Canada ABSTRACT HARDWARE ACCELERATORS FOR VLSI GLOBAL ROUTING Mahdi Elghazali Advisor: University of Guelph, 2009 Dr. Shawki Areibi This thesis investigates three different approaches to enhance the performance of the global routing step in the physical design process. The first approach is based on a hardware/software co-design strategy, while the second is a custom hardware implementation using Handel-C [1]. An application specific instruction implementation is also implemented and investigated. This approach targets the Tensilica configurable processor. The experimental results show that the three approaches produce the same quality solutions as the pure-software implementation. However, the co-design approach achieves an average speedup of 4.3x over the pure- software based approach, while the custom hardware approach achieves an average speed up of 3.9x. The configurable approach obtained an average speedup of 33.6x over the pure software, while achieving a speedup of 7.81x and 8.61x over the hardware/software co-design and the custom hardware respectively. I hereby declare that I am the sole author of this thesis. I authorize the University of Guelph to lend this thesis to other institutions or individuals for the purpose of scholarly research. I further authorize the University of Guelph to reproduce this thesis by photo­ copying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research. 1 The University of Guelph requires the signatures of all persons using or photo­ copying this thesis. Please sign below, and give address and date. n Acknowledgments I would like to take this opportunity to express my sincere appreciation and thanks to my supervisor professor Shawki Areibi for his great guidance and assistance, and for the help he provided throughout this Master program. Many thanks to professor Radu Muresan and professor Gary Grewal for reviewing this thesis. I would like also to thank Adam Erb and Jon Spenceley for their help in this work. I want to especially thank my father, my mother, my brothers and sister for their continuous encouragement and support. And finally, many thanks to all my friends. Special thanks to Ahmed Saghaier and Ahmed Elhossini, I really enjoyed the time we spent together. Thanks to all the people who helped me by any means. m To my family for their support and encouragement. iv Contents 1 Introduction 1 1.1 Motivation 2 1.2 Overall Methodology 4 1.3 Contributions 5 1.4 Thesis Organization 6 2 Background 7 2.1 VLSI Design Process 8 2.1.1 VLSI Physical Design Automation 9 2.2 Global Routing 11 2.2.1 Routing Problem Definition 12 2.2.2 A Classification of Global Routing Algorithms 12 2.3 Maze Routing Algorithms 13 2.3.1 Lee's Algorithm 14 2.3.2 Limitations of Lee's Algorithm for Large Circuits 14 2.3.3 Reducing the Running Time 15 2.4 Reconfigurable Computing Systems 17 v 2.4.1 Hardware/Software Co-design in RCS 17 2.4.2 Field-Programmable Gate Arrays (FPGAs) 19 2.5 Application Specific Instruction-set Processors 22 2.5.1 Tensilica Configurable Processors 22 2.6 Benchmarks 23 2.7 Summary 25 3 Literature Review 26 3.1 Placement Based Hardware Accelerators 28 3.2 Accelerators for FPGA Routers 31 3.2.1 Distributed Workstations 31 3.2.2 Pure Hardware Accelerators 32 3.3 Accelerators for ASIC Routers 34 3.3.1 General Purpose Processors 34 3.3.2 ASIC-Based Implementations 37 3.3.3 FPGA-Based Implementations 41 3.4 Summary 44 4 Hardware/Software Co-design 46 4.1 Methodology 46 4.2 Design Flow of Lee's Algorithm 48 4.3 A Pure-software Based Implementation 49 4.3.1 Implementation on a MicroBlaze System 49 4.3.2 Major Software Functions 51 4.3.3 Multi-Terminal Nets Routing 57 vi 4.3.4 Profiling 58 4.3.5 Framing Technique 58 4.4 A Hardware/Software Co-Design Implementation 59 4.4.1 Fast Simplex Link (FSL) Bus 61 4.4.2 The Hardware Accelerator Module 63 4.5 Results 67 4.5.1 FPGA Usage 68 4.5.2 Speedup 68 4.6 Summary 71 5 A Handel-C Custom RTL Implementation 72 5.1 DK Design Flow 73 5.2 Design Constraints 74 5.3 Design Details 74 5.3.1 Parallelizing Lee's Algorithm 74 5.3.2 Input/Output Data 77 5.4 The Custom Hardware vs. The MicroBlaze Based Implementations 77 5.4.1 Speedup 77 5.4.2 FPGA Usage 79 5.5 Summary 80 6 Configurable Processors Implementation 81 6.1 Tensilica Configurable Processors 82 6.1.1 Xtensa Processors 82 6.1.2 Design Flow 83 vii 6.2 Design Details 84 6.2.1 Design Environment and Overall Architecture 85 6.2.2 Profiling 86 6.3 Results 87 6.3.1 Speed and Area 87 6.4 Overall Comparison 88 6.4.1 Speedup 88 6.4.2 Area 90 6.5 Summary 90 7 Conclusions 92 7.1 Future Work 93 Bibliography 95 A Glossary 100 B AMIRIX AP1000 FPGA PCI Development Board 102 C RC10 104 D The Netlist and the Placement Files 106 D.l The Netlist File 106 D.2 The Placement File 106 vm List of Tables 2.1 Benchmarks 24 3.1 Comparison between the three placement architectures 31 3.2 PE Commands . 44 4.1 The Profiling Results 59 4.2 The FPGA Usage 68 4.3 The Consumed Clock Cycles and the Maximum Operating Frequency 69 4.4 The Obtained Speed up over pure-software 70 5.1 The Consumed Clock Cycles 78 5.2 The Actual Execution Time of the Three Implementations in mili Sec. 78 5.3 The FPGA Usage 80 6.1 Xtensa Processor Configuration Detail 85 6.2 The Profiling Results 86 6.3 The Consumed Clock Cycles and the Speed up Obtained over the Pure ISA Processor 87 6.4 The Consumed Clock Cycles 89 ix 6.5 The Actual Execution Time of the Three Approaches in m Sec. 89 6.6 The Speed up obtained by Tensilica Approach over the H/S and Handel-C Approaches 90 x List of Figures 1.1 Interconnect and Gate Delay 3 1.2 The Overall Design Methodology 5 2.1 The VLSI Design Process 8 2.2 VLSI Physical Design Cycle 9 2.3 An Illustration of General Routing 12 2.4 The Classification of the Global Routing Algorithms 13 2.5 Lee's Algorithm: (a) The Wave Propagation Phase (b) The Retrace Phase (c) The Clean up Phase 15 2.6 Schemes to Reduce the Running Time of Lee's Algorithm, (a) Start­ ing point selection, (b) Double fan-out. (c) Framing 16 2.7 A General FPGA Structure 20 2.8 A General Configurable Logic Block[2] 20 2.9 The Different Computing Approaches 24 3.1 Hardware Accelerators for CAD 27 3.2 The model of the partially reconfigurable dynamic system [3] . 28 3.3 The Serial Architecture [3] 29 xi 3.4 The Parallel Architecture [3] 30 3.5 The Serial Parallel Architecture [3] 30 3.6 HSRA T-Switch with Path-Search OR [4] 33 3.7 Maze Router General Architecture and Pipelined Processors .... 35 3.8 Basic structure of the wavefront machine 38 3.9 Block diagram of a single PE 40 3.10 L3 General Organization [5] 42 3.11 L4 Architecture [6] 43 4.1 The Design Methodology 47 4.2 The Flow Chart of Lee's Algorithm 48 4.3 The MicroBlaze System for the Pure-software Based Implementation 50 4.4 Assign the Source and the Target 51 4.5 The Wave Propagation Function 53 4.6 Retrace and Clean up Function 54 4.7 The Rip up Function 56 4.8 The Rip up Function Steps 57 4.9 Multi-Terminal Nets Routing 58 4.10 The Wave Propagation Function with Framing Technique 60 4.11 Framing 1 Technique 61 4.12 The MicroBlaze System for the Hardware/Software Co-design ..
Recommended publications
  • The RISC-V Instruction Set Manual, Volume I: User- Level ISA, Version 2.0
    The RISC-V Instruction Set Manual, Volume I: User- Level ISA, Version 2.0 Andrew Waterman Yunsup Lee David A. Patterson Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-54 http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-54.html May 6, 2014 Copyright © 2014, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. The RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.0 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c CS Division, EECS Department, University of California, Berkeley fwaterman|yunsup|pattrsn|[email protected] May 6, 2014 Preface This is the second release of the user ISA specification, and we intend the specification of the base user ISA plus general extensions (i.e., IMAFD) to remain fixed for future development. The following changes have been made since Version 1.0 [25] of this ISA specification. • The ISA has been divided into an integer base with several standard extensions. • The instruction formats have been rearranged to make immediate encoding more efficient. • The base ISA has been defined to have a little-endian memory system, with big-endian or bi-endian as non-standard variants.
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Embedded Linux Primer: a Practical Real-World Approach
    Embedded Linux Primer: A Practical, Real-World Approach By Christopher Hallinan ............................................... Publisher: Prentice Hall Pub Date: September 18, 2006 Print ISBN-10: 0-13-167984-8 Print ISBN-13: 978-0-13-167984-9 Pages: 576 Table of Contents | Index Comprehensive Real-World Guidance for Every Embedded Developer and Engineer This book brings together indispensable knowledge for building efficient, high-value, Linux-based embedded products: information that has never been assembled in one place before. Drawing on years of experience as an embedded Linux consultant and field application engineer, Christopher Hallinan offers solutions for the specific technical issues you're most likely to face, demonstrates how to build an effective embedded Linux environment, and shows how to use it as productively as possible. Hallinan begins by touring a typical Linux-based embedded system, introducing key concepts and components, and calling attention to differences between Linux and traditional embedded environments. Writing from the embedded developer's viewpoint, he thoroughly addresses issues ranging from kernel building and initialization to bootloaders, device drivers to file systems. Hallinan thoroughly covers the increasingly popular BusyBox utilities; presents a step-by-step walkthrough of porting Linux to custom boards; and introduces real-time configuration via CONFIG_RT--one of today's most exciting developments in embedded Linux. You'll find especially detailed coverage of using development tools to analyze
    [Show full text]
  • The RISC-V Instruction Set Manual Volume I: User-Level ISA Document Version 2.2
    The RISC-V Instruction Set Manual Volume I: User-Level ISA Document Version 2.2 Editors: Andrew Waterman1, Krste Asanovi´c1;2 1SiFive Inc., 2CS Division, EECS Department, University of California, Berkeley [email protected], [email protected] May 7, 2017 Contributors to all versions of the spec in alphabetical order (please contact editors to suggest corrections): Krste Asanovi´c,Rimas Aviˇzienis,Jacob Bachmeyer, Christopher F. Batten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, David Chisnall, Paul Clayton, Palmer Dabbelt, Stefan Freudenberger, Jan Gray, Michael Hamburg, John Hauser, David Horner, Olof Johansson, Ben Keller, Yunsup Lee, Joseph Myers, Rishiyur Nikhil, Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Colin Schmidt, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn, Ray VanDeWalker, Megan Wachs, Andrew Waterman, Robert Wat- son, and Reinoud Zandijk. This document is released under a Creative Commons Attribution 4.0 International License. This document is a derivative of \The RISC-V Instruction Set Manual, Volume I: User-Level ISA Version 2.1" released under the following license: c 2010{2017 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c. Creative Commons Attribution 4.0 International License. Please cite as: \The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 2.2", Editors Andrew Waterman and Krste Asanovi´c,RISC-V Foundation, May 2017. Preface This is version 2.2 of the document describing the RISC-V user-level architecture. The document contains the following versions of the RISC-V ISA modules: Base Version Frozen? RV32I 2.0 Y RV32E 1.9 N RV64I 2.0 Y RV128I 1.7 N Extension Version Frozen? M 2.0 Y A 2.0 Y F 2.0 Y D 2.0 Y Q 2.0 Y L 0.0 N C 2.0 Y B 0.0 N J 0.0 N T 0.0 N P 0.1 N V 0.2 N N 1.1 N To date, no parts of the standard have been officially ratified by the RISC-V Foundation, but the components labeled \frozen" above are not expected to change during the ratification process beyond resolving ambiguities and holes in the specification.
    [Show full text]
  • Instruction Set Extension with Shadow Registers for Configurable Processors
    Instruction Set Extension with Shadow Registers for Configurable Processors Jason Cong, Yiping Fan, Guoling Han, Ashok Jagannathan, Glenn Reinman, Zhiru Zhang Computer Science Department, University of California, Los Angeles Los Angeles, CA 90095, USA {cong, fanyp, leohgl, ashokj, reinman, zhiruz}@cs.ucla.edu ABSTRACT and the speedup (and power savings) offered by an application- Configurable processors are becoming increasingly popular for specific hardware accelerator. Generally, there are two ways to modern embedded systems (especially for the field-programmable couple the reconfigurable fabric with the microprocessor [5]. system-on-a-chip). While steady progress has been made in the Loosely coupled, a reconfigurable fabric can be used as a co- tools and methodologies of automatic instruction set extension for processor [21][11]. Co-processors perform more complicated configurable processors, the limited data bandwidth available in tasks independently without the constant supervision of the main the core processor (e.g., the number of simultaneous accesses to processor. The main processor sends the necessary data to the co- the register file) becomes a potential performance bottleneck. In processor at the initialization stage. With the internal state this paper we first present a quantitative analysis of the data registers, the co-processor does not need to transfer data during bandwidth limitation in configurable processors, and then propose the computation period. On the contrary, application-specific a novel low-cost architectural extension and associated instruction-set processors (ASIPs) tightly integrate the compilation techniques to address the problem. The application of reconfigurable fabric as additional application-specific function our approach results in a promising performance improvement.
    [Show full text]
  • The What, Why, and How of Customizable Processors Meeting Performance, Cost, and Power Objectives While Reducing ASIC Design Risk and Increasing Design Flexibility
    The What, Why, and How of Customizable Processors Meeting performance, cost, and power objectives while reducing ASIC design risk and increasing design flexibility Customizable processors that perform intensive data processing are designed to provide programmability in the performance-intensive dataplane of the system-on-chip (SoC) design. Not only do they combine the capabilities of a DSP and a CPU, but they can be customized to maximize efficiency for your target application. Introduction Contents While processors are often used for the control functions in system-on-chip Introduction ......................................1 (SoC) designs, designers turn to RTL blocks for many data-intensive functions Getting More Performance the Old that control processors can’t handle. However, RTL blocks take a long time to Way #1—Higher Clock Speed ...........2 design and even longer to verify, and they are not programmable to handle Getting More Performance the Old multiple standards or designs. Way #2—RTL Acceleration ................2 What is a Customizable Processor? ....3 The most common embedded microprocessor architectures—such as the ARM®, MIPS, and PowerPC processors—were developed in the 1980s for Processor Design Cycle ......................4 stand-alone microprocessor chips. These general-purpose processor architec- Achieving Lower Energy Consumption tures, or CPUs, are good at executing a wide range of algorithms with a focus with the Xtensa Processor .................5 on control code, but SoC designers often need more performance in
    [Show full text]
  • Cadence Tensilica Fusion G3 DSP Core
    An Independent Evaluation of the Cadence Tensilica Fusion G3 DSP Core By the staff of October 2016 OVERVIEW The recently announced Cadence Tensilica Fusion G3 DSP IP core is a high-performance licensable programmable digital signal processor core targeting diverse signal processing applications such as communications, audio and industrial applications. BDTI, a technology analysis firm, benchmarked the Fusion G3 core on several typical digital signal processing functions, comparing the performance of Fusion G3 against Texas Instruments’ flagship C66x DSP core. BDTI also compared the Fusion G3’s FFT performance to that of the ARM Cortex-A57 CPU core. Finally, BDTI implemented and optimized a custom DSP function from scratch on the Fusion G3 and compared the resulting performance to that of the TI C66x. This report presents BDTI’s independent evaluation of the Fusion G3 core’s performance and ease of software development. The Fusion G3 DSP core’s wide SIMD (single-instruction, multiple-data) operations and VLIW (very long instruction word) instruction set provide excellent cycle efficiency on many DSP tasks, and yield performance that surpasses that of TI’s flagship C66x DSP core. Fusion G3 is also noteworthy for its double- precision floating-point support for precision-critical tasks. Cadence provides robust software development tools and DSP function libraries to help users effectively realize the core’s performance potential. © 2016 Berkeley Design Technology, Inc. Page 1 Contents targets a similar range of applications, is well known in the industry, and has readily-available 1. Introduction ....................................................... 2 tools and optimized libraries that we could 2. About the Cadence Fusion G3 Core ............
    [Show full text]
  • Diamond Standard Processor Cores
    WHITE PAPER Diamond Standard Processor Cores A comprehensive family of software-compatible, preconfigured RISC controller cores for your next ASIC or SOC design Tensilica’s Diamond Standard Series is a broad family of preconfigured 32-bit microprocessor and DSP Intellectual Property (IP) cores based on Tensilica’s Xtensa Instruction Set Architecture (ISA). All of these code-compatible processor cores employ a common set of software/firmware development tools making it easy for development teams to move from one processor core to another as design needs change or when your design needs multiple cores to execute different tasks. The base Xtensa ISA employs 24-bit, general-purpose RISC instructions that target a wide range of embedded applications. Most common instructions also have a 16-bit narrow encoding to minimize code footprint and the Diamond Series architecture allows modeless switching between 16 and 24-bit instructions. Consequentially, Diamond Standard Series processor cores achieve the highest code densities among all 32-bit RISC processors while delivering industry-leading performance. Introduction Tensilica’s Diamond Standard Series is a family of code-compatible, preconfigured 32-bit microprocessor and DSP Intellectual Property (IP) cores based on Tensilica’s Xtensa Instruction Set Architecture (ISA). The base Xtensa ISA uses 24 bit instructions that target a wide range of embedded applications. Most common instructions in the Xtensa ISA also have 16-bit narrow encodings as well, and the architecture allows modeless switching between 16 and 24-bit instructions so that the compiler is free to pick the smallest possible instruction at any time with no performance penalty. As a result, the Diamond Series processors achieve the highest code densities among all 32-bit RISC processors.
    [Show full text]
  • Xtensa LX Microprocessor Overview Handbook Iii Contents
    Xtensa ® LX Microprocessor Overview Handbook A Summary of the Xtensa® LX Microprocessor Data Book For Xtensa® LX Processor Cores Tensilica, Inc. 3255-6 Scott Blvd Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com © 2004 Tensilica, Inc. Printed in the United States of America All Rights Reserved This publication is provided “AS IS.” Tensilica, Inc. (hereafter “Tensilica”) does not make any warranty of any kind, either ex- pressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licens- es granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups, meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typo- graphical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of this publication. Tensilica and Xtensa are registered trademarks of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc.: FLIX, OSKit, Sea of Processors, Vectra, Xplorer, and XPRES. All other trademarks and registered trademarks are the property of their respective companies. Issue Date: 9/2004 PD-04-2508-10-00 Tensilica, Inc. 3255-6 Scott Blvd Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com Contents Contents Introducing the Xtensa LX Processor Generator ...............................................................
    [Show full text]
  • The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213
    The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213 Editors: Andrew Waterman1, Krste Asanovi´c1;2 1SiFive Inc., 2CS Division, EECS Department, University of California, Berkeley [email protected], [email protected] December 13, 2019 Contributors to all versions of the spec in alphabetical order (please contact editors to suggest corrections): Arvind, Krste Asanovi´c,Rimas Aviˇzienis,Jacob Bachmeyer, Christopher F. Bat- ten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua Chang, David Chisnall, Paul Clayton, Palmer Dabbelt, Ken Dockser, Roger Espasa, Shaked Flur, Stefan Freudenberger, Marc Gauthier, Andy Glew, Jan Gray, Michael Hamburg, John Hauser, David Horner, Bruce Hoult, Bill Huffman, Alexandre Joannou, Olof Johansson, Ben Keller, David Kruckemyer, Yunsup Lee, Paul Loewenstein, Daniel Lustig, Yatin Manerkar, Luc Maranget, Mar- garet Martonosi, Joseph Myers, Vijayanand Nagarajan, Rishiyur Nikhil, Jonas Oberhauser, Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Christopher Pulte, Jose Renau, Josh Scheid, Colin Schmidt, Peter Sewell, Susmit Sarkar, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn, Caroline Trippel, Ray VanDeWalker, Muralidaran Vijayaraghavan, Megan Wachs, Andrew Waterman, Robert Watson, Derek Williams, Andrew Wright, Reinoud Zandijk, and Sizhuo Zhang. This document is released under a Creative Commons Attribution 4.0 International License. This document is a derivative of \The RISC-V Instruction Set Manual, Volume I: User-Level ISA Version 2.1" released under the following license: ⃝c 2010{2017 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c. Creative Commons Attribution 4.0 International License. Please cite as: \The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213", Editors Andrew Waterman and Krste Asanovi´c,RISC-V Foundation, December 2019.
    [Show full text]
  • Xtensa Instruction Set Architecture (ISA) Reference Manual Iii Contents
    Xtensa® Instruction Set Architecture (ISA) Reference Manual For All Xtensa Processor Cores Tensilica, Inc. 3255-6 Scott Blvd. Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com © 2010 Tensilica, Inc. Printed in the United States of America All Rights Reserved This publication is provided “AS IS.” Tensilica, Inc. (hereafter “Tensilica”) does not make any warranty of any kind, either ex- pressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licens- es granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups, meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typo- graphical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of this publication. Tensilica and Xtensa are registered trademarks of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc.: FLIX, OSKit, Sea of Processors, TurboXim, Vectra, Xenergy, Xplorer, and XPRES. All other trademarks and registered trademarks are the property of their respective companies. Issue Date: 4/2010 RC-2010.1 Release PD-09-0801-10-01 Tensilica, Inc. 3255-6 Scott Blvd.
    [Show full text]
  • Soft-Core Processors for Embedded Systems
    RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR SoftSoft--CoreCore ProcessorsProcessors ForFor EmbeddedEmbedded SystemsSystems Jason G. Tong and Ian D. L. Anderson Supervisor: Dr. M.A.S. Khalid ICM’06, December 2006 Research Center for Integrated Microsystems Department of Electrical and Computer Engineering University of Windsor 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 1 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR OUTLINE • Introduction • A Survey of Soft-Core Processors • Commerical Cores and Tools • Open-source Cores • Some Example Applications • Comparison of Soft-Core Processors • Conclusions and Future Work 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 2 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR Introduction 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 3 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR Embedded Systems •An embedded system: a system that utilizes custom Embedded System hardware and software to carry out specific tasks Embedded CPU • Digital Hardware: Software running – Microprocessor or µC on CPU – Application-specific Memory hardware generally used & I/O for accelerating time- Application- critical tasks specific • Embedded software running hardware on the μP or μC 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 4 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY
    [Show full text]