The What, Why, and How of Customizable Processors Meeting Performance, Cost, and Power Objectives While Reducing ASIC Design Risk and Increasing Design Flexibility

Total Page:16

File Type:pdf, Size:1020Kb

The What, Why, and How of Customizable Processors Meeting Performance, Cost, and Power Objectives While Reducing ASIC Design Risk and Increasing Design Flexibility The What, Why, and How of Customizable Processors Meeting performance, cost, and power objectives while reducing ASIC design risk and increasing design flexibility Customizable processors that perform intensive data processing are designed to provide programmability in the performance-intensive dataplane of the system-on-chip (SoC) design. Not only do they combine the capabilities of a DSP and a CPU, but they can be customized to maximize efficiency for your target application. Introduction Contents While processors are often used for the control functions in system-on-chip Introduction ......................................1 (SoC) designs, designers turn to RTL blocks for many data-intensive functions Getting More Performance the Old that control processors can’t handle. However, RTL blocks take a long time to Way #1—Higher Clock Speed ...........2 design and even longer to verify, and they are not programmable to handle Getting More Performance the Old multiple standards or designs. Way #2—RTL Acceleration ................2 What is a Customizable Processor? ....3 The most common embedded microprocessor architectures—such as the ARM®, MIPS, and PowerPC processors—were developed in the 1980s for Processor Design Cycle ......................4 stand-alone microprocessor chips. These general-purpose processor architec- Achieving Lower Energy Consumption tures, or CPUs, are good at executing a wide range of algorithms with a focus with the Xtensa Processor .................5 on control code, but SoC designers often need more performance in critical Benefits of Using a Custom datapath portions of their designs than these microprocessor architectures can Processor Design ...............................5 deliver. Customizable Processors for Digital Signal Processing ...............................5 To bridge this performance gap, the two most-used approaches are to run the Customizable Processors as RTL general-purpose processor at a higher clock rate (thus extracting more perfor- Alternatives .......................................6 mance from the same processor architecture), or to hand-design acceleration Advantages of Using Processors hardware that offloads some of the processing burden from the processor. Instead of RTL ....................................6 Running a general-purpose processor core at a high clock rate incurs a power Fit the Processor to the Algorithm .....7 and area penalty, and designing acceleration hardware takes additional devel- Conclusion ......................................12 opment time, not just for the design but for verification of the new acceler- Additional Information ....................12 ation hardware. In fact, verification can consume as much as 70% to 80% of the total design time. This white paper discusses how Cadence® Tensilica® Xtensa® customizable processors can achieve high performance and lower energy consumption, save time, and provide design flexibility versus hand-coded RTL hardware or a general-purpose processor. The What, Why, and How of Customizable Processors Getting More Performance the Old Way #1—Higher Clock Speed When designing a new chip, some ASIC design teams wait until they know exactly how much processing “horse- power” they need, then they select the processor. This design approach locks up the chip’s system architecture by basing it around the relatively narrow performance range occupied by all of the general-purpose processor architectures. For example, all 32-bit RISC architectures provide essentially the same performance at the same clock speed. If the ASIC design team is wrong about the system’s performance needs and additional processor performance is required, the only alternative available when using general-purpose processors that have fixed instruction set archi- tectures (ISAs) is to use a processor from the same family that runs at a higher clock rate. This is a tried-and-true approach to system engineering dating back to the earliest days of microprocessor usage. However, this decades-old approach has clearly reached the end of its useful life. High clock rates, now measured in GHz, result in excessive power and energy consumption. Chips overheat, fans are needed, and power supply costs rise as a result of this additional energy consumption. All of these consequences increase system costs. In addition, a general-purpose processor can just as easily be employed by one of your competitors in a similar design, making your software and design ideas that much easier to copy. What is the alternative? Custom processors, such as the Xtensa processor, can be optimized for a specific appli- cation. Optimization adds a variety of useful capabilities that keep clock rates and energy consumption low: • Special registers sized to the natural data types of the tasks to be performed • Specialized execution units that efficiently perform task-specific algorithms, often in one or two clock cycles • Customized, system specific I/Os that can connect directly to neighboring blocks of dedicated hardware Further, no other company can replicate your application-specific processor. You can hide portions of your design or give others full access to it using the development tools. These advantages help keep the design pirates away. Getting More Performance the Old Way #2—RTL Acceleration Because many applications simply cannot run fast enough on standard embedded microprocessors, even with an auxiliary DSP, engineering teams often use a hardware description language (HDL) such as Verilog or VHDL to create hand-coded acceleration hardware for the parts of their design with aggressive performance goals that are clearly out of any microprocessor’s reach. However, custom RTL hardware takes a long time to design and even longer to verify. In addition, RTL hardware blocks can’t be easily changed once they’re designed because of the huge verification time. Yet changes are often needed to fix bugs or to accommodate new standards or product features. Figure 1 shows the internal structure of a generic RTL accelerator block. The block’s datapath appears on the left and its state machine appears on the right. www.cadence.com 2 The What, Why, and How of Customizable Processors Data RAM x Finite State + – + – Machine + Figure 1: Hardwired RTL = Datapath Plus State Machine In most RTL accelerators, the datapath consumes the vast majority of the gates. A typical datapath may be as narrow as 16 or 20 bits or range up to hundreds of bits wide, depending on the target task or application. RTL datapaths typically contain many data registers and often include significant blocks of RAM or interfaces to blocks of memory that are shared with other RTL blocks. By contrast, the RTL logic block’s finite state machine (FSM) contains nothing but the control logic. All the nuances of sequencing data through the datapath, all the exception and error conditions, and all the handshakes with other blocks are captured in the FSM. While the FSM is often small in area compared to the datapath, it most often embodies most or all of the design and verification risk due to its logical complexity. A late design change made to an RTL acceleration block is more likely to affect the FSM than the datapath because the FSM contains most of the design complexity. An alternative to dedicated RTL blocks is using a custom processor for embedding the datapath logic and gates, and embedding the FSM control logic into the processor’s firmware to decouple the datapath and the FSM control logic. This separation minimizes hardware design changes and gives flexibility to keep pace with updates that can easily be made in firmware. Custom processors support the RTL block’s ability to have wide, non-integer-sized datapaths—which can be guaranteed correct-by-construction by the processor vendor—while reducing the risks associated with FSM design because processor-based FSMs are firmware programmable. What is a Customizable Processor? Xtensa customizable processors, based on a single processor architecture, use a common development flow with tools that scale from tiny microcontrollers and digital signal controllers to high-performance, real-time controllers and DSPs. Xtensa processors allow you to make the processor run your application more efficiently. You can tailor the Xtensa processor in various ways: • Select from standard configuration options, such as bus widths, interfaces, memories, and preconfigured execution units (floating-point units, DSPs, etc.) www.cadence.com 3 The What, Why, and How of Customizable Processors • Add new registers, register files, and custom task-specific instructions that support specialized data types and operations such as 48-bit data types (stereo audio sample pairs), 56-bit data and operations for security processing (encryption and decryption), or 256-bit data types and operations for packet processing • Add interfaces for more input/output (I/O) bandwidth to move data into and out of the processor such as: Ports (general-purpose I/O [GPIO]), Queues (FIFO), and memory Lookup interfaces • Add instructions on top of the base ISA from pre-defined options or create your own using the Tensilica Instruction Extension (TIE) language Xtensa processors can implement wide, parallel, and complex datapath operations that closely match those used in custom RTL hardware. The equivalent datapaths are implemented by augmenting the base processor’s integer pipeline with additional execution units, registers, and other functions developed by the chip architect for a target application You get this customization with
Recommended publications
  • The RISC-V Instruction Set Manual, Volume I: User- Level ISA, Version 2.0
    The RISC-V Instruction Set Manual, Volume I: User- Level ISA, Version 2.0 Andrew Waterman Yunsup Lee David A. Patterson Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-54 http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-54.html May 6, 2014 Copyright © 2014, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. The RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.0 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c CS Division, EECS Department, University of California, Berkeley fwaterman|yunsup|pattrsn|[email protected] May 6, 2014 Preface This is the second release of the user ISA specification, and we intend the specification of the base user ISA plus general extensions (i.e., IMAFD) to remain fixed for future development. The following changes have been made since Version 1.0 [25] of this ISA specification. • The ISA has been divided into an integer base with several standard extensions. • The instruction formats have been rearranged to make immediate encoding more efficient. • The base ISA has been defined to have a little-endian memory system, with big-endian or bi-endian as non-standard variants.
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Embedded Linux Primer: a Practical Real-World Approach
    Embedded Linux Primer: A Practical, Real-World Approach By Christopher Hallinan ............................................... Publisher: Prentice Hall Pub Date: September 18, 2006 Print ISBN-10: 0-13-167984-8 Print ISBN-13: 978-0-13-167984-9 Pages: 576 Table of Contents | Index Comprehensive Real-World Guidance for Every Embedded Developer and Engineer This book brings together indispensable knowledge for building efficient, high-value, Linux-based embedded products: information that has never been assembled in one place before. Drawing on years of experience as an embedded Linux consultant and field application engineer, Christopher Hallinan offers solutions for the specific technical issues you're most likely to face, demonstrates how to build an effective embedded Linux environment, and shows how to use it as productively as possible. Hallinan begins by touring a typical Linux-based embedded system, introducing key concepts and components, and calling attention to differences between Linux and traditional embedded environments. Writing from the embedded developer's viewpoint, he thoroughly addresses issues ranging from kernel building and initialization to bootloaders, device drivers to file systems. Hallinan thoroughly covers the increasingly popular BusyBox utilities; presents a step-by-step walkthrough of porting Linux to custom boards; and introduces real-time configuration via CONFIG_RT--one of today's most exciting developments in embedded Linux. You'll find especially detailed coverage of using development tools to analyze
    [Show full text]
  • The RISC-V Instruction Set Manual Volume I: User-Level ISA Document Version 2.2
    The RISC-V Instruction Set Manual Volume I: User-Level ISA Document Version 2.2 Editors: Andrew Waterman1, Krste Asanovi´c1;2 1SiFive Inc., 2CS Division, EECS Department, University of California, Berkeley [email protected], [email protected] May 7, 2017 Contributors to all versions of the spec in alphabetical order (please contact editors to suggest corrections): Krste Asanovi´c,Rimas Aviˇzienis,Jacob Bachmeyer, Christopher F. Batten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, David Chisnall, Paul Clayton, Palmer Dabbelt, Stefan Freudenberger, Jan Gray, Michael Hamburg, John Hauser, David Horner, Olof Johansson, Ben Keller, Yunsup Lee, Joseph Myers, Rishiyur Nikhil, Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Colin Schmidt, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn, Ray VanDeWalker, Megan Wachs, Andrew Waterman, Robert Wat- son, and Reinoud Zandijk. This document is released under a Creative Commons Attribution 4.0 International License. This document is a derivative of \The RISC-V Instruction Set Manual, Volume I: User-Level ISA Version 2.1" released under the following license: c 2010{2017 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c. Creative Commons Attribution 4.0 International License. Please cite as: \The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 2.2", Editors Andrew Waterman and Krste Asanovi´c,RISC-V Foundation, May 2017. Preface This is version 2.2 of the document describing the RISC-V user-level architecture. The document contains the following versions of the RISC-V ISA modules: Base Version Frozen? RV32I 2.0 Y RV32E 1.9 N RV64I 2.0 Y RV128I 1.7 N Extension Version Frozen? M 2.0 Y A 2.0 Y F 2.0 Y D 2.0 Y Q 2.0 Y L 0.0 N C 2.0 Y B 0.0 N J 0.0 N T 0.0 N P 0.1 N V 0.2 N N 1.1 N To date, no parts of the standard have been officially ratified by the RISC-V Foundation, but the components labeled \frozen" above are not expected to change during the ratification process beyond resolving ambiguities and holes in the specification.
    [Show full text]
  • Instruction Set Extension with Shadow Registers for Configurable Processors
    Instruction Set Extension with Shadow Registers for Configurable Processors Jason Cong, Yiping Fan, Guoling Han, Ashok Jagannathan, Glenn Reinman, Zhiru Zhang Computer Science Department, University of California, Los Angeles Los Angeles, CA 90095, USA {cong, fanyp, leohgl, ashokj, reinman, zhiruz}@cs.ucla.edu ABSTRACT and the speedup (and power savings) offered by an application- Configurable processors are becoming increasingly popular for specific hardware accelerator. Generally, there are two ways to modern embedded systems (especially for the field-programmable couple the reconfigurable fabric with the microprocessor [5]. system-on-a-chip). While steady progress has been made in the Loosely coupled, a reconfigurable fabric can be used as a co- tools and methodologies of automatic instruction set extension for processor [21][11]. Co-processors perform more complicated configurable processors, the limited data bandwidth available in tasks independently without the constant supervision of the main the core processor (e.g., the number of simultaneous accesses to processor. The main processor sends the necessary data to the co- the register file) becomes a potential performance bottleneck. In processor at the initialization stage. With the internal state this paper we first present a quantitative analysis of the data registers, the co-processor does not need to transfer data during bandwidth limitation in configurable processors, and then propose the computation period. On the contrary, application-specific a novel low-cost architectural extension and associated instruction-set processors (ASIPs) tightly integrate the compilation techniques to address the problem. The application of reconfigurable fabric as additional application-specific function our approach results in a promising performance improvement.
    [Show full text]
  • Cadence Tensilica Fusion G3 DSP Core
    An Independent Evaluation of the Cadence Tensilica Fusion G3 DSP Core By the staff of October 2016 OVERVIEW The recently announced Cadence Tensilica Fusion G3 DSP IP core is a high-performance licensable programmable digital signal processor core targeting diverse signal processing applications such as communications, audio and industrial applications. BDTI, a technology analysis firm, benchmarked the Fusion G3 core on several typical digital signal processing functions, comparing the performance of Fusion G3 against Texas Instruments’ flagship C66x DSP core. BDTI also compared the Fusion G3’s FFT performance to that of the ARM Cortex-A57 CPU core. Finally, BDTI implemented and optimized a custom DSP function from scratch on the Fusion G3 and compared the resulting performance to that of the TI C66x. This report presents BDTI’s independent evaluation of the Fusion G3 core’s performance and ease of software development. The Fusion G3 DSP core’s wide SIMD (single-instruction, multiple-data) operations and VLIW (very long instruction word) instruction set provide excellent cycle efficiency on many DSP tasks, and yield performance that surpasses that of TI’s flagship C66x DSP core. Fusion G3 is also noteworthy for its double- precision floating-point support for precision-critical tasks. Cadence provides robust software development tools and DSP function libraries to help users effectively realize the core’s performance potential. © 2016 Berkeley Design Technology, Inc. Page 1 Contents targets a similar range of applications, is well known in the industry, and has readily-available 1. Introduction ....................................................... 2 tools and optimized libraries that we could 2. About the Cadence Fusion G3 Core ............
    [Show full text]
  • Diamond Standard Processor Cores
    WHITE PAPER Diamond Standard Processor Cores A comprehensive family of software-compatible, preconfigured RISC controller cores for your next ASIC or SOC design Tensilica’s Diamond Standard Series is a broad family of preconfigured 32-bit microprocessor and DSP Intellectual Property (IP) cores based on Tensilica’s Xtensa Instruction Set Architecture (ISA). All of these code-compatible processor cores employ a common set of software/firmware development tools making it easy for development teams to move from one processor core to another as design needs change or when your design needs multiple cores to execute different tasks. The base Xtensa ISA employs 24-bit, general-purpose RISC instructions that target a wide range of embedded applications. Most common instructions also have a 16-bit narrow encoding to minimize code footprint and the Diamond Series architecture allows modeless switching between 16 and 24-bit instructions. Consequentially, Diamond Standard Series processor cores achieve the highest code densities among all 32-bit RISC processors while delivering industry-leading performance. Introduction Tensilica’s Diamond Standard Series is a family of code-compatible, preconfigured 32-bit microprocessor and DSP Intellectual Property (IP) cores based on Tensilica’s Xtensa Instruction Set Architecture (ISA). The base Xtensa ISA uses 24 bit instructions that target a wide range of embedded applications. Most common instructions in the Xtensa ISA also have 16-bit narrow encodings as well, and the architecture allows modeless switching between 16 and 24-bit instructions so that the compiler is free to pick the smallest possible instruction at any time with no performance penalty. As a result, the Diamond Series processors achieve the highest code densities among all 32-bit RISC processors.
    [Show full text]
  • Xtensa LX Microprocessor Overview Handbook Iii Contents
    Xtensa ® LX Microprocessor Overview Handbook A Summary of the Xtensa® LX Microprocessor Data Book For Xtensa® LX Processor Cores Tensilica, Inc. 3255-6 Scott Blvd Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com © 2004 Tensilica, Inc. Printed in the United States of America All Rights Reserved This publication is provided “AS IS.” Tensilica, Inc. (hereafter “Tensilica”) does not make any warranty of any kind, either ex- pressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licens- es granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups, meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typo- graphical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of this publication. Tensilica and Xtensa are registered trademarks of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc.: FLIX, OSKit, Sea of Processors, Vectra, Xplorer, and XPRES. All other trademarks and registered trademarks are the property of their respective companies. Issue Date: 9/2004 PD-04-2508-10-00 Tensilica, Inc. 3255-6 Scott Blvd Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com Contents Contents Introducing the Xtensa LX Processor Generator ...............................................................
    [Show full text]
  • The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213
    The RISC-V Instruction Set Manual Volume I: Unprivileged ISA Document Version 20191213 Editors: Andrew Waterman1, Krste Asanovi´c1;2 1SiFive Inc., 2CS Division, EECS Department, University of California, Berkeley [email protected], [email protected] December 13, 2019 Contributors to all versions of the spec in alphabetical order (please contact editors to suggest corrections): Arvind, Krste Asanovi´c,Rimas Aviˇzienis,Jacob Bachmeyer, Christopher F. Bat- ten, Allen J. Baum, Alex Bradbury, Scott Beamer, Preston Briggs, Christopher Celio, Chuanhua Chang, David Chisnall, Paul Clayton, Palmer Dabbelt, Ken Dockser, Roger Espasa, Shaked Flur, Stefan Freudenberger, Marc Gauthier, Andy Glew, Jan Gray, Michael Hamburg, John Hauser, David Horner, Bruce Hoult, Bill Huffman, Alexandre Joannou, Olof Johansson, Ben Keller, David Kruckemyer, Yunsup Lee, Paul Loewenstein, Daniel Lustig, Yatin Manerkar, Luc Maranget, Mar- garet Martonosi, Joseph Myers, Vijayanand Nagarajan, Rishiyur Nikhil, Jonas Oberhauser, Stefan O'Rear, Albert Ou, John Ousterhout, David Patterson, Christopher Pulte, Jose Renau, Josh Scheid, Colin Schmidt, Peter Sewell, Susmit Sarkar, Michael Taylor, Wesley Terpstra, Matt Thomas, Tommy Thorn, Caroline Trippel, Ray VanDeWalker, Muralidaran Vijayaraghavan, Megan Wachs, Andrew Waterman, Robert Watson, Derek Williams, Andrew Wright, Reinoud Zandijk, and Sizhuo Zhang. This document is released under a Creative Commons Attribution 4.0 International License. This document is a derivative of \The RISC-V Instruction Set Manual, Volume I: User-Level ISA Version 2.1" released under the following license: ⃝c 2010{2017 Andrew Waterman, Yunsup Lee, David Patterson, Krste Asanovi´c. Creative Commons Attribution 4.0 International License. Please cite as: \The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213", Editors Andrew Waterman and Krste Asanovi´c,RISC-V Foundation, December 2019.
    [Show full text]
  • Xtensa Instruction Set Architecture (ISA) Reference Manual Iii Contents
    Xtensa® Instruction Set Architecture (ISA) Reference Manual For All Xtensa Processor Cores Tensilica, Inc. 3255-6 Scott Blvd. Santa Clara, CA 95054 (408) 986-8000 fax (408) 986-8919 www.tensilica.com © 2010 Tensilica, Inc. Printed in the United States of America All Rights Reserved This publication is provided “AS IS.” Tensilica, Inc. (hereafter “Tensilica”) does not make any warranty of any kind, either ex- pressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Information in this document is provided solely to enable system and software developers to use Tensilica processors. Unless specifically set forth herein, there are no express or implied patent, copyright or any other intellectual property rights or licens- es granted hereunder to design or fabricate Tensilica integrated circuits or integrated circuits based on the information in this document. Tensilica does not warrant that the contents of this publication, whether individually or as one or more groups, meets your requirements or that the publication is error-free. This publication could include technical inaccuracies or typo- graphical errors. Changes may be made to the information herein, and these changes may be incorporated in new editions of this publication. Tensilica and Xtensa are registered trademarks of Tensilica, Inc. The following terms are trademarks of Tensilica, Inc.: FLIX, OSKit, Sea of Processors, TurboXim, Vectra, Xenergy, Xplorer, and XPRES. All other trademarks and registered trademarks are the property of their respective companies. Issue Date: 4/2010 RC-2010.1 Release PD-09-0801-10-01 Tensilica, Inc. 3255-6 Scott Blvd.
    [Show full text]
  • Soft-Core Processors for Embedded Systems
    RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR SoftSoft--CoreCore ProcessorsProcessors ForFor EmbeddedEmbedded SystemsSystems Jason G. Tong and Ian D. L. Anderson Supervisor: Dr. M.A.S. Khalid ICM’06, December 2006 Research Center for Integrated Microsystems Department of Electrical and Computer Engineering University of Windsor 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 1 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR OUTLINE • Introduction • A Survey of Soft-Core Processors • Commerical Cores and Tools • Open-source Cores • Some Example Applications • Comparison of Soft-Core Processors • Conclusions and Future Work 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 2 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR Introduction 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 3 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR Embedded Systems •An embedded system: a system that utilizes custom Embedded System hardware and software to carry out specific tasks Embedded CPU • Digital Hardware: Software running – Microprocessor or µC on CPU – Application-specific Memory hardware generally used & I/O for accelerating time- Application- critical tasks specific • Embedded software running hardware on the μP or μC 18th International Conference on Microelectronics – December 16th – 19th, 2006 Slide 4 RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY
    [Show full text]
  • RISC-V OFFERS SIMPLE, MODULAR ISA New CPU Instruction Set Is Open and Extensible
    RISC-V OFFERS SIMPLE, MODULAR ISA New CPU Instruction Set Is Open and Extensible By David Kanter (March 28, 2016) ................................................................................................................... RISC-V is a new general-purpose instruction-set archi- when Tegra 2 omitted the Neon vector extensions, it created tecture (ISA) that’s BSD licensed, extensible, and royalty substantial consternation in the Android ecosystem. free. It’s clean and modular with a 32-, 64-, or 128-bit inte- To address this issue, RISC-V defines a guaranteed ger base and various optional extensions (e.g., floating base integer instruction set of 32, 64, or 128 bits, a family of point). RISC-V is easier to implement than some alterna- optional and predefined extensions, and a mechanism for tives—minimal RISC-V cores are roughly half the size of creating variable-length extensions. The base ISA is cleanly equivalent ARM cores—and the ISA has already gathered architected to simplify implementation. For example, the some support from the semiconductor industry. instruction encoding is highly regular, and the memory The ambitious goal of RISC-V (pronounced “risk model is straightforward and lacks complicated memory five”) is to create a “universal” instruction set that is free instructions. The benefit is that minimal RISC-V cores are and open to all users while providing the broad software much smaller than similar ARM or x86 cores, although the support of successful commercial ISAs (e.g., x86 and ARM). difference is not noticeable for more powerful cores (e.g., Consider a modern smartphone processor, which typically superscalar, out-of-order). To enable extensions, portions of comprises a dozen or more cores with different software the instruction encoding space have already been set aside stacks (e.g., an ARM application CPU, a GPU, three to five for future use.
    [Show full text]