Parallel Ultra Low Power Embedded System

Total Page:16

File Type:pdf, Size:1020Kb

Parallel Ultra Low Power Embedded System Parallel Ultra Low Power Embedded System João Pedro Alves Vieira Thesis to obtain the Master of Science Degree in Electrical and Computer Engineering Supervisor(s): Prof. Aleksandar Ilic Prof. Leonel Augusto Pires Seabra de Sousa Examination Committee Chairperson: Prof. Gonçalo Nuno Gomes Tavares Supervisor: Prof. Leonel Augusto Pires Seabra de Sousa Member of the Committee: Prof. Paulo Ferreira Godinho Flores December 2017 ii Acknowledgments First of all, a special thank you goes to my family and closest friends, who supported me alongside this journey and when it got though. I would like to thank Professor James C. Hoe and Professor Peter Milder from Carnegie Mellon Univer- sity, who were restless, helping on the debug of a main issue found. I would also like to thank my supervisors, for the guidance and insights. iii iv Resumo O futuro do mercado de dispositivos electronico´ portateis´ sera´ constru´ıdo em torno da Internet das Coisas, onde objectos do dia-a-dia estarao˜ ligados a` internet e possivelmente controlados por outros dispositivos. Estes temˆ comec¸ado a aparecer nas nossas atividades diarias´ e e´ esperado que ten- ham um grande crescimento num futuro proximo,´ como por exemplo monitores do estado de saude,´ lampadas,ˆ termostatos, pulseiras desportivas, etc. A maior parte destes dispositivos sem fios com sensores, dependem de baterias. Nos quais e´ essencial ter um modo de funcionamento energetica- mente eficiente, atraves´ do desenvolvimento de dispositivos com arquitecturas capazes de responder as` necessidades de baixo consumo e desempenho em tempo real. Esta Tese tem como objetivo mel- horar a eficienciaˆ energetica´ de um processador de baixo consumo, nomeadamente o PULPino. Para o alcanc¸ar, foram adicionados de forma modular aceleradores de hardware ao mesmo. Tendo o objetivo de encorajar o desenvolvimento de novos aceleradores pela comunidade open-source. Para testar a viabilidade desta abordagem, dois tipos diferentes de aceleradores foram individualmente adicionados. Um primeiro acelerador criptografico´ SHA-3, que implementa um algoritmo de hash, podendo melhorar a seguranc¸a nos dispositivos IoT. Em segundo, um acelerador FFT, muito utilizado em aplicac¸oes˜ de processamento digital de sinal. Ambos os aceleradores foram testados no PULPino, relativamente as` suas capacidades de acelerac¸ao˜ e melhoria de eficienciaˆ energetica.´ Conseguindo atingir poupanc¸as de energia ate´ 99% e 66%, acelerac¸oes˜ de 185 e 3 vezes no SHA-3 e FFT respectivamente. Em relac¸ao˜ a uma versao˜ sem acelerador dos algoritmos executados no PULPino com um core RI5CY. Palavras-chave: Internet das Coisas, Consumo de Potencia,ˆ Sistema Embebido, Eficienciaˆ Energetica.´ v vi Abstract The future of portable electronics’ market will be built around Internet of Things(IoT), where everyday objects will be connected to the internet and possibly controlled by other devices. In fact, examples of these devices have already started to take part on our daily activities and are expected to experience a tremendous growth in a near future, such as health monitors, light bulbs, thermostats, fitness wrist- bands, etc. Most of these devices rely on battery-powered wireless transceivers combined with sensors, where it is essential to sustain energy-efficient execution by developing devices’ architectures capable of delivering both low power and real-time computing performance. Within the scope of IoT applications, this Thesis aims to boost the energy-efficiency of a state-of-the-art ultra-low-power processor, namely PULPino. This challenge was tackled by modularly attaching hardware accelerators to it. They connect to PULPino through a low-power and plug-n-play custom AXI-lite interface. It has the objective of encour- aging the development of new accelerators by the growing PULPino’s open-source community. To test the viability of this approach, two kinds of accelerators were individually attached. A first cryptographic SHA-3 accelerator, implementing a commonly used hash algorithm, that could improve IoT applications’ security. And second, an FFT accelerator, having a widely used algorithm in Digital Signal Processing (DSP) applications. Both accelerators were tested on PULPino, for their speedup and energy-efficiency capabilities. Achieving savings up to 99% and 66% of energy, speedups of 185 and 3 times on SHA-3 and FFT respectively. In comparison to a non-hardware accelerated version of the algorithms executed on PULPino RI5CY core configuration. Keywords: Internet of Things, Ultra-low-power, Embedded System, Energy-Efficiency. vii viii Contents Resumo.................................................v Abstract................................................. vii List of Figures............................................. xi Glossary................................................ xiii 1 Introduction 1 1.1 Motivation.............................................2 1.2 Main Objectives..........................................2 1.3 Main Contribution of this Thesis.................................3 1.4 Outline...............................................3 2 Background 4 2.1 State-of-the-Art: PULP - Parallel Ultra Low Power Platform.................4 2.2 PULPino..............................................7 2.3 Additional PULPino’s Core Configurations........................... 11 2.4 Interconnect Networks...................................... 13 2.4.1 Cache Coherent Interconnect for Accelerators(CCIX)................ 14 2.4.2 GEN-Z........................................... 15 2.4.3 Open Coherent Accelerator Processor Interface(OpenCAPI)............ 16 2.4.4 Standards Comparison................................. 16 2.5 Hardware Accelerators...................................... 17 2.6 Summary............................................. 21 3 Hardware/Software Co-design 22 3.1 AXI Protocol........................................... 22 3.1.1 AXI Interconnect..................................... 24 3.2 Overall System Architecture................................... 27 3.2.1 Hardware interface.................................... 28 3.2.2 Software Interface.................................... 29 3.3 Hardware Accelerators...................................... 29 3.4 Summary............................................. 37 ix 4 Implementation and Experimental Work 38 4.1 Target Device........................................... 38 4.2 System Configuration...................................... 40 4.3 New AXI Interconnect Slave................................... 44 4.4 New Accelerator......................................... 45 4.5 Summary............................................. 45 5 Experimental Results 46 5.1 Software vs Hardware...................................... 46 5.1.1 SHA-3........................................... 46 5.1.2 FFT............................................ 49 5.2 Power Efficiency......................................... 51 5.2.1 SHA-3........................................... 52 5.2.2 FFT............................................ 54 5.3 Summary............................................. 56 6 Conclusions and Future Work 59 References 61 A Software-only Algorithms 67 A.1 SHA-3............................................... 67 A.2 FFT................................................ 71 x List of Figures 2.1 PULP cluster with 4 cores....................................5 2.2 Comparison between RI5CY and ARM’s Cortex-M4.....................7 2.3 RISC-V pipeline.........................................8 2.4 LSU Software vs Hardware................................... 10 2.5 Shuffle instruction diagram................................... 10 2.6 Area breakdown of three core configurations......................... 12 2.7 Energy consumption comparison between three core configurations............ 13 2.8 Use cases of CCIX....................................... 14 2.9 Comparison between typical CPU-memory interface and Gen-Z Media Controller..... 15 2.10 Gen-Z arquitecture aggregating different type of media devices............... 15 2.11 Comparison of CCIX, Gen-Z and OpenCAPI main features................. 16 2.12 Comparison between SPIRAL generated design and Xilinx LogiCore FFT v4.1....... 20 3.1 PULPino’s SoC block diagram.................................. 24 3.2 PULPino’s memory map..................................... 25 3.3 AXI4 node overview....................................... 26 3.4 PULPino with attached accelerators block diagram...................... 27 3.5 SHA-3 kernel overview architecture............................... 31 3.6 SHA-3 padding module’s architecture............................. 31 3.7 SHA-3 permutation module’s architecture........................... 32 3.8 SHA-3 accelerator data path.................................. 33 3.9 SPIRAL Fast Fourier Transform(FFT) iterative architecture................. 35 3.10 SPIRAL Fast Fourier Transform(FFT) fully streaming architecture.............. 35 3.11 FFT accelerator’s data path................................... 36 4.1 Xilinx Zynq-7000 SoC block diagram overview......................... 39 4.2 Implementation block diagram.................................. 40 5.1 SHA-3 computation speedup using hardware accelerator.................. 48 5.2 FFT computation speedup using hardware accelerator.................... 51 5.3 SHA-3 computation power versus energy ratio........................ 52 5.4 SHA-3 accelerator energy saved on multiple frequencies................... 53 xi 5.5 FFT accelerator, dynamic and static on-chip power consumption.............. 55 5.6 FFT accelerator, static
Recommended publications
  • Xilinx Synthesis and Verification Design Guide
    Synthesis and Simulation Design Guide 8.1i R R Xilinx is disclosing this Document and Intellectual Property (hereinafter “the Design”) to you for use in the development of designs to operate on, or interface with Xilinx FPGAs. Except as stated herein, none of the Design may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx. Any unauthorized use of the Design may violate copyright laws, trademark laws, the laws of privacy and publicity, and communications regulations and statutes. Xilinx does not assume any liability arising out of the application or use of the Design; nor does Xilinx convey any license under its patents, copyrights, or any rights of others. You are responsible for obtaining any rights you may require for your use or implementation of the Design. Xilinx reserves the right to make changes, at any time, to the Design as deemed desirable in the sole discretion of Xilinx. Xilinx assumes no obligation to correct any errors contained herein or to advise you of any correction if such be made. Xilinx will not assume any liability for the accuracy or correctness of any engineering or technical support or assistance provided to you in connection with the Design. THE DESIGN IS PROVIDED “AS IS” WITH ALL FAULTS, AND THE ENTIRE RISK AS TO ITS FUNCTION AND IMPLEMENTATION IS WITH YOU. YOU ACKNOWLEDGE AND AGREE THAT YOU HAVE NOT RELIED ON ANY ORAL OR WRITTEN INFORMATION OR ADVICE, WHETHER GIVEN BY XILINX, OR ITS AGENTS OR EMPLOYEES.
    [Show full text]
  • AN 307: Altera Design Flow for Xilinx Users Supersedes Information Published in Previous Versions
    Altera Design Flow for Xilinx Users June 2005, ver. 5.0 Application Note 307 Introduction Designing for Altera® Programmable Logic Devices (PLDs) is very similar, both in concept and in practice, to designing for Xilinx PLDs. In most cases, you can simply import your register transfer level (RTL) into Altera’s Quartus® II software and begin compiling your design to the target device. This document will demonstrate the similar flows between the Altera Quartus II software and the Xilinx ISE software. For designs, which the designer has included Xilinx CORE generator modules or instantiated primitives, the bulk of this document guides the designer in design conversion considerations. Who Should Read This Document The first and third sections of this application note are designed for engineers who are familiar with the Xilinx ISE software and are using Altera’s Quartus II software. This first section describes the possible design flows available with the Altera Quartus II software and demonstrates how similar they are to the Xilinx ISE flows. The third section shows you how to convert your ISE constraints into Quartus II constraints. f For more information on setting up your design in the Quartus II software, refer to the Altera Quick Start Guide For Quartus II Software. The second section of this application note is designed for engineers whose design code contains Xilinx CORE generator modules or instantiated primitives. The second section provides comprehensive information on how to migrate a design targeted at a Xilinx device to one that is compatible with an Altera device. If your design contains pure behavioral coding, you can skip the second section entirely.
    [Show full text]
  • An Architecture and Compiler for Scalable On-Chip Communication
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. Y, MONTH 2004 1 An Architecture and Compiler for Scalable On-Chip Communication Jian Liang, Student Member, IEEE, Andrew Laffely, Sriram Srinivasan, and Russell Tessier, Member, IEEE Abstract— tion of communication resources. Significant amounts of arbi- A dramatic increase in single chip capacity has led to a tration across even a small number of components can quickly revolution in on-chip integration. Design reuse and ease-of- form a performance bottleneck, especially for data-intensive, implementation have became important aspects of the design pro- cess. This paper describes a new scalable single-chip communi- stream-based computation. This issue is made more complex cation architecture for heterogeneous resources, adaptive System- by the need to compile high-level representations of applica- On-a-Chip (aSOC), and supporting software for application map- tions to SoC environments. The heterogeneous nature of cores ping. This architecture exhibits hardware simplicity and opti- in terms of clock speed, resources, and processing capability mized support for compile-time scheduled communication. To il- makes cost modeling difficult. Additionally, communication lustrate the benefits of the architecture, four high-bandwidth sig- nal processing applications including an MPEG-2 video encoder modeling for interconnection with long wires and variable arbi- and a Doppler radar processor have been mapped to a prototype tration protocols limits performance predictability required by aSOC device using our design mapping technology. Through ex- computation scheduling. perimentation it is shown that aSOC communication outperforms Our platform for on-chip interconnect, adaptive System-On- a hierarchical bus-based system-on-chip (SoC) approach by up to a-Chip (aSOC), is a modular communications architecture.
    [Show full text]
  • On-Chip Interconnect Schemes for Reconfigurable System-On-Chip
    On-chip Interconnect Schemes for Reconfigurable System-on-Chip Andy S. Lee, Neil W. Bergmann. School of ITEE, The University of Queensland, Brisbane Australia {andy, n.bergmann} @itee.uq.edu.au ABSTRACT On-chip communication architectures can have a great influence on the speed and area of System-on-Chip designs, and this influence is expected to be even more pronounced on reconfigurable System-on-Chip (rSoC) designs. To date, little research has been conducted on the performance implications of different on-chip communication architectures for rSoC designs. This paper motivates the need for such research and analyses current and proposed interconnect technologies for rSoC design. The paper also describes work in progress on implementation of a simple serial bus and a packet-switched network, as well as a methodology for quantitatively evaluating the performance of these interconnection structures in comparison to conventional buses. Keywords: FPGAs, Reconfigurable Logic, System-on-Chip 1. INTRODUCTION System-on-chip (SoC) technology has evolved as the predominant circuit design methodology for custom ASICs. SoC technology moves design from the circuit level to the system level, concentrating on the selection of appropriate pre-designed IP Blocks, and their interconnection into a complete system. However, modern ASIC design and fabrication are expensive. Design tools may cost many hundreds of thousands of dollars, while tooling and mask costs for large SoC designs now approach $1million. For low volume applications, and especially for research and development projects in universities, reconfigurable System-on-Chip (rSoC) technology is more cost effective. Like conventional SoC design, rSoC involves the assembly of predefined IP blocks (such as processors and peripherals) and their interconnection.
    [Show full text]
  • Implementation, Verification and Validation of an Openrisc-1200
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 1, 2019 Implementation, Verification and Validation of an OpenRISC-1200 Soft-core Processor on FPGA Abdul Rafay Khatri Department of Electronic Engineering, QUEST, NawabShah, Pakistan Abstract—An embedded system is a dedicated computer system in which hardware and software are combined to per- form some specific tasks. Recent advancements in the Field Programmable Gate Array (FPGA) technology make it possible to implement the complete embedded system on a single FPGA chip. The fundamental component of an embedded system is a microprocessor. Soft-core processors are written in hardware description languages and functionally equivalent to an ordinary microprocessor. These soft-core processors are synthesized and implemented on the FPGA devices. In this paper, the OpenRISC 1200 processor is used, which is a 32-bit soft-core processor and Fig. 1. General block diagram of embedded systems. written in the Verilog HDL. Xilinx ISE tools perform synthesis, design implementation and configure/program the FPGA. For verification and debugging purpose, a software toolchain from (RISC) processor. This processor consists of all necessary GNU is configured and installed. The software is written in C components which are available in any other microproces- and Assembly languages. The communication between the host computer and FPGA board is carried out through the serial RS- sor. These components are connected through a bus called 232 port. Wishbone bus. In this work, the OR1200 processor is used to implement the system on a chip technology on a Virtex-5 Keywords—FPGA Design; HDLs; Hw-Sw Co-design; Open- FPGA board from Xilinx.
    [Show full text]
  • RTL Design and IP Generation Tutorial
    RTL Design and IP Generation Tutorial PlanAhead Design Tool UG675(v14.5) April 10, 2013 This tutorial document was last validated using the following software version: ISE Design Suite 14.5 If using a later software version, there may be minor differences between the images and results shown in this document with what you will see in the Design Suite.Suite. Notice of Disclaimer The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent.
    [Show full text]
  • AXI Reference Guide
    AXI Reference Guide [Guide Subtitle] [optional] UG761 (v13.4) January 18, 2012 [optional] Xilinx is providing this product documentation, hereinafter “Information,” to you “AS IS” with no warranty of any kind, express or implied. Xilinx makes no representation that the Information, or any particular implementation thereof, is free from any claims of infringement. You are responsible for obtaining any rights you may require for any implementation based on the Information. All specifications are subject to change without notice. XILINX EXPRESSLY DISCLAIMS ANY WARRANTY WHATSOEVER WITH RESPECT TO THE ADEQUACY OF THE INFORMATION OR ANY IMPLEMENTATION BASED THEREON, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OR REPRESENTATIONS THAT THIS IMPLEMENTATION IS FREE FROM CLAIMS OF INFRINGEMENT AND ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Except as stated herein, none of the Information may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx. © Copyright 2012 Xilinx, Inc. XILINX, the Xilinx logo, Virtex, Spartan, Kintex, Artix, ISE, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners. ARM® and AMBA® are registered trademarks of ARM in the EU and other countries. All other trademarks are the property of their respective owners. Revision History The following table shows the revision history for this document: . Date Version Description of Revisions 03/01/2011 13.1 Second Xilinx release.
    [Show full text]
  • Wishbone Bus Architecture – a Survey and Comparison
    International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.2, April 2012 WISHBONE BUS ARCHITECTURE – A SURVEY AND COMPARISON Mohandeep Sharma 1 and Dilip Kumar 2 1Department of VLSI Design, Center for Development of Advanced Computing, Mohali, India [email protected] 2ACS - Division, Center for Development of Advanced Computing, Mohali, India [email protected] ABSTRACT The performance of an on-chip interconnection architecture used for communication between IP cores depends on the efficiency of its bus architecture. Any bus architecture having advantages of faster bus clock speed, extra data transfer cycle, improved bus width and throughput is highly desirable for a low cost, reduced time-to-market and efficient System-on-Chip (SoC). This paper presents a survey of WISHBONE bus architecture and its comparison with three other on-chip bus architectures viz. Advanced Microcontroller Bus Architecture (AMBA) by ARM, CoreConnect by IBM and Avalon by Altera. The WISHBONE Bus Architecture by Silicore Corporation appears to be gaining an upper edge over the other three bus architecture types because of its special performance parameters like the use of flexible arbitration scheme and additional data transfer cycle (Read-Modify-Write cycle). Moreover, its IP Cores are available free for use requiring neither any registration nor any agreement or license. KEYWORDS SoC buses, WISHBONE Bus, WISHBONE Interface 1. INTRODUCTION The introduction and advancement of multimillion-gate chips technology with new levels of integration in the form of the system-on-chip (SoC) design has brought a revolution in the modern electronics industry. With the evolution of shrinking process technologies and increasing design sizes [1], manufacturers are integrating increasing numbers of components on a chip.
    [Show full text]
  • An Overview of Soc Buses
    Vojin Oklobdzija/Digital Systems and Applications 6195_C007 Page Proof page 1 11.7.2007 2:16am Compositor Name: JGanesan 7 An Overview of SoC Buses 7.1 Introduction....................................................................... 7-1 7.2 On-Chip Communication Architectures ........................ 7-2 Background . Topologies . On-Chip Communication Protocols . Other Interconnect Issues . Advantages and M. Mitic´ Disadvantages of On-Chip Buses M. Stojcˇev 7.3 System-On-Chip Buses ..................................................... 7-4 AMBA Bus . Avalon . CoreConnect . STBus . Wishbone . University of Nisˇ CoreFrame . Manchester Asynchronous Bus for Low Energy . Z. Stamenkovic´ PI Bus . Open Core Protocol . Virtual Component Interface . m IHP GmbH—Innovations for High SiliconBackplane Network Performance Microelectronics 7.4 Summary.......................................................................... 7-15 7.1 Introduction The electronics industry has entered the era of multimillion-gate chips, and there is no turning back. This technology promises new levels of integration on a single chip, called the system-on-a-chip (SoC) design, but also presents significant challenges to the chip designers. Processing cores on a single chip may number well into the high tens within the next decade, given the current rate of advancements [1]. Interconnection networks in such an environment are, therefore, becoming more and more important [2]. Currently, on-chip interconnection networks are mostly implemented using buses. For SoC applications, design reuse becomes easier if standard internal connection buses are used for interconnecting components of the design. Design teams developing modules intended for future reuse can design interfaces for the standard bus around their particular modules. This allows future designers to slot the reuse module into their new design simply, which is also based around the same standard bus [3].
    [Show full text]
  • Vitex-II Pro: the Platfom for Programmable Systems
    The Platform for Programmable Systems Developing high-performance systems with embedded pro- cessors and fast I/O is quite a challenge. To be successful, you Industry’s Fastest must solve the difficult technical FPGA Fabric problems of hardware and Up to 4 IBM PowerPC™ Processors immersed in FPGA Fabric software development, I/O Up to 24 Embedded Rocket I/O™ Multi-Gigabit Transceivers interfacing, and third-party IP Up to 12 Digital Clock Managers integration; you must rigorously XCITE Digitally Controlled Impedance Technology simulate, test, and verify your Up to 556 18x18 Multipliers design; and you must meet Over 10 Mb Embedded Block RAM increasingly difficult deadlines with a cost-effective product that can adapt as industry standards Virtex-II Pro Platform FPGA Family quickly evolve. Benefits are Overwhelming The revolutionary Virtex-II Pro™ Because all of the critical system components (such as microprocessors, memory, IP peripherals, programmable logic, and high-performance I/O) are located on one family, based on the highly successful programmable logic device, you gain a significant performance and productivity Virtex-II architecture, provides a advantage. The Virtex-II Pro FPGA family, along with the Wind River Systems embedded tools and Xilinx ISE development environment, is the fastest, easiest, and unique platform for developing most cost effective method for developing your next generation high-performance high-performance microprocessor- programmable systems. and I/O-intensive applications. With Virtex-II Pro FPGAs, you get: Virtex-II Pro FPGAs provide up to • On-Chip IBM PowerPC Processors – You get maximum performance and ease of use because these are hard cores, operating at peak efficiency, tightly coupled with ™ four embedded 32-bit IBM PowerPC all memory and programmable logic resources.
    [Show full text]
  • Small Soft Core up Inventory ©2019 James Brakefield Opencore and Other Soft Core Processors Reverse-U16 A.T
    tool pip _uP_all_soft opencores or style / data inst repor com LUTs blk F tool MIPS clks/ KIPS ven src #src fltg max max byte adr # start last secondary web status author FPGA top file chai e note worthy comments doc SOC date LUT? # inst # folder prmary link clone size size ter ents ALUT mults ram max ver /inst inst /LUT dor code files pt Hav'd dat inst adrs mod reg year revis link n len Small soft core uP Inventory ©2019 James Brakefield Opencore and other soft core processors reverse-u16 https://github.com/programmerby/ReVerSE-U16stable A.T. Z80 8 8 cylcone-4 James Brakefield11224 4 60 ## 14.7 0.33 4.0 X Y vhdl 29 zxpoly Y yes N N 64K 64K Y 2015 SOC project using T80, HDMI generatorretro Z80 based on T80 by Daniel Wallner copyblaze https://opencores.org/project,copyblazestable Abdallah ElIbrahimi picoBlaze 8 18 kintex-7-3 James Brakefieldmissing block622 ROM6 217 ## 14.7 0.33 2.0 57.5 IX vhdl 16 cp_copyblazeY asm N 256 2K Y 2011 2016 wishbone extras sap https://opencores.org/project,sapstable Ahmed Shahein accum 8 8 kintex-7-3 James Brakefieldno LUT RAM48 or block6 RAM 200 ## 14.7 0.10 4.0 104.2 X vhdl 15 mp_struct N 16 16 Y 5 2012 2017 https://shirishkoirala.blogspot.com/2017/01/sap-1simple-as-possible-1-computer.htmlSimple as Possible Computer from Malvinohttps://www.youtube.com/watch?v=prpyEFxZCMw & Brown "Digital computer electronics" blue https://opencores.org/project,bluestable Al Williams accum 16 16 spartan-3-5 James Brakefieldremoved clock1025 constraint4 63 ## 14.7 0.67 1.0 41.1 X verilog 16 topbox web N 4K 4K N 16 2 2009
    [Show full text]
  • Introduction to Verilog
    Introduction to Verilog Some material adapted from EE108B Introduction to Verilog presentation In lab, we will be using a hardware description language (HDL) called Verilog. Writing in Verilog lets us focus on the high‐level behavior of the hardware we are trying to describe rather than the low‐level behavior of every single logic gate. Design Flow Verilog Source Synthesis and Implementation Tools (Xilinx ISE) Gate‐level Netlist Place and Route Tools (Xilinx ISE) Verilog Source with Testbench FPGA Bitstream ModelSim Compiler Bitstream Download Tool (ChipScope) Simulation FPGA ModelSim SE Xilinx XC2VP30 Figure 1. Simulation flow (left) and synthesis flow (right) The design of a digital circuit using Verilog primarily follows two design flows. First, we feed our Verilog source files into a simulation tool, as shown by the diagram on the left. The simulation tool simulates in software the actual behavior of the hardware circuit for certain input conditions, which we describe in a testbench. Because compiling our Verilog for the simulation tool is relatively fast, we primarily use simulation tools when we are testing our design. When we are confident that design is correct, we then use a hardware synthesis tool to turn our high‐level Verilog code to a low‐level gate netlist. A mapping tool then maps the netlist to the applicable resources on the device we are targeting—in our case, a field programmable grid array (FPGA). Finally, we download a bitstream describing the way the FPGA should be reconfigured onto the FPGA, resulting in an actual digital circuit. Philosophy Verilog has a C‐like syntax.
    [Show full text]