Litex: an Open-Source Soc Builder and Library Based on Migen Python DSL

Total Page:16

File Type:pdf, Size:1020Kb

Litex: an Open-Source Soc Builder and Library Based on Migen Python DSL LiteX: an open-source SoC builder and library based on Migen Python DSL Florent Kermarrec Sebastien´ Bourdeauducq Jean-Christophe Le Lann and Hannah Badier Enjoy-Digital, Landivisiau, France M-Labs, Hong-Kong ENSTA Bretagne, Brest and Lab-STICC UMR fl[email protected] [email protected] [email protected] Abstract—LiteX [1] is a GitHub-hosted SoC builder / IP A. EDA renewal through DSLs and platform-based design library and utilities that can be used to create SoCs and full A recent trend, however, shows that several new HDLs are FPGA designs. Besides being open-source and BSD licensed, its originality lies in the fact that its IP components are entirely starting to emerge. Their common characteristic is that they are described using Migen Python internal DSL, which simplifies its essentially internal Domain Specific Languages (DSLs). These design in depth. LiteX already supports various softcores CPUs languages –sometimes called “embedded DSLs”– are hosted and essential peripherals, with no dependencies on proprietary by a mainstream programming language. No matter which IP blocks or generators. This paper provides an overview of host language is used, embedding a DSL in a mainstream LiteX: two real SoC designs on FPGA are presented. They both leverage the LiteX approach in terms of design entry, libraries programming language provides several benefits: comfort, and integration capabilities. The first one is based on RISC-V smooth learning curve, access to host language libraries and core, while the second is based on a LM32 core. In the second ecosystem. Complex compiler technologies or specific parsers use case, we further demonstrate the use of a fully open-source are not necessary, which significantly reduces the amount of toolchain coupled with LiteX. work required compared to the effort needed to create a brand- Index Terms—open-source, SoC, Python, DSL, FPGA, IP library new language. The design of such internal DSLs is based on efficiently mapping domain-specific concepts to the host language syntax. I. INTRODUCTION The renewal of HDLs, based on embedded DSLs like Electronic Design Automation (EDA) plays a central role in Chisel/SpinalHDL, is also encouraged by the advent of RISC- the advent of all the electronic devices we use daily, ranging V platform-based design (PBD). This initiative is intended from small embedded systems to internet-scale infrastructures. to counterbalance the supremacy of ARM in the field of The success of EDA may mainly be attributed to its capability embedded systems, by providing a set of royalty-free soft- to describe semi-conductor devices with tools and languages cores. Such softcores play a central role in platform-based based on robust abstraction layers: transistors, layout, logic, design, which has been defined [3] as “an integration oriented register-transfer level (RTL), behavioral descriptions, etc. design approach emphasizing systematic reuse, for develop- Thanks to these levels of abstraction, engineers do not have to ing complex products based upon platforms and compatible face the entire complexity of the underlying device they are hardware and software virtual components, intended to reduce designing, which makes the design process more constructive development risks, costs and time to market”. This reuse calls and efficient. In this process, Hardware Description Languages for new methodologies: to build complex embedded systems (HDLs), such as Verilog and VHDL, still play a pivotal role. organized around a processor, two ingredients are mandatory. New features that aim at facilitating the designer’s job have The first is abstraction: modern languages are likely to provide been added at a slow pace. This conservative slowness has such characteristics though either object-oriented or functional arXiv:2005.02506v1 [cs.AR] 5 May 2020 left only very little room for newcomers. Among them, Sys- modeling. As clearly stated in the previous definition, the temVerilog has succeeded to emerge as the most natural way to second ingredient is the availability of component libraries. describe and execute complex test benches, while SystemC is B. Objectives of the paper now mainly associated to transaction-level platform modeling. On the contrary, synchronous languages [2], whose formal In this paper, we present LiteX [1], SoC builder and library aspects could have been beneficial to EDA, have not been of IP components described at the RTL level, together with adopted widely in EDA, despite decades of extensive research. various utilities that facilitate the building of complete SoC In the same vein, High-level Synthesis, which calls for C- designs. LiteX resorts to a DSL, written in Python and named based design entry, has not gained the popularity of HDLs Migen. LiteX has been used successfully and on a daily basis yet. Nowadays, classical HDLs, acting as RTL description for several years by Enjoy-Digital [4], a company dedicated languages, still stand as the cornerstone of EDA. to open-source FPGA-based design, in various application domains (multimedia, software defined radio, automotive, etc.) and is progressively being adopted by users around the world. This work is licensed under a “CC BY-NC-SA 4.0” license. The obvious key of this adoption lies both in the very permissive open-source license adopted by LiteX (BSD), and A. Migen FHDL the choice of Python for the RTL descriptions and deployment. Similar to previous cited technologies, Migen FHDL (Frag- LiteX participates in a larger trend that will be discussed in mented Hardware Description Language) allows to describe this paper. and simulate RTL circuits. It is based on a custom Python The rest of the paper is organized as follows. Next Section abstract syntax tree (AST) and can produce synthesizable Ver- makes a short survey of the trend of eDSLs for EDA, and es- ilog. FHDL does not follow the event-driven paradigm of most pecially for RTL design. Section 3 presents Migen FHDL and HDLs, and instead replaces it with notions of synchronous and MiSoC. Section 4 introduces LiteX SoC builder and library. combinatorial statements. In Section 5, a full SoC design on FPGA is exposed and a Thanks to Migen FHDL, highly and easily configurable fully open-source design flow based on LiteX is presented. cores can be designed. Creating a design by writing a Python program raises the level of abstraction, by for example II. RELATED WORK : BENEFITS OF INTERNAL DSLS FOR enabling the use of object-oriented programming or meta- RTL DESIGN programming. The choice of Python in particular, as opposed to other programming languages such as Scala, also gives a Many hardware-oriented embedded DSLs have been pro- clear advantage for Migen adoption: Python is an easy to posed. Among them, SystemC itself can be cited at first. learn and well-known language, already used for many other It was proposed by Liao in [5] (at that time, SystemC was tasks by application engineers, for instance for algorithmic named Scenic). SystemC relies on C++ objects and allows to prototyping. describe hardware systems at various level of abstractions. The The following code example shows how a functional design simulation engine remains event-driven, similar to VHDL and can be implemented in a short, efficient and readable manner: Verilog. It has received a large audience, and has improved since the early paper of Liao to include sophisticated features from migen import * for transaction-level modeling [6], which allows to build mod- from migen.fhdl import verilog els usable by software programmers for early SoC validation. c l a s s Blinker(Module): However, SystemC has not been used fully as RTL entry. d e f i n i t (self, sys c l k freq , period): A DSL that has recently been receiving growing attention is s e l f.led = led = Signal() Chisel [7] and its fork named SpinalHDL. They both strongly ### contribute to the recent renew of RTL design practices. They also take an active part in the advent of RISC-V softcores. toggle = Signal() c o u n t e r p r e l o a d = int(sys c l k f r e q * p e r i o d / 2 ) Chisel initially targets synchronous designs, where the clock counter = Signal(max=counter p r e l o a d + 1) is implicit. The host language is Scala, a multi-paradigm language (object-oriented and functional), with a static typing. s e l f.comb += toggle.eq(counter == 0) s e l f.sync += n Here again, Scala [8] is known for its syntactic flexibility: I f ( t o g g l e , semicolons are optional, any method can be used as an infix led.eq(˜led), operator, etc. These features provide a strong differentiator counter.eq(counter p r e l o a d ) ) . E l s e ( with respect to VHDL for instance, which is known to be ex- counter.eq(counter − 1) tremely verbose. However, the benefits of Scala go far beyond ) the syntax: the type inference mechanism, parameterized types # Create a 10Hz blinker from a 100MHz system clock. and generators are particularly appealing for hardware design. blinker = Blinker(sys c l k freq=100e6, period=1e −1) HardCaml, an OCaml library for designing hardware, shares p r i n t(verilog.convert(blinker , f blinker.led g )) several aspects with Chisel, as OCaml offers many of the same modern multi-paradigm features provided by Scala. Haskell (a pure functional language) has also been used extensively for Listing 1. Blinking LED design example describing hardware: Lava [9], CλaSH [10], [11]. An older approach based on Java for FPGA programming was proposed This design consists of a decrementing counter that toggles in [12]. Furthermore, an interesting approach named MyHDL a one-bit signal whenever it reaches 0. The one-bit signal [13] and based on Python was introduced by Decaluwe in [14].
Recommended publications
  • Latticemico32 Development Kit User's Guide for Latticeecp
    LatticeMico32 Development Kit User’s Guide for LatticeECP Lattice Semiconductor Corporation 5555 NE Moore Court Hillsboro, OR 97124 (503) 268-8000 May 2007 Copyright Copyright © 2007 Lattice Semiconductor Corporation. This document may not, in whole or part, be copied, photocopied, reproduced, translated, or reduced to any electronic medium or machine- readable form without prior written consent from Lattice Semiconductor Corporation. Trademarks Lattice Semiconductor Corporation, L Lattice Semiconductor Corporation (logo), L (stylized), L (design), Lattice (design), LSC, E2CMOS, Extreme Performance, FlashBAK, flexiFlash, flexiMAC, flexiPCS, FreedomChip, GAL, GDX, Generic Array Logic, HDL Explorer, IPexpress, ISP, ispATE, ispClock, ispDOWNLOAD, ispGAL, ispGDS, ispGDX, ispGDXV, ispGDX2, ispGENERATOR, ispJTAG, ispLEVER, ispLeverCORE, ispLSI, ispMACH, ispPAC, ispTRACY, ispTURBO, ispVIRTUAL MACHINE, ispVM, ispXP, ispXPGA, ispXPLD, LatticeEC, LatticeECP, LatticeECP-DSP, LatticeECP2, LatticeECP2M, LatticeMico8, LatticeMico32, LatticeSC, LatticeSCM, LatticeXP, LatticeXP2, MACH, MachXO, MACO, ORCA, PAC, PAC-Designer, PAL, Performance Analyst, PURESPEED, Reveal, Silicon Forest, Speedlocked, Speed Locking, SuperBIG, SuperCOOL, SuperFAST, SuperWIDE, sysCLOCK, sysCONFIG, sysDSP, sysHSI, sysI/O, sysMEM, The Simple Machine for Complex Design, TransFR, UltraMOS, and specific product designations are either registered trademarks or trademarks of Lattice Semiconductor Corporation or its subsidiaries in the United States and/or other countries. ISP, Bringing the Best Together, and More of the Best are service marks of Lattice Semiconductor Corporation. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. Disclaimers NO WARRANTIES: THE INFORMATION PROVIDED IN THIS DOCUMENT IS “AS IS” WITHOUT ANY EXPRESS OR IMPLIED WARRANTY OF ANY KIND INCLUDING WARRANTIES OF ACCURACY, COMPLETENESS, MERCHANTABILITY, NONINFRINGEMENT OF INTELLECTUAL PROPERTY, OR FITNESS FOR ANY PARTICULAR PURPOSE.
    [Show full text]
  • Latticemico32 Software Developer User Guide
    LatticeMico32 Software Developer User Guide May 2014 Copyright Copyright © 2014 Lattice Semiconductor Corporation. This document may not, in whole or part, be copied, photocopied, reproduced, translated, or reduced to any electronic medium or machine-readable form without prior written consent from Lattice Semiconductor Corporation. Trademarks Lattice Semiconductor Corporation, L Lattice Semiconductor Corporation (logo), L (stylized), L (design), Lattice (design), LSC, CleanClock, Custom Mobile Device, DiePlus, E2CMOS, ECP5, Extreme Performance, FlashBAK, FlexiClock, flexiFLASH, flexiMAC, flexiPCS, FreedomChip, GAL, GDX, Generic Array Logic, HDL Explorer, iCE Dice, iCE40, iCE65, iCEblink, iCEcable, iCEchip, iCEcube, iCEcube2, iCEman, iCEprog, iCEsab, iCEsocket, IPexpress, ISP, ispATE, ispClock, ispDOWNLOAD, ispGAL, ispGDS, ispGDX, ispGDX2, ispGDXV, ispGENERATOR, ispJTAG, ispLEVER, ispLeverCORE, ispLSI, ispMACH, ispPAC, ispTRACY, ispTURBO, ispVIRTUAL MACHINE, ispVM, ispXP, ispXPGA, ispXPLD, Lattice Diamond, LatticeCORE, LatticeEC, LatticeECP, LatticeECP-DSP, LatticeECP2, LatticeECP2M, LatticeECP3, LatticeECP4, LatticeMico, LatticeMico8, LatticeMico32, LatticeSC, LatticeSCM, LatticeXP, LatticeXP2, MACH, MachXO, MachXO2, MachXO3, MACO, mobileFPGA, ORCA, PAC, PAC-Designer, PAL, Performance Analyst, Platform Manager, ProcessorPM, PURESPEED, Reveal, SensorExtender, SiliconBlue, Silicon Forest, Speedlocked, Speed Locking, SuperBIG, SuperCOOL, SuperFAST, SuperWIDE, sysCLOCK, sysCONFIG, sysDSP, sysHSI, sysI/O, sysMEM, The Simple Machine for Complex
    [Show full text]
  • A Superscalar Out-Of-Order X86 Soft Processor for FPGA
    A Superscalar Out-of-Order x86 Soft Processor for FPGA Henry Wong University of Toronto, Intel [email protected] June 5, 2019 Stanford University EE380 1 Hi! ● CPU architect, Intel Hillsboro ● Ph.D., University of Toronto ● Today: x86 OoO processor for FPGA (Ph.D. work) – Motivation – High-level design and results – Microarchitecture details and some circuits 2 FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT6-LUT 6-LUT6-LUT 3 6-LUT 6-LUT FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT 6-LUT 6-LUT 6-LUT 4 6-LUT 6-LUT FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 5 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 6 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel
    [Show full text]
  • Reconfigurable Computing
    Reconfigurable Computing: A Survey of Systems and Software KATHERINE COMPTON Northwestern University AND SCOTT HAUCK University of Washington Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution. Categories and Subject Descriptors: A.1 [Introductory and Survey]; B.6.1 [Logic Design]: Design Style—logic arrays; B.6.3 [Logic Design]: Design Aids; B.7.1 [Integrated Circuits]: Types and Design Styles—gate arrays General Terms: Design, Performance Additional Key Words and Phrases: Automatic design, field-programmable, FPGA, manual design, reconfigurable architectures, reconfigurable computing, reconfigurable systems 1. INTRODUCTION of algorithms. The first is to use hard- wired technology, either an Application There are two primary methods in con- Specific Integrated Circuit (ASIC) or a ventional computing for the execution group of individual components forming a This research was supported in part by Motorola, Inc., DARPA, and NSF. K. Compton was supported by an NSF fellowship. S. Hauck was supported in part by an NSF CAREER award and a Sloan Research Fellowship.
    [Show full text]
  • Implementation, Verification and Validation of an Openrisc-1200
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 1, 2019 Implementation, Verification and Validation of an OpenRISC-1200 Soft-core Processor on FPGA Abdul Rafay Khatri Department of Electronic Engineering, QUEST, NawabShah, Pakistan Abstract—An embedded system is a dedicated computer system in which hardware and software are combined to per- form some specific tasks. Recent advancements in the Field Programmable Gate Array (FPGA) technology make it possible to implement the complete embedded system on a single FPGA chip. The fundamental component of an embedded system is a microprocessor. Soft-core processors are written in hardware description languages and functionally equivalent to an ordinary microprocessor. These soft-core processors are synthesized and implemented on the FPGA devices. In this paper, the OpenRISC 1200 processor is used, which is a 32-bit soft-core processor and Fig. 1. General block diagram of embedded systems. written in the Verilog HDL. Xilinx ISE tools perform synthesis, design implementation and configure/program the FPGA. For verification and debugging purpose, a software toolchain from (RISC) processor. This processor consists of all necessary GNU is configured and installed. The software is written in C components which are available in any other microproces- and Assembly languages. The communication between the host computer and FPGA board is carried out through the serial RS- sor. These components are connected through a bus called 232 port. Wishbone bus. In this work, the OR1200 processor is used to implement the system on a chip technology on a Virtex-5 Keywords—FPGA Design; HDLs; Hw-Sw Co-design; Open- FPGA board from Xilinx.
    [Show full text]
  • Syllabus: EEL4930/5934 Reconfigurable Computing
    EEL4720/5721 Reconfigurable Computing (dual-listed course) Department of Electrical and Computer Engineering University of Florida Spring Semester 2019 Catalog Description: Prereq: EEL4712C or EEL5764 or consent of instructor. Fundamental concepts at advanced undergraduate level (EEL4720) and introductory graduate level (EEL5721) in reconfigurable computing (RC) based upon advanced technologies in field-programmable logic devices. Topics include general RC concepts, device architectures, design tools, metrics and kernels, system architectures, and application case studies. Credit Hours: 3 Prerequisites by Topic: Fundamentals of digital design including device technologies, design methodology and techniques, and design environments and tools; fundamentals of computer organization and architecture, including datapath and control structures, data formats, instruction-set principles, pipelining, instruction-level parallelism, memory hierarchy, and interconnects and interfacing. Instructor: Dr. Herman Lam Office: Benton Hall, Room 313 Office hours: TBA Telephone: (352) 392-2689 Email: [email protected] Teaching Assistant: Seyed Hashemi Office hours: TBA Email: [email protected] Class lectures: MWF 4th period, Larsen Hall 239 Required textbook: none References: . Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation, edited by Scott Hauck and Andre DeHon, Elsevier, Inc. (Morgan Kaufmann Publishers), Amsterdam, 2008. ISBN: 978-0-12-370522-8 . C. Maxfield, The Design Warrior's Guide to FPGAs, Newnes, 2004, ISBN: 978-0750676045.
    [Show full text]
  • A MATLAB Compiler for Distributed, Heterogeneous, Reconfigurable
    A MATLAB Compiler For Distributed, Heterogeneous, Reconfigurable Computing Systems P. Banerjee, N. Shenoy, A. Choudhary, S. Hauck, C. Bachmann, M. Haldar, P. Joisha, A. Jones, A. Kanhare A. Nayak, S. Periyacheri, M. Walkden, D. Zaretsky Electrical and Computer Engineering Northwestern University 2145 Sheridan Road Evanston, IL-60208 [email protected] Abstract capabilities and are coordinated to perform an application whose subtasks have diverse execution requirements. One Recently, high-level languages such as MATLAB have can visualize such systems to consist of embedded proces- become popular in prototyping algorithms in domains such sors, digital signal processors, specialized chips, and field- as signal and image processing. Many of these applications programmable gate arrays (FPGA) interconnected through whose subtasks have diverse execution requirements, often a high-speed interconnection network; several such systems employ distributed, heterogeneous, reconfigurable systems. have been described in [9]. These systems consist of an interconnected set of heteroge- A key question that needs to be addressed is how to map neous processing resources that provide a variety of archi- a given computation on such a heterogeneous architecture tectural capabilities. The objective of the MATCH (MATlab without expecting the application programmer to get into Compiler for Heterogeneous computing systems) compiler the low level details of the architecture or forcing him/her to project at Northwestern University is to make it easier for understand
    [Show full text]
  • Softcores for FPGA: the Free and Open Source Alternatives
    Softcores for FPGA: The Free and Open Source Alternatives Jeremy Bennett and Simon Cook, Embecosm Abstract Open source soft cores have reached a degree of maturity. You can find them in products as diverse as a Samsung set-top box, NXP's ZigBee chips and NASA's TechEdSat. In this article we'll look at some of the more widely used: the OpenRISC 1000 from OpenCores, Gaisler's LEON family, Lattice Semiconductor's LM32 and Oracle's OpenSPARC, as well as more bleeding edge research designs such as BERI and CHERI from Cambridge University Computer Laboratory. We'll consider the technology, the business case, the engineering risks, and the licensing challenges of using such designs. What do we mean by “free” “Free” is an overloaded term in English. It can mean something you don't pay for, as in “this beer is free”. But it can also mean freedom, as in “you are free to share this article”. It is this latter sense that is central when we talk about free software or hardware designs. Of course such software or hardware designs may also be free, in the sense that you don't have to pay for them, but that is secondary. “Free” software in this sense has been around for a very long time. Explicitly since 1993 and Richard Stallman's GNU Manifesto, but implicitly since the first software was written and shared with colleagues and friends. That freedom is achieved by making the source code available, so others can modify the program, hence the more recent alternative name “open source”, but it is freedom that remains the key concept.
    [Show full text]
  • Openpiton: an Open Source Manycore Research Framework
    OpenPiton: An Open Source Manycore Research Framework Jonathan Balkind Michael McKeown Yaosheng Fu Tri Nguyen Yanqi Zhou Alexey Lavrov Mohammad Shahrad Adi Fuchs Samuel Payne ∗ Xiaohua Liang Matthew Matl David Wentzlaff Princeton University fjbalkind,mmckeown,yfu,trin,yanqiz,alavrov,mshahrad,[email protected], [email protected], fxiaohua,mmatl,[email protected] Abstract chipset Industry is building larger, more complex, manycore proces- sors on the back of strong institutional knowledge, but aca- demic projects face difficulties in replicating that scale. To Tile alleviate these difficulties and to develop and share knowl- edge, the community needs open architecture frameworks for simulation, synthesis, and software exploration which Chip support extensibility, scalability, and configurability, along- side an established base of verification tools and supported software. In this paper we present OpenPiton, an open source framework for building scalable architecture research proto- types from 1 core to 500 million cores. OpenPiton is the world’s first open source, general-purpose, multithreaded manycore processor and framework. OpenPiton leverages the industry hardened OpenSPARC T1 core with modifica- Figure 1: OpenPiton Architecture. Multiple manycore chips tions and builds upon it with a scratch-built, scalable uncore are connected together with chipset logic and networks to creating a flexible, modern manycore design. In addition, build large scalable manycore systems. OpenPiton’s cache OpenPiton provides synthesis and backend scripts for ASIC coherence protocol extends off chip. and FPGA to enable other researchers to bring their designs to implementation. OpenPiton provides a complete verifica- tion infrastructure of over 8000 tests, is supported by mature software tools, runs full-stack multiuser Debian Linux, and has been widespread across the industry with manycore pro- is written in industry standard Verilog.
    [Show full text]
  • Architecture and Programming Model Support for Reconfigurable Accelerators in Multi-Core Embedded Systems »
    THESE DE DOCTORAT DE L’UNIVERSITÉ BRETAGNE SUD COMUE UNIVERSITE BRETAGNE LOIRE ÉCOLE DOCTORALE N° 601 Mathématiques et Sciences et Technologies de l'Information et de la Communication Spécialité : Électronique Par « Satyajit DAS » « Architecture and Programming Model Support For Reconfigurable Accelerators in Multi-Core Embedded Systems » Thèse présentée et soutenue à Lorient, le 4 juin 2018 Unité de recherche : Lab-STICC Thèse N° : 491 Rapporteurs avant soutenance : Composition du Jury : Michael HÜBNER Professeur, Ruhr-Universität François PÊCHEUX Professeur, Sorbonne Université Bochum Président (à préciser après la soutenance) Jari NURMI Professeur, Tampere University of Angeliki KRITIKAKOU Maître de conférences, Université Technology Rennes 1 Davide ROSSI Assistant professor, Université de Bologna Kevin MARTIN Maître de conférences, Université Bretagne Sud Directeur de thèse Philippe COUSSY Professeur, Université Bretagne Sud Co-directeur de thèse Luca BENINI Professeur, Université de Bologna Architecture and Programming Model Support for Reconfigurable Accelerators in Multi-Core Embedded Systems Satyajit Das 2018 iii ABSTRACT Emerging trends in embedded systems and applications need high throughput and low power consumption. Due to the increasing demand for low power computing, and diminishing returns from technology scaling, industry and academia are turning with renewed interest toward energy efficient hardware accelerators. The main drawback of hardware accelerators is that they are not programmable. Therefore, their utilization can be low as they perform one specific function and increasing the number of the accelerators in a system onchip (SoC) causes scalability issues. Programmable accelerators provide flexibility and solve the scalability issues. Coarse-Grained Reconfigurable Array (CGRA) architecture consisting several processing elements with word level granularity is a promising choice for programmable accelerator.
    [Show full text]
  • Hardware Platforms for Embedded Computing Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 Universität Dortmund Importance of Energy Efficiency
    Universität Dortmund Hardware Platforms for Embedded Computing Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 Universität Dortmund Importance of Energy Efficiency Efficient software design needed, otherwise, the price for software flexibility cannot be paid. © Hugo De Man (IMEC) Philips, 2007 Universität Dortmund Embedded vs. general-purpose processors • Embedded processors may be optimized for a category of applications. – Customization may be narrow or broad. • We may judge embedded processors using different metrics: – Code size. – Memory system performance. – Preditability. • Disappearing distinction: embedded processors everywhere Universität Dortmund Microcontroller Architectures Memory 0 Address Bus Program + CPU Data Bus Data Von Neumann 2n Architecture Memory 0 Address Bus Program CPU Fetch Bus Harvard Address Bus 0 Architecture Data Bus Data Universität Dortmund RISC processors • RISC generally means highly-pipelinable, one instruction per cycle. • Pipelines of embedded RISC processors have grown over time: – ARM7 has 3-stage pipeline. – ARM9 has 5-stage pipeline. – ARM11 has eight-stage pipeline. ARM11 pipeline [ARM05]. Universität Dortmund ARM Cortex Based on ARMv7 Architecture & Thumb®-2 ISA – ARM Cortex A Series - Applications CPUs focused on the execution of complex OS and user applications • First Product: Cortex-A8 • Executes ARM, Thumb-2 instructions – ARM Cortex R Series - Deeply embedded processors focused on Real-time environments • First Product: Cortex-R4(F) • Executes ARM, Thumb-2 instructions – ARM Cortex M
    [Show full text]
  • Reconfigurable Computing
    Reconfigurable Computing David Boland1, Chung-Kuan Cheng2, Andrew B. Kahng2, Philip H.W. Leong1 1School of Electrical and Information Engineering, The University of Sydney, Australia 2006 2Dept. of Computer Science and Engineering, University of California, La Jolla, California Abstract: Reconfigurable computing is the application of adaptable fabrics to solve computational problems, often taking advantage of the flexibility available to produce problem-specific architectures that achieve high performance because of customization. Reconfigurable computing has been successfully applied to fields as diverse as digital signal processing, cryptography, bioinformatics, logic emulation, CAD tool acceleration, scientific computing, and rapid prototyping. Although Estrin-first proposed the idea of a reconfigurable system in the form of a fixed plus variable structure computer in 1960 [1] it has only been in recent years that reconfigurable fabrics, in the form of field-programmable gate arrays (FPGAs), have reached sufficient density to make them a compelling implementation platform for high performance applications and embedded systems. In this article, intended for the non-specialist, we describe some of the basic concepts, tools and architectures associated with reconfigurable computing. Keywords: reconfigurable computing; adaptable fabrics; application integrated circuits; field programmable gate arrays (FPGAs); system architecture; runtime 1 Introduction Although reconfigurable fabrics can in principle be constructed from any type of technology, in practice, most contemporary designs are made using commercial field programmable gate arrays (FPGAs). An FPGA is an integrated circuit containing an array of logic gates in which the connections can be configured by downloading a bitstream to its memory. FPGAs can also be embedded in integrated circuits as intellectual property cores.
    [Show full text]