Introduction to High-Level Synthesis ECE 699: Lecture 12
Total Page:16
File Type:pdf, Size:1020Kb
ECE 699: Lecture 12 Introduction to High-Level Synthesis Required Reading The ZYNQ Book • Chapter 14: Spotlight on High-Level Synthesis • Chapter 15: Vivado HLS: A Closer Look S. Neuendorffer and F. Martinez-Vallina, Building Zynq Accelerators with Vivado High Level Synthesis, FPGA 2013 Tutorial Recommended Reading G. Martin and G. Smith, “High-Level Synthesis: Past, Present, and Future,” IEEE Design & Test of Computers, IEEE, vol. 26, no. 4, pp. 18–25, July 2009. Vivado Design Suite Tutorial, High-Level Synthesis, UG871, Nov. 2014 Vivado Design Suite User Guide, High-Level Synthesis, UG902, Oct. 2014 Introduction to FPGA Design with Vivado High-Level Synthesis, UG998, Jul. 2013. Behavioral Synthesis I/O Target Behavior Library Algorithm Behavioral Synthesis RTL Design Logic Classic RTL Synthesis Design Flow Gate level Netlist 4 Need for High-Level Design • Higher level of abstraction • Modeling complex designs • Reduce design efforts • Fast turnaround time • Technology independence • Ease of HW/SW partitioning 5 Platform Mapping SW/HW Partitioning Program Hardware Software (executed in (executed in the reconfigurable the microprocessor processor system) system) ECE 448 – FPGA and ASIC Design with VHDL 6 SW/HW Partitioning & Coding Traditional Approach Specification SW/HW Partitioning SW Coding HW Coding SW Compilation HW Compilation SW Profiling HW Profiling ECE 448 – FPGA and ASIC Design with VHDL 7 SW/HW Partitioning & Coding New Approach Specification SW/HW Coding SW/HW Partitioning SW Compilation HW Compilation SW Profiling HW Profiling ECE 448 – FPGA and ASIC Design with VHDL 8 Advantages of Behavioral Synthesis • Easy to model higher level of complexities • Smaller in size source compared to RTL code • Generates RTL much faster than manual method • Multi-cycle functionality • Loops • Memory Access 9 Short History of High-Level Synthesis Generation 1 (1980s-early 1990s): research period Generation 2 (mid 1990s-early 2000s): • Commercial tools from Synopsys, Cadence, Mentor Graphics, etc. • Input languages: behavioral HDLs Target: ASIC Outcome: Commercial failure Generation 3 (from early 2000s): • Domain oriented commercial tools: in particular for DSP • Input languages: C, C++, C-like languages (Impulse C, Handel C, etc.), Matlab + Simulink, Bluespec • Target: FPGA, ASIC, or both Outcome: First success stories 10 Hardware-Oriented High-Level Languages • C-Based System level languages • Commercial • Handel C -- Celoxica Ltd. • Impulse C -- Impulse Accelerated Technologies • Carte C – SRC Computers • SystemC -- The Open SystemC Initiative • Research • Streams-C -- Los Alamos National Laboratory • SA-C -- Colorado State University, University of California, Riverside, Khoral Research, Inc. • SpecC – University of California, Irvine and SpecC Technology Open Consortium 11 Other High-Level Design Flows • Matlab-based • AccelChip DSP Synthesis -- AccelChip • System Generator for DSP -- Xilinx • GUI Data-Flow based • Corefire -- Annapolis Microsystems • Java-based • Commercial • Forge -- Xilinx • Research • JHDL – Brigham Young University 12 Handel-C Overview • High-level language based on ISO/ANSI-C for the implementation of algorithms in hardware • Allows software engineers to design hardware without retraining • Clean extensions for hardware design including flexible data widths, parallelism and communications • Well defined timing model • Each statement takes a single clock cycle • Includes extended operators for bit manipulation, and high-level mathematical macros (including floating point) 13 Handel-C/ANSI-C Comparisons ANSI-C HANDEL-C Handel-C Standard Library ANSI-C Standard Preprocessors Library i.e. #define Parallelism Pointers Structures Arbitrary width Recursion ANSI-C Constructs variables for, while, if, switch Arrays Bitwise logical operators Enhanced bit Floang Point Logical operators manipulaon Arithme=c operators RAM, ROM Funcons Signals Interfaces 14 Handel-C Design Flow Executable Specificaon Handel-C VHDL Synthesis EDIF EDIF Place & Route 15 Different Levels of C/C++ Synthesis Abstraction + More abstract, less + implementation- C Untimed C Domain / specific C (Non-implementation-specific) e r u P d e t + n C + Timed C Domain e m C / m (Implementation-specific) e t g C s u y A S L D g o H l RTL Domain i r Less abstract, more V (Implementation-specific) e d implementation- V n specific a The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) 16 Pure Untimed C/C++ Design Flow Verilog / RTL Gate-level User interaction VHDL RTL Synthesis netlist and guidence ASIC target Pure C/C++ Pure C/C++ Auto-generated, Synthesis implementation-specific FPGA target - Non-implementation-specific - Easy to create Verilog / RTL LUT/CLB- - Fast to simulate VHDL RTL Synthesis level netlist - Easy to modify The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) 17 Catapult C (2004-2011) Calypto Design Systems (2011-2015) (2015-present) 18 Catapult C • Catapult C automatically converts un-timed C/C++ descriptions into synthesizable RTL. ECE 448 – FPGA and ASIC Design with VHDL 19 SystemC -based design-flow alternatives Implementation specific, relatively slow to simulate, relatively difficult to modify Auto-RTL Verilog / RTL Translation VHDL RTL Synthesis Gate-level SystemC netlist SystemC Synthesis Alternative SystemC flows 20 SystemC Evolution System Untimed Algorithmic 0 . 2 C Behavioral/ m e Transaction- t s level y S Timed C m 0 e . RTL t 1 s y S The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) 21 Reconfigurable Supercomputers ECE 448 – FPGA and ASIC Design with VHDL 22 What is a Reconfigurable Computer? Microprocessor system Reconfigurable system . µP µP FPGA . FPGA µP µP . FPGA . FPGA memory memory memory memory I/O Interface Interface I/O ECE 448 – FPGA and ASIC Design with VHDL 23 Reconfigurable Supercomputers Machine Released SRC 6 from 2002 SRC Computers Cray XD1 from 2005 from Cray SGI Altix from 2005 SGI SRC 7 from 2006 SRC Computers, Inc, 24 Pros and cons of reconfigurable computers + can be programmed using high-level programming languages, such as C, by mathematicians & scientist themselves + facilitates hardware/software co-design + shortens development time, encourages experimentation and complex optimizations + allows sharing costs among users of various applications - high entry cost (~$100,000) - hardware aware programming - limited portability - limited availability of libraries - limited maturity of tools 25 SRC Programming Model Microprocessor FPGA Libraries of macros function_1 main.c macro_1 macro_2 macro_1(a, b, c) macro_3 macro_4 ………………………. function_1() macro_2(b, d) VHDL macro_2(c, e) function_2() FPGA function_2 I/O a macro_3(s, t) Macro_1 ANSI C macro_1(n, b) b c macro_4(t, k) Macro_2 Macro_2 MAP C (subset of ANSI C) d e I/O 26 SRC Compilation Process Application sources Macro sources .c or .f files .mc or .mf files . vhd or .v files HDL sources .v files Logic synthesis µP Compiler MAP Compiler Netlists . ngo files Object .o files .o files files Place & Route Linker .bin files Configuration Application bitstreams executable 27 Library Development - SRC LLL HLL HLL (ASM) (C, Fortran) (C, Fortran) µP system FPGA system HDL HLL HLL (VHDL, (C, Fortran) (C, Fortran) Verilog) Library Application Developer Programmer 28 SRC Programming Environment + very easy to learn and use + standard ANSI C + hides implementation details + very well integrated environment + mature - subset of C - legacy C code requires rewriting - C limitations in describing HW (paralellism, data types) - closed environment, limited portability of code to HW platforms other than SRC 29 Application Development for Reconfigurable Computers Program Entry Platform mapping Debugging & Verification Compilation Execution 30 Ideal Program Entry Function Program Entry ECE 448 – FPGA and ASIC Design with VHDL 31 Actual Program Entry Function Preferred SW/HW Architectures Partitioning Use of FPGA FPGA Resources Program Mapping (multipliers, Entry µP cores) Data Transfers Sequence of Run-time & Synchronization Reconfigurations Use of Internal SW/HW Interface and External Memories 32 Cinderella Story AutoESL Design Technologies, Inc. (25 employees) Flagship product: AutoPilot, translating C/C++/System C to VHDL or Verilog • Acquired by the biggest FPGA company, Xilinx Inc., in 2011 • AutoPilot integrated into the primary Xilinx toolset, Vivado, as Vivado HLS, released in 2012 “High-Level Synthesis for the Masses” 33 Vivado HLS High Level Language C, C++, System C Vivado HLS Hardware Description Language VHDL or Verilog HLS-Based Development and Benchmarking Flow Reference ImplementaAon in C Manual Modifications (pragmas, tweaks) Test Vectors HLS-ready C code High-Level Synthesis Functional HDL Code Verification Post Physical Implementation Place & Route FPGA Tools Results Timing Netlist Verification LegUp – Academic Tool for HLS – Open-source HLS Tool • Developed at the University of Toronto • Faculty supervisors: Jason H. Anderson and Stephen Brown • FPL Community Award 2014 – High-Level Synthesis from C to Verilog – Targets Altera FPGAs (extension to Xilinx relatively simple) – Two flows • Pure Hardware • Hardware/Software Hybrid = Tiger MIPS + hardware accelerator(s) + Avalon bus + shared on-chip and off-chip memory 36 Cryptol – New Language for Cryptology – Domain specific language for cryptology: Cryptol