Embedded Processing Using Fpgas Agenda

Embedded Processing Using FPGAs Agenda • Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems 2 - Embedded Processing using FPGAs www.xilinx.com Agenda • Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems 3 - Embedded Processing using FPGAs www.xilinx.com Why Use Processors In the First Place • Microcontrollers (µC) and Microprocessors (µP) Provide a Higher Level of Design Abstraction – Most µC functions can be implemented using VHDL or Verilog - downsides are parallelism & complexity – Using C/C++ abstraction & serial execution make certain functions much easier to implement in a µC • Discrete µCs are Inexpensive and Widely Used – µCs have years of momentum and software designers have vast experience using them 4 - Embedded Processing using FPGAs www.xilinx.com Why Embedded Design using FPGAs In Addition To The Universal Drive Towards Smaller Cheaper Faster With Less Power…. 1 Difficult to Find the Required Mix of Peripherals in Off the Shelf (OTS) Microcontroller Solutions •2 Selecting a Single Processor Core with Long Term Solution Viability is Difficult at Best •3 Without Direct Ownership of the Processing Solution, Obsolescence is Always a Concern •4 Many Microprocessor Based Solutions Provide Limited On-Chip Peripheral Support 5 - Embedded Processing using FPGAs www.xilinx.com Rarely the Ideal Mix 1• Difficult to Find the Required Mix of Peripherals in Off the Shelf (OTS) Microcontroller Solutions – Even with variants, compromises must still be made Microcontroller #1 Microcontroller #2 System Requirements FLASHFLASH UARTUART FLASHFLASH UARTUART 2@ UART RAM UART DDR CAN TIMER RAM UART DDR CAN I2C GPIOGPIO SPISPI GPIOGPIO SPISPI CPU Core CPU CPU Core CPU ? SPI CPU Core CPU CPU Core CPU GPIO Timer I2C Timer FLASH Timer I2C Timer DDR SDRAM Lacks I2C & Includes Lacks a Second UART & RAM vs DDR SDRAM Includes Unnecessary IP – Must decide between inventory impact of multiple configurations or device cost of comprehensive IP 6 - Embedded Processing using FPGAs www.xilinx.com Changing Requirements 2• Selecting a Single Processor Core with Long Term Solution Viability is Difficult at Best – Future proofing system requirements is difficult at best Microcontroller Today’s System Future System Requirements FLASHFLASH UARTUART Requirements 100 MHz 150 MHz 2@ UART DDRDDR UARTUART 2@ UART TIMER TIMER I2C GPIOGPIO SPISPI I2C ? SPI SPI 100 MHz MHz Core 100 100 MHz MHz Core 100 GPIO I2CI2C TimerTimer GPIO FLASH FLASH DDR SDRAM Completely meets current DDR SDRAM 10/100 Ethernet system requirements – Changing processor cores to accommodate new requirements consumes valuable design resources 7 - Embedded Processing using FPGAs www.xilinx.com Here Today, Gone Tomorrow 3• Without Direct Ownership of the Processing Solution, Obsolescence is Always a Concern – A single core is used to create a family of µCs Microcontroller Variant #1 - High Volume Automotive FLASHFLASH CANCAN Microcontroller Variant #2 - Moderate Volume Consumer RAMRAM FLASHFLASH UARTUART SPISPI Microcontroller Variant #3 - Lower Volume Niche GPIOGPIO RAMRAM FLASHFLASH SPISPI UARTUART I2CI2C CPU Core CPU CPU Core CPU GPIOGPIO RAMRAM TimerTimer SPISPI UARTUART CPU Core CPU CPU Core CPU GPIOGPIO TimerTimer SPISPI CPU Core CPU CPU Core CPU ? TimerTimer – The sheer number of µC configurations can lead to obsolescence of lower volume configurations/variants 8 - Embedded Processing using FPGAs www.xilinx.com Chipset Solutions 4• Many Microprocessor Based Solutions Provide Limited On-Chip Peripheral Support – Manufacturers create I/O devices to extend the functionality of the base processor core SerialSerial Microprocessor Chip-Set ParallelParallel MPU Interface MPU MPU Interface MPU PWMPWM Pre-defined interface Microprocessor limits performance Microprocessor I/O Expansion Device – An external, pre-defined interface between a µP and its support device limits overall system performance 9 - Embedded Processing using FPGAs www.xilinx.com FPGA Embedded Design 1• FPGAs Allow For The Implementation Of An Ideal Mix Of Peripherals And System Infrastructure 2• New System Requirements Can Be Supported Without Changing Core Design 3• Longevity Of FPGAs Approaches The Longest Available Microcontrollers In The Market 4• FPGAs Are Used To Augment µp Functionality - Absorbing The Core Is The Next Natural Step 10 - Embedded Processing using FPGAs www.xilinx.com Agenda • Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems 11 - Embedded Processing using FPGAs www.xilinx.com Processor Use Models 1 2 3 State Machine Microcontroller Custom Embedded • Lowest Cost, No • Medium Cost, Some • Highest Integration, Peripherals, No RTOS Peripherals, Possible Extensive Peripherals, & No Bus Structures RTOS & Bus Structures RTOS & Bus Structures • VGA & LCD Controllers • Control & Instrumentation • Networking & Wireless • Low/High Performance • Moderate Performance • High Performance Range of Use Models 12 - Embedded Processing using FPGAs www.xilinx.com Xilinx Processor Solutions 1 2 3 State Machine Microcontroller Custom Embedded Range of Solutions 13 - Embedded Processing using FPGAs www.xilinx.com PicoBlaze 8-Bit Microcontroller Reference Design • Free PicoBlaze macro for implementing complex state machines and micro sequencers • Supports Spartan-3, Virtex-4, Virtex-II/Pro FPGAs and CoolRunner-II CPLDs • Minimal logic size – Only 96 Spartan-3 slices, 5% of XC3S200 device • High performance – Up to 44 MIPS in Spartan-3 and 100 MIPS in Virtex-4 • 100% embedded capability – Everything in FPGA – no external components required • http://www.xilinx.com/picoblaze 14 - Embedded Processing using FPGAs www.xilinx.com • 3-Stage Parallel Pipeline MicroBlaze • Optional Instruction and Data Caches – 2KB to 64KB Each 123 DMIPS @ 150 MHz in • Optional Multiply Unit Virtex-II Pro (0.82 DMIPs/MHz) • 32 x 32-bit GPRs Program • Dedicated Fast Simplex Links - FSL Counter Execution Unit ( Add, Sub, Shift, – 32-bit, Point-to-Point Links Logical, Multiply ) between GPRs in the Register File and User Logic Instruction – 8 Input & 8 Output Decode ( includes – Blocking & Non-Blocking inst buffer ) FSL Register File Transfers – Predefined Insts ( 32 X 32b ) • Local Memory Bus Interfaces - LMB Data-Side Bus Interface Data-Side Local MemoryLocal -Bus LMB Local MemoryLocal -Bus LMB Data-Side Bus Interface Data-Side Local MemoryLocal -Bus LMB Local MemoryLocal -Bus LMB • Provide Access to Inst/Data Instruction-Side Bus Interface Instruction-Side Bus Interface I-Cache D-Cache Memory Without Bus Access ( 2KB to 64KB ) ( 2KB to 64KB ) • 100 MHz+ On-Chip Peripheral Bus – 32-bit Semi-Duplex Bus Firm IP Core Relatively Placed Macro (RPM) • Fast access “Cache Link” interface Remaining FPGA Fabric On -Chip Peripheral Bus - OPB Cache Link 15 - Embedded Processing using FPGAs www.xilinx.com PowerPC Processor Block 10/100/1000 InstructionInstruction SideSide EMAC On Chip Memory OnFPGA Chip Logic Memory Core Client PHY InterfaceInterface Processor Block ISOCM Control Auxiliary ISOCM Processor EMAC Statistics APU APU FCM Unit Control PowerPC 405 Core DCR Host I/F ISPLB Processor Local DSPLB Bus Interface DCR DCR Statistics EMAC Control DSOCM Device Configuration DSOCM Reset, Register Control Debug Register InterfaceInterface 10/100/1000 Client PHY Ext DCR Data Side EMAC On Chip Memory Core InterfaceInterface 16 - Embedded Processing using FPGAs www.xilinx.com Integrating a PowerPC 405 • IP-Immersion™ • ActiveInterconnect™ – Advanced Technology Metal – Segmented Routing Enables High “Headroom” Enables True IP Immersion Bandwidth Interface –1 IP-Immersion Tiles Provide Direct –2 General Purpose Routing Passes Connectivity to Virtex-II Pro Fabric Over Hard IP (Hex & Long) Metal 9 Metal 8 Metal 7 2 Metal 6 Metal 5 Metal 4 Metal 4 Metal 4 Metal 3 Metal 3 Metal 3 Embedded Metal 2 Metal 2 Metal 2 Tile Tile PowerPC 405 1 Metal 1 Metal 1 Metal 1 IP-Immersion IP-Immersion Poly IP-Immersion Poly Poly Silicon Substrate Replaces only 1,024 LUTs and 8 Block RAMs 17 - Embedded Processing using FPGAs www.xilinx.com UltraController Easy to use 5-port HDL module JTAG sys_clk sys_rst PowerPC 405 ResetReset JTAGJTAG BRAM BRAM gpio_in ISOCM DSOCM ISOCM DSOCM gpio_out Code stored in block RAM Virtex-II Pro Fabric Simple processor/fabric interface uses minimal FPGA resources 18 - Embedded Processing using FPGAs www.xilinx.com UltraController II Easy to use HDL module Interrupt sys_clk sys_rst_out RR PowerPC 405 ee sys_rst ss ee tt JJ gpio_in JTAG TT A ISOCM A DSOCM ISOCM DSOCM gpio_out GG Code Loaded FPGAFPGA Fabric Fabric and stored in Cache Operates at maximum CPU frequency Simple processor/fabric interface uses minimal FPGA resources 19 - Embedded Processing using FPGAs www.xilinx.com Virtex 5 Processor Block • 5 x 2 Crossbar Connection (128-bit) including Arbiters – Configure through bit-stream or DCR – Up to1:1 freq w/ processor • 1 PLB master interface to FPGA fabric Processor Block DMA LocalLink • 2 PLB slave interfaces from FPGA fabric DMA LocalLink • PLB Master/Slave ports PLB-S0 I/F –

Embedded Processing Using Fpgas Agenda

Automatic Generation of Hardware for Custom Instructions

Review of FPD's Languages, Compilers, Interpreters and Tools

Reconfigurable Computing in the New Age of Parallelism

PACT HDL: AC Compiler Targeting Asics and Fpgas With

HDL and Programming Languages ■ 6 Languages ■ 6.1 Analogue Circuit Design ■ 6.2 Digital Circuit Design ■ 6.3 Printed Circuit Board Design ■ 7 See Also

Advances in Architectures and Tools for Fpgas and Their Impact on the Design of Complex Systems for Particle Physics

Evaluation of the Hardwired Sequence Control System Generated by High-Level Synthesis,” Proc

An Automated Flow to Generate Hardware Computing Nodes from C for an FPGA-Based MPI Computing Network

Advances in Architectures and Tools for Fpgas and Their Impact on the Design of Complex Systems for Particle Physics

An Integrated Development Toolset and Implementation Methodology for Partially Reconfigurable System-On-Chips

Download The

The GENCOD Project: Automated Generation of Hardware Code For