Embedded Processing Using FPGAs Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
2 - Embedded Processing using FPGAs www.xilinx.com Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
3 - Embedded Processing using FPGAs www.xilinx.com Why Use Processors In the First Place • Microcontrollers (µC) and Microprocessors (µP) Provide a Higher Level of Design Abstraction – Most µC functions can be implemented using VHDL or Verilog - downsides are parallelism & complexity – Using C/C++ abstraction & serial execution make certain functions much easier to implement in a µC • Discrete µCs are Inexpensive and Widely Used – µCs have years of momentum and software designers have vast experience using them
4 - Embedded Processing using FPGAs www.xilinx.com Why Embedded Design using FPGAs In Addition To The Universal Drive Towards Smaller Cheaper Faster With Less Power….
1 Difficult to Find the Required Mix of Peripherals in Off the Shelf (OTS) Microcontroller Solutions •2 Selecting a Single Processor Core with Long Term Solution Viability is Difficult at Best •3 Without Direct Ownership of the Processing Solution, Obsolescence is Always a Concern •4 Many Microprocessor Based Solutions Provide Limited On-Chip Peripheral Support
5 - Embedded Processing using FPGAs www.xilinx.com Rarely the Ideal Mix
1• Difficult to Find the Required Mix of Peripherals in Off the Shelf (OTS) Microcontroller Solutions – Even with variants, compromises must still be made
Microcontroller #1 Microcontroller #2 System Requirements FLASHFLASH UARTUART FLASHFLASH UARTUART 2@ UART RAM UART DDR CAN TIMER RAM UART DDR CAN I2C GPIOGPIO SPISPI GPIOGPIO SPISPI CPU Core CPU CPU Core CPU ? SPI CPU Core CPU CPU Core CPU GPIO Timer I2C Timer FLASH Timer I2C Timer DDR SDRAM Lacks I2C & Includes Lacks a Second UART & RAM vs DDR SDRAM Includes Unnecessary IP – Must decide between inventory impact of multiple configurations or device cost of comprehensive IP
6 - Embedded Processing using FPGAs www.xilinx.com Changing Requirements
2• Selecting a Single Processor Core with Long Term Solution Viability is Difficult at Best – Future proofing system requirements is difficult at best
Microcontroller Today’s System Future System
Requirements FLASHFLASH UARTUART Requirements 100 MHz 150 MHz 2@ UART DDRDDR UARTUART 2@ UART TIMER TIMER I2C GPIOGPIO SPISPI I2C ? SPI SPI 100 MHz Core MHz 100 100 MHz Core MHz 100 GPIO I2CI2C TimerTimer GPIO FLASH FLASH DDR SDRAM Completely meets current DDR SDRAM 10/100 Ethernet system requirements – Changing processor cores to accommodate new requirements consumes valuable design resources
7 - Embedded Processing using FPGAs www.xilinx.com Here Today, Gone Tomorrow
3• Without Direct Ownership of the Processing Solution, Obsolescence is Always a Concern – A single core is used to create a family of µCs
Microcontroller Variant #1 - High Volume Automotive FLASHFLASH CANCAN Microcontroller Variant #2 - Moderate Volume Consumer RAMRAM FLASHFLASH UARTUART SPISPI Microcontroller Variant #3 - Lower Volume Niche GPIOGPIO RAMRAM FLASHFLASH SPISPI UARTUART I2CI2C CPU Core CPU CPU Core CPU
GPIOGPIO RAMRAM TimerTimer SPISPI UARTUART CPU Core CPU CPU Core CPU
GPIOGPIO TimerTimer SPISPI CPU Core CPU
CPU Core CPU ?
TimerTimer – The sheer number of µC configurations can lead to obsolescence of lower volume configurations/variants
8 - Embedded Processing using FPGAs www.xilinx.com Chipset Solutions
4• Many Microprocessor Based Solutions Provide Limited On-Chip Peripheral Support – Manufacturers create I/O devices to extend the functionality of the base processor core
SerialSerial Microprocessor Chip-Set ParallelParallel MPU Interface MPU MPU Interface MPU PWMPWM Pre-defined interface
Microprocessor limits performance Microprocessor I/O Expansion Device – An external, pre-defined interface between a µP and its support device limits overall system performance
9 - Embedded Processing using FPGAs www.xilinx.com FPGA Embedded Design
1• FPGAs Allow For The Implementation Of An Ideal Mix Of Peripherals And System Infrastructure 2• New System Requirements Can Be Supported Without Changing Core Design 3• Longevity Of FPGAs Approaches The Longest Available Microcontrollers In The Market 4• FPGAs Are Used To Augment µp Functionality - Absorbing The Core Is The Next Natural Step
10 - Embedded Processing using FPGAs www.xilinx.com Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
11 - Embedded Processing using FPGAs www.xilinx.com Processor Use Models
1 2 3
State Machine Microcontroller Custom Embedded • Lowest Cost, No • Medium Cost, Some • Highest Integration, Peripherals, No RTOS Peripherals, Possible Extensive Peripherals, & No Bus Structures RTOS & Bus Structures RTOS & Bus Structures • VGA & LCD Controllers • Control & Instrumentation • Networking & Wireless • Low/High Performance • Moderate Performance • High Performance
Range of Use Models
12 - Embedded Processing using FPGAs www.xilinx.com Xilinx Processor Solutions
1 2 3
State Machine Microcontroller Custom Embedded
Range of Solutions
13 - Embedded Processing using FPGAs www.xilinx.com PicoBlaze 8-Bit Microcontroller Reference Design
• Free PicoBlaze macro for implementing complex state machines and micro sequencers • Supports Spartan-3, Virtex-4, Virtex-II/Pro FPGAs and CoolRunner-II CPLDs • Minimal logic size – Only 96 Spartan-3 slices, 5% of XC3S200 device • High performance – Up to 44 MIPS in Spartan-3 and 100 MIPS in Virtex-4 • 100% embedded capability – Everything in FPGA – no external components required • http://www.xilinx.com/picoblaze
14 - Embedded Processing using FPGAs www.xilinx.com • 3-Stage Parallel Pipeline MicroBlaze • Optional Instruction and Data Caches – 2KB to 64KB Each 123 DMIPS @ 150 MHz in • Optional Multiply Unit Virtex-II Pro (0.82 DMIPs/MHz) • 32 x 32-bit GPRs Program • Dedicated Fast Simplex Links - FSL Counter Execution Unit ( Add, Sub, Shift, – 32-bit, Point-to-Point Links Logical, Multiply ) between GPRs in the Register File and User Logic Instruction – 8 Input & 8 Output Decode ( includes – Blocking & Non-Blocking inst buffer ) FSL Register File Transfers – Predefined Insts ( 32 X 32b ) • Local Memory Bus Interfaces - LMB Data-Side Bus Interface Data-Side Local MemoryLocal - Bus LMB Local MemoryLocal - Bus LMB Data-Side Bus Interface Data-Side Local MemoryLocal - Bus LMB Local MemoryLocal - Bus LMB • Provide Access to Inst/Data Instruction-Side Bus Interface
Instruction-Side Bus Interface I-Cache D-Cache Memory Without Bus Access ( 2KB to 64KB ) ( 2KB to 64KB ) • 100 MHz+ On-Chip Peripheral Bus – 32-bit Semi-Duplex Bus Firm IP Core Relatively Placed Macro (RPM) • Fast access “Cache Link” interface Remaining FPGA Fabric On -Chip Peripheral Bus - OPB Cache Link
15 - Embedded Processing using FPGAs www.xilinx.com PowerPC Processor Block
10/100/1000 InstructionInstruction SideSide EMAC On Chip Memory OnFPGA Chip Logic Memory Core Client PHY InterfaceInterface
Processor Block ISOCM Control Auxiliary ISOCM Processor EMAC Statistics APU APU FCM Unit Control PowerPC 405 Core DCR Host I/F ISPLB Processor Local
DSPLB Bus Interface
DCR DCR Statistics EMAC Control DSOCM Device Configuration DSOCM Reset, Register Control Debug Register InterfaceInterface
10/100/1000 Client PHY Ext DCR Data Side EMAC On Chip Memory Core InterfaceInterface
16 - Embedded Processing using FPGAs www.xilinx.com Integrating a PowerPC 405
• IP-Immersion™ • ActiveInterconnect™ – Advanced Technology Metal – Segmented Routing Enables High “Headroom” Enables True IP Immersion Bandwidth Interface –1 IP-Immersion Tiles Provide Direct –2 General Purpose Routing Passes Connectivity to Virtex-II Pro Fabric Over Hard IP (Hex & Long)
Metal 9 Metal 8 Metal 7 2 Metal 6 Metal 5 Metal 4 Metal 4 Metal 4 Metal 3 Metal 3 Metal 3 Embedded Metal 2 Metal 2 Metal 2 Tile Tile PowerPC 405 1 Metal 1 Metal 1 Metal 1 IP-Immersion IP-Immersion Poly IP-Immersion Poly Poly
Silicon Substrate Replaces only 1,024 LUTs and 8 Block RAMs
17 - Embedded Processing using FPGAs www.xilinx.com UltraController
Easy to use 5-port HDL module JTAG sys_clk sys_rst
PowerPC 405 ResetReset JTAGJTAG
BRAM BRAM gpio_in ISOCM DSOCM ISOCM DSOCM gpio_out
Code stored in block RAM Virtex-II Pro Fabric
Simple processor/fabric interface uses minimal FPGA resources
18 - Embedded Processing using FPGAs www.xilinx.com UltraController II
Easy to use HDL module Interrupt
sys_clk sys_rst_out
RR PowerPC 405 ee sys_rst ss ee tt JJ gpio_in JTAG TT
A ISOCM
A DSOCM ISOCM DSOCM gpio_out GG
Code Loaded FPGAFPGA Fabric Fabric and stored in Cache Operates at maximum CPU frequency Simple processor/fabric interface uses minimal FPGA resources
19 - Embedded Processing using FPGAs www.xilinx.com Virtex 5 Processor Block
• 5 x 2 Crossbar Connection (128-bit) including Arbiters – Configure through bit-stream or DCR – Up to1:1 freq w/ processor • 1 PLB master interface to FPGA fabric Processor Block DMA LocalLink • 2 PLB slave interfaces from FPGA fabric DMA LocalLink • PLB Master/Slave ports PLB-S0 I/F – operate at diff freq 440 FCM APU Core Memory Ctlr I/F • 4 Thirty-two bit DMA Channels Interface Control IPLB DRPLB PLB-M I/F • Memory Ctlr Interface to connect DWPLB soft logic based Memory controller CPM/ PLB-S1 I/F • Auxiliary Processing Unit Controller Control
– 128 bit interface, pipelining supported DCR DMA LocalLink Internal • DCR Controller Module Registers DMA LocalLink
DCR Slave DCR Master • CPM and Control Module Interface Interface
20 - Embedded Processing using FPGAs www.xilinx.com The System Infrastructure
Bus infrastructure ranges from logic efficient to maximum bandwidth
MasterMaster MasterMaster MasterMaster MasterMaster MasterMaster MasterMaster
3-State3-State Bi-directionalBi-directional BusBus CentralCentral ArbitrationArbitration ControlControl
ArbArb ArbArb
SlaveSlave SlaveSlave SlaveSlave SlaveSlave SlaveSlave SlaveSlave
Bi-directional Bus Centrally Arbitrated Slave-side Arbitration Single channel of communication, Full duplex communication, N-channels of communication, least amount of logic required and moderate amount of logic required most expensive logic requirement moderate routing requirement and minimal routing requirement and extensive routing required
21 - Embedded Processing using FPGAs www.xilinx.com System Peripherals - IP
ArbiterArbiter PrimaryPrimary BusBus BridgeBridge SecondarySecondary BusBus ArbiterArbiter
Slower peripherals such as I/O functions are typically placed 10/10010/100 DDRDDR SPISPI WatchdogWatchdog UARTUART IntcIntc on secondary bus structures EthernetEthernet SDRAMSDRAM EEPROMEEPROM TimerTimer
Infrastructure I/O Based Peripherals Memory Interfaces High Level Functions High speed memory and system interfaces are typically placed on primary bus structures
22 - Embedded Processing using FPGAs www.xilinx.com High Levels of Integration
PowerPC 405 Data Control Register Bus - DCR SystemSystem ResetReset BRAM UserUser PeripheralPeripheral ISOCM DSOCM ISOCM DSOCM UserUser LogicLogic IPIFIPIF JTAGJTAG (master(master oror BlockBlock slave)slave) Instruction Data
Arbiter Processor Local Bus - PLB Bridge Secondary PLB Arbiter
10/100/1G10/100/1G MemoryMemory 10/10010/100 PCIPCI 32/3332/33 GPIOGPIO UARTUART EthernetEthernet ControllerController EthernetEthernet
MemoryMemory DDR SDRAM
SDRAM ZBT SSRAM Full System Customization & High Performance
23 - Embedded Processing using FPGAs www.xilinx.com Complete Customization
Dual Port Instruction-Side Dual Port Data-Side Local Local Memory Bus BlockBlock RAMRAM Memory Bus
MicroBlaze SystemSystem UserUser ResetReset PeripheralPeripheral
JTAG Inst LMB JTAG Data LMB IPIF Inst Inst LMB IPIF Data Data LMB BlockBlock
Processor Local Bus - PLB Arbiter
MemoryMemory EthernetEthernet UART1UART1 UARTUART 7 7 SPISPI ControllerController oror CANCAN
FPGA Fabric DDR SDRAM ZBT SSRAM SDRAM
24 - Embedded Processing using FPGAs www.xilinx.com Gigabit System Reference Design (GSRD) • Designed for maximum TCP/IP performance with the PowerPC 405 • GSRD building blocks – LocalLink Gigabit Ethernet MAC – Communications DMA Controller – Multi Port Memory Controller • High Performance TCP transmit – Over 900Mbps (with Treck TCP/IP stack) • Applications – High performance bridging – TCP/IP and user data interfaces – IP over GigE Ethernet Switching – High Speed Remote Monitoring – Remote Image/Video Capture
25 - Embedded Processing using FPGAs www.xilinx.com Multiported Memory Controller
26 - Embedded Processing using FPGAs www.xilinx.com Today You Have Multiple Topology Choices
Custom Local Memory Coprocessors
JTAG Point-to-Point MicroBlaze or PowerPC Debug Trace Connections for Shared Bus for higher performance PLBv46 smaller area Multi Port Interrupt Memory Controller Controller DMA Timer/PWM
Ethernet MAC I2C/SPI
PCI UART
Generic Peripheral Controller GPIO
CAN/MOSTGPIO Custom I/O GPIO Peripherals USBGPIOGPIO 2.0 Virtex or Spartan FPGA
27 - Embedded Processing using FPGAs www.xilinx.com Multicore Topologies
Memory Map. SW responsibility for non- PLBv46 PLBv46 conflicting resource partition XPS_INTC XPS_INTC Private boot memory Private boot memory Data_IRQ_A Data_IRQ_B
XPS BRAM XPS BRAM Shared external memory
Interrupt Interrupt Pass small sized messages between a IPLB2 IPLB IXCL IPLB MPMC DPLB DPLB2 DPLB DXCL sender and a receiver (External Memory) PPC405 MicroBlaze Partitioning clocking PLB XCL domains Use synchronous-polling or PLB XCL assync interrupts LL PLB Treatment of CPU and Free LL system reset
Other LL XPS_LLTEMAC Peripheral
_Data_IRQ_A _Data_IRQ_B Processor Subsystem 1 Send FIFO Processor Subsystem 2
Other XPS Receive FIFO Other XPS Peripherals Peripherals UART, GPIO, TIMER XPS_Mailbox Core UART, GPIO, TIMER
Mutex array
XPS_Mutex Core
Shared BRAM memory
CPUs with shared XPS BRAM You can still use shared resources need to sync. memory techniques with BRAM Configure number of
PLB_v46 BRIDGE mutex registers to test and set
28 - Embedded Processing using FPGAs www.xilinx.com Virtex 5 Processor Block
• 5 x 2 Crossbar Connection (128-bit) including Arbiters – Configure through bit-stream or DCR – Up to1:1 freq w/ processor • 1 PLB master interface to FPGA fabric Processor Block DMA LocalLink • 2 PLB slave interfaces from FPGA fabric DMA LocalLink • PLB Master/Slave ports PLB-S0 I/F – operate at diff freq 440 FCM APU Core Memory Ctlr I/F • 4 Thirty-two bit DMA Channels Interface Control IPLB DRPLB PLB-M I/F • Memory Ctlr Interface to connect DWPLB soft logic based Memory controller CPM/ PLB-S1 I/F • Auxiliary Processing Unit Controller Control
– 128 bit interface, pipelining supported DCR DMA LocalLink Internal • DCR Controller Module Registers DMA LocalLink
DCR Slave DCR Master • CPM and Control Module Interface Interface
29 - Embedded Processing using FPGAs www.xilinx.com FPGA logic
FPGA logic FPGA logic
Significant reduction in amount of FPGA logic required to buil d a system using integrated functionality of Virtex -5 Processor Block
30 - Embedded Processing using FPGAs www.xilinx.com Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • The Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
31 - Embedded Processing using FPGAs www.xilinx.com Two Ways to meet Performance Requirements
APU or FSL 5 GHz 5 MHz
CPU HW Accele FPGAFPGA rator µµCC
Brut Force Intelligent
33 - Embedded Processing using FPGAs www.xilinx.com Co-Processor Interfacing
FPGA,FPGA, ASICASIC Conventional µC µC oror ASSPASSP approach Bus Interface
• Fast • Deterministic • Easy Xilinx APU Solutions FSL implementation Co-Processor / Co-Processor / Peripheral Peripheral
34 - Embedded Processing using FPGAs www.xilinx.com Conventional vs. APU Implementations
Bus Conventional Co-ProcessorCo -Processor PowerPC BUS InterfaceInterface Logic Approach BUS Interface Logic Virtex-II PRO Logic
Overhead required to connect soft logic
Benefits APU centric • Less overhead logic APU approach PowerPC Controller Logic • Faster transfer Virtex-4 • Efficient software implementation
35 - Embedded Processing using FPGAs www.xilinx.com Bus Cycle Improvement via APU
Wr Execution Res With APU
Write Operand 1 Write Operand 2 Execution Read Result Read Status Without APU
Reduce number of bus cycles by factor of 10 by using APU
36 - Embedded Processing using FPGAs www.xilinx.com Up to 13x Performance Boost PPC 405 Single Precision FPU
Performance Algorithm Software Emulation of Single Precision Speed Up Floating Pt * FPU
FIR Filter 846.09 ms 63.94 ms 13x
Video Editing Algorithm 248.38 ms 27.93 ms 9x
PID Loop 1.66 us 0.37 us 4x
1024pt FFT 19.48 ms 5.42 ms 4x
(*C-Code is compiled using IBM Performance Libs delivered with EDK 8.2i)
www.xilinx.com Floating Point Accelerated by APU
• Algorithm description - FPU w/ FIR filter • PowerPC operating frequency – 400 MHz • APU with direct interface to fabric- XtremeDSP blocks
SW 2MFlops
PLBPLB ++ FPUFPU 2121 MFlopsMFlops
OCMOCM ++ FPUFPU 3030 MFlopsMFlops
APUAPU ++ FPUFPU 4545 MFlopsMFlops (est)(est)
APU improves algorithm execution 20X over software emulationemulation
38 - Embedded Processing using FPGAs www.xilinx.com Accelerated MPEG De-Compression Example: Video application - IDCT MPEG de-compression
• Leverages Virtex-4 Integrated Features PLB – PowerPC, APU, XtremeDSP Slice – Software Emulation – 13000 IDCTs/sec • HW acceleration over software – Lower latency and high bandwidth APU Soft PowerPC Auxiliary Controller • Efficient HW/SW design partitioning APU FPGA Processor – Optimized implementation I/F Interface ExtermeDSP • Significant performance increase ExtremeDSP Hard Processor Block – 20x improvement over software ExtremeDSP – Utilizing APU and ExtremeDSP OCM FPGA Fabric
39 - Embedded Processing using FPGAs www.xilinx.com Tailor the System to Achieve Performance Take MP3 Decoding with Custom Hardware Logic
100MHz100MHz MicroBlazeMicroBlaze TMTM ,, purepure softwaresoftware 1X == 146146 secondsseconds IMDCT DCT LL MAC
100MHz100MHz MicroBlazeMicroBlaze TMTM +FSL++FSL+ LLLL MACMAC 16X == 99 secondsseconds
TM Custom Hardware Logic HardwareCustom 100MHz100MHz MicroBlazeMicroBlaze TM +FSL+FSL +DCT+DCT ++ IMDCTIMDCT ++ LLLL MACMAC 21X == 77 secondsseconds 1x 2x 8x 10x 20x 50x 100x Performance Improvement
40 - Embedded Processing using FPGAs www.xilinx.com Note: MicroBlaze TM v4.00 core, ML40x board, 100MHz system clock, EDK8.1 An Example: MicroBlaze TM FPU in Motor Control
Function Configuration Improvement Fault GPIO Calls/Second MicroBlaze Feedback MicroBlaze TM UART w/FPU Over 82,000 Over 15X with FPU SPI Motor TM Contrl. MicroBlaze BRAM
Power Stage Power Over 5,300 1X CAN without FPU
IIC
Quad Isolation Ethernet Decoder
Memory HALL Interface Decoder Benefit: XC3S1500 The customer now have the extra processor bandwidth for use on other things. Extra cycles needed for specialized peripherals
• Design implications: – You will get more done for a given clock frequency – Reducing the clock frequency (if possible) reduces power dissipation
Note:41 - Embedded Used MicroBlaze Processing usingTM v4.00 FPGAs core on Spartan 3S1500.www.xilinx.com Auxiliary Processing Solutions
PLB Fabric Floating Point Control Application Module Unit Specific Hard Processor Block Interface -V4
APU Auxiliary 405 Core APU I/F Auxiliary Control Processor Processor FSL C to HDL
Poseidon Design OCM Impulse Systems General Celoxica Solution FPGA Fabric Technologies Solution
42 - Embedded Processing using FPGAs www.xilinx.com C to HDL Tools
Code Profile • Create Code • Profile Code • Identify Critical Code Segments
• Translate Critical Code to HDL HDL • Build HW Accelerators from HDL • Replace Critical Code with Calls to HW Accelerators Code Accelerator Call
43 - Embedded Processing using FPGAs www.xilinx.com Impulse C Programming Model • System of Independent sequential execution units – Called Processes – Can be executed in HW or SW and run concurrently • Processes communicate with each other – Via self-synchronizing FIFO buffers called streams, or – Via shared memories
SW Only SW FPGA Fabric Faster Producer Producer Execution Function Function Stream Function
Co Developer™ Critical Critical Function
Function PROCESS Function Consumer Consumer Function Stream PROCESS PROCESS
Sequential execution only execution Sequential Function Function Concurrent Software and Hardware execution
46 - Embedded Processing using FPGAs www.xilinx.com Impulse C-HDL Tool Flow
47 - Embedded Processing using FPGAs www.xilinx.com Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
48 - Embedded Processing using FPGAs www.xilinx.com What is Partial Reconfiguration?
• Dynamically modify blocks of logic in an FPGA by downloading partial bit files while unmodified logic continues to operate
• Logic is LUTs, FFs, BRAMs, DSP blocks, MGTs...
FPGA
Reconfig Block
49 - Embedded Processing using FPGAs www.xilinx.com PR Applications Today • JTRS Software Defined Radio – SWaP and multi-channel concurrency
• Microwave Communications – Flexibility and extended functionality • Cryptography – Physical isolation in one FPGA
50 - Embedded Processing using FPGAs www.xilinx.com SDR Development Platform
RJ45 – 100BT RJ45 – 100BT RJ45 – 100BT (Data) (Voice) (Ctlr)
Ethernet Switch
Ethernet interface Core Connect
WAVEFORM B WAVEFORM WAVEFORM A A WAVEFORM
Waveform combiner Interpolation filter Decimation filter TX NCO RX NCO D/A Interface A/D Interface (14 bits -- 300 MHz) (12 bits -- 200 MHz)
DAC ADC
51 - Embedded Processing using FPGAs www.xilinx.com PR Application: Dynamic Network Interface Protocol Switching
Config Memory Storage FPGA 10 GigE tx/rx 10 GigE tx/rx
OC48 tx/rx OC48 tx/rx Fibre tx/rx Switch uP Custom tx/rx Fibre tx/rx
Custom tx/rx
52 - Embedded Processing using FPGAs www.xilinx.com Partial Reconfiguration Applications
• Reduce FPGA size by multiplexing FPGA hardware resources
External Memory with full system configuration and PR bit streams
Memory Waveform D Waveform C Waveform B Waveform A
CPU
Reconfiguration Port Internal configuration access port
54 - Embedded Processing using FPGAs www.xilinx.com Agenda
• Why FPGA Platform Based Embedded Processing • Embedded Use Models And Their FPGA Based Solutions • Architecture/Topology Choices • A Reoccurring Question: Hardware Or Software • Reconfigurable Hardware • Tool Flows For FPGA Based Embedded Systems
55 - Embedded Processing using FPGAs www.xilinx.com Necessary Tools
• A Full Complement of Tools are Required to Design an Embedded Processor System – Processor System Generation – Hardware Implementation & Software Compilation – Hardware & Software Debug • HW Implementation & SW Compilation are the two Main Flows that Must be Addressed – The Embedded Flows Should Mirror Traditional Flows
56 - Embedded Processing using FPGAs www.xilinx.com System Generation
Tools are needed to create the system infrastructure, connect peripherals and clock configure the peripherals as desired interrupts reset CoreCore
ArbiterArbiter PrimaryPrimary BusBus BridgeBridge SecondarySecondary BusBus ArbiterArbiter
Databits = 8 FLASHFLASH TimerTimer UARTUART Baudrate = 9600 ? Parity = Off
SRAMSRAM Once the system is defined, the hardware can be implemented and software compiled
57 - Embedded Processing using FPGAs www.xilinx.com 58 - Embedded Processing using FPGAs www.xilinx.com Co-processor Creation Auto-generated HW & SW Templates
HW instantiated Test application code
SW Drivers created
61 - Embedded Processing using FPGAs www.xilinx.com Hardware Implementation
Source Code SynthesisSynthesis VHDL or Verilog
System Netlist Standard FPGA HW Development Flow
Constraints BuildBuild && MapMap Simulation/Synthesis
Build & Map Merged & Mapped Design Place & Route
Memory Map for Configuration File Place & Route Internal Code Storage Place & Route
Configuration File
62 - Embedded Processing using FPGAs www.xilinx.com Eclipse Based SDK for the Software Savvy Engineer
Powerful Software Development • Eclipse open source Software IDE allows an industry standard method for seamless tool integration • Consists of: • Perspectives • Views • Editors • Environment window displays one or more perspectives • A perspective contains views (like the Navigator) and editors
63 - Embedded Processing using FPGAs www.xilinx.com Software IP
Board Support • Device drivers are provided for Package (BSP) each hardware device • Device drivers are written in ANSI RTOS Adaptation Layer C and are designed to be portable across processors • Device drivers allow the user to
Driver select the desired functionality to UART 16550 UART Device Driver Device Device Driver Device Device Driver Device Device Drivers Device
Ethernet 10/100 Ethernet minimize the required memory IIC IIC MasterSlave & ATM Utopia Device ATM Peripheraln, n+1… Initialization Code • Device drivers in all layers have Boot Code common characteristics and API
64 - Embedded Processing using FPGAs www.xilinx.com Board Support Package
ArbiterArbiter PrimaryPrimary BusBus BridgeBridge SecondarySecondary BusBus ArbiterArbiter libxil.a
XEmac_mWriteReg ( … ); XEmac_SendFrame ( … ); DDRDDR 10/10010/100 SPISPI UARTUART TimerTimer IntcIntc ... SDRAMSDRAM EthernetEthernet EEPROMEEPROM XSpi_mSendByte ( … ); XSpi_GetStatusReg ( … ); ... xtmrctr.h XUartLite_SendByte ( … ); xemac.h XUartLite_GetStats ( … ); XTmrCtr_mWriteReg ( … ); ... XEmac_mWriteReg ( … ); XTmrCtr_IsExpired ( … ); XTmrCtr_mWriteReg ( … ); XEmac_SendFrame ( … ); XTmrCtr_IsExpired ( … ); ... XIntc_mEnableIntr ( … ); ... … uartlite.h ... XIntc_mEnableIntr ( … ); ... xspi.h XUartLite_SendByte ( … ); XUartLite_GetStats ( … ); XSpi_mSendByte ( … );... XSpi_GetStatusReg ( … ); ...
Board Support Packages (BSPs) are collections of parameterized drivers for a specific processor system
65 - Embedded Processing using FPGAs www.xilinx.com Software Compilation
C Code CompilerCompiler Source Code
Standard Embedded Assembly Code SW Development Flow
C/C++ Cross Compiler AssemblerAssembler Source Code
Assembler Relocatable Linker Object Code
Machine Code Mapfile and Linker Linker Linker Script
Machine Code
66 - Embedded Processing using FPGAs www.xilinx.com System Design Flow
C Code VHDL or Verilog
Standard Embedded Embedded Standard FPGA SW Development Flow Developers Kit HW Development Flow
Code Entry Board Support HDL Entry System Netlist Package Instantiate the C/C++Include Cross the Compiler BSP Simulation/Synthesis and Compile the ‘System Netlist’ Software Image and Implement Linker Data2MEM Implementationthe FPGA
? 2 Compiled ELF 3 Compiled BIT 1 ?
Download Combined Load Software Download Bitstream Image to FPGA Into FLASH Into FPGA
Debugger Chipscope
RTOS, Board Support Package
67 - Embedded Processing using FPGAs www.xilinx.com HW & SW Debug
Internal & external logic analysis is JTAG Port used to determine where problems reside Debug Port clock interrupts IntegratedIntegrated reset CoreCore BusBus AnalyzerAnalyzer
ArbiterArbiter PrimaryPrimary BusBus BridgeBridge SecondarySecondary BusBus ArbiterArbiter
SW debug typically IntegratedIntegrated FLASHFLASH SRAMSRAM TimerTimer UARTUART consists of JTAG based LogicLogic AnalyzerAnalyzer runtime control of the target processor core
External Connection
68 - Embedded Processing using FPGAs www.xilinx.com “Platform Debug” Integrated HW/SW Debuggers
• Cross Trigger HW and SW Debuggers to Find and Fix Bugs Faster!
• Enable better insight into the HW / SW code dynamics
69 - Embedded Processing using FPGAs www.xilinx.com Key Messages
• FPGA Based Embedded Systems Solve Classic Integration Issues (Size, Cost, Power) • They Also Creating Novel Solutions Not Easily Realized Using Other Technologies • FPGA Platforms Enable Flexible and Dynamic Systems • Intelligent Tools Accelerate Development And Enable You To Optimize The Balance Of Feature Set, Performance, Area & Cost Of Your Design
72 - Embedded Processing using FPGAs www.xilinx.com Q&A
73 - Embedded Processing using FPGAs www.xilinx.com