processor to an FPGA. The The FPGA. an to processor core - soft general [2]. (SoPC) Chip - Programmable - on

resources that may occur when synthesizing a a synthesizing when occur may that resources - System of development rapid the for peripherals

accurately, to avoid inefficient use of FPGA FPGA of use inefficient avoid to accurately, elements with several Intellectual Property (IP) (IP) Property Intellectual several with elements

efficiently use FPGA resources, or perhaps more more perhaps or resources, FPGA use efficiently . They offer memory and logic logic and memory offer They . MicroBlaze

files, and other features specifically tailored to to tailored specifically features other and files, Nios/NiosII, LatticeMico32, and Xilinx and LatticeMico32, Nios/NiosII,

logic units, register register units, logic - arithmetic sets, nstruction i principal products available are: Altera are: available products principal

cores have have cores - soft FPGA Such implementation. as supporting and development tools; some of of some tools; development and supporting as

core processors specifically targeted for FPGA FPGA for targeted specifically processors core aycmecal soft commercially many core processors, as well well as processors, core -

edr hv i te at er itoue soft introduced years past the in have vendors - Today, we are witnesses of the emerging of of emerging the of witnesses are we Today,

conceivably be mapped to an FPGA, FPGA FPGA FPGA, an to mapped be conceivably

CORES

hl ay irpoesr soft any While core could could core -

SOR SOR PROCES SOFT OF MPLES EXA 2.

ower consumption, and larger size [1]. size larger and consumption, ower p higher

disadvantages of reduced processor performance, performance, processor reduced of disadvantages number of) values of the parameters [1]. parameters the of values of) number

oee, soft However, core processors have the the have processors core - oriented ASIPs, due to the “on/off” (or limited limited (or “on/off” the to due ASIPs, oriented

end FPGAs. FPGAs. end - high modern on fit can processors - FPGA or ASIPs oriented - ASIC the like (ASIPs)

core core - soft 100 over – constraints) size to (subject set processors processors set - instruction specific - application

a custom number of per FPGA FPGA per microprocessors of number custom a accompanying custom instructions, as done in in done as instructions, custom accompanying

arts and of enabling enabling of and arts p FPGA cost - lower hence and datapath units and and units datapath custom developing

datgs f tlzn sadr mass standard utilizing of advantages produced produced - represent a different problem from that of of that from problem different a represent

core processors have the the have processors core - soft devices, FPGA instantiated unit. Parameterized soft cores cores soft Parameterized unit. instantiated

core microprocessors on some some on microprocessors core - hard to Compared accompanying instruction that uses the the uses that instruction accompanying

FPGA’s fabric, just like any other circuit. circuit. other any like just fabric, FPGA’s point unit) and an an and unit) point - floating or multiplier

core processor is synthesized onto the the onto synthesized is processor core - soft instantiating a predefined datapath unit (like a a (like unit datapath predefined a instantiating

PAs ofgrbe oi fbi. n contrast In fabric. logic configurable FPGA’s , a a , cifying its size), or or size), its cifying spe (and cache a instantiating

processor is laid out on the chip next to the the to next chip the on out laid is processor Configurable parameters may include include may parameters Configurable

times between processor and FPGA. A hard A FPGA. and processor between times core core - developer) through the setting of parameters. parameters. of setting the through developer)

performance due to reduced communication communication reduced to due performance of core configuration by the user (the application application (the user the by configuration core of

costs and board sizes, and can improve system system improve can and sizes, board and costs core processors is that that is processors core - soft FPGA of feature A

ch coexistence can reduce parts parts reduce can coexistence ch Su logic. custom [1]. circuits general of implementations ASIC

- on coexistence their to due platform, chip with with chip than the overheads when comparing FPGA versus versus FPGA comparing when overheads the than

increasingly popular software implementation implementation software popular increasingly integrated circuits) can thus be significantly less less significantly be thus can circuits) integrated

gate array (FPGA) chips are becoming an an becoming are chips (FPGA) array gate specific specific - (application ASICs on processors core

programmable programmable - field on Microprocessors - soft general to compared FPGAs on processors

INTRODUCTION 1. core core - soft such of overhead performance

. systems embedded FPGA, processors, - soft es, cor - soft : Keywords

process. design the

core processors in FPGAs, and some of the decisions and design tradeoffs which must be made during during made be must which tradeoffs design and decisions the of some and FPGAs, in processors core - soft of implementation

core processors. In this paper, we study the the study we paper, this In processors. core - hard to compared consumption energy and formance per degraded have

uch processors typically typically processors uch s flexibility, increased with designers provide cores processor soft with FPGAs While products. FPGA

dsgs FG edr hv eu poiig ofgrbe ot rcso crs ht a b synthesiz be can that cores processor soft configurable providing begun have vendors FPGA designs. r thei in ed onto their their onto ed

in FPGA configurable logic capacity and decreasing FPGA costs have enabled designers to more readily more to designers enabled have costs FPGA decreasing and capacity logic configurable FPGA in incorporate FPGAs FPGAs incorporate

Field programmable gate arrays (FPGAs) provide designers with the ability to quickly create hardware create quickly to ability the with designers provide (FPGAs) arrays gate programmable Field circuits. Increases Increases circuits.

Abstract

Technical University of Gabrovo of University Technical Gabrovo of University Technical

Valentina Stoianova Kukenska Stoianova Valentina Minev Borisov Petar

CORE PROCESSORS IN F IN PROCESSORS CORE - FT SO OF IMPLEMENTATION s PGA

November 4 2 – 3 2 GABROVO , 7 200

07 SCIENTIFIC SCIENTIFIC INTERNATIONAL CONFERENCE ‘ 2.1 MicroBlaze Soft Processor Core enabling to construct and designs in hours instead A popular soft processor core example is of weeks [2].

Xilinx’s MicroBlaze that can be customized with 2.3 Mico32 Soft Processor Core different peripheral and memory configurations. Both Xilinx and Altera created their own This soft processor core is a 32-bit Reduced proprietary soft core processors, making the Instruction Set Computer (RISC). This processor decision to accept a tougher adoption curve in has a three-stage pipeline with variable length exchange for saddling customers with an IP block instruction latencies, typically ranging from one that tended to lock their design into that particular to three cycles. The tool used to accomplish the FPGA vendor’s devices. design is denominated Xilinx Platform Studio and Like Xilinx’s MicroBlaze and Altera’s Nios, with this friendly environment we are able to Lattice’s Mico32 is a soft-core RISC processor create a MicroBlaze based system instantiating that can be easily dropped into an FPGA. Unlike and configuring cores from the provided libraries. the others, however, Mico32 is completely open MicroBlaze was constructed around Harvard [3]. Rather than take the lock-’em-in approach of memory architecture. The 2 Local Memory their competitors, Lattice has gone the open Busses (LMB) are used to connect the instruction source route, cleverly betting that enabling and data memories. The sizes of this memory as processor-based designs on their devices was well as the number of peripheral used in a much more important than locking customers into particular design are defined by the user. their architecture with an IP core. Additionally the On-Chip-Peripheral Bus is used LatticeMico32 uses fewer than 2,000 look-up to alleviate systems performance bottlenecks and tables (LUTs) on an FPGA, which makes it a is designed to support low-performance/speed very inexpensive engine for your embedded peripherals such as UART, GPIO, USB, external design. Because the processor is soft, you can bus controllers. A MicroBlaze system is configure it with just the options you want for presented in Fig. 6 as a good example of this your application. Optional features include things technology. like data and instruction caches, user-defined The MicroBlaze can operate at up to 200 MHz instructions, and multipliers. This kind of within a Virtex-4 (4VLX40-12) component. The application-specific customizability as well as the range of resources required to implement a flexibility to add any number of processors to MicroBlaze soft processor is between 900 and your design with only a small area penalty is the 2,600 Xilinx Look-Up Tables (LUTs), depending kind of flexibility that has made FPGA-based soft on how the processor is configured [2]. cores so popular among designers. Mico32 2.2 NIOS II Soft Processor Core weighs in with 32 general-purpose registers, up to An another popular soft core processor 32 external interrupts, and a dual Wishbone example is Altera's NIOS II that has a load-store memory interface. Lattice estimates that the RISC architecture, in which many architectural processors can run at over 100MHz on their low- parameters can be customized at design time. The cost 90nm ECP2 FPGAs. user can decide between 16 or 32 bits of width in In keeping with the open-source approach, datapath, register file sizes; as well as cache size Lattice chose the public domain Wishbone bus and custom instructions for the performing of interface for Mico32 and has already announced a user-defined operation in the speeding-up variety of available peripherals, including customized hardware. Those functionalities are memory controllers, asynchronous SRAM, on- supported by the builder development tools, and chip block memory, I/O ports, a 32-bit timer, a using the Nios II Integrated Development DMA controller, general-purpose I/O (GPIO), an Environment (IDE) is possible to build, run, and I2C master controller, a serial peripheral interface debug software of several platforms. Altera also (SPI), and a UART. These plug-on peripherals introduces a SOPC builder [38], for the rapidly dramatically speed up system design, eliminating creation and easily evaluation of embedded the need to custom-code many of the common systems. The integration off-the-shelf intellectual hardware functions if you’re building a Mico32- property (IP) as well as reusable custom based embedded system [3]. components is realized in a friendly way, diminishing the required time to set up a SoC and 3. DESIGN CONSIDERATIONS development environment tool maturity; compatibility between major software releases; 3.1 Performance and Power available training and quality of tool tutorials; Two potentially critical system factors include debug and verification capabilities [4]. the desired functionality and operational The tool suite (Figure 1) includes a collection performance as well as the power required to of traditional software and FPGA design and implement the desired system functionality. development tools. The interaction between these There will typically be a delta between the power two tool groups is commonly referred to as co- consumption and level of performance for fixed design or platform development tool. The function processor implementations and software and soft processor core development potentially more flexible FPGA-based soft tools are responsible for the parameterization of processor cores. the soft core and associated peripherals and the In order to compare the relative performance implementation of processor buses, memory of soft processor cores a common processor maps, interrupt structures and required processor benchmark approach must be used. Currently the peripherals. The software tools also include most common benchmark is the DMIPS traditional compilation, linking, debug and (Dhrystone Million Instructions Per Second) download to the target processor. benchmark. The DMIPS benchmark is based on FPGA design tools include the traditional running an algorithm on a targeted processor core development environments for capturing and to measure its integer processing capabilities synthesizing HDL code, simulation, place and within a defined time period. route, debug and download of the design to the Additional performance considerations include target FPGA platform. the architecture of the soft processor core and its suitability for the targeted application. Factors to 3.3 Considerations evaluate include: type and size of the memory Another important design factor is the ability and peripheral bus; size and model of address to utilize popular operating systems (OSs). Most space; type and size of cache (instruction/data); embedded designs on 32-bit processors include type of controllers like DMA and interrupt an OS to reduce the design time of the software structure; hardware accelerator capability by providing an abstraction interface level to the (co-processor functionality); functional units such software. Most operating systems include the OS as the register file and execution units; type of and any lower-level software required to connect pipeline and strategies to prevent stalls such as the OS to the hardware. This collection of branch prediction. software elements is commonly referred to as a Several factors influence power consumption board support package (BSP) (Figure 1). The including speed of operation, the number and BSP can include items such as the processor boot type of resources required to implement the soft code and interrupt service routines for processor core and the characteristics of the peripherals. FPGA component including static and dynamic power consumption vs. operational speed and Platform Development Tool Software Development Tools (Co-Design) HW temperature. One of the challenges associated Platform Compiler Assembler with FPGA design is the difficulty of estimating Processor BSP configuration power consumption. In an ideal development Linker Generate Platform Instruction Set flow, schedule and resources will be allocated to Description in Simulator design evaluation on a targeted development HDL Debugger platform with an identical target FPGA HDL Description Debug component and soft processor implementation [4]. FPGA Design and Target Board Development 3.2 Design and Development Tools FPGA Program HDL Entry Place and The features and ease of use of the tool suite Route Flash should be considered along with the tool design Simulation Debug flow. Effective tool evaluation and analysis is Debug Synthesize important. The following factors can have a significant effect on design cycle efficiency: ease of use and feature set; design tool flow; Fig. 1 Some important OS considerations include resources required to implement the complete soft interrupt latency, kernel size, implementation of a processor solution including all peripherals and robust set of services, and a collection of full- bus structures: Implement design and analyze featured middleware. Some example middleware utilization report; Incomplete understanding of components include: USB stack; TCPI/IP stack; the impact of a soft processor’s bus structure, embedded web server; encryption algorithms; memory interface overhead and peripheral wireless Ethernet connection. interface speed on overall performance: Verify Other items which should be considered when performance and functionality by implementing a selecting an OS include the API set, level of IDE design on a target evaluation board early in the integration, tasking models, kernel robustness, design cycle; Not implementing or maintaining pre-emption, resource allocation, protection sufficient design margin for design migration and schemes and OS footprint. expansion: Select target FPGA components with Processor cores typically have a list of room for growth with a common package to certified operating systems that have been pre- support potential future design enhancements [4]. verified. If the design team does not have experience with the selected OS, it is 4. CONCLUSION advantageous for the team to be trained on the This paper presents the implementation of specifics of the OS to reduce development time soft-core processors in FPGAs, and some of the and eliminate issues that could be encountered decisions and design tradeoffs which must be during development. Typical OS components made during the design process. Making include: task services; priority levels; timer informed decisions during the design process management; memory management; application reduces the time required to design, implement, programmers interface (API); inter-task debug and test an FPGA soft processor-based communication and synchronization [4]. project. Important design factors are reviewed,

3.4 Debug Options and Capability common design oversights are discussed and soft- cores examples is presented. The debug phase of a design will be iterative Soft processor design teams will benefit from by nature and can consume a significant a system-oriented design approach, which percentage of a design schedule without the considers the long-term effects of design correct tools and design access. The ability to decisions at each design phase. With a solid efficiently debug a design can save weeks design understanding of the overall design cycle, effort and schedule. Robust debug features and development tools options, and benefits of key capability are very important design efficiency design trade-studies, the design team can avoid factors. Some of the most effective tools for many common design mistakes and oversights debugging a soft processor core design include: resulting in a more efficient and flexible design simulation (behavioral and timing); timing cycle. analysis; embedded logic analyzers and embedded bus analyzers; software simulators; non-intrusive real-time software debugger; trace 5. REFERENCES capability; Hardware/Software logic analyzer [1] Sheldon, D., R. Kumar, F. Vahid, R. Lysecky, D. triggers; board-level visual indicators, signal Tullsen, Application-Specific Customization of access ports and input control signals; Parameterized FPGA Soft-Core Processors, standardized debug interface via JTAG bus [4]. International Conference on Computer-Aided Design, ICCAD, San Jose, November 2006. 3.5 Common Design Oversights [2] Calderón, H., C. Elena, S. Vassiliadis, Soft Core The following design factors are often Processors and Embedded Processing: a survey overlooked by design teams new to implementing and analysis, Proceedings of -ProRISC, pp. 483- embedded FPGA soft processors. These factors 488, Veldhoven, The Netherlands, November 2005. should be given special consideration during each [3] Morris K., Soft Core War LatticeMico32 Opens design cycle. Making a mistake in any of these the Field, FPGA and Structured ASIC, September areas may result in a significant impact to a 26, 2006. project's cost or schedule. [4] Cofer, R.C., B. Harding, FPGA Soft Processor Power consumption: Verify consumption on Design Considerations, Programmable Logic an evaluation board before final target board DesignLine, October 12, 2005. design and layout; Underestimating the FPGA