Microblaze Soft Processor Core
32-bit Harvard RISC architecture Performance Feature Cost Easy-to use Area-efficient Optimized for cost-sensitive designs Able to give support into the future solution
Microblaze's Features
Cost-efficient, high performance 32-bit soft processor Optimized for Xilinx FPGAs Highly configurable feature set Co-processor interface for hardware acceleration Fully supported by the Platform Studio embedded development environment JTAG-based integrated debug support
Optional integrated single precision FPU Picoblaze
Cost-effective fully embedded 8-bit RISC microcontroller core Optimized for the Virtex and Spartan series of FPGAs and CoolRunner-II CPLDs Provided as a free, source-level VHDL file with royalty-free re-use within Xilinx FPGAs. Cost-efficient microcontroller-based control and simple data processing.
Picoblaze's Features
16 byte-wide general-purpose data registers 1K instructions of programmable on-chip program store, automatically loaded during FPGA configuration Byte-wide Arithmetic Logic Unit (ALU) with CARRY and ZERO indicator flags 64-byte internal scratchpad RAM 256 input and 256 output ports for easy expansion and enhancement
Automatic 31-location C ALL/RETURN stack Picoblaze's Features (Cont.)
Predictable performance, always two clock cycles per instruction, up to 200 MHz or 100 MIPS in a Virtex-4™ FPGA and 88 MHz or 44 MIPS in a Spartan-3 FPGA Fast interrupt response; worst-case 5 clock cycles Assembler, instruction-set simulator support
PicoBlaze Microcontroller Embedded within an FPGA
Optimal Balance between Microcontroller and FPGA Solutions
MicroBlaze Hardware Options and Configurable Blocks
Hardware Functions Hardware Barrel Shifter Hardware Divider Machine Status Set and Clear Instructions Hardware Exception Support Processor Version Register Floating-Point Unit (FPU) Hardware Multiplier Hardware Debug Logic MicroBlaze Hardware Options and Configurable Blocks (Cont.)
Cache Options Configurable size 2kB - 64 kB Configurable micro-cache size 64B – 1024B 4 or 8 word cache lines Bus Infrastructure On-Chip Peripheral Bus (OPB) for interfacing to peripherals Local Memory Bus (LMB) for fast local access memory Fast Simplex Link (FSL) for interfacing to co- processors MicroBlaze Architecture
32 general-purpose registers An Arithmetic Logic Unit (ALU) A shift unit Two levels of interrupt It can be configured this basic design with more advanced features to allow the user to balance the required performance of the target application against the logic area cost of the soft processor.
MicroBlaze Architecture (Cont.)
MicroBlaze Block Diagram
Microblaze - Data Types and Endianness
Uses Big-Endian bit-reversed format to represent data. Hardware supported data types: Word Half word Byte
Microblaze - Data Types and Endianness (Cont.)
Word Data Type
Half Word Data Type Byte Data Type
Microblaze - Instructions
Instructions are 32 bits defined as either Type A or Type B Type A
Have up to two source register operands and one destination register operand
Microblaze – Instructions (Cont.)
Type B
Have one source register and a 16-bit immediate operand can be extended to 32 bits by preceding the Type B instruction with an IMM instruction Have a single destination register operand
add - Arithmetic Add (Microblaze's Instruction) add rD, rA, rB Add addc rD, rA, rB Add with Carry addk rD, rA, rB Add and Keep Carry addkc rD, rA, rB Add with Carry and Keep Carry
Pseudocode: Registers Altered: if C = 0 then rD (rD) ←(rA) + (rB) MSR[C] else (rD) ← (rA) + (rB) + MSR[C] Latency: if K = 0 then 1 cycle MSR[C] ← CarryOut
bne - Branch if Not Equal (Microblaze's Instruction) bne rA, rB Branch if Not Equal bned rA, rB Branch if Not Equal with Delay
Pseudocode: Registers Altered: If rA ≠ 0 then PC PC ←PC + rB else PC ←PC + 4 if D = 1 then allow following instruction to complete execution
Latency: 1 cycle (if branch is not taken) 2 cycles (if branch is taken and the D bit is set) 3 cycles (if branch is taken and the D bit is not set)
Microblaze - Registers
General Purpose Registers 32-bit General Purpose Registers numbered R0 through R31. Special Purpose Registers Program Counter (PC) Machine Status Register (MSR) Exception Address Register (EAR) Exception Status Register (ESR) Branch Target Register (BTR) Floating Point Status Register (FSR) Processor Version Regi ster (PVR) Microblaze Pipeline Architecture
Three Stage Pipeline When area optimization is enabled, the pipeline is divided into three stages to minimize hardware cost: Fetch, Decode, and Execute.
Microblaze Pipeline Architecture (Cont.)
Five Stage Pipeline When area optimization is disabled, the pipeline is divided into five stages to maximize performance: Fetch (IF), Decode (OF), Execute (EX), Access Memory (MEM), and Writeback (WB).
Microblaze Pipeline Architecture (Cont.)
Branches The instructions in the fetch and decode stages (as well as prefetch buffer) are flushed when executing a taken branch. The fetch pipeline stage is then reloaded with a new instruction from the calculated branch address. A taken branch in MicroBlaze takes three clock cycles to execute, two of which are required for refilling the pipeline. To reduce this latency overhead, MicroBlaze supports branches with delay slots.
Microblaze - Reset, Interrupts, Exceptions, and Break
Reset Hardware Exceptions Equivalent Pseudocode Equivalent Pseudocode PC ← 0x00000000 ESR[DS] ← exception in delay slot MSR ← C_RESET_MSR if ESR[DS] then EAR ← 0 BTR ← branch target PC ESR ← 0 r17 ← invalid value FSR ← 0 else r17 ← PC + 4 PC ← 0x00000020 MSR[EE] ← 0 MSR[EIP]← 1 ESR[EC] ← exception specific value ESR[ESS]← exception specific value EAR ← exception specific value FSR ← exception specific value
Microblaze - Reset, Interrupts, Exceptions, and Break
Breaks There are two kinds of breaks: 1.Hardware (external) breaks 2.Software (internal) breaks
Equivalent Pseudocode r16 ← PC PC ← 0x00000018 MSR[BIP] ← 1
Microblaze - Reset, Interrupts, Exceptions, and Break Interrupt User Vector (Exception) Equivalent Pseudocode Pseudocode r14 ← PC rx ← PC PC ← 0x00000010 PC ← 0x00000008 MSR[IE] ← 0 Interrupt and Exception Handling
Microblaze - Instruction Cache Optional instruction cache for improved performance when executing code that resides outside the LMB address range. The instruction cache has the following features: Direct mapped (1-way associative) User selectable cacheable memory address range Configurable cache and tag size Caching over CacheLink (XCL) interface Option to use 4 or 8 word cache-line Cache on and off controlled using a bit in the MSR
Optional WIC instruction to invalidate instruction cache lines Microblaze - Data Cache Optional data cache for improved performance. The cached memory range must not include addresses in the LMB address range. The data cache has the following features Direct mapped (1-way associative) Write-through User selectable cacheable memory address range Configurable cache size and tag size Caching over CacheLink (XCL) interface Option to use 4 or 8 word cache-lines Cache on and off controlled using a bit in the MSR Optional WDC instruction to invalidate data cache lines PicoBlaze™ 8-bit Embedded Microcontroller's Block Diagram
PicoBlaze - Instruction Set
Processing Data All data processing instructions operate on any of the 16 general-purpose registers The data processing instructions consists of the following types: Logic instructions Arithmetic instructions Test and Compare instructions Shift and Rotate instructions
PicoBlaze - Instruction Set (Cont.)
Logic Instructions : Bitwise logical AND, OR, or XOR between two operands. The first operand is a register location The second operand is either a register location or a literal constant Besides performing pure AND, OR, and XOR operations, the logic instructions provide a means to: complement or invert a register clear a register
set or clear specific bits within a register Picoblaze Bitwise AND, OR, XOR
Bitwise AND, OR, XOR All logic instructions are bitwise operations
The OR and XOR instructions are similar to the AND instruction illustrated in the figure above
except that they perform an OR or XOR logical operation, respectively Picoblaze Complement/Invert Register
PicoBlaze microcontroller does not have a specific instruction to invert individual bits within register sX. However, the XOR sX,FF instruction performs the equivalent operation, as shown in the figure below Complementing a Register Value
Does not have a specific instruction to invert or toggle an individual bit or bits within a specific register. However, the XOR instruction performs the equivalent operation Inverting an Individual Bit Location
Third Party Real Time Operating Systems (RTOS) Support
About lacking a Memory Management Unit, and thus unable to run full Linux, several operating systems have been ported to the MicroBlaze including µClinux and FreeRTOS
ARM Comparison Xilinx’s MicroBlaze soft core is similar to Altera’s Nios II Sharing many of the same advantages and disadvantages Nios II, designed specifically for FPGA integration It runs at 140–180MHz in a fast Stratix or Stratix II device Is a 32-bit RISC processor with a deeper pipeline than the ARM7TDMI It has additional advantages: dynamic or static branch prediction Configurable instruction and data caches
An extendable instruction set ARM Comparison
Nios II has two disadvantages All instructions are 32 bits long the ARM7TDMI has 16-bit Thumb instructions for greater code density It’s a proprietary Altera architecture not the industry-standard ARM architecture.