CoreConnect™ - A Simplistic Overview PowerPC 405

• 5-stage data path pipeline I-Side On-Chip Memory (OCM) • 16KB D and I Caches • Embedded Memory Management Unit I-Cache Fe tc h & • Execution Unit (16KB) Decode Timers – Multiply / divide unit and MMU Debug – 32 x 32-bit GPR (64 Entry TLB) Logic • Dedicated on-chip memory interfaces D-Cache Execution Unit JTAG • Timers: PIT, FIT, Watchdog (16KB) (32x32 GPR, ALU, MAC) Instruction Trace • Debug and trace support Processor Local Local Processor (PLB) • Performance: D-Side On-Chip – 450 DMIPS at 300 MHz Memory (OCM) – 0.9mW/MHz Typical Power

2 CoreConnect™ Bus Architecture

Device Control Register Bus (DCR) • 32-bit bus for initiating peripherals DCR DCR High-Speed • Saves PLB and OPB bandwidth MemoryMemory High-Speed Peripheral ControllerController Peripheral Processor Local Bus (PLB) • 32-bit address, 64-bit data PLBPLB PLBPLB PPC • Primary high-bandwidth bus inter- PPC ArbiterArbiter facing “directly” with the processor 405405

PLB-OPBPLB-OPB Low-Speed Bridge Low-Speed On-chip Peripheral Bus (OPB) Bridge PeripheralPeripheral • 32-bit address, 32-bit data OPBOPB • Lower bandwidth bus interfacing to OPBOPB system peripherals ArbiteArbite r r I/O Interfaces Hard-IP I/O Interfaces Soft-IP CoreConnect Foundation for PowerPC and MicroBlaze

3 CoreConnect™ Example

Dedicated Hard IP Flexible Soft IP PowerPC DCR Bus 405 Core Instruction Data

PLB OPB Arbiter Bus Processor Local Bus Bridge On-Chip Peripheral Bus Arbiter

e.g. Hi-Speed Memory Hi-Speed On-Chip On-Chip On-Chip Peripheral Controller Peripheral Peripheral Peripheral Peripheral

ZBTZBT SSRAMSSRAM • Full System Customization Off-Chip DDRDDR SDRAMSDRAM Memory SDRAMSDRAM • High Performance

4 Processor Use Models

Dedicated Hard IP Flexible Soft IP BRAM PowerPC DCR Bus 405 Core Instruction Data

Data Path Arbiter Data PLB OPB PowerPC Processor Local Bus Bridge On-Chip Peripheral Bus 405 Processing Arbiter Path e.g. Hi-Speed Memory Hi-Spe ed On-Chip On-Chip On-Chip Peripheral Controller Peripheral Peripheral Peripheral Peripheral BRAM FP GA Fa br ic ZBZBT T SSRAM Off-Chip DDR SDRAMSDR AM Memory SDRAMSDRAM Buried Processor Embedded Computing • Processor runs from BRAM only • Processor runs from large external • No external pins, no RTOS, no memory peripherals • CoreConnect bus structure, • Typical use: peripherals packet processing, • Typical use: running embedded control functions software applications on RTOS

5 System Diagram

Machine Status Reg r31 Register Instruction Controller Bus Instruction File Data Side side Program Bus Controller Data 32 x 32bit LMB LMB Counter r1 Control Unit r0

Instruction Shift / Add / MultiplMultip Logical Subtract Multiply Buffer lyy

Processor

TM TM CoreConnect Interrupt UART CoreConnect OPB I/F Controller OPB I/F

Off-Chip Watchdog General Time r / Off-Chip Memory Time r Purpose I/O Counters Memory 0-4GB 0-4GB Peripherals

6 MicroBlaze & CoreConnect™

• MicroBlaze uses the On-Chip Peripheral Bus from IBM’s CoreConnect bus structure. • All OPB IP is portable between processors • Seamlessly integrate both Hard and Soft processors in Virtex-II Pro

7 Example - CoreConnect™ Based System SRAM/ROM External Peripheral Bus Master Controller Controller I2C UART USB GPIO

OPB On-chip Peripheral Bus (OPB) 32-bit FP U Arbite r

PPC405 DMA CPU Interrupt OPB MAL 10/100 Inst Data Controller Bridge Controller

Device Control PLB Register Arbiter Processor Local Bus (PLB) 64-bit Bus

Reset PC133/DDR133 PCI-X SRAM Custom Clock Control SDRAM Controller Bridge Controller Logic Power Mgmt

SRAM

8 CoreConnect™ Details

• Provides three buses for interconnecting cores, library macros, and custom logic: – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) Bus • Shares many similarities with the AMBA 2.0 • IBM offers a no-fee, royalty-free CoreConnect architectural license – Licensees receive the PLB arbiter, OPB arbiter and PLB/OPB bridge designs along with bus model toolkits and bus functional compilers for the PLB, OPB and DCR buses

9 Processor Local Bus (PLB)

• High performance, synchronous on chip bus – 32-bit address, 64-bit write and 64-bit read data bus – Instruction Cache Unit PLB master is read only! • Read/write transfers between master and slave devices • Each PLB master has separate address, read data, write data, and transfer qualifiers • PLB slaves have shared, but decoupled, address, read data, write data, transfer qualifiers, and status signals • Access granted through a central arbitration mechanism

10 PLB Block Diagram

11 PLB Features

• Architecture supports 8 PLB masters – Instruction Cache and Data Cache are PLB masters • Timing is provided by a single, shared clock source • Overlap read & write transfers permits 2 transfers/clock • Four levels of request priority for each master • Byte enables for unaligned halfword and 3-byte transfers • Support for 16-, 32- and 64-bit line data transfers • Variable or fixed length burst transfers supported

12 PLB Transfer Protocol Address Cycles Request Transfer Address Phase Phase Acknowledge Phase

Data Cycles Transfer Data Phase Acknowledge Phase(s) • Decoupled address, read data, and write data bus – Supports overlapped transfers – Address cycle overlapped with read or write data – Read data overlapped with write data • Address pipelining enables next address transfer to begin before current data transfer has completed • Address pipelining & overlapped transfers reduce bus latency

13 Overlapped PLB Transfers

14 On-Chip Peripheral Bus (OPB) • Architected to alleviate system performance bottlenecks by reducing capacitive loading on the PLB – Fully synchronous – 32-bit address bus, 32-bit data bus – Supports single-cycle data transfers between master and slaves – Supports multiple masters, determined by arbitration implementation – Bridge function can be master on PLB or OPB – No tri-state drivers required

15 Device Control Register (DCR) Bus

16 Device Control Register (DCR) Bus • PPC405 is the only master for this 32-bit bus • DCR address is 10-bits (1024 maximum registers) • All other attached devices are slaves • Used for on-chip device configuration purposes • Doesn’t slow down high performance PLB bus • Two instructions used to access registers – Move to DCR “mtdcr” – Move from DCR “mfdcr” – Privileged mode access only!

17 Device Control Register (DCR) Bus • Transfers data between CPU general purpose registers and DCR slave registers • 3 cycle minimum read or write transfer extendable by slave or master • DCR logic must return DCRCPUACK within 64 clock cycles or processor times out. No error! CPU executes next instruction! • Slaves may be clocked slower/faster than master • IBM DCR Register Bus Architecture Specification, Version 2.8 available on the web

18 Example DCR Implementation

19 CoreConnect™ (Features)

• PLB Arbiter – Arbitration for up to 8 PLB master devices on PLB bus includes watchdog timer and separate address, read, and write data paths – Supports address pipelining • PLB to OPB Bridge – PLB slave and OPB master device – Supports dynamic bus sizing for OPB connection – Supports burst reads and writes – Compliant with various bursts sizes – Supports 4-, 8-, and 16-word line transfers – Supports DMA transfers to/from OPB master peripherals

20 CoreConnect™ (Features) • OPB to PLB Bridge – PLB master and OPB slave device – 64-bit PLB master interface supports doubleword (64-bit) reads, and doubleword, word, halfword, and byte writes – Data packing on writes, up to 4-doublewords – Fixed length burst (4 doubleword) prefetching for reads 50+ MHz OPB clock frequency – Support for PLB at 1, 2, 3, or 4 times the frequency of the OPB • OPB Arbiter – Arbitration for up to 4 OPB master peripherals on OPB bus

21 Over 40 Processor IP Modules IP Function Class IP Function Class PPC405 Boot SW Only IPIF Scatter/Gather IPIF Module (HW&SW) Memory Tests SW Only PLB BRAM Memory Controller (HW&SW) BRAM SW Only PLB SRAM Memory Controller (HW&SW) SRAM SW Only PLB DDR Memory Controller (HW&SW) ZB T SW Only PLB ZBT Memory Controller (HW&SW) DDR SW Only PLB Flash Memory Controller (HW&SW) VxWorks Support SW Only OPB BRAM Memory Controller (HW&SW) VxWorks BSP SW Only OPB SRAM Memory Controller (HW&SW) Chip Support Package SW Only OPB ZBT Memory Controller (HW&SW) PLB Arbiter Infrastructure (HW&SW) OPB Flash Memory Controller (HW&SW) PLB<>OPB Bridge Infrastructure (HW&SW) Interrupt Contro ller Pe rip he ral IP Core (HW&SW) OPB Arbiter Infrastructure (HW&SW) UART Lite Pe rip he ral IP Core (HW&SW) OPB Bus Structure Infrastructure (HW&SW) UART 16450 Pe rip he ral IP Core (HW&SW) System Reset Infrastructure (HW&SW) UART 16550 Pe rip he ral IP Core (HW&SW) IPIF Slave Attachment IPIF Module (HW&SW) IIC Master & Slave Peripheral IP Core (HW&SW) IPIF Master Attachment IPIF Module (HW&SW) SPI Maste r & Slave Pe rip he ral IP Core (HW&SW) IPIF Address Decode IPIF Module (HW&SW) E the rnet 10/100M Pe rip he ral IP Core (HW&SW) IPIF Interrupt Control IPIF Module (HW&SW) A TM Uto pia Level 2 Slave Pe rip he ral IP Core (HW&SW) IPIF Read Packet FIFO IPIF Module (HW&SW) OPB Time r/Co unter Pe rip he ral IP Core (HW&SW) IPIF Write Packet FIFO IPIF Module (HW&SW) IPB Timebase/WDT Peripheral IP Core (HW&SW) IPIF DMA IPIF Module (HW&SW) OPB GPIO Pe rip he ral IP Core (HW&SW)

22 CoreConnect PLB/OPB IP Blocks IP Module Slices (Approx.) LUTs DFFs BRAMs

PLB Arbiter 735 1237 234 0 PLB Bus Logic 47 95 0 0 PLB DDR SDRAM 379 279 479 0 PLB ZBT SRAM 249 97 401 0 PLB SRAM/FLASH 298 242 354 0 PLB BRAM 266 149 183 16 PLB to OPB Bridge 686 763 609 0 OPB Arbiter 76 120 33 0 OPB Bus Logic 54 108 0 0 OPB IIC 230 279 182 0 OPB 16450 UART 373 425 322 0 OPB 16550 UART 548 678 418 0 OPB GPIO 144 87 201 0 OPB CROM 80 78 83 1 OPB LCD Controller 71 66 76 0 OPB to PCI Bridge 974 802 1147 0 OPB to DCR Bridge 104 73 135 0 OPB to PLB Bridge 161 99 223 0 INTC - Critical 252 312 193 0 INTC - Non Critical 252 312 193 0 Packet Processor (OCM) 628 783 474 7 Packet Processor (PLB) 628 783 474 1

23 The Benefits of Parameterization

Example: OPB Arbiter

Parameter Values Resources F MAX NUM_MASTERS PROC_INTERFACE DYNAM_PRIORITY PARK REG_GRANTS LUTs MHz 1 N N N N 11 295 2 N N N N 18 223 4 N N N N 34 193 4 Y N N N 59 156 4N Y N N 54 169 4 N N Y N 83 159 Difference: 4N NNY 34 201 >4x in size 4 Y Y Y Y 146 145 >30% in speed 8 Y Y Y Y 388 112

• Significantly increases performance or saves area • Only include what you need • This can only be accomplished in a programmable system

24 Intellectual Property InterFace (IPIF)

• Purpose – Simplifies task of interfacing IP to CoreConnect™ buses • Provides simple interfaces to common tasks • Mix-n-match philosophy of interfaces – Parameterizable -- pay for only the logic needed • Five Interface Types to Cover Needs – Slave SRAM style interface – Slave Control Register style interface – Slave FIFO style interface – Slave DMA handshake style interface – Master Interface

25 IP Interface (IPIF)

Low Perf. IP

High Perf. IP IF OPB Bus I-Cache IP BRAM Bridge ISOCM IP IF I-Cache Controller Controller PLB I/F PLB CPU Arbiter MMU PLB I/F D-Cache Controller DSOCM Controller BRAM D-Cache

The IP interface enables customers to easily integrate their own custom application specific value-added IP into CoreConnect based systems

26 The IPIF

PLB/OPB IP Interface Bus

Master Interface

Slave SRAM I/F

Customer Slave Control Re. I/F IP Slave DMA Handshake I/F DM A Engine

Slave FIFO I/F

The IP interface enables customers to easily integrate their own custom application specific value-added IP into CoreConnect™ based systems

27 CoreConnect IPIF Details

Interrupt Xilinx,party customer or 3rd • Consists of 8 modules Controller • Automatically configures a Slave Addr Decode Attachment

IP Core from core to the processor bus On-Chip Bus MUX Write FIFO • OPB & PLB supported • IPIF will be added to other Maste r Read FIFO Attachment LogiCOREs DMA

Bus Attachment Scatter Bus/Core HW Layer (BAL) Gather Independent Layer

28 System Generator Pro

Specify system architecture

Designer has a concept

Configure peripheral Assemble system

29 Designer Pushes the Button

• Tool automatically connects IP cores to Processor Bus • Automatically generates both HW & SW components – Netlist, implementation files, etc. – Drivers, etc. • Performs rule and consistency check – Memory map – Consistent & proper interfaces

30 ChipScope Pro Debug/Verification for a New Development Paradigm

• ChipScope Pro Verification and IO Pads Debug Tools ILA ILA Integrated Logic – On-Chip CoreConnect Integrated IP IBA Custom Analyzer (ILA) Cores Bus Analysis (IBA) Core Logic – On-Chip Integrated Logic Analysis Embedded (ILA) PPC405 Custom Integrated Bus • Solves Tough Debug Problems

IO Pads IO ILA Core Core Pads IO Analyzer (IBA) Core – Full node/bus visibility – On-chip HW/SW co-verification Memory Array ILA – Debug at full system clock rate ICON Boundary Scan TAP Controller

JTAG

Target Connection

31 Summary

• CoreConnect interconnects high-bandwidth devices such as processor cores, external memory interfaces and DMA controllers • Provides three buses for interconnecting cores, library macros, and custom logic: – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) Bus • IPIF – Extremely useful tool to interface proprietary IP blocks – Parameterizable - pay for only the logic needed • ChipScope Pro provides the Industry’s best node and bus visibility with full debug capability at wirespeed

32 For More Details...

http://www.xilinx.com/xlnx/xil_prodcat_product.jsp?title=coreconnect and http://www-3.ibm.com/chips/products/coreconnect/ Questions?

For more information, go to:

Or contact the eSP team [email protected]