CoreConnect™ - A Simplistic Overview PowerPC 405
• 5-stage data path pipeline I-Side On-Chip Memory (OCM) • 16KB D and I Caches • Embedded Memory Management Unit I-Cache Fe tc h & • Execution Unit (16KB) Decode Timers – Multiply / divide unit and MMU Debug – 32 x 32-bit GPR (64 Entry TLB) Logic • Dedicated on-chip memory interfaces D-Cache Execution Unit JTAG • Timers: PIT, FIT, Watchdog (16KB) (32x32 GPR, ALU, MAC) Instruction Trace • Debug and trace support Processor Local Bus Local Processor (PLB) • Performance: D-Side On-Chip – 450 DMIPS at 300 MHz Memory (OCM) – 0.9mW/MHz Typical Power
2 CoreConnect™ Bus Architecture
Device Control Register Bus (DCR) • 32-bit bus for initiating peripherals DCR DCR High-Speed • Saves PLB and OPB bandwidth MemoryMemory High-Speed Peripheral ControllerController Peripheral Processor Local Bus (PLB) • 32-bit address, 64-bit data PLBPLB PLBPLB PPC • Primary high-bandwidth bus inter- PPC ArbiterArbiter facing “directly” with the processor 405405
PLB-OPBPLB-OPB Low-Speed Bridge Low-Speed On-chip Peripheral Bus (OPB) Bridge PeripheralPeripheral • 32-bit address, 32-bit data OPBOPB • Lower bandwidth bus interfacing to OPBOPB system peripherals ArbiteArbite r r I/O Interfaces Hard-IP I/O Interfaces Soft-IP CoreConnect Foundation for PowerPC and MicroBlaze
3 CoreConnect™ Example
Dedicated Hard IP Flexible Soft IP PowerPC DCR Bus 405 Core Instruction Data
PLB OPB Arbiter Bus Processor Local Bus Bridge On-Chip Peripheral Bus Arbiter
e.g. Hi-Speed Memory Hi-Speed On-Chip On-Chip On-Chip Peripheral Controller Peripheral Peripheral Peripheral Peripheral
ZBTZBT SSRAMSSRAM • Full System Customization Off-Chip DDRDDR SDRAMSDRAM Memory SDRAMSDRAM • High Performance
4 Processor Use Models
Dedicated Hard IP Flexible Soft IP BRAM PowerPC DCR Bus 405 Core Instruction Data
Data Path Arbiter Data PLB OPB Control Bus PowerPC Processor Local Bus Bridge On-Chip Peripheral Bus 405 Processing Arbiter Path e.g. Hi-Speed Memory Hi-Spe ed On-Chip On-Chip On-Chip Peripheral Controller Peripheral Peripheral Peripheral Peripheral BRAM FP GA Fa br ic ZBZBT T SSRAM Off-Chip DDR SDRAMSDR AM Memory SDRAMSDRAM Buried Processor Embedded Computing • Processor runs from BRAM only • Processor runs from large external • No external pins, no RTOS, no memory peripherals • CoreConnect bus structure, • Typical use: peripherals packet processing, • Typical use: running embedded control functions software applications on RTOS
5 System Diagram
Machine Status Reg r31 Register Instruction Controller Bus Instruction File Data Side side Program Bus Controller Data 32 x 32bit LMB LMB Counter r1 Control Unit r0
Instruction Shift / Add / MultiplMultip Logical Subtract Multiply Buffer lyy
Processor
TM TM CoreConnect Interrupt UART CoreConnect OPB I/F Controller OPB I/F
Off-Chip Watchdog General Time r / Off-Chip Memory Time r Purpose I/O Counters Memory 0-4GB 0-4GB Peripherals
6 MicroBlaze & CoreConnect™
• MicroBlaze uses the On-Chip Peripheral Bus from IBM’s CoreConnect bus structure. • All OPB IP is portable between processors • Seamlessly integrate both Hard and Soft processors in Virtex-II Pro
7 Example - CoreConnect™ Based System SRAM/ROM External Peripheral Bus Master Controller Controller I2C UART USB GPIO
OPB On-chip Peripheral Bus (OPB) 32-bit FP U Arbite r
PPC405 DMA CPU Interrupt OPB MAL 10/100 Ethernet Inst Data Controller Bridge Controller
Device Control PLB Register Arbiter Processor Local Bus (PLB) 64-bit Bus
Reset PC133/DDR133 PCI-X SRAM Custom Clock Control SDRAM Controller Bridge Controller Logic Power Mgmt
SRAM
8 CoreConnect™ Details
• Provides three buses for interconnecting cores, library macros, and custom logic: – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) Bus • Shares many similarities with the AMBA 2.0 • IBM offers a no-fee, royalty-free CoreConnect architectural license – Licensees receive the PLB arbiter, OPB arbiter and PLB/OPB bridge designs along with bus model toolkits and bus functional compilers for the PLB, OPB and DCR buses
9 Processor Local Bus (PLB)
• High performance, synchronous on chip bus – 32-bit address, 64-bit write and 64-bit read data bus – Instruction Cache Unit PLB master is read only! • Read/write transfers between master and slave devices • Each PLB master has separate address, read data, write data, and transfer qualifiers • PLB slaves have shared, but decoupled, address, read data, write data, transfer qualifiers, and status signals • Access granted through a central arbitration mechanism
10 PLB Block Diagram
11 PLB Features
• Architecture supports 8 PLB masters – Instruction Cache and Data Cache are PLB masters • Timing is provided by a single, shared clock source • Overlap read & write transfers permits 2 transfers/clock • Four levels of request priority for each master • Byte enables for unaligned halfword and 3-byte transfers • Support for 16-, 32- and 64-bit line data transfers • Variable or fixed length burst transfers supported
12 PLB Transfer Protocol Address Cycles Request Transfer Address Phase Phase Acknowledge Phase
Data Cycles Transfer Data Phase Acknowledge Phase(s) • Decoupled address, read data, and write data bus – Supports overlapped transfers – Address cycle overlapped with read or write data – Read data overlapped with write data • Address pipelining enables next address transfer to begin before current data transfer has completed • Address pipelining & overlapped transfers reduce bus latency
13 Overlapped PLB Transfers
14 On-Chip Peripheral Bus (OPB) • Architected to alleviate system performance bottlenecks by reducing capacitive loading on the PLB – Fully synchronous – 32-bit address bus, 32-bit data bus – Supports single-cycle data transfers between master and slaves – Supports multiple masters, determined by arbitration implementation – Bridge function can be master on PLB or OPB – No tri-state drivers required
15 Device Control Register (DCR) Bus
16 Device Control Register (DCR) Bus • PPC405 is the only master for this 32-bit bus • DCR address is 10-bits (1024 maximum registers) • All other attached devices are slaves • Used for on-chip device configuration purposes • Doesn’t slow down high performance PLB bus • Two instructions used to access registers – Move to DCR “mtdcr” – Move from DCR “mfdcr” – Privileged mode access only!
17 Device Control Register (DCR) Bus • Transfers data between CPU general purpose registers and DCR slave registers • 3 cycle minimum read or write transfer extendable by slave or master • DCR logic must return DCRCPUACK within 64 clock cycles or processor times out. No error! CPU executes next instruction! • Slaves may be clocked slower/faster than master • IBM DCR Register Bus Architecture Specification, Version 2.8 available on the web
18 Example DCR Implementation
19 CoreConnect™ (Features)
• PLB Arbiter – Arbitration for up to 8 PLB master devices on PLB bus includes watchdog timer and separate address, read, and write data paths – Supports address pipelining • PLB to OPB Bridge – PLB slave and OPB master device – Supports dynamic bus sizing for OPB connection – Supports burst reads and writes – Compliant with various bursts sizes – Supports 4-, 8-, and 16-word line transfers – Supports DMA transfers to/from OPB master peripherals
20 CoreConnect™ (Features) • OPB to PLB Bridge – PLB master and OPB slave device – 64-bit PLB master interface supports doubleword (64-bit) reads, and doubleword, word, halfword, and byte writes – Data packing on writes, up to 4-doublewords – Fixed length burst (4 doubleword) prefetching for reads 50+ MHz OPB clock frequency – Support for PLB at 1, 2, 3, or 4 times the frequency of the OPB • OPB Arbiter – Arbitration for up to 4 OPB master peripherals on OPB bus
21 Over 40 Processor IP Modules IP Function Class IP Function Class PPC405 Boot SW Only IPIF Scatter/Gather IPIF Module (HW&SW) Memory Tests SW Only PLB BRAM Memory Controller (HW&SW) BRAM SW Only PLB SRAM Memory Controller (HW&SW) SRAM SW Only PLB DDR Memory Controller (HW&SW) ZB T SW Only PLB ZBT Memory Controller (HW&SW) DDR SW Only PLB Flash Memory Controller (HW&SW) VxWorks Support SW Only OPB BRAM Memory Controller (HW&SW) VxWorks BSP SW Only OPB SRAM Memory Controller (HW&SW) Chip Support Package SW Only OPB ZBT Memory Controller (HW&SW) PLB Arbiter Infrastructure (HW&SW) OPB Flash Memory Controller (HW&SW) PLB<>OPB Bridge Infrastructure (HW&SW) Interrupt Contro ller Pe rip he ral IP Core (HW&SW) OPB Arbiter Infrastructure (HW&SW) UART Lite Pe rip he ral IP Core (HW&SW) OPB Bus Structure Infrastructure (HW&SW) UART 16450 Pe rip he ral IP Core (HW&SW) System Reset Infrastructure (HW&SW) UART 16550 Pe rip he ral IP Core (HW&SW) IPIF Slave Attachment IPIF Module (HW&SW) IIC Master & Slave Peripheral IP Core (HW&SW) IPIF Master Attachment IPIF Module (HW&SW) SPI Maste r & Slave Pe rip he ral IP Core (HW&SW) IPIF Address Decode IPIF Module (HW&SW) E the rnet 10/100M Pe rip he ral IP Core (HW&SW) IPIF Interrupt Control IPIF Module (HW&SW) A TM Uto pia Level 2 Slave Pe rip he ral IP Core (HW&SW) IPIF Read Packet FIFO IPIF Module (HW&SW) OPB Time r/Co unter Pe rip he ral IP Core (HW&SW) IPIF Write Packet FIFO IPIF Module (HW&SW) IPB Timebase/WDT Peripheral IP Core (HW&SW) IPIF DMA IPIF Module (HW&SW) OPB GPIO Pe rip he ral IP Core (HW&SW)
22 CoreConnect PLB/OPB IP Blocks IP Module Slices (Approx.) LUTs DFFs BRAMs
PLB Arbiter 735 1237 234 0 PLB Bus Logic 47 95 0 0 PLB DDR SDRAM 379 279 479 0 PLB ZBT SRAM 249 97 401 0 PLB SRAM/FLASH 298 242 354 0 PLB BRAM 266 149 183 16 PLB to OPB Bridge 686 763 609 0 OPB Arbiter 76 120 33 0 OPB Bus Logic 54 108 0 0 OPB IIC 230 279 182 0 OPB 16450 UART 373 425 322 0 OPB 16550 UART 548 678 418 0 OPB GPIO 144 87 201 0 OPB CROM 80 78 83 1 OPB LCD Controller 71 66 76 0 OPB to PCI Bridge 974 802 1147 0 OPB to DCR Bridge 104 73 135 0 OPB to PLB Bridge 161 99 223 0 INTC - Critical 252 312 193 0 INTC - Non Critical 252 312 193 0 Packet Processor (OCM) 628 783 474 7 Packet Processor (PLB) 628 783 474 1
23 The Benefits of Parameterization
Example: OPB Arbiter
Parameter Values Resources F MAX NUM_MASTERS PROC_INTERFACE DYNAM_PRIORITY PARK REG_GRANTS LUTs MHz 1 N N N N 11 295 2 N N N N 18 223 4 N N N N 34 193 4 Y N N N 59 156 4N Y N N 54 169 4 N N Y N 83 159 Difference: 4N NNY 34 201 >4x in size 4 Y Y Y Y 146 145 >30% in speed 8 Y Y Y Y 388 112
• Significantly increases performance or saves area • Only include what you need • This can only be accomplished in a programmable system
24 Intellectual Property InterFace (IPIF)
• Purpose – Simplifies task of interfacing IP to CoreConnect™ buses • Provides simple interfaces to common tasks • Mix-n-match philosophy of interfaces – Parameterizable -- pay for only the logic needed • Five Interface Types to Cover Needs – Slave SRAM style interface – Slave Control Register style interface – Slave FIFO style interface – Slave DMA handshake style interface – Master Interface
25 IP Interface (IPIF)
Low Perf. IP
High Perf. IP IF OPB Bus I-Cache IP BRAM Bridge ISOCM IP IF I-Cache Controller Controller PLB I/F PLB CPU Arbiter MMU PLB I/F D-Cache Controller DSOCM Controller BRAM D-Cache
The IP interface enables customers to easily integrate their own custom application specific value-added IP into CoreConnect based systems
26 The Xilinx IPIF
PLB/OPB IP Interface Bus
Master Interface
Slave SRAM I/F
Customer Slave Control Re. I/F IP Slave DMA Handshake I/F DM A Engine
Slave FIFO I/F
The IP interface enables customers to easily integrate their own custom application specific value-added IP into CoreConnect™ based systems
27 CoreConnect IPIF Details
Interrupt Xilinx,party customer or 3rd • Consists of 8 modules Controller • Automatically configures a Slave Addr Decode Attachment
IP Core from core to the processor bus On-Chip Bus MUX Write FIFO • OPB & PLB supported • IPIF will be added to other Maste r Read FIFO Attachment LogiCOREs DMA
Bus Attachment Scatter Bus/Core HW Layer (BAL) Gather Independent Layer
28 System Generator Pro
Specify system architecture
Designer has a concept
Configure peripheral Assemble system
29 Designer Pushes the Button
• Tool automatically connects IP cores to Processor Bus • Automatically generates both HW & SW components – Netlist, implementation files, etc. – Drivers, etc. • Performs rule and consistency check – Memory map – Consistent & proper interfaces
30 ChipScope Pro Debug/Verification for a New Development Paradigm
• ChipScope Pro Verification and IO Pads Debug Tools ILA ILA Integrated Logic – On-Chip CoreConnect Integrated IP IBA Custom Analyzer (ILA) Cores Bus Analysis (IBA) Core Logic – On-Chip Integrated Logic Analysis Embedded System Bus (ILA) PPC405 Custom Integrated Bus • Solves Tough Debug Problems
IO Pads IO ILA Core Core Pads IO Analyzer (IBA) Core – Full node/bus visibility – On-chip HW/SW co-verification Memory Array ILA – Debug at full system clock rate ICON Boundary Scan TAP Controller
JTAG
Target Connection
31 Summary
• CoreConnect interconnects high-bandwidth devices such as processor cores, external memory interfaces and DMA controllers • Provides three buses for interconnecting cores, library macros, and custom logic: – Processor Local Bus (PLB) – On-Chip Peripheral Bus (OPB) – Device Control Register (DCR) Bus • IPIF – Extremely useful tool to interface proprietary IP blocks – Parameterizable - pay for only the logic needed • ChipScope Pro provides the Industry’s best node and bus visibility with full debug capability at wirespeed
32 For More Details...
http://www.xilinx.com/xlnx/xil_prodcat_product.jsp?title=coreconnect and http://www-3.ibm.com/chips/products/coreconnect/ Questions?
For more information, go to:
Or contact the eSP team [email protected]