Cell Broadband Engine Spencer Dennis Nicholas Barlow the Cell Processor

Cell Broadband Engine Spencer Dennis Nicholas Barlow the Cell Processor

Cell Broadband Engine Spencer Dennis Nicholas Barlow The Cell Processor ◦ Objective: “[to bring] supercomputer power to everyday life” ◦ Bridge the gap between conventional CPU’s and high performance GPU’s History Original patent application in 2002 Generations ◦ 90 nm - 2005 ◦ 65 nm - 2007 (PowerXCell 8i) ◦ 45 nm - 2009 Cost $400 Million to develop Team of 400 engineers STI Design Center ◦ Sony ◦ Toshiba ◦ IBM Design PS3 Employed as CPU ◦ Clocked at 3.2 GHz ◦ theoretical maximum performance of 23.04 GFLOPS Utilized alongside NVIDIA RSX 'Reality Synthesizer' GPU ◦ Complimented graphical performance ◦ 8 Synergistic Processing Elements (SPE) ◦ Single Dual Issue Power Processing Element (PPE) ◦ Memory IO Controller (MIC) ◦ Element Interconnect Bus (EIB) ◦ Memory IO Controller (MIC) ◦ Bus Interface Controller (BIC) Architecture Overview SPU/SPE Synergistic Processing Unit/Element SXU - Synergistic Execution Unit LS - Local Store SMF - Synergistic Memory Frontend EIB - Element Interconnect Bus PPE - Power Processing Element MIC - Memory IO Controller BIC - Bus Interface Controller Synergistic Processing Element (SPE) 128-bit dual-issue SIMD dataflow ○ “Single Instruction Multiple Data” ○ Optimized for data-level parallelism ○ Designed for vectorized floating point calculations. ◦ Workhorses of the Processor ◦ Handle most of the computational workload ◦ Each contains its own Instruction + Data Memory ◦ “Local Store” ▫ Embedded SRAM SPE Continued Responsible for governing SPEs ◦ “Extensions” of the PPE Shares main memory with SPE ◦ can initiate accesses for SPE cores Power Architecture ◦ Implements Power Architecture Hypervisor ▫ can run multiple operating systems concurrently Memory (1st generation) ◦ 32KB split L1 instruction & Data cache ▫ unified 512KB L2 Cache Power Processor Element (PPE) Element Interconnect Bus High bandwidth internal bus 1st generation: 96 Bytes/cycle 4 16B rings ◦ can handle up to 3 simultaneous data transfers 12 on and off ramps ◦ Each SPE + PPE ◦ memory controller ◦ 2 Off-chip I/O interfaces Memory Flow Controller Asynchronous Memory Controller Retrieves data from main memory to SPE’s local storage & PPE’s Cache. Supports two Rambus XDR memory banks Bus Interface Controller Provides asynchronous interface between EIB and IO interfaces Two flexible IO interfaces to rest of system ◦ One Interface can be reconfigured to provide Symmetric Multiprocessing (SMP) interface Contains pervasive unit ◦ provides test, debug and monitoring functionality ▫ Chip level error checking ◦ provides clock generation & distribution control ◦ Power on Reset Unit (POR) ▫ Responsible for unit initialization ◦ Performance monitoring Power Management Unit (PMU) ◦ Allows software controlled power reduction Thermal Management Unit (TMU) Developing for Cell Octopiler ◦ Takes high level sequential code and parallelizes it to optimize it for a multiprocessor system ▫ High level languages ◦ Divides code nine ways ▫ 8 sets of instructions are written for the SPE’s ▫ The final set is written for the Power PC PPE GCC ◦ IBM sourced plugins for cell PPU/SPU development SPU ISA SPU ISA (cont’d) Applications (In Depth) Console Gaming ◦ PS3 ▫ PPE controls 6 SPE’s delegating tasks ▫ 1 SPE is OS reserved, 1SPE is redundant Supercomputing ◦ IBM BladeCenter QS Series ▫ Easy Scalability Password cracking ◦ High parallelism allows for high floating point brute force performance Conclusion Discontinued in 2009 ◦ Difficult development environment ▫ Programmer managed SPE memory ▫ Explicit parallelism ▫ Two separate ISAs Idea still lives on… ◦ General Purpose GPU ▫ Intel Larabee Architecture . Intel Many Integrated Core Architecture ▫ AMD FireStream ▫ Nvidia Tesla ◦ https://www- 01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F39872570600 06F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf ◦ http://en.wikipedia.org/wiki/SIMD ◦ http://en.wikipedia.org/wiki/Cell_(microprocessor) ◦ ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1564359 ◦ http://arstechnica.com/uncategorized/2006/02/6265-2/ ◦ http://www2.lbl.gov/Science- Articles/Archive/sabl/2006/Jul/CellProcessorPotential.pdf ◦ http://en.wikipedia.org/wiki/Symmetric_multiprocessing ◦ http://researcher.watson.ibm.com/researcher/view.php?person=us- mkg/papers/2006_ieeemicro.pdf References.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us