OpenSPARC T1 on Xilinx FPGAs – Updates Durgam Vahia Paul Hartke [email protected] [email protected] OpenSPARC Engineering Xilinx University Program
RAMP Retreat – Summer 2007, San Diego Overview
• OpenSPARC – Quick Recap • New Developments > Hardware system with T1 on Xilinx board > C program on system > Full system simulation support > Availability • What's next • Q & A
2 www.opensparc.net FCRC-RAMP-2007-San Diego Recap – Big Goals
• Proliferation of OpenSPARC Technology • Proliferation of Xilinx FPGA Technology > Make OpenSPARC FPGA-Friendly > Create reference design with complete system functionality > Boot Solaris/Linux on the reference design > Open it up .. > Seed ideas in the community
Enable multi-core research
3 www.opensparc.net FCRC-RAMP-2007-San Diego OpenSPARC T1
• SPARC V9 implementation • Eight cores, four thread each - 32 simultaneous threads • All cores connect through a 134.4GB/s crossbar switch • High BW 12 way associative 3MB on-chip L2 cache • 4 DDR2 channels (23 GB/s) • 70W power • ~300M transistors
4 www.opensparc.net FCRC-RAMP-2007-San Diego OpenSPARC T1 – Some design choices
• Simpler core architecture to maximize cores on die • Caches, DRAM channels shared across cores • Shared L2 decreases cost of coherence misses significantly • Crossbar good for b/w, latency and functional verification
5 www.opensparc.net FCRC-RAMP-2007-San Diego UltraSPARC-T1 Processor Core
● Four threads per core MUL ● Single issue 6 stage pipeline
EXU ● 16KB I-Cache, 8KB D-Cache > Unique resources per thread > Registers > Portions of I-fetch datapath IFU > Store and Miss buffers > Resources shared by 4 threads MMU LSU > Caches, TLBs, Execution Units > Pipeline registers and DP ● Core Area = 11mm2 in 90nm
● MT adds ~20% area to core TRAP
6 www.opensparc.net FCRC-RAMP-2007-San Diego SPARC Core Pipeline
All processor IO (including interrupts) via Crossbar interface
7 www.opensparc.net FCRC-RAMP-2007-San Diego OpenSPARC T1 on FPGAs (1)
• Single thread version > ~40K Virtex-2/4 LUTs, 30K Virtex-5 LUTs > Optimized for area > No modular arithmetic (MA), reduced TLBs > Basic building block for larger multi-* designs > Easily meets 20ns cycle time (50MHz) > 50K Virtex-4 LUTs with MA and full TLBs
8 www.opensparc.net FCRC-RAMP-2007-San Diego T1 on FPGA (2)
• Four thread version > Functionality identical to Niagara1 core – on FPGAs > 78K Virtex-2/4 LUTs, 59K Virtex-5 LUTs > 40%+ reduction in area compared to original design > Modular Arithmetic Unit can be removed to save area > Further optimizations can be made in > Register file design
9 www.opensparc.net FCRC-RAMP-2007-San Diego T1 System Block Diagram
Xilinx Embedded FPGA Boundary External DDR Developer’s Memory (EDK) Design
Memory Controller
Microblaze debug OpenSPARC T1 T1/Microblaze Microblaze UART Core Bridge Service processor OpenSPARC T1 UART
Fast Simplex 10/100 Ethernet Links interface (FSL) Operational Planned
10 www.opensparc.net FCRC-RAMP-2007-San Diego System Operation
• OpenSPARC T1 core communicates exclusively via processor-to-crossbar interface (PCX) > Glue logic to connect to Microblaze service processor • Service processor firmware polls T1 core and system peripherals and > Does protocol translation and > Routes requests to the right destination • Current support for DRAM and UART > Ethernet planned
11 www.opensparc.net FCRC-RAMP-2007-San Diego System Status - Hardware • Entire system functional on Xilinx ML411 board > With XC4VFX100 device and > Through Xilinx Embedded Development Kit (EDK) project > Service processor firmware maintains reverse directory for coherency • Ran entire OpenSPARC regression suit > Introduced a new flow that converts simulation diagnostics to run on real hardware > Total 322 tests that verify various T1 functionality > Essentially a hardware regression suit
Robust system functionality for Software work
12 www.opensparc.net FCRC-RAMP-2007-San Diego System Status - Software • Able to boot Niagara Hypervisor software > Executes entire Niagara reset sequence > Sets up hardware virtualization > Key for booting both Solaris and Linux > Provides interface to run stand-alone C program without OS • C program > Hypervisor gives control to the C program > OS calls needs to be handled by the program • To demonstrate, we ran text based adventure program – Dungeon – on this system > Showed at FCRC
13 www.opensparc.net FCRC-RAMP-2007-San Diego T1 on ML411
14 www.opensparc.net FCRC-RAMP-2007-San Diego Hypervisor Boot
15 www.opensparc.net FCRC-RAMP-2007-San Diego Full-system simulation • For adoption and proliferation, availability of the verification environment is the key > Added support for full-system simulation that includes behavioral model of all the hardware components > Single thread T1, glue logic, service processor > DRAM control, UART control, DDR model > T1 HDL can be replaced by gate level netlist Synplicity and Xilinx XST are supported > System simulator – Mentor ModelSim > Part of Xilinx EDK project > Easily reproducible
16 www.opensparc.net FCRC-RAMP-2007-San Diego Availability • OpenSPARC T1 released 1.4 on 3/13 > Supports single thread version of T1 > Includes FPGA optimizations • Next release will include > Xilinx XST simulation support. Synplicity support will continue > System build environment with EDK > T1 and glue logic as peripheral cores (“pcores”) > Service processor firmware > Hardware regression environment > Hypervisor code for FPGA system > Full system simulation with ModeSim > DOCUMENTATION All necessary components for academic proliferation
17 www.opensparc.net FCRC-RAMP-2007-San Diego Other notable
• Gaisler Research integrating single thread T1 in GRLIB > Designed T1 crossbar AHB bridge > Full system simulation with T1 working with > Leon2 DRAM controller, UART and timer controller > Working through software integration issues > Reset code etc. > Goal to run C program
18 www.opensparc.net FCRC-RAMP-2007-San Diego Next Steps
• Continue porting software stack > OpenBoot PROM – SPARC equivalent of BIOS > Operating System >Linux and/or Solaris • Support for Virtex-5 devices > BRAM mapping will be a challenge • Extend the system to handle four threaded T1 core • What would RAMP like to see in OpenSPARC
19 www.opensparc.net FCRC-RAMP-2007-San Diego Team
• Sun • Xilinx – Ismet Bayraktaroglu – Paul Hartke – Thomas Thatcher – Gopal Reddy – Kushal Datta – Durgam Vahia
20 www.opensparc.net FCRC-RAMP-2007-San Diego OpenSPARC momentum Innovation will happen everywhere
Innovation Happens Everywhere > 5200
downloads 21 www.opensparc.net FCRC-RAMP-2007-San Diego