Rapid silicon prototyping and production for RISC-V SoCs

5th RISC-V Workshop, Nov 29-30 2016 Neil Hand, Codasip VP Marketing Every new IoT application (and company) is different, so why use a standard part?

Because until now it is all you had… CHOICES TODAY ARE LIMITED FOR NEW DESIGN TEAMS Standard parts • Cheap and easy to get started • Poor production scaling • Limited differentiation

ASIC • Expensive to get started • Good production scaling • High level of differentiation NEED TO SUPPORT RAPID EVOLUTION OF IOT DESIGNS

Phase 1 Phase 2 Phase 3 Basic functionality Usability and cost Multi-function to market optimization integration • Enable the software platform • Expand the market • Deliver unique value

Benefits from ASIC WHAT IF? SOC DESIGN ON AN IOT BUDGET!

Mask set > $25M Prototypes > 6 months EDA tools > $10M Bleeding-edge technology Highly specialized Custom resources Silicon

Design Cost Design Mobile class ASICs

Mainstream Mask set in $10K’s ASICs Prototypes in weeks IoT class EDA Tools <$100K Off-the-Shelf ASICs Mature technology Components Limited specialized resources

BOM Cost and/or Product Differentiation RISC-V PLATFORM FROM SILICON TO INTELLIGENCE

TensorFlow OpenCV An integrated platform that SYCL OpenCL

• Keeps designers in control LLVM

• Is easy to build and expand Application optimized RISC-V • Provides as much, or as little help as you need Advanced Debug, Analysis, optimization • Supports low-cost MPW and FPGA Porting wafer scaling MPSC RICH foundational IP Library STANDARDS BASED IT BEGINS AT THE SILICON LEVEL FPGA MetalCopy FPGA

Proto Port MCSC Spec RTL Implementation Emulation ASIC UltraShuttle

Codasip Baysand UltraSoC METAL CONFIGURABLE STANDARD CELL (MCSC) ACCELERATES DESIGN AND PRODUCTION

• MCSC – 65nm and 40nm Technology Rich Family of • 600k to 4Million usable gates IP Cores • 242 to 1250 IOs

• 1Mbit to 70Mbit of memory MCSC - Core Logic MCSC - Memory • Including DCMs/PLLs • Metal Configurable • Metal Configurable RAM Standard Cell • SP/DP/SDP/ROM/FIFO • Package flexibility supporting up to 1760 • 500+ Standard Cell • Support FPGA Features+ Library Flip Chip BGA package • Multi-Vt Support Core Design MCSC - IO & PHY • ASIC UltraShuttle offering Elements • Metal Configurable IO Transceiver (Serdes) Banks • RTL signoff • 6.5 & 12.5 Gbps • Metal Configurable • Supports Multi- • Low cost verification vehicle DDR/LVDS PHY Protocol • 1.2v to 3.3V IO • Deliver 100 fully tested packaged • Soft Multi-Protocol PCS MCSC - PLL Standards • Metal Configurable PLL • PHY can be Anywhere devices • 6 or 9 Output Processors • 8 weeks from Tape out • Support Dynamic Freq. & Phase METALCOPY SEAMLESS FGPA TO ASIC CONVERSION

• Xpresso IP conversion tool • Makes untethering from FPGA easy • Metal Configurable Standard Cell (MCSC) • Supporting 65nm and 40nm • UltraShuttle • low risk silicon verification • Proven RTL to working silicon methodology • Production ready - DFT, ATPG and BIST LAYERING ON RISC-V CODIX BERKELIUM

Processor Features • 3-stage 32 bit or 5-stage 32-bit or 64 bit

• Support for standard extensions (A, C, F, etc.)

• Configurable general purpose registers ∠ Codix Berkelium delivers RISC-V instruction-set compatibility, allowing users to leverage the rich • Compiler or hardware-based hazard avoidance ecosystem of software and tools becoming available, in • Configurable interrupt support addition to those provided by Codasip. • Configurable branch Prediction unit

• Configurable sleep mode support Min area Max freq • Optional instruction and data cache constraint constraint Registers Gates Freq. Gates Freq. • JTAG support with full debug support Codix –Bk3 1740 20901 54.6 33979 431.4 • Full custom extension support Codix –Bk3C 1807 22364 54.6 36332 419.8 Berkeley Chisel Zscale 2117 22335 59.2 33549 355.5 • Complete LLVM/GNU SDK including profiling, emulation, etc Berkeley VScale 1864 20870 44.2 38921 339.6 EASY EXTENSIBILITY

Application Driven Analysis Updated Model Automatically generated SW and HW Codix-Bk infrastructure

SDK by Codasip or HDK Specification Driven Analysis by Customer

Powerful tools simplify Easy to update and Complete development infrastructure Identification of optimization configure processor that is optimized to, and understands, opportunities models your unique processor The Codasip team required only days to implement WalnutDSA, a project that previously would have required as many as three calendar months [while improving performance].

Quantum-Resistant IoT Security on RISC-V, SecureRF NEXT ADD DEBUG, BRING-UP, AND SOC ANALYSIS

System Interconnect Modules are protocol aware and “smart” Bus Custom Byte Memory Custom Security JTAG Master/ Processor Accelerator Graphics with filter and trace Circuit Stream Controller Circuit Engine Control Slave Portfolio of configurable modules, Processor Analytics Status monitor Bus Monitor Module Additional Monitors optimized for different system IP blocks

Flexible, scalable message fabric, UltraSoC Infrastructure Supports easy to route subsystems with

USB Comm. different power Debug & trace is domains, transparent: USB clock domains does not impact

system bus System block UltraSoC PROBLEMS ULTRASOC SOLVES

UltraSoC IPUltraSoC IP Status Status mon mon SM SM Radio IF Radio IF DSP FFT I I TCM I$ TCM I$ Why is the CPU not Processor Processor D D$ D D$ Interconnect as fast as expected? TCM TCM

Status Status USB SMmon SMmon Turbo MAC DSP DSP Interconnect Debug Debug HubHub Status Status monSM SMmon BusBM mon

Why do some UltraSoCUltraSoC Peripheral Interconnect Can I trust DMA transfers InfrastructureInfrastructure system security? Status

take too long? mon SM DMA-1 RAM DMA-2 Timer Security

Interconnect

BusBM mon Why does the What is going on system Status SM mon PHY with my memory DFI-PHY DRAM controller Why is my occasionally controller? interconnect slower hang or than I expected? deadlock? DDR3 FINALLY WE NEED THE APPLICATION LAYER

• LLVM is the glue that holds the solution together • Allows multiple vendors to collaborate easily • Increases return on invest by allowing portability SUPPORTS A LAYERED PROGRAMMING MODEL Graph programming

OpenCVValidateOpenVX graph models Halide VisionCppValidate the codeTensorFlow using standardCaffe tools • From machine learning to

device control C/C++-level programming OpenCL/SYCL Conformance Wide range of SYCL OpenCLClsmith testsuite HCC C++ AMP NVIDIA CUDA* • Program at any level you specs testsuites other testsuites desire Higher-level language enabler • Each level/layer is well SPIRSPIR/SPIR-V -V/HSAIL specsHSA OpenCLConformance SPIR testsuitesNVIDIA PTX* specified, tested and

validated Device-specific programming

Device-specific C-like Assembly language VHDL Device-specific specification Device-specific testingprogramming and validation models

* Not supported by Codeplay MACHINE LEARNING à SILICON

Graph programming ••TensorFlow C/C++-level ••Caffe programming ••OpenCV Higher-level ••SYCL ••VisionCpp language enabler Device-specific ••OpenCL SPIR programming ••LLVM

• Open Standards benefit from clear specification, testing and validation per standard • Codeplay has products to deliver each layer • Flexible for evolution of standards, products, scaling and market demands COMMON IDE INFRASTRUCTURE

Eclipse provides a common interface throughout the solution.

20 WHAT DOES IT MEAN FOR YOU?

TensorFlow OpenCV • Concept to silicon in record time SYCL OpenCL

• Do in weeks what took LLVM months Application optimized • Proven low-risk solution RISC-V

• We’ve done the work so Advanced Debug, Analysis, you don’t have to optimization • Based on open standards FPGA Porting • Use only what you MPSC want/need RICH foundational IP Library WHAT’S NEXT

• Testchip taping out very soon • Will be available to interested parties • Also exploring general availability on dev-board • Continuing to add more collaborators • Proactive R&D investment from all involved • Stay tuned for more announcements • Aim is to provide a platform from prototype to scaling • De-risk RISC-V for any and every design Thank You For Your Attention