The Next Major Advance in Chip-Level Design Productivity
Total Page:16
File Type:pdf, Size:1020Kb
The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 The Next Major Advance in Chip-Level Design Productivity A. Richard Newton University of California, Berkeley Synopsys EDA Interoperability Developers’ Forum Santa Clara, CA October 21st, 2004 Fundamental Drivers of Future Chip Designs (1) (2) (3) (4) Silicon Scaling Rising Design Growing Complexity Increased System Requires Concurrency Drives Drives Chip Cost Drives Software-Based Multiple Processor Capacity Programmability Solutions Architectures SoC Becomes A “Sea Of Processors” SoC Programmable Software-Centric Sea-of-Processors Platforms Design Design Source: Chris Rowen, Tensilica Page 1 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Key Points The future mainstream building-block of electronic system-level design will present a (configurable) clocked synchronous Von Neumann programmer’s model to the system-level application developer (ASIP or TSP) The majority of large silicon systems will consist of many such processors, connected in an asynchronous network These processors may be integrated on a single chip (CMP) and/or as a (possibly very large) collection of chips These conclusions lead to a number of critical design-technology research challenges and new business opportunities Fundamental Drivers of Future Chip Designs (1) (2) Silicon Scaling Rising Design Drives Chip Cost Drives Capacity Programmability Programmable Platforms Page 2 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Conventional Arguments: The Changing Landscape of Design, Manufacture, and Test The NRE cost of building a complex chip is O($20M) in 2004: Fixed Costs (Masks, EDA Tools, IP Blocks, Diagnosis and Test) Design Costs (Team Size, Verification, Timing Closure) Opportunity Cost (Predictability Of Design Time, Chip Characteristics, and Manufacturing Reliability) Need either a single, huge market or ability to address multiple application variants and system product generations with same physical device Programmability brings adaptability to SoC. Two popular forms: • Field-programmable logic, based on low-level logic and interconnect hardware configuration, from hardware description languages (e.g. Verilog), and O(20-40) times slower/larger/more power than equivalent custom logic Processors, based on sequential instruction programming from high-level languages (mostly C/C++ plus limited assembly code), and O(10-1,000) times slower/more power than equivalent custom logic Total IC Designs 10,000 9,000 8,000 7,000 6,000 5,000 ASSP 4,000 IC Designs 3,000 2,000 ASIC 1,000 0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Year Source: Handel Jones, IBS, October 2002 Page 3 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Total IC Designs 10,000 9,000 8,000 7,000 6,000 5,000 ASSP 4,000 IC Designs 3,000 2,000 ASIC 1,000 0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Year Source: Handel Jones, IBS, October 2002 Fundamental Drivers of Future Chip Designs (1) (2) (3) Silicon Scaling Rising Design Growing Complexity Requires Drives Chip Cost Drives Software-Based Capacity Programmability Solutions Software-Centric Design Page 4 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Growing Complexity Drives Software-Centric Design Growing product complexity driven by both market competition in end products and growing capability of silicon Complexity of the external application domain makes accurate specification of application domain almost impossible Example: voice codec ITU document size G.711 (1988): 190KB, G.726 (1990): 290KB, G.729 (1996): 2.1MB Growing complexity means: 1. Greater design time 2. Greater bug risk and bug fix effort 3. Greater diversity of customer requirements 4. Greater exposure to changing standards Software, today written in high-levellevel languageslanguages (e.g.(e.g. C/C/C++) is the best understood, most scalable means of developing and debugging complexlex functions. A Discipline of Platform-Based Design ApplicationApplication Programming Model: Models/Estimators Kernels/Benchmarks Architecture(s)Architecture(s) Architectural Platform Microarchitecture(s)Microarchitecture(s) Functional Blocks, Cycle-speed, power, area Interconnect VVSSSG V S G S V S Circuit Fabric(s) S Circuit Fabric(s) V S V G Silicon Implementation Platform S ManfacturingManfacturing Interface Interface Delay, variation, Basic device & interconnect SPICE models structures SiliconSilicon ImplementationImplementation GSRC Review, Sep. 2001 Page 5 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Today: “Given a Processor Chip (and it’s Accelerators)…” I get to choose from existing ApplicationApplication hardware product offerings… Programming Model: Models/Estimators Kernels/Benchmarks Then I decide what software Architecture(s)Architecture(s) components I have or can find, Architectural Platform for OS, for IO, for data Microarchitecture(s)Microarchitecture(s) Functional Blocks, conversion, etc., then I port Cycle-speed, power, area Interconnect what I must, and I plan to write CircuitCircuit Fabric(s)Fabric(s) the rest. Silicon Implementation Platform ManfacturingManfacturing Interface Interface A “Hardware-up” methodology Delay, variation, Basic device & interconnect SPICE models structures SiliconSilicon ImplementationImplementation Tomorrow: “Given an Application, and a software development environment…” I get to specify the characteristics ApplicationApplication Programming Model: of a programmable hardware core Models/Estimators Kernels/Benchmarks or sea-of-cores… Architecture(s)Architecture(s) Architectural Platform Then I decide what Microarchitecture(s)Microarchitecture(s) accelerators/additional instructions Functional Blocks, Cycle-speed, power, area Interconnect I might need, select IP from libraries, and use them to design a CircuitCircuit Fabric(s)Fabric(s) Silicon Implementation Platform chip for this class of application ManfacturingManfacturing Interface Interface Delay, variation, Basic device & interconnect A “Software-down” methodology SPICE models structures SiliconSilicon ImplementationImplementation Page 6 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Platforms “We could work with other companies to develop new cores and amortize costs across multiple of their customers” Source:UC Professor Berkeley, Edward Edward Lee Lee 13 Configurability Source: Tensilica, Inc Page 7 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 Configurabilty Only Works if Essentially Transparent to the Application Programmer ALU I/O Critical role of TIE Pipe Cache Timer Register File MMU Tailored, HDL uP core Using the Describe the processor processor generator, attributes from Customized Use a create... a browser-like Compiler, standard Assembler, cell library interface Linker, to target to Debugger, the silicon Simulator process Source: Tensilica, Inc Enabling Design-Space Exploration ApplicationApplication ModelModel ProgrammingProgramming programprogram Architecture View ArchitectureArchitecture CompilerCompiler gengen ModelModel Estimator gengen Estimator .o.o Simulator/Instr.Simulator/Instr. Emulator Emulator Source: Mescal Group Page 8 The Next Major Advance in Chip-Level Synopsys EDA Interoperability Developers’ Forum Design Productivity Santa Clara, CA st [email protected] October 21 , 2004 EEMBC Networking Benchmark • Benchmarks: OSPF, Route Lookup, Packet Flow • Xtensa with no optimization comparable to 64b RISCs • Xtensa with optimization comparable to high-end desktop CPUs • Xtensa has outstanding efficiency (performance per cycle, per watt, per mm2) • Xtensa optimizations: custom instructions for route lookup and packet flow IDT 32334/100 IDT79RC32364/100 14 0.045 NEC V 832-143 0.040 AMD ElanSC520/133 12 2 Toshiba TMPR3927F-GH189/133 0.035 IDT79RC32V334-150 10 Toshiba TMPR3927F-GHM2000/133 0.030 NEC VR5432-167 8 0.025 Xtensa/200 IDT79RC64575IDtc/250 6 0.020 NEC VR5000 0.015 IDT79RC64575Algor/250 4 AMD K6-2/450 Performance/MHz Netmark 0.010 AMD K6-2E/400 Performance relative to IDT (MIPS3 32334/100 2 Xtensa Optimized/200 0.005 AMD K6-2E+/500 0 0.000 A MD K6-IIIE+/550 Colors: Blue-Xtensa, Green-Desktop x86s, Maroon-64b RISCs, Orange-32b RISCs Source: Tensilica, Inc EEMBC Consumer Benchmark • Benchmarks: JPEG, Grey-scale filter, Color-space conversion • Xtensa with no optimization comparable to 64b RISCs • Xtensa with optimization beats all processors by 6x (no JPEG optimization) • Xtensa has exceptional efficiency (performance per cycle, per watt, per mm2) • Xtensa optimizations:custom instructions for filters, RGB-YIQ, RGB-CMYK 200 1.00 ST20C2/50 AMD ElanSC520/133 175 NEC V832/143 0.80 150 National Geode GX1/200