Using Systemc for High-Level Synthesis and Integration with TLM
Total Page:16
File Type:pdf, Size:1020Kb
Using SystemC for high-level synthesis and integration with TLM Presented at India SystemC User Group (ISCUG) http://www.iscug.in April 14, 2013 Mike Meredith – VP Strategic Marketing Forte Design Systems © 2013 Forte Design Systems Agenda • High-level synthesis • Keys to raising abstraction level – A brief history of high-level – Implicit state machine synthesis vs explicit state machine – What is HLS? – Design exploration – HLS Basics – HLS and Arrays • Synthesizability – HLS and I/O protocols – Introduction to the • Integrating synthesizable code synthesizable subset standard in TLM models draft • User Experience – Recommendations and extensions © 2013 Forte Design Systems 2 A Brief History Of High-level Synthesis © 2013 Forte Design Systems 3 What’s In A Name? Algorithmic Behavioral Synthesis Synthesis ESL Synthesis High-level Synthesis © 2013 Forte Design Systems 4 Synthesizing Hardware From A High-level Description -- Not a New Idea! Generation 0 Generation 1 Generation 2 Research First Industrial Tools C-based Tools 1970’s 1980’s 1990’s 2000’s Today • Generation 0 - Research – 1970’s - 1990 – First mention (that I can find) - 1974 • CMU: Barbacci, Siewiorek “Some aspects of the symbolic manipulation of computer descriptions” – A sampling of key researchers: • Thomas, Paulin, Gajski, Wakabayashi, DeMann – Academic implementations • 1979 CMU - CMU-DA • 1988 Stanford Hercules – Most research focused on scheduling and allocation for datapath-dominated designs © 2013 Forte Design Systems 5 Generation 1 - 1990’s Generation 0 Generation 1 Generation 2 Research First Industrial Tools C-based Tools 1970’s 1980’s 1990’s 2000’s Today • First Industrial Tools • Challenges – NEC Cyber Workbench - – Synthesis direct to gates 1993 (internal to NEC) – HDL’s are not good behavioral – Synopsys Behavioral Compiler algorithm languages - 1994 – Poor quality of results – Cadence Visual Architect - • Area and timing closure 1997 – Verification difficulty – Mentor Monet - 1998 • Interface synthesis not good • I/O cycle accuracy not guaranteed © 2013 Forte Design Systems 6 Generation 2 - 2000’s Generation 0 Generation 1 Generation 2 Research First Industrial Tools C-based Tools 1970’s 1980’s 1990’s 2000’s Today • “Bloom” of industrial • Reasons for success implementations – Adoption of C & C variants for – Forte Cynthesizer input give access to algorithm – Celoxica Agility Compiler code – Mentor Catapult – Adoption of flows that include trusted logic synthesis tools – Synfora Pico Express – Quality of results caught up – Cadence C to Silicon Compiler with hand-coded – Verification considered along with design – Designs getting too big for RTL © 2013 Forte Design Systems 7 Why RTL Productivity Isn’t Enough (Hint: Moore’s Law) © 2013 Forte Design Systems 8 What Is High-level Synthesis © 2013 Forte Design Systems What Is High-level Synthesis? It’s all about abstraction Architectural • Y-chart Behavioral Structural – Gajski & Kuhn Algorithmic Systems CPU, Memory Functional Block – Captures relationships Algorithms Subsystems, Buses between specification Register-transfer Logic ALUs, Registers domains and abstraction Logic Circuit Gates, Flipflops levels Transfer functions Transistors Logic Synthesis: Register-transfer Gates/Flipflops Polygons Functional Block level Logic level Cells, Module Plans High-level Synthesis: Macros, Floor Plans Datapath: Algorithms ALUs/Registers Clusters Control: Algorithms Register-transfer Chips, Physical Partitions Algorithmic level Functional Block level Geometry © 2013 Forte Design Systems 10 What is High-Level Synthesis? SystemC and C++ HLS takes Synthesizable Behavioral Models abstract descriptions Constraints, Directives HLS Technology Detail Library FSM Datapath Abstraction MUL ADD DIV FPU and adds detail RTL - Verilog to produce RTL © 2013 Forte Design Systems 11 What is High-Level Synthesis? • HLS creates hardware from un-timed algorithm code – Synthesis from a behavioral description • Input is much more abstract than RTL code – No breakdown into clock cycles – No explicit state machine – No explicit memory management – No explicit register management – The high-level synthesis tool does all of the above © 2013 Forte Design Systems 12 High-Level Synthesis is not... • ... the same as writing software – the algorithm may be the same, but the implementation is very different – the cost/performance tradeoffs are different – e.g. use of large arrays • Software can assume essentially infinite storage with equal (quick) access time • Hardware implementations must tradeoff storage size vs. access time – these in turn affect the coding style used for synthesis • ... the same as writing RTL – the input code is much more abstract © 2013 Forte Design Systems 13 What does the designer do? What does the tool do? Decisions made by designer Decisions made by HLS • Function • State machine – As implicit state machine – Structure, encoding • Performance • Pipelining – Latency, throughput – Pipeline registers, stalling • Interfaces • Scheduling • Storage architecture – Memory I/O – Interface I/O – Memories, register banks etc – Functional operations • Partitioning into modules © 2013 Forte Design Systems 14 What is High-level Synthesis Used For? • Datapath dominated designs – Most of the design is computation oriented – Image processing – Wireless signal processing • Control dominated designs – Most of the design is making decisions and moving data – SSD controllers – Specialized processor development – Network switching applications © 2013 Forte Design Systems 15 Statistics – Real-world usage • Block / Module size – Common = 30K 300K gates – Largest we have seen = 750K gates • Blocks per Project – Typical • Depends on customer stage of adoption • Common = 1-8 – Largest we have seen = 200+ blocks • Project size – (implemented with Cynthesizer) – Largest we have seen = 15M gates • QOR (Area) – Typically 5-20% improvement over hand-RTL 3 Copyright 2011 Forte Design Systems - Proprietary and Confidential 16 Adopting HLS Productivity values and expectations HLS Design The Feature Complete RTL Design Panic 2x-4x over target Zone A R Feature Complete 20% over target E A Estimated Target Area 4 DEVELOPMENT TIME Copyright 2011 Forte Design Systems - Confidential 17 Keys to HLS Raising Abstraction • Implicit state machine – More like algorithm code – Leaves room for exploration • Design exploration High Level – Same source can produce designs for different purposes High-Level Synthesis • Array handling RTL – Arrays can be implemented in multiple ways – HLS schedules memory protocol signals w/ algorithm Logic Synthesis • Handling I/O protocols Gate – HLS schedules I/O protocol signals w/ Level algorithm – Modular interfaces allow protocol encapsulation Design Abstraction & reuse © 2013 Forte Design Systems 18 High-level Synthesis Basics © 2013 Forte Design Systems High-level Synthesis Basics • High-level synthesis begins with an Behavioral algorithmic description of the desired behavior Input • Behavior is expressed in a high level language • SystemC is the input language for Cynthesizer Lexical Processing • The second input is the ASIC or FPGA Algorithm Optimization technology library Control/Dataflow Analysis – Cell functions, area and timing Library Processing • The third input is directives for synthesis – Target clock period (required) Scheduling & Binding – Other directives: pipelining, latency constraints, memory architecture Design Evaluation Output Generation © 2013 Forte Design Systems 20 High-level Synthesis Example Consider the following behavioral description to be synthesized for a 100MHz clock period: typedef unsigned char bit8; unsigned long example_func( bit8 a, bit8 b, bit8 c, bit8 d, bit8 e ) { unsigned long y; y = ( ( a * b ) + c ) * ( d * e ); return y; } • This description is un-timed: The user does not specify the clock boundaries – Does not specify the latency – Does not specify which values are registered or when to compute each value • Cynthesizer will map the SystemC/C++ operations (‘*’, ‘+’) to datapath components, and add clock boundaries – Driven by target clock period and other directives – Results in known latency and registers – Depends on technology library © 2013 Forte Design Systems 21 High-level Synthesis Example: Control & Data Flow Graph • Cynthesizer analyzes the source and determines: – Inputs & Outputs – Operations – Data dependencies • A control/dataflow graph (CDFG) is constructed: – With no timing information – Represents the data flow for the functionality of the design if (x == 0) q = d1; • The graph shows data dependencies: else – Operations that must complete before others can begin q = d2; • e.g. (a*b) must complete before we can calculate 0 (a*b)+c d1 q d2 1 • The graph shows control dependencies: – if/else constructs will turn into select operations x © 2013 Forte Design Systems 22 High-level Synthesis Example: Control & Data Flow Graph The control/dataflow graph (CDFG) analysis would construct the following structure from example_func(): 8 Lexical Processing a 16 Algorithm Optimization 1 8 x t1 17 b 1 Control/Dataflow Analysis + t3 8 33 Library Processing c x3 y 16 t2 8 Scheduling & Binding d x2 8 Design Evaluation e Output Generation © 2013 Forte Design Systems 23 High-level Synthesis Example: Library Processing Assume the characterized library has the following functional units available Functional Unit Delay Area Lexical Processing 8x8=16 2.78 ns 4896.5 Algorithm Optimization 16+16=17 1.99 ns 1440.3 Control/Dataflow Analysis 20x20=40 5.88 ns 27692.6 Library Processing