Virtual Prototypes and Pla orms A Primer
Eyck Jentzsch, MINRES Technologies GmbH Rocco Jonack, MINRES Technologies GmbH Josef Eckmüller, Intel Deutschland GmbH
© Accellera Systems Initiative 1 FOUNDATIONS OF VP
© Accellera Systems Initiative 2 What are not virtual prototypes & pla orms
• These are not virtualiza on solu ons or virtual machines like VirtualBox or VMWare – A virtual machine (VM) is an emula on of a computer system (...) and provides func onality of a physical computer (Wikipedia) – A VM or virtual machine was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real computer machine." • Although some concepts and mechanisms are also used in VPs
© Accellera Systems Initiative 3 What is a virtual prototype or pla orm
• Virtual pla orm is a hardware simulator execu ng embedded so ware (eSW) • Usually built on top of an instruc on set simulator (ISS) of the processor(s) connected via buses to memory and peripherals such as mers, general I/O, communica on interface, etc. • While the VP is mostly implemented in SystemC the ISS may be implemented in some other general purpose language.
© Accellera Systems Initiative 4 Differences between virtual prototype and pla orm
• Virtual prototype – Is in development and does not have final structure and par oning. – Possibility to quickly change this – Comprehensive analysis means – Used to op mize SoC design wrt. power, area, speed, throughput, and/or latency • Virtual pla orm – Represents a SoC in its final shape, – Aims at simula on speed as well as func onal accuracy and completeness, – Used for (e)SW development
© Accellera Systems Initiative 5 Alterna ves to VPs
• Emula on – high accuracy and speed – needs Register-Transfer-Level (RTL) descrip on, expensive to scale • FPGA – high accuracy and speed – needs RTL implementa on to do synthesis, place and route è long implement- compile-debug loops • Hardware in the loop (HIL) – very high accuracy and speed (real- me) – comes late in the design process, scalability challenge
© Accellera Systems Initiative 6 VP standards
• Focus on standards for VP development - SW and FW running on VP may have different requirements/standards • Usage of C++ - Performance requirements - Compliance with exis ng models - Most tools provide APIs - Linking of scrip ng environments for dynamic control e.g. Python, TCL, Lua • Standards are important for IP exchange - Internal implementa on might be different - Standards o en imply standard interfaces/APIs - Standards typically allow easier tool deployment • Tools can blur the line between standard and proprietary implementa on - Addi onal func onality provided that is vendor specific - Model libraries may incur tool specific func onality
© Accellera Systems Initiative 7
Modeling techniques
• Several modeling approaches are used to implement a VP and its components – Behavioral/un med – Func onal/loosely med – Cycle-accurate/approximately med – Register-transfer-level • A par cular VP usually is a mix-and-match of these techniques
© Accellera Systems Initiative 8 Behavioral/un med
• The un med model is a func onal IF descrip on having no no on of me • A set of algorithmic blocks F1 communica ng using func on calls or message pipes • Keeps the causality and order of invoca on F2
© Accellera Systems Initiative 9 Func onal/loosely med (LT) • The loosely med model is a structural and behavioral refinement of the OS+SW (IF, F2) func onal model. ACCEL • Mapping of func onal blocks to HW and CPU (F1) SW components and communica on interfaces in-between based on a chosen architecture • Subsystems can execute ‘ahead-of- me’ • Transac ons at communica on MEM I/O interfaces correspond to a complete read or write across a bus or network in physical hardware and provide ming at the level of the individual transac on
© Accellera Systems Initiative 10 Cycle-accurate/approximately med (AT)
• Approximated ming on bus OS+SW communica on and on hardware (IF, F2) ACCEL resource access CPU (F1) – Interface communica on me – Average processing me in hardware IP • Transac ons are broken down into a
MEM I/O number of phases corresponding much more closely to the phasing of par cular hardware protocols
© Accellera Systems Initiative 11 Register-transfer level (RTL)
• Models a block or SoC in terms of the flow of data between hardware registers, and the logical opera ons performed on this data • Star ng point for physical implementa on by synthesis, place, and route
© Accellera Systems Initiative 12 C++ class libraries for modeling
• SystemC is a class library on top of C++ SCML - Structural descrip on elements - Data types TLM2 - Event driven simula on kernel • TLM2 modeling allows for interoperability SystemC - Reuse of exis ng components whether 3rd party or inhouse C++ - Tool environments support TLM2 - Interac on with more cycle-accurate models is well defined – Even bridging into RTL simulators is well defined and supported by (commercial) tools • Produc vity libraries enable faster modeling of common components - Registers, memories, infrastructure tasks - Examples of libraries - Propriatary: Synopsys SCML, Intel ISCTLM, ASTC Genesis - Open Source: Greensocs Greenlib, MINRES SC Components
© Accellera Systems Initiative 13 TLM2 based modeling
• TLM2 ports are the interfaces between IA TLM2 ports modules - Retain connec on to architectural view of SoC - Allows for reuse of components Subsystems, for instance Audio, - Focus on interfaces, not on func onality Graphics, Modem etc. • Provides infrastructure for modeling efficiency - LT modeling allows op miza on of events Ini ator Slave - Direct-memory-interface (DMI) calls can nb_transport_bw b_transport invalidate_direct_mem_ptr nb_transport_fw provide very high memory-access efficiency get_direct_mem_ptr transport_dbg - Temporal decoupling allows models to delay synchroniza on with other components
© Accellera Systems Initiative 14 Modeling efficiency with TLM2 LT
• LT specifies different transport mechanisms IA ISS RAM - b_transport() as blocking access per transac on - DMI calls to request memory access by pointer
• Design goals Subsystems, for instance Audio, - Minimizing synchroniza on with environment Graphics, Modem etc. - Localizing workload - Allows for registra on of callback func ons • Works well in conjunc on with temporal decoupling - Avoid synchroniza on in blocking calls if possible - Processor models can decide when to synchronize
© Accellera Systems Initiative 15 Produc vity Libraries
• Common elements should be modeled on top IA of common classes • Registers, Memories, etc.
• Efficient for stubs or preliminary implementa ons Subsystems, for instance Audio, • Common infrastructure tasks should be Graphics, modeled on top of common classes Modem etc. – Logging – Parametriza on – Domain handling (Clock, reset, power) • SCML, ISCTLM, and Genesis are examples of (proprietary) produc vity libraries – Concepts and implementa ons might serve as star ng point for standardiza on
© Accellera Systems Initiative 16 Register modeling
• Registers are one of the main interface between HW and SW • Registers mean different things to HW and SW groups registers control Write_start() - HW contains many registers, but only few are exposed to SW enable registers set Write_finished() - SW visible registers should be implemented, documented and reset tested src_addr Reset_values() dest_addr • Consider automated code genera on - Large amount of registers - changing requirements - Meta data formats like IPXACT, RDL, RAL • Having a model that allows consistent HW and SW access modeling is important - Access modes, reset and reten on values - Efficient access through callback func ons - Produc vity layer provides register classes
© Accellera Systems Initiative 17 Model meta data
• S tching models together is a recurring IA and error prone task • Using meta data is o en useful – Data visualiza on Subsystems, for – Data exchange instance Audio, Graphics, • IPXACT is a XML based schema for Modem etc. interface descrip ons – Some tools can provide or read IPXACT • XML parsing for simple tasks can be done using Open source libraries • Other examples for meta data representa ons are RDL, RAL, sysML
© Accellera Systems Initiative 18 Improvement of standards
• TLM2 provides a generic transport mechanism and extension mechanisms – Specific protocols are not described • AMBA, PCI, MIPI are le to implementer • Produc vity libraries are not standardized • More support for common infrastructure tasks – Tracing – Profiling – Paralleliza on
© Accellera Systems Initiative 19 Using IP for Virtual Pla orms • VP represents hardware – SoC design typically contains a significant number of 3rd IP components – Ideally all IP should come from same source (RTL, architecture, verifica on, VP models) • Availability of IP for VP is improving, but s ll limited – Availability of models o en defines the usage – Consider Co-simula on • Processor-based models that execute SW – User of VP is exposed to those models • Peripheral models – Execute func onality autonomously, but under control of SW • Interconnect models – Connec ng different elements based on addresses
© Accellera Systems Initiative 20 Modeling processors
SW/FW • Core func onality of VP IA ISS • Need support for high speed, good debug interface, profiling, etc.
Subsystems, for • Most embedded processor providers instance Audio, Graphics, also provide ISS models Modem etc. – Speed versus ming accuracy • Dynamically configurable to run in performance mode or accuracy mode – Links to debug environment and other SW tools should be consistent with user expecta on
© Accellera Systems Initiative 21 Processor modeling techniques
• Interpre ng execu on – Slow but easy to implement and accurate • Just-in- me (JIT) compiling/dynamic binary transla on (DBT) – Virtualiza on library converts target instruc ons to host instruc ons e.g. using LLVM or QEMU right before execu ng them • Host based/host compiled simula ons – Instead of running the target SW/FW binary on a target processor (model), run a host-compiled representa on interac ng with the model of the pla orm – Virtualiza on environments allow running the SW/FW binary
© Accellera Systems Initiative 22 ARM processor models
• ARM fast models – LT based models for speed with debug interfaces • Cycle accurate model based on Carbon tool transla on – Cycle accuracy guaranteed
Source: ARM support website
© Accellera Systems Initiative 23 Tensilica processor models • ISS from Tensilica is highly configurable – Turbo mode (JIT) is using various mechanisms to speed up simula on – Non-turbo mode (interpre ng) is instruc on accurate • Models provide TLM2 based interfaces • Tensilica provides extension to gdb for debugging • Integra on into Tensilica tool environment by a aching simula on
through TCP port Source: Cadence support website
© Accellera Systems Initiative 24 Modeling peripherals • Typical peripherals are UART, PCI, USB, mers – May connect to the ‘real’ world by using the host worksta on resources – Not main func onality, but important to get FW contents right – Synopsys DW library provides TLM2 based versions of various IP – Some vendors react on demand • Various versions of peripherals are common – Upgrading even to a new version may imply changes in the interface • Peripherals can have significant contribu on to system performance – DMAs, peripherals sending video data, real- me constraints Source: screen shot from PlatformCreator
© Accellera Systems Initiative 25 Modeling interconnect
IA • The model that connects everything ISS – Mostly address map in VP context – Consistent view of registers Subsystems, for – May contain some SW visible func onality like instance Audio, firewalls, address shi ing, domain crossings Graphics, Modem etc. • Some libraries and tools contain generic models – Fast ini al implementa on – High performance • Some vendors provide TLM2 LT interconnect model, e.g. NetSpeed – Ensures consistency within design flow Source: screenshot from NetSpeed NocStudio
© Accellera Systems Initiative 26 Custom build models
• Custom models are necessary, for instance – In house IP – Important IP without any available 3rd party model – Extensions to exis ng models • Consider cost of maintenance – Build based on produc vity libraries – Reuse exis ng models for instance un med models • Consider RTL co-simula on or transla on like Carbon tools
© Accellera Systems Initiative 27 Commercial tools available for VP development (exemplary list) • Synopsys Processor Designer, Virtualizer • Cadence Virtual System Pla orm • Siemens Mentor Vista • MathWorks Matlab, Simulink • ASTC VLAB, Processor Modeling Studio • Wind River Simics • Xilinx Vivado • ARM Carbon Model Studio, SoC Designer Plus
© Accellera Systems Initiative 28 SystemRDL, IPXACT & SystemC DEMO
© Accellera Systems Initiative 29 DEVELOPMENT AND APPLICATIONS OF VP
© Accellera Systems Initiative 30 SoC/system & VP development phases
• Behavioral descrip on of the SoC or Defini on/Specifica on system • Par oning and Mapping to Architecture/Design Architecture (HW) and SW • HW & FW development Implementa on
• Func onal, ming, power, ... valida on Valida on
• SW development and further Op miza on op miza on
© Accellera Systems Initiative 31 User roles in the VP dev phases • Pla orm authors, IP authors Defini on/Specifica on
• Pla orm authors Architecture/Design
• Pla orm authors, IP authors Implementa on
• Pla orm authors, pla orm users (SoC Valida on & system integrators) • Pla orm users Op miza on
© Accellera Systems Initiative 32 VP uses cases • Architectural explora on and Defini on/Specifica on performance analysis
Architecture/Design • Func onal verifica on • FW development (driver, HAL, Implementa on OS) • System performance valida on Valida on • Debug & analysis in FW/SW development Op miza on • Con nuous integra on & test- driven SW development
© Accellera Systems Initiative 33 Architectural explora on & performance analysis
• Mapping of a behavioral model to one or more points in the architectural space consis ng of different HW implementa ons • Evalua on based on performance characteris cs for different system architectures, such as a HW/SW split, communica on system, or system components • Typical use case for pla orm authors • Important proper es: accuracy wrt. to performance metrics i.e. ming, latency, throughput, power
© Accellera Systems Initiative 34 Func onal verifica on
• Behavioral model allows early development of system and integra on test thus valida ng the specifica on • Refined VP with final architecture and HW/SW par oning serves as executable specifica on for hardware implementa on
© Accellera Systems Initiative 35 Func onal verifica on • Can be used as a reference model for the SoC implementa on as well as for component development (RTL block verifica on) – Replace (IP-)blocks by their RTL counterpart and validate in system context – VP as part of the block test bench • Reuse of s muli from VP to RTL, component to system – E.g. using portable s muli (Accellera standard)
© Accellera Systems Initiative 36 FW development (driver, HAL & OS) • Use the VP as a vehicle to execute target SW by simula ng HW behavior Implement • Allows shi -le by star ng SW development early in the design process • Used by pla orm authors and system integrators • Important proper es: balance between speed vs. accuracy Build • Comprehensive analysis, visibility and debug Debug means needed – Logging of HW and SW events (e.g. via host I/O) – Signal & transac on tracing – Non-intrusive debugger integra on
© Accellera Systems Initiative 37 System performance valida on
• Validate performance of pla orm and SW in applica on scenarios within target environments – Recep on of data at wireless modems – Event response me in real me cri cal systems • Used by system integrators
© Accellera Systems Initiative 38 Debug & analysis in FW/SW development
• Similar to FW development • Used by system integrators • Focus even more on speed as not only the pla orm is being simulated rather also the target environment e.g. cellular base sta ons, en re motor controller including the control loop containing actuators and sensors
© Accellera Systems Initiative 39 Debug & analysis in FW/SW development
• Comprehensive analysis, visibility, tracing, and debug enables non- intrusive observa on and debug of behavior thus easing the debugging • Tools like Eclipse Trace Compass or impulse allow to fuse HW and FW/ SW events to ease debugging system behavior
© Accellera Systems Initiative 40 Con nuous integra on & test-driven SW development • The all-in-so ware approach allows to deploy modern SW development techniques like con nuous integra on (CI) and test-driven design (TDD) • Each feature is being (unit-) tested so that func onality regressions can be detected early • Each SW change is tested before propaga ng to the main line of development • Allows close monitoring of the eSW development progress • Maximum benefits in situa ons where one so ware system addresses mul ple hardware variants in different system contexts • Used by system integrators
© Accellera Systems Initiative 41 FW debugging in Eclipse using VP as target. DEMO
© Accellera Systems Initiative 42 ADVANTAGES, LIMITATIONS, AND CHALLENGES
© Accellera Systems Initiative 43 Advantages
• Early availability & easy reconfigura on • Observability • System fault injec on & fault simula on • Simula on speed and accuracy • Scalability • Enabler of agile eSW development methodology
© Accellera Systems Initiative 44 Limita ons
• Simula on speed vs. accuracy • Limited mul -threading capabili es in SystemC – advantageous for sub-system integra on and mul -core simula ons • Missing standards for run me configura on, inspec on, logging and
tracing – integra on of IPs from different vendors s ll very tedious
© Accellera Systems Initiative 45 Challenges
• Simula on speed vs. accuracy • Use case broadening (FW/SW performance valida on, dynamic power es ma on) • Integra on of sub-system and systems
– happens at at different levels ( sub-component, component, sub-system, system) – re-use of test infrastructure between levels – mul -core debugging, SMP, AMP – efficient sub-system integra on process – dynamic rou ng and linking – availability of lower FW/SW layer required to integrate higher layer SW • Conflict of interest between pla orm authors, tool and IP vendors – pla orm author need maximal flexibility in terms of IP selec on and tool support – IP and tool vendor aims to lock-in pla orm author and user • Exchange of VP pla orms between pla orm authors and pla orm users – legal and IP protec on challenges
© Accellera Systems Initiative 46 Simula on speed vs. accuracy
• Simula on speed is crucial for the adop on of VPs • Modern ISS reach several tens or hundreds of MIPS depending on the complexity of the processor • Modeling the en re pla orm slows down esp. If components or communica on interfaces need to be modeled in more detail and/or with higher accuracy • Single-threaded SystemC kernel
© Accellera Systems Initiative 47 Simula on speed and accuracy
• Ways to address this: – Mix-n-match approach (coarse and fine grain modeled components) – use case adapted VPs (variants) – Mul -threading and simula on par oning – Simula on check-poin ng – Faster host hardware �
© Accellera Systems Initiative 48 Early availability & easy reconfigura on
• Earlier system availability by agile development – Implemen ng only func onality that is needed for the use case (so ware development, performance analysis) – Incremental extension un l comple on with intermediate deliveries • As communica on is abstracted, a VP can easily be re-par oned – Graphical tools are available to ease this • By using so ware build mechanisms or run me configura on mechanisms it’s easy to create and maintain variants or deriva ves to suit different requirements
© Accellera Systems Initiative 49 Observability
• Simula on can be stopped upon occurrence of important or interes ng events • State of the pla orm (both the hardware and the so ware state) is frozen and can be examined e.g. using debuggers • Non-intrusive debugging: other than ‘real’ hardware the system cannot determine that it has been stopped • Same or be er debugger access than hardware • Extensive logging and tracing e.g. VCD or transac on recording allows comprehensive post-simula on analysis
© Accellera Systems Initiative 50 Observability
• Combining hardware traces and logs with SW events and traces gives even more insight • Detailed analysis of system w/o – Expensive on-chip measurement – Slow HDL simula on
© Accellera Systems Initiative 51 Scalability
• The all-in-so ware approach executes target code on standard hardware • No dependencies on special hardware – Other than FPGA, Emula on or HIL • Each developer has its own target system to debug the code • Easy scalability by adding off-the-shelf hardware like worksta ons or servers
© Accellera Systems Initiative 52 Enabling agile eSW development methodology
• The all-in-so ware approach allows to deploy modern SW development techniques like con nuous integra on (CI) and test-driven design (TDD) • It can be easily scaled to meet developers demand and compu ng requirements for CI • Allows close monitoring of the eSW development progress and provides benefits: – Be er code structure and quality by frequent checking and extrac ng so ware metrics – Easier debug and less impact (less code to roll-back) – Fewer major integra on bugs (detected early and easy to track) – Constant availability of a "current" build for tes ng, demo, or release purposes
© Accellera Systems Initiative 53 References
• IP-XACT, RDL, SystemC/TLM 2.0: h p://www.accellera.org • ASTC VLAB: h p://vlabworks.com/ • ARM support website: h p://infocenter.arm.com • Tensilica support website: h ps://ip.cadence.com/support • GreenSocs Greenlib: h ps://www.greensocs.com/docs • MINRES SC-Components: h ps://www.minres.com/#opensource • Eclipse Trace Compass: h p://tracecompass.org/ • impulse: h p://toem.de/index.php/projects/impulse
© Accellera Systems Initiative 54 Ques ons
© Accellera Systems Initiative 55