Heterogeneous Multicore OpenAMP

Felix Baum Mentor Embedded

1 Trends Create Opportunities

Competitive Consolidation SoC Technology Market (Reuse, Space, Advancement Weight, Power)

Multiple OSes Mixed Core Types Global Competition Complex Increasing Pace of Innovation Configurations Complexity

2 Heterogeneous for Power/Performance Systems that use more than one kind of Allows improved efficiency for various workloads Increases the programming challenge

3 Heterogeneous Computing for Power/Performance

4 SoC FPGA: Heterogeneous Processing Choices

Hard processor system: ARM* Cortex*-A9, A53 • Dual- and quad-core variants • System management and control Soft CPU(s): Nios II • Configurable (performance, number of cores) • From -programmable state machine to Linux* Hardware accelerators • Application specific hardware in FPGA • OpenCL*, MATLAB*/Simulink*, custom designed RTL • Managed by CPU or autonomous

5 Techniques to Parallelize the Processing: OpenMP

OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform multiprocessing programming in C, C++, and Fortran. • It consists of a set of directives, library routines, and environment variables that influence runtime behavior • User identifies supported constructs, and manually inserts directives to assist compiler for a reconstruction of supported constructs into parallel • User does not need to create threads, and does not need to consider the work assigned to threads • OpenMP allows users who do not have a sufficient knowledge of to explore parallel computing

6 Techniques to Parallelize the Processing: OpenCL*

OpenCL* (Open Computing Language) is the industry’s open standard for writing data-parallel code in heterogeneous . • OpenCL is promoted as a primary approach towards programming parallel computing hardware offerings. OpenCL requires a level of low-level understanding and competence to write efficient parallel . • OpenCL is primarily targeted at leveraging data-parallelism of devices, and additional considerations must be made to use the multiple cores available on the CPUs in the system, for example offloading CPU intensive algorithms to GPUs. The multicore framework listed in this paper differs from OpenCL as it does not try to parallelize processing for the sake of performance but tries to isolate and separate execution blocks based on performance or power consumption requirements.

7 Techniques to Parallelize the Processing: OpenAMP

The multicore framework listed in this paper differs from OpenCL* and OpenMP as it does not try to parallelize processing for the sake of performance, but tries to isolate and separate execution blocks based on performance or power consumption requirements.

OpenAMP defines how operating systems interact: • Focused on IPC and booting frameworks • In particular between Linux* and RTOS/bare-metal • In particular in heterogeneous systems

8 Multicore Configurations

SMP

Linux* (SMP)

Cortex* A Cortex A

9 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations

SMP

Linux* (SMP)

Cortex* A Cortex A

RTOS (SMP)

Cortex A Cortex A

10 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP sAMP

Linux* Linux Linux (SMP) (master)

Cortex* A Cortex A Cortex A Cortex A

RTOS RTOS RTOS (SMP) (master)

Cortex A Cortex A Cortex A Cortex A

11 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP sAMP

Linux* Linux Linux (SMP) (master)

Cortex* A Cortex A Cortex A Cortex A

RTOS RTOS RTOS (SMP) (master)

Cortex A Cortex A Cortex A Cortex A

12 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP Heterogeneous sAMP sAMP

Bare Linux* Linux RTOS RTOS Linux Linux Metal (SMP) (master) (master) (master) Env.

Cortex* A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

Bare RTOS RTOS RTOS Linux RTOS Linux Metal (SMP) (master) (master) (master) Env.

Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

13 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP Heterogeneous sAMP sAMP

Bare Linux* Linux RTOS RTOS Linux Linux Metal (SMP) (master) (master) (master) Env.

Cortex* A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

Bare RTOS RTOS RTOS Linux RTOS Linux Metal (SMP) (master) (master) (master) Env.

Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

14 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP Heterogeneous sAMP sAMP

Bare Linux* Linux RTOS RTOS Linux Linux Metal (SMP) (master) (master) (master) Env.

Hypervisor Hypervisor

Cortex* A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

Bare RTOS RTOS RTOS Linux RTOS Linux Metal (SMP) (master) (master) (master) Env.

Hypervisor Hypervisor Hypervisor

Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

15 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Multicore Configurations Homogeneous SMP Heterogeneous sAMP sAMP

Bare Linux* Linux RTOS RTOS Linux Linux Metal (SMP) (master) (master) (master) Env.

Hypervisor Hypervisor Hypervisor

Cortex* A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

Bare RTOS RTOS RTOS Linux RTOS Linux Metal (SMP) (master) (master) (master) Env.

Hypervisor Hypervisor Hypervisor

Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A Cortex A

16 SMP – Symmetric MultiProcessing; uAMP – unsupervised Asymmetric MultiProcessing; sAMP – supervised AMP aka virtualized Complexity Skyrockets

Extreme complexity is introduced with general purpose development • System architecture • Configuration • Booting • Debugging • Separation • Device sharing • Inter-processor communication Development Complexity • Security Single Dual Multiple Heterogeneous Core Cores Cores Multiple Cores

17 The OpenAMP Standard

• OpenAMP is an open source project managed by the Multicore Association • OpenAMP standardizes how operating systems interact in heterogeneous systems - Focused on inter- communications (IPC) and booting frameworks - In particular between Linux* and RTOS/bare metal • Guiding principles: - Open source implementations for Linux and RTOSes - Prototype and prove in open source before standardizing - Business friendly and implementations to allow proprietary solutions

*

18 The OpenAMP Collaboration

Mentor Embedded • Participates on MCA Board of Directors • Serves in a Technical Advisor role on the MCA OpenAMP Committee • Contributed most of the initial code to the OpenAMP git tree • Mentor Embedded Multicore Framework is a first commercial implementation of the standard

19 OpenAMP: Core Management

• Used by master OS to boot remote OSes on remote CPUs

• remoteproc user API for processor lifecycle management

• Conformance to upstream Linux* remoteproc implementation

• Stand alone OS agnostic clean-room implementation of remoteproc API Where A could be Cortex* A7/A9/A15/A53…

20 OpenAMP: Inter Process Communication

• For inter-processor communications between OS/software contexts

• rpmsg user API for inter-processor communication

• Conformance to upstream Linux* rpmsg implementation

• Stand alone OS agnostic clean-room implementation of virtio and rpmsg for bare-metal or other non-Linux OS

21 OpenAMP: Use Cases

• Separation of applications. One example, tasks from critical processing tasks.

• Offload work for computationally intensive operations such as processing, encryption of secure, sensitive data

*

22 OpenAMP: Medical Use Case Real-Time Display

Vitals Data Acquisition

Web View

23 Summary

• Heterogeneous multicore is here to stay and has great potential to help with both consolidation and isolation

• Making multiple and varied operating environments run side-by-side is hard work, Open AMP makes it possible while minimizing schedule risk

• Open AMP unleashes the ‘silicony goodness’ by providing developers with the framework and tools to optimize the processing to extract most out of the underlying hardware

24 Next Steps

• SoC FPGA Overview - portfolio, tools, ecosystem, road map https://www.altera.com/socfpga • Low-cost Development Platform - evaluation kit https://www.altera.com/atlassoc • Mentor Embedded Multicore Framework www.mentor.com/embedded-software/multicore

• White Paper: Architecting Success with Heterogeneous Systems • Webinar: Challenges of Debugging Heterogeneous Multicore SoCs

25 Completing an Online Session Evaluation by 10am Tomorrow Automatically Enters You in a Drawing to Win!

You will receive an email with a link to the online evaluation prior to the end of this session.

ISDF Technical Session Prize

Intel® Compute Stick (2)

Winners will be notified by email Copies of the complete sweepstakes rules are available at the Info Desk

26 Q&A

27 Legal Notices and Disclaimers

• Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. • Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. • Software and workloads used in performance tests may have been optimized for performance only on Intel . Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. • Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. • This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. • No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. • Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. • All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. • Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. • © 2016 Intel Corporation. Intel, the Intel logo, and others are trademarks of Intel Corporation in the U.S. and/or other countries. • Altera, Arria, Cyclone, Enpirion, Max, Megcore, Nios, Quartus and Stratix, words and logos are trademarks of Altera and registered in the U.S. Patent and Trademark Office and in other countries. • *Other names and brands may be claimed as the property of others.

28