Connecting Software to Silicon China Outreach, October 2018

Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems

Copyright © Khronos® Group Inc. 2018 - Page 1 Khronos Mission Asian Members

Software

Silicon

Khronos is an International Industry Consortium creating royalty-free, open standards to enable software to access hardware acceleration for 3D Graphics, Virtual and Augmented Reality, Parallel Computing, Neural Networks and Vision Processing

Copyright © Khronos® Group Inc. 2018 - Page 2 What is An Open Standard?

An INTEROPERABILITY STANDARD enables two entities to COMMUNICATE. E.g. Software <-> Hardware

Bad Standards Good Standards - Overprescribes implementation details - Prescribes ONLY interoperability - Forces lowest common denominator - Enables implementation diversity - Stifles innovation – Encourages innovation -> Commoditization -> Differentiation

A Truly OPEN standard Is not controlled by a single company – but by the whole industry Is freely available to use by any company without royalty payments Has a well defined IP Framework to protect standard AND member’s IP

Standards Grow Markets… … Reduce Costs … … Accelerate Time to Market By reducing consumer confusion and By sharing development between With well-proven functionality, increasing capabilities and usability many companies and driving volume testing and interoperability

Copyright © Khronos® Group Inc. 2018 - Page 3 Khronos Standards

Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning

High-performance, low-latency 3D Graphics

Portable interaction with VR/AR sensor, haptic and display devices

Copyright © Khronos® Group Inc. 2018 - Page 4 Vulkan and New Generation GPU

Modern architecture | Low overhead| Multi-thread friendly EXPLICIT GPU access for EFFICIENT, LOW-LATENCY, PREDICTABLE performance

Vulkan Porting Tools

Non-proprietary, royalty-free open standard ‘By the industry for the industry’ Portable across multiple platforms - desktop and mobile

Copyright © Khronos® Group Inc. 2018 - Page 5 What is an Explicit API? Explicit API • Provides information at the right time Timeline(s) state definitions - To prepare for efficient execution API Object • Better scheduling control over CPU and GPU generation - Split multi-threaded work creation from work submission object - Explicit synchronization primitives usage • Predictable performance costs - Creating pipelines, allocating memory … Legacy API state context • No driver magic on-the-clock manipulation - Removes guesswork and late decision-making • Simpler drivers Implicit Internal - The application has the most complete Validation & knowledge for holistic optimization Objects

Copyright © Khronos® Group Inc. 2018 - Page 6 Vulkan Explicit GPU Control Vulkan = high performance and low latency 3D and GPU Compute Ideal for VR/AR applications

Resource management offloaded to app: low-overhead, low-latency Complex drivers Application driver cause overhead Single thread per context and inconsistent Application Consistent behavior: behavior across Memory allocation Multiple Front-end no ‘fighting with driver vendors Thread management heuristics’ Synchronization Compilers High-level Driver GLSL, HLSL etc. Always active Multi-threaded generation Validation and debug layers of command buffers error handling Abstraction loaded only when needed Layered GPU Control Full GLSL Context management SPIR-V SPIR-V intermediate preprocessor and Memory allocation pre-compiled shaders language: shading language compiler in Full GLSL compiler flexibility Error detection driver Thin Driver Loadable debug and Explicit GPU Control validation layers Multi-threaded command OpenGL vs. creation. Multiple graphics, OpenGL ES command and DMA queues GPU GPU Unified API across all platforms with feature set flexibility Vulkan 1.0 provides access to OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality but with increased performance and flexibility

Copyright © Khronos® Group Inc. 2018 - Page 7 Vulkan Multi-threading Efficiency 1. Multiple threads can construct Command Buffers in parallel Application is responsible for thread management and synch CPU Thread Command Buffer

CPU Thread Command CPU Buffer Thread

CPU Command Thread Command Buffer Queue GPU

CPU Command Thread 2. Command Buffers placed in Command Buffer Queue by separate submission thread

CPU Thread Command Buffer

Copyright © Khronos® Group Inc. 2018 - Page 8 Vulkan Tools Architecture • Layered design for cross-vendor tools innovation and flexibility - IHVs plug into a common, extensible architecture for code validation, debugging and profiling during development without impacting production performance • Khronos Open Source Loader enables use of tools layers during debug - Finds and loads drivers, dispatches API calls to correct driver and layers

Production Path Interactive (Performance) Debug Layers can be Debugger Vulkan-based Title installed during Development

Validation and Sim Layers Vulkan’s Common Loader Debug Layers

IHV’s Installable Client Debug information via Driver standardized API calls

Copyright © Khronos® Group Inc. 2018 - Page 9 Vulkan – Now Available Everywhere

VR-Related Features NOW Multi-GPU support Multiview Rendering Context priority Front buffer rendering

IN DISCUSSION Variable Rate Rendering Tiled rendering (beam racing)

Copyright © Khronos® Group Inc. 2018 - Page 10 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance testing Building Vulkan’s Future Shader toolchain improvements (size, speed, robustness) Deliver complete ecosystem – not just specs Shading language flexibility – HLSL and OpenCL C support Listen and prioritize developer needs Vulkan Public Ecosystem Forum Drive GPU technology

Released Vulkan 1.1 Extensions KHR_draw_indirect_count Vulkan 1.0 Extensions Source draw count parameter from a buffer in GPU-writable Maintenance updates plus additional functionality memory for greater flexibility for GPU-generated work Explicit Building Blocks for VR: KHR_8bit_storage e.g. multiview 8-bit types in uniform and storage buffers for improved compute Explicit Building Blocks for support in apps such as inferencing and vision Homogeneous Multi-GPU EXT_descriptor_indexing Enhanced Windows System Integration Dynamically non-uniform (aka bindless) resource access Increased Shader Flexibility: Required by some modern game engine architectures February 2016 16 bit storage, Variable Pointers Enhanced Cross-Process and Discussions in Flight Vulkan 1.0 March 2018 Cross-API Sharing Vulkan 1.1 Reduced precision arithmetic types in shaders Detailed driver property queries Integration of 1.0 Extensions. Variable-resolution rendering New Technology into Core e.g. Cross-vendor performance counter queries Subgroup Operations Memory residency management Depth/stencil resolve Ray tracing Widening Platform Support Video Pervasive GPU vendor native driver availability New synchronization primitives Open source drivers – ANV (Intel), AMDVLK/RADV (AMD) Port Vulkan apps to macOS/iOS and DX12

Copyright © Khronos® Group Inc. 2018 - Page 11 New Functionality in Vulkan 1.1 • Protected Content - Restrict access or copying from resources used for rendering and display - Secure playback and display of protected multimedia content • Subgroup Operations - Efficient mechanisms that enable parallel shader invocations to communicate - Wide variety of parallel computation models supported

Example Subgroup Operations A subgroup is a set of invocations (tasks) running on a GPU Compute Unit (Note many GPUs typically support subgroup sizes of 32/64 invocations)

5 2 1 6 7 9 3 4 3 1 3 2 2 3 Active Invocations

+ X Inactive Invocations

5 5 5 5 3 9 9 7 9 9 9 9 6 6 Broadcast Shuffle Add Clustered Multiply

Copyright © Khronos® Group Inc. 2018 - Page 12 Content is shipping on desktop…

Vulkan-only AAA Titles on PC

Dota 2 on PC and macOS

AAA titles on Linux

Copyright © Khronos® Group Inc. 2018 - Page 13 …and Mobile

Plus…. Lineage 2 Revolution Heroes of Incredible Tales Dream League Soccer…

Copyright © Khronos® Group Inc. 2018 - Page 14 Fortnite on Android!

Fortnite on Android uses Vulkan on select phones for optimal performance, including the best-performing Samsung – the Galaxy Note9

Copyright © Khronos® Group Inc. 2018 - Page 15 Vulkan Developer Activity – SDK and GitHub

LunarG Vulkan SDK Download rate increases every year since launch http://vulkan.lunarg.com

SIGGRAPH 2016 Vulkan Related GitHub Repos SIGGRAPH 2018 2500

2000

1500 SIGGRAPH 2017

1000

500

0 SIGGRAPH GDC SIGGRAPH GDC SIGGRAPH 2016 2017 2017 2018 2018

Copyright © Khronos® Group Inc. 2018 - Page 16 NVIDIA Vulkan Raytracing Vendor Extension • Vulkan Raytracing Extension to access RTX Raytracing - Running on Turing-based boards • Nsight Graphics with increased Vulkan support - Including Raytracing extension

Copyright © Khronos® Group Inc. 2018 - Page 17 OpenGL and OpenGL ES

January 2018 June 2018 OpenGL 4.6 conformance test suite OpenGL ES CTS 3.2.5.0 released in open source released in open source Intel and NVIDIA released conformant Raises the quality bar for OpenGL OpenGL 4.6 drivers 3.2 implementations OpenGL ES still the most prevalent April 2018 3D API (billions of units!) More conformant products added OpenGL 4.6.0.1 CTS bugfix update OpenGL ES 3.2 adoption increasing released in April

Working Group Meetings Merged under one Chairperson for Improved Efficiency 13.9% 30.6% GLSL and ESSL specs merged and migrated from LibreOffice to AsciiDoctor to improve maintainability and reduce divergence OpenGL 4.6, OpenGL ES 3.2, GLSL 4.60 and ESSL 3.20 specs June 2018 Lots of bug fixes - many leveraged from open GitHub projects 23.8% 31.7%

Google data – July 2018

Copyright © Khronos® Group Inc. 2018 - Page 18 WebGL Stack

Content downloaded Content from the Web JavaScript, HTML, CSS, ...

Low-level WebGL API provide a Middleware provides accessibility JavaScript Middleware powerful foundation for a rich for non-expert programmers JavaScript middleware E.g. three.js library ecosystem

CSS Reliable WebGL Browser provides WebGL relies on work by 3D engine alongside other HTML5 both GPU and technologies - no plug-in required JavaScript HTML5 Browser Vendors -> Khronos has the right membership OS Provided Drivers to enable that WebGL uses OpenGL ES 2.0 or cooperation Angle for OpenGL ES 2.0 over DX9

Copyright © Khronos® Group Inc. 2018 - Page 19 OpenGL ES and WebGL Evolution

Pervasive OpenGL ES 2.0 Desktop Graphics OpenGL and OpenGL ES ships on every desktop and mobile OS. Textures: NPOT, 3D, Depth, Arrays, Int/float 3D on the Web is enabled! Objects: Query, Sync, Samplers Apple does not ship Seamless Cubemaps, Integer vertex attributes OpenGL ES 3.1 Advanced Graphics Mobile Graphics Multiple Render Targets, Instanced rendering Cannot bring compute Tessellation and geometry shaders Programmable Vertex Transform feedback, Uniform blocks shaders into core WebGL ASTC Texture Compression and Fragment shaders Vertex array objects, GLSL ES 3.0 shaders Floating point render targets Compute Shaders Debug and robustness for security

2007 2012 2014 20162015 OpenGL ES 2.0 OpenGL ES 3.0 OpenGL ES 3.1 OpenGL ES 3.2 Compute Shaders WebGL 2.0 Compute Context Multiview extension 4 years Work in 2011 Progress WebGL 1.0 5 years March 2017 WebGL 2.0 Conformance Testing is vital for Cross-Platform Reliability WebGL 2.0 conformance tests are very thorough 10x more tests than WebGL 1.0 tests

Copyright © Khronos® Group Inc. 2018 - Page 20 WebGL Momentum – WebGL 2.0 is Here!

WebGL 1.0 93.26% Globally

http://caniuse.com/#feat=webgl WebGL 2.0 65.77% Globally

WebGL 2.0 brings Desktop-class graphics to the Web The time to create a new class of

Will reach WebGL 1.0 levels when Safari and Edge ship Web-based 3D Apps is now!

Copyright © Khronos® Group Inc. 2018 - Page 21 Khronos Standards

Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning

High-performance, low-latency 3D Graphics

Portable interaction with VR/AR sensor, haptic and display devices

Copyright © Khronos® Group Inc. 2018 - Page 22 glTF – The JPEG of 3D! glTF spec development on open GitHub – get involved! https://github.com/KhronosGroup/glTF

Compact to Transmit Simple and Fast to Load Describes Full Scenes Runtime Neutral Open and Extensible glTF 2.0 – June 2017 Efficient, reliable transmission glTF 1.0 – December 2015 Native AND Web APIs Bring 3D assets into 1000s of Primarily for WebGL Physically Based Rendering apps and engines Uses GLSL for materials Metallic-Roughness and Specular-Glossiness

Copyright © Khronos® Group Inc. 2018 - Page 23 glTF 2.0 Scene Description Structure

. (JSON) Node hierarchy, PBR material textures, cameras

.bin .png Geometry: vertices and indices .jpg Animation: key-frames ... Skins: inverse-bind matrices Textures

Mandatory Metallic-Roughness Materials

Optional Specular-Glossiness Materials

Geometry Texture based PBR materials

Copyright © Khronos® Group Inc. 2018 - Page 24 glTF Ecosystem Repositories Creation Tools

Discover

Windows Mixed Sony 3D Creator Create Experience Reality Home Oculus

Modo Paint 3D Drive Demand

Mixed Reality 3D Builder Viewer Collada2gltf gltf-vscode FBX2glTF Prep for 3D printing

glTF-validator glTF-asset-generator Apps and Engines Users Copyright © Khronos® Group Inc. 2018 - Page 25 glTF Recent Highlights

Microsoft makes glTF files as pervasively usable as JPGs in Windows 10 Open sources entire glTF SDK! https://github.com/Microsoft/glTF-SDK

Adds glTF to StemCell - 60K+ 3D artists and 700K 3D models https://www.khronos.org/blog/turbosquid-adds-gltf-to-supported-formats-for-its-stemcell-initiative

Supports drag and drop of glTF models to your feed https://developers.facebook.com/blog/post/2018/02/20/3d-posts-facebook/

Over 150K royalty-free glTF models https://sketchfab.com/features/gltf

Dimension generates glTF for delivery of 3D marketing assets https://www.khronos.org/assets/uploads/developers/library/2018-gdc--and-gltf/glTF-Adobe-Dimension-GDC_Mar18.pdf

Integrating glTF into HUBS Web VR Meeting Space https://www.roadtovr.com/mozillas-hubs-one-click-vr-meeting-space-ive-waiting/

3D Tiles community standard proposal references glTF for streaming 3D geospatial datasets http://www.opengeospatial.org/pressroom/pressreleases/2829

Import of glTF into AR Core apps via the Sceneform Tools plugin Draco Mesh Compression Technology Contributed as an extension https://www.khronos.org/news/press/khronos-announces-gltf-geometry-compression-extension-google-draco

Copyright © Khronos® Group Inc. 2018 - Page 26 glTF Roadmap • glTF manages its roadmap very carefully – complexity is the enemy - Mission #1: ensure widespread, consistent, reliable usage • Optional Draco Mesh Compression Extension Shipping Now - Adoption enabled with open source encoders and decoders • Extensions and projects in discussion… - Draco compressed Point clouds, Universal compressed textures, LODs and streaming - Nexgen PBR materials, Advanced animations (e.g. Avatars and Face emoji), Metadata • Developing reference viewer for visual consistency across web and native engines

Domain-specific extensions stay as extensions Stable Mesh Compression Core Spec New widely needed Ratios functionality ships first as extensions

Integrate extensions into new core spec only when: 1) Widespread need is confirmed by the industry 2) Widespread reliable implementation is enabled (e.g. open source)

Copyright © Khronos® Group Inc. 2018 - Page 27 Texture Transmission Extension in Progress

.js, C++, GLSL/HLSL open source transcoders convert Encoding decoupled from data on-the-fly. GPU target device. Progressive Decode – stream Texture One encode pass per only up to desired Original texture asset quality/resolution per target TextureOriginal Original Universal AssetsTexture Encode and Transcode GPU AssetsTexture Supercompress Texture to GPU formats Texture Assets Assets

Transcodable, supercompressed textures for efficient transmission GPU 25% size of the equivalent native GPU encoding. Texture Rate-distortion optimization (RDO) for fine-grain control over quality vs bitrate. Optional LZ/ANS lossless codec stage for maximized compression efficiency. Transcode to a format that Support for both low precision and high-precision modes to support transcoding to the full is natively GPU-accelerated range of industry standard GPU formats on platform: BC1-5, ETC1/2, ASTC, BC6H/7, PVRTC Extension in design - welcome industry feedback https://github.com/KhronosGroup/glTF-Texture-Transmission-Tools

Copyright © Khronos® Group Inc. 2018 - Page 28 With Consumer Capture – 3D Will Go Social!

3D Descriptor Search Database

Manufacturers provide 3D Object Descriptors - much more information than Amazon-style 2D search

Social Loop Object Photos -> Facebook Upload, View, Share Capture Videos -> YouTube and Comment 3D -> glTF!

3D Printing Print Inspire and (e.g. Shapeways.com) Motivate

Copyright © Khronos® Group Inc. 2018 - Page 29 Khronos Standards for AR and VR

Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning

High-performance, low-latency 3D Graphics

Portable interaction with VR/AR sensor, haptic and display devices

Copyright © Khronos® Group Inc. 2018 - Page 30 The Need for VR Standards

Vertically Integrated One-off Solutions in 1990s High-volume Consumer Platforms Today!

Interoperability standards so Consumer platforms have insatiable need for content content can run on all VR devices! to drive demand - can’t afford silo’d content! Compete on adding user value – Consumer confidence – will my device run the content not restricting user choice! I want? Avoid Betamax vs. VHS Syndrome!

VR/AR will build on a HMD Connectivity Wireless connectivity and security constellation of standards

VR Video Formats Acceleration APIs XR in the Browser

Copyright © Khronos® Group Inc. 2018 - Page 31 XR = AR + VR

V1.0 - focused on VR V A After 1.0 – equal focus on AR

Copyright © Khronos® Group Inc. 2018 - Page 32 OpenXR – Solving VR/AR Fragmentation

VR AR VR AR VR AR VR AR App App App App App App App App 1 2 3 4 1 2 3 4

Proprietary Proprietary WebXR Engine WebXR Engine

Application Interface

Device Integration Layer

VR VR VR VR VR VR VR VR VR VR Device Device Device Device Device Device Device Device Device Device 1 2 3 4 5 1 2 3 4 5 Before OpenXR After OpenXR XR Market Fragmentation Wide interoperability of XR apps and devices

Copyright © Khronos® Group Inc. 2018 - Page 33 OpenXR Architecture

Portable AR/VR Input Device Discovery Multiple Sensor Tracking Device Events Pose Normalization Haptics Control Optical Corrections

OpenXR doesn’t replace AR/VR runtimes! It enables those runtimes to use PORTABLE APIs to expose their functionality

Copyright © Khronos® Group Inc. 2018 - Page 34 Companies Publicly Supporting OpenXR

OpenXR is a collaborative design Integrating many lessons from proprietary ‘first-generation’ API designs

Copyright © Khronos® Group Inc. 2018 - Page 35 Input and Haptics Abstraction • Apps use abstracted Input Actions - E.g. “Move,” “Jump,” “Teleport” • Device Events mapped to Input Actions - For the specific system in use • Many advantages - Content can use new devices with no code changes - Can mix-and-match multiple input sources to create a unified UI - Easy optional feature support (e.g. eye and body tracking) - Future-proofing for innovation in input devices and form factors

Copyright © Khronos® Group Inc. 2018 - Page 36 OpenXR Viewport Configuration Flexibility • Applications can: - Query for runtime supported Viewport Configurations - Applications can then set the Viewport Configurations that they plan to use - Select and change their active configuration over the lifetime of the session

Camera Passthrough AR Stereoscopic VR / AR Projection CAVE-like

Photo Credit: Dave Pape One Viewport Two Viewports (one per eye) Twelve Viewports (six per eye)

/viewport_configuration/ar_mono/magic_window /viewport_configuration/vr/hmd /viewport_configuration/vr_cube/cave_vr

Copyright © Khronos® Group Inc. 2018 - Page 37 OpenXR Development Process

Call for Participation / Exploratory Group Formation Fall F2F, October 2016: Korea

Statement of Work / Working Group Formation Winter F2F, January 2017: Vancouver

Specification Work Spring F2F, April 2017: Amsterdam Much more detailed specification overview Interim F2F, July 2017: Washington and SIGGRAPH session videos: https://www.khronos.org/developers/library/2018-siggraph Defining the MVP Fall F2F, September 2017: Chicago

Resolving Implementation Issues Winterim F2F, November 2017: Washington Winter F2F, January 2018: Taipei

First Public Information GDC, March 2018 Multiple implementations Underway! Present Day First Public Demonstrations! Final specification will incorporate SIGGRAPH, August 2018 Coming Soon implementation feedback Release Provisional Specification

Conformance Tests and Feedback Finalize Adopters Program Implementations

Ratify and release Final Specification and Enable Conformant Implementations to Ship

Copyright © Khronos® Group Inc. 2018 - Page 38 Epic ‘Showdown’ VR Demo at SIGGRAPH

Demo runs portably across StarVR and Microsoft Windows Mixed Reality headsets through the OpenXR APIs via an Unreal Engine 4 plugin https://www.youtube.com/watch?v=FCAM-3aAzXg&t=17250s

Copyright © Khronos® Group Inc. 2018 - Page 39 Mobile Augmented Reality Libraries Encapsulated Vision-based Functionality Also leveraging motion sensors ARKit Pose Tracking Yes Yes Plane Detection and Tracking Yes Yes Image Recognition and Tracking Yes Yes 3D Object Scanning, Recognition and Tracking Yes No Ambient light level and temperature Yes Yes 3rd party AR Libraries Environment Light Probes Yes No typically use Environment Reflective Texturing Yes No ARKit/ARCore if available or Link to Neural Net-based Object Detection Yes Yes implement own Access to Point Cloud Yes Yes tracking and Camera Intrinsics Yes Yes functionality if not Multi-user and persistent cloud-based anchors Yes Yes Multi-user Viewing Yes No Face tracking (iPhone X) Yes No OS Availability iOS Android and iOS

Copyright © Khronos® Group Inc. 2018 - Page 40 Bringing VR and AR to the Web

Native XR Apps Web XR Apps Future versions of OpenXR will include cross-platform extended AR functionality WebXR 3D Engines 3D Engines System-exposed AR Capabilities

Close ongoing collaboration between WebXR and OpenXR

Khronos providing the foundation for 3D and XR in the Web and native stacks

Copyright © Khronos® Group Inc. 2018 - Page 41 Khronos Standards for AR and VR

Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning

High-performance, low-latency 3D Graphics

Portable interaction with VR/AR sensor, haptic and display devices

Copyright © Khronos® Group Inc. 2018 - Page 42 Machine Learning Acceleration

Training on Desktop and Cloud Deployment on Embedded Devices

Training Data Sets

Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes

Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware

Copyright © Khronos® Group Inc. 2018 - Page 43 Machine Learning Training

NeuralNeural Net Net Training Training NeuralNeural NetFrameworks Net Training Training Authoring FrameworksFrameworks Interchange Frameworks

GPUs have well established APIs and libraries for compute acceleration

Desktop and Cloud Hardware

cuDNN MIOpen clDNN TPU

Copyright © Khronos® Group Inc. 2018 - Page 44 Machine Learning Acceleration

Training on Desktop and Cloud

Training Data Sets

Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes

Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware

Deployment on Embedded Devices

Copyright © Khronos® Group Inc. 2018 - Page 45 NNEF - Neural Network Exchange Format

Before NNEF

NN Authoring Framework 1 Inference Engine 1

NN Authoring Framework 2 Inference Engine 2

NN Authoring Framework 3 Inference Engine 3 Every Tool Needs an Exporter to Every Accelerator

With NNEF

NN Authoring Framework 1 Inference Engine 1

NN Authoring Framework 2 Inference Engine 2

NN Authoring Framework 3 Inference Engine 3 Optimization and processing tools

Copyright © Khronos® Group Inc. 2018 - Page 46 NNEF Captures a Neural Network Description

Network Structure File Distilled, platform independent network description Human readable, syntactical elements from Python

Standardized Operations Can associate multiple Data Files with one Network Rigorously defined semantics Structure File e.g. the same data in multiple formats Linear, convolution, pooling, normalization, activation, unary/binary Supports fully connected, convolutional, recurrent architectures Network Data File Binary formatNetwork contains Data parameter File tensors BinarySupports formatNetwork float contains and Data quantized parameter File (integer) tensors data Two Levels of Expressiveness BinarySupportsFlexible format bitfloat widthscontains and quantized and parameter quantization (integer) tensors algorithms data Flat SupportsFlexibleQuantization bitfloat widths and algorithms quantized and quantization expressed (integer) algorithmsas data extensible Basic transfer of computation graphs with standardized operations FlexibleQuantization bit widths algorithmscompound and quantization expressed operations algorithmsas extensible Simple to parse and translate to vendor specific formats QuantizationQuantization algorithmscompound info provided expressed operations as hints as extensible for execution Compositional Quantizationcompound info provided operations as hints for execution Quantization info provided as hints for execution Define custom compound operations Higher-level graph descriptions More complex to parse but offers more optimization hints

Split Structure and Data files Easy independent access to network structure or individual parameter data Set of files can use a container such as tar or zip with optional compression and encryption

Copyright © Khronos® Group Inc. 2018 - Page 47 NNEF 1.0

TensorFlow and Caffe NNEF V1.0 released in August 2018 Import / Live Export After positive industry feedback on Provisional specification released in December 2017 Imminent

TensorFlow Syntax and Caffe2 Parser/ Import / Validator Export Files

Google OpenVX NNAPI Ingestion & Convertor Execution

NNEF open source projects hosted on NNEF Working Group Participants Khronos NNEF GitHub repository Apache 2.0 license https://github.com/KhronosGroup/NNEF-Tools

Copyright © Khronos® Group Inc. 2018 - Page 48 NNEF and ONNX

Comparing Neural Network Same Industry Dynamics as Exchange Industry Initiatives LLVM and SPIR-V

Bidirectional translator in open source

Embedded Inferencing Import Training Interchange Initiating open source Defined Specification Open Source Project bidirectional translator

Stability for hardware deployment Software stack flexibility

Multi-company Governance Initiated by Facebook Khronos tried to use LLVM as a hardware IR BUT Flexible Precision / Quantization 32-bit Floating Point only LLVM evolves without needing to preserve backwards compatibility. Fine for software compilers – very difficult to ONNX and NNEF are Complementary manage for hardware toolchains and roadmaps - ONNX will HAVE to move fast to track authoring framework interchange SO - NNEF provides a stable bridge from training into edge inferencing engines Khronos created hardware oriented SPIR-V with bidirectional translation to LLVM

Copyright © Khronos® Group Inc. 2018 - Page 49 Machine Learning Acceleration

Training on Desktop and Cloud

Training Data Sets

Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes

Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware

Deployment on Embedded Devices

Copyright © Khronos® Group Inc. 2018 - Page 50 Three Broad Inferencing Choices

Run in Acceleration needs high-level programming training tools – typically large C++ applications framework E.g. TensorFlow or TensorFlow Lite

Export to The most popular industry choice today. inference Runtime often uses underlying acceleration APIs runtime E.g. TensorRT, CoreML

Compile to Often used to merge custom or vision code optimized alongside inferencing runtime – generates code LLVM for CPU + accelerated API code

Copyright © Khronos® Group Inc. 2018 - Page 51 SYCL Single Source C++ Parallel Programming • SYCL 1.2.1 Adopters Program released in July 2018 with open source conformance tests - https://www.khronos.org/news/press/khronos-releases-conformance-test-suite-for-sycl-1.2.1 • Multiple Implémentations shipping: triSYCL, ComputeCpp - http://sycl.tech • Multiple SYCL libraries for vision and inferencing - SYCL -BLAS, SYCL-DNN, SYCL-Eigen Single application source C++ templates and lambda file using STANDARD C++ functions separate host & device code

C++ Kernel Fusion can gives better performance on complex apps and libs than hand-coding

Accelerated code passed into device OpenCL compilers

Copyright © Khronos® Group Inc. 2018 - Page 52 TensorFlow on SYCL / OpenCL

Python Client C++ Client

Optional C API State-of-the-art C++ compilers can fuse nodes TensorFlow tensor Kernels Matrix multiply Convolutions in vision and neural (> 800 kernels) network graphs to provide optimized performance often faster than hand- Eigen Tensors SYCL-BLAS Library SYCL-DNN Library coded applications

Copyright © Khronos® Group Inc. 2018 - Page 53 Three Broad Inferencing Choices

Run in Acceleration needs high-level programming training tools – typically large C++ applications framework E.g. TensorFlow or TensorFlow Lite

Export to The most popular industry choice today. inference Runtime often uses underlying acceleration APIs runtime E.g. TensorRT, CoreML

Compile to Often used to merge custom or vision code optimized alongside inferencing runtime – generates code LLVM for CPU + accelerated API code

Copyright © Khronos® Group Inc. 2018 - Page 54 Platform Inferencing Stacks Consistent Three Steps 1. Import trained NN model file 2. Build optimized version of graph 3. Run graph on accelerated runtime using underlying low-level API

Core ML Model

Microsoft Windows Google Android Apple MacOS and iOS Windows Machine Learning (WinML) Neural Network API (NNAPI) CoreML https://docs.microsoft.com/en-us/windows/uwp/machine-learning/ https://developer.android.com/ndk/guides/neuralnetworks/ https://developer.apple.com/documentation/coreml

Copyright © Khronos® Group Inc. 2018 - Page 55 NNVM - Open Compiler for AI Inferencing

Paul G. Allen School of Computer Science & Engineering, University of Washington http://www.tvmlang.org/2017/08/17/tvm-release-announcement.html

1.Import Trained Network Description

2. Graph-level Optimizations

3. Decompose to primitive instructions and emit programs for accelerated run-times

SPIR-V IR for parallel accelerators LLVM IR for CPUs Backend in development Facebook Glow Compiler (Graph Lowering Optimizations) https://facebook.ai/developers/tools/glow

Copyright © Khronos® Group Inc. 2018 - Page 56 OpenVX

Wide range of vision hardware architectures OpenVX provides a high-level Graph-based abstraction -> Enables Graph-level optimizations! Can be implemented on almost any hardware or processor! -> Portable, Efficient Vision Processing!

Shipping Implementations X100 Dedicated Hardware Vision DSPs X10 Vision GPU Node Compute Vision Vision Node Node Multi-core Vision Node Power Efficiency Power X1 CPU

Computation Flexibility Vision Processing Graph

Copyright © Khronos® Group Inc. 2018 - Page 57 Extending OpenVX for Inferencing #1 Importing NNEF Neural Network Descriptions

Neural Network Extension NNEF Translator • OpenVX Nodes to represent common NN Layers • Ingests NNEF File and builds OpenVX node graph • 1D-4D Tensors to connect layers • Open source project in progress • INT16, INT7.8, INT8, and U8 Tensor Ops

vxActivationLayer vxConvolutionLayer vxDeconvolutionLayer

vxFullyConnectedLayer vxNormalizationLayer vxPoolingLayer

vxSoftmaxLayer vxROIPoolingLayer …

Vision Node Native Vision Vision Downstream Camera Application Control Node CNN Nodes Node Processing

An OpenVX graph mixing CNN nodes with traditional vision nodes NNEF Translator converts NNEF representation into OpenVX Node Graphs

Copyright © Khronos® Group Inc. 2018 - Page 60 Extending OpenVX for Inferencing #2 Creating Custom User Nodes

OpenVX/OpenCL Interop Kernel/Graph Import • Provisional Extension • Provisional Extension • Enables custom OpenCL acceleration to be • Defines container for executable or IR code invoked from OpenVX User Kernels • Enables arbitrary code to be inserted as a • Memory objects can be mapped or copied OpenVX Node in a graph

Application OpenVX user-kernels can access command queue and cl_mem objects to asynchronously Fully asynchronous host-device schedule OpenCL kernel execution operations during data exchange Copy or export OpenVX data objects cl_mem buffers into OpenVX cl_mem buffers data objects

OpenCL Command Queue

Runtime Map or copy OpenVX data objects into cl_mem buffers Runtime

OpenVX/OpenCL Interop

Copyright © Khronos® Group Inc. 2018 - Page 61 NNEF and OpenVX for Inferencing

Many inferencing stacks end up using OpenCL for hardware acceleration

Ingestion Compilation

Translator Compile to Proprietary IR/Binary Executable Runtimes Code NN Kernel Extension Import Acceleration APIs Vision Nodes To mix inferencing with vision and other User Nodes custom processing

Execute Compile to accelerated executable OpenVX code Runtime

Copyright © Khronos® Group Inc. 2018 - Page 62 OpenCL – Unique Heterogeneous Runtime

OpenCL is the only industry standard for low-level heterogeneous compute Portable control over memory and parallel task execution “The closest you can be to your embedded accelerator and still be portable”

Application or Application or Inferencing Run-time Inferencing Run-time

Growing number of optimized OpenCL vision and inferencing libraries Vision: OpenCV, Halide, Visioncpp Machine Learning: Xiaomi MACE, Arm Compute Library Fragmented GPU Linear Algebra: clDNN, clBlast, ViennaCL API Landscape

CPU GPU CPUCPU GPUGPU

FPGA DSP

Custom Hardware

Copyright © Khronos® Group Inc. 2018 - Page 63 OpenCL – Low-level Parallel Programing • Low-level programming of heterogeneous parallel compute resources - One code tree can be executed on CPUs, GPUs, DSPs and FPGA … • OpenCL C or C++ language to write kernel programs to execute on any compute device - Platform Layer API - to query, select and initialize compute devices - Runtime API - to build and execute kernels programs on multiple devices • The programmer gets to control: - What programs execute on what device - Where data is stored in various speed and size memories in the system - When programs are run, and what operations are dependent on earlier operations

GPU OpenCL CPU KernelOpenCL Kernel code DSP Runtime API CodeKernelOpenCL compiled for CPU loads and executes CPU CodeKernelOpenCL devices FPGA kernels across devices CodeKernel Devices Code Host

Copyright © Khronos® Group Inc. 2018 - Page 64 Growing OpenCL Adoption • 100s of applications using OpenCL acceleration - Rendering, visualization, video editing, simulation, image processing • Almost 6,000 GitHub repositories using OpenCL - Tools, applications, libraries, languages - Up from 4310 one year ago • Khronos Resource Hub https://www.khronos.org/opencl/resources/opencl-applications-using-opencl

Copyright © Khronos® Group Inc. 2018 - Page 65 OpenCL as Language/Library Backend

C++ based Language for Accelerated Single Java language Vision Compiler Open source Neural image computing Source C++ extensions processing directives for software library network processing and library in open Programming for open source Fortran, for machine framework computational source for OpenCL parallelism project C and C++ learning (beta) photography

Hundreds of languages, frameworks and projects using OpenCL to access vendor-optimized, heterogeneous compute runtimes

Copyright © Khronos® Group Inc. 2018 - Page 66 OpenCL Conformant Implementations

1.0|May09 1.1|Jul11 1.2|Jun12 1.0|Aug09 1.1|Aug10 1.2|May12 2.0|Dec14

1.0|May10 1.1|Feb11 Desktop 1.1|Mar11 1.2|Dec12 2.0|Jul14 2.1 | Jun16

1.0|May09 1.1|Jun10 1.2|May15

1.1|Aug12 1.2|Mar16 2.0|Apr17

1.0 | Feb11 1.2|Sep13 Mobile

1.1|Nov12 1.2|Apr14 2.0|Nov15

1.1|Apr12 1.2|Dec14

1.2|Sep14 Adoption Since Last IWOCL NVIDIA: 1.2 on all current GPUs – Windows 1.1|May13and Linux 1.0|Jan10Qualcomm: 2.0 on Adreno GPUs - Android 8.0 Embedded Intel:1.2 on latest processors – Windows 10 1.2|May15 1.2|Aug15 1.0|Jul13 FPGA 1.0|Dec14

Vendor timelines are Dec08 Jun10 Nov11 Nov13 Nov15 first conformant OpenCL 1.0 OpenCL 1.1 OpenCL 1.2 OpenCL 2.0 OpenCL 2.1 submission for each spec Specification Specification Specification Specification Specification generation Copyright © Khronos® Group Inc. 2018 - Page 67 OpenCL Ecosystem Roadmap

OpenCL has an active Bringing Heterogeneous compute to standard ISO C++ three track roadmap Khronos hosting C++17 Parallel STL C++20 Parallel STL with Ranges Proposal

SYCL 1.2 SYCL 1.2.1 C++11 Single source C++11 Single source Processor Deployment programming programming Flexibility Parallel computation across diverse processor architectures

2011 2015 2017 OpenCL 1.2 OpenCL 2.1 OpenCL 2.2 OpenCL C Kernel SPIR-V in Core C++ Kernel Language Kernel Deployment Language Flexibility Execute OpenCL C kernels on Vulkan GPU runtimes

Copyright © Khronos® Group Inc. 2018 - Page 68 OpenCL Next - Feature Set Flexibility • Defining OpenCL features that become optional for enhanced deployment flexibility - API and language features e.g. floating point precisions • Feature Sets avoid fragmentation - Defined to suit specific markets – e.g. desktop, embedded vision and inferencing • Implementations are conformant if fully support feature set functionality

OpenCL 2.2 Functionality = queryable, optional feature

Industry-defined Feature Set E.g. Khronos-defined Khronos-defined Embedded Vision OpenCL 2.2 Full Profile OpenCL 1.2 Full Profile and Inferencing Feature Set Feature Set

Copyright © Khronos® Group Inc. 2018 - Page 69 Universal Deployment Flexibility October 2018

Copyright © Khronos® Group Inc. 2018 - Page 70 Universal Deployment Flexibility

OpenCL Programs

Open source SPIRV-Cross converts Clspv and clvk SPIR-V to MSL or HLSL Open source tools enable open source tools OpenCL and Vulkan apps to be increasingly deployed on any platform Open source shims convert Vulkan to Metal or D3D API calls

Native Vulkan Drivers

UWP and D3D based Consoles

Copyright © Khronos® Group Inc. 2018 - Page 71 Bringing OpenCL Compute to Vulkan • Experimental Clspv Compiler from Google, Adobe and Codeplay - Compiles OpenCL C to Vulkan’s SPIR-V execution environment Successfully tested on over 200K lines of Adobe OpenCL C production code - Open source - tracks top-of-tree LLVM and clang, not a fork

Increasing deployment options for OpenCL developers OpenCL C OpenCL e.g. Vulkan is a supported Source Host Code API on Android

Prototype open Clspv Run-time source project Compiler API Translator Prototype open https://github.com/google/clspv source project https://github.com/kpet/clvk Runtime Kevin Petit from ARM

Copyright © Khronos® Group Inc. 2018 - Page 72 Refining clspv with Diverse Workloads

Public and Private test kernel repositories so developers can upload Compile diverse OpenCL C OpenCL WG Discussion kernels to be used in long-term kernel workloads at Budapest: perf/regression testing – Apache 2.0 Getting kernels into https://github.com/KhronosGroup/OpenCL-Sample-Kernels this clspv process! https://gitlab.khronos.org/opencl/OpenCL-Sample-Kernels 1. Setting up kernel Efficient mapping to sample repo with more Interesting domains to explore include Vulkan SPIR-V? restricted license for existing OpenCL compute libraries - sensitive kernels including vision and inferencing 2. Work with open source library vendors Updates to compiler achieve to extract key kernels efficient mapping? – maybe a funded project?

3. Encouraging members to submit Vulkan is already expanding its compute Propose updates to Vulkan test kernels for key model e.g. 16-bit storage, Variable programming model use cases: e.g. vision Pointers, Subgroups. Compact memory and inferencing types and operations coming

Copyright © Khronos® Group Inc. 2018 - Page 73 SPIR-V Ecosystem

• Khronos defined cross-API IR GLSL HLSL • Native graphics and parallel compute

Third party kernel and • Stable specification to complement shader languages glslang DXC MSL LLVM Open Source Project

HLSL SPIRV-Cross OpenCL C OpenCL C++ SYCL GLSL Front-end Front-end Front-end

SPIR-V (Dis)Assembler

SPIR-V Validator LLVM LLVM to SPIR-V Bi-directional Khronos cooperating with SPIRV-opt | SPIRV-remap Translators Clang/LLVM Community

Optimization Khronos-hosted Tools Open Source Projects https://github.com/KhronosGroup/SPIRV-Tools

IHV Driver Runtimes

Copyright © Khronos® Group Inc. 2018 - Page 74 SPIR-V Recent Updates • SPIR-V 1.3 released - Non-uniform subgroup ops to support Vulkan Subgroups - Six additional extensions incorporated • Tooling evolves - SPIRV-Tools: validation, optimizations, HLSL legalization - SPIRV-Reflect: C/C++ reflection API for SPIR-V shader bytecode - Fossilize serialization format for recording and replaying SPIR-V • Compilers - Glslang: used as basis for GLSL and SPIR-V extensions - DXC: HLSL to SPIR-V compiler

Copyright © Khronos® Group Inc. 2018 - Page 75 Vulkan Portability Initiative

Enabling and accelerating the creation of tools and run-time libraries for Vulkan applications to run on platforms supporting only Metal or

Multiple third party implementations Most Vulkan functionality supported. Of Vulkan Portability run-time E.g. Metal 1.0 supports everything libraries and tools except: Triangle fans, Separate stencil reference masks, Vulkan Events, some texture-specific Portability Layers swizzles. Metal 2.0 reduces this list DevSim Layer – develop and debug with the features of a Portability Library on a full Vulkan driver Porting Research Vulkan Portability Validation Layer – enforces use of What % of Vulkan can be Extension Portability library features EFFICIENTLY supported Standardizes app queries for Vulkan by run-time layer over features NOT supported on Conformance Tests D3D and Metal library/platform combination Vulkan CTS will skip tests for missing functionality Subsets cannot be officially conformant but functionality that is present must work!

Implementation and testing experience

Copyright © Khronos® Group Inc. 2018 - Page 76 Bringing Vulkan Apps to Apple Platforms Today

Dota 2 running on Mac up to Applications 50% faster than native OpenGL Open source SDK to build, run, and debug applications First productions apps using on macOS including MoltenVK already shipping Vulkan validation layer support on macOS and iOS macOS SDK

SPIRV-Cross macOS / iOS Beta release – working to pass all Convert SPIR-V shaders to Run-time applicable conformance tests platform source formats Maps Vulkan to Metal Previously a paid product from MoltenVK for macOS and iOS Brenwill Workshop. Now released into For macOS 10.11, iOS 9.0 and up OPEN SOURCE. Completely free to use - no fees or royalties – including commercial applications

Copyright © Khronos® Group Inc. 2018 - Page 77 Valve - Vulkan Dota 2 on macOS

Shipping Now. Vulkan delivering up to 50% performance increase over native OpenGL

Copyright © Khronos® Group Inc. 2018 - Page 78 gfx-rs from • Vulkan Portability over D3D, Metal, and OpenGL - D3D layer useful for Vulkan on UWP platforms such as Windows 10 S, Polaris, Xbox One - https://github.com/gfx-rs/gfx - https://github.com/gfx-rs/portability Efficiently running Dota 2 on Mac – working to increase conformance coverage https://gfx-rs.github.io/2018/08/10/dota2-macos-performance.html C Apps and Libs

Rust Apps and Libs Warden Test framework Portability (C API) WebRender WebGPU gfx-HAL (Rust 3D API) Secondary Backends

Copyright © Khronos® Group Inc. 2018 - Page 79 Vulkan Portability Timeline

Multiple iOS and macOS apps being organically developed/ported – support Portability Extension through MoltenVK website. E.g. Ratified Forsaken Remastered on Mac DevSim and Validation MoltenVK implementing Layers Complete Dota 2 on Mac KHR_variable_pointers and Ships in Production KHR_16bit_storage to support Updates to CTS to turn off clspv tests for missing First iOS Apps using functionality MoltenVK ship RenderDoc being integrated with through app store MoltenVK MoltenVK passing MoltenVK in open source conformance for all present Dota 2 on Mac Prototype Qt Running on Mac Interest in Vulkan layer for functionality. MacOS SDK from LunarG through MoltenVK Playstation – may need funding? gfx-rs soon after

GDC June TODAY GDC 2018 2018 2018 2019

Copyright © Khronos® Group Inc. 2018 - Page 80 Safety Critical Acceleration APIs October 2018

Copyright © Khronos® Group Inc. 2018 - Page 81 OpenGL SC Milestones

OpenGL SC 1.0 - 2005 OpenGL SC 2.0 - April 2016 Fixed function graphics Shader programmable pipeline safety critical subset safety critical subset

OpenGL ES 1.0 - 2003 OpenGL ES 2.0 - 2007 Fixed function graphics Shader programmable pipeline

Copyright © Khronos® Group Inc. 2018 - Page 82 OpenVX SC - Safety Critical Vision Processing • Based on OpenVX 1.1 specification - Enhanced determinism - Specification identifies and numbers requirements • MISRA C clean per KlocWorks v10 • Divides functionality into “development” and “deployment” feature sets - Adds requirement to support import/export extension

OpenVX SC Verify Binary Import OpenVX SC Development Feature format Deployment Feature Set Set (Create Graph) Export (Execute Graph)

Entire graph creation API Implementation- No graph creation API dependent format

Copyright © Khronos® Group Inc. 2018 - Page 83 Khronos Safety Critical Advisory Forum

Need for new-generation APIs for safety certifiable vision, graphics and compute e.g. ISO 26262 and DO-178B/C Khronos SC Activities

Industry outreach and cooperation Vulkan and OpenCL Gathering SC requirements

AESIN Automotive ADAS & AV + security https://aesin.org.uk Generating guidelines for designing safety critical APIs to ease system certification. SYCL SC Open to Khronos member AND industry experts Guidelines to augment Industry https://www.khronos.org/advisors/kscaf First Safe and Secure Parallel and MISRA C++ Heterogeneous C++ C++ WG23 Programming Vulnerabilities Safe AI for Automotive ISO C Safe and Secure SG We invite safety critical ISO C++ Vulnerabilities Safety Critical SG experts to join KSCAF! No cost or work commitment OpenVX SC 1.1 - May 2017 Being incorporated into new generation mainline OpenVX specification

Copyright © Khronos® Group Inc. 2018 - Page 84 Liaisons and Exploratory Initiatives October 2018 Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems

Copyright © Khronos® Group Inc. 2018 - Page 85 Khronos Liaisons

Formal Liaisons Khronos and OGC (Open Geospatial Consortium) jointly advancing open geospatial standards related ETSI (The European Telecommunications to the fields of augmented and virtual reality, Standards Institute) ARF (AR Framework) distributed simulation, and 3D content services for interoperable AR applications has identified Khronos standards as developing key standards to be referenced

COLLADA is a formal ISO standard. Working to make glTF a formal ISO standard to enable its adoption in products such as PDF Working together to ensure hardware HMD standards and OpenXR are aligned

Informal Liaisons

Copyright © Khronos® Group Inc. 2018 - Page 86 Takyon Exploratory Group

Abaco – made a standardization proposal to Khronos Takyon - unifies low level point-to-point communication and signaling APIs, for the Embedded HPC audience Khronos – outreaching to members and industry to determine interest. Will establish working group if established industry need

Before After Five Essential API Commands Create: create a communication path to a remote endpoint Send: send a message to the remote endpoint (blocking and non-blocking) Send Completion Test: test if a previously started non-blocking send is complete Receive: receive a message Destroy: destroy the path https://www.khronos.org/exploratory/takyon

Copyright © Khronos® Group Inc. 2018 - Page 87 Need for Camera Control API? • Embedded market is distinct from mobile market -> different camera APIs? - Different sensors, multiple sensors, synchronization etc. etc… • Should we reach out to European Machine Vision Association (EMVA) - https://www.emva.org/ - Looking to expand functionality – productive cooperation? - https://www.emva.org/news/emva-hosts-two-new-standards/ • Genicam - https://www.emva.org/standards-technology/genicam/ - Currently a simple interface for industrial inspection - Should we talk to camera vendors e.g. Basler https://www.baslerweb.com/en/

Defines control of Sensor, Color Filter Array Lens, Flash, Focus, Aperture OpenKCAM standard is Auto Exposure (AE) currently on ice – do we Auto White Balance (AWB) Auto Focus (AF) Stream of Images for need to restart? Image Signal Vision Processing Processor (ISP)

Copyright © Khronos® Group Inc. 2018 - Page 88 Principles of Operation October 2018

Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems

Copyright © Khronos® Group Inc. 2018 - Page 89 Khronos Principles of Organization

Any company is welcome to join. Any member can propose new One company one vote standards initiatives Software Strong IP Framework to enable Conformance Tests and Adopters ROYALTY-FREE specifications: Programs for defining conformance, all members agree not assert patents specification integrity and cross- against conformant implementations vendor portability

Only invest where there is Silicon Non-profit organization - strong industry momentum to Membership and Adopters fees ensure industry relevance – cover operating, marketing and let Darwinism rule! engineering expenses

Copyright © Khronos® Group Inc. 2018 - Page 90 Khronos Cooperative Framework

Advisory Panels Khronos Board API Working Groups Individual industry experts Strategy, budget and (Industry, Academic and by invitation, no fee. $$ oversight Associate members) Provide requirements and draft spec feedback $ One vote per industry member

Industry and Community Adopters Conformance Royalty-Free Implementations, Tools Educator Guidelines Feedback and $ Programs Tests Specifications Documentation, Samples Courseware Materials contributions on released materials Open and royalty-free. Often open sourced

Adopters Developers Educators / Certifiers Build conformant Develop applications Create Courses implementation and products using the APIs Training and Certification

Copyright © Khronos® Group Inc. 2018 - Page 91 Khronos Standardization Process WG Passes Finalized Specification to Board - Board votes to Ratify Specification - Specification and Conformance Tests Publicly Released

Any Member Can Propose a Working Group Working Groups - Board votes to establish - Chair & specification editor elected from WG members Exploratory Group if enough interest - Design Contributions welcome from any WG member - Exploratory Group creates detailed - Consensus-based decision process – one company one vote Statement of Work (SOW) - All work is discussed and documented online - Board votes to establish a Working - One teleconference per week (in English) Group to execute the SOW - Three Khronos F2F Meetings a year - open to all members

Anyone with member email address can create account for full access to Khronos resources under Khronos NDA - Draft specifications and tests for all working groups - Email reflectors for each working group

Any Company Can Join Any Member Can Propose an Khronos as a Member Individual for an Advisory Panel - Board - $75K/year - Working Group votes to approve - Contributor - $18K/year - Invitation sent to individual to sign - Associate (<20 employees) - $3.5K/year Advisory Panel Agreement - Academic - $1K/year

Copyright © Khronos® Group Inc. 2018 - Page 92 Khronos IP and Adopters Framework

Working Group releases Conformance Tests for each Ratified Specification

When a Submission is All Members Agree successfully reviewed that to NOT assert their patents Adopters Process product becomes Conformant against any CONFORMANT defines how to submit -> implementation of a RATIFIED Conformance Test Results for The Implementation can use the Specification (no patents need Working Group Review API trademark be disclosed) AND The Implementation becomes protected from patent litigation Implementer executes Adopters from other members Membership agreement contains Agreement to access Adopters mechanisms for members to Process. Adopters fee for each exclude specific patents (rarely generation of specification used as license grant is very (typically around $20K for narrow) unlimited number of products)

Copyright © Khronos® Group Inc. 2018 - Page 93 Khronos Conformance Submission Process

Company Company executes Port Test Upload test results Successful Review enables implementing Adopters Agreement Source to to Khronos private products to use Khronos Khronos spec wishes and pays fee products and web-site. Peer trademarks, be IP Protected and to use the (typically ~$20-50K generate test review by to be listed on Khronos website trademark for unlimited results members/Adopters E.g. “We have products) to enter implemented Adopters program for OpenGL ES” an API Full use of logo Restricted use and trademark trademark (not logo) with small Full use of logo Adopter Benefit with disclaimer disclaimer and trademark language AND IP Protection

Copyright © Khronos® Group Inc. 2018 - Page 94 Exploratory Group Process

Any Member(s) IP-Free Briefing Board votes to Exploratory Group Board votes to Working can propose a with informal establish Exploratory creates detailed establish a Group new Working straw-poll on Group if enough Statement of Work Working Group to Meetings Group level on interest member interest (SOW) execute the SOW Start

Detailed design starts. Participation incurs NO IP licensing obligations Participation removes right to withdraw from Discussions should be high-level: requirements, high-level design directions IP licensing obligations

Exploratory Group Process is designed so that Khronos, and its members, can explore whether to undertake a new initiative without committing any member to IP licensing obligations

Copyright © Khronos® Group Inc. 2018 - Page 95 New Open Source Engagement Model

• Khronos is open sourcing specification sources, Source Materials for Specifications and Reference conformance tests, tools Documentation CONTRIBUTED Under Khronos IP Framework - Merge requests welcome from the community (you won’t assert patents against conformant implementations, (subject to review by OpenCL working group) and license copyright for Khronos use) • Deeper Community Enablement Spec and - Mix your own documentation! Ref Language Source Redistribution - Contribute and fix conformance tests under CC-BY 4.0 - Fix the specification, headers, ICD etc. Contributions and - Contribute new features (carefully) Distribution under Apache 2.0 Spec Build System and Contributions and Scripts Distribution under Conformance Apache 2.0 Test Suite Source Khronos builds and Ratifies Spec and Ref Language Source and Canonical Specification under derivative materials. Khronos IP Framework. Re-mixable under CC-BY by the Khronos Adopters No changes or re-hosting allowed industry and community Program

Anyone can test any Conformant Implementations can Community built implementation at use trademark and are covered documentation and any time by Khronos IP Framework tools

Copyright © Khronos® Group Inc. 2018 - Page 96 Khronos Advisory Panels

Specification drafts and Requirements and invitations for feedback on requirements and feedback specification drafts

Working Shared Email Advisory list and Group Repository Panel Hosted by Khronos. Khronos Members Under Khronos NDA Khronos Advisors Any company can join Invited industry experts Membership Fee $0 Cost Sign NDA and IP Framework Sign NDA and IP Framework

Advisory Panels Active for Vulkan, OpenCL/SYCL and NNEF

Copyright © Khronos® Group Inc. 2018 - Page 97 Khronos Education Forum

Support educators to teach Khronos technologies!

Khronos Education Forum Online Hub for collaboration https://www.khronos.org/education Shared course materials under permissive license Open to all – no fees or membership required Just send an email to [email protected] Educators can connect Direct contact with to share and discuss Khronos working course materials groups for review and feedback

Oregon State University Graphics Class with Vulkan Ref Cards

Copyright © Khronos® Group Inc. 2018 - Page 98 The Value of Khronos Participation

See an early window Have a voice in Develop into the future of the how key products in Products are aligned industry technology standards parallel with with global market roadmap before evolve to suit spec drafting needs and trends products are your business for faster time developed needs to market Members can ship products faster than non-members

Gather Industry Draft Specifications Publicly Release Non-members Requirements for future Confidential to Specifications and Release silicon acceleration Khronos members Conformance Tests Products

The Khronos standardization process is proven to RAPIDLY generate industry consensus on open standards to EFFICIENTLY create new market opportunities

Copyright © Khronos® Group Inc. 2018 - Page 99 Summary and Contacts Angela • Khronos is creating cutting-edge royalty-free open standards WeChat ID: cxy3532 - For 3D, VR/AR, Compute, Vision and machine learning • Please join Khronos! - We welcome members from China and Asia - Influence the direction of industry standards - Get early access to draft specifications • More Information www.khronos.org - WeChat ID: neiltrevett - [email protected] - @neilt3d

Copyright © Khronos® Group Inc. 2018 - Page 100