Connecting Software to Silicon China Outreach, October 2018
Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems
Copyright © Khronos® Group Inc. 2018 - Page 1 Khronos Mission Asian Members
Software
Silicon
Khronos is an International Industry Consortium creating royalty-free, open standards to enable software to access hardware acceleration for 3D Graphics, Virtual and Augmented Reality, Parallel Computing, Neural Networks and Vision Processing
Copyright © Khronos® Group Inc. 2018 - Page 2 What is An Open Standard?
An INTEROPERABILITY STANDARD enables two entities to COMMUNICATE. E.g. Software <-> Hardware
Bad Standards Good Standards - Overprescribes implementation details - Prescribes ONLY interoperability - Forces lowest common denominator - Enables implementation diversity - Stifles innovation – Encourages innovation -> Commoditization -> Differentiation
A Truly OPEN standard Is not controlled by a single company – but by the whole industry Is freely available to use by any company without royalty payments Has a well defined IP Framework to protect standard AND member’s IP
Standards Grow Markets… … Reduce Costs … … Accelerate Time to Market By reducing consumer confusion and By sharing development between With well-proven functionality, increasing capabilities and usability many companies and driving volume testing and interoperability
Copyright © Khronos® Group Inc. 2018 - Page 3 Khronos Standards
Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning
High-performance, low-latency 3D Graphics
Portable interaction with VR/AR sensor, haptic and display devices
Copyright © Khronos® Group Inc. 2018 - Page 4 Vulkan and New Generation GPU APIs
Modern architecture | Low overhead| Multi-thread friendly EXPLICIT GPU access for EFFICIENT, LOW-LATENCY, PREDICTABLE performance
Vulkan Porting Tools
Non-proprietary, royalty-free open standard ‘By the industry for the industry’ Portable across multiple platforms - desktop and mobile
Copyright © Khronos® Group Inc. 2018 - Page 5 What is an Explicit API? Explicit API • Provides information at the right time Timeline(s) state definitions - To prepare for efficient execution API Object • Better scheduling control over CPU and GPU generation - Split multi-threaded work creation from work submission api object - Explicit synchronization primitives usage • Predictable performance costs - Creating pipelines, allocating memory … Legacy API state context • No driver magic on-the-clock manipulation - Removes guesswork and late decision-making • Simpler drivers Implicit Internal - The application has the most complete Validation & knowledge for holistic optimization Objects
Copyright © Khronos® Group Inc. 2018 - Page 6 Vulkan Explicit GPU Control Vulkan = high performance and low latency 3D and GPU Compute Ideal for VR/AR applications
Resource management offloaded to app: low-overhead, low-latency Complex drivers Application driver cause overhead Single thread per context and inconsistent Application Consistent behavior: behavior across Memory allocation Multiple Front-end no ‘fighting with driver vendors Thread management heuristics’ Synchronization Compilers High-level Driver GLSL, HLSL etc. Always active Multi-threaded generation Validation and debug layers of command buffers error handling Abstraction loaded only when needed Layered GPU Control Full GLSL Context management SPIR-V SPIR-V intermediate preprocessor and Memory allocation pre-compiled shaders language: shading language compiler in Full GLSL compiler flexibility Error detection driver Thin Driver Loadable debug and Explicit GPU Control validation layers Multi-threaded command OpenGL vs. creation. Multiple graphics, OpenGL ES command and DMA queues GPU GPU Unified API across all platforms with feature set flexibility Vulkan 1.0 provides access to OpenGL ES 3.1 / OpenGL 4.X-class GPU functionality but with increased performance and flexibility
Copyright © Khronos® Group Inc. 2018 - Page 7 Vulkan Multi-threading Efficiency 1. Multiple threads can construct Command Buffers in parallel Application is responsible for thread management and synch CPU Thread Command Buffer
CPU Thread Command CPU Buffer Thread
CPU Command Thread Command Buffer Queue GPU
CPU Command Thread 2. Command Buffers placed in Command Buffer Queue by separate submission thread
CPU Thread Command Buffer
Copyright © Khronos® Group Inc. 2018 - Page 8 Vulkan Tools Architecture • Layered design for cross-vendor tools innovation and flexibility - IHVs plug into a common, extensible architecture for code validation, debugging and profiling during development without impacting production performance • Khronos Open Source Loader enables use of tools layers during debug - Finds and loads drivers, dispatches API calls to correct driver and layers
Production Path Interactive (Performance) Debug Layers can be Debugger Vulkan-based Title installed during Development
Validation and Sim Layers Vulkan’s Common Loader Debug Layers
IHV’s Installable Client Debug information via Driver standardized API calls
Copyright © Khronos® Group Inc. 2018 - Page 9 Vulkan – Now Available Everywhere
VR-Related Features NOW Multi-GPU support Multiview Rendering Context priority Front buffer rendering
IN DISCUSSION Variable Rate Rendering Tiled rendering (beam racing)
Copyright © Khronos® Group Inc. 2018 - Page 10 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance testing Building Vulkan’s Future Shader toolchain improvements (size, speed, robustness) Deliver complete ecosystem – not just specs Shading language flexibility – HLSL and OpenCL C support Listen and prioritize developer needs Vulkan Public Ecosystem Forum Drive GPU technology
Released Vulkan 1.1 Extensions KHR_draw_indirect_count Vulkan 1.0 Extensions Source draw count parameter from a buffer in GPU-writable Maintenance updates plus additional functionality memory for greater flexibility for GPU-generated work Explicit Building Blocks for VR: KHR_8bit_storage e.g. multiview 8-bit types in uniform and storage buffers for improved compute Explicit Building Blocks for support in apps such as inferencing and vision Homogeneous Multi-GPU EXT_descriptor_indexing Enhanced Windows System Integration Dynamically non-uniform (aka bindless) resource access Increased Shader Flexibility: Required by some modern game engine architectures February 2016 16 bit storage, Variable Pointers Enhanced Cross-Process and Discussions in Flight Vulkan 1.0 March 2018 Cross-API Sharing Vulkan 1.1 Reduced precision arithmetic types in shaders Detailed driver property queries Integration of 1.0 Extensions. Variable-resolution rendering New Technology into Core e.g. Cross-vendor performance counter queries Subgroup Operations Memory residency management Depth/stencil resolve Ray tracing Widening Platform Support Video Pervasive GPU vendor native driver availability New synchronization primitives Open source drivers – ANV (Intel), AMDVLK/RADV (AMD) Port Vulkan apps to macOS/iOS and DX12
Copyright © Khronos® Group Inc. 2018 - Page 11 New Functionality in Vulkan 1.1 • Protected Content - Restrict access or copying from resources used for rendering and display - Secure playback and display of protected multimedia content • Subgroup Operations - Efficient mechanisms that enable parallel shader invocations to communicate - Wide variety of parallel computation models supported
Example Subgroup Operations A subgroup is a set of invocations (tasks) running on a GPU Compute Unit (Note many GPUs typically support subgroup sizes of 32/64 invocations)
5 2 1 6 7 9 3 4 3 1 3 2 2 3 Active Invocations
+ X Inactive Invocations
5 5 5 5 3 9 9 7 9 9 9 9 6 6 Broadcast Shuffle Add Clustered Multiply
Copyright © Khronos® Group Inc. 2018 - Page 12 Content is shipping on desktop…
Vulkan-only AAA Titles on PC
Dota 2 on PC and macOS
AAA titles on Linux
Copyright © Khronos® Group Inc. 2018 - Page 13 …and Mobile
Plus…. Lineage 2 Revolution Heroes of Incredible Tales Dream League Soccer…
Copyright © Khronos® Group Inc. 2018 - Page 14 Fortnite on Android!
Fortnite on Android uses Vulkan on select phones for optimal performance, including the best-performing Samsung – the Galaxy Note9
Copyright © Khronos® Group Inc. 2018 - Page 15 Vulkan Developer Activity – SDK and GitHub
LunarG Vulkan SDK Download rate increases every year since launch http://vulkan.lunarg.com
SIGGRAPH 2016 Vulkan Related GitHub Repos SIGGRAPH 2018 2500
2000
1500 SIGGRAPH 2017
1000
500
0 SIGGRAPH GDC SIGGRAPH GDC SIGGRAPH 2016 2017 2017 2018 2018
Copyright © Khronos® Group Inc. 2018 - Page 16 NVIDIA Vulkan Raytracing Vendor Extension • Vulkan Raytracing Extension to access RTX Raytracing - Running on Turing-based boards • Nsight Graphics with increased Vulkan support - Including Raytracing extension
Copyright © Khronos® Group Inc. 2018 - Page 17 OpenGL and OpenGL ES
January 2018 June 2018 OpenGL 4.6 conformance test suite OpenGL ES CTS 3.2.5.0 released in open source released in open source Intel and NVIDIA released conformant Raises the quality bar for OpenGL OpenGL 4.6 drivers 3.2 implementations OpenGL ES still the most prevalent April 2018 3D API (billions of units!) More conformant products added OpenGL 4.6.0.1 CTS bugfix update OpenGL ES 3.2 adoption increasing released in April
Working Group Meetings Merged under one Chairperson for Improved Efficiency 13.9% 30.6% GLSL and ESSL specs merged and migrated from LibreOffice to AsciiDoctor to improve maintainability and reduce divergence OpenGL 4.6, OpenGL ES 3.2, GLSL 4.60 and ESSL 3.20 specs June 2018 Lots of bug fixes - many leveraged from open GitHub projects 23.8% 31.7%
Google data – July 2018
Copyright © Khronos® Group Inc. 2018 - Page 18 WebGL Stack
Content downloaded Content from the Web JavaScript, HTML, CSS, ...
Low-level WebGL API provide a Middleware provides accessibility JavaScript Middleware powerful foundation for a rich for non-expert programmers JavaScript middleware E.g. three.js library ecosystem
CSS Reliable WebGL Browser provides WebGL relies on work by 3D engine alongside other HTML5 both GPU and technologies - no plug-in required JavaScript HTML5 Browser Vendors -> Khronos has the right membership OS Provided Drivers to enable that WebGL uses OpenGL ES 2.0 or cooperation Angle for OpenGL ES 2.0 over DX9
Copyright © Khronos® Group Inc. 2018 - Page 19 OpenGL ES and WebGL Evolution
Pervasive OpenGL ES 2.0 Desktop Graphics OpenGL and OpenGL ES ships on every desktop and mobile OS. Textures: NPOT, 3D, Depth, Arrays, Int/float 3D on the Web is enabled! Objects: Query, Sync, Samplers Apple does not ship Seamless Cubemaps, Integer vertex attributes OpenGL ES 3.1 Advanced Graphics Mobile Graphics Multiple Render Targets, Instanced rendering Cannot bring compute Tessellation and geometry shaders Programmable Vertex Transform feedback, Uniform blocks shaders into core WebGL ASTC Texture Compression and Fragment shaders Vertex array objects, GLSL ES 3.0 shaders Floating point render targets Compute Shaders Debug and robustness for security
2007 2012 2014 20162015 OpenGL ES 2.0 OpenGL ES 3.0 OpenGL ES 3.1 OpenGL ES 3.2 Compute Shaders WebGL 2.0 Compute Context Multiview extension 4 years Work in 2011 Progress WebGL 1.0 5 years March 2017 WebGL 2.0 Conformance Testing is vital for Cross-Platform Reliability WebGL 2.0 conformance tests are very thorough 10x more tests than WebGL 1.0 tests
Copyright © Khronos® Group Inc. 2018 - Page 20 WebGL Momentum – WebGL 2.0 is Here!
WebGL 1.0 93.26% Globally
http://caniuse.com/#feat=webgl WebGL 2.0 65.77% Globally
WebGL 2.0 brings Desktop-class graphics to the Web The time to create a new class of
Will reach WebGL 1.0 levels when Safari and Edge ship Web-based 3D Apps is now!
Copyright © Khronos® Group Inc. 2018 - Page 21 Khronos Standards
Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning
High-performance, low-latency 3D Graphics
Portable interaction with VR/AR sensor, haptic and display devices
Copyright © Khronos® Group Inc. 2018 - Page 22 glTF – The JPEG of 3D! glTF spec development on open GitHub – get involved! https://github.com/KhronosGroup/glTF
Compact to Transmit Simple and Fast to Load Describes Full Scenes Runtime Neutral Open and Extensible glTF 2.0 – June 2017 Efficient, reliable transmission glTF 1.0 – December 2015 Native AND Web APIs Bring 3D assets into 1000s of Primarily for WebGL Physically Based Rendering apps and engines Uses GLSL for materials Metallic-Roughness and Specular-Glossiness
Copyright © Khronos® Group Inc. 2018 - Page 23 glTF 2.0 Scene Description Structure
.gltf (JSON) Node hierarchy, PBR material textures, cameras
.bin .png Geometry: vertices and indices .jpg Animation: key-frames ... Skins: inverse-bind matrices Textures
Mandatory Metallic-Roughness Materials
Optional Specular-Glossiness Materials
Geometry Texture based PBR materials
Copyright © Khronos® Group Inc. 2018 - Page 24 glTF Ecosystem Repositories Creation Tools
Discover
Windows Mixed Sony 3D Creator Create Experience Reality Home Oculus
Modo Paint 3D Drive Demand
Mixed Reality 3D Builder Viewer Collada2gltf gltf-vscode FBX2glTF Prep for 3D printing
glTF-validator glTF-asset-generator Apps and Engines Users Copyright © Khronos® Group Inc. 2018 - Page 25 glTF Recent Highlights
Microsoft makes glTF files as pervasively usable as JPGs in Windows 10 Open sources entire glTF SDK! https://github.com/Microsoft/glTF-SDK
Adds glTF to StemCell - 60K+ 3D artists and 700K 3D models https://www.khronos.org/blog/turbosquid-adds-gltf-to-supported-formats-for-its-stemcell-initiative
Supports drag and drop of glTF models to your feed https://developers.facebook.com/blog/post/2018/02/20/3d-posts-facebook/
Over 150K royalty-free glTF models https://sketchfab.com/features/gltf
Dimension generates glTF for delivery of 3D marketing assets https://www.khronos.org/assets/uploads/developers/library/2018-gdc-webgl-and-gltf/glTF-Adobe-Dimension-GDC_Mar18.pdf
Integrating glTF into HUBS Web VR Meeting Space https://www.roadtovr.com/mozillas-hubs-one-click-vr-meeting-space-ive-waiting/
3D Tiles community standard proposal references glTF for streaming 3D geospatial datasets http://www.opengeospatial.org/pressroom/pressreleases/2829
Import of glTF into AR Core apps via the Google Sceneform Tools plugin Draco Mesh Compression Technology Contributed as an extension https://www.khronos.org/news/press/khronos-announces-gltf-geometry-compression-extension-google-draco
Copyright © Khronos® Group Inc. 2018 - Page 26 glTF Roadmap • glTF manages its roadmap very carefully – complexity is the enemy - Mission #1: ensure widespread, consistent, reliable usage • Optional Draco Mesh Compression Extension Shipping Now - Adoption enabled with open source encoders and decoders • Extensions and projects in discussion… - Draco compressed Point clouds, Universal compressed textures, LODs and streaming - Nexgen PBR materials, Advanced animations (e.g. Avatars and Face emoji), Metadata • Developing reference viewer for visual consistency across web and native engines
Domain-specific extensions stay as extensions Stable Mesh Compression Core Spec New widely needed Ratios functionality ships first as extensions
Integrate extensions into new core spec only when: 1) Widespread need is confirmed by the industry 2) Widespread reliable implementation is enabled (e.g. open source)
Copyright © Khronos® Group Inc. 2018 - Page 27 Texture Transmission Extension in Progress
.js, C++, GLSL/HLSL open source transcoders convert Encoding decoupled from data on-the-fly. GPU target device. Progressive Decode – stream Texture One encode pass per only up to desired Original texture asset quality/resolution per target TextureOriginal Original Universal AssetsTexture Encode and Transcode GPU AssetsTexture Supercompress Texture to GPU formats Texture Assets Assets
Transcodable, supercompressed textures for efficient transmission GPU 25% size of the equivalent native GPU encoding. Texture Rate-distortion optimization (RDO) for fine-grain control over quality vs bitrate. Optional LZ/ANS lossless codec stage for maximized compression efficiency. Transcode to a format that Support for both low precision and high-precision modes to support transcoding to the full is natively GPU-accelerated range of industry standard GPU formats on platform: BC1-5, ETC1/2, ASTC, BC6H/7, PVRTC Extension in design - welcome industry feedback https://github.com/KhronosGroup/glTF-Texture-Transmission-Tools
Copyright © Khronos® Group Inc. 2018 - Page 28 With Consumer Capture – 3D Will Go Social!
3D Descriptor Search Database
Manufacturers provide 3D Object Descriptors - much more information than Amazon-style 2D search
Social Loop Object Photos -> Facebook Upload, View, Share Capture Videos -> YouTube and Comment 3D -> glTF!
3D Printing Print Inspire and (e.g. Shapeways.com) Motivate
Copyright © Khronos® Group Inc. 2018 - Page 29 Khronos Standards for AR and VR
Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning
High-performance, low-latency 3D Graphics
Portable interaction with VR/AR sensor, haptic and display devices
Copyright © Khronos® Group Inc. 2018 - Page 30 The Need for VR Standards
Vertically Integrated One-off Solutions in 1990s High-volume Consumer Platforms Today!
Interoperability standards so Consumer platforms have insatiable need for content content can run on all VR devices! to drive demand - can’t afford silo’d content! Compete on adding user value – Consumer confidence – will my device run the content not restricting user choice! I want? Avoid Betamax vs. VHS Syndrome!
VR/AR will build on a HMD Connectivity Wireless connectivity and security constellation of standards
VR Video Formats Acceleration APIs XR in the Browser
Copyright © Khronos® Group Inc. 2018 - Page 31 XR = AR + VR
V1.0 - focused on VR V A After 1.0 – equal focus on AR
Copyright © Khronos® Group Inc. 2018 - Page 32 OpenXR – Solving VR/AR Fragmentation
VR AR VR AR VR AR VR AR App App App App App App App App 1 2 3 4 1 2 3 4
Proprietary Proprietary WebXR Engine WebXR Engine
Application Interface
Device Integration Layer
VR VR VR VR VR VR VR VR VR VR Device Device Device Device Device Device Device Device Device Device 1 2 3 4 5 1 2 3 4 5 Before OpenXR After OpenXR XR Market Fragmentation Wide interoperability of XR apps and devices
Copyright © Khronos® Group Inc. 2018 - Page 33 OpenXR Architecture
Portable AR/VR Input Device Discovery Multiple Sensor Tracking Device Events Pose Normalization Haptics Control Optical Corrections
OpenXR doesn’t replace AR/VR runtimes! It enables those runtimes to use PORTABLE APIs to expose their functionality
Copyright © Khronos® Group Inc. 2018 - Page 34 Companies Publicly Supporting OpenXR
OpenXR is a collaborative design Integrating many lessons from proprietary ‘first-generation’ API designs
Copyright © Khronos® Group Inc. 2018 - Page 35 Input and Haptics Abstraction • Apps use abstracted Input Actions - E.g. “Move,” “Jump,” “Teleport” • Device Events mapped to Input Actions - For the specific system in use • Many advantages - Content can use new devices with no code changes - Can mix-and-match multiple input sources to create a unified UI - Easy optional feature support (e.g. eye and body tracking) - Future-proofing for innovation in input devices and form factors
Copyright © Khronos® Group Inc. 2018 - Page 36 OpenXR Viewport Configuration Flexibility • Applications can: - Query for runtime supported Viewport Configurations - Applications can then set the Viewport Configurations that they plan to use - Select and change their active configuration over the lifetime of the session
Camera Passthrough AR Stereoscopic VR / AR Projection CAVE-like
Photo Credit: Dave Pape One Viewport Two Viewports (one per eye) Twelve Viewports (six per eye)
/viewport_configuration/ar_mono/magic_window /viewport_configuration/vr/hmd /viewport_configuration/vr_cube/cave_vr
Copyright © Khronos® Group Inc. 2018 - Page 37 OpenXR Development Process
Call for Participation / Exploratory Group Formation Fall F2F, October 2016: Korea
Statement of Work / Working Group Formation Winter F2F, January 2017: Vancouver
Specification Work Spring F2F, April 2017: Amsterdam Much more detailed specification overview Interim F2F, July 2017: Washington and SIGGRAPH session videos: https://www.khronos.org/developers/library/2018-siggraph Defining the MVP Fall F2F, September 2017: Chicago
Resolving Implementation Issues Winterim F2F, November 2017: Washington Winter F2F, January 2018: Taipei
First Public Information GDC, March 2018 Multiple implementations Underway! Present Day First Public Demonstrations! Final specification will incorporate SIGGRAPH, August 2018 Coming Soon implementation feedback Release Provisional Specification
Conformance Tests and Feedback Finalize Adopters Program Implementations
Ratify and release Final Specification and Enable Conformant Implementations to Ship
Copyright © Khronos® Group Inc. 2018 - Page 38 Epic ‘Showdown’ VR Demo at SIGGRAPH
Demo runs portably across StarVR and Microsoft Windows Mixed Reality headsets through the OpenXR APIs via an Unreal Engine 4 plugin https://www.youtube.com/watch?v=FCAM-3aAzXg&t=17250s
Copyright © Khronos® Group Inc. 2018 - Page 39 Mobile Augmented Reality Libraries Encapsulated Vision-based Functionality Also leveraging motion sensors ARKit Pose Tracking Yes Yes Plane Detection and Tracking Yes Yes Image Recognition and Tracking Yes Yes 3D Object Scanning, Recognition and Tracking Yes No Ambient light level and temperature Yes Yes 3rd party AR Libraries Environment Light Probes Yes No typically use Environment Reflective Texturing Yes No ARKit/ARCore if available or Link to Neural Net-based Object Detection Yes Yes implement own Access to Point Cloud Yes Yes tracking and Camera Intrinsics Yes Yes functionality if not Multi-user and persistent cloud-based anchors Yes Yes Multi-user Viewing Yes No Face tracking (iPhone X) Yes No OS Availability iOS Android and iOS
Copyright © Khronos® Group Inc. 2018 - Page 40 Bringing VR and AR to the Web
Native XR Apps Web XR Apps Future versions of OpenXR will include cross-platform extended AR functionality WebXR 3D Engines 3D Engines System-exposed AR Capabilities
Close ongoing collaboration between WebXR and OpenXR
Khronos providing the foundation for 3D and XR in the Web and native stacks
Copyright © Khronos® Group Inc. 2018 - Page 41 Khronos Standards for AR and VR
Download 3D object and scene data Vision and sensor processing - including neural network inferencing for machine learning
High-performance, low-latency 3D Graphics
Portable interaction with VR/AR sensor, haptic and display devices
Copyright © Khronos® Group Inc. 2018 - Page 42 Machine Learning Acceleration
Training on Desktop and Cloud Deployment on Embedded Devices
Training Data Sets
Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes
Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware
Copyright © Khronos® Group Inc. 2018 - Page 43 Machine Learning Training
NeuralNeural Net Net Training Training NeuralNeural NetFrameworks Net Training Training Authoring FrameworksFrameworks Interchange Frameworks
GPUs have well established APIs and libraries for compute acceleration
Desktop and Cloud Hardware
cuDNN MIOpen clDNN TPU
Copyright © Khronos® Group Inc. 2018 - Page 44 Machine Learning Acceleration
Training on Desktop and Cloud
Training Data Sets
Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes
Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware
Deployment on Embedded Devices
Copyright © Khronos® Group Inc. 2018 - Page 45 NNEF - Neural Network Exchange Format
Before NNEF
NN Authoring Framework 1 Inference Engine 1
NN Authoring Framework 2 Inference Engine 2
NN Authoring Framework 3 Inference Engine 3 Every Tool Needs an Exporter to Every Accelerator
With NNEF
NN Authoring Framework 1 Inference Engine 1
NN Authoring Framework 2 Inference Engine 2
NN Authoring Framework 3 Inference Engine 3 Optimization and processing tools
Copyright © Khronos® Group Inc. 2018 - Page 46 NNEF Captures a Neural Network Description
Network Structure File Distilled, platform independent network description Human readable, syntactical elements from Python
Standardized Operations Can associate multiple Data Files with one Network Rigorously defined semantics Structure File e.g. the same data in multiple formats Linear, convolution, pooling, normalization, activation, unary/binary Supports fully connected, convolutional, recurrent architectures Network Data File Binary formatNetwork contains Data parameter File tensors BinarySupports formatNetwork float contains and Data quantized parameter File (integer) tensors data Two Levels of Expressiveness BinarySupportsFlexible format bitfloat widthscontains and quantized and parameter quantization (integer) tensors algorithms data Flat SupportsFlexibleQuantization bitfloat widths and algorithms quantized and quantization expressed (integer) algorithmsas data extensible Basic transfer of computation graphs with standardized operations FlexibleQuantization bit widths algorithmscompound and quantization expressed operations algorithmsas extensible Simple to parse and translate to vendor specific formats QuantizationQuantization algorithmscompound info provided expressed operations as hints as extensible for execution Compositional Quantizationcompound info provided operations as hints for execution Quantization info provided as hints for execution Define custom compound operations Higher-level graph descriptions More complex to parse but offers more optimization hints
Split Structure and Data files Easy independent access to network structure or individual parameter data Set of files can use a container such as tar or zip with optional compression and encryption
Copyright © Khronos® Group Inc. 2018 - Page 47 NNEF 1.0
TensorFlow and Caffe NNEF V1.0 released in August 2018 Import / Live Export After positive industry feedback on Provisional specification released in December 2017 Imminent
TensorFlow Syntax and Caffe2 Parser/ Import / Validator Export Files
Google OpenVX NNAPI Ingestion & Convertor Execution
NNEF open source projects hosted on NNEF Working Group Participants Khronos NNEF GitHub repository Apache 2.0 license https://github.com/KhronosGroup/NNEF-Tools
Copyright © Khronos® Group Inc. 2018 - Page 48 NNEF and ONNX
Comparing Neural Network Same Industry Dynamics as Exchange Industry Initiatives LLVM and SPIR-V
Bidirectional translator in open source
Embedded Inferencing Import Training Interchange Initiating open source Defined Specification Open Source Project bidirectional translator
Stability for hardware deployment Software stack flexibility
Multi-company Governance Initiated by Facebook Khronos tried to use LLVM as a hardware IR BUT Flexible Precision / Quantization 32-bit Floating Point only LLVM evolves without needing to preserve backwards compatibility. Fine for software compilers – very difficult to ONNX and NNEF are Complementary manage for hardware toolchains and roadmaps - ONNX will HAVE to move fast to track authoring framework interchange SO - NNEF provides a stable bridge from training into edge inferencing engines Khronos created hardware oriented SPIR-V with bidirectional translation to LLVM
Copyright © Khronos® Group Inc. 2018 - Page 49 Machine Learning Acceleration
Training on Desktop and Cloud
Training Data Sets
Vision and Inferencing Applications Live NeuralNeural Net Net Training Training OR Data NeuralTrainingFrameworks Net Training Trained Frameworks Optimization Vision and Inferencing FrameworksFrameworks Networks Runtimes
Desktop and Cloud Diverse Inferencing GPU/TPU Acceleration Acceleration Hardware
Deployment on Embedded Devices
Copyright © Khronos® Group Inc. 2018 - Page 50 Three Broad Inferencing Choices
Run in Acceleration needs high-level programming training tools – typically large C++ applications framework E.g. TensorFlow or TensorFlow Lite
Export to The most popular industry choice today. inference Runtime often uses underlying acceleration APIs runtime E.g. TensorRT, CoreML
Compile to Often used to merge custom or vision code optimized alongside inferencing runtime – generates code LLVM for CPU + accelerated API code
Copyright © Khronos® Group Inc. 2018 - Page 51 SYCL Single Source C++ Parallel Programming • SYCL 1.2.1 Adopters Program released in July 2018 with open source conformance tests - https://www.khronos.org/news/press/khronos-releases-conformance-test-suite-for-sycl-1.2.1 • Multiple Implémentations shipping: triSYCL, ComputeCpp - http://sycl.tech • Multiple SYCL libraries for vision and inferencing - SYCL -BLAS, SYCL-DNN, SYCL-Eigen Single application source C++ templates and lambda file using STANDARD C++ functions separate host & device code
C++ Kernel Fusion can gives better performance on complex apps and libs than hand-coding
Accelerated code passed into device OpenCL compilers
Copyright © Khronos® Group Inc. 2018 - Page 52 TensorFlow on SYCL / OpenCL
Python Client C++ Client
Optional C API State-of-the-art C++ compilers can fuse nodes TensorFlow tensor Kernels Matrix multiply Convolutions in vision and neural (> 800 kernels) network graphs to provide optimized performance often faster than hand- Eigen Tensors SYCL-BLAS Library SYCL-DNN Library coded applications
Copyright © Khronos® Group Inc. 2018 - Page 53 Three Broad Inferencing Choices
Run in Acceleration needs high-level programming training tools – typically large C++ applications framework E.g. TensorFlow or TensorFlow Lite
Export to The most popular industry choice today. inference Runtime often uses underlying acceleration APIs runtime E.g. TensorRT, CoreML
Compile to Often used to merge custom or vision code optimized alongside inferencing runtime – generates code LLVM for CPU + accelerated API code
Copyright © Khronos® Group Inc. 2018 - Page 54 Platform Inferencing Stacks Consistent Three Steps 1. Import trained NN model file 2. Build optimized version of graph 3. Run graph on accelerated runtime using underlying low-level API
Core ML Model
Microsoft Windows Google Android Apple MacOS and iOS Windows Machine Learning (WinML) Neural Network API (NNAPI) CoreML https://docs.microsoft.com/en-us/windows/uwp/machine-learning/ https://developer.android.com/ndk/guides/neuralnetworks/ https://developer.apple.com/documentation/coreml
Copyright © Khronos® Group Inc. 2018 - Page 55 NNVM - Open Compiler for AI Inferencing
Paul G. Allen School of Computer Science & Engineering, University of Washington http://www.tvmlang.org/2017/08/17/tvm-release-announcement.html
1.Import Trained Network Description
2. Graph-level Optimizations
3. Decompose to primitive instructions and emit programs for accelerated run-times
SPIR-V IR for parallel accelerators LLVM IR for CPUs Backend in development Facebook Glow Compiler (Graph Lowering Optimizations) https://facebook.ai/developers/tools/glow
Copyright © Khronos® Group Inc. 2018 - Page 56 OpenVX
Wide range of vision hardware architectures OpenVX provides a high-level Graph-based abstraction -> Enables Graph-level optimizations! Can be implemented on almost any hardware or processor! -> Portable, Efficient Vision Processing!
Shipping Implementations X100 Dedicated Hardware Vision DSPs X10 Vision GPU Node Compute Vision Vision Node Node Multi-core Vision Node Power Efficiency Power X1 CPU
Computation Flexibility Vision Processing Graph
Copyright © Khronos® Group Inc. 2018 - Page 57 Extending OpenVX for Inferencing #1 Importing NNEF Neural Network Descriptions
Neural Network Extension NNEF Translator • OpenVX Nodes to represent common NN Layers • Ingests NNEF File and builds OpenVX node graph • 1D-4D Tensors to connect layers • Open source project in progress • INT16, INT7.8, INT8, and U8 Tensor Ops
vxActivationLayer vxConvolutionLayer vxDeconvolutionLayer
vxFullyConnectedLayer vxNormalizationLayer vxPoolingLayer
vxSoftmaxLayer vxROIPoolingLayer …
Vision Node Native Vision Vision Downstream Camera Application Control Node CNN Nodes Node Processing
An OpenVX graph mixing CNN nodes with traditional vision nodes NNEF Translator converts NNEF representation into OpenVX Node Graphs
Copyright © Khronos® Group Inc. 2018 - Page 60 Extending OpenVX for Inferencing #2 Creating Custom User Nodes
OpenVX/OpenCL Interop Kernel/Graph Import • Provisional Extension • Provisional Extension • Enables custom OpenCL acceleration to be • Defines container for executable or IR code invoked from OpenVX User Kernels • Enables arbitrary code to be inserted as a • Memory objects can be mapped or copied OpenVX Node in a graph
Application OpenVX user-kernels can access command queue and cl_mem objects to asynchronously Fully asynchronous host-device schedule OpenCL kernel execution operations during data exchange Copy or export OpenVX data objects cl_mem buffers into OpenVX cl_mem buffers data objects
OpenCL Command Queue
Runtime Map or copy OpenVX data objects into cl_mem buffers Runtime
OpenVX/OpenCL Interop
Copyright © Khronos® Group Inc. 2018 - Page 61 NNEF and OpenVX for Inferencing
Many inferencing stacks end up using OpenCL for hardware acceleration
Ingestion Compilation
Translator Compile to Proprietary IR/Binary Executable Runtimes Code NN Kernel Extension Import Acceleration APIs Vision Nodes To mix inferencing with vision and other User Nodes custom processing
Execute Compile to accelerated executable OpenVX code Runtime
Copyright © Khronos® Group Inc. 2018 - Page 62 OpenCL – Unique Heterogeneous Runtime
OpenCL is the only industry standard for low-level heterogeneous compute Portable control over memory and parallel task execution “The closest you can be to your embedded accelerator and still be portable”
Application or Application or Inferencing Run-time Inferencing Run-time
Growing number of optimized OpenCL vision and inferencing libraries Vision: OpenCV, Halide, Visioncpp Machine Learning: Xiaomi MACE, Arm Compute Library Fragmented GPU Linear Algebra: clDNN, clBlast, ViennaCL API Landscape
CPU GPU CPUCPU GPUGPU
FPGA DSP
Custom Hardware
Copyright © Khronos® Group Inc. 2018 - Page 63 OpenCL – Low-level Parallel Programing • Low-level programming of heterogeneous parallel compute resources - One code tree can be executed on CPUs, GPUs, DSPs and FPGA … • OpenCL C or C++ language to write kernel programs to execute on any compute device - Platform Layer API - to query, select and initialize compute devices - Runtime API - to build and execute kernels programs on multiple devices • The programmer gets to control: - What programs execute on what device - Where data is stored in various speed and size memories in the system - When programs are run, and what operations are dependent on earlier operations
GPU OpenCL CPU KernelOpenCL Kernel code DSP Runtime API CodeKernelOpenCL compiled for CPU loads and executes CPU CodeKernelOpenCL devices FPGA kernels across devices CodeKernel Devices Code Host
Copyright © Khronos® Group Inc. 2018 - Page 64 Growing OpenCL Adoption • 100s of applications using OpenCL acceleration - Rendering, visualization, video editing, simulation, image processing • Almost 6,000 GitHub repositories using OpenCL - Tools, applications, libraries, languages - Up from 4310 one year ago • Khronos Resource Hub https://www.khronos.org/opencl/resources/opencl-applications-using-opencl
Copyright © Khronos® Group Inc. 2018 - Page 65 OpenCL as Language/Library Backend
C++ based Language for Accelerated Single Java language Vision Compiler Open source Neural image computing Source C++ extensions processing directives for software library network processing and library in open Programming for open source Fortran, for machine framework computational source for OpenCL parallelism project C and C++ learning (beta) photography
Hundreds of languages, frameworks and projects using OpenCL to access vendor-optimized, heterogeneous compute runtimes
Copyright © Khronos® Group Inc. 2018 - Page 66 OpenCL Conformant Implementations
1.0|May09 1.1|Jul11 1.2|Jun12 1.0|Aug09 1.1|Aug10 1.2|May12 2.0|Dec14
1.0|May10 1.1|Feb11 Desktop 1.1|Mar11 1.2|Dec12 2.0|Jul14 2.1 | Jun16
1.0|May09 1.1|Jun10 1.2|May15
1.1|Aug12 1.2|Mar16 2.0|Apr17
1.0 | Feb11 1.2|Sep13 Mobile
1.1|Nov12 1.2|Apr14 2.0|Nov15
1.1|Apr12 1.2|Dec14
1.2|Sep14 Adoption Since Last IWOCL NVIDIA: 1.2 on all current GPUs – Windows 1.1|May13and Linux 1.0|Jan10Qualcomm: 2.0 on Adreno GPUs - Android 8.0 Embedded Intel:1.2 on latest processors – Windows 10 1.2|May15 1.2|Aug15 1.0|Jul13 FPGA 1.0|Dec14
Vendor timelines are Dec08 Jun10 Nov11 Nov13 Nov15 first conformant OpenCL 1.0 OpenCL 1.1 OpenCL 1.2 OpenCL 2.0 OpenCL 2.1 submission for each spec Specification Specification Specification Specification Specification generation Copyright © Khronos® Group Inc. 2018 - Page 67 OpenCL Ecosystem Roadmap
OpenCL has an active Bringing Heterogeneous compute to standard ISO C++ three track roadmap Khronos hosting C++17 Parallel STL C++20 Parallel STL with Ranges Proposal
SYCL 1.2 SYCL 1.2.1 C++11 Single source C++11 Single source Processor Deployment programming programming Flexibility Parallel computation across diverse processor architectures
2011 2015 2017 OpenCL 1.2 OpenCL 2.1 OpenCL 2.2 OpenCL C Kernel SPIR-V in Core C++ Kernel Language Kernel Deployment Language Flexibility Execute OpenCL C kernels on Vulkan GPU runtimes
Copyright © Khronos® Group Inc. 2018 - Page 68 OpenCL Next - Feature Set Flexibility • Defining OpenCL features that become optional for enhanced deployment flexibility - API and language features e.g. floating point precisions • Feature Sets avoid fragmentation - Defined to suit specific markets – e.g. desktop, embedded vision and inferencing • Implementations are conformant if fully support feature set functionality
OpenCL 2.2 Functionality = queryable, optional feature
Industry-defined Feature Set E.g. Khronos-defined Khronos-defined Embedded Vision OpenCL 2.2 Full Profile OpenCL 1.2 Full Profile and Inferencing Feature Set Feature Set
Copyright © Khronos® Group Inc. 2018 - Page 69 Universal Deployment Flexibility October 2018
Copyright © Khronos® Group Inc. 2018 - Page 70 Universal Deployment Flexibility
OpenCL Programs
Open source SPIRV-Cross converts Clspv and clvk SPIR-V to MSL or HLSL Open source tools enable open source tools OpenCL and Vulkan apps to be increasingly deployed on any platform Open source shims convert Vulkan to Metal or D3D API calls
Native Vulkan Drivers
UWP and D3D based Consoles
Copyright © Khronos® Group Inc. 2018 - Page 71 Bringing OpenCL Compute to Vulkan • Experimental Clspv Compiler from Google, Adobe and Codeplay - Compiles OpenCL C to Vulkan’s SPIR-V execution environment Successfully tested on over 200K lines of Adobe OpenCL C production code - Open source - tracks top-of-tree LLVM and clang, not a fork
Increasing deployment options for OpenCL developers OpenCL C OpenCL e.g. Vulkan is a supported Source Host Code API on Android
Prototype open Clspv Run-time source project Compiler API Translator Prototype open https://github.com/google/clspv source project https://github.com/kpet/clvk Runtime Kevin Petit from ARM
Copyright © Khronos® Group Inc. 2018 - Page 72 Refining clspv with Diverse Workloads
Public and Private test kernel repositories so developers can upload Compile diverse OpenCL C OpenCL WG Discussion kernels to be used in long-term kernel workloads at Budapest: perf/regression testing – Apache 2.0 Getting kernels into https://github.com/KhronosGroup/OpenCL-Sample-Kernels this clspv process! https://gitlab.khronos.org/opencl/OpenCL-Sample-Kernels 1. Setting up kernel Efficient mapping to sample repo with more Interesting domains to explore include Vulkan SPIR-V? restricted license for existing OpenCL compute libraries - sensitive kernels including vision and inferencing 2. Work with open source library vendors Updates to compiler achieve to extract key kernels efficient mapping? – maybe a funded project?
3. Encouraging members to submit Vulkan is already expanding its compute Propose updates to Vulkan test kernels for key model e.g. 16-bit storage, Variable programming model use cases: e.g. vision Pointers, Subgroups. Compact memory and inferencing types and operations coming
Copyright © Khronos® Group Inc. 2018 - Page 73 SPIR-V Ecosystem
• Khronos defined cross-API IR GLSL HLSL • Native graphics and parallel compute
Third party kernel and • Stable specification to complement shader languages glslang DXC MSL LLVM Open Source Project
HLSL SPIRV-Cross OpenCL C OpenCL C++ SYCL GLSL Front-end Front-end Front-end
SPIR-V (Dis)Assembler
SPIR-V Validator LLVM LLVM to SPIR-V Bi-directional Khronos cooperating with SPIRV-opt | SPIRV-remap Translators Clang/LLVM Community
Optimization Khronos-hosted Tools Open Source Projects https://github.com/KhronosGroup/SPIRV-Tools
IHV Driver Runtimes
Copyright © Khronos® Group Inc. 2018 - Page 74 SPIR-V Recent Updates • SPIR-V 1.3 released - Non-uniform subgroup ops to support Vulkan Subgroups - Six additional extensions incorporated • Tooling evolves - SPIRV-Tools: validation, optimizations, HLSL legalization - SPIRV-Reflect: C/C++ reflection API for SPIR-V shader bytecode - Fossilize serialization format for recording and replaying SPIR-V • Compilers - Glslang: used as basis for GLSL and SPIR-V extensions - DXC: HLSL to SPIR-V compiler
Copyright © Khronos® Group Inc. 2018 - Page 75 Vulkan Portability Initiative
Enabling and accelerating the creation of tools and run-time libraries for Vulkan applications to run on platforms supporting only Metal or Direct3D
Multiple third party implementations Most Vulkan functionality supported. Of Vulkan Portability run-time E.g. Metal 1.0 supports everything libraries and tools except: Triangle fans, Separate stencil reference masks, Vulkan Events, some texture-specific Portability Layers swizzles. Metal 2.0 reduces this list DevSim Layer – develop and debug with the features of a Portability Library on a full Vulkan driver Porting Research Vulkan Portability Validation Layer – enforces use of What % of Vulkan can be Extension Portability library features EFFICIENTLY supported Standardizes app queries for Vulkan by run-time layer over features NOT supported on Conformance Tests D3D and Metal library/platform combination Vulkan CTS will skip tests for missing functionality Subsets cannot be officially conformant but functionality that is present must work!
Implementation and testing experience
Copyright © Khronos® Group Inc. 2018 - Page 76 Bringing Vulkan Apps to Apple Platforms Today
Dota 2 running on Mac up to Applications 50% faster than native OpenGL Open source SDK to build, run, and debug applications First productions apps using on macOS including MoltenVK already shipping Vulkan validation layer support on macOS and iOS macOS SDK
SPIRV-Cross macOS / iOS Beta release – working to pass all Convert SPIR-V shaders to Run-time applicable conformance tests platform source formats Maps Vulkan to Metal Previously a paid product from MoltenVK for macOS and iOS Brenwill Workshop. Now released into For macOS 10.11, iOS 9.0 and up OPEN SOURCE. Completely free to use - no fees or royalties – including commercial applications
Copyright © Khronos® Group Inc. 2018 - Page 77 Valve - Vulkan Dota 2 on macOS
Shipping Now. Vulkan delivering up to 50% performance increase over native OpenGL
Copyright © Khronos® Group Inc. 2018 - Page 78 gfx-rs from Mozilla • Vulkan Portability over D3D, Metal, and OpenGL - D3D layer useful for Vulkan on UWP platforms such as Windows 10 S, Polaris, Xbox One - https://github.com/gfx-rs/gfx - https://github.com/gfx-rs/portability Efficiently running Dota 2 on Mac – working to increase conformance coverage https://gfx-rs.github.io/2018/08/10/dota2-macos-performance.html C Apps and Libs
Rust Apps and Libs Warden Test framework Portability (C API) WebRender WebGPU gfx-HAL (Rust 3D API) Secondary Backends
Copyright © Khronos® Group Inc. 2018 - Page 79 Vulkan Portability Timeline
Multiple iOS and macOS apps being organically developed/ported – support Portability Extension through MoltenVK website. E.g. Ratified Forsaken Remastered on Mac DevSim and Validation MoltenVK implementing Layers Complete Dota 2 on Mac KHR_variable_pointers and Ships in Production KHR_16bit_storage to support Updates to CTS to turn off clspv tests for missing First iOS Apps using functionality MoltenVK ship RenderDoc being integrated with through app store MoltenVK MoltenVK passing MoltenVK in open source conformance for all present Dota 2 on Mac Prototype Qt Running on Mac Interest in Vulkan layer for functionality. MacOS SDK from LunarG through MoltenVK Playstation – may need funding? gfx-rs soon after
GDC June TODAY GDC 2018 2018 2018 2019
Copyright © Khronos® Group Inc. 2018 - Page 80 Safety Critical Acceleration APIs October 2018
Copyright © Khronos® Group Inc. 2018 - Page 81 OpenGL SC Milestones
OpenGL SC 1.0 - 2005 OpenGL SC 2.0 - April 2016 Fixed function graphics Shader programmable pipeline safety critical subset safety critical subset
OpenGL ES 1.0 - 2003 OpenGL ES 2.0 - 2007 Fixed function graphics Shader programmable pipeline
Copyright © Khronos® Group Inc. 2018 - Page 82 OpenVX SC - Safety Critical Vision Processing • Based on OpenVX 1.1 specification - Enhanced determinism - Specification identifies and numbers requirements • MISRA C clean per KlocWorks v10 • Divides functionality into “development” and “deployment” feature sets - Adds requirement to support import/export extension
OpenVX SC Verify Binary Import OpenVX SC Development Feature format Deployment Feature Set Set (Create Graph) Export (Execute Graph)
Entire graph creation API Implementation- No graph creation API dependent format
Copyright © Khronos® Group Inc. 2018 - Page 83 Khronos Safety Critical Advisory Forum
Need for new-generation APIs for safety certifiable vision, graphics and compute e.g. ISO 26262 and DO-178B/C Khronos SC Activities
Industry outreach and cooperation Vulkan and OpenCL Gathering SC requirements
AESIN Automotive ADAS & AV + security https://aesin.org.uk Generating guidelines for designing safety critical APIs to ease system certification. SYCL SC Open to Khronos member AND industry experts Guidelines to augment Industry https://www.khronos.org/advisors/kscaf First Safe and Secure Parallel and MISRA C++ Heterogeneous C++ C++ WG23 Programming Vulnerabilities Safe AI for Automotive ISO C Safe and Secure SG We invite safety critical ISO C++ Vulnerabilities Safety Critical SG experts to join KSCAF! No cost or work commitment OpenVX SC 1.1 - May 2017 Being incorporated into new generation mainline OpenVX specification
Copyright © Khronos® Group Inc. 2018 - Page 84 Liaisons and Exploratory Initiatives October 2018 Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems
Copyright © Khronos® Group Inc. 2018 - Page 85 Khronos Liaisons
Formal Liaisons Khronos and OGC (Open Geospatial Consortium) jointly advancing open geospatial standards related ETSI (The European Telecommunications to the fields of augmented and virtual reality, Standards Institute) ARF (AR Framework) distributed simulation, and 3D content services for interoperable AR applications has identified Khronos standards as developing key standards to be referenced
COLLADA is a formal ISO standard. Working to make glTF a formal ISO standard to enable its adoption in products such as PDF Working together to ensure hardware HMD standards and OpenXR are aligned
Informal Liaisons
Copyright © Khronos® Group Inc. 2018 - Page 86 Takyon Exploratory Group
Abaco – made a standardization proposal to Khronos Takyon - unifies low level point-to-point communication and signaling APIs, for the Embedded HPC audience Khronos – outreaching to members and industry to determine interest. Will establish working group if established industry need
Before After Five Essential API Commands Create: create a communication path to a remote endpoint Send: send a message to the remote endpoint (blocking and non-blocking) Send Completion Test: test if a previously started non-blocking send is complete Receive: receive a message Destroy: destroy the path https://www.khronos.org/exploratory/takyon
Copyright © Khronos® Group Inc. 2018 - Page 87 Need for Camera Control API? • Embedded market is distinct from mobile market -> different camera APIs? - Different sensors, multiple sensors, synchronization etc. etc… • Should we reach out to European Machine Vision Association (EMVA) - https://www.emva.org/ - Looking to expand functionality – productive cooperation? - https://www.emva.org/news/emva-hosts-two-new-standards/ • Genicam - https://www.emva.org/standards-technology/genicam/ - Currently a simple interface for industrial inspection - Should we talk to camera vendors e.g. Basler https://www.baslerweb.com/en/
Defines control of Sensor, Color Filter Array Lens, Flash, Focus, Aperture OpenKCAM standard is Auto Exposure (AE) currently on ice – do we Auto White Balance (AWB) Auto Focus (AF) Stream of Images for need to restart? Image Signal Vision Processing Processor (ISP)
Copyright © Khronos® Group Inc. 2018 - Page 88 Principles of Operation October 2018
Neil Trevett | Khronos President NVIDIA | VP Developer Ecosystems
Copyright © Khronos® Group Inc. 2018 - Page 89 Khronos Principles of Organization
Any company is welcome to join. Any member can propose new One company one vote standards initiatives Software Strong IP Framework to enable Conformance Tests and Adopters ROYALTY-FREE specifications: Programs for defining conformance, all members agree not assert patents specification integrity and cross- against conformant implementations vendor portability
Only invest where there is Silicon Non-profit organization - strong industry momentum to Membership and Adopters fees ensure industry relevance – cover operating, marketing and let Darwinism rule! engineering expenses
Copyright © Khronos® Group Inc. 2018 - Page 90 Khronos Cooperative Framework
Advisory Panels Khronos Board API Working Groups Individual industry experts Strategy, budget and (Industry, Academic and by invitation, no fee. $$ oversight Associate members) Provide requirements and draft spec feedback $ One vote per industry member
Industry and Community Adopters Conformance Royalty-Free Implementations, Tools Educator Guidelines Feedback and $ Programs Tests Specifications Documentation, Samples Courseware Materials contributions on released materials Open and royalty-free. Often open sourced
Adopters Developers Educators / Certifiers Build conformant Develop applications Create Courses implementation and products using the APIs Training and Certification
Copyright © Khronos® Group Inc. 2018 - Page 91 Khronos Standardization Process WG Passes Finalized Specification to Board - Board votes to Ratify Specification - Specification and Conformance Tests Publicly Released
Any Member Can Propose a Working Group Working Groups - Board votes to establish - Chair & specification editor elected from WG members Exploratory Group if enough interest - Design Contributions welcome from any WG member - Exploratory Group creates detailed - Consensus-based decision process – one company one vote Statement of Work (SOW) - All work is discussed and documented online - Board votes to establish a Working - One teleconference per week (in English) Group to execute the SOW - Three Khronos F2F Meetings a year - open to all members
Anyone with member email address can create account for full access to Khronos resources under Khronos NDA - Draft specifications and tests for all working groups - Email reflectors for each working group
Any Company Can Join Any Member Can Propose an Khronos as a Member Individual for an Advisory Panel - Board - $75K/year - Working Group votes to approve - Contributor - $18K/year - Invitation sent to individual to sign - Associate (<20 employees) - $3.5K/year Advisory Panel Agreement - Academic - $1K/year
Copyright © Khronos® Group Inc. 2018 - Page 92 Khronos IP and Adopters Framework
Working Group releases Conformance Tests for each Ratified Specification
When a Submission is All Members Agree successfully reviewed that to NOT assert their patents Adopters Process product becomes Conformant against any CONFORMANT defines how to submit -> implementation of a RATIFIED Conformance Test Results for The Implementation can use the Specification (no patents need Working Group Review API trademark be disclosed) AND The Implementation becomes protected from patent litigation Implementer executes Adopters from other members Membership agreement contains Agreement to access Adopters mechanisms for members to Process. Adopters fee for each exclude specific patents (rarely generation of specification used as license grant is very (typically around $20K for narrow) unlimited number of products)
Copyright © Khronos® Group Inc. 2018 - Page 93 Khronos Conformance Submission Process
Company Company executes Port Test Upload test results Successful Review enables implementing Adopters Agreement Source to to Khronos private products to use Khronos Khronos spec wishes and pays fee products and web-site. Peer trademarks, be IP Protected and to use the (typically ~$20-50K generate test review by to be listed on Khronos website trademark for unlimited results members/Adopters E.g. “We have products) to enter implemented Adopters program for OpenGL ES” an API Full use of logo Restricted use and trademark trademark (not logo) with small Full use of logo Adopter Benefit with disclaimer disclaimer and trademark language AND IP Protection
Copyright © Khronos® Group Inc. 2018 - Page 94 Exploratory Group Process
Any Member(s) IP-Free Briefing Board votes to Exploratory Group Board votes to Working can propose a with informal establish Exploratory creates detailed establish a Group new Working straw-poll on Group if enough Statement of Work Working Group to Meetings Group level on interest member interest (SOW) execute the SOW Start
Detailed design starts. Participation incurs NO IP licensing obligations Participation removes right to withdraw from Discussions should be high-level: requirements, high-level design directions IP licensing obligations
Exploratory Group Process is designed so that Khronos, and its members, can explore whether to undertake a new initiative without committing any member to IP licensing obligations
Copyright © Khronos® Group Inc. 2018 - Page 95 New Open Source Engagement Model
• Khronos is open sourcing specification sources, Source Materials for Specifications and Reference conformance tests, tools Documentation CONTRIBUTED Under Khronos IP Framework - Merge requests welcome from the community (you won’t assert patents against conformant implementations, (subject to review by OpenCL working group) and license copyright for Khronos use) • Deeper Community Enablement Spec and - Mix your own documentation! Ref Language Source Redistribution - Contribute and fix conformance tests under CC-BY 4.0 - Fix the specification, headers, ICD etc. Contributions and - Contribute new features (carefully) Distribution under Apache 2.0 Spec Build System and Contributions and Scripts Distribution under Conformance Apache 2.0 Test Suite Source Khronos builds and Ratifies Spec and Ref Language Source and Canonical Specification under derivative materials. Khronos IP Framework. Re-mixable under CC-BY by the Khronos Adopters No changes or re-hosting allowed industry and community Program
Anyone can test any Conformant Implementations can Community built implementation at use trademark and are covered documentation and any time by Khronos IP Framework tools
Copyright © Khronos® Group Inc. 2018 - Page 96 Khronos Advisory Panels
Specification drafts and Requirements and invitations for feedback on requirements and feedback specification drafts
Working Shared Email Advisory list and Group Repository Panel Hosted by Khronos. Khronos Members Under Khronos NDA Khronos Advisors Any company can join Invited industry experts Membership Fee $0 Cost Sign NDA and IP Framework Sign NDA and IP Framework
Advisory Panels Active for Vulkan, OpenCL/SYCL and NNEF
Copyright © Khronos® Group Inc. 2018 - Page 97 Khronos Education Forum
Support educators to teach Khronos technologies!
Khronos Education Forum Online Hub for collaboration https://www.khronos.org/education Shared course materials under permissive license Open to all – no fees or membership required Just send an email to [email protected] Educators can connect Direct contact with to share and discuss Khronos working course materials groups for review and feedback
Oregon State University Graphics Class with Vulkan Ref Cards
Copyright © Khronos® Group Inc. 2018 - Page 98 The Value of Khronos Participation
See an early window Have a voice in Develop into the future of the how key products in Products are aligned industry technology standards parallel with with global market roadmap before evolve to suit spec drafting needs and trends products are your business for faster time developed needs to market Members can ship products faster than non-members
Gather Industry Draft Specifications Publicly Release Non-members Requirements for future Confidential to Specifications and Release silicon acceleration Khronos members Conformance Tests Products
The Khronos standardization process is proven to RAPIDLY generate industry consensus on open standards to EFFICIENTLY create new market opportunities
Copyright © Khronos® Group Inc. 2018 - Page 99 Summary and Contacts Angela • Khronos is creating cutting-edge royalty-free open standards WeChat ID: cxy3532 - For 3D, VR/AR, Compute, Vision and machine learning • Please join Khronos! - We welcome members from China and Asia - Influence the direction of industry standards - Get early access to draft specifications • More Information www.khronos.org - WeChat ID: neiltrevett - [email protected] - @neilt3d
Copyright © Khronos® Group Inc. 2018 - Page 100