Khronos Group SYCL 1.2.1 Specification

Total Page:16

File Type:pdf, Size:1020Kb

Khronos Group SYCL 1.2.1 Specification SYCL™ Specification SYCL™ integrates OpenCL™ devices with modern C++ Version 1.2.1 Document Revision: 6 Revision Date: May 14, 2019 Git revision: heads/travis-0-g4f9a12d-dirty Khronos® OpenCL™ Working Group — SYCL™ subgroup Editors: Ronan Keryell, Maria Rovatsou & Lee Howes Copyright 2011-2019 The Khronos® Group Inc. All Rights Reserved SYCL 1.2.1 Copyright© 2013-2019 The Khronos® Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos® Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or other- wise exploited in any manner without the express prior written permission of Khronos® Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos® Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos® to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current ver- sion of this specification on the Khronos® Group website should be included whenever possible with specification distributions. Khronos® Group makes no, and expressly disclaims any, representations or warranties, express or implied, re- garding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property. Khronos® Group makes no, and expressly disclaims any, warranties, express or implied, regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will the Khronos® Group, or any of its Promoters, Con- tributors or Members or their respective partners, officers, directors, employees, agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials. Khronos® is a registered trademark and SYCL™, SPIR™, WebGL™, EGL™, COLLADA™, StreamInput™, OpenVX™, OpenKCam™, glTF™, OpenKODE™, OpenVG™, OpenWF™, OpenSL ES™, OpenMAX™, OpenMAX AL™, OpenMAX IL™ and OpenMAX DL™ and WebCL™ are trademarks of the Khronos® Group Inc. OpenCL™ is a trademark of Apple Inc. and OpenGL® and OpenML® are registered trademarks and the OpenGL ES™ and OpenGL SC™ logos are trademarks of Silicon Graphics International used under license by Khronos®. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners. 2 Contents 1 Acknowledgements 13 2 Introduction 14 3 SYCL Architecture 16 3.1 Overview . 16 3.2 Anatomy of a SYCL application . 17 3.3 The SYCL Platform Model . 18 3.3.1 Platform Mixed Version Support . 19 3.4 SYCL Execution Model . 19 3.4.1 SYCL Application Execution Model . 19 3.4.1.1 OpenCL resources managed by SYCL Application . 19 3.4.1.2 SYCL Command Groups and Execution Order . 20 3.4.2 SYCL Kernel Execution Model . 22 3.5 Memory Model . 23 3.5.1 SYCL Application Memory Model . 23 3.5.2 SYCL Device Memory Model . 26 3.5.2.1 Access to memory . 26 3.5.2.2 Memory consistency . 27 3.5.2.3 Atomic operations . 27 3.6 The SYCL programming model . 27 3.6.1 Basic data parallel kernels . 28 3.6.2 Work-group data parallel kernels . 28 3.6.3 Hierarchical data parallel kernels . 28 3.6.4 Kernels that are not launched over parallel instances . 28 3.6.5 Synchronization . 29 3.6.5.1 Synchronization in the SYCL Application . 29 3.6.5.2 Synchronization in SYCL Kernels . 30 3.6.6 Error handling . 30 3.6.7 Fallback Mechanism . 30 3.6.8 Scheduling of kernels and data movement . 31 3.6.9 Managing object lifetimes . 31 3.6.10 Device discovery and selection . 32 3.6.11 Interfacing with OpenCL . 32 3.7 Memory objects . 33 3.8 SYCL for OpenCL Framework . 34 3.9 SYCL device compiler . 34 3.9.1 Building a SYCL program . 34 3.9.2 Naming of kernels . 35 3.10 Language restrictions in kernels . 35 3.10.1 SYCL Linker . 36 3.10.2 Functions and datatypes available in kernels . 36 3.11 Execution of kernels on the SYCL host device . 36 3 CONTENTS SYCL 1.2.1 3.12 Endianness support . 37 3.13 Example SYCL application . 37 4 SYCL Programming Interface 39 4.1 Header files and namespaces . 39 4.2 Class availability . 39 4.3 Common interface . 40 4.3.1 OpenCL interoperability . 40 4.3.2 Common reference semantics . 40 4.3.3 Common by-value semantics . 42 4.3.4 Properties . 44 4.3.4.1 Properties interface . 45 4.4 Param traits class . 46 4.5 C++ Standard library classes required for the interface . 46 4.6 SYCL runtime classes . 47 4.6.1 Device selection class . 48 4.6.1.1 Device selector interface . 48 4.6.1.2 Derived device selector classes . 49 4.6.2 Platform class . 50 4.6.2.1 Platform interface . 51 4.6.2.2 Platform information descriptors . 52 4.6.3 Context class . 53 4.6.3.1 Context interface . 53 4.6.3.2 Context information descriptors . 56 4.6.4 Device class . 56 4.6.4.1 Device interface . 57 4.6.4.2 Device information descriptors . 60 4.6.5 Queue class . 71 4.6.5.1 Queue interface . 72 4.6.5.2 Queue information descriptors . 75 4.6.5.3 Queue Properties . 76 4.6.5.4 Queue error handling . 76 4.6.6 Event class . 77 4.6.6.1 Event information and profiling descriptors . 80 4.7 Data access and storage in SYCL . 81 4.7.1 Host allocation . 81 4.7.1.1 Default Allocators . 82 4.7.2 Buffers . 82 4.7.2.1 Buffer Interface . 83 4.7.2.2 Buffer Properties . 91 4.7.2.3 Buffer Synchronization Rules . 92 4.7.3 Images . 93 4.7.3.1 Image Interface . 94 4.7.3.2 Image Properties . ..
Recommended publications
  • GLSL 4.50 Spec
    The OpenGL® Shading Language Language Version: 4.50 Document Revision: 7 09-May-2017 Editor: John Kessenich, Google Version 1.1 Authors: John Kessenich, Dave Baldwin, Randi Rost Copyright (c) 2008-2017 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.
    [Show full text]
  • Report of Contributions
    X.Org Developers Conference 2020 Report of Contributions https://xdc2020.x.org/e/XDC2020 X.Org Developer … / Report of Contributions State of text input on Wayland Contribution ID: 1 Type: not specified State of text input on Wayland Wednesday, 16 September 2020 20:15 (5 minutes) Between the last impromptu talk at GUADEC 2018, text input on Wayland has become more organized and more widely adopted. As before, the three-pronged approach of text_input, in- put_method, and virtual keyboard still causes confusion, but increased interest in implementing it helps find problems and come closer to something that really works for many usecases. The talk will mention how a broken assumption causes a broken protocol, and why we’re notdone with Wayland input methods yet. It’s recommended to people who want to know more about the current state of input methods on Wayland. Recommended background: aforementioned GUADEC talk, wayland-protocols reposi- tory, my blog: https://dcz_self.gitlab.io/ Code of Conduct Yes GSoC, EVoC or Outreachy No Primary author: DCZ, Dorota Session Classification: Demos / Lightning talks I Track Classification: Lightning Talk September 30, 2021 Page 1 X.Org Developer … / Report of Contributions IGT GPU Tools 2020 Update Contribution ID: 2 Type: not specified IGT GPU Tools 2020 Update Wednesday, 16 September 2020 20:00 (5 minutes) Short update on IGT - what has changed in the last year, where are we right now and what we have planned for the near future. IGT GPU Tools is a collection of tools and tests aiding development of DRM drivers. It’s widely used by Intel in its public CI system.
    [Show full text]
  • GLSL Specification
    The OpenGL® Shading Language Language Version: 4.60 Document Revision: 3 23-Jul-2017 Editor: John Kessenich, Google Version 1.1 Authors: John Kessenich, Dave Baldwin, Randi Rost Copyright (c) 2008-2017 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.
    [Show full text]
  • Marc Benito Bermúdez, Matina Maria Trompouki, Leonidas Kosmidis
    Evaluation of Graphics-based General Purpose Computation Solutions for Safety Critical Systems: An Avionics Case Study Marc Benito Bermúdez, Matina Maria Trompouki, Leonidas Kosmidis Juan David Garcia, Sergio Carretero, Ken Wenger {marc.benito, leonidas.Kosmidis}@bsc.es Motivation • Massively parallel architectures, high computational power and high energy efficiency, in thermally limited systems • OpenCL and CUDA dominate the market of GPGPU Embedded GPUs can programming in HPC provide the required • Easily programmable APIs performance • Cannot be used in safety critical systems because of pointers and dynamic memory allocation • Safety Critical Systems require higher performance to support new advanced functionalities • In this Bachelor’s thesis [1] awarded with a Technology Transfer Award we: Characteristics of Safety Critical Systems: • Analyze their differences compared to desktop graphics APIs • Certification: Need to comply with safety standards: ISO26262 / DO178 • Demonstrate how a safety-critical application written in a non-certifiable • Very conservative in terms of hardware and software: simple programming model can be converted to use safety-critical APIs. processors, mainly single core • Evaluate performance and programmability trade-offs OpenGL and Vulkan versions diagram Visual output of the Avionics Application Initial prototype GPU-based avionics application, written in Vulkan and ported to OpenGL SC 2 following the guidelines of [2][3]. Basic Compute Shader Shader 3D Image Processed Image Brook SC Porting and Comparison
    [Show full text]
  • Blackberry QNX Multimedia Suite
    PRODUCT BRIEF QNX Multimedia Suite The QNX Multimedia Suite is a comprehensive collection of media technology that has evolved over the years to keep pace with the latest media requirements of current-day embedded systems. Proven in tens of millions of automotive infotainment head units, the suite enables media-rich, high-quality playback, encoding and streaming of audio and video content. The multimedia suite comprises a modular, highly-scalable architecture that enables building high value, customized solutions that range from simple media players to networked systems in the car. The suite is optimized to leverage system-on-chip (SoC) video acceleration, in addition to supporting OpenMAX AL, an industry open standard API for application-level access to a device’s audio, video and imaging capabilities. Overview Consumer’s demand for multimedia has fueled an anywhere- o QNX SDK for Smartphone Connectivity (with support for Apple anytime paradigm, making multimedia ubiquitous in embedded CarPlay and Android Auto) systems. More and more embedded applications have require- o Qt distributions for QNX SDP 7 ments for audio, video and communication processing capabilities. For example, an infotainment system’s media player enables o QNX CAR Platform for Infotainment playback of content, stored either on-board or accessed from an • Support for a variety of external media stores external drive, mobile device or streamed over IP via a browser. Increasingly, these systems also have streaming requirements for Features at a Glance distributing content across a network, for instance from a head Multimedia Playback unit to the digital instrument cluster or rear seat entertainment units. Multimedia is also becoming pervasive in other markets, • Software-based audio CODECs such as medical, industrial, and whitegoods where user interfaces • Hardware accelerated video CODECs are increasingly providing users with a rich media experience.
    [Show full text]
  • The Opencl Specification
    The OpenCL Specification Version: 2.0 Document Revision: 22 Khronos OpenCL Working Group Editor: Aaftab Munshi Last Revision Date: 3/18/14 Page 1 1. INTRODUCTION ............................................................................................................... 10 2. GLOSSARY ......................................................................................................................... 12 3. THE OPENCL ARCHITECTURE .................................................................................... 23 3.1 Platform Model ................................................................................................................................ 23 3.2 Execution Model .............................................................................................................................. 25 3.2.1 Execution Model: Mapping work-items onto an NDRange ........................................................................28 3.2.2 Execution Model: Execution of kernel-instances ........................................................................................30 3.2.3 Execution Model: Device-side enqueue ......................................................................................................31 3.2.4 Execution Model: Synchronization .............................................................................................................32 3.2.5 Execution Model: Categories of Kernels ....................................................................................................33 3.3 Memory
    [Show full text]
  • Standards for Vision Processing and Neural Networks
    Standards for Vision Processing and Neural Networks Radhakrishna Giduthuri, AMD [email protected] © Copyright Khronos Group 2017 - Page 1 Agenda • Why we need a standard? • Khronos NNEF • Khronos OpenVX dog Network Architecture Pre-trained Network Model (weights, …) © Copyright Khronos Group 2017 - Page 2 Neural Network End-to-End Workflow Neural Network Third Vision/AI Party Applications Training Frameworks Tools Datasets Trained Vision and Neural Network Network Inferencing Runtime Network Model Architecture Desktop and Cloud Hardware Embedded/Mobile Embedded/MobileEmbedded/Mobile Embedded/Mobile/Desktop/CloudVision/InferencingVision/Inferencing Hardware Hardware cuDNN MIOpen MKL-DNN Vision/InferencingVision/Inferencing Hardware Hardware GPU DSP CPU Custom FPGA © Copyright Khronos Group 2017 - Page 3 Problem: Neural Network Fragmentation Neural Network Training and Inferencing Fragmentation NN Authoring Framework 1 Inference Engine 1 NN Authoring Framework 2 Inference Engine 2 NN Authoring Framework 3 Inference Engine 3 Every Tool Needs an Exporter to Every Accelerator Neural Network Inferencing Fragmentation toll on Applications Inference Engine 1 Hardware 1 Vision/AI Inference Engine 2 Hardware 2 Application Inference Engine 3 Hardware 3 Every Application Needs know about Every Accelerator API © Copyright Khronos Group 2017 - Page 4 Khronos APIs Connect Software to Silicon Software Silicon Khronos is an International Industry Consortium of over 100 companies creating royalty-free, open standard APIs to enable software to access
    [Show full text]
  • Openbricks Embedded Linux Framework - User Manual I
    OpenBricks Embedded Linux Framework - User Manual i OpenBricks Embedded Linux Framework - User Manual OpenBricks Embedded Linux Framework - User Manual ii Contents 1 OpenBricks Introduction 1 1.1 What is it ?......................................................1 1.2 Who is it for ?.....................................................1 1.3 Which hardware is supported ?............................................1 1.4 What does the software offer ?............................................1 1.5 Who’s using it ?....................................................1 2 List of supported features 2 2.1 Key Features.....................................................2 2.2 Applicative Toolkits..................................................2 2.3 Graphic Extensions..................................................2 2.4 Video Extensions...................................................3 2.5 Audio Extensions...................................................3 2.6 Media Players.....................................................3 2.7 Key Audio/Video Profiles...............................................3 2.8 Networking Features.................................................3 2.9 Supported Filesystems................................................4 2.10 Toolchain Features..................................................4 3 OpenBricks Supported Platforms 5 3.1 Supported Hardware Architectures..........................................5 3.2 Available Platforms..................................................5 3.3 Certified Platforms..................................................7
    [Show full text]
  • Khronos Template 2015
    Ecosystem Overview Neil Trevett | Khronos President NVIDIA Vice President Developer Ecosystem [email protected] | @neilt3d © Copyright Khronos Group 2016 - Page 1 Khronos Mission Software Silicon Khronos is an Industry Consortium of over 100 companies creating royalty-free, open standard APIs to enable software to access hardware acceleration for graphics, parallel compute and vision © Copyright Khronos Group 2016 - Page 2 http://accelerateyourworld.org/ © Copyright Khronos Group 2016 - Page 3 Vision Pipeline Challenges and Opportunities Growing Camera Diversity Diverse Vision Processors Sensor Proliferation 22 Flexible sensor and camera Use efficient acceleration to Combine vision output control to GENERATE PROCESS with other sensor data an image stream the image stream on device © Copyright Khronos Group 2016 - Page 4 OpenVX – Low Power Vision Acceleration • Higher level abstraction API - Targeted at real-time mobile and embedded platforms • Performance portability across diverse architectures - Multi-core CPUs, GPUs, DSPs and DSP arrays, ISPs, Dedicated hardware… • Extends portable vision acceleration to very low power domains - Doesn’t require high-power CPU/GPU Complex - Lower precision requirements than OpenCL - Low-power host can setup and manage frame-rate graph Vision Engine Middleware Application X100 Dedicated Vision Processing Hardware Efficiency Vision DSPs X10 GPU Compute Accelerator Multi-core Accelerator Power Efficiency Power X1 CPU Accelerator Computation Flexibility © Copyright Khronos Group 2016 - Page 5 OpenVX Graphs
    [Show full text]
  • Khronos Data Format Specification
    Khronos Data Format Specification Andrew Garrard Version 1.3.1 2020-04-03 1 / 281 Khronos Data Format Specification License Information Copyright (C) 2014-2019 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. This version of the Data Format Specification is published and copyrighted by Khronos, but is not a Khronos ratified specification. Accordingly, it does not fall within the scope of the Khronos IP policy, except to the extent that sections of it are normatively referenced in ratified Khronos specifications. Such references incorporate the referenced sections into the ratified specifications, and bring those sections into the scope of the policy for those specifications. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible.
    [Show full text]
  • Khronos Native Platform Graphics Interface (EGL Version 1.4 - April 6, 2011)
    Khronos Native Platform Graphics Interface (EGL Version 1.4 - April 6, 2011) Editor: Jon Leech 2 Copyright (c) 2002-2011 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, repub- lished, distributed, transmitted, displayed, broadcast or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be re- formatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group web-site should be included whenever possible with specification distributions. Khronos Group makes no, and expressly disclaims any, representations or war- ranties, express or implied, regarding this specification, including, without limita- tion, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property.
    [Show full text]
  • Gstreamer and Dmabuf
    GStreamer and dmabuf OMAP4+ graphics/multimedia update Rob Clark Outline • A quick hardware overview • Kernel infrastructure: drm/gem, rpmsg+dce, dmabuf • Blinky s***.. putting pixels on the screen • Bringing it all together in GStreamer A quick hardware overview DMM/Tiler • Like a system-wide GART – Provides a contiguous view of memory to various hw accelerators: IVAHD, ISS, DSS • Provides tiling modes for enhanced memory bandwidth efficiency – For initiators like IVAHD which access memory in 2D block patterns • Provides support for rotation – Zero cost rotation for DSS/ISS access in 0º/90º/180º/270º orientations (with horizontal or vertical reflection) IVA-HD • Multi-codec hw video encode/decode – H.264 BP/MP/HP encode/decode – MPEG-4 SP/ASP encode/decode – MPEG-2 SP/MP encode/decode – MJPEG encode/decode – VC1/WMV9 decode – etc DSS – Display Subsystem • Display Subsystem – 4 video pipes, 3 support scaling and YUV – Any number of video pipes can be attached to one of 3 “overlay manager” to route to a display Kernel infrastructure: drm/gem, rpmsg+dce, dmabuf DRM Overview • DRM → Direct Rendering Manager – Started life heavily based on x86/desktop graphics card architecture – But more recently has evolved to better support ARM and other SoC platforms • KMS → Kernel Mode Setting – Replaces fbdev for more advanced display management – Hotplug, multiple display support (spanning/cloning) – And more recently support for overlays (planes) • GEM → Graphics Execution Manager – But the important/useful part here is the graphics/multimedia buffer management DRM - KMS • Models the display hardware as: – Connector → the thing that the display connects to • Handles DDC/EDID, hotplug detection – Encoder → takes pixel data from CRTC and encodes it to a format suitable for connectors • ie.
    [Show full text]