The Opencl™ Specification

Total Page:16

File Type:pdf, Size:1020Kb

The Opencl™ Specification The OpenCL™ Specification Khronos® OpenCL Working Group Version v3.0.6-68-g7d736c4, Thu, 02 Sep 2021 20:01:22 +0000: from git branch: commit: 7d736c4fda6b26166bfe2bcc5cfab82170af04e7 Table of Contents 1. Introduction. 2 1.1. Normative References . 3 1.2. Version Numbers . 3 1.3. Unified Specification . 3 2. Glossary . 4 3. The OpenCL Architecture. 18 3.1. Platform Model . 18 3.2. Execution Model . 19 3.2.1. Mapping work-items onto an NDRange . 23 3.2.2. Execution of kernel-instances . 25 3.2.3. Device-side enqueue . 26 3.2.4. Synchronization . 27 3.2.5. Categories of Kernels . 29 3.3. Memory Model . 29 3.3.1. Fundamental Memory Regions . 30 3.3.2. Memory Objects . 32 3.3.3. Shared Virtual Memory. 34 3.3.4. Memory Consistency Model for OpenCL 1.x . 35 3.3.5. Memory Consistency Model for OpenCL 2.x . 35 3.3.6. Overview of atomic and fence operations . 38 3.3.7. Memory Ordering Rules . 39 3.4. The OpenCL Framework. 50 3.4.1. Mixed Version Support . 50 3.4.2. Backwards Compatibility . 51 3.4.3. Versioning. 52 4. The OpenCL Platform Layer . 54 4.1. Querying Platform Info. 54 4.2. Querying Devices . 56 4.3. Partitioning a Device . 84 4.4. Contexts. 89 5. The OpenCL Runtime . 97 5.1. Command Queues. 97 5.2. Buffer Objects . 104 5.2.1. Creating Buffer Objects . 104 5.2.2. Reading, Writing and Copying Buffer Objects . 111 5.2.3. Filling Buffer Objects . 122 5.2.4. Mapping Buffer Objects. 124 5.3. Image Objects . 127 5.3.1. Creating Image Objects . 128 5.3.2. Querying List of Supported Image Formats . 139 5.3.3. Reading, Writing and Copying Image Objects . 143 5.3.4. Filling Image Objects . 149 5.3.5. Copying between Image and Buffer Objects . 151 5.3.6. Mapping Image Objects. 155 5.3.7. Image Object Queries. 158 5.4. Pipes. 160 5.4.1. Creating Pipe Objects . 160 5.4.2. Pipe Object Queries . 162 5.5. Querying, Unmapping, Migrating, Retaining and Releasing Memory Objects . 163 5.5.1. Retaining and Releasing Memory Objects . 163 5.5.2. Unmapping Mapped Memory Objects. 166 5.5.3. Accessing mapped regions of a memory object . 168 5.5.4. Migrating Memory Objects. 168 5.5.5. Memory Object Queries. 171 5.6. Shared Virtual Memory . 174 5.6.1. SVM sharing granularity: coarse- and fine- grained sharing. 175 5.6.2. Memory consistency for SVM allocations. 187 5.7. Sampler Objects . ..
Recommended publications
  • GLSL 4.50 Spec
    The OpenGL® Shading Language Language Version: 4.50 Document Revision: 7 09-May-2017 Editor: John Kessenich, Google Version 1.1 Authors: John Kessenich, Dave Baldwin, Randi Rost Copyright (c) 2008-2017 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.
    [Show full text]
  • Report of Contributions
    X.Org Developers Conference 2020 Report of Contributions https://xdc2020.x.org/e/XDC2020 X.Org Developer … / Report of Contributions State of text input on Wayland Contribution ID: 1 Type: not specified State of text input on Wayland Wednesday, 16 September 2020 20:15 (5 minutes) Between the last impromptu talk at GUADEC 2018, text input on Wayland has become more organized and more widely adopted. As before, the three-pronged approach of text_input, in- put_method, and virtual keyboard still causes confusion, but increased interest in implementing it helps find problems and come closer to something that really works for many usecases. The talk will mention how a broken assumption causes a broken protocol, and why we’re notdone with Wayland input methods yet. It’s recommended to people who want to know more about the current state of input methods on Wayland. Recommended background: aforementioned GUADEC talk, wayland-protocols reposi- tory, my blog: https://dcz_self.gitlab.io/ Code of Conduct Yes GSoC, EVoC or Outreachy No Primary author: DCZ, Dorota Session Classification: Demos / Lightning talks I Track Classification: Lightning Talk September 30, 2021 Page 1 X.Org Developer … / Report of Contributions IGT GPU Tools 2020 Update Contribution ID: 2 Type: not specified IGT GPU Tools 2020 Update Wednesday, 16 September 2020 20:00 (5 minutes) Short update on IGT - what has changed in the last year, where are we right now and what we have planned for the near future. IGT GPU Tools is a collection of tools and tests aiding development of DRM drivers. It’s widely used by Intel in its public CI system.
    [Show full text]
  • GLSL Specification
    The OpenGL® Shading Language Language Version: 4.60 Document Revision: 3 23-Jul-2017 Editor: John Kessenich, Google Version 1.1 Authors: John Kessenich, Dave Baldwin, Randi Rost Copyright (c) 2008-2017 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.
    [Show full text]
  • Blackberry QNX Multimedia Suite
    PRODUCT BRIEF QNX Multimedia Suite The QNX Multimedia Suite is a comprehensive collection of media technology that has evolved over the years to keep pace with the latest media requirements of current-day embedded systems. Proven in tens of millions of automotive infotainment head units, the suite enables media-rich, high-quality playback, encoding and streaming of audio and video content. The multimedia suite comprises a modular, highly-scalable architecture that enables building high value, customized solutions that range from simple media players to networked systems in the car. The suite is optimized to leverage system-on-chip (SoC) video acceleration, in addition to supporting OpenMAX AL, an industry open standard API for application-level access to a device’s audio, video and imaging capabilities. Overview Consumer’s demand for multimedia has fueled an anywhere- o QNX SDK for Smartphone Connectivity (with support for Apple anytime paradigm, making multimedia ubiquitous in embedded CarPlay and Android Auto) systems. More and more embedded applications have require- o Qt distributions for QNX SDP 7 ments for audio, video and communication processing capabilities. For example, an infotainment system’s media player enables o QNX CAR Platform for Infotainment playback of content, stored either on-board or accessed from an • Support for a variety of external media stores external drive, mobile device or streamed over IP via a browser. Increasingly, these systems also have streaming requirements for Features at a Glance distributing content across a network, for instance from a head Multimedia Playback unit to the digital instrument cluster or rear seat entertainment units. Multimedia is also becoming pervasive in other markets, • Software-based audio CODECs such as medical, industrial, and whitegoods where user interfaces • Hardware accelerated video CODECs are increasingly providing users with a rich media experience.
    [Show full text]
  • Opengl ES / Openvg / Opencl / Webgl / Etc
    KNU-3DC : Education and Training Plan 24 July 2013 Hwanyong LEE, Ph.D. Principal Engineer & Industry Cooperation Prof., KNU 3DC [email protected] KNU-3DC Introduction • Kyungpook National Univ. 3D Convergence Technology Center . Korea Government Funded Org. 23 staffs (10 Ph.D) . Research / Supporting Industry . Training and Education • Training and Education . Dassault Training Center . KIKS(Korea Institute of Khronos Study) • Constructing New Center Building . HUGE ! 7 Floors ! (2014E) 2 KNU-3DC KIKS • KIKS(Korea Institute of Khronos Study) . Leading Role in Korea for Training and Education of Khronos Standard – Collaboration with Khronos Group . Open Lecture + Develop Coursework for • OpenGL / OpenGL ES / OpenVG / OpenCL / WebGL / etc. • Opening / Sponsoring Workshop and Forum . Participating Khronos Activities • Contributor Member (Plan, now Processing) • Active participation of Khronos WG . Other Standard Activities • Make Khronos Standard into Korea National Standard. (WebGL) • W3C, ISO/IEC JTC1, IEEE 3333.X . Research and Consulting for Industry and Academy 3 KIKS Course • KIKS Course will be categorized into . Basic / Advanced / Packaged . Special Course - For instance Overview / Optimization / Consulting • Developing Courseware (for Khronos API) . OpenGL ES Basic & Advanced . OpenGL Basic & Advanced . OpenVG . OpenCL Basic & Advanced . WebGL . Etc. – new standards (Red - Started / Orange – Start at 4Q2013 / Dark Blue – Start at 2014) 4 Different View of Courses University Computer Game Image Parallel View Graphics Develop Processing … Processing JavaScript Khronos Canvas View … iPhone Android Company Web-App Parallel App App Application Develop View Develop Develop … Develop Course Development – Packaging Example • Android Application with OpenGL ES . General Android API’s – JNI, Java etc. OpenGL ES • iPhone App. Development with OpenGL ES . General iPhone APP API – cocoa, Objective-C, etc.
    [Show full text]
  • The Opencl Specification
    The OpenCL Specification Version: 2.0 Document Revision: 22 Khronos OpenCL Working Group Editor: Aaftab Munshi Last Revision Date: 3/18/14 Page 1 1. INTRODUCTION ............................................................................................................... 10 2. GLOSSARY ......................................................................................................................... 12 3. THE OPENCL ARCHITECTURE .................................................................................... 23 3.1 Platform Model ................................................................................................................................ 23 3.2 Execution Model .............................................................................................................................. 25 3.2.1 Execution Model: Mapping work-items onto an NDRange ........................................................................28 3.2.2 Execution Model: Execution of kernel-instances ........................................................................................30 3.2.3 Execution Model: Device-side enqueue ......................................................................................................31 3.2.4 Execution Model: Synchronization .............................................................................................................32 3.2.5 Execution Model: Categories of Kernels ....................................................................................................33 3.3 Memory
    [Show full text]
  • Openvg 1.1 API Quick Reference Card - Page 1
    OpenVG 1.1 API Quick Reference Card - Page 1 OpenVG® is an API for hardware-accelerated two-dimensional Errors [4.1] vector and raster graphics. It provides a device-independent Error codes and their numerical values are defined by the VGErrorCode enumeration and can be and vendor-neutral interface for sophisticated 2D graphical obtained with the function: VGErrorCode vgGetError(void). The possible values are as follows: applications, while allowing device manufacturers to provide VG_NO_ERROR 0 VG_UNSUPPORTED_IMAGE_FORMAT_ERROR 0x1004 hardware acceleration where appropriate. VG_BAD_HANDLE_ERROR 0x1000 VG_UNSUPPORTED_PATH_FORMAT_ERROR 0x1005 • [n.n.n] refers to sections and tables in the OpenVG 1.1 API VG_ILLEGAL_ARGUMENT_ERROR 0x1001 VG_IMAGE_IN_USE_ERROR 0x1006 specification available at www.khronos.org/openvg/ VG_OUT_OF_MEMORY_ERROR 0x1002 VG_NO_CONTEXT_ERROR 0x1007 • Default values are shown in blue. VG_PATH_CAPABILITY_ERROR 0x1003 Data Types & Number Representations Colors [3.4] The sRGB color space defines values R’sRGB, G’sRGB, B’sRGB in terms of the linear lRGB primaries. Primitive Data Types [3.2] Colors in OpenVG other than those stored in image pixels are represented as non-premultiplied sRGBA color values. openvg.h khronos_type.h range Convert from lRGB to sRGB Convert from sRGB to lRGB Image pixel color and alpha values lie in the range [0,1] (gamma mapping) (1) (inverse gamma mapping) (2) VGbyte khronos_int8_t [-128, 127] unless otherwise noted. -1 VGubyte khronos_uint8_t [0, 255] R’sRGB = γ(R) R = γ (R’sRGB) Color Space Definitions -1 The linear lRGB color space is defined in terms of the G’sRGB = γ(G) G = γ (G’sRGB) VGshort khronos_int16_t [-32768, 32767] -1 standard CIE XYZ color space, following ITU Rec.
    [Show full text]
  • Hardware Implementation of a Tessellation Accelerator for the Openvg Standard
    IEICE Electronics Express, Vol.7, No.6, 440–446 Hardware implementation of a tessellation accelerator for the OpenVG standard Seung Hun Kim, Yunho Oh, Karam Park, and Won Woo Roa) School of Electrical and Electronic Engineering, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, SEOUL, 120–749, KOREA a) [email protected] Abstract: The OpenVG standard has been introduced as an efficient vector graphics API for embedded systems. There have been several OpenVG implementations that are based on the software rendering of image. However, the software rendering needs more execution time and power consumption than hardware accelerated rendering. For the effi- cient hardware implementation, we merge eight pipeline stages in the original specification to four pipeline stages. The first hardware accel- eration stage is the tessellation part which is one of the pipeline stages that calculates the edge of vector graphics. In this paper, we provide an efficient hardware design for the tessellation stage and claim this would eventually reduce the execution time and hardware complexity. Keywords: OpenVG, vector graphics, tessellation, hardware acceler- ator Classification: Electron devices, circuits, and systems References [1] K. Pulli, “New APIs for mobile graphics,” Proc. SPIE - The International Society for Optical Engineering, vol. 6074, pp. 1–13, 2006. [2] Khronos Group Inc., “OpenVG specification Version 1.0.1” [Online] http://www.khronos.org/openvg/ [3] S.-Y. Lee, S. Kim, J. Chung, and B.-U. Choi, “Salable Vector Graphics (OpenVG) for Creating Animation Image in Embedded Systems,” Lecture Notes in Computer Science, vol. 4693, pp. 99–108, 2007. [4] G. He, B. Bai, Z. Pan, and X.
    [Show full text]
  • Openbricks Embedded Linux Framework - User Manual I
    OpenBricks Embedded Linux Framework - User Manual i OpenBricks Embedded Linux Framework - User Manual OpenBricks Embedded Linux Framework - User Manual ii Contents 1 OpenBricks Introduction 1 1.1 What is it ?......................................................1 1.2 Who is it for ?.....................................................1 1.3 Which hardware is supported ?............................................1 1.4 What does the software offer ?............................................1 1.5 Who’s using it ?....................................................1 2 List of supported features 2 2.1 Key Features.....................................................2 2.2 Applicative Toolkits..................................................2 2.3 Graphic Extensions..................................................2 2.4 Video Extensions...................................................3 2.5 Audio Extensions...................................................3 2.6 Media Players.....................................................3 2.7 Key Audio/Video Profiles...............................................3 2.8 Networking Features.................................................3 2.9 Supported Filesystems................................................4 2.10 Toolchain Features..................................................4 3 OpenBricks Supported Platforms 5 3.1 Supported Hardware Architectures..........................................5 3.2 Available Platforms..................................................5 3.3 Certified Platforms..................................................7
    [Show full text]
  • Real-Time Computer Vision with Opencv Khanh Vo Duc, Mobile Vision Team, NVIDIA
    Real-time Computer Vision with OpenCV Khanh Vo Duc, Mobile Vision Team, NVIDIA Outline . What is OpenCV? . OpenCV Example – CPU vs. GPU with CUDA . OpenCV CUDA functions . Future of OpenCV . Summary OpenCV Introduction . Open source library for computer vision, image processing and machine learning . Permissible BSD license . Freely available (www.opencv.org) Portability . Real-time computer vision (x86 MMX/SSE, ARM NEON, CUDA) . C (11 years), now C++ (3 years since v2.0), Python and Java . Windows, OS X, Linux, Android and iOS 3 Functionality Desktop . x86 single-core (Intel started, now Itseez.com) - v2.4.5 >2500 functions (multiple algorithm options, data types) . CUDA GPU (Nvidia) - 250 functions (5x – 100x speed-up) http://docs.opencv.org/modules/gpu/doc/gpu.html . OpenCL GPU (3rd parties) - 100 functions (launch times ~7x slower than CUDA*) Mobile (Nvidia): . Android (not optimized) . Tegra – 50 functions NEON, GLSL, multi-core (1.6–32x speed-up) 4 Functionality Image/video I/O, processing, display (core, imgproc, highgui) Object/feature detection (objdetect, features2d, nonfree) Geometry-based monocular or stereo computer vision (calib3d, stitching, videostab) Computational photography (photo, video, superres) Machine learning & clustering (ml, flann) CUDA and OpenCL GPU acceleration (gpu, ocl) 5 Outline . What is OpenCV? . OpenCV Example – CPU vs. GPU with CUDA . OpenCV CUDA functions . Future of OpenCV . Summary OpenCV CPU example #include <opencv2/opencv.hpp> OpenCV header files using namespace cv; OpenCV C++ namespace int
    [Show full text]
  • Khronos Native Platform Graphics Interface (EGL Version 1.4 - April 6, 2011)
    Khronos Native Platform Graphics Interface (EGL Version 1.4 - April 6, 2011) Editor: Jon Leech 2 Copyright (c) 2002-2011 The Khronos Group Inc. All Rights Reserved. This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, repub- lished, distributed, transmitted, displayed, broadcast or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part. Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be re- formatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group web-site should be included whenever possible with specification distributions. Khronos Group makes no, and expressly disclaims any, representations or war- ranties, express or implied, regarding this specification, including, without limita- tion, any implied warranties of merchantability or fitness for a particular purpose or non-infringement of any intellectual property.
    [Show full text]
  • Gstreamer and Dmabuf
    GStreamer and dmabuf OMAP4+ graphics/multimedia update Rob Clark Outline • A quick hardware overview • Kernel infrastructure: drm/gem, rpmsg+dce, dmabuf • Blinky s***.. putting pixels on the screen • Bringing it all together in GStreamer A quick hardware overview DMM/Tiler • Like a system-wide GART – Provides a contiguous view of memory to various hw accelerators: IVAHD, ISS, DSS • Provides tiling modes for enhanced memory bandwidth efficiency – For initiators like IVAHD which access memory in 2D block patterns • Provides support for rotation – Zero cost rotation for DSS/ISS access in 0º/90º/180º/270º orientations (with horizontal or vertical reflection) IVA-HD • Multi-codec hw video encode/decode – H.264 BP/MP/HP encode/decode – MPEG-4 SP/ASP encode/decode – MPEG-2 SP/MP encode/decode – MJPEG encode/decode – VC1/WMV9 decode – etc DSS – Display Subsystem • Display Subsystem – 4 video pipes, 3 support scaling and YUV – Any number of video pipes can be attached to one of 3 “overlay manager” to route to a display Kernel infrastructure: drm/gem, rpmsg+dce, dmabuf DRM Overview • DRM → Direct Rendering Manager – Started life heavily based on x86/desktop graphics card architecture – But more recently has evolved to better support ARM and other SoC platforms • KMS → Kernel Mode Setting – Replaces fbdev for more advanced display management – Hotplug, multiple display support (spanning/cloning) – And more recently support for overlays (planes) • GEM → Graphics Execution Manager – But the important/useful part here is the graphics/multimedia buffer management DRM - KMS • Models the display hardware as: – Connector → the thing that the display connects to • Handles DDC/EDID, hotplug detection – Encoder → takes pixel data from CRTC and encodes it to a format suitable for connectors • ie.
    [Show full text]