From Zero to Zero-Copy Accelerating Your Embedded Media Player
Total Page:16
File Type:pdf, Size:1020Kb
From Zero to Zero-Copy Accelerating Your Embedded Media Player Lukas Rusak September 2018 whoami Kelowna B.C and Okanagan Lukas Rusak Lake Bachelors of Applied Science in Mechanical Engineering Team Kodi Member Team LibreELEC Member LibreELEC Board Member Okanagan Region of B.C. [email protected] Mission Hill Vineyard Agenda ● Introducing zero-copy ● Current proprietary methods ● Open source solutions ● Video decoding pipeline ● Rendering methods ● Board specifics Zero-copy "Zero-copy" describes computer operations in which the CPU does not perform the task of copying data from one memory area to another. [1] ● Embedded devices with minimal CPU power trying to run 4K HEVC video ● Critical for devices with limited memory bandwidth ● Available in proprietary and open source solutions Current Proprietary Methods ● Raspberry Pi (broadcom) dispmanx ○ libbcm_host ● i.MX6 Vivante ○ Blob that does everything for windowing and rendering ○ libvpuwrap for decoding ● Amlogic libamcodec ○ Userspace library dependent of vendor BSP Kernel (3.14) ● Impossible to fix bugs ● Locked to vendor kernels ● No chance to update ● Code maintenance for vendor specific methods Open Source Solutions ● Video 4 Linux 2 (V4L2) Driver ○ Access to onboard decoding capabilities ○ Memory-to-Memory devices ○ Codecs are hardware and driver dependent ● Direct Rendering Manager (DRM) Driver ○ Only required if utilizing direct to plane rendering ● OpenGLES 2.0 and EGL ○ Only required if utilizing EGL image import to OpenGLES ○ EXT_image_dma_buf_import [2] ○ OES_EGL_image_external [3] ○ NV12 or YUYV or YU12 or similar YUV format support Video Decoding Pipeline Compressed ● V4L2 interface directly using IOCTL’s Frame ● Gstreamer ○ dma-buf ready ● FFmpeg Decoder ○ dma-buf patches under review AVDRMFrameDescriptor ● Using Gstreamer or FFmpeg shifts video decoding out of your application Decoded fd Frame Direct to Plane Rendering Positives Video + GUI ● Embedded devices may support YUV formats directly ● Most performant (no empirical evidence) ● No OpenGLES/EGL context required ○ Skips the GPU entirely ● Video overlays are possible using DRM Planes ● Very straight forward when using atomic DRM GUI Only Negatives ● Requires* in driver scaling capabilities ● No OpenGLES capabilities (shaders) ● More complex if using multiple planes Primary Overlay Cursor (DRM Planes) OpenGLES/EGL Image Import Positives ● Shader support ● Only one DRM Plane required Negatives ● More expensive (requires EGL context and GLES operations) ● Requires extensions that are less common Mali Proprietary Blob ● Open source kernel module paired with closed source binary blob for OpenGLES ● Licensed from ARM ● Bundled library containing either: ○ x11 ○ Wayland/GBM ○ GBM ● Impossible to fix bugs ● Lima and Panfrost open source efforts underway Allwinner ● DRM Driver ● OpenGLES Driver ○ “sun4i” ○ “mali” ○ Supports scaling ○ Supports EGL import ○ Supports overlays ○ Supports NV12 ○ Supports NV12 ● V4L2 Driver ○ Patches pending (requires V4L2 request API) ○ Supports NV12 Orange Pi PC Amlogic ● DRM Driver ● OpenGLES Driver ○ “meson” ○ “mali” ○ Supports Scaling (WIP) ○ Supports EGL import ○ Supports Overlays (WIP) ○ Supports NV12 ○ Supports NV12 ○ Neil Armstrong <[email protected]> ● V4L2 Driver ○ “Meson” (WIP) ○ Supports NV12 ○ Supports HEVC, H264 ○ Maxime Jourdan <[email protected]> Libre Computer Le Potato (S905) Broadcom ● DRM Driver ● OpenGLES Driver ○ “vc4” ○ “vc4” ○ Supports scaling ○ No YUV import ○ Supports overlays ○ Supports NV12 and YUV420 ● V4L2 Driver ○ “Bcm2835-v4l2-codec” (WIP) ○ Supports NV12 and YUV420 Raspberry Pi 3B+ NXP ● DRM Driver ● OpenGLES Driver ○ “imx” and “etnaviv” ○ “etnaviv” ○ No scaling (available via a separate ip) ○ Supports EGL import ○ Supports overlays ○ Supports NV12 (WIP) ○ Supports NV12 ■ Target mesa 18.3 ■ gc3000 good performance ● V4L2 Driver ■ gc2000 ok performance ○ “coda” ○ Supports NV12 Solidrun Cubox i4-Pro (gc2000) Qualcomm ● DRM Driver ● OpenGLES Driver ○ “msm” ○ “freedreno” ○ Supports scaling ○ Supports EGL import ○ Supports overlays ○ Supports NV12 ○ Supports NV12 ● V4L2 Driver ○ “venus” ○ Supports NV12 Dragonboard 410c Others HiSilicon MediaTek Renesas Rockchip Asus Tinkerboard S (RK3288) Renesas R-Car Index 1. https://en.wikipedia.org/wiki/Zero-copy 2. https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import.txt 3. https://www.khronos.org/registry/OpenGL/extensions/OES/OES_EGL_image_external.txt Kodi Implementation Links: https://github.com/xbmc/xbmc/tree/master/xbmc/windowing/gbm https://github.com/xbmc/xbmc/blob/master/xbmc/cores/VideoPlayer/VideoRenderers/HwDecRender/RendererDRMPRIME.cpp https://github.com/xbmc/xbmc/blob/master/xbmc/cores/VideoPlayer/VideoRenderers/HwDecRender/RendererDRMPRIMEGLES.cpp https://github.com/xbmc/xbmc/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/DVDVideoCodecDRMPRIME.cpp Q/A Theme By: Samfisher Design .