VCU relaunch initiative

>> 1 PROPOSED AGENDA

1) Introduction. Big picture overview. Top-down systems considerations 2) Video data flows and connectivity 3) Video processing – what happens to a video pipe when you include VPSS and video mixer; conversions from one video format to another, pixel formats, etc. 4) Device abstractions: IPI, DTS, Device drivers and hardware frameworks 5) Codecs: Why and what 6) Media Software frameworks – a review of GStreamer, DRM, KMS, V4L2, etc., and how they all tie together 7) Synchronization and Compressed video system considerations: Learn how to add Audio, multi- stream video and graphics overlay, understand the criticality of synchronization in AV systems 8) VQ (Video quality): You will get an opportunity to run live inputs into the VCU TRD and change bit-rates and GOP 9) Software application design #1: Intended to tie together a lot of concepts from prior sessions 10) Software application design #2 11) Creating a system with Vitis / Vivado / Petalinux Session 1 Agenda

• Welcome & Introduction Overview & Systems Considerations • Hands-On/Lab Session • Wrap-Up & Homework

Note: Goats are naturally curious. They are also agile and well known for their ability to climb and balance in precarious places. They climb trees. Due to their agility and inquisitiveness, they are notorious for escaping their pens by testing fences and enclosures. Goats have been found to be as intelligent as dogs.

1 Session 2 Agenda

• Welcome & Introduction Recap from Session #1 Interactive Session – HDMI Passthru (35 mins) Multimedia design – HW/SW View Video4Linux Direct Render Manager / Kernel Mode Setting Gstreamer - An introduction • Wrap-Up & Homework 1 Session 3 Agenda

Recap – Data flows/Connectivty Interactive Session (45mins) Why Video Processing? (Quick!) Video Primer Video Processing Data Representation Xilinx Video Processing IP/Drivers • Wrap-Up & Homework

1 Session 4 Agenda “Analyzing hardware won’t give you insights into software…” [ref] • Explore the links between hardware and software • IP Integrator • Review hardware configuration (PL blocks) • Device Tree Overview • Describe hardware configuration(s) to software • Targets device drivers and software frameworks • Device Drivers • Bindings – driver-specific configuration • Probe and “of” functions • Multimedia Device Frameworks • Transition from “devices” to “data flows” • Exposing devices to application software 6 Session 5 Session 5 Agenda

• Housekeeping • Introduction to Session 5 • Codecs: Why & What • Hands-On/Lab Session • Wrap-Up & Homework

Note: Goats are naturally curious. They are also agile and well known for their ability to climb and balance in precarious places. They climb trees. Due to their agility and inquisitiveness, they are notorious for escaping their pens by testing fences and enclosures. Goats have been found to be as intelligent as dogs. https://www.elecard.com/zh/products/video-analysis/streameye 1 一些基本的概念與認知

◼ 現有的競爭對手 ⚫ Socionext,NXP,freescale,TI,Ambarella,Nvidia,Intel,Hisilicon Size小品質高 ◼ 影音來源 ⚫ 介面 (HDMI,MIPI,Ethernet…), 60% ⚫ Audio 格式- AES, Dolby,44K,192K…. ⚫ Video 格式- YUV,RGB,MP4,MKV,AVI…. Basic algorithm ◼ MPSoC VCU主要優勢 ⚫ HVQ

⚫ One chip 30% ⚫ 客製化……. General register ◼ 遭遇的挑戰 10% ◼ 價格 Special register ◼ PL端的設計加值……. GStreamer Overview PAD ( always, sometimes, on request )永久/隨機/請求 元件Element IN: SINK Cap 屬性 OUT: SRC

Bin Ubuntu 安裝 GStreamer 1. Install GStreamer on Ubuntu sudo apt-get install libgstreamer1.0-dev ==> work 2. Building applications using GStreamer pkg-config --cflags --libs -1.0 3. Getting the tutorial's source code git clone git://anongit.freedesktop.org/gstreamer/gst-docs 4. Building the tutorial cd ~/gst-docs/examples/tutorials gcc basic-tutorial-1. -o basic-tutorial-1 `pkg-config --cflags --libs gstreamer-1.0` 5. Running the tutorials ./basic-tutorial-1 1. 初始化 GStreamer 內部的資料結構 2. 檢查所有可用的附加元件(plugins) 3. 解析與執行從命令列所傳入的 GStreamer 參數 4. 建立播放影片的管道(pipeline) 5. GST_STATE_PLAYING,開始撥放影片串流。 6. gst_bus_timed_pop_filtered()等待 ERROR 或是 EOS(End-Of-Stream) 7. 程式的最後還要記得將配置的記憶體回收 Building a GStreamer Application – Code Flow Code Flow

1

4 2 6

7 5 3

Review the HDMI-RX V4L Topology HDMI-RX Driver Stack

• HDMI-RX Linux Driver Stack • Video PHY • Driver: hdmi-modules/hdmi/phy-vphy.c • V4L2 HDMI-RX Subsystem • Driver: hdmi-modules/hdmi/xilinx-hdmirx.c • V4L2 Video Processing Subsystem • Driver: linux-xlnx/drivers/…/xilinx-vpss-scaler.c • AXI Broadcaster** • No Broadcaster Driver (transparent operation) • Dummy Driver Required for V4L2 media pipeline • Framebuffer Write • Driver: linux-xlnx/drivers/dma/…/xilinx_frmbuf.c • V4L2 Media Pipeline Driver • Driver: linux-xlnx/drivers/media/…/xilinx-vipp.c

14

Software loading and performance review

PS Side H/W App S/W Stack CPU(X86,ARM) IP Linux / Windows

Xilinx DPU

RTL IP App Gstreamer App GPU(Nvidia,MPSoC) Open CV library Kernel Code

Audio

RTL IP Driver V4L2 Media Driver HEVC Codec

Video Conversion

Capture Video EN / Decoder PL Side H/W Data Streaming MPSoC VCU Video Data Flow (IN)

PS_DDR PL_DDR

PL I/O

PS_DDR MIPI PL_DDR PL_DDR PS_DDR PS_DDR PS_DDR PS_DDR 儲存裝置 G USB/SD/DDR.. Video Streaming NV12,NV16 V4L2 HTTP/TCP IOAL,P010 G RTP/UDP XV20,XV15 影像介面 V4L2 MP4/AVI/MOV/MKV P210 VCU壓縮 SDI/HDMI/DP ….. Video Law Data 週邊輸入 擷取影像 影像來源 分析資料 處理數據 VCU 驅動程式 至緩沖 介面 結構 格式 參數設定 IP 啟動 記憶體 Camera RGB,YUV, YCbCr VCU解壓 4:2:0 PCIe DMA 4:2:2 4:4:4 PL RTL ARM PL RTL ARM8/10/12.. bitARM ARM

ARM PL RTL ARM MPSoC VCU Video Data Flow (OUT)

PL_DDR PS_DDR Video Streaming HTTP/TCP PL I/O RTP/UDP MP4/AVI/MOV/MKV PL_DDR Video Law Data PL_DDR MIPI PL_DDR PS_DDR PS_DDR PS_DDR PS_DDR PS_DDR

PS_DDR 儲存裝置 G USB/SD/DDR.. V4L2 顯示

VCU壓縮 影像介面 呼叫啟動 SDI/HDMI/DP ….. 週邊介面 影像輸出 確認輸出 選擇封裝 傳輸 驅動程式 緩沖暫存 影像去向 格式 及資料轉 IP 移 Camera VCU解壓 RGB,YUV, YCbCr 4:2:0 4:2:2 儲存 DMA PCIe 4:4:4 8/10/12… bit ARM ARM ARM ARM PL RTL PL RTL ARM

ARM ARM PL RTL

Review the HDMI-RX V4L Topology V4L Media Controller Framework View Summary: root@vcu_trd:~# media-ctl --device /dev/media8 -p Media controller API version 5.4.0 Hardware: Device Tree → Device Drivers → V4L2 / Media Controller Framework : /dev/video0 ... Device topology Why? Pass CONTROL to software - entity 1: vcap_hdmi output 0 (1 pad, 1 link) type Node subtype V4L flags 0 device node name /dev/video0 pad0: Sink <- "a0080000.v_proc_ss":1 [ENABLED]

- entity 5: a0080000.v_proc_ss (2 pads, 2 links) type V4L2 subdev subtype Unknown flags 0 device node name /dev/v4l-subdev15 pad0: Sink [fmt:VYYUYY8_1X24/1280x720 field:none colorspace:srgb] <- "a0000000.v_hdmi_rx_ss":0 [ENABLED] pad1: Source [fmt:VYYUYY8_1X24/1920x1080 field:none colorspace:srgb] -> "vcap_hdmi output 0":0 [ENABLED]

- entity 8: a0000000.v_hdmi_rx_ss (1 pad, 1 link) type V4L2 subdev subtype Unknown flags 0 device node name /dev/v4l-subdev16 pad0: Source [fmt:RBG888_1X24/1280x720 field:none colorspace:srgb] [dv.caps:BT.656/1120 min:0x0@25000000 max:4096x2160@297000000 stds:CEA-861,DMT,CVT,GTF caps:progressive,reduced- blanking,custom] [dv.query:no-link] -> "a0080000.v_proc_ss":0 [ENABLED] Hardware View V4L View

20 HEVC的壓縮品質參數的考量

依幀數關係 依畫素內容(靜態和動態)

2 3 Under the Hood: What is going on?

>> 22 Under The Hood: Partitions & Intra-frames 幀內預測

>> 23 Under The Hood: P-Frames 動態預測

>> 24 Under The Hood: B-Frames 前後幀預測

>> 25 VCU Software Stack Status and Roadmap

Available Planned Roadmap 2020.1 2020.2 2021.1- Future Jun’20 Nov’20 May’21  HDR10 Support  Interlace solution enhancements  Power Management  Region of Interest (ROI) based encoding Enhancements  NTSC resolution support for HEVC coding  Dynamic resolution change enhancements Provide an option in the existing parameter to  Audio Support take the delta QP from users  Scene changed Detect (SCD) IP Interlace support  Low latency pipeline:  Frame / Macroblock Skip Enhancement • Audio Support  HLG (Hybrid Log Gamma) support • Up to four stream Encoding  Provide mechanism to add Frame Skip information in

stream for application Control Software Control

 HDR10 Support full pipeline –  HDR10 Support full pipeline : Production release  Power Management (EA release: first week of July)  Interlace solution enhancements  Dynamic resolution change enhancements  Region of Interest (ROI) based encoding Enhancements  NTSC resolution support for HEVC coding Provide an option in the existing parameter to  Audio Support take the delta QP from users  Scene changed Detect (SCD) IP Interlace support  Low latency pipeline:  Frame / Macroblock Skip Enhancement • Audio Support  HLG (Hybrid Log Gamma) support

• Up to four stream Encoding GStreamer  Provide mechanism to add Frame Skip information in stream for application

more details more details more details

* For Low latency full pipeline example design details and schedule refer to subsequent slides 26 Video IP Supporting VCU Available Planned Roadmap 2019.1 2019.2 2020.1 2020.2 - Future May’19 Oct’19 May’20 Oct ’20

 Scene change Detection Memory based  Mixer Enhancements  Mixer IP Enhancement  HLG support for SDI subsystem IP  VCU Memory Controller  HDR support in HDMI RX and TX • Support output Colorimetry  Mpeg2 TS Mux Acceleration  Multi Channel Scalar  Scene change detection streaming  Low latency Pipeline  Eight Audio channel support  SPDIF Audio Driver based • IP Enhancement: PL  Audio support with Interlace Video  VPSS driver : 10 bit support  Sync IP to achieve slice based resource count Optimization  Enabled Multiple sample rate Audio processing (Single and Multi-stream)  VCU PL DDR4 Controller IP support in ALSA • New DDR Part support

Drivers o MT40A1G8SA-075E-2666

 HDR10 Support for HDMI subsystem IP V4L2/DRM/ALSA

 Low Latency Full Pipeline (Single stream  Scene change detection streaming  Mixer IP Enhancement  HLG support in SDI subsystem IP June 15th) based • Support output Colorimetry  Mpeg2 TS Mux Acceleration  Scene change Detection Memory based  Low Latency Full Pipeline (Multiple  Low latency Pipeline  Eight Audio channel support  VCU Memory Controller stream) • Sync IP Enhancement: PL  Audio support with Interlace Video  Sync IP to achieve slice based  Sync IP to achieve slice based resource count Optimization processing (Single Stream) processing (Multiple Stream)  VCU PL DDR4 Controller IP  Multi Channel Scalar  Mixer Enhancements • New DDR Part support o MT40A1G8SA-075E-2666  HDR10 Support for HDMI subsystem IP

* This information is about Video IP working with the VCU, not a general update of the Video IP and its software GStreamer GStreamer Support

27 Low Latency (<35ms) Video Solutions Available Now

Vivado Capture Encode Decode Display Total <35ms Example Designs Use case Release (ms) (ms) (ms) (ms) (ms) Single-stream 2017.1 Normal Latency 16.6 16.6 83.33 16.6 133 HDMI only 4Kp60, 1080p60, 4Kp30, 720p60 2017.3 Reduced Latency 16.6 16.6 50 16.6 99.8 Chroma format: 4:2:0, 8-bit 2018.1 Low Latency VCU Only 16.6 4 22.6 16.6 59.8 SDI Rx/Tx support 2019.2 Low Latency Full Pipeline* 2 5 10 12.5 <35 Multi-stream (2) *Numbers subject to change HDMI Only Early Access Early 4Kp30, 1080p60, 720p60 • The latency calculations assume 60fps Chroma format: 4:2:0, 8-bit Chroma 4:2:2, 10-bit support – Coming Soon • Low latency modes assumes 8 slices per frame • Low latency modes have restrictions on GOP structure and bit rates Public Release Vivado 2019.2

28 Codecs & Bitrates for Different Applications

29

Unsupported Features The following features are not supported: • Scalable video coding in AVC/HEVC ○ Non-Annex B (AVCC) • H.264 (AVC) ○ Interlace video format(隔行視頻格式) ○ Encoder: - Lossless mode (transform bypass, I_PCM)(無損模式) ○ Decoder: - Flexible macroblock ordering (FMO) - Arbitrary slice ordering (ASO) (任意切片排序) - Redundant slice (RS) (冗餘切片) - Dynamic chroma format/profile/level change within a single stream • H.265 (HEVC) ○ Encoder: - Sample adaptive offset (SAO) filter (樣本自適應偏移) - Asymmetric motion partition (AMP) (非對稱運動分區) - Lossless mode (transquant bypass, PCM) (無損模式) - Transform skip mode (轉換跳躍模式) - Wavefront parallel processing (WPP)(波前並行處理) ○ Decoder: - Dynamic chroma format/profile/level change within a single stream(單個流中的動態色度格式) See the Zynq UltraScale+ MPSoC Production Errata (EN285) for more information. VCU Gstreamer VCU Gstreamer Parameter Description Parameter Description Parameter Description Parameter Description Profile Profile LambdaFactors No Level Level EnableFillerData filler-data Tier Tier AvcLowLat No ChromaMode N/A ColourDescription N/A BitDepth N/A ColourMatrix N/A NumSlices num-slices CostMode No SliceSize slice-size EnableAUD No DependentSlice dependent-slice FileScalingList No EntropyMode entropy-mode NumCore No CabacInit No SCDFirstPass No PicCbQpOffset No SliceLat No PicCrQpOffset No SubframeLatency No SliceCbQpOffset No TransferCharac No SliceCrQpOffset No TwoPass No ScalingList scaling-list WaveFront No QpCtrlMode qp-mode RateCtrlMode control-rate CuQpDeltaDepth No BitRate target-bitrate constrained-intra- ConstrainedIntrpred MaxBitRate max-bitrate prediction VrtRange_P low-bandwidth FrameRate N/A quant-i-frames LoopFilter loop-filter-mode SliceQP quant-p-frames quant-b-frames LoopFilter.CrossSlice No MinQP min-qp LoopFilter.CrossTile No MaxQP max-qp LoopFilter.BetaOffset loop-filter-beta-offset InitialDelay initial-delay LoopFilter.TcOffset loop-filter-alpha-c0-offset CPBSize cpb-size CacheLevel2 prefetch-buffer IPDelta No AspectRation aspect-ratio PBDelta No LookAhead look-ahead ScnChgResilience N/A VideoMode N/A MaxPictureSize max-picture-size EnableSEI N/A EnableSkip skip-frame LambdaCtrlMode No default-roi-quality / default quality level to apply to each ROI roi (dynamic setting) / set ROI range and quality insert SEI (dynamic setting) / insert SEI sence change (dynamic setting) / trigger sence change to change look-ahead 4K AVC & FHD AVC Encode PRBS 比較 720 AVC & 4K HEVC Encode PRBS 比較 FHD HEVC & 720p HEVC Encode PRBS 比較