Quantization and Entropy Coding in the Versatile Video Coding (VVC

SUBMITTED TO IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, AUGUST 2020 1 Quantization and Entropy Coding in the Versatile Video Coding (VVC) Standard Heiko Schwarz, Muhammed Coban, Marta Karczewicz, Tzu-Der Chuang, Frank Bossen, Senior Member, IEEE, Alexander Alshin, Jani Lainema, and Christian R. Helmrich, Senior Member, IEEE Abstract—The paper provides an overview of the quantization coding, and tile-based streaming for immersive applications. and entropy coding methods in the Versatile Video Coding (VVC) Despite the rich set of coding tools and functionalities, partic- standard. Special focus is laid on techniques that improve coding ular care was taken to enable decoder implementations with efficiency relative to the methods included in the High Efficiency Video Coding (HEVC) standard: The inclusion of trellis-coded reasonable complexity in both hardware and software. quantization, the advanced context modeling for entropy coding Similar to all previous video coding standards of the ITU-T transform coefficient levels, the arithmetic coding engine with and ISO/IEC since H.261 [4], the VVC design follows the multi-hypothesis probability estimation, and the joint coding of general concept of block-based hybrid video coding. The video chroma residuals. Beside a description of the design concepts, pictures are partitioned into rectangular blocks and each block the paper also discusses motivations and implementation aspects. The effectiveness of the quantization and entropy coding methods is predicted by intra- or inter-picture prediction. The resulting specified in VVC is validated by experimental results. prediction error blocks are coded using transform coding, Index Terms—Versatile Video Coding (VVC), quantization, which consists of an orthogonal transform, quantization of entropy coding, transform coefficient coding, video coding. the transform coefficients, and entropy coding of the resulting quantization indexes. Quantization artifacts are attenuated by applying so-called in-loop filters to reconstructed pictures I. INTRODUCTION before they are output or used as references for inter-picture HE Versatile Video Coding (VVC) standard [1], [2] is the prediction of following pictures. T most recent joint video coding standard of the ITU-T and Although VVC uses the same coding framework as its ISO/IEC standardization organizations. It was developed by predecessors, it includes various improvements that eventually the Joint Video Experts Team (JVET), a partnership between result in a substantially improved compression performance. the ITU-T Video Coding Experts Group (VCEG) and the One of the most prominent changes in comparison to HEVC ISO/IEC Moving Picture Experts Group (MPEG). VVC was is the very flexible block partitioning concept [5] that supports technically finalized in July 2020 and will be published as non-square blocks for coding mode selection, intra-picture ITU-T Rec. H.266 and ISO/IEC 23090-3 (MPEG-I Part 3). prediction, inter-picture prediction, and transform coding and, The primary objective of the new VVC standard is to pro- thus, impacts the design of many other aspects. In the present vide a significant increase in compression capability compared paper, we describe modifications to quantization and entropy to its predecessor, the High Efficiency Video Coding (HEVC) coding. The coding efficiency improvements in this area can standard [3]. At the same time, VVC includes design features be mainly attributed to the following four features: that make it suitable for a broad range of video applications. • the support of trellis-coded quantization (TCQ); In addition to conventional video applications, it particularly • the advanced entropy coding of quantization indexes addresses the coding of video with high dynamic range and suitable for both TCQ and scalar quantization; wide color gamut, computer-generated video (e. g., for remote • the binary arithmetic coding engine with multi-hypothesis screen sharing or gaming), and omnidirectional video and it probability estimation; supports adaptive streaming with resolution switching, scalable • the support of joint chroma residual coding. Manuscript uploaded August 22, 2020. Theses changes in quantization and entropy coding together H. Schwarz and C. R. Helmrich are with the Fraunhofer Institute for with a block-adaptive transform selection [6] eventually led Telecommunications, Heinrich Hertz Institute, 10587 Berlin, Germany. H. Schwarz is also with the Institute of Computer Science, Free University to a substantially increased efficiency of the transform coding of Berlin, 14195 Berlin, Germany (e-mail: [email protected]; design in VVC compared to that of HEVC. [email protected]). The paper is organized as follows. Section II describes the M. Coban and M. Karczewicz are with Qualcomm Technologies Inc., San Diego, CA 92121, USA (e-mail: [email protected]; martak@ quantization in VVC with special focus on the TCQ design. qti.qualcomm.com). The entropy coding of quantization indexes including context T.-D. Chuang is with MediaTek Inc., Hsinchu 30078, Taiwan (e-mail: modeling is presented in Section III. Section IV discusses the [email protected]). F. Bossen is with Sharp Electronics of Canada Ltd., Mississauga, ON L4Z improvements of the core binary arithmetic coding engine. 1W9, Canada (e-mail: [email protected]). The joint coding of chroma prediction errors is described in A. Alshin is with Intel Russia, Moscow, 121614 (e-mail: alexander.alshin@ Section V. Experimental results validating the effectiveness intel.com). J. Lainema is with Nokia Technologies, 33101 Tampere, Finland (e-mail: of the quantization and entropy coding tools are provided in [email protected]). Section VI, and Section VII concludes the paper. SUBMITTED TO IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, AUGUST 2020 2 II. QUANTIZATION case with orthogonal transforms, the inverse transformp for a B−15 Quantization is an irreversible mapping of input values to W×H block includes an additional scaling by WH ·2 , output values. For the specification in image and video coding where B represents the bit depth of the color component in standards, it is split into a non-normative encoder mapping bits per sample. Consequently, the scaling in the decoder has of input samples to integer quantization indexes, which are to approximately generate reconstructed coefficients also referred to as levels and are transmitted using entropy 0 (QP−4)=6 15−B −1=2 t = αk · 2 · 2 · (WH) · qk; (2) coding, and a normative decoder mapping of the quantization k indexes to reconstructed samples. The aim of quantization is to which are then used as input values to the inverse transform. 1 decrease the bit rate required for transmitting the quantization With β = d 2 log2WHe, γ = 2β − log2WH, p = bQP=6c, and indexes while maintaining a low reconstruction error. m = QP % 6, where d·e and b·c denote the ceiling and floor In hybrid video coding, quantization is generally applied functions, respectively, and % denotes the modulus operator, 00 to transform coefficients that are obtained by transforming the mapping qk 7! tk can be rewritten according to prediction error blocks (also referred to as residual blocks) t0 = 24α · 2(32+3γ+m)=6 · 2p · 25−β−B · q : (3) using an approximately orthogonal transform. The transforms k k k used have the property that, for typical residual blocks, the Since both the width W and the height H of a transform block signal energy is concentrated into a small number of transform are integer powers of two, γ 2 f0; 1g is a binary parameter. coefficients. This has eventually the effect that simple scalar For obtaining a realization with integer operations, the two quantizers are more effective in the transform domain than terms in parenthesis are rounded to integer values and the in the original sample space [7]. In particular for improving multiplication with 25−β−B is approximated by a bit shift. the coding efficiency for screen content [8], where residual The VVC standard specifies the reconstruction according to blocks often have different properties, VVC also provides a 0 transform skip (TS) mode, in which no transform is applied, tk = wk · (a[γ][m] p) · qk + ((1 b) 1) b; (4) but the residual samples are quantized directly. where and denote bit shifts to the left and right (in two’s Advanced Video Coding Similarly as in AVC ( ) [9] and complement arithmetic), respectively, and b = B + β − 5. The HEVC, the quantizer design in VVC is based on scalar quan- 2×6 array a[γ][m] specifies integer values that approximate uniform reconstruction quantizers tization with . But VVC also the terms 2(32+3γ+m)=6. It is given by a = ff40, 45, 51, includes two extensions that can improve coding efficiency at 57, 64, 72g; f57, 64, 72, 80, 90, 102gg. The integer values the cost of an increased encoder complexity. 4 wk = round(2 αk), with wk 2 [1; 255], are called scaling list. As further detailed in Section II-F, scaling lists for different A. Basic Design: Uniform Reconstruction Quantizers block types can be specified in a corresponding high-level 0 data structure. If scaling lists are not used, the values wk are In scalar quantization, the reconstructed value tk of each input coefficient (or sample) tk depends only on the associ- inferred to be equal to 16, which corresponds to ∆k = ∆. ated quantization index qk. Uniform reconstruction quantizers In transform skip mode, no inverse transform is applied and, (URQs) are a simple variant, in which the

Load more